Agent host
The agent host is the runtime that wires together messaging, LLM calls, memory, skills, tools,
and the dream cycle into a working agent process. It lives in RockBot.Host and
RockBot.Host.Abstractions, with the concrete RockBot.Agent project providing the runnable
executable.
Overview
Incoming MessageEnvelope (from RabbitMQ)
│
▼
IMessagePipeline.DispatchAsync()
│
├── Middleware chain (logging, tracing, error handling, ...)
│
▼
IMessageHandler<TMessage>.HandleAsync()
│ UserMessageHandler — main LLM conversation loop
│ ScheduledTaskHandler — scheduled task delivery
│ ConversationHistoryRequestHandler — history replay
│
├── IConversationMemory — sliding window of turns
├── ILongTermMemory — BM25 recall of relevant memories
├── ISkillStore — BM25 recall of relevant skills
├── IWorkingMemory — global path-namespaced scratch space (TTL-based)
├── ILlmClient — serialized LLM gateway (one in-flight at a time)
└── IFeedbackStore — quality signal writes (fire-and-forget)
Agent identity and profile
AgentIdentity
public sealed record AgentIdentity(
string Name, // Logical agent name, e.g. "rockbot"
string InstanceId // Unique instance; auto-generated GUID if not supplied
);
Used in system prompt construction, topic subscriptions, and as the Source field on outgoing
envelopes.
AgentProfile
The agent’s personality and instructions are loaded from markdown files on the data volume:
| File | Purpose |
|---|---|
soul.md |
Core identity, values, and personality — stable; authored by prompt engineers |
directives.md |
Deployment-specific operational instructions |
style.md |
(optional) Voice and tone polish |
memory-rules.md |
(optional) Rules governing when and how memories are formed |
The profile is parsed into an AgentProfile composed of AgentProfileDocument instances. Each
document is split on ## headings into named AgentProfileSection items. Sections can be
looked up by name across all documents via profile.FindSection("name").
DefaultSystemPromptBuilder
Assembles the system prompt from the agent profile and identity:
You are {AgentName}.
{soul.md content}
{directives.md content}
{memory-rules.md content} ← if present
{style.md content} ← if present
The result is cached after the first call — the profile is immutable at runtime. The built system prompt is the starting system message on every LLM request.
Message pipeline
Registration
agent
.HandleMessage<UserMessage, UserMessageHandler>()
.HandleMessage<ScheduledTaskMessage, ScheduledTaskHandler>()
.HandleMessage<ConversationHistoryRequest, ConversationHistoryRequestHandler>()
.UseMiddleware<LoggingMiddleware>()
.UseMiddleware<TracingMiddleware>()
.UseMiddleware<ErrorHandlingMiddleware>()
.SubscribeTo(UserProxyTopics.UserMessage)
.SubscribeTo(UserProxyTopics.ConversationHistoryRequest);
Dispatch flow
IMessagePipeline receives a raw MessageEnvelope from the subscriber callback:
- Deserializes the
MessageTypefield to find the registeredIMessageHandler<T> - Passes the envelope through the middleware chain
- Middleware calls
next()to continue; or short-circuits by returning aMessageResult - The innermost middleware invokes the handler
MessageTypeResolver maps MessageType strings to .NET types. Registration is done via
agent.HandleMessage<TMessage, THandler>() which records both the type mapping and the DI
registration for THandler.
Conversation memory
FileConversationMemory (implements IConversationMemory)
Wraps InMemoryConversationMemory with file-backed persistence:
- Each session serializes to
{BasePath}/{sessionId}.json - On startup, sessions whose last turn falls within
SessionIdleTimeoutare reloaded — so recent conversations survive agent restarts - Per-session
SemaphoreSlimprevents concurrent write races on the same file - If
IConversationLogis registered, every turn is also appended to the conversation log for the dream preference-inference pass
Session lifecycle:
- First message in a session creates the file
- Subsequent messages append turns and re-serialize
ClearAsyncremoves both the in-memory state and the file- Stale sessions (beyond
SessionIdleTimeout) are not loaded on restart
Feedback and session evaluation
FileFeedbackStore (implements IFeedbackStore)
Appends FeedbackEntry records to per-session JSONL files:
{BasePath}/{sessionId}.jsonl
One JSON object per line. Per-session semaphores prevent concurrent write races.
QueryRecentAsync scans all JSONL files to find entries since a given timestamp — used by the
dream cycle to gather quality signals for memory consolidation and skill optimization.
SessionSummaryService
Background hosted service that evaluates completed sessions:
- Polls on
FeedbackOptions.PollInterval(default 5 minutes) - Finds sessions whose last turn is older than
SessionIdleThreshold(default 10 minutes) that haven’t already been evaluated this run - Backs off if the LLM is busy (polls every 5s until idle)
- Sends the full session transcript to the LLM with an evaluator directive
- Writes a
FeedbackEntrywithSignalType = SessionSummarycontaining:summary: one-sentence descriptiontoolsWorkedWell,toolsFailedOrMissed,correctionsMadeoverallQuality:excellent/good/fair/poor
The evaluator directive is loaded from session-evaluator.md on the data volume, with a
built-in fallback.
The dream cycle’s skill optimization pass uses poor / fair quality scores, along with
explicit Correction signals, to identify skills that need improvement.
Conversation log
FileConversationLog (implements IConversationLog)
Single-file JSONL log of all conversation turns across all sessions:
{BasePath}/turns.jsonl
A single semaphore serializes all writes. Used exclusively by the dream cycle:
- The preference-inference pass reads the full log to infer durable user preferences
- The skill gap detection pass reads it to find recurring patterns
- Both passes clear the log after processing to prevent unbounded growth
IConversationLog is opt-in — call WithConversationLog() explicitly in the host
builder. WithMemory() does not register it.
Three-tier LLM routing
ModelTier
public enum ModelTier { Low, Balanced, High }
Every LLM call is tagged with a tier. The TieredChatClientRegistry singleton holds one
IChatClient per tier and LlmClient selects the right one at call time.
| Tier | Intended use | Falls back to |
|---|---|---|
Low |
Short factual questions, trivial single-step tasks | Balanced |
Balanced |
Moderate-complexity requests, patrol tasks | — (required) |
High |
Deep analysis, dream consolidation, research | Balanced |
Low and High are optional in configuration; when absent they fall back to Balanced.
KeywordTierSelector (implements ILlmTierSelector)
Scores prompts using a keyword + length heuristic — no embeddings, no external calls:
- Length score (0 – 0.40) — longer prompts tend to be more complex
- Keyword score (0 – 0.35) — high-signal words (
analyze,research,distributed, …) increase the score; simplex words (what is,define,list the, …) decrease it - Structural score (0 – 0.25) — code blocks, math notation, multi-step markers
Scores at or below lowCeiling → Low; at or below balancedCeiling → Balanced; above → High.
The parameterless constructor always uses compiled-in defaults (used in tests). The DI
constructor hot-reloads {BasePath}/tier-selector.json every 60 seconds so thresholds and
keyword lists can be tuned without a pod restart.
tier-selector.json (hot-reloadable)
{
"version": 1,
"notes": "2026-02-24: tightened balancedCeiling after dream review",
"lowCeiling": 0.15,
"balancedCeiling": 0.46,
"highSignalKeywords": ["analyze", "research", "distributed", "..."],
"lowSignalKeywords": ["what is", "define ", "list the", "..."]
}
All fields are optional — omitted fields fall back to compiled defaults.
Dream self-correction pass
Each routing decision is appended to tier-routing-log.jsonl on the PVC (capped at 200
entries). The dream cycle’s tier-routing review pass reads the log and — when it detects
systematic mis-routing — rewrites tier-selector.json with corrected thresholds and keyword
lists. The pass skips when fewer than 10 entries exist.
LLM client
ILlmClient
public interface ILlmClient
{
bool IsIdle { get; }
Task<ChatResponse> GetResponseAsync(
IList<ChatMessage> messages,
ChatOptions? options = null,
CancellationToken ct = default);
Task<ChatResponse> GetResponseAsync(
IList<ChatMessage> messages,
ModelTier tier,
ChatOptions? options = null,
CancellationToken ct = default);
}
A serialized gateway around the underlying IChatClient from Microsoft.Extensions.AI.
Enforces that only one LLM call is in flight at a time within the agent process:
- If a second call arrives while the first is running, it queues and waits
IsIdlelets background services (dream cycle, session evaluator) back off while the user is waiting for a response
The tier-less overload defaults to ModelTier.Balanced. Calls log tier=Balanced model=...
so routing decisions are visible in the pod logs.
AgentLoopRunner
AgentLoopRunner is the single entry point for all LLM tool-calling interactions in the
agent process. Every message handler (UserMessageHandler, ScheduledTaskHandler,
SubagentRunner, A2A handlers, etc.) calls AgentLoopRunner.RunAsync rather than
ILlmClient.GetResponseAsync directly.
Invariant: Never call
ILlmClient.GetResponseAsyncfrom a message handler to drive a tool-calling loop. Always go throughAgentLoopRunner.RunAsync. Direct calls bypass reasoning scaffolding, completion evaluation, hallucination nudging, context overflow trimming, and metrics recording.
What RunAsync does
- DateTime context injection — ensures the model knows the user’s current date/time
- Reasoning scaffolding — injects a system message with the iteration budget and step-by-step planning encouragement
- Inner tool loop — dispatches to either the native path (
FunctionInvokingChatClient) or the text-based parsing loop depending onModelBehavior.UseTextBasedToolCalling - Completion evaluation — after the inner loop returns, a cheap
ModelTier.LowLLM call evaluates whether the response actually completes the original user request. If incomplete, a continuation nudge is appended and the tool loop re-enters (up toMaxCompletionRepromptstimes, default 2). Evaluation is skipped on force-termination (consecutive timeouts) and fails open on any evaluator error. - Proactive follow-up — after the completion evaluator says COMPLETE, a second
ModelTier.Lowcall assesses whether there are high-value proactive actions the agent could take within the current context (e.g. looking up a contact mentioned in conversation, cross-referencing calendar events, connecting related information). If found, context is enriched with relevant skills/services and the tool loop runs one more pass. The follow-up response is appended to the original. Skipped for simple exchanges. Fails open on error.
Completion evaluator configuration
| Setting | Location | Default | Purpose |
|---|---|---|---|
AgentHost:MaxCompletionReprompts |
AgentHostOptions |
2 | Max re-prompts (0 = disabled) |
MaxCompletionRepromptsOverride |
ModelBehavior |
null (use host default) | Per-model override |
AgentHost:MaxFollowUpPasses |
AgentHostOptions |
1 | Max proactive follow-up passes (0 = disabled) |
MaxFollowUpPassesOverride |
ModelBehavior |
null (use host default) | Per-model override |
Diagnostics counters
| Counter | Fires when |
|---|---|
rockbot.agent.completion_check.complete |
Evaluator says task is done |
rockbot.agent.completion_check.incomplete |
Evaluator triggers a re-prompt |
rockbot.agent.completion_check.skipped |
Evaluation skipped (force termination) |
rockbot.agent.follow_up.triggered |
Follow-up evaluator found an opportunity |
rockbot.agent.follow_up.none |
Follow-up evaluator found nothing worth doing |
rockbot.agent.follow_up.skipped |
Follow-up evaluation skipped (disabled, force term) |
Per-model behaviors
Model-specific behavioral overrides are loaded from model-behaviors/{model-prefix}/ on the
data volume. The model prefix is matched case-insensitively against the deployed model ID.
| File | Applied at |
|---|---|
additional-system-prompt.md |
Appended to every system prompt (guardrails, output constraints) |
pre-tool-loop-prompt.md |
Injected before each tool-calling iteration |
Additional properties are configurable in appsettings.json under ModelBehaviors:Models:{prefix}:
| Property | Type | Default | Purpose |
|---|---|---|---|
NudgeOnHallucinatedToolCalls |
bool | false | Inject a nudge when the model describes tool actions without emitting calls |
NudgeOnLeakedToolSyntax |
bool | false | Retry when the model leaks tool-calling scaffolding (to=multi_tool_use.parallel, to=functions.X) into its text output. Language-agnostic; targets a documented OpenAI GPT-family failure mode. Safe to enable for any deployment |
NudgeOnUnexpectedCjkOutput |
bool | false | Retry when the model emits 3+ consecutive CJK codepoints — a heuristic for English-primary deployments where CJK output correlates with training-data contamination. Leave off for agents that legitimately respond in Chinese or Japanese |
NudgeOnToolFailureGiveup |
bool? | true | Retry once when the model gives up after a tool returned an error (matches phrasings like “tool failure”, “errored on both”, “from the current tool state”) instead of retrying the tool itself. General-purpose; on by default. Set to false to opt out for a specific model |
MaxToolIterationsOverride |
int? | null (uses AgentHost:MaxToolIterations) |
Override the per-request tool-loop iteration cap |
ToolResultChunkingThreshold |
int? | null (uses 64 000) | Char count above which tool results are chunked into working memory instead of appended inline |
ScheduledTaskResultMode |
enum | Summarize |
How scheduled task output is presented (Summarize, VerbatimOutput, SummarizeWithOutput) |
MaxCompletionRepromptsOverride |
int? | null (uses AgentHost:MaxCompletionReprompts) |
Override the per-request completion-evaluator re-prompt cap |
MaxFollowUpPassesOverride |
int? | null (uses AgentHost:MaxFollowUpPasses) |
Override the per-request proactive follow-up pass cap |
Example — lowering the chunking threshold for a small-context model:
{
"ModelBehaviors": {
"Models": {
"openrouter/deepseek": {
"ToolResultChunkingThreshold": 32000
}
}
}
}
See Tool result chunking for full details.
Agent host builder
AgentHostBuilder is the fluent configuration API. Access it via AddRockBotHost:
services.AddRockBotHost(agent =>
{
agent.WithIdentity("rockbot");
agent.WithProfile(); // Load soul.md, directives.md, etc. from data volume
agent.WithRules(); // Load agent rules from rules/ directory
agent.WithMemory(); // Conversation + long-term + working memory
agent.WithConversationLog(); // Opt-in: enables dream gap detection + pref inference
agent.WithFeedback(); // IFeedbackStore + SessionSummaryService
agent.WithSkills(); // ISkillStore + ISkillUsageStore + StarterSkillService
agent.WithDreaming(opts =>
{
opts.InitialDelay = TimeSpan.FromMinutes(5);
opts.Interval = TimeSpan.FromHours(4);
});
// Message handlers
agent.HandleMessage<UserMessage, UserMessageHandler>();
agent.HandleMessage<ConversationHistoryRequest, ConversationHistoryRequestHandler>();
agent.HandleMessage<ScheduledTaskMessage, ScheduledTaskHandler>();
// Tool subsystems
agent.AddToolHandler(); // Tool invocation dispatch
agent.AddMcpToolProxy(); // MCP server bridge
agent.AddWebTools(opts => { ... }); // Web search + browse
agent.AddSchedulingTools(); // Scheduled task tools
agent.AddRemoteScriptRunner(); // Script execution via Scripts Manager
// Subscriptions
agent.SubscribeTo(UserProxyTopics.UserMessage);
agent.SubscribeTo(UserProxyTopics.ConversationHistoryRequest);
// Optional middleware
agent.UseMiddleware<LoggingMiddleware>();
agent.UseMiddleware<TracingMiddleware>();
agent.UseMiddleware<ErrorHandlingMiddleware>();
});
Extension method reference
| Method | Registers |
|---|---|
WithIdentity(name) |
AgentIdentity |
WithProfile() |
IAgentProfileProvider, AgentProfile, ISystemPromptBuilder |
WithRules() |
IRulesStore, rules tools |
WithConversationMemory() |
IConversationMemory (file-backed + in-memory) |
WithLongTermMemory() |
ILongTermMemory (FileMemoryStore) |
WithWorkingMemory() |
IWorkingMemory (global, path-namespaced; HybridCacheWorkingMemory + FileWorkingMemory) |
WithMemory() |
All three memory tiers above |
WithConversationLog() |
IConversationLog (FileConversationLog) |
WithFeedback() |
IFeedbackStore + SessionSummaryService |
WithSkills() |
ISkillStore + ISkillUsageStore + StarterSkillService |
WithDreaming() |
DreamService (IHostedService) |
Agent data volume layout
All persistent agent state lives under a single base path (default /data/agent in production,
configurable via AgentProfileOptions.BasePath):
/data/agent/
├── soul.md # Core identity and personality
├── directives.md # Operational instructions
├── style.md # (optional) Voice and tone
├── memory-rules.md # (optional) Memory formation rules
├── dream.md # Dream: memory consolidation prompt
├── skill-dream.md # Dream: skill consolidation prompt
├── skill-optimize.md # Dream: skill optimization prompt
├── skill-gap.md # Dream: skill gap detection prompt
├── pref-dream.md # Dream: preference inference prompt
├── tier-routing-directive.md # (optional) Dream: tier routing review prompt override
├── session-evaluator.md # Session quality evaluation prompt
├── tier-selector.json # (optional) Hot-reloadable tier routing config
├── tier-routing-log.jsonl # Routing decision log (auto-managed, capped at 200 entries)
├── mcp.json # MCP server connection configuration
├── rules/ # Agent rules (markdown files)
├── model-behaviors/ # Per-model prompt overrides
│ └── {model-prefix}/
│ ├── additional-system-prompt.md
│ └── pre-tool-loop-prompt.md
├── memory/ # Long-term memory entries
│ └── {category}/
│ └── {id}.json
├── skills/ # Learned skills
│ └── {name}.json # (may be nested: skills/mcp/email.json)
├── skill-usage/ # Skill invocation event log
│ └── {sessionId}.jsonl
├── feedback/ # Session quality signals
│ └── {sessionId}.jsonl
├── conversations/ # Persisted conversation sessions
│ └── {sessionId}.json
├── working-memory/ # Working memory persistence (path-namespaced, TTL-based)
│ ├── session.json # Entries for all user sessions (session/{id}/...)
│ ├── patrol.json # Entries for patrol tasks (patrol/{name}/...)
│ └── subagent.json # Entries for subagents (subagent/{taskId}/...)
└── conversation-log/ # Aggregated turns for dream passes
└── turns.jsonl
Startup sequence
When the agent process starts:
AgentHostBuilder.Build()registers all services with the DI containerIHostedServiceimplementations start in registration order:StarterSkillService— seeds starter skills from registeredIToolSkillProvidersMcpBridgeService— connects to configured MCP serversFileConversationMemory— reloads sessions withinSessionIdleTimeoutSessionSummaryService— begins polling for sessions to evaluateDreamService— schedules first dream cycle afterInitialDelayAgentHostService— subscribes to configured topics and begins processing messages
- The agent is now ready to receive messages
Configuration reference
Key configuration sections (from appsettings.json or environment variables):
{
"AgentProfile": {
"BasePath": "/data/agent"
},
"AgentHost": {
"MaxToolIterations": 50,
"MaxCompletionReprompts": 2,
"MaxFollowUpPasses": 1
},
"RabbitMq": {
"HostName": "rabbitmq.cluster.local",
"Port": 5672,
"UserName": "rockbot",
"Password": "..."
},
"LLM": {
"Balanced": {
"Endpoint": "https://openrouter.ai/api/v1",
"ApiKey": "...",
"ModelId": "anthropic/claude-haiku-4.5"
},
"Low": {
"Endpoint": "https://openrouter.ai/api/v1",
"ApiKey": "...",
"ModelId": "google/gemini-flash-1.5-8b"
},
"High": {
"Endpoint": "https://openrouter.ai/api/v1",
"ApiKey": "...",
"ModelId": "anthropic/claude-opus-4-6"
}
},
"Memory": {
"BasePath": "memory"
},
"Skills": {
"BasePath": "skills",
"UsageBasePath": "skill-usage"
},
"Dream": {
"Enabled": true,
"InitialDelay": "00:05:00",
"Interval": "04:00:00",
"TierRoutingReviewEnabled": true
},
"Feedback": {
"BasePath": "feedback",
"SessionIdleThreshold": "00:10:00",
"PollInterval": "00:05:00"
}
}
LLM.Balanced is required. LLM.Low and LLM.High are optional — when absent they fall back
to Balanced. The flat legacy keys LLM.Endpoint, LLM.ApiKey, LLM.ModelId are still
accepted for backward compatibility and are treated as Balanced.