Agent host

The agent host is the runtime that wires together messaging, LLM calls, memory, skills, tools, and the dream cycle into a working agent process. It lives in RockBot.Host and RockBot.Host.Abstractions, with the concrete RockBot.Agent project providing the runnable executable.

Overview

Incoming MessageEnvelope (from RabbitMQ)
    │
    ▼
IMessagePipeline.DispatchAsync()
    │
    ├── Middleware chain (logging, tracing, error handling, ...)
    │
    ▼
IMessageHandler<TMessage>.HandleAsync()
    │   UserMessageHandler — main LLM conversation loop
    │   ScheduledTaskHandler — scheduled task delivery
    │   ConversationHistoryRequestHandler — history replay
    │
    ├── IConversationMemory — sliding window of turns
    ├── ILongTermMemory — BM25 recall of relevant memories
    ├── ISkillStore — BM25 recall of relevant skills
    ├── IWorkingMemory — global path-namespaced scratch space (TTL-based)
    ├── ILlmClient — serialized LLM gateway (one in-flight at a time)
    └── IFeedbackStore — quality signal writes (fire-and-forget)

Agent identity and profile

`AgentIdentity`

public sealed record AgentIdentity(
    string Name,          // Logical agent name, e.g. "rockbot"
    string InstanceId     // Unique instance; auto-generated GUID if not supplied
);

Used in system prompt construction, topic subscriptions, and as the Source field on outgoing envelopes.

`AgentProfile`

The agent’s personality and instructions are loaded from markdown files on the data volume:

File	Purpose
`soul.md`	Core identity, values, and personality — stable; authored by prompt engineers
`directives.md`	Deployment-specific operational instructions
`style.md`	(optional) Voice and tone polish
`memory-rules.md`	(optional) Rules governing when and how memories are formed

The profile is parsed into an AgentProfile composed of AgentProfileDocument instances. Each document is split on ## headings into named AgentProfileSection items. Sections can be looked up by name across all documents via profile.FindSection("name").

`DefaultSystemPromptBuilder`

Assembles the system prompt from the agent profile and identity:

You are {AgentName}.

{soul.md content}

{directives.md content}

{memory-rules.md content}   ← if present

{style.md content}          ← if present

The result is cached after the first call — the profile is immutable at runtime. The built system prompt is the starting system message on every LLM request.

Message pipeline

Registration

agent
    .HandleMessage<UserMessage, UserMessageHandler>()
    .HandleMessage<ScheduledTaskMessage, ScheduledTaskHandler>()
    .HandleMessage<ConversationHistoryRequest, ConversationHistoryRequestHandler>()
    .UseMiddleware<LoggingMiddleware>()
    .UseMiddleware<TracingMiddleware>()
    .UseMiddleware<ErrorHandlingMiddleware>()
    .SubscribeTo(UserProxyTopics.UserMessage)
    .SubscribeTo(UserProxyTopics.ConversationHistoryRequest);

Dispatch flow

IMessagePipeline receives a raw MessageEnvelope from the subscriber callback:

Deserializes the MessageType field to find the registered IMessageHandler<T>
Passes the envelope through the middleware chain
Middleware calls next() to continue; or short-circuits by returning a MessageResult
The innermost middleware invokes the handler

MessageTypeResolver maps MessageType strings to .NET types. Registration is done via agent.HandleMessage<TMessage, THandler>() which records both the type mapping and the DI registration for THandler.

Conversation memory

`FileConversationMemory` (implements `IConversationMemory`)

Wraps InMemoryConversationMemory with file-backed persistence:

Each session serializes to {BasePath}/{sessionId}.json
On startup, sessions whose last turn falls within SessionIdleTimeout are reloaded — so recent conversations survive agent restarts
Per-session SemaphoreSlim prevents concurrent write races on the same file
If IConversationLog is registered, every turn is also appended to the conversation log for the dream preference-inference pass

Session lifecycle:

First message in a session creates the file
Subsequent messages append turns and re-serialize
ClearAsync removes both the in-memory state and the file
Stale sessions (beyond SessionIdleTimeout) are not loaded on restart

Feedback and session evaluation

`FileFeedbackStore` (implements `IFeedbackStore`)

Appends FeedbackEntry records to per-session JSONL files:

{BasePath}/{sessionId}.jsonl

One JSON object per line. Per-session semaphores prevent concurrent write races.

QueryRecentAsync scans all JSONL files to find entries since a given timestamp — used by the dream cycle to gather quality signals for memory consolidation and skill optimization.

`SessionSummaryService`

Background hosted service that evaluates completed sessions:

Polls on FeedbackOptions.PollInterval (default 5 minutes)
Finds sessions whose last turn is older than SessionIdleThreshold (default 10 minutes) that haven’t already been evaluated this run
Backs off if the LLM is busy (polls every 5s until idle)
Sends the full session transcript to the LLM with an evaluator directive
Writes a FeedbackEntry with SignalType = SessionSummary containing:
- summary: one-sentence description
- toolsWorkedWell, toolsFailedOrMissed, correctionsMade
- overallQuality: excellent / good / fair / poor

The evaluator directive is loaded from session-evaluator.md on the data volume, with a built-in fallback.

The dream cycle’s skill optimization pass uses poor / fair quality scores, along with explicit Correction signals, to identify skills that need improvement.

Conversation log

`FileConversationLog` (implements `IConversationLog`)

Single-file JSONL log of all conversation turns across all sessions:

{BasePath}/turns.jsonl

A single semaphore serializes all writes. Used exclusively by the dream cycle:

The preference-inference pass reads the full log to infer durable user preferences
The skill gap detection pass reads it to find recurring patterns
Both passes clear the log after processing to prevent unbounded growth

IConversationLog is opt-in — call WithConversationLog() explicitly in the host builder. WithMemory() does not register it.

Three-tier LLM routing

`ModelTier`

public enum ModelTier { Low, Balanced, High }

Every LLM call is tagged with a tier. The TieredChatClientRegistry singleton holds one IChatClient per tier and LlmClient selects the right one at call time.

Tier	Intended use	Falls back to
`Low`	Short factual questions, trivial single-step tasks	Balanced
`Balanced`	Moderate-complexity requests, patrol tasks	— (required)
`High`	Deep analysis, dream consolidation, research	Balanced

Low and High are optional in configuration; when absent they fall back to Balanced.

`KeywordTierSelector` (implements `ILlmTierSelector`)

Scores prompts using a keyword + length heuristic — no embeddings, no external calls:

Length score (0 – 0.40) — longer prompts tend to be more complex
Keyword score (0 – 0.35) — high-signal words (analyze, research, distributed, …) increase the score; simplex words (what is, define, list the, …) decrease it
Structural score (0 – 0.25) — code blocks, math notation, multi-step markers

Scores at or below lowCeiling → Low; at or below balancedCeiling → Balanced; above → High.

The parameterless constructor always uses compiled-in defaults (used in tests). The DI constructor hot-reloads {BasePath}/tier-selector.json every 60 seconds so thresholds and keyword lists can be tuned without a pod restart.

`tier-selector.json` (hot-reloadable)

{
  "version": 1,
  "notes": "2026-02-24: tightened balancedCeiling after dream review",
  "lowCeiling": 0.15,
  "balancedCeiling": 0.46,
  "highSignalKeywords": ["analyze", "research", "distributed", "..."],
  "lowSignalKeywords":  ["what is", "define ", "list the", "..."]
}

All fields are optional — omitted fields fall back to compiled defaults.

Dream self-correction pass

Each routing decision is appended to tier-routing-log.jsonl on the PVC (capped at 200 entries). The dream cycle’s tier-routing review pass reads the log and — when it detects systematic mis-routing — rewrites tier-selector.json with corrected thresholds and keyword lists. The pass skips when fewer than 10 entries exist.

LLM client

`ILlmClient`

public interface ILlmClient
{
    bool IsIdle { get; }
    Task<ChatResponse> GetResponseAsync(
        IList<ChatMessage> messages,
        ChatOptions? options = null,
        CancellationToken ct = default);
    Task<ChatResponse> GetResponseAsync(
        IList<ChatMessage> messages,
        ModelTier tier,
        ChatOptions? options = null,
        CancellationToken ct = default);
}

A serialized gateway around the underlying IChatClient from Microsoft.Extensions.AI. Enforces that only one LLM call is in flight at a time within the agent process:

If a second call arrives while the first is running, it queues and waits
IsIdle lets background services (dream cycle, session evaluator) back off while the user is waiting for a response

The tier-less overload defaults to ModelTier.Balanced. Calls log tier=Balanced model=... so routing decisions are visible in the pod logs.

AgentLoopRunner

AgentLoopRunner is the single entry point for all LLM tool-calling interactions in the agent process. Every message handler (UserMessageHandler, ScheduledTaskHandler, SubagentRunner, A2A handlers, etc.) calls AgentLoopRunner.RunAsync rather than ILlmClient.GetResponseAsync directly.

Invariant: Never call ILlmClient.GetResponseAsync from a message handler to drive a tool-calling loop. Always go through AgentLoopRunner.RunAsync. Direct calls bypass reasoning scaffolding, completion evaluation, hallucination nudging, context overflow trimming, and metrics recording.

What RunAsync does

DateTime context injection — ensures the model knows the user’s current date/time
Reasoning scaffolding — injects a system message with the iteration budget and step-by-step planning encouragement
Inner tool loop — dispatches to either the native path (FunctionInvokingChatClient) or the text-based parsing loop depending on ModelBehavior.UseTextBasedToolCalling
Completion evaluation — after the inner loop returns, a cheap ModelTier.Low LLM call evaluates whether the response actually completes the original user request. If incomplete, a continuation nudge is appended and the tool loop re-enters (up to MaxCompletionReprompts times, default 2). Evaluation is skipped on force-termination (consecutive timeouts) and fails open on any evaluator error.
Proactive follow-up — after the completion evaluator says COMPLETE, a second ModelTier.Low call assesses whether there are high-value proactive actions the agent could take within the current context (e.g. looking up a contact mentioned in conversation, cross-referencing calendar events, connecting related information). If found, context is enriched with relevant skills/services and the tool loop runs one more pass. The follow-up response is appended to the original. Skipped for simple exchanges. Fails open on error.

Completion evaluator configuration

Setting	Location	Default	Purpose
`AgentHost:MaxCompletionReprompts`	`AgentHostOptions`	2	Max re-prompts (0 = disabled)
`MaxCompletionRepromptsOverride`	`ModelBehavior`	null (use host default)	Per-model override
`AgentHost:MaxFollowUpPasses`	`AgentHostOptions`	1	Max proactive follow-up passes (0 = disabled)
`MaxFollowUpPassesOverride`	`ModelBehavior`	null (use host default)	Per-model override

Diagnostics counters

Counter	Fires when
`rockbot.agent.completion_check.complete`	Evaluator says task is done
`rockbot.agent.completion_check.incomplete`	Evaluator triggers a re-prompt
`rockbot.agent.completion_check.skipped`	Evaluation skipped (force termination)
`rockbot.agent.follow_up.triggered`	Follow-up evaluator found an opportunity
`rockbot.agent.follow_up.none`	Follow-up evaluator found nothing worth doing
`rockbot.agent.follow_up.skipped`	Follow-up evaluation skipped (disabled, force term)

Per-model behaviors

Model-specific behavioral overrides are loaded from model-behaviors/{model-prefix}/ on the data volume. The model prefix is matched case-insensitively against the deployed model ID.

File	Applied at
`additional-system-prompt.md`	Appended to every system prompt (guardrails, output constraints)
`pre-tool-loop-prompt.md`	Injected before each tool-calling iteration

Additional properties are configurable in appsettings.json under ModelBehaviors:Models:{prefix}:

Property	Type	Default	Purpose
`NudgeOnHallucinatedToolCalls`	bool	false	Inject a nudge when the model describes tool actions without emitting calls
`NudgeOnLeakedToolSyntax`	bool	false	Retry when the model leaks tool-calling scaffolding (`to=multi_tool_use.parallel`, `to=functions.X`) into its text output. Language-agnostic; targets a documented OpenAI GPT-family failure mode. Safe to enable for any deployment
`NudgeOnUnexpectedCjkOutput`	bool	false	Retry when the model emits 3+ consecutive CJK codepoints — a heuristic for English-primary deployments where CJK output correlates with training-data contamination. Leave off for agents that legitimately respond in Chinese or Japanese
`NudgeOnToolFailureGiveup`	bool?	true	Retry once when the model gives up after a tool returned an error (matches phrasings like “tool failure”, “errored on both”, “from the current tool state”) instead of retrying the tool itself. General-purpose; on by default. Set to `false` to opt out for a specific model
`MaxToolIterationsOverride`	int?	null (uses `AgentHost:MaxToolIterations`)	Override the per-request tool-loop iteration cap
`ToolResultChunkingThreshold`	int?	null (uses 64 000)	Char count above which tool results are chunked into working memory instead of appended inline
`ScheduledTaskResultMode`	enum	`Summarize`	How scheduled task output is presented (`Summarize`, `VerbatimOutput`, `SummarizeWithOutput`)
`MaxCompletionRepromptsOverride`	int?	null (uses `AgentHost:MaxCompletionReprompts`)	Override the per-request completion-evaluator re-prompt cap
`MaxFollowUpPassesOverride`	int?	null (uses `AgentHost:MaxFollowUpPasses`)	Override the per-request proactive follow-up pass cap

Example — lowering the chunking threshold for a small-context model:

{
  "ModelBehaviors": {
    "Models": {
      "openrouter/deepseek": {
        "ToolResultChunkingThreshold": 32000
      }
    }
  }
}

See Tool result chunking for full details.

Agent host builder

AgentHostBuilder is the fluent configuration API. Access it via AddRockBotHost:

services.AddRockBotHost(agent =>
{
    agent.WithIdentity("rockbot");
    agent.WithProfile();                 // Load soul.md, directives.md, etc. from data volume
    agent.WithRules();                   // Load agent rules from rules/ directory
    agent.WithMemory();                  // Conversation + long-term + working memory
    agent.WithConversationLog();         // Opt-in: enables dream gap detection + pref inference
    agent.WithFeedback();                // IFeedbackStore + SessionSummaryService
    agent.WithSkills();                  // ISkillStore + ISkillUsageStore + StarterSkillService
    agent.WithDreaming(opts =>
    {
        opts.InitialDelay = TimeSpan.FromMinutes(5);
        opts.Interval = TimeSpan.FromHours(4);
    });

    // Message handlers
    agent.HandleMessage<UserMessage, UserMessageHandler>();
    agent.HandleMessage<ConversationHistoryRequest, ConversationHistoryRequestHandler>();
    agent.HandleMessage<ScheduledTaskMessage, ScheduledTaskHandler>();

    // Tool subsystems
    agent.AddToolHandler();              // Tool invocation dispatch
    agent.AddMcpToolProxy();             // MCP server bridge
    agent.AddWebTools(opts => { ... });  // Web search + browse
    agent.AddSchedulingTools();          // Scheduled task tools
    agent.AddRemoteScriptRunner();       // Script execution via Scripts Manager

    // Subscriptions
    agent.SubscribeTo(UserProxyTopics.UserMessage);
    agent.SubscribeTo(UserProxyTopics.ConversationHistoryRequest);

    // Optional middleware
    agent.UseMiddleware<LoggingMiddleware>();
    agent.UseMiddleware<TracingMiddleware>();
    agent.UseMiddleware<ErrorHandlingMiddleware>();
});

Extension method reference

Method	Registers
`WithIdentity(name)`	`AgentIdentity`
`WithProfile()`	`IAgentProfileProvider`, `AgentProfile`, `ISystemPromptBuilder`
`WithRules()`	`IRulesStore`, rules tools
`WithConversationMemory()`	`IConversationMemory` (file-backed + in-memory)
`WithLongTermMemory()`	`ILongTermMemory` (FileMemoryStore)
`WithWorkingMemory()`	`IWorkingMemory` (global, path-namespaced; `HybridCacheWorkingMemory` + `FileWorkingMemory`)
`WithMemory()`	All three memory tiers above
`WithConversationLog()`	`IConversationLog` (FileConversationLog)
`WithFeedback()`	`IFeedbackStore` + `SessionSummaryService`
`WithSkills()`	`ISkillStore` + `ISkillUsageStore` + `StarterSkillService`
`WithDreaming()`	`DreamService` (IHostedService)

Agent data volume layout

All persistent agent state lives under a single base path (default /data/agent in production, configurable via AgentProfileOptions.BasePath):

/data/agent/
├── soul.md                    # Core identity and personality
├── directives.md              # Operational instructions
├── style.md                   # (optional) Voice and tone
├── memory-rules.md            # (optional) Memory formation rules
├── dream.md                   # Dream: memory consolidation prompt
├── skill-dream.md             # Dream: skill consolidation prompt
├── skill-optimize.md          # Dream: skill optimization prompt
├── skill-gap.md               # Dream: skill gap detection prompt
├── pref-dream.md              # Dream: preference inference prompt
├── tier-routing-directive.md  # (optional) Dream: tier routing review prompt override
├── session-evaluator.md       # Session quality evaluation prompt
├── tier-selector.json         # (optional) Hot-reloadable tier routing config
├── tier-routing-log.jsonl     # Routing decision log (auto-managed, capped at 200 entries)
├── mcp.json                   # MCP server connection configuration
├── rules/                     # Agent rules (markdown files)
├── model-behaviors/           # Per-model prompt overrides
│   └── {model-prefix}/
│       ├── additional-system-prompt.md
│       └── pre-tool-loop-prompt.md
├── memory/                    # Long-term memory entries
│   └── {category}/
│       └── {id}.json
├── skills/                    # Learned skills
│   └── {name}.json            # (may be nested: skills/mcp/email.json)
├── skill-usage/               # Skill invocation event log
│   └── {sessionId}.jsonl
├── feedback/                  # Session quality signals
│   └── {sessionId}.jsonl
├── conversations/             # Persisted conversation sessions
│   └── {sessionId}.json
├── working-memory/            # Working memory persistence (path-namespaced, TTL-based)
│   ├── session.json           # Entries for all user sessions (session/{id}/...)
│   ├── patrol.json            # Entries for patrol tasks (patrol/{name}/...)
│   └── subagent.json          # Entries for subagents (subagent/{taskId}/...)
└── conversation-log/          # Aggregated turns for dream passes
    └── turns.jsonl

Startup sequence

When the agent process starts:

AgentHostBuilder.Build() registers all services with the DI container
IHostedService implementations start in registration order:
- StarterSkillService — seeds starter skills from registered IToolSkillProviders
- McpBridgeService — connects to configured MCP servers
- FileConversationMemory — reloads sessions within SessionIdleTimeout
- SessionSummaryService — begins polling for sessions to evaluate
- DreamService — schedules first dream cycle after InitialDelay
- AgentHostService — subscribes to configured topics and begins processing messages
The agent is now ready to receive messages

Configuration reference

Key configuration sections (from appsettings.json or environment variables):

{
  "AgentProfile": {
    "BasePath": "/data/agent"
  },
  "AgentHost": {
    "MaxToolIterations": 50,
    "MaxCompletionReprompts": 2,
    "MaxFollowUpPasses": 1
  },
  "RabbitMq": {
    "HostName": "rabbitmq.cluster.local",
    "Port": 5672,
    "UserName": "rockbot",
    "Password": "..."
  },
  "LLM": {
    "Balanced": {
      "Endpoint": "https://openrouter.ai/api/v1",
      "ApiKey": "...",
      "ModelId": "anthropic/claude-haiku-4.5"
    },
    "Low": {
      "Endpoint": "https://openrouter.ai/api/v1",
      "ApiKey": "...",
      "ModelId": "google/gemini-flash-1.5-8b"
    },
    "High": {
      "Endpoint": "https://openrouter.ai/api/v1",
      "ApiKey": "...",
      "ModelId": "anthropic/claude-opus-4-6"
    }
  },
  "Memory": {
    "BasePath": "memory"
  },
  "Skills": {
    "BasePath": "skills",
    "UsageBasePath": "skill-usage"
  },
  "Dream": {
    "Enabled": true,
    "InitialDelay": "00:05:00",
    "Interval": "04:00:00",
    "TierRoutingReviewEnabled": true
  },
  "Feedback": {
    "BasePath": "feedback",
    "SessionIdleThreshold": "00:10:00",
    "PollInterval": "00:05:00"
  }
}

LLM.Balanced is required. LLM.Low and LLM.High are optional — when absent they fall back to Balanced. The flat legacy keys LLM.Endpoint, LLM.ApiKey, LLM.ModelId are still accepted for backward compatibility and are treated as Balanced.