Context Assembly

The context assembly subsystem is the librarian of the agent loop. Every time the LLM is about to be called, it assembles a prompt from a dozen different sources and decides what actually makes it into the context window.

Source code: controllers/{contextAssembly,contextRuleEngine}, services/contextAssemblyService, services/contextRuleEngineService, services/contextRuleEngineDataService.

The problem it solves

A naive agent dumps everything into the prompt: system message, full history, all open files, all tools, recent tool results, relevant memory, project rules. This hits three walls immediately:

Budget. Context windows are finite and tokens cost money.
Relevance. The LLM's attention degrades as the prompt grows; more isn't better.
Staleness. Files change, memory updates, the previous step's "relevant" isn't the current step's "relevant".

contextAssemblyService exists to solve all three deterministically, so agent authors don't each reinvent it (badly).

What it assembles

For a single step, the assembler gathers:

Source	Provided by	Typical share of budget
System prompt	Agent config	fixed
Active rules	`contextRuleEngineService`	small, high priority
Recent turns	`episodicMemoryDataService`	medium, sliding window
Relevant long-term memory	`persistentMemoryDataService`, query-filtered	medium
Knowledge graph traversal	`kgDataService`, entity-seeded	medium
Vector hits	`vectordbService`, top-k semantic	medium
Narrative threads	`narrativeService`, active-thread filter	small
Open files / live project state	`projectStructureService`, `fileReadService`	small-medium
Tool schemas	`toolService`, filtered by relevance	varies
Previous tool results	Working memory	small

Each source is budget-capped independently, then the full prompt is budget-capped as a whole, then the assembler falls back to compression/truncation strategies if still over.

The rule engine

contextRuleEngineService is how projects customise what ends up in the prompt without writing code. A rule looks like:

when:
  task_contains: ["auth", "login", "session"]
then:
  always_include:
    - path: "docs/security-decisions.md"
  boost:
    - kg_entity: "AuthService"
      weight: 2.0
  exclude:
    - path: "generated/**"

Rules are evaluated every step. This is the reason two agents working on the same project can share institutional context without each being hand-coded for it.

Compression and truncation strategies

When the budget is tight (common), the assembler applies strategies in order:

Drop the lowest-relevance hits from each source.
Compress older turns (ConversationCompactorModifier from the processor pipeline).
Summarise long tool outputs.
Hard-truncate as a last resort, with a clear marker so the LLM knows data was cut.

Every decision is recorded so you can replay an assembly and see exactly what was included, what was dropped, and why.

How it plugs into the agent loop

agent step begins
   │
   ▼
contextAssemblyService.build({ task, history, budget })
   │
   ├── rule engine fires
   ├── each source queried + budget-capped
   ├── merged, deduped, compressed
   │
   ▼
LLM request with assembled messages

llmService never calls the assembler itself — it only receives the finished message list. This separation is what lets different agents use different assembly strategies while still using the same LLM path.

The problem it solves​

What it assembles​

The rule engine​

Compression and truncation strategies​

How it plugs into the agent loop​

See also​

The problem it solves

What it assembles

The rule engine

Compression and truncation strategies

How it plugs into the agent loop

See also