AI Requests

A unified view of every LLM call Codebolt has made: which agent, which model, how many tokens, how much it cost, how long it took. Used for debugging, cost tracking, and spotting performance issues.

The underlying data is the event log (type == llm.chat). Every surface is just a different lens on it.

Viewing AI requests

Desktop
CLI
HTTP API

Settings → AI Requests or click the status-bar token counter. Sortable table, filter sidebar, click any row for full payload.

codebolt events query "type == llm.chat" --since "1 hour ago"
codebolt events query "type == llm.chat and agent == 'reviewer'" --since "today"
codebolt provider usage --since "7 days ago" --by agent

GET /api/events?type=llm.chat&since=1h
GET /api/usage?since=7d&group_by=agent

What's shown

Time      Agent       Model               Tokens       Cost     Duration   Run
--------- ----------- ------------------- ------------ -------- ---------- --------
14:23:05  generalist  claude-sonnet-4-6   4.2k / 850   $0.018   2.1s       run_abc
14:23:10  generalist  claude-sonnet-4-6   5.1k / 120   $0.017   1.4s       run_abc
14:23:15  reviewer    gpt-5               3.8k / 340   $0.021   2.8s       run_def

Each row:

Time — when the call was made.
Agent — which agent requested it.
Model — which model answered.
Tokens — input / output.
Cost — computed from per-token rates.
Duration — wall time from request to response.
Run — the agent run ID; click through to the full trace.

Filtering

Filter by:

Agent
Model
Provider
Date range
Run
Cost threshold (> $0.10)
Duration threshold (> 5s)

Useful for finding the slow or expensive calls.

Aggregate view

Toggle from "list" to "aggregate" to see totals:

Cost per agent (which agents are expensive?)
Cost per model (which models dominate spend?)
Cost per day / week / month
Request count per agent
Average duration per agent

Per-project cost tracking

Costs are tagged with the project the run was in. The aggregate view can group by project for chargeback or attribution.

Spotting issues

Patterns to watch for:

An agent making many short calls — should it be batching?
Very long single calls (>30s) — usually a heavily context-loaded planner; check if the context can be trimmed.
High output token counts — the agent is generating long responses; consider tightening prompts.
Recent cost spike — a specific agent started burning more tokens; diff recent changes to that agent's config.
Failed requests — rate limits, auth errors. Check provider health.

Exporting

CLI
Desktop
HTTP API

codebolt events query "type == llm.chat" --json > llm-calls.json

GET /api/events?type=llm.chat&format=jsonl    # streaming JSONL
GET /api/events?type=llm.chat&format=csv

Full data for all LLM calls (subject to retention). Pipe into your analytics tool of choice.

Relationship to the event log

AI Requests is a filtered view of the event log where type == llm.chat. The event log is authoritative; AI Requests is the convenience UI. Anything in AI Requests is also queryable via codebolt events query.

Viewing AI requests​

What's shown​

Filtering​

Aggregate view​

Per-project cost tracking​

Spotting issues​

Exporting​

Relationship to the event log​

See also​