AI Requests
A unified view of every LLM call Codebolt has made: which agent, which model, how many tokens, how much it cost, how long it took. Used for debugging, cost tracking, and spotting performance issues.
The underlying data is the event log (type == llm.chat). Every surface is just a different lens on it.
Viewing AI requests
- Desktop
- CLI
- HTTP API
Settings → AI Requests or click the status-bar token counter. Sortable table, filter sidebar, click any row for full payload.
codebolt events query "type == llm.chat" --since "1 hour ago"
codebolt events query "type == llm.chat and agent == 'reviewer'" --since "today"
codebolt provider usage --since "7 days ago" --by agent
GET /api/events?type=llm.chat&since=1h
GET /api/usage?since=7d&group_by=agent
What's shown
Time Agent Model Tokens Cost Duration Run
--------- ----------- ------------------- ------------ -------- ---------- --------
14:23:05 generalist claude-sonnet-4-6 4.2k / 850 $0.018 2.1s run_abc
14:23:10 generalist claude-sonnet-4-6 5.1k / 120 $0.017 1.4s run_abc
14:23:15 reviewer gpt-5 3.8k / 340 $0.021 2.8s run_def
Each row:
- Time — when the call was made.
- Agent — which agent requested it.
- Model — which model answered.
- Tokens — input / output.
- Cost — computed from per-token rates.
- Duration — wall time from request to response.
- Run — the agent run ID; click through to the full trace.
Filtering
Filter by:
- Agent
- Model
- Provider
- Date range
- Run
- Cost threshold (
> $0.10) - Duration threshold (
> 5s)
Useful for finding the slow or expensive calls.
Aggregate view
Toggle from "list" to "aggregate" to see totals:
- Cost per agent (which agents are expensive?)
- Cost per model (which models dominate spend?)
- Cost per day / week / month
- Request count per agent
- Average duration per agent
Per-project cost tracking
Costs are tagged with the project the run was in. The aggregate view can group by project for chargeback or attribution.
Spotting issues
Patterns to watch for:
- An agent making many short calls — should it be batching?
- Very long single calls (>30s) — usually a heavily context-loaded planner; check if the context can be trimmed.
- High output token counts — the agent is generating long responses; consider tightening prompts.
- Recent cost spike — a specific agent started burning more tokens; diff recent changes to that agent's config.
- Failed requests — rate limits, auth errors. Check provider health.
Exporting
- CLI
- Desktop
- HTTP API
codebolt events query "type == llm.chat" --json > llm-calls.json
Settings → AI Requests → Export button. CSV or JSON. Honours the active filters.
GET /api/events?type=llm.chat&format=jsonl # streaming JSONL
GET /api/events?type=llm.chat&format=csv
Full data for all LLM calls (subject to retention). Pipe into your analytics tool of choice.
Relationship to the event log
AI Requests is a filtered view of the event log where type == llm.chat. The event log is authoritative; AI Requests is the convenience UI. Anything in AI Requests is also queryable via codebolt events query.