Skip to main content

Optimization Loop

The optimization loop automatically improves a subject (agent, skill, etc.) by having an optimizer agent read the source code, make targeted changes, and re-evaluate until the target score is reached.

How It Works

Optimization loop: baseline, proposer, variant batch, evaluator, selector, promotion, and feedbackOne Optimization Run1. Baseline + Eval SetStart from the current agent and the benchmark you care about.baseline score is always measured first2. ProposerGenerate candidate changes along one axis.prompt rewrite, model sweep, grid, capability toggle3. Variant BatchEach variant is a precise diff from baseline.n variants within budget4. Evaluate Every VariantRun each candidate against the same fixtures and compute comparable metrics.overall scorecost + latencytraces + tool behaviorsame eval set5. SelectorRank variants by the metricrespect confidence intervals6. Promote WinnerKeep the best variant as the new candidate.promote only if it truly beats baseline7. Iterate Or StopContinue on a new axis or stop on plateau/regression.budget, max-iterations, stop-on-regression

Per Iteration

  1. Duplicate — the subject's folder is copied to an isolated path.
  2. Modify — the optimizer agent receives the current score, eval output, evaluator reasoning, and previous iteration history. It reads the source code, decides what to change, and makes one modification.
  3. Evaluate — the modified copy is re-run against the same task and evaluators.
  4. Decide — the strategy determines whether to keep or discard the change.
  5. Cleanup — discarded copies are deleted immediately to save disk.

Optimizer Types

Currently, only the agent optimizer is implemented:

Agent Optimizer

An optimizer agent (which you select) receives:

  • The subject's source code path
  • The task definition and instruction
  • Current score vs target score
  • Previous iteration history (what was tried, what worked)
  • Most recent eval output (first 2000 characters)
  • Evaluator scores and reasoning

The optimizer uses file tools to read and modify the subject's source, then outputs a JSON summary describing what it changed and why.

Optimization Targets

The optimizer can modify different aspects of the subject:

TargetWhat it changes
instructionsThe agent's system instructions/prompt
promptsPrompt templates
toolsTool configurations
configAgent configuration
codeSource code logic

You select which targets are allowed when configuring optimization on a task.

Strategies

The strategy determines whether to keep each iteration's changes:

Greedy

Keep the change if:

  • Score improved over the baseline, OR
  • The optimizer successfully made a change and the score didn't decrease.

This is the most common strategy. It always moves forward.

Best-of-N

Each iteration starts from the original source (not the previous iteration). The system tracks the best score globally and applies the best iteration at the end.

Use this when you want to explore diverse changes without sequential dependency.

Annealing

Probabilistic acceptance: worse scores can be accepted early (when "temperature" is high), but acceptance becomes stricter as iterations progress.

Use this to escape local optima — it allows temporary regressions to find better solutions.

Configuration

When enabling optimization on a task:

FieldDescription
optimizerTypeagent (select an optimizer agent)
optimizerAgentIdThe agent that performs optimization
targetsWhat the optimizer can modify
maxIterationsMaximum optimization iterations
targetScoreStop when this score is reached
improvementThresholdMinimum improvement to consider meaningful
strategygreedy, best-of-n, or annealing

Optimization Results

Each iteration produces:

  • Modification — what target was changed, description, optional diff, files modified
  • Score — eval score after the change
  • Improvement — score delta from previous iteration
  • Kept — whether the strategy accepted this iteration
  • Optimizer reasoning — why the optimizer chose this change

The UI shows a timeline of all iterations with scores and decisions.

Output

When optimization completes, the best modified copy is saved at:

.codebolt/agents/{name}-opt-{tag}/

The subject's config is updated with the optimizedAgentPath pointing to this copy.

See Also