Skip to main content

Custom LLM Providers

Add any LLM backend — a cloud API, a local model, a fine-tuned endpoint, or a provider Codebolt doesn't ship with — and make it available as a first-class model choice across the entire application.

Once registered, your provider's models appear in the model picker everywhere: agent manifests, the chat panel, and optimization runs.

When to Write a Custom LLM Provider

  • You're running a local model (Ollama, llama.cpp, vLLM) and want agents to use it.
  • Your organisation has a private Azure OpenAI deployment or an on-prem API endpoint.
  • You've fine-tuned a model and want it selectable alongside built-in models.
  • You need a provider with non-standard auth, routing, or rate-limiting logic.

How It Works

A custom LLM provider is a plugin with type: "llmProvider" that registers itself on startup and handles inference requests from the application.

┌──────────────────────────┐
│ Codebolt Application │
│ (Agent, Chat, etc.) │
│ │
│ "Use model X for this" │
└────────────┬─────────────┘
│ inference request

┌────────────┴─────────────┐
│ Your LLM Provider │
│ Plugin │
│ │
│ 1. Receive request │
│ 2. Call your API │
│ 3. Stream/send response │
└──────────────────────────┘

Lifecycle

  1. Startupplugin.onStart() fires. Your plugin calls llmProvider.register() with provider metadata and model list.
  2. Request handling — The application sends inference requests. Your plugin handles them via llmProvider.onCompletionRequest() (non-streaming) and llmProvider.onStreamRequest() (streaming).
  3. Response — Your plugin sends responses back via llmProvider.sendReply(), llmProvider.sendChunk(), or llmProvider.sendError().
  4. Shutdownplugin.onStop() fires. Your plugin calls llmProvider.unregister().

Plugin package.json

{
"name": "my-llm-provider",
"version": "1.0.0",
"main": "dist/index.js",
"codebolt": {
"plugin": {
"type": "llmProvider",
"triggers": [{ "type": "startup" }]
}
}
}

The key fields:

  • type: "llmProvider" — tells Codebolt this plugin provides models.
  • triggers: [{ "type": "startup" }] — auto-starts when the application launches.

Provider SDK API

The Plugin SDK exposes the llmProvider module for registration and request handling.

Registration

await llmProvider.register({
providerId: 'my-provider',
name: 'My LLM Provider',
description: 'Custom model provider',
capabilities: ['chat', 'tools', 'streaming'],
requiresKey: false,
configFields: [
{ key: 'apiKey', label: 'API Key', type: 'password', required: false },
{ key: 'apiUrl', label: 'API URL', type: 'text', required: false },
],
models: [
{ id: 'my-model-v1', name: 'My Model v1' },
{ id: 'my-model-v2', name: 'My Model v2' },
],
});
FieldDescription
providerIdUnique identifier for the provider
nameDisplay name in the model picker
capabilitiesWhat the provider supports: chat, tools, streaming
requiresKeyWhether an API key is mandatory
configFieldsUser-facing configuration fields (shown in settings)
modelsList of models this provider offers

Request Handlers

// Non-streaming requests
llmProvider.onCompletionRequest(async (req) => {
// req.requestId — unique request ID
// req.options — model, messages, temperature, tools, etc.
});

// Streaming requests
llmProvider.onStreamRequest(async (req) => {
// req.requestId — unique request ID
// req.options — same as above, but expects chunked response
});

Response Methods

MethodWhen to use
llmProvider.sendReply(requestId, response, success)Send a complete response (non-streaming or final aggregated response)
llmProvider.sendChunk(requestId, chunk)Send a streaming chunk
llmProvider.sendError(requestId, errorMessage)Send an error

Cleanup

await llmProvider.unregister('my-provider');

Request Options

The req.options object contains everything your provider needs to make an inference call:

FieldTypeDescription
modelstringModel ID (from your registered models list)
messagesarrayChat messages in OpenAI format
temperaturenumber?Sampling temperature
top_pnumber?Nucleus sampling
max_tokensnumber?Maximum tokens to generate
toolsarray?Tool/function definitions
tool_choiceany?Tool selection strategy
response_formatany?Response format constraint
stopany?Stop sequences
streambooleanWhether streaming was requested

See Also