Config Reference
All agent configuration lives in a single agent.toml file in the project root.
Generate it interactively with ai init, then tune by hand. Sensitive values
(API keys) must always use ${ENV_VAR} references — inline secrets are rejected at load time.
agent.model.api_key and any
[[agent.model.fallback_chain]] api_key must be
${ENV_VAR} references. Literal secrets are rejected with an error.
[agent]
Top-level agent identity and system prompt.
| Field | Type | Default | Description |
|---|---|---|---|
| name | string | — | Required. Human-readable agent name (used in AgentCard and logs) |
| description | string | "" | Short description of the agent's purpose (exposed via A2A AgentCard) |
| system_prompt | string | "" | Base system prompt. Skill fragments are appended at serve time. Bare $name is never expanded — use ${VAR} only for API keys |
[agent.model]
Primary model selection and optional fallback chain.
| Field | Type | Default | Description |
|---|---|---|---|
| provider | string | — | "anthropic" (uses anthropic-sdk-go) or "openai-compat" (any OpenAI-compatible endpoint) |
| model | string | — | Model ID, e.g. "claude-sonnet-4-6", "gpt-5-mini", "llama3" (Ollama), "mistral" (vLLM/Fireworks) |
| api_key | string | "" | Must be ${ENV_VAR}. E.g. ${ANTHROPIC_API_KEY} |
| base_url | string | "" | For openai-compat: endpoint URL (e.g. http://localhost:11434/v1) |
| max_response_tokens | int | 8192 | Maximum tokens per model response |
[[agent.model.fallback_chain]]
Ordered list of fallback models. Tried in sequence if the primary model fails or hits the cost circuit breaker.
| Field | Type | Default | Description |
|---|---|---|---|
| provider | string | — | "anthropic" or "openai-compat" |
| model | string | — | Fallback model ID |
| base_url | string | "" | For openai-compat fallbacks |
| api_key | string | "" | Must be ${ENV_VAR} if set |
[[agent.model.fallback_chain]] provider = "openai-compat" model = "llama3" base_url = "${OLLAMA_URL}" [[agent.model.fallback_chain]] provider = "anthropic" model = "claude-haiku-4-5-20251001" api_key = "${ANTHROPIC_API_KEY}"
[agent.budget]
Resource limits and context pressure thresholds for each session.
| Field | Type | Default | Description |
|---|---|---|---|
| max_turns | int | 10 | Maximum agent turns per session before aborting |
| max_tokens_per_session | int | 0 | Max total tokens (input + output) per session. 0 = unlimited |
| max_usd_per_session | float | 0.0 | Cost circuit breaker in USD per session. 0.0 = unlimited |
| context_warn_ratio | float | 0.70 | Context fill ratio that triggers a warning log (0.0–1.0) |
| context_compact_ratio | float | 0.80 | Context fill ratio that triggers automatic compaction |
| context_abort_ratio | float | 0.95 | Context fill ratio that triggers session abort to prevent truncation |
[agent.memory]
Session persistence and semantic memory backends.
| Field | Type | Default | Description |
|---|---|---|---|
| session_store | string | "inmemory" | Session state backend: "file", "redis", or "inmemory" |
| semantic_memory | bool | false | Enable semantic memory sidecar (port 8092) for cross-session recall |
| trace_graph | bool | false | Write execution traces (turn-by-turn) to the graph DB for replay and evaluation |
[agent.graph]
Connection to a graph database backend (Neo4j Aura or local Cypherlite).
| Field | Type | Default | Description |
|---|---|---|---|
| uri | string | "" | Neo4j/Cypherlite connection URI, e.g. "neo4j+s://demo.neo4jlabs.com:7687" |
| username | string | "" | Graph DB username |
| password | string | "" | Must be ${ENV_VAR}. E.g. ${NEO4J_PASSWORD} |
| database | string | "" | Graph database name. Empty = driver default |
uri is non-empty, ai serve exposes the schema via
GET /api/graph/schema in the web console.
[agent.toolbox]
MCP tool server connection. Start mcp-toolbox with
ai sidecar mcp-toolbox --config toolbox.yaml — it supports
48 database sources out of the box including Neo4j, PostgreSQL, MySQL,
MongoDB, Redis, BigQuery, Snowflake, Cassandra, Elasticsearch, and more.
See ai sidecar and the
toolbox.yaml reference below.
| Field | Type | Default | Description |
|---|---|---|---|
| endpoint | string | "" | MCP server SSE URL, e.g. "http://localhost:15001/mcp/sse" |
| transport | string | "sse" | "sse" for HTTP/SSE (multi-session); "stdio" for subprocess / single-client |
| sidecar_config | string | "" | Path to toolbox YAML file. If set, ai serve auto-starts mcp-toolbox as a sidecar |
toolbox.yaml
Separate config file consumed by mcp-toolbox (not agent.toml).
Defines database sources and the Cypher / SQL tools the agent can call.
Supported source kinds (48 total)
| Category | Sources |
|---|---|
| Graph | neo4j, dgraph |
| Document | mongodb, firestore, couchbase, elasticsearch |
| Relational | postgres, mysql, mariadb, mssql, oracle, sqlite, cockroachdb, tidb, yugabytedb, clickhouse, trino, singlestore, oceanbase, firebird, mindsdb |
| Key-value / Cache | redis, valkey, cassandra, bigtable |
| Cloud analytics | bigquery, spanner, snowflake, looker, cloud-gda, dataplex, dataproc |
| Google Cloud SQL | alloydb-pg, cloud-sql-mysql, cloud-sql-pg, cloud-sql-mssql |
| Other | cloud-healthcare, cloud-logging-admin, cloud-monitoring, http |
Minimal Neo4j example
sources:
my_graph:
kind: neo4j
uri: "${NEO4J_URI}"
username: "${NEO4J_USERNAME}"
password: "${NEO4J_PASSWORD}"
database: "${NEO4J_DATABASE}"
tools:
search_companies:
source: my_graph
description: "Find companies by name (case-insensitive partial match)"
parameters:
- name: query
type: string
description: "Company name fragment"
statement: |
MATCH (c:Company)
WHERE toLower(c.name) CONTAINS toLower($query)
RETURN c.name AS name LIMIT 10
toolsets:
default:
- search_companies
Minimal PostgreSQL example
sources:
my_db:
kind: postgres
host: "${PG_HOST}"
port: 5432
database: "${PG_DATABASE}"
user: "${PG_USER}"
password: "${PG_PASSWORD}"
tools:
list_orders:
source: my_db
description: "List recent orders for a customer"
parameters:
- name: customer_id
type: integer
description: "Customer ID"
statement: |
SELECT id, status, total FROM orders
WHERE customer_id = $customer_id
ORDER BY created_at DESC LIMIT 20
Start the server: ai sidecar mcp-toolbox --config toolbox.yaml --port 15001
Connect the agent: set [agent.toolbox] endpoint = "http://localhost:15001/mcp/sse"
[[agent.tools]]
Explicit tool entries that supplement toolbox-discovered tools. Repeated table.
| Field | Type | Default | Description |
|---|---|---|---|
| name | string | — | Tool name (must match a tool registered in the tool registry) |
| description | string | "" | Optional override for the tool's description shown to the model |
[agent.tool_context]
Layered tool context management. Controls how many tool schemas are injected per turn and which strategies are active.
| Field | Type | Default | Description |
|---|---|---|---|
| group_threshold | int | 0 | Tool count above which list_tools meta-tool is injected. 0 = runtime default of 20; -1 = disable |
| max_schemas_per_turn | int | 8 | Maximum full tool schemas injected per turn |
| tool_selector | string | "" | Fast model ID for pre-filtering tools before the main model call. Empty = disabled |
| tool_embedder | string | "" | Embedding model for semantic tool retrieval. Empty = disabled |
[agent.skills]
List of skills to load at serve time. Each skill contributes system prompt fragments and tool requirements.
| Field | Type | Default | Description |
|---|---|---|---|
| skills | []string | [] | Skill names or GitHub refs, e.g. ["graph-search", "memory-recall"] |
[agent.a2a]
Inbound Agent-to-Agent (A2A) protocol server configuration.
| Field | Type | Default | Description |
|---|---|---|---|
| enabled | bool | true | Enable the inbound A2A HTTP server |
| port | int | 8080 | Port for the A2A server to listen on |
| endpoint | string | "" | Outbound URL for task handoff to another agent (used by StepHandoff) |
| max_requests_per_minute | int | 0 | Rate limit per principal per sliding minute window. 0 = unlimited |
| max_concurrent_tasks | int | 0 | Max non-terminal tasks per principal. 0 = unlimited |
[agent.a2a.auth]
Authentication mode for inbound A2A requests.
| Field | Type | Default | Description |
|---|---|---|---|
| mode | string | "bearer" | Auth mode: "bearer" for static token, "google" for Google ID token (OIDC) validation |
| audience | string | "" | Expected token audience for "google" mode — usually the Cloud Run service URL |
Example — Google ID token auth for a Cloud Run service:
[agent.a2a.auth] mode = "google" audience = "https://my-agent-abc123-uc.a.run.app"
[agent.a2a.card]
AgentCard metadata served at GET /.well-known/agent.json. Conforms to A2A protocol v0.3.
| Field | Type | Default | Description |
|---|---|---|---|
| url | string | "" | Public URL of the agent service (e.g. Cloud Run service URL). Auto-detected on Cloud Run via K_SERVICE metadata |
| protocol_version | string | "0.3" | A2A protocol version advertised in the card |
Use ai show --card --validate to check that all required fields are populated before registering with Gemini Enterprise.
[agent.deploy.cloudrun]
Persistent Cloud Run deployment settings. Values here become defaults for ai deploy --target cloudrun; CLI flags always override.
| Field | Type | Default | Description |
|---|---|---|---|
| project | string | "" | GCP project ID; env: GOOGLE_CLOUD_PROJECT |
| region | string | "us-central1" | GCP region; env: GOOGLE_CLOUD_REGION |
| service_account | string | "" | Service account email for the Cloud Run service identity |
| allow_unauthenticated | bool | false | Allow public (unauthenticated) traffic. When false, Cloud Run requires a Google ID token |
| secrets | []string | [] | Secret Manager mounts: ["ENV_VAR=projects/P/secrets/S/versions/latest"] |
Example:
[agent.deploy.cloudrun] project = "my-gcp-project" region = "us-central1" allow_unauthenticated = true secrets = [ "ANTHROPIC_API_KEY=projects/my-gcp-project/secrets/anthropic-key/versions/latest", ]
[agent.mcp_server]
Outward-facing MCP server surface — exposes agent skills as MCP prompts to Claude Desktop, Cursor, and other MCP hosts.
| Field | Type | Default | Description |
|---|---|---|---|
| enabled | bool | false | Enable the outward MCP server |
| transport | string | "http" | "stdio" (single host, e.g. Claude Desktop) or "http" (dual transport: Streamable HTTP at POST /mcp (preferred) + legacy SSE at GET /sse) |
| port | int | 8081 | Port for the MCP HTTP server |
[agent.security]
Authentication and prompt injection protection.
| Field | Type | Default | Description |
|---|---|---|---|
| require_auth | bool | false | Require Bearer token auth on all A2A endpoints |
| injection_detection | bool | false | Scan incoming messages and tool results for prompt injection patterns |
| require_human_approval | bool | false | Require human approval for irreversible (R_DESTROY) tool calls |
| require_human_approval_except | []string | [] | Glob patterns (path.Match syntax) for tools exempt from the approval gate. Wildcard "*" is not permitted |
When require_auth = true, all A2A requests must include
Authorization: Bearer <token>. Task ownership is enforced — clients can only
access tasks they created.
[agent.sandbox]
Secure code execution sandbox. Registers execute_js, execute_python, and execute_shell MCP tools when enabled.
| Field | Type | Default | Description |
|---|---|---|---|
| enable | bool | false | Register sandbox MCP tools at serve time |
| require_human_approval | []string | [] | Glob patterns for tools requiring approval before execution, e.g. ["execute_shell", "delete_*"] |
| firecracker_pool_size | int | 2 | Number of pre-warmed Firecracker VMs |
| shell_backend | string | "auto" | "firecracker" (microVM, linux/amd64+KVM), "nsjail" (linux, no KVM), or "auto" (Firecracker if /dev/kvm present, nsjail otherwise) |
| python_timeout_sec | int | 30 | Execution timeout for execute_python |
| js_timeout_sec | int | 10 | Execution timeout for execute_js |
| shell_timeout_sec | int | 60 | Execution timeout for execute_shell |
[agent.tool_trust]
Override automatic trust classification for specific tools. Trust levels range from T4 (schema-meta, highest trust) to T9 (untrusted, lowest trust).
| Field | Type | Default | Description |
|---|---|---|---|
| overrides | map[string]int | {} | Maps tool name substrings to trust levels 4–9. Auto-classification is used for unmatched tools |
[agent.tool_trust.overrides] "my_internal_api__" = 6 # T6: allowlisted external API "scraped_content" = 8 # T8: web content
[agent.tool_reversibility]
Override automatic reversibility classification for specific tools.
| Field | Type | Default | Description |
|---|---|---|---|
| overrides | map[string]int | {} | Maps tool name substrings to reversibility levels 0–2. Auto-classification is used for unmatched tools |
0 = R_READ (safe, no side effects),
1 = R_WRITE (reversible side effects),
2 = R_DESTROY (irreversible — triggers approval gate when require_human_approval is true).
Built-in tool classifications
These tools are always classified as shown, regardless of overrides. Use overrides to change the classification for your deployment.
| Tool name | Classification | Description |
|---|---|---|
delete_* | R_DESTROY | Any tool matching the delete_ prefix |
drop_* | R_DESTROY | Any tool matching the drop_ prefix |
execute_shell | R_DESTROY | Shell execution |
http_request | R_DESTROY | Arbitrary HTTP writes |
send_*, email_* | R_DESTROY | Messaging and notification tools |
deploy_*, ai_deploy | R_DESTROY | Deployment tools |
write_cypher | R_DESTROY | Generic Cypher write fallback — always approval-gated |
write_*, create_*, update_* | R_WRITE | Reversible write operations |
read_cypher | R_READ | Generic read-only Cypher query — never gated |
| (everything else) | R_READ | Default when no pattern matches |
[agent.tool_reversibility.overrides] "my_safe_shell" = 0 # treat as safe even though name matches R_DESTROY "read_api" = 2 # this read-looking tool has irreversible side effects
[agent.intent_router]
Intent-based routing. Classifies user intent before the first LLM call and restricts tool visibility to the matched route.
| Field | Type | Default | Description |
|---|---|---|---|
| classify_model | string | "" | Model ID for lightweight intent classification. Empty = routing disabled |
[[agent.intent_router.route]]
Ordered list of intent routes. A route with intent = "default" acts as the fallback when no other intent matches.
| Field | Type | Default | Description |
|---|---|---|---|
| intent | string | — | Intent label returned by the classifier. "default" = fallback route |
| agent | string | "" | Optional A2A endpoint URL for sub-agent handoff |
| tools | []string | [] | Allowlist of tool names visible for this intent. Empty = all tools |
[agent.intent_router] classify_model = "claude-haiku-4-5-20251001" [[agent.intent_router.route]] intent = "research" tools = ["search_companies", "get_company_details"] [[agent.intent_router.route]] intent = "billing" agent = "http://billing-agent:8080" [[agent.intent_router.route]] intent = "default"
[[agent.tool_guard]]
Before/after guards for tool calls. Repeated table — add one section per guard rule.
| Field | Type | Default | Description |
|---|---|---|---|
| tool | string | — | Glob pattern (path.Match syntax) matching tool names to guard |
| before | string | "" | Tool to call before the guarded tool. If it returns an error, the guarded tool is blocked |
| after | string | "" | Tool to call after the guarded tool. Always runs (defer semantics) |
[[agent.tool_guard]] tool = "delete_*" before = "human_approval" [[agent.tool_guard]] tool = "write_file" after = "audit_log"
[agent.honesty]
Calibrated uncertainty and identity acknowledgement behaviours injected into the system prompt.
| Field | Type | Default | Description |
|---|---|---|---|
| calibrated_uncertainty | bool | true | Adds uncertainty guidance to the system prompt ("when data is incomplete, say so") |
| identity_acknowledgement | bool | true | Adds AI identity line to the system prompt |
| data_freshness_note | string | "" | Operator note about data freshness appended to system prompt. Empty = no note |
Full Example
A complete annotated agent.toml:
# agent.toml — complete example [agent] name = "companies-researcher" description = "Research companies using the knowledge graph" system_prompt = """ You are a research agent. Use available tools to answer questions about companies. Prefer targeted lookups over broad queries. Cite your sources. """ [agent.model] provider = "anthropic" model = "claude-sonnet-4-6" api_key = "${ANTHROPIC_API_KEY}" max_response_tokens = 8192 [[agent.model.fallback_chain]] provider = "openai-compat" model = "llama3" base_url = "${OLLAMA_URL}" [agent.budget] max_turns = 15 max_tokens_per_session = 0 # 0 = unlimited max_usd_per_session = 0.50 context_warn_ratio = 0.70 context_compact_ratio = 0.80 context_abort_ratio = 0.95 [agent.memory] session_store = "file" # file | redis | inmemory semantic_memory = true trace_graph = true # write execution traces to graph DB [agent.graph] uri = "neo4j+s://demo.neo4jlabs.com:7687" username = "companies2" password = "${NEO4J_PASSWORD}" database = "companies2" [agent.toolbox] endpoint = "http://localhost:15000/mcp/sse" transport = "http" # stdio | http sidecar_config = "toolbox.yaml" # auto-start mcp-toolbox [agent.tool_context] group_threshold = 20 max_schemas_per_turn = 8 tool_selector = "claude-haiku-4-5-20251001" [agent.skills] skills = ["graph-search", "memory-recall"] [agent.a2a] enabled = true port = 8080 max_requests_per_minute = 60 max_concurrent_tasks = 5 [agent.mcp_server] enabled = true transport = "http" # stdio | http (dual: POST /mcp + GET /sse) port = 8081 [agent.security] require_auth = true injection_detection = true require_human_approval = true require_human_approval_except = ["safe_read_*"] [agent.sandbox] enable = true shell_backend = "auto" [agent.honesty] calibrated_uncertainty = true identity_acknowledgement = true data_freshness_note = "query results may reflect data as of last nightly sync" [agent.intent_router] classify_model = "claude-haiku-4-5-20251001" [[agent.intent_router.route]] intent = "research" tools = ["search_companies", "get_company_details"] [[agent.intent_router.route]] intent = "default" [[agent.tool_guard]] tool = "delete_*" before = "human_approval"