agent.toml is the plain-text configuration file for an agent-intelligence.ai agent. It defines the agent's identity, system prompt, LLM model and provider, token budget, memory settings, toolbox connections, skills, A2A server settings, MCP server settings, and security policies. The entire agent is described in this single file — no code required.

How do I configure which LLM model an agent uses?

Set the [agent.model] section in agent.toml. For Anthropic Claude, set provider = "anthropic" and model = "claude-sonnet-4-6". For OpenAI-compatible APIs, set provider = "openai-compat", model = "gpt-5", and optionally base_url for local models. Add [[agent.model.fallback_chain]] entries to automatically fall back to a cheaper model if the primary exceeds budget.

How do I add tools to an agent?

Add tools via the [agent.toolbox] section in agent.toml. Each [[agent.toolbox.sources]] entry points to an MCP server endpoint. Use 'ai sidecar mcp-toolbox' to start the built-in mcp-toolbox sidecar which provides 48 source kinds including databases, APIs, and file systems. The agent connects automatically when 'ai serve' starts.

Config Reference

All agent configuration lives in a single agent.toml file in the project root. Generate it interactively with ai init, then tune by hand. Sensitive values (API keys) must always use ${ENV_VAR} references — inline secrets are rejected at load time.

Security: Fields agent.model.api_key and any [[agent.model.fallback_chain]] api_key must be ${ENV_VAR} references. Literal secrets are rejected with an error.

[agent]

Top-level agent identity and system prompt.

Field	Type	Default	Description
name	string	—	Required. Human-readable agent name (used in AgentCard and logs)
description	string	""	Short description of the agent's purpose (exposed via A2A AgentCard)
system_prompt	string	""	Base system prompt. Skill fragments are appended at serve time. Bare `$name` is never expanded — use `${VAR}` only for API keys

[agent.model]

Primary model selection and optional fallback chain.

Field	Type	Default	Description
provider	string	—	`"anthropic"` (uses anthropic-sdk-go) or `"openai-compat"` (any OpenAI-compatible endpoint)
model	string	—	Model ID, e.g. `"claude-sonnet-4-6"`, `"gpt-5-mini"`, `"llama3"` (Ollama), `"mistral"` (vLLM/Fireworks)
api_key	string	""	Must be `${ENV_VAR}`. E.g. `${ANTHROPIC_API_KEY}`
base_url	string	""	For `openai-compat`: endpoint URL (e.g. `http://localhost:11434/v1`)
max_response_tokens	int	8192	Maximum tokens per model response

[[agent.model.fallback_chain]]

Ordered list of fallback models. Tried in sequence if the primary model fails or hits the cost circuit breaker.

Field	Type	Default	Description
provider	string	—	`"anthropic"` or `"openai-compat"`
model	string	—	Fallback model ID
base_url	string	""	For `openai-compat` fallbacks
api_key	string	""	Must be `${ENV_VAR}` if set

[[agent.model.fallback_chain]]
provider = "openai-compat"
model    = "llama3"
base_url = "${OLLAMA_URL}"

[[agent.model.fallback_chain]]
provider = "anthropic"
model    = "claude-haiku-4-5-20251001"
api_key  = "${ANTHROPIC_API_KEY}"

[agent.budget]

Resource limits and context pressure thresholds for each session.

Field	Type	Default	Description
max_turns	int	10	Maximum agent turns per session before aborting
max_tokens_per_session	int	0	Max total tokens (input + output) per session. `0` = unlimited
max_usd_per_session	float	0.0	Cost circuit breaker in USD per session. `0.0` = unlimited
context_warn_ratio	float	0.70	Context fill ratio that triggers a warning log (0.0–1.0)
context_compact_ratio	float	0.80	Context fill ratio that triggers automatic compaction
context_abort_ratio	float	0.95	Context fill ratio that triggers session abort to prevent truncation

[agent.memory]

Session persistence and semantic memory backends.

Field	Type	Default	Description
session_store	string	"inmemory"	Session state backend: `"file"`, `"redis"`, or `"inmemory"`
semantic_memory	bool	false	Enable semantic memory sidecar (port 8092) for cross-session recall
trace_graph	bool	false	Write execution traces (turn-by-turn) to the graph DB for replay and evaluation

[agent.graph]

Connection to a graph database backend (Neo4j Aura or local Cypherlite).

Field	Type	Default	Description
uri	string	""	Neo4j/Cypherlite connection URI, e.g. `"neo4j+s://demo.neo4jlabs.com:7687"`
username	string	""	Graph DB username
password	string	""	Must be `${ENV_VAR}`. E.g. `${NEO4J_PASSWORD}`
database	string	""	Graph database name. Empty = driver default

Note: When uri is non-empty, ai serve exposes the schema via GET /api/graph/schema in the web console.

[agent.toolbox]

MCP tool server connection. Start mcp-toolbox with ai sidecar mcp-toolbox --config toolbox.yaml — it supports 48 database sources out of the box including Neo4j, PostgreSQL, MySQL, MongoDB, Redis, BigQuery, Snowflake, Cassandra, Elasticsearch, and more. See ai sidecar and the toolbox.yaml reference below.

Field	Type	Default	Description
endpoint	string	""	MCP server SSE URL, e.g. `"http://localhost:15001/mcp/sse"`
transport	string	"sse"	`"sse"` for HTTP/SSE (multi-session); `"stdio"` for subprocess / single-client
sidecar_config	string	""	Path to toolbox YAML file. If set, `ai serve` auto-starts mcp-toolbox as a sidecar

toolbox.yaml

Separate config file consumed by mcp-toolbox (not agent.toml). Defines database sources and the Cypher / SQL tools the agent can call.

Supported source kinds (48 total)

Category	Sources
Graph	`neo4j`, `dgraph`
Document	`mongodb`, `firestore`, `couchbase`, `elasticsearch`
Relational	`postgres`, `mysql`, `mariadb`, `mssql`, `oracle`, `sqlite`, `cockroachdb`, `tidb`, `yugabytedb`, `clickhouse`, `trino`, `singlestore`, `oceanbase`, `firebird`, `mindsdb`
Key-value / Cache	`redis`, `valkey`, `cassandra`, `bigtable`
Cloud analytics	`bigquery`, `spanner`, `snowflake`, `looker`, `cloud-gda`, `dataplex`, `dataproc`
Google Cloud SQL	`alloydb-pg`, `cloud-sql-mysql`, `cloud-sql-pg`, `cloud-sql-mssql`
Other	`cloud-healthcare`, `cloud-logging-admin`, `cloud-monitoring`, `http`

Minimal Neo4j example

sources:
  my_graph:
    kind: neo4j
    uri:      "${NEO4J_URI}"
    username: "${NEO4J_USERNAME}"
    password: "${NEO4J_PASSWORD}"
    database: "${NEO4J_DATABASE}"

tools:
  search_companies:
    source:      my_graph
    description: "Find companies by name (case-insensitive partial match)"
    parameters:
      - name: query
        type: string
        description: "Company name fragment"
    statement: |
      MATCH (c:Company)
      WHERE toLower(c.name) CONTAINS toLower($query)
      RETURN c.name AS name LIMIT 10

toolsets:
  default:
    - search_companies

Minimal PostgreSQL example

sources:
  my_db:
    kind: postgres
    host:     "${PG_HOST}"
    port:     5432
    database: "${PG_DATABASE}"
    user:     "${PG_USER}"
    password: "${PG_PASSWORD}"

tools:
  list_orders:
    source:      my_db
    description: "List recent orders for a customer"
    parameters:
      - name: customer_id
        type: integer
        description: "Customer ID"
    statement: |
      SELECT id, status, total FROM orders
      WHERE customer_id = $customer_id
      ORDER BY created_at DESC LIMIT 20

Start the server: ai sidecar mcp-toolbox --config toolbox.yaml --port 15001

Connect the agent: set [agent.toolbox] endpoint = "http://localhost:15001/mcp/sse"

[[agent.tools]]

Explicit tool entries that supplement toolbox-discovered tools. Repeated table.

Field	Type	Default	Description
name	string	—	Tool name (must match a tool registered in the tool registry)
description	string	""	Optional override for the tool's description shown to the model

[agent.tool_context]

Layered tool context management. Controls how many tool schemas are injected per turn and which strategies are active.

Field	Type	Default	Description
group_threshold	int	0	Tool count above which `list_tools` meta-tool is injected. `0` = runtime default of 20; `-1` = disable
max_schemas_per_turn	int	8	Maximum full tool schemas injected per turn
tool_selector	string	""	Fast model ID for pre-filtering tools before the main model call. Empty = disabled
tool_embedder	string	""	Embedding model for semantic tool retrieval. Empty = disabled

[agent.skills]

List of skills to load at serve time. Each skill contributes system prompt fragments and tool requirements.

Field	Type	Default	Description
skills	[]string	[]	Skill names or GitHub refs, e.g. `["graph-search", "memory-recall"]`

[agent.a2a]

Inbound Agent-to-Agent (A2A) protocol server configuration.

Field	Type	Default	Description
enabled	bool	true	Enable the inbound A2A HTTP server
port	int	8080	Port for the A2A server to listen on
endpoint	string	""	Outbound URL for task handoff to another agent (used by `StepHandoff`)
max_requests_per_minute	int	0	Rate limit per principal per sliding minute window. `0` = unlimited
max_concurrent_tasks	int	0	Max non-terminal tasks per principal. `0` = unlimited

[agent.a2a.auth]

Authentication mode for inbound A2A requests.

Field	Type	Default	Description
mode	string	"bearer"	Auth mode: `"bearer"` for static token, `"google"` for Google ID token (OIDC) validation
audience	string	""	Expected token audience for `"google"` mode — usually the Cloud Run service URL

Example — Google ID token auth for a Cloud Run service:

[agent.a2a.auth]
mode     = "google"
audience = "https://my-agent-abc123-uc.a.run.app"

[agent.a2a.card]

AgentCard metadata served at GET /.well-known/agent.json. Conforms to A2A protocol v0.3.

Field	Type	Default	Description
url	string	""	Public URL of the agent service (e.g. Cloud Run service URL). Auto-detected on Cloud Run via `K_SERVICE` metadata
protocol_version	string	"0.3"	A2A protocol version advertised in the card

Use ai show --card --validate to check that all required fields are populated before registering with Gemini Enterprise.

[agent.deploy.cloudrun]

Persistent Cloud Run deployment settings. Values here become defaults for ai deploy --target cloudrun; CLI flags always override.

Field	Type	Default	Description
project	string	""	GCP project ID; env: `GOOGLE_CLOUD_PROJECT`
region	string	"us-central1"	GCP region; env: `GOOGLE_CLOUD_REGION`
service_account	string	""	Service account email for the Cloud Run service identity
allow_unauthenticated	bool	false	Allow public (unauthenticated) traffic. When `false`, Cloud Run requires a Google ID token
secrets	[]string	[]	Secret Manager mounts: `["ENV_VAR=projects/P/secrets/S/versions/latest"]`

Example:

[agent.deploy.cloudrun]
project              = "my-gcp-project"
region               = "us-central1"
allow_unauthenticated = true
secrets              = [
  "ANTHROPIC_API_KEY=projects/my-gcp-project/secrets/anthropic-key/versions/latest",
]

[agent.mcp_server]

Outward-facing MCP server surface — exposes agent skills as MCP prompts to Claude Desktop, Cursor, and other MCP hosts.

Field	Type	Default	Description
enabled	bool	false	Enable the outward MCP server
transport	string	"http"	`"stdio"` (single host, e.g. Claude Desktop) or `"http"` (dual transport: Streamable HTTP at `POST /mcp` (preferred) + legacy SSE at `GET /sse`)
port	int	8081	Port for the MCP HTTP server

[agent.security]

Authentication and prompt injection protection.

Field	Type	Default	Description
require_auth	bool	false	Require Bearer token auth on all A2A endpoints
injection_detection	bool	false	Scan incoming messages and tool results for prompt injection patterns
require_human_approval	bool	false	Require human approval for irreversible (R_DESTROY) tool calls
require_human_approval_except	[]string	[]	Glob patterns (path.Match syntax) for tools exempt from the approval gate. Wildcard `"*"` is not permitted

When require_auth = true, all A2A requests must include Authorization: Bearer <token>. Task ownership is enforced — clients can only access tasks they created.

[agent.sandbox]

Secure code execution sandbox. Registers execute_js, execute_python, and execute_shell MCP tools when enabled.

Field	Type	Default	Description
enable	bool	false	Register sandbox MCP tools at serve time
require_human_approval	[]string	[]	Glob patterns for tools requiring approval before execution, e.g. `["execute_shell", "delete_*"]`
firecracker_pool_size	int	2	Number of pre-warmed Firecracker VMs
shell_backend	string	"auto"	`"firecracker"` (microVM, linux/amd64+KVM), `"nsjail"` (linux, no KVM), or `"auto"` (Firecracker if /dev/kvm present, nsjail otherwise)
python_timeout_sec	int	30	Execution timeout for `execute_python`
js_timeout_sec	int	10	Execution timeout for `execute_js`
shell_timeout_sec	int	60	Execution timeout for `execute_shell`

[agent.tool_trust]

Override automatic trust classification for specific tools. Trust levels range from T4 (schema-meta, highest trust) to T9 (untrusted, lowest trust).

Field	Type	Default	Description
overrides	map[string]int	{}	Maps tool name substrings to trust levels 4–9. Auto-classification is used for unmatched tools

[agent.tool_trust.overrides]
"my_internal_api__" = 6   # T6: allowlisted external API
"scraped_content"   = 8   # T8: web content

[agent.tool_reversibility]

Override automatic reversibility classification for specific tools.

Field	Type	Default	Description
overrides	map[string]int	{}	Maps tool name substrings to reversibility levels 0–2. Auto-classification is used for unmatched tools

Levels: 0 = R_READ (safe, no side effects), 1 = R_WRITE (reversible side effects), 2 = R_DESTROY (irreversible — triggers approval gate when require_human_approval is true).

Built-in tool classifications

These tools are always classified as shown, regardless of overrides. Use overrides to change the classification for your deployment.

Tool name	Classification	Description
`delete_*`	R_DESTROY	Any tool matching the delete_ prefix
`drop_*`	R_DESTROY	Any tool matching the drop_ prefix
`execute_shell`	R_DESTROY	Shell execution
`http_request`	R_DESTROY	Arbitrary HTTP writes
`send_`, `email_`	R_DESTROY	Messaging and notification tools
`deploy_*`, `ai_deploy`	R_DESTROY	Deployment tools
`write_cypher`	R_DESTROY	Generic Cypher write fallback — always approval-gated
`write_`, `create_`, `update_*`	R_WRITE	Reversible write operations
`read_cypher`	R_READ	Generic read-only Cypher query — never gated
(everything else)	R_READ	Default when no pattern matches

[agent.tool_reversibility.overrides]
"my_safe_shell" = 0   # treat as safe even though name matches R_DESTROY
"read_api"      = 2   # this read-looking tool has irreversible side effects

[agent.intent_router]

Intent-based routing. Classifies user intent before the first LLM call and restricts tool visibility to the matched route.

Field	Type	Default	Description
classify_model	string	""	Model ID for lightweight intent classification. Empty = routing disabled

[[agent.intent_router.route]]

Ordered list of intent routes. A route with intent = "default" acts as the fallback when no other intent matches.

Field	Type	Default	Description
intent	string	—	Intent label returned by the classifier. `"default"` = fallback route
agent	string	""	Optional A2A endpoint URL for sub-agent handoff
tools	[]string	[]	Allowlist of tool names visible for this intent. Empty = all tools

[agent.intent_router]
classify_model = "claude-haiku-4-5-20251001"

[[agent.intent_router.route]]
intent = "research"
tools  = ["search_companies", "get_company_details"]

[[agent.intent_router.route]]
intent = "billing"
agent  = "http://billing-agent:8080"

[[agent.intent_router.route]]
intent = "default"

[[agent.tool_guard]]

Before/after guards for tool calls. Repeated table — add one section per guard rule.

Field	Type	Default	Description
tool	string	—	Glob pattern (path.Match syntax) matching tool names to guard
before	string	""	Tool to call before the guarded tool. If it returns an error, the guarded tool is blocked
after	string	""	Tool to call after the guarded tool. Always runs (defer semantics)

[[agent.tool_guard]]
tool   = "delete_*"
before = "human_approval"

[[agent.tool_guard]]
tool  = "write_file"
after = "audit_log"

[agent.honesty]

Calibrated uncertainty and identity acknowledgement behaviours injected into the system prompt.

Field	Type	Default	Description
calibrated_uncertainty	bool	true	Adds uncertainty guidance to the system prompt ("when data is incomplete, say so")
identity_acknowledgement	bool	true	Adds AI identity line to the system prompt
data_freshness_note	string	""	Operator note about data freshness appended to system prompt. Empty = no note

Full Example

A complete annotated agent.toml:

# agent.toml — complete example

[agent]
name         = "companies-researcher"
description  = "Research companies using the knowledge graph"
system_prompt = """
You are a research agent. Use available tools to answer questions about companies.
Prefer targeted lookups over broad queries. Cite your sources.
"""

[agent.model]
provider            = "anthropic"
model               = "claude-sonnet-4-6"
api_key             = "${ANTHROPIC_API_KEY}"
max_response_tokens = 8192

[[agent.model.fallback_chain]]
provider = "openai-compat"
model    = "llama3"
base_url = "${OLLAMA_URL}"

[agent.budget]
max_turns              = 15
max_tokens_per_session = 0        # 0 = unlimited
max_usd_per_session    = 0.50
context_warn_ratio     = 0.70
context_compact_ratio  = 0.80
context_abort_ratio    = 0.95

[agent.memory]
session_store   = "file"     # file | redis | inmemory
semantic_memory = true
trace_graph     = true       # write execution traces to graph DB

[agent.graph]
uri      = "neo4j+s://demo.neo4jlabs.com:7687"
username = "companies2"
password = "${NEO4J_PASSWORD}"
database = "companies2"

[agent.toolbox]
endpoint       = "http://localhost:15000/mcp/sse"
transport      = "http"         # stdio | http
sidecar_config = "toolbox.yaml" # auto-start mcp-toolbox

[agent.tool_context]
group_threshold     = 20
max_schemas_per_turn = 8
tool_selector       = "claude-haiku-4-5-20251001"

[agent.skills]
skills = ["graph-search", "memory-recall"]

[agent.a2a]
enabled                = true
port                   = 8080
max_requests_per_minute = 60
max_concurrent_tasks   = 5

[agent.mcp_server]
enabled    = true
transport  = "http"       # stdio | http (dual: POST /mcp + GET /sse)
port       = 8081

[agent.security]
require_auth                = true
injection_detection         = true
require_human_approval      = true
require_human_approval_except = ["safe_read_*"]

[agent.sandbox]
enable        = true
shell_backend = "auto"

[agent.honesty]
calibrated_uncertainty   = true
identity_acknowledgement = true
data_freshness_note      = "query results may reflect data as of last nightly sync"

[agent.intent_router]
classify_model = "claude-haiku-4-5-20251001"

[[agent.intent_router.route]]
intent = "research"
tools  = ["search_companies", "get_company_details"]

[[agent.intent_router.route]]
intent = "default"

[[agent.tool_guard]]
tool   = "delete_*"
before = "human_approval"