API Reference

The agent runtime exposes three HTTP API surfaces: the A2A protocol for agent-to-agent communication, an MCP server for tool/prompt discovery by Claude Desktop and other MCP hosts, and a lightweight REST management API. All surfaces are served by a single Go binary on a configurable port (default :8080).

Authentication

When agent.security.require_auth = true, all A2A endpoints require a Bearer token in the Authorization header:

Authorization: Bearer <token>

Task ownership is enforced — clients can only read or cancel tasks they created. Requests for another client's task return 404 (not 403) to avoid leaking task existence. The REST management API and MCP server are not covered by bearer auth; they should be placed behind a network perimeter in production.

A2A — Agent Discovery

Google A2A Protocol

The Agent Card describes the agent's capabilities, supported A2A version, skills, and authentication requirements. Any A2A-compatible client fetches this before initiating tasks.

GET /.well-known/agent.json

Return the AgentCard for capability discovery.

Response

{
  "name": "companies-researcher",
  "description": "Research companies using the knowledge graph",
  "version": "1.0.0",
  "url": "https://my-agent.example.com",
  "capabilities": {
    "streaming": true,
    "pushNotifications": false,
    "stateTransitionHistory": true
  },
  "defaultInputModes": ["text/plain"],
  "defaultOutputModes": ["text/plain"],
  "skills": [
    { "id": "graph-search", "name": "Graph Search", "description": "..." }
  ],
  "authentication": { "schemes": ["bearer"] }
}

A2A — Tasks

Submit tasks and poll their lifecycle state. Tasks progress through: submittedworkingcompleted | failed | canceled.

POST /a2a auth

Submit a task and get a synchronous response (Task or Message). For streaming, use /a2a/stream.

Request body (JSON-RPC)

{
  "jsonrpc": "2.0",
  "id": "1",
  "method": "message/send",
  "params": {
    "message": {
      "role": "user",
      "parts": [{ "type": "text", "text": "Top 5 companies by revenue?" }]
    }
  }
}

Response

{
  "jsonrpc": "2.0",
  "id": "1",
  "result": {
    "id": "task_01j...",
    "status": { "state": "completed" },
    "artifacts": [
      { "parts": [{ "type": "text", "text": "1. Apple — $394B ..." }] }
    ]
  }
}
POST /a2a/stream auth SSE

Submit a task and receive a Server-Sent Events stream of TaskStatusUpdateEvent and TaskArtifactUpdateEvent messages. Connection closes when the task reaches a terminal state.

Request body

Same JSON-RPC envelope as POST /a2a with method "message/stream"

SSE event stream

data: {"type":"TaskStatusUpdateEvent","taskId":"task_01j...","status":{"state":"working"}}
data: {"type":"TaskArtifactUpdateEvent","taskId":"task_01j...","artifact":{"parts":[{"type":"text","text":"1. Apple"}]}}
data: {"type":"TaskStatusUpdateEvent","taskId":"task_01j...","status":{"state":"completed"},"final":true}
GET /a2a/tasks/{id} auth

Poll the current state of a task by ID. Returns the full Task object including all artifacts.

Path parameters

id  string  Task ID returned from POST /a2a

Response

{ "id": "task_01j...", "status": { "state": "completed" }, "artifacts": [...] }
GET /a2a/tasks auth

List tasks for the authenticated client. Cursor-based pagination.

Query parameters

cursor  string  Pagination cursor from previous response
limit   int     Max results per page (default 20, max 100)

Response

{ "tasks": [...], "nextCursor": "eyJ..." }
DELETE /a2a/tasks/{id} auth

Cancel a running task. The task transitions to canceled state; any in-flight model call is interrupted via context cancellation.

Response

{ "id": "task_01j...", "status": { "state": "canceled" } }

Agent-to-Agent (A2A) Orchestration

Multi-agent patterns

Agents can delegate tasks to other agents over HTTP using the same A2A protocol. An orchestrator agent submits tasks to a specialist agent, streams progress events, and optionally gates irreversible actions behind human approval — all without any out-of-band coordination. Specialist agents are discovered via /.well-known/agent.json.

Multi-agent topology

  User / ai run
       │
       │  POST /a2a
       ▼
 ┌──────────────────┐    POST /a2a       ┌──────────────────┐
 │  Analyst Agent   │──────────────────►│ Researcher Agent │
 │  :8080           │◄──────────────────│  :8082           │
 │  (orchestrator)  │  GET /a2a/{id}/   │  (specialist)    │
 │                  │  stream (SSE)     │                  │
 └──────────────────┘                   └──────────────────┘
          │                                      │
          │ POST /a2a/{id}/approve               │ mcp-toolbox :15000
          │ (human approval gate)                │
          ▼                                      ▼
    Human Operator                         Neo4j Graph DB
          

Task lifecycle

  POST /a2a
       │
       ▼
  submitted   Task accepted, queued for execution
       │
       ▼
  working     Agent is executing; streaming progress events
       │
       ├──► awaiting-approval   Human approval required before continuing
       │           │
       │     POST /a2a/{id}/approve (approved or rejected)
       │           │
       │    ◄──────┘
       │
       ├──► completed    Final answer available in artifacts
       ├──► failed       Unrecoverable error; see error field
       └──► canceled     Canceled via DELETE /a2a/tasks/{id}
          

The awaiting-approval state pauses execution until the operator posts an approval decision. The agent resumes from the same point — no work is lost.

GET /a2a/{id}/stream auth SSE

Subscribe to the real-time event stream for a specific task. Returns a Server-Sent Events stream that remains open until the task reaches a terminal state (completed, failed, or canceled). Used by ai run --stream and the a2a_delegate built-in tool.

SSE event types

event: progress
data: {"type":"progress","message":"Querying companies graph for subsidiaries..."}

event: approval_required
data: {"type":"approval_required","message":"About to publish investment memo to Confluence. Approve?"}

event: completed
data: {"type":"completed","result":"## Investment Memo\n\nAlphabet Inc. analysis..."}

event: failed
data: {"type":"failed","error":"Tool execution error: connection refused"}

Event type reference

progress           Intermediate status update; print and continue streaming
approval_required  Execution paused; operator must POST /a2a/{id}/approve
completed          Final result in result field; stream closes
failed             Terminal error in error field; stream closes

Example — stream with curl

curl -N -H "Authorization: Bearer $TOKEN" \
     https://my-agent.fly.dev/a2a/task_01j.../stream
POST /a2a/{id}/approve auth

Resume a task that is in the awaiting-approval state. Sending "approved": true continues execution; false cancels the pending action and lets the agent decide how to proceed (it may retry, skip, or fail).

Request body

{
  "approved": true,
  "message": "Looks good, proceed"
}

Response (task resumes)

{ "id": "task_01j...", "status": { "state": "working" } }

Response (rejection)

{ "id": "task_01j...", "status": { "state": "working" }, "note": "Approval rejected — agent notified" }

Error — task not awaiting approval

HTTP 409
{ "error": "task is not in awaiting-approval state" }

Agent discovery for orchestrators

Before delegating, an orchestrator fetches GET /.well-known/agent.json from
the target agent to verify its capabilities and authentication scheme.
The a2a_delegate built-in tool does this automatically.

Example — check if a specialist supports streaming before delegating:

  GET https://researcher.example.com/.well-known/agent.json
  → { "capabilities": { "streaming": true }, "authentication": { "schemes": ["bearer"] } }

Set RESEARCHER_ENDPOINT and RESEARCHER_TOKEN env vars in the orchestrator's
agent.toml so a2a_delegate can target the specialist at runtime.
See Demo 04 — Multi-Agent Pipeline for a full walkthrough.

MCP Server — Tools

MCP 2025-11-25 spec

When agent.mcp_server.enabled = true, the agent exposes itself as an MCP server at port 8081 (HTTP/SSE) or via stdio. MCP hosts (Claude Desktop, Cursor, etc.) discover tools and invoke them through the standard MCP JSON-RPC protocol.

RPC tools/list

Return all tools available to this agent (from toolbox + registered tools).

Response

{
  "tools": [
    {
      "name": "company_lookup",
      "description": "Look up a company by name or ID",
      "inputSchema": { "type": "object", "properties": { "name": { "type": "string" } } }
    },
    ...
  ]
}
RPC tools/call

Invoke a tool by name. The agent executes the tool and returns the result.

Request

{ "name": "company_lookup", "arguments": { "name": "Apple" } }

Response

{ "content": [{ "type": "text", "text": "{\"name\":\"Apple Inc.\",\"revenue\":\"$394B\"}" }] }

MCP Server — Prompts (Skills)

Skills are exposed as MCP Prompts, enabling MCP hosts to discover and inject skill context into their own conversations.

RPC prompts/list

List all skills registered with this agent as MCP prompts.

Response

{
  "prompts": [
    { "name": "graph-search", "description": "Search the knowledge graph...", "arguments": [] },
    { "name": "memory-recall", "description": "Recall entities from prior sessions", "arguments": [] }
  ]
}
RPC prompts/get

Retrieve the system prompt fragment for a skill. The MCP host can inject this into its own context.

Request

{ "name": "graph-search" }

Response

{
  "messages": [
    {
      "role": "user",
      "content": "When searching for companies:\n1. Use company_lookup for direct lookups..."
    }
  ]
}

Claude Desktop Integration

MCP host config

Claude Desktop (and any MCP-compatible host) can connect to a running ai serve instance as an MCP server. Tools defined in your toolbox.yaml are exposed directly inside Claude conversations — no CLI required for end users.

Config file locations

macOS    ~/Library/Application Support/Claude/claude_desktop_config.json
Windows  %APPDATA%\Claude\claude_desktop_config.json

Local agent (no auth)

{
  "mcpServers": {
    "my-agent": {
      "url": "http://localhost:8081/mcp/sse"
    }
  }
}

Remote / deployed agent (with Bearer token)

{
  "mcpServers": {
    "my-agent": {
      "url": "https://my-agent.fly.dev/mcp/sse",
      "headers": {
        "Authorization": "Bearer <your-bearer-token>"
      }
    }
  }
}

Multiple servers can be listed under mcpServers — each key becomes a named tool source in Claude Desktop.

Verify the connection

1. Save the config file and restart Claude Desktop (Quit + relaunch).
2. In any conversation, click the 🔧 tools icon in the input bar.
3. Your server name (e.g. my-agent) should appear with a green indicator.
4. Start a conversation — Claude will call your tools automatically when relevant.

Notes

• The MCP server runs on port 8081 by default (separate from the A2A port 8080).
  Set agent.mcp_server.port in agent.toml to change it.
• Bearer token is set via agent.security.bearer_token in agent.toml.
• For deployed agents, ai deploy fly prints the /mcp/sse URL after deployment.
• See Demo 05 — Deploy & Claude Desktop for a full walkthrough.

REST Management API

Internal / local use

Lightweight management endpoints for tooling and health checks. Served on the same port as A2A. The OpenAPI spec for this surface is available at /openapi.json.

GET /api/agents

List all configured agents and their current status.

Response

{ "agents": [{ "id": "companies-researcher", "status": "running", "activeSessions": 3 }] }
PUT /api/agents/{id}

Hot-reload agent config without restarting the server. Skills and tools are re-resolved; active sessions drain gracefully.

Request body

{ "configPath": "./agent.toml" }

Response

{ "id": "companies-researcher", "status": "reloaded" }
GET /api/health

Liveness and readiness probe. Returns 200 when the server is up and all sidecars are healthy. Returns 503 if any required sidecar is down.

Response (healthy)

{
  "status": "ok",
  "sidecars": {
    "genai-toolbox": "running",
    "cypher-mcp":    "running",
    "graphrag":      "running"
  }
}

Response (degraded)

HTTP 503
{ "status": "degraded", "sidecars": { "graphrag": "stopped" } }
GET /openapi.json

OpenAPI 3.0 specification for the REST management API. Suitable for import into Salesforce External Services or any OAS-compatible tool.

Response

OpenAPI 3.0 JSON document describing all /api/* endpoints