Design Principles
Thirteen principles that guide how we build agent-intelligence.ai — and how agents built with it should behave. Inspired by the 12-factor app, the 12-factor agents methodology, and the agent-native principles by Every.
-
01
[>] F01-TERMINAL — terminal-native The terminal is a first-class interface. If it runs in a terminal, it runs everywhere — CI, containers, SSH sessions, scripts. The CLI is not a wrapper around a web UI; it is the product. Every feature must be fully accessible from the command line before a web interface is considered.
-
02
[>] F02-ZERO2RUN — zero to running
ai init → ai serveis the entire getting-started experience. Every additional step must justify its existence. No account creation, no API dashboard, no config file editing required to run a first agent. Defaults are production-grade; overrides are available but never required upfront. -
03
[>] F03-FAST — fast by default Sub-200ms cold start. Static binary. No daemon required. Performance is a feature, not an optimisation pass. A Go binary with no runtime overhead means the tool gets out of the way. Slow tooling trains developers to avoid running it — fast tooling trains tight feedback loops.
-
04
[>] F04-PROGX — progressive exposure Simple commands work with zero flags. Depth scales with the task.
ai run "summarise this"works on day one. Advanced flags — fallback chains, tool filtering, context budgets — are discoverable but never in the way. The UI should not front-load complexity onto beginners or hide power from experts. -
05
[>] F05-OWN-PROMPT — own your prompts Your system prompt is yours. New features ship as prompt fragments, not code.
agent.tomlis plain text. The system prompt is a field you write, read, and version-control. Nothing is injected silently by the platform. New capabilities are delivered as composable prompt fragments (ai skill add) — features that would require code changes in traditional software ship as prompt additions instead. -
06
[>] F06-OWN-CTX — own your context window Context is managed explicitly. The agent knows what it has access to. What goes into the context window is what you put there. Tool results are summarised before injection when configured. Token budgets are visible and enforced. At session start the agent receives a structured summary of available data sources, files, and tools — an agent unaware of its environment cannot act effectively.
-
07
[>] F07-STATELESS — stateless reducers Agents are pure functions. Memory is explicit state. Improvement is auditable. Pause, resume, and replay are first-class operations. Session state — including accumulated memory across sessions — is serialisable, auditable, and rollback-capable. When an agent proposes a change to its own configuration, that change is visible and diffable before being applied. Human approval gates are the mechanism for pausing execution at high-stakes decision points.
-
08
[>] F08-LOCAL1ST — local first Runs on your laptop. Files are the agent-legible data format. Cypherlite provides an embedded graph backend that needs no external service. The full agent loop works offline. Files are the preferred data format for agent-readable state — a directory layout an agent can navigate with
fs_listandfs_readis more legible than rows in a database. Design for what agents can reason about: if a human can understand the structure, an agent can too. -
09
[>] F09-PROTOCOL — protocol-native MCP and A2A are the transport layer. If a user can click it, an agent can call it. Every agent created with
ai initis simultaneously an MCP server and an A2A endpoint from day one. Every capability inai weborai tuihas an agent-callable MCP equivalent — no orphan UI actions. A parity map (.agint/parity.yaml) is maintained and checked in CI. MCP is the single agent-callable surface across all tiers: UI interactions, management operations (reload config, switch agent), and data-plane calls share one protocol. Any external agent that speaks MCP can fully operate the UI — no carve-outs, no parallel REST management surface. -
10
[>] F10-COMPOSE — composable over monolithic Small focused agents. Atomic tools. Behaviour changes via prompts, not refactors. An agent with fewer than 20 tools and a single well-defined responsibility outperforms a Swiss-army agent. Build the UNIX way: small programs composed via MCP and A2A. Tools are atomic primitives — business logic belongs in the prompt. The test: if changing agent behaviour requires refactoring tool code rather than editing a prompt, the tools encode too much logic.
-
11
[>] F11-OBSERVE — observable over magical Every inference and tool call is traceable. No black boxes.
--debugshows the full context window, token counts, and tool call payloads. OpenTelemetry traces export to any backend. Errors are surfaced with enough context to self-correct, not swallowed by a retry loop. If you can't see it, you can't fix it. -
12
[>] F12-ACCESS — accessible by default
--plainoutput works in pipelines, CI, and screen readers. ANSI colour, Unicode symbols, box-drawing characters, and emoji are all stripped by--plain. The tool outputs clean, parseable text when piped. The web UI meets WCAG 2.1 AA. Accessibility is a correctness property, not a polish task. -
13
[>] F13-EMERGE — emergent by design Build atomic primitives. Observe what agents accomplish. Formalize the patterns. The platform should enable agents to accomplish tasks that were not explicitly designed for. When an agent combines existing tools to handle an unanticipated request, that is not a bug — it is the system working as intended. New features are discovered by observing what agents do, then formalizing durable patterns as domain tools or prompt fragments. The test: describe an outcome within the agent's domain that has no dedicated feature. If the agent can compose existing tools to accomplish it, the architecture passes.
How we follow these in practice
Principles only matter when they shape code. Below is how each principle shows up in the agent-intelligence codebase — with concrete examples you can grep for.
01 ▶ F01-TERMINAL — stdout/stderr separation
All CLI commands (
ai serve, ai run, ai init, ai web)
write status and progress messages to stderr. Only the agent's final answer goes to stdout.
This means ai run "summarise this" 2>/dev/null produces clean, pipeable output.
// cmd_run.go — tool trace and token counts go to stderr
fmt.Fprintf(os.Stderr, "[tool] %s (%s)\n", name, duration)
fmt.Fprintf(os.Stderr, "[tokens] in: %s out: %s\n", inTok, outTok)
cmd/ai/cmd_run.go, cmd/ai/cmd_serve.go, internal/agent/agent_runner.go
ADR: This is a cross-cutting convention. See
ADR-001
for why the CLI is the product, not a web wrapper.
02 ▶ F02-ZERO2RUN — explicit error messages on the getting-started path
When
ANTHROPIC_API_KEY is missing, ai init tells you exactly
what's needed instead of a vague "AI interview unavailable" fallback:
Note: AI-assisted setup requires ANTHROPIC_API_KEY. Falling back to manual mode.
Set the key and re-run for the recommended experience.
Every error on the ai init → ai serve path names the missing prerequisite
and suggests the fix.
cmd/ai/cmd_init.go
ADR:
ADR-002
— Cypherlite was chosen as the local graph backend to eliminate the Neo4j credential step from onboarding.
03 ▶ F03-FAST — static Go binary, no runtime
The
ai binary is a single static executable (~25 MB). No Python runtime,
no Node.js, no JVM. Cold start is sub-200ms. Cross-compiled for 5 targets via
make cross-compile with CGO_ENABLED=0.
# Makefile — all cross-compile targets disable CGO
GOARCH=amd64 GOOS=linux CGO_ENABLED=0 go build -o bin/ai-linux-amd64 ...
Makefile
ADR:
ADR-001
— custom assembly chosen over google/adk-go partly because the Google dep graph adds 8-15 MB to the binary.
04 ▶ F04-PROGX — zero flags to start, depth when you need it
ai run "hello" works with no config file. Power features are additive flags:
--verbose for tool trace, --model to override the default,
--max-turns to cap the agent loop. Config fields like
disabled_tools, max_tool_result_chars, and
classify_prompt are all optional with sensible defaults.
# zero-config run
ai run "what is 2+2"
# progressive depth
ai run --verbose --model claude-sonnet-4-20250514 --max-turns 5 "analyse this repo"
cmd/ai/cmd_run.go, internal/config/config.go
ADR:
ADR-009
— sandbox tiers (JS → Python → Shell) are a progressive capability ladder: agents start with JS, opt into heavier runtimes as needed.
05 ▶ F05-OWN-PROMPT — configurable system and internal prompts
The system prompt lives in
agent.toml as plain text. Internal prompts
(intent classification, tool selection) that were previously hardcoded are now
overridable via config:
# agent.toml
[agent]
system_prompt = "You are a research assistant..."
[agent.intent_router]
classify_prompt = "..." # optional — overrides built-in default
[agent.tool_context]
selector_prompt = "..." # optional — overrides built-in default
When these fields are empty, the built-in defaults are used. When set, operators
control exactly what the model sees.
internal/agent/intent_router.go, internal/agent/tool_context.go, internal/config/config.go
ADR:
ADR-010
— the tool selector prompt was originally hardcoded (Strategy A). Now configurable per the audit.
06 ▶ F06-OWN-CTX — tool result truncation and token budgets
Large tool results are truncated before context injection when
max_tool_result_chars is set. The truncation is visible — the agent sees
exactly how much was cut:
[agent.budget]
max_tool_result_chars = 8000
// internal/agent/context.go
func TruncateToolResult(output string, maxChars int) string {
if maxChars <= 0 || len(output) <= maxChars {
return output
}
omitted := len(output) - maxChars
return output[:maxChars] +
fmt.Sprintf("\n[truncated — %d chars omitted]", omitted)
}
internal/agent/context.go, internal/agent/run.go
ADR:
ADR-007
— the W/S/C/I context strategy with Warn/Compact/Abort thresholds. Budget enforcement means
operators always know what's in the window.
07 ▶ F07-STATELESS — serialisable session state
Agent state is a value — not hidden in goroutines or closures.
interview.State is a plain struct with exported fields, serialisable
to JSON for pause/resume. Package-level mutable state (like the skill registry's
sync.Once) is documented as a known limitation for full replay.
internal/interview/interview.go, internal/skill/skill.go
ADR:
ADR-007
— context engineering uses explicit token budgets and compaction thresholds, all serialisable as config values.
08 ▶ F08-LOCAL1ST — Cypherlite as the default graph backend
ai init offers "Local graph (Cypherlite — no credentials needed)" as the
first option in the database menu. Selecting it skips all credential
prompts and writes a local-only config:
[agent.graph]
uri = "cypherlite://.agint/graph"
Neo4j Aura and other cloud databases are available but always second in the list.
The full agent loop works without any external service.
cmd/ai/cmd_init.go, internal/interview/interview.go
ADR:
ADR-002
— Cypherlite chosen as the local graph backend. Embeddable, Cypher-compatible, pure Go, no external service needed.
09 ▶ F09-PROTOCOL — MCP tools replace REST endpoints
Graph operations that were exposed as custom REST endpoints
(
/api/graph/query, /api/graph/schema) are now registered as
MCP tools (graph_query, graph_schema). The REST endpoints
return X-Deprecated headers and will be removed in a future release:
// internal/mgmtapi/mgmtapi.go
w.Header().Set("X-Deprecated", "true")
w.Header().Set("X-Deprecated-Message",
`Use MCP tool "graph_query" instead.`)
Data-plane operations go through MCP. Only management-plane endpoints
(/api/config, /api/health) remain as REST.
internal/mcpserver/server.go, internal/mgmtapi/mgmtapi.go
ADR:
ADR-006
— every agent is simultaneously an MCP server and client. MCP is the standard interface for all tool operations.
10 ▶ F10-COMPOSE — per-deployment tool filtering
Operators can disable specific MCP tools per deployment. A production agent that
should never execute shell commands:
# agent.toml
[mcp_server]
disabled_tools = ["execute_shell", "execute_python"]
Tools in this list are skipped during MCP server registration and never appear
in tools/list. The filter applies to both built-in and sandbox tools.
internal/mcpserver/server.go, internal/config/config.go
ADR:
ADR-005
— split MCP libraries (official SDK for client, mark3labs for server) so each role can be composed independently.
11 ▶ F11-OBSERVE — token counts and full-rate tracing
--verbose prints per-turn token counts and cost to stderr after each
model call. OpenTelemetry defaults to 100% sampling when
OTEL_EXPORTER_OTLP_ENDPOINT is set — because if you configured tracing,
you want traces:
// internal/telemetry/telemetry.go
if endpoint != "" && os.Getenv("OTEL_TRACES_SAMPLER") == "" {
sampler = sdktrace.AlwaysSample()
}
The 10% default only applies when no exporter is configured (effectively a no-op).
cmd/ai/cmd_run.go, internal/telemetry/telemetry.go
ADR:
ADR-007
— context budget thresholds (Warn at 70%, Compact at 80%, Abort at 95%) fire hooks so operators can observe budget consumption.
12 ▶ F12-ACCESS — --plain implies --no-color
--plain automatically sets --no-color at the root command
level, so every code path that checks NoColor also catches
--plain. The NO_COLOR environment variable disables ANSI
codes per the no-color.org spec.
// cmd/ai/root.go
if gf.Plain {
gf.NoColor = true // --plain implies --no-color
}
cmd/ai/root.go
ADR:
ADR-013
— Google ID token auth was designed to work without manual key distribution, keeping the zero-config accessible path.
13 ▶ F13-EMERGE — the agent-native UI is agent-intelligence running on itself
The ultimate test of F13-EMERGE is
ADR-014 — serverless functions are rejected partly because ephemeral processes cannot accumulate the context and memory that emergent behaviour requires.
ai web and ai tui:
instead of a hand-coded dashboard, a running ai[] agent is the
interface. The agent uses its own MCP tools to navigate the UI, query databases,
propose config changes, and crystallize successful patterns into reusable skills —
tasks the developer never explicitly coded.
# Every UI action has an MCP equivalent (F09-PROTOCOL parity)
# Parity map lives at .agint/parity.yaml
# CI check: make parity-check
# The agent_config_propose tool lets the agent improve its own prompts
# New onboarding features ship as prompt fragments, not Go code (F05-OWN-PROMPT)
# Memory crystallizes successful tool sequences into skills (F07-STATELESS)
ADR:
ADR-009
— the sandbox subsystem provides the atomic primitives (shell, JS, Python) that
enable emergent agent behaviour. Agents compose these tiers to accomplish tasks
the sandbox was not explicitly built for.
ADR-014 — serverless functions are rejected partly because ephemeral processes cannot accumulate the context and memory that emergent behaviour requires.