Design Principles

Thirteen principles that guide how we build agent-intelligence.ai — and how agents built with it should behave. Inspired by the 12-factor app, the 12-factor agents methodology, and the agent-native principles by Every.

01
[>] F01-TERMINAL — terminal-native The terminal is a first-class interface. If it runs in a terminal, it runs everywhere — CI, containers, SSH sessions, scripts. The CLI is not a wrapper around a web UI; it is the product. Every feature must be fully accessible from the command line before a web interface is considered.
02
[>] F02-ZERO2RUN — zero to running ai init → ai serve is the entire getting-started experience. Every additional step must justify its existence. No account creation, no API dashboard, no config file editing required to run a first agent. Defaults are production-grade; overrides are available but never required upfront.
03
[>] F03-FAST — fast by default Sub-200ms cold start. Static binary. No daemon required. Performance is a feature, not an optimisation pass. A Go binary with no runtime overhead means the tool gets out of the way. Slow tooling trains developers to avoid running it — fast tooling trains tight feedback loops.
04
[>] F04-PROGX — progressive exposure Simple commands work with zero flags. Depth scales with the task. ai run "summarise this" works on day one. Advanced flags — fallback chains, tool filtering, context budgets — are discoverable but never in the way. The UI should not front-load complexity onto beginners or hide power from experts.
05
[>] F05-OWN-PROMPT — own your prompts Your system prompt is yours. New features ship as prompt fragments, not code. agent.toml is plain text. The system prompt is a field you write, read, and version-control. Nothing is injected silently by the platform. New capabilities are delivered as composable prompt fragments (ai skill add) — features that would require code changes in traditional software ship as prompt additions instead.
06
[>] F06-OWN-CTX — own your context window Context is managed explicitly. The agent knows what it has access to. What goes into the context window is what you put there. Tool results are summarised before injection when configured. Token budgets are visible and enforced. At session start the agent receives a structured summary of available data sources, files, and tools — an agent unaware of its environment cannot act effectively.
07
[>] F07-STATELESS — stateless reducers Agents are pure functions. Memory is explicit state. Improvement is auditable. Pause, resume, and replay are first-class operations. Session state — including accumulated memory across sessions — is serialisable, auditable, and rollback-capable. When an agent proposes a change to its own configuration, that change is visible and diffable before being applied. Human approval gates are the mechanism for pausing execution at high-stakes decision points.
08
[>] F08-LOCAL1ST — local first Runs on your laptop. Files are the agent-legible data format. Cypherlite provides an embedded graph backend that needs no external service. The full agent loop works offline. Files are the preferred data format for agent-readable state — a directory layout an agent can navigate with fs_list and fs_read is more legible than rows in a database. Design for what agents can reason about: if a human can understand the structure, an agent can too.
09
[>] F09-PROTOCOL — protocol-native MCP and A2A are the transport layer. If a user can click it, an agent can call it. Every agent created with ai init is simultaneously an MCP server and an A2A endpoint from day one. Every capability in ai web or ai tui has an agent-callable MCP equivalent — no orphan UI actions. A parity map (.agint/parity.yaml) is maintained and checked in CI. MCP is the single agent-callable surface across all tiers: UI interactions, management operations (reload config, switch agent), and data-plane calls share one protocol. Any external agent that speaks MCP can fully operate the UI — no carve-outs, no parallel REST management surface.
10
[>] F10-COMPOSE — composable over monolithic Small focused agents. Atomic tools. Behaviour changes via prompts, not refactors. An agent with fewer than 20 tools and a single well-defined responsibility outperforms a Swiss-army agent. Build the UNIX way: small programs composed via MCP and A2A. Tools are atomic primitives — business logic belongs in the prompt. The test: if changing agent behaviour requires refactoring tool code rather than editing a prompt, the tools encode too much logic.
11
[>] F11-OBSERVE — observable over magical Every inference and tool call is traceable. No black boxes. --debug shows the full context window, token counts, and tool call payloads. OpenTelemetry traces export to any backend. Errors are surfaced with enough context to self-correct, not swallowed by a retry loop. If you can't see it, you can't fix it.
12
[>] F12-ACCESS — accessible by default --plain output works in pipelines, CI, and screen readers. ANSI colour, Unicode symbols, box-drawing characters, and emoji are all stripped by --plain. The tool outputs clean, parseable text when piped. The web UI meets WCAG 2.1 AA. Accessibility is a correctness property, not a polish task.
13
[>] F13-EMERGE — emergent by design Build atomic primitives. Observe what agents accomplish. Formalize the patterns. The platform should enable agents to accomplish tasks that were not explicitly designed for. When an agent combines existing tools to handle an unanticipated request, that is not a bug — it is the system working as intended. New features are discovered by observing what agents do, then formalizing durable patterns as domain tools or prompt fragments. The test: describe an outcome within the agent's domain that has no dedicated feature. If the agent can compose existing tools to accomplish it, the architecture passes.

How we follow these in practice

Principles only matter when they shape code. Below is how each principle shows up in the agent-intelligence codebase — with concrete examples you can grep for.

01 ▶ F01-TERMINAL — stdout/stderr separation

All CLI commands (ai serve, ai run, ai init, ai web) write status and progress messages to stderr. Only the agent's final answer goes to stdout. This means ai run "summarise this" 2>/dev/null produces clean, pipeable output.

// cmd_run.go — tool trace and token counts go to stderr
fmt.Fprintf(os.Stderr, "[tool] %s (%s)\n", name, duration)
fmt.Fprintf(os.Stderr, "[tokens] in: %s  out: %s\n", inTok, outTok)

cmd/ai/cmd_run.go, cmd/ai/cmd_serve.go, internal/agent/agent_runner.go ADR: This is a cross-cutting convention. See ADR-001 for why the CLI is the product, not a web wrapper.

02 ▶ F02-ZERO2RUN — explicit error messages on the getting-started path

When ANTHROPIC_API_KEY is missing, ai init tells you exactly what's needed instead of a vague "AI interview unavailable" fallback:

Note: AI-assisted setup requires ANTHROPIC_API_KEY. Falling back to manual mode.
      Set the key and re-run for the recommended experience.

Every error on the ai init → ai serve path names the missing prerequisite and suggests the fix. cmd/ai/cmd_init.go ADR: ADR-002 — Cypherlite was chosen as the local graph backend to eliminate the Neo4j credential step from onboarding.

03 ▶ F03-FAST — static Go binary, no runtime

The ai binary is a single static executable (~25 MB). No Python runtime, no Node.js, no JVM. Cold start is sub-200ms. Cross-compiled for 5 targets via make cross-compile with CGO_ENABLED=0.

# Makefile — all cross-compile targets disable CGO
GOARCH=amd64 GOOS=linux CGO_ENABLED=0 go build -o bin/ai-linux-amd64 ...

Makefile ADR: ADR-001 — custom assembly chosen over google/adk-go partly because the Google dep graph adds 8-15 MB to the binary.

04 ▶ F04-PROGX — zero flags to start, depth when you need it

ai run "hello" works with no config file. Power features are additive flags: --verbose for tool trace, --model to override the default, --max-turns to cap the agent loop. Config fields like disabled_tools, max_tool_result_chars, and classify_prompt are all optional with sensible defaults.

# zero-config run
ai run "what is 2+2"

# progressive depth
ai run --verbose --model claude-sonnet-4-20250514 --max-turns 5 "analyse this repo"

cmd/ai/cmd_run.go, internal/config/config.go ADR: ADR-009 — sandbox tiers (JS → Python → Shell) are a progressive capability ladder: agents start with JS, opt into heavier runtimes as needed.

05 ▶ F05-OWN-PROMPT — configurable system and internal prompts

The system prompt lives in agent.toml as plain text. Internal prompts (intent classification, tool selection) that were previously hardcoded are now overridable via config:

# agent.toml
[agent]
system_prompt = "You are a research assistant..."

[agent.intent_router]
classify_prompt = "..."   # optional — overrides built-in default

[agent.tool_context]
selector_prompt = "..."   # optional — overrides built-in default

When these fields are empty, the built-in defaults are used. When set, operators control exactly what the model sees. internal/agent/intent_router.go, internal/agent/tool_context.go, internal/config/config.go ADR: ADR-010 — the tool selector prompt was originally hardcoded (Strategy A). Now configurable per the audit.

06 ▶ F06-OWN-CTX — tool result truncation and token budgets

Large tool results are truncated before context injection when max_tool_result_chars is set. The truncation is visible — the agent sees exactly how much was cut:

[agent.budget]
max_tool_result_chars = 8000

// internal/agent/context.go
func TruncateToolResult(output string, maxChars int) string {
    if maxChars <= 0 || len(output) <= maxChars {
        return output
    }
    omitted := len(output) - maxChars
    return output[:maxChars] +
        fmt.Sprintf("\n[truncated — %d chars omitted]", omitted)
}

internal/agent/context.go, internal/agent/run.go ADR: ADR-007 — the W/S/C/I context strategy with Warn/Compact/Abort thresholds. Budget enforcement means operators always know what's in the window.

07 ▶ F07-STATELESS — serialisable session state

Agent state is a value — not hidden in goroutines or closures. interview.State is a plain struct with exported fields, serialisable to JSON for pause/resume. Package-level mutable state (like the skill registry's sync.Once) is documented as a known limitation for full replay. internal/interview/interview.go, internal/skill/skill.go ADR: ADR-007 — context engineering uses explicit token budgets and compaction thresholds, all serialisable as config values.

08 ▶ F08-LOCAL1ST — Cypherlite as the default graph backend

ai init offers "Local graph (Cypherlite — no credentials needed)" as the first option in the database menu. Selecting it skips all credential prompts and writes a local-only config:

[agent.graph]
uri = "cypherlite://.agint/graph"

Neo4j Aura and other cloud databases are available but always second in the list. The full agent loop works without any external service. cmd/ai/cmd_init.go, internal/interview/interview.go ADR: ADR-002 — Cypherlite chosen as the local graph backend. Embeddable, Cypher-compatible, pure Go, no external service needed.

09 ▶ F09-PROTOCOL — MCP tools replace REST endpoints

Graph operations that were exposed as custom REST endpoints (/api/graph/query, /api/graph/schema) are now registered as MCP tools (graph_query, graph_schema). The REST endpoints return X-Deprecated headers and will be removed in a future release:

// internal/mgmtapi/mgmtapi.go
w.Header().Set("X-Deprecated", "true")
w.Header().Set("X-Deprecated-Message",
    `Use MCP tool "graph_query" instead.`)

Data-plane operations go through MCP. Only management-plane endpoints (/api/config, /api/health) remain as REST. internal/mcpserver/server.go, internal/mgmtapi/mgmtapi.go ADR: ADR-006 — every agent is simultaneously an MCP server and client. MCP is the standard interface for all tool operations.

10 ▶ F10-COMPOSE — per-deployment tool filtering

Operators can disable specific MCP tools per deployment. A production agent that should never execute shell commands:

# agent.toml
[mcp_server]
disabled_tools = ["execute_shell", "execute_python"]

Tools in this list are skipped during MCP server registration and never appear in tools/list. The filter applies to both built-in and sandbox tools. internal/mcpserver/server.go, internal/config/config.go ADR: ADR-005 — split MCP libraries (official SDK for client, mark3labs for server) so each role can be composed independently.

11 ▶ F11-OBSERVE — token counts and full-rate tracing

--verbose prints per-turn token counts and cost to stderr after each model call. OpenTelemetry defaults to 100% sampling when OTEL_EXPORTER_OTLP_ENDPOINT is set — because if you configured tracing, you want traces:

// internal/telemetry/telemetry.go
if endpoint != "" && os.Getenv("OTEL_TRACES_SAMPLER") == "" {
    sampler = sdktrace.AlwaysSample()
}

The 10% default only applies when no exporter is configured (effectively a no-op). cmd/ai/cmd_run.go, internal/telemetry/telemetry.go ADR: ADR-007 — context budget thresholds (Warn at 70%, Compact at 80%, Abort at 95%) fire hooks so operators can observe budget consumption.

12 ▶ F12-ACCESS — --plain implies --no-color

--plain automatically sets --no-color at the root command level, so every code path that checks NoColor also catches --plain. The NO_COLOR environment variable disables ANSI codes per the no-color.org spec.

// cmd/ai/root.go
if gf.Plain {
    gf.NoColor = true  // --plain implies --no-color
}

cmd/ai/root.go ADR: ADR-013 — Google ID token auth was designed to work without manual key distribution, keeping the zero-config accessible path.

13 ▶ F13-EMERGE — the agent-native UI is agent-intelligence running on itself

The ultimate test of F13-EMERGE is ai web and ai tui: instead of a hand-coded dashboard, a running ai[] agent is the interface. The agent uses its own MCP tools to navigate the UI, query databases, propose config changes, and crystallize successful patterns into reusable skills — tasks the developer never explicitly coded.

# Every UI action has an MCP equivalent (F09-PROTOCOL parity)
# Parity map lives at .agint/parity.yaml
# CI check: make parity-check

# The agent_config_propose tool lets the agent improve its own prompts
# New onboarding features ship as prompt fragments, not Go code (F05-OWN-PROMPT)
# Memory crystallizes successful tool sequences into skills (F07-STATELESS)

ADR: ADR-009 — the sandbox subsystem provides the atomic primitives (shell, JS, Python) that enable emergent agent behaviour. Agents compose these tiers to accomplish tasks the sandbox was not explicitly built for.
ADR-014 — serverless functions are rejected partly because ephemeral processes cannot accumulate the context and memory that emergent behaviour requires.