Roadmap — agent-intelligence.ai

210Completed

0In Progress

44Planned

257Total

82% complete

Core System

Agent execution loop, model routing, config schema, context engineering, memory, cost tracking, and evaluation infrastructure.

ID	Task		Status
task-010	Go module and project skeleton F03 F02		completed
task-011	Agent config schema (TOML) and loader F05		completed
task-022	4-layer memory architecture F07		completed
task-030	AI-assisted config generation prompt and engine F02 F05		completed
task-040	Model router — Anthropic + OpenAI-compat + fallback chain F10 F06		completed
task-041	Core agent execution loop with NextStep and Hooks F07 F10		completed
task-042	Context engineering layer — ContextEngine F06 F05		completed
task-043	Tool executor — parallel dispatch, partial failure, timeouts F10 F11		completed
task-044	Security layer — prompt injection detection + A2A input validation F05 F06		completed
task-045	Model fallback chain and cost circuit breaker F03 F06		completed
task-046	Prompt caching optimization F06 F03		completed
task-047	Tool context management for large tool lists F06 F04		completed
task-083	Durable session store — FileSessionStore + crash recovery F07		completed
task-085	Cost observability and budget reporting F11 F06		completed
task-156	Human-in-the-loop: built-in `human_approval` MCP tool F07	medium	completed
task-157	Explicit control flow: intent router and tool-call guards F05 F06	low	completed

Validation spikes (completed)

task-001	Go agent framework evaluation F03 F09		completed
task-002	Cypherlite Go bindings validation F08		completed
task-003	Anthropic Go SDK — streaming + tool-use validation F06		completed
task-004	mcp-toolbox Neo4j source evaluation F09		completed
task-005	MCP Go library selection F09		completed
task-006	MCP dual-role PoC — agent as client AND server F09 F10		completed
task-007	Context engineering validation harness F06		completed
task-160	Prompt injection hardening in agent loop F05 F06	high	completed
task-161	Reversibility taxonomy + default-approve-required for irreversible tool calls F07	high	completed
task-162	Trust hierarchy: principal tiers + trust-level enforcement in agent loop F05	high	completed
task-163	Platform safety floor: non-overridable hardcoded behaviours F11	medium	completed
task-164	Honesty guidelines: uncertainty language, capability statements, AI identity F05	medium	completed
task-170	Wire agent.Agent.Run as real TaskRunner in ai serve F07 F09	critical	completed
task-172	Config-to-agent wiring verification: context thresholds and tool call Index sort F06 F05	high	completed
task-173	cmd_init test coverage F02	medium	completed
task-270	GenerateToolboxConfig — read_cypher + write_cypher opt-in tools F10 F05	high	completed
task-272	write_cypher — wire as R_DESTROY in generated agent.toml and built-in patterns F07 F05	high	completed
task-274	Integration test — read_cypher executes live Neo4j query via mcp-toolbox F11	medium	completed
task-281	Dynamic tool descriptions — inject agent name + purpose into agent_run/agent_stream F02 F01	high	completed
task-290	TokenValidator interface + auth middleware for MCP and A2A servers	critical	completed
task-291	RFC 9728 Protected Resource Metadata endpoint	high	completed
task-292	JWT validation with JWKS auto-discovery and key rotation	high	completed
task-293	A2A AgentCard OAuth2 SecurityScheme + Google DCR extension	medium	completed
task-294	Auth configuration in agent.toml + deploy.auth schema F02	high	completed
task-298	E2E auth test: MCP RFC 7591 DCR + A2A Google JWT DCR	high	completed
task-299	Google Marketplace DCR endpoint — software_statement JWT	high	completed
task-300	Spike: Validate Cloud Run Go SDK for service deployment F09 F02	critical	completed
task-301	Spike: Validate Google ID token validation for A2A auth F09	critical	completed
task-302	Spike: Validate Gemini Enterprise A2A agent registration F09	high	completed
task-320	Google ID token auth mode for A2A server F09	critical	completed
task-321	Outbound Google ID token minting for service-to-service A2A F09	high	completed
task-330	Extend AgentCard to A2A v0.3 / Gemini Enterprise schema F09	high	completed
task-350	Config schema: Cloud Run deployment and Google auth fields F02 F04	high	completed
task-370	SPIKE: Cypherlite technical analysis + integration plan F08 F02 F03	high	completed
task-385	Integration tests for CypherliteBackend and seed package F08 F02 F03	high	completed
task-403	Deprecate mgmtapi data-plane endpoints — migrate to MCP tools F09 F10	high	completed
task-405	Add disabled_tools config to MCP server for per-deployment tool filtering F10 F04	medium	completed
task-407	Make intent classification and tool selector prompts configurable F05	low	completed
task-408	Add max_tool_result_tokens config for tool result truncation F06	low	completed
task-600	Config schema: tool layer, memory, MCP Apps sections		planned
task-601	Generalizable tool: fs_read / fs_write / fs_list / fs_watch		planned
task-602	Generalizable tool: http_request		planned
task-603	Generalizable tool: shell_exec MCP adapter		planned
task-604	Generalizable tool: mcp_connect / mcp_list_tools / mcp_call		planned
task-605	Generalizable tool: browser_* (chromedp computer use)		planned
task-606	Generalizable tool: agent_config_read / agent_config_propose		planned
task-607	Approval gate: human_approval MCP tool + infrastructure		planned
task-608	Panel registry + dynamic panel system		planned
task-609	Chat panel + Trace panel (always-on minimum UI)		planned
task-610	UI parity tool surface (ui_* MCP tools)		planned
task-611	Parity map (.agint/parity.yaml) + make parity-check CI		planned
task-612	Context accumulation: context.md write/read lifecycle		planned
task-613	Terminal panel: xterm.js + WebSocket shell bridge		planned
task-614	HTTP Playground panel		planned
task-615	MCP Explorer panel		planned
task-616	agent-memory package integration (config + Go interface wiring)		planned
task-617	Memory MCP tools: memory_search / read / write / delete / consolidate		planned
task-618	Memory context injection at session start (budget-aware)		planned
task-619	Memory panel UI: four-layer tabbed view		planned
task-620	Failure learning: episodic write on tool failure + pre-retry search		planned
task-621	Automated memory consolidation + session threshold trigger		planned
task-622	Meta-insight generation in memory		planned
task-623	MCP Apps: config section + ui:// resource registry		planned
task-624	MCP Apps: bridge pages (docs/bridge/.html)		planned
task-625	MCP Apps: _meta.ui.resourceUri annotations on parity tools		planned
task-626	ai init: API key hard prerequisite + ai auth command		planned
task-627	Agent-native ai init --agentic: bootstrap agent + system prompt		planned
task-628	Init agent: project introspection + validation-first credential handling		planned
task-629	Init agent: memory-assisted defaults + composability test		planned
task-630	UI meta-agent topology design spike		planned
task-631	RL-inspired skill distillation spike (REQ-SPIKE-001)		planned
task-632	TUI: shared backend connection + bubbletea Chat and Trace		planned
task-633	TUI: approval gates + terminal viewport + responsive narrowing		planned

CLI

The ai command and its subcommands — the primary interface for developers running, configuring, and serving agents locally.

ID	Task		Status
task-012	CLI framework and top-level command structure F01 F03 F12		completed
task-031	`ai init` — onboarding interview agent F02 F04		completed
task-051	`ai serve` and `ai run` — serve and single-run modes F01 F02		completed
task-052	`ai show` — display agent config with syntax highlighting F05 F04		completed
task-054	Curl-based install script and binary distribution F02 F03		completed
task-135	Version in help output and `ai` with no arguments F01 F12		completed
task-136	Global `--quiet / -q` flag for silent mode F01 F12		completed
task-153	`ai show --diagram` — ASCII architecture diagram via D2 F11 F04	low	completed
task-155	`--plain` flag — screen reader and pipeline-safe output F12	low	completed
task-159	`ai init` — progressive elicitation with agent type templates F02 F04	medium	completed
task-165	ai init validation: warn when generated config violates our own principles F04 F10	low	completed
task-175	Shell completion: ai completion bash\|zsh\|fish — static completions F04 F01	medium	completed
task-176	Shell completion: dynamic completions — config files, skills, task IDs F04 F01	low	completed
task-177	Coding agent UX: --output flag + structured errors + exit codes on all commands F11 F01 F04	high	completed
task-178	Coding agent UX: --non-interactive, stdin piping, AI_ env var overrides F02 F01 F04	high	completed
task-179	Coding agent UX: ai status, --dry-run, --timeout, and completion self-doc F11 F07 F02	medium	completed
task-180	ai init: readline support for cursor/arrow key navigation F11 F01 F02	medium	completed
task-181	ai init: fix model name suggestions and add normalization F04 F01	high	completed
task-182	ai init: investigate and fix crash after model name entry F04	medium	completed
task-183	ai init: fix --agent → --config flag name in 'Next step' output F04	low	completed
task-184	ai init: --credentials flag for Aura/Sandbox credential file loading F01	low	completed
task-185	NDJSON done event: include cost_usd, input_tokens, output_tokens F04 F11	medium	completed
task-186	ai init: demo database catalog — gather and add known demo databases F09 F03	high	completed
task-188	ai serve: auto-detect and start mcp-toolbox sidecar without explicit flag F02 F11	high	completed
task-240	ai init — database type selector menu (replaces URI preset menu) F02 F04 F01	high	completed
task-248	Extend --credentials to parse Supabase, PlanetScale, Neon, MongoDB, Redis URLs F02 F03	medium	completed
task-250	ai show — opinionated default view (identity + truncated prompt + non-defaults) F01 F02	high	completed
task-251	ai show — --full flag (preserve current full TOML dump behavior) F01	high	completed
task-252	ai show — --tools flag (parse toolbox YAML, show sources + tools + env var status) F02 F03	high	completed
task-253	ai show — --section flag (show single named config section) F01	medium	completed
task-254	ai show — --filter flag (interactive fzf / built-in line filter) F01 F08	medium	completed
task-255	ai show — --grep flag (non-interactive regex line filter) F01	low	completed
task-271	ai init — offer generic Cypher fallback tools in onboarding interview F02 F04 F05	high	completed
task-273	ai show --tools — annotate fallback tools with visual separator and warning F11 F01	medium	completed
task-280	ai mcp — MCP stdio server command F02 F01	high	completed
task-282	ai show --mcp-config [target] — print ready-to-paste coding agent config snippets F05 F11	high	completed
task-283	ai init — write MCP config files for Claude Code and Cursor at end of onboarding F02 F01 F11	medium	completed
task-284	ai init — write MCP config files for Claude Code and Cursor at end of onboarding F02 F04	high	completed
task-285	ai show — MCP tool identity preview section F11 F05	low	completed
task-312	CLI: Wire Cloud Run service deployment into ai deploy F02 F04	high	completed
task-331	CLI: ai card command for Agent Card inspection and validation F09 F04	low	completed
task-340	CLI: ai deploy status and ai deploy logs for Cloud Run services F04 F11	medium	completed
task-400	Fix stdout/stderr separation across all CLI commands F08 F02	high	completed
task-401	Explicit ANTHROPIC_API_KEY message when ai init falls back to manual mode F02 F04	high	completed
task-402	Add per-turn token counts to ai run --verbose output F11 F06	high	completed
task-404	Offer local Cypherlite graph as default option in ai init F08 F02	medium	completed
task-409	Unify --plain and --no-color into consistent isPlainMode() check F12 F01	low	completed
task-410	ai init — Cypherlite local-first path (option 5 in URI preset menu) F02 F08 F04	high	completed
task-430	ai graph seed — seed Cypherlite DB from named dataset or .cyp file F08 F02 F01	medium	completed
task-440	ai init — AI-generated domain seed data for Cypherlite F02 F08 F04	medium	completed

Integrations

MCP client and server, mcp-toolbox sidecar, graph backends (Neo4j, Cypherlite), and Python sidecars for GraphRAG, graph construction, memory, and eval.

ID	Task		Status
task-020	Neo4j connection and schema introspection F08 F09		completed
task-021	Cypherlite local graph backend F08		completed
task-032	mcp-toolbox config generation and sidecar manager (`ai sidecar mcp-toolbox`) F09 F08		completed
task-033	MCP server — agent exposes its tools over MCP F09		completed
task-034	MCP client — agent connects to external MCP servers F09		completed
task-060	Skills system F10 F05		completed
task-070	Python sidecar service manager (Go) F10 F08		completed
task-071	GraphRAG retrieval sidecar (Python, port 8091) F08 F10		completed
task-072	Graph construction sidecar (Python, port 8090) F08 F10		completed
task-073	Agent memory sidecar (Python, port 8092) F07 F08		completed
task-158	Schema-first tool definitions: Go structs as tool contracts F09	low	completed
task-171	MCP client reconnect backoff implementation F03 F11	high	completed
task-174	mcpserver agent_stream: true SSE relay from agent execution F09 F11	medium	completed
task-241	Config generator: PostgreSQL (Supabase, Neon, plain postgres) F02 F03 F08	high	completed
task-242	Config generator: MySQL (PlanetScale, plain mysql) F02 F03	high	completed
task-243	Config generator: SQLite (local embedded relational) F08 F02 F03	medium	completed
task-244	Config generator: DuckDB (local) and MotherDuck (cloud) F08 F02 F03	medium	completed
task-245	Config generator: MongoDB Atlas (document store) F02 F04	medium	planned
task-246	Config generator: Redis / Valkey F03 F02	low	planned
task-247	SQL schema introspection for ai init (postgres, mysql, sqlite) F02 F04 F11	medium	planned
task-249	SQL seed data for DuckDB and SQLite (recommendations, northwind, countries) F08 F02 F03	medium	planned
task-380	Implement CypherliteBackend in internal/graph/ F08 F02 F03	critical	completed
task-390	Remove CGO dep; update go.mod and Makefile for Cypherlite F03 F02	critical	completed
task-406	Default OTel sampling to 100% when exporter is explicitly configured F11	medium	completed
task-420	Embed openCypher seed datasets for Cypherlite (recommendations, northwind, countries) F08 F02 F03	medium	completed
task-450	Wire Cypherlite backend into ai serve as runtime graph query target F08 F02 F09 F11	high	completed

Web UI

Local developer console — chat, trace, cost counter, eval integration, graph explorer. See web.html for design mockups. Depends on the agent execution loop (task-041) and REST API (task-091).

ID	Task		Status
task-053	`ai web` v1 — chat + ASCII trace + cost counter + eval F11 F04	low	completed
task-090	Web UI shell — routing, theme, agent list, config editor F04 F11	low	completed
task-091	REST management API — `/api/*` endpoints F09 F04		completed
task-141	Full eval tab — run evals, dataset management, diffs F11 F04	low	completed
task-142	Graph explorer tier 1 — schema view (read-only) F11 F08	low	completed
task-143	Graph explorer tier 2 — Cypher playground F11 F08	low	completed
task-144	Graph explorer tier 3 — visual canvas (Neo4j NVL) F11	low	completed
task-145	Web accessibility — WCAG 2.1 AA F12	low	completed
task-146	TUI mode — `ai web --tui` / `ai tui` via bubbletea F01 F12	low	completed
task-147	Session persistence — JSON/YAML in `.ai/` folder F07	low	completed
task-148	Session tab switcher and history browser F07 F04	low	completed
task-149	Team sharing — auth, per-user session isolation F09	low	completed
task-150	System prompt diff on `agent.toml` reload F05	low	completed
task-151	Span replay — re-run any LLM or tool call in isolation F11 F07	low	completed
task-152	Live streaming trace — spans appear in real time F11	low	completed
task-154	D2-powered trace diagram — optional visual mode F11	low	completed
task-260	ai web — graceful port conflict handling + reload signal F02 F04	medium	completed
task-261	ai web — unified Data tab with source switcher and generic schema view F02 F11	high	planned
task-262	ai web — SQL query playground (postgres, mysql, sqlite, duckdb) F11 F04	medium	completed
task-263	ai web — MongoDB document browser F11 F04	low	planned
task-264	ai web — Redis / Valkey key browser F11 F04	low	planned
task-265	ai web — command palette keyboard navigation + fuzzy search F04 F01	medium	completed
task-266	ai web — auto-start ai serve when backend is not running F02 F04	high	completed

Deployment & Production

A2A protocol server, OpenTelemetry instrumentation, evaluation framework, Fly.io and Cloud Run deploy targets, distribution, and integration test suites.

ID	Task		Status
task-050	A2A protocol server with auth and task ownership F09 F10		completed
task-055	A2A auth middleware and rate limiting F09		completed
task-080	OpenTelemetry instrumentation F11		completed
task-081	Agent execution trace graph writer F11		completed
task-082	Evaluation sidecar (Python, port 8093) F11 F10		completed
task-084	Structural evaluation — no-LLM, CI-runnable F11		completed
task-100	`ai deploy` — Fly.io container deployment F02 F08		completed
task-101	Cloudflare domain management and edge routing F02		completed
task-102	`ai deploy` — Cloud Run for graph construction jobs F08 F10		completed
task-103	Spike — WASM/WASI feasibility (Cloudflare Workers + agent binary) F03 F08		completed
task-110	Distribution — `go install`, brew tap, GitHub releases F02 F03		completed
task-120	Integration tests — end-to-end agent bootstrap F02 F11		completed
task-121	Integration tests — graph construction and GraphRAG retrieval F08 F11		completed
task-122	Acceptance tests — mcp-toolbox + Neo4j article and codelab flows F09 F11		completed
task-123	Integration tests — LLM provider and model router (Anthropic + OpenAI) F10 F11		completed
task-124	Integration tests — Neo4j and Aura database connectivity F08 F11		completed
task-187	CLI: add make install target and ai --version flag F02 F01	medium	completed
task-295	Cloudflare Worker OAuth 2.1 auth proxy with DCR + CIMD F02	medium	completed
task-296	ai deploy --target cloudflare command F02	medium	completed
task-297	ai deploy --target cloudrun + Google Marketplace Procurement F02	low	completed
task-310	Cloud Run service deployment — Dockerfile & manifest generation F02 F09	critical	completed
task-311	Cloud Run service deployment — Go SDK integration F02	critical	completed
task-313	Cloud Run Secret Manager integration F02	medium	completed
task-360	Integration test: end-to-end Cloud Run deploy and A2A call F02 F09	medium	completed
task-361	Documentation: Cloud Run deployment guide and Gemini Enterprise registration F09	low	completed
task-460	Update docs: Cypherlite local graph — cli.html, config.html, architecture.html, principles.html, roadmap.html F08 F02	medium	completed
task-500	Refactor Dockerfile generation into shared internal/deploy/container.go F02	critical	completed
task-501	ai deploy --target docker — portable container build + push to any OCI registry F02	high	completed
task-510	Kubernetes manifest generation: Deployment, Service, ConfigMap, Secret F02	high	completed
task-511	ai deploy --target k8s — local clusters (kind, minikube, k3s) + kubectl apply F02	high	completed
task-512	HorizontalPodAutoscaler manifest generation for k8s deployments	medium	completed
task-513	k8s --provider gke — Workload Identity + GKE-specific annotations	medium	completed
task-514	k8s --provider eks — IRSA + AWS Load Balancer Controller annotations	medium	completed
task-515	k8s --provider aks — Azure Workload Identity + AGIC ingress	low	planned
task-516	Helm chart generation for k8s deployments (--helm flag)	medium	planned
task-517	kubectl apply integration and rollout status for k8s deployments F02	high	completed
task-520	ECS task definition + service JSON generation (internal/deploy/fargate.go)	high	completed
task-521	ECR repository creation + image push via AWS SDK F02	high	completed
task-522	ECS service create/update via AWS SDK + service stabilization wait F02	high	completed
task-523	AWS Secrets Manager integration for Fargate (internal/deploy/aws_secrets.go)	medium	planned
task-524	ai deploy --target fargate — CLI integration, dry-run, interactive confirm F02	high	completed
task-525	ai deploy status + logs for Fargate (ECS + CloudWatch Logs)	medium	completed
task-550	Design + document the async A2A dispatch pattern for serverless functions F02	medium	planned
task-551	GCF Gen2 / Cloud Run Functions dispatch shim for single-turn A2A tasks F02	low	planned
task-552	AWS Lambda dispatch shim for single-turn A2A tasks (SQS → Fargate worker) F02	low	planned

Demo Series

Five end-to-end walkthroughs — movie recommendations, company intelligence, clinical knowledge graph, multi-agent A2A pipeline, and cloud deployment. Includes fixture configs, validation scripts, and an interactive demo page at agent-intelligence.ai/demos.

ID	Task		Status
task-200	Demos: Write demo-01 movie recommendations script F02 F04	high	completed
task-201	Demos: Write demo-02 company intelligence + GraphRAG script F09 F11	high	completed
task-202	Demos: Write demo-03 clinical knowledge graph script F10 F11	high	completed
task-203	Demos: Write demo-04 multi-agent A2A pipeline script F07 F09	high	completed
task-204	Demos: Write demo-05 deploy + Claude Desktop script F02 F09	high	completed
task-205	Demos: Build docs/demos.html interactive demo selector page F01 F04	high	completed
task-210	Demos: Create fixture agent.toml + toolbox.yaml for each demo F02	high	completed
task-220	Demos: Implement A2A client call from agent (agent → agent delegation) F09 F10	high	completed
task-221	Demos: Add --stream flag to `ai run` for SSE output F01 F11	high	completed
task-222	Demos: Add --endpoint and --token flags to `ai run` F01 F09	high	completed
task-223	Demos: Package and host fixture tarballs for quick-start F02	medium	completed
task-225	Demos: Document Claude Desktop MCP integration in `docs/api.html` F01 F09	medium	completed
task-226	Demos: Document `ai run` --endpoint, --token, --stream flags in CLI reference F01 F04	medium	completed
task-227	Demos: Add multi-sidecar architecture section to `docs/architecture.html` F10 F11	medium	completed
task-228	Demos: Add multi-agent A2A topology section to `docs/api.html` F07 F09	medium	completed
task-230	Demos: Verify all demo Cypher queries against live Neo4j demo databases F11	high	completed
task-231	Demos: Create end-to-end validation script for demo 01 F11	medium	completed
task-232	Demos: Add Demo Series section and link to `docs/roadmap.html` F02	low	completed

Docs & Landing Page

Public site at agent-intelligence.ai — landing page, docs pages, content compliance, analytics, and deployment to Cloudflare Pages.

	Task	Status
	Landing page — above-the-fold layout, hero, feature strip	completed
	Terminal animation engine and screenplay	completed
	CLI reference page — `docs/cli.html`	completed
	Config reference page — `docs/config.html`	completed
	API reference page — `docs/api.html`	completed
	Architecture page with ASCII diagrams (D2 pipeline)	completed
	Web UI design page — `docs/web.html` with TUI mockups	completed
	Roadmap page — `docs/roadmap.html`	completed
	Demos page — `docs/demos.html` interactive demo selector	completed
	`docs/llms.txt` for LLM navigation	completed
	Copy buttons and curl install block on landing page	completed
	GitHub star badge / widget on landing page	planned
	Inline email capture form with backend endpoint	planned
	Cloudflare Web Analytics script tag	planned
	Configure Cloudflare Pages deployment	planned
	Full accessibility pass — WCAG 2.1 AA	planned
	Performance audit and asset weight optimization	planned
	Content compliance — remove internal tech framing from copy	planned
	Cross-browser validation and smoke test	planned

Sandbox & Code Execution

Secure multi-tier code execution: JavaScript via QJS + wazero (tier 1), Python via CPython WASM (tier 2a) or Firecracker (tier 2b), mini-Linux via Firecracker + nsjail fallback (tier 3).

ID	Task	Status
task-130	Design secure code execution sandbox subsystem F10	completed
task-131	JavaScript sandbox — QJS + wazero, tier 1 F10 F08	completed
task-132	Python sandbox — CPython WASM (tier 2a) and Firecracker (tier 2b) F10 F08	completed
task-132b	Python Firecracker sandbox — Tier 2b (follow-on to task-132) F10 F08	completed
task-133	Mini-Linux sandbox — Firecracker + nsjail fallback, tier 3 F10 F08	completed
task-134	Sandbox MCP tools and code-execution skill F09 F10	completed