True Multi‑Agency
Why "true" multi‑agency matters
Most so‑called multi‑agent systems are linear toolchains with new names. True multi‑agency requires concurrent actors negotiating over a shared objective with explicit state, communication protocols, and guardrails.
Core ingredients
- Roles & capabilities: Planners, critics, executors, tools. Each agent exposes a typed capability set, not free‑form prompts.
- Shared state (blackboard/graph): Task graph + facts + artifacts. Updates are atomic and observable.
- Coordination policy: Who can act when, and on what. Turn‑taking, parallelism, and arbitration rules are explicit.
- Environments: Sandboxed I/O for tools, data, and effects; reproducible sims for learning/test.
- Evaluation hooks: Trace, critique, and score plans, actions, and outcomes continuously.
Coordination patterns that work
- Supervisor → Workers: Planner decomposes; workers execute; critic verifies; loop until done.
- Peer consensus (debate → resolution): Multiple planners produce plans; a judge selects/merges.
- Market/auction: Tasks bid out to specialized agents; cost/utility drives assignment.
- Hierarchical control: High‑level goals → subgoals → executable steps with feedback at each level.
Communication & memory
- Messages are structured: intent, inputs, preconditions, effects, confidence.
- Memory is layered: short‑term (episode), long‑term (project), external (vector/graph indices).
- State transitions are auditable: every change is attributed to an agent + rationale.
Safety & governance
- Policy first: allowlists, redaction, rate limits, authority boundaries per role.
- Counterfactual checks: simulate high‑risk actions in a shadow env; require approval.
- Human‑in‑the‑loop gates: elevation for sensitive scopes; rollbacks are first‑class.
Engineering checklist
- Define the agent roles and their typed capabilities (interfaces + schemas).
- Stand up a shared blackboard/graph with optimistic concurrency + versioned snapshots.
- Implement a coordinator that schedules turns, arbitrates conflicts, and enforces policy.
- Add a critic/evaluator with golden tasks and outcome metrics (precision/latency/cost/SLA).
- Build a replayable environment (sim + fixtures) and wire tracing for every message/action.
- Ship canaries; measure deltas; promote policies by evidence, not vibes.
North star: multiple specialized agents cooperating over a shared state to deliver a measurable outcome—reliably, safely, and faster than a single generalist.