OSI for AI: Network Engineering Patterns for Multi-Agent Systems
A typical multi-agent AI system looks like this: agent A calls agent B for a task. Agent B sometimes calls agent C. Agent C sometimes calls agent A for context. There is no protocol. There is no routing table. There is no quality-of-service guarantee. When the system breaks, nobody can say why.
This is not a new problem. Network engineers solved it in 1984 by formalizing the OSI model — seven layers of abstraction, each with defined responsibilities, that turned the chaos of computer-to-computer communication into a discipline. The same approach applies to multi-agent AI.
The OSI-to-AI Mapping
Each layer of the OSI model has a direct counterpart in a multi-agent AI system:
- Application layer (L7) → Your top-level user request and the SINC-2 / structured prompt format that encodes it
- Presentation layer (L6) → Format conversion between agents (JSON ↔ markdown ↔ tool-call schemas)
- Session layer (L5) → Conversation state, context window management, multi-turn coherence
- Transport layer (L4) → Reliable delivery between agents — retries, timeouts, error correction
- Network layer (L3) → Routing decisions: which agent gets which request, based on what classifier
- Data link layer (L2) → Connection between specific agent pairs, including authentication and rate-limit enforcement
- Physical layer (L1) → The actual API call to the LLM provider (HTTPS, model selection, token limits)
Most teams ship with everything jammed into the application layer. There is no transport-layer reliability. There is no network-layer routing. The system works in development and falls apart in production.
What the Topology Designer Returns
The free Multi-Agent Topology Designer reads your workflow description and returns the complete network spec:
Agents
Each agent gets an ID, a name, a tier assignment (opus / sonnet / haiku / free), and a defined role. Tier assignment matters because the cost ratio between Opus and Haiku is roughly 12:1. Sending every request to Opus is like routing every web request through your most expensive backend regardless of need.
Routes
Each route specifies from-agent, to-agent, trigger condition (on success / on failure / on classification), and retry count. This is your routing table. Without it, agents call each other based on prompt instructions that get reinterpreted on every run.
QoS Budget Classes
Network engineers split traffic into QoS classes — realtime (voice), interactive (web), background (file transfer). Each class gets different priority. The same applies to AI: a customer-facing chat response is realtime; a nightly report-generation task is background. Different budget per call, different timeout, different fallback behavior.
TTL (Time-to-Live) Limits
In IP networking, every packet has a TTL that decrements at each hop. When TTL reaches zero, the packet is dropped. This prevents routing loops from consuming the network. The same primitive applies to agents: max_spawn_depth (how deep can the agent tree go) and max_iterations (how many times can a single agent retry). Without TTL, your runaway loop from earlier becomes inevitable.
Congestion Control
What happens when the system gets overloaded? Networks have well-studied algorithms — exponential backoff, circuit breakers, load shedding. Most multi-agent systems implement none of these. The Designer recommends specific patterns: rate-limit per agent, circuit-breaker on N consecutive failures, graceful degradation paths.
Graphviz DOT Diagram
Output includes a valid Graphviz DOT specification you can paste into GraphvizOnline to render the architecture as an actual diagram. This is the deliverable a senior architect would produce in a design review.
Why a Diagram Beats a Prose Description
A workflow described in prose is interpretable many ways. A workflow shown as a directed graph with labeled nodes, labeled edges, and explicit fan-in/fan-out is unambiguous. The diagram is the spec. When you hand it to another engineer, you are not asking them to interpret your intent — you are giving them an executable architecture.
From a wiki synthesis I built mapping networking concepts to AI: "Your layered architecture (hooks → routing → agents → tools) IS the OSI model. routing_monitor_hook.py = routing table. SINC-2 = application layer protocol. Spawn depth limits = TTL. Budget pools = QoS classes."
Try It on a Workflow You Are About to Build
Describe a multi-agent system you are designing — even informally. The Designer will return a structured topology and an architecture diagram. You will see immediately whether your design has clean QoS separation, defined routing, TTL limits, and congestion control — or whether you were about to ship spaghetti.
For production systems where the topology needs to actually run — with monitoring on every route, observability on every agent, and circuit breakers wired to alerting — see the paid service. The patterns are reusable. The deployment is custom.
Design Your Agent Topology
Describe your AI workflow. Returns a complete OSI-style architecture: agent tier assignments, routes, QoS classes, TTL limits, congestion control, and a Graphviz DOT diagram.
Agent Mesh Design — Service #37
Production multi-agent architecture with proper layering, routing tables, QoS classes, TTL limits, congestion control. The architecture you saw in 5 minutes — engineered, deployed, monitored.