Swarm Architecture: Bounded Parallel Agent Execution

# Swarm Architecture

Bounded Parallel Agent Execution

Author: William Christopher Anderson

Date: March 2026

Version: 1.0

Executive Summary

The Compiled Context Runtime (CCR) solves the problem of agent continuity. Process definitions codify what to do. Compiled context injection provides what to know. Memory chains preserve what was learned. Together, they produce an agent that operates with precision, consistency, and accumulating intelligence.

But the CCR, as described in its foundational whitepaper, operates as a single sequential executor. One agent. One process. One step at a time. This is correct for most work — the majority of knowledge tasks are inherently sequential, and parallelism introduces coordination complexity that rarely justifies the overhead.

Some work, however, is naturally parallel. A code review that spans twelve files can be decomposed into twelve independent analyses. A migration that touches eight databases can execute eight schema changes simultaneously. A research task that consults fourteen sources can dispatch fourteen retrieval operations and merge the results. In these cases, sequential execution is not merely slow — it is structurally wrong. The task’s natural shape is parallel, and forcing it through a sequential pipeline distorts the work.

Swarm Architecture extends the CCR with a model for bounded parallel agent execution. It introduces three constructs:

1. Swarms — bounded groups of agents executing task instances in parallel, governed by a single coordinator and a single correlation identity. A swarm is not a cluster, not a pool, not an unbounded collection of workers. It is a precisely scoped execution boundary: one process step decides to fan out, the swarm executes the parallel work, and the results converge before execution continues.

2. Containment rules — structural constraints that prevent swarm workers from escaping their execution boundary. A swarm worker may spawn sub-agents within its own process boundary, but it may not join another swarm, initiate new swarms, or communicate laterally with sibling workers. These rules are not conventions. They are enforced by the runtime.

3. Convergence protocols — mechanisms for collecting, merging, and validating the results of parallel execution before the parent process continues. Convergence is not implicit. It is a defined step in the process, with explicit merge strategies, conflict resolution rules, and quality gates.

A fourth property emerges from the containment model that is not obvious from its safety-oriented design:

4. Location independence — because containment rules prohibit workers from accessing shared state, communicating laterally, or reaching outside their execution boundary, workers have no requirement for co-location. A worker can execute on the coordinator’s machine, on a cloud instance, on an edge device in a factory, or on a partner organization’s infrastructure across the planet. The containment rules designed for safety become the enabling constraints for distribution. The compiled context boundary becomes the security boundary — workers receive only what their task requires, and cannot leak what they were never given.

This property transforms the scope of what the architecture can do. The same swarm model that parallelizes a local code review can distribute a compliance analysis across twelve jurisdictions, a research synthesis across institutions that cannot share raw data, or a global monitoring operation across edge devices on every continent. The mechanism is identical. The topology varies.

Abstract

The Compiled Context Runtime provides process-driven, context-compiled agent execution with persistent memory. Its sequential execution model is correct for the majority of agent workflows, but structurally inadequate for tasks whose natural decomposition is parallel. Swarm Architecture extends the CCR with bounded parallel execution: swarms of agents that execute independent task instances simultaneously under a single coordinator, governed by containment rules that prevent execution boundary violations, and converged through explicit merge protocols before the parent process continues. This paper describes the architectural model, the containment and coordination mechanisms, the convergence protocols, the relationship to the CCR’s process definition language, the failure and recovery model, the cost implications of parallel versus sequential execution, and the distributed execution model that emerges from the containment architecture — enabling swarms whose workers span local machines, cloud regions, edge devices, and partner infrastructure without sacrificing determinism, auditability, or data security.

1. Introduction

1.1 The Sequential Assumption

The Compiled Context Runtime, as described in its foundational paper, executes processes as ordered sequences of steps. Step one completes before step two begins. Each step receives compiled context scoped to its requirements. Each step’s output is captured in execution history and available to subsequent steps. The model is simple, predictable, and auditable.

This sequential model is not a limitation — it is a design choice rooted in a structural observation: most knowledge work is inherently sequential. Writing code requires understanding the context before writing. Reviewing a pull request requires reading the changes before forming an opinion. Planning a project requires understanding the dependencies before sequencing the work. Forcing parallelism onto inherently sequential work produces coordination overhead without meaningful speedup.

But not all work is sequential.

1.2 The Parallelism That Already Exists

Consider a process that reviews a large pull request. The process definition might specify:

1. Retrieve the PR metadata and changed file list

2. For each changed file, analyze the diff against the relevant architectural standards

3. Synthesize individual file analyses into a coherent review

4. Post the review

Steps 1, 3, and 4 are inherently sequential. Step 2 is inherently parallel — the analysis of billing_engine.py does not depend on the analysis of payment_accessor.py. They share no state. They require no coordination. They can execute simultaneously without affecting each other’s results.

Today, the CCR executes step 2 as a loop: analyze file one, then file two, then file three. Each analysis is independent, but they execute sequentially because the runtime has no mechanism to express or execute parallel work. The result is correct but slow — a twelve-file review takes twelve sequential analysis cycles instead of one parallel cycle.

This is not a theoretical concern. It is a concrete performance penalty applied to every naturally parallel task the agent encounters.

1.3 Why Not General Multi-Agent Systems?

The obvious response is: deploy multiple agents. Let them coordinate. Let them discover work, distribute it, and merge results dynamically.

This is the approach taken by most multi-agent frameworks, and it fails for predictable reasons.

Containment failure. When agents can spawn other agents without structural constraints, the system’s execution boundary becomes unbounded. An agent debugging a test failure spawns an agent to read the source code, which spawns an agent to check the git history, which spawns an agent to analyze the CI configuration. Each spawn is locally reasonable. The aggregate is an uncontrolled expansion of execution scope, token consumption, and coordination complexity.

Coordination overhead. General multi-agent coordination requires consensus mechanisms, shared state management, conflict resolution, and deadlock detection. These mechanisms are well-understood in distributed systems, but they introduce complexity that is disproportionate to the problem. The CCR’s value proposition is deterministic, auditable execution. Adding distributed coordination undermines that proposition.

Emergent behavior. When multiple agents operate with overlapping scope and lateral communication, the system’s behavior becomes emergent rather than specified. The process definition says what should happen; the agents decide what actually happens. This is the opposite of the CCR’s design philosophy, where the process definition is the single source of truth for execution.

Swarm Architecture avoids all three failure modes by constraining parallelism to a specific, bounded pattern: fan out, execute independently, converge. No lateral communication. No dynamic scope expansion. No emergent coordination.

1.4 Scope of This Paper

This paper describes Swarm Architecture as an extension to the Compiled Context Runtime. It assumes familiarity with the CCR’s process definitions, compiled context injection, memory chains, and execution model. Readers unfamiliar with these concepts should consult the CCR whitepaper before proceeding.

The paper covers the swarm execution model, containment rules, convergence protocols, process definition extensions, failure and recovery, cost analysis, and the relationship to VBD component architecture. It does not cover general-purpose multi-agent orchestration, distributed consensus algorithms, or agent-to-agent communication protocols — these are explicitly out of scope.

2. The Swarm Model

2.1 Definition

A swarm is a bounded group of agents executing independent task instances in parallel, governed by a single coordinator, identified by a single correlation ID, and converged through an explicit merge step before the parent process continues.

Every swarm has exactly five properties:

1. A parent step — the process step that initiated the fan-out. The parent step is suspended until convergence completes.

2. A task blueprint — the process definition (or task definition) that each worker executes. All workers in a swarm execute the same blueprint against different inputs.

3. An input set — the collection of independent work items to be processed. Each item becomes the input to one worker instance.

4. A convergence strategy — the mechanism for collecting, merging, and validating worker outputs before returning control to the parent process.

5. A correlation ID — a unique identifier that links every worker instance, every log entry, every memory record, and every artifact produced by the swarm back to the parent step that initiated it.

2.2 The Fan-Out / Converge Pattern

Swarm execution follows a single pattern:

Parent Process
  │
  ├─ Step N: Sequential work
  │
  ├─ Step N+1: Fan-out (swarm)
  │     ├─ Worker 1: Task(input_1) ──→ result_1
  │     ├─ Worker 2: Task(input_2) ──→ result_2
  │     ├─ Worker 3: Task(input_3) ──→ result_3
  │     └─ Worker K: Task(input_k) ──→ result_k
  │
  ├─ Step N+2: Converge(result_1..k) ──→ merged_result
  │
  ├─ Step N+3: Sequential work (uses merged_result)
  │

The pattern is deliberately simple. There is no nesting of swarms within swarms. There is no lateral communication between workers. There is no dynamic addition of work items after fan-out begins. The swarm is a structural primitive — a single level of parallelism — not a recursive coordination framework.

2.3 What a Swarm Is Not

A swarm is not a thread pool. Thread pools are infrastructure-level concurrency mechanisms. Swarms are architecture-level execution patterns. A swarm might be implemented using threads, processes, API calls, or distributed workers — the implementation is invisible to the process definition.

A swarm is not a MapReduce job. MapReduce operates on data partitions with a fixed reduce function. Swarms operate on task instances with configurable convergence strategies. The workers are agents executing process steps, not functions applied to data shards.

A swarm is not an agent swarm in the multi-agent literature. The term “swarm” in multi-agent systems typically implies emergent coordination, stigmergic communication, and self-organizing behavior. None of these properties apply here. A CCR swarm is deterministic, bounded, and fully specified by the process definition. The term is used for its intuitive meaning — a group working in parallel — not for its academic connotations.

3. Containment

3.1 The Containment Problem

Parallelism without containment is the defining failure mode of multi-agent systems. When an agent can spawn other agents without constraint, three problems emerge:

1. Scope creep — Each spawned agent may itself spawn agents, producing an expanding tree of execution that no single process definition governs.

2. Resource exhaustion — Each agent consumes context window tokens, API calls, and memory. Unbounded spawning produces unbounded cost.

3. Audit failure — When the execution tree is dynamic and unbounded, tracing what happened and why becomes intractable.

Swarm Architecture prevents all three through structural containment rules enforced by the runtime.

3.2 The Three Containment Rules

Rule 1: A swarm worker may not join another swarm.

A worker is executing a task instance within a specific swarm boundary. It may not register itself as a worker in a different swarm, even if that swarm is executing the same task blueprint. This prevents cross-swarm contamination and ensures that each swarm’s execution boundary is closed.

Rule 2: A swarm worker may not initiate a new swarm.

If a worker’s task requires further parallelism, it must express that need through its process definition, and the parent process must orchestrate it as a separate swarm step. Workers do not have the authority to create swarms. Only the process coordinator does. This prevents recursive fan-out and bounds the total parallelism to what the process definition explicitly specifies.

Rule 3: A swarm worker may not communicate laterally with sibling workers.

Workers in the same swarm share a correlation ID, but they do not share state, messages, or coordination signals. Worker 3 cannot read Worker 7’s intermediate results. Worker 7 cannot signal Worker 3 to change its approach. The only communication path is vertical: worker to coordinator (via result submission) and coordinator to worker (via task input and compiled context).

3.3 What Workers Can Do

The containment rules constrain inter-swarm and inter-worker behavior. Within its own execution boundary, a worker has full CCR capabilities:

Execute process steps — The worker runs its assigned task blueprint as a normal CCR process.
Use compiled context — The worker receives context compiled for its specific input, just as any CCR process step would.
Record memory — The worker writes to memory chains, tagged with the swarm’s correlation ID.
Spawn sub-agents — The worker may use the CCR’s standard sub-agent mechanism (tool calls, delegate steps) within its own process boundary. These sub-agents are scoped to the worker’s process and do not constitute a new swarm.
Produce artifacts — The worker generates output artifacts that are collected during convergence.

The distinction is precise: a worker is a full CCR agent within its boundary, but it cannot extend its boundary or interact with agents outside it.

3.4 Runtime Enforcement

Containment rules are not guidelines. They are enforced by the runtime through structural checks:

Swarm registration — When a swarm is created, each worker receives a swarm scope token. API calls that would create or join a swarm are rejected if the calling context already holds a swarm scope token.
Communication isolation — Workers receive isolated memory chain namespaces. Cross-worker memory queries are structurally impossible because the namespace scoping prevents it.
Execution boundary tracking — The runtime maintains an execution tree with strict parent-child relationships. Any attempt to create a lateral edge (worker-to-worker) or an upward edge (worker initiating a new swarm) is rejected.

4. Convergence

4.1 The Convergence Step

When all workers in a swarm complete (or when a timeout or failure threshold is reached), the swarm enters convergence. Convergence is not implicit — it is a defined step in the parent process with its own context, logic, and quality gates.

The convergence step receives:

The ordered list of worker results
The original input set (for correlation)
Metadata about each worker’s execution (duration, token usage, success/failure status)
The swarm’s correlation ID (for memory chain queries)

The convergence step produces:

A merged result that the parent process uses in subsequent steps
A convergence report (which workers succeeded, which failed, what conflicts were resolved)
Memory chain entries recording the swarm’s execution for future reference

4.2 Convergence Strategies

The process definition specifies which convergence strategy applies. The CCR provides four built-in strategies and supports custom strategies:

Collect — The simplest strategy. Worker results are collected into an ordered list and passed to the next step without transformation. The parent process is responsible for interpretation. Appropriate when results are independent observations that don’t need merging (e.g., file-level code review comments).

Merge — Worker results are combined into a single output using a merge function specified in the process definition. Conflicts are resolved by the merge function. Appropriate when results contribute to a single deliverable (e.g., parallel document sections assembled into a complete document).

Vote — Worker results are treated as votes. The convergence step tallies results and selects the majority or highest-confidence output. Appropriate when multiple workers analyze the same input from different perspectives and the system needs a consensus decision (e.g., parallel classification with confidence scoring).

Reduce — Worker results are processed sequentially through a reduction function, producing a single accumulated result. Appropriate when results need ordered integration (e.g., parallel test results reduced into a pass/fail summary with aggregated metrics).

Custom — The process definition specifies a convergence process (itself a CCR process definition) that receives the worker results and produces the merged output. This allows arbitrary convergence logic, including multi-step convergence with its own compiled context and quality gates.

4.3 Partial Convergence and Failure Thresholds

Not all workers may succeed. A swarm of twelve workers analyzing twelve files may have one worker fail due to a token limit, a model error, or an input that cannot be processed. The convergence strategy must handle partial results.

The process definition specifies failure behavior through two parameters:

min_success_ratio — The minimum fraction of workers that must succeed for convergence to proceed. Default is 1.0 (all workers must succeed). Setting this to 0.8 means convergence proceeds if at least 80% of workers succeed; failed workers’ inputs are recorded for retry or manual review.

failure_action — What happens when a worker fails: retry (resubmit the failed input to a new worker), skip (proceed without the failed input), abort (fail the entire swarm and return control to the parent process’s error handler).

These parameters allow the process definition to express the task’s tolerance for partial results without embedding retry logic in every worker.

5. Process Definition Extensions

5.1 Swarm-Eligible Steps

A process step becomes swarm-eligible through a fan_out declaration in the process definition:

process: review_pull_request
version: 1

steps:
  - name: retrieve_pr
    action: "Fetch PR metadata and changed file list"
    output: pr_files

  - name: analyze_files
    action: "Analyze each changed file against architectural standards"
    fan_out:
      over: pr_files
      task: analyze_single_file
      concurrency: 8
      convergence:
        strategy: collect
        min_success_ratio: 0.9
        failure_action: skip
    output: file_analyses

  - name: synthesize_review
    action: "Combine file analyses into a coherent review"
    input: file_analyses
    output: review

  - name: post_review
    action: "Post the review to the pull request"
    input: review

The fan_out block specifies:

over — The collection to iterate. Each element becomes the input to one worker.
task — The task blueprint each worker executes. This is a reference to a task definition.
concurrency — The maximum number of workers executing simultaneously. This is a resource constraint, not a parallelism constraint — all items will be processed, but at most concurrency workers run at any time.
convergence — The convergence strategy and failure parameters.

5.2 Task Blueprints

A task blueprint is a process definition designed to be executed by a swarm worker. It is a standard CCR process with one constraint: it must accept a single input item and produce a single output result.

task: analyze_single_file
version: 1

knowledge:
  - architecture/patterns/vbd-component-taxonomy
  - coding/python/style-guide

input:
  file_path: string
  diff_content: string
  pr_context: string

steps:
  - name: analyze
    action: "Analyze the diff against VBD standards and coding conventions"
    output: analysis

  - name: format
    action: "Format the analysis as a structured review comment"
    input: analysis
    output: review_comment

output: review_comment

Task blueprints inherit the full CCR capability set: compiled context injection, knowledge references, gates, and memory recording. They are not reduced-capability processes — they are full processes executing within a containment boundary.

5.3 Behavior Hints

Process and task definitions may include behavior hints that inform the runtime’s scheduling and resource allocation decisions:

behavior:
  swarm_eligible: true
  containment: strict
  event_topics:
    - task.lifecycle
    - artifact.produced
  estimated_duration: short
  model_requirements:
    reasoning_depth: moderate
    code_generation: true

Behavior hints are advisory, not prescriptive. The runtime uses them for optimization — routing short tasks to faster models, pre-allocating resources for large fan-outs, selecting appropriate event channels — but the process definition’s semantic meaning does not depend on them.

6. Coordination

6.1 Correlation IDs

Every entity created during a swarm’s execution — worker instances, memory records, artifacts, log entries, execution records — carries the swarm’s correlation ID. This produces a complete, traceable execution graph:

Swarm: swarm-a1b2c3d4
  ├─ Worker: swarm-a1b2c3d4/worker-001
  │    ├─ Memory: mem-xxx (chain: review, corr: swarm-a1b2c3d4)
  │    └─ Artifact: art-xxx (corr: swarm-a1b2c3d4)
  ├─ Worker: swarm-a1b2c3d4/worker-002
  │    ├─ Memory: mem-yyy (chain: review, corr: swarm-a1b2c3d4)
  │    └─ Artifact: art-yyy (corr: swarm-a1b2c3d4)
  └─ Convergence: swarm-a1b2c3d4/converge
       └─ Artifact: art-zzz (merged result, corr: swarm-a1b2c3d4)

Correlation IDs enable three capabilities:

1. Audit — Given a swarm’s correlation ID, the runtime can reconstruct the complete execution history: which workers ran, what each produced, how results were merged, what the final output was.

2. Cost attribution — Token usage, API calls, and execution time are attributed to the swarm and, through the correlation ID, to the parent process step that initiated it.

3. Memory scoping — Memory chain queries can be scoped to a swarm’s correlation ID, allowing the convergence step to access the collective observations of all workers without pollution from unrelated memory.

6.2 Event-Driven Lifecycle

Swarm lifecycle events are published to event topics, allowing the parent process, monitoring systems, and the learning loop to observe swarm execution without polling:

[table]

|——-|———|—————-|

[table]

Events are published to the topics declared in the task blueprint’s behavior.event_topics. The parent process may subscribe to these events for progress reporting, but the events are informational — they do not affect execution flow.

7. Failure and Recovery

7.1 Worker Failure Modes

Swarm workers can fail in three categories:

Transient failures — API rate limits, network timeouts, model overload. These are retryable. The runtime resubmits the failed input to a new worker instance, up to a configurable retry limit.

Input failures — The input item is malformed, too large for the context window, or references content that doesn’t exist. These are not retryable with the same input. The convergence strategy’s failure_action determines the response: skip the item, abort the swarm, or flag for manual review.

Structural failures — The task blueprint itself is flawed: a step references nonexistent knowledge, a gate condition is unsatisfiable, or the output schema doesn’t match the convergence strategy’s expectations. These indicate a process definition error and always abort the swarm. The error is recorded in the execution history for the learning loop to analyze.

7.2 Swarm-Level Recovery

When a swarm is aborted, the parent process’s error handler receives:

The partial results from workers that succeeded
The error details from workers that failed
The swarm’s execution metadata (duration, token usage, worker count)

The parent process may retry the entire swarm, proceed with partial results, fall back to sequential execution, or escalate to the user. The decision logic is expressed in the process definition’s error handling steps — not in the swarm infrastructure.

7.3 Idempotency Requirement

Task blueprints used in swarms must be idempotent — executing the same input twice must produce the same result without side effects. This is required because the retry mechanism may resubmit inputs, and the system must guarantee that retried work does not corrupt state.

In practice, this means swarm tasks should:

Read from compiled context and input parameters only
Write to memory chains (which are append-only and thus naturally idempotent)
Produce artifacts as output (which are captured by the convergence step, not written to external systems)
Defer side effects (file writes, API calls, notifications) to the parent process’s post-convergence steps

8. Cost Model

8.1 Sequential Baseline

For a task with N independent items, sequential execution requires:

N × (context compilation cost + inference cost + memory recording cost)
Total wall-clock time: N × average_step_duration
Total tokens: N × average_tokens_per_step

8.2 Swarm Execution

The same task with swarm execution requires:

N × (context compilation cost + inference cost + memory recording cost) — the total token cost is identical
1 × convergence cost — an additional inference call to merge results
Total wall-clock time: max(worker_durations) + convergence_duration ≈ average_step_duration + convergence_duration

The critical insight: swarms do not reduce token cost. They reduce wall-clock time.

For a twelve-file code review, the token cost is approximately the same whether the files are reviewed sequentially or in parallel. The difference is time: twelve sequential reviews might take six minutes; twelve parallel reviews with convergence might take forty-five seconds.

8.3 When Swarms Are Worth It

Swarm execution adds overhead: worker lifecycle management, convergence processing, correlation tracking, and the convergence inference call. This overhead is justified when:

1. N is large enough — Below approximately four items, the coordination overhead exceeds the time savings. The exact threshold depends on the task duration and the convergence strategy.

2. Items are truly independent — If worker outputs depend on each other (worker 3 needs worker 1’s result), the task is not suitable for swarm execution. Dependencies require sequential execution or a more complex coordination model that is outside this architecture’s scope.

3. Wall-clock time matters — If the parent process is executing autonomously and the user is not waiting, sequential execution may be acceptable. Swarms are most valuable in interactive workflows where latency directly impacts the user experience.

9. Distributed Execution

9.1 Containment Enables Distribution

The three containment rules — no joining other swarms, no initiating new swarms, no lateral communication — were introduced in Section 3 as safety constraints. They prevent the unbounded execution expansion that makes multi-agent systems unreliable. But these same constraints produce a second, more consequential property: they make workers location-independent.

Consider what a worker requires to execute:

A task blueprint (a YAML document)
A compiled context package (a text payload)
An input item (a data structure)

Consider what a worker does not require:

Access to the coordinator’s memory chains
Knowledge of other workers’ existence
A shared filesystem, database, or message bus with sibling workers
Physical proximity to the coordinator or to other workers

The containment rules guarantee that a worker’s execution boundary is closed. It reads its input, executes its task, and produces its output. It does not reach outside its boundary for anything. This means the boundary can be located anywhere — on the same machine as the coordinator, on a server across the network, on a cloud instance across the continent, or on a device on the other side of the planet.

This is not a deployment convenience. It is an architectural property that emerges from the containment model. Distribution is not something added to swarms — it is something the containment rules make structurally possible by eliminating every requirement for co-location.

9.2 Execution Topologies

A swarm’s workers can be distributed across any combination of execution environments. The coordinator dispatches task instances to workers based on a placement strategy; the workers execute and return results; the convergence step collects results regardless of origin. The coordinator does not care where a worker runs. It cares that the worker started, that the worker finished, and what the worker produced.

This produces several natural topologies:

Local swarm. All workers execute on the same machine as the coordinator. This is the simplest topology — workers are threads or processes on the local runtime. Appropriate for development, testing, and workloads where the machine has sufficient resources.

Cloud-burst swarm. The coordinator runs locally; workers execute on cloud instances. When a fan-out is large — fifty files to review, a hundred records to process — the local machine may not have the compute, memory, or API rate limits to run fifty workers simultaneously. Cloud-burst swarms dispatch workers to cloud instances that spin up for the duration of the swarm and shut down after convergence. The coordinator manages the lifecycle; the workers are ephemeral.

Edge swarm. Workers execute on edge devices or remote machines. A swarm analyzing sensor data from twelve factory floors dispatches one worker per floor, executing on local infrastructure close to the data. The compiled context package travels to the edge; the result travels back. The raw data never leaves the floor.

Federated swarm. Workers execute on machines owned by different participants. A research swarm analyzing datasets held by different institutions dispatches workers to each institution’s infrastructure. Each worker sees only its local dataset through the compiled context scoping. No institution’s data leaves its network. The convergence step operates on results — summaries, classifications, extracted features — not on raw data.

Hybrid swarm. Workers execute across a mixture of local, cloud, and edge environments based on input characteristics. A worker processing a small text file runs locally. A worker processing a large image dataset routes to a cloud GPU instance. A worker processing sensitive financial data routes to an on-premises secure enclave. The placement strategy makes the routing decision; the worker executes identically regardless of location.

9.3 The Compiled Context Boundary as Security Boundary

In a distributed swarm, the compiled context package is the only information that crosses a network boundary on the way in. The worker’s result is the only information that crosses on the way out. This is not a coincidence — it is a direct consequence of the CCR’s compilation model.

The compiled context package is precision-scoped to the current task step. It does not contain the coordinator’s full memory. It does not contain other workers’ inputs. It does not contain the process definition’s internal metadata. It contains exactly what the worker needs to execute its task — nothing more.

This scoping produces a security property that is absent from most distributed agent systems: the worker cannot leak what it was never given. A worker dispatched to a remote environment to analyze a single file receives the compiled context for that file. It does not receive the contents of other files, the PR’s broader context, or the organizational knowledge that informed the process definition. If the remote environment is compromised, the exposure is limited to one compiled context package and one input item.

In the federated topology, this property becomes essential. When workers execute on infrastructure controlled by different parties, each party must trust that the dispatched work does not carry unauthorized information. The compiled context boundary provides that guarantee structurally — not through access control lists, not through encryption alone, but through the architecture’s fundamental design: the worker receives a minimal, scoped payload because the CCR’s compilation pipeline produces minimal, scoped payloads. The security property is not bolted on. It is intrinsic.

9.4 Model Selection Across Geographies

The CCR’s dynamic model selection, described in the CCR whitepaper, takes on new dimensions in distributed swarms. When workers can execute anywhere, the model selection decision becomes a joint optimization across three variables:

Capability. The task requires a specific level of reasoning depth, code generation ability, or domain knowledge. Not all models satisfy the requirement.

Locality. The input data may have residency requirements. Financial data must be processed in-jurisdiction. Healthcare data must remain within HIPAA-compliant infrastructure. A model running in the right geography may be preferable to a more capable model running in the wrong one.

Cost. Cloud GPU instances in different regions have different pricing. Local models have zero marginal inference cost but limited capability. The optimal routing minimizes total cost while satisfying capability and locality constraints.

In a distributed swarm, these three variables are evaluated per worker, not per swarm. Worker 1, processing a small text file, routes to a local model at zero marginal cost. Worker 2, processing a complex architectural analysis, routes to a cloud-hosted reasoning model. Worker 3, processing data subject to EU data residency rules, routes to a model hosted in the EU region. All three participate in the same swarm. All three produce results that converge through the same merge step. The convergence step does not know or care which model each worker used — it operates on results, not on execution metadata.

9.5 Latency and the Geography of Work

Sequential execution has a fixed latency profile: total time equals the sum of individual step durations. Distributed swarm execution introduces a different profile: total time equals the maximum worker duration plus network round-trip time plus convergence duration.

For local swarms, network time is negligible. For cloud-burst swarms, network time is measurable but small relative to inference time — a compiled context package is kilobytes, not gigabytes. For edge and federated swarms, network time can be significant, particularly when workers are geographically distant.

The architecture handles this through deadline-aware scheduling. The process definition’s fan_out block may specify a deadline:

fan_out:
  over: input_items
  task: analyze_item
  concurrency: 20
  deadline: 30s
  convergence:
    strategy: collect
    min_success_ratio: 0.8
    failure_action: skip

The runtime uses the deadline to make placement decisions. If a worker dispatched to a remote location is unlikely to complete within the deadline (based on historical latency data), the runtime places it closer — on a cloud instance in a nearer region, or on the local machine — even if that placement is suboptimal on other dimensions. The deadline constrains the placement strategy, ensuring that distribution does not sacrifice responsiveness beyond the process definition’s tolerance.

9.6 The Implications of Location-Independent Execution

The architectural consequence of location-independent workers extends beyond performance optimization. It changes what swarms can be used for.

Global-scale analysis. A swarm can dispatch workers to every continent simultaneously. A compliance review that must evaluate operations under twelve different regulatory frameworks dispatches twelve workers, each executing in the relevant jurisdiction, each using models trained on or fine-tuned for local regulatory language. The convergence step produces a unified compliance report from twelve jurisdiction-specific analyses, none of which required data to cross jurisdictional boundaries.

Collaborative execution without shared infrastructure. Two organizations working on a joint project can participate in the same swarm without sharing infrastructure, credentials, or raw data. Organization A runs workers on its infrastructure; Organization B runs workers on its infrastructure. The coordinator (running on either side, or on neutral infrastructure) dispatches inputs and collects results. The containment rules guarantee that neither organization’s workers access the other’s data or systems.

Hardware-aware routing. Some tasks benefit from specific hardware. A worker analyzing a large codebase benefits from fast local storage. A worker generating images benefits from GPU acceleration. A worker performing symbolic reasoning benefits from high-memory CPU instances. The placement strategy routes workers to hardware that matches their task profile, turning a homogeneous swarm (same task blueprint) into a hardware-heterogeneous execution with performance characteristics optimized per worker.

Resilience through geographic distribution. A swarm distributed across three cloud regions survives the failure of any single region. When workers are location-independent and the task is idempotent, the retry mechanism can resubmit failed inputs to workers in surviving regions. The swarm completes — slower, perhaps, but completely — even under partial infrastructure failure.

Progressive capability deployment. When a new model is deployed in one region but not yet available globally, distributed swarms can route specific workers to the new model while others continue using the existing model. The convergence step does not distinguish between results from different models. This enables gradual rollout of model upgrades without requiring global synchronization.

9.7 Trust and Verification in Distributed Swarms

When workers execute on infrastructure you do not control, the question of trust becomes concrete. Can you trust a worker’s result? Can you verify that the worker executed the task faithfully?

Swarm Architecture addresses this through three mechanisms:

Result validation. The convergence strategy can include validation logic that checks worker results against expected schemas, value ranges, or consistency conditions. A worker that returns a result outside expected bounds is flagged — its result can be excluded from the merge, retried on trusted infrastructure, or escalated for review.

Redundant execution. For high-stakes tasks, the same input can be dispatched to multiple workers on different infrastructure. If two workers produce consistent results, confidence is high. If they diverge, the convergence step can apply a tiebreaker (third worker, human review, or conservative default). This is the same principle as consensus in distributed systems, applied at the task level rather than the protocol level.

Execution attestation. Workers can produce signed execution records — cryptographic attestations of what input they received, what model they used, what output they produced, and what timestamp they completed. These attestations are collected during convergence and stored with the swarm’s execution history. They do not prevent a compromised worker from producing a false result, but they provide an audit trail that makes falsification detectable after the fact.

These mechanisms are not required for all swarms. A local swarm on trusted infrastructure needs none of them. A federated swarm across organizational boundaries may require all three. The process definition specifies the appropriate level of verification for each swarm, matching the trust model to the deployment topology.

10. Limitations and Future Work

10.1 Deliberate Constraints

Swarm Architecture deliberately excludes several capabilities that might seem natural extensions:

No nested swarms. A swarm worker cannot initiate a sub-swarm. If a task requires nested parallelism, the process definition must express it as sequential swarm steps in the parent process. This constraint preserves containment and bounds the total parallelism to what is explicitly specified.

No inter-worker communication. Workers cannot share intermediate results, coordinate strategies, or negotiate resource allocation. If a task requires coordination between parallel workers, it is not suitable for swarm execution — it requires a different architectural pattern.

No dynamic work distribution. The input set is fixed at fan-out time. Workers cannot discover additional work items during execution. If the total work is not known at fan-out time, the process must use a different pattern (e.g., a loop with dynamic termination conditions).

10.2 Areas for Future Investigation

Hierarchical swarms. Some tasks have natural two-level parallelism: fan out across files, and within each file fan out across functions. The current architecture handles this through sequential swarm steps, but a hierarchical model could express it more naturally while maintaining containment guarantees.

Adaptive concurrency. The current model uses a fixed concurrency parameter. An adaptive model could monitor worker performance and adjust concurrency dynamically — scaling up when workers complete quickly, scaling down when API rate limits are hit.

Cross-swarm learning. Currently, each swarm’s execution is independent. A learning mechanism that analyzes patterns across swarms — which tasks benefit from parallelism, what concurrency levels produce the best cost/latency trade-offs, which convergence strategies produce the highest-quality results — could inform future process definitions.

Heterogeneous workers. The current model requires all workers to execute the same task blueprint. A heterogeneous model could assign different blueprints to different workers based on input characteristics, enabling specialization within a swarm.

11. Conclusion

Swarm Architecture extends the Compiled Context Runtime with a model for bounded parallel agent execution. It solves one problem precisely: naturally parallel work should execute in parallel. It does not attempt to solve general multi-agent coordination, emergent agent collaboration, or distributed consensus.

The architecture’s value lies in its constraints as much as its capabilities. Containment rules prevent the unbounded execution expansion that plagues multi-agent systems. Convergence protocols make parallel result merging explicit and auditable. Correlation IDs preserve the traceability that makes the CCR trustworthy. And — perhaps most significantly — the same containment rules that make swarms safe also make them distributable. A worker that cannot reach outside its boundary can execute anywhere without risk.

This is the paper’s central architectural insight. Constraints designed for safety produce a property — location independence — that transforms the scope of what agent systems can do. A local developer parallelizing a code review and a multinational organization distributing compliance analysis across twelve jurisdictions use the same architecture, the same containment model, the same convergence protocols. The difference is topology, not mechanism.

The compiled context boundary reinforces this at the security layer. Workers receive precisely what they need and nothing more — not because of access control lists or network segmentation, but because the CCR’s compilation pipeline produces minimal, scoped payloads by construction. Security is not a feature added to distribution. It is a property inherited from the context model.

One process definition. One execution model. Workers that can run anywhere — on your laptop, in your cloud, on your partner’s infrastructure, on a device across the planet — because the architecture guarantees they need nothing from each other and can leak nothing they were never given.

References

1. Anderson, W.C. (2026). Compiled Context Runtime: Process-Driven Agent Execution with Unbounded Local Memory. Version 1.0.

2. Anderson, W.C. (2026). Volatility-Based Decomposition in Software Architecture: A Practitioner-Oriented Articulation. Version 1.0.

3. Anderson, W.C. (2026). Harmonic Design: A Unified Software Engineering Framework. Version 1.0.