Behavior-Driven Testing

Boundary-Driven Testing

Testing difficulty is architectural evidence. When a component cannot be exercised in isolation, the problem is not the tests — it is the structure.


The Core Insight

The test spiral — unit, integration, end-to-end, system, user acceptance — is not a testing methodology. It is an architectural map. Each ring corresponds to a level of architectural scope, and that scope is determined entirely by where boundaries have been placed.

Get the boundaries right and the spiral populates itself: each tier has clear targets, predictable scope, and low maintenance overhead. Get them wrong and the spiral collapses — unit tests become integration tests in disguise, E2E tests become the only reliable safety net, and the entire suite grows expensive while providing diminishing confidence.

Testing Is Not Separate from Design

The conventional framing of testing as a discipline separate from design produces a particular kind of pain. Teams adopt frameworks, mandate coverage minimums, and write guidelines. The tests improve. The pain persists. A change to a business rule breaks seventeen tests, most of which are not about business rules.

The reframe: a component designed around a coherent responsibility, with explicit inputs, explicit outputs, and dependencies passed rather than acquired, is inherently testable. No additional effort is required. The same structural choices that allow the component to change without cascading effects allow it to be tested without elaborate setup. The inverse is equally true: a component that cannot be unit-tested without mocking half the system is not badly tested — it is badly structured.

The Spiral Is a Structural Map

The spiral describes a progression from narrowest to broadest scope. The common reading is proportional — many unit tests, fewer integration tests, fewer E2E. This is a useful heuristic, but the more important principle is scope. Where boundaries are placed determines what constitutes a “unit,” what constitutes an “integration,” and what constitutes a “journey.”

  • Unit — one component, all dependencies replaced. Fast, numerous, fine-grained.
  • Integration — component collaboration across one seam, dependencies mocked at the outer boundary. Verifies orchestration and contracts.
  • End-to-End — a complete user flow, full stack. Verifies the system behaves correctly from the outside.
  • System — the integrated system under realistic conditions: load, failure injection, configuration variation.
  • User Acceptance — real users or proxies confirm that what was built matches what was intended.

In a system with no meaningful component boundaries, unit scope and system scope are the same thing. The spiral collapses into E2E by default, because E2E is the only level at which anything coherent can be exercised.

Test Profiles by Role

Each component role in VBD and EBD has a characteristic test profile — not assigned arbitrarily, but derived from the role’s structural position, responsibilities, and communication rules.

TierVBD / EBD RolePrimary Test LevelWhat to Mock
ExecutionEngine / FlowUnitResource Accessor below
External BoundaryAccessor / API AccessorUnit (translation only)Data source driver
Cross-cuttingUtility / InteractionUnitNothing (pure) or external sink
OrchestrationManager / ExperienceIntegrationEngines mocked, Accessors mocked

Engines: The Unit Test Core

Engines are the most logic-dense tier and the natural home of the unit test suite. An Engine encapsulates business rules: given inputs, apply policy, produce a result. Its communication constraints — no peer Engine calls, no direct infrastructure access — ensure there is nothing else to mock beyond the Accessor. Mock the Accessor, supply controlled inputs, assert on the output. The test scope is exactly the Engine and nothing more.

Resource Accessors: Thin Boundary, Minimal Unit Surface

An Accessor’s job is translation: convert a domain request into an external call, convert the response back. Whether the external system is reachable or correctly provisioned is an infrastructure concern belonging to system testing. The Accessor’s correctness is about the translation. The infrastructure’s correctness is about the infrastructure.

Interactions and Utilities: Narrow and Fast

Interactions are atomic: render in a harness, simulate the input event, assert what was emitted. No mocks needed — props and callbacks are the entire interface. Utilities are simpler still: inputs in, outputs out, no side effects. The only exception is a Utility wrapping an external sink, where the sink gets mocked.

The Three Integration Seams

Integration tests verify the seams between roles — not individual components. Everything is still mocked at the external boundary. Real external systems do not enter until E2E. The distinction from unit tests is scope, not realism.

  1. Manager to Engine — Does the Manager invoke the Engine with correct inputs? Does it handle every response state — success, domain failure, unexpected error — and route accordingly? The Engine is mocked.
  2. Engine to Resource Accessor — Does the Engine correctly use the Accessor’s contract? Does it handle all return states? The Accessor is mocked. No database is involved.
  3. Manager to Resource Accessor — For reads that inform orchestration decisions or state persistence the Manager owns. The Accessor is mocked.

In EBD, the equivalent seam is Experience to Flow: does the Experience pass correct shared state, handle Flow completion and skip signals, and advance the journey correctly?

Mock Placement Is Architectural Evidence

Where you place mocks tells you where your boundaries are. Where you are forced to place mocks tells you where your boundaries should be.

The rule: mock at the role boundary, not inside the role. Each component role has one natural mock point — the interface at which it hands off to the next tier. Mock that interface and nothing else.

When a unit test requires mocking more than the single boundary below the component under test, something is wrong. Either the component has absorbed responsibilities that belong at a different tier, or its dependencies are implicit rather than injected, or an Accessor is missing. Mock proliferation is always a structural signal — not a testing problem, and not a problem that better mocking frameworks solve.

Scenarios Validate Architecture and Tests

The same core scenarios that validate structural boundaries in VBD and EBD are the test scenarios that matter most. Scenarios that validate structural boundaries naturally exercise the most load-bearing code paths, the most significant collaborations, and the most complete representations of what the system is actually for. The structural models defined in VBD and EBD produce boundaries that localize both change and test scope simultaneously. The same line that prevents coupling prevents test contamination. The same role taxonomy that makes components replaceable makes them mockable.