Inhale/Exhale: A New Development Methodology for AI-Assisted Coding

· 8 min read

Inhale/Exhale Development

This isn’t Test-Driven Development. This is something different.

The Pattern That Emerged

Over months of building Koru with Claude, a pattern emerged. We’d have intense sessions adding features, phantom types, comptime transforms - watching the test pass rate drop as we outpaced our test suite. Then we’d have cleanup sessions: fixing tests, deleting cruft, documenting real bugs.

At first this felt like failure. “We broke tests.” “We need to slow down.”

But then we noticed: the cleanup sessions were incredibly productive. We’d go from 72% to 76%+ in a single session. Not by writing code. By deleting the right tests.

We removed tests that were:

  • Using outdated APIs that no longer exist
  • Testing “aspirational” features with wrong signatures
  • Claiming things “aren’t implemented” when they ARE
  • Poisoning the AI’s understanding with lies

And we properly documented real bugs instead of hiding them.

This wasn’t sloppy development followed by cleanup. This was a methodology.

The Two Phases

INHALE (Velocity Phase)

During inhale, you’re building. Fast.

  • Blast out features rapidly
  • Tests WILL fail - that’s expected and OK
  • The failing test suite IS your issue tracker
  • Don’t stop to document - the code documents itself
  • Velocity over perfection
  • “Break things, leave breadcrumbs”

The key insight: failing tests during inhale are not failures. They’re breadcrumbs. They’re your issue tracker. They’re proof you’re moving fast enough to outpace your test suite.

EXHALE (Cleanup Phase)

During exhale, you’re stabilizing. Carefully.

  • Stop feature development
  • Look at every failure and ask: Bug? Outdated? Aspirational cruft?
  • DELETE poisonous tests (they pollute signal)
  • FIX tests that were just wrong (bad syntax, name collisions)
  • DOCUMENT real bugs as TODOs with clear explanations
  • Bring fresh eyes: another Claude session, ChatGPT, Codex
  • Tighten, verify, stabilize

The key insight: deleting bad tests is adding value. A clean test suite with 365 passing tests tells you more than a noisy one with 364 passing and 12 mysterious failures.

The Anti-Pattern: SPEC-Driven Development

There’s a popular approach in AI-assisted development: write a detailed SPEC first, then have the AI implement it.

This is antithetical to Inhale/Exhale. Here’s why:

SPECs Are Dead On Arrival

The moment you write a SPEC, it starts rotting. It describes what you planned to build, not what you actually built. And in fast-moving development, those diverge immediately.

# SPEC: Authentication System
- Users authenticate via JWT tokens
- Tokens expire after 24 hours
- Refresh tokens stored in httpOnly cookies

Three days later, you’ve pivoted to session-based auth because JWT didn’t fit. But the SPEC still says JWT. Now every AI session reads that SPEC and suggests JWT-based solutions.

The SPEC is actively lying to the AI.

SPECs Prevent Conversational Design

The best designs emerge through dialogue. You try something, it doesn’t feel right, you pivot. You discover edge cases mid-implementation. You realize the original plan was wrong.

SPECs freeze design before discovery. They say “here’s what we’re building” before you’ve learned what you should build.

With Inhale/Exhale:

  • Inhale: Try things. Let the design emerge through conversation.
  • Exhale: Document what you actually built via passing tests.

The tests ARE the spec. But they’re alive. They run. They verify. They can’t lie.

SPECs Are Immediate Context Poisoning

In AI-assisted development, context is everything. The AI learns from every file it reads.

A SPEC file is:

  • Read early (it’s “documentation”)
  • Trusted implicitly (it’s “the plan”)
  • Never verified (it doesn’t run)
  • Stale within days (reality moved on)

This means the AI is learning from fiction. It’s building mental models based on plans that were abandoned. It’s suggesting patterns that don’t match the actual codebase.

Running tests can’t lie. SPECs can only lie.

What We Do Instead

Instead of: SPEC.md → implement → hope they stay in sync

We do: implement → tests fail → exhale → tests pass → tests ARE the spec

The documentation is generated from passing tests. If a feature works, it’s documented. If it doesn’t exist, it’s not documented. No drift. No lies. No context poisoning.

How It Differs From TDD

TDDInhale/Exhale
Tests are sacredTests can become cruft
Red = STOP immediatelyRed during inhale = breadcrumb
One failing test at a timeBatch failures, batch fixes
Tests drive the designDesign drives tests (inhale), tests verify design (exhale)
Solo developer mental modelMulti-agent collaboration model

TDD says: “Never let the build break.”

We say: “Break it intentionally, clean it intentionally.”

Why This Works for AI-Assisted Development

1. Session Boundaries

Different Claude sessions for different phases. The inhale session has context about what you’re building. The exhale session has fresh eyes to see cruft.

2. Context Limits

AI can’t hold everything in context. So let the test suite be the memory. Failures ARE the issue tracker.

3. Fresh Eyes See Cruft

When you start a new exhale session, the AI doesn’t have attachment to the code. It can objectively ask: “Why does this test exist? Is it testing something real?”

4. Parallel Work

Inhale fast with one agent. Exhale can be delegated - even to a different AI or a human reviewer.

The Ecosystem That Makes This Work

Inhale/Exhale isn’t just a mindset - it requires infrastructure.

Running Documentation

Our test suite doesn’t just verify code - it generates documentation. The website you’re reading pulls examples directly from passing tests. This means:

  • Documentation can’t lie (it’s literally running code)
  • Examples are always current
  • Dead features disappear from docs automatically

When we delete a test with outdated syntax, we’re not just cleaning tests - we’re preventing that outdated syntax from appearing in docs and poisoning future understanding.

The Feedback System

During inhale, we capture issues in real-time through a structured feedback system. Not GitHub issues that go stale. Not TODO comments that get ignored. A living system that surfaces during exhale.

Fighting Context Poisoning

This is the crucial insight for AI-assisted development: stale artifacts poison AI context.

When Claude reads a test that says “comptime transforms aren’t implemented yet” - but they ARE implemented - it learns the wrong thing. It might:

  • Avoid using a feature that works
  • Suggest workarounds for non-existent limitations
  • Repeat the lie in generated code comments

Every outdated test, every stale comment, every aspirational TODO is actively degrading AI performance. Deleting them isn’t cleanup - it’s protecting the quality of future AI sessions.

Velocity AND Correctness

Traditional wisdom says you trade off velocity for correctness. Inhale/Exhale challenges this:

  • Inhale gives velocity - move fast, let tests track what breaks
  • Exhale gives correctness - clean signal, verified state
  • Running docs ensure truth - can’t drift from reality
  • Pruning prevents rot - context stays clean for AI

You’re not sacrificing correctness for speed. You’re batching the correctness work and doing it more efficiently.

The Real Example

Here’s what we found today in our exhale phase:

Tests using wrong APIs:

// Test claimed this was the signature:
~[comptime|transform]event simple_transform {
    source: Source[Text],
    program_ast: *const Program  // WRONG!
}

// But the WORKING code uses:
~[keyword|comptime|transform]pub event if {
    expr: Expression,
    invocation: *const Invocation,
    item: *const Item,
    program: *const Program,  // RIGHT!
    allocator: std.mem.Allocator
}

The test was claiming comptime transforms “aren’t implemented yet” - but they ARE implemented! The test just had the wrong signature. Delete it.

Tests documenting real bugs:

// Test 407: This SHOULD work but doesn't
~import "$test/mylib"
~mylib:double(value: 21)  // BUG: shorthand doesn't resolve

// You have to write the full path:
~test.mylib:double(value: 21)  // This works

This test documents a real codegen bug. Don’t “fix” it by changing the test - document it as a TODO and fix the emitter later.

The Methodology in Practice

  1. Inhale: Build features for a session or two. Let tests break. Don’t stop.

  2. Check the damage: Run --status. See how many failures accumulated.

  3. Exhale: New session. Go through each failure:

    • Is this testing something real? Keep it, fix the bug.
    • Is this testing outdated syntax? Delete it.
    • Is this testing aspirational features? Delete it or TODO it.
    • Is this documenting a real bug? Keep it as TODO with clear docs.
  4. Measure: Pass rate should go UP from deletions. Signal should be cleaner.

  5. Repeat: Back to inhale with a clean baseline.

The Uncomfortable Truth

Deleting tests feels wrong. We’re trained to think more tests = better.

But tests have a cost:

  • They take time to run
  • They take context to understand
  • They create noise when they fail for wrong reasons
  • They can lie about what the system does

A test that claims “feature X doesn’t work” when feature X DOES work is actively harmful. It’s not a safety net. It’s a landmine.

When to Exhale

You know it’s time to exhale when:

  • You’ve lost track of which failures matter
  • The pass rate has dropped significantly
  • You’re context-switching between “real work” and “test confusion”
  • A new session looks at the failures and says “wait, what?”

Conclusion

Inhale/Exhale isn’t about being sloppy. It’s about being intentional.

Intentionally fast during inhale. Intentionally careful during exhale.

The test suite is a living document. Living things need pruning.

But it’s more than that. In AI-assisted development, your codebase IS the context. Every file, every test, every comment shapes how the AI understands your project. Stale artifacts don’t just sit there - they actively mislead.

Inhale/Exhale is a methodology for:

  • Velocity without sacrificing correctness
  • Living documentation that can’t lie
  • Clean context that helps AI help you
  • Pivoting without accumulated cruft holding you back

The garden is healthier when you prune. And the AI is smarter when you delete the lies.


A methodology that emerged from building a programming language with Claude. Not planned. Discovered.