Inhale/Exhale: A New Development Methodology for AI-Assisted Coding
Inhale/Exhale Development
This isn’t Test-Driven Development. This is something different.
The Pattern That Emerged
Over months of building Koru with Claude, a pattern emerged. We’d have intense sessions adding features, phantom types, comptime transforms - watching the test pass rate drop as we outpaced our test suite. Then we’d have cleanup sessions: fixing tests, deleting cruft, documenting real bugs.
At first this felt like failure. “We broke tests.” “We need to slow down.”
But then we noticed: the cleanup sessions were incredibly productive. We’d go from 72% to 76%+ in a single session. Not by writing code. By deleting the right tests.
We removed tests that were:
- Using outdated APIs that no longer exist
- Testing “aspirational” features with wrong signatures
- Claiming things “aren’t implemented” when they ARE
- Poisoning the AI’s understanding with lies
And we properly documented real bugs instead of hiding them.
This wasn’t sloppy development followed by cleanup. This was a methodology.
The Two Phases
INHALE (Velocity Phase)
During inhale, you’re building. Fast.
- Blast out features rapidly
- Tests WILL fail - that’s expected and OK
- The failing test suite IS your issue tracker
- Don’t stop to document - the code documents itself
- Velocity over perfection
- “Break things, leave breadcrumbs”
The key insight: failing tests during inhale are not failures. They’re breadcrumbs. They’re your issue tracker. They’re proof you’re moving fast enough to outpace your test suite.
EXHALE (Cleanup Phase)
During exhale, you’re stabilizing. Carefully.
- Stop feature development
- Look at every failure and ask: Bug? Outdated? Aspirational cruft?
- DELETE poisonous tests (they pollute signal)
- FIX tests that were just wrong (bad syntax, name collisions)
- DOCUMENT real bugs as TODOs with clear explanations
- Bring fresh eyes: another Claude session, ChatGPT, Codex
- Tighten, verify, stabilize
The key insight: deleting bad tests is adding value. A clean test suite with 365 passing tests tells you more than a noisy one with 364 passing and 12 mysterious failures.
The Anti-Pattern: SPEC-Driven Development
There’s a popular approach in AI-assisted development: write a detailed SPEC first, then have the AI implement it.
This is antithetical to Inhale/Exhale. Here’s why:
SPECs Are Dead On Arrival
The moment you write a SPEC, it starts rotting. It describes what you planned to build, not what you actually built. And in fast-moving development, those diverge immediately.
# SPEC: Authentication System
- Users authenticate via JWT tokens
- Tokens expire after 24 hours
- Refresh tokens stored in httpOnly cookies Three days later, you’ve pivoted to session-based auth because JWT didn’t fit. But the SPEC still says JWT. Now every AI session reads that SPEC and suggests JWT-based solutions.
The SPEC is actively lying to the AI.
SPECs Prevent Conversational Design
The best designs emerge through dialogue. You try something, it doesn’t feel right, you pivot. You discover edge cases mid-implementation. You realize the original plan was wrong.
SPECs freeze design before discovery. They say “here’s what we’re building” before you’ve learned what you should build.
With Inhale/Exhale:
- Inhale: Try things. Let the design emerge through conversation.
- Exhale: Document what you actually built via passing tests.
The tests ARE the spec. But they’re alive. They run. They verify. They can’t lie.
SPECs Are Immediate Context Poisoning
In AI-assisted development, context is everything. The AI learns from every file it reads.
A SPEC file is:
- Read early (it’s “documentation”)
- Trusted implicitly (it’s “the plan”)
- Never verified (it doesn’t run)
- Stale within days (reality moved on)
This means the AI is learning from fiction. It’s building mental models based on plans that were abandoned. It’s suggesting patterns that don’t match the actual codebase.
Running tests can’t lie. SPECs can only lie.
What We Do Instead
Instead of: SPEC.md → implement → hope they stay in sync
We do: implement → tests fail → exhale → tests pass → tests ARE the spec
The documentation is generated from passing tests. If a feature works, it’s documented. If it doesn’t exist, it’s not documented. No drift. No lies. No context poisoning.
How It Differs From TDD
| TDD | Inhale/Exhale |
|---|---|
| Tests are sacred | Tests can become cruft |
| Red = STOP immediately | Red during inhale = breadcrumb |
| One failing test at a time | Batch failures, batch fixes |
| Tests drive the design | Design drives tests (inhale), tests verify design (exhale) |
| Solo developer mental model | Multi-agent collaboration model |
TDD says: “Never let the build break.”
We say: “Break it intentionally, clean it intentionally.”
Why This Works for AI-Assisted Development
1. Session Boundaries
Different Claude sessions for different phases. The inhale session has context about what you’re building. The exhale session has fresh eyes to see cruft.
2. Context Limits
AI can’t hold everything in context. So let the test suite be the memory. Failures ARE the issue tracker.
3. Fresh Eyes See Cruft
When you start a new exhale session, the AI doesn’t have attachment to the code. It can objectively ask: “Why does this test exist? Is it testing something real?”
4. Parallel Work
Inhale fast with one agent. Exhale can be delegated - even to a different AI or a human reviewer.
The Ecosystem That Makes This Work
Inhale/Exhale isn’t just a mindset - it requires infrastructure.
Running Documentation
Our test suite doesn’t just verify code - it generates documentation. The website you’re reading pulls examples directly from passing tests. This means:
- Documentation can’t lie (it’s literally running code)
- Examples are always current
- Dead features disappear from docs automatically
When we delete a test with outdated syntax, we’re not just cleaning tests - we’re preventing that outdated syntax from appearing in docs and poisoning future understanding.
The Feedback System
During inhale, we capture issues in real-time through a structured feedback system. Not GitHub issues that go stale. Not TODO comments that get ignored. A living system that surfaces during exhale.
Fighting Context Poisoning
This is the crucial insight for AI-assisted development: stale artifacts poison AI context.
When Claude reads a test that says “comptime transforms aren’t implemented yet” - but they ARE implemented - it learns the wrong thing. It might:
- Avoid using a feature that works
- Suggest workarounds for non-existent limitations
- Repeat the lie in generated code comments
Every outdated test, every stale comment, every aspirational TODO is actively degrading AI performance. Deleting them isn’t cleanup - it’s protecting the quality of future AI sessions.
Velocity AND Correctness
Traditional wisdom says you trade off velocity for correctness. Inhale/Exhale challenges this:
- Inhale gives velocity - move fast, let tests track what breaks
- Exhale gives correctness - clean signal, verified state
- Running docs ensure truth - can’t drift from reality
- Pruning prevents rot - context stays clean for AI
You’re not sacrificing correctness for speed. You’re batching the correctness work and doing it more efficiently.
The Real Example
Here’s what we found today in our exhale phase:
Tests using wrong APIs:
// Test claimed this was the signature:
~[comptime|transform]event simple_transform {
source: Source[Text],
program_ast: *const Program // WRONG!
}
// But the WORKING code uses:
~[keyword|comptime|transform]pub event if {
expr: Expression,
invocation: *const Invocation,
item: *const Item,
program: *const Program, // RIGHT!
allocator: std.mem.Allocator
} The test was claiming comptime transforms “aren’t implemented yet” - but they ARE implemented! The test just had the wrong signature. Delete it.
Tests documenting real bugs:
// Test 407: This SHOULD work but doesn't
~import "$test/mylib"
~mylib:double(value: 21) // BUG: shorthand doesn't resolve
// You have to write the full path:
~test.mylib:double(value: 21) // This works This test documents a real codegen bug. Don’t “fix” it by changing the test - document it as a TODO and fix the emitter later.
The Methodology in Practice
Inhale: Build features for a session or two. Let tests break. Don’t stop.
Check the damage: Run
--status. See how many failures accumulated.Exhale: New session. Go through each failure:
- Is this testing something real? Keep it, fix the bug.
- Is this testing outdated syntax? Delete it.
- Is this testing aspirational features? Delete it or TODO it.
- Is this documenting a real bug? Keep it as TODO with clear docs.
Measure: Pass rate should go UP from deletions. Signal should be cleaner.
Repeat: Back to inhale with a clean baseline.
The Uncomfortable Truth
Deleting tests feels wrong. We’re trained to think more tests = better.
But tests have a cost:
- They take time to run
- They take context to understand
- They create noise when they fail for wrong reasons
- They can lie about what the system does
A test that claims “feature X doesn’t work” when feature X DOES work is actively harmful. It’s not a safety net. It’s a landmine.
When to Exhale
You know it’s time to exhale when:
- You’ve lost track of which failures matter
- The pass rate has dropped significantly
- You’re context-switching between “real work” and “test confusion”
- A new session looks at the failures and says “wait, what?”
Conclusion
Inhale/Exhale isn’t about being sloppy. It’s about being intentional.
Intentionally fast during inhale. Intentionally careful during exhale.
The test suite is a living document. Living things need pruning.
But it’s more than that. In AI-assisted development, your codebase IS the context. Every file, every test, every comment shapes how the AI understands your project. Stale artifacts don’t just sit there - they actively mislead.
Inhale/Exhale is a methodology for:
- Velocity without sacrificing correctness
- Living documentation that can’t lie
- Clean context that helps AI help you
- Pivoting without accumulated cruft holding you back
The garden is healthier when you prune. And the AI is smarter when you delete the lies.
A methodology that emerged from building a programming language with Claude. Not planned. Discovered.