Tests as Living Documentation: Why Our Docs Are Always Correct

Tests as Living Documentation: Why Our Docs Are Always Correct

The problem: Documentation lies. It starts accurate, then drifts. The code changes, the docs don’t. Six months later, the “Getting Started” guide references APIs that no longer exist.

Our solution: We don’t write documentation. We write tests. Then we render the tests as documentation.


The Insight

Every regression test is already a tiny, self-contained example:

  • Input: Working code that demonstrates a feature
  • Expected output: What it should produce
  • Context: Comments explaining what’s being tested

That’s a tutorial. We were just hiding it in a test directory.


What We Built

Our /learn section pulls directly from koru/tests/regression/:

tests/regression/
├── 000_CORE_LANGUAGE/
│   ├── 010_BASIC_SYNTAX/
│   │   ├── 010_001_hello_world/
│   │   │   ├── input.kz          → Code example
│   │   │   ├── expected.txt      → Expected output
│   │   │   └── README.md         → Explanation (optional)
│   │   ├── 010_002_simple_event/
│   │   └── ...
│   └── 020_EVENTS/
├── 100_PARSER/
└── 300_ADVANCED_FEATURES/

A build script walks this tree and generates JSON:

{
  categories: [
    {
      name: "CORE LANGUAGE / BASIC SYNTAX",
      tests: [
        {
          name: "hello world",
          inputKz: "// The actual test code...",
          expectedTxt: "Hello World",
          status: "success",  // From CI
          relatedFiles: [...]  // Imported .kz files
        }
      ]
    }
  ]
}

The website renders this JSON with syntax highlighting, navigation, and status badges.


Why This Works

1. Tests Can’t Lie

If the documentation says “this code prints Hello World” but it actually crashes, the test fails. CI catches it. The docs are forced to be correct.

Traditional docs have no such constraint. They’re just text files that nobody runs.

2. Single Source of Truth

There’s no sync problem because there’s nothing to sync. The test IS the doc. Change the test, the docs update automatically.

3. Real Code, Not Pseudocode

Every example on /learn is code that actually compiles and runs. We know because it runs in CI every commit.

How many tutorials have you followed where the example code doesn’t actually work?

4. Status Visibility

Each lesson shows its CI status:

  • Passing: This works right now
  • Failing: We know it’s broken (transparency!)
  • TODO: Planned but not implemented
  • Broken: Test itself needs fixing

Users see exactly what works and what doesn’t. No false promises.


The Build Pipeline

# Generate lessons from test suite
npm run lessons

# Output: src/lib/data/lessons.json
# Contains all test code, expected outputs, related files

The script:

  1. Walks the regression test directory tree
  2. Reads input.kz, expected.txt, README.md from each test
  3. Captures test markers (MUST_RUN, MUST_FAIL, TODO, SKIP)
  4. Finds imported .kz files
  5. Generates structured JSON

Vercel runs this at deploy time using fresh test data.


What You See

Each lesson page shows:

  • Breadcrumb: Learn / Core Language / Basic Syntax / Hello World
  • Status badge: Passing/Failing/TODO
  • The code: Syntax-highlighted, copy-able
  • Expected output: What it produces
  • Imported files: Any related .kz files
  • Test configuration: Compiler flags, expected errors, etc.

It’s a test report disguised as a tutorial.


The Feedback Loop

We added a feedback widget to every page. Spot an issue? Add feedback directly on the lesson. The feedback includes:

  • The page URL (which maps to a test path)
  • The full test context
  • Priority and status

From the CLI:

npm run feedback              # See all open feedback
npm run feedback:context <id> # Get feedback + full test code
npm run feedback:done <id>    # Mark resolved

Found a problem in the docs? You’ve found a problem in the tests. Fix the test, the docs fix themselves.


The Numbers

Our test suite has 289 tests across:

  • Core language features
  • Parser behavior
  • Advanced features (comptime, taps)
  • Error cases (expected failures)

That’s 289 working examples, automatically documented, always current.


Try It

Browse the Learn section. Every page is a real test. The code runs. The output is verified. The status is live.

Documentation that can’t drift because it’s executed on every commit.


Published November 23, 2025