Dumb Boundaries, Smart Middles

Why Koru's parser and emitter know almost nothing

· 9 min read

Dumb Boundaries, Smart Middles

There is a design principle running through every part of Koru’s architecture, from transforms to phantom types to the compilation pipeline itself:

The parser is dumb. The emitter is dumb. Everything in between is smart.

This is not an accident or a limitation. It is a deliberate decomposition — one that transfers capability from the infrastructure to the people who use it.

What the Parser Knows

The Koru parser knows how to read source text and produce an AST. It understands events, flows, continuations, annotations. It can tell you that ~open(path: "test.txt") is an event invocation and that | opened f |> is a continuation binding.

What the parser does not know:

  • What [transform] means.
  • What [fs:open!] means.
  • What ~if does.
  • Whether [has:transform+sprite] is a valid component set.

These are opaque strings. The parser captures them faithfully and moves on. It has no opinion about what they mean.

This is intentional. If the parser understood annotations, it would need to know about every annotation in existence — including every annotation your user code might define. That is exactly the wrong place for that knowledge to live.

What the Emitter Knows

The emitter knows how to turn an AST into Zig source code. It maps each node type to its output representation. It handles indentation and scope and variable naming.

What the emitter does not know:

  • Whether your phantom states were correctly satisfied.
  • Whether your transforms have run.
  • Whether the purity checker approved this proc.
  • Whether the structure makes semantic sense.

By the time the emitter runs, all of that has already been handled. The emitter trusts that the passes before it did their job. It does not verify — it transcribes.

This is also intentional. An emitter that double-checked semantic validity would be reimplementing the analysis passes it already ran. That is exactly the wrong place for that knowledge to live.

The Middle Is Where Knowledge Lives

Between the parser and the emitter, there is a sequence of passes. Each pass understands something specific. Each pass claims its vocabulary from the AST and does its work.

The transform passes understand [transform] annotations. They walk the AST, find invocations of transform events, and rewrite them into their expanded form. ~if(expr) becomes a conditional node with zero overhead. The parser never knew what ~if meant. The emitter still does not. Only the transform pass does.

The phantom checker understands phantom type annotations. It walks the AST, finds event signatures with states like [fs:open!] and [!opened], tracks obligation creation and disposal through flows, and errors on violations. The parser captured [fs:open!] as an opaque string. The emitter will ignore it entirely. Only the phantom checker gives it meaning.

The purity checker understands [pure] annotations. The structure checker understands flow coverage. The dead-stripper understands reachability.

Each pass owns its slice of meaning. None of them overlap.

Phantom Types Are the Clearest Example

When you write:

~event open { path: []const u8 }
| opened { file: *File[fs:open!] }

~event close { file: *File[!fs:open] }
| closed { file: *File[fs:closed] }

The string fs:open! is opaque to the parser. It goes into the AST as-is. The emitter will later produce *File from *File[fs:open!], dropping the annotation entirely — it has zero runtime representation.

In between, the phantom checker walks the AST. It looks for annotations with the shapes it understands: [state!] creates an obligation, [!state] disposes one, mismatched states in event calls are errors. It claims those annotations and validates them according to its rules.

Here is the nuance that matters: the phantom checker is not dispatched-to. No part of the compiler says “here is an annotation — go handle it.” The checker is a compiler pass that walks the AST and finds what is relevant to it. It hunts, rather than receives.

This distinction matters for what comes next.

Multiple Disciplines, Same Mechanism

The phantom checker validates resource lifecycles: opened → closed, allocated → freed. But [fs:open!] is just a string. There is nothing stopping a different pass from claiming a different vocabulary from the same annotation space.

An ECS checker would walk the AST and look for [has:transform+sprite] and [lacks:sprite]. It would validate that render() is only called on entities with the required components, and that remove_sprite() correctly transitions the component set. The phantom checker would ignore these entirely — it does not understand has: or lacks:.

A borrow checker would walk the AST and look for lifetime annotations: *Data['a:owned], *Data['a:borrowed]. It would enforce Rust-style ownership rules in the scopes where those annotations appear. The ECS checker would ignore them.

All three can coexist in the same program. Each claims its vocabulary. None interfere with each other. The parser does not know which checker applies to which annotation — it just captures the strings. The checkers sort it out.

The Missing Piece: Scope Binding

For multiple disciplines to coexist cleanly, you need a way to say which checker applies in which scope. Right now this is implicit — the checkers are registered in the pipeline and run globally. What is missing is a scope-level declaration:

~[phantom: ecs_checker]  // Everything in this scope uses ECS semantics

render(entity: e)  // Checker validates [has:transform+sprite]

This is small. A few lines in a custom compiler:coordinate proc, telling the relevant pass which scopes are its domain. The architecture already accommodates it. The passes already know how to walk the AST. The annotations are already opaque strings that each pass claims independently.

It is the last inch before multiple type disciplines can fully coexist in a single program without collision.

The Architecture Is Not Locked

The phantom checker itself is replaceable. It is a compiler pass — a [comptime] event in the standard library, called from compiler:coordinate. If you do not like the default checker, replace it:

~std.compiler:coordinate = my_pipeline(...)

Wire in your own semantic checker. Or keep the default and add your own alongside it. Or remove it entirely. The pipeline is user code, and the checker is in the pipeline.

This is not a side effect of the architecture. It is what the architecture was built for. The parser does not bake in any understanding of phantom annotations because the understanding is supposed to live in a pass — and passes are supposed to be replaceable.

Why Opaque Is a Feature

When you first encounter “phantom types are opaque strings,” it can sound like a limitation. Like the language is avoiding commitment.

It is the opposite.

Making the parser opaque to annotation semantics means annotation semantics can live anywhere — in a standard library pass, in a third-party module, in your own code. Making the emitter opaque to semantic validity means semantic validity can be defined by whoever has the best context for defining it. Making the checkers into passes means the checkers can be replaced, extended, or supplemented without touching the compiler.

Opaque is not a gap in the design. It is what transfers power to the right place.

The parser captures. The emitter transcribes. The passes between them — transforms, analysis, phantom checking, whatever else you add — are where all the understanding lives. And that understanding belongs to whoever writes the passes.

That is the design principle. Everything else in Koru’s architecture follows from it.