Your Regex Is a Branch, Not a Library Call
In Koru, match is control flow and each pattern is a branch the compiler turns into a specialized native DFA. Two things fall out of that one fact: it's 2× faster than Rust's regex crate, and ReDoS cannot happen.
Here is a regex match in basically every language you’ve used:
const re = /[a-z]+@[a-z]+/;
if (re.test(input)) { ... } You build an object. At runtime it parses the pattern, builds some internal machine, and walks your input through it. The pattern is data that a general engine interprets.
Here is the same thing in Koru:
import std/regex
import std/io
std/regex:match("foo@bar")
| `[a-z]+@[a-z]+` _ |> std/io:print.ln("matched-email")
| `[0-9]+` _ |> std/io:print.ln("matched-number")
| no-match |> std/io:print.ln("no-match") That isn’t a function call with a string argument. match is control flow — the same shape as any other branching construct in the language — and each
backtick pattern is a branch. The compiler reads those patterns at compile
time and turns each one into its own specialized native matcher, baked straight
into the dispatch. There is no Regex object at runtime. There is no pattern
stored anywhere. There is no general engine interpreting anything.
This is the whole post. Everything below is a consequence of that one move.
(The snippet above is tests/regression/600_STDLIB/640_REGEX/640_001_match_basic,
verbatim, green. Its expected output is matched-email.)
What the compiler actually does
When the std/regex transform sees a match, it walks the branches and hands
each backtick pattern to an in-house regex engine. That engine is small and it
is honest about what it is:
pattern string → regex AST → NFA (Thompson) → DFA (subset construction)
→ emitted as a specialized native matcher Parser, NFA builder, subset construction, code emitter — about 1,200 lines of
Zig, 25 unit tests, zig build test-regex green. It’s the same family as RE2,
the Rust regex crate, and Go’s regexp: a finite automaton, not a
backtracker. The difference is when it runs. Theirs build the automaton at
program startup. Koru’s builds it during compilation and emits the result as a
function — a flat table walk, one branch per input byte, that the Zig optimizer
then inlines into your dispatch like any other code.
So at runtime, std/regex:match("foo@bar") is not “look up a regex and run it.” It’s a
handful of specialized functions that exist because of the patterns you wrote,
called in first-match-wins order, falling through to no-match. The patterns
have evaporated into native code.
Consequence A: it’s 2× faster than Rust
We benchmarked it. The workload: classify 3,000,000 inputs (rotating through an
email, a number, and a non-matching string) against full-match patterns [a-z]+@[a-z]+ then [0-9]+, first match wins, count the buckets. Every
language runs the same dispatch shape; the runtime engines hoist their
compiled patterns out of the loop, as is idiomatic, so we’re measuring
steady-state per-dispatch cost, not startup.
| Language | Engine | Median wall | vs Koru |
|---|---|---|---|
| Koru | compile-time DFA, native | 45 ms | 1.0× |
| Rust | regex crate (linear, SIMD prefilters) | 93 ms | 2.0× |
| JavaScript | V8 Irregexp | 151 ms | 3.4× |
| Go | regexp (RE2) | 324 ms | 7.2× |
| Python | re | 917 ms | 20× |
That’s about 15 nanoseconds per dispatch for Koru — a table walk plus a branch — against 31 ns for the Rust regex crate, which is genuinely fast and genuinely linear.
Be clear about why Koru wins here, because it’s not magic and it’s not “Koru beats Rust.” Koru wins because it does strictly less. The Rust crate is a general engine: it does search, captures, Unicode, patterns it only learns at runtime. The matcher this workload exercises is full-match, ASCII, capture-free, and the patterns are known at compile time — so it can monomorphize each one into straight-line code with no engine indirection and no pattern it has to stay flexible about. Compile-time specialization beats a runtime engine 2:1 here precisely because Koru threw away the generality this job wasn’t using. That’s the trade, stated plainly.
Consequence B: ReDoS cannot happen
This is the one that matters more than the speed.
Take the textbook catastrophic pattern, (a+)+b, and feed it a string of n as with no b. A backtracking engine tries every way to split the a-run
between the two nested quantifiers before it can conclude “no match” — an
exponential number of ways. The wall time goes vertical:
n (length of the a-run) | JavaScript (V8) | Python (re) | Koru | Rust |
|---|---|---|---|---|
| 10 | 81 ms | 42 ms | — | — |
| 20 | 120 ms | 73 ms | — | — |
| 25 | 1.36 s | 1.03 s | — | — |
| 28 | 9.9 s | 7.9 s | — | — |
| 30 | 39.7 s | (didn’t run) | 26 ms | 25 ms |
| 1,000,000 | — | — | 24 ms | 25 ms |
| 10,000,000 | — | — | 30 ms | 30 ms |
Read the bottom-right corner against the JavaScript column. Koru full-matches ten million characters in less time than V8 needs to fail on thirty. Those 26–30 ms are almost entirely process startup — the match itself never registers. The Koru and Rust lines are flat from thirty characters to ten million.
Rust and Go are flat here too, and they earned it the hard way: their engines are deliberately engineered around backtracking blowup at runtime. Koru is flat for a different reason. It’s flat by construction — there is no backtracking code path to fall into, because the emitted matcher is a DFA table walk and a DFA has exactly one next state per byte. The blowup isn’t avoided; it’s unrepresentable.
And here’s the design consequence that makes this stick: the engine rejects backreferences. Not “doesn’t support yet” — rejects, on purpose. A backreference is the feature that forces backtracking and reopens the catastrophic case. Excluding it is the same decision as the DFA: the regular subset is exactly the subset that’s safe, and Koru stays inside it. You cannot write the ReDoS pattern in the first place.
It nests, because it’s just control flow
If match is real control flow, it should compose with the rest of the
language’s control flow. It does:
import std/regex
import std/io
for(0..3)
! each _ |> std/regex:match("foo@bar")
| `[a-z]+@[a-z]+` _ |> std/io:print.ln("hit")
| no-match |> std/io:print.ln("miss") A match sitting inside a for, each iteration dispatching through the
compiled DFAs. Output: hit three times. This is 640_002_match_nested_in_for, and it’s also the exact shape of the throughput
benchmark’s hot loop. Getting here took a real design step — match declares
its user surface as an ordinary event contract: any raw-name branch with a
bindable payload, plus an optional void no-match. The checker validates a
nested match exactly the way it validates a top-level one, exactly the way
it validates any event: against the declared branches. There is no “trust the
transform” exemption anywhere in the pipeline. The point: match isn’t a
special form bolted on; it earns its place by composing like everything else.
The match is a value, not just a destination
A branch that fires because something matched should be able to hold the thing that matched. It does — the pattern branch’s binding receives the matched text:
import std/regex
import std/io
std/regex:match("foo@bar")
| `[a-z]+@[a-z]+` m |> std/io:print.ln("matched: {{ m:s }}")
| `[0-9]+` _ |> std/io:print.ln("number")
| no-match |> std/io:print.ln("no-match") Prints matched: foo@bar (this is 640_003_match_binding, green in the
suite). And notice something about every Koru snippet in this post: they’re .k files — pure Koru, top to bottom. Koru programs that mix in
host-language code mark the boundary with a ~ host-switch glyph; none of
these programs have one, because nothing here needs the host. Regex
dispatch, iteration, binding, printing — not one escape. Bind the text with a name, or discard it with _ — the linear rule
demands one or the other, because a payload you silently ignore is a bug you
haven’t met yet. no-match carries nothing, so it binds nothing: the same
rule’s void half.
Under full-match semantics the matched text is the whole input, so the binding is a fat-pointer copy of a slice that already exists — zero allocation, zero bytes copied. The benchmark agrees: binding the match in every branch and accumulating its length on all three million hits moves the median from 45.6 ms to 47.0 ms, and that delta is the added accumulation work, not the binding. Still under half of Rust’s 92.7 ms for the discard-only version of the workload.
The pattern is the payload schema
Whole-input text is the weak version. The strong version: a pattern with named groups delivers structured data into the branch, and the destructure at the binding position is where you say what shape you want back.
import std/io
import std/regex
std/regex:match("2x3x4")
| `(?<l>[0-9]+)x(?<w>[0-9]+)x(?<h>[0-9]+)` { l: i64, w: i64, h: i64 } |> std/io:print.ln("{{ l:d }} {{ w:d }} {{ h:d }}")
| no-match |> std/io:print.ln("no") Prints 2 3 4. Look at what the branch head is doing. The pattern names three
groups — l, w, h — and the { l: i64, w: i64, h: i64 } is the same
shape-destructure Koru uses on any event payload. The pattern is the payload
schema. And the i64 on each field is not an assertion that the text happens
to be a number — it’s a conversion. The matched text dissolves into an
actual integer at the splice, so inside the body l, w, and h are numbers
you can do arithmetic on, not strings you have to parse in a second step. (This
is 640_005_match_groups_destructure, verbatim, green.)
Don’t want to name every field? Bind the whole payload and read the groups off it as fields — untyped, so they stay as matched slices:
std/regex:match("21,9")
| `(?<row>[0-9]+),(?<col>[0-9]+)` pos |> std/io:print.ln("row={{ pos.row:s }} col={{ pos.col:s }}")
| no-match |> std/io:print.ln("no") Prints row=21 col=9 (640_006_match_groups_whole_binding). Same pattern, two
ways to receive it — destructure the fields you want with the types you want, or
take the struct whole.
Because the transform owns the pattern and the destructure at the same site,
it checks them against each other and the match is strict, both ways: every
named group must be destructured and every destructured field must name a real
group. Capture a group you don’t consume and it’s a compile error, not a silent
leak — `(?<l>...)x(?<w>...)` { l: i64 } is rejected because w was
captured and never destructured (640_007_reject_groups_field_mismatch). If you
want the structural grouping without the capture, spell it bare: (...) groups
for the engine, (?<name>...) captures for you.
Two restrictions ride along with captures, and they’re the same doctrine as the
backreference rejection — refuse the cases where a captured span would be a lie.
A group under a quantifier (`(?<x>a)+`) has no single span — which a did you mean? — so it’s rejected (640_008). A group under alternation (`a|(?<x>b)`) is empty on the branch that wasn’t taken, and “empty when
skipped” is exactly the silent hole; rejected too (640_009). Loud at the match
site, both of them.
Captures are fast for the same reason
Pulling three integers out of a string three million times — the same LxWxH pattern, every language matching the named groups and parsing them to ints and
summing the volumes:
| Language | Engine | Median wall | vs Koru |
|---|---|---|---|
| Koru | compile-time tagged DFA, native | 111 ms | 1.0× |
| Rust | regex crate (named captures) | 301 ms | 2.7× |
| JavaScript | V8 Irregexp | 324 ms | 2.9× |
| Go | regexp (RE2) | 401 ms | 3.6× |
| Python | re | 1,680 ms | 15× |
Same shape as the discard benchmark, same winner — but notice the lead widened. Plain matching was 2.0× Rust; with captures it’s 2.7×. Captures are where Koru pulls further ahead, and that’s the opposite of what you’d expect, because captures are exactly where this design should have broken.
To know where each group matched and not just whether the input matched, the textbook move is to simulate the NFA thread by thread, carrying capture positions along — which is the per-state-per-byte work a DFA throws away to go fast. The first cut of this did exactly that, and on the benchmark above it was the slowest language here — behind V8, behind Go. A capture engine that falls back to NFA simulation isn’t the compile-time story; it’s a runtime engine wearing a hat.
Koru doesn’t have to simulate, and the reason is a restriction it already
imposes. Because a capture group can’t sit under a quantifier or an
alternation, every group is a clean once-entered bracket — so where each one
opens and closes is structurally determined, with no ambiguity left for the
engine to resolve at runtime. That is precisely the condition under which subset
construction can bake the capture bookkeeping straight onto the DFA’s transitions
(RE2 calls it the “one-pass” case). So (?<l>[0-9]+)x(?<w>[0-9]+)x(?<h>[0-9]+) compiles to the same flat table walk as [a-z]+@[a-z]+ did — one transition per
byte — that also writes down six byte offsets as it passes them. The restriction
that looked like a limitation is what buys the speed. The i64 conversions are
the only runtime arithmetic left on top.
(Benchmark, all five implementations, and raw CSV: benchmarks/workloads/regex_match_groups. Same machine and method as the
discard numbers above.)
What it can’t do yet — said plainly
This is cut 2. The honest gaps:
- Full-match only. The whole input must match the pattern. No search, no
“find the first occurrence.”
^and$parse but are no-ops, because the match is already whole-string-anchored. - Captures are full-span only. Named groups extract and convert (above), but a group under a quantifier or under an alternation is rejected — those are the cases where a captured span isn’t a single well-defined slice. No group-counted repetition extraction yet.
- No backreferences — and as above, that one is permanent by design, not a TODO.
- ASCII. No Unicode classes yet.
None of that is hidden behind a flag or a “coming soon” page. The grammar is a
clean regular subset — alternation, concatenation, the three quantifiers,
character classes, ., grouping, and named-group capture — and it is complete
and correct within that subset. Search semantics are a later cut, and they’ll
be built the same way: as automaton states, never as a backtracking escape
hatch.
The shape of the idea
Treating a backend as a compile-time target instead of a runtime dependency is the recurring Koru move — we’ve made the same argument about JavaScript and about the compiler being just event flows. Regex is the same idea aimed at a thing nobody thinks of as a backend. Your patterns are known when you compile. So compile them. What you get back is a branch that runs at native speed and a whole category of vulnerability that you are structurally unable to write.
Numbers measured on an Apple Silicon M-series Mac, Zig 0.15.2, all AOT languages in ReleaseFast. Workloads, raw CSVs, and the green regression tests are in the repo.