How Big Should a Binary Be?
How Big Should a Binary Be?
Here’s a Koru program:
~import "$std/io"
~event add { a: i32, b: i32 }
| sum i32
~add = sum a + b
~add(a: 17, b: 25)
| sum s |> std.io:print.ln("17 + 25 = {{ s:d }}") It defines an event with typed inputs and an identity branch. ~add = sum a + b — that’s the whole implementation. Call the event, pattern-match the result, print with string interpolation. Standard Koru.
Compiled to a static Linux ELF:
-rwxr-xr-x 1 user staff 4848 Feb 13 00:11 output_emitted
output_emitted: ELF 64-bit LSB executable, x86-64, statically linked, stripped 4,848 bytes.
That’s not a typo. An event with an identity branch, a continuation, and formatted I/O compiles to 4.8KB.
For Context
Here’s what other languages produce for “Hello, World”:
| Language | Binary Size | Notes |
|---|---|---|
| Koru | 4,848 B | Event + identity branch + print |
| C (musl static) | ~17 KB | printf("Hello\n") |
| Go | ~1.8 MB | Static by default |
| Rust | ~300 KB | Default release, static |
| Zig (std.debug.print) | ~9.3 KB | Pulls in std.fmt |
| Zig (raw posix.write) | ~4.8 KB | Manual syscall |
Koru matches hand-written Zig using raw posix.write. But you didn’t write raw syscalls. You wrote events and continuations.
How
Three things make this possible, and they compound.
1. Events Disappear at Compile Time
Koru’s event system has no runtime representation. No vtables. No dispatch tables. No event objects. The compiler type-checks everything, verifies branch exhaustiveness, tracks purity — then emits direct function calls.
~add = sum a + b becomes:
pub const add_event = struct {
pub fn handler(input: Input) Output {
return .{ .sum = input.a + input.b };
}
}; And ~add(a: 17, b: 25) | sum s |> becomes:
const result_0 = add_event.handler(.{ .a = 17, .b = 25 });
const s = result_0.sum; Zig sees that all inputs are comptime-known. It inlines the handler, constant-folds 17 + 25 to 42, and the entire event system evaporates. What remains is a constant and a print.
2. Format Specifiers Control Code Generation
This is where it gets interesting. Koru’s print.ln transform looks at your format specifiers and chooses the code generation path:
{{ x:d }}— integer. Emit rawposix.writewith inline int-to-string conversion.{{ s:s }}— string. Emit rawposix.writedirectly.{{ v:any }}— anything. Fall back tostd.debug.print(pulls instd.fmt).
The specifier is mandatory. {{ x }} without a specifier is a compile-time error:
error: std.io:print.ln: '{{ x }}' requires a format specifier (:d, :s, or :any) This means the compiler knows whether your program needs std.fmt. If you stick to :d and :s, the formatter is never referenced, never linked, never in your binary. You don’t pay for what you don’t use — and the compiler can prove you don’t use it.
3. Dead Stripping Removes Everything Else
Koru’s dead strip pass walks the AST before emission. Any event or flow that isn’t reachable from your program’s entry points is removed. The standard library defines dozens of events — io.println, io.readln, io.eprintln, control.if, control.for, testing.assert, and more. If you don’t use them, they don’t exist in the output.
For the program above, the dead stripper removes 68 items. What’s left is your add event, the print.ln inline code, and the main function that calls them.
The Generated Code
Here’s what the compiler actually emits for the print:
{ const __kio = struct {
fn __kz_w(__kz_b: []const u8) void {
_ = @import("std").posix.write(2, __kz_b) catch {};
}
fn __kz_wd(__kz_v: i64) void {
// int-to-string conversion, ~20 instructions
}
};
__kio.__kz_w("17 + 25 = ");
__kio.__kz_wd(@as(i64, @intCast(s)));
__kio.__kz_w("\n");
} Three syscalls. String literal, integer conversion, newline. No allocator, no formatter, no buffering. The __kio struct is a zero-cost abstraction — Zig inlines everything and what’s left is write(2, buf, len).
The function names are deliberately mangled (__kz_w, __kz_wd, __kz_b, __kz_v) to avoid shadowing any user variable in the surrounding scope.
Scaling
Does it stay small as programs grow?
| Program | Description | Binary |
|---|---|---|
| Print “hello” | One print statement | 4,848 B |
| Event + add + print | Identity branch, call, print result | 4,848 B |
| 3 events chained | double → add_ten → square, 3-deep nesting | 4,848 B |
| Branching | 3 branches, runtime dispatch, 3 calls | 8,944 B |
| FizzBuzz | 4 branches, 5 calls, modular arithmetic | 8,944 B |
The first three are the same size. Zig constant-folds the entire chain — three events, three identity branches, three levels of continuation nesting all reduce to a constant. The binary doesn’t know they existed.
The branching programs are larger because they have actual runtime logic — if comparisons that can’t be resolved at compile time. But 8.9KB for a FizzBuzz with four algebraic branches is still remarkable.
What This Means
The 4.8KB isn’t the point. The point is what’s in those 4.8KB: just your logic and the syscalls to express it. Nothing else.
No framework overhead. No runtime. No garbage collector. No unused standard library functions. No formatter you didn’t ask for. The binary contains exactly what the program does, and nothing it doesn’t.
This is what zero-cost abstraction means when the abstraction is the language itself. Events, identity branches, flows, continuations — they’re all compile-time concepts. They give you type safety, exhaustiveness checking, purity analysis, and structured control flow. Then they vanish.
You get the ergonomics of a high-level language and the binary of hand-written assembly. Not approximately. Exactly.