How Big Should a Binary Be?

· 6 min read

How Big Should a Binary Be?

Here’s a Koru program:

~import "$std/io"

~event add { a: i32, b: i32 }
| sum i32

~add = sum a + b

~add(a: 17, b: 25)
| sum s |> std.io:print.ln("17 + 25 = {{ s:d }}")

It defines an event with typed inputs and an identity branch. ~add = sum a + b — that’s the whole implementation. Call the event, pattern-match the result, print with string interpolation. Standard Koru.

Compiled to a static Linux ELF:

-rwxr-xr-x 1 user staff 4848 Feb 13 00:11 output_emitted
output_emitted: ELF 64-bit LSB executable, x86-64, statically linked, stripped

4,848 bytes.

That’s not a typo. An event with an identity branch, a continuation, and formatted I/O compiles to 4.8KB.

For Context

Here’s what other languages produce for “Hello, World”:

LanguageBinary SizeNotes
Koru4,848 BEvent + identity branch + print
C (musl static)~17 KBprintf("Hello\n")
Go~1.8 MBStatic by default
Rust~300 KBDefault release, static
Zig (std.debug.print)~9.3 KBPulls in std.fmt
Zig (raw posix.write)~4.8 KBManual syscall

Koru matches hand-written Zig using raw posix.write. But you didn’t write raw syscalls. You wrote events and continuations.

How

Three things make this possible, and they compound.

1. Events Disappear at Compile Time

Koru’s event system has no runtime representation. No vtables. No dispatch tables. No event objects. The compiler type-checks everything, verifies branch exhaustiveness, tracks purity — then emits direct function calls.

~add = sum a + b becomes:

pub const add_event = struct {
    pub fn handler(input: Input) Output {
        return .{ .sum = input.a + input.b };
    }
};

And ~add(a: 17, b: 25) | sum s |> becomes:

const result_0 = add_event.handler(.{ .a = 17, .b = 25 });
const s = result_0.sum;

Zig sees that all inputs are comptime-known. It inlines the handler, constant-folds 17 + 25 to 42, and the entire event system evaporates. What remains is a constant and a print.

2. Format Specifiers Control Code Generation

This is where it gets interesting. Koru’s print.ln transform looks at your format specifiers and chooses the code generation path:

  • {{ x:d }} — integer. Emit raw posix.write with inline int-to-string conversion.
  • {{ s:s }} — string. Emit raw posix.write directly.
  • {{ v:any }} — anything. Fall back to std.debug.print (pulls in std.fmt).

The specifier is mandatory. {{ x }} without a specifier is a compile-time error:

error: std.io:print.ln: '{{ x }}' requires a format specifier (:d, :s, or :any)

This means the compiler knows whether your program needs std.fmt. If you stick to :d and :s, the formatter is never referenced, never linked, never in your binary. You don’t pay for what you don’t use — and the compiler can prove you don’t use it.

3. Dead Stripping Removes Everything Else

Koru’s dead strip pass walks the AST before emission. Any event or flow that isn’t reachable from your program’s entry points is removed. The standard library defines dozens of events — io.println, io.readln, io.eprintln, control.if, control.for, testing.assert, and more. If you don’t use them, they don’t exist in the output.

For the program above, the dead stripper removes 68 items. What’s left is your add event, the print.ln inline code, and the main function that calls them.

The Generated Code

Here’s what the compiler actually emits for the print:

{ const __kio = struct {
    fn __kz_w(__kz_b: []const u8) void {
        _ = @import("std").posix.write(2, __kz_b) catch {};
    }
    fn __kz_wd(__kz_v: i64) void {
        // int-to-string conversion, ~20 instructions
    }
};
__kio.__kz_w("17 + 25 = ");
__kio.__kz_wd(@as(i64, @intCast(s)));
__kio.__kz_w("\n");
}

Three syscalls. String literal, integer conversion, newline. No allocator, no formatter, no buffering. The __kio struct is a zero-cost abstraction — Zig inlines everything and what’s left is write(2, buf, len).

The function names are deliberately mangled (__kz_w, __kz_wd, __kz_b, __kz_v) to avoid shadowing any user variable in the surrounding scope.

Scaling

Does it stay small as programs grow?

ProgramDescriptionBinary
Print “hello”One print statement4,848 B
Event + add + printIdentity branch, call, print result4,848 B
3 events chaineddouble → add_ten → square, 3-deep nesting4,848 B
Branching3 branches, runtime dispatch, 3 calls8,944 B
FizzBuzz4 branches, 5 calls, modular arithmetic8,944 B

The first three are the same size. Zig constant-folds the entire chain — three events, three identity branches, three levels of continuation nesting all reduce to a constant. The binary doesn’t know they existed.

The branching programs are larger because they have actual runtime logic — if comparisons that can’t be resolved at compile time. But 8.9KB for a FizzBuzz with four algebraic branches is still remarkable.

What This Means

The 4.8KB isn’t the point. The point is what’s in those 4.8KB: just your logic and the syscalls to express it. Nothing else.

No framework overhead. No runtime. No garbage collector. No unused standard library functions. No formatter you didn’t ask for. The binary contains exactly what the program does, and nothing it doesn’t.

This is what zero-cost abstraction means when the abstraction is the language itself. Events, identity branches, flows, continuations — they’re all compile-time concepts. They give you type safety, exhaustiveness checking, purity analysis, and structured control flow. Then they vanish.

You get the ergonomics of a high-level language and the binary of hand-written assembly. Not approximately. Exactly.