Negative-Cost Abstractions

· 12 min read

Negative-Cost Abstractions

Every modern systems language promises zero-cost abstractions. The idea is simple: you should be able to use high-level constructs without paying a runtime penalty. Rust made this its tagline. C++ demonstrated it with templates decades ago.

But zero-cost is the wrong goal. It frames abstraction as a tax you’re trying to minimize. The best you can hope for is breaking even.

What if the abstraction made your program faster, smaller, and safer than writing it by hand?

That’s a negative-cost abstraction. And it’s not theoretical. You already use them every day.


You’ve Seen This Before

SQL

SELECT * FROM users
WHERE age > 25
ORDER BY name

Nobody writes manual loops over B-trees. And if you did, the query planner would still beat you. It considers index selectivity, join ordering, cache line behavior, and statistics you don’t have access to.

SQL is a declarative abstraction that produces faster data retrieval than the imperative code you’d write without it. Using the abstraction is faster than not using it.

TypeScript

You add type annotations. More code at the source level. But the types erase completely at runtime. And all those if (typeof x === 'string') guards you’d scatter through JavaScript? Gone. Not because you’re being reckless — because the compiler already proved they’re unnecessary.

You write more to get less. Fewer bytes shipped, fewer runtime checks, fewer bugs. The abstraction subtracts from the output.

HTTP/2

Front-end teams spent years hand-optimizing for HTTP/1.1: concatenating JavaScript bundles, building CSS sprites, domain-sharding across CDNs. Then HTTP/2 arrived with multiplexing, and the naive approach — just make normal requests — outperformed the hand-optimized version.

The “no optimization” path over the better abstraction beat the “heavily optimized” path over the worse one.

gzip

You add a compression step to your HTTP pipeline. Total transfer time goes down. The CPU cost is less than the bandwidth saved. Adding a layer makes things faster.


The Spectrum

There are three kinds of abstractions:

Positive-cost — The abstraction adds overhead. You pay for convenience with cycles. Garbage collection. Runtime reflection. Dynamic dispatch. Most abstractions live here.

Zero-cost — The abstraction compiles away. You get the ergonomics without the tax. Rust’s ownership model. C++ templates at their best. The abstraction breaks even.

Negative-cost — The abstraction produces better output than you’d write by hand. The program is faster, smaller, or safer because you used it, not despite it.

The difference between zero-cost and negative-cost is this: zero-cost means the compiler generates code as good as what you’d write manually. Negative-cost means the compiler generates code better than what you’d write manually — because the abstraction gives it information or leverage that hand-written code doesn’t.


C++ Templates: The Original Negative-Cost Abstraction

This isn’t a new idea. The C++ community proved it thirty years ago.

C’s qsort takes a void* array and a function pointer for comparison. Every comparison is an indirect call through a pointer. Every swap is a memcpy over opaque bytes.

C++‘s std::sort with a lambda: the template specializes for the exact type, inlines the comparator, eliminates the indirection. The abstraction is measurably faster than the “no abstraction” version.

Using std::sort isn’t zero-cost. It’s negative-cost. The high-level code beats the low-level code.

But C++ templates are a fixed mechanism. You can’t write new ones. They’re baked into the compiler. What if negative-cost abstractions were something you could define yourself?


Koru: Negative-Cost by Design

Koru is a systems language that compiles to Zig. Its core insight is that the standard library is a collection of compile-time transforms that rewrite your program’s AST before compilation — and then erase themselves.

Every feature points in the same direction: the abstraction’s value is measured by what it removes.

print.blk: Removing the Formatting Runtime

~std.io:print.blk {
    === {{ name:s }} ===
    Total: {{ total:d }}
    Max:   {{ max:d }}
    ==============
}

If you wrote this by hand in Zig, you’d reach for std.debug.print with a format string. That’s reasonable. It works. But it parses the format string at compile time, builds an argument tuple, and calls through the formatting machinery.

Koru’s print.blk does something different. At compile time, it sees that the only format specifiers are :s (string) and :d (integer). No {any}, no padding, no width specifiers. So it skips std.debug.print entirely and emits raw write(2, ...) syscalls with an inline integer-to-string converter.

The generated code:

{ const __kio = struct {
    fn __kz_w(b: []const u8) void {
        _ = @import("std").posix.write(2, b) catch {};
    }
    fn __kz_wd(v: i64) void {
        // inline integer formatting, no allocator
    }
};
__kio.__kz_w("=== ");
__kio.__kz_w(name);
__kio.__kz_w(" ===\nTotal: ");
__kio.__kz_wd(@as(i64, @intCast(total)));
__kio.__kz_w("\nMax:   ");
__kio.__kz_wd(@as(i64, @intCast(max)));
__kio.__kz_w("\n==============\n"); }

No format string parsing. No argument tuple. No function call to std.debug.print. Just raw writes to stderr.

The abstraction is faster than the code you’d write without it. And at runtime, print.blk doesn’t exist. There’s no print.blk function in the binary. The abstraction produced the code and then disappeared.

Phantom Types: Removing Defensive Code

Every language has a version of this bug:

f = open("data.txt")
data = f.read()
f.close()
# ... 200 lines later ...
f.read()  # Runtime error: I/O operation on closed file

The standard fix is defensive coding. Check the state before every operation:

if not f.closed:
    data = f.read()

Or use a context manager, or RAII, or defer, or try/finally. Every approach adds runtime machinery to catch a mistake the programmer shouldn’t be able to make in the first place.

Koru’s phantom types take the opposite approach:

~app.fs:open(path: "test.txt")
| opened f |>
    app.fs:read(file: f.file)
    | data d |> process(d)
    | error e |> log(e)
    app.fs:close(file: f.file)
    | closed |> _

The type f.file carries a phantom state: Open. After close, the state becomes Closed. Use it after close? Compile error. Forget to close? The [allocated!] obligation fires a compile error.

And at runtime? The phantom type is erased. There’s no wrapper object. No reference count. No destructor. No deferred call. No if (!closed) check. Zero bytes in the binary.

The defensive code that doesn’t exist is the safety. Not because you’re being reckless — because the compiler proved those paths are impossible.

That’s negative-cost: the abstraction removes both the bug AND the code you’d write to guard against it.

kernel:pairwise: Removing Iteration Patterns

An N-body simulation needs to compute gravitational forces between every pair of bodies. The standard approach:

for (int i = 0; i < n; i++) {
    for (int j = i + 1; j < n; j++) {
        double dx = bodies[j].x - bodies[i].x;
        // ... force calculation ...
        bodies[i].vx += f * dx;
        bodies[j].vx -= f * dx;
    }
}

You manage the iteration pattern, the indexing, the symmetry optimization (j = i + 1), the data layout. The algorithm is buried in bookkeeping.

In Koru:

~std.kernel:shape(Body) { x: f64, y: f64, vx: f64, vy: f64, mass: f64 }

~std.kernel:init(Body) {
    { x: 0.0, y: 0.0, vx: 0.0, vy: 0.0, mass: 1.0 },
    { x: 1.0, y: 0.0, vx: 0.0, vy: 0.0, mass: 1.0 },
    { x: 0.0, y: 1.0, vx: 0.0, vy: 0.0, mass: 1.0 },
    { x: 1.0, y: 1.0, vx: 0.0, vy: 0.0, mass: 1.0 },
}
| kernel k |>
    std.kernel:pairwise {
        const dx = k.other.x - k.x;
        const dy = k.other.y - k.y;
        const dist_sq = dx*dx + dy*dy + 0.01;
        const f = k.mass * k.other.mass / dist_sq;
        k.vx += f * dx;
        k.other.vx -= f * dx
    }

The compiler sees four bodies at compile time. It emits for (0..4) |i| { for (i+1..4) |j| { ... } } with constant bounds. The loop body is inlined directly — no function call, no struct wrapper. k becomes ptr[i], k.other becomes ptr[j].

You’d never write comptime-known bounds by hand for a variable you plan to scale later. The abstraction does it for you when it can, and falls back to runtime .len when it can’t.

Exhaustive Branches: Removing Error Handling Overhead

In most languages, “proper error handling” adds runtime cost. Exceptions require stack unwinding and error object allocation. Even Rust’s Result adds a match you might forget (though the compiler warns you).

Koru events declare all outcomes upfront:

~event divide { a: i32, b: i32 }
| ok { result: i32 }
| err { msg: []const u8 }

This compiles to a tagged union with a switch. No exception mechanism. No stack unwinding. No error objects. The “error handling” is a branch prediction hint to the CPU.

In languages with exceptions, adding error handling makes your code slower. In Koru, the error handling is the same switch statement you’d write anyway — but the compiler guarantees you handle every case.


The Pattern

Every negative-cost abstraction in Koru follows the same structure:

  1. You describe intent — what you want, not how to do it
  2. The compiler transforms — using information the abstraction exposed
  3. The abstraction erases — nothing remains at runtime

And critically: users can write new ones. print.blk isn’t a compiler builtin. It’s a standard library function — a [comptime|transform] event written in Koru itself that receives the AST, rewrites it, and hands back a simpler program.

~[comptime|transform]pub event print.blk {
    source: Source,
    invocation: *const Invocation,
    item: *const Item,
    program: *const Program,
    allocator: std.mem.Allocator
}
| transformed { program: *const Program }

That’s not a macro. It’s not a compiler plugin with a special API. It’s a Koru event that happens to operate on the program itself. The same language, the same mechanism, the same event-and-branch model.

The standard library is a collection of things that delete themselves.


Beyond Zero

Rust’s promise: “You don’t pay for what you don’t use.”

Koru’s promise: “Using it makes things better than not using it.”

Zero-cost is a floor. It says: we won’t slow you down. That’s admirable, and Rust delivered on it brilliantly.

But negative-cost is a direction. It says: the abstraction exists to subtract — subtract complexity, subtract defensive code, subtract runtime cost, subtract itself. The value is measured in what’s not in the final binary.

Every print.blk call removes formatting machinery. Every phantom type removes a defensive check. Every exhaustive branch removes an exception handler. Every kernel operation removes iteration bookkeeping.

Less code. Less runtime. Fewer bugs. All three, simultaneously, because you used the abstraction.

That’s the negative-cost hypothesis: the right abstraction doesn’t just avoid adding cost — it actively removes cost that would exist without it.

We think this is the next step past zero.