AI-First Tooling: When the Compiler Becomes a Protocol

October 27, 2025 · 10 min read

The LSP Paradox

The Language Server Protocol was created, in part, to help AI understand code. Give an AI agent “go to definition,” “find references,” and “hover for type,” and suddenly it can navigate a codebase like a human developer. Brilliant, right?

Except LSP was designed for humans typing in editors. It’s chatty (request/response for every action), synchronous (wait for the server), and interactive (designed for cursor movements and keystrokes). An AI doesn’t need “go to definition” - it can read the entire file. It doesn’t need “hover for type” - it can parse the AST directly. We built a protocol to help AI act like humans, when we should have built programs that AI can read natively.

Instrument for Machines, Review for Humans

Traditional development looks like this:

Human writes code (types in editor)
LSP validates (red squiggles, autocomplete)
Debugger fixes (breakpoints, step through, inspect variables)

AI-first development looks like this:

AI reads AST (structured data, full program context)
AI transforms code (generates optimization passes)
Property tests validate (prove equivalence automatically)

The difference isn’t just tooling - it’s what we instrument. Traditional tools are designed for humans to interrogate programs (what’s the type here? what’s the value there?). AI-first tools are designed for programs to emit their behavior as structured data that machines consume directly.

We don’t need LSP if the language is AI-readable. We don’t need debuggers if programs are self-instrumenting.

Self-Instrumenting Programs

Here’s what full-program profiling looks like in Koru:

~[profile]import "$std/profiler"

That’s it. One line. Your entire program is now profiled with nanosecond precision.

How does it work? The profiler uses a universal tap that observes every event transition:

// From koru_std/profiler.kz
~* -> *
| Profile p |> write_event(
    source: p.source,
    dest: p.dest,
    branch: p.branch,
    timestamp_ns: p.timestamp_ns,
    duration_ns: p.duration_ns
)

The ~* -> * pattern means “observe ALL transitions.” Every time one event calls another, the profiler captures:

What event fired (p.source)
What event it called (p.dest)
Which branch was taken (p.branch)
When it happened (p.timestamp_ns)
How long it took (p.duration_ns)

The output is Chrome Tracer JSON. Open chrome://tracing, load the file, and you see your entire program execution as a timeline. Every transition. Every branch. Every nanosecond. Click a transition in the trace, jump to source code.

No instrumentation. No manual logging. No debugging sessions. Just import the profiler and the program instruments itself.

The Compiler as Protocol

Koru’s compiler doesn’t just compile - it emits what it knows and accepts what to do.

AST at Every Stage

koruc --ast-json=all main.kz

This dumps the AST at every compiler pipeline stage:

{"stage":"parse","ast":{"events":[...],"procs":[...]}}
{"stage":"shape_check","ast":{"events":[...],"types":[...]}}
{"stage":"phantom_check","ast":{"events":[...],"states":[...]}}
{"stage":"optimize","ast":{"events":[...],"inlined":[...]}}

AI doesn’t need to ask the compiler questions. The compiler just tells you everything as structured JSON at every step.

Compiler Control Protocol (CCP)

koruc --ccp main.kz

Bidirectional JSONL protocol:

Compiler → AI: Event stream (hotspots detected, analysis complete, errors found)
AI → Compiler: Command stream (apply this pass, generate this variant, benchmark this)

Example event from the compiler:

{
  "type": "hotspot_detected",
  "event_name": "fetch",
  "calls": 47234,
  "avg_ns": 127,
  "total_ms": 6.0,
  "percentage": 12.3,
  "source_location": {"file": "main.kz", "line": 42},
  "optimization_candidates": [
    {"strategy": "inline", "estimated_gain_ms": 5.8, "confidence": 0.95}
  ]
}

Example command to the compiler:

{
  "command": "apply_pass",
  "pass_file": "ai_passes/inline_fetch.kz",
  "verify": true
}

The compiler applies the pass, runs property tests to verify equivalence, benchmarks the change, and emits the results. All structured. All machine-readable. All automatic.

Execution GeoHashes: Perceptual Hashing for Code

Here’s the problem with traditional debuggers: they use line numbers.

> break main.kz:42
[You add a line above line 42]
> break main.kz:43  # Have to update manually!

Line numbers are positional - they break when code moves. What if we used structural identity instead?

Hashing Transitions from Their Perspective

In Koru, every transition gets a 12-character execution geohash based on its local context:

Transition hash: h3k9df8a2xv4

Chars 0-1:  h3       = Module context (85% confidence)
Chars 0-3:  h3k9     = Flow context (93% confidence)
Chars 0-5:  h3k9df   = Siblings (99% confidence)
Chars 0-8:  h3k9df8a2 = Neighbors (99.9% confidence)
Chars 0-11: h3k9df8a2xv4 = Exact transition (100%)

The hash is built hierarchically from the transition’s point of view:

Module context (chars 0-1): What file/module is this in?
Flow context (chars 2-3): What flow contains this transition?
Siblings (chars 4-5): What other transitions are in this flow?
Neighbors (chars 6-8): What transitions are immediately before/after?
Exact (chars 9-11): The specific source→dest via branch

This is like perceptual hashing for code execution - similar contexts produce similar hashes, even after refactoring.

Visual Hash Comparison

After refactoring, the system shows you:

Original tracepoint:     fetch:h3k9df8a2xv4
Current best match:      fetch:h3k9df8a2xu8
Similarity visualization: ▓▓▓▓▓▓▓▓▓▓▓░ (11/12)
Confidence: 99.9% - Auto-followed

Differences:
  ✓ Module context: Unchanged (h3)
  ✓ Flow context: Unchanged (k9)
  ✓ Siblings: Unchanged (df)
  ✓ Neighbors: Unchanged (8a2)
  ~ Exact details: Minor change (xv4 → xu8)
    Likely cause: Variable rename or minor logic change

The hash prefix tells you what changed:

Full 12-char match → Identical code
9+ char match → Same transition, tiny variation
6+ char match → Same local context, high confidence
4+ char match → Same flow region
2+ char match → Same module area
0-1 char match → Completely different

Hash-Based Tracepoints

Instead of setting breakpoints by line number, you set tracepoints by hash:

koruc --debug \
  --trace=fetch:h3k9df8a2xv4 \
  --when='response.status >= 500' \
  --before=5 \
  --after=3 \
  --break-after=10

This says:

Trace: The transition identified by fetch:h3k9df8a2xv4
When: Only when response.status >= 500
Before: Capture 5 transitions before the tracepoint
After: Capture 3 transitions after the tracepoint
Break after: Stop after collecting 10 traces

The output is structured JSONL with full AST payloads:

{
  "type": "trace_capture",
  "hit_number": 1,
  "canonical": "fetch",
  "hash": "h3k9df8a2xu8",
  "matched_original": "h3k9df8a2xv4",
  "similarity": 0.999,

  "context_before": [
    {"transition": "validate->fetch", "branch": "ok", "payload": {"user_id": 42}},
    {"transition": "fetch->process", "branch": "found", "payload": {"data": {...}}}
  ],

  "tracepoint_hit": {
    "transition": "process->fetch",
    "branch": "error",
    "payload": {"status": 503, "msg": "Service unavailable"},
    "locals": {"retry_count": 3, "user_id": 42}
  },

  "context_after": [
    {"transition": "fetch->log.error", "branch": "done", "payload": {...}},
    {"transition": "log.error->response.send", "branch": "done", "payload": {...}}
  ]
}

Dual Identity: Names + Hashes

The tracepoint has both canonical name and structural hash:

Canonical name (fetch): Human-readable, precise when code is stable
Structural hash (h3k9df8a2xv4): Machine-trackable, resilient to refactoring
Together: Unambiguous (which fetch?) + refactor-resistant (follows code)

When you refactor:

# Before refactor
Tracepoint: fetch:h3k9df8a2xv4

# After refactor (minor changes)
Tracepoint auto-matched: fetch:h3k9df8a2xu8 (11/12 chars, 99.9% confidence)
→ Auto-updated tracepoint

# After major refactor
Tracepoint matched: fetch:h3k9df1m5p2r (6/12 chars, 99% confidence)
→ Tracepoint likely followed, review changes

The tracepoint follows your code through refactoring. No manual updates. No broken breakpoints.

The Philosophy: AI-First, Human-Approved

This isn’t about replacing humans. It’s about division of labor:

Machines (AI):

Read AST (structured, complete program context)
Emit transformations (optimization passes as code)
Run property tests (prove equivalence automatically)
Benchmark changes (measure actual speedup)

Humans:

Review passes (auditable Koru code, not magic)
Approve results (benchmarks prove value)
Set intentions (mark tracepoints, configure analysis)
Make decisions (ship this optimization? investigate this trace?)

Property tests are the contract. AI can’t lie - the tests prove equivalence. Benchmarks can’t lie - they measure real performance. The baseline is sacred - original code is never deleted, only augmented with variants.

Programs become self-documenting because events describe WHAT (semantic intent), not HOW (implementation details). The compiler understands meaning and can optimize accordingly.

What This Unlocks

Full observability from one import:

~[profile]import "$std/profiler"

Every transition traced. No manual instrumentation. No print statements.

AI-driven optimization while you sleep:

koruc --ccp main.kz | ai-optimizer --auto-approve-if-speedup > 1.5x

AI analyzes, optimizes, tests, benchmarks, commits. You review in the morning.

Tracepoints that survive refactoring:

koruc --debug --trace=fetch:h3k9df8a2xv4
# ... refactor for 2 hours ...
koruc --debug --trace=fetch:h3k9df8a2xv4  # Same command!
# → Auto-matched to h3k9df8a2xu8 (99.9% similar)

Set once, works forever (or until you completely rewrite it).

Compiler as orchestratable system:

AI → {"command": "analyze", "passes": ["hotspot", "memory"]}
Compiler → {"type": "hotspot_detected", "event": "fetch", ...}
AI → {"command": "apply_pass", "pass": "inline_fetch.kz"}
Compiler → {"type": "benchmark_result", "speedup": 4.1x}

The compiler tells you what it knows. You tell it what to do. Everything structured. Everything verifiable.

The Punchline

We’re not building better LSP for AI. We’re not building smarter debuggers for AI to use.

We’re building programs that AI can read and transform.

The tooling isn’t FOR AI. The LANGUAGE is.

When you write Koru, you’re not writing instructions for the CPU. You’re writing semantic intent in a form that both machines and humans can understand - machines to optimize, humans to verify.

The compiler becomes a protocol. The program becomes self-instrumenting. The tracepoints follow your code.

This is what AI-first tooling actually looks like.

Interested in trying Koru? Check out the language guide or explore how profiling works in detail.