I wanted to write code that scales linearly to 128+ cores, but feels like Ruby. That didn't exist, so I made it.
The Goal
I wanted the scaling of a 128-core shared-nothing engine, but the ergonomics of Ruby. I wanted to write high-level, readable logic and have a "Magic Button" to make it fast.
In my latest benchmarks, this approach is already 6x faster than Go and 1.4x faster than Rust on a 128-core KV-store workload. But the speed isn't the story—the scaling is.
The Problem: The "Lock Tax"
I wanted to write shared-nothing code easily and correctly. Writing this kind of architecture in C or C++ is notoriously difficult. You are constantly fighting race conditions, manual memory management, and the silent performance killer: Cache-Line Bouncing.
In traditional concurrent languages like Go or Rust, you eventually hit a "Contention Ceiling." You add more cores, but the overhead of keeping caches in sync (the "Lock Tax") means you stop seeing speedups. At 64 or 128 cores, these languages often hit negative scaling—adding more hardware actually makes your app slower.
Why not Actors?
Languages like Erlang or Elixir "solve" this with the Actor Pattern. They isolate state by forbidding shared memory entirely. It’s a beautiful model, and it works perfectly for telecom or niche IO cases where you have millions of slow processes sending small messages.
But for high-performance systems, the Actor model just swaps one bottleneck for another: The Copying Tax. If you can't share memory, you have to copy it. When you're trying to perform 10 million operations per second on a KV store, the cost of copying keys and values between actors becomes far more expensive than the locks you were trying to avoid.
Actors are "Shared-Nothing" conceptually, but they are "Copy-Everything" in practice. I wanted True Shared-Nothing: Zero locks, zero contention, and zero-copy performance.
Why a library wasn't enough
I didn't want to build a new language. I knew the "Ecosystem Desert" is where most new languages go to die. I tried to achieve my goals by building a library for existing languages, but I hit a wall every time:
- Go: You can't escape the Garbage Collector, and the scheduler is too opaque to guarantee the thread-pinning required for true shared-nothing performance.
- Rust: The type system is brilliant but rigid. To change a concurrency model, you have to refactor every function signature in the call chain. I couldn't find a way to make optimization a "one-line directive" without fighting the borrow checker at every turn.
- Zig: I love Zig's performance, but even with its
comptimemagic, the code never felt like Ruby. It’s a "Manual Transmission" language; I wanted a "Smart Transmission" that chose the gears for me.
I wanted code to work like SQL. In SQL, you write a query that describes what you want, and then you add an INDEX to optimize how it's retrieved. You don't rewrite the whole query in C just to make it scale. I wanted that "Optimization via Directive" for general-purpose logic.
The Breakthrough: The Zig Bridge
I finally realized that I didn't need to build a "Walled Garden." By transpiling CLEAR to Zig, I got native access to the entire C standard library and the LLVM toolchain for free.
It didn't matter if people would use my language on Day 1, because CLEAR has access to everything that already exists in C and Zig. I didn't have to build a new world; I just had to build a better way to talk to the one we already have.
The Result: Shared-Nothing Capabilities
CLEAR achieves its performance through Capabilities. Instead of bake-in a concurrency model into a Type, you apply it to a Binding:
-- Write it like Ruby:
MUTABLE map: HashMap<String> = {};
-- Optimize it with one line:
MUTABLE map: HashMap<String>@sharded(128) = {};
By adding @sharded(128) and @pinned, you tell the CLEAR compiler to generate a shared-nothing, lock-free architecture that scales linearly. The business logic remains identical. The compiler handles the infrastructure.
Conclusion
CLEAR is currently in v0.1. It’s not ready for your production database yet, but it’s ready to prove that the "Iron Triangle" of Performance, Scaling, and Ergonomics can be broken.
CLEAR has not yet proved that the triangle is broken—a v0.1 release is just the beginning. But I believe these results are definitive proof that with further development, the triangle will be broken.
I’m not here to ask you to switch your stack. I’m here to show you a solution to the "Many-Core" problem that is only going to get harder as our hardware keeps growing.
Problem solved.
Source: docs/manifesto/pre-release.md