Redis creator ran DeepSeek V4 on a MacBook in two weeks

Salvatore Sanfilippo (antirez), creator of Redis, built ds4 — a program that runs DeepSeek V4 locally on Apple Silicon, with 16k lines written in two weeks alongside GPT-5.5 and 26 tokens/sec on an M3 Max.

Author: Michael Kokin ·

Spotted a retweet from YCombinator CEO Garry Tan — he's running a surprisingly powerful model locally on a MacBook. I tried to figure out how that's even possible and what kind of output quality you can actually expect.

Salvatore Sanfilippo — aka antirez — is an Italian developer who wrote Redis in 2009. One of the most widely used databases in the world, powering X, GitHub, and thousands of services. Salvatore ran the project for 11 years, then walked away.

Last week he published a program called ds4. It runs the Chinese AI model DeepSeek V4 directly on a laptop. No internet, no subscriptions, no cloud. antirez is upfront in the readme: "Without AI this wouldn't exist. If you don't like AI-written code — this isn't for you."

16,000 lines of complex code, written together with GPT-5.5. But antirez didn't just "ask ChatGPT to write a program." He owned the architecture and debugging himself. AI handled the routine stuff under his direction. Without 30 years of programming experience, none of this probably would have worked.

How did the model fit on a laptop in the first place?
The weights are compressed 4x — but not uniformly. Only the parts where it's safe to compress. Critical sections stay at full precision, so quality barely suffers.

Session memory lives on SSD rather than RAM. Apple's SSD is fast enough to load a million tokens of context at once. For reference: a million tokens is roughly a 700-page book that the model holds in its head in full.

Narrow specialization. No general-purpose abstractions. The program does one thing: run one model on one type of hardware. But it does it relatively fast.

On a MacBook Pro M3 Max, the model outputs 26 words per second; on a Mac Studio M3 Ultra — 36. Not fast, but workable for coding. antirez himself says quality is solid and tool calls are reliable. No independent benchmarks yet.

How to try it?
If you have a MacBook Pro or Mac Studio with 128 GB of RAM — yes. Download from GitHub, compile it, pull the model from Hugging Face. Won't run on regular laptops: Apple Silicon only. Entry price is around $5,000 for the Mac.

Two years ago, a GPT-4-level model cost billions of dollars and only ran in data centers. Today, a model at the same level fits on a laptop and can be set up by one person in two weeks. In two more years, it'll fit on a phone.

I keep seeing comments asking why local AI even matters. Sure, it's essentially free — but the bigger thing is that nobody sees your conversations, your code, your medical history. Every ChatGPT session gets sent to OpenAI and stored there. Every Claude request goes to Anthropic. ds4 shows that models capable of the same things will soon run on your own device.

For the industry, the takeaway is simpler. An experienced developer with an AI assistant can do in two weeks what used to take a team of ten a full quarter.

The big winners: experienced engineers. The big losers: juniors who don't have those 30 years yet.