iQuest Coder 40B: open-source model outperforming Sonnet and GPT-5.1

A new Chinese open-source model that outperforms top models from OpenAI and Anthropic on key benchmarks.

iQuest Coder (from Chinese iQuest Lab) is a 40B model (fits on one or two 4090 GPUs) that beats giants like Sonnet-4.5 and GPT-5.1.

The numbers speak for themselves:

SWE-Bench Verified: 76.2%. The gold standard of coding — testing the ability to solve real GitHub tickets. The model successfully closes 3 out of 4 real bugs in someone else's code. That's top senior engineer level (and the best proprietary models).

Terminal-Bench: 51.3% (Absolute record). The most important one. This tests the ability to work in a Linux terminal: install a package, grep for a file, fix an nginx config, restart a service. Regular LLMs struggle here (~35%), but iQuest genuinely knows how to admin.

Bird-SQL: ~70%. A test for complex database queries. Finally, a neural net that understands real DB structure and can write a complex JOIN without inventing non-existent tables.

The key feature is Code-Flow Training. The model was trained not just on code files, but on git commit history. It understands the *logic of changes*: "was — became — why it changed."

⚠️ Disclaimer: The community has already started debating — some say the model may have "peeked" at answers through git history on certain tests. But even if you dock 10% for cleverness, for a local 40B model this is still incredible.

Grab the weights and read the paper here.

---

Why This Matters

Why am I so hooked on this release? It's not about the numbers — it's about what this model enables.

We're used to the coding assistant (copilot) paradigm: you write code, AI completes lines. That's junior-level stuff.

iQuest (and the new tools being discussed on Hacker News) opens the era of autonomous local agents.

A record score in terminal work means the model can not just generate text, but act:

1. Set up the environment on its own.
2. Run tests, spot an error in logs, find the right file, and fix it.
3. All of this completely locally, without sending code to the cloud.

This gives us back digital sovereignty. Right now, to get a "smart agent" you need to pay $20/month (more like $200) and leak your data to the cloud. With iQuest on a local server, you can build your own AI employee that knows your secret keys, has access to the prod database, and sends data nowhere.

The era of chatbots is rapidly ending. The era of AI colleagues working inside your perimeter is beginning.