Anthropic is testing a feature that will score users across 11 habits. A dedicated screen with a personal report is coming to settings — the model will tell you exactly how well (or how badly) you've been communicating with it.
The idea grew out of their recent AI Fluency Index research: engineers dissected nearly 10,000 conversations and identified 11 indicators that separate pros from beginners. The system scans everything — chat history, Cowork activity, Claude Code logs — and produces a score showing exactly where you've been slacking.
The biggest insight from all that data: we catastrophically trust pretty formatting.
When AI outputs neatly formatted code or a clean table, users push back 5.6x less often than with raw text. A polished UI literally lulls you into compliance. Ideally, you should apply *more* skepticism to a well-formatted answer — in practice, we just nod and take the output. The new scoreboard is supposed to slap you on the wrist for that kind of laziness, pushing you to fact-check and argue with the model.
Here's the full checklist from Anthropic. See where you're leaving prompts on the table:
1. Delegation
- **Decide before you type:** Define your goal *before* you start writing the prompt.
- **Strategy first:** Talk through the approach with the model, then ask it to execute.
2. Task framing
- **Context is king:** Specify the audience (who this is for) and the tone you need.
- **Format and examples:** Don't expect the AI to guess the structure. Give it a reference.
- **Direct control:** Explicitly set the role and hard constraints (e.g., "respond without preamble").
- **Iterate:** Polish the output. Never take the first draft and run.
3. Critical thinking (where almost everyone falls apart)
- **Fact-check:** AI makes things up. Verify numbers and claims.
- **Push back:** Catch the model on logical gaps and don't hesitate to call them out.
- **Blind spots:** Feed the AI context it couldn't have known going in.
Anthropic's official tracker is still behind closed doors, but I built my own tool — AI Score — so you can benchmark yourself right now. Drop in a typical conversation (from Claude, ChatGPT, or Gemini) and the algorithm surfaces your blind spots. It's safe: your text is processed once and immediately deleted from the server.
For those who prefer full privacy (totally get it), I packaged the same logic into a standalone skill on GitHub. You can load it directly into your Claude chat and analyze logs locally.