Anthropic is testing AI Score — a rating of how well you use AI

Anthropic is testing a feature that will score users across 11 habits. A dedicated screen with a personal report is coming to settings — the model will tell you exactly how well (or how badly) you've been communicating with it.

The idea grew out of their recent AI Fluency Index research: engineers dissected nearly 10,000 conversations and identified 11 indicators that separate pros from beginners. The system scans everything — chat history, Cowork activity, Claude Code logs — and produces a score showing exactly where you've been slacking.

The biggest insight from all that data: we catastrophically trust pretty formatting.

When AI outputs neatly formatted code or a clean table, users push back 5.6x less often than with raw text. A polished UI literally lulls you into compliance. Ideally, you should apply *more* skepticism to a well-formatted answer — in practice, we just nod and take the output. The new scoreboard is supposed to slap you on the wrist for that kind of laziness, pushing you to fact-check and argue with the model.

Here's the full checklist from Anthropic. See where you're leaving prompts on the table:

1. Delegation

**Decide before you type:** Define your goal *before* you start writing the prompt.
**Strategy first:** Talk through the approach with the model, then ask it to execute.

2. Task framing

**Context is king:** Specify the audience (who this is for) and the tone you need.
**Format and examples:** Don't expect the AI to guess the structure. Give it a reference.
**Direct control:** Explicitly set the role and hard constraints (e.g., "respond without preamble").
**Iterate:** Polish the output. Never take the first draft and run.

3. Critical thinking (where almost everyone falls apart)

**Fact-check:** AI makes things up. Verify numbers and claims.
**Push back:** Catch the model on logical gaps and don't hesitate to call them out.
**Blind spots:** Feed the AI context it couldn't have known going in.

Anthropic's official tracker is still behind closed doors, but I built my own tool — AI Score — so you can benchmark yourself right now. Drop in a typical conversation (from Claude, ChatGPT, or Gemini) and the algorithm surfaces your blind spots. It's safe: your text is processed once and immediately deleted from the server.

For those who prefer full privacy (totally get it), I packaged the same logic into a standalone skill on GitHub. You can load it directly into your Claude chat and analyze logs locally.