Anthropic is testing AI Score — a rating of how well you use AI

The system analyzes chat history and Claude Code logs across 11 indicators drawn from the AI Fluency Index study of nearly 10,000 conversations.

Author: Michael Kokin ·

Anthropic is testing a feature that will score users across 11 habits. A dedicated screen with a personal report is coming to settings — the model will tell you exactly how well (or how badly) you've been communicating with it.

The idea grew out of their recent AI Fluency Index research: engineers dissected nearly 10,000 conversations and identified 11 indicators that separate pros from beginners. The system scans everything — chat history, Cowork activity, Claude Code logs — and produces a score showing exactly where you've been slacking.

The biggest insight from all that data: we catastrophically trust pretty formatting.

When AI outputs neatly formatted code or a clean table, users push back 5.6x less often than with raw text. A polished UI literally lulls you into compliance. Ideally, you should apply *more* skepticism to a well-formatted answer — in practice, we just nod and take the output. The new scoreboard is supposed to slap you on the wrist for that kind of laziness, pushing you to fact-check and argue with the model.

Here's the full checklist from Anthropic. See where you're leaving prompts on the table:

1. Delegation

2. Task framing

3. Critical thinking (where almost everyone falls apart)

Anthropic's official tracker is still behind closed doors, but I built my own tool — AI Score — so you can benchmark yourself right now. Drop in a typical conversation (from Claude, ChatGPT, or Gemini) and the algorithm surfaces your blind spots. It's safe: your text is processed once and immediately deleted from the server.

For those who prefer full privacy (totally get it), I packaged the same logic into a standalone skill on GitHub. You can load it directly into your Claude chat and analyze logs locally.