AI & Developer Tools

AI Model Cheat Sheet

Every major model's price, context window, and sweet spot — on one page you can bookmark. Last verified July 2026 · printable (Ctrl+P) · updated monthly alongside our price tracker.

⇅ Click a price or size column header to sort (click again to reverse).

ModelProviderInput $/1M Output $/1M Context Max output Best for
Gemini 2.5 Flash-LiteGoogle$0.10$0.401M64KBulk classification, tagging, cheapest possible calls
GPT-5.4 NanoOpenAI$0.20$1.25400K*128K*High-volume simple tasks in the OpenAI stack
Gemini 2.5 FlashGoogle$0.30$2.501M64KBudget chat & summarization with big context
Claude Haiku 4.5Anthropic$1.00$5.00200K64KFast production workloads needing Claude quality
GPT-5.6 LunaOpenAI$1.00$6.00400K*128K*Value tier of OpenAI's newest family
Gemini 3.5 FlashGoogle$1.50$9.001M65KSpeed/intelligence balance, agentic browsing
Gemini 3.1 ProGoogle$2.00$12.00500K64KStrong reasoning at mid-tier price
GPT-5.4OpenAI$2.50$15.00400K*128K*Mainstream OpenAI production default
GPT-5.6 TerraOpenAI$2.50$15.00400K*128K*Newest mainstream flagship
Claude Sonnet 5Anthropic$3.00†$15.00†1M128KCoding & agents near-frontier quality at flagship price
Claude Opus 4.8Anthropic$5.00$25.001M128KLong-horizon agents, hard coding, knowledge work
GPT-5.5OpenAI$5.00$30.00400K*128K*OpenAI's frontier reasoning
GPT-5.6 SolOpenAI$5.00$30.00400K*128K*Top of the GPT-5.6 family
Claude Fable 5Anthropic$10.00$50.001M128KThe hardest reasoning and longest autonomous runs

Standard-tier list prices, July 2026. †Claude Sonnet 5 intro rate $2/$10 through Aug 31, 2026. *GPT-5.x figures per the GPT-5 family's published specs — confirm on OpenAI's model page for the exact variant. All listed models support vision input, prompt caching (~90% off repeated input), and batch (~50% off).

Quick picks — skip the analysis

Cheapest that's still good: Gemini 2.5 Flash-Lite ($0.10/$0.40) for classification and extraction; GPT-5.4 Nano if you're already on OpenAI.
Best value flagship: Claude Sonnet 5 — near-frontier coding/agents at $3/$15 (and $2/$10 until Aug 31, 2026).
Biggest context window: Claude's 1M-token models (Fable 5, Opus 4.8, Sonnet 5) and Gemini's 1M Flash tier — roughly 1,500 pages of text in one request.
Longest single output: 128K-output models (Claude 1M-tier, GPT-5.x) — a whole report or codebase-sized diff in one call.
Hardest problems, cost no object: Claude Fable 5 ($10/$50) or GPT-5.6 Sol ($5/$30).
Chatbot at scale: route 80% of traffic to Haiku 4.5 / Gemini Flash, escalate the rest — model it in the chatbot cost simulator.

Reading the spec sheet

Frequently asked questions

Which single model should I default to?

For most production work in mid-2026: a flagship-tier model (Claude Sonnet 5, GPT-5.6 Terra, or Gemini 3.1 Pro) as the default, a budget model for easy traffic, and a frontier model only where quality visibly pays for itself.

Are benchmark scores on this page?

Deliberately not — leaderboard positions shuffle monthly and rarely predict your task. Price, context, and output limits are the stable facts; test your top 2–3 candidates on your own data.

How do I estimate my cost with these numbers?

Tokens ÷ 1,000,000 × price. Or skip the arithmetic: the AI cost calculator and token calculator do it live.

Can I print or save this?

Yes — Ctrl+P gives a clean printable version (navigation is stripped automatically). Bookmark the page for the monthly-updated version.