AI Model Price Tracker
Every major model's current API price in one table, plus a running changelog of cuts, hikes, and new releases. Last verified: July 2026.
Current prices (per 1M tokens, standard tier)
| Model | Provider | Input | Output | Tier |
|---|---|---|---|---|
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | Budget | |
| GPT-5.4 Nano | OpenAI | $0.20 | $1.25 | Budget |
| Gemini 2.5 Flash | $0.30 | $2.50 | Budget | |
| Claude Haiku 4.5 | Anthropic | $1.00 | $5.00 | Fast |
| GPT-5.6 Luna | OpenAI | $1.00 | $6.00 | Fast |
| Gemini 3.5 Flash | $1.50 | $9.00 | Fast | |
| Gemini 3.1 Pro | $2.00 | $12.00 | Flagship | |
| GPT-5.4 | OpenAI | $2.50 | $15.00 | Flagship |
| GPT-5.6 Terra | OpenAI | $2.50 | $15.00 | Flagship |
| Claude Sonnet 5 | Anthropic | $3.00* | $15.00* | Flagship |
| Claude Opus 4.8 | Anthropic | $5.00 | $25.00 | Frontier |
| GPT-5.5 | OpenAI | $5.00 | $30.00 | Frontier |
| GPT-5.6 Sol | OpenAI | $5.00 | $30.00 | Frontier |
| Claude Fable 5 | Anthropic | $10.00 | $50.00 | Frontier |
*Claude Sonnet 5 has an introductory rate of $2.00 / $10.00 through August 31, 2026 — list price shown. Estimate your own workload with the AI cost calculator.
Changelog
July 2026 — tracker launched (baseline)
- Baseline prices recorded for all 14 tracked models (table above).
- Notable going in: OpenAI's new GPT-5.6 family (Luna/Terra/Sol) spans $1–$5 input; Claude Sonnet 5 is running an intro discount ($2/$10) through the end of August; Gemini 2.5 Flash-Lite remains the cheapest tracked model at $0.10 input.
- This page is updated monthly — bookmark it to catch cuts before your next bill.
The one trend to know
The price of a given level of AI capability has been falling fast — industry analyses put it around 10× cheaper per year for equivalent quality, driven by hardware, efficiency gains, and competition. Practically: the flagship model you priced six months ago probably has a successor that's cheaper and better. Re-check before committing an annual budget, and design your product so the model can be swapped easily.
How to read AI pricing pages without getting burned
- Input vs output: output is 3–6× dearer; workloads that write a lot (drafting, code generation) cost more than the input price suggests.
- Long-context surcharges: some models charge more above ~200K tokens of context.
- Cached/batch rates: the headline price is the ceiling — caching (~90% off repeated input) and batch (~50% off) are the working floor.
- Intro pricing: launch discounts expire (see Sonnet 5 above) — note the end date in your cost model.
Frequently asked questions
How often is this updated?
Monthly, and after any major provider announcement. The "last verified" date at the top tells you the snapshot's freshness.
Why do some sites show different prices?
Usually one of: outdated data, batch/cached rates presented as headline prices, reseller markups, or regional/enterprise tiers. We track the providers' standard public list prices.
Do prices differ by region?
List prices are global in USD; cloud-marketplace versions (AWS Bedrock, Google Vertex, Azure) can differ slightly and add enterprise features.