AI API Cost Calculator
Estimate what your LLM usage will cost across OpenAI (GPT), Anthropic (Claude), and Google (Gemini) — and compare every major model side by side. Pricing last updated July 2026.
Rule of thumb: 1,000 tokens ≈ 750 English words. A typical chat message is 50–300 tokens; a long document can be 10,000+.
Full pricing comparison (per 1 million tokens)
Sorted from cheapest to most expensive output. "Blended $/1K requests" uses your inputs above once you press Calculate.
| Model | Provider | Input $/1M | Output $/1M | Cost / 1,000 requests |
|---|
Prices are standard-tier, short-context list prices as of July 2026 and change often — always confirm on the provider's official pricing page before committing. Batch processing (up to 50% off) and prompt caching (up to 90% off repeated input) can substantially lower real-world costs.
Want details on one model? See the per-model breakdowns in our AI pricing directory.
How LLM API pricing works
Every major AI provider bills the same way: you pay per token, with separate rates for input (what you send — your prompt, documents, conversation history) and output (what the model writes back). Output tokens typically cost 3–6× more than input tokens.
A token is roughly ¾ of an English word. "The quick brown fox" is 4 tokens; this paragraph is about 60. Code, non-English languages, and unusual formatting usually consume more tokens per word.
Three ways to cut your AI bill
- Prompt caching. If every request repeats the same long system prompt or document, providers charge cached input at ~10% of the normal rate. For agents and RAG apps this is often the single biggest saving.
- Batch processing. Work that can wait up to 24 hours (classification, embeddings, bulk generation) gets ~50% off at OpenAI, Anthropic, and Google.
- Model routing. Don't send every request to a flagship model. Route simple tasks (classification, extraction, short answers) to a budget model like Gemini 2.5 Flash-Lite or GPT-5.4 Nano — often 25–100× cheaper per token than flagships.
API vs. a chat subscription — which is cheaper?
A ChatGPT/Claude/Gemini consumer subscription runs about $20/month flat. Using the calculator above: a heavy personal user sending ~100 messages a day (≈3,000 requests/month at ~1,000 input + 400 output tokens each) costs roughly $2–8/month on a mid-tier model via API — often cheaper than the subscription, but without the polished app, and you pay for every call. For teams building products, the API is the only option; for casual personal use, subscriptions are simpler.
Frequently asked questions
How many tokens is my prompt?
Roughly: words × 1.33 for English prose. For an exact count, each provider ships a tokenizer tool, or paste your text into our word counter and multiply the word count by 1.33.
Do I pay for the conversation history?
Yes — chat APIs are stateless, so the whole conversation is re-sent (and billed as input) on every turn. Long chats get progressively more expensive; caching mitigates this.
Why is output so much more expensive than input?
Generating tokens requires a full forward pass of the model per token, while input is processed in parallel. The compute asymmetry shows up in the price.
Are there free tiers?
Google's Gemini API has a meaningful free tier for low volumes. OpenAI and Anthropic offer trial credits. All are rate-limited and intended for development, not production.