Question 1

What usage shape should I assume before launch?

Accepted Answer

A common planning baseline for consumer products: 10–20% of signups become monthly active, active users send 5–20 messages/month, and support-style conversations average 3–6 turns. Model with your best guess, then re-simulate with real data after two weeks.

Question 2

Does this include infrastructure costs?

Accepted Answer

No — this is the model (LLM API) bill only. Hosting a thin chat backend typically adds $5–50/month at small scale; vector databases for RAG add more.

Question 3

How do rate limits affect scale?

Accepted Answer

Entry API tiers allow thousands of requests/minute — enough for most products. Past that, providers raise limits with usage history or a sales conversation, not extra fees.

Question 4

Should I use batch pricing?

Accepted Answer

Not for live chat (batch takes up to 24h). Use it for the offline parts: nightly summarization, embedding generation, analytics — those get ~50% off.

Product shape	Typical model cost
FAQ/support bot, budget model, cached	$0.001–0.01 per conversation
General assistant, flagship model	$0.01–0.05 per conversation
Agent doing multi-step work	$0.10–1.00+ per task
Rule of thumb	If model cost per user exceeds ~10% of revenue per user, revisit routing and caching

AI Chatbot Cost Simulator

Cost at your usage, across models

The three levers that dominate chatbot cost

Sanity benchmarks

Frequently asked questions

What usage shape should I assume before launch?

Does this include infrastructure costs?

How do rate limits affect scale?

Should I use batch pricing?