AI model APIs bill per token, split into input and output and quoted per 1 million tokens. As of June 14, 2026, PrePrice tracks 76 models across 10 providers with output token prices from $0.08 to $75 per 1M. Output tokens consistently cost more than input, which is why your real bill depends on how your app spends tokens, not the sticker rate. Every price below links to the provider's official pricing page.
Cached-input reads are typically ~10% of the input rate and are the biggest lever on a high-volume bill. Rates that could not be verified are withheld rather than shown. Always confirm the live price on the provider's page before you commit.
These are list rates per token. What your app actually costs depends on how many tokens each user burns, what you cache, and which models you call where. PrePrice scans your code and tells you your real cost per user and what to charge.
Find your real cost — freeThe per-token price in the table is one input of four. Your monthly AI cost is, roughly, tokens per request × requests per user × active users × price per token. The sticker rate is the smallest and most quoted of the four, and the one you control least. The other three are decisions in your code: how long your prompts are, how often you call the model, and whether you cache.
Across the providers above, output tokens cost several times more than input, and reasoning models bill their hidden thinking tokens as output. An app that returns long answers, writes code, or runs a multi-step agent spends most of its budget on output. An app that classifies or extracts spends most on input. Two apps on the identical model can have wildly different bills.
Cached input reads are typically about 10% of the standard input rate. If your system prompt, tools, or retrieved context repeat across requests, prompt caching can cut input cost by an order of magnitude. Most teams leave it off because the savings are invisible until someone models the per-request token flow, which is exactly what a scan does.
A real AI app also pays for hosting, a vector database, embeddings, auth, payments, and observability. PrePrice tracks 156+ services so the model is priced next to everything around it. The fastest way to see your whole bill, and what to charge so it clears margin, is to point a scan at your code.
Or see all 156 platforms we price, including hosting, vector databases, auth, and payments.
LLM APIs bill per token, split into input (the prompt) and output (the response), quoted per million tokens. Across the 10 providers tracked here, output tokens currently range from about $0.08 to $75 per 1M, and output almost always costs several times more than input. Pick a specific model in the table above for its exact input, output, and cached-input rates.
Because the sticker rate is per token, but your bill is tokens-per-request times requests-per-user times users, plus retries, system prompts, tool calls, and reasoning tokens you don't see. The same model can cost 5-10x more in a chatty agent than in a one-shot classifier. Per-token price tells you almost nothing about your monthly bill until you model your actual usage.
By output-token price, Llama 3.1 8B Instant (Groq) is currently the cheapest in this index at $0.08 per 1M output tokens. Cheapest is not the same as best value: a weaker model that needs more retries or longer prompts can cost more in practice than a pricier model that gets it right in one pass.
This index is generated from the PrePrice pricing catalog (version 2026-06-14.2), last verified June 14, 2026. Every model links to the provider's official pricing page so you can confirm the live rate before you commit.
Yes. Beyond LLM APIs, PrePrice tracks 156+ services across hosting, vector databases, auth, payments, voice, search, and analytics. An AI app's bill is rarely just the model. See any service's cost page or run a scan to get the whole stack priced together.
Point PrePrice at your project. We detect every paid service it calls, compute your real cost per user, and tell you what to charge. Most scans finish in 2 to 4 minutes. Free to run.
Start free