On the cheapest model from each, Groq wins on raw output price: Llama 3.1 8B Instant at $0.08 per 1M output tokens versus $0.12. But "cheaper per token" rarely means "cheaper for your app". The model that needs fewer retries, shorter prompts, or less reasoning can cost less in practice even at a higher sticker rate.
Last verified June 14, 2026| Model | In / 1M | Out / 1M |
|---|---|---|
| GPT OSS 20B | $0.075 | $0.3 |
| GPT OSS Safeguard 20B | $0.075 | $0.3 |
| GPT OSS 120B | $0.15 | $0.6 |
| Llama 4 Scout (17Bx16E) | $0.11 | $0.34 |
| Qwen3 32B | $0.29 | $0.59 |
| Llama 3.3 70B Versatile | $0.59 | $0.79 |
| Llama 3.1 8B Instant | $0.05 | $0.08 |
| Kimi K2 | — | $3 |
| Model | In / 1M | Out / 1M |
|---|---|---|
| Llama 3.3 70B | $3.5 | $3.5 |
| DeepSeek-R1-0528 | $0.18 | $0.88 |
| DeepSeek V4 Pro | $0.6 | $4.4 |
| DeepSeek-V3.1 | $0.6 | $1.7 |
| Qwen3.5-397B-A17B | $0.6 | $3.6 |
| Qwen3 235B A22B FP8 Throughput | $0.2 | $0.6 |
| Qwen3.6-Plus | $0.5 | $3 |
| Kimi K2.6 | $1.2 | $4.5 |
| Kimi K2.5 | $0.5 | $2.8 |
| MiniMax M2.7 | $0.3 | $1.2 |
| GLM-5.1 | $1.4 | $4.4 |
| gpt-oss-120B | $0.15 | $0.6 |
| gpt-oss-20B | $0.05 | $0.2 |
| LFM2 24B A2B | $0.03 | $0.12 |
| Gemma 4 31B | $0.2 | $0.5 |
| Gemma 3n E4B Instruct | $0.06 | $0.12 |
| Qwen3.5 9B | $0.1 | $0.15 |
| GLM-5 | $1 | $3.2 |
| Qwen3-Coder-Next | $0.5 | $1.2 |
| MiniMax M2.5 | $0.3 | $1.2 |
Per-token rates can't answer that. The winner depends on how your code uses each model. PrePrice scans your project and computes the real per-user cost either way, plus what to charge so you clear margin.
Find your real cost — free