Together AI bills per token, split into input (your prompt) and output (the model's reply), priced per million tokens. Across the 20 models tracked here, output runs from LFM2 24B A2B at $0.12 per 1M output tokens up to Kimi K2.6 at $4.5. Input is cheaper, and cached input cheaper still. The table below has every model's exact rates, verified June 14, 2026.
| Model | Input / 1M | Output / 1M |
|---|---|---|
| Llama 3.3 70B | $3.5 | $3.5 |
| DeepSeek-R1-0528 | $0.18 | $0.88 |
| DeepSeek V4 Pro | $0.6 | $4.4 |
| DeepSeek-V3.1 | $0.6 | $1.7 |
| Qwen3.5-397B-A17B | $0.6 | $3.6 |
| Qwen3 235B A22B FP8 Throughput | $0.2 | $0.6 |
| Qwen3.6-Plus | $0.5 | $3 |
| Kimi K2.6 | $1.2 | $4.5 |
| Kimi K2.5 | $0.5 | $2.8 |
| MiniMax M2.7 | $0.3 | $1.2 |
| GLM-5.1 | $1.4 | $4.4 |
| gpt-oss-120B | $0.15 | $0.6 |
| gpt-oss-20B | $0.05 | $0.2 |
| LFM2 24B A2B | $0.03 | $0.12 |
| Gemma 4 31B | $0.2 | $0.5 |
| Gemma 3n E4B Instruct | $0.06 | $0.12 |
| Qwen3.5 9B | $0.1 | $0.15 |
| GLM-5 | $1 | $3.2 |
| Qwen3-Coder-Next | $0.5 | $1.2 |
| MiniMax M2.5 | $0.3 | $1.2 |
Multi-model marketplace with serverless inference, dedicated endpoints, GPU clusters, fine-tuning, and sandbox. Prices are per 1M tokens for LLM models unless otherwise noted. Dedicated inference and GPU clusters billed per GPU-hour.
Source: together.ai · Catalog 2026-06-14.2. Confirm the live rate before you commit.
The rates above are per unit. Your bill is those rates times how hard your code leans on Together AI, plus everything around it. PrePrice scans your project, finds where you call Together AI, and computes your real cost per user and what to charge.
Find your real cost — freeBy output-token price, LFM2 24B A2B is currently the cheapest Together AI model at $0.12 per 1M output tokens and $0.03 per 1M input. The cheapest model per token is not always the cheapest per finished task, since a weaker model can need more retries or longer prompts.
Together AI is pay-as-you-go and billed per token: you pay for input tokens (your prompt and context) and output tokens (the response) separately, priced per million tokens. There is no monthly base fee for the API itself.
The per-unit rate is only one of the four numbers that set your bill. The others are how much your app uses Together AI, how often, and across how many users. Long prompts, retries, multi-step agents, and uncached repeated context all multiply the rate. PrePrice models that usage from your code so the number is real, not a guess.
From Together AI's official pricing page (together.ai), verified June 14, 2026 and re-checked on a schedule. Pricing changes often, so confirm the live rate before you commit.
See the full AI Cost Index or estimate your monthly bill.