This is a real PrePrice audit of a sample AI app. Here's exactly what you get for yours — verdict, cost per run, top drivers with fixes, and risk scenarios.

Scan your repo →
PrePriceai-chatbot · 2026-05-11
Code wiped 5.0s after report·never persisted to our database·SHA-256 a3f291…6f8
VerdictPer user / month
Healthy
$1.89/ user / month
Worst case: $9.45 / month (top 10% of users will use this much or more)
Margin at your price91%
How we picked this verdict

We compare your cost to your stated price. The cost figure includes every paid service in your stack: hosting, database, auth, monitoring, email, search, payments, and (if present) any AI, LLM, embedding, vector database, or voice spend. We total it at the audience size you have selected on the slider.

  • Green. Typical cost is 30 percent or less of your price (margin 70 percent or better). Pricing is healthy.
  • Yellow. Typical cost is 30 to 60 percent of your price (margin 40 to 70 percent). Workable, but worth tightening.
  • Red. Typical cost is over 60 percent of your price, or your worst-case heavy user costs more than they pay. You are losing money on power users.

The "worst case" figure is the P95: the top 10 percent of users (the heaviest ones) cost at least this much. Heavy users dominate margin on freemium and chat apps especially, so we always show both.

We look at the right unit for the app. Chat and freemium apps get judged per user per month. Agentic apps that run a clear job (research agent, code generator, video render) get judged per run, because that is how they get billed.

Cost projection
Drag to scale by audience size
Users20
AI / API spend$38
Infra (hosting + DB + …)$0
Total monthly bill$38

Total = AI bill + infra at current scale ($0 across 5 services). Worst-case AI: $189 / mo · Annual run-rate: $454.

Detected stack · what you're running on
5 services across 4 categories · $0 / mo · 1 verified

Drag the cost projection slider above to watch each service cross tier boundaries. Verified services use multi-dimensional pricing. Others fall back to Sonnet's best estimate.

Hosting· Compute · 1 service
Free
Vercel
Show math
Assumed audience: 20 MAU (drag the slider above to override). Per-line quantities are scaled from this baseline via per-MAU industry defaults; override an individual service with explicit values to recalculate.
Plan: Hobby
Total$0.00 / mo
Auth· Connectors · 1 service
Free
Next Auth
Open source (Auth.js) · est.
Database· Data · 2 services
Free
PostgreSQL (Drizzle ORM)
Free tier (Vercel Postgres 256MB or Neon 0.5GB) · est.
Price wrong?next: $19.00 past 10k
Redis (Upstash or Vercel KV)
Upstash Free (10k commands/day) · est.
Monitoring· Observability · 1 service
Free
OpenTelemetry + Vercel Observability
Bundled with Vercel Pro (basic observability) · est.
About what you built
This isn't right
Multi-model AI chat with code execution, spreadsheet editing, and writing suggestions powered by Vercel AI SDK and Next.js.
What it doesAI Chat With Artifacts
Who it's forProsumer creators and developers
Where it runsWeb

Features we found · 7

What we found · 4 issues
All 4 fixes shipped:~$729.00 / month saved at 1k users
#1

Uncached System Prompt In Chat Streaming

High confidence
app/(chat)/api/chat/route.ts:232~$540.00/mo saved at 1k users
What's wrong
System prompt is reconstructed on every chat call without prompt caching; assuming 2000+ token system prompt re-sent at full input cost every turn.
How to fix it
Add cache_control: {type: 'ephemeral'} to the system message if using Anthropic models (Claude Haiku 4.5 or Sonnet 4.5 support it).
#2

Tool-Use Agent Loop Amplification

Medium confidence
app/(chat)/api/chat/route.ts:237~$162.00/mo saved at 1k users
What's wrong
Five tools available per invocation; no explicit call-count cap beyond stepCountIs(5). Each tool round-trip adds 1500-2000 tokens of context exchange; assuming 2 tool calls per chat turn on average.
How to fix it
Audit tool_choice parameter; consider tool_choice: {type: 'required', toolName: '...'} to prevent multi-tool cascades, or add max_tool_calls if your SDK exposes it.
#3

Uncached Suggestions System Prompt

Medium confidence
lib/ai/tools/request-suggestions.ts:45~$27.00/mo saved at 1k users
What's wrong
streamText call with fixed 50-word system prompt lacks caching; repeated calls with same document content re-process identical instructions at full input cost.
How to fix it
Add cache_control to system prompt if using Anthropic; evaluate OpenAI's prompt caching if model supports it.
#4

Unknown Model Pricing Defaulted

Low confidence
lib/ai/models.ts:26~$0/mo saved at 1k users
What's wrong
Multiple models in the model_registry (deepseek-v3.2, codestral, kimi-k2.5, gpt-oss, grok-4.1) lack pricing data in the pricing_data catalog; cost estimates default to Claude Haiku 4.5 rates, which may underestimate by 2-5x if user selects a more expensive model.
How to fix it
Add explicit pricing for each model in the registry or gate expensive models behind paid tiers; flag unknown models in the UI with a cost warning.
Things to watch · 9 flagged
High risk
Unknown Model Pricing
Models deepseek-v3.2, codestral, mistral-small, kimi-k2.5, gpt-oss, and grok-4.1 detected in lib/ai/models.ts:26 but not found in pricing_data.models. Cost estimates default to Claude Haiku 4.5 ($1/M input, $5/M output); actual cost may be 2-5× higher if user selects an o1-class or GPT-5-class reasoning model.
How to fix: Add explicit pricing for each model in pricing_data or gate high-cost models behind paid tiers. Surface a cost-per-call estimate in the model selector UI.
High risk
Output Cap Missing
No max_tokens or maxTokens parameter visible in app/(chat)/api/chat/route.ts:232 streamText call or hooks/use-active-chat.tsx:89 useChat config. Output can theoretically reach model's native limit (4096-8192 tokens for most models), causing unpredictable cost spikes on long-form responses.
How to fix: Add max_tokens: 1500 (or category-appropriate cap) to streamText options and useChat transport config. This prevents runaway output and caps worst-case cost per turn.
Medium risk
Retry Amplification
No retry backoff or rate-limit middleware detected in API routes (app/(chat)/api/chat/route.ts, app/(chat)/api/suggestions/route.ts). If upstream LLM returns 429 or 5xx, client-side retries via useChat's automatic retry mechanism could amplify cost by 2-3× during transient outages.
How to fix: Add exponential backoff middleware at the transport layer; use Vercel AI Gateway's built-in retry with jitter if available. Cap client-side retries to 1-2 attempts.
High risk
Free Tier Without Per-User Cap
User rate limiting present (app/(chat)/api/chat/route.ts:85 checks getMessageCountByUserId against entitlementsByUserType.maxMessagesPerHour) but no cost-per-user budget guard. If a guest user maxes out their 3 messages/hour quota with tool-heavy or long-output requests, cost can reach $0.10-0.30/user—eating 50-150% of free tier allowance if you offer unlimited guests.
How to fix: Add per-user monthly cost accumulator; pause service for guest users who exceed $0.05/mo COGS until they upgrade. Track actual cost-per-user in Redis or DB.
Medium risk
Reasoning Model Flag Ignored
app/(chat)/api/chat/route.ts:237 checks isReasoningModel to disable tools, but no cost multiplier applied. Reasoning models (o1, o3, Claude Opus thinking, GPT-5) incur 2-5× hidden token overhead for chain-of-thought that isn't visible in prompt or completion. Current cost estimate may be 50-400% low if user selects a reasoning model.
How to fix: Apply a 3× multiplier to output cost when isReasoningModel is true; flag reasoning models in the UI as 'High Cost' and require explicit opt-in.
Medium risk
Tool-Use Cost Uncapped
experimental_activeTools array in app/(chat)/api/chat/route.ts:237 includes 5 tools with stepCountIs(5) as the only limit. Each tool call adds 1500-2000 tokens of round-trip context; worst case is 5 steps × 2 tools/step × 1800 tokens = 18k input tokens = $0.018/turn using Claude Haiku 4.5 uncached. This is 3× the base chat cost.
How to fix: Add max_tool_calls: 3 to streamText options or use tool_choice: 'required' to force single-tool responses. Monitor tool call frequency in production and gate tool-heavy users behind paid tier.
Medium risk
Vercel Pro Tier Assumption Unverified
Hosting cost set to $20/mo (Vercel Pro) based on detecting multiple environments, Vercel Functions, and production-grade setup (detected via @vercel/functions, @vercel/blob, @vercel/otel in package.json). However, no crons block found in vercel.json and no explicit production domain or org config detected—user might still be on Hobby tier ($0/mo) if deploying personal projects.
How to fix: Verify current Vercel tier via usage dashboard. If MAU < 10k and no team members, stay on Hobby. If crons or advanced functions are required, Pro is mandatory.
Low risk
Stripe Fee Assumption
Payment processing fees computed at 2.9% + $0.30 (Stripe US Standard) per transaction. If user base is European or Latin American, actual fee may be 1.5%-3.6% + currency-local fixed fee. Cost breakdown assumes US-based users; margin may be 0.5-1.5% higher or lower depending on geography.
How to fix: Segment users by billing country in your DB; apply region-specific Stripe fees when computing margin. For EU users, fee is 1.5% + €0.25.
Low risk
Database Cost Missing From Breakdown
Drizzle ORM detected (package.json:43, lib/db/migrate.ts:10) but no database service detected in pre_computed_tech_stack. POSTGRES_URL and REDIS_URL env vars present but no corresponding Neon/Supabase/PlanetScale/Upstash entries. Database cost likely $0 if using Vercel Postgres free tier (256MB) or Supabase free (500MB), but this caps at 10k MAU.
How to fix: Audit POSTGRES_URL provider; if Vercel Postgres, note that free tier is 256MB and pauses after 7 days inactivity—upgrade to Neon Launch ($19/mo) or Supabase Pro ($25/mo) when hitting 10k MAU or requiring always-on availability.
What if things change
Cost-per-user delta if assumptions shift
10x Heavy Users
10% of users generate 100 actions/day (10× median); assume lognormal tail with P95 multiplier already capturing some of this. If top decile hits 100 actions/day, their COGS is $18.90/mo—eating 95% of $19.99 revenue. Blended cost at 1k users: +$1.69/user/mo.
+$1.69 / user / mo
Reasoning Model Upgrade
User switches from Claude Haiku 4.5 to o1-preview or Claude Opus thinking mode; reasoning token overhead 3× on output. Output cost goes from $0.006/turn to $0.018/turn; assuming 50% of actions use reasoning model, +$0.0036/action → +$1.08/mo/user at 10 actions/day.
+$1.08 / user / mo
Free Tier Abuse Doubles Actions
Guest users exploit rate limit (3 messages/hour = 72/day theoretical max) by rotating IPs or creating burner accounts; actual free tier usage doubles to 10 actions/day from assumed 5. Free tier COGS goes from $0.95/mo to $1.89/mo—still under $3 threshold, but conversion pressure increases.
+$0.95 / user / mo
Pricing Model Price Increase 25%
Anthropic raises Claude Haiku 4.5 input cost from $1/M to $1.25/M tokens (25% hike, mirroring GPT-4o mini → GPT-4.5 pricing shift). Per-action cost increases $0.00146 → +$1.31/mo at 10 actions/day × 30.
+$1.31 / user / mo
Output Token Drift 2x
System prompt changes or user behavior shift causes output length to double from 1200 → 2400 tokens/turn. Output cost per turn goes from $0.006 to $0.012; +$1.80/mo at 10 actions/day.
+$1.80 / user / mo
Tool-Use Frequency Doubles
Tool-use adoption increases from 20% of chats to 40% (users discover createDocument/editDocument features); tool round-trips add 0.0006/action → +$0.54/mo.
+$0.54 / user / mo
What this report can't tell you
  • Static analysis cannot estimate the relative frequency of chat vs. tool-use actions—assumed 80/20 split for cost allocation.
  • System prompt size from runtime config (systemPrompt({ requestHints, supportsTools })) is unknown; assumed 2000 tokens based on typical chat app prompts.
  • Reasoning token cost is approximated at 30% output overhead; actual cost depends on model and may be 2-5× if using o1/o3-class models.
  • Per-feature usage split is inferred from UI signals; actual user behavior (e.g., 90% chat, 5% code, 5% suggestions) may differ significantly.
  • Cache hit rate for prompt caching (if implemented) is unverified—assumed 0% hit rate (worst case) for cost estimates.
  • Tool call frequency per chat turn is assumed at 20%; actual rate depends on user behavior and tool discoverability in UI.
  • Database and Redis costs assumed free tier ($0/mo); upgrade costs ($19-25/mo for DB, $10/mo for Redis) not included in headline COGS but flagged in risk section.
Lifecycle
  1. Code received2026-05-11 00:00 UTCt+0.0s

    git clone --depth 1, into ephemeral worker tmpdir

  2. Code extracted2026-05-11 00:00 UTCt+3.0s

    Secrets purged · 47 files / 1.2 MB kept for analysis

  3. Report generated2026-05-11 00:02 UTCt+2m 25s

    audit_report.json written to scans table (no source code)

  4. Code wiped2026-05-11 00:02 UTCt+2m 30s

    shutil.rmtree(workdir) in finally block · 0 files remaining

Total time your source existed on our infrastructure: 2 minutes 30 seconds
Input digest
Sourcegithub.com/preprice/sample @ HEAD a3f291c
SHA-256a3f291c5e8b2d4f7a1c9e6b8d3f5a2e7c1b4d6f8
File count47 files analyzed
Byte count1.2 MB (after secrets purge, before chunking)
Storage regionus-east-1

Verify yourself

Run this against your local copy and confirm the value matches our SHA-256 above:

git rev-parse HEAD
What we kept · What we wiped

Kept (in our database)

audit_report.jsonFindings, costs, fix prompts. No source code. No code snippets.

Wiped from our infrastructure

Repository contents47 files / 1.2 MB. Worker tmpdir rmtree'd in finally block at 2026-05-11 00:02 UTC.
Env / credential files.env, .env.local, .env.production, credentials.json, secrets.yml, service-account.json, firebase-adminsdk.json — all purged BEFORE analysis ran.
Private keys + auth tokensid_rsa, id_ed25519, all SSH keys, .npmrc, .yarnrc — purged BEFORE analysis ran. Stack inferred from package.json, imports, and hosting config.
Per-driver code snippetsStripped from audit_report before persistence (server.py:443-446). The file:line reference survives so the report can render a 'See the code' placeholder; the literal snippet does not.
Verification

This scan never wrote source code to our database. Enforced by RLS policy scans_select_own on table scans (migration 007).

What that means: even an authenticated user with a stolen anon JWT can only read their own scans rows. There is no row in any table that contains your source. Code snippets that the synth step might have embedded are stripped before persistence (server.py:443-446) so the JSON we keep is structurally incapable of holding your code.

How we handle your data
Scan IDsample
Report generated2026-05-11
Used to train AI?Never. Anthropic (our AI provider) operates under a zero data retention and no-training contract for API customers.
What we loggedFile paths and line numbers only. Never the code itself.
Who can see this reportOnly you, when signed in. PrePrice staff do not review your scan unless you email support with the scan ID and ask us to.
Verifiable cost numbersEvery dollar figure in this report links back to the vendor's public pricing page (Anthropic, Vercel, Stripe, and so on). Check our math.
Download receipt
Delete everythingDelete my account and every scan I've run →
Heads upPricing data is scraped from vendor pages and audit reports are AI-generated. Cost figures are estimates only and your actual usage will vary. Prices shown are in USD and don't include local taxes. Vendors change pricing tiers without notice. PrePrice is informational only, not financial, legal, or business advice. Always click through to the vendor's pricing page in your report and verify every figure before making any business or pricing decision.

Want this for your own app? It takes a couple of minutes.

Scan your repo →