What is a token in AI pricing?

A token is a chunk of text — roughly ¾ of a word, or about 4 characters. Models bill by tokens in (your prompt) and tokens out (the reply).

AI Token Cost Calculator — LLM API Pricing per Request & Month

🤖

AI Token Cost

LLM API spend

Model tier (sets default prices)

Input $ / 1M

Output $ / 1M

Input tokens per request

Output tokens per request

Requests per month

🤖

Enter tokens to estimate your AI cost

How the AI Token Cost Calculator Works

Pick a model tier or enter your own per-million-token prices.
Enter input & output tokens per request and your monthly volume.
See the cost per request, per 1,000 calls, and per month.

Estimating LLM API Costs

Large language models charge by the token (~¾ of a word), with separate prices for input (your prompt) and output (the model's reply) — usually quoted per million tokens. Output is typically 3–5× more expensive than input, so long replies dominate cost. The formula: cost = (input tokens ÷ 1M × input price) + (output tokens ÷ 1M × output price), times your request volume.

Because prices change often and vary by provider, this calculator keeps the rates editable — drop in the exact numbers from your provider's pricing page. To cut spend: trim prompts, cap output length, cache repeated context, and use a smaller model for simple tasks. Estimate only.

AI Token Cost FAQ

What is a token? ▾

A token is a chunk of text — roughly ¾ of a word, or about 4 characters. "Hello world" is about 2–3 tokens. Models bill by tokens in and tokens out.

Why is output more expensive than input? ▾

Generating text is more compute-intensive than reading it, so providers charge more per output token — often 3–5× the input price. Limiting reply length is a fast way to cut costs.

How can I reduce my AI bill? ▾

Shorten prompts, cap max output tokens, cache or reuse context, batch requests, and route simple tasks to a cheaper/smaller model. Even small per-request savings multiply at scale.

Related Calculators