diff --git a/pages/generative-apis/faq.mdx b/pages/generative-apis/faq.mdx index 0bec3dc647..75f50185b8 100644 --- a/pages/generative-apis/faq.mdx +++ b/pages/generative-apis/faq.mdx @@ -21,7 +21,43 @@ Our Generative APIs support a range of popular models, including: ## How does the free tier work? The free tier allows you to process up to 1,000,000 tokens without incurring any costs. After reaching this limit, you will be charged per million tokens processed. Free tier usage is calculated by adding all input and output tokens consumed from all models used. -For more information, refer to our [pricing page](https://www.scaleway.com/en/pricing/model-as-a-service/#generative-apis). +For more information, refer to our [pricing page](https://www.scaleway.com/en/pricing/model-as-a-service/#generative-apis) or access your bills by token types and models in [billing section from Scaleway Console](https://console.scaleway.com/billing/payment) (past and provisional bills for the current month). + +Note that when your consumption exceeds the free tier, you will be billed for each additional token consumed by model and token type. The minimum billing unit is 1 million tokens. Here are two examples for low volume consumption: + +Example 1: Free Tier only + +| Model | Token type | Tokens consumed | Price | Bill | +|-----------------|-----------------|-----------------|-----------------|-----------------| +| `llama-3.3-70b-instruct` | Input | 500k | 0.90€/million tokens | 0.00€ | +| `llama-3.3-70b-instruct` | Output | 200k | 0.90€/million tokens | 0.00€ | +| `mistral-small-3.1-24b-instruct-2503` | Input | 100k | 0.15€/million tokens | 0.00€ | +| `mistral-small-3.1-24b-instruct-2503` | Output | 100k | 0.35€/million tokens | 0.00€ | + +Total tokens consumed: `900k` +Total bill: `0.00€` + +Example 2: Exceeding Free Tier + +| Model | Token type | Tokens consumed | Price | Billed consumption | Bill | +|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------| +| `llama-3.3-70b-instruct` | Input | 800k | 0.90€/million tokens | 1 million tokens | 0.00€ (Free Tier application)| +| `llama-3.3-70b-instruct` | Output | 2 500k | 0.90€/million tokens | 3 million tokens | 2.70€ | +| `mistral-small-3.1-24b-instruct-2503` | Input | 100k | 0.15€/million tokens | 1 million tokens | 0.15€ | +| `mistral-small-3.1-24b-instruct-2503` | Output | 100k | 0.35€/million tokens | 1 million tokens | 0.35€ | + +Total tokens consumed: `900k` +Total billed consumption: `6 million tokens` +Total bill: `3.20€` + +Note that in this example, the first line where the free tier applies will not display in your current Scaleway bills by model but will instead be listed under `Generative APIs Free Tier - First 1M tokens for free`. + +## What is a token and how are they counted? +A token is the minimum unit of content that is seen and processed by a model. Hence, token definitions depend on input types: +- For text, on average, `1` token corresponds to `~4` characters, and thus `0.75` words (as words are on average five characters long) +- For images, `1` token corresponds to a square of pixels. For example, [pixtral-12b-2409 model](https://www.scaleway.com/en/docs/managed-inference/reference-content/pixtral-12b-2409/#frequently-asked-questions) image tokens of `16x16` pixels (16-pixel height, and 16-pixel width, hence `256` pixels in total). + +The exact token count and definition depend on [tokenizers](https://huggingface.co/learn/llm-course/en/chapter2/4) used by each model. When this difference is significant (such as for image processing), you can find detailed information in each model documentation (for instance in [`pixtral-12b-2409` size limit documentation](https://www.scaleway.com/en/docs/managed-inference/reference-content/pixtral-12b-2409/#frequently-asked-questions)). Otherwise, when the model is open, you can find this information in the model files on platforms such as Hugging Face, usually in the `tokenizer_config.json` file. ## How can I monitor my token consumption? You can see your token consumption in [Scaleway Cockpit](/cockpit/). You can access it from the Scaleway console under the [Metrics tab](https://console.scaleway.com/generative-api/metrics).