Token Calculation Details
What is a Token
A Token is the smallest basic unit processed by large language models. Models do not process raw text directly; instead, they convert text into token sequences for processing.
Tokenizer
The tokenizer is responsible for converting text into token sequences:
Input Text → Tokenizer → Token Sequence
"Today is sunny" → [192, 3847, 2093, 3847]Detokenizer
The detokenizer is responsible for converting token sequences back into text:
Token Sequence → Detokenizer → Output Text
[192, 3847, 2093, 3847] → "Today is sunny"Token Calculation Rules
Chinese and English Differences
| Text Type | Avg Tokens | Example |
|---|---|---|
| English word | 1 Token ≈ 0.75 words | "hello" → 1 token |
| Chinese character | 1 Token ≈ 1-2 chars | "今天" → 2 tokens |
| Punctuation | 1 ≈ 1 Token | "." → 1 token |
| Space | English spaces count | "hello world" → 2 tokens |
Practical Calculation Examples
English: "hello world"
Tokens: [15339, 1917]
Count: 2 tokens
Chinese: "今天天气很好"
Tokens: [192, 3847, 2093, 3847, 452, 2398]
Count: 6 tokensCommon Token Calculation Tools
Platform Built-in Calculation
The ai.TokenHub platform API response includes actual token consumption:
{
"id": "chatcmpl-xxx",
"model": "gpt-4o",
"usage": {
"prompt_tokens": 28,
"completion_tokens": 45,
"total_tokens": 73
}
}Common Token Calculation Formulas
English Text
Token count ≈ characters ÷ 4
Token count ≈ words × 1.33Chinese Text
Token count ≈ characters × 1.3 ~ 1.5Online Calculation Tools
| Tool | URL | Description |
|---|---|---|
| OpenAI Tokenizer | platform.openai.com/tokenizer | Official tool |
| AiToken | aitoken.fly.dev | Online calculator |
ai.TokenHub Platform Token Billing
Billing Components
| Type | Description | Billing |
|---|---|---|
| Prompt Tokens | Input text consumption | ✅ Billable |
| Completion Tokens | Output text consumption | ✅ Billable |
| Total Tokens | Total consumption | ✅ Billable |
Billing Example
# User input
user_input = "Write a poem about spring"
# Assumed to be converted to 14 tokens
# Model output
model_output = "Spring breeze caresses, all things revive.\nFlowers bloom everywhere, green everywhere."
# Assumed to be converted to 28 tokens
# Total: 14 + 28 = 42 tokensPlatform Token Calculation Page
On ai.TokenHub dashboard's "Usage Statistics" page, you can view:
- Daily/weekly/monthly token consumption details
- Token consumption proportion by model
- Token consumption trend charts
How to Save Tokens
1. Streamline Prompts
# ❌ Not recommended - redundant description
messages = [
{"role": "user", "content": "Please act as a very professional, experienced programmer to help me write a piece of code. This code's function is to calculate the sum of two numbers. Please write it in detail, including comments and error handling"}
]
# ✅ Recommended - concise and clear
messages = [
{"role": "user", "content": "Write a Python addition function with error handling"}
]2. Use System Messages
Put fixed role settings in system messages:
messages = [
{"role": "system", "content": "You are a professional Python programmer"},
{"role": "user", "content": "Write a sorting algorithm"}
]3. Control Response Length
Specify response format and length:
{"role": "user", "content": "Explain quantum computing in one sentence"}4. Multi-turn Conversation Optimization
# ❌ Repeating context every time
messages = [
{"role": "user", "content": "Explain what machine learning is, including definition, principles, and application scenarios"},
{"role": "assistant", "content": "Machine learning is..."},
{"role": "user", "content": "Please explain in more detail the definition, principles, and application scenarios of machine learning"}
]
# ✅ Utilize historical context
messages = [
{"role": "user", "content": "Explain what machine learning is, including definition, principles, and application scenarios"},
{"role": "assistant", "content": "Machine learning is..."},
{"role": "user", "content": "Give a specific example"} # Automatically understands context
]Tokens and Context Window
Token Count Affects Context
Context window: 4096 tokens
Input: 3000 tokens
Available output: 4096 - 3000 = 1096 tokens
Input: 4000 tokens
Available output: 4096 - 4000 = 96 tokens ⚠️ Almost no outputPlatform Context Windows
| Model | Context Window | Description |
|---|---|---|
| GPT-4o | 128K Tokens | 128,000 |
| Claude 3.5 Sonnet | 200K Tokens | 200,000 |
| Gemini 3 Pro | 1M Tokens | 1,000,000 |
FAQ
Q: Why are my tokens high even with short input?
Possible reasons:
- Contains special characters or code
- Many Chinese punctuation marks
- Contains URLs or email addresses
Q: Is there any error in token calculation?
Different models use different Tokenizers, so results may vary slightly. It is recommended to use the usage field in the API response as the standard.
Q: How to accurately calculate tokens?
The most accurate way is to get it from the API response:
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}]
)
# Get accurate count from response
print(f"Total Tokens: {response.usage.total_tokens}")