Skip to content

Token Calculation Details

What is a Token

A Token is the smallest basic unit processed by large language models. Models do not process raw text directly; instead, they convert text into token sequences for processing.

Tokenizer

The tokenizer is responsible for converting text into token sequences:

Input Text → Tokenizer → Token Sequence
"Today is sunny" → [192, 3847, 2093, 3847]

Detokenizer

The detokenizer is responsible for converting token sequences back into text:

Token Sequence → Detokenizer → Output Text
[192, 3847, 2093, 3847] → "Today is sunny"

Token Calculation Rules

Chinese and English Differences

Text TypeAvg TokensExample
English word1 Token ≈ 0.75 words"hello" → 1 token
Chinese character1 Token ≈ 1-2 chars"今天" → 2 tokens
Punctuation1 ≈ 1 Token"." → 1 token
SpaceEnglish spaces count"hello world" → 2 tokens

Practical Calculation Examples

English: "hello world"
Tokens: [15339, 1917]
Count: 2 tokens

Chinese: "今天天气很好"
Tokens: [192, 3847, 2093, 3847, 452, 2398]
Count: 6 tokens

Common Token Calculation Tools

Platform Built-in Calculation

The ai.TokenHub platform API response includes actual token consumption:

json
{
  "id": "chatcmpl-xxx",
  "model": "gpt-4o",
  "usage": {
    "prompt_tokens": 28,
    "completion_tokens": 45,
    "total_tokens": 73
  }
}

Common Token Calculation Formulas

English Text

Token count ≈ characters ÷ 4
Token count ≈ words × 1.33

Chinese Text

Token count ≈ characters × 1.3 ~ 1.5

Online Calculation Tools

ToolURLDescription
OpenAI Tokenizerplatform.openai.com/tokenizerOfficial tool
AiTokenaitoken.fly.devOnline calculator

ai.TokenHub Platform Token Billing

Billing Components

TypeDescriptionBilling
Prompt TokensInput text consumption✅ Billable
Completion TokensOutput text consumption✅ Billable
Total TokensTotal consumption✅ Billable

Billing Example

python
# User input
user_input = "Write a poem about spring"

# Assumed to be converted to 14 tokens
# Model output
model_output = "Spring breeze caresses, all things revive.\nFlowers bloom everywhere, green everywhere."

# Assumed to be converted to 28 tokens
# Total: 14 + 28 = 42 tokens

Platform Token Calculation Page

On ai.TokenHub dashboard's "Usage Statistics" page, you can view:

  • Daily/weekly/monthly token consumption details
  • Token consumption proportion by model
  • Token consumption trend charts

How to Save Tokens

1. Streamline Prompts

python
# ❌ Not recommended - redundant description
messages = [
    {"role": "user", "content": "Please act as a very professional, experienced programmer to help me write a piece of code. This code's function is to calculate the sum of two numbers. Please write it in detail, including comments and error handling"}
]

# ✅ Recommended - concise and clear
messages = [
    {"role": "user", "content": "Write a Python addition function with error handling"}
]

2. Use System Messages

Put fixed role settings in system messages:

python
messages = [
    {"role": "system", "content": "You are a professional Python programmer"},
    {"role": "user", "content": "Write a sorting algorithm"}
]

3. Control Response Length

Specify response format and length:

python
{"role": "user", "content": "Explain quantum computing in one sentence"}

4. Multi-turn Conversation Optimization

python
# ❌ Repeating context every time
messages = [
    {"role": "user", "content": "Explain what machine learning is, including definition, principles, and application scenarios"},
    {"role": "assistant", "content": "Machine learning is..."},
    {"role": "user", "content": "Please explain in more detail the definition, principles, and application scenarios of machine learning"}
]

# ✅ Utilize historical context
messages = [
    {"role": "user", "content": "Explain what machine learning is, including definition, principles, and application scenarios"},
    {"role": "assistant", "content": "Machine learning is..."},
    {"role": "user", "content": "Give a specific example"}  # Automatically understands context
]

Tokens and Context Window

Token Count Affects Context

Context window: 4096 tokens

Input: 3000 tokens
Available output: 4096 - 3000 = 1096 tokens

Input: 4000 tokens
Available output: 4096 - 4000 = 96 tokens  ⚠️ Almost no output

Platform Context Windows

ModelContext WindowDescription
GPT-4o128K Tokens128,000
Claude 3.5 Sonnet200K Tokens200,000
Gemini 3 Pro1M Tokens1,000,000

FAQ

Q: Why are my tokens high even with short input?

Possible reasons:

  • Contains special characters or code
  • Many Chinese punctuation marks
  • Contains URLs or email addresses

Q: Is there any error in token calculation?

Different models use different Tokenizers, so results may vary slightly. It is recommended to use the usage field in the API response as the standard.

Q: How to accurately calculate tokens?

The most accurate way is to get it from the API response:

python
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}]
)

# Get accurate count from response
print(f"Total Tokens: {response.usage.total_tokens}")