Token -- Qurtoo Glossary

Everything an LLM sees and produces is tokens. Your prompt, its response, the context window limit, the per-request cost -- all measured in tokens. Different models use different tokenizers (BPE, SentencePiece, tiktoken variants), so "1000 tokens" on GPT is not identical to 1000 on Claude.

Rules of thumb for English prose: 1 token ≈ 0.75 words, ≈ 4 characters. Code tokenizes denser (more tokens per char) because of punctuation and identifiers. Whitespace counts. A 1000-word article ≈ 1300 tokens.

Example Prompt

# Approximating cost before you call

word_count = len(text.split())
estimated_tokens = int(word_count * 1.3)  # English prose
estimated_cost_usd = estimated_tokens * (INPUT_PRICE_PER_M / 1_000_000)

# Exact count via tokenizer:
import tiktoken
enc = tiktoken.encoding_for_model("gpt-4o")
exact_tokens = len(enc.encode(text))

When to use it

Estimating cost before calling
Deciding whether content fits the context window
Budgeting prompts and retrievals in a RAG pipeline

When NOT to use it

You don't have a tokenizer handy -- rough char-count approximation is good enough for first pass