What embedding dimension for RAG?

768–1536 typical. OpenAI text-embedding-3-small = 1536. Balance memory vs expressiveness.

How does vocab size affect dimension?

Mikolov heuristic: dim ≈ √vocab to 4√vocab. Larger vocabularies need more dimensions.

How much memory for 1M vectors at 1536 dim?

1M × 1536 × 4 bytes (float32) ≈ 5.7 GB.

LLM vs embedding dimension?

LLM hidden dim (4096–12288) is different from embedding dim. Embeddings are separate projection layers.

5 more

DATAData & RAGML Calculator

📐

Embedding Dimension

Optimal embedding dimensions for LLMs, RAG, classification, and search. Balance memory vs expressiveness using Mikolov heuristic, MTEB benchmarks, and OpenAI dimensions.

Concept Fundamentals

d ≈ 4·⁴√V

Rule of Thumb

Vocab-based estimate

100–300 dims

Word2Vec

Classic embeddings

768–4096 dims

Transformer

Modern architectures

Representation learning

Application

Embedding size selection

Balance Memory vs ExpressivenessHeuristic: dim ≈ √vocab to 4√vocab

Why This ML Metric Matters

Why: Choosing the right dimension affects retrieval quality, memory footprint, and inference speed. Too low loses expressiveness; too high wastes memory.

How: The calculator applies Mikolov heuristic, purpose-based ranges (LLM/RAG/classification/search), and memory budget constraints.

Sources:Mikolov et al. 2013 - Word2VecMTEB Leaderboard

📐

EMBEDDING DIMENSION

Balance Memory vs Expressiveness for RAG, Search & Classification

Mikolov heuristic, MTEB benchmarks, OpenAI dimensions. Plan vector stores and choose models.

RAG Optimizer →NN Parameter Counter →Token Cost →

📊 Quick Examples — Click to Load

Inputs

Vocab Sizevocabulary / unique items

Model Purpose

Corpus Sizevectors to store

Memory Budget (GB)vector store limit

Task Complexity

embedding-dim.sh

CALCULATED

Suggested Dim

1536

Heuristic Range

224–894

Memory (GB)

0.57

Memory Fits

Yes

Practical Range

768–3072

Embedding Dimension

Suggested: 1536d

1536

RAG|Heuristic 224–894|Memory 0.57 GB

numbervibe.com/calculators/machine-learning/embedding-dimension-calculator

Memory vs Dimension (corpus: 100,000)

Common Model Dimensions

For educational and informational purposes only. Verify with a qualified professional.

🤖 AI & ML Facts

📐

Mikolov 2013: embedding dim ≈ √vocab balances expressiveness and overfitting

— Word2Vec

🔍

MTEB leaderboard ranks embedding models by retrieval, clustering, reranking

— MTEB

📦

1M vectors × 1536 dim × 4 bytes ≈ 5.7 GB for float32 vector store

— Memory calc

⚡

Lower dim = faster similarity search (cosine, dot product)

— Performance

📋 Key Takeaways

• Heuristic: dim ≈ √vocab to 4√vocab (Mikolov Word2Vec). Larger vocab → larger dim.
• RAG/search: 768–1536 typical. OpenAI text-embedding-3-small = 1536.
• Classification: 256–768 often sufficient. Higher dim = more expressiveness, more memory.
• Memory = corpus × dim × 4 bytes (float32). Plan vector store size accordingly.
• MTEB leaderboard benchmarks models by dimension; compare before choosing.

💡 Did You Know

📐Mikolov 2013: embedding dim ≈ √vocab balances expressiveness and overfitting

🔍MTEB leaderboard ranks embedding models by retrieval, clustering, reranking

📦1M vectors × 1536 dim × 4 bytes ≈ 5.7 GB for float32 vector store

⚡Lower dim = faster similarity search (cosine, dot product)

🌍Multilingual models (XLM-R, mE5) often use 768–1024 dim

🤖LLM hidden dim (4096–12288) ≠ embedding dim; embeddings are separate

📉Quantization (int8) halves memory; slight accuracy trade-off

🎯RAG: chunk size and overlap matter as much as embedding dimension

📖 How It Works

1. Heuristic (Mikolov)

Word2Vec: dim ≈ √vocab to 4√vocab. Larger vocabularies need more dimensions to avoid collisions.

2. Purpose-Based Ranges

LLM: 4096–16384. RAG: 768–3072. Classification: 128–768. Search: 384–1024. Based on MTEB and common models.

3. Memory Constraint

Memory = corpus × dim × 4 bytes. If budget is limited, reduce dim or corpus.

4. Task Complexity

High complexity → higher dim. Low complexity → lower dim for faster inference.

🎯 Expert Tips

Check MTEB first

Compare models on retrieval, clustering, reranking before choosing dimension.

Memory planning

Vector store: N × d × 4 bytes. Add index overhead (HNSW, IVF) for production.

RAG pipeline

Chunk size, overlap, and embedding model matter. 1536 (OpenAI) is a solid default.

Quantization

int8 halves memory; minimal accuracy loss for many use cases.

⚖️ Practical Ranges by Purpose

Purpose	Min	Typical	Max	Examples
LLM	4096	12288	16384	Llama 3, GPT-4
RAG	768	1536	3072	OpenAI text-embedding-3
Search	384	768	1024	sentence-BERT, E5
Classification	128	384	768	Lightweight classifiers

❓ Frequently Asked Questions

What is embedding dimension?

The size of the vector representing each token, sentence, or document. Higher dim = more expressiveness but more memory and compute.

Why sqrt(vocab) heuristic?

Mikolov Word2Vec: dim ≈ √vocab balances capacity and overfitting. Too small → collisions. Too large → overfitting, waste.

RAG: 768 vs 1536?

1536 (OpenAI) often better retrieval. 768 (sentence-BERT) cheaper, faster. Benchmark on your data with MTEB-style eval.

How much memory for 1M vectors?

1M × 1536 × 4 bytes ≈ 5.7 GB float32. Half for float16. Add ~20–50% for HNSW/IVF index overhead.

LLM embedding vs hidden dim?

LLM hidden dim (e.g., 4096) is internal. Embedding table maps vocab→hidden. This calculator focuses on standalone embedding models.

When to use lower dimension?

Fast inference, limited memory, simple tasks (classification), edge deployment. Trade-off: some retrieval quality loss.

MTEB vs custom eval?

MTEB gives baselines. Always evaluate on your domain (e.g., legal, medical) for production decisions.

Quantization impact?

int8 halves memory; typically <1% retrieval quality drop. Test on your data before deploying.