Vector Database -- Qurtoo Glossary

Vector DBs index embeddings so you can find "the 10 most similar items to this query" in milliseconds over millions of rows. Underlying algorithms: HNSW (hierarchical graph), IVF (inverted file with clustering), and ScaNN. Consumers pick on accuracy, speed, and memory.

Production-ready options in 2026: pgvector (Postgres extension, most popular because it lives alongside your data), Pinecone, Weaviate, Qdrant, LanceDB, Milvus. For small corpora (< 1M vectors) pgvector in Postgres is usually the right answer -- no new service, same backup story, SQL joins to your other data. Specialized DBs earn their keep at 10M+ vectors or when you need the operational features they ship.

Example Prompt

-- pgvector example: create embedding column + HNSW index
CREATE EXTENSION IF NOT EXISTS vector;

ALTER TABLE docs ADD COLUMN embedding vector(3072);

CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 64);

-- Query: 10 nearest neighbors of a given vector
SELECT id, title
FROM docs
ORDER BY embedding <=> $1
LIMIT 10;

When to use it

RAG retrieval backend
Similarity search at scale (duplicates, recommendations)
You already have Postgres and don't want a new service (use pgvector)

When NOT to use it

Your corpus fits in memory and a linear scan is fast enough (< 10k items)
Exact match / keyword search is the real requirement -- use BM25 or FTS