Definition

A vector database is a specialized data store designed to efficiently index, store, and query high-dimensional vector embeddings. Unlike traditional databases that match exact values, vector databases find items similar to a query vector using distance metrics like cosine similarity or Euclidean distance. They enable semantic search at scale through approximate nearest neighbor (ANN) algorithms.

Why it matters

Vector databases are essential infrastructure for modern AI applications:

Semantic search — find conceptually similar content regardless of exact keywords
RAG systems — retrieve relevant context for language model responses
Recommendation engines — find similar products, content, or users
Anomaly detection — identify outliers in embedding space
Scalability — search billions of vectors in milliseconds

Any system that needs to find “similar” rather than “exact” answers relies on vector databases.

How it works

┌────────────────────────────────────────────────────────────┐
│                    VECTOR DATABASE                         │
├────────────────────────────────────────────────────────────┤
│                                                            │
│  INDEXING:                                                 │
│  Documents → Embed → [0.1, 0.4, ...] → Index (HNSW/IVF)   │
│                                                            │
│  ┌─────────────────────────────────────────┐               │
│  │  Vector Index (graphs/clusters)         │               │
│  │    ●──●──●                              │               │
│  │   /│\   \                               │               │
│  │  ● ● ●   ●──●                           │               │
│  └─────────────────────────────────────────┘               │
│                                                            │
│  QUERYING:                                                 │
│  Query → Embed → [0.2, 0.3, ...] → ANN Search → Top K     │
│                                            │               │
│                                            ▼               │
│                                    Similar documents       │
└────────────────────────────────────────────────────────────┘

Embedding — documents are converted to vectors via embedding model
Indexing — vectors are organized into efficient search structures (HNSW, IVF, etc.)
Query embedding — search query is converted to same embedding space
ANN search — index is traversed to find approximate nearest neighbors
Results — top K most similar vectors returned with metadata

Common questions

Q: How is a vector database different from a traditional database?

A: Traditional databases use exact matching (WHERE name = ‘John’). Vector databases use similarity search—finding vectors closest to your query in high-dimensional space. They complement rather than replace traditional databases.

Q: What indexing algorithms are used?

A: Common algorithms include HNSW (Hierarchical Navigable Small World), IVF (Inverted File Index), and PQ (Product Quantization). HNSW is most popular for its speed/accuracy tradeoff.

Q: What are popular vector databases?

A: Purpose-built options include Pinecone, Weaviate, Milvus, Qdrant, and Chroma. Traditional databases like PostgreSQL (pgvector) and Elasticsearch also support vector search.

Q: How do I handle metadata filtering?

A: Most vector databases support hybrid search combining vector similarity with metadata filters (e.g., “similar to query AND category = ‘legal’”). This is critical for production RAG systems.

Embeddings — the vectors stored in vector databases
RAG — retrieval architecture powered by vector search
Semantic Similarity — the measure used for ranking
Approximate Nearest Neighbor — core search algorithm

References

Johnson et al. (2019), “Billion-scale similarity search with GPUs”, IEEE TBD. [2,600+ citations]

Malkov & Yashunin (2020), “Efficient and robust approximate nearest neighbor search using HNSW graphs”, IEEE TPAMI. [1,800+ citations]

Pan et al. (2024), “Vector Database Management Systems: Fundamental Concepts, Use-Cases, and Current Challenges”, arXiv. [100+ citations]

Definition

Why it matters

How it works

Common questions

Related terms

References