Definition

Cosine similarity measures how similar two vectors are by computing the cosine of the angle between them. Values range from -1 (opposite) to 1 (identical), with 0 indicating orthogonal (no similarity). It’s the most common metric for comparing text embeddings because it focuses on direction (meaning) rather than magnitude (length).

Why it matters

Cosine similarity is foundational to modern AI search and retrieval:

Direction over magnitude — captures semantic orientation, not vector length
Normalized comparison — works with embeddings of different magnitudes
Efficiency — fast to compute, especially with optimized libraries
Interpretability — easy to understand: 1 = same, 0 = unrelated, -1 = opposite
Standard metric — default in most vector databases and embedding APIs

It enables meaningful comparison of text, images, and other embedded representations.

How it works

┌────────────────────────────────────────────────────────────┐
│                   COSINE SIMILARITY                        │
├────────────────────────────────────────────────────────────┤
│                                                            │
│         FORMULA: cos(θ) = (A · B) / (||A|| × ||B||)        │
│                                                            │
│  ┌─────────────────────────────────────────────────────┐   │
│  │                                                     │   │
│  │              B                                      │   │
│  │             /                                       │   │
│  │            /  θ = angle                             │   │
│  │           /                                         │   │
│  │          /                                          │   │
│  │         ──────────────► A                           │   │
│  │                                                     │   │
│  │  cos(0°) = 1.0    → Identical direction             │   │
│  │  cos(45°) ≈ 0.71  → Similar direction               │   │
│  │  cos(90°) = 0.0   → Orthogonal (unrelated)          │   │
│  │  cos(180°) = -1.0 → Opposite direction              │   │
│  │                                                     │   │
│  └─────────────────────────────────────────────────────┘   │
│                                                            │
│  EXAMPLE CALCULATION:                                      │
│  Vector A = [0.8, 0.6]                                     │
│  Vector B = [0.6, 0.8]                                     │
│                                                            │
│  A · B = (0.8 × 0.6) + (0.6 × 0.8) = 0.96                  │
│  ||A|| = √(0.8² + 0.6²) = 1.0                              │
│  ||B|| = √(0.6² + 0.8²) = 1.0                              │
│                                                            │
│  cos(θ) = 0.96 / (1.0 × 1.0) = 0.96 → Very similar        │
│                                                            │
└────────────────────────────────────────────────────────────┘

Comparison with other metrics:

Metric	Formula	When to use
Cosine	1 - cos(θ)	Normalized embeddings, text similarity
Euclidean	√Σ(a-b)²	Absolute distances matter
Dot Product	Σ(a×b)	When vectors are already normalized

Common questions

Q: Why use cosine similarity instead of Euclidean distance?

A: Cosine ignores vector magnitude, focusing only on direction. Two documents about “tax law” should be similar even if one is longer and has a larger embedding magnitude. Cosine captures this; Euclidean treats them as more different.

Q: What does a cosine similarity of 0.8 mean?

A: The vectors point in nearly the same direction—they’re semantically similar. For text embeddings, 0.8+ typically indicates strong relevance. However, thresholds vary by model; calibrate with your data.

Q: Can cosine similarity be negative?

A: Yes, when vectors point in opposite directions. With most text embeddings, negatives are rare since embedding models typically produce vectors in positive space. A value near 0 is more common for unrelated content.

Q: Is cosine similarity the same as cosine distance?

A: They’re inverses. Cosine distance = 1 - cosine similarity. Databases often use “distance” (lower = more similar) while APIs report “similarity” (higher = more similar). Check your tool’s convention.

Embeddings — vectors that cosine similarity compares
Semantic Similarity — concept measured by cosine similarity
Vector Database — uses cosine for nearest-neighbor search
Semantic Search — retrieval powered by cosine comparisons

References

Singhal (2001), “Modern Information Retrieval: A Brief Overview”, IEEE Data Engineering Bulletin. [3,000+ citations]

Manning et al. (2008), “Introduction to Information Retrieval”, Cambridge University Press. [20,000+ citations]

Mikolov et al. (2013), “Efficient Estimation of Word Representations in Vector Space”, arXiv. [35,000+ citations]

Johnson et al. (2019), “Billion-scale similarity search with GPUs”, IEEE Transactions on Big Data. [1,500+ citations]

Definition

Why it matters

How it works

Common questions

Related terms

References