Definition

Full-text search is keyword-based retrieval over text fields such as titles, body text, and metadata. Most systems build an inverted index that maps terms to the documents (and positions) where they occur, enabling fast lookups and scoring.

Why it matters

Precision: exact terms, phrases, and filters are often critical in legal and tax content.
Speed: inverted indexes scale well to large corpora.
Transparency: you can explain why a result matched (terms, fields, boosts).
Control: supports field weighting, phrase matching, and Boolean operators.

How it works

Text -> tokenize/normalize -> inverted index -> query -> scoring -> ranked results

Practical example

Searching for "withholding tax" AND Belgium can reliably surface documents that explicitly contain that phrase and jurisdiction, even when semantic models would drift.

Common questions

Q: Is full-text search the same as semantic search?

A: No. Full-text search matches tokens/phrases. Semantic search matches meaning (often via embeddings). Many systems combine both (hybrid search).

Q: Why do I sometimes miss results that “should” match?

A: Common causes are analyzer choices (stemming, stopwords), missing fields, or content that was not indexed due to the indexing strategy.

Indexing Strategy - decide what and how to index
Boolean Search - strict AND/OR/NOT logic
Relevance Tuning - improve ranking quality
Content Discoverability - ensure content can be found and indexed

References

Manning, Raghavan & Schütze (2008), Introduction to Information Retrieval.