Definition
Full-text search is keyword-based retrieval over text fields such as titles, body text, and metadata. Most systems build an inverted index that maps terms to the documents (and positions) where they occur, enabling fast lookups and scoring.
Why it matters
- Precision: exact terms, phrases, and filters are often critical in legal and tax content.
- Speed: inverted indexes scale well to large corpora.
- Transparency: you can explain why a result matched (terms, fields, boosts).
- Control: supports field weighting, phrase matching, and Boolean operators.
How it works
Text -> tokenize/normalize -> inverted index -> query -> scoring -> ranked results
Practical example
Searching for "withholding tax" AND Belgium can reliably surface documents that explicitly contain that phrase and jurisdiction, even when semantic models would drift.
Common questions
Q: Is full-text search the same as semantic search?
A: No. Full-text search matches tokens/phrases. Semantic search matches meaning (often via embeddings). Many systems combine both (hybrid search).
Q: Why do I sometimes miss results that “should” match?
A: Common causes are analyzer choices (stemming, stopwords), missing fields, or content that was not indexed due to the indexing strategy.
Related terms
- Indexing Strategy - decide what and how to index
- Boolean Search - strict AND/OR/NOT logic
- Relevance Tuning - improve ranking quality
- Content Discoverability - ensure content can be found and indexed
References
Manning, Raghavan & Schütze (2008), Introduction to Information Retrieval.