How It Works
From question to cited answer.
Not a chatbot with a legal plug-in. A purpose-built retrieval system that searches Belgian tax law, ranks sources by authority, and shows you exactly what supports every answer.
The retrieval pipeline
Every question goes through five stages before you see a single word. Each stage reduces noise and increases precision.
Your question
You ask in plain language — Dutch, French, or English. The system parses your intent, identifies the tax domain, and determines the relevant time period.
Hybrid search
Two search strategies run in parallel: BM25 (exact legal term matching) and vector embeddings (semantic meaning). Results are merged using Reciprocal Rank Fusion — so you find relevant sources even when you use different words than the legislation.
Authority ranking
Results are reranked by legal weight. A Court of Cassation judgment outranks an administrative circular. The Constitution outranks everything. This mirrors how a tax professional actually reasons.
Cross-encoder reranking
A specialized model reads each source alongside your question and scores true relevance — catching false positives that keyword matching alone would miss.
Structured answer
The AI generates a research card with citations, confidence score, temporal scope, and jurisdiction — not a wall of chat text. Every claim is tied to a retrievable source.
Temporal versioning
Tax law changes constantly. The system tracks when each provision was in force — so you get the right answer for the right year.
You ask:
"What was the corporate tax rate in 2019?"
Answered with the law as it was then: 29.58% standard rate (29% base + 2% crisis surcharge).
You ask:
"What is the corporate tax rate today?"
Answered with current law: 25% standard rate. The system retrieves the correct version automatically.
How it works under the hood
Every document carries explicit metadata: effective_from, effective_to, and assessment_year. The retrieval pipeline filters by temporal scope before ranking — so you never see outdated provisions mixed in with current law.
4-layer quality gate
Every document passes through four validation stages before it becomes searchable. No shortcuts, no bulk imports.
Structure validation
Documents are split at natural legal boundaries — articles, sections, paragraphs — not arbitrary character limits.
Quality checks
Encoding, completeness, and formatting checks ensure no corrupted or truncated content enters the corpus.
Metadata enrichment
Authority tier, temporal fields, jurisdiction, language, and topic tags are attached to every document chunk.
Deduplication
Duplicate and near-duplicate content is detected and consolidated — so you see each provision once, not five times from different sources.
New documents are quarantined for administrator review before becoming searchable. The corpus grows carefully, not recklessly.
Why retrieval beats fine-tuning
Some legal AI tools bake knowledge into their models through fine-tuning. We chose a different path — and here's why it matters for Belgian tax.
Fine-tuned models
Auryth TX (RAG)
Knowledge frozen at training time — already months or years behind
Real-time updates — new laws enter the corpus immediately, no retraining
Can't reliably point to which document a claim comes from
Full citation traceability — every claim linked to a specific article and paragraph
Absorbs all law versions into one set of weights — temporal confusion
Temporal precision — explicit metadata tracks which version applies when
Expensive and slow to retrain when law changes
Cost-effective — adding new documents is fast. Switching to better base models is instant.
Locked to one model version — fine-tuning doesn't transfer to newer models
Model-agnostic — when a better AI model launches, we switch. Our corpus and pipeline carry over.
Fine-tuning may work for stable legal domains. But Belgian tax law is a moving target — rates change, exemptions appear and disappear, regional rules diverge. Retrieval keeps you current.
We publish our accuracy data
Most AI tools claim to be accurate. We show you — with real metrics, updated regularly, and open to scrutiny.