Skip to main content

How It Works

From question to cited answer.

Not a chatbot with a legal plug-in. A purpose-built retrieval system that searches Belgian tax law, ranks sources by authority, and shows you exactly what supports every answer.

The retrieval pipeline

Every question goes through five stages before you see a single word. Each stage reduces noise and increases precision.

1

Your question

You ask in plain language — Dutch, French, or English. The system parses your intent, identifies the tax domain, and determines the relevant time period.

2

Hybrid search

Two search strategies run in parallel: BM25 (exact legal term matching) and vector embeddings (semantic meaning). Results are merged using Reciprocal Rank Fusion — so you find relevant sources even when you use different words than the legislation.

3

Authority ranking

Results are reranked by legal weight. A Court of Cassation judgment outranks an administrative circular. The Constitution outranks everything. This mirrors how a tax professional actually reasons.

4

Cross-encoder reranking

A specialized model reads each source alongside your question and scores true relevance — catching false positives that keyword matching alone would miss.

5

Structured answer

The AI generates a research card with citations, confidence score, temporal scope, and jurisdiction — not a wall of chat text. Every claim is tied to a retrievable source.

Screenshot: Research card 💡 Full research card: query at top, structured answer with inline citation markers [1][2][3], confidence badge (87%), temporal scope tag (valid until 2026-12-31), jurisdiction chip, and collapsed source accordion.

Source authority ranking

Belgian law has a pecking order. The system applies authority weighting when ranking sources — so higher-weight sources surface first. Here's how it works in practice:

Example: Legislative vs. administrative

Federal Law (WIB) > Administrative Circular

A provision in the Income Tax Code outranks an administrative interpretation of that provision.

Example: Court hierarchy

Court of Cassation > Court of Appeal

A Supreme Court ruling carries more weight than a regional appellate decision on the same issue.

Example: Binding vs. informative

Royal Decree > Legal Doctrine

Binding regulatory instruments outrank academic commentary, which is informative but not authoritative.

Source categories we rank

The system weighs sources across these categories — from constitutional provisions down to academic commentary:

Constitution EU Law Treaties Federal Laws Regional Decrees Royal Decrees Supreme Court Constitutional Court Appeals Courts Advance Rulings Circulars Parliamentary Works Doctrine

When sources conflict, the system knows which one carries more weight — automatically. No guessing, no flat citation lists.

Temporal versioning

Tax law changes constantly. The system tracks when each provision was in force — so you get the right answer for the right year.

You ask:

"What was the corporate tax rate in 2019?"

Assessment year 2020

Answered with the law as it was then: 29.58% standard rate (29% base + 2% crisis surcharge).

You ask:

"What is the corporate tax rate today?"

Assessment year 2026

Answered with current law: 25% standard rate. The system retrieves the correct version automatically.

How it works under the hood

Every document carries explicit metadata: effective_from, effective_to, and assessment_year. The retrieval pipeline filters by temporal scope before ranking — so you never see outdated provisions mixed in with current law.

4-layer quality gate

Every document passes through four validation stages before it becomes searchable. No shortcuts, no bulk imports.

1

Structure validation

Documents are split at natural legal boundaries — articles, sections, paragraphs — not arbitrary character limits.

2

Quality checks

Encoding, completeness, and formatting checks ensure no corrupted or truncated content enters the corpus.

3

Metadata enrichment

Authority tier, temporal fields, jurisdiction, language, and topic tags are attached to every document chunk.

4

Deduplication

Duplicate and near-duplicate content is detected and consolidated — so you see each provision once, not five times from different sources.

New documents are quarantined for administrator review before becoming searchable. The corpus grows carefully, not recklessly.

Why retrieval beats fine-tuning

Some legal AI tools bake knowledge into their models through fine-tuning. We chose a different path — and here's why it matters for Belgian tax.

Fine-tuned models

Auryth TX (RAG)

Knowledge frozen at training time — already months or years behind

Real-time updates — new laws enter the corpus immediately, no retraining

Can't reliably point to which document a claim comes from

Full citation traceability — every claim linked to a specific article and paragraph

Absorbs all law versions into one set of weights — temporal confusion

Temporal precision — explicit metadata tracks which version applies when

Expensive and slow to retrain when law changes

Cost-effective — adding new documents is fast. Switching to better base models is instant.

Locked to one model version — fine-tuning doesn't transfer to newer models

Model-agnostic — when a better AI model launches, we switch. Our corpus and pipeline carry over.

Fine-tuning may work for stable legal domains. But Belgian tax law is a moving target — rates change, exemptions appear and disappear, regional rules diverge. Retrieval keeps you current.

Deep dive: RAG vs fine-tuning for legal AI →

We publish our accuracy data

Most AI tools claim to be accurate. We show you — with real metrics, updated regularly, and open to scrutiny.

View accuracy dashboard

Join the waitlist

Be among the first to experience Auryth TX AI