Key terms in Belgian tax and AI explained
Small trainable modules inserted into frozen pretrained models, enabling efficient task-specific fine-tuning with minimal parameters.
Systematically probing models with difficult or malicious inputs to find failures.
The process of training AI systems to behave in accordance with human values, intentions, and preferences—ensuring models are helpful, harmless, and honest.
Ensuring that generated answers can be traced back to specific supporting sources.
Algorithms that find approximately similar vectors quickly by trading perfect accuracy for massive speed improvements.
A neural network technique that allows models to focus on relevant parts of input when producing output, enabling context-aware processing.
The AI capability of linking generated statements to specific source evidence, establishing which parts of the output are supported by which documents or data points.
An algorithm that efficiently computes gradients by propagating errors backward through a neural network layer by layer.
A decoding algorithm that explores multiple candidate sequences in parallel, keeping the top-k most promising paths at each step.
The systematic process of evaluating model performance against standardized datasets and metrics, enabling fair comparison between different models, architectures, and approaches.
A neural architecture that separately encodes queries and documents into fixed-size vectors, enabling efficient similarity search through pre-computed embeddings and approximate nearest neighbor indexes.
Bias mitigation is the set of methods used to detect and reduce unfair bias in an AI system’s data, model behavior, and outcomes.
Best Matching 25 - the state-of-the-art probabilistic ranking algorithm for text search based on TF-IDF principles.
A subword tokenization algorithm that builds a vocabulary by iteratively merging frequent symbol pairs.
Aligning model confidence scores with the true likelihood of correctness.
A prompting technique that elicits step-by-step reasoning from language models, improving performance on complex tasks by making the model's reasoning process explicit and verifiable.
The method of dividing documents into smaller segments for effective retrieval and processing in RAG systems.
The practice of explicitly referencing source documents in AI-generated responses, enabling verification of claims and building trust through transparency and traceability.
A range of values within which a quantity is believed to lie with a specified probability.
The practice of adding retrieved or auxiliary information into an LLM prompt to guide generation.
The maximum amount of text (measured in tokens) that a language model can process in a single interaction.
Regularly re-running evaluations in production to detect regressions or drift early.
A mathematical measure of similarity between two vectors based on the cosine of the angle between them.
A neural architecture that jointly encodes query and document pairs to produce relevance scores, providing higher accuracy than bi-encoders but at greater computational cost.
A subset of machine learning using neural networks with many layers to learn hierarchical representations from data.
Information retrieval using learned dense vector representations, enabling semantic matching beyond keyword overlap.
Techniques that reduce the number of features in data while preserving as much structure as possible.
A function that quantifies how far apart two points are in a space, subject to metric properties.
A similarity measure computed as the dot product between two embedding vectors.
Techniques to map embeddings from different models or languages into a shared space.
Reducing the size or precision of embeddings to save memory and speed up search.
Changes in embedding distributions over time that can degrade retrieval or model performance.
A machine learning model that converts inputs like text or images into vector embeddings.
The high-dimensional vector space in which embeddings live and semantic relationships are encoded geometrically.
Dense vector representations of data (text, images, etc.) that capture semantic meaning in a continuous numerical space.
Carefully examining where and why a model fails to improve future iterations.
The straight-line distance between two points in a vector space, used as a metric between embeddings.
A reusable setup for defining, running, and tracking evaluations of AI systems.
A curated set of inputs and gold-standard outputs used to measure model or system performance.
The ability to understand, interpret, and explain how AI/ML models make predictions—essential for trust, debugging, regulatory compliance, and responsible AI deployment.
The degree to which a generated answer agrees with trusted sources or ground truth.
The degree to which AI-generated content accurately reflects verifiable truth, distinguishing correct statements from fabrications, errors, and hallucinations.
Facebook AI Similarity Search - the most comprehensive open-source library for efficient similarity search and clustering of dense vectors.
The property that a model’s explanation or answer accurately reflects its underlying reasoning or evidence.
A neural network where information flows in one direction from input to output without recurrent connections.
A machine learning paradigm where models learn to perform tasks from just a handful of examples, enabling rapid adaptation without extensive retraining or fine-tuning.
The process of further training a pre-trained model on domain-specific data to improve performance on specialized tasks.
An LLM capability where the model selects and fills structured arguments to call external tools or functions.
The part of a RAG system where the language model conditions on retrieved context to produce an answer.
An optimization algorithm that iteratively adjusts model parameters by moving in the direction that reduces the loss function.
A simple text generation strategy that always selects the highest-probability token at each step.
The authoritative, verified reference data used to train and evaluate machine learning models—the 'correct' answers against which model predictions are measured.
The technique of anchoring AI model outputs to verifiable sources, facts, or retrieved documents to reduce hallucinations and increase response accuracy and trustworthiness.
Safety mechanisms and constraints that prevent AI systems from generating harmful, inappropriate, or off-topic outputs—providing runtime protection beyond training-time alignment.
When an AI model generates false, fabricated, or unsupported information presented as fact.
The proportion of a model’s outputs that contain unsupported, fabricated, or false statements.
Hierarchical Navigable Small World graphs - the state-of-the-art algorithm for fast approximate nearest neighbor search in high-dimensional spaces.
Using human reviewers to check, correct, or approve AI outputs as part of an evaluation process.
Building and maintaining combined sparse and dense indices to support hybrid search.
A retrieval approach combining keyword-based and semantic vector search to leverage the strengths of both methods.
The ability of large language models to learn new tasks at inference time by conditioning on examples or instructions provided directly in the prompt, without any gradient updates.
The process of updating a vector or search index to reflect new or changed data.
Splitting an index across multiple shards or machines to scale retrieval.
The process of using a trained model to generate predictions or outputs on new, unseen data.
A fine-tuning method that trains language models to follow natural language instructions across diverse tasks.
A data structure mapping terms to document locations, enabling fast full-text search over large document collections.
A retrieval strategy that repeatedly refines queries and context based on intermediate results.
The practice of crafting prompts or inputs to bypass an AI system's safety and policy constraints.
Training a smaller student model to mimic a larger teacher model, transferring knowledge while dramatically reducing size and cost.
A structured network of entities and their relationships that enables machines to understand and reason about real-world concepts.
The high-level design choices for how a system retrieves and structures knowledge for use by LLMs.
Legal domain adaptation tailors an AI or search system to legal language, sources, and reasoning so outputs are more precise and defensible.
Large Language Models are AI systems trained on vast text data to understand and generate human-like text, powering modern conversational AI.
The logarithms of token probabilities produced by a language model, used for scoring and analysis of generations.
Low-Rank Adaptation - an efficient fine-tuning technique that trains small adapter matrices instead of updating all model weights.
A mathematical function that measures how far a model's predictions are from the desired outputs during training.
A field of AI where systems learn patterns from data to make predictions or decisions without explicit programming.
Restricting retrieval results based on document attributes like type, date, or jurisdiction.
An open-source vector database optimized for storing, indexing, and searching massive-scale embedding vectors—enabling similarity search for AI applications like RAG, semantic search, and recommendations.
Techniques to reduce AI model size and computational requirements while preserving performance, enabling efficient deployment.
A degradation in model performance over time because the data distribution or usage changes.
How well a model maintains performance under noise, shifts, or adversarial inputs.
A technique that runs multiple attention operations in parallel, allowing models to capture different types of relationships simultaneously.
Retrieval that chains multiple retrieval steps together to answer complex, multi-step questions.
AI technique that identifies and classifies named entities like people, places, and organizations in text for information extraction.
Finding the closest items to a query in a vector space under a given distance metric.
A retrieval pattern that explicitly searches for contradicting, missing, or disconfirming evidence.
A machine learning model composed of interconnected layers of artificial neurons that learn patterns from data.
Optical Character Recognition—technology that converts images of text (scanned documents, photos, PDFs) into machine-readable text, enabling search, editing, and AI processing of printed or handwritten content.
Retrieving small passages or chunks of text rather than whole documents for more precise answers.
A metric measuring how well a language model predicts text, with lower values indicating better prediction ability.
A fully managed vector database service designed specifically for machine learning applications, providing serverless similarity search at scale.
A technique used in transformer models to inject information about token positions into otherwise order-agnostic embeddings.
The initial phase of training a large language model on massive text corpora to learn general language patterns, world knowledge, and reasoning capabilities before task-specific fine-tuning.
The input text or instruction given to a language model to guide its response generation.
An attack technique where malicious instructions are inserted into LLM inputs to override system prompts, bypass guardrails, or manipulate model behavior in unintended ways.
Removing unnecessary weights or neurons from neural networks to reduce model size and computational cost without significant accuracy loss.
Quantized LoRA - combines 4-bit quantization with LoRA adapters, enabling fine-tuning of 65B+ models on a single 48GB GPU.
Reducing model precision from 32/16-bit to 8/4-bit, dramatically decreasing memory usage and speeding up inference.
Techniques that automatically reformulate or augment search queries to improve retrieval by adding synonyms, related terms, or rephrased versions.
The process of transforming a user query into a more effective form for retrieval.
Checking that changes to models or pipelines do not unintentionally degrade existing behaviour.
A machine learning approach where agents learn optimal behavior through trial-and-error interactions with an environment.
Metrics that capture how stable, predictable, and safe an AI system is over time.
A second-stage retrieval technique that reorders initial search results to improve relevance using more sophisticated models.
The extent to which a retrieval system can surface all the information needed to answer questions in a domain.
Applying rules or metadata filters to restrict which documents can be retrieved for a query.
The time it takes for a retrieval system to return results for a query.
The part of a RAG system that finds and ranks relevant documents or chunks before generation.
Coordinating multiple retrieval steps, indices, or tools to serve a single AI task or query.
An ordered sequence of steps that process a query and documents to return ranked results in a RAG or search system.
The fraction of retrieved documents that are actually relevant to the query.
The fraction of all truly relevant documents that a retrieval system successfully returns.
The computation of numeric relevance scores for documents or chunks given a query.
RAG is an AI technique that combines information retrieval with text generation to produce accurate, source-grounded responses.
Reinforcement Learning from Human Feedback—a technique to fine-tune language models using human preferences as reward signals.
A mechanism where each element in a sequence computes attention weights with all other elements in the same sequence.
Grouping embeddings into clusters so that each group represents a coherent semantic concept or topic.
Search technology that understands meaning and intent rather than just matching keywords, enabling more relevant and intelligent results.
A measure of how alike two pieces of text are in meaning, regardless of the specific words used.
A language-agnostic subword tokenization library that learns a vocabulary directly from raw text.
Searching for items in a dataset that are similar to a query item under a chosen distance metric.
A chunking strategy where overlapping windows move across a document to preserve context between chunks.
Information retrieval using high-dimensional sparse vectors based on term frequencies, like BM25 and TF-IDF.
Evaluating how an AI system behaves under extreme or degraded conditions.
The practice of constraining LLM responses into well-defined formats such as JSON, XML, or schemas.
A machine learning approach where models learn from labeled training data to predict outputs for new inputs.
The hidden or fixed instruction block that sets overall behavior and constraints for an LLM in a given application.
A parameter controlling the randomness of language model outputs, affecting creativity versus consistency.
Term Frequency-Inverse Document Frequency - a statistical measure of word importance in a document relative to a collection.
The process of splitting text into smaller units (tokens) that language models can process and understand.
The design pattern where LLMs decide when and how to call external tools to complete tasks.
A sampling method that restricts token selection to the k most probable next tokens at each generation step.
A sampling method that selects from the smallest set of tokens whose cumulative probability exceeds a threshold p.
A neural network architecture using self-attention to process sequential data in parallel, powering modern LLMs.
Quantifying how uncertain a model is about its predictions or answers.
A machine learning approach where models discover patterns and structure in data without labeled examples.
A specialized database optimized for storing and searching high-dimensional vector embeddings using similarity metrics.
Embeddings represented as numerical vectors in a high-dimensional space, used for similarity and retrieval.
The process of organizing embeddings in a data structure that supports fast similarity search.
Rescaling vectors to have a standard length, often unit norm, before computing similarity.
Approximating vectors using a small set of codebook entries to reduce storage and speed up search.
An open-source vector database that combines vector search with structured data filtering and built-in machine learning modules—enabling semantic search, RAG, and AI-native applications.
A machine learning capability where models perform tasks without any task-specific examples, relying solely on pre-trained knowledge and natural language instructions.