Failure taxonomy in Retrieval-Augmented Generation (RAG) identifies key error types: Retrieval Misses occur when relevant documents are not fetched; Mismatch refers to retrieved documents being irrelevant or only loosely connected to the query; Generation Drift happens when the model’s output diverges from the retrieved evidence, introducing inaccuracies. Understanding these failures helps improve RAG systems by targeting retrieval accuracy, document relevance, and faithful generation based on sourced information.
Failure taxonomy in Retrieval-Augmented Generation (RAG) identifies key error types: Retrieval Misses occur when relevant documents are not fetched; Mismatch refers to retrieved documents being irrelevant or only loosely connected to the query; Generation Drift happens when the model’s output diverges from the retrieved evidence, introducing inaccuracies. Understanding these failures helps improve RAG systems by targeting retrieval accuracy, document relevance, and faithful generation based on sourced information.
What is a retrieval miss in a retrieval-augmented system?
A retrieval miss occurs when the system fails to fetch relevant documents or evidence needed to answer a question, leading to an uninformed or incorrect answer. Causes include weak recall, indexing gaps, or misinterpreting the query.
What is a retrieval mismatch?
A retrieval mismatch happens when the system retrieves documents that don’t match the user’s question or intent, producing irrelevant or noisy information that can mislead the answer.
What is generation drift?
Generation drift is when the generated answer drifts away from the retrieved evidence or facts, resulting in hallucinations or inconsistent details even if the retrieved docs are relevant.
How do these failures differ and how can they be mitigated?
Miss = failure to retrieve relevant evidence; mismatch = wrong or off-target docs; drift = generated content diverges from sources. Mitigations include improving retrieval recall and ranking, grounding outputs with citations, constraining generation to retrieved content, and applying post-generation verification.