Encoder-Decoder Fusion (FiD) is an advanced Retrieval-Augmented Generation (RAG) technique where a language model processes multiple retrieved documents in parallel using separate encoders. The representations from these encoders are then fused and passed to a decoder, which generates a comprehensive and contextually relevant response. This approach enhances information integration, allowing the model to synthesize knowledge from diverse sources, thereby improving answer accuracy and relevance in complex question-answering tasks.
Encoder-Decoder Fusion (FiD) is an advanced Retrieval-Augmented Generation (RAG) technique where a language model processes multiple retrieved documents in parallel using separate encoders. The representations from these encoders are then fused and passed to a decoder, which generates a comprehensive and contextually relevant response. This approach enhances information integration, allowing the model to synthesize knowledge from diverse sources, thereby improving answer accuracy and relevance in complex question-answering tasks.
What is Fusion-in-Decoder (FiD) in NLP?
FiD is a question-answering approach that retrieves multiple passages and fuses their information during decoding, letting the decoder attend to all retrieved passages to generate an answer grounded in sources.
How does FiD differ from a standard encoder-decoder model?
FiD processes several retrieved passages and decodes by attending to all their encodings, enabling evidence fusion, whereas a standard model typically handles a single input.
What is the role of the encoder in FiD?
The encoder encodes each retrieved passage (often with the question) into representations; there can be a separate encoding per passage, and these encodings are used by the decoder.
How does the decoder fuse information from multiple passages?
The decoder uses cross-attention over the encoder outputs from all retrieved passages to generate a single, coherent answer that combines evidence.
When should you use FiD, and what are its benefits?
Use FiD for open-domain QA or tasks needing synthesis from multiple sources; benefits include improved factual grounding and coverage, with trade-offs like higher compute and memory needs.