Retrieval-Augmented Generation (RAG) has been shown to enhance the factual accuracy of Large Language Models (LLMs) , but existing methods often suffer from limited reasoning capabilities in effectively using the retrieved evidence, particularly when using open-source LLMs. To mitigate this gap, we introduce a novel framework, Open-RAG, designed to enhance reasoning capabilities in RAG with open-source LLMs. Our framework transforms an arbitrary dense LLM into a parameter-efficient sparse mixture of experts (MoE) model capable of handling complex reasoning tasks, including both single- and multi-hop queries. Open-RAG uniquely trains the model to navigate challenging distractors that appear relevant but are misleading. As a result, OPEN-RAG lever-ages latent learning, dynamically selecting relevant experts and integrating external knowledge effectively for more accurate and contextuallyrelevant responses. In addition, we propose a hybrid adaptive retrieval method to determine retrieval necessity and balance the trade-off between performance gain and inference speed. Experimental results show that the Llama2-7B-based Open-RAG outperforms state-of-the-art LLMs and RAG models in various knowledge-intensive tasks, surpassing Chat-GPT, Self-RAG, and Command R+ in the RAG setting.
@inproceedings{
islam2024openrag,
title={Open-{RAG}: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models},
author={Islam, Shayekh Bin and Rahman, Md Asib and Hossain, KSM Tozammel and Hoque, Enamul and Joty, Shafiq and Parvez, Md Rizwan}
booktitle={The 2024 Conference on Empirical Methods in Natural Language Processing},
year={2024},
url={https://openreview.net/forum?id=J8H25KJ1cv}
}