Retrieval Augmented Generation (RAG) is gaining traction among IT leaders and businesses eager to adopt generative AI. By combining a large language model (LLM) with RAG, organizations can tap into their own data, enhancing the reliability of AI outputs.
So, how does RAG function? What’s its practical application in business? Are there any true alternatives? To explore these questions, TechRepublic spoke with Davor Bonaci, the CTO of DataStax, a company focused on databases and AI. He shared insights on how RAG is being utilized as generative AI rolls out in 2024 and where he sees it heading in 2025.
RAG boosts the accuracy and relevance of generative AI outputs by enriching the model with context from enterprise data. Instead of relying solely on the information the LLM learned from the internet, which often leads to errors like hallucinations, RAG ensures that the model bases its responses on accurate, specific information.
According to Bonaci, LLMs have been trained on vast amounts of internet data, but they can stumble when faced with enterprise-specific inquiries. RAG addresses this issue by providing the model with precise information. When a user submits a query, the RAG process first retrieves relevant documents or data from a knowledge base or database. This additional context is then used alongside the original question, enabling the model to give informed answers and minimizing the chances of producing misleading or incorrect information.
Bonaci strongly believes that the use of RAG significantly improves the outputs generated by LLMs. He asserts that without RAG, LLMs can be nearly unusable in a business context. The integration of RAG opens new possibilities for generative AI applications, vastly enhancing their reliability.
He pointed out that using RAG can elevate the accuracy of model outputs to over 90% for straightforward tasks. For more complex queries requiring reasoning, the accuracy tends to range between 70% and 80%.
Organizations are employing RAG across various use cases. In automation, businesses use RAG-enhanced LLMs to streamline repetitive tasks, especially in customer support, where they can search documentation and perform actions like processing refunds or managing orders. For personalization, RAG helps summarize large sets of data, tailoring information like customer reviews to a specific user’s context, such as their past purchases. In search functions, RAG refines results, making them more relevant to users’ interests.
Bonaci also highlighted the advanced potential of incorporating knowledge graphs with RAG, which offers a refined approach to managing complex data relationships. For example, when a telecom customer queries about plan details, a knowledge graph can systematically organize and clarify conflicting information, allowing the AI to provide accurate answers.
When considering alternatives to RAG, the main option is fine-tuning a generative AI model. Fine-tuning involves modifying the model using enterprise data directly, creating a tailored dataset. However, Bonaci points out that while this method has its place, RAG stands out as the most recognized and effective way to make generative AI truly functional for businesses. Fine-tuning addresses only a narrow range of issues, limiting its widespread adoption.