What is RAG Application
RAG is a technique that combines the power of information retrieval with the creativity of generative language models ( LLM ).
By applying this technique, the model becomes more powerful, as it can access data from a database and also store data in a database.
Traditional LLMs are purely dependent on their training data. They can only generate human-like questions and answers, but they have limitations — for example, they cannot access up-to-date data.
This limitation is overcome by the RAG technique. By using RAG, we give permission to the LLM to find up-to-date data, store that data, and retrieve that data. Unlike traditional LLMs, which cannot store or fetch new information, RAG extends their abilities.
That’s why it is called Retrieval-Augmented Generation (RAG).
Technical Definition
RAG was designed to improve large language model performance and reduce hallucinations by enabling LLMs to go beyond their training data, with access to external knowledge bases.
It stores data in vector embeddings.
I have already talked about vector embeddings in my recent blog. For example, Pinecone, Qdrant, and Astra are common vector databases, and now MongoDB is also capable of storing vector embeddings.
The Two Phases of RAG
~ Indexing
In this phase, the user provides raw data. This raw data is converted into chunks, then transformed into vector embeddings, and finally stored in the database.
~ Retrieval
In this phase, the user’s query is converted into a vector embedding. Using this, the system searches the vector database and retrieves the most relevant information to answer the query.
Limitations of RAG
~ Data Leakage
Data leakage is a major concern with RAG.
~ Cannot Perform Aggregation Operations
If you ask an LLM powered by RAG to calculate the sum for an unknown dataset, it’s not just difficult — it’s impossible. This is because vector databases store data in vector format, and they are designed mainly for similarity searches. Essentially, a vector database alone is not sufficient to handle aggregation operation