From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)

Unlock the full course today

Join today to access over 23,200 courses taught by industry experts.

RAG question-answering process

RAG question-answering process

Having set up the knowledge base, let's now look at how the question answering process would look like. We have the knowledge base that we built using the curation process we discussed in the previous video. We have a user who wants some answers and hence issues a prompt to the RAG system. The RAG system receives the input prompt and may perform validations like authentication. There may be additional metadata that may be passed along, like product IDs or user IDs that are used as scalar filters on the knowledge base. Then we get the embedding vector from the prompt using the same embedding model that was used during the curation process. We then query the knowledge base to find answers. On the vector database, we find the top-K matches to the input query by comparing the embeddings. We may also use additional metadata as filters. The results from the knowledge base are then used as context to the LLM. The prompt is sent to the LLM with this context. We will specifically instruct the…

Contents