From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)

Unlock the full course today

Join today to access over 23,200 courses taught by industry experts.

Set up the Milvus cache

Set up the Milvus cache

The notebook for LLM caching is in the file code_04_XX Using Vector DB as LLM Cache. I have the file loaded up here. Let's proceed to set up a Milvus cache. We create a database called cache_db and a collection called llm_cache. We first create a connection to the Milvus instance. We then check if the database already exists. If not, we proceed to create the database. We then switch the connection to use this database. Let's run this code now to create the database. Next, we proceed to create a collection. We will add a cache ID field as the primary key, but we will also make it autogenerated so we don't need to populate it. The prompt_text field is used to store the input prompt string as is. The response_text field will store the LLM response as is. Do note that we are limiting the sizes to 2048, but your use case may require much bigger sizes, so do plan for the same accordingly. The prompt_embedding field will store the embedding vector for the prompt. Now, we will proceed to…

Contents