From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)
Unlock the full course today
Join today to access over 23,200 courses taught by industry experts.
Create collections
From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)
Create collections
Having set up the database and users, let's now proceed to create a collection. For this, we will be using a dataset that contains course names and descriptions. We will create an embedding vector for the description and make it searchable. First, we need to set up a schema for the collection. For this, we need to define fields one by one. First, we have the course_id. We provide a name and use the datatype as INT64. This is also set as the primary key with a maximum length of 32. Then we have the title of the course that is set as a worker field with a max_length of 256. Then the description field is set as VARCHAR with a max_length of 2048. Finally, we set up the vector field for storing the embedding for the description. We specify the datatype as FLOAT_VECTOR. The dimension attribute should be set to the exact size of the embedding. This is based on the embedding model used. We will be using OpenAI's embedding model that gives embeddings of dimension 1536. Once the fields are set…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.