From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)

Unlock the full course today

Join today to access over 23,200 courses taught by industry experts.

Create collections

Create collections

Having set up the database and users, let's now proceed to create a collection. For this, we will be using a dataset that contains course names and descriptions. We will create an embedding vector for the description and make it searchable. First, we need to set up a schema for the collection. For this, we need to define fields one by one. First, we have the course_id. We provide a name and use the datatype as INT64. This is also set as the primary key with a maximum length of 32. Then we have the title of the course that is set as a worker field with a max_length of 256. Then the description field is set as VARCHAR with a max_length of 2048. Finally, we set up the vector field for storing the embedding for the description. We specify the datatype as FLOAT_VECTOR. The dimension attribute should be set to the exact size of the embedding. This is based on the embedding model used. We will be using OpenAI's embedding model that gives embeddings of dimension 1536. Once the fields are set…

Contents