From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)
Unlock the full course today
Join today to access over 23,200 courses taught by industry experts.
Distance measure considerations
From the course: LLM Foundations: Vector Databases for Caching and Retrieval Augmented Generation (RAG)
Distance measure considerations
When doing semantic search with vector databases, a key design consideration is the distance measure. When using vector databases, it's critical to understand how the distance measures work for a specific use case. As seen in the earlier code examples, a vector search will always return hits as long as there are records available in the database. If we set a limit of 10 in the query, it will return 10 records as long as there are 10 records in the database. The results are sorted by the distance between the search string and the string in the database. How do we determine if there is actually a match? We need to use distance or similarity thresholds. This is the maximum value of the distance below which we can consider that there is a match. So when a search is executed in Milvus, we can set the radius search parameter to this value so the search only returns those results where the distance is below the radius. What exactly do we mean by similar when comparing two strings? How close…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.