Vector Search & RAG: A Practical Mental Model for Semantic and Conversational Search
Vector search, semantic search, and RAG are often used interchangeably, but they solve different problems. This article introduces a clear mental model to distinguish search types, techniques, and tools, and to reason about when each approach is appropriate.
Introduction
The goal of this article is to give you a high-level overview of how Vector Search and RAG work, and shows the basic tools used to implement semantic and conversational search (what is all that?, you might ask. Stick with me, we’ll get to it).
I won’t go deep into details or trade-offs. Instead, I want to provide a broad mental model that helps you reason about different search types and tools, and then decide where to go deeper depending on your needs.
For that I will break this article into 3 parts, this one being the first one
- Basic concepts and mental model
- Implementing Semantic Search with OpenSearch
- Implementing Conversational Search with OpenSearch and Open AI API
Basic concepts and mental models
In order to understand Vector Search it is helpful first to understand searching at the type level and at the tool level and how they correlate to each other. The diagrams below help to map the definition of different searching types and tools and when to use what tool for each type. Hopefully the diagram helps to make a faster connection without having to use a long description for each.
Search Types

Search Techniques

Now that we understand the different types of search and tools I would like to go a bit deeper on understanding the core concepts for the search types and tools used in this article.
Vector Search
The following is a mental model that helps me simplify how Vector Search works but first there are a couple concepts to understand first.

How does it work in practice?
The following oversimplifies and might leave you with some questions which hopefully will get clarified in the next section.

At a high level, vector search uses two main ways to find the nearest vector:
- K vectors with the smallest distance (Euclidean)
- K vectors with the largest cosine similarity

Semantic Search
As we saw earlier we use Vector Searching as a tool to implement Semantic Search, in order to achieve this, we have two important steps which we will describe below and will give you a better picture of how everything is pulled together when a user makes a search.
Indexing phase
In this phase the goal is to take all the data that we have available for searching, whether is structured or unstructured and create embeddings (vectors with meaning) and stored them in a database that supports storing these vectors.

Search phase
The diagram below describes what happens when the user makes a query, for this to work, the indexing phase needs to happen before.

RAG
There’s plenty of information out there regarding this topic so I don’t want to go into details that very likely others are better at explaining it than me. Below I will link some references that I personally found useful. Instead of describing and going into theory I would like to provide you a mental model on how RAG is used in the context of conversational searching and using as base line what we know about Vector searching.

The LLM
- Compares multiple documents (which are result of a vector search)
- Extracts relevant details
- Explains trade offs
- Summarizes
- And answers why and how in natural language
Next
This mental model helps you choose the right search strategy before reaching for a specific tool and now that we have the foundations, let's explore two practical examples: