Traditional search is great for matching exact terms and even goes further with synonyms, fuzzy matching, and other techniques that aim to find relevant results for a user’s query. However, these techniques fall short when a user expresses their query in a natural language or is looking for conceptually related content. This blog showcases an implementation of semantic search using Elasticsearch and e5-small, a NLP model developed by Microsoft researchers that generates embeddings which capture the underlying meaning of a text.
Understanding Semantic Search
Traditional keyword search has a few shortcomings and implies some management complexities, such as:
- It misses conceptually related results.
- Managing synonyms becomes increasingly more difficult.
- Multi-language support needs a very complex implementation.
- Context can get lost in the search process
The limitations of traditional text search become apparent when, for instance, when we search “car maintenance tips”. A traditional search engine will match documents containing the terms “car”, “maintenance” and/or “tips”. However, it will completely miss highly relevant documents containing the phrase “automobile repair guides” even if they are, conceptually, the same. We can mitigate this specific scenario using synonyms, but in general this approach would be increasingly hard to maintain and we would still miss the intent of the user because a traditional search engine, at the end, is just matching terms.
Other interesting examples come from technical vs casual language, for example consider:
- “Internal combustion engine maintenance” vs. “how to take care of your car’s motor”
- “Brake system inspection” vs. “checking your brakes”
- “Transmission fluid replacement” vs. “changing gear oil”
On the other hand, semantic search completely changes this paradigm by extracting the meaning of a text rather than trying to get exact or partially exact matches based on the query. Instead of indexing terms and mapping synonyms, a semantic search is based on multidimensional vector representations of the text that capture its meaning and enable vector-related operations, such as K-nearest neighbor algorithms, to find vectors that are close to the one generated from the query in the same multidimensional space.
Semantic Search Implementation with Elasticsearch
This project will use an NLP model to generate vectors from Wikipedia articles and user queries. It will also use Elasticsearch as the vector database and search engine to retrieve relevant results.
System Architecture
- NLP Model: The chosen model, multilingual-e5-small (https://huggingface.co/intfloat/multilingual-e5-small), can generate vectors from text in multiple languages.
- Vector Database: Elasticsearch is one of the most versatile search engines. It supports vector storage and operations natively, making the entire process easier. Its RESTful API simplifies indexing documents, posting queries, and performing semantic searches.

Davila , A. (n.d.). Semantic search implementation architecture
Implementation Process
- Data Ingestion: The e5 model running inside a Machine Learning node in Elasticsearch is used to generate vectors from Wikipedia articles through an ingest pipeline also running within Elasticsearch. This approach reduces the complexity of the implementation by eliminating the need for an external script to calculate the vectors. As a result, when a new document is received, the vector is automatically calculated and stored.
- Query Vectorization: The user’s query is also vectorized using the same model, which is projected into the same vector space as the data and can be compared.
- Search Execution: The vector from the query is then compared to the data vectors using a K-Nearest-Neighbor algorithm to find the most similar ones.
- Semantic Search Results: The system returns results based on conceptual relevance, not just keyword presence.
Key Advantages
- Context is everything: Semantic search understands the meaning behind queries, not just the words.
- No need for synonyms: It finds relevant content even when exact terms aren’t used.
- Relevance reimagined: Results are based on conceptual similarity, significantly enhancing accuracy.
- Language barriers? What barriers?: It works effectively across multiple languages.
Example Queries

Davila , A. (n.d.). Semantic search results 1

Davila , A. (n.d.). Semantic search results 2

Davila , A. (n.d.). Semantic search results 3
Conclusion
This approach offers a very interesting alternative to traditional search. By understanding the context of the data, it can deliver better matches for complex user queries without requiring exact terms to be present.
With Elasticsearch, the process can be streamlined and enables more complex use cases, such as Retrieval-Augmented Generation (RAG) applications integrating semantic search with an LLM, or hybrid queries that provide a context-aware search engine.
Personally, I do recommend considering semantic search as a part of modern search engines. Given that it can deliver better results than traditional text matching by understanding the intent behind the query. This becomes even more powerful when combined with LLMs and AI agents, enabling things like conversational search, making it a foundational piece for next-generation search solutions.
Bibliography
intfloat. (n.d.). Multilingual-e5-small. intfloat/multilingual-e5-small. https://huggingface.co/intfloat/multilingual-e5-small
Elastic. (n.d.). Semantic Search. Semantic search. https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html
Written by:
Édgar Alexander Dávila
Elasticsearch Engineer
Country: Ecuador



