Semantic search in Elasticsearch: Step-by-step

Traditional search is great for matching exact terms and even goes further with synonyms, fuzzy matching, and other techniques that aim to find relevant results for a user’s query. However, these techniques fall short when  a user expresses their query in a natural language or is looking for conceptually related content. This blog showcases an implementation of semantic search using Elasticsearch and e5-small, a NLP model developed by Microsoft researchers that generates embeddings which capture the underlying meaning of a text. 

Understanding Semantic Search

Traditional keyword search  has a few shortcomings and implies some management complexities, such as:

  • It misses conceptually related results.
  • Managing synonyms becomes increasingly more difficult.
  • Multi-language support needs a very complex implementation.
  • Context can get lost in the search process

The limitations of traditional text search become apparent when, for instance, when we search “car maintenance tips”. A traditional search engine will match documents containing the terms “car”, “maintenance” and/or “tips”. However, it will completely miss highly relevant documents containing the phrase “automobile repair guides” even if they are, conceptually, the same. We can mitigate this specific scenario using synonyms, but in general this approach would be increasingly hard to maintain and we would still miss the intent of the user because a traditional search engine, at the end, is just matching terms.

Other interesting examples come from technical vs casual language, for example consider:

  • “Internal combustion engine maintenance” vs. “how to take care of your car’s motor”
  • “Brake system inspection” vs. “checking your brakes”
  • “Transmission fluid replacement” vs. “changing gear oil”

On the other hand, semantic search completely changes this paradigm by extracting the meaning of a text rather than trying to get exact or partially exact matches based on the query. Instead of indexing terms and mapping synonyms, a semantic search is based on multidimensional vector representations of the text that capture its meaning and enable vector-related operations, such as K-nearest neighbor algorithms, to find vectors that are close to the one generated from the query in the same multidimensional space. 

Semantic Search Implementation with Elasticsearch

This project will use an NLP model to generate vectors from Wikipedia articles and user queries. It will also use Elasticsearch as the vector database and search engine to retrieve relevant results. 

System Architecture

  • NLP Model: The chosen model, multilingual-e5-small (https://huggingface.co/intfloat/multilingual-e5-small), can generate vectors from text in multiple languages. 
  • Vector Database: Elasticsearch is one of the most versatile search engines. It supports vector storage and operations natively, making the entire process easier.  Its RESTful API simplifies indexing documents, posting queries, and performing semantic searches. 

Davila , A. (n.d.). Semantic search implementation architecture

Implementation Process

  • Data Ingestion: The e5 model running inside a Machine Learning node in Elasticsearch is used to generate vectors from  Wikipedia articles through an ingest pipeline also running within Elasticsearch. This approach reduces the complexity of the implementation by eliminating the need for an external script to calculate the vectors. As a result, when a new document is received, the vector is automatically calculated and stored. 
  • Query Vectorization: The user’s query is also vectorized using the same model, which is projected into the same vector space as the data and can be compared. 
  • Search Execution: The vector from the query is then compared to the data vectors using a K-Nearest-Neighbor algorithm to find the most similar ones. 
  • Semantic Search Results: The system returns results based on conceptual relevance, not just keyword presence.

Key Advantages

  • Context is everything: Semantic search understands the meaning behind queries, not just the words.
  • No need for synonyms: It finds relevant content even when exact terms aren’t used.
  • Relevance reimagined: Results are based on conceptual similarity, significantly enhancing accuracy.
  • Language barriers? What barriers?: It works effectively across multiple languages.

Example Queries

Davila , A. (n.d.). Semantic search results 1 

Davila , A. (n.d.). Semantic search results 2

Davila , A. (n.d.). Semantic search results 3

Conclusion

This approach offers a very interesting alternative to traditional search. By understanding the context of the data, it can deliver better matches for complex user queries without requiring exact terms to be present.

With Elasticsearch, the process can be streamlined and enables more complex use cases, such as Retrieval-Augmented Generation (RAG) applications integrating semantic search with an LLM, or hybrid queries that provide a context-aware search engine. 

Personally, I do recommend considering semantic search as a part of modern search engines. Given that it can deliver better results than traditional text matching by understanding the intent behind the query. This becomes even more powerful when combined with LLMs and AI agents, enabling things like conversational search, making it a foundational piece for next-generation search solutions.

Bibliography

intfloat. (n.d.). Multilingual-e5-small. intfloat/multilingual-e5-small. https://huggingface.co/intfloat/multilingual-e5-small 

Elastic. (n.d.). Semantic Search. Semantic search. https://www.elastic.co/guide/en/elasticsearch/reference/current/semantic-search.html

Written by:

Édgar Alexander Dávila
Elasticsearch Engineer
Country: Ecuador

Introduction to Elasticsearch

Elasticsearch stands as a powerful search engine enriched with analytical capabilities, all rooted in Lucene. This versatile platform seamlessly integrates three key solutions: Observability, Security, and Enterprise Search. Moreover, it offers the flexibility for users to craft ad hoc applications leveraging its robust search, machine learning, and analytics functionalities. Whether deployed on-premises or through the convenient  Elastic Cloud service in the cloud, Elasticsearch empowers businesses with unparalleled search capabilities and data insights.

Key Features of Elasticsearch:

  • Full Text Search: Elasticsearch offers robust full-text search capabilities, including customizable analyzers tailored to suit specific use cases. 
  • Distributed Architecture and Scalability: Its distributed architecture allows Elasticsearch to scale horizontally, facilitating efficient data management and lifecycle processes. This scalability ensures high availability, making data resilient to major outages. 
  • Fast Response Times: Elasticsearch boosts impressively fast response times, making it ideal for customer-facing search applications. This attribute has led to its widespread adoption by online retailers worldwide.
  • Machine Learning Capabilities: Elasticsearch features dedicated machine learning nodes, providing access to pre-built models and the ability to upload and execute custom models. This opens up avenues for advanced natural language processing (NLP), clustering, and other machine-learning applications.

Main Concepts

1. Kibana: Kibana serves as a vital component within the Elastic ecosystem, offering a web interface for Elasticsearch. Positioned as the visualization and UI layer of the stack, Kibana empowers users with dashboards, maps, and a monitoring interface, facilitating the overall usability of the stack.

2. Elasticsearch Node: An Elasticsearch node represents an individual instance within the Elasticsearch infrastructure. Each node may fulfill one or more roles, such as data storage, master management, or machine learning capabilities.

2.1 Cluster: A cluster comprises one or more Elasticsearch nodes, with a minimum of three recommended to achieve high availability. Within an Elasticsearch cluster, data, processing, and management are shared, ensuring robustness and high availability.

3. Index: An index serves as a mechanism for organizing documents with similar characteristics within Elasticsearch. Each index has settings and mappings that dictate how data is stored and retrieved.

4. Shard: Shards are subdivisions of an index designed to be distributed on data nodes, thereby facilitating scalability and fault tolerance. Replicas are shards maintained on different nodes to ensure data availability in the event of node failures. Additionally, having replicas facilitate distributed query processing, leading to faster response times.

Basic Architecture for an Elastic Deployment

The simplest architecture ensuring high availability and stability typically consists of three data nodes, each fulfilling both data and master roles. Among these nodes, one is designated as the master node. With this configuration, up to two replicas can be maintained, distributing data across all nodes for redundancy.

Access is facilitated through a dedicated Kibana node, establishing a connection to the Elasticsearch nodes. Via Kibana, users can execute queries, construct visualizations, and manage the cluster, including configuration adjustments within Elasticsearch.

Alternatively, data access can be achieved by sending requests to the RESTful API provided by  Elasticsearch. This approach enables performing tasks similar to those accomplished through Kibana programmatically. A common scenario involves generating a search request based on user input, forwarding it to Elasticsearch, and presenting the results on the frontend.

Going further we can have much more complex architectures, with multiple Kibana nodes, dedicated Coordinating, Master and machine learning Elasticsearch nodes and even with data tiers. 

Elasticsearch emerges as an invaluable tool catering to a spectrum of real-time use cases, ranging from its comprehensive full-text search functionality to leveraging machine learning-powered forecasting. Having a robust architecture that ensures high availability and the option to use it as a service, Elasticsearch can be used in production environments with confidence. In my experience, Elasticsearch is a very useful tool that enables a wide range of use cases and adapts very well to any of the client’s needs. It is useful to build search engines, recommendation systems, observability, and security platforms alike.

Written by:

Alexander Dávila
Software Engineer – Elastic Certified Engineer & Elastic Certified Analyst
Country: Ecuador