My use case has evolved over time with Elastic Search. Initially, we started with it as a searching solution. Before Elastic Search, our primary source of truth was SQL databases, the traditional RDBMS. We thought about taking the data from the traditional RDBMS because they were not able to cater to the scale that we wanted to achieve, so we migrated the data from MySQL, keeping it as the primary source of truth, but for the searching mechanism and wildcard searches, we migrated to Elastic Search.
My experience with the relevancy of search results in Elastic Search includes both traditional keywords and full-text search. In the supply chain industry, with millions of orders and customers such as CMA CGM, Maersk, or Kuehne+Nagel, filtering out those orders was essential, using a shipment number, transportation order number, or an origin or destination number. In the gaming industry at FDJ United, full-text searches make more sense to understand gaming intent. For example, when a user searches for 'I really want to play action games', we break down that full-text query, use custom text analyzers, and derive the intent behind the user's query in combination with a vector database alongside Elastic Search.
My assessment of the effectiveness of hybrid search, combining vector and text searches, shows that Elastic Search is remarkable for text-based searches. I have explored other solutions, but none can beat Elastic Search in that area. When I combine hybrid searches with vector databases, they store the mathematical representation of the data. For instance, to find the top 10 closest proximity based on a query, the vector database uses cosine similarity on the available data and suggests the top 10 results while Elastic Search can keep the metadata, enabling quick access to the entire database based on derived intent.
I have utilized trusted GenAI experiences related to semantic search and text-based search in my current project using Elastic Search. My go-to solution for text-based searches will always be Elastic Search, but for semantic search, I am trying to build a solution that emphasizes system-level understanding agents. For example, if a new engineer queries the agent for a system explanation, it scans all the relevant data and provides a comprehensive analysis of the service, contextualizing inputs to reduce hallucination, controlled temperatures for the LLM model, and reducing nucleus sampling. As for knowledge preservation, I use a vector database to store significant outputs generated by the LLM, depending on user preferences regarding the gravity of the analyses performed.