How Algolia uses AI to deliver smarter search
AI is a key part of Algolia’s strategy. We are continually investing in AI to enhance and extend our existing battle-tested solutions to the most complex problems in search. With 1 trillion searches per year touching 1 in 6 web users every day, and more than 9,000 customers, we have access to a powerful dataset that enables us to develop the most comprehensive suite of intelligent automation capabilities in the market, while maintaining our stringent data privacy policies.
Why AI in search is hard
Search is a complex problem that can be an important business differentiator. As Doug Turnbull, author of “Relevant search” and a prominent relevance expert put it: “Delivering a truly relevant search experience has been elusive, if not a critical blind spot for most organizations.” Even market leaders in consumer-grade search like Netflix and Amazon struggle to optimize and tune their search results with their AI stack.
Building great search requires companies to evolve their relevance by solving multiple problems:
- Clean and structured data. Search data — often large amounts of it — need to be indexed and cleaned up before using it as an input for search. For example, products can lack certain “attributes” like color or material, or have messy descriptions that need to be organized and structured so that they can be recognized by the search engine. Custom code and AI can be used to solve for this algorithmically, thus saving massive amounts of time and resources.
- User intent detection. Humans search in “messy” ways: we use different synonyms to describe the same thing (“red trousers” for “red pants”), we make typos (“cooiking” for “cooking”), and ask questions rather than type in or speak clean search terms. Furthermore, we speak different languages with their own specificities, such as very long words or words without spaces. This is where Natural Language Processing (NLP) — a branch of AI that helps read, understand, and make sense of human languages — comes in.
- Result ranking (and re-ranking). Once the search engine “understands” the user intent, it needs to return results ranked so that the user perceives them as most relevant in the context of her query. Ranking formulas alone are a significant challenge: even with the best of them, there’ll be edge cases that aren’t covered by the main configuration.
In addition, presenting results most relevant to the user is only half of the equation.
Imagine a situation in which the top result based on user relevance is a discontinued product, and the next most relevant one in which the business is not making any money. The business will in that case want to re-rank the results to offer results that are also optimized for “business relevance”: metrics such as availability, margins, popularity, etc. Fine tuning ranking formulas query by query is incredibly time consuming: this is where AI can be used to analyze user behavior and suggest improved ranking based on the findings — then iterate and self-learn to improve further. - Transparency. Given the complexity of each of the above problems, it is critical for businesses to both understand how each layer of AI affects search results, how they work together in orchestration, and, perhaps most importantly, be able to make changes based on the findings. Even before AI, good search solutions offered relevance understanding and optimization to developers only. Complex ranking formulas (e.g., assigning one unique score to each document and mixing a lot of information like attribute weights, proximity between words, etc.) are effectively a black box, making it nearly impossible for a search team to understand why results are appearing in a certain order, and how to adjust to achieve expected relevance.
In order for AI to be applied successfully, it needs to solve for each of the above problems, while ensuring that all AI algorithms work well holistically. In other words: to provide quality consumer-grade search that drives measurable business results, companies need a comprehensive approach to AI.
This is why Algolia has built a family of AI algorithms that solve the most important aspects of search. Think of it as an AI studio for search: a group of solutions to most complex search problems, packaged inside an easy-to-use and flexible API. This unique approach applies state-of-the-art practices for each specific need — be it an industry, company, use case, or partner solution — to help our customers achieve more in less time.
Algolia’s AI platform in 2020
In 2020, we continue to enhance our existing AI practices and expand to new ones:
- Natural Language Processing (NLP) is a branch of AI that helps read, understand, and make sense of human languages. Algolia’s search engine uses advanced machine learning algorithms to automatically recognize words that would normally be difficult to decipher, such as words without spaces in Asian languages like Japanese and Chinese, or Germanic languages such as German, Dutch, Finish and Swedish. One example is what’s often cited as the longest Dutch word: Kindercarnavalsoptochtvoorbereidingswerkzaamhedenplan, meaning “preparation activities plan for a children’s carnival procession”.
- Automated personalization. Personalizing search results is a surefire way to improve the user experience and maximize business KPIs. Algolia’s automatic personalization analyzes user signals (like a click on a product page, a purchase, an add to cart, an add to wishlist, etc.) to deduce a profile for each user according to a business configuration, and serve a personalized search and discovery experience for each user, while preserving user privacy. What’s unique about Algolia is how we handle the ranking. We are able to personalize while respecting all other key elements of relevance. Our customers can understand how results are ranked based on personalization and each relevance factor (textual factors like synonyms, plurals, lemmatization, and business relevance factors like margin or popularity). They can then iterate and adjust to get to that perfect search result formula.
- Dynamic synonym suggestions. Configuring synonyms is one of the most important — and complex — ways to improve search relevance, and increase its business impact. However, synonyms in search are not dictionary based; they are specific to each business and each use case. This makes it a time consuming, manual process, difficult even for experts who know their content and product catalog. Algolia has built a streamlined process for making recommendations based on how real users use synonyms, then made it easy to accept or reject the recommendations. This feature is currently in beta.
- Dynamic re-ranking, in beta this quarter, suggests improved query ranking based on the findings from analyzing user behavior. For example, if the third result for a specific query is the most clicked one, the recommendation or automation will move it to the first position. The feature will improve in accuracy and precision as additional feedback loops from user behavior are analyzed. Keeping with the white-box approach, customers will have full transparency into the algorithmic computations and have the ability to override the behavior. Furthermore, Algolia is the only solution that allows you to know exactly how each of those computations impacts the experience.
- Voice and natural language understanding. Users are expressing more and more queries in the form of natural language instead of keywords, particularly when it comes to voice search. Because search engines are used to understanding and dealing with keywords, we are introducing a layer of AI to analyze user intent before the query reaches the search engine. This new NLU engine enables new conversational interfaces, where search and NLU combine to enhance each other. Algolia’s NLU engine detects intents and entities to understand what users really want, for example identifying that a user who says “add a classy black dress for less than $500 to my cart” wants to perform a different action than the one who says “show me all black dresses.”
- Combining Algolia with customers’ ML/AI models and technologies. Algolia customers can use any data point (click through rate, product or content popularity, etc.), and even insert the results of their own ML model into Algolia’s extensible ranking formula to enhance their feedback loop. Direct measurement of the impact of their decisions is subsequently facilitated by the Algolia Analytics and A/B testing suites.
Our vision is to be able to use any type of custom AI algorithm with Algolia. The first step we released in this direction is to efficiently handle millions of rules in our engine; our customers can use those rules as the output of their AI algorithms to rerank results, promote categories, etc.
Transparent and understandable
We clearly state what AI features do, where and how they impact the experience, and we clearly state how they work alongside other features. We made sure that our customers have transparency on how their relevance is computed, as opposed to having to blindly trust an algorithm that they don’t understand. We dub this the white box approach and we believe it is key to keep customers empowered, educated on the insights of an iterative approach, and trustful of Algolia.
A white box approach is critical for enabling continuous improvement. If a company cannot tell you why results are ranked in a particular order, it will be impossible for you to test alternative ways to tune and improve the results.
What’s next
We will continue to explore, develop, and broaden our AI studio: the suite of integrated algorithms to help our customers harness the power of AI, while keeping ease of use and flexibility as design principles.
For example, we are a launch partner with OpenAI to bundle their technology on top of our search engine. The goal is to be able to provide the answers to user questions, in addition to a list of results that the user needs to analyze (similar to what Google is doing on a query like “Why did Franklin Roosevelt support the formation of the UN” ). You can imagine how this technology would be useful in the context of a help center, with significant impact on reducing support cost, or for news organizations to keep readers coming back for the latest facts.
We are evaluating many other AI solutions, including, but not limited to:
- AI-powered cleaning and structuring of your data
- Dynamic merchandising suggestions: identifying categories to promote on specific queries
- Automatic configuration of searchable attributes (one of the most important elements of your ranking formula, such as product name or category), making sure it always reflects the way your users are searching
- Dynamic filters optimization to ensure top filter values you display in your UI are matching your users’ behaviors
As AI continues to mature, we continue our work to reduce search complexity and enable our customers to provide the best search and discovery experiences to their users. And as natural language processing (NLP) becomes better at analyzing intent, we will get closer to the ideal of a two-way dialogue in conversational search. Eventually, we humans will interact with software like we would with a human assistant. Stay tuned as Algolia paves the roadmap to this not-so-distant scenario.