Your site search is broken. Maybe it returns nothing useful. Maybe it returns everything, which is just as bad. Before you can fix it, you need to understand what you're actually working with.
Here is a plain-English guide to the key components of a search engine: what each one does, why it matters, and how it shapes the results your customers see.
The search engine itself
The engine is the system that powers what people can search for on your website or product. It has several distinct parts, and they all need to work together.
The index
The index is the master record of everything the search engine should look at. It sets the rules for what gets crawled and, crucially, what gets ignored. Think of it as the foundation. Without a well-configured index, nothing else works properly.
Crawling
Crawling is the process of the search engine reading your site and gathering information according to those index rules. The first crawl pulls everything. Subsequent crawls look for changes: additions, removals, updates.
How often you crawl depends on how often your content changes.
On demand suits small or rarely updated sites. Crawl when you make a change.
Scheduled suits sites with regular updates. A nightly crawl, Tuesday to Saturday, is a sensible default.
Constant is only necessary if your business is built around search, as Amazon, LinkedIn and TikTok are. For most sites, it's overkill.
Crawling is resource-intensive. Running it unnecessarily costs processing power and tends to make engineers grumpy.
The algorithm
The algorithm is applied against the index. In plain terms, it controls what gets returned when someone searches, and in what order.
If you use an off-the-shelf search platform like Algolia, the algorithm is a set of configurable rules. If you build from scratch using something like Solr, you design it yourself. Either way, understanding how it works is essential for diagnosing what's going wrong.
Relevancy
Relevancy is the gold standard. When someone searches for something, they want the most accurate result, not just any result. If your search returns a long list of loosely related items, relevancy is your problem.
Typo tolerance
Most platforms include typo tolerance: the ability to interpret what someone meant, even if they transposed letters or misspelled a complex word. It matters more than people realise, particularly on sites with specialist or technical terminology.
Stop words
Many searches are phrased as questions. But the words that hold a question together ("what is," "how do I," "where can I find") are noise to a search engine. These are called stop words. The engine strips them out and focuses on the meaningful terms, which makes results faster and more relevant.
Metadata
If your site has metadata and your search engine is configured to read it, that metadata gets indexed alongside your content.
Metadata is structured information that describes your content. On an e-commerce site selling clothing, metadata might include gender, category, size and colour. It is controlled by your CMS, your e-commerce platform, or a separate system that feeds into the search engine.
Here is the rule: if you want visitors to be able to filter search results, you must have metadata. No metadata, no filters. It really is that simple.
Synonyms
Synonyms are one of the most underused tools in site search. They help visitors find things even when the words they use do not match the words in your content.
There are three good reasons to use them.
Bridging a language gap
If your organisation uses jargon, clinical terms or legal language, synonyms let you map plain-language equivalents onto technical ones. The NHS uses everyday words like "pee" and "poo" in its search configuration because those are the words patients actually type. (For what it is worth, nobody can spell diarrhoea correctly on the first attempt.)
Handling misspellings
For words specific to your sector that are routinely misspelled, and that a standard spell-checker would not catch, synonyms pick up the slack.
Filling metadata gaps
If you cannot implement metadata properly, synonyms can partially compensate. It is messy and hard to maintain, so treat it as a short-term workaround, not a strategy.
Noise
Noise is what you get when a search returns thousands of results and almost none of them are relevant. If your site has a noise problem, it almost certainly has a relevancy problem too.
One nuance worth noting: on some sites, a high volume of results builds confidence. On others, it creates doubt. If visitors do not trust what they are seeing, they will not click, even when the top result is exactly right. Understanding your users' relationship with volume is part of understanding whether your search is actually working.
Murmuration helps retailers and digital teams understand and fix onsite search. Get in touch if you'd like to talk through what a diagnostic might look like for your site.
▶ Know someone who’d love this? Forward it their way.
▶ Was this email forwarded to you?

