The cost of starting without a baseline

Navigation projects almost always begin the same way. Someone says users can't find anything, a stakeholder demands a fix, and the team starts moving things around, relabelling, restructuring, and eventually launching something new.

Then comes the question nobody can answer: is it better?

Without a benchmark, there is no honest answer. You have opinions, you have instincts, and you have whoever shouted loudest in the last review. What you do not have is evidence.

A benchmark is a stake in the ground. It captures where you are before the work starts, so that when something launches, you can measure what actually changed.

What benchmarking is not

There is a version of this question I have heard many times: what is the industry best practice for navigation, so we can benchmark against it?

Navigation performance is not like that. It is specific to your product, your users, and your context. You are not measuring yourself against a competitor or an industry average. You are measuring yourself against yourself.

The only metric that comes close to a universal comparison is bounce rate, and even that shifts depending on how your analytics platform defines it. Everything else needs to be interpreted in your own context, using your own data.

What to benchmark

The right data depends on what you have identified as the problem. Useful sources include analytics, session recordings, task completion rates from usability testing, customer support query volumes, tree testing scores, and NPS or satisfaction scores tied to findability.

Before you collect anything, define your parameters. That means a consistent date range (a full calendar month is a minimum, because a single week is almost always anomalous), a clear measure of volume, and a defined threshold for what counts as a problem.

One week of data is not a baseline. It is a moment. It could reflect a news spike, a seasonal shift, or a technical blip. You need enough data to see a pattern.

Why sample size matters more than you think

Small samples are the silent killer of navigation projects.

I worked on a project where a team was pushing a particular navigation approach past resistant stakeholders. They ran a test, compared it to the existing navigation, and declared their preferred option the winner. The data said so.

The benchmark had 34 participants. Their preferred option had 26. The site had over a million visitors a year. Neither sample was large enough to be statistically significant, and the two were not directly comparable. A major decision was being made on the basis of it anyway.

This is not unusual. It happens constantly.

If you are using benchmark data to secure budget or approval, representative sample sizes are not optional. Stakeholders who do not understand the methodology will use thin data against you the moment it suits them.

Tree testing and first-click testing as benchmark tools

Two methods are particularly useful for establishing a navigation baseline.

Tree testing

Tree testing strips out visual design and asks users to find things using your navigation structure alone. It gives you a findability score: a percentage of users who successfully located a given item.

If 68% of people could not find a key section in your current structure, that is your benchmark. That is the number you are trying to improve. The important caveat is that tree testing removes all visual context, so it will not catch problems caused by hidden menus, unlabelled icons, or mobile layout issues.

First-click testing

First-click testing adds the design back in. Users see a static version of the interface and indicate where they would click first to find something. This gives you the visual context that tree testing lacks, and lets you test whether a navigation element is actually visible and legible in context, not just whether the label makes sense in isolation.

Used together, the two methods can tell you whether you have a structural problem, a labelling problem, a visual design problem, or all three.

What a benchmark actually gives you

Three things, in order of usefulness.

A clearer picture of the problem. Defining what to measure often surfaces issues you had not identified yet.

Evidence to defend your decisions. If you are asking for budget, approval, or resource, you need numbers. A well-constructed benchmark is hard to argue with.

A way to measure success after launch. Without a baseline, you cannot know whether what you built is better than what you replaced. That matters for the project you are on now, and for every project that follows it.

Navigation work lives and dies by its data. The benchmark is where that data starts.

Murmuration helps retailers and digital teams find out why things aren't findable, and fix them. Get in touch if you'd like to talk through what a diagnostic might look like for your site.

Know someone who’d love this? Forward it their way.

Was this email forwarded to you?

Comment

Avatar

or to participate

Keep Reading