How many participants do you need for a usability study? This is one of the most common questions clients ask when approaching us about a new research project. And it's no wonder – running too many sessions drives up project costs, while running too few can leave you wondering if you're missing big problems. Here are a few questions to consider when choosing your sample size:
Is your study mostly about benchmarking or analyzing usability? If you're after precise benchmarking measures such as completion rate, you'll want a big sample – in the range of a few hundred users. The key here is to minimize your margin of error (how far your sample stats may vary from those of the whole population). And even with 100 users, your measures can be off (high or low) by about 10 percentage points. In cases like this, we use automated testing platforms like UserZoom and Loop11 to help keep costs down.
To identify and diagnose usability issues, you'll want to run live test sessions, and a sample of as few as 8 people is often enough – a stark contrast to the hundreds that are required for benchmarking studies. In this situation, statistical projections are less of a concern, because whether a problem impacts 20% or 50% of users, it's still a problem.
But how do we know such a small sample will uncover most usability problems? In a published study, Laura Faulkner (2004) tested 60 users, tallied the total number of problems discovered, then analyzed random subsamples of 5-10 users. She found that samples of 5 uncovered an average of 85% of all problems, while samples of 10 found 95% of the problems. So you can feel comfortable that even a few test sessions will reveal most usability issues.
Do you have distinct user subgroups? Different user types (e.g., doctors vs. patients) may have very different knowledge and needs, and this can greatly impact usability. For example, terms familiar to one group may make no sense to another. For moderated usability testing, we generally recommend completing between 6-7 sessions per subgroup.
Can all the key user tasks be covered in each session? For very large websites or applications, the answer may be no. In this situation, we may need to add a few sessions, rotating tasks across them to be sure we cover each one several times.
Is exploring user attitudes and preferences a key objective? Users’ subjective reactions are variable, and larger samples can be important to gauge the range of user opinions and come away with a good sense of how people generally respond to your design.
Will management want big numbers? In some cases, you may need to boost your sample size just to be sure key players will buy in to your findings and recommendations.
As you can see, choosing the right sample size is a mix of art and science. Most of the time, we end up testing between 8 and 14 users for live moderated test sessions, and between 200 and 400 for benchmarking studies. If you’d like some help thinking through the right size for your project, get in touch. We’d love to hear from you!