There are many good reasons to do competitive user testing, such as finding areas that are frustrating your users, failing to convert, or just areas that could use some general improvement. Watching users try out a competitor’s website or app can show you what their designs are doing well, and where they’re lacking; which features competitors have that users really like; how they display and organize information and options, and how well it works.
A less obvious, but perhaps even more valuable reason, is that competitive usability testing improves the quality of feedback on your own website or app. By giving users something to compare your interface to, it sharpens their critiques and increases their awareness.
Read more: 5 secrets to comparative usability testing
If a user has only experienced your website’s way of doing something, for example, it’s easy for them to take it for granted. As long as they were able to complete what was asked of them, they may have relatively little to say about how it could be improved. But send them to a competitor’s site and have them complete the same tasks, and they’ll almost certainly have a lot more to say about whose way was better, in what ways, and why they liked it more. Thanks to this effect alone, the feedback you collect about your own designs will be much more useful and insight-dense..
Setting up a competitive usability study
Not only can competitive user testing get you more incisive feedback on what and how users think, it’s also a great opportunity to quantitatively measure the effectiveness of different pages, flows, and features on your site or app, and to quantify users’ attitudes towards them. Quantitative metrics and hard data provides landmarks of objectivity as you plan your roadmap and make decisions about your designs. They deepen your understanding of user preferences, and strengthen your ability to gauge the efficacy of different design choices.
When doing competitive UX testing – whether between your products and a competitor’s, or between multiple versions of your own products – quantitative metrics are a valuable baseline that provide quick, unambiguous answers and lay the groundwork for a thorough qualitative analysis.
Same or different users?
The first choice you need to make when setting up a competitive UX study is whether to test each interface with separate groups of users, or send the same users to each one.
As described above, we prefer sending the same users to both if possible, so that they can directly compare their experiences with a sharp and keenly aware eye. We recommend trying this method if it’s feasible for your situation, but there are a few things to consider:
1. Time: How long will it take users to go through both (or all) of the interfaces you’re testing? If the flows aren’t too long and the tasks aren’t too complicated, you can safely fit 2 or even 3 different sites or apps into a single session.
The default session duration for Trymata tests is 30 minutes, which we’ve found to be a good upper limit. The longer the session goes, the more your results could degrade due to tester fatigue, so keep this in mind and make sure you’re not asking too much of your participants.
2. Depth: There will necessarily be a trade-off between how many different sites or apps users visit in a single session, and how deeply they interact with each one. If you need users to go into serious depth, it may be better to use separate groups for each different interface.
3. Scale: To get statistically reliable quantitative data, at least 20 users should be reviewing each interface. If every tester tries out both sites during their session, you only need 20 in all. If you use different batches of testers per site, you would need 40 total users to compare two sites.
So if you don’t have the ability or bandwidth to recruit and test with lots of users, you may want to simplify each flow such that they can fit into a single session; but if your UX team can handle larger numbers, you can have 20 visit each site separately (or even have some users visit multiple sites, and others go deeper into a single one).
For our Domino’s vs Pizza Hut test, we chose to send the same users to both sites so they could directly compare their experience on each. This wasn’t too much of a challenge, as ordering pizza is a relatively simple flow that doesn’t require intense or deep interaction, and the experience of both sites could fit easily into a 30-minute window.
Accounting for bias
As with any kind of usability testing, it’s critical to be aware of potential sources of bias in your test setup. In addition to the typical sources, competitive testing can also be biased by the order of the websites.
There’s several ways that this bias can play out: in many cases, users are biased in favor of the first site they use, as this site gets to set their expectations of how things will look and work, and where different options or features might be found. When the user moves on to the next website, they may have a harder time simply because it’s different from the first one.
On the other hand, users may end up finding the second site easier if they had to struggle through a learning curve on the first one. In such cases, the extra effort they put in to understand key functions or concepts on the first site might make it seem harder, while simultaneously giving them a jump-start on understanding the second site.
Lastly, due to simple recency effects, the last interface might be more salient in users’ minds and therefore viewed more favorably (or perhaps just more extremely).
To account for bias, we set up 2 tests: one going from A→B, and one from B→A, with 10 users per flow. This way, both sites would get 20 total pairs of eyes checking them out, but half would see each site first and half of them second.
No matter whether the site order would bias users in favor of the second platform or the first, the 10/10 split would balance these effects out as much as possible.
The other benefit of setting up the study this way is that we would get to observe how brand new visitors and visitors with prior expectations would view and interact with each site. Both Domino’s and Pizza Hut would get their share of open-minded new orderers and judging, sharp-eyed pizza veterans.
Writing the task script
We re-used the same task script from our previous Domino’s vs Pizza Hut test, which has been dissected and explained in an old blog post here. You can read all about how we chose the wording for those tasks in that post. You can do a quick skim of the task list below:
Scenario: You’re having a late night in with a few friends and people are starting to get hungry, so you decide to order a couple of pizzas for delivery.
- Have you ordered pizza online before? Which website(s) did you use?
- Does this site have any deals you can take advantage of for your order?
- Customize your pizzas with the toppings, sauce, and crust options you would like.
- Finalize your order with any other items you want besides your pizzas.
- Go through the checkout until you are asked to enter billing information.
- Please now go to [link to second site] and go through the pizza ordering process there too. Compare your experience as you go.
- Which site was easier to use, and why? Which would you use next time you order pizza online?
We also could have broken down Task 6 into several more discrete steps – for example, mirroring the exact same steps we wrote for the first website. This would have allowed us to collect task usability ratings, time on task, and other metrics that could be compared between the sites.
However, we decided to keep the flow more free-form and let users chart their own course through the second site. You can choose between a looser task script and a more structured one based on the kinds of data you want to collect for your study.
The post-test survey
After users complete the tasks during their video session, we have them respond to a post-test survey. This is where we posed a number of different rating-style and multiple-choice type questions to try and quantify users’ attitudes and determine which site performed better in which areas. Our post-test survey:
- Which website did you prefer using to create your pizzas and place your order? (multiple choice, single-select)
- Domino’s
- Pizza Hut
After completing both flows and giving feedback on each step, we wanted the users to unequivocally choose one of the websites. This way we could instantly see the final outcome from each of the tests, without trying to parse unemphatic verbal responses from the videos.
For each test, we listed the sites in the order they were experienced, to avoid creating any additional variables between the test.
- How would you rate your experience on the Domino’s website, on a scale of 1 (Hated it!) to 10 (Loved it!)? (slider rating, 1-10)
- How would you rate your experience on the Pizza Hut website, on a scale of 1 (Hated it!) to 10 (Loved it!)? (slider rating, 1-10)
Here again we showed the questions in an order corresponding to the order from the video session. First users rated the site they started on, then they rated the site they finished on.
- In which of the following areas do you feel that PIZZA HUT’S website was BETTER than DOMINO’s, if any? (multiple choice, multi-select)
- Overall mood/feel of the site
- Attractive pictures, images, and illustrations
- Ease of navigating around the site
- Clarity of information provided by the site
- None of the above
For the fourth question, we listed several different aspects of the user experience to see which site held the edge in each. Users could check any number of options, and we also included a “none of the above” option.
If we were to run this test again, we would include more options to pick from that later came up in our results, such as the availability of appealing promotions/deals, and the choices of pizza toppings and customizations and other food options.
- What is the #1 most important thing about a pizza website, to you? (free response)
Since we knew that we probably wouldn’t think of every possible area of the experience that users cared about, we followed up by asking a free-response question about what users prioritized the most in their online ordering experience. This allowed us to get more insight into the previous question, and build a deeper understanding of each user’s mindset while viewing their videos and other responses.
- On average, about how often do you order pizza for delivery (including online or over the phone)? (multiple choice, single-select)
- Several times a week
- About once a week
- Once or twice a month
- Less than once a month
- Which of the following is your preferred chain to order delivery pizza from? (multiple choice, single-select)
- Domino’s
- Pizza Hut
- Papa John’s
- Little Caesars
- Other
The final 2 questions were just general information-gathering questions. We were interested to see what kind of backgrounds the testers had (and were maybe also a little excited to try out more of the new post-test question types).
Besides expanding the options in question 4, the other thing we would change about the post-test survey if we ran this study again would be to ask more free-response type questions. We found that with so many quantitative type questions, we actually missed out on some qualitative answers that would have been useful (especially in conjunction with the data we did get).
Some example questions we would add, which we thought of after getting the results in, are:
- What did you like the best about the [Domino’s/Pizza Hut] website?
- What did you like the least about the [Domino’s/Pizza Hut] website?
- Do you feel that your experience on the two sites would influence your choice between Domino’s and Pizza Hut if you were going to order online from one of them in the future?
Wrapping up Part 1
Besides the task script and post-test survey, the rest of the setup just consisted of choosing a target demographic – we selected users in the US, ages 18-34, making under $50,000. Once the tests were finalized, we launched them and collected the 20 results in less than a day.
After they had completed their video session, we had each user answer some post-test survey questions, including multiple-choice, slider rating, and free response questions, to better understand and quantify their final opinions.
Read more: How many users should I test with?
As expected, each website performed better on average when users had experienced it first out of the two sites. However, Domino’s still emerged as the overall winner of the comparison, with higher combined scores and a still-strong performance amongst those who had used the site second.
Below, we’ll walk through the survey and video results in detail and describe key insights from the data.
Post-test survey results
Before diving into our video results (6 hours of them!), we started by looking through our post-test survey data to get an idea of the final outcome and main takeaways. That way, we could tackle the videos with a mental framework in place, and save ourselves time and effort.
Read more: Using quantitative data in your UX research
Q1. Which website did you prefer using to create your pizzas and place your order?
In total, 13 out of 20 users (65%) indicated that they preferred the Domino’s website. Among users who visited Domino’s first, 8 out of 10 preferred it. Of those who visited Pizza Hut first and Domino’s second, it was an even split – 5 users chose each site.
We discussed bias and how to control for it in Part 1, and we were expecting each site to do better among users who experienced it first (since the first site gets to set user expectations for where things are and how they work).
Read more: Bias in user testing
It seems that’s exactly what happened. Each website performed better when it came first instead of second – Domino’s getting 8 votes instead of 5, and Pizza Hut getting 5 votes instead of 2.
However, the fact that Domino’s still tied Pizza Hut even when it came second in the flow was an early indicator that it had an objectively better UX.
Q2-3. How would you rate your experience on each website, on a scale of 1 (Hated it!) to 10 (Loved it!)?
The slider rating questions further reinforced the overall performance of the two sites – and the effects of order bias on users’ perceptions.
Domino’s scored higher on average than Pizza Hut whether it came first or second. Not only that, the site also had a higher median and mode in both cases, and a lower standard deviation (indicating a more consistent experience for all users).
The gap between the sites was smaller when Pizza Hut came first, though: just a 0.7 difference, instead of 1.4 when Domino’s was first.
Q4. In which of the following areas do you feel that [the second company]’s website was BETTER than [the first], if any?
We asked users to indicate whether Domino’s or Pizza Hut was superior in each of the following categories (colors correspond to graphs below):
- Overall mood/feel of the site (blue)
- Attractive pictures, images, and illustrations (red)
- Ease of navigating around the site (yellow)
- Clarity of information provided by the site (green)
Across both test flows, users favored the first website they visited – out of a total of 40 possible points (10 users x 4 categories) only 16 were given to the second website in either case.
However, among each set of users, one category stood out as a key strength for the second site. Those who used Domino’s first said that Pizza Hut had more “attractive pictures, images, and illustrations” by 8 to 2; those who used Pizza Hut first said that Domino’s had superior “ease of navigating around the site” by 7 to 3.
The most interesting takeaway here is that among Pizza Hut-first users, Domino’s received less total votes across categories (16 out of 40) despite receiving better overall ratings on the slider question, and 50% preference among users.
There are a couple of possible ways to read this:
(A) The one category in which Domino’s received more votes (“ease of navigating”) could have been disproportionately important to users, and impacted the overall standings more than categories like “attractive pictures.”
(B) There may be other ways in which the Domino’s site is superior to Pizza Hut, which were not accounted for by this survey question.
In fact, we felt that there was likely some truth to both of these possibilities. More on that below.
Q5. What is the #1 most important thing about a pizza website, to you?
Knowing that our previous question likely wouldn’t cover all of the different aspects that mattered to users, we asked them this free-response question to dig deeper into their preferences.
Then, we did a direct comparison of the written responses to this question and the responses from Question 4. First, we read through the Q5 responses, and if they matched with one of the four categories from the last question, we assigned it to that category. For answers that didn’t match anything, we assigned a new category.
Next, we cross-referenced each user’s most important thing against their selections from Q4, and against their final decision from Q1. For example…
Tester Paul said the most important thing to him was “images and presentation” (we assigned this to the category B, “attractive pictures, images, and illustrations”).
In Q4, Paul voted that Domino’s was superior in that category. In Q1, he had picked Domino’s as his preferred website to use. Therefore, his ultimate preference was consistent with his most important thing.
11 of the 20 users named a “most important thing” that we correlated with one of our pre-defined categories. Out of those 11, 8 users’ final decision was consistent with their Q5 answer. On the other hand…
Tester Timothy said the most important thing to him was “visualization: I’m a visual person and like to see pictures rather than text.” We assigned his answer to the same category as Paul’s.
In Q4, Timothy voted that Pizza Hut was superior in that category. Nonetheless, his preferred website from Q1 was Domino’s, not Pizza Hut.
Why wouldn’t his answer be consistent? We can think of a few explanations:
- People actually aren’t great at knowing what they like and why. While Timothy said that visuals were the most important thing, subconsciously he might have prioritized other aspects instead.
- The cumulative advantages held by Domino’s in other aspects may have outweighed its visual inferiority for Timothy, even if he really did prioritize visuals the most.
- The fact that Timothy experienced the Domino’s website first, and Pizza Hut second, may have subconsciously influenced his preference and led him to favor the first site despite objective considerations.
Of the 3 users whose most important thing contradicted their final decision, 2 of them had visited their preferred website first in the flow.
As for the other “most important things” that didn’t correspond with any of our four categories? Deals/cost was the most popular, accounting for 6 users. Half of them chose Domino’s, half chose Pizza Hut.
Some final thoughts on the data so far:
At this point, can we categorically identify the most important factors affecting users’ ultimate preference between the two websites? Aside from what users said they valued, which of the categories actually showed the strongest correlation with users’ final choices? Among the 13 users who chose Domino’s, the most commonly cited areas of advantage were C (“Ease of navigating around the site”) and D (“Clarity of information provided by the site”) – both chosen by 11 of the 13. Options A and D, by comparison, were picked by only 8 and 4 of these users, respectively. Among the 7 users who chose Domino’s, no option stood out as having an especially strong correlation (6 votes for A, 6 for B, 4 for C, and 6 for D). We can at least tell, though, that the weakest link was the “ease of navigating around the site.”
Summary: ease of navigation and clarity of information closely correlated with the strong performance by Domino’s, while navigation was a weakness for Pizza Hut.
Findings from the video results
Having analyzed our post-test survey data, we moved on to watching the user videos. Rather than watching all 6 hours beginning to end, though, we had a plan.
In each video, we jumped right to the final task, Task 7 (“Which site was easier to use, and why? Which would you use next time you order pizza online?”) and watched it. We listened to each user explain their final preference and the key factors that influenced their decision.
Then, in many cases, we jumped to the task(s) in their video that were directly relevant to their feedback, or any other parts we felt would be especially insightful (the SEQ and task completion rates are great features for identifying these). As we watched our user videos, we noticed patterns in the things users commented on and the things they liked about each site. It also became clear that noticeably more of these patterns favored Domino’s and not Pizza Hut.
What do I get with my coupon?
The most impactful difference between the two experiences was related to how users tracked the fulfillment of their deal coupon. Both sites offered deals combining pizzas of various sizes and toppings and other menu items, but only Domino’s actually helped users keep track of the different items as they added them to their order, and figure out when all of the parts of the coupon had been checked off.
As one user put it, “It’s a multi-faceted deal, so they’re walking you through it. That’s what Domino’s did… [W]ith Pizza Hut, you’re just kinda guessing things. I went back and clicked on that deal 4 times to try to add what I wanted… I’m still really confused about what I was buying and I don’t even know if my deal was applied.”
Another user noted that he chose a coupon on Pizza Hut and created his first pizza, but then the site took him directly to his cart – even though the coupon he had clicked wasn’t fulfilled yet:
“When i went to the Domino’s site, it was keeping track of the deal as it went on. It made sure that it was telling me where the deal was fulfilled and how many items I’ve added so far, that are in accordance with the deal. And this one [Pizza Hut], I try fulfilling this deal, and I order the one pizza…..”
Other frequent patterns that cropped up:
- Domino’s consolidated the checkout step into a single page, while Pizza Hut split it into several screens. Most users strongly preferred the single-page version, where they could see all the details at once.
- Domino’s made it easy to toggle through their whole menu, so users could easily view all of their sides, desserts, and other items. On Pizza Hut it was much more difficult to jump between menu categories.
- Domino’s had better deals that several users said were a better match for combinations they actually wanted to order.
But not every pattern was in favor of Domino’s. Here are some other common pieces of feedback that were better for Pizza Hut:
- Pizza Hut had more attractive visual presentation, with a modern style and nicer pictures. “The Domino’s site kinda looks boxy and old and not fun to look at,” said one user. Many users voiced similar sentiments.
- Pizza Hut provided much more information about the toppings and ingredients going onto the pizzas. “They told me what the options actually are. Whereas with Domino’s, when I was creating my pizza they didn’t tell me what the things were, like I didn’t know what the ‘hearty sauce’ was. On the Pizza Hut website it all came with nice fancy descriptors…it described the food better.”
Going back to our post-test survey data, what does all this mean?
1. The comparative advantages for Domino’s that we saw in the videos align pretty clearly with the two survey categories closely correlated with its success: ease of navigation (C), and clarity of information (D).
The coupon tracker falls under both categories, providing the users with clear information about what they needed to do (D) and guiding them through the steps to accomplish it (C). Putting the checkout on a single page also relates to both; it simplifies the number of steps users need to take to finish (C) and allows them to easily access all information about their order (D). Easy navigation around the menu is mostly C, but it also makes it easier for users to gather information about the different menu options.
2. Pizza Hut’s favorable feedback correlated with (B) attractive visuals and (D) clarity of information. Clearly, though, the usefulness of the information provided across the rest of the Domino’s site outweighed Pizza Hut’s informational advantage in the create-a-pizza step.
So, what did we learn about running competitive usability studies from these results?
The order of the websites does matter to the outcome of the study. In this case, it clearly caused users to favor the first one they tried (though this won’t always be the case). Testing both orders with an equal number of users not only helps to nullify these order biases, but it also allows you to cross-reference the data points you collect and analyze the relative performance of the the two sites at a much deeper level, as we did above. We also saw how quantitative data can be leveraged to build an understanding of what the videos contain, and then execute a highly targeted viewing of the videos that saves time and enhances that understanding.
Of course, you don’t have to go through all the steps we did above. The primary aim of this exercise has been to put competitive usability testing under the microscope and see what’s possible with such a study. But we hope that this series has inspired you to think about new ways to collect and analyze quantitative data in your user testing, and encouraged you to try testing your own products against a competitor or two.