Tipping the balance: the danger of mixing scales in market research

Posted Oct 14, 2016

Scale-based questions are a hallmark of quantitative market research, from Net Promoter scoring (to determine willingness to recommend) to Purchase Likelihood to Customer Satisfaction. Indeed, scales are a preferred format when we ask consumers to predict future behavior as they allow respondents to assign a likelihood to said behavior, rather than indicate that they are “all in” (yes) or “all out” (no).When it comes to using scales, there are many things to consider, including which labels or end points to use, whether to have a bi-polar or unipolar scale, and how many scale points to include. This last issue is one that I’ve been thinking about a lot lately. Many times I use a 5-point scale so that I can easily label each point and ensure that all respondents interpret each point the same way. But other times, I want to create room for more variability in responses, so I’ll use a larger scale—7 or even 10 points. This got me thinking: Are there differences in the way respondents use a smaller scale versus a larger scale? Do they give more extreme responses on one versus the other? And what implications might that have for survey design? In order to find out, the AYTM research team ran the following experiment.

Methodology

We fielded a survey in which a nationally representative sample of N=1000 respondents evaluated two new tech products: The Nanoleaf Aurora and the Livestream Mevo. For each product, respondents were asked to read about the product and then rate their agreement with:1. A positive evaluative statement (this product is appealing)2. A negative evaluative statement (this product is not for someone like me)3. A behavioral statement (I would purchase this product if the price were right)4. A personality statement (I am usually one of the first people to try a new technology product when it comes out).After going through this series of scale questions for the first product, respondents then completed an identical set of questions for the second tech product. Importantly, respondents used a 5-point scale for questions related to one product and a 10-point scale for questions related to the other product. Both A) which product was assigned the 5-point versus the 10-point scale and B) the order in which respondents evaluated the products were randomized for all respondents.

Results

Overall, the results* indicate consistently that respondents use a 5-point scale in a more extreme way than they use a 10-point scale, in particular the upper end of the scale. For example, for the Nanoleaf Aurora the average appeal rating among those who used the 5-point scale was 3.5 (or 75% of the scale), whereas the average appeal rating among those who used the 10-point scale was 7.1 (or 71% of the scale). This same effect held true for every metric we tested, across both products tested and consistently across age and gender groups.

Implications

The fact that respondents in the current study so consistently used 5-point scales more extremely than 10-point scales has important implications. At the very least, we must use extreme caution when attempting to compare results across studies that use different scales and stay mindful of the need for scale consistency when designing studies. This is true both within surveys (across all scale questions that one might wish to compare) and across surveys (when running multiple waves, as in a tracker).These findings also have implications for the standard practice of taking Top 2 Box for 5-point scales and Top 3 Box for 10-point scales. Market researchers have a tendency to see these metrics as largely interchangeable, when in reality our findings suggest that a Top 4 Box metric on a 10-point scale would STILL be more conservative than Top 2 Box on a 5-point scale.Finally, these results suggest that there may be times when a 5-point or a 10-point scale is the more desirable choice for a study. For instance, in cases where a more conservative estimate is desired (e.g., market sizing a new product category), our study suggests a 10-point scale will yield the more conservative estimate.*Results were analyzed using univariate analysis of variance (ANOVA), analysis of co-variance (ANCOVA), and repeated measures ANOVA and ANCOVA.Photo credits:

500 g by Martin Cathrae under CC BY-SA 2.0 // cropped from original

Weight Scale by Philipp under CC BY 2.0 // cropped from original