Repeat measures design and analysis
The logic, use, and analysis of the Repeated Measures Design is presented. This is a data collection strategy in which respondents perform a number of rating tasks on each of a set of objects. The Repeated Measures Design possesses a number of attractive characteristics, making it of great value in marketing research.
Introduction:
A common objective in marketing research is to obtain perceptions about each of a number of objects. The objects may be brands or concepts. The perceptions may relate to brand image on a number of product-related attributes or to likelihood of purchase of new product concepts. Once these perceptions are obtained, a second objective involves comparison: evaluating differences among objects.
Mean ratings are frequently used as the basis for assessing differences among objects. Analysis of variance and t-tests serve as extremely powerful tools for inferring whether differences in means truly exist. The user of such procedures can assess the siqnificance of differences with reasonably well determined levels of confidence.
Types of Designs:
Two research designs are generally available for data collection:
- The monadic design: Each respondent evaluates one and only one of the objects.
- The repeated measures design: Each respondent rates all objects, or some subset of all objects, each subset containing more than one object. It is "repeated" in the sense that a respondent provides a rating for each object presented.
The monadic design is, perhaps, the better known of the two designs. Its characteristics are widely understood and the analyses of data collected in this fashion usually proceed smoothly. The repeated measures design is widely used, but its properties and advantages relative to the monadic design are not as well understood.
Understanding Repeated Measures Design:
The basic idea behind the repeated measures design is illustrated by considering what is called the randomized block design. To construct this type of design, the researcher forms a number of blocks or groups of respondents. When forming the blocks, the objective is to have the respondents within each block be as similar as possible with regard to some set of characteristics. The characteristics may be, for example, demographics, attitudes, or product usage behavior. Further, blocks should differ widely among themselves with regard to these same characteristics. The researcher strives for maximum homogeneity within blocks and maximum heterogeneity among blocks. (Maximum heterogeneity among blocks at lows objects to be compared across a wide range of characteristics. Validity and generalizability of findings may be enhanced under these circumstances. Maximum homogeneity within blocks leads to increased precision in how blocks are defined which, in turn, promotes greater clarity in distinguishing among blocks.)
For example, blocks could be constructed by age. All respondents with in a block would be of the same, or very similar, age. Across blocks, though, age would vary widely.
Typical randomized block designs have as many respondents in a block as there are objects to be rated.¹ The respondents in each block are then randomly assigned to each of the objects. Differences among objects are calculated within each block and added across blocks. Assessing object differences in this manner effectively eliminates the influence of the set of characteristics used in forming the blocks. This can lead to increased precision in all subsequent statistical analyses since it removes variability in the ratings due to those characteristics.
Taking this blocking idea to its logical extreme leads to the repeated measures design. Maximally homogeneous blocks could be formed by treating each respondent as a block. Each respondent would rate all objects. In essence, each respondent acts as his/her own control. Subsequent analyses would assess and eliminate respondent, or block, differences to improve statistical precision.
Example:
Consider the following example: Seven respondents were asked to rate three new product concepts in accordance with their likelihood to purchase the products these concepts portrayed. Their ratings are recorded in Table One.
There are two approaches to analyzinq these data. The repeated measures nature of the data could be taken into account by removing respondent differences. Or, the data could be treated as though they were collected with a monadic design.
Treating the data from a monadic point of view leads to a pooled within-group variance estimate of 4.873 (with eighteen degrees of freedom if the data were, in fact, from a monadic design).
A comparison of these two approaches is of great importance. To facilitate such a comparison, an estimate of variability is calculated for both approaches. The most reasonable variance estimate would be that calculated with in brands: the pooled within-group variance. This variance estimate is the basis for comparing mean differences using statistical techniques such as t-tests and the analysis of variance.
Treating the data from a monadic point of view leads to a pooled within-group variance estimate of 4.873 (with eighteen degrees of freedom if the data were, in fact, from a monadic design).
To calculate a variance estimate for the repeated measures approach the respondent differences are removed first. This involves calculating respondent means across concepts and subtracting these means from each of the concept ratings for that respondent.
¹A distinction can be made between complete and incomplete block designs. Blocks containing as many respondents as objects, where each object is rated once within each block, are considered to form a complete design. Blocks which have fewer respondents than objects, so that only a subset of objects can be evaluated within a block, are considered to form an incomplete design.
The residuals, those values resulting from the subtraction, are shown in Table Two.
Recalculating the respondent means shows them all to be zero. In a real sense the respondent differences have been removed from the data. The pooled within-group variance estimate is now 1.595 (with twelve degrees of freedom).
Clearly, taking the repeated measures approach and removing respondent differences has reduced the error variability. A more sensitive analysis of concept differences can now be performed.²
Design considerations in analysis:
Quite often studies are designed for repeated measurements but the analyisis treats the data as if a monadic design was used. It is clear that such an analysis cannot take advantage of the greater precision potentially offered by a more thorough analysis which explicity recognizes the repeated measurement nature of the data. Further, this monadic design analysis can lead the researcher to incorrect conclusions about the extent of significance of mean differences. Failure to take the respondent differences into account typically leads to an underestimate of the significance of statistical tests.
As an example, consider testing the significance of the difference between the means of Concept C and Concept A in Table One. A t-test is an appropriate statistic to calculate and is found by forming a ratio of the mean difference to an estimate of the standard error of that difference. The mean difference is 4.857 -3.286 = 1.571.
The standard error used if these data were analyzed as though they originated from a monadic design is 1.180 (the square root of 4.873 X [1/7 + 1/7]). Again, this standard error would have eighteen degrees of freedom if a monadic design had been actually used. Combining these two pieces of information gives a t-value of 1.33, significant with about 80% confidence (using a two-tailed test).
Recognizing the repeated measurement nature of the data, the t -test can be recomputed using the smaller standard error of .675 (the square root of 1.595 X [1/7 + 1/7]), with twelve degrees of freedom. Under these circumstances a t-value of 2.33 is obtained, which is significant with about 96% confidence.
²The above example is purely an illustration of aspects involved in an analysis of a repeated measures design. In practice, all statistical analyses involved in the comparison of objects would use the mean ratings of the original data (Table One) and not those of the residuals (Table Two). The measurement scale, in this instance a seven point scale, is preserved and interpretation of statistical tests would proceed as usual. However, note that the differences between means of concepts are identical both before and after respondent mean removal, as comparison of means in Table One and Table Two shows. The removal of individual differences (when there are no missing data) does not affect the relative position and differences among objects.
Table One
Example data of seven point likelihood-to-purchase scale
| Respondent | Concept A | Concept B | Concept C | Respondent Mean |
| 1 | 2 | 2 | 2 | 2 |
| 2 | 3 | 5 | 7 | 5 |
| 3 | 2 | 4 | 6 | 4 |
| 4 | 1 | 3 | 2 | 2 |
| 5 | 1 | 3 | 5 | 3 |
| 6 | 7 | 6 | 5 | 6 |
| 7 | 7 | 7 | 7 | 7 |
| Concept Mean | 3.286 | 4.286 | 4.857 |
Differences in Concept Means
| Comparison | Difference |
| A Versus B | - 1.000 |
| B Versus C | - .571 |
| C Versus A | 1.571 |
This second test, which recognizes the repeated measures aspect of the design by removing respondent differences, strongly suggests that Concept C is significantly more likely to be purchased than Concept A, a very different conclusion than that drawn from the first analysis.
A general rule based on the above example is as follows: The analysis of any data should take into account the sampling design and constraints imposed on the data. This means that any special conditions under which the data are collected should be taken into account in the analysis. For example, blocks in the randomized block design, and hence respondents in a repeated measures design, are constraints or special conditions associated with the sampling design. This rule will generally work to the researcher's benefit as it allows for realistic assessment of the information contained in the data. Consideration of the sampling design and constraints leads to the calculation of an appropriate error term and the correct number of degrees of freedom associated with that, error term. As such, results of an analysis can be evaluated with reasonably correct levels of confidence.
Advantages of Repeated Measures Designs:
The repeated measures design has several advantages over the monadic design. Because fewer people need to be interviewed, substantial reductions in field costs and timing can be achieved. This will be true whether the data are collected by phone, mail or personal interview.
Table Two
Example data with respondent means removed
| Respondent | Concept A | Concept B | Concept C | Respondent Mean |
| 1 | 0 | 0 | 0 | 0 |
| 2 | -2 | 0 | 2 | 0 |
| 3 | -2 | 0 | 2 | 0 |
| 4 | -1 | 1 | 0 | 0 |
| 5 | -2 | 0 | 2 | 0 |
| 6 | 1 | 0 | -1 | 0 |
| 7 | 0 | 0 | 0 | 0 |
| Concept Mean | -.857 | .143 | .714 |
Differences in Concept Means
| Comparison | Difference |
| A Versus B | - 1.000 |
| B Versus C | - .571 |
| C Versus A | 1.571 |
Respondents may differ among themselves due to the idiosyncratic nature with which they respond to attitude scales. Using a seven-point scale as an example, some people may restrict their ratings of objects to the top two or three points only. Others may use only the lower two or three points. This response patterning or bias, which is due to respondent differences and which does not necessarily affect the relative relationships among objects, can be isolated, studied and removed in a repeated measures design.
Finally, the error variability used in the analysis of mean differences can be reduced by estimating and eliminating differences due to respondents. Means and mean differences consequently can be estimated with greater statistical precision. The result is a more sensitive and powerful analysis allowing the researcher to detect differences among objects with greater confidence.
Precautions In Use:
As a last note, below are two conditions that ensure the validity of data collected through repeated measures designs:
- The evaluation of one object should not affect or bias the evaluation of any other object (as can happen in taste tests).
- The respondent should not be asked to evaluate so many objects as to bring on fatigue. As a safeguard the number of objects to be rated should be minimized and the order of presentation of the objects should be randomized to prevent order or position bias.
If a biasing effect, commonly called a carry-over or context effect, is anticipated, a monadic design should be used. (Special carry-over designs exist. The basic idea involved with these designs is to rotate the ordering of objects so that each object follows each other object, in presentation to respondents, an equal number of times. The biasing effect is assumed to be averaged out, but not necessarily removed.)
The minimum number of objects will depend on the complexity of the rating task. As no hard and fast rules are available, pretesting questionnaires is advisable.
If a large number of objects need to be rated, an incomplete block design is recommended. Use of such a design restricts the number of objects each respondent evaluates. Further, the incomplete block design can be balanced: All objects are rated an equal number of times; all possible pairs of objects appear an equal number of times. These design characteristics enhance and simplify subsequent statistical analyses.
Summary:
Repeated measures designs, when correctly analyzed, give the researcher the ability to improve precision of statistical tests in making comparisons among objects. Such designs are also efficient from a cost and timing point of view.
