Examining group differences: components of t plots

Components of t plots are a valuable graphic aid in summarizing the results of t-tests for comparing means of independent groups on several variables. These plots show the direction and magnitude of differences between group means on all dependent variables, as well as confidence bounds adjusted to take into account the fact that comparisons are made on several variables.


Introduction:

The t-test is one of the most widely used techniques for assessing the statistical significance of differences between means. A common situation for which t-tests are appropriate is when a researcher wants to compare the means of one group of respondents with the means of one or more other groups on a number of variables. The groups could be based on demographic characteristics of the respondents, on the brand of a product used most often, on the product or concept evaluated in a monadic design, etc.

When the number of groups and/or variables is large, however, as is often the case in marketing applications, it is useful to have some method of efficiently summarizing the results of the numerous possible t-tests. In addition, when the number of t-tests is large, some adjustments of the significance levels are necessary in order to maintain a reasonable overall level of confidence across all of the tests.

This report describes a graphic procedure for displaying the results of t-tests for comparing means of independent groups on several variables. One group is designated as a standard or control with which all other groups are compared on each variable. The technique can also be used to examine differences among all possible pairs of groups. The information obtained from these t-tests is used to construct "components of t" plots, which show the directions and magnitudes of the differences between means as well as confidence limits adjusted to take into account the fact that comparisons are made on several characteristics.


Example:

A coffee manufacturer was interested in measuring consumer perceptions of four coffee blends, three of which were experimental and one which was currently on the market. Prior to the expenditure of additional resources on the experimental blends, the manufacturer wanted to know how each of them fared relative to the product on the market. A screening study was conducted in which 400 respondents each used one of the four products for a week and then evaluated it on eight characteristics: flavor, richness, bitterness, smoothness, satisfying, color, texture, and value. Seven-point rating scales were used, with a 7 denoting "describes this product exactly" and a 1 meaning" does not describe this product at all."

Since the primary interest is in comparing the three experimental coffees (Products 1, 2, and 3) with the existing blend (Product 4), a typical analysis of the data would involve 24 t-tests to assess the differences between each of the three experimental blends and the existing product on all eight variables. If a significance criterion of .05 were used for each of these comparisons - that is, if one were willing to accept, for each comparison, a 5% risk of erroneously declaring a chance difference to be significant - then the overall risk of such an error would be very high. (There is a 5% risk of an error for the first t-test, plus a 5% risk for the second, plus a 5% risk for the third, and so on.) This implies a low overall confidence level across the 24 tests and suggests the need for some adjustment of the significance levels as mentioned earlier.

The use of components of t plots in this case would serve two purposes. First, the plots would graphically depict the direction and magnitude of the differences between the experimental and existing products. Second, the plots would allow us to identify on which variables the products differ significantly while maintaining some prespecified level of confidence. In the next section, components of t plots are described and applied to the example data.


Components of t Plots:

The basic idea behind components of t plots stems from a simple re-expression of the t statistic. In its most familiar form, the t statistic for testing the equality of means of two independent groups is


where X 1and X 2 are the means of the two groups, n1 and n2 are the sample sizes, and s² is an estimate of the variance, obtained by pooling (or averaging) the variances within the two groups. The denominator of the t statistic is the standard error of the difference between the means (SED). This formula for t can be re-expressed as


The terms C1and C2 are the components of the t statistic. They are simply means which have been rescaled so that the difference between them yields the t value.

A components of t plot is constructed by plotting the components of t (C1) for one group against the components of t for another group (C2) for each variable. The general form of this plot is illustrated in Figure One. If the two groups being compared have identical means (and thus identical t components) on each variable, the plotted points would lie on the "line of no difference" extending from lower left to upper right in the figure. If the group means are similar, but not equal, the plotted t components would lie near this line.

The parallel lines above and below the line of no difference form the outer bounds of a "region of nonsignificant differences." These bounds are constructed so that any point falling outside the region of nonsignificance corresponds to a difference between means which is statistically significant with some prespecified level of confidence. To account for the fact that comparisons are made on several variables, the confidence bounds are adjusted using the Bonferroni inequality, to be discussed in detail in a future Research on Research report.

A plotted point falling in the horizontally hatched region in Figure One indicates that the mean of Group Two is significantly greater than the mean of Group One on that variable. Similarly, a plotted point falling in the vertically hatched region indicates that the mean of Group One is significantly greater than the mean of Group Two.

For the example data, components of t plots were used to compare the three experimental coffees with the existing blend (Product 4) on the eight characteristics. Table One includes summary statistics for the 24 comparisons. For each comparison, the table shows the sample sizes, means, standard deviations, and components of t for each group, as well as the t statistic and its associated level of significance (or one minus the level of confidence).

The components of t plots are shown in Figures 2, 3, and 4. One plot is produced for each pair of products. Figure 2 is a plot of comparisons between Products 1 and 4, Figure 3 is a plot of comparisons between Products 2 and 4, and Figure 4 a plot of comparisons between Products 3 and 4. The confidence bounds shown in each figure are 95% bounds, which have been adjusted to account for the fact that eight comparisons are made for each pair of products.

Figure Two indicates that Product 1 was rated significantly higher than Product 4 on flavor, richness, smoothness, value, and satisfying and significantly lower than Product 4 on texture. From Figure Three, we can see that Product 2 was rated significantly higher than Product 4 on flavor and richness and significantly lower on color. Finally, Figure 4 shows that Product 3 was rated higher than Product 4 on flavor and satisfying and significantly lower on color and texture.



Alternative Analysis:

It should be noted that a multivariate test — Hotelling's T² — is also applicable for testing the statistical significance of differences between two groups on several variables. However, the results of this test do not indicate on which of the variables, if any, the groups differ significantly. Therefore, the use of t-tests with appropriately adjusted confidence limits is a valuable approach to the problem of comparing groups on several variables.


Click to enlarge



Click to enlarge



Click to enlarge



Click to enlarge