Interpretation of T-test results

This paper discusses the relationship between t-statistics, obtained from testing differences between means, and correlations. The relationship should be beneficial in aiding the researcher to interpret and evaluate results of t-tests. Because potential information provided by correlations is generally not considered with t-test information, it is very possible that researchers are providing spurious conclusions to management.


An Example:

A study was conducted to determine the potential market acceptance at two variations of peanut butters. The products, as concepts, were presented for evaluation to two samples of 150 respondents each. Degree of believability was measured in addition to purchase intent and overall appeal.

A seven-point scale ("1" representing "totally unbelievable" and "7" being "totally believable") was used for measuring believability. Table One contains descriptive statistics summarizing responses.

Table One
Descriptive statistics for believability

  Mean Variance Sample Size
Concept one 5.7 2.81 150
Concept Two 4.9 2.94 150
Pooled variance: 2.875  


Statistical Analysis:

A t-test used to test the significance of the difference between the means yielded at value of 4.09, statistically significant with at least 99% confidence; Concept one was judged to be more believable.

It was also of interest to study the relationship or strength of association between believability and purchase intent (likelihood to purchase). A correlation coefficient of .23 was obtained. Although this correlation is statistically significantly greater than zero (again, with at least 99% confidence) it was dismissed as being too small to be of practical utility; no real relationship was felt to exist.


Discussion of Analysis:

An interpretive "double standard" appears when comparing decision rules for evaluating the significance of t-statistics from tests of mean differences and correlation coefficients. Statistical significance and the practical importance of a difference between two means are often considered synonymous. Indeed, some researchers may conclude marketing actions should be undertaken when two means differ with at least 95% confidence. Often, little regard is given to absolute magnitude of the difference.

Correlation coefficients,on the other hand, are evaluated quite differently. The size of the correlation, not statistical significance, forms the basis for assessing practical importance. Researchers frequently assume the correlations are statistically significant (or completely ignore the statistical test). Then they move to a second interpretive stage of evaluating the size of that correlation. As such, correlations below .4, although they may be statistically significant, are usually interpreted as indicating no "meaningful" relationship. Herein lies the double standard: researchers typically do not go to this second interpretive stage of examining size of mean differences when evaluating t-statistics from tests of mean differences.

A valuable rule for evaluating mean differences beyond usual significance tests can be obtained by exploiting this double standard: combining the statistical/probabilistic advantages of significance testing and the realistic interpretive advantages of taking the magnitude of mean differences into account. This combination is possible through the use and understanding of the mathematical relationship between t-statistics and correlation coefficients.

A t-statistic can be expressed as a correlation. As such, t-statistics can be interpreted in the perhaps more easily understood correlation scale of - 1 to + 1. While examining this mathematical relationship, it will be of interest to show that correlations which result from re-expressing statistically significant t-statistics can be quite small. A change in perspective about interpreting t-statistics and/or correlations may be warranted.


The relationship between the t-statistic and the Correlation Coefficient:

The t-statistics is by far the most popular form of standardizing the difference between two means. The relationship between the t-statistic (t) and the correlation coefficient (r) can be shown by exploiting the idea of a standardized difference. (Standardization in this sense means dividing some statistic by a measure of its variability, typically its standard error.) The t-statistic is of the form

where the denominator is the standard error of the difference between the means using a pooled estimate of variability, .

A second form of standardization of the mean difference is the point biserial correlation:

where:

rpb is called the point biserial correlation, N is the total sample size, P1 and P2 are the proportions of respondents in groups one and two, respectively and are the two group means, and is the sum of squared deviations of all scores (Xij) from the grand mean.

Although mean differences can be expressed as a correlation using rpb, it is only one step toward showing the relationship between the t-statistic and the correlation coefficient. The second step is taken by noting that the significance of the difference of any correlation coefficient from zero can be tested by the use of a t-statistic. Using the point biserial correlation as a specific instance, the formula is:

where df represents the number of degrees of freedom associated with the standard error of the correlation. Re-expressing the above formula to solve for rpb yields¹:

As such, given a t-statistic computed from a test of mean differences and its associated degrees of freedom it is possible to express that t-statistic as a correlation coefficient. (More information about the generality of the above formula is given in Appendix One.)

[Note: The paint biserial correlation can also be calculated using the usual pearson product moment correlation formula. A second variable consisting of ones and zeroes is constructed and correlated with the variable containing the quantities of interest. This one/zero variable, called a dummy variable, would be created by arbitrarily assigning a score of one to one of the two groups of respondents. The second group would receive zero values. The absolute value of the correlation obtained by this method will match the absolute point biserial correlation value. The sign of the correlation may differ depending on which group receives what code.]


Reconsideration of Ear1ier Example:

Reconsider the example and data supplied earlier. The t-statistic is

with 298 degrees of freedom.
Next, the mean difference of .8,(5.7 - 4.9), can be standardized to give the point biserial correlation:

The relationship between the t-statistic and the correlation can be shown in two ways: the t-statistic can be re-expressed as a correlation,


¹ rpb takes the original sign (+ or -) assumed by the t-statistic prior to squaring.

matching the point biserial result. Or the significance of the point biserial correlation of .23 from zero can be tested:

matching the t-test results (more decimal points are used to reduce round-off error). In both instances the results are consistent.


Interpretation:

Along with expressing mean differences and their associated t-statistics as correlations come the usual interpretive trappings. Correlations in this context retain their ease of interpretation. Specifically, the correlation measures the degree of systematic change in response, as indicated by group means, from one group to the other. The degree of change is represented by the absolute magnitude of the correlation; a large difference relative to its error variability would yield a large correlation. The larger the difference, the closer the correlation would be to ± 1, depending on the direction of the difference. Conversely, zero correlation would indicate no systematic change: the means are equal.

Further, the square of the correlation retains the meaning of "proportion of variability accounted for." Tailored slightly to fit this context, r² indicates the percentage of total variability of the measured quantity ("believability" in the preceding example) due to the systematic change in average response from group to group. If this percentage is small then, again, the difference between means is also small (relative to the total variability).


t-Statistic/Correlation Coefficient Relationship Continued:

Consider the size of correlations corresponding to at-value of 2.00 for various sample sizes (N) displayed in Table Two. The use of 2.00 is selected as it closely represents a statistical result being significant with 95% confidence with a sample size of at least 50. ²Although the mean differences are large enough to be considered statistically significant, the corresponding correlations are quite small suggesting limited practical importance. Clearly, statistically significant t-statistics will not always correspond to correlations of an "acceptable" magnitude. As such, incorporating correlations into the realm of tests of mean differences shows that larger t-statistics may be desired.


Evaluating t-Statistics: A Rule of Thumb:

A researcher can judge the t-statistic / correlation coefficient relationship in one of two ways: discount statistically significant t-statistics which correspond to small correlations or have more respect for small correlations. The first of these


² The t-statistic/correlation coefficient relationship is expanded upon in Appendix Two


Table Two

Correlation Coefficients for Various Sample Sizes
N R
50 0.277350
100 0.158030
150 0.162221
200 0.140720
250 0.125988
300 0.115087
350 0.106600
400 0.099751
450 0.094072
500 0.089264
550 0.085126
600 0.081514
650 0.078326
700 0.075485
750 0.072932
800 0.070622
850 0.068519
900 0.066593
950 0.064820
1000 0.063182

alternatives may be more useful as it can identify statistically significant mean differences that are also large enough to be of practical importance. Although there are no hard and fast rules for determining a small correlation (or when a correlation stops being small), experience with correlational techniques (regression and factor analysis) suggests a value of .4 to be on the border line. A "rule of thumb" can be developed by substituting .4 into the significance test formula for correlations:

Setting rpb = .4 and rearranging terms shows that the absolute value of the t-statistic equals (.4364) As such, a t-statistic would be considered as reflecting a practically useful mean difference if it were statistically significant with a reasonable level of confidence and roughly greater, in absolute size, than .4 times the square root of the number of degrees of freedom associated with the statistic.

From the example cited earlier, with 298 degrees of freedom the t-statistic would need to exceed 7.53 to be considered a statistically significant and practically useful mean difference.


Summary:

Having a statistically significant t-statistic may not be enough to warrant a conclusion of "meaningful difference" or "practical importance." The magnitude of the mean difference should also be taken into account. Linking correlations with t-statistics is a simple way of providing some guidance.


Appendix One:

The formula relating t-statistics to correlation coefficients

extends to more complicated cases. For example, it can be used to convert t-statistics from one degree of freedom contrasts in an analysis of variance. Also, it can be used on t-statistics calculated from differences between adjusted (least square or marginal) means from analysis of covariance or unbalanced factorial designs. Again, by having information on the t-statistics and its degrees of freedom the above general form can be used.³

However, the point biserial correlation is not readily generalizable to the examples cited. In general, without adjustment of the means and sum of squares components of the formula, which may be computationally difficult, the point biserial correlation will not give correct results. As such, use of the above general form is recommended.


The general use of this form stems from the fact that the components of the t-statistic reflect the appropriate adjustments included in more complex analyses. The numerator of the t-statistic contains the adjusted variability for the hypothesis of interest. The denominator has the appropriate error term and degrees of freedom, derived from the analysis of variance or regression model, to be used in converting the t-statistic to a correlation.


Click to enlarge


Appendix Two:

Figure One shows the relationship between t-statistics and correlation coefficients for t-statistic values between 1.25 and 20.00 at .25 increments. The four curves represent four degree of freedom values: 50, 250, 500 and 1000. Clearly, the relationship is monotonic. If the mean differences are small relative to their standard errors then the associated correlation will be small. Conversely, as the t-statistic values increase, the correlation coefficients increase as well, although more slowly for larger degrees of freedom. The correlation value of one will be achieved when t = ∞ which can occur only when the standard error of the mean difference is zero. Further, note that for a given t-statistic the correlation decreases as the degrees of freedom increase. Computationally, this is an obvious result. Statistically, it shows that as degrees of freedom (and hence sample size) increase smaller standardized differences are needed to achieve a constant level of significance. Larger samples increase precision making small differences appear more statistically significant. This same relationship will be true for correlations as a measure of standardized difference: smaller correlations are needed to express the same magnitude of differences as sample sizes increase.