Clustering Concepts

Introduction


There is great interest in the accumulation of information concerning the performance of concepts. This information, typically in the form of some measure of purchase likelihood coupled with marketing information, is used as the basis for models which predict future product profitability from concept performance. Predictive ability improves considerably with the quality, breadth and relevance of the accumulated information. However, many researchers may have scant data that can be used with these models: potentially many recently surveyed concepts, but few brought to market, and little idea as to levels at which to set marketing variables. In lieu of these data, a simple approach to identifying successful concepts will be presented which relies only on some measure of purchase intent. The techniques employed here will not estimate sales or ultimate product profitability. Rather, the goal is to simply identify some subset or cluster of concepts which is "better" (evaluated more positively) than others. A statement of statistical significance of the distinctiveness of these clusters is also available.


Background

A databank had been developed containing information on 39 children's vitamin concepts. As background, respondents were asked to evaluate a concept in terms of their likelihood to purchase the product presented by the concept. A five-point scale was used for evaluation. The top two boxes were labeled as "definitely" and "probably" would buy, respectively. Responses falling in the remaining three categories were considered as reflecting little enthusiasm for the product depicted. Each concept was evaluated by 150 female heads of households with children between the ages of 2 and 12 present in the households. Of specific interest here are the top two box purchase likelihood percentages, although any other measure of concept acceptance (e.g., means, top box percentages, ratios of top box to bottom box percentages) is equally applicable. The 39 concepts are listed in Table One, ordered by their top two box percentages.


Gaps and Clusters

The general approach taken here is the statistical assessment of differences, or gaps, between adjacent concepts, as measured, for example, by top two box purchase likelihood percentages. These gaps, in turn, help to define clusters of concepts. The basis for this assessment is the supposition that real or true differences among concepts will be exhibited as large gaps within an ordered set of concept percentages. These "naturally occurring" breaks in the data are taken as indicative of, or the result of, respondent discrimination among concepts. In addition, the size of the gap is a measure of stability or reliability of the clustering created by the gaps; large gaps indicate greater respondent discrimination.

Of specific interest are clusters which contain those concepts which have the largest top two box percentages. These clusters form a subset of the "best" concepts. Gaps, as measures of discrimination, can serve the purpose of delimiting such clusters and are the basis for statistical tests of significance of the existence of these clusters. Further, the gap which separates the "best" cluster from the rest of the data represents a boundary or hurdle that must be surpassed for a concept to gain entrance into this cluster. The upper bound of the gap, or the lowest top two box percentage included in the "best" cluster, is th at value which must be exceeded for a concept to be included. Do note that inferences concerning gaps are for concepts currently included in the data bank. The position and/or size of gaps, and location of clusters, may change as concepts are added. New analyses should be performed.



Ultimately, a statistical question will be raised concerning the size of a gap: how large must a gap be to be considered as indicative of the existence of a cluster? This question may best be answered with reference to a specific statistical model. In this instance, reliance is placed on the assumption that concepts are distributed in a uniform fashion. A uniform distribution is one within which data values (percentages in this example) are evenly spread out across the full range of values encountered.

Given the nature of the concepts tested, a uniform distribution is likely. Concepts are typically non-randomly selected for testing. Those that are obvious "losers" are eliminated before testing, thus reducing the chances of getting poor top two box purchase likelihood percentages. Truly great concepts probably already exist as products, so the upper end of the percentage continuum may be truncated as well. Within the band of remaining moderate percentages, concepts will probably be reasonably evenly dispersed.


Graphic Display of Gaps

The vitamin concept data in Table One are ordered by top two box percentages. Differences between adjacent concepts can be calculated by subtracting a concept's percentage from the percentage of the concept above it. These differences, or gaps, also appear in Table One. The gap between the two most highly rated concepts, "GG" and "FF," is .3, (51.4% - 51.1%). Note that the gap value for the highest concept is undefined.

Before any statistical analysis is performed, an examination of the gap values is useful to get a feel for their size and range. The concepts could be reordered by gap size but there are many numbers (38 gaps) to review. Consider, too, th at this is a small data bank. Reviewing larger sets of gaps would be even more difficult. A reasonable approach is to graphically display the concept top two box percentages in a form which highlights the gaps. One method is to create what is called a percentile plot: plotting each concept percentage against an index number indicating that concept's ranked position. The index number is small for small percentages and increases as percentages increase. Such a plot is shown in Figure One. Differences between adjacent top two box percentages are literally displayed as gaps. Large differences yield large gaps. For example, consider the difference between the third ("Q") and fourth ("Y") concepts. The difference of 3.5 is large relative to all others, as is the gap in the plot.


A Test of Significance

It now must be determined whether a particular gap is statistically significant or just an excessively large aberration in the data. An intuitively reasonable approach to significance assessment is to compare the size of a specific gap against an entire distribution of gaps obtained from randomly drawn, uniformly distributed data. A statistically significant gap would be relatively large when compared to (theoretically) all gaps that could be obtained from uniformly distributed data. Using statistical significance testing logic, the uniform distribution represents a "null case" in which no clusters exist. Large gaps are expected to occur only infrequently by chance. The significance of an obtained gap is found by noting the percentage of gaps from the "null" distribution which are larger, and can be interpreted as the percentage of times that a gap this size or larger would occur if the data from which it was drawn were uniformly distributed, i.e., no clusters really existed. Significance is commensurate with the risk of considering a gap (and the clusters separated by this gap) as "real and useful" when, in fact, no clusters of concepts really exist. Large gaps have small significance levels and, hence, small risks of suggesting the existence of clusters which really don't exist. Conversely, the percentage of gaps from the "null" distribution smaller than the gap of specific interest can be taken as the level of confidence that the gap is greater than would be expected by chance. Again, high confidence indicates the existence of clustering, or non-uniformity.

Of specific interest is the gap of 3.5, separating concepts "Q" and "Y." An estimate of significance is found by referring to a distribution of 38,000 gaps. The 38 gaps from each of 1,000 randomly generated uniformly distributed samples of 39 observations (the same as the number of concepts in Table One) were accumulated to form this distribution. Table Two displays a summary of gap values from this distribution along with their associated significance levels, against which the obtained gap of 3.5 is compared. The gap of 3.5 falls close to the .005 significance level: the chance occurrence of a gap as large or larger than 3.5 from a uniform distribution which contains no clusters is roughly .5%. Conversely, the gap is considered significant with about 99.5% confidence. (The exact level of significance is .00553 which is calculated as the proportion of gaps within the reference distribution larger than 3.5. There are 210 such gaps which when divided by 38,000 yields the appropriate significance level.) The gap separates the top three concepts into a cluster of "best" concepts. Entrance into this cluster requires a top two box percentage of at least 50.8.



Adjusting Significance Levels

If there is interest in testing one and only one gap whose position is known before the data are examined (say between the third and fourth highest rated concepts), the estimated significance level obtained by the above method would be correct. However, several large, potentially useful gaps may be present, all of which appear only after some examination of the data. The subsequent selection of a subset of gaps for testing, after reviewing all gaps, leads to a biased estimate of significance. The actual level of significance can be much greater i.e., much lower confidence) than reported by the testing method. The obtained probabilities of significance can be adjusted in a fashion akin to the use of the Bonferroni inequality (Research on Research Paper Number 30). A better measure of significance, which is at worst an upper bound on the true level of significance, is calculated by 1 - (1 - P)n. P is the level of significance obtained from the above method and n is the number of gaps. For example, if all 38 gaps from the vitamin concept data were examined but only one gap, that between concepts "Q" and "Y," was tested, the adjusted level of significance is at most 1 - (1 - .00553)38, or .19. The gap is now considered significant with at least 81% confidence. This significance finding is tantamount to calculating the proportion of randomly drawn, uniformly distributed samples of 39 concepts each which have at least one gap as large as 3.5. Also, this significance level equals the true significance level if all of the gaps are independent of one another. Since gaps, in general, will be correlated to some extent (adjacent gaps will be most highly correlated since they are calculated from common concepts) the obtained significance level will almost always serve as an upper bound on the true level of significance.

Clearly, the level of confidence associated with the testing of a single gap with no adjustment yields too optimistic a result. Rarely, though, would all gaps be of interest for possible significance testing. If the objective of the analysis is to identify clusters of "best" concepts, restricting attention to gaps among the concepts with the largest top two box percentages is a reasonable compromise. Using only the nineteen largest concept percentages, those larger than the median, limits to 18 the number of gaps of possible interest. The upper bound on the level of significance for a gap of 3.5 is now obtained from the subset of the "null" distribution used above corresponding to eighteen gaps between the largest nineteen top two box percent ages for each of the 1,000 randomly generated uniform distributions. Eighty-nine of the 18,000 gaps are larger than 3.5 yielding an unadjusted level of significance of .00494. Then, 1 - (1 - .00494)18 gives an adjusted significance level of .085, or significant with at least 91.5% confidence.


When No Gaps Occur

Concept databanks will be encountered that have no significant gaps and, thus, no apparent "best" cluster. This is not to say that no "best" clusters exist; this merely indicates that no naturally occurring point of segmentation among concepts is present. Any attempt to cluster concepts must either be more arbitrarily imposed or based on additional information not previously considered.



A percentile plot, such as shown here, can also be used to assess characteristics of the distribution of data. Specifically, if the data plotted form a straight line (as drawn in the figure), running from the lower left corner to the upper right, the data conform to a uniform distribution. If the data are perfectly uniformly distributed, the plot would display a perfectly straight line which would be formed by the one-to-one correspondence of differences between adjacent ranks and differences between adjacent top two box percentages. The existence of a gap suggests that the adjacent difference between two concepts is greater than would be expected if the data were perfectly uniformly distributed. The gap also suggests the position of an area of respondent discrimination, a zone which provides distinction between two groups of concepts. The data displayed in Figure One look straight enough to recommend use of the uniform distribution as a statistical model, yet wobbly enough to suggest the existence of large gaps.