Censored Scales

Introduction


Discrete scales are often used to record responses concerning characteristics which really vary along a continuum. Examples in attitudinal measurement are most prevalent where respondents are given several choices (e.g., a 6-point agreement scale) which are used to capture attitudes that fall along a continuum. Number of units consumed (e.g., glasses of beer), or demographic characteristics, such as age or income, are also obtained by splitting a measurement continuum into several discrete intervals. Respondents are then instructed to check the scale position or interval which most accurately describes their intended response. Quite often the discrete measurements of consumption or demographics are characterized by open-ended intervals at the scale extremes. "More than 20 glasses" or "over 65 years of age" are examples of open-ended upper bounds on scales for beer consumption and age, respectively. "Under $10,000 a year" reflects the open-ended nature of a lower bound on income. This paper addresses some issues which arise when discrete scales with open-ended extremes are used to measure characteristics which really vary along a continuum.


Example Data

The product category is in-home fabric furniture cleaning kits. Market Facts conducted a survey among 3,000 Consumer Mail Panel members. The questionnaire contained a two-question sequence. The first asked for the number of kits ever purchased, while the second asked, conditional on a response of at least two kits, how much time had transpired between purchases. The intent of the latter question was to estimate the purchase cycle for this product. The first question was open-ended: the respondent wrote in a response. However, a rather crude five point scale was used to record time between purchases. (See Table One.)

Of interest was the calculation of a purchase cycle, the average number of months between purchases, for each of two segments: 1) those who had purchased two kits, "single repeat" purchasers, and 2) "multiple repeat" purchasers, those who purchased more than two kits. "Each of the five scale points was coded in terms of months to facilitate calculation of an average with a useful interpretation. The coding is shown in Table One, along with frequencies and percentages of response for both segments and the average which is to be interpreted as indicative of a purchase cycle. The codes used here have no special mathematical or statistical origin and were chosen purely on intuitive grounds. The averages have some face validity in that "multiple repeat" purchasers have a shorter purchase cycle. But the average values themselves may be more arbitrarily calculated than would suit most researchers' needs.


*When defining segments in this fashion, it is quite likely that some respondents may be misclassified. Imposition of the questionnaire at a particular point in time may truncate the purchase cycle. Respondents who have yet to repurchase but may do so in the future are not represented in these segments. Further, those in the "single repeat" segment may, at some point, purchase again.



Before considering some analysis problems, the concept of "purchase cycle" needs to be defined. Literally, based on the information collected, purchase cycle refers to time between purchases. Time until next purchase is inferred. Longitudinal surveys, purchase diaries for example, could yield far more precise data on several subsequent purchases and the time between them. However, the product was considered to have a sufficiently long purchase cycle to warrant the use of this retrospective, cross-sectional survey. The construction and maintenance of a special diary panel was too costly. Further, product usage was considered very salient, with usage easily recalled over relatively long intervals of time. Errors in measurement due to poor recall (e.g., telescoping) were considered negligible.


Scale Problems

The problems with averages here stem from the scale from which these have been calculated. The inexactness of the month codes assigned to represent time between purchases for each of the five time intervals (response categories) is the first problem. For example, a person having one to two years transpire between kit purchases is assigned a "month score" of 18, which assumes, for analysis purposes, that 18 months is the average of all responses in that time interval. With in each response category, there is a span of time which, for the scale used here, increases as the length of time between purchases increases. (The scaling used does not yield equally spaced intervals, nor is there a consistent progression in the numbers used as month codes. A scale with equally spaced scale points, say by six-month intervals, would remedy these maladies.)

The use of a specific scale point to represent all responses in an interval would suggest, hopefully, that purchases within a time interval are at least symmetrically distributed around this point. If this is the case, this point would be both the average and mid point of the interval, where the midpoint is halfway between the upper and lower bounds of the interval. In actuality, the true average value within an interval is unknown when dealing with grouped data and the symmetric distribution of responses within intervals is rarely the case, especially where the true underlying continuum is skewed, such as with the data presented here. In fact, interval averages and midpoints may differ greatly when the data are skewed and neither may be a good indicator of a central value (e.q., median) within the interval. The month codes shown in Table One are not always midpoints of their respective intervals, nor is it likely that they represent average responses.

Scale values which more accurately reflect true centers within each interval can be obtained. First, the distribution of data is smoothed. Graphically, this would be depicted by conversion from a bar chart indicating frequency of response in each interval, to a smooth curve representing a conjectured continuous distribution for the data. Calculus would be used to estimate the values which are more indicative of the centers within the intervals. As such, a certain amount of "play" within response categories is to be expected when a discrete scale is used to record quantities or events which take place along a continuum.


Censoring

The major problem with this scale is that it is censored: there is no measurable upper bound. Respondents placing themselves in the last scale interval, "more than two years," are essentially unmeasured. At this upper bound the inexactness of the monthly code assigned, 36 months, is most noticeable. Coding responses as "36 months" assumes that the actual average response within this undefined interval is three years. To complicate matters, a large percentage of respondents placed themselves in this category, making the segment average especially sensitive to changes in coding this interval. For exam ple, assuming that responses in this interval actually average around 48 months rather than 36 months dramatically changes the segment averages. The "multiple repeat" purchaser average increases to 23.88 months and the "single repeat" purchaser average advances to 35.34 months.


Alternative Analyses: Cumulative Percentages

A simple alternative, more easily interpreted than averages, yet still weak because of the scale used, is to accumulate the percentage of respondents having time between purchases within a certain interval: within six months, a year or two years. These cumulative percentages are shown in Table Two.



For example, 34.5% of "single repeat" purchasers and 63.8% of "multiple repeat" purchasers repurchased within two years. The two percentages can be more cleanly compared than the averages, yet still may be of limited interpretability due to the condensed nature of the scale. A chi-squared test of the equality of the percentage distributions and z-tests of specific differences between segment percentages may be used to assess statistical significance. The statistical assumptions underlying the use and interpretation of the chi-squared and z-tests are far more likely to be met than is the case for a test of differences between averages. The price to be paid for use of this alternative analysis is that the purchase cycle is no longer expressed or summarized as an average number of months. To the extent this is a problem, a second alternative is offered below.


Alternative Analyses: Averages

An improved estimate of the average number of months, as indicative of a purchase cycle, can be obtained by more explicitly recognizing the censored nature of the data. The magnitude of the censoring is used directly in the calculation of the average. Further, this alternate approach allows for the estimation of a month code which "best" fits the distribution of data, "best" in the sense of fulfilling the requirements of the statistical model underlying this approach.

The procedure follows a number of steps, described below as applied to the "multiple repeat" segment. While the calculations are reasonably simple, estimation is aided greatly by the use of two tables which are not reproduced here. The tables are available upon request from Market Facts.

The first step in the calculation is the identification of the point of censoring, the month beyond which responses, as represented by the last scale interval, are unmeasured. This point, denoted as C, is 24 months, the upper bound of the fourth scale interval, "1 to 2 years." Further, 696 of the 1,924 respondents, or 36.2%, fall beyond this point and are considered censored. Among the "uncensored" 1,228 respondents, two statistics are needed: an average purchase cycle, estimated as 10.22 months (denoted u), and an average squared distance from this average. The latter value, denoted M, is calculated as :

(This average squared distance is very much like the variance but is divided by n, the number of respondents in the uncensored portion of the scale, rather than n - 1.) The average squared distance, uncensored average and point of censoring are used in the following fashion:

,

which yields :

35.81 / (10.22 - 24)²,

or a value of .1886. This statistic and the proportion censored (.362) are used to enter the first of the two tables referenced above. The resulting tabled value, denoted , is needed in the calculation of the total "multiple repeat" sample purchase cycle average and variance. The obtained tabled value for is .54. The formulas for the average and variance are, respectively:

,

which yields a value of:

10.22 - (.54)(10.22 - 24),

or 17.66, and:

which yields a value of:

35.81 + (.54)(10.22 - 24)²,

or 138.35. As such, the estimated purchase cycle is 17.66 months, about two months shorter than the average originally calculated. The variance associated with the purchase cycle is 138.35.

Once the average is obtained, an estimate of the month code, which when assigned to the censored interval leads to this average, can be calculated as follows. An estimate of the total "multiple repeat" sample sum may be obtained by multiplying the estimated purchase cycle, 17.66, by the total number of respondents, 1,924. The product is 33,977.84 months. The sum of months attributable to the uncensored sample is subtracted from this total sample sum. The result is (33,977.84 - 12,546.50) = 21,431.34 months attributable to those in the censored interval. Division of this remainder by the number of censored respondents, 696, yields an average number of months of 30.8 which is the implicit month code for the censored interval. Note that this new code is about five months smaller than that used initially.

Lastly, the standard error of the total "multiple repeat" sample purchase cycle is estimated with the aid of the second table. The statistic:

is used to enter this table. The resulting tabled value, denoted B and equal to 1.12, is then used to modify the usual standard error calculation. The standard error is estimated as:

The standard error can be used to create confidence intervals around the estimated purchase cycle (refer to Research on Research Paper Number 34 for details) or to test the statistical significance of differences between purchase cycles.

Summary statistics for both segments are reported in Table Three. Although both segments are shown together, the purchase cycles are not comparable due to the separate and different codes used for the censored interval. Using the month code for one segment to calculate the purchase cycle for the other would eliminate this problem. Further, both segments could be combined, with the censored interval code calculated from the total sample. Both approaches should be used with caution given the disparity of the distributions for the two segments.


While this estimation procedure produced a slightly shorter purchase cycle for "multiple repeat" purchasers, the purchase cycle for the "single repeat" segment has increased by, about two months. As might be expected, the month code for this segment is larger (by three months) than that used originally.


Additional Points

The above analyses are attempts at doing the best with bad data. A better approach is to obtain the information in a more useful form. A reasonable choice is a write-in response. The respondent may record the number of months between their last two purchases, the dates of their last two purchases or, perhaps, the typical or average time between purchases.

In lieu of the potentially more precise written-in response, a discrete scale with more response categories than used in the example can be applied. The scale can be improved by increasing the number of categories within which respondents may place themselves. Also, the time interval within each category should be shortened. In the above example, later response categories would be six months in length, rather than a year or more. Further, and to the point of the above two suggestions, an attempt should be made to minimize the portion of respondents failing into the censored interval at the end of the scale. With reasonable refinement in the number of categories, say in the vicinity of seven, and with a smaller portion, say 10%, in the censored cell, the average and its standard error can be more precisely estimated.