Factors Involved in Conducting Product Tests via Central Location Facilities

Introduction


The ultimate success of a product test can be judged by whether the resulting data:

Success, then, depends upon the validity of product measurements obtained and the reliability or consistency of these measurements. This success (i.e., that test product measurements are valid and reliable) is often taken for granted. The product tester may not realize that errors may have occurred in design or implementation of the test and consequently incorrect decisions might have been made.

Perhaps, then, a better definition of successful product tests can be developed–successful product tests are those in which biasing factors have been minimized or tightly controlled. For this paper, biasing factors will be defined as any factors which could confound or distort measurement of product characteristics.

Factors which present the greatest likelihood of confounding product test results and which must be controlled carefully include the following:

Each of these factors relates to the execution and "actual doing" of the product test. Control of these factors is critical to product testing success, regardless of the goal of the test itself.

This paper presents an example of, and guidelines for, product testing that ensure proper control of the set-up and execution factors listed above. The example is based on experience with tests conducted in central location testing facilities. Observations can be generalized to other modes of data collection, such a small intercept facilities.


An Example

In recent years "convenience" meals have enjoyed great growth in consumer demand. Consumers desire meals that are easy to prepare and require shorter cooking times. Current offerings in the category include both conventionally cooked items, using stove-top or oven baking, and microwaveable products.

A manufacturer was planning to enter the market with a line of high quality, microwaveable beef and noodle dishes. At the time of this test, a fully microwaveable product was not available for testing, although a partially microwaveable prototype had been developed.

The Research and Development (R&D) Department had produced one flavor within this line of dishes and wished to test six variations of this flavor. These variations consisted of modifications to sauce thickness and salt level. Three levels of sauce thickness and two levels of salt were to be evaluated. Statistically, the product test could be described as a 3 x 2 factorial design. Respondents were asked to evaluate three of the six possible sauce/salt combinations. Of greatest concern to R&D was the identification of the most acceptable sauce thickness. Salt level was of secondary interest. As such, a split plot design was used to form a basis for blocking. Respondents were exposed to all three sauce levels within a level of salt. This approach to blocking maximizes the precision of statistical comparisons among sauces at the expense of salt level contrasts. Research on Research Paper Numbers 41 and 49 can be consulted for further details.

The overall goal of the product test was to measure consumer reaction to the new flavor, and the extent to which sauce thickness and salt level affect consumer acceptability. R&D intended to use the information obtained in continuing product reformulation. To facilitate evaluation of this new product among consumers, a Central Location Facility was chosen as the method for data collection.


Environmental issues

The facility constitutes the environment in which tasting occurs, and control of this environment is critical. Selection of the facility is typically based on two sets of criteria, location of the facility and environmental characteristics of the facility.

The facility should be located in an expanding residential area that offers a wide variety of respondents (ages, incomes, occupations) for the development of a respondent bank from which participants can be drawn for future tests. (No attempt is made to draw samples from the bank with strict statistical virtues. Convenience is the overriding concern in obtaining respondents who are at least representative of consumers with in the product category of interest.) Younger, newly developed areas are preferable in order to ensure the ability to freshen and replenish the respondent bank over time. Further, the facility must be easily accessible to respondents in order to increase cooperation rates.

The most important characteristics of the facility include the kitchen, storage capacity, lighting and ventilation of the room where evaluations are conducted. The facility must be amenable to change with the requirements of the products being tested. For example, the beef and noodle dish required each main ingredient to be prepared separately. So, kitchen space requirements included adequate stove-top space for preparation of the meat, counter-top space for additional handling and preparation, and room for four microwave ovens to heat the final product.

Generally, products are presented as "holistic entities" to be evaluated in their natural prepared state across a multitude of product "dimensions." The test products are to be smelled and seen in addition to tasted. For example, the aroma emitted by products may be as crucial to correct evaluation as taste. The facility or environment in which the products are tested, then, must guarantee that nothing detracts from these aspects of the evaluation. When respondents evaluate more than a single product, strong aromatic products may adversely influence the evaluation of other products seen later in the testing sequence, resulting in a form of carry-over effect. Adequate rotation of products for presentation may minimize this problem. However, good ventilation is required to ensure that only the product currently being evaluated is sensed, and that other product odors do not linger or intermix.

Proper lighting is essential so that the product may be seen as it would when served at home. Products may, then, be clearly and consistently evaluated in terms of appearance (e.g., size, amount and color of the beef and noodles).

The above issues illustrate the complex interaction between respondent and product. Controlling factors such as ventilation and lighting ensure that products are presented to respondents in a consistent and natural manner. For example, although a beef and noodle dish is placed before a respondent with the intent of gathering taste perceptions, other senses come into play. The initial smell and sight imparted by the product reaches the respondent long before tasting occurs. The sense of smell intermingles with taste to create a complete impression of the palatability of the product. As such, the product is not just for tasting but for experiencing.


Product Preparation Issues

Product preparation instructions must be precise. The validity and reliability of the evaluations depend quite heavily on whether the test product has been prepared consistently and correctly. To ensure useful product evaluations, the following must be provided:

  1. An ingredient list
  2. A supply list
  3. A list of equipment and utensils required for product preparation
  4. Preparation, cooking and serving instructions
  5. "Holding time" allowed

The specific ingredients found in each product should be made clear to respondents. Some respondents may be sensitive to one or more of the ingredients and, hence, screened out at the recruiting stage.

The purpose of the supply list is twofold. It should alert those responsible for product preparation to the amount of product that will be delivered. This may prevent shortages, which cause interruptions In data collection. Once the amount of beef and noodles being sent is known, steps can be taken to ensure there is adequate refrigerator space for the beef and sufficient storage and handling room for the noodles. Further, yield (i.e., the number of servings per batch or package) is critical for planning the number of respondents to pre-recruit for each "seating."

The equipment necessary to prepare, cook and serve the product must be tailored specifically to the product being tested. As applied to the beef and noodle example, the test product was not fully microwaveable, so each ingredient had to be prepared separately before the microwaving took place. The beef had to be browned at the same time the noodles had to be boiled before the dish was microwaved. Therefore, skillets and pots had to be available in the kitchen. Consistent preparation is always key, so this equipment had to be of uniform size and type. Different size skillets would have altered browning time (larger skillets tend to brown meat faster), and different pot diameters would have altered water evaporation during noodle preparation. The number of skillets and pots necessary depends greatly on the number of servings each batch or package provides, as well as the number of dishes respondents would try.

Once the pre-preparation of the product was completed, the actual cooking of the beef and noodles was done in a microwave oven. The wattage and type (digital vs. dial) had to be determined before preparation began. Different wattages can cause cooking times to fluctuate, and digital timers allow for a more precise measurement of cooking time. These considerations were especially important because more than one microwave was needed.

A final issue involved in product preparation is the maximum amount of "holding time" allowed, if any. Holding time is the amount of time a product may be kept (after cooking and prior to serving) before it must be discarded. Consideration of holding time becomes more critical as the number of products being tested increases. The beef and noodles required 15 minutes of preparation and cooking and could be held up to 15 minutes prior to serving. This amount of holding time allowed the kitchen to "overlap" products. While respondents evaluated one product, a second was prepared and held. This assured an even and timely flow of products for testing.

How the product will be served to respondents is also an important consideration. The density of the beef and noodles required the use of steel, rather than plastic utensils. Further, the product involved a sauce, so paper plates would not have been a wise choice as they tend to absorb and bleed liquids.


Conducting a Dry Run

A final control or check before actual data collection begins is the "dry run." The main purpose of the dry run is to allow the people who prepare the product to go through the actual preparation step-by-step before testing begins. Any ambiguous or confusing instructions as well as mistakes can be caught and corrected before the test begins. Additionally, any unexpected product variation during this trial will be recognizable, allowing immediate discussions to take place with the client and greatly reducing any "down time." The dry run also lets those responsible for the product preparation see how the product should look and smell once it has been prepared.

A crucial factor in the success of any on going product testing program is communication among the parties involved in product development. This would include personnel from Marketing Research (both client and supplier), Marketing and R&D. The dry run serves as a focal point for fostering communication and cooperation among these parties. Each has its own set of issues and objectives to be addressed by the test. The dry run can serve to bring all these issues out at one time before the testing actually begins.


Respondent Preparation issues

Consistent with attempts to maximize the goodness of the data (validity and reliability), respondents must be prepared to complete the task at hand. While the intent is not to "prep" or train respondents (as might happen with trained sensory panels), standardization is desired in the way in which responses are given.

Obviously, there is no control over what the respondent has eaten prior to the test or, in general, the tastes they may have in their mouths. One way to neutralize this unpredictable source of variance is to expose respondents to a "standard" product in advance of the test products. This standard product is evaluated by all respondents in the first position (first product seen). The standard product would be the same as, or similar to, the flavor of the actual test products. The purpose is to put all respondents on common ground in preparation for the actual evaluations (i.e., respondents start out with the same taste on their palates). Keeping in mind the purpose of tasting this product, no analysis should be done on the ratings obtained for this product.

Quite often respondents come to the product evaluation task unprepared and unaware of what is expected. Consequently, another important advantage of preliminary exposure to a standard product is that respondents become acclimated and gain a sense of what the product test entails. Time taken to familiarize respondents with the terms used to describe and rate the products leads to a greater understanding and appreciation of what is expected. Validity of the information obtained is enhanced if respondents understand the tasks before them. The effort of learning is, thus, absorbed by this first, standard product and should not differentially influence ratings of products tested later.

A second issue to consider is respondent preparation between product trials. To ensure that all products are being rated "fresh," respondents must cleanse their palates after the standard product is served, as well as between each subsequent tasting of the test products. The typical cleansing procedure involves eating an unsalted soda cracker which acts as a sponge, absorbing tastes (e.g., oils) left behind by the previous product. For rinsing purposes, the respondent is asked to drink a small cup (2 oz.) of bottled water served at room temperature.

Finally, respondents need to be given specific instructions for answering items in the questionnaire. As mentioned above, the standard product may allow respondents to learn how to respond to scaled items. In addition, respondents are told to be as specific as possible with their answers in responding to open-ended questions (e.g., "What, specifically, was good about the f1avor?"). Also, respondents need to be instructed to answer all questions in their own words, and not to discuss any answers or aspects of the test with other respondents until the test has been concluded.


Summary

Taken as separate pieces, the importance of environmental, product preparation and respondent preparation factors can easily be overlooked. However, in order for the product test's mission to be performed with sufficient power, all of these factors must be addressed by the researcher.

Ultimately, the power of a product test lies in the test's ability to produce reliable and actionable data. By addressing these factors as a whole, the researcher is forced to recognize that the overall test is no better than its weakest element. In recognizing this interdependence, standardization in product testing methodology can develop. This is desirable, of course, because standardized testing procedures allow the researcher to limit the biasing factors which act to confound test results.