## Sources of Bias and Solutions to Bias in the Consumer Price Index

by
Jerry Hausman

Citation

Title:

Sources of Bias and Solutions to Bias in the Consumer Price Index

Author:

Jerry Hausman

Year:

2003

Publication:

The Journal of Economic Perspectives

Volume:

17

Issue:

1

Start Page:

23

End Page:

44

Publisher:

Language:

English

URL:

Select license:

Select License

DOI:

PMID:

ISSN:

**Updated:**December 5th, 2012

Abstract:

Journal of Economic Perspectives-Volume 17, Number 1-Winter 2003-Pages 23-44

Sources of Bias and Solutions to Bias in the Consumer Price Index

Jerry Hausman

he idea of using a basket of goods as the basis for measuring the cost of

living dates back to at least the early nineteenth century in England, as

Diewert (1993) discusses in his interesting early history of price index research. As "every schoolboy knows" (an English expression), this "constant bas- ket" approach suffers from numerous biases and flaws as the basis for calculating a cost-of-living index. It fails to allow for substitution that occurs when consumers switch away from goods that have become relatively more expensive and toward goods that have become relatively less expensive. It ignores the introduction of new goods. It ignores quality changes in existing goods. Finally, it ignores shifts in shopping patterns to lower-priced stores, like the shift to stores such as M7al-Mart, which is both the largest retailer for consumer products as well as the largest supermarket chain in the United States, a shift that creates the problem of "outlet bias."

These problems have been known for a long time; for example, the substitu- tion issue is discussed in Bowley (1899), the new goods problem arises in Marshall (1887), and the quality change problem comes up in Sidgwick (1883). They are discussed again in the 2002 report from the National Research Council, At \14'zat Price (Schultze and Mackie, 2002). However, the study was primarily funded by the

U.S. Bureau of Labor Statistics, and the new report basically accepts the current BLS approaches to these problems.1

'I find it unfortunate that many economists have interpreted the Boskin et al. (1996) report as a "Republican" view of the Consumer Price Index and the report of the National Research Council as the "Democratic" response. For an example of such a discussion, see Madrick (2001). However, in my reading, some of the committee analysis in At What Price does seem designed to counter the Boskin report and to defend the Bureau of Labor Statistics approach.

w Jg' Hausman is John and Jennie S. MacDcmald Professw of Economics, Massachusetts Institute of Technology, Cambridge, Massachusetts. His e-mail address is ([email protected]).

24 Journal of Economic Perspectives

Modern economics provides a solution to each of these problems. Use of a cost-of-living index based on utility functions (or, equivalently, expenditure func- tions) allows estimation of each of the effects of substitution, new goods, quality change and outlet bias. To estimate these effects, economists will need both price and quantity data, rather than using primarily monthly price data, which is the current BLS approach. Quantity data are a necessary input to solve the problems of new goods, changing quality and outlet shifts. However, quantity data are in large part readily available given the widespread collection of computerized retail outlet and household scanner data. Unfortunately, the BLS has not yet incorporated modern economic theory nor the availability of scanner quantity data into its estimation of the Consumer Price Index, which is meant to approximate a cost-of- living index.

In this paper, I will demonstrate that while the revised Bureau of Labor Statistics approach to the substitution effect is sufficient, the BLS approach to biases caused by new goods, quality change and new outlets is severely inadequate. \\'hat is often not recognized is that failure to include the substitution bias is a "second- order" effect, while failure to include the effects of new goods, quality changes in existing goods and outlet effects are all "first-order" effect^.^ The substitution problem can largely be addressed by using a mathematical formula for calculating the Consumer Price Index that, instead of assuming a constant basket of goods, uses the (geometric) mean of the fixed basket approach before and after the price change. The specific formula that gives this average is the Fisher (1922) ideal index. However, a correct approach to incorporating new goods, quality improvements and outlet changes into a cost-of-living index cannot be based on a constant basket of goods and a survey of updated prices, not even if that basket of goods is gradually rotated and updated over time. Instead, it must take account of changes in both prices and quantities-or equivalently, changes in prices and expenditures (Diewert, 1998; Hausman, 1999). The BLS periodically collects data on quantities to estimate the weights that enter the Consumer Price Index. However, the BLS would need to collect quantity data at high frequency, similar to collection of price data, to take account of these three sources of bias.

Until fairly recently, data on quantities could not be collected in a cost-effective manner. However, beginning in about 1985, bar code scanners became common in

U.S. retail outlets, and almost every retail outlet is now computerized. Two com- panies, AC Nielsen and IRI, collect price and quantity data in great detail from retail outlets and households and resell the data to manufacturers. Supermarkets, neighborhood convenience stores, pharmacies and "big-box" retail outlets all have data that can be purchased from vendors. These companies gather the information at the "stock keeping unit" level so that not only is the exact product known-say, Apple Cinnamon Cheerios-but the package size and type is also known along with the price. Family purchases in terms of prices and quantities for a random sample

By first- and second-order effects. I mean the term that arises in a Taylor expansion of the cost-of-living index, as I demonstrate subsequently.

of households, again using scanners, are also available. Scanner data are available almost immediately. Thus, the quantity data needed to estimate an accurate cost- of-living index are in large part a~ailable.~

Thus, my suggestion would be for the Bureau of Labor Statistics to begin to collect these quantity and price data and for the BLS to research and develop methods to collect quantity data where it does not currently exist.-ending price surveyors out to stores, which is the original approach used in England in the nineteenth century and is the main approach currently used by the BLS, will not get the job done in the twenty-first century.

Evaluation of Biases in the CPI

I conduct all my analysis in terms of a cost-of-living index (see the Appendix, where I define mathematically the cost-of-living index). A cost-of-living index is the correct theoretical tool to measure the effect on consumer welfare of price changes, quality changes and introduction of new goods, as the academic literature has long noted (for example, Boskin et al., 1996). The Bureau of Labor Statistics has recognized (Abraham, Greenlees and Moulton, 1998) that a cost-of-living index provides the correct approach, although, oddly enough, the recent National Re- search Council report seems ambivalent on this issue (Schultze and Mackie, 2002, chapter 2; Schultze, this issue). A cost-of-living index is based on the minimum levels of income needed to reach a given utility level at two different time periods, given the prices and goods available in the economy.

The Effect of New Goods and Services on a Cost-of-Living Index

Many new products and services have a significant effect on consumer welfare. The gain in consumer welfare from one new product, the introduction of the cellular telephone in the United States, exceeded $50 billion per year in 1994 and $111 billion per year in 1999 (Hausman, 1997a, 1999,2002a). However, the Bureau of Labor Statistics approach is to omit the introduction of new goods in its calculation of the Consumer Price Index until they are eventually discovered as part of the gradual rotation of the sample of goods. This approach can take consider- able time; for example, the BLS did not include cellular telephones in the CPI

The treatment of scanner data in the 4tWhatPn'c~report (Schultze and hfackie. 2002) is disappointing in many ways. While the committee worried that the Consumer Expenditure Survey is inaccurate

(p. 253). it did not explore the use of electronic collection of family expenditure data that is currently ongoing. Although the committee repeatedly emphasizes the trvo- or three-year delay associated with collection of expenditure or quantity data (for example, p. 57). it fails to notice that scanner data is available almost in real time. While the committee discusses the use of scanner data within the current Bureau of Labor Statistics framework (pp. 266ff), it has only a ve7 brief discussion regarding the use scanner data to decrease biases in the Consumer Price Index (pp. 273ff)-and it does not recognize the requirement of using quantity data to reduce bias. 'Silver and Heravi (2001a, b) discuss the results of using scanner data in the United Kingdom and its effect on the calculation of traditional price indices.

26 Journal of Economic Perspectives

calculations until 15 years after their introduction in the United States (Hausman, 1999, 2002a). Even when the no longer new good eventually does enter the CPI calculation, no adjustment is made for the consumer gains it provides in relation to the earlier goods."he Committee recommends that the BLS continue this prac- tice (Schulze and Mackie, 2002, p. 160).

To include new goods in a cost-of-living index, the key conceptual step is to use a "virtual price" for the new good before its appearance, which as Hicks (1940) demonstrated, sets quantity demanded equal to zero.6 Estimation of this virtual price requires estimation of a demand function. Given the demand function, the analyst can solve for the virtual price and for the expenditure function, as in Hausman (1981), and thus make an exact calculation of consumer welfare and the change in the cost-of-living index from the introduction of a new product or service. This calculation is presented in the Appendix. There are various methods to estimate a demand curve. One can specify a parametric form of the demand function as in Hausman (1981, 1996, 1999), or alternatively, estimation of a nonparametric demand curve could be used with welfare calculations following the approach of Hausman and Newey (1995). All of these approaches will require significant amounts of both price and quantity data. While substantial econometric issues can arise over estimation of demand cun7es, recent econometric advances for estimation with differentiated products have evolved using scanner data and dis- crete choice approaches.' I expect further research on the topic, but the availability of large amounts of panel data from scanners and other sources reduces the econometric problems of estimating demand curves and expenditure functions significantly.

As a simpler alternative, I have proposed a conservative approach that de- creases the information requirements and should provide a "lower bound" esti- mate, which I have applied in Hausman (1996, 1997a, 1999). Figure 1 illustrates this approach. The current price and quantity for the good is given by the point at (P:~,q:L) on the convex demand curve labeled D. Ideally, we would estimate the (compensated) demand curve and find the virtual price p* where the quantity demanded will equal zero. The change in expenditure needed to hold utility constant with the introduction of the new product is the compensating variation, which is measured by the area under the compensated demand curve above the observed price. However, as a simple approximation one can take the line that

'\\'hen goods disappear from the market. the opposite effect occurs. However, typically only unsuc- cessf~ll (unprofitable) goods disappear, and their negative effect on consumer welfare will typically be small. as I demonstrate in Ha~~sman

(1999). The methodolo,gy developed here can also be used to measure the negative effect on economic welfare from the disappearance of goods from the market. wicks (1940) used the market demand curve. while the correct approach for a cost-of-li~ing index is to use the compensated demand cul-\.e, as in Hausman (1996). ' Identification and consistent estimation of the demand cul-\.e wlth differentiated goods poses a potential problem. However. the use of scanner panel data allows a solution to the problem. For discussion. see Hausman, Leonard and Zona (1994), Hausman (1996) and Hausman and Leonard (2002). See also Bern-. 1,eLinson and Pakes (1995) and Petrin (2002) for the use of a discrete choice approach.

28 Journal of Eco~zomic Perspectives

approaches require both price and quantity data, which follows from the definition of a price elasticity.

The introduction of new goods is likely to have a significant effect in a correctly calculated cost-of-living index. Some economists have claimed that the introduction of many new goods offers only a trivial degree of variation that typically does not benefit consumers (for example, Bresnahan, 1997). However, even a new product like Apple Cinnamon Cheerios seems to offer substantial welfare gains (Hausman, 1996).%ausman (199'7a, 1999) also found large consumer benefits from the introduction of cellular phones, and Petrin (2002) found large consumer effects from the introduction of the minivan. I call this outcome the "invisible hand of imperfect competition": when firms introduce successful new products in the expectation of making economic profits, significant consumer surplus will also be created."

The cost of introducing a substantial new product is typically in the tens of millions of dollars and may exceed $100 million. Nonetheless, companies make hundreds of product introductions of this magnitude each year. Most of these costs are sunk, so that the firm must expect to earn sufficient profits to recoup its investment in the new product introduction. While many new product introduc- tions fail, a sufficient number succeed to make the investment profitable in a risk-adjusted expected value sense. Thus, in expectation the firm has profits II 2 F, where F is the fixed cost for introducing and advertising the product and II is the difference between revenue and variable cost. The new product will create con- sumer surplus as well. If the demand elasticity q is very large, the firm cannot earn significant marginal profits to earn back its sunk cost investment in the new good. However, a moderate price elasticity leads both to the possibility of the firm earning profits and also to significant consumer surplus if the good is successful. The relationship between consumer surplus and profits is a close one.

To demonstrate this claim, I begin with a constant elasticity demand curve and constant marginal costs, MC, while holding prices of other products constant. Consumer surplus for the price elasticity (in absolute value) q > 1 is CS = pq/ (q -1). The price elasticity q > 1 because a firm setting price in an imperfectly competitive market will not operate in a range where product demand is inelastic. Calculating marginal profit with constant marginal cost, MC = c, and using the first-order conditions that set (p -c) = (p/q), the firm's profits (producer surplus) equal II = pq/q. Rearranging and using the first- order condition, the firm's profits II = (CS/6), where 6 = q/(q -1) > 1. Therefore, consumer surplus exceeds the fixed costs of introduction since

See Hausman and Leonard (2002) for a recent estimation for estimation of the consumer welfare gains from another new consumer product. " IVhile significant consumer benefit will arise, social welfare need not increase, because much of the producers surplus for the firm introducing the new good may arise from the "business stealing" effect from other firms, as discussed, for example, in Spence (1976) and the numerous papers that have followed.

30 Journal of Economic Perspectives

the analysis here in terms of the consumer surplus, rather than the compensating variation between two expenditure functions, to keep the intuition straightforward. But the finding holds also when the compensating variation is used.

Of course, the calculation of a precise compensating variation for the intro- duction of a new good raises various econometric issues. Nevertheless, the effect on consumer surplus from the introduction of a successful new good will typically be substantial. Moreover, the usual outcome is that with the invention of a new product, the prices for current substitute products will decrease, so that consumer surplus will increase even more than this calculation (for example, Hausman and Leonard, 2002). However, a multiproduct firm that introduces a new product may be able to increase prices of its other products (Hausman, 1996), which creates a partially offsetting effect.

Taking these factors as a whole, the "invisible hand of imperfect competi- tion" will typically lead to significant welfare gains from successful new product introduction-othenvise, firms would not find it economically rational to intro- duce new products. Omission of the effects of the introduction of new goods by the Bureau of Labor Statistics is likely to create substantial downward bias in the Consumer Price Index.

The Effects of Quality Change on a Cost-of-Living Index

The Bureau of Labor Statistics does not adjust directly for quality change. It does gradually rotate new higher-quality goods into the sample that is used to calculate the Consumer Price Index, but the results can be problematic. M'hen the product with improved quality enters the sample, it is matched to an existing product, and a linking procedure follows. For example, when Windows 95 was introduced, it replaced the combination of MS-DOS and Windows 3.1. The BLS procedure led to introduction of the product with no quality adjustment, compared to the existing operating systems. But Windows 95 had much superior functionality, as market evidence demonstrates, since most of the consumers who purchased Windows 95 could have continued to purchase MS-DOS/Windows 3.1, often at a lower price. Because the BLS procedure fails to capture the quality improvement, the measured CPI will overstate the rise in a true cost-of-living index. For some products, the BLS uses a hedonic procedure to make adjustments for quality. I will discuss this procedure later.

Introduction of new goods and improved quality of existing goods are similar economic effects, which enter a cost-of-living index in a similar manner as I discuss in the Appendix. Assume that the old quality good exists in period 1,while the new quality good exists in period 2. The difference in the minimum expenditure needed in period 2 to achieve the same utility that existed in period 1will yield the compensating variation. Figure 3 illustrates the case where the good with improved

the demand curve would not be concave to the origin. For most new differentiated product$. I expect this outcome to be unlikely, since brand name has a significant role in demand.

32 Journal of Economic Perspectives

index." In the Appendix, I provide an example of an exact calculation of the gains from a quality improvement.

The Schultze panel recommends that the Bureau of Labor Statistics should endeavor to introduce new and improved goods into the Consumer Price Index earlier. Much more frequent updates of expenditure weights in the CPI would help alleviate the current problem, as has been recommended by economists in the past (for example, Boskin, 1996).But more frequent updates does not really solve the problem of quality change. For many products, quality improvements are a con- tinuous event, causing the demand curve for the product to shift outward for a significant period of time. Moreover, this improved quality is often accompanied by a greater diffusion of the product. As an example, think of the evolution of cellular phones in recent years as they have become smaller, more reliable, ~\ith longer battery life and more convenient to use (Hausman, 1997a, 1999, 2002a). In the presence of the introduction of new goods and quality improvement of existing goods, both prices and quantities (or alternatively, prices and expenditures) must be used to calculate a correct cost-of-living index. Using only prices and ignoring the information in quantity data will never allow for a correct estimate of a cost-of-living index in the presence of new goods and improvements in existing goods.

The Effect of Lower Price Stores: Outlet Bias

An important market outcome in retailing over the past 30 years is the growth of discount retail outlets such as Wal-Mart, Best Buy and Circuit City. MTal-Mart is now the largest supermarket chain in the United States, and for a number of branded goods, it now sells 10 to 20 percent of the output of large branded companies; for example, in 2001, Wal-Mart made 12 percent of total sales for Gillette, 15 percent for Proctor and Gamble and 20 percent for Revlon. These outlets offer significantly lower prices than "traditional" outlets such as department stores.

The Bureau of Labor Statistics approach gradually rotates products sold at these discount outlets into the Consumer Price Index calculations. However, when a certain good sold at a department store is rotated out of the index and the same good sold at a discount retail outlet is rotated into the index, the BLS procedure treats these as different goods-not as a reduction in the price of the same good. The result is that the measured inflation rate fails to capture the gains as consumers shift to these low-price retailers. Only knowing the prices at the discount outlet does not give the needed information to calculate a cost-of-living index; it is also necessary to measure the quantity purchased in the discount outlets. Further, differences in senice quality and how they affect consumers surplus could be estimated using the techniques that I discussed above if quantity data are available. While some consumers do not like shopping at Wal-Mart, the situation is similar to the outcome that some consumers do not buy new goods since they continue to

"The quality of indiddual goods may also decrease, but the analysis ~vould be the same.

prefer the old products. However, for those consumers who shift to Wal-Mart, the gain in utility is first order.13 Obtaining current price and quantity data from these outlets would be straightforward, since they all employ scanner technology.

While the Schultze panel recognizes the possibility of "outlet bias," it essentially recommends ignoring this problem (Schultze and Mackie, 2002, p. 176). It agrees with the implicit assumption made by the Bureau of Labor Statistics that the lower prices at discount retailers reflect a lower quality of senice, and thus that the "senice-adjusted" price is not actually lower. This argument fails on at least two grounds. First, the managers and shareholders of M'al-Mart would be amazed to find that consumers are indifferent between shopping at Wal-Mart and shopping at traditional outlets, given M'al-Mart's phenomenal growth over the years. Indeed, Tt'al-Mart's lower prices are in part related to superior logistical systems that have nothing to do with lower senice quality levels. Second, many consumers seem to place relatively lower value on their time spent shopping. This is one reason retail e-commerce has performed so poorly-many consumers didn't place a high value on reducing their time spent shopping.

Thus, the market evidence goes against the claim that outlet bias is not an important factor in a correct calculation of a cost-of-living index. Wen the dis- count stores are gaining market share, then this market outcome is evidence of an outlet substitution effect that should not be ignored by the Bureau of Labor Statistics.

Substitution Bias

Substitution bias, the problem that arises because a fixed basket of goods does not take into account that consumers will shift away from goods that are consumed is worth some attention. Shapiro and Wilcox (1997) found that substitution bias leads to overstating the rise in a cost-of-living index by about 0.3 percent per year. But the National Research Council panel spent too much of its effort discussing substitution bias, compared to the other new goods, quality change and outlet biases. The substitution bias is a second-order effect, while the other three are first-order effects.

The terms "first-order" and "second-order" have a mathematical meaning; they refer to the terms in a Taylor series expansion. The Appendix presents the Taylor expansion for as a method of approximating the size of the compensating variation-that is, for the difference in expenditure functions that measures the true change in the cost of living-and shows how the new goods, quality change and outlet biases appear in the first term, while substitution bias appears in the second term.

At an intuitive level, when calculating the change in a cost-of-living index,

IS The lower prices create a first-order welfare increase in a cost-of-lillng index using the expenditure function, because the derivative of the expenditure function with respect to a change in price equals the quantity purchased, which is a first-order effect. This result is known as "Sheppard's Lemma" (for example, Deaton and Muellbauer, 1980).

source^ of Bza~ and Solutzons to Bzas zn the Consumer Pnce Index 35

assumption is an approximate way of dealing with substitution bias.14 As a practical step, it is an improvement over not allowing any substitution at all, and it should have the effect of eliminating concerns over substitution bias. But while the BLS will collect quantity data to account for substitution bias, it is only collecting expenditure data at the highest levels of aggregation-at the level of some 200 aggregate commodities. Thus, this database will not allow for estima- tion of new good bias, quality bias or outlet bias that requires quantity data at a lower level of aggregation.

Skepticism about Hedonic Regressions

Although the Bureau of Labor Statistics typically uses the linking process when it rotates new goods or goods of improved quality into the Consumer Price Index calculations, for a small number of goods, such as personal computers, the BLS has used a hedonic adjustment for quality change. The BLS website lists a number of "developmental" hedonic research programs for consumer durable goods, includ- ing clothes dryers, microwave ovens, college textbooks, VCRs, DVD players, cam- corders and consumer audio equipment.

Unfortunately, I do not think that a hedonic approach is correct in general. The hedonic approach used by the BLS is a "pure price" approach, which does not capture consumer preferences with the combination of quantity and price data that are the fundamental basis for the demand curve and the related expenditure function. This hedonic approach cannot be used to calculate a true cost-of-living index.'"

A hedonic regression has price (or log price) on the left-hand side ("depen- dent variable") and product characteristics on the right hand side ("explanatory variables"). The idea is to estimate the coefficients of the right-hand side variables and then to adjust observed prices for changes in attributes. For example, suppose a hedonic regression concerning computers has as a right-hand variable the log of the microprocessor speed. Suppose that the price of a computer decreased by 10 percent over a year's time and its processor speed increased from 1.5 mHz to

2.0 mHz (that is, an increase of one-third). If other right-hand side variables remained constant, the estimated price decrease in percentage terms would ap-

l4 The Fisher ideal index described in the text is a form of a superlative price index, which Diewert

(1976) showed offers an exact calculation of some second-order flexible function for a homothetic utility (expenditure) function. Homothetic utility functions imply that all expenditure elasticities are assumed to be unity so that expenditure shares are fixed. This assumption is well known not to hold in practice. Alternatively, homotheticity can be eliminated if the reference utility level is changed to a geometric a\-erage of the reference utilities in the two periods. Also, note that while the substitution effect is second order, summing over all price changes can have a significant effect. "Diewert (2001) develops sufficient conditions to allow a hedonic regression to be interpreted as a function of consumer preferences. However, the assumptions, in my opinion, are too strong to be useful, and 1do not believe that the hedonic regression is identified in an econometric interpretation since the characteristics of the goods \vould be jointly endogenous with the price.

36 Journal of Economic Perspectives

proximately be fj = -0.1 -b*0.33, where b is the estimated coefficient of processor speed (measured in logs) in the estimated regression.

However, this hedonic regression adjustment has no simple relationship with consumer valuation of the computer, which is the correct basis for a cost-of-living index. Under the very special conditions of perfect competition, cost alone deter- mines the price, but many hedonic regressions find that including the brand as a right-hand side variable is empirically important, which suggests that these markets are often imperfectly competitive. In considering a good in a market with imperfect competition, the price is an interaction of three factors: demand, cost and com- petitive interaction.'" hedonic regression mixes these three sets of factors together.

To understand the shortcomings of the hedonic approach, consider how the true compensating variation should be calculated in this situation. Think of a (partially) indirect utility function that calculates the maximum utility attainable for a consumer from the attributes of a good and from income minus price:

v' = m(xl, z') + (J' -p') = gl(xl)+ g2(z1)+ (f -pl),

where v' is the maximum possible utility level in period 1, x1 and z' are two attributes in period 1,microprocessor speed (mHz) and hard drive capacity on a computer, j1 is income in period 1, and p1 is the price of the computer. For ease of exposition, I assume that the two features enter separably into the indirect utility function, represented by the functions gl and g2, respectively, instead of the m function. In period 2, the features change to x2 and z', which will alter the maximum achievable utility to a'. To determine the true compensating variation, one needs to calculate the price that would be needed to adjust for the changes in the available attributes such that the maximum achievable utility would be the same in the two periods: v1 = v'. Thus, the new "quality-adjusted" price is the old price p' adjusted for changes in features, where the evaluation is done using the consumers' utility valuation.

In contrast, a hedonic regression as used by the Bureau of Labor Statistics determines the price as a function of product attributes, which is not the same at all. The attributes x1 and z1 that are offered in the marketplace are determined by the interaction of consumers' preferences, technology and competition. The costs of producing these attributes will be determined by cost, demand and competition in the factor input markets, like competition between AhlD and Intel in micropro-

I h In one ven special case, price does not depend on demand, so a hedonic regression could identi5 the cost factor. However, this special case arises when no economies of scale or scope are present- essentiallt- the conditions needed for the Samuelson-Mirrlees nonsubstitution theorems. An assumption of the absence of economies of scale and scope would not make sense in most industries, including those where hedonics is most commonly used. I discuss this question at greater length in Hausman (2002b). Pakes (2001) also discusses problems in interpreting a hedonic regression specification using these economic factors.

JqHausman 37

cessors. In a situation of imperfect competition, firms will be charged a markup over marginal cost. This markup will vary according to the extent of competition, and it will vary from firm to firm according to how the attributes of the products for a particular firm differ from what else is available in the market.'' The coefficients in a hedonic regression on these attributes will mix together factor input prices, markups that vary by firm and the utility that consumers derive from various attributes, all of which may vary across time. Thus, hedonic regressions are not structural econometric equations. This argument that hedonic regressions are not capturing a structural relationship is consistent with the empirical evidence for hedonic regressions with personal computers that the coefficients change signifi- cantly across years (Berndt, Griliches and Rapport, 1995). Overall, price adjustment using hedonic price regression has no relationship, under general conditions, to what is supposed to be measured in a cost-of-living index.

The discussion to this point presumes that the relevant attributes of a product can be clearly enumerated for the purposes of a hedonic regression, but this assumption will not hold for all products. Consider the problems posed by medical goods and services, an area in which technological change is especially rapid. The Bureau of Labor Statistics has measured the price of medical inputs in the past, like the price of a day in the hospital. But of course, a day in the hospital is a service that has changed a great deal over time, so the Schultze committee recommends using diagnosis-based measures instead of input-based measures where feasible, like childbirth or coronary bypass surgery, instead of "day in the hospital" (Schultze and Mackie, 2002, p. 188). This recommendation is an improvement, but it still does not measure quality change. For example, improvements in treating breast cancer have been remarkable over the past decade. Measuring the cost of "breast cancer treatment" would not take into account that many families would be willing to pay a large amount of money for the improvement in survival probabilities. Nor would a hedonic approach easily be able to take into account factors such as "fewer side effects." In many instances of medical care and services, identifying the key product attributes for a hedonic regression seems implausible.

Ultimately, data on price and product attributes alone will not allow correct estimation of the compensating variation adjustment to a cost-of-living index. Quantity data are also needed, so that estimates of the demand functions (or equivalently, the expenditure or utility functions) can occur. For this reason, I disagree with the panel's conclusion that hedonic methods are "probably the best hope" for improving quality adjustments (Schultze and Mackie, 2002, pp. 64, 122), since hedonic methods do not use quantity data to estimate consumer valuation of a product, and consumer demand must be the basis of a cost-of-living index.

1i This discussion also demonstrates that the other Bureau of Labor Statistics method of "cost-based adjustment" for quality change, which the BLS uses for automobiles, is also incorrect. The presence of a markup for cars means that the change in price is not the same as a change in costs. In addition, the value of a change to consumers is not determined by the incremental costs.

38 Journal of Economic Perspectives

Interestingly, the panel recommends the "direct method" of hedonic adjustment

(p. 129), which requires high frequency data collection of prices and product attributes. Hedonic demand functions that use quantity data could be estimated, which could allow for correct treatment of quality change (for example, Hausman, 1979; Berry, Levinson and Pakes, 1995). But if the Bureau of Labor Statistics collected high-frequency quantity data, it could estimate the change in the cost-of- living index from quality change using methods discussed in this paper.

Estimation of Overall Bias in the CPI

The U.S. government should devote significant resources to the measurement of the Consumer Price Index because economic knowledge of consumer welfare depends, in large part, on drawing an accurate separation between real and nominal changes. Yet several recent studies using aggregate consumption data- studies not addressed by the Schultze panel-suggest that the CPI as currently measured contains a substantial upward bias.

Costa (2001) and Hamilton (2001) estimate bias in the Consumer Price Index by using expenditure survey data to estimate the increase in households' expendi- tures versus their real income over time. The empirical methodology is to compare households with similar demographic characteristics and the same real income at different points in time and compare their expenditures on given categories of expenditure. The key identifying assumption (besides f~~nctional

form) is that the expenditure elasticities remain constant over time for a given category of expen- diture after controlling for demographic characteristics. Thus, residuals from the relationship between real income and predicted expenditures are used to estimate the bias in prices that are used to deflate income. Using data from 1972-1994 on food and recreation expenditures, Costa finds that an annual bias averaging

1.6 percent over this time period. Hamilton (2001) also estimates CPI bias to be

1.6 percent per year during this period, using a similar econometric approach on a different data set. This procedure will capture "outlet bias" and "substitution bias," but since it will not measure either "new good bias" or "quality change" bias-which the Boskin et al. (1996) commission argued were the largest source of bias in the CPI during the 1975-1994 period-it will yield an underestimate of CPI bias. Bils and Klenow (2001) estimate that the BLS understated quality improve- ment and, thus, overstated inflation by 2.2 percent per year over the period 1980-1996 on products that constituted over 80 percent of U.S. spending on consumer durables.'"

These aggregate studies, along with numerous micro studies on particular goods, demonstrate that the magnitude of the biases in the Consumer Price Index

'"ils and Klenow (2001) use a constant elasticity of substitution specification, which, with its implica- tion of the independence of irrelevant alternatives, may yield an overestimate of quality change, as I discuss in Hausman (1996).

Sources of Bias and Solutions to Bias in the Consumer Price Index 39

are much too large to be ignored. The Bureau of Labor Statistics has taken only very modest steps to address new goods bias, quality bias and outlet bias. Until the BLS incorporates a continual updating of quantity data into their data collection and estimation procedure, these sources of bias will continue to exist in the CPI.

The program that I have outlined to correct for new good bias, quality change bias and outlet bias may seem substantial in term of added resources. However, the use of scanner data, which would collect both price data and quantity data, would save substantial current resources, since BLS price surveyors would be largely eliminated. Instead, computerized data collected from a stratified random sample of retail outlets or households would provide the current price data as well as quantity data. Further, in my simplified approach to estimation of each of the biases, only own-price elasticities are required. While not every minor instance of a quality change would be estimated, the important instances of quality change would provide a starting point. For instance, the current products for which the BLS estimates hedonic indices would provide a natural place to begin. Similarly, new goods and outlet bias effects would be identified and estimated using the available scanner data. A change in focus from the current "pure price" approach of the BLS to an economic approach that uses both price and quantity data would lead to a framework for estimating a cost-of-living index that reflects twenty-first century economics and technology.

Appendix

Some Analytics of a Cost-of-Living Index

The expenditure function is defined as the minimum income required for a consumer to reach a given utility level:

(1) y = e(p,, p2, . . . , pn; G) = e(p, G) solves min xp,q, such that u(x) = G,

i

where there are n goods labeled q,. The change in the required income when, for instance, prices change between period 1 and period 2, follows from the compensating variation (CV),where superscripts denote the period and sub- scripts number the goods:

The exact cost-of-living index (COLI) becomes p(p2, pl, u') = y2(p2, ul)/y', which gives the ratio of the required amount of income at period 2 prices to be as well off as in period 1. As with any index number calculation, the period 2 utility level, u2, allows for a different basis to calculate the cost-of-living index.

40 Journal of Economic Perspectives

New Good Bias: A First-Order Effect

In period 1, consider the demand for the new good, x,,, as a function of all

1

prices and income, y, q', = g7,(pi, pi, . . . , p7,-,, p;,, yl). Now if the good were not available in period 1, I solve for the virtual price, p*,, which causes the demand for the new good to be equal to zero:

(3) 0 = 9; = g,(p:, p:, . . . , pt-,, p*,,y').

Instead of using the Marshallian demand cun7e approach of Hicks (1940) and Rothbarth (1941), I instead use the income-compensated and utility-constant Hicksian demand curve to do an exact welfare evaluation (Hausman, 1996, 1999). Estimation of the Marshallian demand curve provides the necessary information to calculate the Hicksian demand curve (Hausman, 1981). Income, y, is solved in terms of the utility level, ul, to find the Hicksian demand cun7e given the Marshal- lian demand curve specification.

In terms of the expenditure function, I solve the differential equation from Roy's identity that corresponds to the demand function to find the (partial) expenditure function, using the techniques in Hausman (1981) and Hausman and Newey (1995). The approach solves the differential equation, which arises from Roy's identity in the case of common parametric specifications or nonparametric specifications of demand. To solve for the amount of income needed to achieve utility level u1 in the absence of the new good, I use the expenditure function to calculate y*, which is the required income to reach the reference utility level ul. The compensating variation from the introduction of the new good is CV =

1

y* -y1 -e(p:, p:, . . . , p7,-1, p:, ul) -yl.

While this approach holds prices of the other goods constant, price changes of the other goods caused by the introduction of the need good are easily treated by allowing for substitution effects in the usual way. Consumer demand theory (the integrability conditions) allows for one price to change at a time, holding other prices constant with the correct answer not requiring all prices to be changed simultaneously. Also, price changes of other goods from the introduction of a new good typically lead to a further increase in the compensating variation because of competition from the new good (Hausman and Leonard, 2002).

The effect on the correctly calculated cost-of-living index is "first order" because it arises in the leading term in a Taylor approximation to the change in the cost-of-living index. Using Taylor's theorem,

ae(p#, u')

(4) y1 -y* = (p*,-p$ h,(p#, u') = (p*,-p'n) for p#E (p*, pl),

apn

where the Marshallian demand for the new good equals the compensated Hicksian demand, g,(pl, y l) = h7,(p1, ul). Alternatively, equation (4) measures the area under the compensated demand cunre for the new good or senice, which yields a first-order magnitude, as illustrated in Figure 1 in the text. The exact cost-of-living index becomes p(pl, p*, ul) = y*/yl.

Substitution Bias: A Second-Order Effect

In contrast to the first-order effect of a new good, the substitution effect of a price change is a second-order effect. Using a Taylor approximation around the period 1 price, when only price j changes, I assume that other prices are assumed to remain constant except for the jth price and then rewrite equation (2) as

For each given form of expenditure function, or equivalently, the demand func- tions h and q, there exists a given p3# E (pl, p2) that makes equation (5) hold with exact equality. This expression is the basis for Diewert's (1976) notion of a super- lative index. The analysis also explains why Ining Fisher's (1922) geometric mean approach for period 1 and period 2 prices and numerous other approaches will all approximate some expenditure (utility) function up to second order. The first term (the first-order term) in equation (5) is taken account of in the current CPI, which multiplies the change in price times quantities in the reference basket of goods, but the "substitution bias" effect arises from the second-order term. Haus- man (1981) offers further discussion on the accuracy of measuring this deadweight loss amount. More generally, if all prices change, the derivatives of the compen- sated demands with respect to prices are the terms in the Slutsky matrix.

Estimating the Effect of a Quality Change: A First-Order Effect

I assume that good n (old quality) exists in period 1, while good n + 1 (new quality) exists in period 2. The difference in expenditure functions yields the CV, which is the difference in areas under the two Hicksian compensated demand cunres as illustrated in Figure 2:''

where pT2and T2+, are the virtual prices that set demand for good n + 1 in period 1 and good n in period 2 equal to zero. The difference between the two expendi- ture functions typically is a first-order effect, as I demonstrated in equation (4).The

Iy I am keeping all other prices the same hemreen the two periods. Prices of other goods in period 2 may differ from period 1, but equation (5) can be used and then other prices in period 2 can be changed. The order of the change does not matter because of integrability. For a discussion, see, for example, Hausman (1981) or Hausman and Newey (1995).

42 Journal of Economic Perspectives

quality of goods may also decrease. The analysis would be the same. Also, to the extent that a new model introduction is accompanied by an increase in price, pTZt1> pT1,this approach takes account of the effect of the price increase. Thus, omission of quality change leads to a first-order bias in the estimation of a cost-of- living index.

A (lower bound) approximation, holding the price of the product constant, can again be used to compute

where p*, = &(a, + 1) /aT2.This approximation gives a lower bound for the effect of quality change under the assumptions of a linear demand curve (as before) and the assumption that the new good with improved quality always has higher demand than the old quality good at each price (their virtual price is assumed to be the same). If the price does not change, the estimate is the change in quantity times the price adjusted with the demand elasticity. If the price also changes, using equation

(6) I find

so that the change in consumer welfare arises from the increase in quantity purchased minus the difference in expenditure for the new quality product minus the difference in expenditure for the previous quality product. In general, the net result of both a quantity increase due to a shift of the demand curve and a price increase can either increase or decrease the cost-of-living index (Spence, 1976). Again, we see the first-order effect of a quality change and the requirement to measure quantities to estimate the CV or change in the cost-of-living index.

To calculate in the framework of equation (6), as an example, I use the results of Hausman (1981) for a constant elasticity demand curve to calculate an expen- diture function

where A is the intercept of the demand curve, a is the price elasticity and 6 is the income elasticity. To consider quality change in its most straightforward setting, I assume that the price p, remains constant across the two periods, but that A increases due to quality improvement so that the demand curve shifts outward. Thus, the coefficient A captures the attributes of the product that may be changed by the manufacturer or may change due to factors such as network effects-for example, for cellular telephones. The combined effect of both a shift of the demand curve and a price change can be estimated in a straightforward manner using this approach. Of course, the econometric estimation must be able to

Sources of Bias and Solutions to Bias in the Consumer Price Index 43

separate out a shift of the demand curve from movement along a demand culTe, which is one of the oldest problems in econometrics. The compensating variation is calculated from equation (9) where y is income:

If a greater quantity is bought at the same price, consumer welfare typically increases. I applied this approach to calculate the increase in consumer welfare from improved quality of cellular telephone networks in Hausman (1999).

w Peter Diamond and Ariel Pakes provided useful discussions. Erwin Diewert provided many comments and has made many suggestions in my research on this topic over the years. The editors provided extremely hebjiul comments. Carol Miu and Amj Sheridan provided research

assistance.

References

Abraham, Katharine, John Greenlees and Brent Moulton. 1998. "Working to Improve the Consumer Price Index." Journal ofEconomic Per- spectzves. Winter, 12:1, pp. 27-36.

Berndt, Ernst R., Zvi Griliches and Neal J. Rapport. 1995. "Econometric Estimates of Price Indexes for Personal Computers in the 1990s." Journal ofEconomtrics. 68:1, pp. 245-68.

Beny, Steven, James Levinsohn and Ariel Pakes. 1995. "Automobile Prices in Market Equi- librium." Econometnca. 63:4, pp. 841-90.

Bils, Mark and Peter Klenow. 2001. "Quanti- fylng Quality Growth." American Economic Reuimu. 91:4, pp. 1006-030.

Boskin, Michael et al. 1996. "Toward a More Accurate Measure of the Cost of Living." Final Report to the Senate Finance Committee.

Bowley, A.L. 1899. "Wages, Nominal and Real," in Dictiona~ ofPolitica1 Economy. R.H. Pal- grave, ed. London: Macmillan, pp. 640-51.

Bresnahan, Timothy. 1997. "Comment," in The Economics oflVew Goods. T. Bresnahan and R. Gordon, eds. Chicago: University of Chicago Press, pp. 238-41.

Costa, Dora L. 2001. "Estimating Real Income

in the U.S. from 1888 to 1994: Correcting CPI Bias Using Engel Curves." Journal ofPolz/ical Eron- omy. December, 109:6, pp. 1288-310.

Deaton, Angus and John Muellbauer. 1980. Economics and Consumer Behavior. Cambridge: Cambridge University Press.

Diewert, W. Erwin. 1976. "Exact and Superla- tive Index Numbers." Journal ofEconometnts. Xlay, 4, pp. 114-45.

Diewert, W. Erwin. 1980. "Aggregation Prob lems in the Measurement of Capital," in Thr Measurement of Capztal. Dan Usher, ed. Chicago: University of Chicago Press, pp. 433-528.

Diewert, W. Erwin. 1993. "The Early History of Price Index Research,'' in Gsays in Index ,Vumber Theo~, Lrolume I. W.E. Diewert and A.O. Naka- mura, eds. Amsterdam: Else~ier, pp. 33-65.

Diewert, W. Erwin. 1998. "Index Number Is- sues in the Consumer Price Index." Journal of Economic Perspectives. Winter, 12:1, pp. 47-58.

Diewert, W. Erwin. 2001. "Hedonic Regres- sions: A Consumer Theory Approach," forthcoming in ScannerData and Price Indexes. R. Feen- stra and M. Shapiro, eds. Studies in Income and Wealth, Volume 61, Chicago: University of Chi- cago Press, part n'.

Fisher, Irving. 1922. The Making of Index l'iurn- bers. Boston: Houghton Mifflin.

44 Journal of Economic Perspectives

Hamilton, Bruce W. 2001. "Using Engel's Law to Estimate CPI Bias." American Economzc Remew. June, 913, pp. 619-30.

Hausman, Jeny. 1979. "Individual Discount Rates and the Purchase and Utilization of En- ergy Using Durables." Bell Journal of Economics. Spring, 10:1, pp. 33-54.

Hausman, Jeny. 1981. "Exact Consumer's Sur- plus and Deadweight Loss." Ama'can Economzc Revzew. September, 71:4, pp. 662-76.

Hausman, Jeny. 1996. "Valuation of New Goods Under Perfect and Imperfect Competi- tion," in The Economics of,Yau Goods. T. Bresna- han and R. Gordon, eds. Chicago: University of Chicago Press, pp. 209-37.

Hausman, Jeny. 1997a. "LTaluing the Effect of Regulation on New Senices in Telecommunica- tions." Brookzngs Papers on Economic Activity: Mi- croeconomics. pp. 1-38.

Hausman, Jeny. 1997b. "The CPI Commission: Discussion" American Economic Review. May, 87, pp. 94-98.

Hausman, Jeny. 1999. "Cellular Telephone, New Products and the CPI." Journal of Business and Economics Statistics. April, 17:2, pp. 188-94.

Hausman, Jeny. 2002a. "Mobile Telephone," in Handbook of Telecommunications Economics. M. Cave, S. Majumdar and I. LTogelsang, eds. hmsterdam: North Holland, pp. 564-605.

Hausman, Jeny. 2002b.-"~egulated Costs and Prices in Telecommunications," in Intaatzonal Handbook of Telecommunications. G. Madden, ed. London: Elgar Publishing, forthcoming.

Hausman, Jerry and Gregory Leonard. 2002. "The Competitive Effects of a New Product In- troduction: i\ Case Study." Journal of Industnal Economzcs. September, 50:3, pp. 237-63.

Hausman, Jeny and Whitney Newey. 1995. "h'onparametric Estimation of Exact Consumers Surplus and Deadweight Loss." Econometrica. 63:6, pp. 1445-476.

Hausman, Jeny, Gregory Leonard and J. Douglas Zona. 1994. "Competitive Analysis with Differentiated Products." Annales DEconomie et de Statistique. 34, pp. 159-80.

Hicks, J.R. 1940. "The Lraluation of Social In- come." Economzca. 7:2, pp. 105-24. Lowe, Joseph. 1823. The Present State ofEngland

in Regard to Agriculture, Trade and Finance, Second Edition. London: Longman, Hurst, Rees, Orme and Brown.

Konus, A. 1939. "The Problem of the True Index of the Cost of Living." Econometrica. 7, pp. 10-29.

Madrick, Jeff. 2001. "Economic Scene." LVew

York Tim~s. December 27, 2001, p. c2.

Marshall, Alfred. 1887. "Remedies for Fluctu- ations of General Prices," in iMemorials of Alfred marsh all. A.C. Pigou, ed. London: Macmillan,

1925, pp. 188-211.

Pakes, Ariel. 2001. "Some Notes on Hedonic Price Indices, with an Application to PCs." Paper presented at the NBER Productivity Program Meeting, March 16, Cambridge, Mass.

Petrin, Amil. 2002. "Quantifiing the Benefits of New Products: The Case of the Minivan." Journal of Political Economy. ,\ugust, 110:4, pp. 705-29.

Pollak, Robert. 1989. The Theoly of the Cost-of Living Index. Oxford: Oxford University Press.

Rothbarth, E. 1941. "The Measurement of Changes in Real Income under Conditions of Rationing." Revim of Economic Studies. 8, pp. 100-07.

Schultze, Charles and Chistopher Mackie, eds.

2002. At What Price? Washington, D.C.: National Academy Press.

Shapiro, Matthew and David Wdcox. 1997. "Alternative Strategies for Aggregating Prices in the CPI." Federal Reserve Bank ofSt. Louis Reviau. May, 79, pp. 113-25.

Sidgwick, Henry. 1883. The Princzples ofPolitica1 Economj. London: Macmillan.

Silver, Mick S. and Saeed Heravi. 2001a. "Scanner Data and the Measurement of Infla- tion." Economic Journal. June, 11 1:472, pp. F383- F404.

Silver, Mick S. and Saeed Heravi. 2001b. ll'hy the CPI Matched Models Method May Fail Us: Results from an Hedonic and Matched Experi- ment Using Scanner Data." Mimeo, University of Cardiff.

Spence, Michael. 1976. "Product Selection, Fixed Costs, and Monopolistic Competition." Review ofEconomic Studies. June, 43:2, pp. 217-33.

Comments