ESTIMATING DEMAND FOR DIFFERENTIATED PRODUCTS WITH CONTINUOUS CHOICE AND VARIETY-SEEKING: AN APPLICATION TO THE PUZZLE OF UNIFORM PRICING

a dissertation submitted to the department of economics and the committee on graduate studies of stanford university in partial fulfillment of the requirements for the degree of doctor of philosophy

Robert Stanton McMillan March 2005

c Copyright by Robert Stanton McMillan 2005

All Rights Reserved

ii

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

Peter Reiss (Principal Adviser)

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

Frank Wolak

I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.

Liran Einav

Approved for the University Committee on Graduate Studies.

iii

Abstract Retailers typically sell many different products from the same manufacturer at the same price. I consider retailer-based explanations for this uniform pricing puzzle, estimating the counterfactual profits that would be lost by a retailer switching from a non-uniform to a uniform pricing regime in the carbonated soft drink category. In order to calculate this profit difference, I develop a new structural model of demand that improves on existing work by more closely matching several key features of the data. These key features include the fact that households may be variety-seeking, that they make continuous choices, and that they choose from a large number of products. Using household-level panel data on purchases of carbonated soft drinks, I estimate that the retail store I observe earned an additional $36.56 (1992 dollars) in average weekly profits by charging non-uniform prices. This corresponds to roughly a 3% difference in profits, and suggests that when a retail store faces even a relatively small cost to determine the optimal set of non-uniform prices, it may be optimal to charge the same price for many products.

iv

Acknowledgements I thank my advisers Peter Reiss and Frank Wolak for their guidance and advice. I also thank Pat Bajari, Michaela Draganska, Liran Einav, Cristobal Huneeus, Navin Kartik, Tom MaCurdy, Mikko Packalen, James Pearce, and many others at the Stanford Economics department for helpful discussions. Finally, I thank my parents, my brother, and Helen Chabot. Without their support and encouragement, this would never have come to fruition.

v

Contents Abstract

iv

Acknowledgements

v

1 Introduction

1

1.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1

1.2

Demand-side Explanations . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7

1.3

An Overview of Menu Costs . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

1.4

Estimating Menu Costs with Soft Drink Data . . . . . . . . . . . . . . . . .

13

1.5

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

2 A Model of Continuous Demand and Variety-Seeking

25

2.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

2.2

Previous Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

2.3

The Model

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

2.4

Behavior of the Model When There Are Two Inside Goods . . . . . . . . .

34

2.4.1

Demand Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

2.4.2

Engel Curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

2.4.3

The A Matrix and Some Additional Remarks . . . . . . . . . . . . .

56

2.5

Choice Behavior Generated by the Model . . . . . . . . . . . . . . . . . . .

58

2.6

The Costs of Misspecification . . . . . . . . . . . . . . . . . . . . . . . . . .

68

2.7

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

2.8

Appendix A: Restrictions Yielding Concavity . . . . . . . . . . . . . . . . .

80

2.8.1

Two Goods, Two Characteristics . . . . . . . . . . . . . . . . . . . .

80

2.8.2

Two Goods, Three Characteristics . . . . . . . . . . . . . . . . . . .

81

vi

2.8.3 2.9

Three Goods, Three Characteristics . . . . . . . . . . . . . . . . . .

82

Appendix B: Analytic Solutions to the Two-Inside Good Case: Details . . .

83

2.10 Appendix C: Additional Figures

. . . . . . . . . . . . . . . . . . . . . . . .

3 Investigating the Costs of Uniform Pricing

87 94

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

3.1.1

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

3.1.2

Data: Soft Drinks . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

3.1.3

Estimating Demand . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

3.1.4

Counter-Factuals and Preview of Main Result . . . . . . . . . . . . .

98

3.1.5

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

100

Empirical Demand Model . . . . . . . . . . . . . . . . . . . . . . . . . . . .

100

3.2.1

Product-Level Demand Model . . . . . . . . . . . . . . . . . . . . . .

101

3.2.2

Store Choice Model . . . . . . . . . . . . . . . . . . . . . . . . . . .

106

3.2.3

Residual Demand . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

108

Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

109

3.3.1

IRI Basket Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

109

3.3.2

Dominick’s Finer Foods Data . . . . . . . . . . . . . . . . . . . . . .

117

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

117

3.4.1

Structural Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

117

3.4.2

Store Choice Model . . . . . . . . . . . . . . . . . . . . . . . . . . .

121

3.4.3

Counter-Factuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

123

Interpreting “Lost” Profits . . . . . . . . . . . . . . . . . . . . . . . . . . . .

135

3.5.1

Menu Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

135

3.6

Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

139

3.7

Appendix 3.A: Numerically Solving The Utility Function . . . . . . . . . . .

139

3.8

Appendix 3.B: Modeling Heterogeneity of Preferences . . . . . . . . . . . .

140

3.9

Appendix 3.C: Analysis of the Panel Composition . . . . . . . . . . . . . . .

141

3.1

3.2

3.3

3.4

3.5

References

146

vii

List of Tables 1.1

Distribution of Purchase Occasions by Number of Items and Number of UPCs Purchased (all Carbonated Soft Drinks) . . . . . . . . . . . . . . . . . . . .

1.2

22

Distribution of Purchase Occasions by Number of Items and Number of UPCs Purchased (Among Top 25 Carbonated Soft Drinks) . . . . . . . . . . . . .

23

2.1

Pairs of (ε1 , ε2 ) corresponding to the demand curves in Figures 2.1-2.7 . . .

36

2.2

Pairs of (ε1 , ε2 ) corresponding to the demand curves in Figures 2.1-2.30 . .

49

2.3

Marginal Cost Configurations . . . . . . . . . . . . . . . . . . . . . . . . . .

69

2.4

Example Product Universe Number One . . . . . . . . . . . . . . . . . . . .

70

2.5

Example Product Universe Number Two . . . . . . . . . . . . . . . . . . . .

75

2.6

Summary of Dominance Conditions . . . . . . . . . . . . . . . . . . . . . . .

87

3.1

Distribution of Purchase Occasion Expenditure by Store, All Purchase Occasions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.2

111

Distribution of Purchase Occasion Expenditure by Store, Purchases Made by Panelists Who Visited Store A at least Once. . . . . . . . . . . . . . . . . .

112

3.3

Variety and Size Distribution of in the Dataset, grouped by Manufacturer .

113

3.4

Summary Statistics for Prices and Quantities Sold at Store A, grouped by Manufacturer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

114

3.5

Characteristics of Products in the Dataset, grouped by Manufacturer . . . .

116

3.6

Parameter Estimates from Structural Model of Product Choice . . . . . . .

119

3.7

Selected Own and Cross Price Elasticities from Homogenous Logit Model of Product Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.8

119

Matrix of Estimated Average Own and Cross-Price Elasticities from Structural Model of Product Choice . . . . . . . . . . . . . . . . . . . . . . . . .

viii

120

3.9

Coefficients on Demographic Variables from Specifications III and VI of the Conditional Logit Model of Store Choice . . . . . . . . . . . . . . . . . . . .

125

3.10 Coefficients on Price Index Variables for Specifications I-VI of Conditional Logit Model of Store Choice. Standard errors are in brackets. . . . . . . . .

126

3.11 Coefficients on Price Indices for Specifications VII and VIII of Conditional Logit Model of Store Choice. Standard errors are in brackets. . . . . . . . .

127

3.12 Summary Statistics for Marginal Costs (in Dollars per 12oz Serving) Implied by the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

128

3.13 Summary Statistics on the Differences Between Observed Non-Uniform Prices and “Optimal” Uniform Prices (in Dollars per 12oz Serving) . . . . . . . . .

132

3.14 Uniform and Non-Uniform Prices, Quantities and Estimated Profits for the Week of July 7, 1991. Prices and marginal costs reported are in cents per 12oz Serving. Quantity is measured in 12oz servings. Profits are measured in dollars. All prices and profits are in nominal terms. The total difference in profits for the week from the two pricing strategies is $61.52. The 3L size of Pepsi was not offered in this week. . . . . . . . . . . . . . . . . . . . . . .

134

3.15 Size and Composition of Households in Panel . . . . . . . . . . . . . . . . .

142

3.16 Age Distribution of Primary Male and Female in Households in Panel . . .

143

3.17 Income Distribution of Panel Households . . . . . . . . . . . . . . . . . . . .

143

3.18 Summary Statistics for Population Living Near Stores in the Panel

144

ix

. . . .

List of Figures 1.1

Graph of Prices for Coke and Pepsi at Store B, 6/91-6/93 . . . . . . . . . .

16

1.2

Graph of Prices for Coke and Pepsi at Store A, 6/91-6/93 . . . . . . . . . .

17

1.3

Graph of Prices Ratios of Pepsi Varieties to Regular Pepsi at Store B, 6/91-6/93 18

1.4

Graph of Prices Ratios of Pepsi Varieties to Regular Pepsi at Store A, 6/91-6/93 19

2.1

Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =

"

1

0.5

0.5

1

#

, β1 = 5.5, β2 = 5, ρ1 = ρ2 = 0.5, p2 =

$0.50, w = 60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2

Demand for good one and two as a function of p1 , for 40th and 60th percentiles of ε1 and ε2 . A =

"

1 0

#

, β1 = 5.5, β2 = 5, ρ1 = ρ2 = 0.5, p2 =

0 1

$0.50, w = 60. Compare to Figures 2.3 and 2.4. . . . . . . . . . . . . . . . . 2.3

"

#

1 0 0 1

, β1 = 6.875, β2 = 5, ρ1 = 0.4, ρ2 =

0.5, p2 = $0.50, w = 60. Compare to Figures 2.2 and 2.4. . . . . . . . . . . "

1 0

#

0 1

, β1 = β2 = 5, ρ1 = ρ2 = 0.5, p2 =

$0.50, w = 60. Compare to Figures 2.2 and 2.3 . . . . . . . . . . . . . . . .

42

Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =

"

1

0.5

0.5

1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.5, p2 =

$0.50, w = 60. Compare to Figure 2.1. . . . . . . . . . . . . . . . . . . . . . 2.6

41

Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =

2.5

40

Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =

2.4

37

44

Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =

"

1

0.5

0.5

1

#

, β1 = 6.875, β2 = 5, ρ1 = 0.4, ρ2 =

0.5, p2 = $0.50, w = 60. Compare to Figures 2.1 and 2.5. . . . . . . . . . .

x

45

2.7

Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =

"

1

0.999

0.999

1

#

, β1 = 5.5, β2 = 5, ρ1 = ρ2 = 0.5, p2 =

$0.50, w = 60. Compare to Figure 2.1. . . . . . . . . . . . . . . . . . . . . . 2.8

47

Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each "

symbol represents an increase of $0.20 in w. A =

1

0.5

0.5

1

#

, β1 = 5.5, β2 =

5, ρ1 = ρ2 = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.9

50

Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each symbol represents an increase of $0.20 in w. A =

"

1

0.5

0.5

1

#

, β1 = β2 = 5, ρ1 =

ρ2 = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

51

2.10 Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each "

symbol represents an increase of $0.10 in w. A =

1 0

#

0 1

, β1 = 5.5, β2 =

5, ρ1 = ρ2 = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

52

2.11 Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each "

symbol represents an increase of $0.10 in w. A =

1 0

#

0 1

, β1 = β2 = 5, ρ1 =

ρ2 = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

2.12 Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each symbol represents an increase of $0.10 in w. A =

"

1 0 0 1

#

, β1 = 6.875, β2 =

5, ρ1 = 0.4, ρ2 = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

2.13 Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each symbol represents an increase of $0.40 in w. A =

"

1

0.999

0.999

1

#

, β1 = 5.5, β2 =

5, ρ1 = ρ2 = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55

2.14 Graph of logit choices as a function of error terms . . . . . . . . . . . . . .

59

2.15 Graph of purchases (broken into groups) as a function of error terms. A = "

1 0 0 1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = $0.30, w = $30. The probabili-

ties of the four regions are:

"

0.21 0.09

#

0.49 0.21

. . . . . . . . . . . . . . . . . . . . . .

61

2.16 Negative Lognormal Probability Distribution of εj . Mean=e0.5 , Variance=e2 − e, Mode=e−1 , Median=1. ln(−εj ) ∼ N (0, 1). . . . . . . . . . . . . . . . . . xi

62

2.17 Graph of purchases (broken into groups) as a function of error terms. A = "

1 0 0 1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 1. The probabilities

of the seven regions are:



0.09 0.13 0.09

  

  0.14 0.13  0.27 0.09



. . . . . . . . . . . . . . . . . . . . . . .

2.18 Graph of purchases as a function of error terms. A =

"

1 0

#

0 1

63

β1 = 7, β2 =

5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 30. The probabilities of the four regions are:

"

0.24 0.05 0.57 0.13

#

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

2.19 Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =

"

#

10

0

0

10

, β1 = β2 = 0.5, ρ1 = ρ2 = 0.4, p1 = p2 =

0.3, w = 30. The probabilities of the four regions are:

"

0.21 0.09

#

0.49 0.21

. . . . . . .

65

2.20 Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =

"

1

0.1

0.1

1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 =

0.3, w = 30. The probabilities of the four regions are:

"

0.28 0.07

#

0.38 0.27

. . . . . . .

66

2.21 Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =

"

1 0.5 0

1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 =

0.3, w = 30. The probabilities of the four regions are:

"

0.57 0.05

#

0.24 0.15

. . . . . . .

66

2.22 Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =

"

1 1 0 1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w =

30. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

2.23 Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,3},{2,4}. The marginal costs corresponding to the horizontal       1 0   0 1    0 0  0 0

axis are shown in Table 2.3. A =

0 0

 0 0    1 0   0 1

,β=

3    5       7    9

,ρ=

0.05    0.08       0.07    0.09

. . . . . . .

72

2.24 Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,3},{2,4}. The marginal costs corresponding to the horizontal       1 0 0 0

 

 1 0 0      0 0 1 0  0 0 0 1

axis are shown in Table 2.3. A = 0

,β=

xii

9    5       7    9

,ρ=

0.05    0.08       0.07    0.09

. . . . . . .

73

2.25 Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,3},{2,4}. The marginal costs corresponding to the horizontal       1 0   0 1    0 0  0 0

0 0

 0 0    1 0   0 1

axis are shown in Table 2.3. A =

,β=

1    5       7    9

,ρ=

0.15    0.08       0.07    0.09

. . . . . . .

74

2.26 Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,3},{2,4}. This corresponds to uniform pricing by “diet” characteristic. the The marginal costs corresponding to the horizontal axis are       1 1 0 1

 

 1 0 0      1 0 1 1  1 0 1 0

shown in Table 2.3. A = 1

,β=

5    5       5    5

,ρ=

0.05    0.07       0.07    0.09

. . . . . . . . . . . .

76

2.27 Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,2},{3,4}. This corresponds to uniform pricing by “Coke” or “Pepsi” characteristics. The marginal costs corresponding to the horizontal       1 1   1 1    1 0  1 0

0 1

 0 0    1 1   1 0

,β=

axis are shown in Table 2.3. A =

5    5       5    5

,ρ=

0.05    0.07       0.07    0.09

. . . . . . .

77

2.28 Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,2},{3,4}. This corresponds to uniform pricing by “Coke” or “Pepsi” characteristics. The marginal costs corresponding to the horizontal       1 1   1 1    1 0  1 0

0 1

 0 0    1 1   1 0

axis are shown in Table 2.3. A =

,β=

3.5    5.5       3.5    3.5

,ρ=

0.05    0.07       0.07    0.09

. . . . . .

78

2.29 Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =

"

1

0.999

0.999

1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.5, p2 =

$0.50, w = 60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

88

2.30 Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =

"

1

0.999

0.999

1

#

, β1 = 6.875, β2 = 5, ρ1 = 0.4, ρ2 =

0.5, p2 = $0.50, w = 60 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

2.31 Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =

"

1 0 0 1

#

, β1 = β2 = 5, ρ1 = 0.5, ρ2 = 0.4, p1 = p2 =

0.3, w = 30. The probabilities of the four regions are:

xiii

"

0.23 0.06 0.55 0.15

#

. . . . . . .

90

2.32 Graph of purchases as a function of error terms. The groupings are the same "

as in Figure 2.15. A =

1

0.02

0.02

1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 =

0.3, w = 30. The probabilities of the four regions are:

"

0.22 0.08

#

0.47 0.23

. . . . . . .

90

2.33 Graph of purchases as a function of error terms. The groupings are the same "

as in Figure 2.15. A =

1

0.25

0.25

1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 =

0.3, w = 30. The probabilities of the four regions are:

"

0.36 0.05

#

0.22 0.37

. . . . . . .

91

2.34 Graph of purchases as a function of error terms. The groupings are the same "

as in Figure 2.15. A =

1

0.1

0.1

1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 =

0.3, w = 1. The probabilities of the seven regions are:



0.16 0.11 0.07

  

 0.10 0.11   0.23 0.16



. . . . . . .

91

2.35 Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =

"

#

10

1

1

10

, β1 = β2 = 0.5, ρ1 = 0.5, ρ2 = 0.4, p1 = p2 =

0.3, w = 30. The probabilities of the four regions are:

"

0.28 0.07

#

0.38 0.28

. . . . . . .

92

2.36 Graph of purchases as a function of error terms. The groupings are the same "

as in Figure 2.15. A =

1 0.08 0

1

#

, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 =

0.3, w = 30. The probabilities of the four regions are:

"

0.27 0.08

#

0.45 0.20

. . . . . . .

92

2.37 Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =

"

1 0.5 0

#

1

, β1 = 7, β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 =

0.3, w = 30. The probabilities of the four regions are:

"

0.65 0.03

#

0.22 0.10

. . . . . . .

93

2.38 Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =

"

1 0.5 0

1

#

, β1 = β2 = 5, ρ1 = 0.5, ρ2 = 0.4, p1 = p2 =

0.3, w = 30. The probabilities of the four regions are: 3.1

"

0.66 0.04 0.19 0.11

#

. . . . . . .

Graph of the Price and Implied Marginal Cost (in cents per 12oz serving) for a 2L Bottle of Regular Pepsi, 6/91-6/93 . . . . . . . . . . . . . . . . . . . .

3.2

130

Graph of the Maximum Difference Across Products (in cents per 12oz serving) Between a Product’s Uniform and Non-Uniform Prices, 6/91-6/93 . . .

3.4

130

Graph of the Average Markup (in cents per 12oz serving) Across Products, 6/91-6/93 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

3.3

93

131

Graph of the Difference Between Profits from Uniform and Non-Uniform Price Strategies, as a Percent of the Profits Earned at Non-Uniform Prices, 6/91-6/93 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xiv

136

3.5

Graph of the Counterfactual Dollars Lost from Charging Uniform Prices, 6/91-6/93 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xv

136

Chapter 1

Introduction 1.1

Introduction

Retailers typically sell many different products from the same manufacturer at the same price. For example, in the yogurt category, all flavors of six ounce Dannon Fruit-on-theBottom yogurt are sold at one price, while all flavors of six ounce Yoplait Original yogurt are sold at a second (uniform) price. This practice is common in many product categories, including frozen dinners, ice cream and salsa. In other product categories, however (e.g., frozen juice), items are typically sold at different prices within manufacturer brands. While not true at all times, in all stores, and for all products, the extent of these uniform prices across different retailers, product categories, and time is stunning. Why is it optimal for the retailer to sell many different items at the same price? Consider the following anecdotal examples: Tea and Juice Although many teas and juices are sold at uniform prices, there are notable exceptions. Frozen orange juice is almost always priced differently from other frozen juice. Within the premium juice category (e.g., brands such as Odwalla and Naked Juice), prices are frequently completely non-uniform. Similarly, although most teas are sold at uniform prices, some varieties of tea are frequently sold at a higher price. These non-uniform prices seem to correlate with marked differences in marginal costs. Although most tea leaves cost roughly the same, some cost more to produce. Similarly, differences in cost and juiciness across different fruits can lead to different marginal costs for the same volume of liquid. Furthermore, both tea and juice are products

1

CHAPTER 1. INTRODUCTION

2

that are difficult for manufacturers to adjust the amount of input. If manufacturers adjust the amount of real juice or tea leaves, consumers are likely to notice. Wine Different varieties of wine from the same vineyard and vintage are typically sold at the same price when they are sold for less than $15 a bottle. For example, the muchreviewed Charles Shaw wines are $1.99, regardless of variety (e.g., Merlot, Cabernet Sauvignon, Shiraz, Gamay Beaujolais, Chardonnay, or Sauvignon Blanc). However, for more expensive wine, different varieties are generally priced non-uniformly. As prices increase, we see more and larger deviations from uniform prices. Clothing Within a particular style, clothing is typically sold at the same price for different colors and sizes. There are however, exceptions to this rule: while S,M,L and XL sizes are typically the same price, many retailers charge more for XXXL and “tall” sizes. These sizes generally cost the retailer more, either because of the amount of fabric used, or because average costs are higher due to lower volumes. Also, although men’s shirts are usually priced uniformly across colors, striped shirts are frequently priced differently. Retailers frequently claim that this is because striped shirts are a different style than solid colored shirts, with different demand. Finally, upscale clothing stores are more likely to charge different prices for different colors. Books Books, even harlequin romance novels, are not sold at uniform prices. Even different books by a single author generally have different prices. At first this seems puzzling. But unlike many products, books are frequently sold with a suggested retail price stamped on their cover or dust jacket. Furthermore, this price is the same everywhere. While many retailers offer lower prices (e.g., discounts for New York Times Bestsellers), these are nearly always offered as a percentage difference from this suggested price. If one is willing to take as given the constraint that most books are sold at a single price (or at most 2-3 prices) nationally, it becomes clear that the demand for different books is almost certainly quite different. While many products exhibit uniform prices, there seem to be clear patterns characterizing products that are not priced uniformly. Products with ostensibly different marginal costs, such as different flavors of tea, varieties of frozen juice, and odd sizes of clothing are frequently sold at different prices. Other products, with demand that clearly differs across varieties such as different colors of designer clothes, colors of cars, expensive varieties of

CHAPTER 1. INTRODUCTION

3

wine also tend to be priced differently. A suggestive pattern emerges: unless there are clearly additional profits from nonuniform pricing, there is a strong tendency towards uniform prices. If marginal costs are sufficiently different, we tend to see non-uniform prices. If demand for the products is sufficiently different, we tend to see non-uniform prices. If prices are sufficiently high, we tend to see non-uniform prices. This suggests that managerial menu costs on the part of the retailer may be able to explain the observed uniform pricing behavior. In determining what price to charge, the retailer incurs a cost, most obviously the opportunity cost of a price-setting executive’s time. Given this cost, it may be optimal for the retailer to group products with similar costs and demands and sell them at a single price. This dissertation is devoted to the careful examination of this managerial menu cost explanation. In order to roughly test this hypothesis, by measuring the menu costs that would be required to rationalize this behavior, I carefully estimate a structural model of the residual demand curve faced by the retailer. Using this demand model, I calculate the counter-factual difference between the profit (defined as revenue minus cost of goods sold) a retailer earns by charging uniform and non-uniform prices. Only a structural model will allow me to predict demand – and hence calculate profits – at counter-factual prices. I interpret this profit difference as the additional managerial menu cost (incurred when using non-uniform prices instead of uniform ones) that would be sufficient to induce a retailer to use uniform rather than non-uniform prices. I find that in a product category where uniform prices are typically found, the necessary managerial menu costs average roughly 3% of profits, or $35 per week, per store, per category (in 1992 dollars).1 According to Levy, Bergen, Dutta & Venable (1997), any price change would, at a minimum involve a category pricing manager. In this time period, they report that such a manager typically earned roughly $100,000 annually. Assuming a 40 hour work week, this corresponds to a wage of roughly $50 per hour. Hence, my finding can be seen as saying the following: “For the category I consider, charging non-uniform prices nets the retailer an additional average $35 per week, per store. If charging non-uniform prices instead of uniform prices takes up more than about an hour of a pricing executive’s time per store per category, the retailer will find it more profitable to simply charge uniform prices.” 1

For reference, one 2004 dollar is equivalent to $0.72 1991 dollars, $0.74 1992 dollars, or $0.76 1993 dollars, the years covered by the data in this dissertation.

CHAPTER 1. INTRODUCTION

4

The remainder of the chapter explores alternative explanations proposed by the literature, and describes my approach in more detail. Given the pervasiveness of uniform pricing in retail environments, the dearth of papers on the subject is quite surprising. To my knowledge, only Orbach & Einav (2001) confront the uniform pricing puzzle directly (and then only in the movie industry). They observe that tickets for movies that are a priori expected to be blockbusters are sold at the same price as movies that are a priori expected to be box office bombs. Unfortunately, they are hamstrung by the fact that they never observe non-uniform pricing for movies, and are unable to find any convincing explanation. The answer to the uniform pricing puzzle must revolve around the key question: How do retailers set prices? Menu costs are not the only potential explanation. Indeed, in addition to the menu cost explanation, there are a variety of explanations, which fall into two groups: demand-side (consumer-based) and supply-side (retailer-based) explanations. Potential demand-side explanations involve explicit consumer preferences for uniform prices. Based on discussions with price-setters, Kashyap (1995) and Canetti, Blinder & Lebow (1998) find that many firms believe they face a kinked demand curve, containing socalled “price points” where marginal revenue is discontinuous. Two consumer preferences that might yield a “uniform price” price point are that more prices makes it harder to figure out what to buy, and that more prices make the consumer feel that the retailer is trying to take advantage of her. Shugan (1980) and Hauser & Wernerfelt (1990) develop theoretical models of costly optimization and Draganska & Jain (2001) find some evidence of this in yogurt. Kahneman, Knetsch & Thaler (1986) look at consumers’ perceptions of fairness, and show that consumers may perceive unfairness in retailers pricing policies. Evidence that these perceptions of fairness can affect demand can be found in recent popular press surrounding actions by Coke (Hays 1997) and Amazon (Heun 2001). In addition to the “null” hypothesis that the observed prices are actually optimal in a traditional supply and demand framework, the two principal supply-side explanations are that the observed uniform pricing behavior is either driven by menu costs or stems from an attempt by retailers to soften price competition. This is the explanation favored by Ball & Mankiw (2004), who use it to explain “sticky” prices. The puzzle of uniform pricing across differentiated products is closely related to the long-standing macro-economic issue of sticky prices. Sticky prices can be thought of as uniform prices for products that are differentiated by time – an example of inter-temporal price uniformity. Addressing this issue, Ball & Mankiw (2004) suggest that much of the

CHAPTER 1. INTRODUCTION

5

observed inter-temporal price uniformity can be explained by the menu costs associated with “the time and attention required of managers to gather the relevant information and make... decisions.”(p.24-25). Leslie (2004) also believes that menu costs play a role in pricing decisions. He finds that Broadway theaters can earn higher profits by charging prices that differ across seats. While he finds that charging different prices for different seating categories results in higher profits, he cannot explain why theaters use only two or three different categories. Because he does not observe seat-level price variation he cannot estimate the implied menu costs. Although they do not describe it as such, Chintagunta, Dub´e & Singh (2003) provide a measurement of implied menu costs. They find that multi-store retail chains can earn higher profits by charging different prices in different geographic areas. They predict that if retailers charged a different price in each store (rather than using only three or four different menus of prices for over 80 stores), the chain would have earned an additional $10,000 per week in the orange juice category alone! Such a large result seems unreasonable in light of the salary levels found in Levy et al. (1997), and may be due to the potential demandside explanations discussed above or the restrictive assumptions that they make about the nature of competition between retailers in order to identify the demand. Taking a different tack and inferring managerial menu costs from salary data, Levy et al. (1997) guesstimate that the annual price-setting managerial costs are $2.3-$2.9 million at the chain level, which translate to average annual per-store costs of roughly $7000.2 These are average, not marginal costs, however. In addition to menu costs, another possible supply-side explanation for uniform pricing is that it leads to a softening of price competition. There is an extensive literature investigating the effects of multi-market interaction on firms’ abilities to collude (for example Nevo (2001) looks at the case of breakfast cereal, while Carlton (1989) suggests this as an explanation for inter-temporal price uniformity). Most relevant to my analysis is Corts (1998) who shows that firms engaged in multi-market competition may prefer to commit themselves to charging the same price in both markets in order to soften price competition. Corts models the interaction between two firms which compete in a Bertrand setting in two markets. In his model, the two firms have identical costs, but these costs differ across markets. If they are able to charge different prices in each market, the firms drive the price down to marginal cost. However, if they are able to restrict themselves to charge the same price 2

These figures are in 1992 dollars. One 1992 dollar is equivalent to $1.35 2004 dollars.

CHAPTER 1. INTRODUCTION

6

in both markets, then they are able to earn positive profits in expectation. Viewing two different flavors as different markets, his result suggests that retailers may benefit if they are able to tacitly agree to charge fewer prices than the number of distinct products that they sell. Alternatively, charging fewer prices may allow colluding firms to more easily detect cheating. However, both of these theories of collusion require retaliation by the cartel in the face of detected cheating. This does not coincide with what I see in my data: I see repeated non-uniform pricing by one retailer that is not met by non-uniform pricing by any other retailers in my sample. As we will see in more detail later (see Figures 1.1-1.4), I do not see the stores in my data retaliating through the use of non-uniform prices in the face of deviations. Finally, there has also been a growing body of literature exploring the implications of line length – the dual of the uniform pricing puzzle (See, for example, Draganska & Jain (2001), Bayus & Putsis (1999), and Kadiyali, Vilcassim & Chintagunta (1999)). These papers model the retailer’s decision to add products to a “line”. This literature has taken for granted that all the products in a line have the same price. Indeed, many authors have defined a product line as the set of products from a single manufacturer sold by the retailer at a uniform price. Rather than examine the pricing decision, this literature has focused exclusively on the decision of whether to introduce additional products. Clearly in addition to facing the problem of maximizing product line length, manufacturers face the decision of when to price products differently, that is, when to split a line. To my knowledge, this question has not been directly addressed by the line length literature. Although many of these explanations for uniform pricing are plausible (and may be the cause for some cases of uniform pricing), this dissertation focuses primarily on the menu cost explanation, which is essentially a story of bounded rationality on the part of the retailer. There are several reasons for this focus. The first reason for this focus is that the menu cost explanation uses standard assumptions concerning consumer choice behavior, and at a minimum will provide a useful benchmark for comparison when considering other explanations. Second, it is certainly plausible that for many goods, menu costs may be able to explain uniform pricing. Third, while there is reason to believe that demand-side explanations may play a role, I am currently limited by a lack of data. Exploring demandside explanations in more detail cannot be done with the data presently available.3 By 3

Doing so would, at a minimum, require exogenous switching between uniform and non-uniform pricing strategies. In addition, Kahneman et al. (1986) suggest that consumer responses to alternative price behavior are heavily dependent on framing (e.g., explanations for alternative pricing behavior).

CHAPTER 1. INTRODUCTION

7

contrast, if I assume that consumers do not have explicit preferences for uniform prices, it is possible to examine the implications of uniform pricing for retailers. This dissertation estimates the economic profits that retailers appear to lose by following optimal uniform pricing strategies. I compare the expected profit earned under under the actual pricing regime, as well as the expected profits earned under several alternative pricing strategies, such as having one price for each size, one per unit price for each manufacturer, and price for each manufacturer-brand-size. These lost profits place bounds on the retailer’s costs of implementing these alternative pricing regimes. In the next section, I describe the potential demand-side explanation in more detail, before continuing (in section 1.3) to develop my framework for estimating the menu costs to the retailer from following nonuniform pricing strategies.

1.2

Demand-side Explanations

In this section, I briefly discuss “fairness” and costly consumer optimization and their implications for uniform pricing. As mentioned in the introduction, the assumption that uniform pricing is entirely driven by supply-side factors is quite strong. One might reasonably suppose that the source of uniform pricing lies with the consumer. Unfortunately, when examining these demand-side explanations, my maintained assumption that consumer preferences are stable across different stores is far less tenable. To see this, consider two markets: A and B. Suppose that market A has uniform prices while market B has non-uniform prices. One cannot simultaneously assume that the demand systems in these two markets are the same, while assuming that the cause of the uniform pricing lies with the demand system. Estimation of the demand system in the case of explicit demand-side preferences for uniform prices would require either truly exogenous changes between uniform and non-uniform prices (as in a controlled experiment), or much stronger assumptions. However, even if we observed apparently exogenous price variation, with the same households making purchases under both uniform and non-uniform pricing regimes, evidence suggests that many of these demand-side explanations are subject to framing issues. In thinking about fairness and its effects on pricing and consumer demand, we need to consider two questions: (1) Do consumers believe the products are priced fairly? and (2) If consumers believe that products are priced unfairly, what do they do? In an attempt to better understand consumer ideas about fairness, Kahneman et al. (1986) surveyed roughly

CHAPTER 1. INTRODUCTION

8

100 Canadian households. Their results showed that people’s perceptions regarding whether a firm’s actions were fair or unfair depended on the way the actions were framed. For example, 38% of respondents thought that it was fair for a firm to lower wages 7% during a recession with no inflation, while 78% of respondents thought it was fair for a firm to raise wages only 5% during a recession with 12% inflation. In several other questions, they found that people typically believe that a firm is acting unfairly when price differences stem from changes in market power, but that they believe firms act fairly when price differences are the result of shifting costs. Since, in the absence of market power, prices are driven to marginal cost, the implication of this finding is that when prices are non-uniform, consumers may believe (rightly or wrongly) that the only fair reason for non-uniform prices is that the marginal costs (or perhaps average costs) for the products are different. Presumably if consumers perceive that they are being treated unfairly, they may react negatively. Although I am unaware of any studies that systematically look at consumer behavior when consumers perceive that firms are acting unfairly, it is reasonable to conjecture that the consumer response would be to decrease purchases of either the products priced unfairly, the category these products are in, or the store that sells them. Hence, if a retailer charges non-uniform prices for products that consumers perceive to have equal costs, the firm may face decreased demand. Two rather extreme real-world examples bear this out. First, in late 1999 it was revealed that Coca-Cola was considering reprogramming some of its vending machines to charge higher prices during warmer weather (Hays 1997). This met with quick disapproval by consumers, who threatened a boycott of all Coca-Cola products if the temperature-contingent pricing was implemented. Hasty back-pedalling by Coca-Cola ensued. Later, in the fall of 2000, consumers discovered that Amazon.com was charging different prices to different consumers, both randomly and based on purchase histories. After threats by consumers to boycott the company, the company agreed to discontinue the practice (Heun 2001). These are two extreme examples, but they illustrate that consumers may respond to non-uniform prices by decreasing overall demand. The principal implication of the fairness theory is that if products are priced nonuniformly, consumers may ask themselves whether there is a cost-based justification. Unfortunately this leaves a great deal unexplained. Can retailers that wish to sell at non-uniform prices explain their actions in a framework that is palatable to consumers? Two pieces of

CHAPTER 1. INTRODUCTION

9

evidence suggest that they can. The first is that consumers do not protest the existence of sales. If all products are sold at marginal cost, consumers should interpret sales as evidence that goods are sold above marginal cost. The second, more convincing piece of evidence that consumer wrath can be mitigated is the recent change in pricing regimes undertaken by many major league baseball teams. Formerly sold at the same price for a particular seat, professional baseball tickets have recently switched to non-uniform prices. The price of a ticket now depends on the quality of the opponent (Fatsis 2002). This has been accomplished through aggressive framing by teams. Games against poorly performing teams have been “discounted”, while tickets against “premium” opponents are “a few dollars more”. Another rationalization for a non-standard demand system centers on the way that consumers optimize. Grocery stores sell thousands of products; an average store in a large grocery chain might carry more than 14,000 unique items. Even within a single category, they frequently stock several hundred different items. As discussed in Shugan (1980) and Hauser & Wernerfelt (1990), having to choose from such a large number of products may make it difficult for the household to figure out which bundle of goods offers it the greatest utility.4 Uniform prices may make it easier for consumers to decide between products: If two products are the same price, then consumers need only consider the relative utility of the goods, rather than deciding between the more expensive good and the bundle consisting of the less expensive good and the other goods that could be purchased with the difference. However, it is also possible to imagine that if products are too close to each other in characteristics space, uniform prices confound the choice process. In this case, consumers may find it easier to decide between two items when their prices are different. To my knowledge, the behavioral choice literature has not yet addressed the issue of how uniform prices affect choice, instead focusing on the effects of varying numbers and varieties of goods. The literature on computational costs for consumers is still relatively undeveloped, but in specifying the consumer’s utility function, some authors have tried including product variety in the consumer’s utility function. Draganska & Jain (2001) specify consumer i’s indirect utility from choosing good j with characteristics Xjt and price pjt as: 0 uijt = Xjt β − αpjt + f (ljt ) + ijt

(1.1)

4 The related “choice overload” problem is currently an active research area in behavioral marketing. Looking at jams and jellies, Iyengar & Lepper (2000) find evidence that consumers feel overwhelmed when faced with choosing between too many alternatives, decreasing their purchase probability.

CHAPTER 1. INTRODUCTION

10

where f is a parametric quadratic function and ljt is the number of products in the same line as good j. Consistent with Iyengar’s results, they find that consumer utility first increases with line length, and then decreases. Similarly, Ackerberg & Rysman (2004) assume that household utility varies with the total number of products offered. They estimate both additive and multiplicative effects, and find that both effects significantly alter the estimated price elasticities. Unfortunately, experimental results have yet to reveal why consumers should care only about the line length for the line that the product is in, rather than all products in the category, or all products within some visual radius. Further examination of consumers’ perceptions of fairness and the factors that influence the costs of consumers’ utility maximization are clearly fertile ground for future research. In the interim, however, we can investigate the plausibility of the menu costs explanation.

1.3

An Overview of Menu Costs

Assuming that uniform pricing is driven entirely by supply-side factors, then clearly the first-order question is: “How much profit is forgone?” To answer this question, I begin by assuming that retailers are able to choose among a variety of different pricing strategies of differing sophistication. These strategies are functions that map each period’s state space to a vector of prices. Consider the array of potential pricing strategies that a retailer of carbonated soft drinks might choose from: • Charge a constant percentage markup of 30% over wholesale price. • Charge a markup of a constant amount of $0.25 over wholesale price. • Charge a constant percentage markup of x% over wholesale price, where x is chosen optimally. • Charge a single per unit price (e.g., $0.25 per 12 ounces) for all soft drinks, regardless of size or flavor. • Charge a single per unit price for all soft drinks from the same manufacturer, regardless of size, but charge different prices across manufacturers. • Charge a single per unit price for all soft drinks of the same size, regardless of manufacturer, but charge different prices across sizes.

CHAPTER 1. INTRODUCTION

11

• Charge a different price for each product (e.g., one price for a 2-Liter bottle of Diet Coke, a second price for a 2-Liter bottle of Coke Classic, a third price for a 2-Liter bottle of Diet Pepsi, etc.). • Charge two different per-unit prices for all soft drinks, but determine the groupings of products and these two price levels optimally. In this context, uniform pricing by manufacturer-brand-size and completely non-uniform prices are just two of many potential pricing strategies. The retailer’s implementation costs for these pricing strategies clearly differ. For example, charging a constant markup of 30% on all products requires no knowledge on the part of the retailer about the residual demand curve that it faces. In fact, many books on applied pricing for small retailers (e.g., Burstiner (1997)) suggest that they simply charge a 100% markup on their entire inventory, a practice known as “keystone pricing”. By contrast, charging a different (and profit maximizing price) for each product requires intimate knowledge on the part of the retailer of the residual demand curve that it faces. It must be cognizant not only of consumers’ preferences, but also of the current state space (competitors’ current prices, current advertising activity, current wholesale prices, holiday periods, etc.). Furthermore, mapping these state variables to optimal prices for each product involves solving a high dimensional optimization problem every period. The following framework is useful for analyzing the retailer’s decision process. Assume that each period, the retailer maximizes expected profits, less menu costs: Expected Profit

Expected =

this Period

Revenue

Expected −

from Sales

Cost of Goods



Menu Costs this Period

(1.2)

The menu costs incurred by the retailer each period can be thought of as having two components:

Menu Costs this Period

Maintenance =

Costs (Recurring)

Upgrade +

Costs

(1.3)

(if any, Non-Recurring)

The first component of the current period’s menu cost is the recurring cost of maintaining the current pricing strategy. Obviously, this cost would vary depending on the pricing

CHAPTER 1. INTRODUCTION

12

strategy chosen. A simple pricing strategy, such as “charge the same prices as last period” would incur zero maintenance costs. But in the case of a complex strategy, this cost could be quite high. Each period, the retailer would have to learn the current state space (entering wholesale prices into a computer pricing program, learning competitors’ prices, etc.) and apply the pricing rule to determine the prices for that period. Even within the class of complicated pricing rules, costs might vary. Due to the difficulties inherent in numerical optimization, uni-dimensional pricing strategies (e.g., charging a single optimal markup) are much easier to implement (and hence require less managerial time) than high-dimensional pricing strategies. The second component of the current period’s menu cost is the cost of upgrading to a better pricing strategy. This fixed cost is non-recurring (or at least infrequently recurring). If the retailer decides to use the same pricing strategy as in the previous period, no upgrade costs would be incurred. But if, for example, a retailer decided to switch from a “charge the same prices as last period” strategy to a “charge the profit-maximizing price for each good” strategy, they would potentially have to incur several costs. First, in order to learn their demand curve, the retailer may want to introduce exogenous price variation.5 In addition to the opportunity cost of the time it takes for a manager to determine these prices, this experimentation involves forgone profits, because it explicitly requires charging prices that are believed to be non-optimal. Fortunately, this experimentation is required only infrequently – when it is believed that the structural parameters of the demand system have changed. Second, the retailer must analyze the data to recover the structural parameters of the demand system. Here the retailer faces a choice regarding the level of sophistication used in estimating the demand curve. For example, the retailer could employ a homogenous logit model, a heterogenous logit model, or another model (such as that found in this paper) in estimating demand. In choosing among alternative models, the retailer must trade off the opportunity cost of a manager’s time, or the cost of hiring consulting services against the expected cost of mis-specification. Like the costs associated with experimentation, this cost must be incurred only when it is believed that the structural parameters of the demand system have changed. Third, the retailer may potentially have to purchase software (such as optimization software) to allow it to convert the estimated demand system to a set of 5

As discussed in more detail later, it is not possible to estimate the cross-price elasticities between two products if their relative prices are constant.

CHAPTER 1. INTRODUCTION

13

optimal prices. Unlike the maintenance costs, which are incurred each period, these upgrade costs would only be incurred by the retailer infrequently, when the retailer perceives that the upgrade costs are less than the present discounted value of the additional profits gained from following the new pricing strategy.6 This means that the upgrade costs must be incurred whenever there is reason to believe that the structural parameters of the demand system have changed. If the demand system changes substantially over time, or across distance, this will lead to additional upgrade costs. For example, if Coke is more popular in some areas, while Diet Coke is more popular in other areas, then demand must be estimated separately in each of these areas. This explains why even large chains might charge uniform prices - because demand may differ structurally across geographic areas, and hence it may need to be re-estimated for each area, eliminating returns to scale in upgrade costs. Similarly (though less likely), the retailer will have to re-estimate demand more frequently in areas where the distribution of consumers’ preferences change frequently. We are less likely to see costly-to-implement pricing strategies when the retailer cannot expect to recoup the upgrade costs. For tractability, this dissertation assumes a static model, and estimates the additional per-period maintenance costs for non-uniform pricing relative to uniform pricing. Estimating a dynamic structural model would have to include a model of the retailer’s expectations about the demand curve it would find if it experimented, as well as the retailer’s expectations how marginal costs and demand would change over time, and their competitor’s actions.

1.4

Estimating Menu Costs with Soft Drink Data

In order to actually calculate the implied menu costs, I must learn both the marginal costs and the demand curves faced by the retailer. In practice, I only observe data on weekly prices and quantities purchased by households. Economic theory suggests I can recover the marginal costs if I know the residual demand function, but I must observe price and 6

I choose to remain agnostic about the process that allows the retailer to form expectations about the additional profit to be gained from upgrading to a new pricing system without actually implementing the system.

CHAPTER 1. INTRODUCTION

14

quantity data that includes variation in prices - I must observe a retailer charging nonuniform prices.7 I solve this problem by considering a product that is frequently (but not always) priced uniformly: different flavors of carbonated soft drinks. As I discuss in more detail below (and in chapter 2), although several demand models currently exist, all either ignore potential variety-seeking behavior by consumers, or cannot be feasibly estimated for more than a handful of products. Given that the proposed counter-factual pricing experiments involve adjusting the relative prices between different varieties, there is reason to expect that allowing for this variety-seeking behavior – in the form of negative crossprice elasticities – may be important. Chapter 2 investigates the neccesity of accounting for this variety-seeking behavior in more detail, comparing the results from counter-factual exercises using my new model and the traditional logit model, and shows that menu cost estimates may be substantially incorrect if the wrong model is used. Within the soft drink category, there is often a great deal of price variation, both for the same product over time and between products from different manufacturers. Indeed, Coke and Pepsi frequently alternate promotion weeks, with Coke on sale one week and Pepsi on sale the next. This behavior can be seen in Figures 1.1 and 1.2, which plot the weekly price (normalized to 12-ounce servings) of 2-Liter containers of Coke and Pepsi over a two year period at two different stores. In contrast to the price variation seen in these figures, soft drinks are typically sold at uniform prices by manufacturer-brand-size. Within a size, all flavors of Pepsi are typically sold at one price, and all flavors of Coke at another uniform price. Figure 1.3 illustrates this by plotting the ratios of the price of Diet Pepsi and Diet Caffeine Free Pepsi to Regular Pepsi. From the graph, it is easy to see that at this store 7 Strictly speaking this is not quite true. For example, it is possible to empirically estimate a homogenous logit model and hence derive cross-price elasticities between two goods whose price ratio is constant as long as the their price levels are changing (i.e., get identification off sales). To see this, consider the case with two products and an outside good, with product dummies for characteristics. In this case, a homogenous logit model of household choice implies a system of 2 linear equations, where the dependent variable in equation j is the log of the ratio of the market share of good j to the market share of the outside good: ln(sjt ) − ln(s0t ). The independent variables are the price of good j and indicator variables for each good. Because the logit assumes that the coefficients on these prices and dummy variables are equal in each equation, the system collapses to a single equation with three variables: price and indicator variables for the two goods. This equation is usually estimated by OLS. If the two goods are sold at the same price, say $1 in every period, clearly the price variable is co-linear with the product dummies, and the model cannot be estimated. If, however, the two goods are sold at different prices in each period, but at the same price relative to each other, then the model can be estimated. Unfortunately, the identification in this case is coming entirely from the functional form. The logit model assumes that households choose the alternative yielding the highest indirect utility. Because indirect utility functions are homogenous of degree zero – meaning that a change in the level of all prices is equivalent to a change in the level of income – the effect of this price variation is equivalent to variation in household income.

CHAPTER 1. INTRODUCTION

15

(the same store as in Figure 1.1), the three Pepsi UPCs8 were always sold at the same price. However, not all stores charged uniform prices during this time. Figure 1.4 shows the same price ratios as in Figure 1.3 but at another store (the same store as in Figure 1.2). The graph for this store clearly shows a great deal more variation in the prices of different flavors of Pepsi. This variation allows us to estimate demand separately for each Pepsi variety. Similar price variation among varieties of other manufacturer brands at this store allows us to estimate demand for many different items.

8

The Universal Product Code (UPC), also sometimes known as a Store Keeping Unit (SKU) is a number that uniquely identifies each product/size. For example, a 2-Liter plastic bottle of Caffeine-Free Diet CocaCola has a different UPC than a 2-Liter plastic bottle of Diet Coca-Cola, or a 12oz can of Caffeine-Free Diet Coca-Cola.

35

Cents per 12oz Serving 20 25 30

08jun1991

15

05dec1991 Coke

Pepsi

02jun1992 Week

29nov1992

Prices of 2L (67.6oz) sizes at Store B

Figure 1.1: Graph of Prices for Coke and Pepsi at Store B, 6/91-6/93

28may1993

CHAPTER 1. INTRODUCTION 16

35

Cents per 12oz Serving 15 20 25 30

08jun1991

10

05dec1991 Coke

Pepsi

02jun1992 Week

29nov1992

Prices of 2L (67.6oz) sizes at Store A

Figure 1.2: Graph of Prices for Coke and Pepsi at Store A, 6/91-6/93

28may1993

CHAPTER 1. INTRODUCTION 17

1.4

Price Ratio 1 1.2

.8

05dec1991 Diet Pepsi

Diet Caffeine Free Pepsi

02jun1992 Week

29nov1992

28may1993

Price Ratios of Pepsi Varieties to Regular Pepsi, 2L (67.6oz) size at Store B

08jun1991

.6

Figure 1.3: Graph of Prices Ratios of Pepsi Varieties to Regular Pepsi at Store B, 6/91-6/93

CHAPTER 1. INTRODUCTION 18

1.4

Price Ratio 1 1.2

.8

05dec1991 Diet Pepsi

Diet Caffeine Free Pepsi

02jun1992 Week

29nov1992

28may1993

Price Ratios of Pepsi Varieties to Regular Pepsi, 2L (67.6oz) size at Store A

08jun1991

.6

Figure 1.4: Graph of Prices Ratios of Pepsi Varieties to Regular Pepsi at Store A, 6/91-6/93

CHAPTER 1. INTRODUCTION 19

CHAPTER 1. INTRODUCTION

20

As an aside, note that Figures 1.3 and 1.4 (and the puzzle addressed in this dissertation) highlight two features of the data: (1) the prices of the three goods are moving in lock-step (i.e., Figure 1.3 is essentially two straight lines) and (2) they are all the same price (i.e., these lines are at 1). Although this dissertation considers (2), observing only (1) but not (2) would also be explained by menu costs. For example, if a retailer knew that Diet Coke was more popular than Coke (but not how this varied with their absolute prices or the prices of other products) then one unsophisticated pricing strategy that would result from this would be to always charge $0.10 more for Diet Coke than Coke. Given this dataset, I need a model of consumer demand that will allow me to estimate the residual demand curve in period t for each product j: Qjt (·). It is important to note that I need to estimate the residual demand function faced by a single store. Unless the retailer is a monopolist, this is the not the same as the market demand function faced by all stores. The residual demand function reflects the presence of other stores in the market. The difference is that the residual demand function accounts for the fact that the prices charged at other stores affect demand at store A. This means that if store B has a clearance sale, I should expect demand at store A to decline. Several previous empirical demand studies have ignored this aspect for the very good reason that in most datasets, this information is simply not available – prices for other stores are not observed. In my case, however, I observe the prices charged at four other competing stores. Furthermore, as explained later, the additional stores in the dataset were chosen precisely because they were the stores that shoppers were most likely to visit.9 To estimate the retailer’s residual demand curve, I begin by decomposing the household’s demand for soft drinks into two parts. I assume that conditional on going shopping (a process that, following the existing literature, I take to be exogenous) the household first chooses which store to shop at. Then, conditional on the choice of store, the household chooses the bundle of goods from that store that maximizes their utility. This means that the residual demand faced by a particular store in a given week is equal to the sum over all households (that went shopping in that week) of the probability that the household chose that, multiplied by their expected purchases, conditional on choosing that store. In other 9 Throughout this dissertation I assume that the retailer uses a best-response to other retailers, and does not account for the fact that deviations may lead to changes in rivals’ pricing strategy. This is the same as the criterion for a Nash equilibrium.

CHAPTER 1. INTRODUCTION

21

words, this equation:

Et [Q(p)] =

X

E [i’s purchases|i goes to A] · P

i

! i’s characteristics i goes to A and prices at all stores (1.4)

describes the residual demand curve faced by store A in week t. Although I estimate a structural model of product choice, conditional on store choice (my approach to this problem is the main subject of the next chapter), for tractability, I estimate a reduced form model of store choice. Ideally, I would like the household’s entire choice problem to satisfy utility maximization. One way to do this would be to model the household’s choice of store as a multinomial logit choice, where the mean indirect utilities are derived from the optimal bundles that the household could have selected at each store. Unfortunately, this approach is computationally very burdensome. As an approximation to this, I model the household’s choice of store as a multinomial logit choice, where the indirect utility from each store varies with household and purchase-occasion characteristics, as well as price indices for various categories from each store. Many authors have used the logit model to estimate household demand for differentiated products. The principal advantage of the logit is that it is very easy to estimate. A significant drawback of the logit is that in its traditional form, it does not allow for varietyseeking behavior by households – all cross-price elasticities are constrained to be positive. In the applications for which the logit was originally developed, where each agent makes a single choice among several competing alternatives, this is not a problem. It becomes problematic, however, in applications where agents’ choices are not exclusive. Such nonexclusive choices occur in many settings, but are particularly frequent in grocery stores. Empirical evidence from households’ purchase behavior of soft drinks suggests that these patterns are present in the data. As shown in Table 1.1, households typically buy several units of the same product, and/or several different products, within a single purchase occasion. This is also evident, though to a lesser degree, when I consider only purchases among the top 25 UPCs by sales in Table 1.2. Several alternatives exist to the traditional logit model to account for potential complementarities that arises when households buy bundles. One approach that does account for complementarities is the AIDS model of (Deaton & Muellbauer 1980), but this approach is can only be applied to aggregate data. Because using household-level data offers the

CHAPTER 1. INTRODUCTION

22

Table 1.1: Distribution of Purchase Occasions by Number of Items and Number of UPCs Purchased (all Carbonated Soft Drinks) Total Number of Items Purchased 1 2 3 4 5 6 7 8 9 10+ Total

of 1 1,332 967 177 370 25 122 7 83 15 169 3,267

Total Number UPCs Purchased 2 3 4 5+ 0 0 0 0 377 0 0 0 195 86 0 0 111 45 16 0 50 22 9 3 46 22 5 2 23 13 4 2 16 17 1 1 11 3 4 0 39 11 5 4 868 219 44 12

Total 1,332 1,344 458 542 109 197 49 118 33 228 4,410

This table shows the distribution of household purchase occasions across multiple units and multiple products, replicating a similar table found in (Dub´e 2001).

potential for increasing the number of observations – it gives us significantly more information over a shorter time period – it seems wasteful to aggregate potentially useful variation. Furthermore, because the AIDS approach is non-hedonic and does not use product characteristics, it can only be estimated when all products are offered in all weeks. If even one product is missing in a given week, the AIDS approach cannot use that week. Recent work by Israilevich (2004) has attempted to ameliorate this problem with mixed success. As I will discuss in more detail in chapter 2, several other approaches to this problem have been suggested, including modifications to the logit by Gentzkow (2004), Hendel (1999), Dub´e (2001), and Chan (2002) or by taking a different approach altogether, as in Kim, Allenby & Rossi (2002). All of these attempts however, have had significant shortcomings. Models based closely on the logit require that the econometrician specify the number of potential bundles that the household can choose, while other models suffer computational difficulties that severely restrict the number of products that can be included in the analysis. This dissertation improves on these existing models by developing and estimating a model that contains the flexibility to capture variety-seeking choice behavior as well as the scalability to handle continuous choice and larger product spaces. Using the parameter estimates from this new structural model, I am able to calculate the implied profits that

CHAPTER 1. INTRODUCTION

23

Table 1.2: Distribution of Purchase Occasions by Number of Items and Number of UPCs Purchased (Among Top 25 Carbonated Soft Drinks) Total Number of Items Purchased 1 2 3 4 5 6 7 8 9 10+ Total

Total Number of UPCs Purchased 1 2 3 4 1,163 0 0 0 274 108 0 0 86 38 6 0 49 23 3 1 17 23 5 2 25 15 6 0 5 6 2 0 8 3 3 0 3 2 2 0 60 5 2 0 1,690 223 29 3

5 0 0 0 0 0 0 0 0 0 1 1

Total 1,163 382 130 76 47 46 13 14 7 68 1,946

This table shows the distribution of household purchase occasions across multiple units and multiple products, replicating a similar table found in (Dub´e 2001).

the retailer earned following its non-uniform pricing strategy as well as the profits that it would have earned had it followed a strategy of uniform prices.

1.5

Conclusion

This chapter has described the puzzle of uniform pricing for differentiated products, and has covered many anecdotal facts. These anecdotes suggest that managerial menu costs on the part of the retailer may be able to explain the observed uniform pricing behavior. In order to test this hypothesis, and measure the menu costs that would be required to rationalize this behavior, I need to carefully estimate a structural model of the residual demand curve faced by the retailer. Only a structural model will allow me to predict demand at counter-factual prices. The remainder of the dissertation proceeds as follows: chapter 2 lays out the problems with the existing demand estimation literature as it pertains to my data. Building on this literature, I develop a new model and explore its characteristics in detail. Chapter 3 uses this model to estimate household-level demand, using grocery data. These estimates are then used to perform the counter-factual experiments described in this

CHAPTER 1. INTRODUCTION

24

chapter. By comparing the expected profit earned by a single retailer at the weekly prices actually charged to the expected profit that same retailer would have earned, had it charged uniform prices that were optimal in each week (subject only to the restriction that they be uniform by manufacurer-brand-size), I am able to infer that the retailer would have experienced a profit loss of roughly $36.56 (in 1992 dollars) per week if it had charged uniform, rather than non-uniform prices. This leads me to conclude that relatively small managerial menu costs associated with weekly price optimization are sufficient to lead to the observed behavior by many other retailers – that of uniform prices.

Chapter 2

A Model of Continuous Demand and Variety-Seeking 2.1

Introduction

As discussed in chapter 1, existing demand models are unable to simultaneously address several key features that are frequently encountered in consumer choice data (and that are found in my dataset). These three key features are: that households may be varietyseeking, that they make continuous choices, and that they choose from a large number of products. This chapter begins by exploring in more detail the shortcomings of existing models in dealing with these features. I consider both deviations from the the classic AIDS model, as well as deviations from the traditional multinomial logit model.1 After identifying these shortcomings in the existing literature, I develop a new demand model that allows for variety-seeking households that make continuous choices from a large number of products. This new model is based on the hedonic framework of many existing models; each good is represented by a vector of characteristics. However, it involves solving a direct utility function, subject to a budget constraint in order to compute a household’s demand for a vector of products. By allowing households’ preferences to be nonlinear in the characteristics, I am able to model variety-seeking behavior. In addition, the use of a direct utility function and budget constraint enables us to capture households’ continuous 1 With the exception of Kim et al. (2002), the current literature has tended to use the logit framework (hedonic utility and an i.i.d. extreme value idiosyncratic shock) to incorporate variety-seeking behavior, assuming that idiosyncratic shocks are i.i.d. across choices.

25

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

26

choice in a way that reflects economic theory. Finally, I am able to estimate the model despite the large number of products by using the Method of Simulated Moments, which is significantly less computationally costly than Simulated Maximum Likelihood proposed by other authors. I discuss this last point in chapter 3. After developing the new model, I demonstrate the additional features that my model offers. I demonstrate the advantages of the model in several ways. First, I look at the demand and Engel curves generated by the model, showing how the parameters affect their slope, curvature, and reservation prices. Second, I look at how the model distributes probabilities determining propensities to purchase different bundles of goods. This shows how the parameters and the characteristics matrix affect the extensive margin. Finally, I generate data based on the new model, and estimate the profit difference between uniform and non-uniform prices using both the new model and the logit model. I find that the logit estimates of the profit differences are not robust to the misspecification.

2.2

Previous Literature

Existing demand models are unable to address simultaneously: complementarities between goods, continuous choice, and a large number of products. Given that households typically purchase several different soft drink products within the same purchase occasion, there is reason to believe that complementarities exist in my data. This behavior is summarized in table 1.1, which replicates a similar table in Dub´e (2001). It shows the frequency with which households buy either multiple units of the same product, or several different products, where a product is defined in this case to be a UPC. One approach that does account for these complementarities is the Almost Ideal Demand System model of Deaton & Muellbauer (1980). The AIDS model is appealing because it does not require the econometrician to locate the products in characteristics space, and allows for a relatively unconstrained system of cross-price elasticities. However, using the AIDS model here is problematic for two reasons. First, the approach can only be applied to aggregated data. In my case, aggregating the data would ignore the potentially useful variation contained in the household-level data. Second, because the traditional AIDS approach is not hedonic and does not use product characteristics, it can only be estimated when all products are offered in all weeks. If even one product is missing in a given week,

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

27

the AIDS approach cannot use that week.2 This problem is not readily apparent when dealing with data that is aggregated across several sizes to the brand level (see, for example, Hausman, Leonard & Zona (1994)). But when dealing with UPC-level data, the problem can frequently become severe. In my data, nearly every week has some product that is not offered. Recent work by Israilevich (2004) has attempted to ameliorate this problem by parameterizing the prices of the “missing” goods in weeks when they are not offered. However, he reports difficulty in getting the estimates to converge when there are more than a handful of missing good-weeks. Although the traditional multinomial logit does not allow for purchases of more than one product, there are two obvious ways to remedy this. One way is to view the purchase of several products at once as a set of independent choices. This is the approach implicitly taken by papers using aggregate data, such as Berry, Levinsohn & Pakes (1995) and Nevo (2001). Recall that the traditional logit model with two inside goods, with mean utilities V1 and V2 , and an outside good with mean utility equal to zero gives the following choice probabilities: P (Choose outside good) = and P (Choose good j) =

1 1+

eV1

+ eV2

eVj 1 + eV1 + eV2

Adapting this as suggested above, we have:

P (q1 , q2 , z) = P

 =

Choose

!q1

Good One

eV1 1 + eV1 + eV2

q 1  ·

·P

Choose

!q2

Good Two

eV2 1 + eV1 + eV2

q 2  ·

Choose

·P

!z

Outside Good 1

z

1 + eV1 + eV2

where q1 , q2 , and z are the quantities of the different goods chosen. However, two problems exist. First, the econometrician must specify the number of choices that the household makes. In the traditional logit model, each household is assumed to make a single choice: Cmax = q1 + q2 + z = 1. But when the household could potentially make more than one choice, the econometrician must specify the number. This is the 2

The AIDS model leads to an econometric model in which each week is an observation, with one equation for each product (predicting the market share of that product), and the prices of each good entering as explanatory variables in all equations. Hence, if I do not observe a single product’s price and demand in a particular week, I cannot use that week as an observation.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

28

problem that Berry et al. (1995) and Nevo (2001) refer to as needing to specify the size of the market, or the share of the outside good. We will explore this problem in more detail below. The second problem with this model is that it is not able to capture complementarities between products – all cross-price elasticities are constrained to be positive. This can be seen by noting that in this case E[q1 ] = Cmax ∗ P (Choose good 1), which is a strictly decreasing function of V2 . A second potential way to tweak the traditional logit model, that does allow for complementarities, is to assume that the household chooses from among a large number of possible bundles. This amounts to assuming that the household aggregates the characteristics of each product in the bundle to create a kind of meta-product. In this case: exp (V1 q1 + V2 q2 ) (q1i ,q2j ,z)∈C exp (V1 q1i + V2 q2j )

P (q1 , q2 , z) = P

where the set of possible choices is defined as3 C = {(q1i , q2j , z)|q1i ≥ 0, q2j ≥ 0, z ≥ 0, q1i + q2j + z = Cmax } In order to make the sum in the denominator finite, it is necessary to restrict the number of potential bundles. This may be done by discretizing q1 and q2 (and, by implication, z). Now we have: E(q1 ) =

1 + eV1

eV1 + 2e2V1 + eV1 +V2 + eV2 + eV1 +V2 + e2V1 + e2V2

when Cmax = 2. Or: E(q1 ) =

1 + eV1

eV1 + 2e2V1 + 3e2V1 + eV1 +V2 + 2e2V1 +V2 + eV1 +2V2 + eV2 + eV1 +V2 + e2V1 + e2V2 + e2V1 +V2 + e2V2 + V1 + e3V1 + e3V2

when Cmax = 3. I critique this approach in more detail by following Gentzkow (2004). Gentzkow generalizes the above model by parameterizing the marginal utility from each potential bundle of goods. Essentially, he estimates the mean utility for each good (relative to the outside good), as well as the mean utility from each possible bundle. This approach is best understood in the context of the following example. Suppose that there are two 3

For example, when the number of choices Cmax = 2, there are six potential choices, with indirect utilities given by: u(1, 0, 1) = V1 + ε1,0,1 , u(1, 1, 0) = V1 + V2 + ε1,1,0 , u(2, 0, 0) = 2V1 + ε2,0,0 , u(0, 2, 0) = 2V2 + ε0,2,0 , and u(0, 0, 2) = ε0,0,2 , where the ε’s are i.i.d. extreme value (Gumbel) distributed.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

29

goods: coffee and cream. Then at time t, household i will receive indirect utility uit (coffee) = Γcoffee + εi,t,coffee from choosing coffee, uit (cream) = Γcream + εi,t,cream from choosing cream, or uit (coffee & cream) = Γcoffee & cream + εi,t,coffee & cream from choosing coffee and cream (where, in general, Γcoffee & cream 6= Γcoffee + Γcream ), or uit (outside good) = εi,t,0 from choosing the outside good, where the Γ’s are all estimated parameters, and the ε’s are all i.i.d. extreme value (Gumbel) distributed. This approach has several significant drawbacks in this setting. The first problem with applying Gentzkow’s model in this setting is that it cannot handle large numbers of products, because the number of parameters in the model increases exponentially with the number of products. In the above example, there are three parameters: Γcoffee ,Γcream , and Γcoffee & cream . If we add another alternative, say sugar, then we add four(!) additional parameters: Γsugar , Γcoffee & sugar ,Γcream & sugar , and Γcoffee & cream & sugar . This problem becomes even more severe when, as in this case, consumers make non-binary choices. Suppose I could buy two coffees. Or twenty. The parameter space explodes. The second problem with applying Gentzkow’s model is that it requires that the econometrician specify exactly how many choices each household makes. This is fine in his application, where the choice is the weekly decision to subscribe to a newspaper or not, but it is difficult to say how many times you have made the choice between having a coffee (with or without cream) in any reasonable time frame. Once? Twice? Zero? One might think that it would be possible to make the number of choices depend upon the household’s actual expenditure and the price index via a budget constraint. But this means that the number of choices depends on the price level. This serves to expose an underlying problem with the logit model of demand. It predicts that all goods will be purchased with positive probability regardless of their price. Hendel (1999)’s solution to this problem of is to assume that the

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

30

number of choices made follow a Poisson distribution whose mean varies with observable characteristics about the household (or firm in his case). The third problem with Gentzkow’s model in this setting is that even if we deal with non-binary choice sets by placing restrictions on the parameters to reduce their number, it is not possible to place restrictions on the distribution of the error terms of the bundles. Returning to the 2-good example, εi,t,coffee and εi,t,cream are distributed independently of εi,t,coffee & cream (as well as of each other). Allowing for the choice of two coffees ({coffee & coffee}) would give us: uit (coffee & coffee) = Γcoffee & coffee + εi,t,coffee & coffee Where εi,t,coffee is again distributed independently of εi,t,coffee & coffee . These independence assumptions are justified only by the computational convenience that they deliver – closed form solutions for the choice probabilities. Although Hendel does not address the complementarity issue, he allows for continuous choice by essentially assuming that the utility from the non-price product characteristics is concave in quantity. Hence, while the traditional logit assumes that the mean utility from alternative j with characteristics vector Xj and price pj is uj = βXj , Hendel proposes uj = (βXj qj )γ − αpj , where qj is the quantity of good j consumed, and γ is a curvature parameter. Chan (2002) also proposes a continuous choice model based on the logit, but it predicts positive consumption of all characteristics and infinite consumption of some products. Kim et al. (2002) adopt a different approach. They write down the household’s direct utility function – which they assume is additively separable in products – and solve the Kuhn-Tucker conditions. Unfortunately, estimating their model requires using (simulated) maximum likelihood, which becomes very computationally intensive when considering more than a handful of products as it involves integrating a normal distribution that has number of dimensions equal to the number of products in the choice space. My model is a hybrid between Kim et al. (2002)’s and Chan’s. I improve on Chan by explicitly solving for the household’s budget constraint, by extending it to UPC-level demand, and modelling the panel aspect of the data. The non-linearity also allows me to use a more flexible matrix of characteristics, with more characteristics than products. I also use physical characteristics, rather than brand-level (as in Chan) or product-level (as

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

31

in Kim et al. (2002)) dummy variables.

2.3

The Model

To formulate a model that accounts for variety-seeking, continuous choice, and a large choice set, I begin by writing down a household-level direct utility function over bundles of goods. This section describes the model in detail. In order to reduce the dimensionality of the parameter space, I use a hedonic approach and assume that, as in the logit model, households derive utility from product characteristics. This means that each product j ∈ J is completely described by a vector of characteristics of length C.4 The menu faced by the household can be represented by a J × C matrix A where the rows of A are the products, and the columns are characteristics. Hence, the A matrix can be thought of as a stacked matrix consisting of the Xj ’s from the logit model – although, unlike the logit model, the A matrix can have more characteristics than products (C > J). More concretely, consider an example with two characteristics (Diet and Cola) and two available products (Diet Coke and Diet 7Up). Because it is a diet cola, Diet Coke has characteristic vector [1 1].5 As a diet non-cola, Diet 7Up has characteristic vector [1 0]. Stacking these two products’ characteristic vectors yields the following A matrix: " A=

1 1

#

1 0

Again following the logit, I assume that a household’s utility function is additively separable in these characteristics. Unlike the logit, however, I allow households the ability to consume multiple units of a single product, as well as consuming several different products. In particular, I assume that household i myopically maximizes the utility function: Uit (qit , zit ) =

X

βc (A0c qit + 1)ρc + ε0it qit + zit

(2.1)

c∈ C

with respect to qit and zit . Ac is the cth column of A, qit is a column vector of length J 4

Abusing notation, I use J to mean both the set of products and the number of elements in that set. Note that although in this case the product characteristics are indicator variables, in general they need only be non-negative. For example, in the estimated model one of the characteristics is the number of milligrams of caffeine per 12-ounce serving. 5

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

32

comprising household i’s purchases at time t of the J goods described by A, zit is the amount of outside good consumed, and βc and ρc are the characteristic-specific scalar components of β and ρ. The J dimensional vector εit represents the household/shopping-occasion marginal utility shocks, which are observed by the utility-maximizing household, but not by the econometrician. The household maximizes this utility function subject to the budget constraint: X

pjt qjt + zt ≤ wit

(2.2)

j∈ J

where wit is the household’s total grocery expenditure in week t. The imposition of the budget constraint is a significant difference between my model and logit-based models. Since the logit assumes that the household is constrained by making a fixed number of choices, rather than by a budget constraint, the logit model places positive probability that households will purchase any particular product, regardless of its price. This conflicts with standard economic theory. By contrast, my model assigns zero probability to the household purchasing products once their prices pass a certain threshold. I assume that εit is i.i.d. across products, time, and households, and negatively lognormally distributed on the interval (−∞, 0). These error terms represent purchase-occasion specific idiosyncratic taste shocks for particular items that are observed by the household, but not by the econometrician, such as a sudden aversion to consuming Diet Coke. In principle, these error-terms may be correlated across products within a purchase occasion. This could stem from characteristic-specific shocks, such as a sudden preference for low calorie or diet beverages. Additionally, the shocks may be heteroscedastic, either across time or, more likely, across products. However, for simplicity (and to reduce the number of parameters required to estimate the model), I restrict these shocks to be i.i.d, and assume they follow a negative standard log-normal distribution. The reason that these taste shocks are bounded from above relates to the fact that I assume that the household does not satiate on the idiosyncratic taste characteristic. The taste shocks can be thought of as a log-normally distributed product and purchase-occasion specific characteristic with parameter values β = −1 and ρ = 1. I make these parametric restrictions because my data do not contain sufficient variation to identify them. As shown in the next section, the result of these restrictions is that, pj = pj − εj is the effective price to the household of the product. If pj < 0 then the household inelastically spends its entire budget on good j. This behavior is not reasonable. Hence, it is necessary to

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

33

bound εit from above in order to prevent unreasonable choice behavior. If, for example, the realization of εitj is greater than the price of good j, a household may never consume the outside good on that purchase occasion, regardless of the level of wt . Therefore, to avoid these negative effective prices I must, at a minimum have εj < pj . This means that I must have −∞ < εj < pj . However, because it makes little theoretical sense to have the support of the distribution of the error terms dependent upon prices, I impose the restriction εj < 0. Within the class of distributions having one-sided support, the log-normal distribution is desirable because it offers flexible variation with two parameters (although I do not do so here). The ρ parameters introduce the nonlinearity that is key to the model’s ability to capture households’ continuous choice behavior and potential preference for variety. Unfortunately, because of this nonlinearity, it is not possible to assign clear-cut interpretations to the parameters as one can do with a linear model. With this caveat, I provide some economic intuition for these parameters. βc ρc is the maximum marginal utility the household can receive from characteristic c, and hence is a measure the household’s preference for a particular characteristic, (similar to the familiar coefficients on characteristics in the logit model).6 At the most basic level, the sign of βc determines whether the household considers characteristic c to be a good or a bad. Positive βc ’s imply that the household receives positive marginal utility from that characteristic, while negative βc ’s imply negative marginal utility from that characteristic. At a secondary level, the magnitude of βc determines the strength of a household’s preferences for a particular characteristic.7 However, as seen in the next section, the impact of βc is affected in a nonlinear way by the satiation rate, ρc . The parameter ρc is the satiation rate for that characteristic - it determines the rate at which the marginal utility for characteristic c changes. Values of ρc closer to one mean that the household’s marginal utility from additional units of that characteristic changes more slowly. Also, in addition to increasing βc ρc , changes in ρc affect the behavior of households’ demand curves as prices approach zero. For this reason, throughout much of the analysis in this section when I vary ρc , I hold βc ρc constant. Returning to the example, if we substitute the A matrix into the utility function, we 6 When βc > 0 (and hence 0 < ρc < 1), marginal utility is positive, but decreasing. When βc < 0 (and hence 1 < ρc ), marginal utility is both negative and decreasing. 7 In order to capture household heterogeneity, in chapter 3 I allow the taste parameters β to vary across households. This allows us to capture persistent household-specific preferences, such as a taste for Coke or Pepsi.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

34

can see that the household maximizes: Uit (qit , zit ) = βDiet (qi,t,DietCoke + qi,t,Diet7U p + 1)ρDiet + βCola (qi,t,DietCoke + 1)ρCola + εi,t,DietCoke qi,t,DietCoke + εi,t,Diet7U p qi,t,Diet7U p + zit

with respect to qit and zit , subject to: pt,DietCoke qi,t,DietCoke + pt,Diet7U p qi,t,Diet7U p + zit ≤ wit Unlike the logit, which offers closed form solutions for the households’ expected purchases, this model generally does not have such simple solutions, even for particular values of ε. Because this model differs so substantially from familiar approaches, the remainder of the chapter will both show the choice behavior generated by the model, and demonstrate that this behavior includes both complementarities and continuous choice.

2.4

Behavior of the Model When There Are Two Inside Goods

This demand model differs significantly from many of the existing demand models discussed above. For this reason, we would like to know what the demand curves and Engel curves associated with the model look like. The primary purpose of this section is to show a variety of typical demand and Engel curves from the model and to relate the features we see in these curves to the mathematics of the model. The secondary purpose is to show how these features vary as we change the parameters of the model and the characteristics of the products in the choice set – providing us with economic interpretations of the taste parameters. In order to allow greater analytical clarity, this section deals exclusively with the case when there are two inside goods, each with two differentiating characteristics, and an outside good. When possible, I solve for the analytical solutions. However, in many cases closed-form solutions are not obtainable, in these cases I determine the effect of changes in the parameters or the A matrix on the solution. The general form of the characteristic matrix with two goods and two characteristics is: " A=

a11 a12 a21 a22

#

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

35

With preference parameters β and ρ, the household’s utility function is thus: U (q1 , q2 , z) = β1 (a11 q1 + q2 a21 + 1)ρ1 + β2 (a12 q1 + a22 q2 + 1)ρ2 + ε1 q1 + ε2 q2 + z The household’s objective is to choose q1 , q2 and z, to maximize this utility function subject to a budget constraint: p1 q 1 + p 2 q 2 + z ≤ w Where w is the household’s grocery expenditure, qj is the number of units of good j consumed, pj is the price per unit of that good, and z is the number of units of outside good consumed (and whose price I normalize to one). I also impose non-negativity constraints for q1 , q2 and z. Assuming that the concavity conditions derived in the Appendix are satisfied, we can solve this optimization problem by forming the Lagrangian: L(q1 , q2 , z, λ0 , λ1 , λ2 , λ3 ) = β1 (a11 q1 + q2 a21 + 1)ρ1 + β2 (a12 q1 + a22 q2 + 1)ρ2 + ε1 q 1 + ε2 q 2 + z +λ0 (w − p1 q1 − p2 q2 − z) + λ1 q1 + λ2 q2 + λ3 z The mechanics of the solutions to this Lagrangian discussed in this section are derived in the Appendix.

2.4.1

Demand Curves

As mentioned earlier, because of the high degree of nonlinearity present in the household’s demand and indirect utility functions, it is not possible to assign clear-cut interpretations to the parameters in my model as one can do with the logit model. The purpose of this subsection is to use the demand curves generated by the model to gain insight into both the characteristics of the model and the way in which the parameters affect the choice behavior it generates. This subsection begins by showing a series of typical demand curves generated by the model. These curves have three main features. First, although difficult to fully appreciate in the two-good case, the model generates a large number of corner solutions. This should not be surprising, since this behavior is one of the key features of the data that the model was intended to match. Second, the demand curves are usually at least somewhat kinked at the point where the household shifts from substituting between goods of differing marginal

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

36

utilities. This is a consequence of my method of introducing corner solutions. Third, the household-level own and cross price elasticities exhibit a large amount of variation with respect to prices. In particular, many households have own-price and cross-price elasticities of zero over substantial regions of the price space. After covering these typical demand curves, I show the effects of changes in the parameters β and ρ, as well as the effects of changes in A, the product characteristics matrix, as it moves between a diagonal and a row-dependent matrix. In brief, βρ has the largest effect upon the demand curve, while ρ has the greatest impact on demand when prices are very low. "

Figure 2.1 shows some typical demand curves for the two good model when A =

1

0.5

0.5

1

#

.

The left half of the figure contains four different own-price demand curves for good one, each from a different combination of ε1 and ε2 – the idiosyncratic product-specific shocks. The right half of the figure graphs the corresponding demand curves for good two as a function of the price of good one. These four curves correspond to the potential combinations of the 40th and 60th percentile values of the distributions of ε1 and ε2 – which are approximately -1.49 and -1.82. Table 2.1 lists the (ε1 , ε2 ) pairs corresponding to the curves in Figures 2.1-2.7. I use these same values in all the demand curve graphs in this subsection, as well as the Engel curves shown in the following subsection. Table 2.1: Pairs of (ε1 , ε2 ) corresponding to the demand curves in Figures 2.1-2.7 Curve ε1 ε2 ① −1.49 −1.49 ② −1.49 −1.82 ③ −1.82 −1.49 ④ −1.82 −1.82

With this in mind, the interior solutions (when all three goods are consumed) are: a22

h

qˆ1 =

i

1 ρ1 −1

− a21

h

(p2 −ε2 )a11 −(p1 −ε1 )a21 β2 ρ2 (a11 a22 −a21 a12 )

i

1 ρ2 −1

− a22 + a21 (2.3)

a11 a22 − a21 a12 a11

qˆ2 =

(p1 −ε1 )a22 −(p2 −ε2 )a12 β1 ρ1 (a11 a22 −a21 a12 )

h

(p2 −ε2 )a11 −(p1 −ε1 )a21 β2 ρ2 (a11 a22 −a21 a12 )

i

1 ρ2 −1

− a12

h

(p1 −ε1 )a22 −(p2 −ε2 )a12 β1 ρ1 (a11 a22 −a21 a12 )

a11 a22 − a21 a12

i

1 ρ1 −1

− a11 + a12 (2.4)

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

37

Figure 2.1: Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =



1 0.5 0.5 1



, β1 = 5.5, β2 = 5, ρ1 = ρ2 = 0.5, p2 = $0.50, w = 60

q (p | p =$0.50) 1

q (p | p =$0.50)

2

2

1

0.9

0.9

0.8

0.8

0.7

0.7 2

0.6

Price of Good 1

Price of Good 1

1

1

0.5

0.4

1

1

2

2

0.6

4

1

0.5

0.4

0.3

0.3

0.2

0.2

3

4 3

0.1

0

0

0.1

2

4 q1

6

8

0

0

1

2

3

4

5

q2

zˆ = w − p1 q1 − p2 q2

(2.5)

The ˆ’s are to remind the reader that these can be considered candidate solutions. In many cases we will not have an interior solution. This will occur if either of the terms in brackets are negative, or if qˆ1 < 0, qˆ2 < 0, or zˆ < 0. I discuss each of these possibilities below. In Figure 2.1, households are at interior solutions on those portions of the curves where both q1 > 0 and q2 > 0. For example, the portion of curve ①, covering roughly 0.2 < p1 < 0.54 is at an interior solution. Plugging in values into the above equations tells us that the relevant demand equations for this segment of the demand curve are (holding p2 constant at 0.5): q1 ≈ −9.38 ∗ (2.49 − p1)−2 + 5.67 ∗ (p1 + 0.5)−2 − .667

(2.6)

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

38

q2 ≈ 18.75 ∗ (2.49 − p1)−2 − 2.84 ∗ (p1 + 0.5)−2 − .667

(2.7)

While the household is at an interior solution, the own-price elasticity of good one is: ∂q1 p1 ∂p1 q1

=

−p1 q1 det A2



a222 β1 ρ1 (1−ρ1 )



p1 a22 −p2 a12 β1 ρ1 det A



ρ1 1−ρ1



p1 a22 −p2 a12 β1 ρ1 det A



ρ1 1−ρ1



p2 a11 −p1 a21 β2 ρ2 det A



ρ2 1−ρ2



+

a221 β2 ρ2 (1−ρ2 )



p2 a11 −p1 a21 β2 ρ2 det A



ρ2 1−ρ2



+

a21 a11 β2 ρ2 (1−ρ2 )

(2.8)

and the cross-price elasticity is: ∂q1 p2 ∂p2 q1

=

p2 q1 det A2



a22 a12 β1 ρ1 (1−ρ1 )

(2.9)

Note several things about these demand equations and the elasticities they imply. First, at interior solutions, the household’s expenditure level w does not enter the demand function for the inside goods – making the compensated and uncompensated demand elasticities equal. This is because at the interior, all three goods are purchased, and any additional wealth will be spent on the outside good. Second, the cross-price elasticity in this case (two goods and two characteristics) is always positive. This is specific to the two-inside-good case, and is due to the adding-up constraint. Third, when the A matrix is diagonal, the cross-price elasticities are all zero (at the interior solutions). Since I only show demand curves here for the inside goods, a brief word is in order about demand for the outside good. Some households with very low expenditure levels (small w), or particularly strong preferences for inside goods (very high β or ρ, or very small ε1 or ε2 ) may find it optimal not to purchase any of the outside good. These households spend all of their grocery expenditure on the inside good. While I will consider these cases later in this section when I discuss the Engel curves generated by the model, all of the demand curves in this section have a sufficient expenditure level ($60) that some outside good is always purchased. Returning to the example of curve ① in Figure 2.1, as p1 increases above about 0.54, the household with demand curve ① reaches a corner solution for good one. At this point, qˆ1 < 0, zˆ > 0, and the household’s demand function for good two becomes:  q1 = 0, z = w − p2 q2 and q2 = max 0, min



q2∗ ,

w p2

 (2.10)

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

39

Where q2∗ is the solution to the equation:8 a21 β1 ρ1 (a21 q2∗ + 1)ρ1 −1 + a22 β2 ρ2 (a22 q2∗ + 1)ρ2 −1 + ε2 =1 p2

(2.11)

Note that p1 does not enter Equation 2.10. This means that once a household’s demand for a good goes to zero, the cross-price elasticity between that good and all other goods is zero. This is the reason that the demand curve shown on the right, which plots q2 as a function of p1 becomes vertical as p1 > 0.54. A similar effect occurs on the portion of the beginning of curve ①, when qˆ2 moves away from zero and the household switches from consuming only good one to also consuming good two. In this range, covering roughly 0 < p1 < 0.2, we have zˆ > 0 and qˆ2 < 0, which yields the demand curve  q2 = 0, z = w − p1 q1 and q1 = max 0, min



q1∗ ,

w p1

 (2.12)

Where q1∗ is the solution to the equation: a11 β1 ρ1 (a11 q1∗ + 1)ρ1 −1 + a12 β2 ρ2 (a12 q1∗ + 1)ρ2 −1 + ε1 =1 p1

(2.13)

Again, note that once qˆ2 < 0, the cross-price elasticity between goods one and two is zero. Before moving on, the last feature to recognize about the model is that (although the behavior is not shown in these figures) it is possible that the household will never want to consume any good one (or, symmetrically, good two), even when its price is zero. This is because the household may receive a small negative realization of ε2 , and a large negative realization of ε1 . This makes them unwilling to purchase good one, even at a price of zero. Having covered the basic characteristics of the demand curves, we’d like to know how changes in the parameters lead to changes in the demand curves. These changes are clearest when the product characteristic matrix is the identity matrix, so we’ll consider that case first, before returning to the non-diagonal case. When the product characteristics matrix is diagonal, the characteristics are productspecific. A real-world example of this would be if the available goods were: food, shelter, and a composite outside good. One implication of this is that the household’s utility function 8

Note that the concavity conditions guarantee that this solution, if positive, is unique.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

40

Figure 2.2: Demand for good one and two as a function of p1 , for 40th and 60th 



percentiles of ε1 and ε2 . A = 10 01 , β1 = 5.5, β2 = 5, ρ1 = ρ2 = 0.5, p2 = $0.50, w = 60. Compare to Figures 2.3 and 2.4. q (p | p =$0.50) 1

q (p | p =$0.50)

2

2

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

Price of Good 1

Price of Good 1

1

1

0.5

0.4

0.4

0.3

0.2

0.2

0.1

0.1

0

0.5

1

1.5

2

0 0.2

2.5

2

0.5

0.3

0

1

0.4

q1

0.6 q2

0.8

1

is additively separable in the two goods, which means that changes in the price of good two can only affect demand for good one (and vice versa) through the budget constraint. Figure 2.2 shows some typical demand curves when the A matrix is diagonal. Even though these graphs each contain four different demand curves – just like Figure 2.1 – in both the left and right panes, each curve has another curve stacked on top of it. This is because in this case, Equations 2.3 and 2.4 reduce to:  qˆ1 =

p1 − ε 1 β1 ρ1



1 ρ1 −1

−1

(2.14)

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

 qˆ2 =

p2 − ε 2 β2 ρ2



1 ρ2 −1

41

−1

(2.15)

Hence, as long as the budget constraint is not binding on the interior goods (i.e., p1 qˆ1 + p2 qˆ2 ≤ w), the cross-price elasticities in this case are zero, for all price combinations of p1 and p2 . Figure 2.3: Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =



1 0 0 1



, β1 = 6.875, β2 = 5, ρ1 = 0.4, ρ2 = 0.5, p2 = $0.50, w = 60. Compare to Figures 2.2 and 2.4. q (p | p =$0.50) 1

q (p | p =$0.50)

2

2

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

Price of Good 1

Price of Good 1

1

1

0.5

0.4

0.4

0.3

0.2

0.2

0.1

0.1

0

0.5

1 q1

1.5

2

2

0.5

0.3

0

1

0 0.2

0.4

0.6 q2

0.8

1

Figure 2.3 is identical to Figure 2.2, except that ρ1 has decreased from 0.50 to 0.40, and β1 has increased from 5.5 to 6.875 in order to hold their product, β1 ρ1 , constant. The reason I hold β1 ρ1 constant is to isolate the second-derivative effect from the first-derivative effect (that is, isolate the rate of change in the marginal utility from the maximal marginal utility).

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

42

The most noticeable difference between the two graphs is that the quantity demanded of good one at p1 = 0 decreases substantially, a change from roughly 2.4 units to 1.75 units for the 40th percentile value of ε1 , with a similar changes for the 40th percentile. At the same time, there is very little change in the reservation price, the price at which demand for good one goes to zero. This will make it relatively difficult to empirically estimate ρ precisely as distinct from βρ without observing demand when prices are close to zero. Figure 2.4: Demand for good one and two as a function of p1 , for the 40th and 60th 



percentiles of ε1 and ε2 . A = 10 01 , β1 = β2 = 5, ρ1 = ρ2 = 0.5, p2 = $0.50, w = 60. Compare to Figures 2.2 and 2.3 q (p | p =$0.50) 1

q (p | p =$0.50)

2

2

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

Price of Good 1

Price of Good 1

1

1

0.5

0.4

0.4

0.3

0.2

0.2

0.1

0.1

0

0.5

1 q1

1.5

2

2

0.5

0.3

0

1

0 0.2

0.4

0.6 q2

0.8

1

Contrast this with what occurs when we decrease β1 as in Figure 2.4, from 5.5 to 5.0. In this case, there is also some movement in the behavior of the demand curve as p1 goes to zero. But more significant is the effect on the reservation price for good one, which decreases by nearly a third for the 60th percentile group of 1 , with a similar decrease for

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

43

the 40th percentile as well. To see why changes to β1 have a much larger effect than changes to ρ1 in determining the reservation price, consider the following equations governing the reservation prices of the two goods. Since good one is the sole source of characteristic one, there are only two constraints. First the cost of marginal utility from consuming good one must less than the cost of marginal utility from consuming the outside good. Second, good one must not be “crowded out” by good two, if the household’s budget is not great enough to buy any outside good. Mathematically, this means that good one will be consumed as long as: ( ) β2 ρ2 a22 (a22 pw2 + 1)ρ2 −1 β1 ρ1 a11 > max 1, p1 p2

(2.16)

In this case, increasing β1 , ρ1 , or a11 linearly increases the reservation price. An increase in β1 of 0.25 causes an increase in the reservation price by $0.25. If the budget constraint is binding for the interior goods, then the second term in braces is greater than one, and then decreasing β2 , ρ2 , or a22 all decrease the reservation price of good one. An increase in β2 by 10% would increase the reservation price for good one by at least 9%. By contrast, changes in ρ1 have a much larger effect on q1 than changes in β1 when p1 is close to zero. The reason for this can be seen by evaluating Equation 2.14 at p1 = 0, and comparing the derivatives with respect to β1 and ρ1 (holding β1 ρ1 constant): ∂q1 (p1 = 0) = ∂β1 ∂q1 (p1 = 0) = ∂ρ1



β1 ρ1 −ε1





β1 ρ1 −ε1

1 1−ρ1



 · log

1 1−ρ1

· β1−1

1 1 − ρ1



· (1 − ρ1 )−2

(2.17)

(2.18)

When evaluated at the parameter values β1 = 5, ρ1 = 0.5, the effect of a change in ρ (holding βρ constant) has an effect more than seven times larger. If βρ is not held constant, the effect of a change in ρ is even greater. "

Returning to the original case, with A =

1

0.5

0.5

1

#

, the parameters β and ρ have similar

effects as above, as can be seen in Figures 2.5 and 2.6, which respectively show the effects of decreasing β1 and of decreasing ρ1 while holding β1 ρ1 constant. Unlike the case with a diagonal A matrix, however, here the products overlap significantly. This means that much less simplification is possible, and as a result, interpretation

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

44

Figure 2.5: Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =



1 0.5 0.5 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.5, p2 = $0.50, w = 60. Compare to Figure 2.1.

q (p | p =$0.50) 1

q (p | p =$0.50)

2

2

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

Price of Good 1

Price of Good 1

1

1

0.5

0.4

0.4

0.3

0.2

0.2

0.1

0.1

0

2

4 q1

6

8

2

0.5

0.3

0

1

0

0

1

2

3

4

5

q2

becomes more difficult. In general, demand for good one is positive as long as: a22

h

(p1 −ε1 )a22 −(p2 −ε2 )a12 β1 ρ1 (a11 a22 −a21 a12 )

i

1 ρ1 −1

− a21

h

(p2 −ε2 )a11 −(p1 −ε1 )a21 β2 ρ2 (a11 a22 −a21 a12 )

i

1 ρ2 −1

− a22 + a21 > 0

(2.19)

n o However, if p2 < min β1 ρ1 a21 + β2 ρ2 a22 , p1 aa22 , we also require the second condition that: 12

p 1 < p2

a11 β1 ρ1 (a21 pw2 + 1)1−ρ1 + a12 β2 ρ2 (a22 pw2 + 1)1−ρ2 a21 β1 ρ1 (a21 pw2 + 1)1−ρ1 + a22 β2 ρ2 (a22 pw2 + 1)1−ρ2

(2.20)

The first condition just says that qˆ1 > 0. The second is a condition that assures that it is “cheaper” (loosely defined) to buy characteristic one from good one than it is to buy

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING



45

Figure 2.6: Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =  1 0.5 , β = 6.875, β = 5, ρ 1 2 1 = 0.4, ρ2 = 0.5, p2 = $0.50, w = 60. Compare to Figures 0.5 1 2.1 and 2.5. q (p | p =$0.50) 1

q (p | p =$0.50)

2

2

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

Price of Good 1

Price of Good 1

1

1

0.5

0.4

0.4

0.3

0.2

0.2

0.1

0.1

0

2

4 q1

6

2

0.5

0.3

0

1

0

0

1

2

3

4

5

q2

characteristic one from good two. This brings us to a discussion of what I call “dominance”. In addition to interior solutions, this model also allows for corner solutions. Indeed this is one of the key features of the model. In many cases the household will choose not to purchase one or more of the goods. The household will choose to purchase no units of an interior good, say good one, if either: (1) given the amount of good two that the household has purchased, the cost of marginal utility from good one is greater than the cost of marginal utility from the outside good or, (2) all of good one’s characteristics can be obtained more cheaply by buying good two. The first possibility for non-interior solutions is that which has already been discussed

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

46

above: if any of zˆ, qˆ1 , or qˆ2 is less than zero, demand for that good will be zero, and if two of the three predicted consumption values zˆ, qˆ1 , and qˆ2 are negative, then the household spends its entire budget on the third (positive) good. Hence, qˆ1 < 0 and zˆ < 0, implies that the the solution is (0, pw2 , 0), while qˆ2 < 0 and zˆ < 0 implies that the solution is ( pw1 , 0, 0). Finally, qˆ1 < 0 and qˆ2 < 0 implies that the household spends all of its money on the outside good. The second possibility occurs when one of the terms in the brackets is negative. When this occurs, I say that one good “dominates” the other. While this is nearly always a redundant condition when there are only two inside goods, it occurs with greater frequency when more goods are considered. From a household’s perspective, one good is dominated by another if all of it’s desirable characteristics can be obtained more cheaply from some other good (or combination of goods). Note that because of the idiosyncratic shocks (the ε’s) this will vary from household to household. There are two ways for good one to be dominated by good two. If good one’s principal9 characteristic is bad (has a negative β), then unless good one is a cheaper source of the other characteristic than good two it will have zero demand. Similarly, if good one’s non-principal characteristic is good (has a positive β) then unless good one is a cheaper source for it’s principal characteristic than good two it will have zero demand. Mathematically, if one good dominates the other, we will have either

or10

(p1 − ε1 )a22 − (p2 − ε2 )a12 <0 β1 ρ1 (a11 a22 − a21 a12 )

(2.21)

(p2 − ε2 )a11 − (p1 − ε1 )a21 <0 β2 ρ2 (a11 a22 − a21 a12 )

(2.22)

The signs of the determinant of A, and the β’s determine which inequality implies dominance of which good (this is discussed in more detail in the Appendix). This condition generally remains in the background, but when the products move closer together in the characteristics space, this becomes more of a possibility. For example, Figure 2.7 illustrates what happens when the two products are nearly perfect substitutes: A=

"

1

0.999

0.999

1

#

. In this case, good one will dominate good two if: 0.999(p1 − ε1 ) < p2 − ε2

and good two will dominate good one if 0.999(p2 − ε2 ) < p1 − ε1 . Obviously, one good or 9 In the two-good/two-characteristic case, if det A > 0, good one’s principal characteristic is the first characteristic. If det A < 0, good one’s principal characteristic is the second characteristic. In the three good case, the intuition is less clear, but the mathematics are similar. 10 It is not possible for both inequalities to hold simultaneously.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

47

the other is dominant except in a very small range of prices.



Figure 2.7: Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =  1 0.999 , β = 5.5, β = 5, ρ1 = ρ2 = 0.5, p2 = $0.50, w = 60. Compare to Figure 2.1. 1 2 0.999 1 q (p | p =$0.50) 1

q (p | p =$0.50)

2

2

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

Price of Good 1

Price of Good 1

1

1

0.5

0.4

0.4

0.3

0.2

0.2

0.1

0.1

0

5

10 q1

15

2

0.5

0.3

0

1

0

0

2

4 q2

6

8

The feature of the model that makes it possible for one product to dominate the other is the same feature that reduces the dimensionality of the parameter space: assuming that every product is completely represented by a set of many observable characteristics and one unobservable characteristic. These features are also apparent in Figures 2.29 and 2.30. In sharp contrast to Figure 2.4, increasing β1 has essentially no effect on the reservation price of good one. This is because, as the price of good one increases, households are substituting not to the outside good, as they were when the A matrix was diagonal, but rather they are substituting to good two. And since an increase in β1 affects both good one and good two nearly equally, we see no effect on the reservation price of good one. As noted earlier,

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

48

where we do see a large effect, in both Figures 2.29 and 2.30 is in the quantity demanded, particularly when prices are low.

2.4.2

Engel Curves

In addition to allowing corners in the inside goods, the model also allows corner solutions in the outside good - in some cases the household may not desire to consume any outside good. In this case, we will have zˆ ≤ 0 in equation 2.5 (and therefore z = 0, due to the non-negativity constraint). Because the household’s utility from the inside goods is strictly concave, this situation is dependent on the household’s expenditure level. Once the household’s budget reaches the point where it purchases positive amounts of the outside good, further increases in the household’s budget will not result in any increase in their purchases of the inside goods. This subsection explores this case in more detail by examining the Engel curves generated by the model.11 The four main characteristics of these Engel curves are as follows. First, they contain many corners: in many cases, the marginal utility households receive from one good is strictly greater than that from another over some range of expenditure levels. In the twogood case, this means that many of the Engel curves begin (and may even end) by following one of the axes. Second, relating to these corners, the Engel curves tend to be piecewise linear (or nearly linear). The third main characteristic of these Engel curves is that they terminate. Unlike many other models, the households in my model always reach a satiation point with respect to each of the interior goods. Beyond this point, any additional income is spent on the outside good. Finally, the Engel curves in this model may be negatively sloped for some range of expenditure levels. This behavior only occurs when the A matrix is non-diagonal, and reflects the fact that as households’ expenditure levels get high enough to allow them to purchase both goods, the second good may provide some of the characteristics formerly obtained through the first good – leading the household to consume less of the first good. This subsection highlights these characteristics, as well as illustrating the effects on the Engel curves of changes in the parameters and the product characteristics matrix. In contrast to the discussion of demand curves in the previous subsection, in which I was able to provide closed-form solutions for many cases, when zˆ < 0 there is no general 11

One feature of these graphs that may require explanation is that at the end of the Engel curves, the symbols are sometimes closer together. This is due to the fact that the household reaches it’s satiation point for the inside goods somewhere during the last incremental increase in w.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

49

closed-form solution, making interpretation of the Engel curves more difficult. When zˆ < 0, qˆ1 > 0 and qˆ2 > 0 the solutions are z = 0, q2 =

w − p1 q 1 and q1 = q1∗ p2

(2.23)

where q1∗ is the solution to ρ1 −1 w − p1 q 1 (p2 a11 β1 ρ1 − p1 a21 β1 ρ1 ) a11 q1 + a21 +1 + p2 ε 1 = p2  ρ2 −1 w − p1 q 1 +1 (p1 a22 β2 ρ2 − p2 a12 β2 ρ2 ) a12 q1 + a22 + p1 ε 2 p2 

(2.24) (2.25)

Figure 2.8 plots the Engel curves that these equations generate for an identical set of parameters to those in Figure 2.1. Here we have returned to the case where A =

"

1

0.5

0.5

1

#

.

Each pane of the graph represents a price pair, (p1 , p2 ), and each curve within the panes represents a (ε1 , ε2 ) pair. The parameters are the same in this Figure as in Figure 2.1. I chose to display these particular percentiles to provide substantial variation in the choices while not allowing the variation from one percentile to completely dominate the other in the figures. Table 2.2: Pairs of (ε1 , ε2 ) corresponding to the demand curves in Figures 2.1-2.30 Curve ε1 ε2 # −1.49 −1.82 M −1.49 −1.49  −1.82 −1.82 O −1.82 −1.49

Table 2.2 lists the symbols that correspond with (ε1 , ε2 ) pairs in Figures 2.8-2.13. These symbols serve both to mark the curves, and to denote the expenditure levels associated with those points on the Engel curves. Although they vary slightly from one Figure to the other, each symbol in Figure 2.8 reflects an increase in the expenditure level of $0.20. Hence, in Figure 2.8, the Engel curves only continue from an expenditure level of zero to (at most) an expenditure level of $2.20. After $2.20 (in Northeast pane), the household becomes satiated with respect to the interior goods, and does not increase its expenditure on the inside goods, regardless of the total expenditure level. Looking at Figure 2.8, both the Northeast and Southwest panes have similar patterns.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

50

Figure 2.8: Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each symbol   1 0.5 represents an increase of $0.20 in w. A = 0.5 , β = 5.5, β2 = 5, ρ1 = ρ2 = 0.5. 1 1 P1=0.25, P2=0.5

P1=0.5, P2=0.5

4

4

3

3 q2

5

q2

5

2

2

1

1

0

0

1

2

3

4

0

5

0

1

2

3

q1

q1

P1=0.25, P2=0.25

P1=0.5, P2=0.25

4

4

3

3

5

4

5

q2

5

q2

5

4

2

2

1

1

0

0

1

2

3 q1

4

5

0

0

1

2

3 q1

The O pair of ε’s leads that household to consume only good two, while the # pair leads

that household to consume only good one. Because p1 = p2 in both of these panes, the

curves for pairs M and , which both have ε1 = ε2 , share the same path. The curve for M goes further however, because the ε’s are closer to zero in this case, and therefore lead the household to consume more of both goods before switching to the outside good. In the Northwest pane, only the O curve results in purchases of good two. It may seem surprising that the Engel curve in this pane has a negative slope for part of its length. This is due to the fact that initially characteristic one offers the highest marginal utility, which leads the household to consume only good one, which is the better source of characteristic one. Eventually, however, characteristic two offers higher marginal utility than one. However, since good two contains characteristic one, as well as characteristic two, the household is

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

51

able to cut back on its consumption of good one, leading to the observed negative slope. Also, the Northwest and Southeast panes are similar, although reflected about the 45◦ line. The reason that the non-corner Engel curve terminates in a larger value of q1 in the Southeast than it does of q2 in the Northwest is that β1 is slightly greater than β2 . Figure 2.9: Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th of ε1 and ε2 . Each symbol  percentiles  1 0.5 represents an increase of $0.20 in w. A = 0.5 1 , β1 = β2 = 5, ρ1 = ρ2 = 0.5. P1=0.25, P2=0.5

P1=0.5, P2=0.5

3

3 q2

4

q2

4

2

2

1

1

0

0

1

2

3

0

4

0

1

2

3

q1

q1

P1=0.25, P2=0.25

P1=0.5, P2=0.25

3

3 q2

4

q2

4

4

2

2

1

1

0

0

1

2

3

4

0

0

1

2

q1

3

4

q1

Figure 2.9 shows that this asymmetry disappears when both β1 = β2 and ρ1 = ρ2 . The Figure considers the effect on the Engel curves from a slight decrease in β1 from 5.5 to 5.0, holding all else constant. Now the Northwest and Southeast panes are completely symmetric. Additionally, there is a slight decrease in the satiation points for both goods, although good one is affected more, since it contains more of the first characteristic. The Engel curves change significantly when we make the A matrix more diagonal, as seen in Figure 2.10, which plots the Engel curves when A =

"

1 0 0 1

#

. Notice that unlike the

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

52

Figure 2.10: Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each symbol   represents an increase of $0.10 in w. A = 10 01 , β1 = 5.5, β2 = 5, ρ1 = ρ2 = 0.5.

1

0.8

0.8

0.6

0.6

q2

1

0.4

0.4

0.2

0.2

0

q2

P1=0.5, P2=0.5

0

0.5

0

1

1

P1=0.25, P2=0.25

P1=0.5, P2=0.25 1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0.5 q1

1

0

0

q1

q2

q2

P1=0.25, P2=0.5

0.5

1

0

0

0.5

q1

1 q1

three previous sets of curves, none of the cures in this graph show corner solutions in either inside good at the satiation point. The result of this is fewer overlapping curves. This change is due to the fact that neither product can substitute for the other in characteristic space. Good one is now the only source of characteristic one, and good two is the only source of characteristic two. These changes aside, there are similarities. The Northeast and Southwest panes are still very similar to each other, with the lower prices in the Southwest pane leading to higher levels of consumption before satiation. Notice as well that in all four panes, the prices and ε’s of good two have no effect on the satiation point of good one, and vice-versa. This can be seen in the fact that the O and the  have the same value of q1 at the satiation point, and that the M and the # also have

the same value of q1 at the satiation point. This independence is due to the fact that the

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

53

products have no overlapping characteristics, and is true for all the Engel curves with this A matrix. Figure 2.11: Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each symbol  represents an increase of $0.10 in w. A = 10 01 , β1 = β2 = 5, ρ1 = ρ2 = 0.5.

1

0.8

0.8

0.6

0.6

q2

1

0.4

0.4

0.2

0.2

0

q2

P1=0.5, P2=0.5

0

0.2

0.4

0.6

0.8

0

1

0.6

P1=0.5, P2=0.25

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0.2

0.4

P1=0.25, P2=0.25 1

0

0.2

q1

1

0

0

q1

q2

q2

P1=0.25, P2=0.5

0.4

0.6

0.8

1

q1

0

0

0.2

0.4

0.6

0.8

1

0.8

1

q1

Figures 2.11 and 2.12 are very similar to Figure 2.10. The difference is that in Figure 2.11, β1 has been decreased from 5.5 to 5.0, and in Figure 2.12 ρ1 has been decreased from 0.05 to 0.04, while holding β1 ρ1 = 2.5 constant. The main effect of decreasing β in Figure 2.11, is to decrease the satiation point for good one in all four panes. Because good two does not contain any of characteristic one, however, the satiation points for good two remain unchanged. A second effect of the change in β1 is that now we have β1 = β2 and ρ1 = ρ2 . This means that when ε1 = ε2 , as in the Northeast and Southwest panes for the M and  pairs, the Engel curve moves along the 45◦ line.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

54

Figure 2.12: Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each symbol represents an increase of $0.10 in w. A = 10 01 , β1 = 6.875, β2 = 5, ρ1 = 0.4, ρ2 = 0.5. P1=0.5, P2=0.5

1

1

0.8

0.8

0.6

0.6

q2

q2

P1=0.25, P2=0.5

0.4

0.4

0.2

0.2

0

0

0.2

0.4

0.6 q1

0.8

0

1

0

0.2

1

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0

0

0.2

0.4

0.6 q1

0.8

0.6 q1

0.8

1

P1=0.5, P2=0.25

q2

q2

P1=0.25, P2=0.25

0.4

1

0

0

0.2

0.4

0.6 q1

0.8

1

Finally, in Figure 2.13, I show the effects of a dramatically different A matrix. In this case, A =

"

1

0.999

0.999

1

#

, making the products nearly perfect substitutes. The Northwest and

Southeast panes are most representative of actual choice behavior in this situation. The Northeast and Southwest panes show knife-edge behavior for the M and  pairs. Because the products have nearly identical characteristics, the β’s and ρ’s make little difference in distinguishing between goods one and two. Here, ε1 and ε2 have the largest effect on which good is purchased. Except in the rare cases (as shown here) when p1 − ε1 = p2 − ε2 , only one of the two goods will be purchased. To recap, this subsection has illustrated the behavior of the Engel curves generated by the model. These curves have several features that differentiate them from many other Engel curves. First, as shown throughout this subsection, all of the Engel curves terminate,

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

55

Figure 2.13: Demand for good one and two as a function of expenditure level, w, for four different values of p1 and p2 , for 40th and 60th percentiles of ε1 and ε2 . Each symbol  1 0.999 represents an increase of $0.40 in w. A = 0.999 , β = 5.5, β2 = 5, ρ1 = ρ2 = 0.5. 1 1

6

5

5

4

4

q2

6

3

3

2

2

1

1

0

q2

P1=0.5, P2=0.5 7

0

2

4

0

6

4

P1=0.25, P2=0.25

P1=0.5, P2=0.25 7

6

6

5

5

4

4

3

3

2

2

1

1 0

2 q1

7

0

0

q1

q2

q2

P1=0.25, P2=0.5 7

2

4 q1

6

0

0

2

4

6

6

q1

signifying satiation in the inside goods. Second, the curves tend to be primarily piece-wise linear, often sharply changing direction once a certain expenditure level is reached. This behavior is closely related to the many corners found in the model, and is generated by the fact that I have assumed that households’ utility is derived primarily from the product characteristics, rather than the products themselves. Finally, although changes in β and ρ do have significant effects on the Engel curves, much of the variation we see is generated by changes in the product characteristics matrix. The next subsection discusses this fact in more detail.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

2.4.3

56

The A Matrix and Some Additional Remarks

As the previous two subsections have illustrated, the structure of the A matrix can have a large impact on the choice behavior that the model predicts. In this subsection, therefore, I summarize the restrictions and recommendations for A matrices in empirical applications. First, all of the elements of A must be non-negative. Essentially, the elements of A should be thought of as a ranking. For example, due to the fact that all errors are negative, a good with all zeros will never be purchased. Second, while it is possible for A to have more rows than columns, if this is the case then A will have greater row rank than column rank. This means that one good can be synthesized by one or more other goods. Therefore only one of these two (possibly composite) goods will be chosen (whichever has the higher value for Aj1 /(pj − εj )). The number of un-dominated products that it can contain (i.e., the maximum number of different products that will be purchased on a single purchase occasion) is thus limited to the number of columns. Third, in estimation, it is a good idea to allow for more characteristics than products. Consider the two-good case. In the two-good case, it’s not possible to have negative crossprice elasticities. The closest my model can come to complements is having cross-price elasticities of zero. In the two-good two-characteristic case, the cross-price elasticities are # " 1 0 . Similarly, the closest the model can come zero (at interior solutions) when A = 0 1 to a world with perfect substitutes is when the two products have" identical # characteristics. 1 1 . In this case, we Again, in the two-good, two characteristic case, this requires A = 1 1 will never observe households purchasing both goods simultaneously. In order to allow the parameters of the model to determine the probabilities that the household buys both goods together it is necessary for the econometrician to choose an " A matrix # that accommodates 1 1 0 both extremes. One such A matrix in this case is: A = . In this case, as β1 1 0 1 approaches zero, the cross-price elasticities will approach zero. Conversely when β2 and β3 approach zero, each household has a discontinuity point where they are perfectly elastic between good one and good two (this occurs when p1 − ε1 = p2 − ε2 ). Fourth, as we have seen throughout this section, the specification of the A matrix greatly influences the possible choice behavior of the model. Unlike many other models, as currently formulated, even the scale chosen for measuring each characteristic is important

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

57

in determining some of the potential quantity behavior. For example, if we scale all of the the elements of a particular characteristic of A by a positive scalar λ, we have: β1 (a11 q1 + a21 q2 + 1)ρ1 =

β1 β1 (λa11 q1 + λa21 q2 + λ)ρ1 6= ρ1 (λa11 q1 + λa21 q2 + 1)ρ1 λ ρ1 λ

Note that the λ is not simply absorbed by the β. The size of λ affects the closeness of the corner. The effect is that as the elements of A increase in magnitude for a particular characteristic, we are less likely to see corner solutions for that characteristic. In principle, this problem could be eliminated by parameterizing the distance from the corner, and substituting a characteristic-specific parameter for each “1”. I do not do so here because, like Kim et al. (2002), I lack sufficient variation in my data. In practice, it may be partially parameterized by comparing results from several alternative scalings. If the predicted purchases for a particular product are frequently very close to, but slightly greater than zero, this indicates that the additive term (the “1”) for the characteristic most closely identified with that product (the characteristic for which that product is the cheapest source) is too small relative to the the scale of that characteristic. Hence, the scale of that characteristic should be decreased. Finally, I have assumed that households’ decisions are static, and that they do not condition current purchases on expectations of future prices (or past purchases). Within this context, although the A matrix above is treated as time invariant throughout this chapter, in the next chapter I include feature and display as time-varying characteristics, with the matrix A varying from week to week. Implicit in using these time-varying characteristics is the assumption that households do not condition current purchases on their expectations of future changes in the characteristics matrix. Before proceeding further, I emphasize several significant assumptions, normalizations, and operational differences from existing models that are worth highlighting and explaining in more detail. These are: • I normalize ρ > 0. This is because it is very difficult to distinguish the choice behavior produced by negative ρ’s from positive ones. It is also theoretically unappealing to have the derivatives of the utility function alternate signs. • As derived in the Appendix, in the case with two inside goods and two product characteristics the utility function is concave (assuming det A > 0 and ρ > 0) if and only if βc (ρc − 1) ≤ 0 ∀c ∈ C. When there are more than two goods or characteristics,

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

58

this is a sufficient, but not necessary condition. • Although the model extends the existing literature by allowing households to have preferences for variety, the model does not currently allow these preferences for variety to be time-varying. For example, this could be violated if households occasionally had a party. This could be modeled by introducing correlation across the εj ’s within a purchase occasion. The reason I assume that the εj ’s are i.i.d. both across goods and across time is for computational tractability (doing so would further increase the number of parameters to estimate). In future work, this distribution could be easily parameterized. In addition to these, there are several other key differences between this model and most other demand models. First, as noted previously, unlike much of the existing literature, my model uses an explicit budget constraint. This is a necessary feature in modeling bundles. There must be some way in which the household’s choice set is restricted from considering an infinite number of potential bundles. Because the logit model does not include a budget constraint, small departures from it on this front would place positive probability on purchasing items at near-infinite prices.

2.5

Choice Behavior Generated by the Model

Section 2.4 showed that the model can generate a wide variety of choice behavior. The purpose of this section is to show that the model has a large amount of freedom in allocating probability to these different outcomes. The characteristics matrix does place a large amount of structure on the “shape” of the error space decomposition, but within this structure the parameters are able to “stretch” the divisions in several directions. This is in striking contrast to the logit model, which can only reposition the the way in which the error space is divided. The logit model requires the econometrician to specify the functional form for the mean indirect utilities from each choice, V0 , V1 , and V2 . Frequently, the specification Vi = βXi − αpi is used, where Xi is a vector of characteristics of product i, pi is the price of good i, and β and α are parameters. The price and characteristics of the outside good are usually normalized to zero, and hence V0 = 0. The logit model roughly analogous to the example with two inside goods12 is one with 12

The model is only roughly analogous because, in addition to the reasons mentioned above, this particular

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

59

two inside goods and one outside good. The household receives three marginal utility shocks (one for each good) that are unobserved by the econometrician, and that are assumed to be distributed i.i.d. according to a Gumbel distribution. Hence, if V 1, V 2, and V 0 (normalized to zero,) are the mean indirect utilities, the household chooses among the three goods to maximize: max {V 0i = ε0i , V 1i = V 1 + ε1i , V 2i = V 2 + ε2i }

(2.26)

Figure 2.14: Graph of logit choices as a function of error terms

Buy Outside Good

1

ε0 −V1

Buy Good 2 Buy Good 1 0 −V2 ε1 ε2

−V2+1

−V1+1

Figure 2.14 graphs these logit choices as a function of the logit error terms. To understand the graph, it is helpful to think of the shape depicted as a paper airplane descending at a 45◦ angle into the page. The arrows along the axes indicate the directions of increase for ε0 , ε1 , and ε2 . Although this orientation is non-standard, I believe it shows the graph more clearly. V 1 and V 2 are the mean indirect utilities that the household receives from choosing good one or good two respectively. V 0, the mean utility from the outside good is normalized to zero, and is not shown. The paper airplane shape divides the graph into three regions. The top region represents the error space that results in the household choosing the outside good (i.e, good 0). The lower right region is the region of the error space that specification justifies the inclusion of the αpi term by assuming that there is a second outside good, with price normalized to one, on which the household spends any remaining money.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

60

results in the household choosing good 2, while the lower left represents a choice of good 1. The fuselage of the “plane”, where all three planes intersect, goes through the point (−V 1, −V 2, 0). Recall that in the logit model, the ε’s are i.i.d. and follow an extreme value (Gumbel) distribution. This distribution is very similar to the standard normal distribution, but has thicker tails. The important point is that there are only two parameters, V 1 and V 2 that completely determine how the error space affects the choice outcomes. The probability of the occurrence of any particular point in this error space is given by −ε0 −e−ε1 −e−ε2

P (ε0 , ε1 , ε2 ) = e−ε0 −ε1 −ε2 e−e

To illustrate the richer error structure provided by my model, the remainder of this section shows how the different parts of the model affect the patterns of choice behavior that it generates. I look at several different configurations of the A matrix (specified by the econometrician and the data), including the extreme examples of diagonal, square, and triangular, as well as several intermediate cases. For each of these configurations, I look at the effects of changing the parameters, scaling the A matrix, and a binding budget constraint. Figure 2.15 shows the distribution of choices resulting from the error structure in my model when the A matrix is diagonal.13 The parameters used to generate the figure are β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 30, and A =

"

1 0 0 1

#

. Because my model is

continuous, rather than discrete, I have discretized the choices into the four regions shown in Figure 2.15 in order to allow for better comparison. The regions in this graph (and most of the rest of the graphs in this section) are color-coded as follows: The dark rectangular region in the Northeast corner is the region of the error space in which neither good one nor good two is purchased. The light rectangle in the Southwest corner represents the region where both goods are purchased. The two rectangles in the Northwest and Southeast represent regions where only one of the two goods is purchased. For reference, Figure 2.16 graphs the probability distribution of εj . Recall that the εj ’s are assumed to be i.i.d. across products, purchase-occasions, and households. The probabilities of each region are given in the Figure’s description.14 These probabilities were obtained by simulating ε draws. Although the regions in Figure 2.15 are rectangular, and allow for a relatively simple analytical transformation, in general this will 13 14

Due to rounding, probabilities may not sum to one. In each case, the probabilities have been rounded, and hence, may not sum to one.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

61

Figure 2.15: Graph of purchases (broken into groups) as a function of error terms. A = 

1 0 0 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = $0.30, w = $30. The probabilities of the four regions are:



0.21 0.09 0.49 0.21



             

Don’t Buy Good Two  

          

   

Buy Some of Good Two  



|

{z

}

|

Buy Some of Good One

{z

}

Don’t Buy Good One

not be the case. As we will see in subsequent graphs, the regions are often very irregular. For this reason, simulation methods are necessary. Figure 2.17 illustrates the effect of the budget constraint. The majority of the graph is identical to Figure 2.15. The only differences between the figures occur in the areas where either ε1 or ε2 are small in magnitude. In these areas, the regions that are slightly lighter represent the areas of the error space where the budget constraint is binding. That is, regions of the graph in which z = 0. Except where noted, the probability of these regions in the other graphs in this section are near-zero. The only effect of the budget constraint on the propensity to purchase the two goods can be seen by noting that there are now 45◦ lines between the point of intersection of the purchase boundaries and the axes. The slope of the lines comes from the ratio of the prices of the two goods. This is because when the budget constraint is binding on the interior goods, no outside good is purchased (z = 0). Therefore, assuming that neither

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

62

Figure 2.16: Negative Lognormal Probability Distribution of εj . Mean=e0.5 , Variance=e2 − e, Mode=e−1 , Median=1. ln(−εj ) ∼ N (0, 1). 0.7

0.6

Probability Density

0.5

0.4

0.3

0.2

0.1

0

0

−0.5

−1

−1.5

−2

−2.5 ε

−3

−3.5

−4

−4.5

−5

good dominates the other, Equation 2.24 gives us: ρ1 −1 w − p1 q 1 p2 ε1 − p1 ε2 = − (p2 a11 β1 ρ1 − p1 a21 β1 ρ1 ) a11 q1 + a21 +1 p2  ρ2 −1 w − p1 q 1 (p1 a22 β2 ρ2 − p2 a12 β2 ρ2 ) a12 q1 + a22 +1 p2 

(2.27) (2.28)

Hence, the relative amounts of q1 and q2 purchased are determined by the weighted sum p2 ε1 − p1 ε2 of the error terms. This leads to the linear transitions seen in the Southwest corner of the Figure. The effects of increasing the parameters β1 and ρ1 are shown in Figures 2.18 and 2.31. Both β and ρ have similar effects on the household’s propensity to purchase. Higher values of either β1 or ρ1 mean that good one is purchased even at more negative realizations of ε1 . However, as discussed in section 2.4.1, ρ has a greater impact than β on the amount that the household purchases, conditional on purchase. The effects of increasing the scale of the A matrix can be seen in Figure 2.19, which shows that the propensity to purchase can be held roughly constant by adjusting β to compensate for changes in the scale of the A matrix. Here I multiply all of the elements of A by 10, while dividing β by 10. The propensity to purchase shown in this figure is nearly

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

63

Figure 2.17: Graph of purchases (broken into groups) as a function of error terms. A = 

1 0 0 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 1. The probabilities of the seven regions are:

 

 0.09 0.13 0.09 0.14 0.13  0.27 0.09

identical to that shown in Figure 2.15. As noted in subsection 2.4.3, while the propensity to purchase remains the same, the amounts purchased are affected by the transformation. Thus far in this section, we have considered only the case when A is a diagonal matrix – each good has its own characteristic. Looking back, one can see that in all of the graphs thus far, ε1 does not affect the probability of purchasing good two, and vice versa. The next few figures illustrate the changes in choice behavior as the A matrix becomes non-diagonal. As we move to a progressively more non-diagonal matrices, such as in Figures 2.20 and 2.32 (in the Appendix), we begin to see interaction effects. For values of ε1 or ε2 below roughly -2.5, Figures 2.32 and 2.15 are nearly identical. The only difference in these regions is that the boundary lines, where the household is indifferent between buying and not buying, have shifted outwards very slightly. For values of ε closer to zero however, it is clear that even a relatively small amount of the off diagonal characteristics can significantly alter the outcome space. Realizations of ε1 that are close to zero lead to large purchases of good one. Because good one now contains some of the second characteristic, this leads to a decrease in the demand for good two. The degree of curvature in the Southwest portion of the graph is determined by setting qb1 = 0 in Equation 2.3, or qb2 = 0 in Equation 2.4. This effect is even more marked in Figure 2.33, which moves A even further from a diagonal form. Figure 2.34 is analogous to Figure 2.17, but with a different A matrix. It shows the

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

64





Figure 2.18: Graph of purchases as a function of error terms. A = 10 01 β1 = 7, β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3,  w = 30.  The probabilities of the four regions 0.05 are: 0.24 0.57 0.13

effects of a binding budget constraint on choice behavior. As discussed earlier (see Equation 2.27), there are 45◦ lines between the points of intersection of the purchase boundary and the axes, representing the boundary between purchasing only one inside good and both inside goods. As was the case in Figures 2.18 and 2.31, increasing β1 or ρ1 have the effect of stretching the graph in the horizontal direction. That is, ε1 can be more negative, and we will still observe households buying good one. Figure 2.35 shows the effect of scaling the A matrix. As was the case with Figure 2.19, we can achieve very similar cutoffs in the boundaries between buying and not buying by decreasing the β’s to compensate for increasing the scale of the A matrix. Having considered the cases of diagonal and square A matrices, we turn our attention to the transition from diagonal to progressively more triangular characteristics matrix. Figures 2.22 and 2.36 show how this progression affects choice behavior. Notice that unlike in Figure 2.20, where both purchase boundaries in the Southwest corner of the graph were curved, here we only have curvature in the transition from buying to not buying good two. This comes directly from the fact that when a21 = 0, the boundary equation from Equation 2.3 is linear in ε1 and ε2 , while the boundary equation from setting qb2 = 0 in Equation 2.4 is nonlinear in ε1 and ε2 .

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

65

Figure 2.19: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



10 0 0 10



, β1 = β2 = 0.5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 30. The

probabilities of the four regions are:



0.21 0.09 0.49 0.21



As the figures throughout this section have shown, my model is able to flexibly assign probabilities to different outcomes. Although the structure of the A matrix has a significant effect on the choice probabilities and their variation, in general the parameters β and ρ are accommodate observed purchase probabilities such as those shown in Table 1.1.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

66

Figure 2.20: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



1 0.1 0.1 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 30. The

probabilities of the four regions are:



0.28 0.07 0.38 0.27



Figure 2.21: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



1 0.5 0 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 30. The

probabilities of the four regions are:



0.57 0.05 0.24 0.15



CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

67

Figure 2.22: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



1 1 0 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 30.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

2.6

68

The Costs of Misspecification

Although the previous sections have shown the additional flexibility of the choice model developed in this chapter, given the added complexity in estimating the model, it seems worth investigating whether the effort to achieve additional flexibility is worthwhile. For example, if the true data generating process is that described in this chapter, how far off would we be if we estimated demand using a traditional logit model? In particular, how poorly does the logit perform in making the calculations described in chapter one (and those I plan to make in chapter three) – the profit difference between uniform and non-uniform prices. As the Monte Carlo results in this section show, depending on the parameters and the A matrix, the answers given by the logit may be significantly off. Throughout this section, I consider a model with four inside goods and one outside good. Although my model can handle more product characteristics than products (an A matrix that has more columns than rows), here I use a square A matrix to avoid tilting the results in favor of my model. For the Monte Carlo results in this section, I have made the following assumptions: 1. I assume that 5000 shoppers visited the store each week, and that their budgets were distributed normally, with a mean of $30, and a standard deviation of $20. This reflects the actual distribution of budgets in my data. The assumption about the number of shoppers is essentially a normalization, and affects only the amount of simulation error introduced. 2. I consider several different parameter combinations and also show the effects of changes in β and ρ. 3. I consider two potential product characteristic spaces, one in which each characteristic is product-specific, and another more representative of the soft drink category, which is the focus of the following chapter. I consider several potential sets of parameters and product characteristics. In each case, I proceed as follows. First, I draw one set of ε’s for each household. Then for each set of ε’s, at a particular set of prices (more on this in a moment), I solve for the optimal consumption bundle for each household. After solving for each household’s demand, I aggregate the realized demand, and optimize over the price space, first allowing for non-uniform prices,

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

69

and then restricting the prices to be uniform. This gives me the true optimal uniform and non-uniform prices. In order to illustrate how the relationship between the logit predictions and the true model depends on the marginal costs of the products, I calculate prices and resulting profit differences under 34 = 81 different combinations of marginal costs. Doing so also shows how changes in the products marginal costs translate to changes in the profit difference between uniform and non-uniform prices. Each figure uses the same sequence of marginal costs, numbered 1 through 81. The cost configurations used for each week are shown in Table 2.3.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

c1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1

c2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3

Table 2.3: Marginal Cost Configurations c3 c4 c1 c2 c3 c4 c1 0.1 0.1 28 0.2 0.1 0.1 0.1 55 0.3 0.1 0.2 29 0.2 0.1 0.1 0.2 56 0.3 0.1 0.3 30 0.2 0.1 0.1 0.3 57 0.3 0.2 0.1 31 0.2 0.1 0.2 0.1 58 0.3 0.2 0.2 32 0.2 0.1 0.2 0.2 59 0.3 0.2 0.3 33 0.2 0.1 0.2 0.3 60 0.3 0.3 0.1 34 0.2 0.1 0.3 0.1 61 0.3 0.3 0.2 35 0.2 0.1 0.3 0.2 62 0.3 0.3 0.3 36 0.2 0.1 0.3 0.3 63 0.3 0.1 0.1 37 0.2 0.2 0.1 0.1 64 0.3 0.1 0.2 38 0.2 0.2 0.1 0.2 65 0.3 0.1 0.3 39 0.2 0.2 0.1 0.3 66 0.3 0.2 0.1 40 0.2 0.2 0.2 0.1 67 0.3 0.2 0.2 41 0.2 0.2 0.2 0.2 68 0.3 0.2 0.3 42 0.2 0.2 0.2 0.3 69 0.3 0.3 0.1 43 0.2 0.2 0.3 0.1 70 0.3 0.3 0.2 44 0.2 0.2 0.3 0.2 71 0.3 0.3 0.3 45 0.2 0.2 0.3 0.3 72 0.3 0.1 0.1 46 0.2 0.3 0.1 0.1 73 0.3 0.1 0.2 47 0.2 0.3 0.1 0.2 74 0.3 0.1 0.3 48 0.2 0.3 0.1 0.3 75 0.3 0.2 0.1 49 0.2 0.3 0.2 0.1 76 0.3 0.2 0.2 50 0.2 0.3 0.2 0.2 77 0.3 0.2 0.3 51 0.2 0.3 0.2 0.3 78 0.3 0.3 0.1 52 0.2 0.3 0.3 0.1 79 0.3 0.3 0.2 53 0.2 0.3 0.3 0.2 80 0.3 0.3 0.3 54 0.2 0.3 0.3 0.3 81 0.3

c2 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3

c3 0.1 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.3 0.1 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.3 0.1 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.3

c4 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3 0.1 0.2 0.3

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

70

Using the 81 “weeks” of data generated by the different marginal cost configurations, I estimated a homogenous logit model of demand. The homogenous logit model is particularly convenient in this case, because it yields the estimating equation: ln(sjt ) − ln(s0t ) = βXj − αpjt + ηjt

(2.29)

where sjt is the share of good j in “week” t, and s0t is the share of the outside good in that week. As is well known, the econometrician is required to specify the size of the market - and hence the share of the outside good in each week. After experimenting with different values, I found that in general, the logit profit difference predictions did not change significantly for a wide range of market size definitions. I consider two product characteristic possibilities. First, I consider the case when each product is identified by a product-specific characteristic. The example given in Table 2.4 is one where each of the four products: Coke, Diet Coke, Pepsi, and Diet Pepsi have no demand-relevant characteristics in common. Specifying the A matrix requires two steps. The first step is to assume its structure, as I have done. The second step is to assume the scale of each of its columns. Here, I set each of the entries either “1” or “0”. It is important to stress that this choice is not a normalization. Choosing the scale of each product characteristic is equivalent to choosing the scale of the additive constant (the “+1”) in each of the households’ sub-utility functions. This choice affects the probability that households are at a corner for that characteristic. In future work, I hope to estimate these scales (or equivalently, parameterize the “+1”), but for now I assume the scale.

Products: Coke Diet Coke Pepsi Diet Pepsi

Table 2.4: Example Product Universe Number One Characteristics:  Coke Diet Coke Pepsi Diet Pepsi 1 0 0 0  0 1 0 0 X 0 0 0 A=  0 0 1 0 0 X 0 0 0 0 0 1 0 0 X 0 0 0 0 X



   

This product characteristic structure ensures that the products are poor substitutes for one another. Each product is the sole source of its characteristic – there is no overlapping. Hence, this case reflects a world in which households have a greater preference for variety. Figure 2.23 plots the percentage by which the expected profits from optimal non-uniform prices are lower than the expected profits from optimal uniform prices (when I restrict

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

71

p1 = p3 and p2 = p4 ). The vertical axis in the figure measures: ΠN on−U nif orm − ΠU nif orm ΠN on−U nif orm where ΠN on−U nif orm = max{p1 ,p2 } Q1 (p)(p1 − c1 ) + Q2 (p)(p2 − c2 ) + Q3 (p)(p3 − c3 ) + Q4 (p)(p4 − c4 )

and ΠU nif orm = max

{p1 ,p2 }

Q1 (p)(p1 − c1 ) + Q2 (p)(p2 − c2 ) + Q3 (p)(p1 − c3 ) + Q4 (p)(p2 − c4 )

For each marginal cost configuration (“week”) I plot both the “true” loss, which assumes uses my model is the true model and uses it to generate the Q’s in the equations above, as well as the loss predicted by the logit model, which uses the parameters estimated from the fake data to generate the Q’s. In general, the “actual” profit differences, (i.e. assuming that my model is the data generating process,) are typically 5% or less, with no week higher than 20%. The lost profits are greater when the marginal costs of the products grouped are farther apart. The logit estimates correlate with this, but predict much larger differences across the board. Weeks 3, 6, 9, 30, 33, 36, 57, 60, and 63 show the largest difference in expected profits when pricing uniformly. These weeks correspond to the weeks with the greatest differences between the costs of goods two and four. As seen in the figure, the logit model’s predictions deviate significantly from the truth, particularly in weeks 9, 36, and 63. As mentioned above, these are weeks when the actual marginal costs of the products within the uniform clusters are significantly different, and the true profit difference is greater. However, the logit does not consistently predict the magnitude of the profit differences. For example, weeks 3, 30, and 57 all have large actual profit differences, but while the logit estimates are too high in other cases, in these they are correct, or are too low.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

72

Figure 2.23: Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,3},{2,4}. The marginal costs corresponding to the horizontal axis are shown in Table    2.3. A

1 0  0 1 =  0 0 0 0

0 0 1 0

0 0  , 0  1

β=

3  5   ,  7  9

ρ=

0.05  0.08     0.07  0.09

Percentage Profit Decrease from Uniform Price Restriction 70% 60%

% Profit Decrease

50% 40% Our Model Logit

30% 20% 10% 0% 0

.

5

10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Marginal Cost Configuration

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

73

Figure 2.24: Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,3},{2,4}. The marginal costs corresponding to the horizontal axis are shown in Table    2.3. A

1 0  0 1 =  0 0 0 0

0 0 1 0

0 0  , 0  1

β=

9  5   ,  7  9

ρ=

0.05  0.08     0.07  0.09

Percentage Profit Decrease from Uniform Price Restriction 100% 90% 80%

% Profit Decrease

70% 60% Our Model Logit

50% 40% 30% 20% 10% 0% 0

5

10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Marginal Cost Configuration

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

74

Figure 2.24 shows the predicted profit differences when I increase β1 from 3 to 9. This change can be thought of as directly increasing demand for good one. The effect of this change is to increase the profits lost by pricing uniformly in many cases. What we observe, is that now the weeks with the greatest profit difference from uniform pricing are: 9, 18, 27, 57, 66, and 75. Now, the cost differences between goods one and three appear to be driving the profit differences. When the cost of good one is close to the cost of good three the actual lost profits are relatively small. The logit is again an imprecise predictor, exaggerating weeks with large profit differences, and under-estimating weeks with moderate profit losses. Figure 2.25: Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,3},{2,4}. The marginal costs corresponding to the horizontal axis are shown in Table    2.3. A

1 0  0 1  = 0 0 0 0

0 0 1 0

0 0  , 0  1

β=

1  5   ,  7  9

ρ=

0.15  0.08     0.07  0.09

Percentage Profit Decrease from Uniform Price Restriction 70% 60%

% Profit Decrease

50% 40% Our Model Logit

30% 20% 10% 0% 0

5

10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Marginal Cost Configuration

Figure 2.24 shows the predicted profit differences when we hold β1 ρ1 = 0.15, and increase ρ1 from 0.05 to 0.15. This change can be thought of as decreasing the rate of satiation for good one. The effect of this change is to increase the profits lost by pricing uniformly in some cases, but decrease them significantly when the marginal costs of goods one and three

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

75

are closer. This figure can be seen as an amalgam of Figures 2.23 and 2.24. Differences in marginal costs between goods two and four drive much of the variation, but the middle third of the weeks (weeks 28 through 54) show comparatively smaller differences, coming from a lack of large cost differences between goods one and three.

Products: Coke Diet Coke Pepsi Diet Pepsi

Table 2.5: Example Product Universe Number Two Characteristics:   Soda Coke Pepsi Diet 1 1 0 0  1 1 0 1  X X 0 X  A=  1 0 1 0  X X 0 0 1 0 1 1 X 0 X X X 0 X 0



These have shown the effects when the A matrix is diagonal, a world in which demand for one good does not significantly affect demand the the others. Next, I consider the case where there is significantly more overlap. n particular, I assume the A matrix is as shown in Table 2.5. In this case, the four demand-relevant characteristics are: Soda, Coke, Pepsi, and Diet. Again, I choose to set each positive characteristic to “1”. Because the logit demand model is linear, it’s not able to match this product characteristic matrix exactly - the constant term is co-linear with the pair of brand dummy variables. Therefore, in this case, I omit one of the brand dummy variables in estimating the logit for this second characteristics. In Figures 2.26 and 2.28, I vary the groupings of the products. First grouping them by   whether or not they are diet sodas, and then grouping them by brand. I chose β =  0.05    0.07       0.07    0.09

5    5       5    5

and



ρ=

in order to simulate an environment where households cared most about whether

a product was a diet drink, and then about the brand. Figure 2.26 shows both larger median profit losses for the true model, and less variation with respect to changes in marginal costs. The logit model, however, predicts much greater profit differences. Figure 2.27 uses the same parameters as Figure 2.26, but now groups the products by brand, putting Coke and Diet Coke in one group, and Pepsi and Diet Pepsi in another. Here the logit actually does a remarkably good job of predicting the differences in expected

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

76

Figure 2.26: Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,3},{2,4}. This corresponds to uniform pricing by “diet” characteristic. the The marginal costs corresponding to the in Table 2.3.   horizontal   axis are shown  A

1 1  1 1  = 1 0 1 0

0 0 1 1

1 0  , 1  0

β=

5  5   ,  5  5

ρ=

0.05  0.07     0.07  0.09

Percentage Profit Decrease from Uniform Price Restriction 30%

% Profit Decrease

25%

20% Our Model Logit

15%

10%

5%

0% 0

5

10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Marginal Cost Configuration

profits. The picture changes dramatically, however, if we change the parameters slightly. Figure Figure 2.28 decreases all of the β’s by 30% from 5.0 to 3.5. Although the true percentage profit differences remain more-or-less unchanged, the logit predictions change dramatically. At low levels they are too low, and at high levels they are too high. This section has shown that, at least for this application, measuring the difference in expected profits between uniform and non-uniform prices, the logit model does not perform well when it does not accurately describe the data generating process. I conclude that in this case, the additional complexity of my model is worthwhile. Unfortunately, I have been unable to determine factors that affect the degree to which the logit model miscalculates the result. Learning to recognize cases in advance, when it is more important to estimate demand using my model (and incur the associated additional costs) than to use the logit

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

77

Figure 2.27: Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,2},{3,4}. This corresponds to uniform pricing by “Coke” or “Pepsi” characteristics. The marginal costs corresponding to axis are shown in Table 2.3.   the horizontal    A

1 1  1 1  = 1 0 1 0

0 0 1 1

1 0  , 1  0

β=

5  5   ,  5  5

ρ=

0.05  0.07     0.07  0.09

Percentage Profit Decrease from Uniform Price Restriction 30%

% Profit Decrease

25%

20% Our Model Logit

15%

10%

5%

0% 0

5

10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Marginal Cost Configuration

as an approximation is an area for future research.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

78

Figure 2.28: Graph of percentage decrease in expected profits from optimal uniform prices versus optimal non-uniform Prices. For the uniform case, products were grouped as: {1,2},{3,4}. This corresponds to uniform pricing by “Coke” or “Pepsi” characteristics. The marginal costs corresponding to the horizontal axis      areshown in Table 2.3. A

1 1  1 1  = 1 0 1 0

0 0 1 1

1 0  , 1  0

β=

3.5  5.5     3.5 , 3.5

ρ=

0.05  0.07     0.07  0.09

Percentage Profit Decrease from Uniform Price Restriction 70% 60%

% Profit Decrease

50% 40% Our Model Logit

30% 20% 10% 0% 0

5

10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 Marginal Cost Configuration

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

2.7

79

Conclusion

The logit demand model has been a workhorse of demand estimation for over thirty years. Even at its most basic form, it is a powerful and easily-estimated model. And yet, even with recent extensions, it has significant shortcomings. These shortcomings are most apparent when the model is applied in a setting with continuous choice, with many products, and when households may be variety-seeking. This chapter has described the state of the current demand estimation literature, and its continued shortcomings with respect to these challenges. In response to these challenges, I developed a new model, able to accommodate these features. Although retaining the hedonic framework of the logit model, I introduce significant nonlinearity to the household’s utility function. Allowing for continuous choice forces us to use a budget constraint to limit the household’s purchase behavior. As we will see in more detail in the following chapter, these changes come at significant computational cost. To justify this cost, and because this model represents a departure from much of the existing demand literature, a large portion of this chapter has been spent exploring and dissecting the workings of the new model. By showing the demand and Engel curves that it generates, as well as the possible probability distributions over various choices, I showed that the model was indeed able to deliver realistic choice patterns. Additionally, this chapter showed that for my own application, looking at the difference in expected profits from uniform and non-uniform pricing, the logit model performs quite poorly when compared to the new model. This finding helps to justify the added computational complexity. As computers become faster, and more household-level data becomes available, I expect the techniques employed in this dissertation should continue to gain momentum. The strategy of writing down a direct utility function, and numerically maximizing it with respect to a budget constraint yields much more freedom than previous approaches. Unlike the work of Kim et al. (2002), which uses Simulated Maximum Likelihood, my approach using Method of Simulated Moments does not require high dimensional numerical integration, and thus allows for a much larger product space. The particular utility function assumed in this dissertation is both flexible, concave, and parsimonious with parameters. Finally, this demand model has a variety of potential uses beyond the uniform pricing application. For example, although I do not do so in this paper, modeling households’

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

80

purchases of bundles in this new way allows for a variety of additional counterfactuals that were previously not possible. It is now possible to calculate the effects of promotions of the kind: “buy good A, get good B free”, as well as more complex models of cross-category interactions.

2.8

Appendix A: Restrictions Yielding Concavity

In order to ensure that the necessary first-order conditions of the Lagrangian are also sufficient, as well as to reflect the predictions of economic theory, we would like to restrict the parameters of the model to make utility a concave function of consumption. The utility function is concave if and only if its hessian matrix is negative semi-definite, which in turn is true if and only if all of the hessian’s eigenvalues are all real and non-positive.

2.8.1

Two Goods, Two Characteristics

In the case of only two interior goods, the utility function is concave (and the hessian’s eigenvalues are all real and non-positive) if and only if the following three conditions are satisfied: (1) h11 ≤ 0, (2) h22 ≤ 0, and (3) h11 h22 −h12 h21 ≥ 0, where hij =

∂2U ∂qi ∂qj

represents

the i, jth element of the Hessian. With two characteristics and two goods, h11 = a211 β1 ρ1 (ρ1 − 1)(a11 q1 + a21 q2 + 1)ρ1 −2 + a212 β2 ρ2 (ρ2 − 1)(a12 q1 + a22 q2 + 1)ρ2 −2 h22 = a221 β1 ρ1 (ρ1 − 1)(a11 q1 + a21 q2 + 1)ρ1 −2 + a222 β2 ρ2 (ρ2 − 1)(a12 q1 + a22 q2 + 1)ρ2 −2 and h12 = h21 = a11 a21 β1 ρ1 (ρ1 − 1)(a11 q1 + a21 q2 + 1)ρ1 −2 + a12 a22 β2 ρ2 (ρ2 − 1)(a12 q1 + a22 q2 + 1)ρ2 −2

If we define: T1 = β1 ρ1 (ρ1 − 1)(a11 q1 + a21 q2 + 1)ρ1 −2 T2 = β2 ρ2 (ρ2 − 1)(a12 q1 + a22 q2 + 1)ρ2 −2 T 1 = (a11 q1 + a21 q2 + 1)ρ1 −2 T 2 = (a12 q1 + a22 q2 + 1)ρ2 −2

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

81

then we can see that h11 = a211 T1 + a212 T2 h22 = a221 T1 + a222 T2 h12 = a11 a21 T1 + a12 a22 T2 Also, note that T j ≥ 0.

h22 h11 − h221 = (a211 T1 + a212 T2 )(a221 T1 + a222 T2 ) − (a11 a21 T1 + a12 a22 T2 )2

(2.30)

= a211 a222 T1 T2 + a212 a221 T1 T2 − 2a11 a12 a21 a22 T1 T2

(2.31)

= T1 T2 (a211 a222 + a212 a221 − 2a11 a12 a21 a22 )

(2.32)

= T1 T2 (det A)2

(2.33)

Hence, h22 h11 − h221 ≥ 0 ⇔

(2.34)

β1 ρ1 (ρ1 − 1)β2 ρ2 (ρ2 − 2)(det A)2 ≥ 0

(2.35)

Combining these three conditions, for the two good, two characteristic case, a necessary and sufficient condition for concavity (assuming det A 6= 0) is: βj ρj (ρj − 1) ≤ 0, j = 1, 2

2.8.2

Two Goods, Three Characteristics

As in the previous case, we require (1) h11 ≤ 0, (2) h22 ≤ 0, and (3) h11 h22 − h12 h21 ≥ 0. In this case, h11 = a211 β1 ρ1 (ρ1 − 1)(a11 q1 + a21 q2 + 1)ρ1 −2 + a212 β2 ρ2 (ρ2 − 1)(a12 q1 + a22 q2 + 1)ρ2 −2 + a213 β3 ρ3 (ρ3 − 1)(a13 q1 + a23 q2 + 1)ρ3 −2

h22 = a221 β1 ρ1 (ρ1 − 1)(a11 q1 + a21 q2 + 1)ρ1 −2 + a222 β2 ρ2 (ρ2 − 1)(a12 q1 + a22 q2 + 1)ρ2 −2 +

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

82

a223 β3 ρ3 (ρ3 − 1)(a13 q1 + a23 q2 + 1)ρ3 −2 and h12 = h21 = a11 a21 β1 ρ1 (ρ1 − 1)(a11 q1 + a21 q2 + 1)ρ1 −2 + a12 a22 β2 ρ2 (ρ2 − 1)(a12 q1 + a22 q2 + 1)ρ2 −2 + a13 a23 β3 ρ3 (ρ3 − 1)(a13 q1 + a23 q2 + 1)ρ3 −2

Following the previous subsection, if we define: T3 = β3 ρ3 (ρ3 − 1)(a13 q1 + a23 q2 + 1)ρ3 −2

T 3 = (a13 q1 + a23 q2 + 1)ρ3 −2 then we can see that h11 = a211 T1 + a212 T2 + a213 T2 h22 = a221 T1 + a222 T2 + a223 T2 h12 = a11 a21 T1 + a12 a22 T2 + a13 a23 T3 As before, we find that βj ρj (ρj − 1) ≤ 0, j = 1, 2 is a sufficient condition, however it is no longer necessary.

2.8.3

Three Goods, Three Characteristics

The case of three goods and three characteristics does not offer any additional intuition, but it is included for completeness, to show that the result extends to higher dimensions. In this case the hessian is negative semi-definite if and only if: (1) h11 ≤ 0, (2) h22 ≤ 0, (3) h33 ≤ 0, (4)h11 h22 − h12 h21 ≥ 0, (5)h11 h33 − h13 h31 ≥ 0 (6)h22 h33 − h23 h32 ≥ 0, and (7) h11 h22 h33 + 2h12 h23 h31 − h11 h223 − h22 h213 − h33 h212 + h11 h22 h33 ≤ 0. Once again, βj ρj (ρj − 1) ≤ 0, j = 1, 2, 3 is a sufficient, but not necessary condition.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

2.9

83

Appendix B: Analytic Solutions to the Two-Inside Good Case: Details

If the concavity conditions derived in the previous Appendix are satisfied, we can solve the household’s optimization problem by forming the Lagrangian: L(q1 , q2 , z, λ0 , λ1 , λ2 , λ3 ) = β1 (a11 q1 + q2 a21 + 1)ρ1 + β2 (a12 q1 + a22 q2 + 1)ρ2 + ε1 q 1 + ε2 q 2 + z +λ0 (w − p1 q1 − p2 q2 − z) + λ1 q1 + λ2 q2 + λ3 z The Lagrangian yields the following equations for a local maximum: ∂L ∂q1

= a11 β1 ρ1 (a11 q1 + a21 q2 + 1)ρ1 −1 + a12 β2 ρ2 (a12 q1 + a22 q2 + 1)ρ2 −1 + ε1 − λ0 p1 + λ1 = 0

(2.36) ∂L ∂q2

ρ1 −1

= a21 β1 ρ1 (a11 q1 + a21 q2 + 1)

ρ2 −1

+ a22 β2 ρ2 (a12 q1 + a22 q2 + 1)

+ ε 2 − λ 0 p2 + λ 2 = 0

(2.37) ∂L = 1 − λ0 + λ 3 = 0 ∂z

(2.38)

λ0 (w − p1 q1 − p2 q2 − z) = 0 λ1 q1 = 0 λ2 q2 = 0 λ3 z = 0 λ0 , λ 1 , λ 2 , λ 3 ≥ 0 First, note that we can substitute λ0 = 1 + λ3 . Next, if we define: α1 = (a11 q1 + a21 q2 + 1)ρ1 −1 α2 = (a12 q1 + a22 q2 + 1)ρ2 −1 Then Equations 2.36 and 2.37 represent a linear system with two equations and two unknowns α1 and α2 . Assuming that the determinant of A is non-zero15 , and that β1 β2 ρ1 ρ2 6= 15 If det A = 0, then the characteristics of one good are equal to a multiple of the characteristics of the other. Therefore only one of the two inside goods will be chosen (whichever has the higher value for Aj1 /(pj − εj )). This “dominance” concept will be discussed later.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

84

0 then the solution to this equation is: α1 =

(p1 + λ3 p1 − ε1 − λ1 )a22 − (p2 + λ3 p2 − ε2 − λ2 )a12 β1 ρ1 (a11 a22 − a21 a12 )

α2 =

(p2 + λ3 p2 − ε2 − λ2 )a11 − (p1 + λ3 p1 − ε1 − λ1 )a21 β2 ρ2 (a11 a22 − a21 a12 )

Therefore: 

(p1 + λ3 p1 − ε1 − λ1 )a22 − (p2 + λ3 p2 − ε2 − λ2 )a12 β1 ρ1 (a11 a22 − a21 a12 )



1 ρ1 −1



(p2 + λ3 p2 − ε2 − λ2 )a11 − (p1 + λ3 p1 − ε1 − λ1 )a21 β2 ρ2 (a11 a22 − a21 a12 )



1 ρ2 −1

a11 q1 + a21 q2 + 1 =

a12 q1 + a22 q2 + 1 = Continuing, this gives us:



1 ρ1 −1

a22 δ1 q1 =





1 ρ2 −1

− 1 − a21 δ2

 −1

a11 a22 − a21 a12   1   1 ρ −1 ρ −1 a11 δ2 2 − 1 − a12 δ1 1 − 1

q2 =

a11 a22 − a21 a12

Where for convenience, we define δ1 =

(p1 + λ3 p1 − ε1 − λ1 )a22 − (p2 + λ3 p2 − ε2 − λ2 )a12 β1 ρ1 (a11 a22 − a21 a12 )

δ2 =

(p2 + λ3 p2 − ε2 − λ2 )a11 − (p1 + λ3 p1 − ε1 − λ1 )a21 β2 ρ2 (a11 a22 − a21 a12 )

We begin by examining the interior solution, when all three goods are consumed. In this case λ1 = λ2 = λ3 = 0 and λ0 = 1, and the solutions are: a22



qˆ1 =



1 ρ1 −1

− a21



(p2 −ε2 )a11 −(p1 −ε1 )a21 β2 ρ2 (a11 a22 −a21 a12 )



1 ρ2 −1

− a22 + a21 (2.39)

a11 a22 − a21 a12 a11

qˆ2 =

(p1 −ε1 )a22 −(p2 −ε2 )a12 β1 ρ1 (a11 a22 −a21 a12 )



(p2 −ε2 )a11 −(p1 −ε1 )a21 β2 ρ2 (a11 a22 −a21 a12 )



1 ρ2 −1

− a12



(p1 −ε1 )a22 −(p2 −ε2 )a12 β1 ρ1 (a11 a22 −a21 a12 )

a11 a22 − a21 a12



1 ρ1 −1

− a11 + a12 (2.40)

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

zˆ = w − p1 q1 − p2 q2

85

(2.41)

The ˆ’s are to remind the reader that these can be considered candidate solutions – there are several ways in which this interior solution may break down, leading to a non-interior solution. The above solutions may yield qˆ1 < 0, qˆ2 < 0, or zˆ < 0. Additionally, if either of the δ’s is negative, then the above solution will predict complex consumption. I explore each of these cases in the next few paragraphs. Assume for the moment that both δ’s are positive. Then if qˆ1 < 0 and zˆ < 0, then the solution is (0, pw2 , 0), while if qˆ2 < 0 and zˆ < 0, then the solution is ( pw1 , 0, 0). If zˆ > 0 and qˆ1 < 0, then    ∗ w q1 = 0, z = w − p2 q2 and q2 = max 0, min q2 , p2

(2.42)

Where q2∗ is the solution to the equation16 : a21 β1 ρ1 (a21 q2∗ + 1)ρ1 −1 + a22 β2 ρ2 (a22 q2∗ + 1)ρ2 −1 + ε2 =1 p2 Similarly, if zˆ > 0 and qˆ2 < 0, then 

  ∗ w q2 = 0, z = w − p1 q1 and q1 = max 0, min q1 , p1

(2.43)

Where q1∗ is the solution to the equation: a11 β1 ρ1 (a11 q1∗ + 1)ρ1 −1 + a12 β2 ρ2 (a12 q1∗ + 1)ρ2 −1 + ε1 =1 p1 Finally, if zˆ < 0, qˆ1 > 0 and qˆ2 > 0 then we have λ1 = λ2 = 0 and λ0 = 1 + λ3 . Rather than trying to solve for λ3 , in this case it’s easier to return to the first order conditions, which in this case give us:

16

λ0 p1 = a11 β1 ρ1 (a11 q1 + a21 q2 + 1)ρ1 −1 + a12 β2 ρ2 (a12 q1 + a22 q2 + 1)ρ2 −1 + ε1

(2.44)

λ0 p2 = a21 β1 ρ1 (a11 q1 + a21 q2 + 1)ρ1 −1 + a22 β2 ρ2 (a12 q1 + a22 q2 + 1)ρ2 −1 + ε2

(2.45)

Note that the concavity conditions guarantee that this solution, if positive, is unique.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

86

Dividing the first equation by the second yields: a11 β1 ρ1 (a11 q1 + a21 q2 + 1)ρ1 −1 + a12 β2 ρ2 (a12 q1 + a22 q2 + 1)ρ2 −1 + ε1 p1 = p2 a21 β1 ρ1 (a11 q1 + a21 q2 + 1)ρ1 −1 + a22 β2 ρ2 (a12 q1 + a22 q2 + 1)ρ2 −1 + ε2 Substituting in q2 =

w−p1 q1 p2

and simplifying: 

w − p1 q 1 (p2 a11 β1 ρ1 − p1 a21 β1 ρ1 ) a11 q1 + a21 +1 p2 

ρ1 −1

w − p1 q 1 +1 (p1 a22 β2 ρ2 − p2 a12 β2 ρ2 ) a12 q1 + a22 p2 Hence the solution is z = 0, q2 =

+ p2 ε 1 = ρ2 −1 + p1 ε 2

w − p1 q 1 and q1 = q1∗ p2

(2.46)

where q1∗ is the solution to ρ1 −1 w − p1 q 1 +1 + p2 ε 1 = (p2 a11 β1 ρ1 − p1 a21 β1 ρ1 ) a11 q1 + a21 p2 ρ2 −1  w − p1 q 1 +1 + p1 ε 2 (p1 a22 β2 ρ2 − p2 a12 β2 ρ2 ) a12 q1 + a22 p2 

(2.47) (2.48)

Now, consider the case we would have either δ1 < 0 or δ2 < 0 at an interior solution. I.e., when

or

(p1 − ε1 )a22 − (p2 − ε2 )a12 <0 β1 ρ1 (a11 a22 − a21 a12 )

(2.49)

(p2 − ε2 )a11 − (p1 − ε1 )a21 <0 β2 ρ2 (a11 a22 − a21 a12 )

(2.50)

Closer inspection reveals two insights: (1) these are mutually exclusive cases and (2) they correspond to “dominance” of one inside good by the other. Basically, one good is dominated by another if you can get all of it’s desirable characteristics more cheaply from some other good (or combination of goods). There are two ways for good one to be dominated by good two. If good one’s principal17 characteristic is bad (has a negative β), then unless good one is a cheaper source of the other characteristic than good two it will have zero demand. 17 In the two-good/two-characteristic case, if det A > 0, good one’s principal characteristic is the first characteristic. If det A > 0, good one’s principal characteristic is the second characteristic. In the three good case, the intuition is less clear, but the mathematics are similar.

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

87

Similarly, if good one’s non-principal characteristic is good (has a positive β) then unless good one is a cheaper source for it’s principal characteristic than good two it will have zero demand. In what follows, I define pj = pj − εj . pj can be thought of as the effective price for the household of one unit of good j. Table 2.6: Summary of Dominance Conditions a12 a22 a11 a21 det A β1 β2 Dominant p1 − p2 p1 − p2 Good >0 >0 >0 1 >0 <0 <0 2 >0 >0 <0 2 >0 <0 >0 1 <0 >0 <0 2 <0 <0 >0 1 <0 >0 >0 1 <0 <0 <0 2

The inequalities in equations 2.49 and 2.50 imply that good one is dominated by good two (and thus has zero demand) if any of A1-A4 are met: Condition A1: det A > 0, β1 < 0, and Condition A2: det A > 0, β2 > 0, and Condition A3: det A < 0, β1 > 0, and Condition A4: det A < 0, β2 < 0, and

a12 p1 a11 p1 a12 p1 a11 p1

< < < <

a22 p2 . a21 p2 . a22 p2 . a21 p2 .

Note that for a particular set of parameters (A, β, ρ), at most one of these conditions is relevant. Similarly, good two is dominated by good one if any of B1-B4 hold: Condition B1: det A > 0, β1 > 0, and Condition B2: det A > 0, β2 < 0, and Condition B3: det A < 0, β1 < 0, and Condition B4: det A < 0, β2 > 0, and

a22 p2 a21 p2 a22 p2 a21 p2

< < < <

a12 p1 . a11 p1 . a12 p1 . a11 p1 .

If good two is dominated by good one, then the solution is given by Equation 2.43, while if good one is dominated by good two the solution is given by Equation 2.42. These conditions are summarized in Table 2.6.

2.10

Appendix C: Additional Figures

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

88

Figure 2.29: Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =



1 0.999 0.999 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.5, p2 = $0.50, w = 60

q (p | p =$0.50) 1

q (p | p =$0.50)

2

2

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

Price of Good 1

Price of Good 1

1

1

0.5

0.4

0.4

0.3

0.2

0.2

0.1

0.1

0

5

10 q1

15

2

0.5

0.3

0

1

0

0

2

4 q2

6

8

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

89

Figure 2.30: Demand for good one and two as a function of p1 , for the 40th and 60th percentiles of ε1 and ε2 . A =   1 0.999 , β = 6.875, β2 = 5, ρ1 = 0.4, ρ2 = 0.5, p2 = $0.50, w = 60 1 0.999 1 q (p | p =$0.50) 1

q (p | p =$0.50)

2

2

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6

Price of Good 1

Price of Good 1

1

1

0.5

0.4

0.4

0.3

0.2

0.2

0.1

0.1

0

5

10 q1

15

2

0.5

0.3

0

1

0

0

2

4

6 q2

8

10

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

90

Figure 2.31: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



1 0 0 1



, β1 = β2 = 5, ρ1 = 0.5, ρ2 = 0.4, p1 = p2 = 0.3, w = 30. The

probabilities of the four regions are:



0.23 0.06 0.55 0.15



Figure 2.32: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



1 0.02 0.02 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 30. The

probabilities of the four regions are:



0.22 0.08 0.47 0.23



CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

91

Figure 2.33: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



1 0.25 0.25 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 30. The

probabilities of the four regions are:



0.36 0.05 0.22 0.37



Figure 2.34: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



1 0.1 0.1 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 1. The

probabilities of the seven regions are:

 

 0.16 0.11 0.07 0.10 0.11  0.23 0.16

CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

92

Figure 2.35: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



10 1 1 10



, β1 = β2 = 0.5, ρ1 = 0.5, ρ2 = 0.4, p1 = p2 = 0.3, w = 30.

The probabilities of the four regions are:



0.28 0.07 0.38 0.28



Figure 2.36: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



1 0.08 0 1



, β1 = β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 30. The

probabilities of the four regions are:



0.27 0.08 0.45 0.20



CHAPTER 2. CONTINUOUS DEMAND AND VARIETY-SEEKING

93

Figure 2.37: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



1 0.5 0 1



, β1 = 7, β2 = 5, ρ1 = ρ2 = 0.4, p1 = p2 = 0.3, w = 30. The

probabilities of the four regions are:



0.65 0.03 0.22 0.10



Figure 2.38: Graph of purchases as a function of error terms. The groupings are the same as in Figure 2.15. A =



1 0.5 0 1



, β1 = β2 = 5, ρ1 = 0.5, ρ2 = 0.4, p1 = p2 = 0.3, w = 30.

The probabilities of the four regions are:



0.66 0.04 0.19 0.11



Chapter 3

Investigating the Costs of Uniform Pricing 3.1 3.1.1

Introduction Overview

Retailers typically sell many different products from the same manufacturer at the same price. As mentioned in chapter one, there are a large number of potential explanations for this uniform pricing behavior. One potential explanation is that retailers face managerial menu costs, and hence find it costly to charge the optimal price for each product in each period. It is this explanation that I investigate in more depth in this chapter. In order to assess the plausibility of the menu costs explanation, I would like to know how large these menu costs would have to be to lead retailers to charge uniform prices. The following framework is useful for analyzing the retailer’s decision process. Assume that each period, the retailer maximizes expected profits, less menu costs: Expected Profit this Period

Expected =

Revenue

Expected −

from Sales

Cost of Goods



Menu Costs this Period

(3.1)

My approach to measuring the menu costs that would rationalize the observed uniform pricing behavior is to look at the counterfactual expected profits earned by the retailer

94

CHAPTER 3. COSTS OF UNIFORM PRICING

95

under uniform and non-uniform pricing strategies. That is, for each week, I compute:  Minimum Menu Cost this Period that Rationalizes Uniform Prices

  =       



Expected

Expected

Revenue from Sales

Cost of Goods

this Period



this Period

with Non-Uniform

with Non-Uniform

Prices

Prices

Expected

Expected

Revenue from Sales

Cost of Goods

this Period



this Period

with Uniform

with Uniform

Prices

Prices

  − 

(3.2)

    

(3.3)

The left-hand side of this equation corresponds to the amount of profit that the retailer actually earned by charging non-uniform prices instead of uniform prices. Therefore, the model predicts that if the retailer had faced menu costs higher than this amount, I would have observed that retailer charging uniform prices if the choice was made on a week-byweek basis. More likely, as discussed in chapter one, the uniform versus non-uniform pricing strategy decision is made infrequently. This would mean that the relevant profit difference to consider would be the discounted value of the sum of the profit differences over several weeks. The difference between the profits earned under uniform and non-uniform prices depends only on the demand function faced by the store, and it’s marginal costs. In order to make this comparison empirically, it is necessary to learn (a) the demand system faced by the retailer and (b) the cost structure faced by the retailer. Learning these two pieces in order to perform the counterfactual experiment posed above, requires developing a structural model of demand and supply.

3.1.2

Data: Soft Drinks

The main problem in calculating this potential profit difference between uniform and nonuniform pricing strategies is in estimating demand. Specifically, I must estimate a demand system for different varieties. As discussed in chapter one, if these different varieties always sell at the same price, it is not possible to identify demand for the different varieties. After much searching, I found a dataset that contained the necessary variation. In this dataset, one retailer charged non-uniform prices for different flavors of carbonated soft drinks.

CHAPTER 3. COSTS OF UNIFORM PRICING

96

Like many products, carbonated soft drinks are typically priced uniformly by manufacturerbrand-size. For example, at a typical retailer, 2-Liter bottles of Pepsi, Diet-Pepsi, and DietCaffeine-Free Pepsi all typically sell at the same price, relative to each other. This behavior can be seen in Figure 1.3. By contrast, a typical retailer will sell Coke and Pepsi at different prices, as seen in Figure 1.1. In my dataset, I observe a retailer charging non-uniform prices for soft drinks. An example of this variation can be seen in Figure 1.4. It is this variation (and similar variation in other soft drink varieties at this retailer) that I use to identify demand for individual varieties of soft drinks. There are a number of features that make the carbonated soft drink category amenable to this investigation. First, the soft drink category has a large number of products, and a large number of different varieties of similar products. Second, contrary to many other product categories, these products could plausibly be grouped in a number of alternative ways. This stems from the fact that the product packaging for soft drinks is largely identical across brands and manufacturers. The label on a 2-Liter bottle of Coke may differ from its Pepsi counterpart, but the physical shape of the bottle is identical. This physical similarity is much different than other product categories, like yogurt, where consumers might be less likely to accept line pricing by flavors. Third, soft drinks are the most frequently purchased item in scanner data. According to scanner data, in most product categories, the median household makes a purchase only a couple of times per year. By contrast, the median household in the sample purchases soft drinks on eleven occasions over the sample period of two years. This gives us the hope of obtaining reasonably good estimates. There are, however, several potential drawbacks to looking at carbonated soft drinks. First, anecdotal evidence suggests that consumers may stockpile soft drinks. This would be problematic for me, since the demand model I developed in chapter two does not account for dynamics. However, contrary to expectations, in their descriptive paper, Hendel & Nevo (2002) find no evidence of stockpiling of soft drinks. This is not definitive however, since they also find little evidence of stockpiling in detergents, while their structural paper on detergents (Hendel & Nevo 2002) does find such effects. A second potential problem with using soft drinks is that retailers may view the soft drink category as a “loss-leader” category, pricing it low in order to drive store traffic. Although it features in some theoretical work (see for example, Hess & Gerstner (1987)) there is little empirical evidence of loss leader behavior. In a dataset covering the same geographic area and time period as my own in which I observe actual retail and wholesale

CHAPTER 3. COSTS OF UNIFORM PRICING

97

prices, I do not observe negative margins except in a handful of cases. Nevertheless, it is possible for loss-leader behavior to subtly depress soft drink prices without actually pushing them below marginal costs. To the extent that cross-category loss-leader behavior by the retailer does occur, it would only affect the assumptions I use to recover marginal costs. My method would interpret low prices as evidence of low marginal costs, when in fact prices would be low in order to drive store traffic. My approach could be modified to account for cross-category loss-leader behavior, but I lack strong candidates for an alternative to assuming the retailer maximizes profits on a category-by-category basis, and simultaneously estimating demand for several product categories is beyond the scope of this paper. A third potential problem with the soft drink industry is the argument that the soft drink category is different because the two main manufacturers, Coke and Pepsi, have extraordinarily strong brands and, as a result, may exert pressure on retailers to price their products in a particular way. If Coke and Pepsi exerted influence over pricing, it does not present a problem for my demand estimation, which relies solely on the fact that I observe price variation. It could potentially affect my profit calculations. However, what I see in the data does not appear to be consistent with the Coke and Pepsi influencing pricing (at least not at the store that charged non-uniform prices). What I observe in the data is one store charging non-uniform prices, and all other stores charging uniform prices. It seems highly implausible that Coke and Pepsi would (or could) dictate to every other store in the area to charge uniform prices, but allow the retailer I observe to charge non-uniform prices.

3.1.3

Estimating Demand

Having found a dataset containing sufficient price variation to identify demand for individual varieties, it became clear that demand for carbonated soft drinks present a fourth potential problem: it has several key features that make existing demand models unsuitable. These features are that: choice is continuous, there are a large number of products, and households may be variety-seeking. As noted in chapter two, existing demand models are unable to simultaneously handle these features. This led me to develop the new model of demand described in chapter two. In calculating the retailer’s counterfactual expected profit, the relevant demand function to consider is the residual demand function, which reflects the presence of other stores in the market. The difference between residual demand and market demand is that the residual demand function accounts for the fact that the prices charged at other stores affect demand

CHAPTER 3. COSTS OF UNIFORM PRICING

98

at any particular store. One can think of the residual demand curve as having several components, i.e., that in purchasing carbonated soft drinks, households make a series of decisions. They must decide whether to shop, where to shop, what to buy, and how much to buy. With respect to the first stage of this process, although some authors, such as Kahn & Schmittlein (1989) have investigated the household’s decision of when to shop, little progress has been made in identifying the factors that affect this process. Therefore, following Bell & Lattin (1998), Rhee & Bell (2002), and others, I take the household’s decision to go shopping, as well as their shopping budget on that occasion to be exogenously determined and uncorrelated with all unobservables. Moving to the next level of the decision tree, rather than develop a structural model that incorporates the household’s simultaneous decision of where to shop, what to buy, and in what quantity1 , I follow Bell & Lattin (1998) in decomposing the decision into two conditionally independent parts. Thus the demand at store A in week t is: Et [Qt (pt )] =

X i

E [i’s purchases|i goes to A] · P

! i’s characteristics i goes to A and prices at all stores

(3.4)

In order to estimate this residual demand function Qt (pt ) for carbonated soft drinks faced by the retailer, I follow a two-step approach described in more detail in section 3.2. First in section 3.2.1, I briefly review the demand model from the previous chapter, which I apply in to estimating a structural model of product choice, conditional on store choice. Then, in section 3.2.2, I describe the model of store choice.

3.1.4

Counter-Factuals and Preview of Main Result

The second step required to calculate the profit difference suggested in Equation 3.3 is to recover the cost structure faced by the retailer. In particular, to calculate the implied managerial menu costs, I need to make assumptions about how the retailer set its prices during the time period of my data. This step is essential to recovering the retailer’s marginal costs. If I had marginal cost data for the retailer, I would not need to make these assumptions 1

A fully structural model would involve calculating the household’s expected utility, net of travel costs, from visiting each of the stores in its choice set. Such a model would also involve consumers forming expectations of the menu of prices they would face at each store. Given my estimation procedure, this approach is far too computationally burdensome. Instead, I approximate this choice, by assuming that households’ choices among the stores in my sample follow a logit choice model. This model represents an approximation of the true model and does not directly correspond to any model of utility maximization. I use it because it is computationally cheap, and because I believe the approximation is reasonable.

CHAPTER 3. COSTS OF UNIFORM PRICING

99

about the competitive environment. The key assumption I choose to make is the the retailer faced Bertrand competition with the other local grocery stores in my sample, as well as other retailers outside my sample. I further assume that the retailer maximized profits for the soft drink category separately from its other categories (i.e., I assume that the retailer did not use loss leaders), faced constant marginal costs (which may have varied from product to product and week to week) function, and that they charged the profit-maximizing price for each product (UPC). Although I could choose from a multitude of alternative assumptions about the retailer’s price-setting strategy, among them those listed in section 1.3, it is difficult to know how to select among them without further information. Together, these assumptions imply that in each week t, the retailer chooses prices for each soft drink j that maximize the single-period expected profit2 function for carbonated soft drinks: Et [Πt ] =

X

Et [Qjt (pt )] (pjt − cjt )

(3.5)

j∈J

where pt is the vector of prices pjt , cjt is the marginal cost of good j in week t, and the retailer’s expectation is taken over the idiosyncratic household-level preference shocks. These assumptions about the retailer’s price-setting strategy allow me to recover the implied marginal cost for each product from the retailer’s first-order conditions for profit maximization. The assumption of profit maximization implies that: X ∂Et [Qkt (pt )] ∂Et [Πt ] = Et [Qjt (pt )] + (pkt − ckt ) =0 ∂pjt ∂pjt

(3.6)

k∈J

in each week t, for each of the products j sold by the retailer. This gives me a series of J equations and J unknowns – the cjt ’s – for each week. I then use these marginal costs to calculate the retailer’s expected profit for each week under uniform and non-uniform pricing strategies. To calculate the retailer’s expected profits from non-uniform pricing, I calculate: Et [Πt ] =

X

Et [Qjt (pt )] (pjt − cjt )

j∈J 2

The retailer’s expectations are over the unknown (to the retailer and the econometrician) realizations of the households’ idiosyncratic preference shocks.

CHAPTER 3. COSTS OF UNIFORM PRICING

100

at the actual prices. Because I do not observe the retailer charging uniform prices, calculating the retailer’s expected profits from uniform pricing involves one final step: I must numerically solve for the hypothetically optimal uniform prices – the set of prices that maximize expected profits (based on the demand system and marginal costs that I estimate) subject to the constraint that they be uniform by manufacturer-brand-size.

3.1.5

Conclusion

Comparing the difference in profits on a week-by-week basis between the uniform and nonuniform pricing strategies, I find that although it varies from week to week depending primarily on changes in the marginal costs, the average weekly difference in profits is $36.56 (in nominal dollars). After adjusting for inflation, my estimate of the total difference in profits between the two pricing strategies over the entire two year sample is $5,135. This suggests that single-store grocery retailers will not find it profitable to learn what optimal non-uniform prices they should charge if this learning costs them more than roughly $2,500 per year. However, it also suggests that, absent demand-side factors not measured in this paper, many large grocery chains may be leaving money on the table with respect to pricing in the soft drink category.

3.2

Empirical Demand Model

As mentioned in the introduction, the distinction between market demand and residual demand is an important one. Several previous empirical demand studies have ignored this aspect for the very good reason that in most datasets, this information is simply not available – prices for other stores are not observed. In my case, however, I observe the prices charged at four other competing stores. Furthermore, the additional stores in the dataset were chosen precisely because they were the stores that shoppers were most likely to switch to. I assume that the households’ choice process is as follows. First, I assume that an exogenous process governs consumers’ decision of whether to shop in a given week, as well as their total grocery expenditure in that week. Second, conditional on deciding to shop, I model households’ store-choice decision using a conditional logit. Then, conditional on a household’s choice of store, I assume that they optimally allocate expenditure between soft drinks and all other groceries. Hence, the residual demand faced by store A is equal to

CHAPTER 3. COSTS OF UNIFORM PRICING

101

the sum over all households i (that went shopping in that week) of the probability that the household chose store A, multiplied by their expected purchases qit , conditional on choosing store A. The resulting expected residual demand vector faced by store A in week t is: Et [Q(p)] =

X

E [i’s purchases|i goes to A] · P

i

! i’s characteristics i goes to A and prices at all stores

(3.7)

or, more formally: E [Q(pt |Ωobs,t , Ωunobs,t )] =

X

E [qit |parameters, Ωobs,t ] · P (A|parameters, Ωobs,t )

(3.8)

i

where Ωobs,t and Ωunobs,t represent observed and unobserved variables specifying the state of the world in week t. For tractability, I assume that the store choice decision is made independently of the households’ soft drink purchase decision. Economically, this rules out, for example, going to store A because it is the only store that carries product j. More importantly, it also rules out going to a particular store based on a purchase-occasionspecific shock. This means that I am assuming that households’ idiosyncratic preference shocks for varieties of soft drinks are independent of their idiosyncratic shock for their store-choice decision. Unfortunately, this means that the complete demand model is not consistent with utility maximization.3 The next two subsections describe the specification and estimation procedure used for each of these two components of residual demand on more detail.

3.2.1

Product-Level Demand Model

As discussed earlier, it is a fundamental feature of the grocery market for carbonated soft drinks that households make continuous choices among a large number of products, and may be variety-seeking. Recall that in Table 1.1, nearly 70% of the households’ purchase occasions involve the purchase of more than one unit, and roughly 25% of purchase occasions involve the purchase of more than one different UPC. As discussed in chapter two, existing demand models were unable to accommodate these features. Therefore, in modeling demand for soft drinks I use the household-level demand model developed in the previous chapter, which I briefly review here. 3 For example, the household’s choice of store is assumed to be conditionally independent of the household’s preferences for particular items. This is reflected in my use of a generic average price, rather than an average of the bundle of products the household would purchase.

CHAPTER 3. COSTS OF UNIFORM PRICING

102

In order to reduce the dimensionality of the parameter space, I assume that, as in the logit models considered above, households derive utility from product characteristics. Each product j (shown in Table 3.3) can then be expressed as a vector of C different characteristics (described in section 3.3.1), and the menu faced by the household can be represented by a J × C matrix A where the rows of A are the products, and the columns are characteristics. Hence, A is just a stacked matrix consisting of the Xjt ’s from the logit model. To illustrate how the dataset fits into the model from chapter two, consider an example with just two characteristics (Diet and Cola) and three available products (Coke, Diet Coke, and Diet 7Up). Because it is a non-diet cola, Coke has characteristic vector [0 1].4 As a diet cola, Diet Coke has characteristic vector [1 1]. Finally, as a diet non-cola, Diet 7Up has characteristic vector [1 0]. Stacking these three products’ characteristic vectors yields the following A matrix5 : 

0 1



   A= 1 1   1 0 Again following the logit, I assume that a household’s utility function is additively separable in these characteristics. Unlike the logit, however, my model allows households the ability to consume multiple units of a single product, as well as consuming several different products. In particular, I assume that in week t, household i myopically maximizes the utility function6 : Uit (qt , zt ) =

X

βc (A0ct qt )ρc + ε0it qt + zt

(3.9)

c∈ C

with respect to the vector qt and the scalar zt . Act is the cth column of A in week t, qt is a column vector of length J comprising the household’s purchases of the J goods described by A, zt is the amount of outside good consumed, and βc and ρc are the characteristicspecific scalar parameters for that household. The J dimensional vector εit represents 4

Note that although in this case the product characteristics are indicator variables, in general they need only be non-negative. For example, in the estimated model one of the characteristics is the number of milligrams of caffeine per 12-ounce serving. 5 Although the A matrix shown here is time invariant, the empirical A matrix will typically vary from week to week, because I include feature and display as time-varying characteristics. 6 For simplicity, the model assumes households are homogenous in their preferences (β and ρ). Extending the model to account for heterogeneity (observed or unobserved) is straightforward, though computationally burdensome. The relevant details, difficulties and tradeoffs are discussed in the Appendix.

CHAPTER 3. COSTS OF UNIFORM PRICING

103

the household/shopping-occasion marginal utility shock, which is observed by the utilitymaximizing household, but not by the econometrician.7 The household maximizes this utility function subject to the budget constraint: X

pjt qjt + zt ≤ wit

(3.10)

j∈ J

where wit is the household’s total grocery expenditure in the store in week t. Returning to the three-good example above and substituting the A matrix into the utility function, we can see that the household maximizes 8 : Uit (qt , zt ) = βDiet (qDietCoke,t + qDiet7U p,t )ρDiet + βCola (qCoke,t + qDietCoke,t )ρCola + εi,Coke,t qCoke + εi,DietCoke,t qDietCoke,t + εi,Diet7U p,t qDiet7U p,t + zt with respect to q and z, subject to: pCoke,t qCoke,t + pDietCoke,t qDietCoke,t + pDiet7U p,t qDiet7U p,t + zt ≤ wit The model is estimated using the Method of Simulated Moments (MSM), developed independently by McFadden (1989) and Pakes & Pollard (1989)9 . This estimation method uses the fact that the expectation of the difference between the expected purchases and the actual, observed purchases, is zero at the true parameter values. More formally, I use the (|J| + 2) ∗ (|J| + 1) moment conditions: I T 1 XX (qit − E[qit |β, ρ, pt , wit ]) xit HI,T,R (β, ρ) = IT

(3.11)

i=1 t=1

7

I currently assume that εit is i.i.d. across products, time, and households, and negatively log-normally distributed on the interval (−∞, 0). It is necessary to bound εit from above in order to prevent unreasonable choice behavior. If, for example, the realization of εitj is greater than the price of good j, a household may never consume the outside good on that purchase occasion, regardless of the level of wt . 8 In reality, households are forced to choose between the discrete sizes offered by the store. I make no attempt to model this feature of the data, as it introduces an extraordinary amount of computational cost with little clear return. In the estimation, I do not restrict the predicted purchases to the discrete purchases that the household could have actually made, instead allowing purchase quantities to vary continuously. Although Dub´e (2001) suggests selecting the purchasable grid point adjacent to the unconstrained maximum, this is not numerically feasible in my case as it would require the examination of 225 ' 3.3 × 107 points for each household maximization. 9 Gouri´eroux & Monfort (1996) contains the best summary and discussion of various simulation estimators and their properties that I have found.

CHAPTER 3. COSTS OF UNIFORM PRICING

104

where qit is the household’s vector of actual purchases. Each moment is an average across all purchase occasions of the difference between expected purchases and the actual, observed purchases, interacted with the instruments. The vector of instruments, xit consists of all exogenous variables in the model (more on this in the following paragraphs), namely: the prices of each good, the household’s budget, and a constant. Because exact computation is infeasible, I simulate E[qit |β, ρ, pt , wit ] by drawing R = 30 sets of εit ’s (which I hold constant as I search the parameter space). Hence, I use the fact that:

R

E[qit |β, ρ, wit , pit ] ≈

1 X q(β, ρ, wit , pt , εrit ) R

(3.12)

r=1

Using these moments, I define my estimates as minimizing the distance function:  b = argmin (HI,T,R (ρ, β))0 W (HI,T,R (ρ, β)) [b ρ, β]

(3.13)

Ideally, I would implement this as a two-stage procedure. The first stage of this procedure would involve choosing the weighting matrix, and finding consistent estimates of the parameters. The second stage takes these estimates and uses them to calculate the optimal weighting matrix, and then re-estimates the parameters. In practice, however, estimation currently takes several weeks. Therefore, I report only the (inefficient) first-stage estimates. Each of these two stages of estimation consists of iterating over several steps, which I review now: 1. Choose starting values for the parameters: β and ρ. 2. Take the actual characteristics of the households in the sample that went shopping in that week. In this case, a household is completely characterized by it’s budget (wit ). This amounts to assuming that the retailer knows the distribution of the households that would go shopping (not necessarily at its store) in each week. 3. Draw R sets of εit ’s for each observed purchase occasion. I use R = 30. This means drawing R∗(Number of Purchase Occasions)∗(Number of Products)= 30∗16008∗25 = 12, 006, 000 i.i.d. negative lognormal random numbers. These random numbers are held constant across iterations. 4. Using the expenditures from the actual purchase occasion as the budget constraint, and the actual menu of prices in the week of the purchase occasion, take the current

CHAPTER 3. COSTS OF UNIFORM PRICING

105

parameters and solve explicitly (numerically) the household’s utility maximization problem. This step is non-trivial and accounts for the bulk of the computational power involved in this estimation procedure. This means solving R∗(Number of Purchase Occasions)=30 ∗ 16008 = 480, 240 utility maximization problems for each iteration of the parameter values. I discuss this process, and suggest numerical algorithms at greater length in the Appendix. 5. For each purchase occasion, average over the R different purchase vectors to get the expected purchases for that purchase occasion at the current set of parameter values. 6. Using the difference between the actual vector of purchases on that purchase occasion and the expected purchases calculated in Step 4, calculate the current moment equations. 7. Interact these moment equations with the weighting matrix W to calculate the current distance function. The weighting matrix is of dimension (|J| + 2) ∗ (|J| + 1) by (|J| + 2) ∗ (|J| + 1). 8. Using a numerical minimization algorithm10 , choose a new set of parameter values (β and ρ) and repeat steps until a minimum is found. In addition to the computational cost, this simulation method forces me to make assumptions about the distribution of the unobservables (ε). The assumption that I choose to make is that these unobservables are distributed independently of all observable variables. In particular, I assume that they are distributed independently of prices. The distributions of ε could be made dependent upon prices (or other observables). I do not do so here for two reasons. First, it seems at least plausible that brand, holiday, feature, and display variables account for much of the potential for unobserved correlation between price and these unobservables, but to the extent that the retailer observes time-varying changes in the household error terms, the unconditional distribution of the error term will differ significantly from its distribution conditional on prices. In this case my estimates may be both biased and inconsistent. Short of simulation, I cannot think of a way to “bound” the effects of violations of this assumption. The second, and principal justification for this assumption, is that it is crucial in making the estimation tractable. Even implementing a 10 I have had the most success using the simplex-based E04CCF routine available from the Numerical Algorithms Group (NAG).

CHAPTER 3. COSTS OF UNIFORM PRICING

106

recursive routine to match predicted with observed market shares (as in Berry et al. (1995)) would be prohibitively computationally expensive. Economically, with respect to prices, I am assuming that the retailer does not observe any (or at least does not adjust prices in response to) time-varying changes in the distribution of the idiosyncratic demand shocks. Given this assumption that the idiosyncratic shocks are distributed independently of prices, it is internally consistent to use prices as instruments (since they will be orthogonal to the difference between the actual and the expected demands).

3.2.2

Store Choice Model

As noted earlier, I need to estimate the store’s residual demand, not market demand. The true process by which households choose where to shop is almost certainly related to their decisions about exactly what to purchase once they get there. However, I am unaware of any paper that simultaneously models the household’s store choice and product-level purchasing decisions. A fully structural model would involve calculating the household’s expected utility, net of travel costs, from visiting each of the stores in its choice set. Such a model would also involve consumers forming expectations of the menu of prices they would face at each store. Furthermore, the effects of these prices on store choice would almost certainly depend on the household’s expected shopping basket on that purchase occasion. Given my estimation procedure, this approach is far too computationally burdensome. Instead, I approximate this choice, by assuming that households’ choices among the stores in my sample follow a logit choice model. This model represents an approximation, of the true model and does not directly correspond to any model of utility maximization. I use it because it is computationally cheap, and because I believe the approximation is reasonable. The model I estimate assumes that in week t, conditional on going shopping, household i derives indirect utility: 0 uist = Dist δ 0 − p0st δ 1 + ηist

(3.14)

from choosing store s at time t, where Dist is a vector of household characterstics interacted with store indicator variables, p is a vector of price indices for several product categories at store s, including soft drinks, and δ 0 and δ 1 are vectors of parameters.11 I defer discussion of the exact specifications and discussion of the estimated coefficients to section 3.4.2. This model implicitly assumes that households form expectations about current prices 11 In principle, the elements of δ 0 and δ 1 could be allowed to vary across households, though I do not do so here.

CHAPTER 3. COSTS OF UNIFORM PRICING

107

(see Ho, Tang & Bell (1998)). I estimate several specifications using current prices, implicitly assuming that the household is able to perfectly forecast (or learn) these prices. I also estimate several other specifications using prices from the previous two weeks as predictors of store choice12 , though in general I do not find that prices substantially influence households’ choice of store. These findings are consistent with those of Hoch, Dreze & Purk (1994) who also find that consumers are largely inelastic to short term price changes in their choice of store. Although I am not aware of any papers that simultaneously address the household’s decision of what to buy and where to shop, there is an extensive literature on store choice, which I will not attempt to summarize in detail here. Instead I focus on the portions of that literature that I have included in the specification of this model. I follow Bell, Ho & Tang (1998) and Leszczyc, Sinha & Timmermans (2000) in incorporating household-level demographics and find that these are both statistically and economically significant in predicting store choice. While Rhee & Bell (2002), find that once unobserved heterogeneity is accounted for, shoppers’ demographic characteristics are not statistically significant in predicting the probability of switching, they do not allow the effects of these characteristics to vary across stores. Although I do not control for unobserved heterogeneity, I find that the effect of household characteristics vary on a store-by-store basis. After demographic variables, I find that one of the most significant predictors of store choice is whether the household visited the store in the previous two weeks. This is consistent with the finding by Rhee & Bell (2002), who find that households are highly path-dependent in their choice of store. However, this may simply be controlling for unobserved time-varying heterogeneity among consumers. I also account for the possibility that, as suggested by Bell & Lattin (1998), households with higher expenditure levels tend to prefer stores with certain pricing formats. Specifically, they found that households with large average expenditure levels tend to prefer so-called Every Day Low Price (EDLP) stores to High-Low stores whose prices fluctuate more wildly from week-to-week. To account for this effect, I interacted the household’s expenditure level for the purchase occasion with store indicator variables. This allows shoppers who expect to have high (or low) expenditure levels to seek out specific stores. 12

I also experimented with using longer lags, but found that they did not improve the predictive power of the model.

CHAPTER 3. COSTS OF UNIFORM PRICING

108

Finally, as mentioned earlier, I take the household’s decision to go shopping to be governed by an exogenous process. This is consistent with evidence in Chiang, Chung & Cremers (2001) who find that consumers decision to shop is largely unaffected by marketing mix variables.

3.2.3

Residual Demand

I complete the construction of the residual demand system by bringing together the product choice and store choice models. The residual demand faced by store A is equal to the sum over all households (that went shopping in that week) of the probability that the household chose store A, multiplied by their expected purchases, conditional on choosing store A. Hence, the expected demand system faced by store A in week t is: X



(3.15)

" #  R 0 δ0 − p δ1 X 1 X exp D At iAt ≈ q(ρ, β, pt , wit , εrit ) · P 0 δ0 − p δ1) R exp (D st s∈{A,B,C,D,E} ist

(3.16)

Et [Q(p)] =

E [qit |ρ, β, pt , wit ] · P A|Dist , pAt , p−At

i

i

r=1

In calculating this expected demand, I follow similar steps to those used in the estimation: 1. Take the estimated values of the parameters: β and ρ. 2. Draw R sets of εit for each observed purchase occasion. I use R = 30. These random numbers need not be the same as those used to estimate the parameters above. Note that in this case the number of purchase occasions is the total number of store trips (to any store) made in that week, not just the purchase occasions from store A. 3. Using the expenditures from the actual purchase occasions as the budget constraints, and the actual menu of prices in the week of the purchase occasion, take the current parameters and solve explicitly solve the household’s utility maximization problem (using numerical methods discussed in the Appendix). 4. For each purchase occasion, average over the R different purchase vectors to get the expected purchases for that purchase occasion at the current set of parameter values. 5. For each purchase occasion, multiply these expected purchases by the probability of choosing store A in that week.

CHAPTER 3. COSTS OF UNIFORM PRICING

3.3

109

Data

Although a trip to nearly any store offers many cases of uniform pricing, I focus exclusively on grocery stores. There are a number of reasons for restricting my attention to grocery stores. First, they offer literally thousands of examples of uniform pricing. Second, given that grocery stores carry a large number of products,13 one might expect them to use relatively sophisticated pricing techniques. Third, grocery stores are a significant portion of the economy. In 1997, U.S. grocery stores had sales in excess of $368 billion, with roughly 100,000 establishments (U.S. Economic Census, 1997). Fourth, with few exceptions, groceries are not characterized by consumer uncertainty. Consumers are presumably quite familiar with products’ characteristics as well as their preferences over these characteristics. For example, there is not much uncertainty about what will be inside when you pop open a can of Diet Coke. Finally, grocery stores conveniently offer the availability of scanner panel data. This paper utilizes two Chicago-area grocery datasets, both of which have already been extensively studied. The principal dataset used has store-level price and quantity data for a geographic cluster of five stores, from several different chains. It also contains a householdlevel component that allows us to observe the purchase patterns of individual households. The second dataset used has store-level price, quantity, and cost data for all the grocery stores of a single chain - Dominick’s Finer Foods. This section describes each of these datasets in more detail, and briefly takes a rough look at the representativeness of the panel.

3.3.1

IRI Basket Data

For a two-year period from 1991 to 1993, Information Resources Incorporated (IRI) collected a panel dataset in urban Chicago. This dataset has both aggregate and micro components. The aggregate component consists of weekly price and quantity14 data at the store/UPC level for several different product categories at five geographically close stores. Throughout the paper, these stores are referred to as stores A through E. As mentioned earlier, one of these stores, which I will call store A, charged non-uniform prices for carbonated soft drinks during this period. The micro-level component of this dataset contains carbonated 13 14

A typical grocery store carries over 14,000 different products. Quantity sold includes sales to all customers, not just those in the panel.

CHAPTER 3. COSTS OF UNIFORM PRICING

110

soft drink purchase histories for 548 households at these five grocery stores over the twoyear period. The dataset also contains the households’ total grocery expenditure on each purchase occasion. IRI paid these households to use a special electronic card that recorded their purchases when they shopped at these stores. For the majority of the analysis, I use only a subset of these households consisting of 262 households that visited store A (the store at which I estimate demand) at least once during the two year sample period. According to the documentation provided with the data, these five stores and 548 panelists were selected by IRI using two criteria: First, although very little information is available on the actual sampling procedures used, IRI tried to create a stratified random sample of households, reflective of the population in the area. Second, in order to avoid the effects of unobserved market fluctuations, it was IRI’s goal to, as much as possible, achieve a closed system. That is, IRI tried to include the stores that the households in the panel would be most likely to shop at, in order to observe as large a fraction of their grocery expenditure as possible. That IRI achieved this goal is supported by the fact for the vast majority of the households, grocery expenditure at stores within the sample universe appears to be fairly constant over time. Tables 3.1 and 3.2 show the distribution of households’ expenditure at different stores for all households, as well as for those who shopped at store A at least once. The mean weekly expenditure by a household shopping at Store A was $22, while the median was $15.15 This is less than stores B and C, but similar to stores D and E. Even households that shopped at store A at least once tended to spend more at these other stores, although the majority ($350,000) of their total expenditure of $610,000 over the period was at store A.

15

For clarity, all dollar references in this section are nominal. Hence, prices in 1991 use 1991 dollars, etc. I use this approach because retailers and wholesale prices over the period do not appear to move with inflation. For reference, one 1991 dollar is equivalent to $1.39 2004 dollars, a 1992 dollar is equivalent to $1.35 2004 dollars, and a 1993 dollar is equivalent to $1.31 2004 dollars.

Store Store Store Store Store Total

Store

A B C D E

Table 3.1: Distribution of Purchase Occasion Expenditure by Store, All Purchase Occasions Number of Expenditure in Dollars Purchase Mean Standard Minimum 25th Median 75th Maximum Total Occasions Deviation Percentile Percentile ($000’s) 16,008 22 21 0.25 8 15 27 228 350 10,063 38 36 0.16 14 27 50 281 390 13,733 44 38 0.14 17 33 60 378 600 7,835 22 30 0.34 6 12 23 325 170 5,637 26 28 0.34 9 16 33 254 150 53,516 31 32 0.14 10 20 40 378 1,700

CHAPTER 3. COSTS OF UNIFORM PRICING 111

Table 3.2: Distribution of Purchase Occasion Expenditure by Store, Purchases Made by Panelists Who Visited Store A at least Once. Store Number of Expenditure in Dollars Purchase Mean Standard Minimum 25th Median 75th Maximum Total Occasions Deviation Percentile Percentile ($000’s) Store A 16,008 22 21 0.25 8 15 27 228 350 Store B 615 32 27 1.24 14 23 41 177 20 Store C 2,234 34 38 0.14 13 24 40 378 8 Store D 6,435 15 29 0.34 5 10 17 188 99 Store E 3,479 19 19 0.34 7 13 26 201 68 Total 28,782 21 23 0.14 8 14 26 378 610

CHAPTER 3. COSTS OF UNIFORM PRICING 112

CHAPTER 3. COSTS OF UNIFORM PRICING

113

Unlike many previous papers which have estimated brand-level demand, this paper estimates UPC-level demand. Over a two year period, a typical grocery store sells over 200 different items in the carbonated soft drink category. The vast majority of these products are offered only rarely, or quickly enter and exit. Because this paper uses panel purchases to estimate demand, and many of these products are only rarely (or never) purchased by the panel, it not practical to estimate the households’ demand for them. Instead, I estimate the households’ demand for the 25 products with the largest market share by volume. These products represent 71% of the Store A’s carbonated soft drink sales by volume, and 69% of their total soft drink sales by dollar value. The products included in the analysis are shown in Table 3.3. Of these, three varieties (8 items) were distributed by the Coca Cola Corp., two varieties (8 items) were distributed by Pepsi Co., two varieties (4 items) were distributed by Dr. Pepper/7Up, two varieties (4 items) were distributed by the Royal Crown Corp., and one variety (1 items) was distributed by an independent producer under a private label. Table 3.3: Variety and Size Distribution of in the Dataset, grouped by Manufacturer Manufacturer

Variety

Coca Cola

Coke Diet Coke Diet CF Coke Pepsi Diet Pepsi RC Diet Rite 7Up Diet 7Up Private Label

Pepsico RC Corp. DP/7Up PL Total Number of Items

Number of 12oz servings (Liters in parentheses) 1 5.63 6 8.45 12 24 (0.36) (2.0) (2.13) (3.0) (4.26) (8.52)

! 1

! ! ! ! ! ! ! ! 8

! ! 2

! ! 2

! ! ! ! !

! ! ! ! ! ! !

5

7

Number of Sizes Avail. 3 3 2 4 4 1 3 2 2 1 25

Some descriptive statistics on the price and sales volume for these products is shown in Table 3.4. The price of a 12-ounce serving of carbonated soft drink varied from a high of $0.49 as part of a 12-pack of 12-ounce cans of Diet Coke, to a low of $0.12 for a single can of the Private Label cola. Most products appear to have had either an end-of-aisle display, or a mention in the store’s circular in between one-third to one-half of the weeks. The notable exceptions to this were the 3L bottle of Pepsi, the 2L bottle of Diet 7up, and the store brand which received significantly less advertising (as measured by circular and display activity).

2L Bottle 12-pack 12oz cans 24-pack 12oz cans 2L Bottle 12-pack 12oz cans 24-pack 12oz cans 12-pack 12oz cans 24-pack 12oz cans 2L Bottle 6-pack 12oz cans 12-pack 12oz cans 24-pack 12oz cans 2L Bottle 3L Bottle 12-pack 12oz cans 24-pack 12oz cans 2L Bottle 2L Bottle 3L Bottle 24-pack 12oz cans 2L Bottle 6-pack 12oz cans 2L Bottle 24-pack 12oz cans 12oz Can

Mean Price ($) 0.22 0.31 0.27 0.23 0.31 0.27 0.33 0.27 0.22 0.39 0.31 0.26 0.23 0.20 0.31 0.26 0.21 0.37 0.21 0.26 0.17 0.21 0.18 0.26 0.19

Maximum Price ($) 0.32 0.47 0.34 0.32 0.49 0.34 0.47 0.34 0.32 0.48 0.47 0.34 0.32 0.20 0.47 0.34 0.32 0.48 0.32 0.34 0.32 0.32 0.20 0.32 0.22

S.D. of Price 0.06 0.09 0.06 0.06 0.10 0.06 0.10 0.06 0.06 0.10 0.10 0.06 0.06 0.00 0.10 0.06 0.06 0.11 0.06 0.06 0.02 0.05 0.03 0.05 0.03

Mean Price at DFF 0.26 0.34 0.24 0.26 0.34 0.25 0.36 0.21 0.25 0.22 0.33 0.20 0.25 0.04 0.34 0.24 0.25 0.39 0.24 0.19 0.19 0.24 0.19 0.21 NA

Mean DFF Wholesale Price 0.23 0.29 0.18 0.23 0.31 0.19 0.30 0.15 0.22 0.17 0.28 0.15 0.22 0.03 0.30 0.18 0.22 0.30 0.21 0.14 0.15 0.19 0.15 0.18 NA

S.D. of DFF Wholesale Price 0.04 0.06 0.07 0.05 0.04 0.08 0.04 0.06 0.04 0.09 0.08 0.06 0.05 0.05 0.05 0.07 0.04 0.10 0.05 0.05 0.05 0.04 0.00 0.02 NA

Mean Number of Servings Sold Per Week 30 29 17 21 28 28 15 14 32 13 25 50 14 7 21 23 20 15 12 14 30 13 10 19 20

All prices are in nominal Dollars per 12-ounce serving. Source: IRI and DFF Data.

Private Label

Diet 7up

7up

RC Diet Rite

Diet Pepsi

Pepsi

Diet Caffeine Free Coke

Diet Coke

Coke

Minimum Price ($) 0.12 0.17 0.17 0.12 0.17 0.15 0.17 0.17 0.12 0.24 0.16 0.17 0.12 0.20 0.10 0.17 0.12 0.17 0.12 0.17 0.12 0.12 0.11 0.17 0.12

S.D. of Number of Servings Sold Per Week 26 30 27 27 31 41 20 22 23 15 26 84 16 7 24 35 16 22 16 24 17 14 20 34 13

% of Weeks on End of Aisle Display 47 44 38 47 33 38 22 33 46 17 38 42 40 2 46 42 57 28 55 43 31 37 11 37 2

% of Weeks Featured in Circular 61 51 46 56 49 45 40 45 58 33 50 52 56 1 50 47 54 32 53 44 35 24 9 31 25

Table 3.4: Summary Statistics for Prices and Quantities Sold at Store A, grouped by Manufacturer % Market Share by Volume 3.93 3.22 2.67 2.65 3.19 3.95 2.06 2.09 5.03 1.60 2.61 5.23 2.02 1.41 2.06 3.19 2.74 1.67 1.49 2.03 5.88 1.68 1.43 2.67 4.13

% Market Share by Sales 3.63 3.77 2.61 2.26 3.78 4.16 2.37 2.26 4.68 2.33 2.91 4.96 1.86 1.25 2.30 2.99 2.37 2.23 1.17 1.92 4.51 1.41 1.00 2.56 3.39

CHAPTER 3. COSTS OF UNIFORM PRICING 114

CHAPTER 3. COSTS OF UNIFORM PRICING

115

Both the traditional logit model and the new model I propose reduce the dimensionality of the demand system parameter space by assuming that households’ preferences over products are driven by product characteristics. The store’s residual demand Qjt (·) for product j is denominated in twelve ounce servings of carbonated soft drink. The characteristics used in the analysis are: calories (per 12 ounce serving), milligrams of sodium (per 12 ounce serving), milligrams of caffeine (per 12 ounce serving), grams of sugar (per 12 ounce serving), as well as indicator variables for the presence of citric acid, phosphoric acid, and whether it is a diet drink. These physical characteristics were obtained by contacting the manufacturers of the products, and, to the best of my knowledge, represent the characteristics of the products during the relevant time period. I also include indicator variables for size, brand, and whether it was featured in store A’s weekly circular, or an in-store display (in store A), as well as a constant common to all products. These characteristics were chosen based on earlier work by Dub´e (2001). These characteristics are the elements of the A matrix, and are shown in Table 3.5. This table also shows the number of weeks that each product was available. For example, the 12-pack of 12-ounce cans of Diet Pepsi was unavailable for 16 of the 104 weeks, while the 24-pack of 12-ounce cans of Diet Caffeine Free Coke, and the 24-pack of 12-ounce cans of Diet Caffeine Free Coke were not available for 15 weeks.

Private Label

2L Bottle 12-pack 12oz cans 24-pack 12oz cans 2L Bottle 12-pack 12oz cans 24-pack 12oz cans 12-pack 12oz cans 24-pack 12oz cans 2L Bottle 6-pack 12oz cans 12-pack 12oz cans 24-pack 12oz cans 2L Bottle 3L Bottle 12-pack 12oz cans 24-pack 12oz cans 2L Bottle 2L Bottle 3L Bottle 24-pack 12oz cans 2L Bottle 6-pack 12oz cans 2L Bottle 24-pack 12oz cans 12oz Can

104 99 99 104 99 99 100 89 104 104 90 100 104 94 88 101 104 104 104 98 104 104 104 89 104

Weeks Sold 140 140 140 0 0 0 0 0 150 150 150 150 0 0 0 0 160 0 0 0 140 140 0 0 140

Calories

50 50 50 40 40 40 40 40 37.5 37.5 37.5 37.5 37.5 37.5 37.5 37.5 50 45 45 45 75 75 45 45 50

Sodium (mg) 39 39 39 0 0 0 0 0 40.5 40.5 40.5 40.5 0 0 0 0 42 0 0 0 39 39 0 0 39

Sugar (g) 34 34 34 45 45 45 0 0 38 38 38 38 36 36 36 36 43 48 48 48 0 0 0 0 34

Caffeine (mg)

Contains Phosphoric Acid 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1

Contains Citric Acid 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 0 0 0 0 1 1 1 1 0 1 1 1 0 0 1 1 0

Diet

% of Weeks Featured in Circular 47 32 42 46 32 41 32 44 44 20 36 42 44 1 33 42 32 20 5 23 45 22 44 39 20

% of Weeks on End of Aisle Display 35 27 32 36 23 32 19 31 28 11 27 33 27 2 30 37 26 29 6 34 42 21 42 38 3

Characteristics are per 12oz serving. Source: Coca Cola Corp., Pepsico, Royal Crown Corp., and Cadbury Beverages.

Private Label

7up

DP/7up

Diet 7up

RC Diet Rite

Diet Pepsi

Pepsi

Diet Caffeine Free Coke

Royal Crown

Pepsico

Coke

Coca Cola

Size

Table 3.5: Characteristics of Products in the Dataset, grouped by Manufacturer

Diet Coke

Variety

Manufacturer

CHAPTER 3. COSTS OF UNIFORM PRICING 116

CHAPTER 3. COSTS OF UNIFORM PRICING

3.3.2

117

Dominick’s Finer Foods Data

I supplement the IRI data with demographic and wholesale price data from Dominick’s Finer Foods, a grocery chain located in the greater Chicago metropolitan area. From 1989 to 1997, through an arrangement with the University of Chicago Graduate School of Business, Dominick’s kept track of store-level, weekly unit sales and price for every UPC symbol for a number of product categories, including carbonated soft drinks. This dataset has weekly store sales totals in these product categories as well as store area demographics pulled from census data. In addition, the dataset contains the actual wholesale prices that Dominick’s paid for each good. For a more thorough description of the dataset, see Hoch et al. (1994).16 Finally, because this dataset covers the same time and geographic area as the IRI dataset, one can match many of the products across the two datasets, giving us a measure of the wholesale prices for these products. To the best of my knowledge this matching has not been done in previous work.

3.4 3.4.1

Results Structural Model

The parameter estimates and standard errors17 from the structural model are presented in Table 3.6. As discussed in chapter 2, the nonlinearity of the model makes these parameter estimates difficult to interpret directly, however we can make some inferences, particularly relative to each other. The characteristic that appears to have the largest affect on households’ soft drink purchasing decisions is the indicator variable for whether the product is sold on a holiday. This characteristic has both the largest maximal marginal effect18 (βholiday ρholiday ) and a β 16

This dataset is publicly available from the Kilts Center at the University of Chicago Graduate School of Business web page: http://gsbwww.uchicago.edu/kilts/research/db/dominicks/ √ b 17 The standard errors shown in Table 3.6 are calculated using the N (0, V ) as the  fact that:  n(θn −θ0 0 ) →d −1 −1 0 0 1 number of observations n goes to infinity. V = [G0 W G0 ] G0 W Ω0 (1 + R ) W G0 [G0 W G0 ] , where W is P I PT ∂E[qit |θ,pt ,wit ] b IT = 1 the weighting matrix. I obtain a consistent estimator of V by using G i=1 t=1 xit IT PI PT ∂θ 1 b b b and θ = [β, ρb]. I use the estimated parameters β and ρ to compute ΩIT = IT i=1 t=1 ((qit − b ρ b ρ E[qit |β, b, pt , wit ])xit )((qit − E[qit |β, b, pt , wit ])xit )0 which is a consistent estimator of Ω0 . For further discussion, see Gouri´eroux & Monfort (1996). I use a diagonal weighting matrix (W ), with the elements scaled by the sum of squares of each of the instruments. 18 The term “maximal marginal effect” is somewhat misleading. It is actually the most positive marginal

CHAPTER 3. COSTS OF UNIFORM PRICING

118

that is statistically significantly greater than zero. The effect of the Holiday characteristic is offset somewhat by the fact that ρholiday is quite close to zero. This means that although the marginal utility from soda on holidays starts off higher, its second derivative is more negative than for other characteristics. Taken together, these two facts imply that households are more likely to purchase soft drinks on holidays, but not likely to substantially increase the quantity that they purchase. After the Holiday characteristic, the Coke and Pepsi characteristics have the largest effects, based on their high maximal marginal utility values. Interestingly, while βP epsi is statistically significantly different from zero, βCoke is not. Given that both βP epsi and ρP epsi are relatively large, consumers differentially prefer Pepsi (and Coke, to a lesser degree) to other soft drinks – they are both more likely to purchase these brands, and more likely to purchase more of them. At the other end of the spectrum, due to the fact that both their β’s and their ρ’s are relatively small, Sodium and Caffeine do not appear to significantly affect consumers purchasing behavior (although their coefficients are imprecisely estimated). More readily interpretable than the parameter estimates are the own and cross-price elasticities that they imply. In the logit model, the cross-price elasticities are infamously dependent upon market shares. These restrictions are evident in Table 3.7, which shows a matrix of own and cross price elasticities for several sizes and varieties of Coke and Pepsi. The logit model predicts, for example, that the demand for 24-packs (288oz) of Diet Coke increases by 0.46% from either a 1% increase in the price of a 2-liter bottle of Diet Coke or a 1% increase in the price of a 2-liter bottle of Pepsi. This contrasts sharply with the price elasticities implied by my structural model, presented in Table 3.8. As clearly seen, my model allows for a rich pattern of cross-price elasticities. In addition to allowing cross-price elasticities to vary across products, my model also allows for negative off-diagonal price elasticities, suggesting that complementarities exist. For example, if households tend to purchase both 12-packs of Coke and 6-packs of 7Up, then an increase in the price of Coke may cause more people to buy Pepsi, but it will also lead people both to buy less Coke and to buy less of the Coke/7Up bundle. This flexibility is not possible using the traditional logit model.

effect. When βc is negative, ρc is necessarily greater than one, and hence, the marginal effect is increasing in magnitude as quantity increases.

CHAPTER 3. COSTS OF UNIFORM PRICING

119

Table 3.6: Parameter Estimates from Structural Model of Product Choice β Characteristic Constant Calories Sugar Sodium Caffeine Phosphoric Acid Citric Acid Cola Flavored Single Serving 288oz Diet Coke Diet Coke Pepsi 7up RC Jewel Holiday Feature Display

Units per 12oz g per 12oz mg per 12 oz mg per 12oz Indicator Variable ” ” ” ” ” ” ” ” ” ” ” ” ” ” ”

coeff. 0.0001 -0.0001 0.0054 0.0000 0.0000 0.5431 0.1674 0.2588 0.3671 0.3174 -0.0003 -0.0076 0.5836 -0.0115 0.4477 0.2333 0.2380 0.4449 5.3081 0.2232 0.4830

ρ s.e. (0.5435) (0.6231) (1.2754) (0.8227) (0.3318) (0.1842) (0.0708) (0.1187) (0.1610) (0.0311) (0.0046) (0.0059) (0.3054) (0.0062) (0.0216) (0.1140) (0.0906) (0.3412) (1.8195) (0.1304) (0.4114)

coeff. 0.7471 1.2581 0.8134 0.1608 0.0957 0.3742 0.6585 0.4188 0.5139 0.5690 1.6013 1.3875 0.4090 1.6398 0.4764 0.5892 0.4824 0.3507 0.0586 0.3683 0.2712

s.e. (0.6382) (0.5957) (0.5749) (0.2259) (0.0737) (0.2111) (0.2603) (0.3062) (0.3254) (0.0640) (0.4515) (0.4267) (0.2517) (0.0793) (0.0778) (0.2415) (0.1868) (0.2108) (0.0305) (0.2217) (0.2434)

Maximal Marginal Effect (βρ) 0.0001 -0.0001 0.0044 0.0000 0.0000 0.2032 0.1102 0.1084 0.1886 0.1806 -0.0005 -0.0105 0.2387 -0.0189 0.2133 0.1374 0.1148 0.1560 0.3111 0.0822 0.1310

This table shows parameter estimates from the structural model under two different specifications. The right-most column is the product of β and ρ.

Table 3.7: Selected Own and Cross Price Elasticities from Homogenous Logit Model of Product Choice Coke Diet Coke Pepsi Diet Pepsi 288oz 2L 288oz 2L 288oz 2L 288oz 2L Coke - 288oz -2.398 0.015 0.046 0.014 0.073 0.019 0.051 0.013 Coke - 2L 0.052 -1.992 0.046 0.014 0.073 0.019 0.051 0.013 Diet Coke - 288oz 0.052 0.015 -2.404 0.014 0.073 0.019 0.051 0.013 Diet Coke - 2L 0.052 0.015 0.046 -2.001 0.073 0.019 0.051 0.013 Pepsi - 288oz 0.052 0.015 0.046 0.014 -2.203 0.019 0.051 0.013 Pepsi - 2L 0.052 0.015 0.046 0.014 0.073 -2.002 0.051 0.013 Diet Pepsi - 288oz 0.052 0.015 0.046 0.014 0.073 0.019 -2.392 0.013 Diet Pepsi - 2L 0.052 0.015 0.046 0.014 0.073 0.019 0.051 -2.000

Diet Rite, 2L Bottle

Diet Rite, 24-pack 12oz cans RC, 2L Bottle

Diet Pepsi, 2L Bottle

Diet Pepsi, 24-pack 12oz cans

Diet Pepsi, 12-pack 12oz cans Diet Pepsi, 3L Bottle

Pepsi, 6-pack 12oz cans Pepsi, 2L Bottle

Pepsi, 24-pack 12oz cans

Pepsi, 12-pack 12oz cans

Private Label, 12oz Can

Diet Caffeine Free Coke, 24-pack 12oz cans

Diet Caffeine Free Coke, 12-pack 12oz cans

Diet Coke, 2L Bottle

Diet Coke, 24-pack 12oz cans

Diet Coke, 12-pack 12oz cans

Coke, 2L Bottle

Coke, 24-pack 12oz cans

Coke, 12-pack 12oz cans

Diet 7up, 2L Bottle

Diet 7up, 24-pack 12oz cans

7up, 6-pack 12oz cans

7up, 2L Bottle

7up, 2L Bottle -4.158 0.054 0.056 0.131 0.023 0.014 0.036 -0.015 -0.001 0.008 -0.005 0.000 0.013 0.030 0.020 0.035 0.044 0.001 -0.002 0.000 0.007 0.018 0.006 0.003 0.006 7up, 6-pack 12oz cans 0.081 -7.921 0.049 0.121 0.040 0.034 0.034 0.011 0.025 -0.003 0.024 0.013 0.026 0.056 0.024 0.028 0.061 -0.008 0.033 0.016 -0.001 0.012 0.000 0.026 -0.003 Diet 7up, 24-pack 12oz cans 0.082 -0.003 -3.409 0.109 0.034 0.019 -0.049 0.064 0.039 0.011 0.061 0.033 0.015 0.035 0.026 0.000 0.025 0.007 0.028 0.034 0.016 -0.020 0.023 0.057 0.023 Diet 7up, 2L Bottle 0.063 0.072 0.036 -6.704 -0.019 -0.010 -0.009 0.010 0.008 0.014 0.006 0.008 -0.010 0.005 0.002 0.009 0.001 0.010 0.012 0.009 0.016 -0.004 0.019 0.011 0.026 Coke, 12-pack 12oz cans 0.052 -0.131 0.020 -0.056 -4.584 0.071 0.088 0.084 0.068 0.090 0.100 0.064 0.063 0.109 0.049 0.065 0.078 0.028 0.083 0.045 0.056 0.042 0.060 0.054 0.076 Coke, 24-pack 12oz cans 0.048 0.055 0.028 -0.032 0.119 -2.971 0.134 0.093 0.108 0.119 0.109 0.082 0.070 0.106 0.059 0.070 0.127 0.033 0.083 0.056 0.073 0.059 0.071 0.076 0.065 Coke, 2L Bottle 0.027 0.028 -0.017 -0.022 0.094 0.053 -4.048 0.074 0.043 0.065 0.067 0.037 0.032 0.039 0.022 0.050 0.074 0.026 0.010 0.012 0.045 0.034 0.033 0.022 0.068 Diet Coke, 12-pack 12oz cans 0.020 -0.113 0.041 0.070 0.088 0.070 0.105 -4.757 0.075 0.134 0.188 0.067 0.044 0.081 0.045 0.052 0.054 0.054 0.065 0.053 0.066 0.021 0.090 0.072 0.093 Diet Coke, 24-pack 12oz cans -0.003 0.155 0.057 0.068 0.111 0.121 0.143 0.113 -2.891 0.138 0.140 0.115 0.058 0.120 0.057 0.049 0.094 0.070 0.137 0.079 0.098 0.028 0.111 0.094 0.145 Diet Coke, 2L Bottle 0.012 0.029 0.006 0.027 0.081 0.044 0.072 0.103 0.055 -3.981 0.093 0.043 0.017 0.037 0.019 0.041 0.054 0.041 0.038 0.028 0.055 0.017 0.069 0.036 0.082 Diet Caffeine Free Coke, 12-pack 12oz cans 0.000 -0.057 0.040 0.030 0.099 0.069 0.110 0.180 0.083 0.117 -4.979 0.071 0.042 0.100 0.044 0.047 0.081 0.055 0.093 0.063 0.085 0.018 0.093 0.071 0.104 Diet Caffeine Free Coke, 24-pack 12oz cans 0.003 -0.013 0.048 0.044 0.123 0.089 0.124 0.099 0.115 0.133 0.113 -2.516 0.060 0.057 0.055 0.056 0.084 0.066 0.075 0.071 0.074 0.025 0.131 0.080 0.102 Private Label, 12oz Can 0.041 0.036 0.016 -0.039 0.088 0.057 0.083 0.066 0.044 0.049 0.059 0.044 -2.191 0.068 0.059 0.065 0.069 0.032 0.054 0.047 0.053 0.041 0.050 0.062 0.059 Pepsi, 12-pack 12oz cans 0.060 0.053 0.033 0.033 0.124 0.067 0.097 0.110 0.056 0.069 0.116 0.045 0.060 -4.085 0.060 0.075 0.074 0.055 0.101 0.051 0.042 0.046 0.068 0.066 0.092 Pepsi, 24-pack 12oz cans 0.105 -0.131 0.058 0.055 0.151 0.109 0.110 0.121 0.097 0.108 0.156 0.088 0.125 0.212 -2.591 0.133 0.121 0.150 0.199 0.121 0.137 0.113 0.138 0.124 0.146 Pepsi, 2L Bottle 0.080 -0.052 0.003 0.098 0.071 0.050 0.094 0.061 0.034 0.073 0.053 0.036 0.052 0.069 0.057 -3.286 0.096 0.075 0.089 0.059 0.096 0.063 0.105 0.049 0.087 Pepsi, 6-pack 12oz cans 0.076 0.037 0.033 0.012 0.061 0.064 0.124 0.087 0.050 0.067 0.095 0.045 0.051 0.127 0.047 0.088 -6.029 0.076 0.123 0.059 0.121 0.066 0.109 0.070 0.092 Diet Pepsi, 3L Bottle -0.002 0.000 0.006 0.013 0.029 0.017 0.044 0.060 0.031 0.057 0.050 0.028 0.019 0.044 0.039 0.050 0.078 -2.972 0.040 0.038 0.096 0.039 0.075 0.042 0.058 Diet Pepsi, 12-pack 12oz cans 0.006 -0.008 0.041 0.040 0.072 0.054 0.039 0.068 0.068 0.077 0.100 0.053 0.051 0.114 0.063 0.094 0.090 0.088 -4.340 0.064 0.117 0.057 0.088 0.091 0.130 Diet Pepsi, 24-pack 12oz cans 0.006 -0.095 0.064 0.090 0.112 0.084 0.065 0.109 0.110 0.114 0.168 0.090 0.084 0.132 0.097 0.118 0.120 0.113 0.147 -3.122 0.126 0.085 0.134 0.112 0.126 Diet Pepsi, 2L Bottle 0.017 0.011 0.011 0.075 0.027 0.022 0.052 0.049 0.032 0.060 0.067 0.029 0.020 0.014 0.036 0.054 0.086 0.069 0.049 0.047 -4.250 0.041 0.069 0.054 0.063 RC, 2L Bottle 0.052 0.074 -0.027 -0.005 0.077 0.061 0.097 0.034 0.024 0.042 0.039 0.021 0.050 0.065 0.066 0.091 0.113 0.087 0.057 0.060 0.095 -1.825 0.094 0.060 0.089 RC, 3L Bottle 0.010 0.021 0.009 0.029 0.024 0.014 0.023 0.030 0.023 0.045 0.035 0.024 0.012 0.022 0.015 0.029 0.055 0.030 0.027 0.020 0.033 0.018 -4.157 0.026 0.068 Diet Rite, 24-pack 12oz cans 0.006 0.074 0.052 0.053 0.096 0.058 0.056 0.135 0.066 0.074 0.092 0.050 0.056 0.069 0.054 0.054 0.097 0.063 0.075 0.057 0.083 0.047 0.103 -3.693 0.128 Diet Rite, 2L Bottle 0.008 0.002 0.009 0.042 0.039 0.016 0.044 0.041 0.027 0.045 0.036 0.019 0.016 0.030 0.019 0.029 0.037 0.035 0.031 0.028 0.037 0.022 0.064 0.041 -4.980

RC, 3L Bottle

Table 3.8: Matrix of Estimated Average Own and Cross-Price Elasticities from Structural Model of Product Choice

CHAPTER 3. COSTS OF UNIFORM PRICING 120

CHAPTER 3. COSTS OF UNIFORM PRICING

3.4.2

121

Store Choice Model

As discussed in section 3.2.2, I estimate a conditional logit model of store choice, where the choice is conditional on shopping at one of the stores that we observe.19 The results from eight different specifications of the store choice model are presented in Tables 3.9, 3.9, and 3.11. Specifications I-III have identical demographic variables, but involve progressively longer lagged price indices. Specifications IV-VI are nearly identical to I-III. They differ only in the fact that they include an additional variable that measures whether the household made a purchase at the store in the previous two weeks. Specifications VII and VIII simultaneously incorporate price indices from several periods for a subset of products, both with and without the lagged store choice variable.20 The two main results of my analysis of store choice are: (1) that observable demographics significantly affect households’ choice of store, even after incorporating a measure of path dependence, and (2) for the product categories for which we have data, households are (at least in the short term) relatively inelastic with respect to store choice. Table 3.9 presents the coefficients on the demographic variables in Specifications III and VI. These coefficients remain essentially unchanged with respect to different combinations of price index variables. The demographic variables include an indicator variable equal to one if the household made a purchase (of any kind) at the store in the previous two weeks. This lagged store choice variable accounts for two things: First, it acts like a household-level fixed effect, and second, it accounts for fact that it is easier to shop at a store when you know that store’s layout.21 With respect to observable demographics, I find that people with lower incomes, were less likely to shop at stores A and B, and more likely to shop at stores C, D and E. Having an unemployed female in the household at the beginning of the sample period was a significant factor in store choice (although having an unemployed male was not), with unemployed female households much more likely to shop at store C. Non-white households were more likely to shop at store A, and households that subscribed to a newspaper were 19

For two different sets of stores, there are a non-trivial number of households that shop at both stores. This is true for stores A&D and B&C. I treat going to both stores as a separate alternative. In generating the price variables in this case, I use the lower of the two price indices of the stores in the bundle. 20 Note: The coefficients are all measured with respect to Store C. If a household made multiple purchases at the same store in the same week, I collapsed these into a single purchase occasion. 21 In the case of the “bundled” stores, the variable is created slightly differently. For example, if you went to stores A and D last two weeks, then this week the variable would be one for store A, store D, and the bundle of stores A and D. If you only went to Store A in the last two weeks, then this week the variable would be one for store A, zero for store D, and 0.5 for the bundle of stores A and D.

CHAPTER 3. COSTS OF UNIFORM PRICING

122

substantially more likely to shop at stores A, D, and to a lesser extent E. They are less likely to shop at store B. Furthermore, the coefficients on these demographic variables are largely invariant to the other aspects of the specification (depending primarily on whether lagged store choice is included). For this reason, I only present these coefficients for Specifications III and VI in Table 3.9. Although the effects of expenditure levels on store choice are statistically significant, they are not economically so.22 As mentioned in section 3.2.2, I explored a variety of specifications for the store choice model, including lags of price indices, alternative measures of price, additional demographic variable, and additional index variables measuring the fraction of the category that was featured in the store circular or an end-of-aisle display. The evidence from the effects of price on store choice were less encouraging, though, as noted earlier, they are in substantial agreement with the literature. The coefficients for price of Cookies and Detergent have the expected sign, and are statistically significant. Unfortunately, although some of the price index variables are significant, many are only significant at the five percent level. Given the number of coefficients, it is not surprising that a subset are statistically significant. Furthermore, many of the price index variables do not have the expected sign. Cat food, bar soap, and yogurt, for example, both have statistically significantly positive coefficients in several specifications. This suggests that these price indices may be capturing effects other than (i.e., that they are correlated with an omitted variable). This gives me less confidence in interpreting the coefficient on soft drinks, which (although the point estimates do not move too wildly) is only significant when I do not account for path dependence. While I do not report their results here, I also estimated models using alternative price indices, including the Stone price index and a variety of indices measuring the extent of discounts offered. These alternative measures of price did not appear to have any effect on store choice. 22

I explored using logged expenditure, as well as nonlinear effects from expenditure levels, but the effects were not substantially different.

CHAPTER 3. COSTS OF UNIFORM PRICING

3.4.3

123

Counter-Factuals

Marginal Costs In order to use the estimated demand system to recover estimates of the expected profits lost from uniform pricing, I return to the steps discussed in section 1.3. As discussed there, in order to recover the marginal costs for each product in each week, I need to make an assumption about the actual price-setting behavior of the retailer during the sample period. The assumption I choose to make is that the retailer maximizes total weekly profit for the soft drink category, and charges the profit-maximizing price for each product in each week. This assumption implies the following J first-order conditions (one for each good j) for each week t: X ∂Πt,N on−U nif orm ∂Et [Qkt (pt )] = Et [Qjt (pt )] + (pkt − ckt ) =0 ∂pjt ∂pjt

(3.17)

k∈J

By taking these first-order conditions numerically23 , I am able to solve the system of J equations and J unknowns for each week – the cjt ’s – and recover the implied weekly marginal costs for each product. Note that in recovering these marginal costs, the level of Qjt drops out. That is, the implied marginal cost is independent of the total number of households shopping in that week. In recovering the marginal costs from the first-order conditions of the retailer, I am implicitly assuming that the demand system that I have estimated is the true demand system (and by association, that it is the demand system that store A used in setting its prices), and that in each week, the retailer knows the distribution of the budgets of the households. Solving this system of equations, gives me the implied marginal cost cjt for each good, at store A, during week t. Table 3.12 contains summary statistics for these implied marginal costs. In general, the estimated marginal costs are substantially lower than the wholesale prices reported in the Dominick’s dataset (taken from a geographically proximate competing grocery retailer) and shown in Table 3.4. This discrepancy may be explained by the fact that store A is part of a large chain, and therefore may have received preferential wholesale prices. Additionally, these implied marginal costs may be capturing the effects of slotting allowances or nonlinearities in wholesale prices (such as block discounts) not accounted for in the Dominick’s 23

I do this by (1) choosing a fixed number of households, (2) simulating demand from these households at the observed prices at store A in week t, (3) numerically taking the derivatives of demand for each good with respect to all other goods.

CHAPTER 3. COSTS OF UNIFORM PRICING

124

data (see Israilevich (2004)). If I am underestimating the true marginal costs, the likely source would be that suggest that either retailers are pricing non-optimally with respect to the soft drink category, that the estimated demand model is incorrect, or that the assumed supply model is incorrect (e.g., retailers may be engaging in cross-category subsidization).

CHAPTER 3. COSTS OF UNIFORM PRICING

125

Table 3.9: Coefficients on Demographic Variables from Specifications III and VI of the Conditional Logit Model of Store Choice Specification III

Constant Expenditure ($) Log(Income) Unemployed Female Unemployed Male Non-white Subscribes to Newspaper Household Has No Kids

A 4.633 [0.190] -0.012 [0.000] -0.443 [0.019] -0.768 [0.038] -0.053 [0.069] 0.921 [0.035] 0.428 [0.034] -0.920 [0.209]

A and D 4.595 [0.235] -0.010 [0.001] -0.507 [0.024] -0.479 [0.049] -0.542 [0.097] 0.243 [0.045] 0.089 [0.045] -16.726 [742.304]

B 2.38 [0.203] -0.001 [0.000] -0.256 [0.020] -0.070 [0.036] -0.427 [0.078] 0.271 [0.037] -0.463 [0.038] -1.071 [0.248]

Store B and C D 1.18 -0.544 [0.279] [0.276] 0.002 -0.014 [0.000] [0.001] -0.246 0.030 [0.028] [0.027] 0.209 -0.640 [0.049] [0.055] -0.176 -2.183 [0.105] [0.256] -0.710 -0.907 [0.063] [0.067] -0.311 0.361 [0.052] [0.044] 1.752 -16.315 [0.170] [713.396]

E -4.904 [0.296] -0.012 [0.001] 0.452 [0.028] -1.110 [0.060] 0.127 [0.106] -0.819 [0.065] 0.150 [0.043] -0.807 [0.299]

Specification VI (Adds an Indicator Variable for Whether the Household Visited that Store in the Last Two Weeks) Store A A and D B B and C D E Constant 2.586 3.887 2.462 1.849 -0.679 -1.668 [0.309] [0.321] [0.294] [0.306] [0.387] [0.432] Expenditure ($) -0.011 -0.007 0.000 0.002 -0.013 -0.014 [0.001] [0.001] [0.001] [0.001] [0.001] [0.001] Log(Income) -0.263 -0.410 -0.261 -0.260 0.027 0.178 [0.030] [0.032] [0.029] [0.030] [0.037] [0.041] Unemployed Female -0.552 -0.306 -0.026 0.288 -0.586 -0.631 [0.062] [0.065] [0.052] [0.054] [0.080] [0.086] Unemployed Male 0.331 -0.027 -0.278 -0.013 -1.42 0.091 [0.120] [0.131] [0.112] [0.113] [0.280] [0.165] Non-white 0.551 0.093 0.090 -0.747 -0.642 -0.538 [0.056] [0.060] [0.054] [0.066] [0.082] [0.086] Subscribes to 0.450 0.151 -0.400 -0.254 0.475 0.373 Newspaper [0.055] [0.059] [0.053] [0.056] [0.065] [0.067] Does not Have Kids 1.088 -14.635 -0.368 1.773 -13.314 1.455 [0.310] [631.998] [0.279] [0.197] [676.928] [0.372]

CHAPTER 3. COSTS OF UNIFORM PRICING

126

Table 3.10: Coefficients on Price Index Variables for Specifications I-VI of Conditional Logit Model of Store Choice. Standard errors are in brackets. Product Category Bacon BBQ Sauce Butter Cat Food Cereal Cleansers Coffee Cookies Crackers Detergents Eggs Fabric Softener Frozen Pizza Hot Dogs Ice Cream Peanut Butter Snacks Bar Soap Soft Drinks Sugarless Gum Toilet Tissue Yogurt Shopped at Store in Past 2 Weeks Number of Observations Pseudo R2 Log Likelihood Effect on Market Share from a increase in Soft Drink Prices: Before: After:

274,337

Specification III IV Prices Current Lagged 2 Weeks Prices -0.036 0.019 [0.023] [0.032] 0.139 -0.101 [0.095] [0.128] -0.023 -0.002 [0.068] [0.092] 1.070 0.286 [0.283] [0.412] 0.166 -0.139 [0.052] [0.074] -0.041 0.039 [0.062] [0.088] 0.010 0.123 [0.036] [0.049] -0.526 -0.437 [0.071] [0.101] 0.038 -0.090 [0.043] [0.062] -0.042 -0.095 [0.020] [0.028] 0.011 -0.101 [0.057] [0.084] -0.091 -0.131 [0.054] [0.076] -0.032 -0.031 [0.032] [0.042] 0.105 -0.028 [0.032] [0.044] -0.002 -0.015 [0.036] [0.049] 0.003 -0.194 [0.044] [0.064] -0.270 -0.160 [0.061] [0.085] 0.288 0.465 [0.064] [0.089] -0.082 -0.026 [0.032] [0.044] 0.058 0.026 [0.069] [0.094] -0.002 -0.042 [0.046] [0.063] 0.681 0.149 [0.105] [0.148] 4.372 [0.029] 273,489 277,011

V Prices Lagged 1 Week -0.042 [0.032] 0.087 [0.130] -0.069 [0.094] 0.181 [0.422] 0.159 [0.074] -0.055 [0.091] -0.007 [0.050] -0.291 [0.103] -0.001 [0.062] -0.003 [0.028] 0.051 [0.085] -0.157 [0.078] -0.026 [0.046] 0.043 [0.045] -0.035 [0.051] -0.072 [0.065] -0.198 [0.085] 0.324 [0.089] -0.037 [0.045] 0.058 [0.096] 0.055 [0.063] -0.053 [0.150] 4.376 [0.029] 274,337

VI Prices Lagged 2 Weeks -0.063 [0.032] 0.180 [0.132] -0.056 [0.094] 0.718 [0.417] 0.240 [0.075] -0.217 [0.091] -0.001 [0.049] -0.410 [0.102] 0.065 [0.062] 0.026 [0.028] -0.303 [0.083] -0.065 [0.078] -0.042 [0.045] 0.122 [0.045] 0.029 [0.050] 0.092 [0.061] -0.237 [0.085] 0.470 [0.088] -0.019 [0.045] 0.209 [0.096] -0.008 [0.064] 0.869 [0.151] 4.386 [0.029] 273,489

0.124 -67,458

0.1241 -66,800

0.1246 -66,586

0.5243 -36,632

0.5282 -35,982

0.5291 -35,821

.3405 .3395

.3407 .3384

.3414 .3395

.3405 .3402

.3407 .3404

.3412 .3412

I Current Prices 0.022 [0.023] -0.121 [0.093] -0.065 [0.067] 1.217 [0.284] -0.123 [0.051] 0.129 [0.061] 0.094 [0.035] -0.468 [0.071] -0.023 [0.043] -0.064 [0.020] 0.073 [0.057] -0.086 [0.054] -0.010 [0.031] 0.015 [0.032] -0.051 [0.036] -0.091 [0.045] -0.173 [0.061] 0.304 [0.064] -0.041 [0.031] 0.007 [0.069] -0.066 [0.045] 0.432 [0.105]

II Prices Lagged 1 Week -0.005 [0.023] -0.019 [0.094] -0.050 [0.068] 0.993 [0.288] 0.006 [0.052] 0.101 [0.062] 0.060 [0.036] -0.479 [0.071] -0.01 [0.043] -0.076 [0.020] 0.149 [0.058] -0.086 [0.055] -0.028 [0.033] 0.059 [0.032] -0.048 [0.037] -0.093 [0.046] -0.247 [0.061] 0.252 [0.065] -0.101 [0.032] -0.012 [0.069] -0.003 [0.045] 0.195 [0.105]

277,011

CHAPTER 3. COSTS OF UNIFORM PRICING

127

Table 3.11: Coefficients on Price Indices for Specifications VII and VIII of Conditional Logit Model of Store Choice. Standard errors are in brackets. Product Category Cat Food Cereal Coffee Cookies Detergents Eggs Hot Dogs Peanut Butter Salty Snacks Bar Soap Fabric Softener Soft Drinks Yogurt Shopped at Store in Past 2 Weeks Number of Observations Pseudo R2 Log Likelihood Effect on Market Share from a increase in Soft Drink Prices: Before: After:

Current Prices 0.657 [0.317] -0.129 [0.054] 0.091 [0.036] -0.372 [0.078] -0.049 [0.021] 0.023 [0.060] -0.036 [0.032] -0.034 [0.048] -0.069 [0.069] 0.241 [0.077] -0.064 [0.059] -0.008 [0.033] 0.327 [0.109]

Specification VII Prices Lagged Prices Lagged 1 week 2 weeks 0.469 0.715 [0.335] [0.316] -0.019 0.126 [0.055] [0.055] 0.009 0.002 [0.036] [0.036] -0.189 -0.365 [0.081] [0.077] -0.036 -0.014 [0.021] [0.021] 0.148 -0.043 [0.061] [0.059] 0.013 0.074 [0.032] [0.033] -0.064 0.025 [0.049] [0.046] -0.220 -0.134 [0.072] [0.069] 0.035 0.198 [0.083] [0.075] -0.081 -0.025 [0.062] [0.058] -0.101 -0.081 [0.033] [0.033] 0.037 0.586 [0.108] [0.107]

273,489

Current Prices -0.192 [0.461] -0.153 [0.077] 0.162 [0.050] -0.391 [0.112] -0.076 [0.029] -0.183 [0.088] -0.060 [0.045] -0.131 [0.066] -0.152 [0.095] 0.362 [0.106] -0.126 [0.083] 0.010 [0.046] 0.075 [0.156]

Specification VIII Prices Lagged Prices Lagged 1 week 2 weeks 0.283 0.681 [0.485] [0.460] 0.121 0.247 [0.077] [0.077] -0.066 -0.003 [0.049] [0.050] -0.021 -0.248 [0.117] [0.112] 0.042 0.035 [0.029] [0.029] 0.112 -0.337 [0.089] [0.086] 0.005 0.105 [0.045] [0.047] -0.072 0.090 [0.068] [0.063] -0.122 -0.135 [0.096] [0.094] -0.075 0.384 [0.113] [0.103] -0.097 0.008 [0.087] [0.083] -0.070 -0.018 [0.045] [0.046] -0.182 0.812 [0.154] [0.153] 4.389 [0.029] 273,489

0.1255 -66,524

0.5296 -35,786

0.3414 0.3370

0.3414 0.3406

7up, 2L Bottle 7up, 6-pack 12oz cans Diet 7up, 24-pack 12oz cans Diet 7up, 2L Bottle Coke, 12-pack 12oz cans Coke, 24-pack 12oz cans Coke, 2L Bottle Diet Coke, 12-pack 12oz cans Diet Coke, 24-pack 12oz cans Diet Coke, 2L Bottle Diet Caffeine Free Coke, 12-pack 12oz cans Diet Caffeine Free Coke, 24-pack 12oz cans Private Label, 12oz Can Pepsi, 12-pack 12oz cans Pepsi, 24-pack 12oz cans Pepsi, 2L Bottle Pepsi, 6-pack 12oz cans Diet Pepsi, 3L Bottle Diet Pepsi, 12-pack 12oz cans Diet Pepsi, 24-pack 12oz cans Diet Pepsi, 2L Bottle RC, 2L Bottle RC, 3L Bottle Diet Rite, 24-pack 12oz cans Diet Rite, 2L Bottle

Product

Mean Marginal Cost 0.123 0.310 0.148 0.148 0.160 0.093 0.099 0.165 0.085 0.094 0.165 0.080 0.036 0.149 0.082 0.087 0.249 0.077 0.170 0.109 0.111 0.014 0.084 0.116 0.105

Std. Dev. of Marginal Cost 0.090 0.204 0.088 0.096 0.120 0.079 0.075 0.133 0.084 0.080 0.127 0.080 0.034 0.130 0.077 0.068 0.136 0.014 0.137 0.074 0.067 0.035 0.033 0.062 0.068

Number of Weeks w/ Negative Costs 6 2 3 3 2 12 10 3 19 16 3 18 19 3 14 14 1 0 4 9 3 29 6 1 3

Mean Markup ($) 0.108 0.081 0.119 0.083 0.179 0.181 0.150 0.179 0.189 0.154 0.180 0.190 0.154 0.190 0.191 0.161 0.164 0.123 0.171 0.167 0.134 0.165 0.108 0.147 0.113 Std. Dev. of Markup 0.035 0.135 0.032 0.046 0.048 0.026 0.027 0.054 0.031 0.032 0.048 0.029 0.018 0.078 0.024 0.023 0.080 0.014 0.088 0.023 0.023 0.020 0.017 0.023 0.025

Mean Margin (%) 53.2 27.7 49.1 41.7 58.5 71.1 65.3 58.8 74.8 67.6 58.4 75.5 82.9 61.6 75.0 69.3 43.7 61.6 55.6 64.7 58.5 93.8 58.1 58.7 56.0

Table 3.12: Summary Statistics for Marginal Costs (in Dollars per 12oz Serving) Implied by the Model

CHAPTER 3. COSTS OF UNIFORM PRICING 128

CHAPTER 3. COSTS OF UNIFORM PRICING

129

As is frequently the case (see Villas-Boas (2002)) my estimates imply that in some weeks, for some products, marginal costs are negative. Although this seems economically bizarre, in theory these could be explained by slotting allowances. In practice, (see Israilevich (2004)) the arrangements between retailers and manufacturers regularly involves nonlinear contracting schemes such as block discounts (which imply negative marginal costs over some regions). To the extent that my assumption of constant marginal costs is violated, I may be picking up some of these nolinearities. Figure 3.1 shows a typical path of prices and implied marginal costs over time, The key feature to notice here is that intertemporal marginal cost (e.g., wholesale price) variation is responsible for nearly all of the inter-temporal price variation. This feature is mirrored in the Dominick’s data. Figure 3.2 plots the average markup in cents per 12oz serving over the sample period implied by the derived marginal costs. The average average markup across products appears to be roughly fourteen cetns per 12oz serving, with occasional spikes upwards (and one large spike downwards). Together with the low standard deviations on margins shown in Table 3.12, this also agrees with what is observed in the Dominick’s data. “Optimal” Uniform Prices In order to calculate the profits the firm would have earned by following a uniform pricing strategy, I must first solve for the “optimal” uniform prices. I do this by restricting the prices each week to be uniform by manufacturer-brand-size24 . Then for each week, I numerically solve for the set of prices that maximizes expected profits, subject to this restriction. Table 3.13 presents summary statistics on the differences between these “optimal” uniform prices and the non-uniform prices actually charged by the retailer. Contrary to (my) expectations, the majority of the differences were not uniformly positive or negative. That is, in some weeks the non-uniform price was higher than the optimal uniform price, while in other weeks it was lower. In hindsight, this is actually suggested by the variation in the price ordering of the varieties in Figure 1.4. Furthermore, for all but two products (2L containers of RC and Diet Rite), the average difference between the non-uniform and the optimal uniform prices was less than one cent per 12oz serving.

24 For example, I restrict 12-packs of 12oz cans of Coke, Diet Coke, and Diet Caffeine Free Coke to all sell at the same price each week, although I allow this price to vary across weeks.

CHAPTER 3. COSTS OF UNIFORM PRICING

130

30 20 10 0

Cents per 12 Ounce Serving

40

Figure 3.1: Graph of the Price and Implied Marginal Cost (in cents per 12oz serving) for a 2L Bottle of Regular Pepsi, 6/91-6/93

08jun1991

05dec1991

02jun1992

29nov1992

28may1993

Week Price of 2L Pepsi (288oz size)

Cost of 2L Pepsi (288oz size)

Figure 3.2: Graph of the Average Markup (in cents per 12oz serving) Across Products, 6/91-6/93

18 16 14 12 10

Cents per 12 Ounce Serving

20

Average Markup Over Time

08jun1991

05dec1991

02jun1992 Week

29nov1992

28may1993

CHAPTER 3. COSTS OF UNIFORM PRICING

131

Figure 3.3: Graph of the Maximum Difference Across Products (in cents per 12oz serving) Between a Product’s Uniform and Non-Uniform Prices, 6/91-6/93

0

Cents per 12oz Serving 5 10

15

Maximum Price Difference Per Serving Between Uniform and Non-Uniform Prices

08jun1991

05dec1991

02jun1992 Week

29nov1992

28may1993

Table 3.13: Summary Statistics on the Differences Between Observed Non-Uniform Prices and “Optimal” Uniform Prices (in Dollars per 12oz Serving) Greatest Greatest Product Mean Std. Dev. of Increase Decrease Difference Difference from Uniform from Uniform 7up, 2L Bottle -0.001 0.015 0.145 0.028 Diet 7up, 2L Bottle -0.001 0.016 0.152 0.023 Coke, 12-pack 12oz cans 0.004 0.024 0.053 0.161 Coke, 24-pack 12oz cans 0.000 0.018 0.111 0.135 Coke, 2L Bottle 0.001 0.005 0.021 0.023 Diet Coke, 12-pack 12oz cans -0.002 0.012 0.054 0.035 Diet Coke, 24-pack 12oz cans 0.001 0.011 0.046 0.084 Diet Coke, 2L Bottle 0.001 0.006 0.021 0.018 Diet Caffeine Free Coke, 12-pack 12oz cans -0.001 0.010 0.051 0.030 Diet Caffeine Free Coke, 24-pack 12oz cans -0.001 0.007 0.046 0.016 Pepsi, 12-pack 12oz cans -0.005 0.031 0.232 0.031 Pepsi, 24-pack 12oz cans 0.000 0.004 0.017 0.019 Pepsi, 2L Bottle -0.001 0.005 0.031 0.011 Diet Pepsi, 12-pack 12oz cans -0.001 0.022 0.184 0.046 Diet Pepsi, 24-pack 12oz cans -0.002 0.007 0.039 0.013 Diet Pepsi, 2L Bottle 0.002 0.009 0.019 0.065 RC, 2L Bottle 0.011 0.015 0.014 0.056 Diet Rite, 2L Bottle -0.028 0.037 0.125 0.023

CHAPTER 3. COSTS OF UNIFORM PRICING 132

CHAPTER 3. COSTS OF UNIFORM PRICING

133

Profit Differences Again using my marginal cost estimates and the actual quantities sold, I can estimate the profits that the store A actually earned in each week. In addition, by simulating expected demand, I can calculate the profits that the firm expected to earn each week. The difference between these two numbers is that the former is scaled by the number of shoppers who actually went shopping in that week. These give me a measure of the profits earned by the firm under the non-uniform pricing regime. Comparing the these expected profit figures yields a weekly estimate of the percentage profit decrease that store A would have experienced if it had charged uniform prices. Table 3.14 shows the detailed results of these calculations for a typical week of the sample: the week beginning July 7, 1991. Several features are apparent. The first is that demand is strongly skewed towards the lowest priced products. The ten products priced at $0.21 cents per 12oz serving or lower sell by far the lasrgest share of the quantity. The second feature is that many of the prices are the same or nearly the same under both uniform and non-uniform pricing policies. In this week, store A actually charged the same price for 24-packs of 12oz cans of both Coke and Diet Coke. Because I assume (in order to identify the marginal costs) that the retailer charged the optimal prices in each week, the results are skewed towards finding a smaller estimate of the profit difference. Third, much of the increase in profits comes from a significant decrease in the price of a single product: 2L bottles of Royal Crown (RC) cola. Finally, I note in passing that demand for some goods increased, in spite of an increase in the price going from the uniform to the non-uniform. This can be attributed to the effects of cross-price elasticities – the prices of many other goods also moved.

7up, 2L Bottle 7up, 6-pack 12oz cans Diet 7up, 24-pack 12oz cans Diet 7up, 2L Bottle Coke, 12-pack 12oz cans Coke, 24-pack 12oz cans Coke, 2L Bottle Diet Coke, 12-pack 12oz cans Diet Coke, 24-pack 12oz cans Diet Coke, 2L Bottle Diet Caffeine Free Coke, 12-pack 12oz cans Diet Caffeine Free Coke, 24-pack 12oz cans Private Label, 12oz Can Pepsi, 12-pack 12oz cans Pepsi, 24-pack 12oz cans Pepsi, 2L Bottle Pepsi, 6-pack 12oz cans Diet Pepsi, 12-pack 12oz cans Diet Pepsi, 24-pack 12oz cans Diet Pepsi, 2L Bottle RC, 2L Bottle RC, 3L Bottle Diet Rite, 24-pack 12oz cans Diet Rite, 2L Bottle

Product 31.9 50.9 20.8 31.9 47.1 20.8 32.0 47.1 20.8 32.0 47.1 20.8 17.0 48.8 20.8 31.8 50.4 48.8 20.8 31.8 19.7 20.1 32.5 19.7

pU nif orm 58 12 5381 8 57 5609 83 81 5788 83 70 5944 3050 68 9099 284 108 51 8337 94 2703 298 285 367

QU nif orm 31.8 48.0 20.8 31.8 47.0 20.8 31.8 47.0 21.4 31.9 47.0 20.8 17.0 47.0 20.8 31.8 48.0 47.0 20.7 31.8 14.1 20.0 31.2 31.7

pN on−U nif orm 57 24 5406 10 58 5538 85 81 5463 86 71 5896 3016 87 8964 274 151 69 8283 90 4495 296 348 16

QN on−U nif orm

Marginal Cost 24.9 58.5 6.6 29.1 33.5 0.7 18.8 32.8 -0.7 20.4 32.9 -0.7 1.1 31.0 0.0 19.0 40.8 31.6 2.9 20.5 -1.7 9.7 18.9 24.5

ΠU nif orm ($) 4.05 -0.90 761.30 0.23 7.80 1126.52 10.88 11.58 1242.81 9.58 10.01 1272.49 484.78 12.04 1887.23 36.41 10.27 8.79 1492.67 10.75 578.59 30.78 38.73 -17.43

ΠN on−U nif orm ($) 3.94 -2.55 764.38 0.25 7.76 1115.18 10.98 11.48 1209.34 9.82 9.99 1265.51 479.22 13.88 1862.89 35.04 10.80 10.69 1479.67 10.20 708.66 30.44 42.75 1.16

Table 3.14: Uniform and Non-Uniform Prices, Quantities and Estimated Profits for the Week of July 7, 1991. Prices and marginal costs reported are in cents per 12oz Serving. Quantity is measured in 12oz servings. Profits are measured in dollars. All prices and profits are in nominal terms. The total difference in profits for the week from the two pricing strategies is $61.52. The 3L size of Pepsi was not offered in this week.

CHAPTER 3. COSTS OF UNIFORM PRICING 134

CHAPTER 3. COSTS OF UNIFORM PRICING

135

The weekly expected differences in profits are presented in Figures 3.4 and 3.5. I estimate a distinct mass point at zero lost profits. As noted above, this is largely due to my assumption that, in each week, the prices charged by store A were optimal. This assumption implies that for weeks in which store A actually charged uniform prices, it could not have expected to lose any profits doing so. My estimates imply that by charging uniform, rather than non-uniform prices, the retailer would have lost $36.56 per week in profit, or a total of $3,803 over the two year period of my sample. The prospect of earning an additional $3,803 in profits (roughly $5,135 2004 dollars) over a two-year period for the soft drink category may seem small, but this is only a single store in a much larger chain of more than 100 stores. If the chain were able to realize similar profit increases at other stores in the chain, a rough estimate of the profit increase would be over $250,000 dollars per year in 2004 dollars. This would presumably be more than enough to hire an empirical economist to determine the optimal prices for each product in each store in each week. Furthermore, this estimate is solely for the soft drink category. While it is not clear what the results would be for other categories, similar profit increases may be possible.

3.5 3.5.1

Interpreting “Lost” Profits Menu Costs

As mentioned in the introduction, if we set aside demand-side explanations, the two reasons for retailers to charge uniform prices are: to reduce menu costs and to soften price competition with other retailers. In the event that the difference between uniform and non-uniform prices is close to zero, this would suggest that retailers do not expect to lose much (if any) profit by charging uniform prices. On the other hand, if the predicted profit differences are positive, we must try to differentiate between these (and potentially other) explanations for the hypothetical “lost profits”. When talking to store managers, the most frequently offered explanation for the observed price uniformity is some form of menu costs. When pressed, Safeway store managers respond that the reason for uniform pricing is that it is “too much trouble” to price every good separately. In understanding what is meant by “too much trouble” it is important to distinguish between two different kinds of menu costs: physical menu costs and managerial menu costs. One type of menu cost comes from the costs associated with physically changing prices.

CHAPTER 3. COSTS OF UNIFORM PRICING

136

10 5 0

Percentage Difference Between Uniform and Non-Uniform Prices

15

Figure 3.4: Graph of the Difference Between Profits from Uniform and Non-Uniform Price Strategies, as a Percent of the Profits Earned at Non-Uniform Prices, 6/91-6/93

08jun1991

05dec1991

02jun1992

29nov1992

28may1993

Week

Figure 3.5: Graph of the Counterfactual Dollars Lost from Charging Uniform Prices, 6/91-6/93

0

100

Dollars 200 300

400

500

Profit Difference in Dollars Between Uniform and Non-Uniform Prices

08jun1991

05dec1991

02jun1992 Week

29nov1992

28may1993

CHAPTER 3. COSTS OF UNIFORM PRICING

137

According to Tony Mather, Director Business Systems, Safeway (U.K.): “Pricing at the moment is very labor-intensive. Shelf-edge labels are batch printed, manually sorted and changed by hand while customers are out of the store.”25 Levy et al. (1997) estimate the average menu costs to a large chain-owned grocery store for physically changing a single price tag to be $0.52. To put this in perspective, a typical large grocery store usually changes the price tags on about 4,000 items each week, changing as many as 14,000 tags in some weeks. Although their study was conducted on behalf of a company selling electronic price display tags, their stated aim was to put a lower bound on menu costs and they report that grocery store executives generally agreed with their findings. One might think physical menu costs promote uniform pricing – that stores reduce their physical menu costs by charging uniform prices. However, two pieces of evidence suggest that physical menu costs do not explain uniform prices. First, grocery stores typically post prices for every UPC even when they are uniformly priced. Hence, the physical menu costs are the same, regardless of whether the prices are priced uniformly or non-uniformly. Second, in cases where the physical menu cost is presumably small or insignificant, we still observe uniform prices. Even grocery stores that have implemented electronic display tags and that can change prices throughout the store at the touch of a button from the store’s central computer continue to charge uniform prices. Furthermore, online grocery stores – who presumably have nearly zero physical menu costs – also sell at uniform prices. A second kind of menu cost, and one that has not typically been discussed in the literature is the managerial cost associated with figuring out what price to charge for that product. While academic papers generally assume that retailers learn optimal prices costlessly, this is an abstraction from reality. In order to learn its demand function, a retailer must experiment by charging a variety of prices – introducing exogenous price variation – and this experimentation can be costly. In addition, the retailer may have to hire personnel or consulting services to determine “optimal” prices. These costs may not be insubstantial. A recent article in Business Week (Keenan 2003) suggests that implementing the advanced techniques offered by pricing consultants typically requires a “12-month average installation” time and a price that “start[s] at around $3 million.” If the additional expected profit to be gained from charging different prices for two products is less than the cost of figuring out what those prices should be, then we will see uniform prices. This suggests that the store’s choice of whether to follow a uniform or non-uniform 25

http://www.symbol.com/uk/Solutions/case study safeway.html

CHAPTER 3. COSTS OF UNIFORM PRICING

138

pricing strategy is more likely a long-term decision rather than a week-by-week decision. In this case, the relevant cost to consider is the present discounted value of the sum of the lost expected profits across weeks and represents the one-time or infrequent cost either of experimentation or consulting services. Managerial menu costs also suggest a reason that pricing strategies may vary across stores – leading some stores to charge uniform prices while others charge non-uniform prices. Pricing decisions for most large grocery chains are made at the chain level. Store managers at these chains typically receive the week’s prices electronically from company headquarters, and are only responsible for making sure that price labels are printed and placed on shelves. This centralization allows large chains to spread out these managerial costs across many stores. However, evidence suggests that even large chains may be influenced by managerial menu costs. Chintagunta et al. (2003) document the fact that Dominick’s Finer Foods grouped its stores into three different categories based on the levels of competition the stores faced, with each of roughly one hundred stores charging one of three menus of prices. Such pricing heuristics presumably lower managerial costs by reducing the dimensionality of the optimal pricing problem, but at the cost of non-optimal prices. Other pricing heuristics seem to be in widespread use. Both small retailers and large grocery stores26 frequently use constant-markup pricing heuristics, such as pricing all goods at wholesale cost plus a fixed percentage or amount. The apparent widespread use of these pricing heuristics may explain why soft drink prices tend to vary dramatically over time, but not cross-sectionally – while wholesale prices move a good deal over time, wholesale prices are typically uniform within manufacturer-brand. Unfortunately, this raises the question of why manufacturers would choose to price their products uniformly.

26 Data suggests that Dominick’s Finer Foods (described in section 5) frequently followed a constantmarkup pricing strategy.

CHAPTER 3. COSTS OF UNIFORM PRICING

3.6

139

Conclusion

In retail environments, many differentiated products are sold at uniform prices. Explanations for this behavior can be grouped into demand-side and supply-side explanations. Lacking the necessary data to investigate demand-side explanations, I look at supply-side explanations. Using grocery store scanner panel data and household grocery purchase histories, I examine the market for carbonated soft drinks – a product that is frequently, but not always, sold at uniform prices – and evaluate the validity of several supply-side explanations. To do this, I develop a new structural model of household demand for carbonated soft drinks. Using the estimated demand system, I conduct the counter-factual experiment of forcing the prices of a particular store to be uniform, and comparing the resulting profits to the non-uniform case. The results from the new structural model suggest that uniform pricing leads to a total profit loss for the retailer over the two year sample period, of roughly $5,135 in 2004 dollars. This result suggests that there are additional profits to be earned from non-uniform pricing, under the assumption that the retailer charged optimal prices. Clearly, however, it may not be profitable for single-store retailers to take advantage of this opportunity. Without the benefits of multiple stores over which to spread the managerial costs of determining optimal prices, single store retailers may find it optimal to charge uniform prices. Unfortunately, this “scale” explanation cannot be the whole story. Anecdotal evidence suggests that Walmart charges uniform prices for many products, even though that company has almost certainly realized most returns to scale with respect to managerial menu costs. Although additional research is necessary regarding demand-side reactions to non-uniform pricing, these results suggest that pricing managers, particularly those at large retail chains, should be aware of potential additional profits available from non-uniform pricing. Moreover, they suggest that for single-store retailers, relatively small managerial menu costs are able to generate the observed behavior.

3.7

Appendix 3.A: Numerically Solving The Utility Function

The key ingredient to this estimation procedure is the ability to quickly and reliably solve the household’s constrained utility maximization problem. In addition to theoretical reasons for imposing concavity in the household’s utility function, without this restriction, solving for the household’s optimal bundle would be difficult if not impossible. When the utility

CHAPTER 3. COSTS OF UNIFORM PRICING

140

function is concave, this is much easier, and gradient-based numerical methods give good results. It is imperative that the numerical solutions to the household’s optimization problem be correct. If the solutions are not the true optimal bundles, the parameter estimates will not be consistent or unbiased. To perform these optimizations, I have employed several different numerical methods, with varying degrees of success. Because there is no way to analytically verify the solution when using large numbers of goods and/or characteristics, I use the best solution from a large number of randomly drawn starting values with the Subplex and Nelder-Mead optimization routines as the “true” solution. The following three optimization algorithms have proven useful: • NAG E04UGF - This algorithm is based on the SNOPT/NPSOL packages and currently gives the best results. Unlike the other optimization packages used, this uses a user-supplied analytic gradient. With a single randomized starting value, the “true” solution is found at least 99% of the time. With two randomized starting values, this increases to 100%. • NAG E04CCF - This is the Numerical Algorithm Group’s FORTRAN implementation of the Nelder-Mead simplex method. It also gives good results, but, because it does not use gradients, takes much longer. • Subplex - Subplex is a subspace-searching simplex method for the unconstrained optimization of general multivariate functions. Like the Nelder-Mead simplex method it generalizes, the subplex method is well suited for optimizing noisy objective functions. Subplex was developed by Tom Rowan at Oak Ridge National Laboratory and is described in: T. Rowan, ”Functional Stability Analysis of Numerical Algorithms”, Ph.D. thesis, Department of Computer Sciences, University of Texas at Austin, 1990. Subplex tends to be less consistent at finding the correct solution, but occasionally significantly improves on the solutions in the above two methods.

3.8

Appendix 3.B: Modeling Heterogeneity of Preferences

Although the product-level demand model I estimate does not incorporate household-level heterogeneity, this would be a relatively straightforward extension for future work. The greatest obstacle to estimating such a model is that it increases the number of parameters

CHAPTER 3. COSTS OF UNIFORM PRICING

141

to estimate. I explored specifications including discrete types driven by observable demographics including household size and median expenditure level, but the MSM distance function in this case was poorly behaved. Similarly, one could estimate a model of unobserved heterogeneity, but the greatest obstacle in this case would be a dramatic increase in the required computing power. As mentioned earlier, in order to get a sufficiently good estimate of the expected purchases, it is necessary to use at least R = 30 simulations. Unfortunately, with current processing power, this means that estimation takes several weeks. Incorporating additional parameters increases the difficulty of the MSM distance function optimization. Incorporating unobserved heterogeneity requires at least an order of magnitude increase in the number of simulations, making such a venture prohibitively computationally expensive for this application at the present time, although it might be possible by using a smaller set of products (which would allow the utility function to be solved more quickly).

3.9

Appendix 3.C: Analysis of the Panel Composition

This paper makes extensive use of the purchase histories of the households in the IRI dataset. Hence, one would like to know whether these households are indeed representative of the households that typically shop at these stores. Fortunately, in addition to purchase histories, the IRI dataset contains demographic information for each of the households. As seen in Table 3.15, the mean household size for the panel is 1.9, with a standard deviation of 1.23. Additionally, nearly all of the households have children, and more than 1/3 have a retired female. Table 3.16 shows the age distribution for the primary man and woman in the household. For both men and women in the panel, the median age appears to be in the range of 55-64, although these numbers may be skewed by the large number of possible non-responses. Finally, Table 3.17 shows the distribution of income among households in the panel, with a median in the range of $20,000-25,000.

CHAPTER 3. COSTS OF UNIFORM PRICING

142

Table 3.15: Size and Composition of Households in Panel Total Number of Households 262 Median Number of Members in Household 1 Mean Number of Members in Household (s.d.) 1.75 (1.06) Fraction of Households with No Kids .004 Kids Aged 0-5 .015 Kids Aged 6-11 .069 Kids Aged 12-18 .061 Kids Aged 18+ .889 with a Retired Male .168 Retired Female .347 Sample consists of all households that shopped at store A at least once during the two year period. Source: IRI Data.

65+ 115 43.9 59 22.5

10k 52 19.9

10-12k 14 5.3

12-15k 26 9.9

15-20k 16 6.1

Household Income 20-25k 25-35k 35-45k 14 28 23 5.3 10.7 8.8

45-55k 20 7.6

55-65k 19 7.3

65-75k 12 4.6

Sample consists of all households that shopped at store A at least once during the two year period. Source: IRI Data.

Num. of Hhds in Cat. % of Hhds

N.R. 5 1.9

Table 3.17: Income Distribution of Panel Households

Sample consists of all households that shopped at store A at least once during the two year period. Source: IRI Data.

Table 3.16: Age Distribution of Primary Male and Female in Households in Panel Age None Present/ No Response 18-29 30-34 35-44 45-44 55-64 Num. of Hhds w/ Primary Female Aged 35 1 8 26 37 40 % of Hhds 13.4 0.4 3.1 9.9 14.1 15.3 Num. of Hhds w/ Primary Male Aged 145 2 2 11 19 24 % of Hhds 55.3 0.8 0.8 4.2 7.3 9.2

75k+ 33 12.6

CHAPTER 3. COSTS OF UNIFORM PRICING 143

Demographic data for population in US Census tracts surrounding four Dominick’s Finer Foods stores located in close geographical proximity to the stores in the panel. No information is available on the population size of these areas. Source: Market Metrics, based on 1990 US Census Data.

Table 3.18: Summary Statistics for Population Living Near Stores in the Panel W X Y Z Mean Number of Members in Household 1.55 2.74 2.53 2.15 Fraction of Households with 1 Member .614 .270 .324 .426 with 2 Members .280 .277 .288 .302 with 3 or 4 Members .092 .309 .269 .193 with 5+ Members .014 .144 .119 .079 Fraction of Women that have No Kids .881 .708 .689 .738 that have Kids Aged 0-5 .060 .144 .152 .142 that have Kids Aged 6-17 .059 .147 .159 .121 Fraction of Population Retired .094 .172 .169 .124 Median (s.d.) Household Income ($000’s) 31.1 (25.9) 25.4 (20.4) 24.1 (21.3) 26.5 (23.9) Fraction of Households with Income<$15k .088 .133 .153 .152

CHAPTER 3. COSTS OF UNIFORM PRICING 144

CHAPTER 3. COSTS OF UNIFORM PRICING

145

In the end, it is difficult to know how to assess these numbers. Ideally, I would like to know how they compare to the population of shoppers at store A. A rough proxy for this population is shown in Table 3.18. This table displays demographic information on the population surrounding four stores owned by Dominick’s Finer Foods that are located in close geographic proximity to store A. Several stark differences are apparent. Households in the panel are much more likely to have children than those in the surrounding population. They also tend to have lower incomes, with a substantially larger fraction of them earning below $15,000 per year. These differences are suggestive that the households in the panel are quite different from those in the background population, but no more than that. If typical shoppers at store A also differ from the background population, my panel may still be representative. Regardless, it is important to note that the ability to draw broad inferences from household-level data hinges critically on the representativeness of the panel.

Bibliography Ackerberg, D. & Rysman, M. (2004), ‘Unobserved product differentiation in discrete choice models: Estimating price elasticities and welfare effects’, RAND Journal of Economics, forthcoming . Ball, L. & Mankiw, N. G. (2004), ‘A sticky-price manifesto’, NBER Working Paper (4677). Bayus, B. L. & Putsis, W. P. (1999), ‘Product proliferation: An empirical analysis of product line determinants and market outcomes’, Marketing Science 18(2), 137–153. Bell, D., Ho, T.-H. & Tang, C. (1998), ‘Determining where to shop: Fixed and variable costs of shopping’, Journal of Marketing Research 35(3), 352–369. Bell, D. R. & Lattin, J. M. (1998), ‘Shopping behavior and consumer preference for store price format: Why “large basket” shoppers prefer edlp’, Marketing Science 78, 66–88. Berry, S., Levinsohn, J. & Pakes, A. (1995), ‘Automobile prices in market equilibrium’, Econometrica pp. 841–890. Burstiner, I. (1997), The Small Business Handbook: a Comprehensive Guide to Starting and Running Your Own Business, third edn, Simon and Schuster. Canetti, E., Blinder, A. & Lebow, D. (1998), Asking About Prices: A New Approach to Understanding Price Stickiness, Russell Sage Foundation Publications. Carlton, D. W. (1989), The Theory and the Facts of How Markets Clear: Is Industrial Organization Valuable for Understanding Macroeconomics?, Vol. 1 of The Handbook of Industrial Organization, Elsevier Science Publishers, chapter 15, pp. 909–946.

146

BIBLIOGRAPHY

147

Chan, T. Y. (2002), ‘Estimating a continuous hedonic choice model with an application to demand for soft drinks’. Mimeo, Washington University, St. Louis, Olin School of Business. Chiang, J., Chung, C.-F. & Cremers, E. T. (2001), ‘Promotions and the pattern of grocery shopping time’, Journal of Applied Statistics 28(7), 801–819. Chintagunta, P., Dub´e, J.-P. & Singh, V. (2003), ‘Balancing profitability and customer welfare in a supermarket chain’, Quantitative Marketing and Economics 1, 111–147. Corts, K. (1998), ‘Third degree price discrimination in oligopoly: All-out competition and strategic commitment’, RAND Journal of Economics 29(2), 306–323. Deaton, A. & Muellbauer, J. (1980), ‘An almost ideal demand system’, American Economic Review 70(3), 312–326. Draganska, M. & Jain, D. (2001), Product line length decisions in a competitive environment. Mimeo, Stanford University, Graduate School of Business. Dub´e, J.-P. (2001), Multiple discreteness and product differentiation: Strategy and demand for carbonated soft drinks. Mimeo, University of Chicago, Graduate School of Business. Fatsis, S. (2002), ‘The Barry Bonds tax: Teams raise prices for good games’, Wall Street Journal (Eastern Edition) (12/3), D1. Gentzkow, M. (2004), Valuing new goods in a model with complementarities: Online newspapers. Mimeo, University of Chicago, Graduate School of Business. Gouri´eroux, C. & Monfort, A. (1996), Simulation-Based Econometric Methods, Oxford University Press. Hauser, J. R. & Wernerfelt, B. (1990), ‘An evaluation cost model of consideration sets’, The Journal of Consumer Research 16(4), 393–408. Hausman, J., Leonard, G. & Zona, J. (1994), ‘Competitive analysis with differentiated products’, Annales D’Economie et de Statistique 34, 159–180. Hays, C. L. (1997), ‘Variable-price coke machine being tested’, The New York Times p. C1. October 28.

BIBLIOGRAPHY

148

Hendel, I. (1999), ‘Estimating multiple-discrete choice models: An application to computerization returns’, Review of Economic Studies 66(2), 423–446. Hendel, I. & Nevo, A. (2002), Measuring the implications of sales and consumer stockpiling behavior. Mimeo, University of Wisconsin, Madison and University of California, Berkeley. Hess, J. D. & Gerstner, E. (1987), ‘Loss leader pricing and rain check policy’, Marketing Science 6(4), 358–374. Heun, C. T. (2001), ‘Dynamic pricing boosts bottom line’, Information Week (861), 59. Ho, T.-H., Tang, C. S. & Bell, D. R. (1998), ‘Rational shopping behavior and the option value of variable pricing’, Marketing Science 44(12), 115–160. Hoch, S. J., Dreze, X. & Purk, M. (1994), ‘EDLP, hi-lo, and margin arithmetic’, Journal of Retailing 58, 16–27. Israilevich, G. (2004), ‘Assessing product-line decisions with supermarket scanner data’, Quantitative Marketing and Economics 2(2), 141–167. Iyengar, S. S. & Lepper, M. R. (2000), ‘When choice is demotivating: Can one desire too much of a good thing?’, Journal of Personality and Social Psychology 79(6), 995–1006. Kadiyali, V., Vilcassim, N. & Chintagunta, P. (1999), ‘Product line extensions and competitive market interactions: An empirical analysis’, Journal of Econometrics 89, 339–363. Kahn, B. E. & Schmittlein, D. C. (1989), ‘Shopping trip behavior: An empirical investigation’, Marketing Letters 1(1), 55–69. Kahneman, D., Knetsch, J. L. & Thaler, R. (1986), ‘Fairness as a constraint on profit seeking: Entitlements in the market’, American Economic Review 76(4), 728–741. Kashyap, A. K. (1995), ‘Sticky prices: New evidence from retail catalogs’, Quarterly Journal of Economics 110(1), 245–274. Keenan, F. (2003), ‘The price is really right’, Business Week (3826), 62. Kim, J., Allenby, G. & Rossi, P. (2002), ‘Modeling consumer demand for variety’, Marketing Science 21, 229–250.

BIBLIOGRAPHY

149

Leslie, P. (2004), ‘Price discrimination in broadway theatre’, RAND Journal of Economics 35(3). Leszczyc, P. P., Sinha, A. & Timmermans, H. (2000), ‘Consumer store choice dynamics: An analysis of the competitive market structure for grocery stores’, Journal of Retailing 76, 323–345. Levy, D., Bergen, M., Dutta, S. & Venable, R. (1997), ‘The magnitude of menu costs: Direct evidence from large u.s. supermarket chains’, Quarterly Journal of Economics 112(3), 792–825. McFadden, D. (1989), ‘A method of simulated moments for estimation of discrete response models without numerical integration’, Econometrica 57(5), 995–1026. Nevo, A. (2001), ‘Measuring market power in the ready-to-eat cereal industry’, Econometrica 69(2), 307–342. Orbach, B. Y. & Einav, L. (2001), ‘Uniform prices for differentiated goods: The case of the movie-theater industry’, Harvard John M. Olin Discussion Paper Series (337). Pakes, A. & Pollard, D. (1989), ‘Simulation and the asymptotics of optimization estimators’, Econometrica 57(5), 1027–1057. Rhee, H. & Bell, D. R. (2002), ‘The inter-store mobility of supermarket shoppers’, Journal of Retailing (1), 225–237. Shugan, S. M. (1980), ‘The cost of thinking’, The Journal of Consumer Research 7(2), 99– 111. Villas-Boas, S. B. (2002), Vertical contracts between manufacturers and retailers: An empirical analysis.

estimating demand for differentiated products with ...

For example, in the yogurt category, all flavors of six ounce Dannon Fruit-on-the-. Bottom yogurt are sold ... not true at all times, in all stores, and for all products, the extent of these uniform prices across different retailers .... non-uniform pricing for movies, and are unable to find any convincing explanation. The answer to the ...

1MB Sizes 0 Downloads 206 Views

Recommend Documents

Estimating Demand for Mobile Applications
Stern School of Business, New York University & Wharton. School, University of Pennsylvania [email protected]. Sang Pil Han. College of Business, City University of Hong Kong [email protected] .... discussed users' usage patterns of voice call

Estimating demand in online search markets, with ...
Nov 6, 2012 - is engaged in "discovery", where she is learning about existing product varieties and their prices. ..... Second, the search horizon is finite.

Estimating Housing Demand With an Application to ...
tion of household demographics. As an application of our methods, we compare alternative explanations .... ple who work have income above the poverty line. The dataset ... cities, both black and white migrants are more likely to rent their home and t

Estimating Housing Demand With an Application to ...
Housing accounts for a major fraction of consumer spend- ing and ... erences even after accounting for all household demographics. ... statistical packages.

Supporting Teachers' Growth with Differentiated Professional ... - Eric
Apr 8, 2011 - analysis of their students' assessment data was the primary focus of the .... Especially because it ties into the animal [science] unit, along with.

Estimating Domestic Demand for Major Fruits in the Philippines - Sign in
School of Management. University of the Philippines Mindanao. Determining the Market Potential of Major. Fruits in the Philippines: An Application of the.

Algorithms for estimating information distance with application to ...
Page 1. Algorithms for Estimating Information Distance with Application to ... 0-7803-8253-6/04/$17.00 02004 IEEE. - 2255 -. Page 2. To express function E,(x,y) ...

Algorithms for estimating information distance with ...
d(x#)=dbA. (symmetry axiom). The universality implies that if two objects are similar in some computable metric, then they are at least that similar in E&y) sense.

Estimating Anthropometry with Microsoft Kinect - Semantic Scholar
May 10, 2013 - Anthropometric measurement data can be used to design a variety of devices and processes with which humans will .... Each Kinect sensor was paired with a dedicated ..... Khoshelham, K. (2011), Accuracy analysis of kinect.

Estimating Bayesian Decision Problems with ...
Nov 11, 2014 - data is to estimate the decision-making parameters and understand, .... Therefore one can recover (θit,πit) if ρt, γit,0, and γit,1 are known.

Shape Optimization for Human-Centric Products with ...
Feb 27, 2014 - Email: [email protected]. Abstract. In this paper .... (Left) A given design is defined on the template human body model H. (Middle) The ...

Estimating Farm Production Parameters with ...
settings, due to, for example, differences in geography, weather or user training and behavior. (Bogaert et al. ..... A key issue when estimating production functions is accounting for unobserved productivity. I account for ..... was implemented by t

Demand for Slant
23 Jul 2013 - consumption. Before the advent of cable TV and the internet, the US market for non-local news was ... Corresponding author: Felix Vбrdy, International Monetary Fund and Haas School of Business, UC. Berkeley ..... action of abstaining,

PDF Matching Supply with Demand
83-93 84 NPD- Supply Chain Management (SCM) alignment have been ARTICLE IN PRESSG Model JMSY-212; No. of Pages 16 Journal of Manufacturing Systems (2013) – Contents lists avai Disclosed herein are systems and methods for demand forecasting that ena

Optimization with Demand Oracles
Jul 15, 2011 - Optimization with Demand Oracles. Ashwinkumar Badanidiyuru. Department of Computer Science. Cornell Unversity [email protected].

Optimization with Demand Oracles
Jul 15, 2011 - We study combinatorial procurement auctions, where a buyer with a valuation function v and budget B wishes to buy a ... truthful, assume that the valuations are accessed via demand oracles (e.g., [13, 11, 7, 8, 1, 20]). ...... Game The

Estimating Production Functions with Robustness ...
The literature on estimating production functions on panel data using control functions has focused mainly ... ∗We thank James Levinsohn for providing us with the Chilean manufacturing industry survey data. We also ...... analytical in the paramete

Differentiated Teaching and Learning - UKM
Apr 9, 2014 - The Teacher Tool Kit is a support guide to assist teachers to provide quality lessons ..... Students can take turns to be the runner/ writer until the answer ...... Every morning for the past week, a man in a military uniform has.