Causal Ambiguity as a Source of Sustained Capability-Based Advantage Michael D. Ryall Melbourne Business School May 24, 2007

Abstract This paper presents the …rst formal examination of role of causal ambiguity as a barrier to imitation. Here, the aspiring imitator faces a knowledge (i.e., “capabilities-based”) barrier to imitation that is both causal and ambiguous in a precise sense of both words. Imitation conforms to a well-explicated process of learning-by-observing. I provide a precise distinction between the intrinsic causal ambiguity associated with a particular strategy and the subjective ambiguity perceived by a challenger. I …nd that intrinsic ambiguity is a necessary but insu¢ cient condition for a sustained capability-based advantage. I also demonstrate that combinatorial complexity, a phenomenon that has attracted the recent attention of strategy theorists, and causal ambiguity are distinct barriers to imitation. The former acts as a barrier to explorative/active learning and the latter as one to absorptive/passive learning. One implication of this is that learning-by-doing and learning-by-observing are complementary strategic activities, not substitutes – in most cases, we should expect …rm strategies to seek performance enhancement using e¤orts of both types.

I wish to thank R. Cassadesus-Masanel, D. Chickering, G. Cooper, J. Denrell, J. Gans, B. Gibbs, P. Ghemawat, H. Harmon, A. W. King, M. Lenox, D. Levinthal, G. MacDonald, J. Rivkin, S. Rockhart, O. Sorenson and participants at the 2006 ACAC and AoM conferences, as well as strategy workshops at Rice and U. T., Austin.

1

1

Introduction

A central proposition in strategy is that …rms sustain relative performance advantages only if their existing and potential rivals cannot imitate them (Nelson and Winter, 1985; Dierickx and Cool, 1989; Barney, 1991). In this context, “imitation” means the purposeful endeavor to improve performance by copying the form and strategy of a superior rival. An imitation strategy is one of many ways two …rms may become similar in appearance and performance. For example, de novo innovation can result in such similarities and, when it does, also be referred to as “imitation” (as in Lippman and Rumelt,1982). The primary focus of this paper is upon imitation as an explicit strategy and causal ambiguity as a particular barrier to its success.1 Generally speaking, imitation fails when it is physically impossible, legally prevented, economically unattractive or the necessary knowledge is lacking. Saloner et al. (2001) label barriers of the …rst three types “positional” and those of the last “capabilities-based.” The conditions leading to positional barriers (e.g., switching costs, entry costs, scope and scale economies, and the likelihood of ex-post retaliation) have been extensively studied in both formal and informal settings and are presently well understood (e.g., Porter, 1980; Tirole, 1988). Capabilities-based barriers have received much less in the way of formal attention. Certainly, when imitation is hampered by a lack of knowledge, learning must become a core strategic objective of the aspiring imitator. Learning can be explorative in the active sense of learning from one’s own experience (learningby-doing), or absorptive in the passive sense of learning from external information (learning-by-observing).2 Thus, a capabilities-based advantage may be sustainable if learning of both types is e¤ectively blocked. This means that any theory of sustained capabilities-based advantage must be precise in its description not only what is known and believed by the imitator a priori, but also the dynamic process by which learning occurs, what the speci…c impediments to learning are, how they operate and the conditions under which they persist. One exception to the paucity of formal work in this area is the recent stream of literature examining the role of combinatorial complexity in creating knowledge-based barriers to imitation (Levinthal, 1997; Ghemawat and Levinthal, 2000; Rivkin, 2000; Lenox et al., 2006a, 2006b). This work applies the formalism of Kaufman’s (1993) N K model of evolutionary biology to competition between …rms. In this setting, …rm performance is a function of N strategic activities. The extent to which performance is driven by interactions 1 It

is worth pointing out that MacDonald and Ryall (2004) demonstrate that inimitability is neither necessary nor su¢ cient

to prevent imitation. In the case of complementors, imitation may actually improve the performance of both …rms (MacDonald and Ryall, 2006). In what follows, I develop a setting in which the motivation for and e¤ect of imitation do conform to the conventional intuition. 2 There are multiple streams of literature in strategy that focus upon learning-by-doing, each deserving a survey paper in its own right. These include papers on learning curves (e.g., Lieberman, 1984; Ghemawat, 1985), exploration (e.g., March, 1991; Levinthal and March, 1993), dynamic experimentation (e.g., Besanko et al., 2007) and search (e.g., LR82; Rivkin 2000). Learning-by-observing has seen much less attention, a notable exception being the work on absorbtive capacity (e.g., Cohen and Levinthal, 1990).

2

between activities is an exogenous feature of the competitive environment and is summarized by a parameter, K. Managers do not know the relationship between activities and performance. They learn by exploring the local neighborhood of their current activities or, if attempting to imitate, in the neighborhood of an industry leader.3 When combinatorial complexity is low (N K small), the globally optimal activity con…guration is quickly learned via local search. When complexity is high, however, interactions between activities make local search much less e¤ective. Thus, combinatorial complexity is a barrier to exploration by local search and, as is broadly demonstrated by this stream of research, can be a source of sustained capabilities-based advantage.4 The focus here is upon causal ambiguity as an obstacle to imitation one that is distinct from combinatorial complexity. In my analysis, causal ambiguity operates as a barrier to absorptive, or passive, learning. The term “causal ambiguity” in its traditional usage refers to any knowledge-based impediment to imitation (e.g., Saloner et al., 2001, p. 49). The …rst strategy paper using this term appears to be Lippman & Rumelt (1982) who assert (p. 418), “basic ambiguity concerning the nature of the causal connections between actions and results” can result in persistent performance heterogeneity because “the factors responsible for performance di¤erentials resist precise identi…cation.”Although Lippman & Rumelt (1982) present a formal model, causal ambiguity does not enter into it as a speci…c object of analysis. Just the same, the preceding assertion is now commonplace (often supported by a reference to Lippman & Rumelt, 1982), appearing in everything from foundational scholarly contributions (e.g. Barney, 1991, p. 107; Peteraf, 1993, p. 182) to MBA textbooks (e.g., Besanko et al., 1996, p. 552; Collis and Montgomery, 1998, p. 34; Grant, 2002, p. 238). The point-of-view taken here is that when “causal ambiguity” is as broadly de…ned as “the state in which managers do not know how their actions map to consequences,” the statement “managers experience causal ambiguity”is indistinguishable from “managers don’t know what they’re doing,”in which case a bias toward plain language should favor the latter.5 This paper is motivated by the interesting possibility, as initially raised by Lippman & Rumelt (1982), that a particular type of confusion can arise in the context of competitive imitation that is both “causal” and “ambiguous” in a precise sense of both words. To explore this possibility, I create a model in which …rms are conceptualized as a collection of activity centers, with di¤erences in …rm performance arising from di¤erences in activities undertaken. This is 3 What 4 As

Rivkin (2000) refers to as incremental improvement versus follow-the-leader imitation, respecively. a very relevant aside, these papers also raise the possibility that the degree of interaction is a managerial choice. Rivkin

(2000, p. 843) says, “Analyses in the paper also hint at the potency of altering the relationships among decisions,” and Ghemawat and Levinthal (2000, p. 8) observe, “All this could be read to suggest that an over-arching choice of con…guration set the context for most of the other exceptional rather than normal choices embedded in Southwest’s activity system,” [emph. added]. The idea that managers choose the interrelationships is central to this paper. 5 The connection between causal ambiguity and imitation has seen some informal re…nement. King and Zeithaml (2001) distinguish between knowing what to imitate vs. how to imitate. Mosakowski (1997) examines the extent to which relevant information is knowable. Reed and DeFillippi (1990) conjecture a positive relationship between causal ambiguity and the number …rm activities (this conjecture is superceded by the previously described complexity work).

3

consistent with Porter (1996) and the previously cited work on combinatorial complexity. Here, however, managers cannot choose the entirety of …rm-wide activities with deterministic precision but, instead, can only specify in‡uence relations between subordinates who, themselves, decide which activities get done. Once the …rm’s network of in‡uence relationships is created, activities are assumed to occur according to a stochastic process. Di¤erent in‡uence structures generate di¤erent probability distributions over activities and, hence, di¤erent levels of expected pro…t. The potential imitator knows neither the stochastic implications of in‡uence structure nor the actual in‡uence relations chosen by an industry exemplar. It does, however, observe the activities of the exemplary …rm and, from these observations, may be able to piece together the network of in‡uence relations underlying its successful performance. If so, absorptive learning may lead to successful imitation. To take a simple example, the market leader in an industry may know that new products are most successful when marketing professionals drive their development. It may have a formal process by which Marketing (an activity center populated with such professionals) develops new products and hands them o¤ to Engineering for design and materials speci…cation or it may simply have a corporate culture in which “Marketing is king.”Either way, the e¤ect is that engineers react to the activities of marketers. Now, consider the problem from the perspective of a potential entrant who wishes to imitate the successful performance of the market leader. This challenger must decide whether to make its product development marketing- or engineering-driven. If the latter, engineers generate new product ideas (driven by, e.g., technological niftiness, design elegance, production cost concerns, etc.) and hand them o¤ to marketers for pricing and distribution. It seems natural to assume that the imitator can observe the leader’s enticing …nancial performance and even the leader’s Marketing and Engineering activities (ad campaigns, sales force deployments, product features, components, etc.) – but not its internal (possibly informal) web of in‡uence relations under which these activities arise. In the intuitive notation of causal systems, then, one of the key problems facing the imitator (in this paper, the key problem) is …guring our whether the leader has structured operations as M ! E or M

E where M and E are variables identifying Marketing and Engineering activities, respectively.6

Where does causal ambiguity enter the analysis? First, when the behavior of activity centers is consistent with the overarching network of in‡uence relations in a very natural way, the …rm’s operations can be represented as a stochastic causal system.7 As a result, a potential imitator can attempt to learn the in‡uence structure of a rival by applying standard techniques of causal inference to that rival’s observed activities.8 Second, suppose the imitator begins with subjective beliefs about the stochastic implications of 6 The

third possibility is that the two groups are simply left to …ght it out every time, each marching to the beat of its own

drummer. 7 These are sometimes referred to as “Bayes nets.” Basic introductions to these systems include Pearl (1988, 2000), Edwards (1995), Jensen (1996, 2001), and Korb and Nicholson (2004). More advanced coverage is found in Cowell et al. (1999) and Spirtes et al. (2000). 8 Al-Najjar (2006) also explores the limits of statistical identi…cation procedures in a competitive setting. He shows that

4

in‡uence relations and about the true structure underlying the superior performance of its target. Then, the imitator holds beliefs over probabilities and, as a result, ambiguity is at play (in the technical sense of Einhorn and Hogarth, 1986). By introducing these elements into my model, I am able to de…ne subjective causal ambiguity as a precise measure of the spread in the imitator’s subjective beliefs with respect to which in‡uence relations are most e¢ cacious (speci…cally, the entropy of these beliefs). In addition, I develop a related measure of the extent to which the exemplary …rm’s in‡uence network is inherently resistant to causal inference, which I call its intrinsic causal ambiguity. The question, then, is the extent to which intrinsic ambiguity (a feature of the competitive environment) acts as a barrier to learning-by-observation and, hence, as a source of capabilities-based advantage. My results demonstrate that the intrinsic causal ambiguity of a …rm’s internal in‡uence network presents a long-run upper bound on the subjective ambiguity experienced by its challengers. When …nancial risk is factored into the analysis, I show that this suggests that imitation is less likely in industries characterized by higher levels of intrinsic ambiguity. Because the relationship between causal ambiguity and the density of in‡uence relations does not increase monotonically, some causally ambiguous strategies may be complex but others deceptively simple. This last point suggests, contrary to the conventional wisdom, that the strategies of small, younger …rms may be inherently more di¢ cult to imitate than those of their of larger, more established counterparts. It also implies a normative conclusion that, sometimes, simpler strategies may be better than complex ones. Because everything is formalized, my results permit one to distinguish –in situations where causal inference is possible –when a simple or complex strategy is more ambiguous. In addition, because the analysis focuses upon a …rm’s observable activities, these results are open to direct empirical refutation. Finally, I demonstrate that combinatorial complexity and causal ambiguity are, indeed, distinct barriers to imitation. This leads to the nice implication that, to the extent combinatorial complexity and causal ambiguity are both present in the competitive environment, explorative and absorptive learning are complementary strategic activities, not substitutes. Learning-by-doing generates information that may be useful in overcoming causal ambiguity and learning-by-observing shrinks what may be a large, combinatorially complex search space. From a positive standpoint, we should expect …rms to seek performance enhancement using e¤orts of both types. In the next two sections, I use some simple examples to illustrate the key concepts that are developed more generally in later sections. First, I discuss the notion of causal ambiguity used here and describe how it di¤ers from other types of incomprehension. Next, I demonstrate how causal inference works. The setup, key assumptions, and essential formal objects are presented in §4. Because the math is fairly involved and likely to be unfamiliar to many readers, I take my time developing the model with numerous examples and discussions that aim to facilitate its interpretation. §5 introduces a result on the observational equivalence of causal systems from the causal inference literature. §6 speci…es the bounds on managerial rationality persistent strategic experimentation will, eventually, reveal the optimal strategy. In my model, causal ambiguity can withstand a subjectively rational program of strategic experimentation.

5

a

Activity Pro…le & Associated Cost

a1

1

1

1

1

0

0

0

0

a2

1

1

0

0

1

1

0

0

a3

1

0

1

0

1

0

1

0

cE

29

86

43

32

74

63

40

52

Table 1: Costs arising from speci…c activity outcomes

assumed in the model. The formal de…nition of causal ambiguity is presented in §7. The main results are in §8 and §9 (proofs of the propositions are in Appendix B). Closing thoughts on the relationship between my results and those in the N K literature as well as on business policy implications, are presented in §10 and §11, respectively. A notation glossary is provided in Appendix A, p. 32.

2

“Causal ambiguity” vs. other confusions

To help understand the ideas underlying the following analysis, consider a simple version of the setup in Lenox et al. (2006; hereafter, LRL06). A potential entrant faces an incumbent in a homogeneous-goods market in which price is set according to Cournot competition with linear market demand p q I ; q E =

qI + qE ,

where p is price and superscripts I and E indicate the incumbent and entrant, respectively. The entrant’s pro…t function is E

where

qI ; qE ; a

p qI ; qE

cE (a) q E

;

(1)

is a …xed operating cost, cE is a constant marginal cost (on q E ) that depends upon an activity

pro…le a: Suppose, the …rm has three cost-relevant activities, each of which can be in one of two states. This is represented as a = (a1 a2 a3 ) with ai 2 f0; 1g ; i.e., a is a 3-tuple of zeros and ones. For example, if the challenger is an airline then activities might include the heterogeneity of aircraft types employed, choice of routes, the frequency of service along routes, the speci…c of frequent ‡yer programs, passenger boarding procedures, crew decision rights, and so on.9 There are 8 = 23 possible joint activity outcomes. Assume the activity-cost relationship is as shown in Table 1. In order to distinguish between causal ambiguity and combinatorial complexity, this cost function is designed as a maximally-interactive N K …tness landscape (N = 3, K = 2).10 The cost-minimizing activity con…guration is a = (111) : Suppose: (i ) the challenger can observe the incumbent’s activities; (ii ) managers 9 Porter’s

(1996) in-depth discussion of the strategically relevant activities of Southwest Airlines is a good illustration of the

kinds of situations analyzed in this paper. 1 0 See, e.g., Ghemawat and Levinthal (2000, p.14).

6

can simply set all activities to their desired values; and (iii ) the market can support two e¢ cient …rms. Then, E enters, implements a = (111) by managerial …at and immediately succeeds in imitating I. Now, consider the ways in which a boundedly rational entrant might misunderstand the situation. First, the mapping from outcomes to payo¤s might be unknown. This could occur at many levels: the forms taken by p;

E

; and cE or the speci…c values of

and : The entrant may not know certain features of industry

structure. For example, it may know whether there are other entrants lurking in the shadows, the actions available to the incumbent, or whether one of the …rms has a …rst-mover advantage (i.e., whether the game is Cournot vs. Stackleburg). He may not know the incumbent’s type. For example, the incumbent may be a type that irrationally punishes entrants by ‡ooding the market with product thereby driving down the price to unpro…table levels for both …rms. Any of these confusions could, loosely, be termed “causal ambiguity” in the broadest sense that the entrant does not know the exact consequences of its actions. However, interpreted in this way, the term is not particularly descriptive because none of these problems actually requires causal inference for their solution nor need they involve ambiguity in the speci…c technical sense of Einhorn and Hogarth (1986).11 Moreover, each of the problems described above are well studied issues in game theory (e.g., games of imperfect information, of incomplete information, subjective games, etc.). My goal here is to create a setting in which “causal ambiguity” describes incomprehension with respect to a distinct type of knowledge, one that involves both causality and ambiguity. What is causal knowledge and why does it matter? To answer this, suppose you wake up each morning, look out the window, note whether the grass is wet or dry and then turn on the TV to discover whether or not it rained overnight.12 Over time you learn that whenever the grass is wet, say 10% of the time, it also rained overnight and whenever the grass is dry, it did not rain. Statistically, overnight rain and wet grass are perfectly correlated. At this point, you do not need causal knowledge to make precise predictions about future events: 10% of the time rain and wet grass occur, the rest of the time it does not rain and the grass is dry. Still, based upon the correlation, you might hypothesize the existence of a causal relationship between these events. One situation in which causal knowledge plays an important role is when agents must choose among interventions designed to shift outcomes in some favored direction. For example, if wetgrass ! rain and more rain is desired, one might increase the frequency of overnight lawn sprinklings in order to achieve better outcomes. On the other hand, rain ! wetgrass implies changes to the frequency with which the grass gets wet has no e¤ect on the frequency with which it rains. Thus, causal knowledge is more general than strictly probabilistic knowledge: the direction of the arrows encode additional information about the underlying 11 A

problem is “ambiguous” if it involves second-order uncertainty; that is, the assessment of probability distributions over

probability distributions. 1 2 This example is adapted from Pearl (1988).

7

process driving stochastic outcomes. The additional generality also creates new opportunities for confusion. With causality present, even when the joint distribution on rain and the state of grass moisture is known with great precision, agents may still have no idea which of {seed clouds, turn on lawn sprinkler} is the most e¢ cacious way to alter the behavior of the system. In a business context, explicit identi…cation of causal relationships is useful when managers must choose among interventions. Suppose superior performing …rms tend to have “high powered” sales incentive systems coupled with “aggressive”sales cultures. Does the incentive system drive the culture, or is it the other way around? Confusion as to the correct answer might lead a …rm to implement high powered incentives and let the sales force adapt when it should have hired an aggressive sales force and then adapted the incentive system to them. A second situation in which causal knowledge is an issue is when in‡uence relations are not an exogenous feature of the environment but, instead, an object of choice. Both Rivkin (2000) and Ghemawat and Levinthal (2000) suggest that, within the …rm, in‡uence relations may be malleable. That is, some strategic decisions may actually involve the choice of a causal system. For example, managers may choose to structure in‡uence relations between activity centers in a certain way. The resulting web of in‡uence relations can result in a casual system with activity centers as the main components. Presumably, di¤erent relationships induce di¤erent behaviors across activity centers. At a high level of abstraction, these behaviors are summarized by joint probability distributions on activities. As in the rain/grass case, although a …rm might observe a superior competitor’s operating activities in great detail, the in‡uence relations driving them may not be observable.13 Yet, identifying the competitor’s underlying causal system is the …rst step toward imitating it. If the …rm knows where and in what direction the in‡uence relations ‡ow, it is well on its way toward successful imitation. Unfortunately, not only may the …rm be uncertain about which system is used by its competitor, it may also be uncertain about the performance implications of one system versus another. If so, the …rm faces uncertainty with respect to causal uncertainty – that is, causal ambiguity. The focus of this paper is upon causal ambiguity arising in situations where the issue is choice of in‡uence structure (i.e., as opposed to intervention). How do managers “structure in‡uence relationships between activity centers?”At the more tangible end of the spectrum is the organization of work-in-process (WIP) between factory units. Organization of this kind is often imposed explicitly, in great detail, and vigorously monitored and enforced. For example, in semiconductor production, manufacturing proceeds through a sequence of activities –planarization, cleaning, etching, di¤usion, chemical coating, lithography, oxidation, implantation, sputtering, grinding, polishing, vapor deposition, testing, and spinning – each typically conducted within a speci…c work unit. The ‡ow of processor WIP is tightly controlled. Indeed, engineering documents usually describe activities at a very …ne 1 3 As

Ghemawat and Levinthal (2000, p. 13) point out, “... given informational barriers, all that we might be able to do is

observe linkages between choices, not the direction of in‡uence.”

8

level of detail, right down to step-by-step procedures for individual production technicians. (Though, even under the most carefully managed conditions, such processes are still prone to random ‡uctuations.) Alternatively, senior management may induce tacit structure at a higher level of aggregation. For example, at Air Products and Chemicals in the 1980s the informal organization was “Engineers Rule,” in which engineers enjoyed implicit decision rights on which new products to commercialize. Other departments, such as Manufacturing and Marketing, essentially took new product decisions as given – and had to respond to engineering initiatives, doing their respective best to produce and sell the products they were given. Hence, in the context of this analysis, Engineering would be said to in‡uence (or, drive) Manufacturing and Marketing in new product development. Note that encouraging this aspect of the informal organization is a senior management choice, one that might be very di¤erent in, say, a consumer products …rm like Proctor and Gamble. In general, activity centers (department, shop team, task force, etc.) are in‡uenced by their formal objectives, the individual preferences of their constituent employees (which may be at odds with their formal objectives), their skills and knowledge, the …rm’s incentive and compensation schemes, and so on. It seems uncontroversial to assert that, in the real world, managers do not have the resources or knowledge required to force each and every activity center to produce a desired outcome every period. Rather, senior managers must rely on a set of fairly blunt organizational instruments to push e¤orts roughly in the desired strategic direction.14 One of the novelties of the upcoming model is the assumption that the object of managerial choice is the in‡uence structure between activity centers. To see the idea, return to the running example and suppose managers are limited to issuing activity center targets, creating broad incentives for their achievement and, then, hoping for the best. According to Table 1, they will set the target of activity center i to ai = 1: Assume that using the best available incentive schemes, managers induce each of the three activity centers to hit their targets 90% of the time. Then, Pr (111) = :93 = 73% and, overall, expected cost is 39. In this case, the targets are independent and, hence, so is the behavior of centers. Alternatively, managers might impose conditional targets (and structure the work ‡ow appropriately). For example, assume I continues to set independent targets for centers 1 and 2 at ai = 1 but now creates the following set of targets for center 1 4 This

is in the spirit of Ghemawat and Levinthal (2000) who say, “Discussions of cross-sectional linkages often presume that

a coherent system of policy choices is arrived at by some process of a priori theorizing ... A more plausible characterization is that a …rm makes a few choices about how it will compete and these choices, in turn, in‡uence subsequent decisions.”

9

a

Empirical Frequencies of Activity Pro…les

a1

1

1

1

1

0

0

0

0

a2

1

1

0

0

1

1

0

0

a3

1

0

1

0

1

0

1

0

Freq. (%)

72:9

8:1

0:9

8:1

0:9

8:1

0:9

0:1

Table 2: Frequency of Incumbent operating outcomes.

3 that are contingent on the activities of units 1 and 2: (a1 a2 )

a3 Target

(00)

1

(01)

0

(10)

0

(11)

1

Without changing any center’s ability to hit targets (90% success rate), Table 2 summarizes the frequency of joint activities that is, over time, generated by this organizational strategy. As a result, expected cost performance is improved by 5% over the scheme with independent targets. Qualitatively, the in‡uence relations created by the conditional targets described above is neatly summarized by a1 ! a3

a2 ;

(2)

meaning the activities of area 3 are are in‡uenced by the activities of its two independently operating counterparts. When entrant E knows neither (2) nor the performance implications summarized in Table 2, it faces a problem of causal ambiguity.

3

Causal inference

What, if anything, can an agent do in the face of causal ambiguity? Continuing with the previous example, let us keep the assumption that I adopts (2). Suppose I is considered a shining example of superior strategic design and, as such, has had its operations examined in intimate detail – in academic papers, business case studies and the popular press. Assume that I’s operating history is public but that the organizational structure underlying that performance is hidden to outsiders. E, having read all the studies, is aware of the information in Table 2. I call these frequencies the empirical distribution on activities induced by (2) and denote it by : E does not know the cost landscape, I’s organizational strategy (2), nor the rate at which departmental targets are attained. Assume E has 10

access to all the same productive resources available to I (and, hence, could instantly imitate if it knew which in‡uence relationships to establish between activity centers). Does Table 2 provide su¢ cient information to imitate? The answer, as it turns out, is yes. Moreover, the path to enlightenment, in this case, is provided by what are now standard techniques of causal inference. One hypothesis that might be entertained by E is that I’s activity centers operate independently. If so, is fully described by three in‡uence parameters,

1;

2;

and

associated with the independent operating structure; i.e., (111) =

3;

(ai = 1) =

(1

1 2

(101) = .. .

1

(1

i.

Then, According to Table 2,

= :729;

1 2 3

(110) =

each corresponding to a “local”probability

3)

= :081;

2) 3

= :009; .. .

.. .

A moment’s re‡ection should convince E (and us) that these equalities cannot be consistently satis…ed by any choice of parameter values. Hence, even though E is ignorant of I’s hidden operating structure, it should at least rule out “establish independent operations” as its imitative goal. Alternatively, suppose E hypothesizes that I has adopted (2): Then, local in‡uence parameters

1 ; 2 ; 3j00 ; 3j10 ; 3j01 ; 3j11

where, e.g.,

is fully described by the set of

3j00

is the likelihood that a3 = 1

when a1 = a2 = 0. In order for these to be consistent with Table 2, they must satisfy (111) = (110) = (101) = .. .

1 2 3j11 1 2 1

1

3j11

(1

2 ) 3j10

.. .

= :729; = :081; = :009; .. .

As we know (from the construction of Table 2), these parameters do, indeed, exist. Thus, the entrant cannot rule out (2). By applying this procedure to the remaining possibilities, the only structure that survives is the true one. Thus, it is not necessary for E to enter and engage in potentially costly explorations of the myriad feasible organizational in‡uence structures (active learning). Rather, by carefully observing the operations of I, E can infer the e¢ cient structure and imitate I directly upon entry. This happens to be a very special case. For example, if I adopts a1 ! a2 ! a3 then, regardless of the speci…c empirical distribution this generates, E is never able to distinguish it from either a1

a2

a3 or a1

a2 ! a3 : As is explained later in the

paper, these three structures form an observational equivalence class: any

consistent with one of these

structures is also consistent with each of the other two in the sense that local in‡uence parameters can be chosen in each that generate . Thus, no matter how long E observes I’s operations, it can never distinguish between these three structures. E could still choose to enter and explore the landscape on its own (perhaps

11

using N K-style search procedures). However, if experimentation is perceived to be risky, then this level of confusion can deter imitation. If so, the resulting failure does not rely on any assumption of short-run …xity of resources or other exogenously imposed technological constraint. As a …nal note, the “causal” interpretation of these structures arises because in‡uence only ‡ows in the speci…ed direction. Consider once again an intervention of the kind discussed in the previous section. Suppose that under (2), I is presented with an opportunity to intervene directly and set a3 = 1. If it does, what is the e¤ect on overall performance? With only the empirical distribution to go by, managers might use

(a1 a2 = 11ja3 = 1) = 96:4% to predict the resultant likelihood that a1 a2 = 11 when a3 = 1:

However, under (2), the actual probability of a1 a2 = 11 when a3 is …xed at 1 is only 81% (= :92 ): changing the behavior of center 3 does not a¤ect the behaviors of centers 1 and 2. As mentioned earlier, this paper focuses on causal inference for the purpose of structure selection, not intervention. Still, it is useful to see how causal knowledge re…nes purely probabilistic knowledge.

4

The Model

The following formalism describes: (i ) the way the industry modelled actually works; and, (ii ) the way agents in the model believe it works. Correspondingly, in this section I begin by specifying all the elements required to identify the agents, their policy choices and the objective consequences of those choices. Later, I specify what agents know and how they make assessments about the things they do not know. Terms being de…ned are set it italics. Common graph-theoretic terms (e.g., “parents” and “descendants” of a node) are generally used without formal elaboration. To start, assume that a lone incumbent, I; faces a single potential entrant, E. Because I wish to isolate the e¤ects of causal ambiguity from di¤erences in …rm capabilities, assume that both …rms have access to identical resource portfolios. Each resource portfolio is associated with 2

r < 1 activity centers.15 Firms

share a common discount factor . Competition is dynamic. The timing of events within each period is as follows. E decides whether to enter and, if so, how to organize its in‡uence relations. I always enters and implements the superior organization (as described in the next section). Activities are simultaneously generated for I and, if it enters, E via a stochastic process that is induced by the …rms’ respective in‡uence structures. These activities determine the marginal costs under which the …rms compete Cournot-style to receive their payo¤s. 1 5 If

resources are distinct, then the canonical mapping from resources to activity centers is bijective (i.e., each resource is its

own activity center). More generally, the resource/activity center distinction allows resources to be aggregated in obvious ways (e.g., individual marketers grouped into a marketing dept.).

12

4.1

Activities and performance outcomes

An activity pro…le for …rm i in period t; denoted ait ; is an r-vector of 0s and 1s; e.g., the k th component aik;t 2 f0; 1g so that, e.g., ait = (100:::010) : Broadly, “activities” are externally observed resource state variables indicating the quality, quantities or prices of productive inputs, inventory levels, plant locations, composition of workforce skills, production processes utilized, and so on. De…ne A to be the set of all possible activity pro…les including a null pro…le, a? ; to go with the no-entry case.16 Let c ait be the period-t cost to …rm i when its activity pro…le is ait : Notice that the cost function is identical for both …rms. As in LRL06, …rm i’s activity pro…le determines its marginal cost in a Cournot game (and may, if we wish, be “tuned”according to the N K procedure). Firms know their costs at the time they choose quantities. Thus, following (1), pro…t for …rm i can be restated as a function of the activity pro…les of both …rms, i

aIt ; aE t

p qtI + qtE

c ait

qti

;

where it is understood that qtI and qtE are the Cournot equilibrium quantity choices given linear, downward sloping demand and constant marginal costs c aIt and c aE t . Assume that: i ) there is a cost-minimizing activity pro…le abest such that

E

abest ; abest > 0; and, ii ) there is a cost maximizing activity pro…le aworst 6=

worst a? such that aE implies qtE = 0 regardless of the value of aIt ; aworst results in t =a

Normalize the payo¤ to staying out of the industry to zero:

i

E

aIt ; aE = t

.17

= 0 if ait = a? (i.e., not entering aIt ; aE t

implies zero economic pro…t).18

4.2

Policy choices

In this model, the structure of linkages between activity centers is a managerial choice. Formally, a …rm’s operating structure is depicted by a directed, acyclic graph in which the nodes correspond to the …rm’s individual activity centers and the edges correspond to the direct in‡uence relationships established between them. For example, Intel’s arrangement of the ten semiconductor activities mentioned earlier, is described by the graph aIntel ! ::: ! aIntel 1;t 10;t e.g., the output of vapor deposition directly in‡uences testing results, aIntel ! aIntel 8;t 9;t ; while planarization a¤ects testing as well, but indirectly through its in‡uence on the outcomes of intermediating processes.19

f0; 1gr [ a? : Unless otherwise indicated, all sets are …nite. symmetry, these conditions also apply to I: 1 8 It should be mentioned that the following results hold in much more general settings, including those with multiple entrants, 16 A

1 7 By

large activity domains (aik;t 2 f0; :::; kg, k < 1), and environments in which individual pro…ts depend directly upon at (allowing, e.g., activity-contingent product di¤erentiation). 1 9 Narrowly interpreted, the notion that managers can precisely establish stable in‡uence relationships between activity centers is a heady one. Real-world managers face a legion of constraints when attempting to do so. More broadly, imagine that managers

13

At a higher and, perhaps, more strategic level of aggregation, activities can be viewed as being driven via a top-down decision process of cascading in‡uence relationships that correspond to a …rm’s formal organization. For example, Bainbridge asserts that bad operating performance at GM in 2005 (supplier bankruptcies, high labor costs, poor product designs, and a corresponding $8 billion loss) was due to its “Detroit-centric” management hierarchy, which sported “six layers of management between top executives and those in the …eld.”20 A stylized representation of this operating structure is aGM 1;t aGM 2;t aGM 4;t

.&

.#

aGM 3;t #&

aGM 5;t

aGM 6;t

aGM 7;t

where, e.g., aGM 1;t indicates the period-t decisions, actions, communications, etc., of GM CEO (2006) Rick Wagoner. As an object of analysis, operating structure is intended to capture the actual in‡uence relationships that drive outcomes. Often, these do not correspond to any o¢ cial organizational chart but, instead, are tacit and, hence, invisible to outside observers (such relationships are commonly referred to as the “informal organization”). For example, Moody (1995), chronicles a year spent at Microsoft shadowing the Encarta design and development team. As Moody points out, managers often imposed broad structure on the informal organization in the sense I have in mind here; referring to one senior manager, Moody says (p. 217), “His direct interventions in team disputes invariably were in support of Bjerke – an endorsement, it seemed to me, of the decisions she was making,” where Bjerke represented the Design component of this team (the other departments included Development and Marketing). This observation is consistent with: aDesign

!

aDevelopment

& aM arketing Any operating structure is permitted, provided it is free of in‡uence loops (acyclic).21 Structures need not be fully connected nor are they required to be sensible. For example, Intel is allowed to try Intel aIntel ! ::: ! aIntel 10;t ! a9;t 1;t ;

choose from a menu of monitoring, enforcement, incentive and compensation policies and, as a result, induce the formation of a de facto operating structure – this being the object of analysis here. 2 0 See www.professorbainbridge.com/2006/03/hierarchy_and_g.html. 2 1 Clearly, most productive activities undertaken by …rms involve feedback loops between resource units. Note that in such cases, loops can often be eliminated via an appropriate choice of time period and action labeling. Also, although simpli…cations have been made for the purpose of this paper, the theory of causal inference is su¢ ciently rich to relax this assumption, including allowing for hidden variables and bidirectional in‡uence relations.

14

presumably with disastrous results. Index the various operating structures that can be arranged using the r activity centers by 1; :::; m. Robinson (1977) demonstrates that there are m (r) =

r X

k=1

k+1

( 1)

r k(r 2 k

k)

m (r

k) ;

directed acyclic graphs that can be constructed from r nodes (where m (0)

(3) 1): By (3), m (1) = 1; m (2) = 3;

m (3) = 25; m (4) = 543; m (5) = 29; 281; and so on. Note: the Reed and DeFillippi (1990) idea is that causal ambiguity is increasing in m: Let S denote the set of m operating structures available to both …rms, with a typical element (structure) denoted Sk and including a null structure, S? . Managers choose an element in S: In the initial, entryorganization phase of period t, …rm i chooses a structure Sti 2 S in which the options are: stay out (Sti = S? )

or enter using one of the m non-null operating structures (e.g., Sti = Sk ). If Sti = S? ; then ait = a? is certain.

4.3

Performance implications of structure

In the Microsoft case mentioned above, di¤erent elements of the Encarta team (at the time, code-named Sendak ) had di¤erent agendas (Moody, p. 27): “Sendak ’s designers and editors would want to pack the encyclopedia with features seen nowhere else ... Sendak ’s developers would want a far less ambitious set of new features and ample time in which to write code for them.” In this situation, “... which element held sway would largely determine the functionality of the software, the timing of its release and, ultimately, its success in the marketplace,” [emph. added]. In this model, the strategic decision facing managers is determining which activity centers “hold sway” over one another. Inevitably, productive activities of all kinds are prone to a certain measure of unpredictability. In addition to independent “noise” at the local level, I assume activity likelihoods vary in systematic ways with choice of in‡uence structure. In order to represent these e¤ects, assume that once an operating structure is established, it generates activities according to a stochastic process along the lines presented in the preceding examples. Suppose Sti = Sk . The empirical distribution generated by Sk ; denoted

k;

is a probability distribution on A: Thus,

all …rms implementing structure Sk experience the same expected operating performance. Rather than keeping track of the local in‡uence parameters associated with Sk (i.e., the to construct

k,

I simply take

k

’s of §3) and then using them

as a primitive and make an assumption that guarantees the existence of

parameter values that will generate

k

in the desired way.

Assumption 1 (Faithfulness) For all Sk 2 S and i = 1; :::; r; 1. ai is

k -conditionally

independent of all its nondescendants given the outcomes of its parents in Sk and

2. The removal of any edge in Sk causes item 1 to fail for some ai . 15

This assumption is what makes the activity centers behave as a causal system; it creates a link between structure and activity that makes causal inference possible by ruling out degenerate cases. For example, if

is faithful to (2), then it is the case that, for all a 2 A; (a) =

(a1 ) (a2 ) (a3 ).22 I also assume that the

k

(a) =

(a1 ) (a2 ) (a3 ja1 ; a2 ) but not

are positive on A (i.e., managers cannot eliminate

undesired action pro…les by choice of structure). Let Fk denote the set of empirical distributions that are faithful to Sk (i.e., for Sk 6= S? ). In keeping with my focus on imitation, assume there is a uniquely cost-e¢ cient organizational structure and, without loss of generality, label it S1 : Fix the incumbent’s actions to StI = S1 for all t. Let

E k

denote the

expected pro…t for E when it chooses entry-structure Sk (i.e., before actual activity pro…les are generated), X X

E k

p aI ; aE

c aE

q E aI ; aE

1

aI

k

aE ;

(4)

aI 2A aE 2A

where p aI ; aE and q E aI ; aE are the Cournot equilibrium market price and entrant quantity choices given c aI and c aE : Note that (4) is “objective” in the sense that i when Sti = Sk : Assume

1

is such that

E 1

E k

is the true expected cost for …rm

> 0; making imitative entry the objectively optimal choice.

Hence, E certainly enters if imitation is guaranteed. I prefers that E stay out. Barring that, I prefers that E choose an ine¢ cient operating structure since I’s pro…t is inversely proportional to E’s cost.23 Summing Up An incumbent I faces a potential entrant E: The …rms have access to symmetric resource portfolios. A period t activity pro…le for …rm i; ait , is a list of observed activities, one for each of r activity centers. The set of all such activity pro…les is A: Associated with each ait is a constant marginal cost c ait : An operating structure, Sk 2 S; is a graph with nodes corresponding to activity centers and directed edges corresponding to the in‡uence relationships established between them. Structure is the key decision variable. Each period, I adopts the e¢ cient structure, S1 ; and E makes an entry/operating structure choice Sti 2 S: Each Sk generates a faithful empirical distribution decision

5

StE

k

2 Fk on …rm activities A. At the time of making its

= Sk (i.e., before the resolution of uncertainty); the entrant faces an expected pro…t of

E k:

Observational indistinguishability theorem

Sl is said to be observationally indistinguishable from Sk if the empirical distribution generated by Sl is also faithful to Sk (i.e., if

l

2 Fk ). To see why this distinction is important suppose that, over a su¢ cient period

of observation, the challenger develops an arbitrarily accurate assessment of the empirical distribution on the incumbent’s activities, 2 2 See

1.

If

1

2 Fk where k 6= 1; the challenger (who does not observe the incumbent’s

Spirtes et al. (2000, p. 13) for additional technical details. Faithfulness turns out to be a “reasonable” assumption in

the sense that, under mild regularity conditions, of all possible parameter values ( ) ; the set failing the faithfulness condition has Lebesgue measure zero (Meek, 1995). 2 3 Loss-assuring aworst implies, for all S ; there are faithful empirical distributions, k

16

k

2 Fk , under which

E k

< 0:

actual operating structure, S1 ) cannot tell which of S1 or Sk is actually generating the results. This is where the opportunity for confusion arises. E might choose to implement Sk , in which case direct experience will eventually reveal that it is not the optimal structure.24 On the other hand, if E is worried that the incorrect choice results in very poor performance, it might choose to skip entry and stick with something safer (i.e., its known outside alternative). Alternatively, as I will show, if

2 = Fk , then Sk can, over time, be ruled out

1

strictly via passive observation. Let OIk denote the set of structures with which Sk is observationally indistinguishable. Under the faithfulness assumption, Sl 2 OIk if and only if

l

2 Fk : Thus, if S1 is completely “transparent”in the sense

that no other structure is observationally indistinguishable from it (OI1 = fS1 g), then it is only a matter of time before the challenger properly identi…es the e¢ cient organization and enters. As we will see, operating structures are, in general, not so transparent. Example 1 Recall from §3 that the organizational structure (2) is the only one capable of generating, over a long period, the data in Table 2. Consider, instead, the structure a1 a1 ! a2 ! a3 and a1

a2

a3 are both in the observational indistinguishability class of this structure.

To see this, suppose that the in‡uence parameters of a1 2

a2 ! a3 : As mentioned earlier,

= :90

1j0

= :10

1j1

a2 ! a3 : are

= :90

= :80

3j0

3j1

= :90:

Then, the empirical distribution on activity pro…les is a1

a2

a3

a1

a2

a3

1

1

1

72:9

0

1

1

8:1

1

1

0

8:1

0

1

0

0:9

1

0

1

0:8

0

0

1

7:2

1

0

0

0:2

0

0

0

1:8

(5)

However, this same distribution is implied under a1 ! a2 ! a3 with 1

Similarly, under a1

= :82

a2 3

2j0

= :50

2j1

= :98

3j0

= :08

3j1

= :90:

= :90

2j1

= :90

1j0

= :90

1j1

= :90:

a3 with = :90

2j0

It is important to note that there is no assumption that the parameters for a1 ! a2 ! a3 and a1

a2

a3

are, indeed, as described above. Rather, the key point is that an observer of (5), ignorant of the actual parameters, could not tell which of the three operating structures generated the data. As demonstrated, there 2 4 Over

time, E learns that its performance is, on average, di¤erent from I’s (

e¢ cient, it also discovers

E k

<

I:

17

k

6=

1 ).

Because S1 is assumed to be uniquely

exist paramters for all three that produce the observed correlations in activities. Hence, all three are in the same observational indistinguishability class. It would be certainly be useful if the observational indistinguishability class of the incumbent’s operating structure could be constructed directly from the features of its in‡uence network (i.e., and not require “brute force”construction by repeated application of Bayes’rule to all m possibilities). Fortunately, as it turns out, this is possible. For the following theorem, given a structure Sk , three activities are said to constitute a local structure identi…er (hereafter, LSI) if two unlinked activities are organized to in‡uence the third directly; e.g., a structure like a1 ! a3

a2 .

Theorem 1 (Verma and Pearl, 1990) Two organizational structures are observationally indistinguishable if and only if they have the same edges (regardless of direction) and set of LSIs. Example 2 Suppose the incumbent’s organization is a1

a3 ! a2 : Using Theorem 1, we can determine –

via visual inspection alone: OI1 = fa1

a3 ! a2 ; a1 ! a3 ! a2 ; a1

a3

a2 g :

To see this, …rst note that a structure is observationally indistinguishable with a1 the same edges: the only possibilities are those shown in (6) plus a1 ! a3

(6) a3 ! a2 only if it has

a2 : None of the structures in

(6) are ruled out because they all have the same set of LSIs (the empty set). However, this is not true of a1 ! a3

a2 since, as con…gured, this structure constains one LSI (in this case, the graph itself ).

Example 3 Alternatively, consider the following structure involving 5 activity centers: Budgeting, Engineering, Finance, Manufacturing, and Marketing. Bud

Eng &

. (7)

F in .

&

Mfg

M kt

Here, Financial Analysis serves as a gatekeeper, checking Engineering projects against Budget’s projections before forwarding approved projects to Manufacturing and Engineering. Using Theorem 1, we can instantly determine that there are no other structures in the observational indistinguishability class; keeping the edges constant, there is no way to reverse an arrow without either breaking up an LSI or forming a new one. This result is important not only because it demonstrates exactly how to construct a structure’s observational indistinguishability class, but also because it demonstrates a general insight on the di¢ culty of strategic inference in the presence of interactions between activity centers. To wit, if the interactions 18

are causal in nature (directed) and if they induce a consistent process of history generation, sometimes more interactions make the inference problem easier. Moreover, Theorem 1 tells us exactly what kinds of relationships serve to increase transparency in this way.

6

Subjective rationality

What, exactly, does E know about the market? To keep our attention on the causal inference problem, assume that E is aware of all environmental primitives except: i ) which structure is the e¢ cient one consistently chosen by I; and, ii ) the actual empirical distributions associated with each of the m available operating structures.25 In this scenario, E is far better informed than most real-world counterparts would be under similar circumstances. For example, it knows what it’s resources are, how to parameterize market demand, how to organize its activity centers to create the desired in‡uence relationships, and so on. Still, a crucial piece of the strategic puzzle –which operating structure is e¢ cient –is missing. Thus, at a very fundamental level, E does not know the consequences of its policy options. First, assume that, following each period, E observes I’s activity pro…le. E also knows its own entry/organization decisions and activity pro…le outcomes. Therefore, at the start of period t, E observes a history of the form ht

E I h0 ; S1E ; aI1 ; aE 1 ; :::; St 1 ; at

E 1 ; at 1

;

where h0 is the period-1 null history. I adopt the notational convention of using “^” to indicate E’s assessment with respect to an object. So,

(

and ^

1 ; :::;

m)

summarizes the true empirical distributions associated with each of the m structures

(^ 1 ; :::; ^ m ) summarizes E’s initial beliefs about these distributions: Let ^

(^1 ; :::; ^m ) be E’s

initial assessment about which structure is optimal; e.g., ^k is E’s initial belief that Sk is the true e¢ cient structure (the one employed by I). ^ is a set of beliefs that is used to weigh other beliefs, thereby adding the ambiguity dimension to the model.26 I write, e.g., ^ k (ht ) to indicate E’s updated assessment of

k

given

the history ht or, when the history is implied, simply ^ k;t : E is savvy to the idea of using causal inference to infer I’s operating structure: it knows that, for any Sk ;

k

2 Fk . Finally, assume that E’s initial priors

are independent, Dirichlet distributed, and result in strictly positive multinomial distributions.27 2 5 Earlier

versions of this paper included uncertainty about the general mapping from activity con…gurations to payo¤s and,

in the case with multiple entrants, the strategies of rivals. The results were virtually identical at the expense of a massive increase in mathematical complexity. 2 6 In this formulation, ^ depends upon ^. See Appendix B for the technical details. 2 7 The Dirichlet assumption is primarily to ensure that E can, indeed, update its beliefs in response to new information. This distribution is well-studied in the context of learning a hidden causal structure from data. Under fairly mild assumptions on the resulting posterior empirical distributions, Dirichlet priors are actually implied. Interested readers are referred to Neapolitan (2004, p. 309).

19

This setup allows a fairly wide array of possibilities. From E’s perspective, there are m relevant “states of the world” –one corresponding to each structure in which that structure is the optimal choice (^ weighs these states each period). E may have di¤erent assessments conditional upon which state of the world it …nds itself. For example, suppose there is large number of activity centers that include Engineering, Marketing and Manufacturing. Let “Engineering is king” be the structure in which Engineering in‡uences everything and de…ne “Marketing is king” similarly. The setup is su¢ ciently general that E is also allowed to believe that, conditional on “Engineering is king”being the optimal choice, Engineering ! Manufacturing is locally e¢ cacious (i.e., that any structure with this link results in lower expected costs than without it). Conversely, discovering that “Marketing is king” is optimal may lead to the opposite conclusion regarding Engineering ! Manufacturing. Moreover, nothing prevents E being wrong on both counts. A dynamic strategy, denoted every history: formally, ht ;

; for …rm E speci…es a (possibly random) entry/organization choice for

(Sk jht ) is the probability that E chooses StE = Sk upon observing the history

can encode simple strategies (“Stay out forever,” “Enter in even periods under Sk ;” etc.) as well as

much more sophisticated, outcome-dependent ones (“Fix " 2 [0; 1] and enter with Sk in any period t where ^k;t

1

";” “Employ N K-style search from period t on,” “Embark on a subjectively optimal program of

Bayesian experimentation,” etc.). Assume E is subjectively rational, meaning: (i ) beliefs are updated in Bayesian fashion; and, (ii ) maximizes the subjective expected present value of pro…ts in every period.28 Remember that entry from period 1 on is, in fact, the optimal strategy. The only way imitation is forestalled, therefore, is if E’s subjective assessments are persistently wrong. If E is allowed to believe anything (e.g., “Martians always strike down imitators with death rays”) then …nding beliefs that cause imitation to fail is trivial. However, E is not only a subjective optimizer, but also a rational learner –its beliefs are properly updated in response to new information. Thus, it is not obvious that there are any initial beliefs that, ultimately, causes imitation to fail.

7

Causal ambiguity

Given this setup, it is possible to introduce two measures of causal ambiguity, one with respect to the inherent transparency of the incumbent’s operating structure and another with respect to a …rm’s subjective beliefs regarding which structure that is. Given perfect knowledge of

1

(I’s empirical distribution), there

is a limited number of structures that E might confuse with S1 under the faithfulness assumption. The idea is to relate this number to the subjective beliefs of E over time under Bayesian learning. Presumably, E’s beliefs must at least converge (almost surely) to place positive weight only on structures faithful to S1 : Less 2 8 This

particular notion of subjective rationality was introduced by Kalai and Lehrer (1993). Ryall (2004) provides the …rst

application to strategy.

20

obvious is whether, under subjective rationality, the opportunity to enter and experiment with structures of its own implies that E must inevitably learn the e¢ cient structure. De…nition 1 Given beliefs ^; E’s subjective degree of causal ambiguity is m X

^ (^)

^k ln (^k ) ;

(8)

k=1

where 0 ln (0)

0:

This measure ranges from zero to jln (m)j (a positive number). It equals jln (m)j when the challenger’s priors regarding the optimal structure are uninformative (i.e., ^1 = ::: = ^m = 29

challenger is certain that it knows which structure is the optimal one.

1 m)

and zero when the

This de…nition is useful because it

summarizes each …rm’s uncertainty regarding the e¢ cient causal structure in a single number. From E’s perspective, there are m primary states of the world, one corresponding to each structure Sk 2 S in which S I = Sk ; i.e., the condition that Sk is the true e¢ cient structure (^ summarizes E’s subjective weights on these states). E’s beliefs regarding the empirical distributions associated with each structure can vary depending upon the state of the world. For example, E may believe that the probability of (111) under a1

a2 ! a3 is .9 if S I = a1 ! a2 ! a3 and .1 if S I = a1 S I = a1 ! a2 ! a3

= :4; and

S I = a1

= :6

a2

a3

then, E’s subjective expected probability of (111) should it adopt a1

a2

a3 : If

a2 ! a3 is (:9) (:4)+(:1) (:6) = :42: In

other words, (8) is a measure of the spread in subjective beliefs over the probabilities associated with causal structures –hence the term “causal ambiguity.”Let m1 be the number of structures that are observationally indistinguishable from S1 (i.e., the set cardinality of OI1 ). De…nition 2 The intrinsic ambiguity of S1 is

jln (m1 )j :

Intrinsic ambiguity is equal to the subjective degree of causal ambiguity when managers place equal weight on, and only on, the elements of a structure’s observational indistinguishability set. Other than the requirement that E not initially rule out any structure from potentially being the e¢ cient one, ^ is fairly unrestricted. Therefore, it is not obvious what, if any, long-run relationship exists between ^ and

: For

example, if the subjectively optimal strategy speci…es entry and experimentation until the true structure is discovered, ^ must eventually converge to 0. Example 4

Suppose that there are 3 structures in the observational indistinguishability class of S1 : Then

= ln (3) = 1:1: Suppose that r = 3 and that E has uninformative initial beliefs (places equal weight on each of the m = 25 possible structures that it is the e¢ cient one). Then, 29

is the entropy measure of ^it (see, e.g., Golan et al. 1996).

21

i 0

= ln (25) = 3:2:

Example 5 The structure in Example 3 has

= 0 even though, in total, there are m = 29; 281 ways to

organize these units. Compare this against a1

a3 ! a2 which has

8

= 1:1 even though m = 25:

Result on long-run ambiguity

I now proceed to show the extent to which the incumbent’s operating structure is revealed to an entirely passive observer. That is, if E stays out and does no exploration of its own, how much can it learn about I’s true choice of operating structure? The questions of interest in this section include, assuming E stays out and, hence, gains no …rst-hand knowledge regarding the structure-performance relationship: To what extent does E learn to predict the I’s operating activities? How accurate does E’s assessment eventually become regarding the e¢ cient structure? What is the relationship between intrinsic ambiguity and E’s long-run subjective ambiguity? First, how well does E come to predict I’s operating behavior? Let ^ It denote E’s period-t assessment of the incumbent’s true empirical distribution. Then, the answer, provided by the next lemma, is that ^ It converges to reality with probability 1.30 Because I’s activities are stochastically determined, it is always possible that the actual history observed by E will, by pure chance, happen to mimic data driven by some distribution other than

1:

What the lemma says is that, over time, large discrepancies between E’s beliefs

and the truth are highly unlikely. This degree of learning occurs even under a strategy of strictly passive observation (E stays out forever). Lemma 1 For all strategies 1.

and initial beliefs ^ ; E’s subjective assessment ^ It converges in probability to

Formally, plim ^ It = t!1

1:

Lemma 1 and Theorem 1 imply that E’s beliefs regarding I’s operating structure are highly likely to become concentrated with weight 1 on the set of structures in S1 ’s observational indistinguishability class, OI1 . Thus, because causal structures induce certain, well-de…ned regularities in the observations they induce, shrewd scrutiny of incumbent conduct cannot but help to reduce the number of structures considered likely candidates for imitation. This important result is stated formally in the following proposition. Proposition 1 For all strategies

and initial beliefs ^ ; E’s subjective likelihood that StI 2 OI1 converges

in probability to 1: plim t!1 3 0 The

X

^k;t = 1:

StI 2OI1

reference distribution in these propositions is always reality (i.e., the distribution over histories implied by the policy

choices of I and E and the true empirical distributions associated with each operating structure).

22

As we now know, more activities permit more linkage choices which can, in many cases, make the overall structure more transparent. Clearly, if jOI1 j = 1 (S1 is perfectly transparent), then E eventually enters and imitates: Linking this result to intrinsic ambiguity, it is immediate that

= 0 is su¢ cient to imply

successful imitation. More generally, E’s subjective ambiguity is limited by the intrinsic ambiguity of I’s operating structure. Corollary 1 For all strategies

and initial beliefs ^ ; E’s subjective degree of causal ambiguity converges in

probability to a number bounded by the intrinsic ambiguity of S1 : plim ^ (^t ) = x

:

t!1

Corollary 1 highlights the fact that the intrinsic ambiguity of the optimal operating structure bounds the ambiguity that can exist under conditions of passive observation. Even the challenger that never enters and, hence, never experiments with operations of its own, eventually learns the incumbent’s empirical distribution to an arbitrary degree of accuracy. By Theorem 1, this limits beliefs with respect to what the incumbent is actually doing to generate its observed behavior. Of course, under su¢ cient experimentation, ambiguity may be reduced even further and, perhaps, eliminated altogether. However, even these results are su¢ cient to re…ne the N K hypothesis. Corollary 2 E’s subjective degree of causal ambiguity converges in probability to a value that is not monotonic in the N K complexity of the cost function. Corollary 2 makes an important point about causal ambiguity in the real world, one that slips through the intuitively appealing reasoning employed in traditional discussions on this topic. Causal ambiguity arises as an issue in strategy because it is thought to be a source of the kind of confusion that prevents managers from imitating the performance of their more successful competitors. As we see in (3), the aggregate number of feasible causal structures is, indeed, exponentially increasing in the number of observed activities. However, the very context of the problem – posited as a challenger attempting to imitate a successful incumbent – already implies that the challenger observes certain aspects of the exemplary …rm’s behavior. This, in turn, implies the possibility of applying the tools of causal inference to the observed history. Over time E learns 1

and, in turn, the observational equivalence class of structures capable of generating it. Sometimes problems with lots of choices are hard to solve and sometimes they are easy –it all depends

upon the ruggedness of the landscape. Similarly, if points on a rugged landscape emit data according to a location-speci…c causal process, then sometimes the area of search is small and sometimes large – it all depends upon the degree of causal ambiguity. As in (7), the lucky case for the entrant is when the incumbent’s location can be narrowed down to a single point. Such cases may be rare since the intrinsic ambiguity of most operating structures is greater than zero. Of course, once causal inference is taken as far as it will go,

23

E may very well wish to enter and apply active learning procedures to assess the remaining options. The fact that

< jln (m)j implies that learning-by-observing always results in a reduction in the search space.

Interestingly, managers who increase the number of linkages between activities in the hope of introducing imitator-confusing complexity, may inadvertently make their operations more transparent instead. Alternatively, simple strategies with sparse cross-sectional interrelationships may be more di¢ cult to piece together from the outside. Although I do not purse it further, Theorem 1 provides a road map to managers who, if they understand their external landscape and the expected operating implications of various structures, can use it to maximize the performance-to-ambiguity ratio (à la Rivkin, 2000). Finally, for any number of operating variables, the largest observational indistinguishability class is the one containing (all) the fully connected graphs. Therefore, if the optimal deployment always imposes the densest set of in‡uence relations, then the upper bound on subjective ambiguity does increase monotonically in the number of activities. However, it is di¢ cult to imagine a compelling reason to make such an assumption, especially since the costs required to impose and maintain more interrelated organizations should be higher (costs that are ignored in my setup).

9

Result on sustained advantage

Because E controls the same technology as I, any sustainable advantage for the incumbent is “capabilitybased”in the sense of Saloner et al. (2001, p. 41-55). Therefore, I is said to sustain a strong capabilities-based advantage if E never enters. This is a very strong form of imitative failure. In this case, the incumbent’s advantage with respect to the potential entrant is su¢ cient to guarantee it monopoly pro…ts. Alternatively, a weak advantage is one in which E never imitates. I would enjoy a weak advantage if E attempted imitation, found a pro…table but suboptimal structure and decided the risks of further organizational innovation o¤set the perceived bene…ts. A strong advantage implies a weak advantage but not conversely (hence, the terminology). In order to decide its course of action, E must conduct a subjectively rational risk analysis. That is, E must convert its beliefs regarding its various organizational options – whatever their degree of ambiguity – into an appropriate pro…t assessment. Ambiguity by itself is never su¢ cient to deter entry or imitation. For example, an entrant may have strong expectations that entry is pro…table under any of the …rm’s feasible range of activity pro…les (i.e., the entrant’s priors place high probabilities on good outcomes under every choice of structure). Even if this assessment is overly optimistic, entry occurs and, at best, only a weak advantage obtains. On the other hand, if entry is perceived to be su¢ ciently risky, it may be deterred altogether (with causal ambiguity playing a key supporting role). To capture these considerations, I now introduce the following risk measure.

24

De…nition 3 The intrinsic risk of S1 is 1 m1

m1 1 m1

E 1

:

(9)

It is important to note that this measure does not depend upon E’s subjective beliefs – it is computed from the objective primitives of the model. To interpret , consider a situation in which su¢ cient time has passed that E’s beliefs regarding

1

are arbitrarily accurate (as Lemma 1 guarantees). Suppose that no

entry has occurred up to this point. E knows that there are m1 structures in OI1 and that one of these structures is the one generating

1.

What E does not know is which empirical distributions go with which

structures: In this case, a lower bound on the worst possible payo¤ should E choose unwisely is

:31

is the

Bayes risk (see DeGroot, 1970, p.121-23) of an extremely pessimistic challenger who knows the true empirical distribution is

1

but has uninformative beliefs with respect to which structure in OI1 is the e¢ cient one.

Proposition 2 Assume that E knows

I

=

1:

If

< 0; then I does not enjoy a strong capabilities-based

advantage nor, with probability 1, does it enjoy a weak capabilities-based advantage: If

< 0; then entry happens immediately because even the most pessimistic beliefs cannot deter it.

Indeed, the situation is even worse: E not only enters but persists in experimenting until imitation succeeds. Although I may enjoy a weak advantage for some time, it is highly unlikely to last forever (i.e., there are sequences of random events under which E never imitates, but they occur with probability 0). Eventually, E …gures out that its current structure is not performing to expectation and tries something di¤erent (because < 0; E does not exit). Notice the connection to intrinsic ambiguity. When intrinsic causal ambiguity is zero, m1 = 1 and

< 0. This implies the following corollary.

Corollary 3 Assume that E knows

I

=

1:

If

= 0; then I does not enjoy a strong capabilities-based

advantage nor, with probability 1, does it enjoy a weak capabilities-based advantage: It is very important to note that, when

0; it is not hard to construct subjective beliefs that support

E staying out with probability 1. This holds even though, in this setup, the direct cost of entry/imitation is zero. Interestingly, subjectively optimal experimentation – as must arise under the assumption that E is subjectively rational – is not su¢ cient to assure objectively optimal behavior. Corollary 3 con…rms the conventional wisdom in strategy insofar as causal ambiguity is a necessary condition for imitation to fail. However, it is not su¢ cient: if the …nancial bene…t of successful imitation is too strong relative to the cost of failure, challengers enter and doggedly experiment with organizational structures until they get it right. To gain some intuition into these e¤ects and their relationship to , let us turn to a …nal example. Example 6 Consider a situation in which there are two activity centers (r = 2). Let S1 = a1 ! a2 , S2 = a1 3 1 For

a2 and S3 = (indep. ops.). By the indexing convention, S I = S1 : For the sake of simplicity,

each structure Sk and all " > 0; there exists an empirical distribution

25

k

2 Fk such that

E k

="

:

limit the number of periods to two. Should E imitate successfully, it enjoys an expected payo¤ in each period of

E 1

> 0: E does not know that S I = S1 but, let us suppose, it is su¢ ciently informed (either by direct

observation or piecing together the case studies, news reports and academic journal articles) to have a very accurate assessment of I’s operating behavior; i.e., ^ I = Let ^1 ; ^2 ; and ^3 be E’s initial priors on S I

=

1

I

1:

= S1 , S I = S2 and S I = S3 : Because E knows that

is generated by a causal system, it knows at least that S I 2 OI1 = fS1 ; S2 g : Therefore, ^3 = 0

and ^1 + ^2 = 1: In addition to not knowing which structure is the optimal way to organize its two activity centers, it also does not know the empirical distribution associated with the wrong choice. Assume that E is very pessimistic: it believes the wrong choice of structure generates the bad activity pro…le aworst with certainty, resulting in an expected loss equal to …xed operating costs :32 E Now, from E’s perspective, if it enters and chooses wisely it enjoys expected pro…t of ^ E good where ^ good = E 1

= 1: On the other hand, if it chooses unwisely, it su¤ ers an expected loss of ^ E bad =

: To simplify

things even further, assume that exploration is incredibly e¤ ective; speci…cally, if E enters in period 1 then it learns beyond a doubt whether its organization is the e¢ cient one. Of course, if E stays out, it learns no additional information. E has three logical choices: i) stay out, ii) enter under S1 ; and iii) enter under S2 : If staying out is optimal in period 1, given the fact that E learns nothing new, it is also optimal in period 2. The net present value of the stay out strategy is, therefore, 0. If E enters under S1 ; then, with probability ^1 it earns period 1 pro…ts of ^ E good and with probability (1

^1 ) a loss of ^ E bad : However, by entering in period 1, it learns the

correct structure and, therefore, is assured a payo¤ of ^ E good in period 2. Recalling that the discount rate is ; the subjective expected present value of this plan is ^1 ^ E good + (1

V1 =

(^1 + )

E 1

E ^1 ) ^ E bad + ^ good

(1

^1 ) :

Similarly, E’s assessment of the present value of entering under S2 is

V2

(^2 + )

E 1

(1

^2 ) ; where ^2 = (1

^1 ) :

Since E is subjectively rational, if it does enter, it chooses the structure corresponding to the larger of V1 or V2 : This is entirely determined by the larger of ^1 or ^2 : Let Vi = max fV1 ; V2 g. Then, because it can stay out and be assured a payo¤ of 0, E does not enter if Vi < 0; that is (^i + ) 3 2 More

E 1

(1

properly, E believes aworst occurs with probability 1

purpose of the example.

26

^i )

< 0;

(10)

" for " > 0 arbitrarily small, a technical detail I ignore for the

where, since it is the value-maximizing entry choice, ^i 1 2

^i <

1 2:

Rearranging terms: E 1 : E 1

+

(11)

This example highlights several insights that carry through to the more general case. First, condition (11) and, hence, the possibility of failed imitation only arises as a result of causal ambiguity. If E knows S I = S1 (i.e., ^1 = 1), it always enters and earns (1 + )

E 1:

Second, entry o¤ers the opportunity to gain additional

knowledge via direct experience. Here, all the learning happens in one period. This is unrealistically fast, but is consistent with what happens over longer periods (with high probability). Thus, the pessimistic E must trade o¤ the bene…t from learning (here, getting

E 1

for certain in period 2) with the downside of

implementing the wrong structure in period 1. From condition (11), we see that E is less likely to enter the more impatient it is (low ) and the lower the relative bene…t to learning (i.e., the size of E 1

In particular, when

<

1 2;

E 1

relative to ).

the payo¤ to experimentation is su¢ ciently large that E always enters.

Tying the example back to our main result, note that m1 = 2 so that = Suppose

< 0: Then,

E 1

1 2

E 1

1 2

:

> 0: But, from (10) and the fact that ^i

1 2;

this implies that E’s subjectively

best choice of entry structure (whichever it is) results in a strictly positive expected payo¤ in period 1. Therefore, there is no trade-o¤ to make – entry is the strictly dominant, subjectively optimal decision! Keep in mind, we constructed E’s beliefs to be maximally pessimistic given its knowledge of I’s operating performance. Thus, if E enters under these beliefs, it enters under any beliefs (consistent with ^ I = 1 ).

Moreover, even in the more general case, E never exits the market. It continues to learn and adopt

subjectively optimal structures until (with probability 1) it succeeds in imitating. Let me conclude this section with the following observation. Because

is constructed from objective

primitives, Proposition 2 is, in theory, empirically refutable. In the simplest setting, this requires estimating 1

and

E 1

from incumbent operating data. The estimate of

1

then implies OI1 and, hence, m1 : Finally,

would be estimated as the expected pro…t in the worst-case organizational scenario. In the real world, the analysis is complicated by hidden causal relationships, multiple …rms, etc. However, analytic techniques do exist for estimating causal structures from the data they generate under these complications. Refuting the corollary is a somewhat simpler a¤air, requiring “only”the estimation of

1

and comparing the resulting

to some measure of imitative success within the industry of study.

10

Causal ambiguity versus combinatorial complexity

As mentioned earlier, my model was designed to extend LRL06 in a very transparent way, with the objective of facilitating comparisons with the growing number of strategy applications that utilize the N K formalism. 27

Speci…cally, although the cost function in my model is assumed to be N K-tuned, the preceding results in no way depend upon any choices for N and K: This demonstrates that causal ambiguity is, indeed, a distinct barrier to imitation. Still, comparisons must be made with care. Here, as in LRL06, the complexity of the marginal cost function on activities can be N K-tuned. However, unlike LRL06, activities in my model are (quite purposefully) not managerial choice variables. Thus, on the one hand, I have shown a set of circumstances under which managers facing an N K-complex cost landscape can avoid the correspondent search problem by applying causal inference to the operations of the incumbent.33 On the other hand, as the astute reader might (rightly) point out, this result is achieved by shifting the object of managerial choice from action pro…les to operating structures, thereby causing a mismatch on a very critical dimension. As Rivkin (2000) is careful to emphasize, the issue is the complexity of the decision problem; i.e., the mapping from decisions/policies/choices to payo¤ outcomes.34 Thus, because the only policy variable here is choice of operating structure, one might suspect that the original, complex decision problem was simply replaced with a relatively simple one. To see that the essential message of this paper with respect to imitation withstands this observation, it su¢ ces to consider an instance of the model in which the choice of operating structure is, itself, N K-complex. To make things concrete, return to the 3-activity case. By (3), there are 25 possible operating structures that meet the acyclicality requirement (not counting the stay-out option). Arbitrarily index these 1 to 25 and assign each their number in 5-digit binary (e.g., structure #1 = 00001, #9 = 01001); the optimal structure need no longer be #1. As we know, assuming the incumbent chooses the optimal structure (whichever one it is), each 5-digit string corresponds to an expected payo¤, e.g., if the optimal structure happens to be #2, the entrant gets an expected payo¤ of

E

(S00010 ; S01001 ) when it picks structure #9. It should be easy to

see that by manipulating the demand parameters, cost function and activity probabilities, we can tune

E

to any level of N K-complexity.35 My model is su¢ ciently general to allow E to adopt a strategy

in which it enters and pursues an

exploration strategy using a N K-style hill-climbing algorithm. Indeed, the model requires that E do exactly this whenever subjective rationality demands it. At the same time, because

must be optimal with respect

to E’s updated beliefs in every period, the implications of causal inference must also be respected. That is, to the extent causal inference rules out certain structures, these must be removed from the set upon which searches. Therefore, my results complement the N K-studies by saying something about the likely “area of the landscape” upon which …rms in a particular industry search (as well as how that area changes over time). 3 3 Note

that when

= 0; E need not know anything about the mapping from action pro…les to costs (or even pro…ts) in

order to imitate successfully. 3 4 LRL06 is similarly careful; the interpretation of binary strings as “activity decisions” happens to arise naturally in their setting. 3 5 At least to within an arbitrary margin of error.

28

Viewed in this way, the theory implies that explorative and absorptive learning strategies should not be viewed as substitutes but as complements, each defeating a di¤erent type of learning barrier. My results demonstrate that passive learning (e.g., accumulating competitive intelligence) inevitably shrinks the search space upon which active learning operates. Conversely, active learning (e.g., attempts at de novo innovation via experimentation) create new information that cannot but help to re…ne a …rm’s understanding of how the world works, thereby improving it’s ability to decipher the behavior of an industry’s superior performers. From a positive point-of-view, subjective rationality implies both of these approaches are constantly being weighed, with e¤orts on each typically being applied contemporaneously. Thinking of …rms as employing either an active or passive imitation strategy is too narrow. When intrinsic ambiguity is low, more emphasis may be placed on the latter (only in the extreme case of zero ambiguity is exploration entirely uncalled for). When ambiguity is high, less is learned by observing, the bene…ts to exploration are greater and, as a result, we should observe more of it. This also implies some caution in inferring …rm strategies from outcomes: innovative search on a landscape narrowed by causal induction can produce outcomes that –to the outside observer –look either more like imitation or more like innovation.

11

Conclusions

On the one hand, the preceding results con…rm the conjecture that causal ambiguity may well play an important role in developing a capabilities-based advantage. On the other, its mere existence is not su¢ cient to ensure it. Given enough experimentation, an entrant eventually discovers the optimal operating policy. More patient …rms place greater weight on the bene…ts of experimentation and, hence, are more likely to adopt exploration strategies. Firms with a fairly high level con…dence (lower subjective ambiguity) in their ability to imitate may also view the downside to doing so su¢ ciently limited that they attempt it. This con…dence may be misplaced. Even so, once the process of exploration begins, it may yet lead to success. My results highlight three dimensions that are important in analyzing the sustainability of capabilitiesbased advantages under causal ambiguity: 1) the intrinsic riskiness of entry associated with the optimal structure, 2) intrinsic level of causal ambiguity of that structure, and 3) the accuracy of challenger beliefs with respect to the elements in this class. The …rst item is intimately related to the second. However, items 1 are 2 are not su¢ cient to guarantee a strong capability-based advantage –challengers’initial priors (item 3) also play a strong role. Su¢ ciently optimistic challengers always enter in the short-term and may spend a long time experimenting, thereby reducing a strong capabilities-based advantage to, at best, a weak one. If the implied performance di¤erences between causal structures are small, then picking a random structure in the equivalence class is almost as good as imitating the incumbent. As described above, the bias under such circumstances should be toward sustained entry. However, there is also little incentive to literally imitate the incumbent, especially if changing causal relations involve switching costs (not considered here).

29

Entry occurs and erodes the incumbent’s pro…ts, but the incremental return to entrants getting it exactly right is low. If imitation is risky (high intrinsic ambiguity coupled with high risk), the opposite dynamic is at work. That is, the risks of entry keep such activity low, but those …rms that do enter are compelled to get it right. One of the more interesting …ndings, especially given the growing interest in the relationship of complexity to performance, is that denser causal relationships do not necessarily imply greater causal ambiguity. The fact that more interrelationships between observable operating variables may actually reveal a lot to potential imitators has implications. Much has been written, for example, about the relative performance advantages enjoyed by Southwest Airlines and the di¢ culty of its larger competitors in their attempts to imitate it. This is true even though its activities are simpler than its competitors and fairly transparent (and, indeed, well documented). The organization is informal, there is no ticketing, routes are simple point-to-point, equipment is standardized, and teams are independent.36 Each of these operational features imply either fewer in‡uence relations between activities or greater di¢ culty in observing heterogeneity in outcomes (e.g., due to equipment standardization). These devices make strategic sense if they have the e¤ect of simplifying operations in a way that increases the intrinsic ambiguity of Southwest’s operations and, thereby, prevent imitation. Moreover, this argument does not rely upon any assumption of resource “stickiness” on the part of Southwest’s competitors. Rather, the linkages adopted by Southwest may be su¢ ciently ambiguous, and the risk of experimentation su¢ ciently high, that imitation is foreclosed. For similar reasons, small startups may be more di¢ cult to imitate than larger, established …rms – which may be one reason they tend to be good acquisition candidates (i.e., since this may be the only way to observe their hidden structure). Alternatively, outsourcing is a way to introduce independencies in observed operations. Done appropriately, this can actually increase the level causal ambiguity connected to a …rm’s strategy. So-called “‡at” organizations (a feature of Southwest) also push in the direction of fewer interrelationships. While it may be simple enough to create a ‡at organization, it may be quite di¢ cult to do so successfully –even when observing the behavior of those who have. Finally, let us speculate on some possible extensions of this theory. Throughout the analysis, it was assumed that the incumbent simply implemented the optimal structure. An obvious extension is to examine the incumbent as a strategic player. For example, it seems unlikely that a …rm would adopt an easily imitated operating structure, even though highly e¢ cient. What are the competitive trade-o¤s between operating performance and causal ambiguity? Answering this may improve our general understanding of when, e.g., the kinds of simplifying and decoupling devices seen at Southwest are likely to be implemented for strategic purposes. It was also assumed that causal relationships between key, observable operating variables could be represented by directed acyclic graphs. When important operating variables are not observed, this assumption 3 6 Saloner

at al. (2001, p. 67-71).

30

is no longer appropriate since correlations may be induced by hidden causes. In such cases, more general approaches must be used (e.g., “chain”graphs) to represent the relations observed by outsiders. Fortunately, the literature on probabilistic networks includes numerous approaches to this issue. Another signi…cant assumption was that managers knew the causal implications of their operating plans. However, the strategy literature also raises the possibility that these implications may not be known, resulting in causal ambiguity with respect to one’s own structure. Also, the actions of senior managers aim not only to implement an appropriate set of interrelationships between operating entities but also to a¤ect the local behaviors of those entities (i.e., to in‡uence the parameters that determine the empirical distribution). It may be worthwhile to extend the analysis presented here to these cases. Finally, much of the literature on probabilistic networks is concerned with the empirical exercise of estimating equivalence classes of causal structures from real-world history. This raises the interesting possibility of investigating the propositions presented here via empirical methods.

31

A

Glossary of Notation

r

Number of activity centers

aik;t ait

2 f0; 1g ; k = 1; :::; r ai1;t ; :::; air;t

best

a? ; a

;a

worst

f0; 1g

ait

c

Activity pro…le of …rm i in period t Stay-out, cost-minimizing, cost-maximizing pro…les

r

A

The activity of activity center k

Set of all activity pro…les Constant marginal cost given activities ait

m (r)

Number of operating structures on r activity centers Unavoidable …xed operating cost if E enters

S

Set of all operating structures

Sk 2 S

The k th operating structure

Sti = Sk

Firm i chooses Sk in period t

S?

Stay out decision; Sti = S? ) ait = a?

Fk

Set of distributions on A faithful to Sk

k E k

2 Fk

True empirical distribution generated by Sk E’s true expected pro…t in a period when Sk is chosen

OIk

fSl j

l

2 Fk g

LSI E I E h0 ; S1E ; aI1 ; aE 1 ; :::; St 1 ; at 1 ; at 1

ht ^

(^ 1 ; :::; ^ m )

Observational indistinguishability class of Sk Local structure identi…er: ax ! ay

az

A period-t history E’s initial beliefs w.r.t. the true empirical distributions

^ It

E’s period-t belief w.r.t. I’s empirical distribution

^k

E’s initial belief that Sk is the e¢ cient structure

^

(^1 ; :::; ^m )

Pro…le of beliefs w.r.t. e¢ cient structure

(Sk jht )

E’s strategy: prob. choose Sk given history ht

^ (^)

Subjective degree of causal ambiguity in ^ jln (m1 )j 1 m1

E 1

Intrinsic ambiguity of S1 ; the e¢ cient structure m1 1 m1

Intrinsic risk of S1

32

B

The propositions

B.1

Proof of Lemma 1

Suppose E stays out. Then, it only observes histories of the form ht = h0 ; S? ; aI1 ; a? ; :::; S? ; aIt ; a? . In this case, E’s assessments ^ It depend only upon it’s initial priors and a stream of incumbent operating data d0 ; aI1 ; :::; aIt ; where d0 = h0 : To indicate the dependence of ^ It on a speci…c dt

dt

1;

I write ^ It ( jdt ).

Assume E’s priors are independent and Dirichlet distributed with positive integer parameters Pz z 2r is the total number of activity pro…les in A and let l=1 l . Then, for al 2 A; l

^ I0 aI1 = al jh0 =

1 ; :::;

z

where

:

Given dt ; let lt denote the number of times al shows up as a component. Then, since E is Bayesian, it can be shown that + lt : +t

l

^ It aIt+1 = al jdt =

1 if aIj = al and 0 oth-

Consider al 2 A. De…ne the sequence of random variables Y1 ; :::; Yt where Yj erwise. Note that, given the Dirichlet assumption, E (Yj ) = write

^ I1

aI2

= al jd1 ;

^ I2

aI3

l( 2(

(al ) and V ar (Yj ) =

1

= al jd2 ; ::: as a sequence of random variables,

^ It;l

l) +1) :

1 +t

= Xt

(

Then, we can l

+ Y1 + ::: + Yt ) :

Therefore, E (Xt )

0 1 @ +t

=

+

l

t X j=1

l + t 1 (al ) : +t

=

1

E (Yj )A

Since Y1 ; :::; Yt are independent, V ar (Xt )

1

=

2

( + t) t

=

2

( + t)

t X

V ar (Yj )

j=1

l( 2(

l) : + 1)

Note that the Xt s are not identically distributed. Just the same, by the Chebyshev inequality, for all P where P

1

1

( ) indicates the

(jXt

E (Xt )j

1 -probability

P where ( ; t)

t 2

2(

l(

l) : +t)2 ( +1)

1

)

2

=

t 2

2

of an event. Thus, for all

(jXt

E (Xt )j < )

Hence, for all lim P

t!1

V ar (Xt )

1

1

l

( + t) ( + 1) > 0;

( ; t) :

E (Xt )j < ) = 1: 33

l) 2

> 0;

(jXt

(

:

> 0;

Moreover, for all histories, lim E (Xt ) = lim

t!1

l

t!1

+ t 1 (al ) = +t

1

(al ) :

Therefore, ^ It;l converges in probability to E (Xt ) which, in the limit, equals

B.2

1

(al ) :

Proposition 1

This is a direct consequence of Lemma 1 and Theorem 1 (see Neapolitan 2004, p. 457-68, on Bayesian structure selection given large random samples).

B.3

Proposition 2

B.3.1

Preliminaries

From E’s perspective, there are m primary states of the world, one corresponding to each structure Sk 2 S in

which S I = Sk ; i.e., the condition that Sk is the true e¢ cient structure (^t summarizes E’s subjective weights on these states each period). In addition, assume for each primary state, Sk = true, and each structure, Sj 2 S, E has priors that generate an associated empirical distribution. Speci…cally, let ^ k=true ( jht ) be j E’s assessment of the empirical distribution that E believes is generated by Sj conditional upon: (i ) Sk being the true e¢ cient structure and, (ii ) ht being the observed history. Assume these meet the faithfulness condition. As before, ^k (ht ) is the probability E assigns to Sk = true given history ht : Initial priors are indicated by h0 : Given these primitives, E’s assessment, given history ht ; that a 2 A will occur following a choice of Sj is ^ j (ajht )

m X

(ajht ) ^k (ht ) : ^ k=true j

k=1

The true empirical distributions are summarized by

(

1 ; :::;

m)

and the subjective ones by ^

(^ 1 ; :::; ^ m ) : E knows that staying out (StE = S? ) results in the occurrence of a? (and zero pro…t) with certainty. Let

E I E denote the set of in…nite histories with typical element ! = h0 ; S1E ; aI1 ; aE 1 ; S2 ; a2 ; a2 ; ::: : For

notational convenience, let ht denote both a t-period history and the cylinder set de…ned by it. That is, each history ht is associated with a subset of in…nite play paths (the cylinder set de…ned by ht ) such that ! 2 ht

if the projection of ! into its …rst 3t + 1 components is ht : For t < 1; let Ht denote the set of

all cylinders associated with t-length histories and Ht the -algebra generated by Ht . Then, ( ; H) is the measure space in which H is the smallest -algebra generated by the …nite-length cylinder sets. Next, construct the true distribution on …nite-length histories that is jointly implied by E’s strategy, I’s consistent use of S1 and the true

k s.

are primitives of the model, let P

;

Since I’s choice of structure is …xed and the true empirical distributions denote the probability distribution induced by E’s choice of strategy

34

and the true empirical distributions associated with each structure. Following Kalai and Lehrer (1995, p. 146), construct P

;

inductively. Start with P

;

(h0 )

1: Assuming P

;

(ht ) (Sk jht )

;

is de…ned for all period-t histories,

de…ne it for ht+1 = (ht ; Sk ; ai ; aj ) by P Then, ( ; H; P

;

;

(ht+1 )

P

) is a well-de…ned probability space (P

;

1

(ai )

k

(aj ) :

(12)

here is the unique extension of (12) from the

Ht ’s to H). Then; it is straightforward to construct the subjective subjective probability space, ( ; H; P Once again, start by setting P

;^

(h0 )

P

;^

(ht+1 )

;^ ).

1: Then, for ht+1 = (ht ; Sk ; ai ; aj ), de…ne P

;^

(ht ) (Sk jht )

1

(ai ) ^ k (aj jht ) :

(13)

For any speci…c in…nite history ! 2 ; the net present value of pro…t for E from period 1 on is v (!)

1 X

t 1 E

aIt (!) ; aE t (!) ;

t=1

where ait (!) is the period-t action taken by …rm i along history !: Thus, the expected net present value of strategy

under beliefs ^ is given by V ( ; ^)

Z

v (!) dP

;^

(!) :

= E has objectively perfect knowledge if, for all Sj ; Sk 2 S; ^ k=true j

j.

In this case, we say ^ =

and note

that V ( ; ) is then actual present value associated with : B.3.2 Assume

Main result < 0 and let ^ be arbitrary beliefs satisfying the preceding assumptions. Let

forever strategy. Since E is subjectively rational, ( jh0 ) = ^ k=true k

1:

be the stay-out

maximizes V ( ; ^ ). By the premise, for all Sk 2 OI1 ;

By the assumption that E knows the faithfulness condition, this implies X

^k (h0 ) = 1:

(14)

Sk 2OI1

Let ^k (h0 ) 2 max ^j (h0 ) jSj 2 OI1 : By (14), ^k (h0 )

1 : m1

(15)

The faithfulness condition implies, for all Sj 2 S; ^ l=true (aworst jh0 ) < 1: Therefore, E’s initial subjective j assessment, for all Sj 2 S; is

^E j (Sj 6= true) = xj 35

;

where xj > 0 and ^ E j (Sj 6= true) is the P

;^ -expected

payo¤ in period 1 to choosing Sj conditional upon

Sj 6= true. Then, the subjective expected pro…t of choosing Sk in period 1 is ^E k = ^k (h0 ) Since ^k (h0 )

1 m1

E 1

+ (1

^k (h0 )) (xk

):

and xk > 0; ^E k >

This immediately violates the assumption that Suppose Sk 6= S1 : Then

k

6=

1

> 0:

maximizes V ( ; ^ ). Therefore, E enters in period 1.

and with probability arbitrarily close to 1, in some period t for some

other Sl 2 OI1 ; ^l (ht ) 2 max ^j (ht ) jSj 2 OI1

and, because changing structures is costless, E adopts

Sl following the same logic as above: This continues until S1 is eventually tried (which occurs P

;

-almost

always). The actual sequence of choices is determined by the subjectively optimal ; which selects structures using the Gittins index for multi-arm bandit problems (see Whittle, 1982, for a general discussion).

References [1] Al-Najjar, N. I., L. Anderlini, and L. Felli. 2006. Undescribable Events. Review of Economic Studies 73 (4), 849–868. [2] Barney, J. 1991. Firm resources and sustained competitive advantage. Journal of Management, 17(1): 99-120. [3] Besenko, D.,U. Doraszelski, Y. Kryukov, and M. Satterthwaite. 2007. Learning-By-Doing, Organizational Forgetting, and Industry Dynamics. Harvard Institute of Economic Research Discussion Paper No. 2128 Available at SSRN: http://ssrn.com/abstract=962878. [4] Besenko, D., D. Dranove, and M. Shanley. 1997. Economics of Strategy. John Wiley & Sons, Inc.: New York. [5] Cohen, W. M., and D. A.1990. Absorptive Capacity: A New Perspective on Learning and Innovation. Administrative Science Quarterly, Vol. 35, 128-152. [6] Collis, D. J., and C. A. Montgomery. 1998. Corporate Strategy: A Resource-Based Approach. Boston: McGraw-Hill. [7] Cowell, R. G., A. P. Dawid, S. L. Lauritzen, and D. J. Spiegelhalter. 1999. Probabilistic Networks and Expert Systems. Springer, New York. [8] Dierickx, I., and K. Cool. 1989. Asset Stock Accumulation and Sustainability of Competitive Advantage. Management Science, 35: 1504-1511. 36

[9] Edwards, D. 1995. Introduction to Graphical Modelling. Springer-Verlag, New York. [10] Einhorn, H. J., and R. M. Hogarth. 1986. Decision Making under Ambiguity. Journal of Business, 59: S225-50. [11] Ghemawat, P. 1985. Building strategy on the experience curve. Harvard Business Review, 63, MarchApril, 143-49. [12] Ghemawat, P. and D. Levinthal. 2000. Choice Structures and Business Strategy. Working paper. Harvard Business School. [13] Golan, A., G. Judge, and D. Miller. 1996. Maximum Entropy Econometrics: Robust Estimation with Limited Data. New York: John Wiley & Sons. [14] Grant, R. M. 2002. Contemporary Strategy Analysis: Concepts, Techniques, Applications. Blackwell: Malden. [15] Jensen, F. V. 1996. An Introduction to Bayesian Networks. Springer, New York. [16] — — 2001. Bayesian Networks and Decision Graphs. Springer, New York. [17] Kalai, E., and E. Lehrer. 1993. Rational learning leads to Nash equilibrium. Econometrica: 61(5), 1019-1045. [18] Kau¤man, S. A. 1993. The Origins of Order:Self-Organization and Selection in Evolution. Oxford University Press, Oxford, U.K. [19] King, A. W., and C. P. Zeithaml. 2001. Competencies and …rm performance: examining the causal ambiguity paradox. Strat. Mgmt. J., 22: 75-99. [20] Korb, K. B., and A. E. Nicholson. 2004. Bayesian Arti…cial Intelligence. Chapman and Hall/CRC, Boca Raton. [21] Lieberman, M. 1984. The Learning Curve and Pricing in the Chemical Processing Industries. Rand Journal of Economics, Vol. 15, No. 2, Summer. [22] Lenox, M. J., S. R. Rockart and A. Y. Lewin. 2006. Interdependency, Competition, and the Distribution of Firm and Industry Pro…ts. Management Science, 52: 757-72. [23] — — 2006. Interdependencies, Competition and Industry Dynamics. Forthcoming, Management Sci. [24] Levinthal, D. A. 1997. Adaptation on rugged landscapes. Management Science, 43: 934-50.

37

[25] Levinthal, D. A., and J. G. March. 1993. The Myopia of Learning. Strategic Management Journal Vol. 14 95-112. [26] Lippman, S. A., and R. P. Rumelt. 1982. Uncertain imitability: an analysis of inter-…rm di¤erences in e¢ ciency under competition. Bell Journal of Economics, 13(3): 418-38. [27] MacDonald, G., and M. D. Ryall. 2004. How do value creation and competition determine whether a …rm appropriates value? Management Sci.v 50 (10) p. 1319-33. [28] MacDonald, G., and M. D. Ryall. 2006. Do new competitors, new customers, new suppliers,... sustain, destroy or create competitive advantage? Working paper. Melbourne Business School. [29] March, J. 1991 Exploration and Exploitation in Organizational Learning. Organization Science, 2, 71-87. [30] Mayer, R. and M. Gavin, “TrustinManagementandPerformace: WhoMindstheShop Whilethe [31] Meek, C. 1995. Strong Completeness and Faithfulness in Bayesian Networks, in Besnard, P., and S. Hands (eds.): Uncertainty in Arti…cial Intelligence; Proceedings of the Eleventh Conference, Morgan kaufmann, San Mateo, CA. [32] Moody, F. 1995. I sing the Body Electronic: A Year with Microsoft on the Multimedia Frontier. New York: Penguin Books. [33] Mosakowski, E. 1997. Strategy making under causal ambiguity: conceptual issues and empirical evidence. Org. Sci. 8:4. [34] Nelson, R., S. Winter. 1982. An Evolutionary Theory of Economic Change. Belkap, Cambridge, MA. [35] Neopolitan, R. E. 2004. Learning Bayesian Network s. Pearson/Prentice Hall: Upper Saddle River. [36] Pearl, J., 1988. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. North Holland, Amsterdam. [37] — — 2000. Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge. [38] Peteraf, M. A. 1997. The cornerstones of competitive advantage: a resource-based view. Strat. Mgmt. J. 14: 179-191. [39] Porter, M. E. 1980. Competitive Strategy, Free Press, New York, 1980. [40] — — 1996. What is Strategy? Harvard Business Review, November-December. [41] Reed, R., and R. J. DeFillippi. 1990. Causal ambiguity, barriers to imitation, and sustainable competitive advantage. Academy of Management Review, 15(1): 88-102. 38

[42] Rivkin, J. W. 2000. Imitation of Complex Strategies. Management Science, 46(6): 824-44. [43] Ryall, M. D. 2003. Subjective rationality, self-con…rming equilibrium and corporate strategy. Management Science, 49(7): 936-49. [44] Saloner, G., A. Shepard, and J. Podolny. 2001. Strategic Management. New York: John Wiley & Sons, Inc. [45] Spirtes, P., C. Glymour, and R. Scheines. 2000. Causation, Prediction and Search. The MIT Press, Cambridge. [46] Tirole, J. 1988. The Theory of Industrial Organization, The MIT Press, Cambridge. [47] Verma, T. S., and J. Pearl. 1990. Equivalence and synthesis of causal models, in: Proceedings of the 6th Conference on Uncertainty in Arti…cial Intelligence. Cambridge, p. 220-7. Reprinted in: Bonissone, P., Henrion, M., Kanal, L. N., Lemmer, J. F. (Eds.), Uncertainty in Arti…cial Intelligence, vol. 6, 255-68. [48] Whittle, P.. 1982. Optimization over time, Vol. 1. New York: Wiley.

39

Causal Ambiguity as a Source of Sustained Capability ...

May 24, 2007 - distinguish between knowing what to imitate vs. how to imitate. ...... the same edges: the only possibilities are those shown in (6) plus a& ( a( ' a' ...

294KB Sizes 5 Downloads 294 Views

Recommend Documents

Organisational Capability as a Source of ... - Semantic Scholar
cameras to notebook computers. Hamel and ..... MAY 2007. 9. References. Devan, J., Klusas, M.B. and Ruefli T.W. (2007). The elusive goal of corporate.

Organisational Capability as a Source of ... - Semantic Scholar
cameras to notebook computers. Hamel and ..... MAY 2007. 9. References. Devan, J., Klusas, M.B. and Ruefli T.W. (2007). The elusive goal of corporate.

i Edge Replacement as a Model of Causal ...
Figure 20: Graph classes, data, and model predictions, for Mayrhofer et al (2010). .... auto mechanic has detailed knowledge about which intermediate events ...

Causal Mathematical Logic as a guiding framework.pdf
itself has been proposed to be an entropic process providing a new physics based clarity to. Darwinism (England, 2013). CML seeks to define the entire range of ...

10 Fawcett Examining open data as a source of competitive ...
10 Fawcett Examining open data as a source of competitive advantage for big businesses ODRS16.pdf. 10 Fawcett Examining open data as a source of ...

a siphonotid millipede (rhinotus) as the source of ...
Oct 17, 2003 - 3Laboratory of Bioorganic Chemistry, National Institute of Diabetes and Digestive and Kidney Diseases, NIH, ... [email protected]. 2781. 0098-0331/03/1200-2781/0 C 2003 Plenum Publishing Corporation ... litter arthropods, including small m

Causal Mathematical Logic as a guiding framework.pdf
The current state of progression in the neurosciences is approaching the software engineering. equivalent of a critical mass. We are faced with the task of .... Download Date | 1/20/15 7:14 PM. Page 3 of 45. Causal Mathematical Logic as a guiding fra

Source preference and ambiguity aversion: Models and ...
each subject in each binary comparison. ..... online materials in Hsu et al. (2005) ..... Pacific Meeting of Economic Science Association in Osaka (February 2007).

A dynamic operationalization of Sen's capability approach
capability to choose the life they have reason to value» (Sen,1999:63), to highlight the social and economic factors ... In general, Sen's approach requires the translation of goods and services (i.e. commodities) ..... support it with any proof.

A dynamic operationalization of Sen's capability approach
Personal and social conversion factors play a pivotal role in Sen's capability approach: ...... gli effetti occupazionali della formazione utilizzando i non ammessi ai.

Land race as a source for improving ... - Semantic Scholar
KM-1 x Goa local and C-152 x Goa local F1 hybrids yielded better than the best parent, a land race itself. This improved ..... V.P., 2000, Genotypic difference in.

DEPLOYING AN OPEN SOURCE WEB PORTAL AS A TOOL FOR ...
DEPLOYING AN OPEN SOURCE WEB PORTAL AS A ... KNOWLEDGE SHARING AND COLLABORATION.pdf. DEPLOYING AN OPEN SOURCE WEB PORTAL ...

Land race as a source for improving ... - Semantic Scholar
KM-1 x Goa local and C-152 x Goa local F1 hybrids yielded better than the best parent, a land race itself. This improved ..... V.P., 2000, Genotypic difference in.

Nursing and physician attire as possible source of ... - Semantic Scholar
terial load of these microorganisms. Methods: ... determined the bacterial load on uniforms. METHODS .... dichotomous variables and the Mann-Whitney U test.

A porous silicon diode as a source of low-energy free ...
electrical and optical properties of porous silicon PS since it can provide ... in thin layers of PS, or the electrical transport at low tem- peratures. ..... meter lead.

A porous silicon diode as a source of low-energy free electrons at milli ...
We have developed a porous silicon PS diode that yields free-electron currents ... conduction but the electron emission mechanism is not well understood in the ...

as source of starch in making breads, sweets and pastries
electricity of Php 30.00 (P5.00/hr). The total cost of producing the flour is Php 281.72 or Php. 128.05/kg. This serves as the reference point in pricing the flour.

Auditory time-interval perception as causal inference on ... - Frontiers
Nov 28, 2012 - fore, revealing the temporal perception system is fundamental to understanding the sensory processing system, but it is not fully understood yet.

Ureteral Calculi as a Source of Low Back Pain- a Case ...
Distributed in Open Access Policy under Creative Commons® Attributi on License 3.0. Ureteral Calculi as a ... patient was then referred to a urologist who confirmed the findings and ... back pain. The system ..... sensitive from a management.

On the Adequacy of Statecharts as a Source of Tests for Cryptographic ...
On the Adequacy of Statecharts as a Source of Tests for Cryptographic Protocols. ∗. K. R. Jayaram and Aditya P Mathur. Department of Computer Science, Purdue University. W. Lafayette, IN 47907, USA. {jayaram,apm}@purdue.edu. Abstract. The effective

Auditory time-interval perception as causal inference on ... - Frontiers
Nov 28, 2012 - fore, revealing the temporal perception system is fundamental to ... that the temporal accuracy of our auditory system is higher than those for ...

Dynamic causal modelling of evoked potentials: A ...
MEG data and its ability to model ERPs in a mechanistic fashion. .... the repeated presentation of standards may render suppression of prediction error more ...

A Study on the Generalization Capability of Acoustic ...
robust speech recognition, where the training and testing data fol- low different .... Illustration of the two-class classification problem in log likelihood domain: (a) ...