We Don't Know Why She Swallowed the Fly: Policy and Path Dependence

Stephen E. Margolis and Stan Liebowitz

I know an old lady who swallowed a fly,

I don't know why she swallowed the fly.

Perhaps she'll die.

Welcome to the world of path dependence, a world governed not by our stars, not by ourselves, but by insignificant accidents of history. In this unpredictable world, small, seemingly inconsequential decisions lead inexorably to uncontrollable consequences. Ingestion of a fly leads an old lady to swallow a spider, a bird, a cat, a dog, a goat, a cow, and then, tragically, a horse. A typewriter keyboard arrangement that solves a temporary mechanical problem on the first typewriter becomes entrenched as the standard for generations to come, even though it is notoriously inefficient. A headstart for one computer operating system ensures its persistence, even against superior alternatives. In the world of path dependence, because individual decisions that may seem individually inconsequential or privately rational lead to large, lingering, and widely felt consequences, our expectations for market outcomes are turned upside down. The Invisible Hand does not work in the world of path dependence. Or so it is claimed.

Path dependence is the application to economic systems of an intellectual movement that has lately come into fashion in several academic disciplines. In physics and mathematics the related idea is called chaos -- sensitive dependence on initial conditions. As chaos theory has it, a hurricane off the coast of Florida may be the fault of a butterfly flapping its wings in the Sahara. In biology, the related idea is called contingency -- the irreversible character of natural selection. Contingency implies that fitness is only a relative notion: Survival is not of the fittest possible, but only of the fittest that happen to be around at the time.

Scientific popularizations like James Gleick's book Chaos and Mitchell Waldrop's Complexity have moved these ideas into the public view. In Wonderful Life, Stephen J. Gould applies this intellectual revolution to paleontology. And now the ideas are percolating into the popular culture. In the movie Jurassic Park, Jeff Goldblum plays a character identified as a "chaotitian." He warns that brewing up a few dinosaurs might lead to a situation that cannot be controlled. Small things, he tells us ominously, lead inexorably to big ones, maybe even disastrous ones.

The first explicit application of this concept to economics is credited to Brian Arthur, who warns of the danger of "lock-in by insignificant historical events." Paul Krugman captures the idea in a succinct definition of path dependence: "the powerful role of historical accident in determining the shape of the economy." On the face of it, the arguments of the advocates of path dependence have tremendous appeal. After all, it is hard to deny that if the right accident occurred, such as a meltdown of a nuclear reactor, it might have considerable consequences. But the type of accident that is envisioned in the path dependence literature is not of this magnitude. Rather, it is of the insignificant kind: a headstart, a quirky choice, a passing condition. Furthermore, lock-in has a very specific meaning in this literature. In some sense, of course, we are all locked in -- if only to eating, breathing, and remaining in our solar system. But this is not what lock-in means in this literature. In path dependence, getting "locked-in" means having to accept inferior standards or products even though superior alternatives exist, even though it is known that these superior alternatives exist, and even though the costs of switching are not high.

What is especially important is that for Arthur, Krugman, and their ilk, path dependence is no oddity: It's a likely phenomenon that can affect choices of technologies, networks, standards, industrial location, or almost any arrangement that might exhibit what economists call increasing returns to scale. And according to these authors, increasing returns, which is economist-speak for saying that bigger is better when it comes to the production of goods and services, is a common phenomenon -- one that is found in virtually all high-tech industries.

It's only a small step from there to sweeping policy prescription. In general discussions, path dependence arguments have been used to support active management of trade and every sort of industrial policy. Taking one specific case, path dependence arguments have been used to support antitrust actions against Microsoft. A white paper by Gary Reback and a group of coauthors, including Brian Arthur, uses path dependence arguments to claim that Microsoft's successes in the personal computer software market are due not to Microsoft's ability to provide consumers with handy solutions to their problems, but instead are caused by consumers' inabilities to escape from a path controlled by Microsoft. Reback et al. chillingly portray the ominous end of that path: "It is difficult to imagine that in an open society such as this one, with multiple information sources, a single company could seize sufficient control of information transmission so as to constitute a threat to the underpinnings of a free society. But such a scenario is a realistic (and perhaps probable) outcome." Lock the doors, it might be Bob, or Windows 95, come to enslave us.

With no less than the underpinnings of the free society supposedly at stake, this new theory of market failure certainly deserves a closer look. We start with the path dependence literature's paradigmatic example, which both illustrates the theory and provides its rather questionable empirical foundation. After that we return to the theory itself.

Typewriter Keyboards and Other Fables

Paul Krugman's new book, Peddling Prosperity, is a popularization of economic ideas that bear on public policy issues. He titles his chapter on path dependence, "The Economics of QWERTY, referring to the widely reported story of the typewriter keyboard. The title is appropriate. Despite the claims for the prevalence of path dependence, there is no other example that has been able to convince so many people of the significance of path dependence. Understandably, the theoretical literature of path dependence is littered with references to the QWERTY typewriter keyboard.

Here's how the path dependence folks tell the story: In 1867 Christopher Latham Sholes obtained a patent on the typewriter. In developing the typewriter, Sholes and his associates had one persistent problem: The hammers that put the letters on the page tended to jam. Sholes solved this problem by arranging the keyboard so as to slow down the typist, thus reducing the frequency of jamming.

Sholes sold the rights to the typewriter to the Remington company, which did some development work of its own and then began production. By the 1880's typewriters were still not quite a standard for business writing, but they were becoming familiar and of some interest to the general public. Demonstrations of typing speed were a source of public entertainment. In 1888, a well-publicized contest was held in Cincinnati that pitted Louis Taub, who had been traveling in the east and billing himself as the world's fastest typist, against Francis McGurrin, a typist from Salt Lake City. Taub used a rival machine with a rival keyboard arrangement, the Caligraph, and a hunt-and-peck method of typing. McGurrin used a Remington and had memorized the keyboard. He was, arguably, the world's first touch typist. McGurrin won the contest, going away.

Word went out that memorizing the keyboard on a Remington machine was the way to go. Typing schools were soon converted and started teaching ten-finger typing on Remington machines. QWERTY was established. In the meantime, the mechanical considerations that led to the QWERTY arrangement had largely been eliminated. They would be completely eliminated with the advent of electric typewriters and computer keyboards.

But it was too late. The solution to a hammer-jamming problem led inexorably to an inefficient standard for a new technology. We're still stuck with QWERTY, locked in by accidental and insignificant historical events such as a typing contest in Cincinnati and the desultory choice of a keyboard to memorize. And, the path dependence advocates add, here's the real tragedy of it: We have known of a much better keyboard design for over half a century. In 1936, August Dvorak announced his Dvorak Simplified Keyboard. Dvorak designed his keyboard to minimize finger movement, to keep the hands on the home row as much as possible, to load the right and left hands about equally, and to keep most of the load on the stronger fingers. The Dvorak keyboard is easier to learn, allows faster typing and fewer errors, and lowers stress. A study by the U.S. Navy found that the investment in retraining a typist on the Dvorak keyboard would be fully repaid ten days after the start of training! Other evidence indicates that the advantage in typing speed on the Dvorak typewriter is between 20 and 40 percent.

So why wasn't the Dvorak keyboard widely adopted? Path dependence, of course. Since compatibility is supposed to be of great importance to typists, a particular keyboard design increases in value as more people use it, providing a kind of increasing returns. Because of an accident, QWERTY came first, which established the path to which we are now locked-in. No one learns to use the Dvorak keyboard because there are so few Dvorak typewriters, and there are so few Dvorak typewriters because no one learns to use the Dvorak keyboard. It is, as some authors have written, a failure of decentralized decision making, i.e. markets. This failure to use a different keyboard design, reports a recent Fortune magazine article, results in "billions of dollars in lost productivity."

There you have it: The foundation case for path dependence, the story that's told over and over again to illustrate the phenomenon. It's an appealing yarn. It has everything; an engaging premise, a puzzling conflict, and a resolution just perverse enough to feed our 1990s cynicism. Unfortunately, it is false. Virtually every bit of it.

In the first place, and contrary to the most persistent claim, the QWERTY arrangement was not designed to slow down touch typists. The QWERTY arrangement solved the jamming problem not by addressing speed, but rather by addressing sequencing. Pairs of keys that are frequently struck in succession were placed as far from each other as possible. This arrangement made the paths of the hammers that were apt to be used in close succession as different as possible, making them less likely to interfere with each other, but it didn't necessarily have anything to do with speed.

Furthermore, the claims for the superiority of QWERTY's chief modern rival, the Dvorak keyboard, which the path dependence people have accepted on their face, are simply not true. Most of the specific claims for Dvorak's keyboard originate in studies that Dvorak conducted himself. Dvorak's experiments consisted primarily of comparing the performance of groups of students learning his keyboard, with the performance of other groups of students, of different ages, at other schools, with different training regimens, learning QWERTY. Dvorak's reports on his successes are actually fairly entertaining, with the same sense of mission and enthusiasm as a late-night infomercial on cable TV. But nobody would call them science. More orthodox experimental studies have repeatedly found little or no advantage for the Dvorak arrangement. An influential study conducted in 1956 by Penn State Professor Earle Strong for the U.S. General Services Administration concluded that the investment in retraining in the Dvorak keyboard would never be repaid. Ergonomic studies, which use simulations of typing movements, have found little or no advantage for the Dvorak arrangement over QWERTY. In fact, these studies suggest that one key to an ergonomically effective keyboard arrangement is to alternate the two hands as much as possible. Thus, the QWERTY keyboard, which tends to space successive characters far apart from one another, turns out to be ergonomically sound.

And finally, the Navy study on the potential return to investment in Dvorak training isn't quite as convincing as one might suppose. We tried to find this study as a government document, in our own libraries, and then many, many other libraries including the Naval Library and the Library of Congress. No luck. We finally found what we take to be the study (though it is mentioned in the path dependence literature, it is never fully cited) in a private collection. Although it does carry the markings of declassified wartime paperwork, it is not an official U.S. Navy study. There is no indication that it was ever commissioned, reviewed, or accepted by the Navy. It was, however, conducted under the supervision of a Navy lieutenant commander who could bring his own keyboards to the playground: one Lieutenant Commander August Dvorak. Experimenter bias is not the least bit hard to find in the document -- the study's results are clearly exaggerated, if not outright fudged. Because there is no evidence that the path dependence advocates ever found this document, it is not entirely surprising that this experimenter bias was overlooked.

But problems with method and documentation can be seen as a mere academic quibble. Let's look at the big issues, the ones that should have made researchers skeptical. If training pays for itself ten days from its start, it yields an annual return of over 2,000 percent. Surely this kind of opportunity would be profitable for firms that use large numbers of typists. Yet rarely has anyone ever made the switch--even today, when software to remap computer keyboards allows easy conversion to the Dvorak keyboard. How many path dependence researchers, we wonder, have made this easy, painless switch?

The fable also leaves out other important details. It turns out that the supposedly decisive Cincinnati contest was not unique. The were other contests, with results reported in major newspapers, in the same year as the Cincinnati contest. Some were won by typists using Caligraph machines, others using Remingtons. There were also several manufacturers in competition with Remington, marketing their machines with claims that their alternative keyboard arrangements were easier to use. One of these machines, the Hammond, has notable similarity to the Dvorak keyboard. This competition makes all the difference in the world for path dependence. Our current keyboard comes to us not as a happenstance choice but rather as the winner in a battle of different keyboard designs. So it is not all that surprising to find that QWERTY is a fairly efficient keyboard.

It is hard to overstate the significance of the typewriter keyboard example for the path dependence literature. Krugman has it right: This movement in economic policy is the "economics of QWERTY." The received story has great appeal -- it sounds like it ought to be true -- and economists, policy wonks, editorialists, and writers of white papers continue to tell it. Our debunking of the myth was first published in 1990, and the inconvenient truth has been whispered around since, but the received fable of the keys has proven to have great staying power. Doubtless the path dependence folks would now greet an alternate real-world example of detrimental path dependence with great enthusiasm, but so far, and despite the claims of the prevalence of the phenomenon, no convincing examples have come to the fore. Some have been proposed. One writer claims that the government backed the wrong kind of nuclear reactor. This may or may not be true, but it says little about the performance of markets, and surely it is a dog-bites-man story even if it is true.

Another writer argues that in the battle between AC and DC, although we got the right system, we barely dodged a bullet. The failure of the DC system was supposedly just an odd outcome that had nothing to do with the inferiority of DC transmission, but rather with the fact that Edison's DC power companies were having financial difficulties not shared by any of the competing AC systems. All of that might be interesting, but surely you can't prove a market failure by pointing to a correct outcome.

A number of other writers have claimed that Beta-format videorecorders were clearly the superior technology, but that an early lead for the VHS format was an insurmountable hurdle for the otherwise preferable Beta. This gets things wrong in a couple of ways. First, Beta was on the market first, and had a headstart of almost two years before VHS sold any machines at all -- so it was Beta, not VHS, that had the historical edge. Sony, the creator of Beta, can hardly be portrayed as a weakling unable to capitalize on a superior mousetrap. Furthermore, the two formats actually used the same technologies since Sony and JVC had jointly produced a prior machine, had a patent-sharing agreement, and even had discussed collaborating on the Betamax. In fact, Sony managers and engineers felt that JVC had copied the Betamax when JVC produced the VHS design: The only serious format differences reflected differing priorities regarding compactness versus playing time. Apparently being able to record an entire movie on one tape was more important to consumers than a smaller cassette.

Other claims of real-world path dependence can only be described as fanciful. Picking up on current environmental enthusiasms, some authors have argued that if early automobile producers had chosen electrical power, electrical-propulsion systems today would be as good as internal combustion engines. Never mind that even with all of the applications of motors and batteries in the century since, and with all the advantages of digital electronic-power management systems, the most advanced electric automobiles that anyone has been able to make do not yet equal the state of the art in internal-combustion automobiles as of the early nineteen-twenties. Never mind that electric automobiles actually were commercially viable in the early stages of the industry, and that electric power has been viable the entire time in the nearby technologies of smaller industrial and recreational vehicles.

Steam power is another lamented missed opportunity in this line of utopian speculation. Never mind that in the applications in which steam has been dominant, railroads and oceangoing ships, it has gradually been eclipsed by diesel, electric, and hybrid designs. The claim that is made in these cases is that we can never know what breakthroughs might have occurred in these technologies if only they had been explored further. Such arguments are hard to disprove, but they're also hard to take seriously. There you have it. Theory tells us that path dependence is a phenomenon that is likely to afflict choices of technologies, standards, the location of industries, and so forth. Path dependence supposedly describes the technologies of modern life. One would expect an embarrassment of rich examples -- but for some reason no one can come up with any. The only examples of the phenomena that have been presented seem to be either fictitious stories or pure conjecture.

How can this be? Perhaps the theory behind all this is itself deficient.

Increasing Returns and Path Dependence

For a firm, a condition of increasing returns means that bigger is better -- the firm can produce goods at lower cost as its own output increases. Increasing returns can also be understood to occur when products become more valuable to each consumer as more consumers use the product. So, for example, in a network like the telephone system, the advantage of having a phone increases as more people get phones. This condition has been identified as a "network effect" (sometimes incorrectly called a "network externality"). For a technology, the payoffs to a user may increase as the number of other users of that technology increases. For typewriter keyboards, video recorders, microprocessors, or word processors, the advantage of using a particular design seems to increase with the number of users of that design.

Arthur, Krugman, and some others argue that markets, which might have done fine at organizing the manufacture of old-technology things like steel and sailing ships, are ill suited to organizing the manufacture of new-technology items like silicon chips and software because of these network effects.

There are problems with both parts of this argument. First let's look at the claim that new- tech goods lead to a different kind of market than old-tech goods. It's true that the past decades have seen great declines in the costs of computers, fax machines, and videotape machines, to name just a few examples. At the same time, production of these items has grown enormously. But much of this cost saving may be the result of advances in know-how, including the general state of technology, rather than an economy of scale per se. That is to say, a decline in cost is likely due to advances in technology and not increases in scale. In fact, the increase in sales is probably due to this decrease in costs, rather than the other way around. So the apparent economies to scale for firms in new-technology industries may not be a permanent condition, if they exist at all.

What is probably more important is that even where there are increasing returns , the theoretical argument for lock-in does not withstand careful scrutiny. Table 1 is reproduced from a 1989 paper by Brian Arthur that is often credited with starting the whole discussion. The table is the basis for an exercise that seemingly demonstrates the likelihood of unsatisfactory lock-in.

The story of the table is that there are two technologies that are in competition with each other. A would-be adopter arrives on the scene and chooses between technologies A and B. Assume for now that an adopter receives a payoff (value), as shown in the table, that is determined by the number of prior adopters of a given technology. So, for example, if there are twenty-one adopters of technology A, each would enjoy a payoff of 12. The payoffs increase with the number of adopters, consistent with models of increasing returns.

Arthur uses the table to illustrate the likelihood of undesirable lock-in. The first adopter on the scene, choosing between a payoff of 10 with technology A and a payoff of 4 with technology B, would be expected to choose technology A. The arrival of subsequent adopters will only serve to reinforce the advantage of choosing A. But notice that if the eventual number of adopters is large enough, technology B would yield greater returns. But the choices of individual adopters will lock us in to technology A.

Arthur's story of lock-in is simple -- deceptively so. If we look at the table alone, it seems unavoidable that individuals' choices will lead to an irreversible choice of technology A and it seems undeniable that A is an unfortunate choice where the number of eventual adopters is large. The first adopter would rather have 10 than 4, and so would anyone else. We are locked-in; the market fails. Each agent acts rationally, given the payoffs in the table, but as a group we end up with less than we might have had. Perhaps, the argument goes, we need the government to protect consumers from themselves.

But what is lacking from the table and is also lacking in the great outpouring of abstract modeling of path dependency, is an appreciation of both the variety of steps that people take to avoid such harms, and the restrictive conditions assumed in the table.

These analyses make the common mistake of assuming that market organization and perfect decentralization are, or ought to be, the same thing. Imagine for a moment that each of these technologies is owned, perhaps through patent or copyright. In that case, if the number of potential adopters is large, the owner of technology B would have a significant incentive to establish B as the technology of choice. Just as the owner of especially productive land is expected to capture the value of its advantages, the owner of a technology would be expected to capture the advantages that it offers over the next best alternatives. Given that, it is worthwhile for the owner of technology B to cut prices for early adopters or provide other incentives to induce adoptions of B. While the owner of A will have similar incentives, the total wealth potential of technology B is greater, so B would be able to offer greater incentives to become the technology of choice, under the assumption that B is the technology capable of yielding greater total benefits. Alternatively, if the technology is not owned, it would pay all would-be adopters to enter agreements to adopt the preferred technology.

More generally, the inefficiency that seems inescapable in the table is a profit opportunity for someone who can figure out the means to move the outcome from A to B and appropriate the difference. Such entrepreneurship can take various forms, some of which are familiar. Where a technology is not patentable or otherwise ownable, a firm may be able to create a format or a variant of the technology that is. Firms can advertise, they can lease out the goods that implement the technology, they can enter strategic alliances. On the consumer side, a large user of a technology may be able to profit from technology B regardless of the choices of other users. For example, large firms with numerous typists would have switched to Dvorak if Dvorak really were such a superior design.

Not only does this model reduce producers to the role of mere spectators, but it assumes that consumers have no foresight. For if consumers were aware of the entire table, all that is required to prevent lock-in to an inferior alternative is that adopters can make reasonable forecasts of the number of eventual adopters. If, for example, early adopters know that they will be joined by 100 more, they will see that everyone will be better off with technology B. The latecomers will see it that way too, and the earlycomers know it.

The kind of foresight that we are talking about here is not the stuff of gifted visionaries. It is actually pretty ordinary. It led you to buy an FM radio in the early sixties. It led you to buy service for eight when you were a newlywed, even if you only knew one other couple in town. And in 1990, it led you to buy a Windows-capable computer because it was the coming thing, even if you were only using DOS-based software at the time.

There is another special aspect to this table that is easy to overlook. The paths of returns in table must "cross." As demonstrated in figure 1, the slopes of the payoff lines must differ, with the slope of B being steeper than the slope of A. Such a cross presumably requires that the network effects or scale economies of production for technology B are much stronger than for technology A. Otherwise we cannot be misled into choices that we will regret.

But go back to the examples put forward by advocates of path dependence. QWERTY and Dvorak don't differ in terms of network effects or production costs. The cost of producing typewriters is not affected by the alignment of keys. Nor does the value of having other typists knowing the same design (and thus making keyboards more interchangeable) differ for the two types of keyboards. Similarly for videorecorders, the advantage that comes with increased availability of rental tapes is unlikely to differ between formats. Thus the conditions needed for lock-in are very special, and are not consistent with the examples that have been put forward.

Finally, to ward off confusion, we must note there is a commonplace type of lock-in that is of little interest to the path dependence literature because it does not give rise to inefficiency. In the simplest sense, path dependence might be thought of as the mere passage of durable goods through time: What we have today depends upon some of the things that we did yesterday. People deal with durability even in the most ordinary aspects of their lives. Any rational approach to durable commitments must invoke some set of beliefs about the future. Individuals probably would not know all of the numbers in the table, nor do they know the eventual number of adopters of either technology with certainty. They would have some expectations, some guesses about the future of computing or their dinner guests two years down the road. But they would not know with certainty the table. The day may come when, looking back, individuals regret the choices that they made in the past. They may even be locked in to their choices in the sense that they do not find it worthwhile to buy a new computer again so soon after their previous purchases. They may regret their choice of word processor, or the job they took.

Such regrets are common. Their very abundance may help to explain the uncritical acceptance that path dependence and lock-in have received, since it is easy to confuse the one type of path dependence with the other. But these are ex post regrets. They are caused by imperfect information about the future, and not the inability to choose. Maybe if we could go back in time, knowing what we now know, we would buy the other word processor. But this is a problem of prediction, not of coordination. This is not Arthur's story of lock-in.

In Arthur's story, individuals regret purchasing the same word processor that everyone else bought because they cannot get everyone else to buy a better one that is available. The more common regret of individuals is that they did not know the best product at the time that they had to choose. If there is a systematic tendency to err, it stems from systematic misinformation about the future. That problem is not one that is inherent in the decision mechanism, or one that uniquely follows from increasing returns, but rather is one of information. So long as there is no one who knows the payoffs to technologies, we are cast back to the usual problem of regulation: Can governments know more about the likely payoffs to technologies than individual consumers and producers in the markets know? It's not impossible, of course, but it would seem that the affirmative answer carries a burden of proof.


As an economic theory, path dependence offers a new expression of how government action might improve on market outcomes. As stated by Paul Krugman: "In a QWERTY world, markets cannot be relied upon to get things right." That in itself says nothing about the correctness of the theory of path dependence, but it does suggest that we might want to pay attention, especially given the theory's simple logical appeal and its romantic tales of butterflies, dinosaurs and old typewriters that come to us, all reflecting the glowing halo of science.

Certainly one thing does lead to another, sometimes in ways that are surprising or intriguing. Discovering this interconnectedness is much of what science is about. Certainly important interconnections, including important economic ones, are intertemporal. I can live in a house today because someone built it sometime in the past. People understand these interconnections, and they plan their lives with these sorts of things in mind: They build, they save, they get educated, they put a turkey in the oven at noon. But the claim of path dependence, at least as would matter for public policy, is that people often either ignore these interconnections, or only look at them in a narrow and myopic manner, and so they get locked into bad solutions.

It is, of course, possible that lack of foresight, or difficulties in communication, or common property in technologies or other hazards could create instances in which complete decentralization could commit us to unfortunate paths. In a world of path dependence, for example, there might not be any automobiles. After all, automobiles are not particularly useful until there are gas stations, and gas stations won't be profitable until there are automobiles. In a world of path dependence, there might not be any fax machines. I won't buy a fax because I don't know for sure that you will buy one, and you won't buy one because you don't know that I'll buy one.

But something's amiss. We have cars and we have faxes. We found our ways out of these traps. People are clever. They anticipate the future, they look for profit opportunities, they advertise, contract, warranty, and make other sorts of commitments. For every hypothetical trap that can be thought up there are hypothetical escapes. Whether the traps are real and whether the escapes are practical cannot be resolved on theory alone. That something could have happened does not mean that it did. If path dependence is a common phenomenon, the real world should be rife with examples of it. We're waiting for evidence of one.

Meanwhile we'll be checking the obituaries, looking for an old lady who swallowed a horse.

Selected Readings