Read Superintelligence: Paths, Dangers, Strategies Online
Authors: Nick Bostrom
Tags: #Science, #Philosophy, #Non-Fiction
The second qualification is that the post-transition technology base would enable material resources to be converted into an unprecedented range of products, including some goods that are not currently available at any price even though they are highly valued by many humans. A billionaire does not live a thousand times longer than a millionaire. In the era of digital minds, however, the billionaire could afford a thousandfold more computing power and could thus enjoy a thousandfold longer subjective lifespan. Mental capacity, likewise, could be for sale. In such circumstances, with economic capital convertible into vital goods at a constant rate even for great levels of wealth, unbounded greed would make more sense than it does in today’s world where the affluent (those among them lacking a philanthropic heart) are reduced to spending their riches on airplanes, boats, art collections, or a fourth and a fifth residence.
Does this mean that an egoist should be risk-neutral with respect to his or her post-transition resource endowment? Not quite. Physical resources may not be convertible into lifespan or mental performance at arbitrary scales. If a life must be lived sequentially, so that observer moments can remember earlier events and be affected by prior choices, then the life of a digital mind cannot be extended arbitrarily without utilizing an increasing number of
sequential
computational operations. But physics limits the extent to which resources can be transformed into sequential computations.
42
The limits on sequential computation may also constrain some aspects of cognitive performance to scale radically sublinearly beyond a relatively modest resource endowment. Furthermore, it is not obvious that an egoist would or should be risk-neutral even with regard to highly normatively relevant outcome metrics such as number of quality-adjusted subjective life years. If offered the choice between an extra 2,000 years of life for certain and a one-in-ten chance of an extra 30,000 years of life, I think most people would select the former (even under the stipulation that each life year would be of equal quality).
43
In reality, the prudential case for favoring a wide distribution of gains is presumably subject-relative and situation-dependent. Yet, on the whole, people would be more likely to get (almost all of) what they want if a way is found to achieve a wide distribution—and this holds even before taking into account that a commitment to a wider distribution would tend to foster collaboration and thereby increase the chances of avoiding existential catastrophe. Favoring a broad distribution, therefore, appears to be not only morally mandated but also prudentially advisable.
There is a further set of consequences to collaboration that should be given at least some shrift: the possibility that pre-transition collaboration influences the level of post-transition collaboration. Assume humanity solves the control problem. (If the control problem is not solved, it may scarcely matter how much collaboration there is post transition.) There are two cases to consider. The first is that the intelligence explosion does
not
create a winner-takes-all dynamic (presumably because the takeoff is relatively slow). In this case it is plausible that if
pre-transition collaboration has any systematic effect on post-transition collaboration, it has a positive effect, tending to promote subsequent collaboration. The original collaborative relationships may endure and continue beyond the transition; also, pre-transition collaboration may offer more opportunity for people to steer developments in desirable (and, presumably, more collaborative) post-transition directions.
The second case is that the nature of the intelligence explosion does encourage a winner-takes-all dynamic (presumably because the takeoff is relatively fast). In this case, if there is no extensive collaboration before the takeoff, a singleton is likely to emerge—a single project would undergo the transition alone, at some point obtaining a decisive strategic advantage combined with superintelligence. A singleton, by definition, is a highly collaborative social order.
44
The absence of extensive collaboration pre-transition would thus lead to an extreme degree of collaboration post-transition. By contrast, a somewhat higher level of collaboration in the run-up to the intelligence explosion opens up a wider variety of possible outcomes. Collaborating projects could synchronize their ascent to ensure they transition in tandem without any of them getting a decisive strategic advantage. Or different sponsor groups might merge their efforts into a single project, while refusing to give that project a mandate to form a singleton. For example, one could imagine a consortium of nations forming a joint scientific project to develop machine superintelligence, yet not authorizing this project to evolve into anything like a supercharged United Nations, electing instead to maintain the factious world order that existed before.
Particularly in the case of a fast takeoff, therefore, the possibility exists that greater pre-transition collaboration would result in less post-transition collaboration. However, to the extent that collaborating entities are able to shape the outcome, they may allow the emergence or continuation of non-collaboration only if they foresee that no catastrophic consequences would follow from post-transition factiousness. Scenarios in which pre-transition collaboration leads to reduced post-transition collaboration may therefore mostly be ones in which reduced post-transition collaboration is innocuous.
In general, greater post-transition collaboration appears desirable. It would reduce the risk of dystopian dynamics in which economic competition and a rapidly expanding population lead to a Malthusian condition, or in which evolutionary selection erodes human values and selects for non-eudaemonic forms, or in which rival powers suffer other coordination failures such as wars and technology races. The last of these issues, the prospect of technology races, may be particularly problematic if the transition is to an intermediary form of machine intelligence (whole brain emulation) since it would create a new race dynamic that would harm the chances of the control problem being solved for the subsequent second transition to a more advanced form of machine intelligence (artificial intelligence).
We described earlier how collaboration can reduce conflict in the run-up to the intelligence explosion, increasing the chances that the control problem will
be solved, and improve both the moral legitimacy and the prudential desirability of the resulting resource allocation. To these benefits of collaboration it may thus be possible to add one more: that broader collaboration pre-transition could help with important coordination problems in the post-transition era.
Collaboration can take different forms depending on the scale of the collaborating entities. At a small scale, individual AI teams who believe themselves to be in competition with one another could choose to pool their efforts.
45
Corporations could merge or cross-invest. At a larger scale, states could join in a big international project. There are precedents to large-scale international collaboration in science and technology (such as CERN, the Human Genome Project, and the International Space Station), but an international project to develop safe superintelligence would pose a different order of challenge because of the security implications of the work. It would have to be constituted not as an open academic collaboration but as an extremely tightly controlled joint enterprise. Perhaps the scientists involved would have to be physically isolated and prevented from communicating with the rest of the world for the duration of the project, except through a single carefully vetted communication channel. The required level of security might be nearly unattainable at present, but advances in lie detection and surveillance technology could make it feasible later this century. It is also worth bearing in mind that broad collaboration does not necessarily mean that large numbers of researchers would be involved in the project; it simply means that many people would have a say in the project’s aims. In principle, a project could involve a maximally broad collaboration comprising all of humanity as sponsors (represented, let us say, by the General Assembly of the United Nations), yet employ only a single scientist to carry out the work.
46
There is a reason for starting collaboration as early as possible, namely to take advantage of the veil of ignorance that hides from our view any specific information about which individual project will get to superintelligence first. The closer to the finishing line we get, the less uncertainty will remain about the relative chances of competing projects; and the harder it may consequently be to make a case based on the self-interest of the frontrunner to join a collaborative project that would distribute the benefits to all of humanity. On the other hand, it also looks hard to establish a formal collaboration of worldwide scope before the prospect of superintelligence has become much more widely recognized than it currently is and before there is a clearly visible road leading to the creation of machine superintelligence. Moreover, to the extent that collaboration would promote progress along that road, it may actually be counterproductive in terms of safety, as discussed earlier.
The ideal form of collaboration for the present may therefore be one that does not initially require specific formalized agreements and that does not expedite advances in machine intelligence. One proposal that fits these criteria is that we
propound an appropriate moral norm, expressing our commitment to the idea that superintelligence should be for the common good. Such a norm could be formulated as follows:
The common good principle
Superintelligence should be developed only for the benefit of all of humanity and in the service of widely shared ethical ideals.
47
Establishing from an early stage that the immense potential of superintelligence belongs to all of humanity will give more time for such a norm to become entrenched.
The common good principle does not preclude commercial incentives for individuals or firms active in related areas. For example, a firm might satisfy the call for universal sharing of the benefits of superintelligence by adopting a “windfall clause” to the effect that all profits up to some very high ceiling (say, a trillion dollars annually) would be distributed in the ordinary way to the firm’s shareholders and other legal claimants, and that only profits in excess of the threshold would be distributed to all of humanity evenly (or otherwise according to universal moral criteria). Adopting such a windfall clause should be substantially costless, any given firm being extremely unlikely ever to exceed the stratospheric profit threshold (and such low-probability scenarios ordinarily playing no role in the decisions of the firm’s managers and investors). Yet its widespread adoption would give humankind a valuable guarantee (insofar as the commitments could be trusted) that if ever some private enterprise
were
to hit the jackpot with the intelligence explosion, everybody would share in most of the benefits. The same idea could be applied to entities other than firms. For example, states could agree that if ever any one state’s GDP exceeds some very high fraction (say, 90%) of world GDP, the overshoot should be distributed evenly to all.
48
The common good principle (and particular instantiations, such as windfall clauses) could be adopted initially as a voluntary moral commitment by responsible individuals and organizations that are active in areas related to machine intelligence. Later, it could be endorsed by a wider set of entities and enacted into law and treaty. A vague formulation, such as the one given here, may serve well as a starting point; but it would ultimately need to be sharpened into a set of specific verifiable requirements.
We find ourselves in a thicket of strategic complexity, surrounded by a dense mist of uncertainty. Though many considerations have been discerned, their details and interrelationships remain unclear and iffy—and there might be other factors we have not even thought of yet. What are we to do in this predicament?
A colleague of mine likes to point out that a Fields Medal (the highest honor in mathematics) indicates two things about the recipient: that he was capable of accomplishing something important, and that he didn’t. Though harsh, the remark hints at a truth.
Think of a “discovery” as an act that moves the arrival of information from a later point in time to an earlier time. The discovery’s value does not equal the value of the information discovered but rather the value of having the information available earlier than it otherwise would have been. A scientist or a mathematician may show great skill by being the first to find a solution that has eluded many others; yet if the problem would soon have been solved anyway, then the work probably has not much benefited the world. There
are
cases in which having a solution even slightly sooner is immensely valuable, but this is most plausible when the solution is immediately put to use, either being deployed for some practical end or serving as a foundation to further theoretical work. And in the latter case, where a solution is immediately used only in the sense of serving as a building block for further theorizing, there is great value in obtaining a solution slightly sooner only if the further work it enables is itself both important and urgent.
1
The question, then, is not whether the result discovered by the Fields Medalist is in itself “important” (whether instrumentally or for knowledge’s own sake). Rather, the question is whether it was important that the medalist enabled the publication of the result to occur at an earlier date. The value of this temporal
transport should be compared to the value that a world-class mathematical mind could have generated by working on something else. At least in some cases, the Fields Medal might indicate a life spent solving the wrong problem—for instance, a problem whose allure consisted primarily in being famously difficult to solve.
Similar barbs could be directed at other fields, such as academic philosophy. Philosophy covers some problems that are relevant to existential risk mitigation—we encountered several in this book. Yet there are also subfields within philosophy that have no apparent link to existential risk or indeed any practical concern. As with pure mathematics, some of the problems that philosophy studies might be regarded as intrinsically important, in the sense that humans have reason to care about them independently of any practical application. The fundamental nature of reality, for instance, might be worth knowing about, for its own sake. The world would arguably be less glorious if nobody studied metaphysics, cosmology, or string theory. However, the dawning prospect of an intelligence explosion shines a new light on this ancient quest for wisdom.
The outlook now suggests that philosophic progress can be maximized via an indirect path rather than by immediate philosophizing. One of the many tasks on which superintelligence (or even just moderately enhanced human intelligence) would outperform the current cast of thinkers is in answering fundamental questions in science and philosophy. This reflection suggests a strategy of deferred gratification. We could postpone work on some of the eternal questions for a little while, delegating that task to our hopefully more competent successors—in order to focus our own attention on a more pressing challenge: increasing the chance that we will actually have competent successors. This would be high-impact philosophy and high-impact mathematics.
2
We thus want to focus on problems that are not only important but urgent in the sense that their solutions are needed prior to the intelligence explosion. We should also take heed not to work on problems that are negative-value (such that solving them is harmful). Some technical problems in the field of artificial intelligence, for instance, might be negative-value inasmuch as their solution would speed the development of machine intelligence without doing as much to expedite the development of control methods that could render the machine intelligence revolution survivable and beneficial.
It can be hard to identify problems that are both urgent and important and are such that we can confidently take them to be positive-value. The strategic uncertainty surrounding existential risk mitigation means that we must worry that even well-intentioned interventions may turn out to be not only unproductive but counterproductive. To limit the risk of doing something actively harmful or morally wrong, we should prefer to work on problems that seem
robustly positive-value
(i.e., whose solution would make a positive contribution across a wide range
of scenarios) and to employ means that are robustly justifiable (i.e., acceptable from a wide range of moral views).
There is a further desideratum to consider in selecting which problems to prioritize. We want to work on problems that are
elastic
to our efforts at solving them. Highly elastic problems are those that can be solved much faster, or solved to a much greater extent, given one extra unit of effort. Encouraging more kindness in the world is an important and urgent problem—one, moreover, that seems quite robustly positive-value: yet absent a breakthrough idea for how to go about it, probably a problem of quite low elasticity. Achieving world peace, similarly, would be highly desirable; but considering the numerous efforts already targeting that problem, and the formidable obstacles arrayed against a quick solution, it seems unlikely that the contributions of a few extra individuals would make a large difference.
To reduce the risks of the machine intelligence revolution, we will propose two objectives that appear to best meet all those desiderata: strategic analysis and capacity-building. We can be relatively confident about the sign of these parameters—more strategic insight and more capacity being better. Furthermore, the parameters are elastic: a small extra investment can make a relatively large difference. Gaining insight and capacity is also urgent because early boosts to these parameters may compound, making subsequent efforts more effective. In addition to these two broad objectives, we will point to a few other potentially worthwhile aims for initiatives.
Against a backdrop of perplexity and uncertainty, analysis stands out as being of particularly high expected value.
3
Illumination of our strategic situation would help us target subsequent interventions more effectively. Strategic analysis is especially needful when we are radically uncertain not just about some detail of some peripheral matter but about the cardinal qualities of the central things. For many key parameters, we are radically uncertain even about their
sign
—that is, we know not which direction of change would be desirable and which undesirable. Our ignorance might not be irremediable. The field has been little prospected, and glimmering strategic insights could still be awaiting their unearthing just a few feet beneath the surface.
What we mean by “strategic analysis” here is a search for
crucial considerations
: ideas or arguments with the potential to change our views not merely about the fine-structure of implementation but about the general topology of desirability.
4
Even a single missed crucial consideration could vitiate our most valiant efforts or render them as actively harmful as those of a soldier who is fighting on the wrong side. The search for crucial considerations (which must explore normative as well as descriptive issues) will often require crisscrossing the boundaries between different academic disciplines and other fields of knowledge. As there is no established methodology for how to go about this kind of research, difficult original thinking is necessary.
Another high-value activity, one that shares with strategic analysis the robustness property of being beneficial across a wide range of scenarios, is the development of a well-constituted support base that takes the future seriously. Such a base can immediately provide resources for research and analysis. If and when other priorities become visible, resources can be redirected accordingly. A support base is thus a general-purpose capability whose use can be guided by new insights as they emerge.
One valuable asset would be a donor network comprising individuals devoted to rational philanthropy, informed about existential risk, and discerning about the means of mitigation. It is especially desirable that the early-day funders be astute and altruistic, because they may have opportunities to shape the field’s culture before the usual venal interests take up position and entrench. The focus during these opening gambits should thus be to recruit the right kinds of people into the field. It could be worth foregoing some technical advances in the short term in order to fill the ranks with individuals who genuinely care about safety and who have a truth-seeking orientation (and who are likely to attract more of their own kind).
One important variable is the quality of the “social epistemology” of the AI-field and its leading projects. Discovering crucial considerations is valuable, but only if it affects action. This cannot always be taken for granted. Imagine a project that invests millions of dollars and years of toil to develop a prototype AI, and that after surmounting many technical challenges the system is finally beginning to show real progress. There is a chance that with just a bit more work it could turn into something useful and profitable. Now a crucial consideration is discovered, indicating that a completely different approach would be a bit safer. Does the project kill itself off like a dishonored samurai, relinquishing its unsafe design and all the progress that had been made? Or does it react like a worried octopus, puffing out a cloud of motivated skepticism in the hope of eluding the attack? A project that would reliably choose the samurai option in such a dilemma would be a far preferable developer.
5
Yet building processes and institutions that are willing to commit seppuku based on uncertain allegations and speculative reasoning is not easy. Another dimension of social epistemology is the management of sensitive information, in particular the ability to avoid leaking information that ought be kept secret. (Information continence may be especially challenging for academic researchers, accustomed as they are to constantly disseminating their results on every available lamppost and tree.)
In addition to the general objectives of strategic light and good capacity, some more specific objectives could also present cost-effective opportunities for action.
One such is progress on the technical challenges of machine intelligence safety. In pursing this objective, care should be taken to manage information hazards. Some work that would be useful for solving the control problem would also be useful for solving the competence problem. Work that burns down the AI fuse could easily be a net negative.
Another specific objective is to promote “best practices” among AI researchers. Whatever progress has been made on the control problem needs to be disseminated. Some forms of computational experimentation, particularly if involving strong recursive self-improvement, may also require the use of capability control to mitigate the risk of an accidental takeoff. While the actual implementation of safety methods is not so relevant today, it will increasingly become so as the state of the art advances. And it is not too soon to call for practitioners to express a
commitment to safety
, including endorsing the common good principle and promising to ramp up safety if and when the prospect of machine superintelligence begins to look more imminent. Pious words are not sufficient and will not by themselves make a dangerous technology safe: but where the mouth goeth, the mind might gradually follow.
Other opportunities may also occasionally arise to push on some pivotal parameter, for example to mitigate some other existential risk, or to promote biological cognitive enhancement and improvements of our collective wisdom, or even to shift world politics into a more harmonious register.
Before the prospect of an intelligence explosion, we humans are like small children playing with a bomb. Such is the mismatch between the power of our plaything and the immaturity of our conduct. Superintelligence is a challenge for which we are not ready now and will not be ready for a long time. We have little idea when the detonation will occur, though if we hold the device to our ear we can hear a faint ticking sound.
For a child with an undetonated bomb in its hands, a sensible thing to do would be to put it down gently, quickly back out of the room, and contact the nearest adult. Yet what we have here is not one child but many, each with access to an independent trigger mechanism. The chances that we will
all
find the sense to put down the dangerous stuff seem almost negligible. Some little idiot is bound to press the ignite button just to see what happens.
Nor can we attain safety by running away, for the blast of an intelligence explosion would bring down the entire firmament. Nor is there a grown-up in sight.
In this situation, any feeling of gee-wiz exhilaration would be out of place. Consternation and fear would be closer to the mark; but the most appropriate attitude may be a bitter determination to be as competent as we can, much as if we were preparing for a difficult exam that will either realize our dreams or obliterate them.