Superintelligence: Paths, Dangers, Strategies (55 page)

Read Superintelligence: Paths, Dangers, Strategies Online

Authors: Nick Bostrom

Tags: #Science, #Philosophy, #Non-Fiction

BOOK: Superintelligence: Paths, Dangers, Strategies
5.61Mb size Format: txt, pdf, ePub

13
. Freitas (1980); Freitas and Merkle (2004, Chap. 3); Armstrong and Sandberg (2013).

14
. See, e.g., Huffman and Pless (2003), Knill et al. (2000), Drexler (1986).

15
. That is to say, the distance would be small on some “natural” metric, such as the logarithm of the size of the population that could be sustainably supported at subsistence level by a given level of capability if all resources were devoted to that end.

16
. This estimate is based on the WMAP estimate of a cosmological baryon density of 9.9×10
–30
g/cm
3
and assumes that 90% of the mass is intergalactic gas, that some 15% of the galactic mass is stars (about 80% of baryonic matter), and that the average star weighs in at 0.7 solar masses (Read and Trentham 2005; Carroll and Ostlie 2007).

17
. Armstrong and Sandberg (2013).

18
. Even at 100% of
c
(which is unattainable for objects with nonzero rest mass) the number of reachable galaxies is only about 6×10
9
. (Cf. Gott et al. [2005] and Heyl [2005].) We are assuming that our current understanding of the relevant physics is correct. It is hard to be very confident in any upper bound, since it is at least conceivable that a superintelligent civilization might extend its reach in some way that we take to be physically impossible (for instance, by building time machines, by spawning new inflationary universes, or by some other, as yet unimagined means).

19
. The number of habitable planets per star is currently uncertain, so this is merely a crude estimate. Traub (2012) predicts that one-third of stars in spectral classes F, G, or K have at least one terrestrial planet in the habitable zone; see also Clavin (2012). FGK stars form about 22.7% of the stars in the solar neighborhood, suggesting that 7.6% of stars have potentially suitable planets. In addition, there might be habitable planets around the more numerous M stars (Gilster 2012). See also Robles et al. (2008).

It would not be necessary to subject human bodies to the rigors of intergalactic travels. AIs could oversee the colonization process.
Homo sapiens
could be brought along as information, which the AIs could later use to instantiate specimens of our species. For example, genetic information could be synthesized into DNA, and a first generation of humans could be incubated, raised, and educated by AI guardians taking an anthropomorphic guise.

20
. O’Neill (1974).

21
. Dyson (1960) claims to have gotten the basic idea from science fiction writer Olaf Stapledon (1937), who in turn might have been inspired by similar thoughts by J. D. Bernal (Dyson 1979, 211).

22
. Landauer’s principle states that there is a minimum amount of energy required to change one bit of information, known as the Landauer limit, equal to
kT
ln 2, where
k
is the Boltzmann constant (1.38×10
–23
J/K) and
T
is the temperature. If we assume the circuitry is maintained at around 300 K, then 10
26
watts allows us to erase approximately 10
47
bits per second. (On the achievable efficiency of nanomechanical computational devices, see Drexler [1992]. See also Bradbury [1999]; Sandberg [1999]; Ćirković [2004]. The foundations of Landauer’s principle are still somewhat in dispute; see, e.g., Norton [2011].)

23
. Stars vary in their power output, but the Sun is a fairly typical main-sequence star.

24
. A more detailed analysis might consider more closely what types of computation we are interested in. The number of
serial
computations that can be performed is quite limited, since a fast serial computer must be small in order to minimize communications lags within the different parts of the computer. There are also limits on the number of bits that can be stored, and, as we saw, on the number of irreversible computational steps (involving the erasure of information) that can be performed.

25
. We are assuming here that there are no extraterrestrial civilizations that might get in the way. We are also assuming that the simulation hypothesis is false. See Bostrom (2003a). If either of these assumptions is incorrect, there may be important non-anthropogenic risks—ones that involve intelligent agency of a nonhuman sort. See also Bostrom (2003b, 2009c).

26
.
At least a wise singleton that grasped the idea of evolution could, in principle, have embarked on a eugenics program by means of which it could slowly have raised its level of collective intelligence.

27
. Tetlock and Belkin (1996).

28
. To be clear: colonizing and re-engineering a large part of the accessible universe is not currently within our
direct
reach. Intergalactic colonization is far beyond today’s technology. The point is that we could in principle use our present capabilities to develop the additional capabilities that would be needed, thus placing the accomplishment within our
indirect
reach. It is of course also true that humanity is not currently a singleton and that we do not know that we would never face intelligent opposition from some external power if we began to re-engineer the accessible universe. To meet the wise-singleton sustainability threshold, however, it suffices that one possesses a capability set such that if a wise singleton facing no intelligent opposition had possessed this capability set then the colonization and reengineering of a large part of the accessible universe would be within its indirect reach.

29
. Sometimes it might be useful to speak of two AIs as each having a given superpower. In an extended sense of the word, one could thus conceive of a superpower as something that an agent has relative to some field of action—in this example, perhaps a field that includes all of human civilization but excludes the other AI.

CHAPTER 7: THE SUPERINTELLIGENT WILL
 

1
. This is of course not to deny that differences that appear small visually can be functionally profound.

2
. Yudkowsky (2008a, 310).

3
. David Hume, the Scottish Enlightenment philosopher, thought that beliefs alone (say, about what is a good thing to do) cannot motivate action: some desire is required. This would support the orthogonality thesis by undercutting one possible objection to it, namely that sufficient intelligence might entail the acquisition of certain beliefs which would then necessarily produce certain motivations. However, although the orthogonality thesis can draw support from the Humean theory of motivation, it does not presuppose it. In particular, one need not maintain that beliefs alone can never motivate action. It would suffice to assume, for example, that an agent—be it ever so intelligent—can be motivated to pursue any course of action if the agent happens to have certain desires of some sufficient, overriding strength. Another way in which the orthogonality thesis could be true even if the Humean theory of motivation is false is if arbitrarily high intelligence does not entail the acquisition of any such beliefs as are (putatively) motivating on their own. A third way in which it might be possible for the orthogonality thesis to be true even if the Humean theory were false is if it is possible to build an agent (or more neutrally, an “optimization process”) with arbitrarily high intelligence but with constitution so alien as to contain no clear functional analogs to what in humans we call “beliefs” and “desires.” (For some recent attempts to defend the Humean theory of motivation see Smith [1987], Lewis [1988], and Sinhababu [2009].)

4
. For instance, Derek Parfit has argued that certain basic preferences would be irrational, such as that of an otherwise normal agent who has “Future-Tuesday-Indifference”:

A certain hedonist cares greatly about the quality of his future experiences. With one exception, he cares equally about all the parts of his future. The exception is that he has Future-Tuesday-Indifference. Throughout every Tuesday he cares in the normal way about what is happening to him. But he never cares about possible pains or pleasures on a future Tuesday…. This indifference is a bare fact. When he is planning his future, it is simply true that he always prefers the prospect of great suffering on a Tuesday to the mildest pain on any other day. (Parfit [1986, 123–4]; see also Parfit [2011])

 
 

For our purposes, we need take no stand on whether Parfit is right that this agent is irrational, so long as we grant that it is not necessarily unintelligent in the instrumental sense explained in the text. Parfit’s agent could have impeccable instrumental rationality, and therefore great intelligence, even if he falls short on some kind of sensitivity to “objective reason” that might
be required of a fully rational agent. Therefore, this kind of example does not undermine the orthogonality thesis.

5
. Even if there are objective moral facts that any fully rational agent would comprehend, and even if these moral facts are somehow intrinsically motivating (such that anybody who fully comprehends them is necessarily motivated to act in accordance with them), this need not undermine the orthogonality thesis. The thesis could still be true if an agent could have impeccable
instrumental
rationality even whilst lacking some other faculty constitutive of rationality proper, or some faculty required for the full comprehension of the objective moral facts. (An agent could also be extremely intelligent, even superintelligent, without having full instrumental rationality in every domain.)

6
. For more on the orthogonality thesis, see Bostrom (2012) and Armstrong (2013).

7
. Sandberg and Bostrom (2008).

8
. Stephen Omohundro has written two pioneering papers on this topic (Omohundro 2007, 2008). Omohundro argues that all advanced AI systems are likely to exhibit a number of “basic drives,” by which he means “tendencies which will be present unless explicitly counteracted.” The term “AI drive” has the advantage of being short and evocative, but it has the disadvantage of suggesting that the instrumental goals to which it refers influence the AI’s decision-making in the same way as psychological drives influence human decision-making, i.e. via a kind of phenomenological tug on our ego which our willpower may occasionally succeed in resisting. That connotation is unhelpful. One would not normally say that a typical human being has a “drive” to fill out their tax return, even though filing taxes may be a fairly convergent instrumental goal for humans in contemporary societies (a goal whose realization averts trouble that would prevent us from realizing many of our final goals). Our treatment here also differs from that of Omohundro in some other more substantial ways, although the underlying idea is the same. (See also Chalmers [2010] and Omohundro [2012].)

9
. Chislenko (1997).

10
. See also Shulman (2010b).

11
. An agent might also change its goal
representation
if it changes its ontology, in order to transpose its old representation into the new ontology; cf. de Blanc (2011).

Another type of factor that might make an
evidential decision theorist
undertake various actions, including changing its final goals, is the evidential import of deciding to do so. For example, an agent that follows evidential decision theory might believe that there exist other agents like it in the universe, and that its own actions will provide some evidence about how those other agents will act. The agent might therefore choose to adopt a final goal that is altruistic towards those other evidentially linked agents, on grounds that this will give the agent evidence that those other agents will have chosen to act in like manner. An equivalent outcome might be obtained, however, without changing one’s final goals, by choosing in each instant to act
as if
one had those final goals.

12
. An extensive psychological literature explores adaptive preference formation. See, e.g., Forgas et al. (2010).

13
. In formal models, the value of information is quantified as the difference between the expected value realized by optimal decisions made with that information and the expected value realized by optimal decisions made without it. (See, e.g., Russell and Norvig [2010].) It follows that the value of information is never negative. It also follows that any information you know will never affect any decision you will ever make has zero value for you. However, this kind of model assumes several idealizations which are often invalid in the real world—such as that knowledge has no final value (meaning that knowledge has only instrumental value and is not valuable for its own sake) and that agents are not transparent to other agents.

14
. E.g., Hájek (2009).

15
. This strategy is exemplified by the sea squirt larva, which swims about until it finds a suitable rock, to which it then permanently affixes itself. Cemented in place, the larva has less need for complex information processing, whence it proceeds to digest part of its own brain (its cerebral ganglion). One can observe the same phenomenon in some academics when they have been granted tenure.

16
. Bostrom (2012).

17
.
Bostrom (2006c).

18
. One could reverse the question and look instead at possible reasons for a superintelligent singleton
not
to develop some technological capabilities. These include the following: (a) the singleton foresees that it will have no use for the capability; (b) the development cost is too large relative to its anticipated utility (e.g. if the technology will never be suitable for achieving any of the singleton’s ends, or if the singleton has a very high discount rate that strongly discourages investment); (c) the singleton has some final value that requires abstention from particular avenues of technology development; (d) if the singleton is not certain it will remain stable, it might prefer to refrain from developing technologies that could threaten its internal stability or that would make the consequences of dissolution worse (for instance, a world government may not wish to develop technologies that would facilitate rebellion, even if they have some good uses, nor develop technologies for the easy production of weapons of mass destruction which could wreak havoc if the world government were to dissolve); (e) similarly, the singleton might have made some kind of binding strategic commitment not to develop some technology, a commitment that remains operative even if it would now be convenient to develop it. (Note, however, that some
current
reasons for technology development would
not
apply to a singleton: for instance, reasons arising from arms races.)

19
. Suppose that an agent discounts resources obtained in the future at an exponential rate, and that because of the light speed limitation the agent can only increase its resource endowment at a polynomial rate. Would this mean that there will be some time after which the agent would not find it worthwhile to continue acquisitive expansion? No, because although the present value of the resources obtained at future times would asymptote to zero the further into the future we look,
so would the present cost of obtaining them
. The present cost of sending out one more von Neumann probe a 100 million years from now (possibly using some resource acquired some short time earlier) would be diminished by the same discount factor that would diminish the present value of the future resources that the extra probe would acquire (modulo a constant factor).

20
. While the volume reached by colonization probes at a given time might be roughly spherical and expanding with a rate proportional to the square of time elapsed since the first probe was launched (~
t
2
), the amount of resources contained within this volume will follow a less regular growth pattern, since the distribution of resources is inhomogeneous and varies over several scales. Initially, the growth rate might be ~
t
2
as the home planet is colonized; then the growth rate might become spiky as nearby planets and solar systems are colonized; then, as the roughly disc-shaped volume of the Milky Way gets filled out, the growth rate might even out, to be approximately proportional to
t
; then the growth rate might again become spiky as nearby galaxies are colonized; then the growth rate might again approximate ~
t
2
as expansion proceeds on a scale over which the distribution of galaxies is roughly homogeneous; then another period of spiky growth followed by smooth ~
t
2
growth as galactic superclusters are colonized; until ultimately the growth rate starts a final decline, eventually reaching zero as the expansion speed of the universe increases to such an extent as to make further colonization impossible.

21
. The simulation argument may be of particular importance in this context. A superintelligent agent may assign a significant probability to hypotheses according to which it lives in a computer simulation and its percept sequence is generated by another superintelligence, and this might generate various convergent instrumental reasons depending on the agent’s guesses about what types of simulations it is most likely to be in. Cf. Bostrom (2003a).

22
. Discovering the basic laws of physics and other fundamental facts about the world is a convergent instrumental goal. We may place it under the rubric “cognitive enhancement” here, though it could also be derived from the “technology perfection” goal (since novel physical phenomena might enable novel technologies).

CHAPTER 8: IS THE DEFAULT OUTCOME DOOM?
 

1
. Some additional existential risk resides in scenarios in which humanity survives in some highly suboptimal state or in which a large portion of our potential for desirable development is irreversibly squandered. On top of this, there may be existential risks associated with the lead-up to
a potential intelligence explosion, arising, for example, from war between countries competing to develop superintelligence first.

2
. There is an important moment of vulnerability when the AI first realizes the need for such concealment (an event which we may term
the conception of deception
). This initial realization would not itself be deliberately concealed when it occurs. But having had this realization, the AI might move swiftly to hide the fact that the realization has occurred, while setting up some covert internal dynamic (perhaps disguised as some innocuous process that blends in with all the other complicated processes taking place in its mind) that will enable it to continue to plan its long-term strategy in privacy.

3
. Even human hackers can write small and seemingly innocuous programs that do completely unexpected things. (For examples, see some the winning entries in the International Obfuscated C Code Contest.)

4
. The point that some AI control measures could appear to work within a fixed context yet fail catastrophically when the context changes is also emphasized by Eliezer Yudkowsky; see, e.g., Yudkowsky (2008a).

5
. The term seems to have been coined by science-fiction writer Larry Niven (1973), but is based on real-world brain stimulation reward experiments; cf. Olds and Milner (1954) and Oshima and Katayama (2010). See also Ring and Orseau (2011).

Other books

The Best of Sisters in Crime by Marilyn Wallace
Archangel's Heart by Nalini Singh
Tell No One by Harlan Coben
Bestiary by Robert Masello
A Girl Called Eilinora by Nadine Dorries
Boneshaker by Cherie Priest