Read Superintelligence: Paths, Dangers, Strategies Online
Authors: Nick Bostrom
Tags: #Science, #Philosophy, #Non-Fiction
19
. There will be a trade-off between total parallel computing power and computational speed, as the highest computational speeds will be attainable only at the expense of a reduction in power efficiency. This will be especially true after one enters the era of reversible computing.
20
. An emulation could be tested by leading it into temptation. By repeatedly testing how an emulation started from a certain prepared state reacts to various sequences of stimuli, one could obtain high confidence in the reliability of that emulation. But the further the mental state is subsequently allowed to develop away from its validated starting point, the less certain one could be that it would remain reliable. (In particular, since a clever emulation might surmise it is sometimes in a simulation, one would need to be cautious about extrapolating its behavior into situations where its simulation hypothesis would weigh less heavily in its decision-making.)
21
. Some emulations might identify with their clan—i.e. all of their copies and variations derived from the same template—rather with any one particular instantiation. Such an emulation might not regard its own termination as a death event, if it knew that other clan members would survive. Emulations may know that they will get reverted to a particular stored state at the end of the day and lose that day’s memories, but be as little put off by this as the partygoer who knows she will awake the next morning without any recollection of the previous night: regarding this as retrograde amnesia, not death.
22
. An ethical evaluation might take into account many other factors as well. Even if all the workers were constantly well pleased with their condition, the outcome might still be deeply morally objectionable on other grounds—though
which
other grounds is a matter of dispute between rival moral theories. But any plausible assessment would consider subjective well-being to be one important factor. See also Bostrom and Yudkowsky (forthcoming).
23
. World Values Survey (2008).
24
. Helliwell and Sachs (2012).
25
. Cf. Bostrom (2004). See also Chislenko (1996) and Moravec (1988).
26
. It is hard to say whether the information-processing structures that would emerge in this kind of scenario would be conscious (in the sense of having qualia, phenomenal experience). The reason this is hard is partly our empirical ignorance about which cognitive entities would arise and partly our philosophical ignorance about which types of structure have consciousness. One could try to reframe the question, and instead of asking whether the future entities would be conscious, one could ask whether the future entities would have moral status; or one could ask whether they would be such that we have preferences about their “well-being.” But these questions may be no easier to answer than the question about consciousness—in fact, they might require an answer to the consciousness question inasmuch as moral status or our preferences depend on whether the entity in question can subjectively experience its condition.
27
.
For an argument that both geological and human history manifest such a trend toward greater complexity, see Wright (2001). For an opposing argument (criticized in
Chapter 9
of Wright’s book), see Gould (1990). See also Pinker (2011) for an argument that we are witnessing a robust long-term trend toward decreasing violence and brutality.
28
. For more on observation selection theory, see Bostrom (2002a).
29
. Bostrom (2008a). A much more careful examination of the details of our evolutionary history would be needed to circumvent the selection effect. See, e.g., Carter (1983, 1993); Hanson (1998d); Ćirković et al. (2010).
30
. Kansa (2003).
31
. E.g., Zahavi and Zahavi (1997).
32
. See Miller (2000).
33
. Kansa (2003). For a provocative take, see also Frank (1999).
34
. It is not obvious how best to measure the degree of global political integration. One perspective would be that whereas a hunter–gatherer tribe might have integrated a hundred individuals into a decision-making entity, the largest political entities today contain more than a billion individuals. This would amount to a difference of seven orders of magnitude, with only one additional magnitude to go before the entire world population is contained within a single political entity. However, at the time when the tribe was the largest scale of integration, the world population was much smaller. The tribe might have contained as much as a thousandth of the individuals then living. This would make the increase in the scale of political integration as little as two orders of magnitude. Looking at the fraction of world population that is politically integrated, rather than at absolute numbers, seems appropriate in the present context (particularly as the transition to machine intelligence may cause a population explosion, of emulations or other digital minds). But there have also been developments in global institutions and networks of collaboration outside of formal state structures, which should also be taken into account.
35
. One of the reasons for supposing that the first machine intelligence revolution will be swift—the possible existence of a hardware overhang—does not apply here. However, there could be other sources of rapid gain, such as a dramatic breakthrough in software associated with transitioning from emulation to purely synthetic machine intelligence.
36
. Shulman (2010b).
37
. How the
pro et contra
would balance out might depend on what kind of work the superorganism is trying to do, and how generally capable the most generally capable available emulation template is. Part of the reason many different types of human beings are needed in large organizations today is that humans who are very talented in many domains are rare.
38
. It is of course very easy to make multiple copies of a software agent. But note that copying is not in general sufficient to ensure that the copies have the same final goals. In order for two agents to have the same final goals (in the relevant sense of “same”), the goals must coincide in their
indexical
elements. If Bob is selfish, a copy of Bob will likewise be selfish. Yet their goals do not coincide: Bob cares about Bob whereas Bob-copy cares about Bob-copy.
39
. Shulman (2010b, 6).
40
. This might be more feasible for biological humans and whole brain emulations than for arbitrary artificial intelligences, which might be constructed so as to have hidden compartments or functional dynamics that may be very hard to discover. On the other hand,
AIs specifically built to be transparent
should allow for more thoroughgoing inspection and verification than is possible with brain-like architectures. Social pressures may encourage AIs to expose their source code, and to modify themselves to render themselves transparent—especially if being transparent is a precondition to being trusted and thus to being given the opportunity to partake in beneficial transactions. Cf. Hall (2007).
41
. Some other issues that seem relatively minor, especially in cases where the stakes are enormous (as they are for the key global coordination failures), include the search cost of finding policies that could be of mutual interest, and the possibility that some agents might have a basic preference for “autonomy” in a form that would be reduced by entering into comprehensive global treaties that have monitoring and enforcement mechanisms attached.
42
. An AI might perhaps achieve this by modifying itself appropriately and then giving observers read-only access to its source code. A machine intelligence with a more opaque architecture
(such as an emulation) might perhaps achieve it by publicly applying to itself some motivation selection method. Alternatively, an external coercive agency, such as a superorganism police force, might perhaps be used not only to enforce the implementation of a treaty reached between different parties, but also internally by a single party to commit itself to a particular course of action.
43
. Evolutionary selection might have favored threat-ignorers and even characters visibly so highly strung that they would rather fight to the death than suffer the slightest discomfiture. Such a disposition might bring its bearer valuable signaling benefits. (Any such instrumental rewards of having the disposition need of course play no part in the agent’s conscious motivation: he may value justice or honor as ends in themselves.)
CHAPTER 12: ACQUIRING VALUES44
. A definitive verdict on these matters, however, must await further analysis. There are various other potential complications which we cannot explore here.
1
. Various complications and modulations of this basic idea could be introduced. We discussed one variation in
Chapter 8
—that of a satisficing, as opposed to maximizing, agent—and in the next chapter we briefly touch on the issue of alternative decision theories. However, such issues are not essential to the thrust of this subsection, so we will keep things simple by focusing here on the case of an expected utility-maximizing agent.
2
. Assuming the AI is to have a non-trivial utility function. It would be very easy to build an agent that always chooses an action that maximizes expected utility if its utility function is, e.g., the constant function
U
(
w
) = 0. Every action would equally well maximize expected utility relative to that utility function.
3
. Also because we have forgotten the blooming buzzing confusion of our early infancy, a time when we could not yet see very well because our brain had not yet learned to interpret its visual input.
4
. See also Yudkowsky (2011) and the review in section 5 of Muehlhauser and Helm (2012).
5
. It is perhaps just about conceivable that advances in software engineering could eventually overcome these difficulties. Using modern tools, a single programmer can produce software that would have been beyond the pale of a sizeable team of developers forced to write directly in machine code. Today’s AI programmers gain expressiveness from the wide availability of high-quality machine learning and scientific calculation libraries, enabling someone to hack up, for instance, a unique-face-counting webcam application by chaining libraries together that they never could have written on their own. The accumulation of reusable software, produced by specialists but useable by non-specialists, will give future programmers an expressiveness advantage. For example, a future robotics programmer might have ready access to standard facial imprinting libraries, typical-office-building-object collections, specialized trajectory libraries, and many other functionalities that are currently unavailable.
6
. Dawkins (1995, 132). The claim here is not necessarily that the amount of suffering in the natural world
outweighs
the amount of positive well-being.
7
. Required population sizes might be much larger or much smaller than those that existed in our own ancestry. See Shulman and Bostrom (2012).
8
. If it were easy to get an equivalent result without harming large numbers of innocents, it would seem morally better to do so. If, nevertheless, digital persons are created and made to suffer unjust harm, it may be possible to compensate them for their suffering by saving them to file and later (when humanity’s future is secured) rerunning them under more favorable conditions. Such restitution could be compared in some ways to religious conceptions of an afterlife in the context of theological attempts to address the evidential problem of evil.
9
. One of the field’s leading figures, Richard Sutton, defines reinforcement learning not in terms of a learning method but in terms of a learning problem: any method that is well suited to solving that problem is considered a reinforcement learning method (Sutton and Barto 1998, 4). The present discussion, in contrast, pertains to methods where the agent can be conceived of as having the final goal of maximizing (some notion of) cumulative reward. Since an agent with some very different kind of final goal might be skilled at mimicking a reward-seeking agent in a wide
range of situations, and could thus be well suited to solving reinforcement learning problems, there could be methods that would count as “reinforcement learning methods” on Sutton’s definition that would not result in a wireheading syndrome. The remarks in the text, however, apply to most of the methods actually used in the reinforcement learning community.
10
. Even if, somehow, a human-like mechanism could be set up within a human-like machine intellect, the final goals acquired by this intellect need not resemble those of a well-adjusted human, unless the rearing environment for this digital baby also closely matched that of an ordinary child: something that would be difficult to arrange. And even with a human-like rearing environment, a satisfactory result would not be guaranteed, since even a subtle difference in innate dispositions can result in very different reactions to a life event. It may, however, be possible to create a more reliable value-accretion mechanism for human-like minds in the future (perhaps using novel drugs or brain implants, or their digital equivalents).