Read Superintelligence: Paths, Dangers, Strategies Online
Authors: Nick Bostrom
Tags: #Science, #Philosophy, #Non-Fiction
26
. Stansberry and Kudritzki (2012). Electricity used in data centers worldwide amounted to 1.1–1.5% of total electricity use (Koomey 2011). See also Muehlhauser and Salamon (2012).
27
. This is an oversimplification. The number of chunks working memory can maintain is both information- and task-dependent; however, it is clearly limited to a small number of chunks. See Miller (1956) and Cowan (2001).
28
. An example might be that the difficulty of learning Boolean concepts (categories defined by logical rules) is proportional to the length of the shortest logically equivalent propositional formula. Typically, even formulae just 3–4 literals long are very difficult to learn. See Feldman (2000).
29
. See Landauer (1986). This study is based on experimental estimates of learning and forgetting rates in humans. Taking into account implicit learning might push the estimate up a little. If one assumes a storage capacity ~1 bit per synapse, one gets an
upper bound
on human memory capacity of about 10
15
bits. For an overview of different estimates, see Appendix A of Sandberg and Bostrom (2008).
30
. Channel noise can trigger action potentials, and synaptic noise produces significant variability in the strength of transmitted signals. Nervous systems appear to have evolved to make numerous trade-offs between noise tolerance and costs (mass, size, time delays); see Faisal et al. (2008). For example, axons cannot be thinner than 0.1 μm lest random opening of ion channels create spontaneous action potentials (Faisal et al. 2005).
31
. Trachtenberg et al. (2002).
32
.
In terms of memory and computational power, though not in terms of energy efficiency. The fastest computer in the world at the time of writing was China’s “Tianhe-2,” which displaced Cray Inc. Titan in June 2013 with a performance of 33.86 petaFLOPS. It uses 17.6 MW of power, almost six orders of magnitude more than the brain’s ~20 W.
CHAPTER 4: THE KINETICS OF AN INTELLIGENCE EXPLOSION33
. Note that this survey of sources of machine advantage is
disjunctive
: our argument succeeds even if some of the items listed are illusory, so long as there is at least one source that can provide a sufficiently large advantage.
1
. The system may not reach one of these baselines at any sharply defined point. There may instead be an interval during which the system gradually becomes able to outperform the external research team on an increasing number of system-improving development tasks.
2
. In the past half-century, at least one scenario has been widely recognized in which the existing world order would come to an end in the course of minutes or hours: global thermonuclear war.
3
. This would be consistent with the observation that the Flynn effect—the secular increase in measured IQ scores within most populations at a rate of some 3 IQ points per decade over the past 60 years or so—appears to have ceased or even reversed in recent years in some highly developed countries such as the United Kingdom, Denmark, and Norway (Teasdale and Owen 2008; Sundet et al. 2004). The cause of the Flynn effect in the past—and whether and to what extent it represents any genuine gain in general intelligence or merely improved skill at solving IQ test-style puzzles—has been the subject of wide debate and is still not known. Even if the Flynn effect (at least partially) reflects real cognitive gains, and even if the effect is now diminishing or even reversing, this does not prove that we have yet hit diminishing returns in whatever underlying cause was responsible for the observed Flynn effect in the past. The decline or reversal could instead be due to some independent detrimental factor that would otherwise have produced an even bigger observed decline.
4
. Bostrom and Roache (2011).
5
. Somatic gene therapy could eliminate the maturational lag, but is technically much more challenging than germline interventions and has a lower ultimate potential.
6
. Average global economic productivity growth per year over the period 1960–2000 was 4.3% (Isaksson 2007). Only part of this productivity growth is due to gains in organizational efficiency. Some
particular
networks or organizational processes of course are improving at much faster rates.
7
. Biological brain evolution was subject to many constraints and trade-offs that are drastically relaxed when the mind moves to a digital medium. For example, brain size is limited by head size, and a head that is too big has trouble passing through the birth canal. A large brain also guzzles metabolic resources and is a dead weight that impedes movement. The connectivity between certain brain regions might be limited by steric constraints—the volume of white matter is significantly larger than the volume of the gray matter it connects. Heat dissipation is limited by blood flow, and might be close to the upper limit for acceptable functioning. Furthermore, biological neurons are noisy, slow, and in need of constant protection, maintenance, and resupply by glial cells and blood vessels (contributing to the intracranial crowding). See Bostrom and Sandberg (2009b).
8
. Yudkowsky (2008a, 326). For a more recent discussion, see Yudkowsky (2013).
9
. The picture shows cognitive ability as a one-dimensional parameter, to keep the drawing simple. But this is not essential to the point being made here. One could, for example, instead represent a cognitive ability profile as a hypersurface in a multidimensional space.
10
. Lin et al. (2012).
11
. One gets a certain increase in collective intelligence simply by increasing the number of its constituent intellects. Doing so should at least enable better overall performance on tasks that can be easily parallelized. To reap the full returns from such a population explosion, however, one would also need to achieve some (more than minimal) level of coordination between the constituents.
12
.
The distinction between speed and quality of intelligence is anyhow blurred in the case of non-neuromorphic AI systems.
13
. Rajab et al. (2006, 41–52).
14
. It has been suggested that using configurable integrated circuits (FPGAs) rather than general-purpose processors could increase computational speeds in neural network simulations by up to two orders of magnitude (Markram 2006). A study of high-resolution climate modeling in the petaFLOP-range found a twenty-four to thirty-four-fold reduction of cost and about two orders of magnitude reduction in power requirements using a custom variant of embedded processor chips (Wehner et al. 2008).
15
. Nordhaus (2007). There are many overviews of the different meanings of Moore’s law; see, e.g., Tuomi (2002) and Mack (2011).
16
. If the development is slow enough, the project can avail itself of progress being made in the interim by the outside world, such as advances in computer science made by university researchers and improvements in hardware made by the semiconductor industry.
17
. Algorithmic overhang is perhaps less likely, but one exception would be if exotic hardware such as quantum computing becomes available to run algorithms that were previously infeasible. One might also argue that neural networks and deep machine learning are cases of algorithm overhang: too computationally expensive to work well when first invented, they were shelved for a while, then dusted off when fast graphics processing units made them cheap to run. Now they win contests.
18
. And even if progress on the way toward the human baseline were slow.
19
.
is that part of the world’s optimization power that is applied to improving the system in question. For a project operating in complete isolation, one that receives no significant ongoing support from the external world, we have
≈ 0, even though the project must have started with a resource endowment (computers, scientific concepts, educated personnel, etc.) that is derived from the entire world economy and many centuries of development.
20
. The most relevant of the seed AI’s cognitive abilities here is its ability to perform intelligent design work to improve itself, i.e. its intelligence amplification capability. (If the seed AI is good at enhancing another system, which is good at enhancing the seed AI, then we could view these as subsystems of a larger system and focus our analysis on the greater whole.)
21
. This assumes that recalcitrance is not known to be so high as to discourage investment altogether or divert it to some alternative project.
22
. A similar example is discussed in Yudkowsky (2008b).
23
. Since inputs have risen (e.g. amounts invested in building new foundries, and number of people working in the semiconductor industry), Moore’s law itself has not given such a rapid growth if we control for this increase in inputs. Combined with advances in software, however, an 18-month doubling time in performance per unit of input may be more historically plausible.
24
. Some tentative attempts have been made to develop the idea of an intelligence explosion within the framework of economic growth theory; see, e.g., Hanson (1998b); Jones (2009); Salamon (2009). These studies have pointed to the potential of extremely rapid growth given the arrival of digital minds, but since endogenous growth theory is relatively poorly developed even for historical and contemporary applications, any application to a potentially discontinuous future context is better viewed at this stage as a source of potentially useful concepts and considerations than as an exercise likely to deliver authoritative forecasts. For an overview of attempts to mathematically model a technological singularity, see Sandberg (2010).
CHAPTER 5: DECISIVE STRATEGIC ADVANTAGE25
. It is of course also possible that there will be no takeoff at all. But since, as argued earlier, superintelligence looks technically feasible, the absence of a takeoff would likely be due to the intervention of some defeater, such as an existential catastrophe. If strong superintelligence arrived not in the shape of artificial intelligence or whole brain emulation but through one of other paths we considered above, then a slower takeoff would be more likely.
1
. A software mind might run on a single machine as opposed to a worldwide network of computers; but this is not what we mean by “concentration.” Instead, what we are interested in here is
the extent to which power, specifically power derived from technological ability, will be concentrated in the advanced stages of, or immediately following, the machine intelligence revolution.
2
. Technology diffusion of consumer products, for example, tends to be slower in developing countries (Talukdar et al. 2002). See also Keller (2004) and The World Bank (2008).
3
. The economic literature dealing with the theory of the firm is relevant as a comparison point for the present discussion. The
locus classicus
is Coase (1937). See also, e.g., Canbäck et al. (2006); Milgrom and Roberts (1990); Hart (2008); Simester and Knez (2002).
4
. On the other hand, it could be especially easy to steal a seed AI, since it consists of software that could be transmitted electronically or carried on a portable memory device.