The Act of Creation (73 page)

Read The Act of Creation Online

Authors: Arthur Koestler

BOOK: The Act of Creation
9.87Mb size Format: txt, pdf, ePub
p. 509
. The
Concise Oxford Dictionary
gives no less than thirty-four meanings of the word.

 

 

 

 

 

X
PERCEPTION AND MEMORY
I must now switch from animal to man, and later back again. The manner
in which animals learn holds important lessons for man; but in order to
interpret the data the experimenter must make certain minimum assumptions
regarding the animals' experiences; and, whether he is aware of it or not,
these assumptions are based on his own human experience. We talk about
the animal's pain-reaction or fear-reaction because we have experienced
pain and fear; we interpret certain signs as meaning that the animal
is alert or apprehensive by inferences which are often unconscious and
contain an unavoidably anthropomorphic element. Even Lloyd Morgan's
canon acknowledged this; it merely said that one should not be more
anthropomorphic than one could help.*
Now, learning involves perception and memory; and since we know
incomparably more about both in man than in cats or rats, we must discuss
some aspects of man's perceptual and sensory-motor skills before we turn
to learning in animals. Instead of the over-worked province of visual
perception, I shall start, for a change, on hearing.
Screening the Input
It has been said that visitors to Stalin had to go through seventeen
successive screenings: at the outer gate of the Kremlin compound,
at several inner gates, and so forth, until the last corridor and the
last door leading to the inner sanctum. 'Inputs' which aspire to become
'stimuli' apparently suffer a similar fate. Where hearing is concerned,
the brain's stimulus-screening activity starts in the ear. Efferent,
inhibitory fibres from higher centres to the cochlea of the cat were
discovered by Galambos in 1956. In a series of remarkable experiments
[1] the cat's auditory nerve was tapped and wired to an amplifier,
so that impulses passing from ear to brain were directly recorded. The
impulses were caused by the clicking of a metronome. But the moment a
mouse in a glass jar was shown to the cat the firings in the auditory
nerve were diminished or ceased altogether: the cat was turning a 'deaf
ear' on the metronome. The point of the experiment was to show that the
process of stimulus-selection is centrally controlled, but sets in at
the periphery -- the outer gate of the Kremlin compound.
Attitude and expectation -- the pattern of the behavioural matrix
to which the organism is attuned at the time -- determine what shall
constitute a stimulus and what shall not. On a happy family evening,
when people are talking while the radio is playing, junior is crying,
and the dog is begging to be let out, each of these simultaneous inputs
may be perceived as 'signal' and the rest as 'noise'. In audition, at
least, the 'figure-background relation' seems to be more complicated than
the Gestalt school suggests; it is not something innate in perceptual
organization, but dependent on past experience and present state of
mind. Women were known to sleep soundly through an air-raid but to
awake at the slightest cry of their babies; people deeply asleep show
sharp EEG reactions when their own name, or the name of a girlfriend,
is read out in a list of other 'background' names. [2]
The point has also an indirect bearing on the controversy whether
discrimination is based on the 'absolute' or relational properties of
stimuli. [3] The answer seems to be, briefly, that absolute stimuli
do not exist -- short of sticking a knife into somebody. Yet even
on the primitive level of pain, the matrix influences perception --
as witnessed by the General in the American Civil War who, in the heat
of battle, did not notice that his middle finger was shot away; not to
mention anaesthesia by hypnosis in dentistry and child-birth -- or the
even more remarkable phenomena of hysterical conversion blindness.
Thus the higher centres exercise a selective influence on sensation
and perception; those aspects of the input which are irrelevant will
be treated as noise, and forgotten 'without leaving a trace'. But the
criteria of relevance depend on the 'rules of the game' which the organism
is playing at the time.
Stripping the Input
Selective control of the input is the first stage in the process of
extracting information from the chaotic noises and other sensations which
bombard the organism's receptors; without it, the mind would be in a
kind of Brownian motion. This first stage is followed by the processing
of the input in a series of relaying operations, each of them designed
to strip the input of what appears to be irrelevant -- according to the
criteria of relevance which operate along that input-channel. One might
call this a process of "de-particularization". It is a clumsy word,
but it conveys what is really implied in the terms 'generalization' or
'abstraction', with their multiple connotations.
The most familiar examples of 'de-particularization' are, of course, the
visual constancies. The triangle, or the letter 'W', is stripped of the
irrelevancies of retinal position, size, etc. Thanks to colour constancy
the accidents of light and shadow are discarded; thanks to size constancy,
my moving hand does not seem to shrink or grow -- changes in perspective
size are 'dis-regarded' by the regard. Yet the criteria of relevance
and irrelevance depend, even in these cases of apparently spontaneous
perception, to a considerable degree on interpretative frames -- on
perceptual matrices acquired by past experience. When an object of the
appearance of a tennis ball is inflated against a homogeneous background,
it will be seen as if it were retaining its size and approaching the
observer. [4] This is different from size constancy because in this
case the observer has to accommodate his eyes and make them converge at
a closer range so that the ball gets out of focus and should be seen
as a blurred double image. Yet the knowledge that tennis balls behave
reasonably and do not grow into footballs somehow manages to compensate
for this, and to discard the anomalies in the situation as irrelevant
noise. To quote Bartlett once more: 'Even the most elementary perceptions
have the character of inferential constructions.' The Baconian ideal
of observation without theorizing is undermined by the mechanism of
observation itself. Perception is polluted by implied hypotheses. To
look, to listen, to taste, means to ask questions; and mostly they are
leading questions.
To obtain a more detached view of the living organism's methods of coding
and storing its experiences, let me make a naïve comparison with a
typical' engineering procedure. Examining a modern gramophone record with
a magnifying glass, you see a spiral curve with lateral oscillations of
varying amplitude and spacing -- a curve where the abscissa represents
time, and the ordinate the amplitude of the needle's oscillations. And
yet this two-dimensional curve, with a single independent variable, can
reproduce any sequence of sounds, from the Sermon on the Mount to the
Ninth Symphony performed by orchestra and choir, including the buzzing
of a fly and a cough in the audience. In fact the entire range of human
knowledge and experience could be expressed by the function of this one
independent variable, so that one is tempted to ask why the nervous
system does not produce engrams in this simple type of code, instead
of the incomparably more complicated methods it uses. The answer is,
that a 'linear' memory trace of this type would be completely useless
for the purposes of analysing, recognizing, and matching new inputs, and
for working out the appropriate responses. It would merely represent the
'blooming, buzzing confusion of pure sensation sans organization' which,
in the words of William James, is the new-born infant's world. Before
it can be more or less permanently stored, the input must be processed,
dismantled, and reassembled in various ways, which the following examples
may serve to illustrate.
Dismantling and Reassembling
Let the input be fifty instruments and fifty voices performing a choral
part of Beethoven's Ninth. On the gramophone record, and in the air-waves
which make the ear-drum vibrate, the pitch, timbre, and loudness of the
individual voices and instruments have all been superimposed on each other
-- scrambled together into a single variable pulse. The individuality
of soprano, flute, viola, is lost in the process; it requires a human
nervous system to reconstitute it.
The pulse is transmitted and amplified by the bones of the middle ear
and enters through the oval window into the cochlea. Here the basilar
membrane, based in viscous fluid, starts the process of unscrambling
the acoustic omelette. This is done, partly at least, by a kind of
Fourier-analysis of the oscillatory curve, which breaks it down into
its spectrum of basic frequencies.* The parallel fibres of the basilar
membrane form a kind of spiral harp; each fibre responds to a specific
frequency. This analysing mechanism operates over a range of twenty to
twenty thousand cycles per second, and auditory discrimination varies from
about 0.05 at low frequencies to 0.025 at 2,000 c/s (Piccolo flute). Each
frequency has its separate 'place' on the spiral membrane. Each 'place'
is presumed to be connected by a separate group of fibres, running
through several relay stations, to a presumably fixed location in the
primary reception area in the auditory cortex -- area 22.
But this mechanism of transmission by fixed pathways and non-specific
impulses is only half of the story; the other half is transmission of the
lower frequencies by 'volleys' in a bundle of fibres firing in turn at
the specific frequency of the input. [5] The details of both theories
are still controversial, but the available evidence indicates that they
complement each other. We have, then, here one more instance of the
complementary character of two types of nervous function: conduction by
specific pathways, and conduction by specific signals over equipotential
pathways.
We now have our fifty singers and fifty instruments decomposed into
a constantly changing mosaic pattern of excitations in area 22 where
each point (or region or circuit) [6] represents the frequency of one
pure tone, and in some form also its intensity --
regardless of the
instrument or voice in which it originated
. This state of affairs
bears no resemblance to any conceivable neural model based on S.-R. theory
-- or on the Gestalt physiology of Köhler and Koffka.* In fact the whole
physiological theory of Gestalt, and many of its psychological postulates,
break down when we come to audition. This is not surprising since Köhler
concentrated entirely on visual perception; and in the seven hundred-odd
pages of Koffka's "Principles of Gestalt Psychology" exactly one page
(p. 200) is devoted to 'other (than visual) senses'. Even on this one
page, the only reference to audition is the statement that 'sound' and
'stillness' have a reversible figure-background relation.
At the auditory projection area we must assume the dismantling process
to end and the reassembling to start. When we listen to the symphony
we do not hear an ensemble of the pure tones into which it has been
broken up in the cochlea, but an ensemble of individual instruments
and voices: that is, of organized sub-wholes. The individual timbre of
an instrument is determined by its overtones -- the series of partials
which accompany the fundamental, and the energy-distribution of them. By
superimposition of the sine curves of the partials, we obtain the periodic
curve characteristic for each instrument. When we identify the sound of
a violin or flute by picking out and bracketing together its partials --
which were 'drowned' among thousands of other partials in the air-pressure
wave -- we have achieved 'timbre constancy', comparable to visual figure
constancy. This, of course, is based on past experience and involves an
act of recognition by the 'trained ear' of instruments previously heard
in isolation.
'Coloured Filters'
Since all but the most elementary perceptions interact with past
experience, it seems a rather unsound procedure to discuss perception
divorced from the problem of memory. The question, then, is how the
'trace' was originally acquired which enables me to recognize an
instrument or voice on subsequent occasions. Let us assume that I
am hearing an exotic instrument for the first time, and that I am
interested at the moment only in its timbre, not in the melody played on
it (which, in the case of a Japanese koto or samisen, would be above my
head anyway). As I am listening, the mathematical relations between the
partials remain constant and enduring, whereas their pitch and loudness
are changing all the time. This stable and enduring relation-pattern
(the fixed ratios between the part-frequencies) will be treated by my
nervous system, which is processing the input, as relevant, whereas
the changes in the relata (the absolute frequencies) are discarded as
irrelevant. When this filtering-out process is completed, the input
will have been finally stripped of all irrelevant detail, according
to the demands of parsimony, and reduced to its invariant pattern --
to 'information' purified of 'noise'. If an input has undergone these
transformations and was permitted to progress this far without being
blocked somewhere on its way (as, for instance, the voices of irrelevant
strangers at a cocktail party are) then it will tend to leave a lasting
'trace' -- which will enable the nervous system to recognize in future
the same voice or instrument.
We have witnessed, as it were, the formation of the code of a new
perceptual skill. The organism feeds on negative entropy; in communication
theory, 'entropy' becomes 'noise'. The sensorium abstracts information
from the chaotic environment as the mitochondria extract, by a series
of dismantling and reassembling processes, a specific form of energy
from food. The abstracting and recording of information involves, as
we have just seen, the sacrifice of details which are filtered out as
irrelevant in a given context. But what is considered as irrelevant in one
context, may be relevant in another; and vice versa. We can recognize an
instrument regardless of the tune played on it; but we can also recognize
a tune regardless of the instrument on which it is played. The tune is
abstracted and recorded in a memory-trace de-particularized of timbre;
timbre is recorded de-particularized of tune. Thus the filtering-out of
redundancies as the input is relayed from periphery to centre does not
proceed along a single channel, but along several channels, each with
its series of filters of different colour, as it were. The different
colours represent the criteria of relevance in different perceptual
hierarchies. Each hierarchy analyses the input according to its own
criteria of relevance; but the loss of detail incurred in the process
of memory-formation along a single channel is partly counteracted by
the fact that information rejected as irrelevant by its coloured filters
may be admitted as relevant by another channel belonging to a different
hierarchy. We shall see that this principle of multi-dimensional analysis
is of basic importance in the phenomena of recognition and recall.
A Digression on Engrams
The neurophysiological problems of memory are beyond the scope of
this book, but the following remarks may help to forestall possible
misunderstandings. Perceptual codes of the type which enables us to
recognize an instrument are devices which analyse complex acoustic
inputs by some unknown process of 'matching' or 'resonance'. The
quotes indicate that these words are used as metaphors only; the
process must of course be incomparably more complicated than acoustic
or electric resonance. What matters is that a memory-trace cannot be
visualized as a mechanical record like a gramophone groove, 'stamped'
into the brain and activated by specific pathways. Such an arrangement
would be as useless for purposes of auditory analysis as it would be
useless for the visual recognition of shapes to have an archive of
photographic engrams. Instead of this, we must hypothesize some kind of
'attunement' of a cluster or clusters of neurons, with a hierarchic
ogranization and containing sub-wholes which are equipotential in
their response to one specific pattern of excitation and to that
pattern only. Pringle [7] assumed that memory-traces function like
'coupled resonators'; Hyden's [8] theory of RNA changes which determine
selective responsiveness to frequency-modulation sequences of excitation
seems more plausible. Whatever the mechanism is, it must 'combine
the principle of fixed but partly equipotential spatial connections,
with selective responses to specific excitation patterns, to account
for the hierarchic organization of perceptual, conceptual, and motor
skills. Weiss' excitation clang was an approach in that direction. Hebb's
phase sequences in neuron assemblies was another. The Pitt-McCullough
model of a scanning analyser to account for figure constancies should
show that basically similar principles can be applied in vision as in
audition to the problem of analysing and matching the input.
The matrix of a complex skill -- such as the maze-running skill of
Lashley's rats -- may be no more 'localized' than the programme of a
political party is localized by the addresses of all members who adhere
to it. If some members or groups of members are eliminated, other groups
may take over. Simpler and more primitive matrices, however, are perhaps
rather like the professional guilds of craftsmen concentrated in one
area of a medieval town; if that area is destroyed, the skill is lost.
Tracing a Melody
We have seen that recognition of a voice or instrument is based on
an invariant relation (the fixed ratios between partials or formants)
which has been extracted from the variable relata. Once the instrument
is perceived as a recognizable whole, the relation becomes a relatum
-- e.g. 'a violin' -- regardless whether a verbal symbol is attached
to it or not. This relatum then enters into relations with the sound
of other instruments, which are analysed on higher levels according to
the criteria of more complex rules of the game -- harmony, melodic, and
contrapuntal form -- in which several perceptual hierarchies participate.
A
tune
is defined by rhythm and pitch. Rhythm derives from the
hierarchic organization of beat-cum-accent into measure, measure into
phrase. To qualify as a tune, the pitch-variation sequence must conform
to certain codes of modality, key, harmony. These codes must also be
represented in the listener's perceptual organization, otherwise there
would be no musical experience, only the sensation of a medley of sounds
-- as when a European listens for the first time to Chinese opera. The
melody itself has the structural coherence of a closed figure as distinct
from an open, linear chain. It is either 'taken in' as a whole in the
specious present, or learned by the integration of sub-wholes, that is
of entire phrases -- but never by chaining note to note in the manner of
learning nonsense-syllables (though even these tend to form patterns). A
chain of notes could not be transposed from one key or instrument to
another; nor recognized after transposition.
A tune is a' temporal pattern of notes in a given scale. The notes are
the relata; by humming it in a different key, or playing it on a different
instrument, the relata are changed but the relation remains invariant in
all transformations. On a higher level, the tune as a whole again becomes
a relatum which enters into relations with other tonal patterns; or with
itself in symmetrical reversal; or contrapuntally with other themes.
Most people are capable of learning and recognizing simple melodies,
and equally capable of recognizing the sound of various instruments --
but few mortals share the privilege of 'absolute pitch', of being able to
identify single notes. In other words, retention of a pattern of stimuli
is the rule, retention of an isolated stimulus the exception. If the
pattern is relatively simple, it is 'take in at a glance', as a whole:
as a rule, listening to the first two transients is sufficient to identify
an instrument. [9] But the more complex the pattern, the more difficult
it becomes to 'take the whole in at a glance', and it can be retained
only by dint of a certain amount of rote learning.
Conditioning and Insight in Perception
Once more, however, the items memorized are not discrete bits, but
organized sub-wholes; and they are not summated in an open chain but
interrelated in a closed figure. Thus the first movement of a sonata
will fall into three sub-wholes: statement of themes, development,
recapitulation; and the first of these is usually subdivided into the
exposition of two themes in the order
A-B-A
; while in the rondo we
usually have
A B A C A
.
Similar considerations apply, for instance, to the learning of a poem.
Rhythm, rhyme, grammar, and meaning provide patterns or 'grids'
superimposed on each other -- matrices governed by already established
codes; and the memorizing that remains to be done is not so much a
'stamping 'in' but a 'filling of gaps'. This is shown by the typical
way of' getting stuck' in reciting a poem; e.g.:
'
. . . Cannon to left of them / Cannon in front of them / (-- --) and
thundered
'. A word has fallen out like a piece from a jigsaw puzzle
-- but it merely leaves a gap; it does not break the 'chain'. The
old-fashioned method of teaching history by reigns and battles is
an obvious example of stamping in. Even so, the data often show
some rudimentary organization into rhythmic or visual patterm,
acquired spontaneously or by some memorizing trick such as rhyming
jingles. Calculating prodigies memorize long series of numbers, not
by chaining but by ordering them into familiar sub-groups. Nonsense
syllables are easier retained by twisting them into a semblance of words,
and weaving these into a story. [10] The position of thirty men on a
chessboard is easier retained than of five chessmen lying in a heap on
the floor.
How do we recognize complex patterns? Take a professional musician who
has turned on his radio in the middle of a programme: 'It's a string
quartet. . . . Something by Beethoven. . . . It's a quartet of the
middle period. . . . It's the second Rasoumovsky. . . . It is probably
played by the Amadeus Quartet. . . .' The input has been matched in
rapid succession against the very complex coded constancies in several
interlocking hierarchies -- timbre, melody, rhythm, accent, phrasing,
volume, density, etc. -- until the last drop of 'information' has been
extracted from it. Each independent hierarchy of 'coloured filters'
activated by the input adds an additional dimension to understanding.
Perception cannot be divorced from past experience. What I have said
so far already foreshadows a continuous scale of gradations between
opposite methods of perceptual learning. At one end, in classical
conditioning, we shall find stamping-in, under artificial conditions,
of excitation-patterns which outside the laboratory would be treated as
biologically irrelevant and would accordingly leave no trace. Outside
the laboratory, edible things do not emit signals by metronome-clicks,
or by displaying the figure of an ellipse on a cardboard. The dog's
perceptual organization is not 'attuned' to this kind of input-signal;
it lies outside all recognized rules of the game; and there will be no
inherent tendency in the naïve dog to abstract information from the
rate of metronome-clicks. However (see below,

Other books

Dodsworth in Paris by Tim Egan
Wild Rain by Christine Feehan
Laird of Darkness by Nicole North
Little White Lies by Lesley Lokko
Zombies Eat Lawyers by Michael, Kevin, Maran, Lacy
Highland Shift (Highland Destiny: 1) by Harner, Laura, Harner, L.E.