Read The First Word: The Search for the Origins of Language Online
Authors: Christine Kenneally
That claim was made without any relevant data from animal studies, and it took only a few years to be invalidated. In 1975 two researchers repeated the infant study but used chinchillas, which also proved to have categorical perception. So even though this trait fundamentally underlies the human ability to perceive speech, it’s a much more general feature of animal auditory systems. Later experiments have shown that categorical perception also applies to nonspeech sounds.
Other important properties of human speech perception are shared by other animals. In a study conducted by Marc Hauser and colleagues, researchers found that humans aren’t the only species with the ability to identify different languages based on their characteristic rhythms. Tamarins, tiny primates that roam the forests of the Amazon basin, can distinguish between languages based on different rhythmic cues.
10
This ability suggests that we probably didn’t evolve our sensitivity to linguistic rhythm for the specific purpose of understanding or producing speech, even though that is now its primary function. Instead we use a general perceptual mechanism that is shared among animals. In another study Hauser and colleagues extended the earlier findings to show that other properties of this perceptual mechanism are common to humans and tamarins. For example, neither human babies nor tamarins distinguish between languages that come from the same rhythmic class, such as English and German, or that are rhythmically similar like English and Dutch. However, they could tell the difference between rhythmically different languages like Japanese and Polish. Another property of speech perception is the ability to hear the formant frequencies that characterize different vowels. In another study, Hauser and colleagues have pointed out that some animals are able to use formant frequencies to make distinctions between sounds and that other species perceive formants in their own species’ vocalizations.
11
Many questions remain about the animal perception of speech. There is no evidence that animals either have or could be trained to develop the ability to parse out the vast number of words in the semicontinuous speech stream of human conversation. Still, we have yet to explain the very basic fact that animals like the Border collie Rico, the African gray parrot Alex, and the bonobo Kanzi clearly have some capacity for perceiving and understanding words within a semicontinuous speech stream. These animals appear to take the speech-noise, identify distinct sounds within it, break the whole thing up into smaller meaningful units (if not as many as humans, then at least some), and derive a meaning from that. Kanzi, for example, has learned that the buzz coming out of someone’s mouth can be broken up into recognizable units (“throw,” “ball,” “water”) that can be combined to create larger meaningful units (“Throw the ball in the water”).
In order to accurately determine how much of speech perception is shared by humans and animals, researchers must eventually explain how these creatures adjust to different speakers in the same way that humans do and, even if one person’s
p
is different from another’s, still make sense of the word, no matter who is saying it.
Of course, humans do a lot more perceptually than simply pulling a few words out of a larger set of vocalizations. We parse the speech stream exhaustively, and we do it in real time, picking out sounds that are jammed many to a second. We identify the words they create and at the same time the sentences they create. “Speech flows together like this” actually sounds more like “Speechflowstogetherlikethis,” and yet we effortlessly work out where one word has ended and another has begun in real time.
Researchers like Marc Hauser and Tecumseh Fitch believe that the claims for human uniqueness have been proven wrong so often in the perceptual domain that people should no longer make default assumptions about any special human ability. In their view, it is reasonable to believe that the hearing part of language is completely shared with many other animals. But others are more skeptical.
Speech perception is such a complicated task, Steven Pinker pointed out, that even speech recognition systems on today’s modern computers require that you talk to them with exaggerated breaks between words unless they are trained on a specific person’s voice. “Understanding connected speech from a variety of speakers is a remarkable ability,” he said, “one that artificial intelligence researchers have had enormous difficulty duplicating in computers. It certainly has not been shown that other animals are capable of processing continuous speech. It would be very hard to test, because they don’t have the language that continuous speech is converted into. The fact is that we don’t know that they can do it, and I’d be very skeptical if they can.”
A
lthough many components of language have some kind of analog in animal communication, our close relatives typically lack highly structured signals. Of course birdsong can be complexly patterned, but ape and monkey communication seems to consist mostly of unanalyzable cries. Human language involves two types of structures. In the first, elements from a finite set of meaningless sounds are combined into meaningful words and parts of words, known as morphemes. Linguists call this phonology. The rules of phonology cover intonation and rhythm as well as the way specific sounds can be combined. The rules of sound apply at the smallest scale, between two single sounds that occur side by side, and over vast tracts of speech—from single sentences that either rise or fall depending on whether they are questions, to lengthier statements that end on a falling intonation. All these rules change depending on the language that is spoken.
In the second type of structure, words and morphemes are combined into phrases. This is what linguists call syntax. In 1960 the linguist Charles Hockett said that the relationship between the two types of combinatory rules was one of the major design features of human language; he called it “duality of patterning.”
Inevitably, both kinds of structure have been found to be not restricted to humans. Elements of phonology operate not just in birdsong but in the songs of whales. Phrases in these songs recur and are used again. In one early experiment Marc Hauser and a colleague demonstrated that vervet monkeys use a fall in pitch to mark the end of an utterance and that other vervets seem to interpret this as a signal to take a turn in vocalizing, like humans do. Tecumseh Fitch suggests there may be other elements of sound rules that animals share. Rhythm is an important element of human language, and Fitch points to the rhythm in the dominance displays of chimpanzees and gorillas as a possible precursor for this ability in humans. Gorillas put on impressive performances of vocalizing and rhythmic chest beating, and while this behavior has been little studied, it might provide a clue to the origins of rhythm in humans. Still, chimpanzees do not speak, and neither do they dance. If important analogs for this aspect of language exist in other animals, there are also important distinctions. Not only does other animal vocal communication not have the range of distinct sounds of human language, it doesn’t appear to employ anything like the number and range of rules that we have for combining speech sounds.
Interestingly, it’s been pointed out that the rules of phonology contradict Chomsky’s notion of the poverty of stimulus—the idea that there is not enough information in the language a child hears for it to learn language. Philip Carr, a phonologist at the University of Montpellier in France, says there is abundant evidence of the rules of phonology in the speech that children hear. The “data are more than complete,” he wrote. Neonates, according to Carr, have access to more information than they need to understand the sound system of their language.
1
Of the two types of structures, syntax has been the more hotly contested in the language evolution debate. At its most basic, syntax is a series of rules for combining words in a meaningful way. All the words in the following sentence make perfect sense by themselves, but because the way they are lined up defies the syntax of English, there is no larger meaning:
the the are up way they meaning lined there no syntax English is defies larger of.
Until very recently it was believed only we could understand or deploy any of the structural devices found in human syntax, but Kanzi showed that this is not entirely the case. He is able to learn and apply some rules to structure the symbols with which he communicates. In addition, Klaus Zuberbühler has also established that rudimentary syntax can occur in the natural cries of monkeys in the wild.
Different types of syntax have been observed in the communication of a number of primate species. The black-and-white colobus, the titi monkey, the male gibbon, the chimpanzee, and the wedge-capped capuchin monkey have combinations of calls in their repertoire of cries. The black-and-white colobus uses a snort as an alarm call, but also places it before a roar, a combination that is used to help groups of these monkeys keep their distance from one another. The titi monkey combines several different calls into various combinations, and the response of its listeners shows that they distinguish between the different ordering of the sounds. Gibbons arrange a series of sounds into structured vocalizations, and the same is true of capuchins. In the case of gibbons, when the animal’s song is arranged in a normal order, the listening gibbons squeak in response.
Zuberbühler wanted to know whether an obvious change of meaning resulted from the way that elements of the calls were ordered. He started with the Campbell’s monkey in the Taï Forest of the Côte d’Ivoire. Like vervets, these animals employ different kinds of alarm calls, with one distinctive cry to warn of crowned-hawk eagles and another for leopards. They also use an interesting combination cry, in which one of the alarm calls is preceded by a boom sound. Boom-plus-alarm combinations appear to indicate a lesser threat, and are used in a situation that calls for a response to the alarm cries of a distant group, the detection of a far-off predator, or less direct dangers like falling trees or breaking branches.
2
Zuberbühler had shown in earlier experiments that Diana monkeys respond to the cries of other species. Even though the calls of the Diana monkeys are very different from those of the Campbell’s monkey, the Diana monkeys, who live closely with the Campbell’s monkeys, appear to both understand and use their alarm cries to protect themselves. For example, if it hears a Campbell’s monkey make an alarm call for an eagle, a Diana monkey will make its own distinct eagle alarm cry. In the syntax experiment, Zuberbühler played a series of Campbell’s monkey alarm calls to a group of Diana monkeys. The recordings consisted either of Campbell’s monkey alarm calls or the Campbell’s monkey phrase, boom-plus-alarm. (In order to run the experiment, Zuberbühler had to use a great deal of stealth, approaching the monkeys without detection; otherwise he would have just provoked a series of human-induced alarm calls.) Zuberbühler confirmed that the Diana monkeys responded to Campbell’s monkey alarm cries with alarm cries of their own. If he played an eagle alarm call, they’d respond with their own eagle alarm call; if he played the leopard alarm call, they would start making leopard alarm calls themselves. If Zuberbühler played a boom and then one of the Campbell’s monkey alarm cries, the Diana monkeys wouldn’t respond with one of their own alarms—indicating that they understood the nondirect nature of the threat.
Zuberbühler likens the boom to qualifiers in our own language, such as “maybe” and “kind of.” His study, he says, suggests that primates have some naturally occurring syntactic abilities, and also suggests that projects in which animals are trained by humans to use syntax are tapping into abilities that occur naturally in these species.
In a more recent experiment, Zuberbühler and Kate Arnold showed that male putty-nosed monkeys combine two basic calls to add meaning to a message. Typically, these monkeys produce a
pyow
sound in various situations, most often as an alarm in response to the sighting of a leopard. They also make a
hack
sound when an eagle has been seen. Zuberbühler and Arnold discovered that male putty-nosed monkeys also make a
pyow-hack
sound, a combination call that signals that either a leopard or an eagle has been seen. The difference in response is that shortly after a
pyow-hack
is made, the whole monkey troop will move location, suggesting that it has the additional message of
“Move!”
Gibbons structure units of sound to create meaning, but their vocalizations are quite different from those of most other primates; they produce complex songs, communicating over distances up to one kilometer away. Typically gibbons form monogamous pairs, and every morning mated pairs sing a duet that pronounces their bond to neighboring apes.
Zuberbühler and colleagues recorded white-handed gibbons at Khao Yai National Park, Thailand, and found that the gibbons use their songs to repel predators as well as to perform duets. The duets and the predator songs used the same notes (“wa,” “hoo,” “leaning wa,” “oo,” “sharp wow,” “waoo,” and “other”), but they systematically differed in how they were arranged. At the beginning of a song, there were fewer “leaning wa” notes and significantly more “hoo” notes if a predator had been sighted. In addition, predator songs had more “sharp wows” in them and were longer overall than duets. Male-and female-specific parts of the songs also differed depending on the referent. While female-specific parts came later in the predator song, the males replied earlier to females in these songs than in the duets. As with the other Zuberbühler experiments, it was also found that the structured utterances were meaningful to neighboring animals. Nearby gibbons responded differently to the two kinds of songs.
The scientists don’t view the gibbon songs as sentences created with syntactic rules about word order. There is no context in which to determine whether notes have smaller discrete meanings, like words, which build a larger meaning when combined in different ways. What is important about the gibbon utterances is that they use combinatorial rules to functionally refer to different things. The same set of sounds has two different meanings when ordered in different ways.
The simple structural rules that these primates use in the wild contradict the idea that creating meaning with structure is a special human ability. Though there remains a wide gulf between what we do with structure and what other animals do, at least some elements of our ability seem to be graded. Robert Seyfarth and Dorothy Cheney, the researchers who pioneered the vervet monkey work, suggest that more evidence for an evolutionary precursor to human syntax may be found somewhere other than in the vocal domain.
After their vervet work, Seyfarth and Cheney began to study a baboon group in the Okavango Delta of Botswana. Baboons—Old World monkeys—typically live in stable groups of 50 to 150 animals. They have a small and limited set of calls, which are largely innate, and they have no call combinations.
3
There are 80 to 90 baboons in the Seyfarth-Cheney group, and every day since 1992 someone has observed the animals. By now, Seyfarth, Cheney, and their colleagues recognize all the animals individually. The rules of baboon society, said Seyfarth, are similar to those of Jane Austen’s: be nice to your relatives, and get in with the high-ranking family. For the researchers this extended period of observation has been like watching a long-running soap opera.
Baboons have a matrilineal society. Females stay in the group into which they are born, while somewhere between the ages of six and nine the males leave and join another group. Each baboon family is ranked from highest to lowest in the troop, and within each of those families each baboon is also ranked for dominance. What this means practically, said Seyfarth, is that within each group, there is one baboon that can go wherever she wants, eat whatever she wants, and sit wherever she pleases. All the other baboons will give way to her. Then there is a second baboon that can do all the same things, except with respect to the top baboon, and so it goes down the dominance hierarchy to the lowest-ranked baboon in the group. Seyfarth and Cheney have found over the years that the dominance ranks within families are as stable as those between families. Some families will, en masse, give way to other families, while within families there is a number one baboon, a number two baboon, and on, until the last baboon.
Some vocalizations are universal within the baboon group. For example, all the baboons seem to grunt all the time. Also, some cries are given in only one direction—up or down the dominance ranking. Screams and fear barks are given only to those higher in rank, and threat grunts only to those lower in rank than the grunter. The scientists also found that baboon calls are individually distinctive. Because of this, a third-party baboon can tell a great deal about the social dynamic in a group of animals just by listening to an exchange between them—he can tell which is more dominant and which individuals are involved, and therefore to what family they belong.
The researchers and some of their colleagues decided to exploit this information in an experiment in which they recorded baboon utterances and then played them back to baboon listeners. For example, they played the threat grunts and fear barks of two baboons that would normally give these kinds of calls to each other (the threat grunter was higher in dominance than the barker), and they also manipulated the recordings so that an interaction sounded as if it defied social order—a lower-ranked baboon threat-grunted at a higher-ranked baboon, and it fear-barked back.
Experiments like this have found that when played a “normal” interaction, the baboon listeners will either ignore it or look at the source of the sound for a short amount of time: the dynamic is normal to them and doesn’t arouse surprise or require further investigation. When a baboon looks longer, it suggests that what it just heard has caught its attention and violated its expectations in some way, as in the case of the vocalizations that subverted the baboons’ ranking. The baboon listeners looked longer for the source when the interaction violated normal expectations. These results confirmed that individual baboons recognized the ranking of others.
Seyfarth and Cheney’s team also wanted to know if individual baboons understood the family rankings in the group. Two researchers waited until two baboons of different families were sitting near each other. First they played them a recording of a high-ranking baboon arguing with a low-ranking baboon, both from different families than those of the observers. Typically enough, the sitting baboons paid little attention. Then, a few weeks later, the experimenters played a recording of a fight between an unrelated baboon and a baboon from the same family as the high-ranking listener. In this case, the low-ranking listener looked up at the higher-ranked baboon, as if to see what she would do next. The researchers later played a recording of a fight between a family member of the high-ranking baboon and a family member of the low-ranking baboon. Immediately, the listeners looked at each other, indicating their awareness of the family relationships.