The Origins of the British: The New Prehistory of Britain (14 page)

BOOK: The Origins of the British: The New Prehistory of Britain
6.92Mb size Format: txt, pdf, ePub

In spite of the apparent consistency of the P/Q split in the British Isles, Brythonic and Goidelic do have a number of similarities which make them even closer to each other than either is to the extinct celtic languages recovered from inscriptions in south-west Europe. This similarity leads to an alternative tree of celtic languages with three deep branches: the insular-celtic group, Gaulish/Lepontic and Celtiberian. We then need to ask whether the relatedness of Brythonic and Goidelic results from common ‘genetic’ descent, or from geographic closeness – a so-called areal phenomenon, as suggested by linguist Kim McCone of the National University of Ireland (
Figure 2.4
).
52

This question of which language tree to use is not as academic as it seems, since it obviously affects any attempted reconstruction of which part of Europe (e.g. Spain vs southern France or Italy) each of the Brythonic and Goidelic branches might have come from and when. The importance of language history to anyone but linguists is, apart from natural curiosity, what it might contribute to the history or prehistory of peoples and their cultures. Although language is likely, on balance, to be rather more about movement of culture than about movement of people, both aspects are fascinating to the interested layperson as well as the academic. And as we shall see from the human genetics, the Irish legendary connection with Spain may not be as risible as some archaeologists make it out to be.

Dating celtic-language splits
 

From this thumbnail sketch of linguistic comparison we can see that, although the common ancestry of the Continental and insular-celtic languages is not in doubt, the structure of that relationship is anything but agreed by scholars. In trying to deduce the geographical origin of a group of languages in a historical context, the dating of splits and changes is as important to the historian as the reconstruction of the order of branching of the language tree is to linguists. Although in previous decades linguists made confident dates of deep splits in languages, based on the evidence of word-sharing, some have had their fingers burnt, and rather than try again they have tended to avoid the practice. Others have simply never strayed from the tight confines of the last couple of thousand years, for which period the written word in documents and inscriptions is the ultimate test of time and place.

Scholars studying celtic languages have shared this reluctance. For the British Isles, they do have the advantage that the large body of extant inscriptions and other texts provides a tremendous opportunity to look in detail at sound changes over the past 1,600 years. They can cross-check their dates against those determined by the archaeologists.
53
While this makes possible a microscopic in-depth exploration using the core tool of their craft (known as the
comparative method
), the evidence on which it is based needs to be rigorously determined and of high quality, which means that this approach cannot be extended back any further than the first celtic inscriptions, around 2,500 years ago. Celtic linguists are therefore content for the big questions of European Celtic homeland origins and dates, first posed so long ago, to remain on the shelf.

One of the main methods previously used to date language splits depended on measuring the degree of change in numbers of shared words (
cognates
– quite literally, words with a shared birth) between related languages. Since our vocabulary is also our dictionary or lexicon, this mathematical approach to language diversity is called
lexico-statistics
and the dating method
glottochronology
(see below). Family trees can be constructed, based on the degree of lexical sharing between related languages. Such trees, although they may look superficially similar to some of the trees produced using the strict comparative method, are fundamentally different in concept and meaning.

The comparative method is used in a rigorous tree-building approach (as with genetics) and places groups and subgroups of related languages on the tree, with exclusively shared sound changes (called innovations) appearing in new subgroups. Not only that, but such sound changes are expected to be similar and reproducible in cognate words throughout the lexicon. For instance,
th
is retained in English, but was systematically changed to
d
in Old High German – German
der
becomes
the
and
dunne
becomes
thin
in English. Systematic reproducibility of sound changes is a hallmark of the comparative method.

Counting words
 

In contrast, the lexico-statistical tree, while it ideally uses attested cognate words as the basic unit of measure, concentrates on the proportion of cognates shared between related languages, rather than structuring according to the strict rules of the comparative method. Lexico-statistics is thus more similar to
phenetic
analysis, meaning literally a comparison of phenotypes (the different varieties actually seen in a population, rather
than the genes or ‘genotypes’ that underlie them). The phenetic approach was used in the past by population geneticists, and still is by physical anthropologists, to compare human populations around the world. Those biologists compared the frequency of common markers, such as different aspects of head shape, between populations to see how close they were to one another. The trouble with this statistical method is that for living human populations it gives very blurred trees.

On this comparison of ‘genetic vs phenetic’ methods, the strict comparative approach may sound better and more rigorous, and for most academic purposes it is, but there are some problems. It is very difficult to use the comparative method for dating language change unless, as in the case of historical documents and inscriptions, one has a rigorous, historical/archaeological method of checking dates.

Lexico-statistical analysis, on the other hand, is a quantitative technique and thus lends itself more to the use of the data to estimate dates of splits back beyond the written word. Using the comparative method, individual words, say in English and German, can be identified as sharing a common ancestor by inheritance rather than by borrowing. So, for example, the two words mentioned above (
thin
and
dunne
) can be shown ultimately to have a common Germanic ancestor which has changed in a systematic way in each language. These words are cognates. By contrast, the fact that English
beef
and French
boeuf
have a common origin and meaning results from the Norman Conquest and is an example of borrowing. The French have more recently borrowed the same word back in
biftek
(meaning ‘beef steak’).

Since some words drop out of use from individual languages, there is a decay process, rather as there is with radiocarbon.
Counting up the proportion of remaining shared cognate words between two languages is thus some measure of the closeness of their relationship in time. This general principle can be used to reconstruct a tree, with dates on the branches. The dating method is called
glottochronology
.

The chief problem with glottochronology is that the decay appears to occur at different rates in different language groups, and that puts a fatal flaw in the method. For this reason among others, most linguists long ago rejected the glottochronological method of dating language splits using lexico-statistical data as inaccurate. Further more, since the measurement of the percentage of shared cognates is a measure of decay of relationship, this also means that one has to assume accurate identification of cognates between more distantly related language groups if it is to be calibrated securely. Differ entiating cognates from borrowed words becomes increasingly difficult the more distant the relationship.

There is a general issue which affects both the comparative and the glottochronological method: a large number of different languages on a single large landmass do not simply branch in a tree-like genetic manner from one another. Neighbourly languages co-exist in a multilingual environment and interchange their content: i.e. there may be a lot of borrowing of words and even syntax between languages. Where such neighbourly languages are largely related, as in Europe, the borrowings – although initially obvious – may become increasingly difficult to detect
as
borrowings rather than inheritance. Undetected borrowing distorts both kinds of language tree (comparative and lexico-statistical) and is probably the main underlying reason for structural differences between them.

So, what to do? There are basic differences between the disciplines of archaeology and linguistics on the one hand, and sciences such as geology and biology on the other. In their attitude to the scientific method, some linguists seem to misunderstand the meaning of, or are unable to accept, uncertainty. They interpret the scientific method as implying authority, rigour and certainty, while scientists accept that, in many situations, comparisons have to be made using measurements that have some degree of error and theories of classification with a degree of uncertainty. A statistical approach has to be used to handle such uncertainty. Unlike disagreements between academic authorities, there are standard methods of dealing with sources of observational error and of uncertainty. Archaeologists, in contrast to linguists, have learnt through experience that if a method such as carbon dating gives inaccurate results at first, it should not be thrown out of the window, but attempts should be made to sort out the problems of error and improve it.

My observer’s take on all of this is, ‘If at first you don’t succeed, try, try and try again.’ That does seem to be happening, at least amongst some linguists. A huge set of cognate data on Indo-European languages, originally published in the early 1990s by Hawaiian linguist Isidore Dyen,
54
one of the doyens of lexico-statistics, has recently been recycled in some high-profile publications.
55
Rather than just reinventing Dyen’s analysis and conclusions, these publications use new tree-building methods developed to deal with similarly heterogeneous data in genetics studies.

Dyen made an observation on celtic languages which has not really been disproved or falsified in subsequent re-analyses. Although it shows some relationship with the three Indo-European branches which are dominant in Europe, Germanic,
Italic-Romance and Balto-Slavic (described as ‘Meso-European’ by Dyen), the celtic group tends to stand on its own as a deep branch sharing fewer than 20% of cognates with them. Put simply, this suggests that celtic languages separated from the other three groups before those three split from one another. Not only that, but even the branching within insular celtic is also deep (see
Figure 6.2a
).

Dyen’s dataset includes seven dialects from surviving insular-celtic languages, two each of Irish and Welsh and three of Breton. His analysis confirms general points from the comparative method: these three groups are internally consistent (i.e. on a regional basis their dialects are as closely related to each other as are other modern dialects, such as different types of Swedish); and that Welsh and Breton dialects group together as Brythonic, and are separate from Irish dialects (Goidelic).

But there the similarities stop, in terms of the expected degree of relationship. When Dyen analysed the two large Meso-European branches, Romance and Germanic, he found that each Romance dialect shared between 47% and 67% of cognates with each Germanic dialect. On the other hand, only 30% to 36% of cognates were shared between Brythonic dialects and Goidelic dialects, suggesting deeper splits within the celtic group. I should stress that this is not a false result, that might possibly follow from borrowing between Irish and Welsh
56
as a result of the geographical proximity of Ireland and Wales; rather the opposite – the older genetic relationship is apparently still strong. On this scale of percentage-shared-cognates, the deep ‘celtic split’ between Brythonic and Goidelic is on the same scale as that between Lithuanian and Slavic languages or between the various Indic languages.

What does this mean? In a relative sense, it is consistent with Schmidt’s argument for a deep genetic split between Irish and Brythonic languages rather than McCone’s (later) insular-celtic classification, based on an areal effect (
Figure 2.4
).
57
Schmidt postulated a deep split between Goidelic (Irish) on the one hand and all the rest, including all the Continental celtic languages (Celtiberian, Lepontic and Gaulish) and Brythonic, on the other. This implies a very different history and age of separation of the Goidelic languages from the rest.

Edinburgh professor of linguistics April McMahon and her husband geneticist Robert McMahon confirm these deep celtic relationships in a re-analysis of Dyen’s data, using various different tree-building methods.
58
While the McMahons urge caution against rushing into dates,
59
Russell Gray and Quentin Atkinson of Auckland University have done just that in the journal
Nature
,
60
again using Dyen’s dataset. Gray’s own headline attempts to use lexical analysis to prove the validity of linguist’s trees of Pacific languages, based on cognate sets provided by the same linguists, have been justifiably derided for circular reasoning and false logic. However his luck changed when he joined forces with the moderating influence of three other mathematicians, Quentin Atkinson, Geoff Nichols and David Welch and worked on the Indo-European tree.

Other books

Nighthawk Blues by Peter Guralnick
A Pleasant Mistake by Allison Heather
Monster Republic by Ben Horton
This Perfect Day by Ira Levin
Requiem for a Mouse by Jamie Wang
More Letters From a Nut by Ted L. Nancy
The Author's Blood by Jerry B. Jenkins, Chris Fabry