Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
HistoryEnglish.pdf
Скачиваний:
62
Добавлен:
20.05.2015
Размер:
1.63 Mб
Скачать

Language Families and the Pre-History of English 59

2.4 Meeting the Ancestors II

In the last section, we noted that as nineteenth-century work into linguistic (and genetic) relatedness progressed, scholars moved away from the search for the original Adamic language of humankind, and instead became increasingly focused on the provenance of particular language families, such as the IndoEuropean. Interestingly, as the twentieth and twenty-first centuries have unfolded, research has generally and increasingly moved back to seeking out the ultimate original source. Palaeo-anthropology and more recently, genetics, have sought to identify our original human ancestors, their Urheimat and subsequent patterns of dispersal. Certain linguists have also joined in this search, seeking to establish early links between the large, recognized language families, and in some cases even positing an original linguistic matriarch, known as Proto-World. Thus, proposals have been made for, among others, a genetic affiliation between IE and Semitic (Levin, 1971); IE and Sino-Tibetan (Pulleyblank, 1978); Sino-Tibetan and Austronesian (Wurm, 1982); Sino-Tibetan, North Caucasian and Yeniseian (Starostin, 1984), North Caucasian and Etruscan (Orël and Starostin, 1990) (cited in Trask, 1996: 380; see Chapter 13 for a detailed discussion).

There have also been proposals for super-families, one of the most famous being that for Nostratic (from Latin nostros ‘our countrymen’), which links the IE, Uralic, Altaic, Afro-Asiatic and Kartvelian families. Although it was first proposed by Holger Pedersen in 1903, detailed work on Nostratic began to be carried out in the 1960s by the Russian linguists A. Dolgopolsky and V.M. IllichSvitych, the latter of whom added Dravidian to the family. The Nostratic grouping has perhaps received the highest amount of detailed work done in the historical linguistic tradition: Illich-Svitych, for example, applied the comparative method to reconstructed data from the individual Ursprache of each language family (such as PIE, Proto-Uralic, and so on) and recreated a detailed phonological system for Proto-Nostratic, as well as some seven hundred lexical items and a few grammatical processes.

Other well-known proposals for super-families have taken shape through the work of Joseph Greenberg, who used a much more controversial methodology for his groupings, namely multilateral comparison. This involves comparing huge numbers (hundreds or thousands) of words from the languages being investigated and determining whether there is a high frequency of resemblance between any of them. In cases where the number of resemblances is deemed to be significant, a genetic relationship between the relevant languages is declared. Greenberg used this technique throughout his career, establishing independent classifications for the Australian and African families of languages which have largely been accepted. For example, in relation to the latter, he classified the 1,500 or so African languages into four families; a categorization to which there is ‘almost no serious opposition’ (Trask, 1996: 386). Again with multilateral comparison, he proposed another super-family, termed Indo-Pacific, which links the Papuan languages of New Guinea and surrounding islands, the language of the Andaman Islands and also the now extinct languages of Tasmania. The Indo-Pacific grouping has remained controversial, but even more so has been Greenberg’s

60 The History of English

classification of the languages of the Americas, of which there are approximately 650. He maintained the existence of two already established families, those of Eskimo-Aleut and Na-Déné, which comprise about 50 languages, but grouped the remaining 600, spanning North and South America as well as the Caribbean, into a single family which he named Amerind. The vituperation received by this hypothesis, and indeed by Greenberg, from fellow linguists was enormous: Trask (1996: 387) cites criticisms which label the work as ‘worthless’, ‘crude and puerile’, ‘misguided and dangerous’ and ‘completely unscientific’. On the other hand, Greenberg’s thesis has received support from areas of research outside linguistics, such as genetics. For example, the geneticist Cavalli-Sforza, researching gene maps for the native American population, has postulated the existence of three distinct population groups ‘whose distribution corresponds remarkably well to the distribution of Greenberg’s three language families, with the Eskimo-Aleut and Na-Déné speakers being genetically noticeably distinct from the comparatively homogenous remainder: Greenberg’s Amerinds’ (ibid.: 387–8). Before his death in 2001, Greenberg, undeterred by his critics, was working on yet another superfamily, Euroasiatic, which would include IE, UralicYukaghir, Altaic, Korean, Japanese, Ainu, Gilyak, Chukchi-Kamchatkan and Eskimo-Aleut.

It perhaps goes without saying that all such work is always going to be controversial. The majority of historical linguists, while having a better than average grasp of the workings of various languages, have a less than superhuman knowledge of all of them. Knowledge of the various language families therefore tends to be specialized, and it is difficult, if not impossible, to wholly endorse or productively scrutinize groupings which include families with which the researcher is not familiar. It is possible too that certain groupings contradict researchers’ perceptions about genetic affiliations, and their reactions are consequently negative: Trask (ibid.: 379), for example, states that racist perspectives once precluded the incorporation of Chadic, whose native speakers are predominantly Black, into the larger Afro-Asiatic family, which contains languages whose speakers are primarily White. There is also a very real issue of time-depth. We know that languages undergo continuous change and given enough time (say, a few thousand years) can evolve in mutually unintelligible forms, despite a genetic relationship. In addition, the written record is relatively sparse – many languages have never been written down, no known text is more than five thousand years old, and no language’s entire history to date is represented by available texts. Consequently, as we have seen, proto-forms, both at the head of and within a language family, have typically had to be reconstructed. Because the comparative method is well established in reconstructive methodology and possibly also because such reconstructed forms have an anchor in actual data, we have come to accept that by and large, ‘shape’ can be given to languages spoken up to about eight thousand years ago (bearing in mind, of course, issues such as semantic change). However, when the reconstructions themselves become the database for further reconstructions, researchers may be forgiven for becoming somewhat nervous. Remember that reconstruction can depend heavily on the database used, as well as on what the researcher considers plausible avenues of

Language Families and the Pre-History of English 61

change. Given this measure of subjectivity, the use of unattested forms as ‘DNA’ for hypothetical languages such as Proto-Nostratic, for example, which, if it had existed, would have been spoken 10,000–15,000 years ago, or for Proto-World (up to at least 100,000 years ago), leaves even the most supportive linguist feeling rather unsettled. We are left in the position of only being able to say whether such reconstructions, given the database used and the methodology employed, are plausible; but they bring us no closer to verifying or falsifying the existence of such early languages or their genetic affiliations. Therefore, what is typically presented as an exercise in rationality is very often an act of faith, which may go some way towards explaining the sometimes personal vehemence of the criticisms levelled against this kind of work.

This brings us to an important point: how rational, how scientific, are the methods used for grouping and reconstructing language families? One of the side effects of considering if, and how, we can discover languages and genetic affiliations with no recorded data, is the deeper scrutiny of what we do when we actually have data. Even though some of the results of Greenberg’s multilateral comparison, for example, have become largely accepted (as in the case of the Australian and African classifications), the method itself has been criticized as unscientific and subjective. Thus, McMahon and McMahon state that, in terms of his Amerind classification, Greenberg fails to make explicit the criteria on which phonetic and semantic resemblances between the languages being compared are based, and so ‘mass comparison could produce a whole range of results, depending on the linguist’s personal judgement’ (2003: 18). The shortcomings of this methodology are often no doubt emphasized in contrast to its seemingly more sensible alternative, comparative reconstruction. But even this long accepted procedure has its limitations. We have already mentioned the fact that the method does not preclude a measure of subjectivity in terms of reconstructed forms, and that it cannot reliably work beyond a certain timedepth. There are, however, other issues. For example, as we have seen, regular and repeated correspondences between cognates are taken as symptomatic of genetic affiliation between languages. In addition, cognates are drawn from the ‘stable’ elements of a language’s lexicon, in order to minimize the risk of contamination from chance similarities and loanwords. It is important to note, though, that this only works if the assumption of stable lexis is true. McMahon and McMahon point out that this has never been tested (nor is it clear how it might be); and there are also attested cases where borrowing across allegedly stable elements has occurred: witness, for example, the ‘everyday’ borrowings from Old Norse into Old English at the end of the Anglo-Saxon period (see Chapters 1 and 3). In cases where knowledge of the socio-historical context of a language’s development is sparse or non-existent, we cannot always be certain of the validity of the cognate database. The result is, of course, that we could end up constructing an inaccurate system of relationships. More generally, an overarching issue with the comparative method lies with the fact that it is not, ultimately, an objective and generalizable method for determining linguistic affiliations. McMahon and McMahon (ibid.: 13–14) quote a contribution on the HISTLING on-line discussion group which states that ‘it is a mistake to assume that the

62 The History of English

C[omparative] M[ethod] is a set of airtight procedures which, if followed faithfully, will produce the desired answers – genetic relationships will automatically emerge’ (Bob Rankin, 2 May 2002). Indeed, the method is typically elucidated in the context of its application to families such as Indo-European (as in this chapter), which becomes its anchor in extrapolation to ‘newer’ cases. With no set procedures, the method is essentially heuristic in nature and therefore very open to erroneous conclusions and groupings, especially in cases where the linguist has little or no knowledge of the languages in question.

As such, no statistical, objective tests exist for the determination of genetic affiliation, as well as the degrees of that affiliation, between languages. This is perhaps because:

There has been a strong tendency to see comparative linguistics as an art rather than a science, and as requiring sensitivity and depth of knowledge of one particular language group on the part of the individual scholar, rather than generalisable techniques which allow the processing of large quantities of data, regardless of the region or family from which they come.

(McMahon and McMahon, 2003: 9)

Given this state of affairs, it is not surprising that language classifications and relationships remain an area of controversy. It is also not surprising that some researchers are now looking at ways in which the discipline can be made more disciplined. The benefits of this would be mainly twofold. First, the use of an objective, testable and generalizable methodology would allow the field itself to progress: researchers could apply a defined set of procedures to relevant data from any set of languages and statistically determine whether genetic affiliations existed or not. For those so inclined, this would have the potential for ultimately testing super-family groupings. Second, it would bring comparative work in line with methodologies employed by other disciplines such as genetics and archaeology which, with some similar aims in mind, are increasingly taking note of linguistic attempts to ‘meet the ancestors’. At the moment, the lack of quantitative analysis and evaluation in comparative linguistics means that it is not producing data which is ‘interpretable and usable by neighbouring disciplines’ (ibid.: 20). The authors also point out that at the moment, hypotheses (and their methodologies) which are viewed as unreliable by linguists, such as those of Greenberg, often find sympathetic ears among practitioners in other fields, who may see them as bold and progressive ideas. In addition, given the lack of other options, such practitioners may take it upon themselves to develop those ideas and create their own approaches to quantification and evaluation of linguistic data. If linguists want to convincingly demonstrate their reservations about certain classifications, as well as retain control of the data, then they must also be able to provide viable alternatives.

McMahon and McMahon are currently spearheading a project geared towards developing quantitative methods for language classification. Detail on their work so far is available on their website,8 and in McMahon and McMahon (2003), but I will outline the main points here. First, they are working within a framework proposed by Embleton (1986: 3; quoted in McMahon and McMahon, 2003: 21) for developing a quantitative methodology in linguistics. Embleton (ibid.) states that this is a three-stage process, involving (1) developing a procedure, based on

Language Families and the Pre-History of English 63

either ‘theoretical grounds . . . a particular model, or on past experience’; (2) verifying the procedure by applying it to data ‘where there already exists a large body of opinion for comparison’; and (3) applying the procedure to cases where linguistic opinion has not yet been established or produced. Thus, the method, once devised, can be tested for reliability on data that is already categorized and if it succeeds, be generalized to new, untested cases.

McMahon and McMahon have decided to test their procedure on the established Indo-European family, using a cognate database of 200 lexical items. They emphasize, however, that their primary focus in determining cognacy is meaning rather than form; that is, if two words in two languages have the same meaning, then they are considered cognates. Their data is based on Dyen, Kruskal and Black’s (1992) 200-word comparative lists for 95 IE languages and dialects9 and includes items with meanings such as ‘and’, ‘father’, ‘foot’, ‘snow’, ‘three’, ‘that’, ‘two’, ‘woman’, and so on. Dyen et al. also provide numerical information, namely the percentages of both cognate and non-cognate material that holds between each pair of languages. The latter percentage (that of non-cognate material) yields a ‘distance matrix’, which indicates the level of distance between the two languages being compared. This data has been fed into programs from the PHYLIP package, which were developed for the reconstruction of evolutionary histories in biology. As the authors state, the programs treat the numerical encodings of genetic and linguistic data in exactly the same way, and run them through the same quantitative procedures. They also have an additional advantage: the programs do not just produce one tree of relationships but instead, generate a set of plausible models. They then select the tree which is most appropriate to the data that has been input; that is ‘the tree where branch lengths and order of branching are most consistent with the distances in the data matrix’ (McMahon and McMahon, 2003: 29–30). Thus, the effects of a linguist’s preconceptions or judgements about linguistic relationships are minimized in the final result, which, as a product of a set of objective procedures, can also be reliably tested for statistical significance. It must be borne in mind, of course, that the programs are only as effective as the data they are given, and it is arguable that the initial determination of cognates, as well as of the degrees of relationship between them, is inevitably subject to human error. However, this is currently unavoidable, and the only solution would seem to be to retain an awareness that there must be a margin of error for all results.

The trees generated as most plausible by the three PHYLIP programs used (Neighbour, Fitch and Kitch) produced very similar patterns of grouping. In addition, they were not significantly dissimilar to the traditional IE tree. Repeated tests, involving re-sampling of the cognate data, demonstrated that the sub-family groupings, such as Germanic, Celtic, Slavic and Romance, were ‘extremely robust’ (ibid.: 36). In different runs, the languages within those sub-families do appear in slightly different permutations, but all languages consistently stay within their sub-group. The fact that the computer programs largely confirm the less scientific groupings of traditional comparative linguistics is no bad thing: it validates the years of careful research and thought that have gone into determining linguistic relationships, and it also adds some weight to certain

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]