THIRTY years ago the first effort was made to reconstruct the history of human differentiation by employing the genetic divergence observed among human groups. The data base comprised gene frequencies, that is, frequencies of alleles at polymorphic loci known to be clearly inherited. Observed frequencies are very stable and seem to be rather insensitive to short-term environmental change. There are, however, very few if any data from the past, and stability in time is inferred from the stability in space, essentially the regularity of gene-frequency distributions and the very small differences usually observed among populations that live in widely different environments. Fortunately, in very recent times, new developments in molecular technology have generated the hope of obtaining substantial information from individuals or populations that have been dead for a long time.
Data from physical anthropology (including skin color, body build, and facial traits) had previously been the only source of information. Some of these data, especially measurements on bones, have the great advantage of being readable in the fossil material. Unfortunately, data available for the past have shown conspicuous changes in the last 200 years, as, for instance, the trend to increase in stature and changes in other measurements observed in Europe. It is difficult to ascribe these observations to genetic causes, and it is more likely that they represent responses to recent environmental changes. They are therefore less suitable for the study of genetic history. Even so, major differences observed in the fossil material have been important for reconstructing the general lines of evolution of the genus Homo. More detailed conclusions are still controversial because of the rarity of informative specimens and of dating difficulties in the time range of greater interest. Some of these limitations are slowly being removed.
Genetic data about extant populations useful for our purposes are extremely numerous. Two of the first polymorphic loci discovered, the ABO and RH blood groups, had considerable clinical importance and were tested very widely. Many other markers with no clinical interest were nevertheless investigated in many populations because of the anthropological information they can provide. Unfortunately, the existing data vary widely in number and geographic distribution. If they had been collected more systematically according to a rational plan, as occurred for important markers like those of the HLA system, we would have a much more informative body of data. Today molecular genetics is providing us with enormously more powerful technology, but the data base thus generated is still minimal, and we should better organize our future efforts.
There is another important reason for starting a major program in analyzing human diversity now. While our potential skills for analyzing human evolution are increasing, social changes taking place in developing countries are rapidly destroying the identities--if not the very existence--of the most important aboriginal populations. Thus, organized research efforts to save this precious information about our past have acquired a new urgency. Fortunately, recent technical developments make the prospects very exciting, so that this is a good time for taking stock of available knowledge and using it as a guide for planning future research.
This book was started with the desire to analyze the geography of human genes, using new techniques we have developed for the purpose of studying ancient human migrations. While the very demanding work of computerizing the enormous data base in existence was proceeding, it became clear that there was a need to analyze the same information with other techniques, developed by us and by others, which can lead to conclusions of historical interest. But the challenging task of reconstructing the history of human evolution can hardly be entirely satisfactory using only evidence provided by the genetic data. Information from historical, linguistic, anthropological, and archaeological sources is also useful, and it should be compared with the genetic evidence if we wish to reach fully satisfactory conclusions.
Needless to say, all these sources have their own limitations. Relevant data from history are infrequent, far from quantitative, and do not usually probe deep enough in time. Archaeology says very little about the physical populations it studies, but it gives dates and some, however vague, information on demography, especially on numbers of people, that are important for predicting the rates of genetic evolution. But archaeologists often find it difficult to distinguish between the migration of eople and the diffusion of artifacts or the techniques for making them. Linguistic change follows rules that are somewhat analogous to those of genetic evolution, except that it is much faster and the reconstruction of early stages is therefore especially difficult. Moreover, languages are sometimes replaced by others of totally different origin in a very short time, partially blurring the concordances. Physical anthropology can be misleading because certain physical traits observed in bones can sometimes change quickly with changing environmental conditions. Only genes almost always have the degree of permanence necessary for discussing fissions, fusions, and migrations of populations that took place during the history of our subspecies, which goes back for at least 100,000 years. A large fraction of the genetic variants we study appeared before that time. Their relative proportions have changed considerably since and can orient us in understanding population history.
Although population geneticists often summarize knowledge about the archaeology, history, and linguistics of the ethnic groups they have studied, there has been no comprehensive treatment or attempt at a global picture of our species from the points of view of general history that are relevant for genetics. We hope to fill this gap with the present volume. In the first chapter we give some general historical information on the subject, a discussion of the concept of race, its failure, and an elementary introduction to the major analytical techniques used for our purposes. We have tried to make the book readable to scientists of as many disciplines as possible, given that not only geneticists but also scholars from fields as diverse as archaeology, anthropology, history, geography, and linguistics have a potential interest in the subject. Most barriers to cross-disciplinary exchanges are the result of the specialized vocabularies of each field, and we have tried to counter this limitation as much as possible. This is tantamount to saying that lay readers could also understand this book, if they have the motivation necessary for going through a scientific analysis. Inevitably, the discussion is kept at an elementary level in each of these disciplines, and the language used is as simple as possible. Statistical methods and basic population genetics theory are explained in a qualitative way with economical use of scientific terms; all of which are defined at their first introduction.
The second chapter is dedicated to an analysis of the world data with the aim of understanding the general history of Homo sapiens sapiens. Trees of descent are reconstructed and compared with archaeological data and linguistic classifications. Other methods of analysis are applied to the global data for an evaluation of the genetic structure of the species as a whole.
The five chapters that follow are dedicated to the major geographic subdivisions of the inhabited Earth. We start with the continent where the genus and probably also the subspecies to which we belong have first developed, Africa, and then proceed with the other continents successively occupied, though not in the strict order of occupation: Asia, Europe, America, and Oceania. In each chapter we briefly discuss geography and ecology, and then history, starting with paleoanthropological and archaeological information. We pay special attention, when possible, to population numbers and densities, as well as to migrations that have special relevance for the evolutionary processes in which we are interested. Physical anthropology and linguistics follow. Then an analysis of the available genetic data is given for each continent in general and for its most important subsections. Geographic maps of genes for which there is enough information are given for the world and each continent at the end of the volume. "Synthetic" geographic maps derived from them are given in the text and show the major genetic patterns that can be abstracted from the total genetic "landscape" by suitable methods. There is not always enough genetic information to make full use of or to interpret all the historical and other nongenetic information given in the first sections of each chapter, nor is there enough of the latter to explain all details of the former, but we hope the unused information may be a stimulus for further research.
The last chapter is an epilogue that discusses generally our conclusions from a methodological point of view and the most urgent problems facing the continuation of research at this crucial time. We now have the tools for doing a much better job than has been done thus far at the level of both data collection and analysis. There is, of course, room for improvement in both, but the usefulness of living populations is being destroyed by a rapid increase in the rate at which human populations are vanishing. The mixing of formerly isolated groups is especially damaging for future research. This is a critical time for organizing our efforts before we lose a unique opportunity for understanding our genetic heritage.
As already mentioned, the second half of the book is dedicated to geographic maps for all genes for which the amount of data of aboriginal populations was deemed adequate. It is difficult to establish an objective criterion for deciding when data are sufficient for making a map, and the choice of alleles and continents represented in the maps was in part subjective. Gene frequencies from samples that were geographically too close had to be averaged before they were used in constructing maps. For different populations inhabiting the same region, we had to choose between discarding some of them or pooling them. In general, when there were both aboriginal populations and late settlers that could be easily distinguished, the former were chosen. The pooling of distinct populations living in the same narrow area generated local heterogeneities, which were systematically estimated and are shown on each map.
For satisfactory map construction, the regularity of the geographic distribution of the data is even more important than the total number of observations. Even for the most intensively studied genes, some areas are not well sampled. In order to give an idea of the strengths and weaknesses of each map from the point of view of the spatial distribution of data, we have indicated the locations at which data were available, as well as significant local heterogeneity, if any. No smoothing of the data could be perfect; we have therefore indicated where the calculated surface of gene frequencies departed significantly from the observations, and the direction of departure. A brief comment on the map of each gene is given in a special section of the appropriate chapter. These single-gene maps were used to generate the synthetic ones.
All gene frequencies obtained from the literature were used for building the geographic maps, but only a selection of populations tested for a greater number of genes was employed for tree analysis. The two methods, trees and geographic maps can be considered complementary descriptions of the same reality. The first stresses historical aspects, and the second the geographic ones. The historical interpretation of trees needs to be strengthened by tests of the validity of the hypotheses underlying them, which is sometimes possible. We usually find good agreement between genetic and nongenetic information, which encourages us further to believe in our conclusions. If nothing else, the presentation of clear hypotheses that can be tested is, we believe, a valuable contribution.
The gene frequencies of the population samples used for tree analysis in the various chapters and their sources are also given in the second part of the book (Appendixes 1 and 2). The bibliography of gene frequencies used for trees and maps is separate from that of works cited in the text (Appendix 3 and Appendix Bibliography). The largest part of gene-frequency data is also found in earlier tabulations that report the relevant sources, and we make specific reference to them population by population.
The task we set before ourselves was not an easy one, and we hope critical readers will recognize that the need to summarize a substantial amount of information of varied nature has inevitably generated the possibility of important omissions and errors. In particular, we apologize to authors who may feel their work has not been adequately considered. In many cases we have preferred to give our conclusions without comparing them with dissenting ones. Our excuse is that we wanted to present testable hypotheses and indicate the basis on which we have accepted or discarded them, without attempting to be fully comprehensive (a nearly impossible task). We are hopeful that our effort will help to spread knowledge and interest in human population genetics, and to recognize the usefulness of thinking in multidisciplinary terms. Much work is necessary for filling important gaps, for organizing future research more satisfactorily at an international level, and for making full use of the power of present techniques at this critical time, when crucial information is slipping out of our hands.
Return to Book Description
File created: 8/7/2007
Questions and comments to: email@example.com
Princeton University Press