I have repeated the experiment with a much larger set of populations:
English_D, British_D, Ukranians_Y, Karitiana, Spaniards, Sardinian, Serb_D, Mordovians_Y, Irish_D, French, Finnish_D, Chuvashs_16, Romanian_D, N_Italian_D, French_Basque, Austrian_D, Russian_D, Hungarians_19, Kent_1KG, German_D, Belorussian, Tuscan, Lithuanian_D, Orkney_1KG, Dutch_D, TSI30, Ukrainian_D, Bulgarians_Y, Bulgarian_D, Russian, Swedish_D, Pais_Vasco_1KG, French_D, Castilla_Y_Leon_1KG, Lithuanians, San, Polish_D, Romanians_14, Orcadian, Cornwall_1KG, Valencia_1KG, North_Italian, FIN30, Norwegian_D, CEU30I used Sardinians as the Caucasoid reference population, Karitiana for Mongoloids, and San for Africans. The latter two were chosen because they live at maximally opposite corners of the Earth (South America vs. South Africa).
A first plot of the f4 statistics used for f4 regression ancestry estimation is seen below:
Clearly, some evidence of a cline is present, but several populations appear to deviate from it. In order to get the cleanest possible cline, I carried out the following greedy procedure: I calculate the correlation coefficient of this set, and iteratively remove one population that leads to the maximum improvement of the correlation, until no further improvement takes place. The following populations were removed with this procedure:
Spaniards, Serb_D, Romanian_D, N_Italian_D, Tuscan, TSI30, Bulgarians_Y, Bulgarian_D, Castilla_Y_Leon_1KG, Romanians_14, Valencia_1KG
This seems to make sense, as all these are southern European populations. Note that their removal does not mean that they do not partake in the same phenomenon as northern Europeans: they also exhibit Karitiana-shift relative to the Sardinians, but there are probably other confounding factors that make them fall "off-cline". Including them would diminish the clarity of the cline for Northern European populations. The regression of the remaining populations can be seen on the right:
I can't say that I've made any obvious mistakes, but these admixture proportions are substantial, and call for an explanation. Whatever their true levels, I am fairly confident on at least a few points:
First, it is evident that northern Europeans have higher levels of this element than southern Europeans; the latter are not altogether deficient in it, but they fall "off-cline", making estimation of their admixture proportions more difficult.
Second, within northern Europe, there is a fairly clear east-west cline of diminishing Amerasian-like admixture. The minimum occurs in Sardinians and secondarily in Southwest Europe. Romance, Celtic, and Germanic populations all have less of it than Balto-Slavic and Uralic ones. And, some populations of northeastern Europe seem to have a noticeable excess of it.
The groups with the most Amerasian-like admixture possess Y-haplogroup N, a clear trace of eastern ancestry that is not shared by most Europeans. The arrival of this haplogroup, either with Comb Ceramic of the Baltic Neolithic or later with Seima Turbino Bronze Age expansions is probably responsible for the local excess in Northeastern Europe. The Chuvash are, of course, a Turkic population but of Finno-Ugrian genetic origin.
But, the presence of this element even in Western Europe cannot be explained on the basis of typically Mongoloid elements which are almost completely lacking there. If Mesolithic Europeans were themselves Asian-shifted, then this would account for the presence of the element, but not necessarily for its clinal manifestation. The double (north-south and east-west) cline indicates every sign of an intrusive element. So, for the time being, I will propose that this is associated with late (e.g., Copper and Bronze Age) phenomena, such as the northern stream of the Bronze Age Indo-European invasion of Europe.
This may be due to the
- (i) northern Indo-European groups picking up some native east European or Siberian elements as they made their way into Europe,
- or (ii), more likely, in my opinion, that the Y-haplogroup R1 group of people, whose closest relatives are in Central/South Asia (R2) , and whose more distant relatives (Q) are in Siberia and the Americas, were from the beginning an "intermediate population" between West and East Eurasia. The R1 group of people in its R1b and R1a varieties first appear in Europe during the Copper Age, and they are lacking in early Neolithic sites.
Eight years ago, and in a totally different context, I wrote:
Similarly, 9 out of 10 Basques are descended from a man who has also fathered 9 out of 10 Kets from Siberia and 9 out of 10 Maya Indians from America. That man, founder of haplogroup P thus has descendants who belong to two of the major human races (or three, if Amerindians are considered as separate from Asian Mongoloids)
In conclusion, human continental populations form groups of genetic and phenotypic similarity, and these groups can be considered races in the phenetic sense. However, these groups are not monophyletic, hence in the cladistic sense they should not be considered as valid taxa. Since the principle of common descent is generally applied in modern systematics (or at least it should!), I think it's best not to recognize human subspecies.
(A raw dump of fourpop output can be found here).