Friday, October 30, 2015

Inferences that can be made from a Sudanese Arab's Gedmatch results

A while back I said I'd be sharing the various genomic results of both a Sudanese Arab man & a Nubian woman who were commercially tested (via 23andme) because this would allow for the utilization of more SNPs when dealing with analyses like admixture or PCAs.

The reason I wanted to do this was because while Dobon et al. 2015 sampled a lot of Nubians & Sudanese Arabs for about 200,000 SNPs which should allow for high quality analyses; they utilized a bad genotyping chip / a chip that overlaps in SNPs badly with the chips used for sequencing genomes in other datasets.

For example, the admixture analysis above from Dobon et al. 2015, if I recall correctly, and was told by one of the study's researchers via email was that it only utilized about 15,000 SNPs and, indeed, the decent enough PCAs I shared in my post about Dobon et al. also used such a low number of SNPs.

It's usually preferable for PCAs like the one below, for example, to utilize 100,000 SNPs or more for high quality results:

The PCAs I shared are good enough and the inferences that can be made from them like Bejas seeming as though they are intermediates between certain Horn Africans (Somalis, Habeshas, Wolaytas, Oromos) & Sudanese Arabs + Nubians aren't wrong at all. [note]

On the other hand, when dealing with analyses like admixture analyses which become increasingly unstable & therefore unreliable with the lower number of SNPs you use; such a low number of genetic markers is problematic.

So what I essentially wanted to do was circumvent this issue by taking two people who were not tested in this study and whom were sampled for anything between 500,000 to 1,000,000 SNPs (like at 23andme) & run them through some admixture analyses in order to  reliably compare them to modern populations in terms of for example just how much ancestry they may share with Somalis & Habeshas.

I'm happy to say the data very much supports what was noticeable via the PCAs I shared that were ultimately owed to David Wesolowski who runs the Eurogenes genome blog and project.

Basically, what the results for Eurogenes K=36 suggest is that Horn Africans such as Somalis & Tigrinyas in terms of more recent ancestry; are closer to one another which explains why in various PCAs like the one below; we cluster off together rather than either population clustering with Sudanese populations instead:

More recent ancestry being ancestry I suppose more along the lines of like the last few thousand years: in terms of such ancestry its obvious most Horn Africans of Cushitic & Ethiopian Semitic speaking populations are closer to each other than either is to Sudanese Arab or Nubian groups as the PCAs suggest and this Sudanese Arab's admixture results do as well.

Although another inference that can be made is that it seems quite evident lots of likely post-Neolithic ancestry is shared between Horn African groups like Somalis & Northern Sudanese groups like Sudanese Arabs.
I say this once again based on these Eurogenes K=36 results. Firstly it must be noted that these results were acquired via Gedmatch which allowed me to utilize David Wesolowski's K=36 admixture calculator; however, these results are skewed to some extent by the notorious calculator effect.

This is why for example, the Eritrean Tigrinya, this Sudanese Arab chap & I are showing "East African" which is actually a Maasai peaking / seemingly based component from what I understand. In the original table David made for this analysis; things are more fine tuned and not effected by the calculator effect and Somalis alongside Tigrinyas are missing components like "East African".

However, the Northeast African component here peaks in Somalis & is somewhat similar to the "Ethio-Somali" component or the "Lowland East Cushitic" component. And of course, even if it is now being shown via a calculator effect; the "East African" component carries with it Horn African-related ancestry.

This Sudanese Arab along with Dobon et al.'s Nubians & Sudanese Arabs as the global PCA below including them (that's a bit messy because of the low number of SNPs utilized) suggests- :

-is on a rather fundamental scale very similar to Horn African populations & essentially a mixture between West Euraisan ancestry and African ancestry related to the kind of ancestry that forms the non-Niger-Congo-related ancestry in populations such as Dinkas. [note]

In fact, in this regard he is somewhat more similar to Tigrinyas than I am as he is closer to them in West Eurasian ancestry levels than Somalis are as you can see below:

 However, the difference seems to be that he has some much more "recent" West Eurasian-derived ancestry than either Somalis or Tigrinyas do, as evidenced by his higher "Arabian", "Near Eastern" & "Eastern Mediterranean" scores than either me or my Tigrinya colleague. [note]

Another distinction seems to be that he has much much recent Nilo-Saharan speaker-related ancestry evidenced by his "Central African" score (peaks in South Sudanese, if I recall correctly) which also explains why, in various admixture analyses available at Gedmatch, he consistently displays Niger-Congo-related ancestry which is present in contemporary South Sudanese populations and seemingly also populations like Darfurians.

The "East African component" based ancestry in Horn Africans while seemingly related to the non-Niger-Congo ancestry in Nilo-Saharan speakers such as Dinkas; isn't exactly like them & is seemingly mostly to entirely pre-historic in origin / extremely ancient hence why we lack the likely ancient & substantial Niger-Congo-related ancestry present in populations like Dinkas; our non-West Eurasian ancestry is not derived from them.

However, in this Sudanese Arab's case it does seem as though he derives actual ancestry from groups like the South Sudanese rather than pre-historic populations similar to much of the ancestry in them. This is quite understandable as many Sudanese Arabs used to be actual Nilo-Saharan speakers prior to their Arabization, and Nubians are Nilo-Saharan speakers.

As for the signals of perhaps post-Neolithic Horn African-related / Somali-like ancestry showing in this Sudanese Arab, I would have liked to see the results of a Nubian in these kinds of analyses but what this likely suggests to me is that this is due to ancestry from actual early to not so early Cushites being present in Sudanese Arabs & Nubians. [note]

As it is suggested via both archaeological and linguistic studies; in many cases, some of the earliest inhabitants of Northern Sudan (as early as the Neolithic) were in fact people of Cushitic speaking origins who were eventually assimilated by the Nilo-Saharan speaking ancestors of populations like Nubians just a few thousand years ago. [2] [3]

The Kerma culture's people are suggested by some linguists to have been Cushitic speakers

Then there's of course the strong & long-time presence of actual Cushitic speakers seemingly closely related to Horn African populations such as Bejas [note] who are, in some cases, thought to be a relic of not just Sudan's early Cushitic speaking nature but also that the ancestors of Horn African Cushitic speakers originally migrated into the Horn from northerly regions such as Sudan.

Further linguistic, archaeoligical & genomic study is needed to confirm these possibilities of shared ancestry from just say the last few thousand years or so between Nubians and Somalis for example (the sampling of ancient genomes would be the most ideal)...

Northeast Africa
Nevertheless, for now those are the best inferences I can make using this one man as a proxy for Dobon et al.'s various samples whom he seems representative of. Granted, one should recall that at least in terms of admixture levels (proportions of West Eurasian and African ancestry); Sudanese Arabs and Nubians seem quite heterogeneous.

For now my bet would be that Nubians for example are a mixture between earlier Cushitic or generally Afro-Asiatic speaking peoples perhaps genetically similar to Somalis, populations perhaps similar to Copts from earlier periods of Egypt & populations similar to contemporary Nilo-Saharan populations such as Dinkas with Sudanese Arabs being all that + some later Arabian-related gene flow. [note]

Only further genomic study of these groups will tell and what I wrote in the above paragraph is honestly just an educated guess on my part based on our current data.

Reference List:


1. Eurogenes ANE K=7 mostly inflates the ENF & WHG-UHG scores of people who've been run through it (Northwest Africans turning up as over 20% WHG-UHG as opposed to the more fine-tuned K=8's ~15% values) though its ENF scores are not inflated by much (Somalis hopping from 41% in K=8 to 42% in K=7). This is because the run was honestly mostly designed to spot ANE-related ancestry & not as much care was put into the other components. [note] Sub-Saharan African is just a result of me adding the "East African" & "West African" components in ANE K=7 together to neaten things up.

3. A friend more knowledgeable than I in this particular subject's notes on the Niger-Congo-related admixture in groups like the South Sudanese: Link to note

Northwestern Neolithic Anatolians were essentially "EEF" with less "WHG"

Well, it seems David wasn't wrong when he worked on the extremely low coverage genome of a farmer from Neolithic Northwestern Anatolia as we now have proof with far more samples from the recent Mathieson et al. 2015 to back up what he noticed.

What he noticed can be surmised via the PCA (Principal Component Analysis) / autosomal DNA based cluster above where the Barcin / Neolithic Northwestern Anatolian farmer whose genome he analyzed looked like she was essentially an Early European Farmer with less Western European Hunter-Gatherer-related ancestry.

As you can see in the PCA above where Mathieson et al. 2015's Anatolian Neolithic samples are present; they essentially cluster with Neolithic Europeans (F.e. Middle Neolithic) / Early European Farmers but pull farther away from Western European Hunter-Gatherers than they do.

This is all quite interesting as it suggests now that the Neolithic Farmers who entered Europe from West Asia (specifically Anatolia and into the Balkans) during Neolithic period already carried Western European Hunter-Gatherer-related ancestry.

These farmers who brought agriculture to Europe then stocked up on more WHG-related ancestry once they got to Europe (from mixing with the continent's local earlier forager inhabitants) resulting in individuals like the Stuttgart farmer who're more WHG-related in ancestry than these Anatolian Farmers from around 7,000 years ago.

Scope of Matheison et al.'s samples
The next step would obviously be acquiring more and more ancient DNA from West Asia. For now, it's difficult to assume all of Neolithic Anatolia was like these Barcin Farmers who lived so close to the Balkans & the Bosphoros (perhaps this explains their WHG-related ancestry visible outside of "ENF"?). 

It would be truly intriguing if Neolithic Anatolian farmers from areas such as Eastern or Central Anatolia also proved to be very EEF-like and not distinct from their Barcin Farmer neighbors.

However I suppose the old "EEF + ANE + WHG" model has been shaken up once again and Europeans are for now looking to basically be "Anatolian Neolithic-related + Yamnaya-related + WHG-related" on a as displayed above in charts from Mathieson et al. 2015.

With the Pontic Caspian Steppe side of Europeans being part Caucasian-like & part Eastern European Hunter-Gatherer-related which as this paper notes has a certain extra relation to Ancient North Eurasians like MA-1 that makes modeling Europeans as part "ANE" possible, and finally the Neolithic Farmers who came to Europe seemingly from Anatolia would've been part this component & part WHG-related.

The plot thickens with the more ancient DNA we get and there are still unanswered questions like the exact source of West Asian-related ancestry in Bronze Age Steppe Pastoralists like the Yamnaya & what exactly "Basal Eurasian" really could be. 

Reference List:

Recommended reads:

Saturday, October 10, 2015

All Modern Africans are part Eurasian? We should probably be a bit cautious

In light of a recent study that sampled a 4,500 year old Southwestern Ethiopian (labeled "Mota" for the cave he was found in) we now have data suggesting that Africans all around other than the obvious already known cases like Horn Africans & various Southeast Africans; supposedly have non-negligible Eurasian ancestry.

Truthfully, upon really taking a look at this paper's various results... I must personally advise some caution and say that I'm not 100% convinced that this Eurasian ancestry in all these groups is real. Why? Well, it's mostly but not entirely the scope as well as the levels. 

Every population from Hunter-Gatherer Pygmies like Mbutis & Biakas [note] down to farmer groups like the Yoruba or pastoralists like Anuaks are all supposedly about ~6-8% Eurasian (supposedly even West Eurasian?)? What's suspicious about this is the general homogeneous amount found all across in all these populations that prior to this study showed no strong Eurasian affinities other than some signs of Neanderthal ancestry or a hint here and there in some admixture runs in only some of them, if I recall correctly. [2]

But what large scale event occurred that spread Eurasian ancestry at such a homogeneous level to so many culturally & genetically distinct as well as geographically far removed from one another populations? This doesn't entirely even make sense. 

It can make some sense when observing some populations like Yorubas showing prior signs of Neanderthal ancestry and then the presence of markers like mtDNA U in West Africa but all across Africa? From Nilo-Saharan speakers like the South Sudanese ("Sudanese" samples) to Niger-Congo speakers like South African Bantu speakers and at a roughly equal level?

I mean there's nothing shocking about Eurasian ancestry showing up in groups like the Nama whom we know have both South Cushitic pastoralist & European admixture or in the Maasai who are substantially South Cushitic (and perhaps even part East Cushitic?) admixed or in Horn Africans but in these other populations it's surprising but would to my mind be more acceptable if these levels varied more.

F.e. if the South Sudanese were like ~3% and Yorubas were like ~7% & things just varied altogether then these results would seem a bit more plausible to me as it would mean there was some very ancient Eurasian expansions into Africa (plenty of evidence for such) that just so happened to hit these populations in different ways and likely at different times and then understandably resulted in different levels of Eurasian admixture.

But then Mbutis are ~6%, Yorubas are ~7% , Southern African Bantus are ~7% & the South Sudanese are ~6.5%... I must ask how? Why such extremely close levels? I'm not going to come out and say these results are 100% moot but I must say that one should be a little suspicious of them.

Perhaps this is due to Mota containing some form of a very archaic African element that all these populations lack? [note] This could perhaps shift them all away from Mota and towards Eurasians at a roughly equal rate? I'm really just shooting in the dark here as I'm generally just in shock over this study's results.

All I can say at this point is that further study is required... More ancient genomes from Africa, more from Eurasia and I'll be excited to see what we get when people like David who runs Eurogenes or Kurd over at Anthrogenica get their hands on this sample and start comparing him more and more to modern populations.

A lot more still needs to be learned and tested but for now this study's claim that these groups have Eurasian admixture is what the data we currently have is saying...

Reference List:

1. Ancient Ethiopian genome reveals extensive Eurasian admixture throughout the African continent, Llorente et al. (the study's text is pay-walled but the supplementary information which contains most of what you'll need anyway is free)


Friday, October 9, 2015

Ancient Ethiopian Genome has some interesting things to share

Well, it seems we no longer have to utilize modern Africans as a reference for a completely "African" population (from a genetic point of view) with essentially no "Eurasian" input as we now have an ancient genome from Southwest Ethiopia that's about 4,500 years old and he has interesting things to share about modern Africans.

"Characterizing genetic diversity in Africa is a crucial step for most analyses reconstructing the evolutionary history of anatomically modern humans. However, historic migrations from Eurasia into Africa have affected many contemporary populations, confounding inferences. Here, we present a 12.5x coverage ancient genome of an Ethiopian male (‘Mota’) who lived approximately 4,500 years ago. We use this genome to demonstrate that the Eurasian backflow into Africa came from a population closely related to Early Neolithic farmers, who had colonized Europe 4,000 years earlier. The extent of this backflow was much greater than previously reported, reaching all the way to Central, West and Southern Africa, affecting even populations such as Yoruba and Mbuti, previously thought to be relatively unadmixed, who harbor 6-7% Eurasian ancestry."

They're not joking about that last emboldened part either. Yorubas and various African populations thought to completely lack "Eurasian" / Out-of-Africa ancestry are in fact anything between 5 to 10% Eurasian (seemingly West Eurasian) based on comparing them to Mota (the name for this ancient individual being based on the the "Mota cave" which he was found in) and ancient Eurasians like Early European Farmers whom I talk about here.

Basically every population from Mbuti pygmies to Yorubas to Kenyan Bantus now demonstrates non-negligible non-African / Euraisan / Out-of-Africa ancestry. It seems that yet again, we were misguided to use modern samples to try and fully understand Human history.

The reason we now have these estimates is because Mota seems to utterly lack any signs of Eurasian ancestry; signs that Yorubas among other groups have shown in the past like Yorubas showing non-zero Neanderthal ancestry levels [2] whilst Mota unlike various modern Africans utterly lacks any signs of Neanderthal ancestry.

This really shakes up and complicates African genetics and slightly back-ups statements I've made in other areas of the internet somewhat based on what geneticists like David Reich have said; even populations like Yorubas thought to be mostly "pure" due to ADMIXTURE analyses like the following one from an old study- :

- are not in fact "pure" but clearly somewhat complex mixtures like many Eurasian and other African populations like Horn Africans, various Southeast AfricansCentral Asians, Europeans and so on. [note] We will of course need more ancient genomes from Africa in the future to add to and further comprehend what these results show as well as even back them up but for now; that's what we have.

But for now that's all I'll be saying about this paper until I've gotten around to reading it more thoroughly than my current skimming of it.

I also highly recommend going through the study yourself or at least it's currently available supplementary information.

Reference list:


1. Mota belong to Y-DNA E1b1 & mtDNA L3x2a

2. Seems to be a lot of news on this study in the mass media...

3. A fruitful discussion about this study is ensuing here.

4. One interesting thing to note is that when Mota & the Druze are used to asses how "Eurasian" all of these populations are; the West Eurasian ancestry in Horn Africans increases but then decreases (something that doesn't happen to other populations like Yorubas and Mbutis) when Mota & Early European Farmers (LBK) are used...

The Druze have a small amount of "African" ancestry, if I recall correctly at about ~3% or some such (some of this may in light of this recent find carry Eurasian ancestry?) so perhaps this causes the sudden lowering in Horn Africans in respect to when LBKs are used or perhaps it's the greater European Hunter-Gatherer-related ancestry in LBKs? It's an interesting thing to note, I suppose.