Wednesday, December 9, 2015

Kenyan Somalis: Not all "Borana admixed"

I've strangely encountered a few people online who have gone off the deep-end with Lazaridis et al.'s Kenyan Somali samples and used them to somehow assume that all Kenyan Somalis have some kind of seemingly Borana-related admixture.

I would like to point out to some of these people that nearly half of the Kenyan Somali samples basically fit within the normal variation of North-Central Somalis (Pagani et al.'s Somalian Somali samples + myself & Brainblaster). In fact I thought this was quite interesting and mirrored the situation we got with Ethiopian Somali samples from Jijiga:

Who seemed mostly identical to Hodgson et al.'s Pagani et al. owed North-Centralite samples except for some samples showing a little bit of Omotic-related admixture at a rate of ~2%. Kind of painted an interestingly and implied homogeneous picture of Somalis from the Northeast of the country down to areas close to Hararghe in Ethiopia which is so far being proved in that numerous Somalis so far tested commercially (23andme and other outlets) from areas like Togdheer or Woqpoyi Galbeed again prove to remain within this same variation. [note]

Rough perception of "North-Central Somalia"
I would personally hold off on jumping to conclusions about Kenyan Somalis as a whole, most Somalis tested (either the hundreds from other regions and clans sampled for their Haplogroups or those academically and commercially genotyped for their autosomal DNA from other regions) so far have proven quite remarkably homogeneous (I was personally expecting more heterogeneity). 

It's quite likely that these 6-7 seemingly Borana-admixed ethnic Somalis from Kenya are recent developments and not entirely representative. The presence of 4-6 Somalis who remain within Northeastern Somali region and North-Central Somalia variation implies that there was & is obviously a more "un-admixed" presence which in my opinion will prove to be the predominant state of various Somalis in kenya

I say this because if we look at all the areas Somalis in Kenya live in-:

-there are areas that don't exactly overlap with the Oromo inhabited areas. Garrisa which is where Lazaridis et al.'s samples are from is a place in a region that overlaps with the presence of other ethnic groups, but even there we clearly found samples that don't look "mixed" so take a moment to think what people more North of the town in areas that prove to be 99% Somali will likely look like.

 Inter-mixture can be expected though in areas where Somalis overlap with Oromos as a demographic like on that map of Kenya above but the rhetoric I've seen among some which is "Ohhh, all Somalis in Kenya are Borana mixed!" isn't sensible and if we're going by these samples; isn't even true.

I only tend to voice concerns about these samples because it can get perplexing if someone produces an ADMIXTURE analysis that utilizes them in his or her main spreadsheet or dataset as the 6-7 who do look like they have some Borana-esque admixture could skew the ultimate results of the Somalis utilized in the analysis and things could thus prove somewhat uninformative for Somalis from Somalia, Ethiopia and so on.

Reference List:

3. Ethiopian Genetic Diversity Reveals Linguistic Stratification and Complex Influences on the Ethiopian Gene Pool, Pagani et al.


1. The assumption that some of these Somalis are Borana admixed is mostly an assumption based on their PCA (Principal Component Analysis / "cluster") affinity for Boranas from Ethiopia and how they straddle between other Somalis and these same Boranas in terms of levels of West Eurasian ancestry & finally because they also show no signs of Niger-Congo-related admixture---ruling out any possibility of admixture from some Bantu-speaking populations in Kenya. No one, to my knowledge, has managed to confirm whether or not they have Omotic-related admixture as Boranas from Ethiopia who could prove representative of those in Kenya indeed seem to; that would prove quite informative.

Caucasus Hunter Gatherer-related ancestry in the Horn?

In lieu of this new Caucasus Hunter-Gatherer component which seems to act as an ancient DNA stand-in for the likes of "Caucasus-Gedrosia" & "Teal"; I do wonder if some Horn Africans in fact have some Caucasus Hunter-Gatherer (CHG) related ancestry.

The tree-mixes above seem to focus the West Eurasian gene flow into Somalis & Beta Israels on Anatolian Neolithic Farmers which makes sense outside of their more overt WHG-related ancestry; they essentially seem to be made up of a component akin to "ENF/EF/Near Eastern/Southwest Asian" which the likes of Somalis, Habeshas and Agaws have always shown a highest affinity for. [note]

However in the past Amharas & Tigrinyas (Habeshas), Xamirs & Beta Israels (Agaws), Wolaytas & Oromos as well as sometimes Somalis have shown noticeable levels of the "Caucasian" component with this presence being more overt among all of the non-Somali populations.

So I suppose I took great interest in the progress of this "component" and how it could be integrated into analyses like qpAdm or ADMIXTURE. It's practically impossible to create a perfect ancient component in ADMIXTURE when the pre-historic samples available are so small in number and can't form a cluster such as with these two CHG genomes.

However more independent (non-academic) but experienced people who make ADMIXTURE calculators have attempted to create a component that corresponds as well as possible with these two pre-historics (at levels of 95-100%) by, to my understanding, using both them and some modern populations to form the cluster.
That K=11 run is incredibly messy in my humble opinion and sadly utilizes (like Lazaridis et al.) George Ayodo's Kenyan Somalis alongside myself to create its Somali results & the slight Borana-looking admixture in some of them could skew things + that "East African" component is a mixed component of West Eurasian (entirely Southwest Asian related according to Dilawer) & "African" ancestry. So I don't entirely know what to make of the run, it's future derivatives seem better but it needs some work. [note]

PuntDNAL's K=10 seems a bit more pristine although the "African" ancestry in these populations looks rather deflated. Punt says his "CHG" component shows up in the likes of Satsurblia (a CHG) at a level of about ~98% & that he's quite sure about these estimates although I find it perplexing that the Maasai who in the past never showed hints of "Caucasian" or anything of the sort are showing CHG, this too may need some work.

Perhaps CHG-related ancestry was present in some Neolithic to post-Neolithic peoples ancestral to even South Cushites but to me "CHG" does correspond with a good amount of the "ANE" type affinities populations across West Asia were showing so it'd be interesting if some Horn Africans show it despite not really showing signs of ANE-related affinities via analyses like ADMIXTURE or formal stats.
Although, one can't jump to too many conclusions as we still need to fully understand "CHG", I suppose. 

I cannot truly be conclusive about whether or not this or that population has this or that much "CHG" (or if it has any at all) with the data we have now but my theory for a while since this component surfaced has been akin to what I stated to Dilawer/Kurd at Anthrogenica:

I have a feeling CHG-related ancestry to some extent will appear in the likes of Habeshas, Agaws, various Oromos & Wolaytas and at best appear at lower levels in Somalis like in Punt's run or not appear in them at all. It'll be interesting to see the results of David Wesolowski's own CHG integrating run once it's finished as I have feeling he may produce the best results.

Reference List:


1. Lazaridis et al. uses those Kenyan Somalis as well so that could skew their "Caucasus-Gedrosia" related levels in Somalis although they show oddly lower than usual levels in the "Afars" (Xamir) & Beta Israel samples, the run just generally displays that some Horn Africans do show such ancestry that seems related to the CHG population.

2. Tree-mixes are owed to David Wesolowski

The plot thickens yet again

Well, now we seem to have the Haak et al. ~ Mathieson et al. "Armenian" proxies for the origins of Early Bronze Age Steppe folk in the flesh replacement in a new component mentioned in a new study and that would be "Caucasus Hunter-Gatherer".

Seems that two pre-historic Caucasian Hunter-gatherers ("Satsurblia" & "Kotias") touched on by a new paper (Jones et al.) are essentially the old "Caucasian~West Asian" ADMIXTURE component (which appeared in dual models as "Caucasus" & "Gedrosia") in the flesh as Dienekes would put it

It also seems to be what people were looking for when theorizing about the hypothetical "teal" people in that Yamnayas according to Jones et al. can be modeled as "EHG + CHG" (EHG= Eastern European Hunter-Gatherer) although David Wesolowski over at Eurogenes seems to find that this model isn't perfect and that there may have been some excess WHG (WHG = Western European Hunter-Gatherer) related ancestry contributing to these Steppe pastoralists as well.

One odd detail from the paper is that it seems to be peddling this view that the component is essentially descended from a "basal lineage" and makes no real mention, if I recall correctly, of an affinity for MA-1 or EHG, something Razib Khan chews on over at his blog.

Like Razib, I don't really buy this model. This component is likely some sort of mixture / hybridization of something "EF/ENF/Near Eastern" like and something "ANE" related. It clearly has some great part in why various West Asian populations among others showed much of the "ANE" affinities they did in older runs based on derivatives of the now obsolete three-way model:

Yet some seem to be claiming it has no ANE-related affinities to it but perhaps at this point in the game with CHG, EHG and all these more recent pre-historic samples, "ANE" is moving farther and farther into redundancy in respect to West Eurasians along with some other groups at least.

I suppose my caution with my Ancient North Eurasian post was warranted, where I mentioned that whatever was making certain populations in South to South-Central Asia & West Asia look part "ANE" that wasn't EHG might just prove to be a different kind of Hunter-Gatherer component that merely shows strange affinities for Mal'ta boy and his kind or is extra-related to "ANEs" somehow or partially mixed with ancestry related to them.

My personal contention is that CHG is something Near Eastern related + something Ancient North Eurasian related but all I can say with any certainty is:

1. These simplistic mixture (this+this) models might be misguided and the truth may prove more complex with further study once Lazaridis and company release their analyses of these genomes & we have more and more new genomes.

2. Caucasus Hunter-Gatherer has a lot of "Basal Eurasian" ancestry & is indeed somehow related to "Early Farmer/ EF/ENF/Near Eastern".

3. Something about it sure is making various populations fit as part "ANE" and this "Near Eastern + ANE" model was the original image most had of "Caucasian" & "Teal" whose roles it has now annexed.

At any rate, the story of Europeans and other West Eurasians' geneses has definitely thickened in plot with Europeans now looking to have four heavily drifted (and divergent-ly admixed in some cases) pre-historic ancestral populations found for now in the ancient DNA record. 

I'll of course be sitting back and waiting for further plot thickening like the understanding of "Basal Eurasian", the finding of yet more unique pre-historic H. Sapien populations and so on. Ancient DNA is an exciting business.

Reference List:

Recommended reads:

Sunday, December 6, 2015

Enjoy some PCAs

I thought I'd quite randomly share some PCAs (Principal Component Analyses) or what some less familiar with them might call "genetic clusters" in my possession that are owed to David Wesolowski who runs the Eurogenes genome blog and project:

The above is basically a global PCA utilizing about 166,000 SNPs (anything above 100,000, if I recall correctly, is generally considered a high quality analysis) and the following is roughly the same thing but with Maasai people excluded (long story as to why they were excluded; it's nothing very interesting, hold off on the weird theories):

Below is a sort of intra-East African PCA where myself and a fellow ethnic Somali friend of mine are included (he's of the Isaaq clan with the Y-DNA marker T-M184 and mtDNA L0a1a, he's labeled as "Brainblaster" & is included in the above two PCAs alongside myself as well):

Various different dimensions for the PCA set-up above:

I'm unaware as to the exact number of SNPs utilized for these more "intra-East African" PCAs but they're certainly high coverage (over 100,000 SNPs) at minimum. If I had to add one more note I'd remind some that my own Y-DNA marker is E-V32 whilst my mtDNA marker is N1a.


Thursday, December 3, 2015

The Reer Xamar are a substantially mixed population?

A while back I encountered a young man of Reer Xamar origins (the "X" is pronounced like this) on both sides (paternal & maternal) who'd been genotyped via AncestryDNA & I ended up helping him figure out his origins at Anthrogenica via services like Gedmatch (running him through various calculators) for quite some time:

Above is a general gist of what we discovered about this individual's ancestry by running him through various analyses aimed at analyzing autosomal DNA. [note] He looks to be a mixture between Somalis (perhaps with some other Horn African input such as from Oromos?), West Asians (seem to mostly look Arabian or Levantine Arab related but some Iranian is plausible), South Asians & Southeast African Bantu speakers.

Of course, a great amount of emphasis was placed on this ancestry being old on his part. F.e. None of  his grandparents are full-blown ethnic Somalis neither any of great grandparents from what he's alluded to me. None are full-blown Arabs and so on. Instead his family is quite incessant that all of this ancestry including the very strange substantial South Asian ancestry (which is very real) is centuries old.

Honestly, I was expecting him to be somewhat "mixed" from the get-go as the historical data on Reer Xamars sometimes referred to more generally as "Benadiris" (they are speakers of Benadiri Somali) is that they are of somewhat mixed origins as even Wikipedia will simplistically outline:

"Although the Benadiri are sometimes described as the founders of Mogadishu (hence, their colloquial name Reer Xamar or "People of Mogadishu", though the city itself is postulated to be a successor of ancient Sarapion), their members actually trace their origins to diverse groups. The latter include Arab, Persian and Somali people."

Some sources will also outline that they have some ancestry from Bantu speakers of mostly or seemingly Southeast African origins given their strong involvement in Mogadishu's medieval and early modern slave trade.

So to my mind seeing some Somali & some Southeastern African Bantu speaker related ancestry as well as some Arabian and Iranian ancestry was plausible and expected but the real deal is quite shocking if this individual is representative (and I somewhat think he is as there is another person from a different region who is quite similar to him, though they are related) of Reer Xamars.

However what was ultimately shocking is how substantial his Somali-related ancestry is. As a matter of fact, he clearly draws the greatest proportion of his ancestry from the Horn of Africa / Somalis (could be some Oromo-related stuff in there I'm missing as there were some Oromo or Oromo-esque tribals in Southern Somalia during the Middle Ages and such), numbering up to ~50% of the chap's ancestry.

The most surprising element of his ancestry however is the South Asian element which seems substantial as it consistently shows up as ~20% of his ancestry, in fact even AncestrDNA's equivalent of 23andme's ancestry composition had him pegged as ~20% South Asian as does nearly every decent Gedmatch admixture calculator I've run him through.

This ancestry is further corroborated by how he constantly turns up with South Asian or South-Central Asian (Pakistan & Afghanistan) populations in various admixture calculators' Oracle-4 results where a person is modeled as a clean mix between 4 populations (each making up 25% of his or her ancestry) they best fit as a mixture of based on their ADMIXTURE proportions

He consistently turns up as something like "Somali or Horn African + South Asian or South-Central Asian + West Asian or North African + any population with substantial West-Central African / "Niger-Congo" related ancestry" (whether they be "Sudanese/ South Sudanese" or peoples like the Hema):

At this point his various ancestries are backed up by too many lines of evidence and analyses not to be real so it's quite perplexing where his ancestors could've acquired South Asian ancestry.

Contact between the Medieval and Early Modern Somali coast and South Asia was quite strong to a point where the Somali language itself has a plethora of Indo-Aryan loan-words (some via proxy through Arabic or Persian though) [2], and traders from the Somali coast as well as South Asian traders supposedly made stops on each other's lands.

Nevertheless, I never expected real inter-mixture to have ever occurred. The majority of ethnic Somalis so far tested seem quite homogeneous and while "mixed" are more of an ancient mixture that seems to have (for the most part anyway) remained endogamous through periods such as the Middle Ages, avoiding notable Arabian, Iranian or Ethiopian Semitic speaker or Oromo-related input. [note]

At any rate, strong history between the Horn of Africa's coastline and South to South-Central Asia or not; I was not expecting this chap to be so substantially South to South-Central Asian admixed. And at the same time so low on West Asian ancestry (either Arabian or Iranian) when Reer Xamars are quite often characterized as Peninsular Arabian (Yemenite et al.) & Iranian Plateau migrants who intermixed with the likes of Somalis to some extent.

Further details on this Reer Xamar's results are available at Anthrogenica where I ran him through a plethora of analyses and such in order to corroborate the kinds of inferences I've shared in this blog post

The most important point now would be to see the results of other Reer Xamars or Reer Xamar-related peoples and see if their results at all resemble this individual I encountered's results and I'm surprised to say that we (myself and this individual) encountered another chap who has extremely similar but somewhat different results to his:

This individual who is a relative of the Reer Xamar I originally encountered and helped out is of paternally Barawi / "Bravanese" descent & maternally Reer Xamar descent (of the "Shanshiyo" tribe like the full-blown Reer Xamar).

The Bravanese are basically Swahili speaking coastal inhabitants of Southern Somalia associated with the town of Barawa much like Reer Xamars are associated with Mogadishu (another name for Mogadishu is "Xamar/Ḥamar"). They too are thought to be some kind of Arabian-Iranian-Bantu-"Somali" mixture with, to my knowledge, no real mention of South to South-Central Asian related ancestry.

At any rate, this individual's results are staggering as while a bit different from that apparently full-blown Reer Xamar he's extremely similar to him and fundamentally of more or less the exact same mixture.

Old picture of Barawa / Baraawe / Brava

This individual basically shows up as a very similar mixture to the full-blown Reer Xamar although he does have (as you can see from that modified Eurogenes K=36 based pie chart I made) more West Asian ancestry than the individual who is fully Reer Xamar also resulting in lower levels of Somali or Somali/Horn African-related ancestry.

One cool line of data this new sample affords is Haplogroup data. You see, the reason I never touched on that Reer Xamar's Y-DNA & mtDNA (which would've been very interesting to touch upon) is because he got tested at AncestryDNA which does not test for Haplogroups anymore but simply autosomal DNA. This other chap who is paternally Bravanese however got himself genotyped by 23andme meaning we have Y-DNA & mtDNA results.

His Y-DNA Haplogroup is L-M20 (of its L1 subclade):

His mtDNA is L0f:

His Y-DNA given his substantial South to South-Central Asian Asian ancestry is likely owed to South to South-Central Asian ancestry especially since his subclade (L1) is supposedly most often found in India however as a colleague more knowledgeable about this Haplogroup pointed out- :

-an Iranian origin of this Y-DNA is possible because as I noted myself; this guy does likely have some Iranian (I'm speaking of the Iranian plateau here & not Iranic speaker related ancestry as a whole) ancestry.

His mtDNA L0f, funnily enough, tends to peak in South Cushitic speakers like the Iraqw (whose language I've uploaded recordings of by the way, as just a side-note) and substantially South (to perhaps East?) Cushitic speaker admixed populations in Southeastern Africa. [1] 

It is most likely to be owed to his Southeast African Bantu related ancestry as various Bantu speaking populations in Southeast Africa have some South Cushitic speaker related admixture (even if it's barely even noticeable via analyses such as ADMIXTURE but instead rears its head at times via Haplogroup frequencies). Although this Haplogroup, as a friend notes, is originally owed to African Hunter-Gatherer groups ("Khoisan" related and so on):

The hunters in Southern Somalia he's talking about are the likes of the Eyle who may indeed have contributed just a bit to some ethnic Somalis in Southern Somalia whilst being assimilated decades ago, and perhaps that's where he's getting his mtDNA but I personally find the SE African origin more plausible.

In the end his Haplogroups corroborate what autosomal DNA based analyses like admixture analyses or Oracle-4 are telling us which is that this individual is of mixed origins (carrying a South to South-Central Asian to perhaps even West Asian Iranian Y-DNA & either a Southeast African or Southern Somali mtDNA).

Old picture of Mogadishu before the Somali Civil War

Now, one thing that I'd like corroborated via more than one line of evidence is just how "Somali" these two seem (40-50%, that's an incredible amount of Somali to Somali-related ancestry, I must say) via analyses like Eurogenes K=36 which affords us components like "Northeast African" & "East African" (I've commented on this run's usefulness for noticing West Eurasian admixed Horn African ancestry here).
Now, ANE K=7 is a rather outdated run (the old ANE + WHG + ENF model is mostly dead now even more so with the recent finding of "CHG" although his ANE-related ancestry corroborates being substantially South Asian as they tended to show high-levels of such ancestry before CHG's appearance) but in the end it proves very useful in this context because it allows us to gauge just how "Eurasian" (Out-of-Africa ancestry) & "African" (non-Eurasian or substantially non-Eurasian admixed African ancestry) a person is.

 Various analyses that afford for a sort of "African" (green) Vs. "Eurasian (blue)" dichotomy in their results display relatively the exact same results where the full-blown Reer Xamar is about 35-40% "African" & the half Bravanese and half Reer Xamar chap is 30-35% "African". This does interestingly go in line with their estimated scores of Somali-related ancestry...

How so? Ethnic Somalis as I've outlined in the past are like many of their close relatives in the Horn of Africa; substantially West Eurasian admixed (possessing as a result a strong amount of likely West Eurasian Hunter-Gatherer & definitely Basal Eurasian ancestry).

In such admixture analyses Somalis tend to turn up as 55 to 65% "African" (most often ~60%) so suppose I take the Reer Xamar's result in the Jtest where he's ~38% "African" (essentially ~40%) and account for the fact that as various calculators show; about 5-10% of this is owed to West-Central African-related (most likely SE African Bantu-derived) ancestry. [note]

What do I have left? About ~30% "African" ancestry (though a tiny amount of this could be owed to West Asian Arab ancestry as they all have some "African" ancestry at a rate of 5-15%); roughly half the amount ethnic Somalis such as myself tend to show which corroborates that this individual could indeed roughly-to-nearly half ethnic Somali in ancestry.

South-Central Asia
I mean there's no doubt at this point that these two have substantial Somali or Somali-related (again; perhaps there's something Oromo about some of their ancestry but I haven't seen overt evidence of this) but I needed to be sure about these incredible levels implied by runs such as Eurogenes K=36 and I'd say their levels of "African" ancestry do corroborate such levels.

Although the problem here with the chap who is paternally Bravanese but maternally Reer Xamar and our Reer Xamar is that they're quite notably related (2nd to 4th cousin range related; though they were not aware of one another before matching on Gedmatch):

This relation can really skew things in that it would always be more ideal to have two totally unrelated samples & it could explain why they're so similar, though, somehow, I have a feeling that most Reer Xamars and such will turn out interrelated within the last few centuries to some extent due to how small their population is.

But one telling detail is that one of them is of a different group paternally yet both of them look very similar in ancestry anyway (implying perhaps that both of his parents; either Bravanese or Reer Xamar are not very distinct), this may imply that both groups could be well-represented by the type of mixture that characterizes these two individuals but for the time being; this is too small a sample size to be sure of anything.

This is all very very intriguing stuff but it's obvious what's needed here; more samples & more study... Won't be sure before but for the time being if these two individuals are at all representative of coastal people like Reer Xamars and Barawis then it can only be said that these populations are incredibly mixed and have quite the "cosmopolitan genetic profile". [note]

Reference List:


1. The fully Reer Xamar guy has a cousin who was tested at 23andme meaning he got some Y-DNA, some information about the supposed origins of his clan and his cousin's Y-DNA which could be shared with him (I'll have to ask soon if this cousin shares  a paternal line with him): [-]

2. Original K=36 results

3. I urge anyone of either Reer Xamar or Barawi / Bravanese descent who's had their genome sequenced to contact me via email ( and share their raw data or their Gedmatch kit number. The more samples we have the better!

4. I basically have the raw data of both these individuals and have uploaded their kits to Gedmatch where I ran them through a plethora of calculators there; most of the inferences made in this blog post are based on that. You can rummage about with their kits yourself here: A653627 (Reer Xamar) , M044752 (Half Bravanese & half Reer Xamar)

5. If you're wondering about the East Asian-related stuff they both seem to show; it's most likely owed to some of their other ancestries as various South to South-Central Asians as well as West Asian Iranians show such ancestry at relatively low levels. Similarly the affinities they show for African Hunter-Gatherer groups like Pygmies and San are probably owed to other ancestries in my humble opinion (perhaps their Southeast African derived ancestry).

6. I'll possibly be sharing newer and somewhat more accurate ancestry estimates with a future post. For now I'd say, for the three samples I so far have, the estimates are more like 35-45% "Somali/Horn African", 20-25% South Asian, 5-15% SE African Bantu-related ancestry with the rest being mostly West Asian ancestry.

Wednesday, December 2, 2015

Comments on Mota PCAs from Eurogenes

Thought I'd make some quick comments on David's PCAs (principal component analyses) utilizing autosomal DNA and aimed at seeing where the ancient Southwestern Ethiopian individual Mota from Llorente et al. sits in regards to modern populations:

The most interesting of all the PCAs has to be this one, in my humble opinion. I say this because Mota clearly demonstrates a sort of affinity for the likes of Ari Blacksmiths and Cultivators and both him and them form a sort of curve with Oromos, Wolaytas, Agaws and Ḥabeshas whom are populations with over Omotic-related admixture.

Somalis are excluded from this curve and instead look more like a mixture between West Eurasians and South Sudanese-related populations (at least related to what ancestry in these groups isn't related to the ancestry that dominates West-Central African populations such as Yorubans). 

This somewhat supports data from an old admixture analysis by a fellow ethnic Somali which posits that ethnic Somalis largely lack Omotic-related admixture but instead may display (in some admixture analyses for example) an affinity for "Omotics" due to very ancient shared ancestry between them (perhaps related to Afro-Asiatic migrations or some such?).

Further study will be needed, of course but these results are fantastically intriguing.

This other PCA is intriguing I suppose in that Mota displays a sort of pull toward Out-of-Africa (OoA) populations, more so than the likes of Yorubas, Mbutis and so on despite in Llorente et al. being painted as highly non-OoA influenced (0%). A colleague I correspond with finds this intriguing as it could mean he carries a notable amount of pre-historic East African ancestry that may have a stronger affinity than usual for OoA  populations; could be...

This PCA seems perhaps somewhat uninteresting in comparison to the two prior ones however it is fascinating that Mota clusters in this PCA with Hadzas. A population that seems similar to him in that it looks to be a mixture (for now) between African Hunter-Gatherer-related ("Khoisan" & "Pygmy") & pre-historic East African / "South-Sudanese-esque" ancestry (when you exclude a population like Dinkas' West-Central African / "Niger-Congo" related input).

The thing is, Mota also shares the most drift with modern Southwestern Ethiopians like Ari Blacksmiths (with this population for the time being tending to be the modern peak of the "Omotic" component) & this is enthralling because Mota in David's admixture analyses like Eurogenes K=36 turns up as substantially "Omotic":

Central African: 14%
Omotic: 68%
Pymgy: 14%
West African: 3%

Hadzas have turn up as substantially Omotic-related in the past as well such as in this old analysis that proves quite good at spotting the Omotic-related input in various Horn African populations (I'd say that's its main use):

Bandar's run K=7

 But of course, this does not need to imply actual Southwestern Ethiopian or Omotic speaker related input into Hadzas (wildly unlikely) but it's riveting that a pre-historic sample like Mota much like Ari Blacksmiths and Cultivators displays an affinity for Hadzas as well. Perhaps this implies some shared  pre-historic ancestry between the likes of some Southeastern Africans & the likes of Mota.

That last bit might be the case given that both some of the African ancestry in Hadzas & some of the African ancestry in Mota has displayed a weird affinity for Basal Eurasian admixed pre-historic West Eurasian populations at different times via tree-mix:

Hadza-related migration edge toward Stuttgart:

Mota-related migration edge toward :

Given Mota's strange pull toward OoA populations (perhaps I am making too much of this?) despite being claimed to be vastly less OoA influenced than all modern African populations Hadzas and Motas could share in a pre-historic East African / South Sudanese-esque element that is perhaps shifted more than usual toward Out-of-Africa populations which tend to point their origins toward East Africa from a genetic and archaeological perspective.

This last bit is pretty incredibly speculative and I place no great weight behind what I've said but I suppose it is food for thought.

Another thing I would note before finishing however is that in the last PCA; both the Hadza and Mota display a clear pull away from other Africans towards the direction of Hunter-Gatherer groups such as the Ju-Hoan-North though they do not pull exactly toward them but in their general direction (right); perhaps corroborating that Mota carries ancestry at the very least related to these types of  divergent African populations.

Reference List:


Friday, October 30, 2015

Inferences that can be made from a Sudanese Arab's Gedmatch results

A while back I said I'd be sharing the various genomic results of both a Sudanese Arab man & a Nubian woman who were commercially tested (via 23andme) because this would allow for the utilization of more SNPs when dealing with analyses like admixture or PCAs.

The reason I wanted to do this was because while Dobon et al. 2015 sampled a lot of Nubians & Sudanese Arabs for about 200,000 SNPs which should allow for high quality analyses; they utilized a bad genotyping chip / a chip that overlaps in SNPs badly with the chips used for sequencing genomes in other datasets.

For example, the admixture analysis above from Dobon et al. 2015, if I recall correctly, and was told by one of the study's researchers via email was that it only utilized about 15,000 SNPs and, indeed, the decent enough PCAs I shared in my post about Dobon et al. also used such a low number of SNPs.

It's usually preferable for PCAs like the one below, for example, to utilize 100,000 SNPs or more for high quality results:

The PCAs I shared are good enough and the inferences that can be made from them like Bejas seeming as though they are intermediates between certain Horn Africans (Somalis, Habeshas, Wolaytas, Oromos) & Sudanese Arabs + Nubians aren't wrong at all. [note]

On the other hand, when dealing with analyses like admixture analyses which become increasingly unstable & therefore unreliable with the lower number of SNPs you use; such a low number of genetic markers is problematic.

So what I essentially wanted to do was circumvent this issue by taking two people who were not tested in this study and whom were sampled for anything between 500,000 to 1,000,000 SNPs (like at 23andme) & run them through some admixture analyses in order to  reliably compare them to modern populations in terms of for example just how much ancestry they may share with Somalis & Habeshas.

I'm happy to say the data very much supports what was noticeable via the PCAs I shared that were ultimately owed to David Wesolowski who runs the Eurogenes genome blog and project.

Basically, what the results for Eurogenes K=36 suggest is that Horn Africans such as Somalis & Tigrinyas in terms of more recent ancestry; are closer to one another which explains why in various PCAs like the one below; we cluster off together rather than either population clustering with Sudanese populations instead:

More recent ancestry being ancestry I suppose more along the lines of like the last few thousand years: in terms of such ancestry its obvious most Horn Africans of Cushitic & Ethiopian Semitic speaking populations are closer to each other than either is to Sudanese Arab or Nubian groups as the PCAs suggest and this Sudanese Arab's admixture results do as well.

Although another inference that can be made is that it seems quite evident lots of likely post-Neolithic ancestry is shared between Horn African groups like Somalis & Northern Sudanese groups like Sudanese Arabs.
I say this once again based on these Eurogenes K=36 results. Firstly it must be noted that these results were acquired via Gedmatch which allowed me to utilize David Wesolowski's K=36 admixture calculator; however, these results are skewed to some extent by the notorious calculator effect.

This is why for example, the Eritrean Tigrinya, this Sudanese Arab chap & I are showing "East African" which is actually a Maasai peaking / seemingly based component from what I understand. In the original table David made for this analysis; things are more fine tuned and not effected by the calculator effect and Somalis alongside Tigrinyas are missing components like "East African".

However, the Northeast African component here peaks in Somalis & is somewhat similar to the "Ethio-Somali" component or the "Lowland East Cushitic" component. And of course, even if it is now being shown via a calculator effect; the "East African" component carries with it Horn African-related ancestry.

This Sudanese Arab along with Dobon et al.'s Nubians & Sudanese Arabs as the global PCA below including them (that's a bit messy because of the low number of SNPs utilized) suggests- :

-is on a rather fundamental scale very similar to Horn African populations & essentially a mixture between West Euraisan ancestry and African ancestry related to the kind of ancestry that forms the non-Niger-Congo-related ancestry in populations such as Dinkas. [note]

In fact, in this regard he is somewhat more similar to Tigrinyas than I am as he is closer to them in West Eurasian ancestry levels than Somalis are as you can see below:

 However, the difference seems to be that he has some much more "recent" West Eurasian-derived ancestry than either Somalis or Tigrinyas do, as evidenced by his higher "Arabian", "Near Eastern" & "Eastern Mediterranean" scores than either me or my Tigrinya colleague. [note]

Another distinction seems to be that he has much much recent Nilo-Saharan speaker-related ancestry evidenced by his "Central African" score (peaks in South Sudanese, if I recall correctly) which also explains why, in various admixture analyses available at Gedmatch, he consistently displays Niger-Congo-related ancestry which is present in contemporary South Sudanese populations and seemingly also populations like Darfurians.

The "East African component" based ancestry in Horn Africans while seemingly related to the non-Niger-Congo ancestry in Nilo-Saharan speakers such as Dinkas; isn't exactly like them & is seemingly mostly to entirely pre-historic in origin / extremely ancient hence why we lack the likely ancient & substantial Niger-Congo-related ancestry present in populations like Dinkas; our non-West Eurasian ancestry is not derived from them.

However, in this Sudanese Arab's case it does seem as though he derives actual ancestry from groups like the South Sudanese rather than pre-historic populations similar to much of the ancestry in them. This is quite understandable as many Sudanese Arabs used to be actual Nilo-Saharan speakers prior to their Arabization, and Nubians are Nilo-Saharan speakers.

As for the signals of perhaps post-Neolithic Horn African-related / Somali-like ancestry showing in this Sudanese Arab, I would have liked to see the results of a Nubian in these kinds of analyses but what this likely suggests to me is that this is due to ancestry from actual early to not so early Cushites being present in Sudanese Arabs & Nubians. [note]

As it is suggested via both archaeological and linguistic studies; in many cases, some of the earliest inhabitants of Northern Sudan (as early as the Neolithic) were in fact people of Cushitic speaking origins who were eventually assimilated by the Nilo-Saharan speaking ancestors of populations like Nubians just a few thousand years ago. [2] [3]

The Kerma culture's people are suggested by some linguists to have been Cushitic speakers

Then there's of course the strong & long-time presence of actual Cushitic speakers seemingly closely related to Horn African populations such as Bejas [note] who are, in some cases, thought to be a relic of not just Sudan's early Cushitic speaking nature but also that the ancestors of Horn African Cushitic speakers originally migrated into the Horn from northerly regions such as Sudan.

Further linguistic, archaeoligical & genomic study is needed to confirm these possibilities of shared ancestry from just say the last few thousand years or so between Nubians and Somalis for example (the sampling of ancient genomes would be the most ideal)...

Northeast Africa
Nevertheless, for now those are the best inferences I can make using this one man as a proxy for Dobon et al.'s various samples whom he seems representative of. Granted, one should recall that at least in terms of admixture levels (proportions of West Eurasian and African ancestry); Sudanese Arabs and Nubians seem quite heterogeneous.

For now my bet would be that Nubians for example are a mixture between earlier Cushitic or generally Afro-Asiatic speaking peoples perhaps genetically similar to Somalis, populations perhaps similar to Copts from earlier periods of Egypt & populations similar to contemporary Nilo-Saharan populations such as Dinkas with Sudanese Arabs being all that + some later Arabian-related gene flow. [note]

Only further genomic study of these groups will tell and what I wrote in the above paragraph is honestly just an educated guess on my part based on our current data.

Reference List:


1. Eurogenes ANE K=7 mostly inflates the ENF & WHG-UHG scores of people who've been run through it (Northwest Africans turning up as over 20% WHG-UHG as opposed to the more fine-tuned K=8's ~15% values) though its ENF scores are not inflated by much (Somalis hopping from 41% in K=8 to 42% in K=7). This is because the run was honestly mostly designed to spot ANE-related ancestry & not as much care was put into the other components. [note] Sub-Saharan African is just a result of me adding the "East African" & "West African" components in ANE K=7 together to neaten things up.

3. A friend more knowledgeable than I in this particular subject's notes on the Niger-Congo-related admixture in groups like the South Sudanese: Link to note