Tuesday, December 4, 2018

Paternal and maternal origins of Ashkenazi Jews

Putting a lid on the Khazar myth with uniparentals


This is the fourth and last part of my Khazar special series. Read part Ipart II and part III if you haven't already.

Until now, I've mostly focused on autosomal analyses to refute the Khazar narrative by showing the high similarity and overlap between Ashkenazi Jews and other Western Jews (and East Mediterranean populations in general). This means that I’ve dealt with overall ancestry, reflected in all 23 of our chromosomal pairs, and not with any sex-specific chromosomes.

However, sex specific chromosomes - or uniparental lineages - that are passed down through generations from father to son and mother to daughter, can actually offer more direct "proof" of origin and in many cases can complement the autosomal results.

First, lets start with the paternal lineages, or the genetic markers men inherit via their fathers, and their grandfather, and their great great great grandfather, and so on.

To understand why it is of such importance in determining accurate ancestry, lets first understand how it is inherited. Most of the DNA in our body is packed into 23 pairs of chromosomes. The first 22 pairs are matching, meaning that they contain roughly the same DNA inherited equally from both parents. The 23rd pair is different because in men, the pair does not match. The chromosomes in this pair are known as "sex" chromosomes and they have different names: X and Y. Women have two X chromosomes and men have one X and one Y.

Each generation, fathers pass down copies of their Y chromosomes to their sons essentially unchanged. Between generations, the matching chromosomes in the other 22 pairs make contact and exchange segments of DNA. However, the Y chromosome skips this step. Instead, a nearly identical copy is handed down each time.




But, every now and then, small changes to the DNA sequence do occur in Y. These changes, called mutations, create new genetic variants on the Y chromosome. Because the Y does not recombine between generations, these variants collect in patterns that uniquely mark individual paternal lineages. We call these new genetic variants "subclades" of the original Y lineage. And because humans have moved around a lot in history, subclades are important step in actually tracing these migrations. This enables us to establish a direct line of ancestry between populations which seem to no longer originate from the same location, and also clear the picture autosomal DNA gives us.

So for example, while South Italians, Aegean Greeks and Maltese all seem to autosomally overlap with Western Jews, when one compares the latter's paternal lineages, the vast majority of those markers come from the Levant, rather than from South Eastern Europe as in the former.

Here are the proportions of the most common paternal genetic marker groups (called haplogroups) among Lebanese people, divided by the different denominations (source from  here):



And these are the proportions of the most common paternal haplogroups among Ashkenazi Jews:



The comparison makes it clear that Ashkenazi paternal haplogroups as well as their proportions are extremely similar to Lebanese ones. Also, we can test to see if these are also similar to Sephardic paternal haplogroups, considering how we just saw that the two populations - Ashkenazi and Sephardi Jews - overlap in PCAs showing genetic distance between populations:



So it’s quite clear that Sephardic and Ashkenazi paternal haplogroups are very similar to both each other and to modern Levantine populations (especially the Lebanese).

We can compare these paternal haplogroups and their proportions to a number of different populations that probably have some Khazarian ancestry, which we can assume due to the fact that they have lived in the region of the Khazar kingdom for centuries or descend from people which are known to have lived there or to be closely related to them linguistically.

The Khazar language is considered closely related to the Chuvash language and, according to Wikipedia, “Modern Chuvashes claim to descend from Sabirs and Kazan Tatars from the Volga Bulgars.” The tenth century Arab writer Al-Istakhri wrote the following: "the language of the Bulgars is like the language of the Khazars”. Here he meant the Volga Bulgars, from whom the Chuvash seem to be descended.

So let’s compare the haplogroups of the Ashkenazi Jews and Lebanese shown above to the Chuvash people paternal haplogroups, and see how much they resemble each other:




Clearly they are almost nothing alike. There’s also a map of where most of the Chuvash were concentrated, which is in the location where Volga Bulgaria used to be, which was once
ruled by Khazaria, and is close to Khazaria proper.

Kevin Brook, author of the book The Jews of Khazaria (which, as I've mentioned previously, I highly recommend for reliable information about the Khazars), thinks that the North Caucasian Karachay people might be better candidate for claiming descent from the Khazars (you can read his article about the Karachay people at his website Khazaria.com here):

Karachay (meaning "Black River") — also spelled Karachai — are a people living in the North Caucasus....
Some researchers believe the Karachays descend mainly from medieval Kipchaks (Cumans) and Alans. There are also theories according to which the Khazars, Bulgars, and/or Huns helped to form this people.
... about 5.8% of Karachays have the Y-DNA haplogroup R1b1b1.... More common than R1b varieties is R1a1 at about 39% according to the latest results... The "Karachay-Balkar DNA" project includes many ethnic Karachays with R-M459 (part of R1a1) and R-M512 (R1a1a). One each has R-M417 (R1a1a1) and R-Z94 (R1a1a1b2a). G2a is found among Karachays at about 34% and J2a at about 11%.

So judging from their paternal haplogroups distribution, it's quite clear they couldn't possibly be the ancestors of modern Ashkenazi Jews.

Next, let’s also compare Ossetian haplogroups - they are considered by historians as modern descendants of the Alans, a Steppe Iranic people closely related to the Khazars (they got their independence from Khazaria around the 9th century CE):




Again, there is very little similarity.

Now that we’ve finished with the Caucasus’ Iranic people (Ossetians, descended from the Alans) and Fino-Ugric Turkic people (Chuvash people, descended from Volga Bulgars), let’s check Tatars, who are also sometimes considered to be descendants of the Volga Bulgars and other ethnic groups related to the Khazars:




Once again, as expected, there are few similarities in terms of both proportions and actual haplogroups.

Although I think I've made it abundantly clear that the vast majority of Ashkenazi paternal lineages have no discernable Khazar or Turkic origin, proponents of the Khazar hypothesis have often relied on two paternal haplogroups present in Ashkenazi Jews as proof of Khazar ancestry:
  • Haplogroup Q-M242 (my own haplogroup), which is found among ~5% of Ashkenazi Jewish men and is a major Central Asian and North Amerindian haplogroup.
  • Haplogroup R1a, which is found among ~50% of Ashkenazi Levites and is also very common among peoples from East Europe and the North Caucasus.
While these haplogroups are relatively rare among Ashkenazi Jews, they have nonetheless been used as a smoking gun for paternal East European and Turkic ancestry among Ashkenazi Jews. After all, if you can’t prove that the majority of Ashkenazi Jews are descended from the Khazars, why not at least claim that some Ashkenazi Jews are?

However, this route is yet another dead end for the Khazar hypothesis.

Let’s first discuss haplogroup Q-M242. Do you remember how I mentioned before that subclades are important? Well, although this paternal lineage is primarily concentrated in Central Asians, Turks, and Mongols, it seems that all Ashkenazi Jews which belong to that lineage have been found to belong to a specific subclade that originates in and is shared with other non-Jewish populations native to the Middle East.
This subclade is called Q-M378. It seems to have diverged from its Central Asian parent (Q-M242) around 8,000-12,000 years ago and was carried into Western Asia at that time, thousands of years before the Judeans and Israelites first appeared in the Levant, which for intend and purposes by the time time Jews came to form as a people, could be considered native to Western Asia. More specifically, an additional "child" subclade to Q-M378, subclade Q-L245, is shared among Ashkenazi, non-Ashkenazi Western, and Mizrahi Jews.

On top of this, Ashkenazi Jews seem to all belong to a specific child subclade of Q-L245: Q-Y2232, whose age roughly correlates to when Jews first began leaving the Levant.

To better help you visualize this somewhat complicated subject, I’ve included a branch tree of the Q-M242 Jewish subclades, taken from the Jewish Q FamilyTreeDNA project:
















































As you can see from this tree, all Western Jews and Mizrahi Jews share the same West Asian branch of Q-M242, which predates the Jewish diaspora and has no traceable or reasonable connection to Khazar ancestry.  The only Jewish exception is among Yemenite Jews, whose Q subclade seems to be related to Arabian ancestry.

Not surprisingly, Eran Elhaik has chosen to ignore the overwhelming evidence and recently, in September 2018, claimed that Ashkenazi Q is of Khazar origin, further dissolving any vague semblance of academic or scientific merit in his studies.

Regarding the R1a subclades that are found in ~50% Ashkenazi Levites, this too turned out to be of West Asian origin. The Ashkenazi subclade of M582 is shared with Near Eastern populations rather than East Europeans. (You can learn more about this here ).

The evidence has spoken loud and clear: the vast majority of paternal lineages found in Ashkenazi Jews seem to originate in the Middle East and are shared with other Western Jews.

Now let’s shift our focus to Ashkenazi maternal lineages. Here the picture is a little bit more complicated.

See while the paternal lineage is passed down through men's Y chromosome, women have no Y chromosome, but two X chromosomes instead. So while that means men always get their mother's X chromosome, the mother herself got one X from her father, and one X from her mother. That means her son has 50% chance to get either the paternal X or maternal X - which essentially means there is no way to trace genealogy this way.

So how could we trace maternal lineages? Well as it turns out, the mitochondria - small structures inside our body cells that turn fuel from the food we eat into energy - carries DNA which is solely passed down from the mother, and grandmother, and great great grandmother, and so on. The reason for this is that when a male sperm cell fertilizes a woman egg, it leaves the part that carries it's mitochondria, which gives it the energy it needs to move and reach the egg, outside of the egg - and so the embryo gets only the mitochondria of the egg cell. Here's an illustration (circled in red is the only part that gets absorbed in the egg, so, as can be seen, no male mitochondria):




We call this DNA marker mtDNA, and it goes relatively unchanged similarly to the Y chromosome in men. And, like Y chromosome, mtDNA also every now and then mutate, and so we have mtDNA subclades, similar to paternal subclades.

So what do we know about Ashkenazi Jews' mtDNA, or maternal lineages? Well, here the picture isn't as simple. In 2013, Costa el al. published the most extensive and in depth study of the maternal origins of Ashkenazi Jews. They found that the majority of 81% of maternal lineages originated in Europe (mostly Southern Europe, but also some from Central and Eastern Europe), 8.3% originated in the Near East, 1.1% of elsewhere in Asia, and 9.9% were too difficult to assign to either Europe or the Near East due to overlap.

Below is a pie chart from that study that clearly distinguishes the different origins of maternal haplogroups among Ashkenazi Jews:



The study also revealed or reiterated several important facts:
  • There exist a number of shared maternal haplogroups between Sephardic and Ashkenazi Jews, which demonstrates a common origin for both Western Jewish communities:
"Evidence for haplotype sharing with non-Ashkenazi Jews for each of the three main haplogroup K founders may imply a partial common ancestry in Mediterranean Europe for Ashkenazi and Spanish-exile Sephardic Jews"
  • Most maternal uniparental haplogroups in Ashkenazi Jews come from Mediterranean Europe, and there's little to no difference between the various Ashkenazi communities:
"These analyses suggest that the first major wave of assimilation probably took place in Mediterranean Europe, most likely in the Italian peninsula ~2 ka, with substantial further assimilation of minor founders in west/central Europe. There is less evidence for assimilation in Eastern Europe, and almost none for a source in the North Caucasus/Chuvashia, as would be predicted by the Khazar hypothesis8,9—rather, the results show strong genetic continuities between west and east European Ashkenazi communities10, albeit with gradual clines of frequency of founders between east and west"
  • Ashkenazi paternal lineages originate in the Levant:
"As might be expected from the autosomal picture, Y-chromosome studies generally show the opposite trend to mtDNA with a predominantly Near Eastern source"

In conclusion, it’s safe to say that the case has been closed from a genetic standpoint, both in terms of autosomal makeup and uniparental lineages.

However extensive Khazar conversion to Judaism may have been, it left virtually no genetic impact on any living Jewish community.

I hope that these entries focused on addressing the Khazar hypothesis have adequately demonstrated that Ashkenazi Jews mostly descend from and still belong among East Mediterranean people; and are extremely closely-related to all other Western Jews—Sephardi, Romaniote, Italian, Syrian, and North African - none of whom are suspected of having any Khazar ancestry.

Thursday, November 15, 2018

DIY: Refuting the Khazar myth

Showing the Khazar fallacy with open genomics tools


This is the third part of my Khazar special series. Read part I, and part II if you haven't already.

In my previous posts, I've reviewed both the academic papers refuting the Khazar "hypothesis", as well as busting the most "serious" attempt to actually prove this narrative (Elhaik et al., Das et al., etc.).

However, since most people would find it difficult to actually understand everything that is written in those studies, I think the best way to find out something is to do it yourself.

Fortunately, nowadays, simple DNA ancestry tests done at home - such as 23andme, AncestryDNA, MyHeritage etc. are easily accessible and can prove the same conclusion detailed in those peer reviewed publications without much effort. And thanks to open source/data communities, we are now at the age that any person with a simple GEDmatch (an open data personal genomics database and genealogy website) account and little bit of technical capability can verify this by themselves in no time.

There are number of ADMIXTURE calculators on GEDmatch which one can use to run yours or any other kit to see both the closest populations or the closest ADMIXTURE.

For example, let's run my own kit - a full East European Ashkenazi Jew:

Eurogenes K13 calculator results:

Single Population Sharing:

#Population (source) Dist.
1Ashkenazi4.14
2East_Sicilian9.03
3Italian_Jewish9.18
4Central_Greek9.64
5South_Italian9.73
6Algerian_Jewish10.09
7Sephardic_Jewish10.66
8West_Sicilian10.91
9Italian_Abruzzo12.03
10Greek_Thessaly12.33
11Tunisian_Jewish12.9
12Libyan_Jewish13.8
13Cyprian15.76
14Tuscan16.14
15Lebanese_Muslim19.41
16Bulgarian19.63
17Lebanese_Druze20.7
18Syrian20.9
19Samaritan21.17
20Palestinian21.84

As can be seen - the closest single population to my kit are other Ashkenazi Jews, followed by East Mediterranean non-Jewish populations and other (Western) Jewish populations.

My ADMIXTURE results also show I am pretty much your average Ashkenazi Jew:

Mixed Mode Population Sharing:

# Primary Population (source) Secondary Population (source) Distance
1 97.2% Ashkenazi + 2.8% Lebanese_Druze @ 4.1
2 100% Ashkenazi + 0% Abhkasian @ 4.14
3 100% Ashkenazi + 0% Adygei @ 4.14
4 100% Ashkenazi + 0% Afghan_Pashtun @ 4.14
5 100% Ashkenazi + 0% Afghan_Tadjik @ 4.14
6 100% Ashkenazi + 0% Afghan_Turkmen @ 4.14
7 100% Ashkenazi + 0% Aghan_Hazara @ 4.14
8 100% Ashkenazi + 0% Algerian @ 4.14
9 100% Ashkenazi + 0% Algerian_Jewish @ 4.14
10 100% Ashkenazi + 0% Altaian @ 4.14
11 100% Ashkenazi + 0% Armenian @ 4.14
12 100% Ashkenazi + 0% Assyrian @ 4.14
13 100% Ashkenazi + 0% Austrian @ 4.14
14 100% Ashkenazi + 0% Austroasiatic_Ho @ 4.14
15 100% Ashkenazi + 0% Azeri @ 4.14
16 100% Ashkenazi + 0% Balkar @ 4.14
17 100% Ashkenazi + 0% Balochi @ 4.14
18 100% Ashkenazi + 0% Bangladeshi @ 4.14
19 100% Ashkenazi + 0% Bantu_N.E. @ 4.14
20 100% Ashkenazi + 0% Bantu_S.E. @ 4.14
  
And if I'll try some other calculators, I get pretty much the same picture:

puntDNAL K13 Global:

Single Population Sharing:

#Population (source)Distance
1Ashkenazy_Jew3.82
2Italian_Sicilian4.67
3Greek_Central5.86
4Italian_Abruzzo6.52
5Sephardic_Jew7.16
6Greek_Thessaly10.54
7Albanian11.07
8Kosovar12.41
9Italian_Tuscan13.03
10Turkish13.23
11Cypriot14.68
12Turkish_Aydin15.13
13Turkish_Kayseri15.74
14Bulgarian16.74
15Macedonian17.24
16Syrian18.37
17Romanian18.58
18Lebanese_Christian18.95
19Italian_Bergamo18.99
20Lebanese_Druze19.19


MDLP K23b:

Single Population Sharing:

#Population (source)Distance
1Ashkenazi_Jew ( )3.04
2Sicilian_West ( )4.78
3Sicilian_Siracusa ( )5.21
4Sicilian_Agrigento ( )5.45
5French_Jew ( )5.72
6Maltese ( )5.87
7Turk_Jew ( )5.9
8Sephardic_Jew ( )5.98
9Greek_Peloponnesos ( )6.27
10Sicilian_Trapani ( )6.28
11Sicilian_East ( )6.3
12Italian_Jew ( )6.39
13Ashkenazi ( )6.65
14Greek_Northwest ( )7.14
15Greek_Thessaloniki ( )7.53
16Greek_Thessaly ( )7.54
17Bulgarian ( )7.56
18Macedonian ( )7.7
19Moroccan_Jew ( )7.82
20Romanian_Jew ( )8.56

And so forth. As can be seen, this is pretty consistent with what we've seen in the PCA I've posted in previous entry:




Ashkenazi Jews, like their genetically closest population, Sephardic Jews, are grouped together as part of the Western Jews cluster, which in a larger scope can be viewed as part of the East Mediterranean Continuum, which again means Ashkenazi Jews cluster genetically with other non-Jewish East Mediterranean populations - Aegean Greek populations (like from Crete, Rhodes etc.), Sicilians, Maltese and Cypriots.

The PCA above is made from academic samples gathered as part of Davidski's Global25 en-devour, and it also includes coordinates for 471 Ashkenazi individuals (colored in grey) from Bray et al. 2010, which as can be seen they all cluster tightly together, as expected from such an endogenic population.

Examining these ~500 Ashkenazi samples to find their lowest genetic distance, the shortest is to the Ashkenazi Global25 reference panel, followed by various East Mediterranean populations:

 Ashkenazi_G25_reference 1.031630  
 Maltese 1.995762  
 Italian_South 2.020773  
 Sicilian_East 2.166531  
 Sicilian_West 2.313971  
 Italian_Abruzzo 2.593161  
 Italian_Jew 2.757282  
 Greek_Crete 2.930938  


The same trend can be observed in other PCAs, based on Eurogenes K36 (one of the ADMIXTURE calculators found on GEDmatch) values, using a mix of both academic and non-academic samples:




As can be seen, with open genetic data now available, and people getting their own ancestry profile and raw DNA in less than a $100 these days, it's quite easy to refute the Khazar myth with about 5 minutes work, by simply showing that Ashkenazi Jews cluster with all other Western Jews and tightly to other Mediterranean people.

But don't just take my word for it.
If you're of Ashkenazi origin, or know someone who has done one of those home ancestry tests, ask their raw data, and upload them to GEDmatch.
Run any of the ADMIXTURE calculators.

And if you want to dive deeper, you can research nMonte, PAST3, and the other open source tools that I used to make the PCAs included in this post. There’s a vast world of information that’s easily accessible to anyone curious about genetics and wants to test theories out for themselves.

So far I've used autosomal analyses to refute the Khazar narrative. This means that I’ve dealt with overall ancestry and not with any sex-specific chromosomes.

In my next post, I will tackle uniparental lineages: the Y chromosome and mitochondrial DNA that are passed down through generations from father to son and mother to daughter. These can offer more detailed information in some areas that can complement what we’ve already seen. Stay tuned!

Monday, November 5, 2018

The Return of the Khazars

"Proving" the Khazar ancestry of Ashkenazi Jews with bad science


This is the second part of a series of blog entries dedicated to show the invalidity of the theory that Ashkenazi Jews are Khazars. If you hadn't read the first part yet, I suggest you do

In my previous post, I referenced a long list of peer reviewed studies from the past decade or so that completely dismantle this now-defunct theory.

However, as I mentioned at the end of the post, the overwhelming amount of conclusive data was still not enough to kill off the Khazar theory.

On December, 2012, Dr. Eran Elhaik published his peer reviewed study "The Missing Link of Jewish European Ancestry: Contrasting the Rhineland and the Khazarian Hypotheses" on the Oxford journal Genome Biology and Evolution, which garnered a lot of publicity at the time by claiming to prove, using population genetics, that Ashkenazi Jews are indeed descendants of the Khazars.

At the time, the study’s controversial results, the scientific community’s rejection of it, and Elhaik's own cries that he's being persecuted due to politics rather than admitting that perhaps there was something wrong with his study, all helped to elevate its publicity. Most non-scientific journals that covered this study accepted it without even questioning its scientific validity.

However, the scientific community soon responded with force: in 2013, Human Biology published a study that refuted both Elhaik's claims and the Khazar narrative in general. The leading author of this paper was Doron M. Behar, a known geneticist and researcher on Jewish genetics. Another thirty scholars, many of whom are well known scientists, cosigned the paper. Here's the abstract of that study:
The origin and history of the Ashkenazi Jewish population have long been of great interest, and advances in high-throughput genetic analysis have recently provided a new approach for investigating these topics. We and others have argued on the basis of genome-wide data that the Ashkenazi Jewish population derives its ancestry from a combination of sources tracing to both Europe and the Middle East. It has been claimed, however, through a reanalysis of some of our data, that a large part of the ancestry of the Ashkenazi population originates with the Khazars, a Turkic-speaking group that lived to the north of the Caucasus region ~1,000 years ago. Because the Khazar population has left no obvious modern descendants that could enable a clear test for a contribution to Ashkenazi Jewish ancestry, the Khazar hypothesis has been difficult to examine using genetics. Furthermore, because only limited genetic data have been available from the Caucasus region, and because these data have been concentrated in populations that are genetically close to populations from the Middle East, the attribution of any signal of Ashkenazi-Caucasus genetic similarity to Khazar ancestry rather than shared ancestral Middle Eastern ancestry has been problematic. Here, through integration of genotypes on newly collected samples with data from several of our past studies, we have assembled the largest data set available to date for assessment of Ashkenazi Jewish genetic origins. This data set contains genome-wide single-nucleotide polymorphisms in 1,774 samples from 106 Jewish and non- Jewish populations that span the possible regions of potential Ashkenazi ancestry: Europe, the Middle East, and the region historically associated with the Khazar Khaganate. The data set includes 261 samples from 15 populations from the Caucasus region and the region directly to its north, samples that have not previously been included alongside Ashkenazi Jewish samples in genomic studies. Employing a variety of standard techniques for the analysis of populationgenetic structure, we find that Ashkenazi Jews share the greatest genetic ancestry with other Jewish populations, and among non-Jewish populations, with groups from Europe and the Middle East. No particular similarity of Ashkenazi Jews with populations from the Caucasus is evident, particularly with the populations that most closely represent the Khazar region. Thus, analysis of Ashkenazi Jews together with a large sample from the region of the Khazar Khaganate corroborates the earlier results that Ashkenazi Jews derive their ancestry primarily from populations of the Middle East and Europe, that they possess considerable shared ancestry with other Jewish populations, and that there is no indication of a significant genetic contribution either from within or from north of the Caucasus region.


Citations:

Behar, Doron M.; Metspalu, Mait; Baran, Yael; Kopelman, Naama M.; Yunusbayev, Bayazit; Gladstein, Ariella; Tzur, Shay; Sahakyan, Havhannes; Bahmanimehr, Ardeshir; Yepiskoposyan, Levon; Tambets, Kristiina; Khusnutdinova, Elza K.; Kusniarevich, Aljona; Balanovsky, Oleg; Balanovsky, Elena; Kovacevic, Lejla; Marjanovic, Damir; Mihailov, Evelin; Kouvatsi, Anastasia; Traintaphyllidis, Costas; King, Roy J.; Semino, Ornella; Torroni, Anotonio; Hammer, Michael F.; Metspalu, Ene; Skorecki, Karl; Rosset, Saharon; Halperin, Eran; Villems, Richard; and Rosenberg, Noah A.

These are basically the foremost experts on Jewish genetic studies.

And you can read the entire rebuttal here:

No Evidence from Genome-Wide Data of a Khazar Origin for the Ashkenazi Jews

In response to this rebuttal, Elhaik seemingly became obsessed with proving the Khazar hypothesis, going so far as to create an entire website dedicated to it. And when his previous attempts fell flat in the face of science, history, and logic, he continued publishing similar follow up "studies", as part of Das et al. in 2016 and 2017, essentially akin to trolling the scientific community.

To understand just how problematic Elhaik's papers and theories are, one has to understand that populations genetics is essentially a comparative field. You have to carefully construct reference groups as a basis for relationships between populations. For example, if I assume that my Near East/Middle East reference group will be composed of Iraqis, Iranians and Kurds, I might end up with Lebanese, Druze and Palestinians getting bogus results that they are only 50% Middle Eastern / Near Eastern, and about 50% South European.

In addition to this, assumptions about modern populations representing ancient ones need to be carefully verified with ancient DNA samples before being used as such. A good example is Haber et al. paper from 2017, which successfully established that modern day Lebanese are pretty good proxies to Bronze Age Canaanites by testing ancient samples found in Sidon, Lebanon and dated to ~1750 BC. This paper found a 93% correlation between modern day Lebanese and those ancient samples. So, it's safe to say that we can use Lebanese as a good modern reference population for Levantine ancestry.

And while Elhaik's 2012 paper has numerous flaws, these two factors—reference groups and using modern populations as proxies of ancient populations—are where his entire narrative totally collapses.

First, in his paper, he seems to have intentionally omit all Western Jewish population except for Ashkenazi Jews. Considering what data was widely available at the time of this study, 2012, this seem to have been a deliberate and calculated move, as one cannot escape from the thought that he knew that the autosomal similarities between Ashkenazi, Sephardi, Italian, and even North African Jews would completely undermine the entire premise of his study. Oddly enough, he actually admits to having Sephardic Jewish samples, yet without any clear reasons, states:
In congruence with the literature that considers “Ashkenazi Jews” distinct from “Sephardic Jews,” we excluded the later.
Just like that. No reason is given why they were excluded.

Second, in what Elhaik describes as the choice of "surrogate" populations—essentially what I've described as using modern populations as proxies for ancient ones—he states the following:
Choice of Surrogate Populations
As the ancient Judeans and Khazars have been vanquished and their remains have yet to be sequenced, in accordance with previous studies (Levy-Coffman 2005; Kopelman et al. 2009; Atzmon et al. 2010; Behar et al. 2010), contemporary Middle Eastern and Caucasus populations were used as surrogates. Palestinians were considered proto-Judeans because they are assumed to share a similar linguistic, ethnic, and geographic background with the Judeans and were shown to share common ancestry with European Jews (Bonné-Tamir and Adam 1992; Nebel et al. 2000; Atzmon et al. 2010; Behar et al. 2010). Similarly, Caucasus Georgians and Armenians were considered proto-Khazars because they are believed to have emerged from the same genetic cohort as the Khazars (Polak 1951; Dvornik 1962; Brook 2006).

Essentially, he chose to represent ancient Levantine Jewish population with modern day Palestinians, and Khazars with Georgians and Armenians. The funny thing is that he actually claims one of the reasons for choosing Palestinians as his reference group was that they were shown to share common ancestry with European Jews! This here alone indicates that he recognizes the Levantine ancestry of Ashkenazi Jews.

However, using Palestinians as a reference group for ancient Judean Jews, lacking any concrete historical or genetic evidence at the time for such a connection, can rightly be considered more  politically-driven than good science, and using Armenians (or Georgians) as Khazar proxies is just odd.

Palestinians are Levantine people, just like ancient Judean Jews most likely were. However, the majority of Palestinians today are Muslim, and Muslim Levantines are not the best proxy for ancient Levantines because it has been established by previous studies that they drift towards North African and peninsular Arab (Saudi, Yemenite) populations. While it is true that Elhaik's original paper was out years before Haber et al. provided proof that Lebanese are a much better proxy for the ancient Levant, the aforementioned drift that Levantine Muslims show on the different PCAs and even some degree of Sub Saharan African ancestry found among them that is lacking Christian and Jewish populations had been established at least as early as 2003:

Extensive Female-Mediated Gene Flow from Sub-Saharan Africainto Near Eastern Arab Populations

This study found that Haplogroups L1-L3A, which are common among people of sub-Saharan African descent and usually indicate such admixture, can be found among Muslim Middle Eastern populations:
“Haplogroups L1–L3A in the Near East reach their highest frequency in the Yemen Hadramawt (∼35%). Other Arab populations—Palestinians, Jordanians, Syrians, Iraqis, and Bedouin—have ∼10%–15% of lineages of sub-Saharan African origin. These types are rarely shared between different Arab populations. By contrast, non-Arab Near Eastern populations—Turks, Kurds, Armenians, Azeris, and Georgians—have few or no such lineages, suggesting that gene flow from Africa has been specifically into Arab populations. “

And, also, regarding non-Muslims Middle Easterners, specifically Middle Eastern Jews:
“Near Eastern Jewish groups almost entirely lack haplogroups L1–L3A. “
Later studies reaffirmed these findings, which can be seen in the PCA I posted in my previous entry here.

Another important fact from the 2003 paper is that the members of the "Khazar" reference group Elhaik constructed, Armenians and Georgians, are treated here as Near Eastern populations rather than Caucasus populations related to the northern Caucasus that the Khazars inhabited and ruled over. And for a good reason, but we'll get to that soon.

First, let’s return to Elhaik's choice of Palestinians as a "surrogate" population representing ancient Judean Jews. He justifies this choice with an argument that undermines the paper's central premise: that Palestinians were shown to share common ancestry with Ashkenazi Jews, thus recognizing their Levantine ancestry and partial origin in contradiction to the Khazar narrative that he would later conclude. From this, the conclusion I (and many others who have read his paper) is that this decision is  driven by political views.

In fact, one cannot but suspect that Elhaik believes that Palestinians descend from the ancient Jews and are thus the real Jews, while Ashkenazi Jews, who make up the majority of the world’s modern Jewish population, are essentially fake Jews. This seems to me (and others) to be the main reason that he chose Palestinians rather than Samaritans, for instance, who would have made a more logical surrogate population due to historical, genetic, cultural, and religious factors.

In his later studies and in the website that he created to promote his ideas, Elhaik regularly alludes to Shlomo Sand's theories. Sand, the author of the politically-biased and historically-controversial books such as The Invention of the Jewish People (2009) and How I Ceased to Be a Jew (2013), has repeatedly claimed that all of the different Jewish ethnic groups that lived around the world are made up of local converts to Judaism and that there is no common Jewish ethnicity or ancestral origin. He similarly claims that modern Palestinians are the true descendants of ancient Jews rather than modern Jews. These claims are not supported by population genetic studies. And predictably, when Elhaik published his first paper in 2012, Sand was quick to seize it and dismiss all other genetic studies as erroneous despite the fact that he has no credentials of knowledge of population genetics. Sand's ideological and controversial books on Jewish history and Elhaik's seemingly poorly-reasoned papers complement each other, and are equally detached from earlier and more recent scientific evidence.


Elhaik’s use of Amrenians and Georgians as a proxy for Khazars further illustrate the ahistorical nature of his narrative. He admits in his paper that “Khazars have been vanquished and their remains have yet to be sequenced,” so nobody really knows who what modern populations, if any, are predominately descended from the Khazars. In virtually all genetic studies, Armenians are considered to be a Near Eastern population that overlaps with Mesopotamian populations like modern Assyrians and, to a lesser degree, Kurds. This is clearly evident in both the Eurogenes PCA:


















and if I zoom in on the Global25 PCA that I posted in my previous entry:














Armenians very clearly overlap with Assyrians, Kurds, and Iranians. Ironically, their closest Jewish populations are Mizrahi Jews—Georgian, Iranian, and Iraqi Jews—not Ashkenazi Jews. These populations all cluster tightly with Armenians and other Mesopotamian-like non-Jewish populations. Are we now to believe that Iraqi and Iranian Jews descend from Khazars?

Elhaik very seem to have chose not to use the Turkic-speaking Chuvash people, who are widely assumed by scholars to be the closest modern population to the Khazars. In fact, he didn’t choose a single Turkic-speaking people as a reference for Khazars despite the fact that the Khazars were almost certainly Turkic-speaking themselves. He also didn't choose any North Caucasian populations such as Kumyks or Ossetians, both of whom reside in the actual areas parts that the Khazars’ kingdom was centered around. On top of all this, the Ossetians, a North Caucasus Iranian people, even have historical traditions linking their ethnogenesis to the Khazars via the medieval Alans.
Lastly, the only non-Ashkenazi Jewish population that Elhaik chose to feature in his study are Azeri Jews, otherwise known as Mountain/Caucasian Jews. Elhaik cites Kevin Alan Brook's The Jews of Khazaria (2006) as one of his bases for choosing Armenians and Georgians as his "surrogate populations" for Khazars. I actually have a copy of this book at home, which I highly recommend it as an excellent scholarly work about the Khazars, and in his book as well as on his website, Brook actually argues against the notion that Azeri Jews are descended from Khazars:

"I have not yet been convinced of a connection between Mountain Jews and Khazarian Jews. It is possibly a coincidence that Khazarian Jews and Mountain Jews lived in roughly the same geographic area. And most of the Khazars who remained in the Caucasus after the 10th century are known to have been forced into Islam, leaving us with the more likely scenario that the Turkic groups of the North Caucasus who are Muslims, especially the Karachays and Balkars, but not the Kumukhs, are partly descended from the Khazars. "
It's funny because in this same study, Brook claims that North Caucasus populations can be more reasonably-assumed to be descendants of the Khazars, and yet Elhaik still chose Armenians and Georgians, who inhabit regions that are on the very edge of Khazaria’s historical southern boundaries and that the Khazars did not consistently control. In fact, Arabs had a much stronger holds over these regions, especially Armenia, than the Khazars did during this time period. Azeri Jews are similarly assumed by Elhaik to be the descendants of Khazars without any explanation and again contradicting the same sources that he cites.

Elhaik basically assumes with no rational basis that Armenians are descended from Khazars, and that Palestinians are the primary and most authentic modern descendants of ancient Jews. Therefore, he concludes that Ashkenazi Jews, who show some affinity to other Northern Near Eastern populations like Armenians (which in reality is likely due to shared Near Eastern ancestry), are descendants of Khazars.

If anything, Elhaik unknowingly corroborated a paper that was published the following year by Haber et al. (2013), which have found that:

"Levantine populations [can be split to] two branches: one leading to Europeans and Central Asians that includes Lebanese, Armenians, Cypriots, Druze and Jews, as well as Turks, Iranians and Caucasian populations; and a second branch composed of Palestinians, Jordanians, Syrians, as well as North Africans, Ethiopians, Saudis, and Bedouins." 

I hope that this entry was clear and elaborate enough to show just how poorly-reasoned and unscientific the most serious attempt in recent years to "prove" the Khazar ancestry of Ashkenazi Jews (or any Jews to that matter) was. In fact, it had to be this bad because this theory has no merit.

Unfortunately, as I've mentioned at the beginning of this blog entry, Elhaik didn't stop with his first bad study, and has subsequently tried to support this theory in later studies, by Das et al. (2015, 2016 and 2017), though he has changed his narrative repeatedly to accommodate the evidence that overwhelmingly contradicts his arguments. Bizarrely, though, he has pushed his narrative to an even greater extreme, arguing that Ashkenai Jews only derive 3% of their ancestry from the Levant by assuming that Bedouins are in fact pure Levantines and, as a result, modern day Levantine populations, including Palestinians and Lebanese, are heavily descended from Iranian and Anatolian populations.

In my next post, I'll show how simple it is to disprove the Khazar theory with today's available open genomics data.

Wednesday, October 31, 2018

Khazarian Rhapsody

Wikipedia describes a rhapsody as a form of music that is:
"episodic yet integrated, free-flowing in structure, featuring a range of highly contrasted moods, colour and tonality. An air of spontaneous inspiration and a sense of improvisation make it freer in form than a set of variations"
As I'll show in the next several blog entries that will be dedicated to this subject, much like the art form, the Khazar hypothesis of Ashkenazi ancestry presents dreams, a range of emotions, and fantasies that are highly contrasted with reality as has been clearly established by recent advancement in populations genetics.

But before we dwell into why this theory should be considered a complete and utter nonsense, I think a short recap of what it's all about should be our first order of business.

As a child of East European Ashkenazi Jews, and especially having family roots in Jews from Ukraine and Russia, I was introduced to the notion the Kingdom of Khazaria pretty soon after my inquiries into the origins of Ashkenazi Jewry began.

To people who might not be familiar, the Khazar theory states the following: a group of nomadic Turkic people that ruled a vast kingdom in the East European Steppe region, decided to convert to Judaism around the 8th or 9th centuries AD. There are some arguments as to whether or not this conversion was only limited to the royal court, or also included substantial parts of Khazar commons. In any case, the theory claims that after the destruction of their kingdom around the 10th-11th centuries AD by the Slavic Rus people (forefathers of modern day Ukrainians and Russians), the Jews of Khazaria were dispersed among the Slavs, adopted Slavic as their language, and later on German mixed with Slavic words, and basically became the forefathers of all or most Ashkenazi Jews, who in recent times mostly lived in nearby Eastern Europe. During the 20th century, several works have been propagating this idea, including the most famous one - The Thirteenth Tribe  written by Arthur Koestler in 1976, in which he presented the thesis that Ashkenazi Jews are not descended from the historical Israelites of antiquity, but from Khazars, a Turkic people.




The notion of being the descendant of Jewish knights fighting in Eastern Europe against both the pagan barbarians from the North and the Roman legions from the South, was really appealing to me as a teenager.

To anyone who's interested to learn about this mysterious, yet fascinating period of Jewish history, I honestly recommend to pay a visit to Kevin Brook's Khazaria.com. He really is the expert when it comes to Khazar historical affairs.

In any case, during the 20th century this Khazar hypothesis became a compelling alternative to the more traditional "Rhineland hypothesis"  - which states that the forefathers of Ashkenazi Jews were Israelite Jews who were expelled by the Romans in the 1st century AD from Judea to Rome, Italy, as slaves. They then migrated to the Rhineland region, Germany in early medieval, picked up a German dialect and infused it with Hebrew and Aramaic, and later on were expelled to Eastern Europe where they adopted many Slavic words into their Germanic dialect, Yiddish. The Rhineland hypothesis gained its support from a plethora of historical evidence, showing that most Ashkenazi Jews arrived to Eastern Europe from Western and Central Europe, that there were vast Jewish communities in France and along the Rhineland during early medieval ages, and that Yiddish derives from early High German.

Both theories had their own supporters. Unfortunately, in a sharp contrast to Koestler's original intent, the Khazar theory has been hijacked in the last few decades has also been used by anti-Zionists to challenge the idea Jews have ancestral ties to ancient Israel, and it has also played a role in anti-Semitic attitudes.

In any case, up until 15 years ago, both theories, based only on interpretation of history, could have pertained to be of equal validity.

Then came along modern populations genetics.

From the early 2000s, dozens of peer reviewed publications uncovered that Ashkenazi Jews and Sephardi Jews are closer to each other than to other historically neighboring non-Jewish populations (in the case of Sephardic Jews - Iberians, in the case of Ashkenazi Jews - Germans and Slavs).

Further more, Ashkenazi Jews have been shown to have substantial Levantine genetic admixture, and the non-Levantine component seemed to be mainly of Southern European (Italian, Greek etc.) origin and not Turkic or North Caucasus.

Those findings have been further confirmed and re-validated by numerous of peer reviewed publications in the following decade and a half, among them which can be cited:


Behar et al. (2006):
Behar et al. (2008):
Behar et al. (2010):
Atzmon et al. (2010):
Bray et al. (2010):
Ostrer et al. (2013):
Costa et al. (2013):
Xue et al. (2017):
While many of these and similar scientific studies sometimes contradict and disagreed with each other on several details, they all seem be in agreement that there is no genetic evidence whatsoever that Ashkenazi Jews are Khazar converts or are particularly related to any Khazar-like people, and all of these papers strongly reaffirm that all Ashkenazi Jews are highly related to Sephardic Jews and other Western Jewish populations (more so than to any other non-Jewish population).
Unfortunately, as we will see in the following entries, all of this evidence wasn't enough to kill this theory, with some even resorting to bad science in order to keep it going. As I began this post, by quoting the definition of a rhapsody, this hypothesis and the attempt to continue promoting it are now causing few academics to run wild, come up with fantasies, contrasting reality - and all just to pursue a dead end.

So stay tune for the next entries in our Jewish Genes Khazar special.

Sunday, October 28, 2018

"Western" & "Mizrahi" Jews

How genetics break apart the traditional ethnic divisions of Jewish diasporas


I think the best way to get this blog going, is by introducing the way the different Jewish ethnic groups are being divided according to genetics. It's important to break the mold for new people who are being introduced to this subject, and are usually hanged onto the traditional cultural-linguistic-religious divisions.

Usually, Jewish ethnic groups are traditionally divided as following:
  • Ashkenazi Jews (plural Ashkenazim) - are the descendants of Jews who migrated into northern France and Germany around 800–1000, and later into Eastern Europe. Religiously they follow minhag Ashkenaz, and traditionally spoke Yiddish - Jewish German dialect.
  • Sephardic Jews (plural Sephardim) - from Hebrew "Spanish" - are Jews whose ancestors lived in Iberia prior to 1492, and were forced to leave after the Alhambra edict in that year. Most of their ancestors settled around the Mediterranean sea - mainly North Africa, Italy and the regions controlled by the Ottoman Empire (Balkans and Asia Minor, Syria and Land of Israel), while much smaller numbers also fled as far as Poland and the Netherlands. Religiously they follow minhag Sepharad, and traditionally spoke Ladino - Jewish Spanish dialect.
  • Mizrahi Jews (plural Mizrahim) - from Hebrew "Eastern/Oriental" - are Jews whose ancestors have never left the Middle East, and existed in many places that Sephardic Jews also arrived, prior to 1492. Because in many cases the two communities intermarried extensively, there is a bit of confusion and many Mizrahi Jews are usually called Sephardim and vice versa. Basically, one can say that Mizrahi Jews are all Jews that traditionally adhered to a religious liturgical rite very similar to those of the Sephardic Jews and their ancestors spoke Middle Eastern based  languages such as Judeo-Arabic, Judeo-Aramaic or Judeo-Persian, as opposed to Sephardic Jewish communities which spoke Ladino.

Along those three major divisions that almost everyone is familiar with, there are other Jewish diasporas which in the traditional religious-cultural-linguistic traditions, have their own unique status:

  • Italkim - are distinct non-Sephardic Italian Jewish community trace their origins as far back as the 2nd century BCE. It is thought that some families descend from Jews deported from Judaea in 70 CE. They have traditionally spoken a variety of Judeo-Italian languages (Italkian) and used Italian Hebrew as a pronunciation system.
  • Romaniotes - are a distinct non-Sephardic Greek Jewish community that has resided in Greece and neighboring areas for over 2,000 years. They have historically spoken the Judæo-Greek dialect Yevanic.
  • Gruzim - Georgian Jews
  • Beta Israel or Falashim - Ethiopian Jews.
  • Maghrebi Jews - pre-Sephardic Jewish communities of North Africa, traditionally spoke Jewish Berber dialects. In some places like Morocco and Algeria, most of them merged with the Sephardic communities, while in other places like Tunisia and Libya, they remained separate.

However, genetically speaking, this makes no sense. As you'll come to see in the next couple of blog entries, the different Jewish ethnic groups do not follow this religious or linguistic or even cultural divisions. Genes don't care about this.

In the most fascinating study "Abraham's Children in the Genome Era: Major Jewish Diaspora Populations Comprise Distinct Genetic Clusters with Shared Middle Eastern Ancestry" by Atzmon et al. (2010), the terms "European/Syrian Jews" and "Middle Eastern Jews" are used to differentiate between two Jewish ethnic groups that they've recognized:
"Two major differences among the populations in this study ... Ashkenazi, Sephardic, Italian, and Syrian Jews and the genetic proximity of these populations to each other compared to their proximity to Iranian and Iraqi Jews. This time of a split between Middle Eastern Iraqi and Iranian Jews and European/Syrian Jews, calculated by simulation and comparison of length distributions of IBD segments, is 100–150 generations, compatible with a historical divide that is reported to have occurred more than 2500 years ago. The Middle Eastern populations were formed by Jews in the Babylonian and Persian empires who are thought to have remained geographically continuous in those locales. In contrast, the other Jewish populations were formed more recently from Jews who migrated or were expelled from Palestine and from individuals who were converted to Judaism during Hellenic-Hasmonean times, when proselytism was a common Jewish practice."
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3032072/

Basically, it was found that Ashkenazi, Sephardic, Italian and Syrian Jews all cluster together - and are designated as "European/Syrian" Jews, compare to Iraqi and Iranian Jews which also cluster with each other, but form a separate cluster than the European/Syrian cluster.

They rightly attribute the time of the split to roughly 2500 years ago, or the Babylonian diaspora, and say that European/Syrian Jews descend from Jews that mixed with Mediterranean Europeans (Greco-Romans) during the Second Temple period and migrated/were expelled Westward, while Middle East Jews descend from those Babylonian Jews that arrived to Mesopotamia (modern day Iraq) after the destruction of the 1st Temple. This actually makes sense when the different genetic clustering and paternal and maternal lineages of those two separate communities are examined.

However, instead of those names designated to these two separate ethnic groups by Atzmon et al., it had become increasingly popular, when discussing genetics, to name these Jewish ethnic groups in the following way: Western Jews for the European/Syrian Jews and Mizrahi Jews for the Middle East Jews.

I, for once, find this terminology to be much better. I prefer the term Western Jews over the term European/Syrian Jews, for the fact that it also includes North African (Sephardic and pre-Sephardic) Jews and geographically speaking, they all cluster with populations West of the Levant, with Southeast/East Mediterranean European populations (Aegean Greeks, Maltese, Sicilians and Cypriots). All these communities can rightly be considered part of the "Western diaspora".

For those "Middle East" Jewish communities, I also prefer the Jewish/Hebrew designation of "Mizrahi" Jews, first of all because it sounds better, and second of all because it also somewhat overlaps with the traditional designation of these communities (ie they are considered Mizrahi Jews already). Also, Syria, last time I checked, is also  part of the Middle East, and so is Egypt and in many ways, linguistically and culturally speaking, North Africa could be considered the Middle East as well. So it doesn't make sense to only refer to Iraqi and Iranian Jews as Middle Eastern. Also, Georgian and Uzbek Jews also closely cluster with Iraqi and Iranian Jews, and those communities do not live in the Middle East. The geographic meaning of the word "Mizrahi" - "Eastern" in Hebrew, is also suitable here since they all originate from communities that existed East of the Levant. In essence, they form the "Eastern diaspora".

I've added a PCA created from academic samples gathered by Davidski from Eurogenes project (if you don't know who that is - I suggest you head to his most excellent Eurogenes blog. He's also behind many of the ADMIXTURE calculators offered on the open genomic data site GEDmatch), visually showing the coordinates and genetic distances of each of these Jewish ethnic groups relative to each other and to other West Asian, Near Eastern and South European populations:



As can be clearly seen, despite the two groups - Western Jews and Mizrahi Jews - clustering in two different locations on this PCA and with different non-Jewish populations - all Jewish ethnic groups which were identified as Western Jews pretty much overlap - with non-Sephardic North African Jews extremely close to them; and all groups identified as Mizrahi Jews also overlap with each other.

So from now own, in this blog, we will refer to Western Jews when discussing:
Ashkenazi Jews, Sephardic Jews, non-Sephardic Greek Jews (Romaniote Jews), non-Sephardic Italian Jews (Italkim), non-Sephardic North African Jews (Maghrebi or Berber Jews) and non-Sephardic Syrian Jews (also known as Musta'arabi Syrian Jews).

And Mizrahi Jews when discussing:
Iraqi Jews, Persian Jews, Mountain Jews ("Kavkazim"), Kurdish Jews, Geogrian Jews and Uzbek Jews ("Bukharim").

Yemenite Jews, Ethiopian Jews etc. all form their own genetically distinct populations, and cannot be included in those groups.

Why this blog

The origins of the Jewish people is something which has been a personal interest of mine, as an Ashkenazi Jew myself, for almost 20 years now.

It also seems to be of immense interest to the rest of the world - more than 30 (!) different population genetics studies have published in the past 20 years trying to uncover the genetic history of our people.

In this blog, I'll try to sum up as clearly as possible, in layman's terms, the complete picture that has been drawn so far by these recent genetic studies regarding the origins of the Jewish people.

I will also use this blog as an opportunity to lay out my own theory about the origins of several of the more demographically dominant Jewish ethnic groups (such as Ashkenazi, Sephardi etc.). Thanks to the advancements in population genetics and DNA ancestry tests which can be done at home in  relatively affordable prices, we are at an exciting age where tons of cultural and linguistic mysteries and basically dark corners of the of the history of populations can be uncovered with the help of genomic tools.
However, despite the boom in published studies and the amount of knowledge gathered in this exciting field, very few people roll up their sleeves and try do their own study with the abundance of readily available open genetic data.
Sometimes this means that even an amateur such as myself can pick up on connections that no one in either the scientific community or semi-professional popular genetics blogs has noticed before.

And last but not least, the intent of this blog is also to help disprove once and for all the notorious "Khazar hypothesis" which should have been buried for good as a result of genetic studies, but is still being propagated endlessly by pseudo-scientists, scientists with political agendas, and anti-Semites.

In essence, this blog is a pop-science Jewish genetics blog, focusing on the origins and the history of the Jewish people. It's main purpose is to be accessible and readable to the non-scientific general audience.