Lets face it, “Ethnicity Estimates”, i.e. analysing your DNA to break down your ancestry into different regional world populations, are fun. In certain circumstances it can be useful for a genealogist, for example researching an unknown recent ancestor. At a more general level it can be useful for a person to connect to the place of origin, both at an individual level (hey, I’m a bit Scottish!) and collectively with their ancestors (Hey, I’m related to people all over the world!) It is also a major selling point for the firms selling DNA testing to the public. The richest and most mature genealogical community in the world is based in North America, where most people are a colourful blend of immigrants from many different countries and cultures. As such it makes it worthwhile to provide a form of instant gratification for DNA testers in this market.
The down side of the ethnicity estimates is that it’s, at the moment, an imprecise science. This problem is compounded when people take their ethnicity estimates too literally.
What’s new in 2017 ?
Back in 2015 I wrote about my own ethnicity results. Since then there have been a number of changes the DNA testing market, specifically:
- My Heritage (https://www.myheritage.com/) have launched a DNA test that includes an ethnicity estimate. Details of their test are here.
- LivingDNA (https://www.LivingDNA.com) have launched as a company. Their single test covers ethnicity testing, Mitochondrial DNA testing and (if you’re a bloke) yDNA testing. Details of their test are here.
- FamilyTreeDNA (https://www.familytreeDNA.com) have updated their ethnicity estimates (referred to as My Origins). Details of the update are here.
What do I want to cover ?
In this post I’m trying to show a couple of things. Firstly what ethnicity reports each company provides you when you buy their DNA test. Secondly What results a plain, theoretically 100% British, person gets at each company. On top of this I hope this article shows why you need to look at the bigger picture of your ethnicity and try not to take the ethnicity results you get at face value.
One other important point to address here is the fact that whilst you inherit exactly 50% of your autosomal (i.e. non-sex) DNA from each parent, the percentages you inherit from each grandparent are more variable due to the random nature of recombination during meiosis. You should inherit 25% from each grandparent, but it could be slightly more or slightly less. Allow this process to work for a few generations and it may well be that you don’t even carry any genes from some of your ancestors.
I’m British (at least I think I am)
Now let’s talk about me. I’ve written on a few occasions about my ancestors If you want a slightly longer read you can read about them here, but let me try and summarise things a little.
If I look at my eight great-grandparents, six came from small rural communities within the British Isles. One great-grandparent came from the market-town of Darlington, but had clear roots within the local area, whilst the other city dweller was born in the beautiful cathedral city of York, but had her roots in the surrounding villages.
The only great grandparent which came from a community with a clear migration history is Robert Stewart. He came from Lissan, on the County Derry/Co. Tyrone borders in Ulster. This area was greatly affected by the Plantation of Ulster. I have a Stewart cousin who has had his yDNA tested. It shows a direct link back to the Scottish King Robert III, so I’m assuming that my Stewart ancestor was part of the Scottish settlement within the Plantation.
Naturally there may be instances of NPEs (Non Paternal Events) in my family, but since the communities of my ancestors were not the major cities and ports of Britain it’s more likely that any such events were the result of the neighbour-next-door rather than these tall dark handsome strangers.
Taking all the information I have on my ancestors I would expect my ethnicity report to show me as 100% British with a breakdown of 50% North Yorkshire, 25% East Anglia, 12.5% Scots-Irish (but probably lowland Scots) and 12.5% Durham/Scottish Borders.
For those of you that prefer something more visual I have a map and chart of my great-grandparents, and their grandparents.
For comparison below is a rather wonderful map of Norse placenames in the Britain, Isles produced by the British Museum. As you can see there were many areas of Norse settlement names in Northern England and along the eastern seaboard of England, which correspond with many of my family’s own birth locations. Whilst there is some dispute as to how many Vikings settled in Britain, many of the place names (at least within the area I grew up) refer to farms belonging to people with Norse names.
Finally in this section you may want to cast your eyes over this colourful spreadsheet that I created. These are my eight great-grandparents (again) and their grandparents. They are coloured to match the 21 regions used in the LivingDNA map:
Who are the British ?
If you are interested in the genetics of the British then there are two important studies that will help you understand the background. The first is the seminal “People of the British Isles” study that was published in 2015. There is a nice summary article here, and I wrote about it here. In brief the study looked at 2,039 people with ancestors born in the same part of the country. Taking this data they managed to break the population of the British Isles (ex. Irish Republic) into 17 regional groups based on their genetic variation only. As ever a picture paints a thousand words, so below is the clustering of the 2039 samples into regions. More interestingly the data was used to create a “hierarchical cluster tree” based on these genetic differences. This is shown top-right in the below diagram. It shows that the people of the Orkney Islands have the most significantly-different genetic make-up from people in the other parts of the British Isles, followed by the genetic difference between the Welsh and the rest of mainland Britain.(sorry for these horribly complex sentences I hope you are following).
One important conclusion from the results was that there was a large cluster of relatively homogeneous people sampled within central and eastern England. These are the red squares in the diagram below. Further “the best estimates for the proportion of presumed Anglo-Saxon ancestry in the large eastern, central and southern England cluster (red squares) are a maximum of 40% and could be as little as 10%“. This is suggested to show that the Anglo-Saxon invasions of Britain did not replace the local population, as sometimes thought, but interbred with them.
The other significant scientific research on the British people is a recently published research (in pre-print) on “The Beaker Phenomenon And The Genomic Transformation Of Northwest Europe.” As the Abstract states “Bell Beaker pottery spread across western and central Europe beginning around 2750 BCE before disappearing between 2200-1800 BCE”. The study sampled “170 Neolithic, Copper Age and Bronze Age Europeans, including 100 Beaker-associated individuals.” Among their conclusions was that “British Neolithic farmers were genetically similar to contemporary populations in continental Europe and in particular to Neolithic Iberians, suggesting that a portion of the farmer ancestry in Britain came from the Mediterranean rather than the Danubian route of farming expansion. Beginning with the Beaker period, and continuing through the Bronze Age, all British individuals harboured high proportions of Steppe ancestry and were genetically closely related to Beaker-associated individuals from the Lower Rhine area. . We use these observations to show that the spread of the Beaker Complex to Britain was mediated by migration from the continent that replaced >90% of Britain’s Neolithic gene pool within a few hundred years“. In other words British Neolithic farmers, with their Iberian ancestry, were almost totally replaced within a few hundred years by people from the Bell Beaker culture of the Lower-Rhine area arriving in the British Isles. If you are interested in this study there is an excellent podcast you can listen to here.
It’s important to note that neither of these pieces of research is unchallenged within the academic world (for example here), so expect arguments and counter-arguments to continue. That is one of the great joys of living in such an era of historical research.
I’m starting with LivingDNA on purpose. Their USP (Unique Selling Point) within the genetic genealogy world is that their test breaks down ethnicity into 80 regions Worldwide, including 21 within the British Isles. They are currently involved in genetic studies in Ireland and the German-speaking world to provide richer details of both of those regions. Below is a map of the regions LivingDNA can identify within the British Isles. In part their detailed breakdown is due to the fact that LivingDNA “analyse unique combinations of ‘linked DNA’. This proprietary method is the reason our test delivers a level of power never seen before.“
Recently (June 2017) LivingDNA revised their results and enabled testers to look at 3 different “Views” of their results: Complete, Standard and Cautious. Their blog provides an explanation of the three different views, but to summarise. Standard is, well, the Standard interpretation of your ethnicity, Complete attempts to assign all your DNA whilst Cautious will group together any “genetically similar populations together” to give a ethnicity which LivingDNA are most certain about.
I’ll start with the standard View, my results from LivingDNA were as follows:
As you can see LivingDNA identified 98.4% of my ancestry, all of which was assigned as European, although a mix of Great Britain/Ireland and a more general Europe (unassigned)
The next, and highest, level of detail is the Sub-Regional ancestry, again shown below:
If you prefer a more visual representation the British Isles regions are shown below.
Now on to the Complete View. In my case what it has done is assigned all my unassigned percentages to an ethnicity. This did not change any of the ethnicity percentages that were already assigned in the Standard View. The specific additional assignments I received were:
- 1.6% World unassigned was replaced by 1.6% Pashtun.
- 5.% Great Britain and Ireland unassigned was identified as 1.9%, Cornwall, 1.6% Southwest Scotland and Northern Ireland and 1.6% Aberdeenshire.
- 6.7% Europe Unassigned was identified as 4.1% Scandinavia, 1.4% Basque and 1.3% Sardinia.
To be honest all of these assignments look like background noise.
The final results are the Cautious View, and I think the most interesting. This view has not changed either the Global or Regional level results from the Standard View and therefore only works at the sub-regional level. Below are my results. As you can see the come close to my paper-trail ancestry, especially on my two main British Regions North Yorkshire (45.5% identified vs 50% paper-trail) and East Anglia (30.1% identified vs 25% paper-trail). The percentage ancestry figures appear to be calculated from my Standard View results as follows:
- 45.5% North Yorkshire-related ancestry is the sum of Northumbria (21.2%) North Yorkshire (17.2%), Lincolnshire (4.8%) and South Yorkshire (2.3%)
- 30.1% East Anglia-related ancestry is the sum of East Anglia (17.9%) and Southeast England (12.2%)
The map view provides some other interesting details. There are two regions, Cumbria and Northwest England, where the colouring does not match the text colours. This is because LivingDNA believe that the region is genetically similar to other regions. As an example Northwest England (shaded purple) is very close (genetically) to both North Wales (shaded pink) and East Anglia (shaded blue), so it’s possible that either, or both of, my East Anglia and North Wales ancestry could have come from Northwest England instead.
Ancestry provide their ethnicity estimates as part of the single AncestryDNA test. There are two important points to note are, firstly “Your ethnicity estimate shows where your ancestors came from hundreds to thousands of years ago” and as part of the methodology used to determine ethnicity they re-sample the DNA data 40 times. This means you actually get a range of estimates. A more detailed explanation of the methodology used is described in this Ethnicity Whitepaper. Now to the results.
If you click on the SEE DETAILS button Ancestry have a very good explanation of the Genetic Diversity of each region and how you compare to that. For example this is how I compare to a “Typical native”:
Among other details shown is the size of the Reference Panel used by Ancestry to model each ethnicity, for example 195 people in the case of Great Britain. A full discussion of the Ancestry Reference Panel can be found here. One key point to highlight from the discussion of Reference Panel is the lower quality of ethnicity estimates from the regions “Great Britain, Western Europe, Iberia, and Mali” due to “the number of reference samples in the panel for each region and the similarity of a given region to others.” (My emphasis).
Finally on this topic Ancestry provide one more interesting snippet of data. They give statistics on the other regions commonly seen in people native to the Great Britain region (see below). This is important as Ancestry says “Since approximately 60% of the typical native’s DNA comes from this region, 40% is more similar to other regions, such as Ireland, Europe West, Scandinavia and the Iberian Peninsula”. This is important – you can be 100% British native and will typically show on 60% British, whilst Ireland, Europe West, Scandinavia and the Iberian Peninsula in particular will show up in your Ancestry ethnicity. This is exactly what happens with my Ethnicity Estimate. Ancestry have also blogged quite extensively about some of these issues, for example there are articles about “The British Are Less British Than We Think“, “AncestryDNA – The Viking In The Room” “What does our DNA tell us about being Irish?” and “AncestryDNA – The Irish Connection.”
There is also a nice map showing what they mean for each of the regions. You can see that the borders are fuzzy and overlap, both of which are important to consider.
One ancestry.com-specific feature is their “Genetic Communities“. These are the orange and red dots that are shown in the map. The communities are “groups of AncestryDNA members who are connected through DNA most likely because they descend from a population of common ancestors, even if they no longer live in the area where those ancestors once lived“. at the time of writing this blog I have two communities, although Ancestry expect that as more people test more communities will develop and people will become part of more communities. There is a very good explanation of the communities feature at the Ancestry website.I’m still trying to understand the subtle differences between Northern English and English Northerners…
MyHeritage is a recent arrival in the world of DNA testing. and their Ethnicity Estimates is still in Beta testing. I obtained my Ethnicity Estimate from this company by transferring my DNA data from one of my other tests. For a while it was unclear if transferees like myself would receive the Ethnicity Estimates for free. This has now been cleared up.
The test from MyHeritage covers 42 regions. More importantly for myself they have separated out the British Isles into the English and a single category for the “Irish, Scottish and Welsh”. It’s worth noting that My heritage views the Irish, Scottish and Welsh as “very close genetically, because of their shared Celtic roots, but hopefully we will be able to tell them apart successfully in the future“.
In addition, the latest iteration of their product has improvement to separate “German, French, Dutch, and other related ethnicities, which are more accurately detected now as North and West European, whereas previously many of them were incorrectly described as British”
Details of the methodology MyHeritage use aren’t (AFAIK) available, however their blog does mention the “Founder Populations” project where they sampled “more than 5,000 users [who] were handpicked by MyHeritage from its 90 million-strong user base, by virtue of their family trees exemplifying consistent ancestry from the same region or ethnicity for many generations.”
The results from MyHeritage are shown below. As you can see they have done a very good job of separating my English heritage from my Scottish, although the high percentages for South Europe are a little odd:
23andMe have always been a bit of an outlier in the world of genetic genealogy. Their product is a mix of genealogical information and medical information. As a result the product they have developed has had to be tailored to the demands of government legislation in the countries that they operate. This has led to a fragmented website experience. One result of this was that I was for a number years on the “old 23andMe Experience”. This has only recently changed to the “23andMe New Experience”. My new 23andMe website has updated Ancestry reports, but appears to have lost some functionality along the way.
For background their Ancestry Composition report identifies 31 different world regions, based on a reference population of over 10,000 people. Details of their methodology are published here. A slightly more simplified explanation off the methodology are here, whilst there is also a FAQ section here. One interesting point to note is that 23andMe provide details of the “Precision” and “Recall” for each of their 31 ethnicities. As the 23andMe website explains ““Precision” corresponds to the question “When the system predicts that a piece of DNA is from population A, how often is it actually from population A?” “Recall” corresponds to the question “Of the pieces of DNA that actually were from population A, how often did the system predict that they were from population A?” The “British and Irish” ethnicity from 23andMe has values as follows: Precision 0.90 Recall 0.39. This indicates that 23andMe are confident that their “British and Irish” predictions are correct, but they have a relatively low success in identifying all such segments that are British. In lay-mans terms don’t expect 23andMe to identify all your Britishness/Irishness.
This is how I’m currently reported at 23andMe.
This is my default reporting from 23andMe, however it is one of 5 different “Confidence Level” reports that are available. These vary from 90% to 50%, with 90% being the report that 23andMe are most confident in and 50% being the most speculative. Interestingly 23andMe prefer to report your most speculative results as the default. Accessing the more conservative estimates is a bit of a faff. They are available as part of the additional feature “Your Ancestry Composition Chromosome Painting”. This feature, as you may have gathered from the title, allows you to see each of your pairs of chromosomes painted with the ethnicity that 23andMe have calculated. This can be quite useful if you have clearly mixed ancestry, as you could use this information, together with any details you have of where you match other people to help identify a common ancestor. Obviously for a plain “vanilla” Brit like myself there is not much to gain.
Looking at the 90% confidence level ethnicity estimates you can see that my sub-regional estimates such as “British & Irish” have dropped considerably (from 64.8% to 7.1%), however at the regional level, e.g. Northwestern Europe, this difference is lower (97.3% to 74.6%). At the continental level there is hardly any difference between my European estimates 99.8% to 98.6%.
One final note on this report. If you are a 23andMe user and are having difficulty separating out all the shades of blue in your “Ancestry Composition Chromosome Painting” report you can hover your mouse over each sub-regional category label (the bit on the right) and it will highlight on which parts of your chromosome it is actually referencing.
In addition to the percentages report there is a new feature that shows a possible timeline for how my ethnicity may have occurred. I must admit I find this feature unrealistic and annoying. It simply doesn’t fit in with my known paper-based ancestry. I suspect it’s a product that works better in the North American market where such diverse ethnicity mixes are more likely to occur.
Family Tree DNA
Last up is FamilyTreeDNA.com, hereafter ftDNA. There ethnicity product “MyOrigins” is now in it’s second iteration. A description of the update can be found here. Their ethnicity estimates are based on a panel of “24 reference populations”, an increase of six from their previous reference popluations. The list of ethnicities can be found here. My results are shown below. As you can see they struggle to isolate my British Isles (33%) from my West and Central Europe (45%).
On top of this ftDNA provide a map showing the regions you come from, complete with a number of layers of “fluffyness”
Comparing the Results
I must admit I’m a little reluctant to compare the results, as we’re really not comparing like with like. The LivingDNA results break out my UK populations, which none of the other groups do. Similarly the ancestry.com results claim to cover a period thousands of years ago. Finally both Ancestry and MyHeritage separate our some parts of the British Isles. To try and level the playing field a bit I’ve tabulated the results in two ways (see table and chart below). The first row of data looks at the percentage of my DNA identified as British by each test company (including Irish, since some companies don’t separate this out). The second row of data covers the more general Western Europe category. By this I mean that it will include any percentages that are identified as Scandinavian, Northwest Europe and Iberian. I’ve included these ethnicities because they (i.) All have direct sea connections to Britain (ii.) All have migration events to the British Isles and (iii.) Are identified by Ancestry.com as commonly been seen in “British” DNA samples.
As you can see if we look only at “British” DNA then LivingDNA is way ahead of the other testing companies, however once you include the neighbouring populations to Britain the results are remarkably close. Only MyHeritage is significantly lower than the rest, due to 7% Greek they have identified.
|Test Provider||British Isles||Broadly Western European|
1 ancestry.com separate British and Irish ethnicity. (59% British + 11% Irish)
2 MyHeritage separate English from Irish, Scottish and Welsh ancestry. (70.1% English + 11.8% Irish, Scottish & Welsh combined)
3 91.7% Great Britain and Ireland + 4.1% Scandinavian – taken from the Complete View.
For British testers LivingDNA has a clear advantage over the other groups, both due to the high percentage of British ancestry they detect and the fact they break your British ancestry down into regional groups. However whilst the regional results may roughly map to your known identity it’s not a perfect fit.
What I find exciting about LivingDNA is knowing that their large-scale regional analysis of DNA will in future be used for their Irish and Germanic ethnicities, hopefully bringing the same level of detail to people from these regions.
Personally I like the approach ancestry.com have taken. The fact that they analyse your data 40 times to give a range of ethnicity estimates helps people understand the fuzziness of such reports. If you are looking at your own results you may find it useful to discard any ethnicities that have a range starting with 0% – it may well be the truth !
Despite my cynicism I must admit that receiving your first ethnicity report can be an exciting and revealing occasion. It can recast your whole understanding of your family history. The flip-side is that if you are rather plain “Vanilla-British” then the issue that you are likely to encounter is understanding that genetically British people are incredibly similar to their neighbours. As a result you are likely to show other European ethnicities. Don’t get hung up on these they may well be “false positives”
The truth is that there are no British genes. We Brits are merely the product of the mixing of people who washed up on our shores since the last Ice Age covered our islands .