Dealing with Endogamy, Part I: Exploring Amounts of Shared DNA
Autosomal DNA testing is a valuable resource for genealogists seeking to overcome recent brick walls in their family history, particularly in instances where traditional historical research is limited or unavailing. At Legacy Tree Genealogists, we frequently use autosomal DNA test results to answer questions regarding adoption, unknown paternity, or difficult to trace ancestors. To learn more about autosomal DNA testing see our previous blog post on the basics.
Endogamy is the custom of marrying only within the limits of a local community, clan, or tribe over the course of many generations. The reasons for this genetic isolation could be cultural or religious (as with Ashkenazi Jews and Low-German Mennonites) or geographic (as with island and tribal populations). Members of endogamous populations may descend from a limited pool of “founder” ancestors who represented the initial genetic makeup of their population. After many generations and hundreds of years of isolation from outside pedigrees, autosomal DNA markers converge within endogamous populations. As a result, genetic profiles of population members can easily be distinguished from the DNA of outside populations.
Pedigree collapse occurs when two related individuals produce offspring. As a result, the number of unique individuals occupying locations in a pedigree decreases or collapses. Whereas most people have eight unique great-grandparents, a child of two first cousins will only have six unique great-grandparents. They will also have inherited a larger portion of their DNA from the ancestors held in common between their parents. Though a related concept, pedigree collapse is not the same as endogamy. However, recent cases of pedigree collapse in an individual’s tree and long term endogamy can have similar effects on DNA inheritance. When practiced over multiple generations and over the course of several hundred years, continued pedigree collapse can lead to endogamy.
As a result of endogamy, individuals from the same population will frequently share multiple ancestors in common with each other. They also may descend from the same ancestral couple multiple times, which can greatly complicate autosomal DNA analysis. In one of our research cases, we found that an individual descended 12 different times from the same ancestral couple who lived in the late 1600s in French Canada. Although they were quite distant ancestors in every case (within the range of 9th-11th great grandparents), he had inherited a disproportionate amount of DNA from them due to their heavy representation within his family tree.
In this and a future blog post, we will explore two keys for dealing with endogamy in autosomal DNA test results: 1) Exploring the exact amounts of shared DNA between relatives; and 2) Testing multiple relatives. In this post we will address the first of these strategies.
Measurements of Shared DNA
Amounts of shared DNA are communicated in autosomal DNA test results in one of two ways: as a percentage of the total autosomal DNA, or in centimorgans. Centimorgans are a unit of measurement commonly used in genetics to specify how much DNA two individuals share in common. They are actually measures of recombination and express the likelihood that two locations on a DNA strand are inherited together. Centimorgans are measured on a logarithmic scale. Typically, segments on the ends of a DNA strand have higher centimorgan values than those in the center because segments on the end are more likely to recombine than segments in the center. Larger segments with high centimorgan values typically suggest that two individuals share a recent common ancestor. For different levels of relationship, we observe different levels of expected shared percentages and centimorgans. These estimates can be found at http://isogg.org/wiki/Autosomal_DNA_statistics.
In endogamous populations, genetic cousins may share much more DNA than would be expected given their closest relationship. This may be due to the fact that in addition to being a third cousin, they are also a double 6th cousin, a 5th cousin once removed, a triple fourth cousin and a half 4th cousin twice removed. As you explore your relationships to genetic cousins, be sure to consider all possible sources of shared DNA.
Calculating the Coefficient of Relationship
One useful equation that can help to explore and evaluate relationships is the coefficient of relationship (CofR = ∑(1/2)n). This coefficient communicates the estimated amount of shared DNA between two individuals who are related through multiple family lines. It is calculated by raising ½ to the number of generational steps between an individual and their relative (n). This is calculated for each relationship and then summed for all pertinent relationships. It is important to remember that the coefficient of relationship is calculated through every common ancestor, not through every common ancestral couple. Therefore, if two individuals share a common ancestral couple in common, there will be two elements which contribute to the final coefficient. Also, very distant relationships may or may not result in shared DNA, so the coefficient may not always be representative of the observed amounts of shared DNA. The coefficient is a better representation of the total expected amount of DNA rather than the actual amount of DNA that two relatives will share. However, if the calculated coefficient is much lower than the observed amount of shared DNA, then this could indicate that there are additional relationships between the subject and the match which have not yet been identified – some of which could be beyond a brick wall.
Runs of Homozygosity and Fully Identical Regions
In order to accurately evaluate DNA test results in endogamous populations, consider exactly how much DNA two individuals share in common. Most genetic matches will only share DNA segments on one copy of their chromosomes either maternal or paternal. However, if both of an individual’s parents are from the same endogamous population, or are known close relatives to each other, then they may have a “Run of Homozygosity,” or a region of their DNA where the maternal copy is identical to the paternal copy. In these cases, the subject is a genetic match to themselves. If another genetic cousin overlaps in this same region of DNA, then the amount of DNA they share in common with the test subject should be doubled for that particular segment since they match the maternal copy and the paternal copy.
Doubling of shared DNA segments can also occur between matches if they share multiple recent common ancestral couples, as is the case with double cousins. If double cousins have fathers who are brothers, and mothers who are sisters, then they may match each other in fully identical regions through shared DNA on their paternal and maternal chromosomes even if they do not personally have runs of homozygosity.
None of the testing companies report total amounts of shared DNA which take into account runs of homozygosity or fully identical regions. However, these regions can be discovered through analysis at Gedmatch.com and through David Pike’s utilities, and might be used to confirm and refine the total amounts of shared DNA between two individuals. However, the researcher must have access to test results for both the subject and the genetic match in order to perform these comparisons.
Applying Different Centimorgan Thresholds
In endogamous populations, much of the population shares extremely small segments of DNA from many distant ancestors. If these small segments of DNA are included as part of the total shared DNA between two individuals, it can skew the estimates for how closely they might actually be related. At Family Tree DNA, all segments larger than 1 cM are included as part of the total shared DNA. At Ancestry, all segments larger than 5 cMs are included; however, some larger segments may be excluded based on their matching algorithms. At 23andMe all segments larger than 5 cMs are included as part of the total. Depending on the nature of the endogamous population, it may be beneficial to recalculate the total amounts of shared DNA between matches through comparison at Gedmatch and through application of higher centimorgan thresholds. For example an individual may share 120 centimorgans at Family Tree DNA, but when segments smaller than 7 cMs are excluded, this total may drop to 60 or 70 and may be more representative of the nature of their closest relationship. The appropriate threshold to apply in any given case will depend on the amount of endogamy within a population and whether the test subject is a full member of that population or has recent admixture from outside populations. Consider calculating several totals using different thresholds to give a better indication and overview of the shared DNA.
Calculating exact levels of expected and observed shared DNA and applying different thresholds for total calculations can help to overcome some of the challenges of autosomal analysis in endogamous populations. If you struggle with endogamy in your autosomal DNA test results, Legacy Tree would be happy to assist you in your research. Contact us today for a free consultation.
 Angie Bush, “Cousin Marriage and Endogamy” in “Course Introduction and Overview,” in Advanced DNA Analysis Techniques, Salt Lake Institute of Genealogy, 2016.
 Are your parents related? Gedmatch.com, accessed October 2016; and David Pike, “Search for Runs of Homozygosity (ROHs),” David Pike’s Utilities, www.math.mun.ca, accessed October 2016.