Five Ways to Use the New DNA Coverage Estimator Tool at DNA Painter
The DNA Coverage Estimator is now available and makes the process of identifying matching ancestors through DNA much simpler than ever before.
In April 2018, Legacy Tree Genealogists published an article by Paul Woodbury introducing the concept of DNA coverage – the amount of an ancestor’s DNA represented in a DNA database through the test results of their tested descendants. Different descendants of an ancestor inherit different portions of that individual’s DNA. Therefore, they have different shared segments and total amounts of shared DNA with key genetic cousins. They may even have altogether unique key genetic cousins not shared with other descendants of the target ancestor. By testing multiple descendants of a research subject, it is possible to maximize the coverage of that individual’s DNA in a database.
As part of our previous article, we presented several equations to help in calculating and estimating the coverage of an ancestor. Coverage estimates can be helpful for prioritizing DNA testing candidates, developing strategies for collaboration with existing genetic cousins, and estimating amounts of DNA an ancestor might have shared with a key match or group of matches. However, using the published equations has often been cumbersome and complicated. Further the set up of the equations has limited the scalability for calculating coverage estimates for descendants of large families.
Recently, Leah Larkin at the DNA Geek refined and simplified the original equations to a more intuitive approach (See her description of the math here). Using these revised equations, Leah and Paul worked with Jonny Perl (owner and developer of DNA Painter) to create the coverage estimator tool. (To learn more about the tool and how to use it, visit Jonny’s article at DNA Painter).
This tool makes coverage analysis much more straightforward and accessible for genealogists -no more complicated calculations or limitations on the number of descendants to include. Here are just a few ideas of how you might use the coverage estimator tool in your research.
1. Estimate Your Ancestor’s Coverage in a Single Testing Database
One way you might use the coverage estimator tool is by entering all of the tested descendants of a known ancestor at a single DNA testing company (no mixing and matching between databases) in order to estimate their coverage at that company. If you have used WATO previously, you might use the same tree structure and mark the individuals who have tested.
For example, in the attached screenshot, I have identified the five tested descendants of Susan at AncestryDNA. Their combined DNA tests account for approximately 78.9% of Susan’s DNA at that company.
Keep in mind that while including all of the tested descendants of a research subject at a particular DNA testing company can give you an idea of what the coverage of that ancestor is in that particular database, the only way to take full advantage of that coverage is to collaborate with and seek access to the test results of the other tested descendants.
Without access to DNA test results for other descendants, you will still be limited to the coverage created by your own DNA test results. You won’t be able to learn about the additional chunks of DNA that other descendants inherited (and by extension the amounts of DNA shared with key genetic cousins) or even the additional key genetic cousins that you don’t match but they do.
2. Prioritize the Genetic Cousins with Whom You Should Collaborate
Because you can only fully leverage the coverage of your ancestor’s DNA by obtaining access to the test results of other descendants, another way you might use the coverage tool is to take the tree you created for your ancestor’s tested descendants and unmark all individuals for whom you do not currently have test result access keeping only those for whom you do have access. This will estimate how much of your ancestor’s DNA you have access to through tested descendants (we could call it your active coverage).
The DNA Painter tool will identify which individual should be the next tester to maximize your active coverage. (In this case, these individuals are already tested, but this information might be interpreted to help you understand who you should reach out to and seek to collaborate with). If possible, you should try to work with those individuals to obtain access to their test results and thereby benefit from the additional perspective, segments, matches, and relationships that their test results can provide for the purposes of your research.
The tool will also reveal how much more helpful these individuals could be for increasing your coverage. If you attempt to collaborate with the highest priority individual and they decline to share test result access, you can mark them as “unwilling to test” in the data, and the tool will let you know who is the next best priority for collaboration.
In the same example from above, I have access to the test results of Elizabeth, Peter and Bernard, but not Matthew or Lillian, so I unmarked Matthew and Lillian as having tested. This reveals that my active coverage (the coverage based on the test results I actually have access to) is about 68.8%. The tool then tells me that the next best person to test (or in this case, the best person for me to work with for collaboration) would be Lillian. Getting access to her test results would increase my active coverage by 7.8%.
3. Prioritize the Relatives You Should Invite to Test or Transfer
To this point we have only considered already-tested descendants of a research subject ancestor, but the Coverage Estimator tool can also help you identify priorities for targeted testing or autosomal DNA transfers.
Consider adding other known descendants of your research subject to your chart. These individuals might not have performed DNA testing yet. On the other hand, they may have performed DNA testing, but at a different database.
When you add these people into your tree, the tool will identify who you should contact to invite to perform DNA testing (or to upload their test results from elsewhere) in order to maximize coverage. As in the case of collaboration prioritization described above, if individuals decline to test or transfer, you can mark them as “unwilling to test” and the tool will identify the next best candidates.
Alternatively, if they indicate that they are willing (but they haven’t done it yet, or you will need to wait for the results to process) you can mark that they are willing to test, and the tool will update to identify the next best testing or transfer option for you to consider.
There will be a decreasing return on investment as you test more relatives. As potential increases in coverage become smaller and smaller with each new testing candidate that agrees to test or transfer (or who declines), you might want to consider if the potential increase in coverage is worth it for your case.
Keep in mind that 1% of an individual’s DNA represents about 70 centimorgans of DNA. Is 70 centimorgans more coverage worth the cost of a test? You decide.
In our example case, imagine that Susan had another son, John, who has two living sons. When we add them to the tree, the tool will identify them as the next best testing candidates to invite to test.
4. Estimate the Amount of DNA Your Ancestor Shared with a Genetic Cousin
Once you have started to collaborate with living descendants of a research subject and have obtained access to their test results, arranged transfers of their results into a desired database or arranged targeted testing, you can take coverage analysis a step further to being reconstructing the DNA of your research subject ancestor and estimating how much DNA they might have shared with key genetic cousins.
Consider a scenario where you and several other descendants of your research subject are all sharing DNA with a key genetic cousin who might be related through the ancestor of your subject. If you are working with results at a company that reports on segment data, you might generate a comparison between the key match and all independent tested descendants of your research subject. Next, you can determine the total number of centimorgans that match shares on unique segments with all of the descendants of the research subject. S
Some tools that might help with this include:
>DNA Painter’s Centimorgan Estimator tool (take the start position of one segment and the end position of a partially overlapping segment and calculate the length of the composite segment)
>DNA Painter’s Distinct Segment Generator (copy and paste two or more segments that multiple family members share with a single match and get the cM values for each composite segment and the total shared cM).
Once you know how much DNA a match shares with all of the tested descendants of the research subject, divide the total number by the coverage estimate as a decimal (e.g. 70.1% = .701) and you can estimate how much DNA your research subject might have shared with the genetic cousin. From there, you can evaluate the likely relationship levels using DNA Painter’s Shared cM Project tool.
Estimates are…Estimates
Keep in mind that the estimates provided by the Coverage Estimator are just that – ESTIMATES. The true coverage of an ancestor may be lower or higher given that genetic inheritance is random. As such, you should be careful in making assumptions and conclusions based off of this data.
As a general rule, the closer relatives a research subject has and the more tested descendants they have from unique descent lines, the closer the coverage estimate will be to the true coverage of the subject. At the very least (assuming there are not multiple relationships between the descendants of the subject and the key match), engaging in analysis of unique shared segments can reveal the minimum amount of DNA that an ancestor would have shared with a key genetic cousin.
In our example case, I was able to get the test results of Elizabeth, Peter and Bernie transferred to GEDmatch.com. I also managed to collaborate with a key genetic cousin, Katy, and invite that individual to transfer their test results to GEDmatch. While Elizabeth was born in the 1930s, and Peter and Bernie were born in the 1960s, Katy was born in the 1990s (and may be a generation further removed from the common ancestors with Susan’s descendants).
Comparisons of Katy’s DNA against Elizabeth, Peter and Bernie reveals that she shares 351 cM of DNA on unique segments with Susan’s tested descendants. Assuming that all of this shared DNA came from a common ancestor between Katy and Susan, we can conclude that Susan and Katy would have shared at least 351 cM of DNA with each other.
However, the combined results of Elizabeth, Peter and Bernie only account for a portion of Susan’s DNA – about 68.8% of her DNA according to the Coverage Estimator tool. If we assume that the 351 cM of DNA that Katy shares with Susan’s descendants represents only 68.8% of the DNA she would have shared with Susan, we estimate that Katy might have shared approximately 510 cM.
With that level of sharing, we would expect that Katy is related to Susan at the level of a first cousin once removed or (given her age) possibly at the genetically equivalent level of a great-grandniece. For more information on navigating genetically equivalent relationships see our article on the subject.
The amounts of DNA that Katy shares with Elizabeth, Peter and Bernie separately corroborates this hypothesis, but the combined unique segments and application of coverage estimates provides stronger evidence of the proposed relationship at higher probabilities than any of the amounts of shared DNA between Katy and Elizabeth, Katy and Peter or Katy and Bernie.
5. Estimate the Amount of DNA Your Ancestor Shared with a Deceased Relative
In the case above, we considered a scenario where a group of descendants of a research subject were compared against a single genetic match. To take it one step further, you might also consider the unique shared segments between two groups of matches. You could calculate the coverage of the descendants of the research subject for whom you have test result access, and then in a new chart, calculate the estimated coverage of a candidate relative based on the relationships between the tested descendants of the candidate.
Use the same tools as were utilized in the previous recommendation (DNA Painter’s Centimorgan Estimator, and Distinct Segment Generator) to determine the segments that the research subject’s descendants share with the candidate relative’s descendants. Alternatively, you can get all individuals transferred to GEDmatch and use the Lazarus tool to identify the segments shared between both groups.
Next, divide by the coverage of the research subject (in decimal format) and divide again by the coverage of the relative candidate (in decimal format). In this way you can estimate how much DNA the research subject and the relative candidate would have shared in common with each other.
In our sample case, after transferring the test results of Elizabeth, Peter and Bernie to GEDmatch, we found a group of additional genetic cousins already at GEDmatch including a pair of siblings and their aunt (coverage of 68.8% for the common ancestor). All three matches descended from a woman named Laverna. Using the Lazarus tool with Elizabeth, Peter and Bernie in one group and these three matches in the other group, we found that Susan’s descendants shared 472.2 cM of DNA with the three descendants of Laverna. When we divided this by .688 (for Susan’s coverage) and by .688 (for Laverna’s coverage) we found that Susan and Laverna might have shared approximately 999 cM of DNA (100% probability of a first cousin or genetically equivalent relationship). Additional exploration revealed that Laverna was a great-niece of Susan (which is genetically equivalent to a first cousin relationship).
Again, keep in mind that these are estimates, the closer the descendants of each individual, the more independent descendants who have tested, and the higher the estimated coverage, the more accurate the estimates of how much DNA a research subject may have shared with a match or with another relative candidate.
Give it a try, use the Coverage Estimator tool to estimate the coverage of an ancestor’s DNA in any given testing database, to prioritize the individuals with whom you will collaborate, seek access to test results, invite to transfer or invite to test, and ultimately to help determine the amount of DNA your ancestor shared with a deceased relative.
If you have a tough DNA mystery you'd like to solve, our DNA experts can help! Contact us today for a free consultation to discuss which of our project options works best for you.
Lämna ett svar