In the last few weeks, genetic genealogy has been brought to the forefront of the national and international stage with the revelation that U.S. law enforcement is applying genetic genealogy methodologies to cold cases for unidentified persons and criminal investigations. The identification of the “Chameleon” killer, “Buckskin Girl,” the “Golden State Killer,” “Lyle Stevik” and a suspect in a 1987 Washington State murder case all have been made possible through application of genetic genealogy approaches.[1] It is anticipated that additional cases will be solved in the coming weeks and months as these approaches become more widely utilized.
Genetic Genealogy Methodology
The approaches utilized in these cases are the same as the familial searching approaches commonly utilized in genetic genealogy investigations. Genetic genealogy research centers around the observation that when two individuals share DNA, they share common ancestry. In fact, all humans share approximately 99.9% of their DNA with all other humans. The remaining .1% of our DNA on which we differ can be used to identify portions of DNA which likely came from recent common ancestors. In this way, shared mutations or differences point to shared ancestry. In genetic genealogy investigations incorporating autosomal DNA evidence, individuals perform DNA testing at one or more of the major DNA testing companies. These tests analyze an array of several hundred thousand DNA markers across an individual’s genome. When individuals share long consecutive runs of these markers with other individuals, the resulting “segments” are considered to be identical-by-descent (IBD). In other words, those two individuals are expected to share recent common ancestors.
Once these markers are tested and analyzed, a test subject’s results are compared to a large database of other tested customers. This comparison generates a list of genetic cousins – individuals who share large segments of IBD DNA inherited from recent common ancestors. Based on the amounts of DNA shared between the test subject and genetic cousins, levels of genealogical relationship are estimated. The amounts of DNA that siblings share is vastly different than the amounts of DNA shared between first, second or third cousins. These estimates of genetic relationship are only based on genetic data, but can be confirmed through genealogical investigation in traditional document sources. Also, in some cases genetic relationships between multiple proposed relatives can be utilized to organize genetic matches into unique relationship groups. These groups, in turn, can often be associated with specific ancestral lines of the test subject. For matches and for groups, family trees are reviewed or constructed and common ancestors between genetic cousins are identified. When genetic cousins of a test subject share common ancestors, those ancestors are frequently ancestors or relatives of the test subject as well. Finally, descendants of these ancestral candidates are researched until individuals are located whose genealogical relationships account for all proposed ancestors and whose personal history places them in the correct place at the correct time to be the person of interest. This methodology is commonly employed for cases of misattributed or unknown parentage, adoption, multiple identities, and exploration of other brick-wall scenarios in genealogical research.
Use of Genetic Genealogy Methodologies by Law Enforcement
In the recent cases in which law enforcement has utilized these same approaches, DNA samples have been obtained from unidentified remains or crime scenes and have subsequently been sequenced using whole-genome sequencing technologies or microarray technologies – the same technologies utilized by genetic genealogy testing companies. From these sequences, fake kits comparable to those utilized in genetic genealogy were created and uploaded to at least one of the databases utilized by genealogists. Each of the cases mentioned above utilized GEDmatch.com, a free third-party website which accepts autosomal DNA transfers from each of the major DNA testing companies. At least for the Golden State Killer case, MyHeritage, Ancestry, 23andMe, and Family Tree DNA have denied direct involvement in relation to the use of their autosomal DNA databases. Family Tree DNA was contacted in relation to this case for information on a customer’s Y-DNA sample, which eventually proved to be a false lead.[2] Nevertheless, Family Tree DNA and MyHeritage do accept autosomal DNA transfers and could potentially have been utilized as well. Ancestry, 23andMe, Family Tree DNA and MyHeritage specifically prohibit law enforcement submission of DNA samples or kits to their databases as part of criminal or forensic investigations.
Once suspects’ DNA profiles were uploaded to GEDmatch.com, their results were compared against the existing database of tested genealogists at that website. Just as with other genetic genealogy test results, GEDmatch provided a list of genetic cousins who shared large segments of IBD DNA with the suspect. Some of these individuals were genetically related to each other. By building out the family trees of these matches and searching for common ancestors, ancestral candidates of the suspect were identified. Before identifying the suspects in these cases, it was first necessary to identify their second, third or fourth great grandparents. Once ancestral candidates were identified, descendants of these individuals were researched through genealogical investigation. In these cases, as descendants of unrelated ancestral candidates were researched, connections and intersections between the descending family lines may also have been identified. Eventually, individuals were found who were in the right place at the right time to be the missing persons of interest or, in the case of the murder investigations, fit the expected profiles for the suspected killers. These individuals’ family trees accounted for all ancestral candidates and connections to close genetic cousins. In the case of criminal investigations, once these individuals were identified, contemporary DNA samples were obtained from discarded materials. Those new samples were compared to the crime scene evidence utilizing law enforcement databases and were confirmed to match.
Using Genetic Genealogy to Solve Cold Cases: The Ethics Debate
While the capture of elusive serial killers, murderers and rapists is undoubtedly a huge benefit to society, the use of GEDmatch and genetic genealogy databases in this way raises important ethical concerns which demand continued consideration. First, the individuals whose DNA kits were utilized in these cases and are being utilized in other cases are largely unaware of this use of their DNA. Additionally, because GEDmatch is an international database, individuals living outside the United States have the possibility of their DNA test results being utilized in a jurisdiction and a country of which they are not even a citizen. For those who have submitted their samples to GEDmatch and who may be uncomfortable with other uses of their DNA results, they have been instructed to remove their submissions. For those administering the DNA kits of relatives, we urge changing the settings of the kits for family members to be private or research kits, or removing them entirely from the GEDmatch database until permission and consent can be obtained from relatives for these uses.
While law enforcement has long been utilizing DNA samples for investigations, the DNA databases they have until now utilized are vastly different from the databases commonly used by genealogists. Law enforcement databases center around fewer markers, more variable markers, and markers which are generally uninformative regarding health or appearance. Additionally, strict safeguards have been set in place to protect privacy and prevent abuse of the more than 15 million DNA samples in the database. These samples are entirely composed of convicted felons or suspects in criminal investigations. Meanwhile, GEDmatch is much more accessible to the public, is composed of less than one million samples voluntarily submitted by genealogists, includes contact information and identifying details associated with kits, can be used to determine ethnic origins, is much more effective at identifying close to distant relatives of an individual and has very few limitations on its use.[3] In the future, we call for protections on the data of innocent genealogists and strict procedures for how and when genetic genealogy databases may be used in law enforcement investigations.
Just as the structure and composition of law enforcement databases are quite different from genetic genealogy databases, so too are the ways in which results from these databases are interpreted. Any use of genetic genealogy approaches by law enforcement must be held to the highest standards of interpretation and, just as with genealogical research, genetic evidence must be considered within the context of all other available evidence. DNA tests run at DNA testing companies are held to high quality assurance standards and are obtained through samples containing large amounts of DNA. Meanwhile, crime scene DNA and degraded DNA from unidentified remains are sometimes found only in trace amounts, may be incomplete, and may be contaminated with other unrelated DNA signatures. In order to be comparable against genetic genealogy databases, these trace samples must achieve high coverage reads in analysis, they must guard against the potential for contamination and they must be of sufficient quality to accurately call the markers of interest.
Finally, while many laud law enforcement use of these genetic genealogy databases for violent crimes, it follows that reasonable restrictions on future and/or unanticipated uses must be established. Companies must provide the opportunity for informed user consent, with transparency to the potential for law enforcement use that exists in public databases such as GEDmatch. It is equally important that consumers use due diligence and familiarize themselves with the privacy policies and terms of service of each company that they choose to share their data with.
While genetic genealogy methodologies have immense power for good in convicting violent criminals they also have a potential for misuse. We encourage the creation of regulation for the use of genetic genealogy methodologies in law enforcement investigations: regulation which will balance the potential benefits of these methodologies against potential abuses of power and which will maximize the benefits that genetic genealogy has to offer in many arenas, while preserving opportunities for utilizing DNA evidence in journeys of self-discovery through genealogical investigation.
Legacy Tree Genealogists employs a team of leading genetic genealogists. If you have a genealogy question you think DNA might be able to help answer, we would love to help! Contact us to discuss your questions and goals, and we’ll help you choose a project option and get started.
Special Thanks to Debbie Cruwys Kennett for her input and consultation.
[1] Tim Arrango, “The Cold Case That Inspired the ‘Golden State Killer’ Detective to Try Genealogy,” New York Times (https://www.nytimes.com/2018/05/03/us/golden-state-killer-genealogy.html: accessed May 2018).
Crimesider Staff, “‘Buckskin Girl’ case: DNA breakthrough leads to ID of 1981 murder victim,” CBS News (https://www.cbsnews.com/news/buckskin-girl-case-groundbreaking-dna-tech-leads-to-id-of-1981-murder-victim/: accessed May 2018).
Sami Edge, “Dead man found in Washington state, who had ties to N.M., ID’d through DNA,” Santa Fe New Mexican (http://www.santafenewmexican.com/news/local_news/dead-man-found-in-washington-state-who-had-ties-to/article_00accfb7-b964-5a57-a9ae-ed6841bb2917.html: accessed May 2018).
CBS/AP, “DNA leads to arrest of man in 1987 killing of couple in Washington state,” CBS News (https://www.cbsnews.com/news/william-earl-talbott-arrested-in-1987-killing-of-tanya-van-cuylenborg-jay-cook-in-washington/: accessed May 2018).
[2] Peter Aldhous, “Cops Forced A Company To Share A Customer’s Identity For The Golden State Killer Investigation,” BuzzFeed News (https://www.buzzfeed.com/peteraldhous/family-tree-dna-subpoena-golden-state-killer: accessed May 2018).
[3] Leah Larkin, “A Comparison of GEDmatch and the FBI’s CODIS Database,” The DNA Geek (http://thednageek.com/a-comparison-of-gedmatch-and-the-fbis-codis-database/: accessed May 2018).
What steps do I need to take to remove my DNA information from GEDmatch? I put it up years ago and don’t have the contact information anymore.
Hi Diane, we suggest contacting GEDmatch directly. They can provide information on how to remove your kit as well as other options for limiting how your data is accessed and used.
I think it is pretty amazing that criminals are now being found through GEDmatch. Great tool for law enforcement.
I suppose if you have something to hide or are covering for an evil person, you shouldn’t upload your DNA. Really sad to think that a relative would cover up badness.
There are a myriad of reasons that someone might not want to share their DNA with the government. Of course some small-minded people will always assume the worst — another reason to keep one’s medical information private.