STR analysis

Short tandem repeat (STR) analysis is a common molecular biology method used to compare allele repeats at specific loci in DNA between two or more samples. A short tandem repeat is a microsatellite with repeat units that are 2 to 7 base pairs in length, with the number of repeats varying among individuals, making STRs effective for human identification purposes.^[2] This method differs from restriction fragment length polymorphism analysis (RFLP) since STR analysis does not cut the DNA with restriction enzymes. Instead, polymerase chain reaction (PCR) is employed to discover the lengths of the short tandem repeats based on the length of the PCR product.

Forensic uses

STR analysis is a tool in forensic analysis that evaluates specific STR regions found on nuclear DNA. The variable (polymorphic) nature of the STR regions that are analyzed for forensic testing intensifies the discrimination between one DNA profile and another.^[3] Scientific tools such as FBI approved STRmix incorporate this research technique.^[4]^[5] Forensic science takes advantage of the population's variability in STR lengths, enabling scientists to distinguish one DNA sample from another. The system of DNA profiling used today is based on PCR and uses simple sequences^[6] or short tandem repeats (STR). This method uses highly polymorphic regions that have short repeated sequences of DNA (the most common is 4 bases repeated, but there are other lengths in use, including 3 and 5 bases). Because unrelated people almost certainly have different numbers of repeat units, STRs can be used to discriminate between unrelated individuals. These STR loci (locations on a chromosome) are targeted with sequence-specific primers and amplified using PCR. The DNA fragments that result are then separated and detected using electrophoresis. There are two common methods of separation and detection, capillary electrophoresis (CE) and gel electrophoresis.

Each STR is polymorphic, but the number of alleles is very small. Typically each STR allele will be shared by around 5 - 20% of individuals. The power of STR analysis comes from looking at multiple STR loci simultaneously.^[6] The pattern of alleles can identify an individual quite accurately. Thus STR analysis provides an excellent identification tool. The more STR regions that are tested in an individual the more discriminating the test becomes.^[6] However, given 10 STR loci, it can result in a genotyping error margin of 30%, or nearly one third (1/3) of the time.^[7] Even when using 15 identifier microsatellite STR loci, they are not informative markers for inference of ancestry, a much larger set of genetic markers is needed to detect fine-scale population structure.^[8] A study claimed 30 DIP-STRs were found to be suitable for prenatal paternity testing and roughly outlining biogeographic ancestry in forensics, but more markers and multiplex panels need to be developed to promote use of this original approach.^[9]

When comparing SNP and STR analysis, the use of high-quality SNPs has proven to be better for delineating population structure, as well as genetic relationships at the individual and population level.^[10] Using the best 15 SNPs (30 alleles) was similar to the best 4 STR loci (83 alleles), and increasing the STR made no difference, but increasing to 100 SNPs substantially increased assignment giving the highest result. Researchers found that some of the STR loci out-performed the SNP loci on a single locus basis, but combinations of SNPs outperformed the STRs based upon total number of alleles. The SNPs from a larger panel gave significantly more accurate individual genetic self-assignment compared to any combination of the STR loci.^[10]

From country to country, different STR-based DNA-profiling systems are in use. In North America, systems that amplify the CODIS 20 core loci are almost universal, whereas in the United Kingdom the DNA-17 17 loci system (which is compatible with The National DNA Database) is in use. Whichever system is used, many of the STR regions used are the same. These DNA-profiling systems are based on multiplex reactions, whereby many STR regions will be tested at the same time.

The true power of STR analysis is in its statistical power of discrimination. Because the 20 loci that are currently used for discrimination in CODIS are independently assorted (having a certain number of repeats at one locus does not change the likelihood of having any number of repeats at any other locus), the product rule for probabilities can be applied. This means that, if someone has the DNA type of ABC, where the three loci were independent, we can say that the probability of having that DNA type is the probability of having type A times the probability of having type B times the probability of having type C. This has resulted in the ability to generate match probabilities of 1 in a quintillion (1x10¹⁸) or more. However, DNA database searches showed much more frequent than expected false DNA profile matches.^[11] Moreover, since there are about 12 million monozygotic twins on Earth, the theoretical probability is not accurate.

In practice, the risk of contaminated-matching is much greater than matching a distant relative, such as contamination of a sample from nearby objects, or from left-over cells transferred from a prior test. The risk is greater for matching the most common person in the samples: Everything collected from, or in contact with, a victim is a major source of contamination for any other samples brought into a lab. For that reason, multiple control-samples are typically tested in order to ensure that they stayed clean, when prepared during the same period as the actual test samples. Unexpected matches (or variations) in several control-samples indicates a high probability of contamination for the actual test samples. In a relationship test, the full DNA profiles should differ (except for twins), to prove that a person was not matched as being related to their own DNA in another sample.^{[citation needed]}

In biomedical research, STR profiles are used to authenticate cell lines.^[12] Self-generated STR profiles can be compared with databases such as CLASTR (https://www.cellosaurus.org/cellosaurus-str-search/) or STRBase (https://strbase.nist.gov/). In addition, self-generated primary murine cell lines cultured before the first passaging can be matched with later passages, thus ensuring the identity of the cell line.

References

^ Image by Mikael Häggström, MD, using following source image: Figure 1 - available via license: Creative Commons Attribution 4.0 International", from the following article:
Roberta Sitnik, Margareth Afonso Torres, Nydia Strachman Bacal, João Renato Rebello Pinho (2006). "Using PCR for molecular monitoring of post-transplantation chimerism". Einstein (Sao Paulo). 4 (2).{{cite journal}}: CS1 maint: multiple names: authors list (link)
^ Butler, John M. (4 August 2011). Advanced Topics in Forensic DNA Typing: Methodology. San Diego: Elsevier Academic Press. pp. 99–100. ISBN 9780123745132.
^ National Commission on the Future of DNA Evidence (July 2002). "Using DNA to Solve Cold Cases" (PDF). U.S. Department of Justice. Retrieved 2006-08-08.
^ "Internal Validation of STRmix™ V2.3" (PDF). dfs.dc.gov.
^ Moretti, Tamyra R.; Just, Rebecca S.; Kehl, Susannah C.; Willis, Leah E.; Buckleton, John S.; Bright, Jo-Anne; Taylor, Duncan A.; Onorato, Anthony J. (2017). "Internal validation of STRmix™ for the interpretation of single source and mixed DNA profiles". Forensic Science International: Genetics. 29: 126–144. doi:10.1016/j.fsigen.2017.04.004. PMID 28504203.
^ ^a ^b ^c Tautz D. (1989). "Hypervariability of simple sequences as a general source for polymorphic DNA markers". Nucleic Acids Research. 17 (16): 6463–6471. doi:10.1093/nar/17.16.6463. PMC 318341. PMID 2780284.
^ Witherspoon, D. J.; Wooding, S.; Rogers, A. R.; Marchani, E. E.; Watkins, W. S.; Batzer, M. A.; Jorde, L. B. (2007-05-01). "Genetic Similarities Within and Between Human Populations". Genetics. 176 (1): 351–359. doi:10.1534/genetics.106.067355. ISSN 0016-6731. PMC 1893020. PMID 17339205.
^ Babiker, Hiba MA; Schlebusch, Carina M; Hassan, Hisham Y; Jakobsson, Mattias (December 2011). "Genetic variation and population structure of Sudanese populations as indicated by 15 Identifiler sequence-tagged repeat (STR) loci". Investigative Genetics. 2 (1): 12. doi:10.1186/2041-2223-2-12. ISSN 2041-2223. PMC 3118356. PMID 21542921.
^ Damour, Géraldine; Mauffrey, Florian; Hall, Diana (2023-05-01). "Identification and characterization of novel DIP-STRs from whole-genome sequencing data". Forensic Science International: Genetics. 64: 102849. doi:10.1016/j.fsigen.2023.102849. ISSN 1872-4973. PMID 36827792.
^ ^a ^b Glover, Kevin A.; Hansen, Michael M.; Lien, Sigbjørn; Als, Thomas D.; Høyheim, Bjørn; Skaala, Oystein (2010-01-06). "A comparison of SNP and STR loci for delineating population structure and performing individual genetic assignment". BMC Genetics. 11: 2. doi:10.1186/1471-2156-11-2. ISSN 1471-2156. PMC 2818610. PMID 20051144.
^ Felch, Jason; et al. (July 20, 2008). "FBI resists scrutiny of 'matches'". Los Angeles Times. pp. P8.
^ Hong Y. (2020). "Authentication of Primary Murine Cell Lines by a Microfluidics-Based Lab-On-Chip System". Biomedicines. 8 (12): 590. doi:10.3390/biomedicines8120590. PMC 7763653. PMID 33317212.

[1] Image by Mikael Häggström, MD, using following source image: Figure 1 - available via license: Creative Commons Attribution 4.0 International", from the following article:
Roberta Sitnik, Margareth Afonso Torres, Nydia Strachman Bacal, João Renato Rebello Pinho (2006). "Using PCR for molecular monitoring of post-transplantation chimerism". Einstein (Sao Paulo). 4 (2).{{cite journal}}: CS1 maint: multiple names: authors list (link)

[2] Butler, John M. (4 August 2011). Advanced Topics in Forensic DNA Typing: Methodology. San Diego: Elsevier Academic Press. pp. 99–100. ISBN 9780123745132.

[3] National Commission on the Future of DNA Evidence (July 2002). "Using DNA to Solve Cold Cases" (PDF). U.S. Department of Justice. Retrieved 2006-08-08.

[4] "Internal Validation of STRmix™ V2.3" (PDF). dfs.dc.gov.

[5] Moretti, Tamyra R.; Just, Rebecca S.; Kehl, Susannah C.; Willis, Leah E.; Buckleton, John S.; Bright, Jo-Anne; Taylor, Duncan A.; Onorato, Anthony J. (2017). "Internal validation of STRmix™ for the interpretation of single source and mixed DNA profiles". Forensic Science International: Genetics. 29: 126–144. doi:10.1016/j.fsigen.2017.04.004. PMID 28504203.

[:0-6] Tautz D. (1989). "Hypervariability of simple sequences as a general source for polymorphic DNA markers". Nucleic Acids Research. 17 (16): 6463–6471. doi:10.1093/nar/17.16.6463. PMC 318341. PMID 2780284.

[7] Witherspoon, D. J.; Wooding, S.; Rogers, A. R.; Marchani, E. E.; Watkins, W. S.; Batzer, M. A.; Jorde, L. B. (2007-05-01). "Genetic Similarities Within and Between Human Populations". Genetics. 176 (1): 351–359. doi:10.1534/genetics.106.067355. ISSN 0016-6731. PMC 1893020. PMID 17339205.

[8] Babiker, Hiba MA; Schlebusch, Carina M; Hassan, Hisham Y; Jakobsson, Mattias (December 2011). "Genetic variation and population structure of Sudanese populations as indicated by 15 Identifiler sequence-tagged repeat (STR) loci". Investigative Genetics. 2 (1): 12. doi:10.1186/2041-2223-2-12. ISSN 2041-2223. PMC 3118356. PMID 21542921.

[9] Damour, Géraldine; Mauffrey, Florian; Hall, Diana (2023-05-01). "Identification and characterization of novel DIP-STRs from whole-genome sequencing data". Forensic Science International: Genetics. 64: 102849. doi:10.1016/j.fsigen.2023.102849. ISSN 1872-4973. PMID 36827792.

[:1-10] Glover, Kevin A.; Hansen, Michael M.; Lien, Sigbjørn; Als, Thomas D.; Høyheim, Bjørn; Skaala, Oystein (2010-01-06). "A comparison of SNP and STR loci for delineating population structure and performing individual genetic assignment". BMC Genetics. 11: 2. doi:10.1186/1471-2156-11-2. ISSN 1471-2156. PMC 2818610. PMID 20051144.

[11] Felch, Jason; et al. (July 20, 2008). "FBI resists scrutiny of 'matches'". Los Angeles Times. pp. P8.

[12] Hong Y. (2020). "Authentication of Primary Murine Cell Lines by a Microfluidics-Based Lab-On-Chip System". Biomedicines. 8 (12): 590. doi:10.3390/biomedicines8120590. PMC 7763653. PMID 33317212.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

Forensic uses

See also

References