Publications by authors named "Bruce Budowle"

293 Publications

Linkage and linkage disequilibrium among the markers in the forensic MPS panels.

J Forensic Sci 2021 Apr 22. Epub 2021 Apr 22.

Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA.

For the past two to three decades, forensic DNA evidence has been analyzed with a limited number of short tandem repeats (STRs), and these STRs are usually assumed to be independent for statistical calculations. With the development and implementation of the MPS technologies, more autosomal markers, both single nucleotide polymorphisms (SNPs) and STRs, can be analyzed. A number of these markers are physically very close to each other, and it may not be appropriate to assume all these markers are genetically unlinked or in linkage equilibrium. In this study, publicly accessible genomic data from five representative populations were used to evaluate the genetic linkage and linkage disequilibrium (LD) between autosomal markers represented in six major commercial panels (in total, 362 markers). Among the 3041 syntenic marker pairs, 1524 pairs had sex-average genetic distances <50 cM, and thus, these marker pairs can be considered as genetically linked. Among the 143 marker pairs with physical distances <1 Mb, 19 LD haplotype blocks (comprising 39 SNPs in total) were detected for at least one of the tested populations. Statistical methods for interpreting linked markers and/or markers in LD were suggested for various case scenarios.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/1556-4029.14724DOI Listing
April 2021

Evaluation of Promega PowerSeq™ Auto/Y systems prototype on an admixed sample of Rio de Janeiro, Brazil: Population data, sensitivity, stutter and mixture studies.

Forensic Sci Int Genet 2021 Apr 6;53:102516. Epub 2021 Apr 6.

Instituto de Biofísica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil. Electronic address:

Forensic DNA typing typically relies on the length-based (LB) separation of PCR products containing short tandem repeat loci (STRs). Massively parallel sequencing (MPS) elucidates an additional level of STR motif and flanking region variation. Also, MPS enables simultaneous analysis of different marker-types - autosomal STRs, SNPs for lineage and identification purposes, reducing both the amount of sample used and the turn-around-time of analysis. Therefore, MPS methodologies are being considered as an additional tool in forensic genetic casework. The PowerSeq™ Auto/Y System (Promega Corp), a multiplex forensic kit for MPS, enables analysis of the 22 autosomal STR markers (plus Amelogenin) from the PowerPlex® Fusion 6C kit and 23 Y-STR markers from the PowerPlex® Y23 kit. Population data were generated from 140 individuals from an admixed sample from Rio de Janeiro, Brazil. All samples were processed according to the manufacturers' recommended protocols. Raw data (FastQ) were generated for each indexed sample and analyzed using STRait Razor v2s and PowerSeqv2.config file. The subsequent population data showed the largest increase in expected heterozygosity (23%), from LB to sequence-based (SB) analyses at the D5S818 locus. Unreported allele was found at the D21S11 locus. The random match probability across all loci decreased from 5.9 × 10 to 7.6 × 10. Sensitivity studies using 1, 0.25, 0.062 and 0.016 ng of DNA input were analyzed in triplicate. Full Y-STR profiles were detected in all samples, and no autosomal allele drop-out was observed with 62 pg of input DNA. For mixture studies, 1 ng of genomic DNA from a male and female sample at 1:1, 1:4, 1:9, 1:19 and 1:49 proportions were analyzed in triplicate. Clearly resolvable alleles (i.e., no stacking or shared alleles) were obtained at a 1:19 male to female contributor ratio. The minus one stutter (-1) increased with the longest uninterrupted stretch (LUS) allele size reads and according to simple or compound/complex repeats. The haplotype-specific stutter rates add more information for mixed samples interpretation. These data support the use of the PowerSeq Auto/Y systems prototype kit (22 autosomal STR loci, 23 Y-STR loci and Amelogenin) for forensic genetics applications.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2021.102516DOI Listing
April 2021

Autosomal STR and SNP characterization of populations from the Northeastern Peruvian Andes with the ForenSeq™ DNA Signature Prep Kit.

Forensic Sci Int Genet 2021 May 23;52:102487. Epub 2021 Feb 23.

Department of Forensic Medicine, University of Helsinki, PO Box 40, FI-00014 Helsinki, Finland; Forensic Medicine Unit, Finnish Institute for Health and Welfare, PO Box 30, FI-00271 Helsinki, Finland. Electronic address:

Autosomal DNA data from Peru for human identity testing purposes are scarce in the scientific literature, which hinders obtaining an appropriate portrait of the genetic variation of the resident populations. In this study we genetically characterize five populations from the Northeastern Peruvian Andes (Chachapoyas, Awajún, Wampís, Huancas and Cajamarca). Autosomal short tandem repeat (aSTR) and identity informative single nucleotide polymorphism (iiSNP) data from a total of 233 unrelated individuals are provided, and forensic genetic parameters are calculated for each population and for the combined set Northeastern Peruvian Andes. After correction for multiple testing in the whole dataset of the Northeastern Peruvian Andes, the only departure from Hardy-Weinberg equilibrium was observed in locus rs2111980. Twenty one out of 27 aSTR loci exhibited an increased number of alleles due to sequence variation in the repeat motif and flanking regions. For iiSNPs 33% of the loci displayed flanking region variation. The combined random match probability (RMP), assuming independence of all loci (aSTRs and iiSNPs), in the Chachapoyas, the population with the largest samples size (N = 172), was 8.14 × 10 for length-based data while for sequence-based was 4.15 × 10. In the merged dataset (Northeastern Peruvian Andes; N = 233), the combined RMP when including all markers were 2.96 × 10 (length-based) and 3.21 × 10 (sequence-based). These new data help to fill up some of the gaps in the genetic canvas of South America and provide essential length- and sequence-based background information for other forensic genetic studies in Peru.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2021.102487DOI Listing
May 2021

Graph Algorithms for Mixture Interpretation.

Genes (Basel) 2021 Jan 27;12(2). Epub 2021 Jan 27.

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA.

The scale of genetic methods are presently being expanded: forensic genetic assays previously were limited to tens of loci, but now technologies allow for a transition to forensic genomic approaches that assess thousands to millions of loci. However, there are subtle distinctions between genetic assays and their genomic counterparts (especially in the context of forensics). For instance, forensic genetic approaches tend to describe a locus as a haplotype, be it a microhaplotype or a short tandem repeat with its accompanying flanking information. In contrast, genomic assays tend to provide not haplotypes but sequence variants or differences, variants which in turn describe how the alleles apparently differ from the reference sequence. By the given construction, mitochondrial genetic assays can be thought of as genomic as they often describe genetic differences in a similar way. The mitochondrial genetics literature makes clear that sequence differences, unlike the haplotypes they encode, are not comparable to each other. Different alignment algorithms and different variant calling conventions may cause the same haplotype to be encoded in multiple ways. This ambiguity can affect evidence and reference profile comparisons as well as how "match" statistics are computed. In this study, a graph algorithm is described (and implemented in the MMDIT (Mitochondrial Mixture Database and Interpretation Tool) R package) that permits the assessment of forensic match statistics on mitochondrial DNA mixtures in a way that is invariant to both the variant calling conventions followed and the alignment parameters considered. The algorithm described, given a few modest constraints, can be used to compute the "random man not excluded" statistic or the likelihood ratio. The performance of the approach is assessed in in silico mitochondrial DNA mixtures.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/genes12020185DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7911948PMC
January 2021

A Continuous Statistical Phasing Framework for the Analysis of Forensic Mitochondrial DNA Mixtures.

Genes (Basel) 2021 Jan 20;12(2). Epub 2021 Jan 20.

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp, Bowie Blvd., Fort Worth, TX 76107, USA.

Despite the benefits of quantitative data generated by massively parallel sequencing, resolving mitotypes from mixtures occurring in certain ratios remains challenging. In this study, a bioinformatic mixture deconvolution method centered on population-based phasing was developed and validated. The method was first tested on 270 in silico two-person mixtures varying in mixture proportions. An assortment of external reference panels containing information on haplotypic variation (from similar and different haplogroups) was leveraged to assess the effect of panel composition on phasing accuracy. Building on these simulations, mitochondrial genomes from the Human Mitochondrial DataBase were sourced to populate the panels and key parameter values were identified by deconvolving an additional 7290 in silico two-person mixtures. Finally, employing an optimized reference panel and phasing parameters, the approach was validated with in vitro two-person mixtures with differing proportions. Deconvolution was most accurate when the haplotypes in the mixture were similar to haplotypes present in the reference panel and when the mixture ratios were neither highly imbalanced nor subequal (e.g., 4:1). Overall, errors in haplotype estimation were largely bounded by the accuracy of the mixture's genotype results. The proposed framework is the first available approach that automates the reconstruction of complete individual mitotypes from mixtures, even in ratios that have traditionally been considered problematic.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/genes12020128DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7909279PMC
January 2021

STRait Razor Online: An enhanced user interface to facilitate interpretation of MPS data.

Forensic Sci Int Genet 2021 May 13;52:102463. Epub 2021 Jan 13.

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA; Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA.

Since 2013, STRait Razor has enabled analysis of massively parallel sequencing (MPS) data from various marker systems such as short tandem repeats, single nucleotide polymorphisms, insertion/deletions, and mitochondrial DNA. In this paper, STRait Razor Online (SRO), available at https://www.unthsc.edu/straitrazor, is introduced as an interactive, Shiny-based user interface for primary analysis of MPS data and secondary analysis of STRait Razor haplotype pileups. This software can be accessed from any common browser via desktop, tablet, or smartphone device. SRO is available also as a standalone application and open-source R script available at https://github.com/ExpectationsManaged/STRaitRazorOnline. The local application is capable of batch processing of both fastq files and primary analysis output. Processed batches generate individual report folders and summary reports at the locus- and haplotype-level in a matter of minutes. For example, the processing of data from ∼700 samples generated with the ForenSeq Signature Preparation Kit from allsequences.txt to a final table can be performed in ∼40 min whereas the Excel-based workbooks can take 35-60 h to compile a subset of the tables generated by SRO. To facilitate analysis of single-source, reference samples, a preliminary triaging system was implemented that calls potential alleles and flags loci suspected of severe heterozygote imbalance. When compared to published, manually curated data sets, 98.72 % of software-assigned allele calls without manual interpretation were consistent with curated data sets, 0.99 % loci were presented to the user for interpretation due to heterozygote imbalance, and the remaining 0.29 % of loci were inconsistent due to the analytical thresholds used across the studies.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2021.102463DOI Listing
May 2021

ProDerAl: Reference Position Dependent Alignment.

Bioinformatics 2021 Jan 18. Epub 2021 Jan 18.

Center for Human Identification, University of North Texas, Fort Worth Texas.

Motivation: Current read-mapping software uses a singular specification of alignment parameters with respect to the reference. In the presence of varying reference structures (such as the repetitive regions of the human genome), alignments can be improved if those parameters are allowed vary.

Results: To that end, the C ++ program ProDerAl was written to refine previously generated alignments using varying parameters for these problematic regions. Synthetic benchmarks show that this realignment can result in an order of magnitude fewer misaligned bases.

Availability: *Nix users can retrieve the source from GitHub (https://github.com/Benjamin-Crysup/proderal.git). Windows binary available at https://github.com/Benjamin-Crysup/proderal/releases/download/v1.1/proderal.zip.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btab008DOI Listing
January 2021

Reducing noise and stutter in short tandem repeat loci with unique molecular identifiers.

Forensic Sci Int Genet 2021 Mar 25;51:102459. Epub 2020 Dec 25.

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA; Department of Microbiology, Immunology and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA.

Unique molecular identifiers (UMIs) are a promising approach to contend with errors generated during PCR and massively parallel sequencing (MPS). With UMI technology, random molecular barcodes are ligated to template DNA molecules prior to PCR, allowing PCR and sequencing error to be tracked and corrected bioinformatically. UMIs have the potential to be particularly informative for the interpretation of short tandem repeats (STRs). Traditional MPS approaches may simply lead to the observation of alleles that are consistent with the hypotheses of stutter, while with UMIs stutter products bioinformatically may be re-associated with their parental alleles and subsequently removed. Herein, a bioinformatics pipeline named strumi is described that is designed for the analysis of STRs that are tagged with UMIs. Unlike other tools, strumi is an alignment-free machine learning driven algorithm that clusters individual MPS reads into UMI families, infers consensus super-reads that represent each family and provides an estimate the resulting haplotype's accuracy. Super-reads, in turn, approximate independent measurements not of the PCR products, but of the original template molecules, both in terms of quantity and sequence identity. Provisional assessments show that naïve threshold-based approaches generate super-reads that are accurate (∼97 % haplotype accuracy, compared to ∼78 % when UMIs are not used), and the application of a more nuanced machine learning approach increases the accuracy to ∼99.5 % depending on the level of certainty desired. With these features, UMIs may greatly simplify probabilistic genotyping systems and reduce uncertainty. However, the ability to interpret alleles at trace levels also permits the interpretation, characterization and quantification of contamination as well as somatic variation (including somatic stutter), which may present newfound challenges.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2020.102459DOI Listing
March 2021

A novel approach for visualization and localization of small amounts of DNA on swabs to improve DNA collection and recovery process.

Analyst 2021 Feb 4;146(4):1198-1206. Epub 2021 Jan 4.

Department of Physics and Astronomy, Texas Christian University, Fort Worth, TX 76129, USA.

In this report, a simple and practical procedure is proposed for DNA localization on a solid matrix e.g., a collection swab. The approach is straightforward and employs spectrum decomposition using a model DNA intercalator Ethidium Bromide (EtBr). The proposed approach can detect picograms of DNA in solution and nanograms of DNA on solid surfaces (swabs) without the need for PCR amplification. The proposed technology offers the possibility for developing an inexpensive, sensitive, rapid, and practical method for localizing and recovering DNA deposited on collection swabs during routine DNA screening. Improved detection of low DNA concentrations is needed and, if feasible, will allow for better decision making in clinical medicine, biological and environmental research, and human identification in forensic investigations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1039/d0an02043eDOI Listing
February 2021

Genetic assessment reveals no population substructure and divergent regional and sex-specific histories in the Chachapoyas from northeast Peru.

PLoS One 2020 31;15(12):e0244497. Epub 2020 Dec 31.

Department of Forensic Medicine, University of Helsinki, Helsinki, Finland.

Many native populations in South America have been severely impacted by two relatively recent historical events, the Inca and the Spanish conquest. However decisive these disruptive events may have been, the populations and their gene pools have been shaped markedly also by the history prior to the conquests. This study focuses mainly on the Chachapoya peoples that inhabit the montane forests on the eastern slopes of the northern Peruvian Andes, but also includes three distinct neighboring populations (the Jívaro, the Huancas and the Cajamarca). By assessing mitochondrial, Y-chromosomal and autosomal diversity in the region, we explore questions that have emerged from archaeological and historical studies of the regional culture (s). These studies have shown, among others, that Chachapoyas was a crossroads for Coast-Andes-Amazon interactions since very early times. In this study, we examine the following questions: 1) was there pre-Hispanic genetic population substructure in the Chachapoyas sample? 2) did the Spanish conquest cause a more severe population decline on Chachapoyan males than on females? 3) can we detect different patterns of European gene flow in the Chachapoyas region? and, 4) did the demographic history in the Chachapoyas resemble the one from the Andean area? Despite cultural differences within the Chachapoyas region as shown by archaeological and ethnohistorical research, genetic markers show no significant evidence for past or current population substructure, although an Amazonian gene flow dynamic in the northern part of this territory is suggested. The data also indicates a bottleneck c. 25 generations ago that was more severe among males than females, as well as divergent population histories for populations in the Andean and Amazonian regions. In line with previous studies, we observe high genetic diversity in the Chachapoyas, despite the documented dramatic population declines. The diverse topography and great biodiversity of the northeastern Peruvian montane forests are potential contributing agents in shaping and maintaining the high genetic diversity in the Chachapoyas region.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0244497PLOS
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7774974PMC
March 2021

Developmental Validation of a MPS Workflow with a PCR-Based Short Amplicon Whole Mitochondrial Genome Panel.

Genes (Basel) 2020 Nov 13;11(11). Epub 2020 Nov 13.

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Boulevard, Fort Worth, TX 76107, USA.

For the adoption of massively parallel sequencing (MPS) systems by forensic laboratories, validation studies on specific workflows are needed to support the feasibility of implementation and the reliability of the data they produce. As such, the whole mitochondrial genome sequencing methodology-Precision ID mtDNA Whole Genome Panel, Ion Chef, Ion S5, and Converge-has been subjected to a variety of developmental validation studies. These validation studies were completed in accordance with the Scientific Working Group on DNA Analysis Methods (SWGDAM) validation guidelines and assessed reproducibility, repeatability, accuracy, sensitivity, specificity to human DNA, and ability to analyze challenging (e.g., mixed, degraded, or low quantity) samples. Intra- and inter-run replicates produced an average maximum pairwise difference in variant frequency of 1.2%. Concordance with data generated with traditional Sanger sequencing and an orthogonal MPS platform methodology was used to assess accuracy, and generation of complete and concordant haplotypes at DNA input levels as low as 37.5 pg of nuclear DNA or 187.5 mitochondrial genome copies illustrated the sensitivity of the system. Overall, data presented herein demonstrate that highly accurate and reproducible results were generated for a variety of sample qualities and quantities, supporting the reliability of this specific whole genome mitochondrial DNA MPS system for analysis of forensic biological evidence.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.3390/genes11111345DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7709034PMC
November 2020

Forensic investigation approaches of searching relatives in DNA databases.

J Forensic Sci 2021 Mar 2;66(2):430-443. Epub 2020 Nov 2.

Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA.

There are several indirect database searching approaches to identify the potential source of a forensic biological sample. These DNA-based approaches are familial searching, Y-STR database searching, and investigative genetic genealogy (IGG). The first two strategies use forensic DNA databases managed by the government, and the latter uses databases managed by private citizens or companies. Each of these search strategies relies on DNA testing to identify relatives of the donor of the crime scene sample, provided such profiles reside in the DNA database(s). All three approaches have been successfully used to identify the donor of biological evidence, which assisted in solving criminal cases or identifying unknown human remains. This paper describes and compares these approaches in terms of genotyping technologies, searching methods, database structures, searching efficiency, data quality, data security, and costs, and raises some potential privacy and legal considerations for further discussion by stakeholders and scientists. Y-STR database searching and IGG are advantageous since they are able to assist in more cases than familial searching readily identifying distant relatives. In contrast, familial searching can be performed more readily with existing laboratory systems. Every country or state may have its own unique economic, technical, cultural, and legal considerations and should decide the best approach(es) to fit those circumstances. Regardless of the approach, the ultimate goal should be the same: generate investigative leads and solve active and cold criminal cases to public safety, under stringent policies and security practices designed to protect the privacy of its citizenry.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1111/1556-4029.14615DOI Listing
March 2021

Allelic frequencies with 23 autosomic STRS in the Aymara population of Peru.

Int J Legal Med 2021 May 21;135(3):779-781. Epub 2020 Oct 21.

Center for Human Identification, University of North Texas Health Science Center, Ft Worth, TX, 76107, USA.

Population data of the Aymara in the province of Puno were established for 23 autosomal STR markers. DNA was obtained from unrelated individuals (n = 190) who reside in three areas of the Floating Islands of Lake Titicaca, residents on the border with Bolivia and residents who are not from the border with Bolivia. The PENTA E marker presented the highest PD (0.9738), PIC (0.8793), and PM (0.7847) values. The combined PD was greater than 0.99999999 and the combined PE was 0.99999994. The largest distance, based on Fst values, was between the Aymara population and the Ashaninca population (0.04022), and the smallest distance was with the populations of Bolivia (0.00136) and Peru (0.00525).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00414-020-02448-0DOI Listing
May 2021

Evaluation of 16S rRNA Hypervariable Regions for Bioweapon Species Detection by Massively Parallel Sequencing.

Int J Microbiol 2020 26;2020:8865520. Epub 2020 Sep 26.

Instituto de Biofísica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil.

Molecular detection and classification of the bacterial groups in a sample are relevant in several areas, including medical research and forensics. Sanger sequencing of the 16S rRNA gene is considered the gold standard for microbial phylogenetic analysis. However, the development of massively parallel sequencing (MPS) offers enhanced sensitivity and specificity for microbiological analyses. In addition, 16S rRNA target amplification followed by MPS facilitates the combined use of multiple markers/regions, better discrimination of sample background, and higher sample throughput. We designed a novel set of 16S rRNA gene primers for detection of bacterial species associated with clinical, bioweapon, and biohazards microorganisms via alignment of 364 sequences representing 19 bacterial species and strains relevant to medical and forensics applications. In silico results indicated that the hypervariable regions (V1V2), (V4V5), and (V6V7V8) support the resolution of a selected group of bacteria. Interspecies and intraspecies comparisons showed 74.23%-85.51% and 94.48%-99.98% sequencing variation among species and strains, respectively. Sequence reads from a simulated scenario of bacterial species mapped to each of the three hypervariable regions of the respective species with different affinities. The minimum limit of detection was achieved using two different MPS platforms. This protocol can be used to detect or monitor as low as 2,000 genome equivalents of bacterial species associated with clinical, bioweapon, and biohazard microorganisms and potentially can distinguish natural outbreaks of pathogenic microorganisms from those occurring by intentional release.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1155/2020/8865520DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7533751PMC
September 2020

Are low LRs reliable?

Forensic Sci Int Genet 2020 11 8;49:102350. Epub 2020 Jul 8.

University of Auckland, Department of Statistics, Private Bag 92019, Auckland, New Zealand; Institute of Environmental Science and Research Limited, Private Bag 92021, Auckland, 1142 New Zealand.

To answer the question "Are low likelihood ratios reliable?" requires both a definition of reliable and then a test of whether low likelihood ratios (LRs) meet that definition. We offer, from a purely statistical standpoint, that reliability can be determined by assessing whether the rate of inclusionary support for non-donors over many cases is not larger than expected from the LR value. Thus, it is not the magnitude of the LR alone that determines reliability. Turing's rule is used to inform the expected rate of non-donor inclusionary support, where the rate of non-donor inclusionary support is at most the reciprocal of the LR, i.e. Pr(LR > x|H) ≤1/x. There are parallel concerns about whether the value of the evidence can be communicated. We do not discuss that in depth here although it is an important consideration to be addressed with training. In this paper, we use a mixture of real and simulated data to show that the rate of non-donor inclusionary support for these data is significantly lower than the upper bound given by Turing's rule. We take this as strong evidence that low LRs are reliable.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2020.102350DOI Listing
November 2020

Distinguishing mitochondrial DNA and NUMT sequences amplified with the precision ID mtDNA whole genome panel.

Mitochondrion 2020 11 17;55:122-133. Epub 2020 Sep 17.

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Boulevard, Fort Worth, TX 76107, USA; Department of Microbiology, Immunology, and Genetics, University of North Texas Health. Science Center, 3500 Camp Bowie Boulevard, Fort Worth, TX 76107, USA.

Nuclear mitochondrial DNA segments (NUMTs) are generated via transfer of portions of the mitochondrial genome into the nuclear genome. Given their common origin, there is the possibility that both the mitochondrial and NUMT segments may co-amplify using the same set of primers. Thus, analysis of the variation of the mitochondrial genome must take into account this co-amplification of mitochondrial and NUMT sequences. The study herein builds on data from the study by Strobl et al. (Strobl et al., 2019), in which multiple point heteroplasmies were called with an "N" to prevent labeling NUMT sequences mimicking mitochondrial heteroplasmy and being interpreted as true mitochondrial in origin sequence variants. Each of these point heteroplasmies was studied in greater detail, both molecularly and bioinformatically, to determine whether NUMT or true mitochondrial DNA variation was present. The bioinformatic and molecular tools available to help distinguish between NUMT and mitochondrial DNA and the effect of NUMT sequences on interpretation were discussed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.mito.2020.09.001DOI Listing
November 2020

Population genetic study of a Peruvian population using human identification STRs.

Int J Legal Med 2020 Nov 2;134(6):2071-2073. Epub 2020 Sep 2.

Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, 76107, USA.

In this study, allele frequencies were determined in a Peruvian population for application to human identification. A population of 601 unrelated individuals was analyzed (400 individuals with the GlobalFiler Express kit and 201 individuals with the VeriFiler Express kit). The locus with the highest power of discrimination (PD) was SE33 (0.9851, 31 alleles), while the least polymorphic locus was D22S1045 (0.75810, 11 alleles). The PE in a similar fashion ranged from 0.2421 (D22S1045) to 0.7818 (SE33). Under the assumption of independence, the combined PD was > 0.9999999999 while the combined PE = 0.9999999933. When comparing the population studied with different populations of Latin America, the greatest Fst genetic distance was obtained with a Venezuelan population (0.052), and the shortest distance was with a Bolivian and Peruvian population (0.004).
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00414-020-02418-6DOI Listing
November 2020

How many familial relationship testing results could be wrong?

PLoS Genet 2020 08 13;16(8):e1008929. Epub 2020 Aug 13.

Center for Human Identification, University of North Texas Health Science Center, Fort Worth, Texas, United States of America.

View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1371/journal.pgen.1008929DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7425842PMC
August 2020

A standalone humanitarian DNA identification database system to increase identification of human remains of foreign nationals.

Int J Legal Med 2020 Nov 6;134(6):2039-2044. Epub 2020 Aug 6.

Center for Human Identification, Graduate School of Biomedical Sciences, University of North Texas Health Science Center, 3500 Camp Bowie Blvd, CBH-250, Ft Worth, TX, 76107, USA.

The identification of missing persons and human remains is a worldwide problem which has been exacerbated with increased migrations and rampant human trafficking and smuggling cases. DNA typing and DNA databases are primary tools and resources used to help identify human remains and missing persons. The foundation of most, if not all, national DNA database systems, e.g., CODIS, is law enforcement identification. With such database systems, compliance with statutory and operational requirements is necessary to ensure the integrity of the databases. However, because of conditions in their homelands, relatives of missing persons at times may not trust the government and may be reluctant to contact a law enforcement agency, making it difficult to satisfy the law enforcement nexus necessary for entry into a national DNA database. A potential solution to increase the identification of unidentified human remains found within the USA, such as those that may be of foreign nationals, the University of North Texas Center for Human Identification (UNTCHI) has created a Humanitarian DNA Identification DNA Database (HDID) that enables family reference sample DNA profiles from non-US citizens to be compared with the DNA profiles from unidentified human remains within its local database system. This short communication describes the needs, basis, policies, and practices to inform the scientific, investigative, and legal communities and the public so that various entities may become aware and consider submitting family reference sample (FRS) profiles from foreign nationals for the purpose of searching against UNTCHI's HDID. It is our hope that by creating this HDID, another vehicle is available to support identification of human remains within the USA and to bring much needed answers to the family members of missing persons. The HDID will merge high forensic quality and best practices with the broader accessibility for non-US families to voluntarily donate DNA profiles for searching for missing loved ones.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00414-020-02396-9DOI Listing
November 2020

Numt identification and removal with RtN!

Bioinformatics 2020 12;36(20):5115-5116

Department of Microbiology, Immunology and Genetics.

Motivation: Assays in mitochondrial genomics rely on accurate read mapping and variant calling. However, there are known and unknown nuclear paralogs that have fundamentally different genetic properties than that of the mitochondrial genome. Such paralogs complicate the interpretation of mitochondrial genome data and confound variant calling.

Results: Remove the Numts! (RtN!) was developed to categorize reads from massively parallel sequencing data not based on the expected properties and sequence identities of paralogous nuclear encoded mitochondrial sequences, but instead using sequence similarity to a large database of publicly available mitochondrial genomes. RtN! removes low-level sequencing noise and mitochondrial paralogs while not impacting variant calling, while competing methods were shown to remove true variants from mitochondrial mixtures.

Availability And Implementation: https://github.com/Ahhgust/RtN.

Supplementary Information: Supplementary data are available at Bioinformatics online.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btaa642DOI Listing
December 2020

The lot-to-lot variability in the mitochondrial genome of controls.

Forensic Sci Int Genet 2020 07 30;47:102298. Epub 2020 Apr 30.

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Boulevard, Fort Worth, TX 76107, USA; Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Boulevard, Fort Worth, TX 76107, USA.

Current research in the biomedical field has illustrated how cell lines used as reference standards can change over time and, more importantly, can affect research and diagnostic results obtained from these cell lines. With the use of increasingly sensitive and highly resolving technologies (e.g., massively parallel sequencing), forensic scientists must be aware of and account for potential variability in the cell lines used as controls in their validation studies and day-to-day casework. In this study, multiple lot numbers from four commonly-used control cell line DNAs were sequenced with massively parallel sequencing on the Ion S5. The variability among these different lots was evaluated, and the effect on forensic laboratory work discussed.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2020.102298DOI Listing
July 2020

An algorithm for random match probability calculation from peptide sequences.

Forensic Sci Int Genet 2020 07 6;47:102295. Epub 2020 Apr 6.

Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, United states.

For the past three decades, forensic genetic investigations have focused on elucidating DNA signatures. While DNA has a number of desirable properties (e.g., presence in most biological materials, an amenable chemistry for analysis and well-developed statistics), DNA also has limitations. DNA may be in low quantity in some tissues, such as hair, and in some tissues it may degrade more readily than its protein counterparts. Recent research efforts have shown the feasibility of performing protein-based human identification in cases in which recovery of DNA is challenged; however, the methods involved in assessing the rarity of a given protein profile have not been addressed adequately. In this paper an algorithm is proposed that describes the computation of a random match probability (RMP) resulting from a genetically variable peptide signature. The approach described herein explicitly models proteomic error and genetic linkage, makes no assumptions as to allelic drop-out, and maps the observed proteomic alleles to their expected protein products from DNA which, in turn, permits standard corrections for population structure and finite database sizes. To assess the feasibility of this approach, RMPs were estimated from peptide profiles of skin samples from 25 individuals of European ancestry. 126 common peptide alleles were used in this approach, yielding a mean RMP of approximately 10.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2020.102295DOI Listing
July 2020

Forensic genetic investigation of human skeletal remains recovered from the La Belle shipwreck.

Forensic Sci Int 2020 Jan 12;306:110050. Epub 2019 Nov 12.

Center for Human Identification, Research and Development Unit, University of North Texas Health Science Center, 3500 Camp Bowie Boulevard, Fort Worth, TX 76107, USA; Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Boulevard, Fort Worth, TX 76107, USA.

In 1995, the historical shipwreck of La Belle was discovered off the coast of Texas. One partial human skeleton was recovered from alongside cargo in the rear portion of the ship; a second (complete) skeleton was found atop coiled anchor rope in the bow. In late 2015, comprehensive forensic genetic testing began on multiple samplings from each set of remains. For the partial skeleton recovered from the ship's rear cargo area, results were obtained for 26/27 Y-STRs using traditional CE; with MPS technology, results were obtained for 18/24 Y-STRs, 56/56 ancestry-informative SNPs (aiSNPs), 22/22 phenotype-informative SNPs (piSNPs), 22/27 autosomal STRs, 4/7 X-STRs, and 94/94 identity-informative SNPs (iiSNPs). For the complete skeleton of the second individual, results were obtained for 7/17 Y-STRs using traditional CE; with MPS technology, results were obtained for 5/24 Y-STRs, 49/56 aiSNPs, 18/22 piSNPs, 15/27 autosomal STRs, 1/7 X-STRs, and 66/94 iiSNPs. Biogeographic ancestry for each set of skeletal remains was predicted using the ancestry feature and metapopulation tool of the Y-STR Haplotype Reference Database (YHRD), Haplogroup Predictor, and the Forensic Research/Reference on Genetics knowledge base (FROG-kb). Phenotype prediction was performed using piSNP data and the HIrisplex eye color and hair color DNA phenotyping webtool. mtDNA whole genome sequencing also was performed successfully. This study highlights the sensitivity of current forensic laboratory methods in recovering DNA from historical and archaeological human remains. Using advanced sequencing technology provided by MiSeq™ FGx (Verogen) and Ion S5™ (Thermo Fisher Scientific) instrumentation, degraded skeletal remains can be characterized using a panel of diverse and highly informative markers, producing data which can be useful in both forensic and genealogical investigations.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.forsciint.2019.110050DOI Listing
January 2020

Reverse Complement PCR: A novel one-step PCR system for typing highly degraded DNA for human identification.

Forensic Sci Int Genet 2020 01 6;44:102201. Epub 2019 Nov 6.

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA; Department of Microbiology, Immunology, and Genetics, Graduate School of Biomedical Sciences, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA.

Reverse Complement PCR (RC-PCR) is an innovative, one-step PCR target enrichment technology adapted for the amplification of highly degraded (fragmented) DNA. It provides simultaneous amplification and tagging of a targeted sequence construct in a single, closed-tube assay. A human identification (HID) RC-PCR panel was designed targeting 27 identity single nucleotide polymorphisms (SNPs) generating targets only 50 base pairs in length. In a single reaction, the complete sequencing construct is produced which is essential for massively parallel sequencing (MPS) library preparation, thus reducing time and labor as well as minimizing the risk of sample carry-over or other forms of contamination. The RC-PCR system was evaluated and found to produce reliable and concordant variant calls. Also, the RC-PCR system demonstrated to have substantial sensitivity of detection with a majority of alleles detected at 60 pg of input DNA and robustness in tolerating known PCR inhibitors. The RC-PCR system may be an effective alternative to current forensic genetic methods in the analysis of highly degraded DNA.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2019.102201DOI Listing
January 2020

Utility of the Ion S5™ and MiSeq FGx™ sequencing platforms to characterize challenging human remains.

Leg Med (Tokyo) 2019 Nov 14;41:101623. Epub 2019 Aug 14.

Center for Human Identification, University of North Texas Health Science Center, Fort Worth, TX, USA.

Often in missing persons' and mass disaster cases, the samples remaining for analysis are hard tissues such as bones, teeth, nails, and hair. These remains may have been exposed to harsh environmental conditions, which pose challenges for downstream genotyping. Short tandem repeat analysis (STR) via capillary electrophoresis (CE) is still the gold standard for DNA typing; however, a newer technology known as massively parallel sequencing (MPS) could improve upon our current techniques by typing different and more markers in a single analysis, and consequently improving the power of discrimination. In this study, bone and tooth samples exposed to a variety of DNA insults (cremation, embalming, decomposition, thermal degradation, and fire) were assessed and sequenced using the Precision ID chemistry and a custom AmpliSeq™ STR and iiSNP panel on the Ion S5™ System, and the ForenSeq DNA Signature Prep Kit on the MiSeq FGx™ system, as well as the GlobalFiler™ PCR Amplification Kit on the 3500™ Genetic Analyzer. The results demonstrated that using traditional CE-based genotyping performed as expected, producing a partial or full DNA profile for all samples, and that both sequencing chemistries and platforms were able to recover sufficient STR and SNP information from a majority of the same challenging samples. Run metrics including profile completeness and mean read depth produced good results with each system, considering the degree of damage of some samples. Most sample insults (except decomposed) produced similar numbers of alleles for both MPS systems. Comparable markers produced full concordance between the two platforms.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.legalmed.2019.08.001DOI Listing
November 2019

A novel phylogenetic approach for de novo discovery of putative nuclear mitochondrial (pNumt) haplotypes.

Forensic Sci Int Genet 2019 11 14;43:102146. Epub 2019 Aug 14.

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp, Bowie Blvd., Fort Worth, TX, 76107, USA; Department of Microbiology, Immunology, and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA.

Current approaches for parsing true variation (i.e. signal) from noise, broadly involve estimating a baseline value of the latter, below which all sequence data are ignored. In an effort to deliver a more objective criterion for setting such thresholds, a novel approach based on phylogenetic principles is presented here., Our method deconstructs a special category of noise from true mitochondrial genome data, namely nuclear insertions of mitochondrial DNA (Numts). This bioinformatic approach leverages the relationship of massively parallel sequence reads and is capable of discovering putative Numts (pNumts) in absence of a reference genome. The new method was tested on a whole mitochondrial genome dataset (n = 41 individuals from an admixed population sample from Rio de Janeiro) and led to the discovery of 451 pNumt variants. Comparison of these pNumts haplotypes against an existing Numt database revealed 147 exact matches to previously discovered Numts, while 122 haplotypes differed only by a single base pair and none matched exclusively to the mitochondrial genome. In general, these sequences were considerably more divergent from the mitochondrial genome than from those of the Numt database, supporting that the novel pNumts were probably hitherto uncatalogued variants. Unlike previous techniques, our method appears to be able to detect both polymorphic and fixed Numt sequences. It was also found that the region containing the D-Loop and associated Promoters (DLP) in the human mitochondrial genome, which harbors markers of forensic genetics importance, is the origin of several Numts. Though currently designed for the mitochondrial genome, our novel approach has the potential to be expanded to other scenarios that might require construing signal from noise, including the deconvolution of mixtures, thus significantly improving how analytical thresholds may be established.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2019.102146DOI Listing
November 2019

Evaluation of mitogenome sequence concordance, heteroplasmy detection, and haplogrouping in a worldwide lineage study using the Precision ID mtDNA Whole Genome Panel.

Forensic Sci Int Genet 2019 09 23;42:244-251. Epub 2019 Jul 23.

Institute of Legal Medicine, Medical University of Innsbruck, Innsbruck, Austria; Forensic Science Program, The Pennsylvania State University, University Park, PA, USA. Electronic address:

The emergence of Massively Parallel Sequencing technologies enabled the analysis of full mitochondrial (mt)DNA sequences from forensically relevant samples that have, so far, only been typed in the control region or its hypervariable segments. In this study, we evaluated the performance of a commercially available multiplex-PCR-based assay, the Precision ID mtDNA Whole Genome Panel (Thermo Fisher Scientific), for the amplification and sequencing of the entire mitochondrial genome (mitogenome) from even degraded forensic specimens. For this purpose, more than 500 samples from 24 different populations were selected to cover the vast majority of established superhaplogroups. These are known to harbor different signature sequence motifs corresponding to their phylogenetic background that could have an effect on primer binding and, thus, could limit a broad application of this molecular genetic tool. The selected samples derived from various forensically relevant tissue sources and were DNA extracted using different methods. We evaluated sequence concordance and heteroplasmy detection and compared the findings to conventional Sanger sequencing as well as an orthogonal MPS platform. We discuss advantages and limitations of this approach with respect to forensic genetic workflow and analytical requirements.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2019.07.013DOI Listing
September 2019

Linkage, recombination, and mutation rate analyses of 19 X-chromosomal STR loci in Chinese Southern Han pedigrees.

Int J Legal Med 2019 Nov 17;133(6):1691-1698. Epub 2019 Jul 17.

Guangzhou Forensic Science Institute, 1708 Baiyun Avenue, Guangzhou, 510030, China.

From Southern Han Chinese samples, we analyzed 19 X-STR markers for linkage, linkage disequilibrium (LD), and mutation rate. The data were collected from two- and three-generation Southern Han Chinese families. These data suggested that both linkage and linkage disequilibrium should be considered while calculating likelihood ratios with X-STR markers in relationship tests. The linkage disequilibrium of these 19 X-STR markers was calculated in our previous research study that was conducted on Southern Han Chinese population. In this study, the recombination fractions between pairs of markers and those obtained from the second-generation Rutgers combined linkage-physical map of the human genome were compared. The observed differences indicated that recombination was not homogeneous along the X chromosome. Therefore, we evaluated the effect on likelihood calculations by referring to haplotype frequencies obtained from allele distributions rather than haplotype counts of Southern Han Chinese population.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00414-019-02121-1DOI Listing
November 2019

Massively parallel sequence data of 31 autosomal STR loci from 496 Spanish individuals revealed concordance with CE-STR technology and enhanced discrimination power.

Forensic Sci Int Genet 2019 09 14;42:49-55. Epub 2019 Jun 14.

Center for Human Identification, University of North Texas Health Science Center, USA.

This study reports Short Tandem Repeat (STR) sequence-based allele data from 496 Spanish individuals across 31 autosomal STR (auSTR) loci included in the Precision ID GlobalFiler™ NGS STR Panel v2: D12S391, D13S317, D8S1179, D21S11, D3S1358, D5S818, D1S1656, D2S1338, vWA, D2S441, D5S2800, D7S820, D16S539, D6S474, D12ATA63, D4S2408, D6S1043, D19S433, D14S1434, CSF1PO, D10S1248, D18S51, D1S1677, D22S1045, D2S1776, D3S4529, FGA, Penta D, Penta E, TH01 and TPOX. The sequence of each allele was aligned to the reference sequence GRCh37 (hg19) and formatted according to the guidance of the International Society for Forensic Genetics. A subset of 221 samples was evaluated for testing concordance with allele calls derived from CE-based analysis using PowerPlex Fusion 6C, and there was 99.95% allele concordance. Twenty-five out of 31 auSTR loci showed an increased number of alleles due to repeat region sequence variation and/or single nucleotide polymorphisms (SNP) residing in the flanking regions. A total of 18 loci showed increased observed heterozygosity due to sequence variation; the loci exhibiting the greatest increase were: D13S317 (12% points), D5S818 (10% points), D8S1179 (7% points), D3S1358 (7% points), and D21S11 (6% points). The combined match probability decreased from 2.022E-24 (length-based data) to 1.042E-27 (sequence-based data) for the 20 CODIS core STR loci. The combined match probability (sequence-based data) for the 31 STR loci studied was 4.777E-40. The combined typical paternity index increased from 1.118E + 12 to 8.179E + 13 using length and sequence-based data, respectively. This Spanish population study performed in the framework of the EU-funded DNASEQEX project is expected to provide STR sequence-based allele frequencies for forensic casework and support implementation of massively parallel sequencing (MPS) technology in forensic laboratories.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.fsigen.2019.06.009DOI Listing
September 2019

Copan microFLOQ® Direct Swab collection of bloodstains, saliva, and semen on cotton cloth.

Int J Legal Med 2020 Jan 4;134(1):45-54. Epub 2019 Jun 4.

Center for Human Identification, UNT Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX, 76107, USA.

The microFLOQ® Direct Swab was tested by sampling diluted blood, semen, and saliva stains deposited on cotton cloth. DNA typing was performed using the PowerPlex® Fusion 6C System by direct PCR or a modified direct PCR. Direct PCR of swabs sampled the center of a stain, compared to their respective edge samplings, and had higher profile completeness and total relative fluorescent units (RFU) for all dilutions of blood and semen stains tested. The modified direct PCR used template DNA eluted from the swab head using the Casework Direct Kit, Custom and washes either contained 1-thioglycerol (TG) additive or no TG. Modified direct PCR had mixed results for blood, saliva, and semen stains, with semen stains showing significant differences in profile completeness (5% and 1%) and total RFU (neat, 5% and 1%) with the addition of TG to the Casework Direct Reagent. No significant difference was seen in any dilution of blood or saliva stains processed with the modified direct PCR, but profile completeness and total RFU were improved overall compared to stains swabbed with cotton swabs or 4N6FLOQSwabs™. This study supports the hypothesis that the microFLOQ® Direct Swab is able to collect minute amounts of DNA from cotton cloth and may be considered as an alternate pre-screening methodology in forensic biology casework.
View Article and Find Full Text PDF

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00414-019-02081-6DOI Listing
January 2020