期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

FDSTools: A software package for analysis of massively parallel sequencing data with the ability to recognise and correct STR stutter and other PCR or sequencing noise

《Forensic science international. Genetics》2017

Massively parallel sequencing (MPS) is on the advent of a broad scale application in forensic research and casework. The improved capabilities to analyse evidentiary traces representing unbalanced mixtures is often mentioned as one of the major advantages of this technique. However, most of the available software packages that analyse forensic short tandem repeat (STR) sequencing data are not well suited for high throughput analysis of such mixed traces. The largest challenge is the presence of stutter artefacts in STR amplifications, which are not readily discerned from minor contributions. FDSTools is an open-source software solution developed for this purpose. The level of stutter formation is influenced by various aspects of the sequence, such as the length of the longest uninterrupted stretch occurring in an STR. When MPS is used, STRs are evaluated as sequence variants that each have particular stutter characteristics which can be precisely determined. FDSTools uses a database of reference samples to determine stutter and other systemic PCR or sequencing artefacts for each individual allele. In addition, stutter models are created for each repeating element in order to predict stutter artefacts for alleles that are not included in the reference set. This information is subsequently used to recognise and compensate for the noise in a sequence profile. The result is a better representation of the true composition of a sample. Using Promega Powerseq™ Auto System data from 450 reference samples and 31 two-person mixtures, we show that the FDSTools correction module decreases stutter ratios above 20% to below 3%. Consequently, much lower levels of contributions in the mixed traces are detected. FDSTools contains modules to visualise the data in an interactive format allowing users to filter data with their own preferred thresholds. 相似文献

2.

Degradation in forensic trace DNA samples explored by massively parallel sequencing

《Forensic science international. Genetics》2017

Routine forensic analysis using STRs will fail if the DNA is too degraded. The DNA degradation process in biological stain material is not well understood. In this study we sequenced old semen and blood stains by massively parallel sequencing. The sequence data coverage was used to measure degradation across the genome. The results supported the contention that degradation is uniform across the genome, showing no evidence of regions with increased or decreased resistance towards degradation. Thus the lack of genetic regions robust to degradation removes the possibility of using such regions to further optimize analysis performance for degraded DNA. 相似文献

3.

Investigation of length heteroplasmy in mitochondrial DNA control region by massively parallel sequencing

《Forensic science international. Genetics》2017

Accurate sequencing of the control region of the mitochondrial genome is notoriously difficult due to the presence of polycytosine bases, termed C-tracts. The precise number of bases that constitute a C-tract and the bases beyond the poly cytosines may not be accurately defined when analyzing Sanger sequencing data separated by capillary electrophoresis. Massively parallel sequencing has the potential to resolve such poor definition and provides the opportunity to discover variants due to length heteroplasmy. In this study, the control region of mitochondrial genomes from 20 samples was sequenced using both standard Sanger methods with separation by capillary electrophoresis and also using massively parallel DNA sequencing technology. After comparison of the two sets of generated sequence, with the exception of the C-tracts where length heteroplasmy was observed, all sequences were concordant. Sequences of three segments 16184–16193, 303–315 and 568–573 with C-tracts in HVI, II and III can be clearly defined from the massively parallel sequencing data using the program SEQ Mapper. Multiple sequence variants were observed in the length of C-tracts longer than 7 bases. Our report illustrates the accurate designation of all the length variants leading to heteroplasmy in the control region of the mitochondrial genome that can be determined by SEQ Mapper based on data generated by massively parallel DNA sequencing. 相似文献

4.

MAPlex - A massively parallel sequencing ancestry analysis multiplex for Asia-Pacific populations

《Forensic science international. Genetics》2019

Current forensic ancestry-informative panels are limited in their ability to differentiate populations in the Asia-Pacific region. MAPlex (Multiplex for the Asia-Pacific), a massively parallel sequencing (MPS) assay, was developed to improve differentiation of East Asian, South Asian and Near Oceanian populations found in the extensive cross-continental Asian region that shows complex patterns of admixture at its margins. This study reports the development of MAPlex; the selection of SNPs in combination with microhaplotype markers; assay design considerations for reducing the lengths of microhaplotypes while preserving their ancestry-informativeness; adoption of new population-informative multiple-allele SNPs; compilation of South Asian-informative SNPs suitable for forensic AIMs panels; and the compilation of extensive reference and test population genotypes from online whole-genome-sequence data for MAPlex markers. STRUCTURE genetic clustering software was used to gauge the ability of MAPlex to differentiate a broad set of populations from South and East Asia, the West Pacific regions of Near Oceania, as well as the other globally distributed population groups. Preliminary assessment of MAPlex indicates enhanced South Asian differentiation with increased divergence between West Eurasian, South Asian and East Asian populations, compared to previous forensic SNP panels of comparable scale. In addition, MAPlex shows efficient differentiation of Middle Eastern individuals from Europeans. MAPlex is the first forensic AIM assay to combine binary and multiple-allele SNPs with microhaplotypes, adding the potential to detect and analyze mixed source forensic DNA. 相似文献

5.

High sensitivity multiplex short tandem repeat loci analyses with massively parallel sequencing

《Forensic science international. Genetics》2015

STR typing in forensic genetics has been performed traditionally using capillary electrophoresis (CE). However, CE-based method has some limitations: a small number of STR loci can be used; stutter products, dye artifacts and low level alleles. Massively parallel sequencing (MPS) has been considered a viable technology in recent years allowing high-throughput coverage at a relatively affordable price. Some of the CE-based limitations may be overcome with the application of MPS. In this study, a prototype multiplex STR System (Promega) was amplified and prepared using the TruSeq DNA LT Sample Preparation Kit (Illumina) in 24 samples. Results showed that the MinElute PCR Purification Kit (Qiagen) was a better size selection method compared with recommended diluted bead mixtures. The library input sensitivity study showed that a wide range of amplicon product (6–200 ng) could be used for library preparation without apparent differences in the STR profile. PCR sensitivity study indicated that 62 pg may be minimum input amount for generating complete profiles. Reliability study results on 24 different individuals showed that high depth of coverage (DoC) and balanced heterozygote allele coverage ratios (ACRs) could be obtained with 250 pg of input DNA, and 62 pg could generate complete or nearly complete profiles. These studies indicate that this STR multiplex system and the Illumina MiSeq can generate reliable STR profiles at a sensitivity level that competes with current widely used CE-based method. 相似文献

6.

Predicting the origin of stains from whole miRNome massively parallel sequencing data

《Forensic science international. Genetics》2019

In this study, we have screened the six most relevant forensic body fluids / tissues, namely blood, semen, saliva, vaginal secretion, menstrual blood and skin, for miRNAs using a whole miRNome massively parallel sequencing approach. We applied partial least squares (PLS) and linear discriminant analysis (LDA) to predict body fluids based on the expression of the miRNA markers. We estimated the prediction accuracy for models including different subsets of miRNA markers to identify the minimum number of markers needed for sufficient prediction performance. For one selected model consisting of 9 miRNA markers we calculated their importance for prediction of each of the six different body fluid categories. 相似文献

7.

Pairwise kinship analysis of 17 pedigrees using massively parallel sequencing

《Forensic science international. Genetics》2022

With the tremendous development of massively parallel sequencing (MPS) in the last decade, it has been widely applied in basic science, clinical diagnostics, microbial genomics, as well as forensic genetics. MPS has lots of advantages that may facilitate the kinship analysis. In this study, 243 Chinese Han individuals from 17 families were involved and sequenced using the ForenSeq™ DNA Signature Prep Kit (Verogen, Inc., San Diego, USA), which provided the sequence information of 27 autosomal STRs (A-STRs), 7 X chromosomal STRs (X-STRs), 24 Y chromosomal STRs (Y-STRs) and 94 identity-informative SNPs (iSNPs). A total of 275 pairs of parent-child, 123 pairs of full siblings, 1 pair of twins, 1 pair of half siblings, 158 pairs of grandparent-grandchild, 222 pairs of uncle/aunt-nephew/niece and 121 pairs of first cousins, as well as 701 pairs of unrelated individuals were identified. Using both likelihood ratio (LR) and identical by state (IBS) methods, the kinship analysis was conducted among these relative and non-relative pairs based on the A-STRs and SNPs. As a result, the ForenSeq Signature Kit could solve the analysis of parent-child (t1 = −4, t2 = 4), full siblings (t1 = −2, t2 = 2) and most second-degree kinships (t1 = −1, t2 = 1) using the LR method. When the IBS method was applied, 123 full sibling pairs had a higher average IBS value than other kinship groups in this study. And the IBS method could play a role in the testing of parent-child and full siblings. 相似文献

8.

Improved pairwise kinship analysis using massively parallel sequencing

《Forensic science international. Genetics》2019

In the present study, 67 individuals from two families were analyzed to explore the efficacy of the ForenSeq^™ DNA Signature Prep Kit for pairwise kinship analysis. Six types of pairwise relationships including 81 parent-offspring, 60 full siblings, 48 grandparent-grandchildren, 147 uncle/aunt-nephew/nieces, 97 first cousins and 190 non-relatives were generated from these two families and the corresponding likelihood ratio (LR) was calculated using either sequence-based or length-based STR genotype data (i.e., LR_sequence and LR_length). In addition, 10,000 pairs of different relationships were simulated to estimate the system powers of the STRs and SNPs in this panel. The results showed that 54, 9 and 5 additional alleles were observed based on sequence for 27 autosomal STRs, 24 Y-STRs and 7 X-STRs, respectively, compared to those based on length information and 11 novel alleles were identified. Five mutations were found for 58 STRs in 81 parent-offspring but no mutations were observed for SNPs. For 27 autosomal STR loci, the LRs were increased from 9.20, 7.87, 2.01, 2.07, 0.42 for log₁₀LR_length to 11.52, 10.12, 2.61, 2.60, 0.52 for log₁₀LR_sequence for paternity index (PI), full siblings index (FSI), grandparent-grandchild index (GI), uncle/aunt-nephew/niece index (UNI) and first cousins index (FCI), respectively. PI values for 94 SNPs separated more than those of 27 STRs if two individuals were non parent-offspring relatives. For the simulation study, the effectiveness was 1 for the parent-offspring relationship at the thresholds of t₁ = − 4 and t₂ = 4 and was 0.9998 for full siblings (t₁ = − 2, t₂ = 2). With an error rate of 0.42%, 93.02% of second degree relatives could be identified at the thresholds of t₁ = − 1 and t₂ = 1. However, the effectiveness was only 0.4300 for first cousins with a relatively high error rate of 2.68% (t₁ = − 1, t₂ = 1). In conclusion, STR typing according to the sequence information is more polymorphic, which increases the discrimination power for kinship testing. Compared to these 27 STR markers, 94 SNP markers in this panel have advantages in paternity testing especially when mutated STRs are involved or when a relative is an alleged parent. This panel is powerful enough to resolve paternity and full sibling testing. Most of the second degree relationships could be identified with low error rate while more markers are still needed for first cousins testing. 相似文献

9.

Mitochondrial DNA heteroplasmy in the emerging field of massively parallel sequencing

《Forensic science international. Genetics》2015

Long an important and useful tool in forensic genetic investigations, mitochondrial DNA (mtDNA) typing continues to mature. Research in the last few years has demonstrated both that data from the entire molecule will have practical benefits in forensic DNA casework, and that massively parallel sequencing (MPS) methods will make full mitochondrial genome (mtGenome) sequencing of forensic specimens feasible and cost-effective. A spate of recent studies has employed these new technologies to assess intraindividual mtDNA variation. However, in several instances, contamination and other sources of mixed mtDNA data have been erroneously identified as heteroplasmy. Well vetted mtGenome datasets based on both Sanger and MPS sequences have found authentic point heteroplasmy in approximately 25% of individuals when minor component detection thresholds are in the range of 10–20%, along with positional distribution patterns in the coding region that differ from patterns of point heteroplasmy in the well-studied control region. A few recent studies that examined very low-level heteroplasmy are concordant with these observations when the data are examined at a common level of resolution. In this review we provide an overview of considerations related to the use of MPS technologies to detect mtDNA heteroplasmy. In addition, we examine published reports on point heteroplasmy to characterize features of the data that will assist in the evaluation of future mtGenome data developed by any typing method. 相似文献

10.

Targeted DNA methylation analysis and prediction of smoking habits in blood based on massively parallel sequencing

《Forensic science international. Genetics》2023

Tobacco smoking is a frequent habit sustained by > 1.3 billion people in 2020 and the leading preventable factor for health risk and premature mortality worldwide. In the forensic context, predicting smoking habits from biological samples may allow broadening DNA phenotyping. In this study, we aimed to implement previously published smoking habit classification models based on blood DNA methylation at 13 CpGs. First, we developed a matching lab tool based on bisulfite conversion and multiplex PCR followed by amplification-free library preparation and targeted paired-end massively parallel sequencing (MPS). Analysis of six technical duplicates revealed high reproducibility of methylation measurements (Pearson correlation of 0.983). Artificially methylated standards uncovered marker-specific amplification bias, which we corrected via bi-exponential models. We then applied our MPS tool to 232 blood samples from Europeans of a wide age range, of which 90 were current, 71 former and 71 never smokers. On average, we obtained 189,000 reads/sample and 15,000 reads/CpG, without marker drop-out. Methylation distributions per smoking category roughly corresponded to previous microarray analysis, showcasing large inter-individual variation but with technology-driven bias. Methylation at 11 out of 13 smoking-CpGs correlated with daily cigarettes in current smokers, while solely one was weakly correlated with time since cessation in former smokers. Interestingly, eight smoking-CpGs correlated with age, and one displayed weak but significant sex-associated methylation differences. Using bias-uncorrected MPS data, smoking habits were relatively accurately predicted using both two- (current/non-current) and three- (never/former/current) category model, but bias correction resulted in worse prediction performance for both models. Finally, to account for technology-driven variation, we built new, joint models with inter-technology corrections, which resulted in improved prediction results for both models, with or without PCR bias correction (e.g. MPS cross-validation F₁-score > 0.8; 2-categories). Overall, our novel assay takes us one step closer towards the forensic application of viable smoking habit prediction from blood traces. However, future research is needed towards forensically validating the assay, especially in terms of sensitivity. We also need to further shed light on the employed biomarkers, particularly on the mechanistics, tissue specificity and putative confounders of smoking epigenetic signatures. 相似文献

11.

Estimating number of contributors in massively parallel sequencing data of STR loci

《Forensic science international. Genetics》2019

In recent years a number of computer-based algorithms have been developed for the deconvolution of complex DNA mixtures in forensic science. These procedures utilize likelihood ratios that quantify the evidence for a hypothesis for the presence of a person of interest in a DNA profile compared to an alternative hypothesis. Proper operation of these software systems requires an assumption regarding the total number of contributors present in the mixture. Unfortunately, estimates based on counting the number of alleles at a locus can be inaccurate due to the sharing and masking of alleles at individual loci. The effects of allele masking become increasingly severe as the number of contributors increases, rendering estimates about high-order mixtures uncertain. The accuracy of these estimates can be improved by increasing the number of STR markers in panels, and by using highly polymorphic markers. Increasing the number of STR markers from 13 to 20 (expanded CODIS panel) improves the accuracy of allele count-based estimation methods for low-order mixtures, but accuracy for high-order mixtures (> 3 contributors) remains poor due to allele masking. An alternative technique, massively parallel sequencing, holds great potential to improve the accuracy of the estimate of number of contributors due to its ability to detect sequence polymorphisms within alleles. This process results in an expansion of the number of alleles when compared to that obtained using capillary electrophoresis. Here, we show that the detection of these additional sequence-defined alleles in 22-marker panels improves number of contributor estimates in conceptual mixtures of 4 and 5 contributors. 相似文献

12.

Sequence polymorphisms of forensic Y-STRs revealed by a 68-plex in-house massively parallel sequencing panel

《Forensic science international. Genetics》2022

Sequence polymorphisms of Y chromosome short tandem repeat (Y-STR) markers can be unveiled using next generation sequencing (NGS). Compared to capillary electrophoresis, NGS has the advantage of distinguishing between some alleles of the same length. Here, a 68-plex in-house panel covering 67 Y-STR loci and the sex determinant Amelogenin locus, was developed. The accuracy of this panel was 100% concordant with three standard reference samples. The sensitive was as low as 250 pg. A total of 466 length-based alleles, 806 sequence-based alleles, and 149 haplotypes were observed across 149 Chinese Han individuals. The total haplotype diversity and discrimination capacity was 1.0000 in detected samples. The DYS710 locus possessed the highest diversity by sequence among these Y-STRs, with 109 sequence-based alleles observed. Micro-variant alleles with the same length were observed in 39 Y-STR loci, with their sequence variations mainly attributable to repeat pattern variations. While the number of sequence-based alleles identified for DYS447, DYS449, DYS710, DYS720 and DYF387S1a/b was approximately three times that of their length-based alleles, flanking sequence variations were observed in 18 alleles. In addition, 201 sequence-based alleles in 42 loci were newly discovered. This significantly expanded the knowledge of human Y-STR sequence polymorphisms. Collectively, the 68-plex panel provided reliable Y-STR results as well as higher resolution for paternal lineage analysis. 相似文献

13.

Characterizing the amplification of STR markers in multiplex polymerase chain displacement reaction using massively parallel sequencing

《Forensic science international. Genetics》2023

Polymerase chain displacement reaction (PCDR) showed advantages in forensic low-template DNA analysis with improved amplification efficiency, higher allele detection capacity, and lower stutter artifact than PCR. However, characteristics of STR markers after PCDR amplification remain unclarified for the limited resolving power of capillary electrophoresis (CE). This issue can be addressed by massively parallel sequencing (MPS) technology with higher throughput and discriminability. Here, we developed a multiplex PCDR system including 24 STRs and amelogenin. In addition, a PCR reference was established for comparison. After amplification, products were subjected to PCR-free library construction and sequenced on the Illumina NovaSeq system. We implemented a sequence-matching pipeline to separate different amplicon types of PCDR products from the combination of primers. In the sensitivity test, the PCDR multiplex obtained full STR profiles with as low as 125 pg 2800M control DNA. Based on that, single-source DNA samples were tested. First, highly concordant genotypes were observed among the PCDR multiplex, the PCR reference, and CE-based STR kits. Next, read counts of different PCDR amplicon types were investigated, showing a relative abundance of 78:12:12:1 for the shortest amplicon S, the two medium amplicons M1 and M2, and the longest amplicon L. We also analyzed the stutter artifacts for distinct amplicon types, and the results revealed the reduction of N − 1 and N − 2 contraction stutters, and the increase of N + 1 and N + 2 elongation stutters in PCDR samples. Moreover, we confirmed the feasibility of PCDR for amplifying degraded DNA samples and unbalanced DNA mixtures. Compared to the previous proof of principle study, our work took a further step to characterize the complete profile of STR markers in the PCDR context. Our results suggested that the PCDR-MPS workflow is an effective approach for forensic STR analysis. Corresponding findings in this study may help the development of PCDR-based assays and probabilistic methods in future studies. 相似文献

14.

Forensic identity SNPs: Characterisation of flanking region variation using massively parallel sequencing

《Forensic science international. Genetics》2023

Single nucleotide polymorphisms (SNPs) can be analysed for identity or kinship applications in forensic genetics to either provide an adjunct to traditional STR typing or as a stand-alone approach. The advent of massively parallel sequencing technology (MPS) has provided a useful opportunity to more easily deploy SNP typing in a forensic context, given the ability to simultaneously amplify a large number of markers. Furthermore, MPS also provides valuable sequence data for the targeted regions, which enables the detection of any additional variation seen in the flanking regions of amplicons. In this study we genotyped 977 samples across five UK-relevant population groups (White British, East Asian, South Asian, North-East African and West African) for 94 identity-informative SNP markers using the ForenSeq DNA Signature Prep Kit. Examination of flanking region variation allowed for the identification of 158 additional alleles across all populations studied. Here we present allele frequencies for all 94 identity-informative SNPs, both including and excluding the flanking region sequence of these markers. We also present information on the configuration of these SNPs in the ForenSeq DNA Signature Prep Kit, including performance metrics for the markers and investigation of bioinformatic and chemistry-based discordances. Overall, the inclusion of flanking region variation in the analysing workflow for these markers reduced the average combined match probability 2175 times across all populations, with a maximum reduction of 675,000-fold in the West African population. The gain due to flanking region-based discrimination increased the heterozygosity of some loci above that of some of the least useful forensic STR loci; thus demonstrating the benefit of enhanced analysis of currently targeted SNP markers for forensic applications. 相似文献

15.

Techniques for estimating genetically variable peptides and semi-continuous likelihoods from massively parallel sequencing data

《Forensic science international. Genetics》2022

Forensic genetic investigations typically rely on analysis of DNA for attribution purposes. There are times, however, when the amount and/or the quality of the DNA is limited, and thus little or no information can be obtained regarding the source of the sample. An alternative biochemical target that also contains genetic signatures is protein. One class of genetic signatures is protein polymorphisms that are a direct consequence of simple/single/short nucleotide polymorphisms (SNPs) in DNA. However, to interpret protein polymorphisms in a forensic context, certain complexities must be understood and addressed. These complexities include: 1) SNPs can generate 0, 1, or arbitrarily many polymorphisms in a polypeptide; and 2) as an object of expression that is modulated by alleles, genes and interactions with the environment, proteins may be present or absent in a given sample. To address these issues, a novel approach was taken to generate the expected protein alleles in a reference sample based on whole genome (or exome) sequence data and assess the significance of the evidence using a haplotype-based semi-continuous likelihood algorithm that leverages whole proteome data. Converting the genomic information into the proteomic information allows for the zero-to-many relationship between SNPs and GVPs to be abstracted away. When viewed as a haplotype, many GVPs that correspond to the same SNP is equivalent to many SNPs in perfect linkage disequilibrium (LD). As long as the likelihood formulation correctly accounts for LD, the correspondence between the SNP and the proteome can be safely neglected. Tests were performed on simulated samples, including single-source and two-person mixtures, and the power of using a classical semi-continuous likelihood versus one that has been adapted to neglect drop-out was compared. Additionally, summary statistics and a rudimentary set of decision guidelines were introduced to help identify mixtures from protein data. 相似文献

16.

Performance of a next generation sequencing SNP assay on degraded DNA

《Forensic science international. Genetics》2015

Forensic DNA casework samples are often of insufficient quantity or quality to generate full profiles by conventional DNA typing methods. Polymerase chain reaction (PCR) amplification of short tandem repeat (STR) loci is inherently limited in samples containing degraded DNA, as the cumulative size of repeat regions, primer binding regions, and flanking sequence is necessarily larger than the PCR template. Additionally, traditional capillary electrophoresis (CE) assay design further inherently limits shortening amplicons because the markers must be separated by size. Non-traditional markers, such as single nucleotide polymorphisms (SNPs) and insertion deletion polymorphisms (InDels), may yield more information from challenging samples due to their smaller amplicon size. In this study, the performance of a next generation sequencing (NGS) SNP assay and CE-based STR, mini-STR, and InDel assays was evaluated with a series of fragmented, size-selected samples. Information obtained from the NGS SNP assay exhibited higher overall inverse random match probability (1/RMP) values compared to the CE-based typing assays, with particular benefit for fragment sizes ≤150 base pairs (bp). The InDel, mini-STR, and NGS SNP assays all had similar percentages of loci with reportable alleles at this level of degradation; however, the relatively fewer number of loci in the InDel and mini-STR assays results in the NGS SNP assay having at least nine orders of magnitude higher 1/RMP values. In addition, the NGS SNP assay and three CE-based assays (two STR and one InDel assay) were tested using a dilution series consisting of 0.5 ng, 0.1 ng, and 0.05 ng non-degraded DNA. All tested assays showed similar percentages of loci with reportable alleles at these levels of input DNA; however, due to the larger number of loci, the NGS SNP assay and the larger of the two tested CE-based STR assays both resulted in considerably higher 1/RMP values than the other assays. These results indicate the potential advantage of NGS SNP assays for forensic analysis of degraded DNA samples. 相似文献

17.

STRait Razor: A length-based forensic STR allele-calling tool for use with second generation sequencing data

《Forensic science international. Genetics》2013,7(4):409-417

Recent studies have demonstrated the capability of second generation sequencing (SGS) to provide coverage of short tandem repeats (STRs) found within the human genome. However, there are relatively few bioinformatic software packages capable of detecting these markers in the raw sequence data. The extant STR-calling tools are sophisticated, but are not always applicable to the analysis of the STR loci commonly used in forensic analyses. STRait Razor is a newly developed Perl-based software tool that runs on the Linux/Unix operating system and is designed to detect forensically-relevant STR alleles in FASTQ sequence data, based on allelic length. It is capable of analyzing STR loci with repeat motifs ranging from simple to complex without the need for extensive allelic sequence data. STRait Razor is designed to interpret both single-end and paired-end data and relies on intelligent parallel processing to reduce analysis time. Users are presented with a number of customization options, including variable mismatch detection parameters, as well as the ability to easily allow for the detection of alleles at new loci. In its current state, the software detects alleles for 44 autosomal and Y-chromosome STR loci. The study described herein demonstrates that STRait Razor is capable of detecting STR alleles in data generated by multiple library preparation methods and two Illumina^® sequencing instruments, with 100% concordance. The data also reveal noteworthy concepts related to the effect of different preparation chemistries and sequencing parameters on the bioinformatic detection of STR alleles. 相似文献

18.

Massively parallel sequencing of complete mitochondrial genomes from hair shaft samples

《Forensic science international. Genetics》2015

Though shed hairs are one of the most commonly encountered evidence types, they are among the most limited in terms of DNA quantity and quality. As a result, DNA testing has historically focused on the recovery of just about 600 base pairs of the mitochondrial DNA control region. Here, we describe our success in recovering complete mitochondrial genome (mtGenome) data (∼16,569 bp) from single shed hairs. By employing massively parallel sequencing (MPS), we demonstrate that particular hair samples yield DNA sufficient in quantity and quality to produce 2–3 kb mtGenome amplicons and that entire mtGenome data can be recovered from hair extracts even without PCR enrichment. Most importantly, we describe a small amplicon multiplex assay comprised of sixty-two primer sets that can be routinely applied to the compromised hair samples typically encountered in forensic casework. In all samples tested here, the MPS data recovered using any one of the three methods were consistent with the control Sanger sequence data developed from high quality known specimens. Given the recently demonstrated value of complete mtGenome data in terms of discrimination power among randomly sampled individuals, the possibility of recovering mtGenome data from the most compromised and limited evidentiary material is likely to vastly increase the utility of mtDNA testing for hair evidence. 相似文献

19.

Mitigating the effects of reference sequence bias in single-multiplex massively parallel sequencing of the mitochondrial DNA control region

《Forensic science international. Genetics》2019

Sequence analysis of the mitochondrial DNA (mtDNA) control region can provide forensically useful information, particularly in challenging samples where autosomal DNA profiling fails. Sub-division of the 1122-bp region into shorter PCR fragments improves data recovery, and such fragments can be analysed together via massively parallel sequencing (MPS). Here, we generate mtDNA data using the prototype PowerSeq™ Auto/Mito/Y System (Promega) MPS assay, in which a single PCR reaction amplifies ten overlapping amplicons of the control region, in a set of 101 highly diverse samples representing most major clades of the mtDNA phylogeny. The overlapping multiplex design leads to non-uniform coverage in the regions of overlap, where it is further increased by short amplicons generated alongside the intended products. Primer sequences in targeted amplification libraries are a potential source of reference sequence bias and thus should be removed, but the proprietary nature of the primers in commercial kits necessitates an alternative approach that minimises data loss: here, we introduce the bioinformatic selection of sequencing reads spanning putative primer sites (Overarching Read Enrichment Option, OREO). While OREO performs well in mitigating the effects of primer sequences at the ends of sequence reads, we still find evidence of the internalisation of primer-derived sequences by overlap extension, which may compromise the ability to call variants or to measure heteroplasmy in primer-binding regions. The commercially available PowerSeq™ CRM Nested System design prevents primer internalisation, as shown in a reanalysis of a subset of 57 samples that contain possible heteroplasmies. In combination with OREO, the CRM Nested kit mitigates reference sequence bias, allowing heteroplasmic variants to be estimated down to a 5% threshold. Provided appropriate steps are taken in data processing, single-reaction multiplex assays represent robust tools to analyse mtDNA control region variation. The OREO approach will allow users to bypass the effects of unknown primer sequences in any single-reaction tiled multiplex and eliminate primer-derived bias in overlapping amplicon sequencing studies, in both forensic and non-forensic settings. 相似文献

20.

Development of a multiplex forensic identity panel for massively parallel sequencing and its systematic optimization using design of experiments

《Forensic science international. Genetics》2019

The application of massively parallel sequencing (MPS) in forensic sciences enables high-resolution short tandem repeat (STR) genotyping for the characterization of biological evidence. While MPS supports multiplexing of a large number of forensic markers, the performance of an MPS-STR panel depends on good primer design and optimal PCR conditions. However, conventional strategies for multifactorial assay optimization are labor-intensive and do not necessarily allow the experimenter to identify optimum factor settings.Here we describe our new multiplex PCR assay, monSTR, which supports the simultaneous amplification of 21 forensic markers followed by targeted sequencing on the Illumina MiSeq. The selection of STR markers adapts on the expanded European Standard Set (ESS), including the highly polymorphic locus SE33, for compatibility with existing forensic DNA databases. Primer engineering involved bioinformatics tools to create a multiplex-compatible primer set. Primer quality was evaluated in silico and in vitro. We demonstrate the systematic optimization of multiplex PCR thermocycling conditions using Design of Experiments (DOE) methodology. The objective was to yield a specific, balanced, low-noise amplification of forensic targets. A central composite face design of experiments enabled an efficient simultaneous investigation of multiple critical process parameters and their interactions. Optimal multiplex PCR conditions were predicted using software-aided modelling based on DOE data. Verification experiments suggested a balanced, reproducible amplification of all markers with reduced formation of artefacts. Fully concordant STR profiles were obtained for the investigated reference samples even with challenging input DNA concentrations. We found that application of DOE principles enabled an experimentally practical and economically justifiable assay development and optimization, even beyond the field of forensic genetics. 相似文献