首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 879 毫秒
1.
2.
3.
Genome sequencing is positioned as a routine clinical work‐up for diverse clinical conditions. A commonly used approach to highlight candidate variants with potential clinical implication is to search over locus‐ and gene‐centric knowledge databases. Most web‐based applications allow a federated query across diverse databases for a single variant; however, sifting through a large number of genomic variants with combination of filtering criteria is a substantial challenge. Here we describe the Clinical Genome and Ancestry Report (CGAR), an interactive web application developed to follow clinical interpretation workflows by organizing variants into seven categories: (1) reported disease‐associated variants, (2) rare‐ and high‐impact variants in putative disease‐associated genes, (3) secondary findings that the American College of Medical Genetics and Genomics recommends reporting back to patients, (4) actionable pharmacogenomic variants, (5) focused reports for candidate genes, (6) de novo variant candidates for trio analysis, and (7) germline and somatic variants implicated in cancer risk, diagnosis, treatment and prognosis. For each variant, a comprehensive list of external links to variant‐centric and phenotype databases are provided. Furthermore, genotype‐derived ancestral composition is used to highlight allele frequencies from a matched population since some disease‐associated variants show a wide variation between populations. CGAR is an open‐source software and is available at https://tom.tch.harvard.edu/apps/cgar/ .  相似文献   

4.
Classification of variants of unknown significance is a challenging technical problem in clinical genetics. As up to one‐third of disease‐causing mutations are thought to affect pre‐mRNA splicing, it is important to accurately classify splicing mutations in patient sequencing data. Several consortia and healthcare systems have conducted large‐scale patient sequencing studies, which discover novel variants faster than they can be classified. Here, we compare the advantages and limitations of several high‐throughput splicing assays aimed at mitigating this bottleneck, and describe a data set of ~5,000 variants that we analyzed using our Massively Parallel Splicing Assay (MaPSy). The Critical Assessment of Genome Interpretation group (CAGI) organized a challenge, in which participants submitted machine learning models to predict the splicing effects of variants in this data set. We discuss the winning submission of the challenge (MMSplice) which outperformed existing software. Finally, we highlight methods to overcome the limitations of MaPSy and similar assays, such as tissue‐specific splicing, the effect of surrounding sequence context, classifying intronic variants, synthesizing large exons, and amplifying complex libraries of minigene species. Further development of these assays will greatly benefit the field of clinical genetics, which lack high‐throughput methods for variant interpretation.  相似文献   

5.
The CAGI‐5 pericentriolar material 1 (PCM1) challenge aimed to predict the effect of 38 transgenic human missense mutations in the PCM1 protein implicated in schizophrenia. Participants were provided with 16 benign variants (negative controls), 10 hypomorphic, and 12 loss of function variants. Six groups participated and were asked to predict the probability of effect and standard deviation associated to each mutation. Here, we present the challenge assessment. Prediction performance was evaluated using different measures to conclude in a final ranking which highlights the strengths and weaknesses of each group. The results show a great variety of predictions where some methods performed significantly better than others. Benign variants played an important role as negative controls, highlighting predictors biased to identify disease phenotypes. The best predictor, Bromberg lab, used a neural‐network‐based method able to discriminate between neutral and non‐neutral single nucleotide polymorphisms. The CAGI‐5 PCM1 challenge allowed us to evaluate the state of the art techniques for interpreting the effect of novel variants for a difficult target protein.  相似文献   

6.
Deciphering the potential of noncoding loci to influence gene regulation has been the subject of intense research, with important implications in understanding genetic underpinnings of human diseases. Massively parallel reporter assays (MPRAs) can measure regulatory activity of thousands of DNA sequences and their variants in a single experiment. With increasing number of publically available MPRA data sets, one can now develop data‐driven models which, given a DNA sequence, predict its regulatory activity. Here, we performed a comprehensive meta‐analysis of several MPRA data sets in a variety of cellular contexts. We first applied an ensemble of methods to predict MPRA output in each context and observed that the most predictive features are consistent across data sets. We then demonstrate that predictive models trained in one cellular context can be used to predict MPRA output in another, with loss of accuracy attributed to cell‐type‐specific features. Finally, we show that our approach achieves top performance in the Fifth Critical Assessment of Genome Interpretation “Regulation Saturation” Challenge for predicting effects of single‐nucleotide variants. Overall, our analysis provides insights into how MPRA data can be leveraged to highlight functional regulatory regions throughout the genome and can guide effective design of future experiments by better prioritizing regions of interest.  相似文献   

7.
8.
Single nucleotide mutations in exonic regions can significantly affect gene function through a disruption of splicing, and various computational methods have been developed to predict the splicing‐related effects of a single nucleotide mutation. We implemented a new method using ensemble learning that combines two types of predictive models: (a) base sequence‐based deep neural networks (DNNs) and (b) machine learning models based on genomic attributes. This method was applied to the Massively Parallel Splicing Assay challenge of the Fifth Critical Assessment of Genome Interpretation, in which challenge participants predicted various experimentally‐defined exonic splicing mutations, and achieved a promising result. We successfully revealed that combining different predictive models based upon the stacked generalization method led to significant improvement in prediction performance. In addition, whereas most of the genomic features adopted in constructing machine learning models were previously reported, feature values generated with DSSP, a DNN‐based splice site prediction tool, were novel and helpful for the prediction. Learning the sequence patterns associated with normal splicing and the change in splicing site probabilities caused by a mutation was presumed to be helpful in predicting splicing disruption.  相似文献   

9.
10.
Accurate prediction of the impact of genomic variation on phenotype is a major goal of computational biology and an important contributor to personalized medicine. Computational predictions can lead to a better understanding of the mechanisms underlying genetic diseases, including cancer, but their adoption requires thorough and unbiased assessment. Cystathionine‐beta‐synthase (CBS) is an enzyme that catalyzes the first step of the transsulfuration pathway, from homocysteine to cystathionine, and in which variations are associated with human hyperhomocysteinemia and homocystinuria. We have created a computational challenge under the CAGI framework to evaluate how well different methods can predict the phenotypic effect(s) of CBS single amino acid substitutions using a blinded experimental data set. CAGI participants were asked to predict yeast growth based on the identity of the mutations. The performance of the methods was evaluated using several metrics. The CBS challenge highlighted the difficulty of predicting the phenotype of an ex vivo system in a model organism when classification models were trained on human disease data. We also discuss the variations in difficulty of prediction for known benign and deleterious variants, as well as identify methodological and experimental constraints with lessons to be learned for future challenges.  相似文献   

11.
Precision medicine and sequence‐based clinical diagnostics seek to predict disease risk or to identify causative variants from sequencing data. The Critical Assessment of Genome Interpretation (CAGI) is a community experiment consisting of genotype‐phenotype prediction challenges; participants build models, undergo assessment, and share key findings. In the past, few CAGI challenges have addressed the impact of sequence variants on splicing. In CAGI5, two challenges (Vex‐seq and MaPSY) involved prediction of the effect of variants, primarily single‐nucleotide changes, on splicing. Although there are significant differences between these two challenges, both involved prediction of results from high‐throughput exon inclusion assays. Here, we discuss the methods used to predict the impact of these variants on splicing, their performance, strengths, and weaknesses, and prospects for predicting the impact of sequence variation on splicing and disease phenotypes.  相似文献   

12.
Protein kinases represent a large and diverse family of evolutionarily related proteins that are abnormally regulated in human cancers. Although genome sequencing studies have revealed thousands of variants in protein kinases, translating “big” genomic data into biological knowledge remains a challenge. Here, we describe an ontological framework for integrating and conceptualizing diverse forms of information related to kinase activation and regulatory mechanisms in a machine readable, human understandable form. We demonstrate the utility of this framework in analyzing the cancer kinome, and in generating testable hypotheses for experimental studies. Through the iterative process of aggregate ontology querying, hypothesis generation and experimental validation, we identify a novel mutational hotspot in the αC‐β4 loop of the kinase domain and demonstrate the functional impact of the identified variants in epidermal growth factor receptor (EGFR) constitutive activity and inhibitor sensitivity. We provide a unified resource for the kinase and cancer community, ProKinO, housed at http://vulcan.cs.uga.edu/prokino .  相似文献   

13.
Deleterious variants in SLC2A2 cause Fanconi‐Bickel Syndrome (FBS), a glycogen storage disorder, whereas less common variants in SLC2A2 associate with numerous metabolic diseases. Phenotypic heterogeneity in FBS has been observed, but its causes remain unknown. Our goal was to functionally characterize rare SLC2A2 variants found in FBS and metabolic disease‐associated variants to understand the impact of these variants on GLUT2 activity and expression and establish genotype‐phenotype correlations. Complementary RNA‐injected Xenopus laevis oocytes were used to study mutant transporter activity and membrane expression. GLUT2 homology models were constructed for mutation analysis using GLUT1, GLUT3, and XylE as templates. Seventeen FBS variants were characterized. Only c.457_462delCTTATA (p.Leu153_Ile154del) exhibited residual glucose uptake. Functional characterization revealed that only half of the variants were expressed on the plasma membrane. Most less common variants (except c.593 C>A (p.Thr198Lys) and c.1087 G>T (p.Ala363Ser)) exhibited similar GLUT2 transport activity as the wild type. Structural analysis of GLUT2 revealed that variants affect substrate‐binding, steric hindrance, or overall transporter structure. The mutant transporter that is associated with a milder FBS phenotype, p.Leu153_Ile154del, retained transport activity. These results improve our overall understanding of the underlying causes of FBS and impact of GLUT2 function on various clinical phenotypes ranging from rare to common disease.  相似文献   

14.
The NAGLU challenge of the fourth edition of the Critical Assessment of Genome Interpretation experiment (CAGI4) in 2016, invited participants to predict the impact of variants of unknown significance (VUS) on the enzymatic activity of the lysosomal hydrolase α‐N‐acetylglucosaminidase (NAGLU). Deficiencies in NAGLU activity lead to a rare, monogenic, recessive lysosomal storage disorder, Sanfilippo syndrome type B (MPS type IIIB). This challenge attracted 17 submissions from 10 groups. We observed that top models were able to predict the impact of missense mutations on enzymatic activity with Pearson's correlation coefficients of up to .61. We also observed that top methods were significantly more correlated with each other than they were with observed enzymatic activity values, which we believe speaks to the importance of sequence conservation across the different methods. Improved functional predictions on the VUS will help population‐scale analysis of disease epidemiology and rare variant association analysis.  相似文献   

15.
Improving predictions of phenotypic consequences for genomic variants is part of ongoing efforts in the scientific community to gain meaningful insights into genomic function. Within the framework of the critical assessment of genome interpretation experiments, we participated in the Vex‐seq challenge, which required predicting the change in the percent spliced in measure (ΔΨ) for 58 exons caused by more than 1,000 genomic variants. Experimentally determined through the Vex‐seq assay, the Ψ quantifies the fraction of reads that include an exon of interest. Predicting the change in Ψ associated with specific genomic variants implies determining the sequence changes relevant for splicing regulators, such as splicing enhancers and silencers. Here we took advantage of two computational tools, SplicePort and SPANR, that incorporate relevant sequence features in their models of splice sites and exon‐inclusion level, respectively. Specifically, we used the SplicePort and SPANR outputs to build mathematical models of the experimental data obtained for the variants in the training set, which we then used to predict the ΔΨ associated with the mutations in the test set. We show that the sequence changes captured by these computational tools provide a reasonable foundation for modeling the impact on splicing associated with genomic variants.  相似文献   

16.
Recent studies have highlighted a potential role of genetic and epigenetic variation in the development of Alzheimer’s disease. Application of the CRISPR‐Cas genome‐editing platform has enabled investigation of the functional impact that Alzheimer’s disease‐associated gene mutations have on gene expression. Moreover, recent advances in the technology have led to the generation of CRISPR‐Cas–based tools that allow for high‐throughput interrogation of different risk variants to elucidate the interplay between genomic regulatory features, epigenetic modifications, and chromatin structure. In this review, we examine the various iterations of the CRISPR‐Cas system and their potential application for exploring the complex interactions and disruptions in gene regulatory circuits that contribute to Alzheimer’s disease.  相似文献   

17.
Accurate interpretation of genomic variants that alter RNA splicing is critical to precision medicine. We present a computational framework, Prediction of variant Effect on Percent Spliced In (PEPSI), that predicts the splicing impact of coding and noncoding variants for the Fifth Critical Assessment of Genome Interpretation (CAGI5) “Vex‐seq” challenge. PEPSI is a random forest regression model trained on multiple layers of features associated with sequence conservation and regulatory sequence elements. Compared to other splicing defect prediction tools from the literature, our framework integrates secondary structure information in predicting variants that disrupt splicing regulatory elements (SREs). We applied our model to classify splice‐disrupting variants among 2,094 single‐nucleotide polymorphisms from the Exome Aggregation Consortium using model‐predicted changes in percent spliced in (ΔPSI) associated with tested variants. Benchmarking our model against widely used state‐of‐the‐art tools, we demonstrate that PEPSI achieves comparable performance in terms of sensitivity and precision. Moreover, we also show that using secondary structure context can help resolve several cases where changes in the counts of SREs do not correspond with the directionality of ΔPSI measured for tested variants.  相似文献   

18.
Type 1 diabetes is an autoimmune disease characterized by destruction of the pancreatic islet beta cells that is mediated primarily by T cells specific for beta cell antigens. Insulin administration prolongs the life of affected individuals, but often fails to prevent the serious complications that decrease quality of life and result in significant morbidity and mortality. Thus, new strategies for the prevention and treatment of this disease are warranted. Given the important role of dendritic cells (DCs) in the establishment of peripheral T cell tolerance, DC‐based strategies are a rational and exciting avenue of exploration. DCs employ a diverse arsenal to maintain tolerance, including the induction of T cell deletion or anergy and the generation and expansion of regulatory T cell populations. Here we review DC‐based immunotherapeutic approaches to type 1 diabetes, most of which have been employed in non‐obese diabetic (NOD) mice or other murine models of the disease. These strategies include administration of in vitro‐generated DCs, deliberate exposure of DCs to antigens before transfer and the targeting of antigens to DCs in vivo. Although remarkable results have often been obtained in these model systems, the challenge now is to translate DC‐based immunotherapeutic strategies to humans, while at the same time minimizing the potential for global immunosuppression or exacerbation of autoimmune responses. In this review, we have devoted considerable attention to antigen‐specific DC‐based approaches, as results from murine models suggest that they have the potential to result in regulatory T cell populations capable of both preventing and reversing type 1 diabetes.  相似文献   

19.
20.
It is possible to estimate the prior probability of pathogenicity for germline disease gene variants based on bioinformatic prediction of variant effect/s. However, routinely used approaches have likely led to the underestimation and underreporting of variants located outside donor and acceptor splice site motifs that affect messenger RNA (mRNA) processing. This review presents information about hereditary cancer gene germline variants, outside native splice sites, with experimentally validated splicing effects. We list 95 exonic variants that impact splicing regulatory elements (SREs) in BRCA1, BRCA2, MLH1, MSH2, MSH6, and PMS2. We utilized a pre‐existing large‐scale BRCA1 functional data set to map functional SREs, and assess the relative performance of different tools to predict effects of 283 variants on such elements. We also describe rare examples of intronic variants that impact branchpoint (BP) sites and create pseudoexons. We discuss the challenges in predicting variant effect on BP site usage and pseudoexonization, and suggest strategies to improve the bioinformatic prioritization of such variants for experimental validation. Importantly, our review and analysis highlights the value of considering impact of variants outside donor and acceptor motifs on mRNA splicing and disease causation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号