首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Membrane proteins are unique in that segments thereof concurrently reside in vastly different physicochemical environments: the extracellular space, the lipid bilayer, and the cytoplasm. Accordingly, the effects of missense variants disrupting their sequence depend greatly on the characteristics of the environment of the protein segment affected as well as the function it performs. Because membrane proteins have many crucial roles (transport, signal transduction, cell adhesion, etc.), compromising their functionality often leads to diseases including cancers, diabetes mellitus or cystic fibrosis. Here, we report a suite of sequence‐based computational methods “Pred‐MutHTP” for discriminating between disease‐causing and neutral alterations in their sequence. With a data set of 11,846 disease‐causing and 9,533 neutral mutations, we obtained an accuracy of 74% and 78% with 10‐fold group‐wise cross‐validation and test set, respectively. The features used in the models include evolutionary information, physiochemical properties, neighboring residue information, and specialized membrane protein attributes incorporating the number of transmembrane segments, substitution matrices specific to membrane proteins as well as residue distributions occurring in specific topological regions. Across 11 disease classes, the method achieved accuracies in the range of 75–85%. The model designed specifically for the transmembrane segments achieved an accuracy of 85% on the test set with a sensitivity and specificity of 86% and 83%, respectively. This renders our method the current state‐of‐the‐art with regard to predicting the effects of variants in the transmembrane protein segments. Pred‐MutHTP allows predicting the effect of any variant occurring in a membrane protein—available at https://www.iitm.ac.in/bioinfo/PredMutHTP/  相似文献   

2.
This paper reports the evaluation of predictions for the “CALM1” challenge in the fifth round of the Critical Assessment of Genome Interpretation held in 2018. In the challenge, the participants were asked to predict effects on yeast growth caused by missense variants of human calmodulin, a highly conserved protein in eukaryotic cells sensing calcium concentration. The performance of predictors implementing different algorithms and methods is similar. Most predictors are able to identify the deleterious or tolerated variants with modest accuracy, with a baseline predictor based purely on sequence conservation slightly outperforming the submitted predictions. Nevertheless, we think that the accuracy of predictions remains far from satisfactory, and the field awaits substantial improvements. The most poorly predicted variants in this round surround functional CALM1 sites that bind calcium or peptide, which suggests that better incorporation of structural analysis may help improve predictions.  相似文献   

3.
The human genome contains frequent single-basepair variants that may or may not cause genetic disease. To characterize benign vs. pathogenic missense variants, numerous computational algorithms have been developed based on comparative sequence and/or protein structure analysis. We compared computational methods that use evolutionary conservation alone, amino acid (AA) change alone, and a combination of conservation and AA change in predicting the consequences of 254 missense variants in the CDKN2A (n = 92), MLH1 (n = 28), MSH2 (n = 14), MECP2 (n = 30), and tyrosinase (TYR) (n = 90) genes. Variants were validated as either neutral or deleterious by curated locus-specific mutation databases and published functional data. All methods that use evolutionary sequence analysis have comparable overall prediction accuracy (72.9-82.0%). Mutations at codons where the AA is absolutely conserved over a sufficient evolutionary distance (about one-third of variants) had a 91.6 to 96.8% likelihood of being deleterious. Three algorithms (SIFT, PolyPhen, and A-GVGD) that differentiate one variant from another at a given codon did not significantly improve predictive value over conservation score alone using the BLOSUM62 matrix. However, when all four methods were in agreement (62.7% of variants), predictive value improved to 88.1%. These results confirm a high predictive value for methods that use evolutionary sequence conservation, with or without considering protein structural change, to predict the clinical consequences of missense variants. The methods can be generalized across genes that cause different types of genetic disease. The results support the clinical use of computational methods as one tool to help interpret missense variants in genes associated with human genetic disease.  相似文献   

4.
Multiple algorithms are used to predict the impact of missense mutations on protein structure and function using algorithm-generated sequence alignments or manually curated alignments. We compared the accuracy with native alignment of SIFT, Align-GVGD, PolyPhen-2, and Xvar when generating functionality predictions of well-characterized missense mutations (n = 267) within the BRCA1, MSH2, MLH1, and TP53 genes. We also evaluated the impact of the alignment employed on predictions from these algorithms (except Xvar) when supplied the same four alignments including alignments automatically generated by (1) SIFT, (2) Polyphen-2, (3) Uniprot, and (4) a manually curated alignment tuned for Align-GVGD. Alignments differ in sequence composition and evolutionary depth. Data-based receiver operating characteristic curves employing the native alignment for each algorithm result in area under the curve of 78-79% for all four algorithms. Predictions from the PolyPhen-2 algorithm were least dependent on the alignment employed. In contrast, Align-GVGD predicts all variants neutral when provided alignments with a large number of sequences. Of note, algorithms make different predictions of variants even when provided the same alignment and do not necessarily perform best using their own alignment. Thus, researchers should consider optimizing both the algorithm and sequence alignment employed in missense prediction.  相似文献   

5.
Pompe disease is an autosomal recessive lysosomal storage disorder caused by disease‐associated variants in the acid alpha‐glucosidase (GAA) gene. The current Pompe mutation database provides a severity rating of GAA variants based on in silico predictions and expression studies. Here, we extended the database with clinical information of reported phenotypes. We added additional in silico predictions for effects on splicing and protein function and for cross reactive immunologic material (CRIM) status, minor allele frequencies, and molecular analyses. We analyzed 867 patients and 562 GAA variants. Based on their combination with a GAA null allele (i.e., complete deficiency of GAA enzyme activity), 49% of the 422 disease‐associated variants could be linked to classic infantile, childhood, or adult phenotypes. Predictions and immunoblot analyses identified 131 CRIM negative and 216 CRIM positive variants. While disease‐associated missense variants were found throughout the GAA protein, they were enriched up to seven‐fold in the catalytic site. Fifteen percent of disease‐associated missense variants were predicted to affect splicing. This should be confirmed using splicing assays. Inclusion of clinical severity rating in the Pompe mutation database provides an invaluable tool for diagnosis, prognosis of disease progression, treatment regimens, and the future development of personalized medicine for Pompe disease.  相似文献   

6.
The REarranged during Transfection (RET) gene encodes a receptor tyrosine kinase required for maturation of the enteric nervous system. RET sequence variants occur in the congenital abnormality Hirschsprung disease (HSCR), characterized by absence of ganglia in the intestinal tract. Although HSCR‐RET variants are predicted to inactivate RET, the molecular mechanisms of these events are not well characterized. Using structure‐based models of RET, we predicted the molecular consequences of 23 HSCR‐associated missense variants and how they lead to receptor dysfunction. We validated our predictions in biochemical and cell‐based assays to explore mutational effects on RET protein functions. We found a minority of HSCR‐RET variants abrogated RET kinase function, while the remaining mutants were phosphorylated and transduced intracellular signals. HSCR‐RET sequence variants also impacted on maturation, stability, and degradation of RET proteins. We showed that each variant conferred a unique combination of effects that together impaired RET protein activity. However, all tested variants impaired RET‐mediated cellular functions, including cell transformation and migration. Our data indicate that the molecular mechanisms of impaired RET function in HSCR are highly variable. Although a subset of variants cause loss of RET kinase activity and downstream signaling, enzymatic inactivation is not the sole mechanism at play in HSCR.  相似文献   

7.
Pathogenic variants in the core spliceosome U5 small nuclear ribonucleoprotein gene EFTUD2/SNU114 cause the craniofacial disorder mandibulofacial dysostosis Guion‐Almeida type (MFDGA). MFDGA‐associated variants in EFTUD2 comprise large deletions encompassing EFTUD2, intragenic deletions and single nucleotide truncating or missense variants. These variants are predicted to result in haploinsufficiency by loss‐of‐function of the variant allele. While the contribution of deletions within EFTUD2 to allele loss‐of‐function are self‐evident, the mechanisms by which missense variants are disease‐causing have not been characterized functionally. Combining bioinformatics software prediction, yeast functional growth assays, and a minigene (MG) splicing assay, we have characterized how MFDGA missense variants result in EFTUD2 loss‐of‐function. Only four of 19 assessed missense variants cause EFTUD2 loss‐of‐function through altered protein function when modeled in yeast. Of the remaining 15 missense variants, five altered the normal splicing pattern of EFTUD2 pre‐messenger RNA predominantly through exon skipping or cryptic splice site activation, leading to the introduction of a premature termination codon. Comparison of bioinformatic predictors for each missense variant revealed a disparity amongst different software packages and, in many cases, an inability to correctly predict changes in splicing subsequently determined by MG interrogation. This study highlights the need for laboratory‐based validation of bioinformatic predictions for EFTUD2 missense variants.  相似文献   

8.
Single nucleotide polymorphisms (SNPs) are the simplest and most frequent form of human DNA variation, also valuable as genetic markers of disease susceptibility. The most investigated SNPs are missense mutations resulting in residue substitutions in the protein. Here we propose SNPs&GO, an accurate method that, starting from a protein sequence, can predict whether a mutation is disease related or not by exploiting the protein functional annotation. The scoring efficiency of SNPs&GO is as high as 82%, with a Matthews correlation coefficient equal to 0.63 over a wide set of annotated nonsynonymous mutations in proteins, including 16,330 disease‐related and 17,432 neutral polymorphisms. SNPs&GO collects in unique framework information derived from protein sequence, evolutionary information, and function as encoded in the Gene Ontology terms, and outperforms other available predictive methods. Hum Mutat 30:1–8, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

9.
10.
《Genetics in medicine》2021,23(12):2386-2393
PurposeGenetic variation in MC1R is a main determinant of red hair color (RHC) phenotype and confers susceptibility to skin disorders.MethodsWe assessed the effects and function of MC1R variants identified in our clinical cohort of 135,947 participants with available exome sequencing using phenome-wide association scan (PheWAS). Expression and function of several variants were evaluated.ResultsWe found 24 nonsense and 215 missense variants in MC1R. Many common missense MC1R variants are strongly associated with skin disorders including skin cancer; however, each variant shows different penetrance and expressivity. Severity of skin phenotype was well correlated with the magnitude of functional defect measured as receptor expression and α-MSH stimulated cAMP production. Remarkably, MC1R deletions and nonsense variants are only weakly associated with milder skin phenotypes.ConclusionOur comprehensive assessment of all MC1R variants in a large cohort clearly establish that individuals with some missense variants are more susceptible to severe skin disorders than those with MC1R deletions or nonsense variants.  相似文献   

11.
Advances in genome sequencing have led to a tremendous increase in the discovery of novel missense variants, but evidence for determining clinical significance can be limited or conflicting. Here, we present Learning from Evidence to Assess Pathogenicity (LEAP), a machine learning model that utilizes a variety of feature categories to classify variants, and achieves high performance in multiple genes and different health conditions. Feature categories include functional predictions, splice predictions, population frequencies, conservation scores, protein domain data, and clinical observation data such as personal and family history and covariant information. L2‐regularized logistic regression and random forest classification models were trained on missense variants detected and classified during the course of routine clinical testing at Color Genomics (14,226 variants from 24 cancer‐related genes and 5,398 variants from 30 cardiovascular‐related genes). Using 10‐fold cross‐validated predictions, the logistic regression model achieved an area under the receiver operating characteristic curve (AUROC) of 97.8% (cancer) and 98.8% (cardiovascular), while the random forest model achieved 98.3% (cancer) and 98.6% (cardiovascular). We demonstrate generalizability to different genes by validating predictions on genes withheld from training (96.8% AUROC). High accuracy and broad applicability make LEAP effective in the clinical setting as a high‐throughput quality control layer.  相似文献   

12.
To assist in distinguishing disease‐causing mutations from nonpathogenic polymorphisms, we developed an objective algorithm to calculate an “estimate of pathogenic probability” (EPP) based on the prevalence of a specific variation, its segregation within families, and its predicted effects on protein structure. Eleven missense variations in the RPE65 gene were evaluated in patients with Leber congenital amaurosis (LCA) using the EPP algorithm. The accuracy of the EPP algorithm was evaluated using a cell‐culture assay of RPE65‐isomerase activity The variations were engineered into plasmids containing a human RPE65 cDNA and the retinoid isomerase activity of each variant was determined in cultured cells. The EPP algorithm predicted eight substitution mutations to be disease‐causing variants. The isomerase catalytic activities of these RPE65 variants were all less than 6% of wild‐type. In contrast, the EPP algorithm predicted the other three substitutions to be non‐disease‐causing, with isomerase activities of 68%, 127%, and 110% of wild‐type, respectively. We observed complete concordance between the predicted pathogenicities of missense variations in the RPE65 gene and retinoid isomerase activities measured in a functional assay. These results suggest that the EPP algorithm may be useful to evaluate the pathogenicity of missense variations in other disease genes where functional assays are not available. Hum Mutat 30:1–7, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

13.
More than 90% of genetic variants are rare in most modern sequencing studies, such as the Alzheimer''s Disease Sequencing Project (ADSP) whole-exome sequencing (WES) data. Furthermore, 54% of the rare variants in ADSP WES are singletons. However, both single variant and unit-based tests are limited in their statistical power to detect an association between rare variants and phenotypes. To best use missense rare variants and investigate their biological effect, we examine their association with phenotypes in the context of protein structures. We developed a protein structure–based approach, protein optimized kernel evaluation of missense nucleotides (POKEMON), which evaluates rare missense variants based on their spatial distribution within a protein rather than their allele frequency. The hypothesis behind this test is that the three-dimensional spatial distribution of variants within a protein structure provides functional context to power an association test. POKEMON identified three candidate genes (TREM2, SORL1, and EXOC3L4) and another suggestive gene from the ADSP WES data. For TREM2 and SORL1, two known Alzheimer''s disease (AD) genes, the signal from the spatial cluster is stable even if we exclude known AD risk variants, indicating the presence of additional low-frequency risk variants within these genes. EXOC3L4 is a novel AD risk gene that has a cluster of variants primarily shared by case subjects around the Sec6 domain. This cluster is also validated in an independent replication data set and a validation data set with a larger sample size.

High-throughput DNA sequencing of diverse humans has identified millions of genetic variants, the vast majority of which are exceptionally rare. A survey of ∼60,000 individuals from the Exome Aggregation Consortium (ExAC) found that out of ∼7 million variants, 99% have a frequency <1% and 54% are singletons (Taliun et al. 2021). Similarly, in the Alzheimer''s Disease Sequencing Project (ADSP) whole-exome sequencing (WES) of ∼10,000 individuals, 97% of identified variants have a minor allele frequency <1%, and 23% are singletons (Butkiewicz et al. 2018). However, the effect of most rare variants on diseases of interest remains unknown because of insufficient statistical power to detect the associations between these variants and phenotypes.We hypothesized that rare missense variants contribute to common diseases by disrupting the protein function and are likely to form clustered or dispersed patterns within protein structures when examined in population-based studies. Therefore, incorporating spatial context will improve rare variant association tests. Prior studies have shown that missense variants show nonrandom patterns in protein structures, such as cancer-associated hotspot regions with a high density of missense somatic mutations (Tokheim et al. 2016). Our group (Sivley et al. 2018) also found that germline causal missense variants for Mendelian diseases show nonrandom patterns in three-dimensional (3D) space. These patterns include clusters that likely reflect disruption of a key functional region and dispersions that likely reflect depletion of variants within a sensitive protein core.To test this hypothesis within sequencing studies of disease traits, we developed a kernel function to quantify genetic similarity among individuals by using protein structure information. When two individuals have different missense variants distal in genomic coordinates but close in 3D protein structure, these individuals will be assigned a high genetic similarity through our kernel function. When applied over an entire data set, our kernel function captures differences in the spatial patterns of rare missense variants among cases and controls or over continuous traits. Using a statistical framework similar to SKAT (Wu et al. 2011), we test the association of rare variants with quantitative and dichotomous phenotypes using this structure-based kernel. We call this approach protein optimized kernel evaluation of missense nucleotides (POKEMON). We validated that POKEMON can identify trait associations with spatial patterns formed by missense variants both in simulation studies and real-world data.  相似文献   

14.
15.
Hereditary non-polyposis colorectal cancer (HNPCC) is an autosomal dominant inherited disease caused by defects in the process of DNA mismatch repair (MMR), and mutations in the hMLH1 or hMSH2 genes are responsible for the majority of HNPCC. In addition to clear loss-of-function mutations conferred by nonsense or frameshift alterations in the coding sequence or by splice variants, genetic screening has revealed a large number of missense codons with less obvious functional consequences. The ability to discriminate between a loss-of-function mutation and a silent polymorphism is important for genetic testing for inherited diseases like HNPCC where the opportunity exists for early diagnosis and preventive intervention. In this study, quantitative in vivo DNA MMR assays in the yeast Saccharomyces cerevisiae were performed to determine the functional significance of amino acid replacements observed in the human population. Missense codons previously observed in human genes were introduced at the homologous residue in the yeast MLH1 or MSH2 genes. This study also demonstrated feasibility of constructing genes that encode functional hybrid human-yeast MLH1 proteins. Three classes of missense codons were found: (i) complete loss of function, i.e. mutations; (ii) variants indistinguishable from wild-type protein, i.e. silent polymorphisms; and (iii) functional variants which support MMR at reduced efficiency, i.e. efficiency polymorphisms. There was a good correlation between the functional results in yeast and available human clinical data regarding penetrance of the missense codon. The results reported here raise the intriguing possibility that differences in the efficiency of DNA MMR exist between individuals in the human population due to common polymorphisms.  相似文献   

16.
Resequencing genes in individuals at extremes of the population distribution constitutes a powerful and efficient strategy to identify sequence variants associated with complex traits. An excess of sequence variants at one extreme relative to the other that is not due to chance or to population stratification constitutes evidence for genetic association and implies the presence of functionally significant sequence variants. Recently, we reported that non-synonymous sequence variants in Niemann-Pick type C1-like 1 (NPC1L1), an intestinal cholesterol transporter, were significantly more common among individuals with low cholesterol absorption than in those with high cholesterol absorption. To determine whether sequence variations identified in individuals with low cholesterol absorption affect protein function, we performed studies in cultured cells and in families. Expression of the mutant proteins in Chinese hamster ovarian-K1 cells revealed that a majority (14 of 20) of the variants identified in low absorbers were associated with very low levels of NPC1L1 protein. In two extended families, mean cholesterol absorption levels, as measured using stable isotopes, were significantly lower in family members with the sequence variants than in those without the variant. These data indicate that the excess of sequence variations in individuals with extreme phenotypes reflects an enrichment of functionally significant variants. These findings are consistent with in silico predictions that some sequence variations found in healthy individuals are as deleterious to protein function as mutations that, in other genes, cause monogenic diseases. Such sequence variations may explain a significant fraction of quantitative phenotypic variation in humans.  相似文献   

17.
Thanks to the advent of rapid DNA sequencing technology and its prevalence, many disease‐associated genetic variants are rapidly identified in many genes from patient samples. However, the subsequent effort to experimentally validate and define their pathological roles is extremely slow. Consequently, the pathogenicity of most disease‐associated genetic variants is solely speculated in silico, which is no longer deemed compelling. We developed an experimental approach to efficiently quantify the pathogenic effects of disease‐associated genetic variants with a focus on SLC26A4, which is essential for normal inner ear function. Alterations of this gene are associated with both syndromic and nonsyndromic hereditary hearing loss with various degrees of severity. We established HEK293T‐based stable cell lines that express pendrin missense variants in a doxycycline‐dependent manner, and systematically determined their anion transport activities with high accuracy in a 96‐well plate format using a high throughput plate reader. Our doxycycline dosage‐dependent transport assay objectively distinguishes missense variants that indeed impair the function of pendrin from those that do not (functional variants). We also found that some of these putative missense variants disrupt normal messenger RNA splicing. Our comprehensive experimental approach helps determine the pathogenicity of each pendrin variant, which should guide future efforts to benefit patients.  相似文献   

18.
CHARGE syndrome is characterized by the variable occurrence of multisensory impairment, congenital anomalies, and developmental delay, and is caused by heterozygous mutations in the CHD7 gene. Correct interpretation of CHD7 variants is essential for genetic counseling. This is particularly difficult for missense variants because most variants in the CHD7 gene are private and a functional assay is not yet available. We have therefore developed a novel classification system to predict the pathogenic effects of CHD7 missense variants that can be used in a diagnostic setting. Our classification system combines the results from two computational algorithms (PolyPhen-2 and Align-GVGD) and the prediction of a newly developed structural model of the chromo- and helicase domains of CHD7 with segregation and phenotypic data. The combination of different variables will lead to a more confident prediction of pathogenicity than was previously possible. We have used our system to classify 145 CHD7 missense variants. Our data show that pathogenic missense mutations are mainly present in the middle of the CHD7 gene, whereas benign variants are mainly clustered in the 5' and 3' regions. Finally, we show that CHD7 missense mutations are, in general, associated with a milder phenotype than truncating mutations.  相似文献   

19.
Reliable methods for predicting functional consequences of variants in disease genes would be beneficial in the clinical setting. This study was undertaken to predict, and confirm in vitro, splicing aberrations associated with mismatch repair (MMR) variants identified in familial colon cancer patients. Six programs were used to predict the effect of 13 MLH1 and 6 MSH2 gene variants on pre‐mRNA splicing. mRNA from cycloheximide‐treated lymphoblastoid cell lines of variant carriers was screened for splicing aberrations. Tumors of variant carriers were tested for microsatellite instability and MMR protein expression. Variant segregation in families was assessed using Bayes factor causality analysis. Amino acid alterations were examined for evolutionary conservation and physicochemical properties. Splicing aberrations were detected for 10 variants, including a frameshift as a minor cDNA product, and altered ratio of known alternate splice products. Loss of splice sites was well predicted by splice‐site prediction programs SpliceSiteFinder (90%) and NNSPLICE (90%), but consequence of splice site loss was less accurately predicted. No aberrations correlated with ESE predictions for the nine exonic variants studied. Seven of eight missense variants had normal splicing (88%), but only one was a substitution considered neutral from evolutionary/physicochemical analysis. Combined with information from tumor and segregation analysis, and literature review, 16 of 19 variants were considered clinically relevant. Bioinformatic tools for prediction of splicing aberrations need improvement before use without supporting studies to assess variant pathogenicity. Classification of mismatch repair gene variants is assisted by a comprehensive approach that includes in vitro, tumor pathology, clinical, and evolutionary conservation data. Hum Mutat 0, 1–14, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

20.
Niemann-Pick type C (NPC) disease is a rare autosomal-recessive lysosomal storage disease typically accompanied by progressive impairment of nervous system and liver function. Biochemically, the disorder presents with an inhibited egress of cholesterol and glycosphingolipids from endosomal and lysosomal compartments in neuronal and nonneuronal cells. In the majority of NPC patients, mutations in the NPC1 gene can be identified. About 5% of patients show mutations in the NPC2 gene. Many different mutations can cause NPC disease and multiple variants not associated with the disease are known in both genes. A continuously updated collection of gene variants is lacking to date and only limited information is available on genotype-phenotype correlation. We have created the NPC disease gene variation database (NPC-db; http://npc.fzk.de; last accessed 24 October 2007). This database aims to provide a comprehensive overview of the sequence variants in NPC1 and NPC2, including information on their functional consequences and associated haplotypes. Moreover, genotype data and clinical information from individual NPC patients provide information on the impact of functional variants. NPC-db addresses professionals and nonprofessionals dealing with NPC disease on a clinical, diagnostic, research, or personal basis. The user is encouraged to search contents and submit novel information, thereby contributing to generate a valuable open-access tool that will allow a better understanding of the molecular and clinical details of NPC disease.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号