Negative selection on human genes underlying inborn errors depends on disease outcome and both the mode and mechanism of inheritance |
| |
Authors: | Franck Rapaport,Bertrand Boisson,Anne Gregor,Vivien Bé ziat,Sté phanie Boisson-Dupuis,Jacinta Bustamante,Emmanuelle Jouanguy,Anne Puel,Jé ré mie Rosain,Qian Zhang,Shen-Ying Zhang,Joseph G. Gleeson,Lluis Quintana-Murci,Jean-Laurent Casanova,Laurent Abel,Etienne Patin |
| |
Abstract: | Genetic variants underlying life-threatening diseases, being unlikely to be transmitted to the next generation, are gradually and selectively eliminated from the population through negative selection. We study the determinants of this evolutionary process in human genes underlying monogenic diseases by comparing various negative selection scores and an integrative approach, CoNeS, at 366 loci underlying inborn errors of immunity (IEI). We find that genes underlying autosomal dominant (AD) or X-linked IEI have stronger negative selection scores than those underlying autosomal recessive (AR) IEI, whose scores are not different from those of genes not known to be disease causing. Nevertheless, genes underlying AR IEI that are lethal before reproductive maturity with complete penetrance have stronger negative selection scores than other genes underlying AR IEI. We also show that genes underlying AD IEI by loss of function have stronger negative selection scores than genes underlying AD IEI by gain of function, while genes underlying AD IEI by haploinsufficiency are under stronger negative selection than other genes underlying AD IEI. These results are replicated in 1,140 genes underlying inborn errors of neurodevelopment. Finally, we propose a supervised classifier, SCoNeS, which predicts better than state-of-the-art approaches whether a gene is more likely to underlie an AD or AR disease. The clinical outcomes of monogenic inborn errors, together with their mode and mechanisms of inheritance, determine the levels of negative selection at their corresponding loci. Integrating scores of negative selection may facilitate the prioritization of candidate genes and variants in patients suspected to carry an inborn error.Negative (or purifying) selection is the natural process by which deleterious alleles are selectively purged from the population (1). In diploid species, the strength of negative selection at a given locus is predicted to increase with decreasing fitness and increasing dominance of the genetic variants controlling traits: Variation causing early death in the heterozygous state are the least likely to be transmitted to the next generation, as their carriers have fewer offspring than noncarriers (2). Human genetic variants that cause severe diseases are, thus, expected to be the primary targets of negative selection, particularly for diseases affecting heterozygous individuals. In humans, several studies have ranked protein-coding genes according to their levels of negative selection (3–5). Nevertheless, the extent to which negative selection affects human disease-causing genes, and the factors determining its strength, remain largely unknown, particularly because our knowledge of the severity, mode, and mechanism of inheritance of the corresponding human diseases remains incomplete (3, 6–8).The strength of negative selection at a given gene has been traditionally approximated by comparing the coding sequence of the gene in a given species with that of one or several closely related species; it depends on the proportion of amino acid changes that have accumulated during evolution (9–11). With the advent of high-throughput sequencing, intraspecies metrics have been developed, based on the comparison of the probability of predicted loss-of-function (pLOF) mutations for a gene under a random model with the frequency of pLOF mutations observed in population databases (5, 12, 13), which capture the species-specific evolution of genes. Using an interspecies-based method and a hand-curated version of the Online Mendelian Inheritance in Man (hOMIM) database, a previous study elegantly showed that most human genes for which mutations cause highly penetrant diseases, including autosomal dominant (AD) diseases in particular, evolve under stronger negative selection than genes associated with complex disorders (6). However, other studies based on OMIM genes have reported conflicting results (3, 14–17), probably due to the incompleteness and heterogeneity of the datasets used. Moreover, no study has yet addressed this problem with intraspecies metrics, even though it has been suggested that the choice of the reference species for interspecies metrics contributes to discrepancies across studies (6).We aimed to improve the identification of the drivers of negative selection acting on human disease-causing genes, by developing a negative selection score combining several informative intraspecies and interspecies statistics, focusing on inborn errors of immunity (IEI). IEI, previously known as primary immunodeficiencies (18), are genetic diseases that disrupt the development or function of human immunity. They form a large and expanding group of genetic diseases that has been widely studied, and they are well characterized physiologically (immunologically) and phenotypically (clinically) (19–21). IEI are often symptomatic in early childhood, and at least until the turn of the 20th century and the introduction of antibiotics, most individuals with IEI probably died before reaching reproductive maturity. Accordingly, IEI genes have probably been under strong negative selection from the dawn of humankind until very recently. In this study, we investigated whether the severity of IEI and their mode and mechanism of inheritance have left signatures of negative selection of various intensities in the corresponding human genes. Furthermore, we validated our model on genes underlying inborn errors of neurodevelopment (IEND), another group of well-characterized severe genetic diseases. |
| |
Keywords: | immunodeficiency selection evolution genetics method |
|
|