首页 | 本学科首页   官方微博 | 高级检索  
     


Impact of natural selection on global patterns of genetic variation and association with clinical phenotypes at genes involved in SARS-CoV-2 infection
Authors:Chao Zhang  Anurag Verma  Yuanqing Feng  Marcelo C. R. Melo  Michael McQuillan  Matthew Hansen  Anastasia Lucas  Joseph Park  Alessia Ranciaro  Simon Thompson  Meagan A. Rubel  Michael C. Campbell  William Beggs  Jibril Hirbo  Sununguko Wata Mpoloka  Gaonyadiwe George Mokone  Regeneron Genetic Center  Thomas Nyambo  Dawit Wolde Meskel  Gurja Belay  Charles Fokunang  Alfred K. Njamnshi  Sabah A. Omar  Scott M. Williams  Daniel J. Rader  Marylyn D. Ritchie  Cesar de la Fuente-Nunez  Giorgio Sirugo  Sarah A. Tishkoff
Abstract:Human genomic diversity has been shaped by both ancient and ongoing challenges from viruses. The current coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has had a devastating impact on population health. However, genetic diversity and evolutionary forces impacting host genes related to SARS-CoV-2 infection are not well understood. We investigated global patterns of genetic variation and signatures of natural selection at host genes relevant to SARS-CoV-2 infection (angiotensin converting enzyme 2 [ACE2], transmembrane protease serine 2 [TMPRSS2], dipeptidyl peptidase 4 [DPP4], and lymphocyte antigen 6 complex locus E [LY6E]). We analyzed data from 2,012 ethnically diverse Africans and 15,977 individuals of European and African ancestry with electronic health records and integrated with global data from the 1000 Genomes Project. At ACE2, we identified 41 nonsynonymous variants that were rare in most populations, several of which impact protein function. However, three nonsynonymous variants (rs138390800, rs147311723, and rs145437639) were common among central African hunter-gatherers from Cameroon (minor allele frequency 0.083 to 0.164) and are on haplotypes that exhibit signatures of positive selection. We identify signatures of selection impacting variation at regulatory regions influencing ACE2 expression in multiple African populations. At TMPRSS2, we identified 13 amino acid changes that are adaptive and specific to the human lineage compared with the chimpanzee genome. Genetic variants that are targets of natural selection are associated with clinical phenotypes common in patients with COVID-19. Our study provides insights into global variation at host genes related to SARS-CoV-2 infection, which have been shaped by natural selection in some populations, possibly due to prior viral infections.

Coronavirus disease 2019 (COVID-19) is caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Coronaviruses are enveloped, positive-sense, and single-stranded RNA viruses, many of which are zoonotic pathogens that crossed over into humans. Seven coronavirus species, including SARS-CoV-2, have been discovered that, depending on the virus and host physiological condition, may cause mild or lethal respiratory disease. There is considerable variation in disease prevalence and severity across populations and communities. Importantly, minority populations in the United States appear to have been disproportionally affected by COVID-19 (1, 2). For example, in Chicago, more than 50% of COVID-19 cases and nearly 70% of COVID-19 deaths are in African Americans (who make up 30% of the population of Chicago) (1). While social and economic factors are largely responsible for driving COVID-19 health disparities, investigating genetic diversity at host genes related to SARS-CoV-2 infection could help identify functionally important variation, which may play a role in individual risk for severe COVID-19 infection.In this study, we focused on four key genes playing a role in SARS-CoV-2 infection (3). The ACE2 gene, encoding the angiotensin-converting enzyme-2 protein, was reported to be a main binding site for severe acute respiratory syndrome coronavirus (SARS-CoV) during an outbreak in 2003, and evidence showed stronger binding affinity to SARS-CoV-2, which enters the target cells via ACE2 receptors (3, 4). The ACE2 gene is located on the X chromosome (chrX); its expression level varies among populations (5); and it is ubiquitously expressed in the lung, blood vessels, gut, kidney, testis, and brain, all organs that appear to be affected as part of the COVID-19 clinical spectrum (6). SARS-CoV-2 infects cells through a membrane fusion mechanism, which in the case of SARS-CoV, is known to induce down-regulation of ACE2 (7). Such down-regulation has been shown to cause inefficient counteraction of angiotensin II effects, leading to enhanced pulmonary inflammation and intravascular coagulation (7). Additionally, altered expression of ACE2 has been associated with cardiovascular and cerebrovascular disease, which is highly relevant to COVID-19 as several cardiovascular conditions are associated with severe disease. TMPRSS2, located on the outer membrane of host target cells, binds to and cleaves ACE2, resulting in activation of spike proteins on the viral envelope and facilitating membrane fusion and endocytosis (8). Two additional genes, DPP4 and LY6E, have been shown to play an important role in the entry of SARS-CoV-2 virus into host cells. DPP4 is a known functional receptor for the Middle East respiratory syndrome coronavirus (MERS-CoV), causing a severe respiratory illness with high mortality (9, 10). LY6E encodes a glycosylphosphatidylinositol-anchored cell surface protein, which is a critical antiviral immune effector that controls coronavirus infection and pathogenesis (11). Mice lacking LY6E in hematopoietic cells were susceptible to murine coronavirus infection (11).Previous studies of genetic diversity at ACE2 and TMPRSS2 in global human populations did not include an extensive set of African populations (5, 1214). No common coding variants (defined here as minor allele frequency [MAF] > 0.05) at ACE2 were identified in any prior population studies. However, few studies included diverse indigenous African populations whose genomes harbor the greatest diversity among humans. This leads to a substantial disparity in the representation of African ancestries in human genetic studies of COVID-19, impeding health equity as the transferability of findings based on non-African ancestries to African populations can be low (15). Including more African populations in studying the genetic diversity of genes involved in SARS-CoV-2 infection is extremely necessary. Additionally, the evolutionary forces underlying global patterns of genetic diversity at host genes related to SARS-CoV-2 infection are not well understood. Using methods to detect natural selection signatures at host genes related to viral infections helps identify putatively functional variants that could play a role in disease risk.We characterized genetic variation and studied natural selection signatures at ACE2, TMPRSS2, DPP4, and LY6E in ethnically diverse human populations by analyzing 2,012 genomes from ethnically diverse Africans (referred to as the “African diversity” dataset), 2,504 genomes from the 1000 Genomes Project (1KG), and whole-exome sequencing of 15,977 individuals of European ancestry (EA) and African ancestry from the Penn Medicine BioBank (PMBB) dataset (SI Appendix, Fig. S1). The African diversity dataset includes populations with diverse subsistence patterns (hunter-gatherers, pastoralists, agriculturalists) and speaking languages belonging to the four major language families in Africa (Khoesan; Niger–Congo, of which Bantu is the largest subfamily; Afroasiatic; and Nilo-Saharan). We identify functionally relevant variation, compare the patterns of variation across global populations, and provide insight into the evolutionary forces underlying these patterns of genetic variation. In addition, we perform an association study using the variants identified from whole-exome sequencing at the four genes and clinical traits derived from electronic health record (EHR) data linked to the subjects enrolled in the PMBB. The EHR data include diseases related to organ dysfunctions associated with severe COVID-19, such as respiratory, cardiovascular, liver, and renal complications. Our study of genetic variation in genes involved in SARS-CoV-2 infection provides data to investigate infection susceptibility within and between populations and indicates that variants in these genes may play a role in comorbidities relevant to COVID-19 severity.
Keywords:SARS-CoV-2/COVID-19   genetic variation   phenotype association   natural selection   African diversity
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号