Signatures of positive selection apparent in a small sample of human exomes |
| |
Authors: | Jacob A. Tennessen Jennifer Madeoy Joshua M. Akey |
| |
Affiliation: | Department of Genome Sciences, University of Washington, Seattle, Washington 98195-5065, USA |
| |
Abstract: | Exome sequences, which comprise all protein-coding regions, are promising data sets for studies of natural selection because they offer unbiased genome-wide estimates of polymorphism while focusing on the portions of the genome that are most likely to be functionally important. We examine genomic patterns of polymorphism within 10 diploid autosomal exomes of European and African descent. Using coalescent simulations, we show how polymorphism, site frequency spectra, and intercontinental divergence in these samples would be influenced by different modes of positive selection. We examine putatively selected loci from four previous genome-wide scans of SNP genotypes and demonstrate that these regions indeed show unusual population genetic patterns in the exome data. Using a series of conservative criteria based on exome polymorphism, we are able to fine-scale map signatures of selection, in many cases pinpointing a single candidate SNP. We also identify and evaluate novel candidate selection genes that show unusual patterns of polymorphism. We sequence a portion of one novel candidate locus, IVL, in 74 individuals from multiple continents and examine global genetic diversity. Thus, we confirm, narrow, and supplement existing catalogs of putative targets of selection, and show that exome data sets, which are likely to soon become common, will be powerful tools for identifying adaptive genetic variation.Regions of the human genome that have recently evolved via positive selection have been sought for decades, but often remain elusive or difficult to confirm (Bodmer and Cavalli-Sforza 1976; Akey 2009). Initial single locus approaches have now yielded to genome-wide analyses that extensively sample loci across all chromosomes, even if they do not sample the whole genome exhaustively (Hinds et al. 2005; The International HapMap Consortium 2005; Akey 2009). Multiple genomic scans have produced extensive, often poorly overlapping lists of candidate genes under positive selection in humans (Kelley et al. 2006; Voight et al. 2006; Williamson et al. 2007; Akey 2009). A limitation of most of these scans is that they have primarily focused on ascertained SNP markers (Hinds et al. 2005; The International HapMap Consortium 2005), complicating population genetics inferences. In order to extract well-supported regions of recent adaptation from existing catalogs of putatively selected loci, it is important to reevaluate and refine such lists using data that are free from ascertainment biases. Fortunately, more ideal genome-wide data sets are beginning to emerge. These include sets of all genomic exons, or “exomes,” which are more practical to sequence at high coverage in multiple individuals than whole genomes. Although the sample sizes are still small, analysis of these genome-wide sequence data sets can be useful for evolutionary studies, as the unbiased estimates of polymorphism and divergence they provide can be used to assess previously identified candidate regions under selection and more precisely determine targets of selection.Here, we analyze the autosomal exomes of four African and six European individuals (Ng et al. 2009). We first perform coalescent simulations with selection to evaluate whether selection could leave a signature in the exomes of a small number of individuals. We then test whether genomic regions previously identified as possible targets of positive selection show evidence of non-neutrality in the exome data, and we filter the candidate regions accordingly. We also identify and evaluate several novel regions of unusual polymorphism suggestive of positive selection, and we collect and analyze additional sequence data for one of the most interesting novel genes. |
| |
Keywords: | |
|
|