首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
4.
5.
6.
7.
8.
Pan-American mitochondrial DNA (mtDNA) haplogroup C1 has been recently subdivided into three branches, two of which (C1b and C1c) are characterized by ages and geographical distributions that are indicative of an early arrival from Beringia with Paleo-Indians. In contrast, the estimated ages of C1d—the third subset of C1—looked too young to fit the above scenario. To define the origin of this enigmatic C1 branch, we completely sequenced 63 C1d mitochondrial genomes from a wide range of geographically diverse, mixed, and indigenous American populations. The revised phylogeny not only brings the age of C1d within the range of that of its two sister clades, but reveals that there were two C1d founder genomes for Paleo-Indians. Thus, the recognized maternal founding lineages of Native Americans are at least 15, indicating that the overall number of Beringian or Asian founder mitochondrial genomes will probably increase extensively when all Native American haplogroups reach the same level of phylogenetic and genomic resolution as obtained here for C1d.While debate is still ongoing among scientists from several disciplines regarding the number of migratory events, their timing, and entry routes into the Americas (Wallace and Torroni 1992; Torroni et al. 1993; Forster et al. 1996; Kaufman and Golla 2000; Goebel et al. 2003, 2008; Schurr and Sherry 2004; Wang et al. 2007; Waters and Stafford 2007; Dillehay et al. 2008; Gilbert et al. 2008a; O''Rourke and Raff 2010), the general consensus is that modern Native American populations ultimately trace their gene pool to Asian groups who colonized northeast Siberia, including parts of Beringia, prior to the last glacial period. These ancestral population(s) probably retreated into refugial areas during the Last Glacial Maximum (LGM), where their genetic variation was reshaped by drift. Thus, pre-LGM haplotypes of Asian ancestry were differently preserved and lost in Beringian enclaves, but at the same time, novel haplotypes and alleles arose in situ due to new mutations, often becoming predominant because of major founder events (Tamm et al. 2007; Achilli et al. 2008; Bourgeois et al. 2009; Perego et al. 2009; Schroeder et al. 2009). The scenario of a temporally important differentiation stage in Beringia explains the predominance in Native Americans of private alleles and haplogroups such as the autosomal 9-repeat at microsatellite locus D9S1120 (Phillips et al. 2008; Schroeder et al. 2009), the Y chromosome haplogroup Q1a3a-M3 (Bortolini et al. 2003; Karafet et al. 2008; Rasmussen et al. 2010), and the pan-American mtDNA haplogroups A2, B2, C1b, C1c, C1d, D1, and D4h3a (Tamm et al. 2007; Achilli et al. 2008; Fagundes et al. 2008; Perego et al. 2009).In the millennia after the initial Paleo-Indian migrations, other groups from Beringia or eastern Siberia expanded into North America. If the gene pool of the source population(s) had in the meantime partially changed, not only because of drift, but also due to the admixture with population groups newly arrived from regions located west of Beringia, this would have resulted in the entry of additional Asian lineages into North America. This scenario, sometimes invoked to explain the presence of certain mtDNA haplogroups such as A2a, A2b, D2a, D3, and X2a only in populations of northern North America (Torroni et al. 1992; Brown et al. 1998; Schurr and Sherry 2004; Helgason et al. 2006; Achilli et al. 2008; Gilbert et al. 2008b; Perego et al. 2009), has recently received support from nuclear and morphometric data showing that some native groups from northern North America harbor stronger genetic similarities with some eastern Siberian groups than with Native American groups located more in the South (González-José et al. 2008; Bourgeois et al. 2009; Wang et al. 2009; Rasmussen et al. 2010).As for the pan-American mtDNA haplogroups, when analyzed at the highest level of molecular resolution (Bandelt et al. 2003; Tamm et al. 2007; Fagundes et al. 2008; Perego et al. 2009), they all reveal, with the exception of C1d, entry times of 15–18 thousand years ago (kya), which are suggestive of a (quasi) concomitant post-LGM arrival from Beringia with early Paleo-Indians. A similar entry time is also shown for haplogroup X2a, whose restricted geographical distribution in northern North America appears to be due not to a later arrival, but to its entry route through the ice-free corridor (Perego et al. 2009). Despite its continent-wide distribution, C1d was hitherto characterized by an expansion time of only 7.6–9.7 ky (Perego et al. 2009). This major discrepancy has been attributed to a poor and possibly biased representation of complete C1d mtDNA sequences (only 10) in the available data sets (Achilli et al. 2008; Malhi et al. 2010). To clarify the issue of the age of haplogroup C1d and its role as a founding Paleo-Indian lineage, we sequenced and analyzed 63 C1d mtDNAs from populations distributed over the entire geographical range of the haplogroup.  相似文献   

9.
10.
Understanding patterns of spontaneous mutations is of fundamental interest in studies of human genome evolution and genetic disease. Here, we used extremely rare variants in humans to model the molecular spectrum of single-nucleotide mutations. Compared to common variants in humans and human–chimpanzee fixed differences (substitutions), rare variants, on average, arose more recently in the human lineage and are less affected by the potentially confounding effects of natural selection, population demographic history, and biased gene conversion. We analyzed variants obtained from a population-based sequencing study of 202 genes in >14,000 individuals. We observed considerable variability in the per-gene mutation rate, which was correlated with local GC content, but not recombination rate. Using >20,000 variants with a derived allele frequency ≤10−4, we examined the effect of local GC content and recombination rate on individual variant subtypes and performed comparisons with common variants and substitutions. The influence of local GC content on rare variants differed from that on common variants or substitutions, and the differences varied by variant subtype. Furthermore, recombination rate and recombination hotspots have little effect on rare variants of any subtype, yet both have a relatively strong impact on multiple variant subtypes in common variants and substitutions. This observation is consistent with the effect of biased gene conversion or selection-dependent processes. Our results highlight the distinct biases inherent in the initial mutation patterns and subsequent evolutionary processes that affect segregating variants.Mutation is one of the most fundamental processes in biology. It is the ultimate source of genetic variation and one of the driving forces of evolution. Mutation also plays a significant role in the etiology of human diseases. There is considerable interest in understanding the underlying pattern and molecular spectrum of spontaneous mutations. Historically, two approaches were applied to estimate the single-nucleotide mutation rate in humans. The first analyzes divergent sites between humans and another species, typically chimpanzee. According to Kimura''s neutral theory, the majority of substitutions are neutral and therefore the extent of between-species divergence can be used to estimate the neutral mutation rate (Kimura 1983). Many groups have applied this approach to estimate the spontaneous mutation rate in humans (Drake et al. 1998; Nachman and Crowell 2000; Kumar and Subramanian 2002; Silva and Kondrashov 2002). However, several forces, including natural selection, biased gene conversion (BGC), and demographic history, can alter fixation probabilities and reshape the spectrum and genomic distribution of between-species substitution patterns. A second, more direct approach, pioneered by Haldane (1935), uses incidence rates of dominant disorders in humans to estimate the mutation rate (Sommer 1995; Sommer and Ketterling 1996; Kondrashov 2003; Lynch 2010). This approach, however, is limited by the fact that only a small subset of new mutations manifest as disease variants (Nachman 2004).The mutation rates from these studies represent a genome-wide average. However, there is extensive variability among different genes or genomic regions in both between-species divergence and within-species diversity (Wolfe et al. 1989; Nachman and Crowell 2000; Sachidanandam et al. 2001; Smith and Lercher 2002; Kondrashov 2003; Hodgkinson et al. 2009). This suggests that spontaneous mutation rates are not constant throughout the genome, although the reasons behind this variability are unclear.Local nucleotide composition is a frequently studied feature that could contribute to mutation rate variability. One study showed that AT > GC (an A base replaced with a G or a T base replaced with a C) common variants segregate at a higher frequency in regions with higher GC content (Webster et al. 2003), and others similarly reported increased fixation bias toward GC base pairs in GC-rich regions (Lercher and Hurst 2002a; Lercher et al. 2002). However, analyses of GC content and variant patterns often reported contradicting findings. For example, while some studies showed that GC content is positively correlated with both divergence rates between humans and chimpanzee (Smith et al. 2002; Webster et al. 2003; Arndt and Hwa 2005; Duret and Arndt 2008) and within-human nucleotide diversity (Sachidanandam et al. 2001; Hellmann et al. 2005), another study found a negative correlation (Cai et al. 2009). Furthermore, while some studies reported increasing GC > AT substitution rates with increasing GC content (Smith et al. 2002; Webster et al. 2003), others showed a decrease (Arndt and Hwa 2004; Duret and Arndt 2008). These inconsistencies could be partly explained by differences in the allele frequency, and therefore the evolutionary time scale of the variants analyzed in different studies. Consequently the observed patterns could be the result of confounding factors, such as selection and demography, instead of alterations in the actual mutation rate.Recombination is known to influence patterns of common variation and substitution rates. Correlations between recombination rate and nucleotide diversity or between species substitution rates have been observed in humans (Nachman et al. 1998; Nachman 2001; Lercher and Hurst 2002b; Hellmann et al. 2003, 2005; Spencer et al. 2006; Duret and Arndt 2008; Cai et al. 2009; Lohmueller et al. 2011), Drosophila (Begun and Aquadro 1992; Begun et al. 2007; Kulathinal et al. 2008), and several plant species (Dvorak et al. 1998; Kraft et al. 1998; Stephan and Langley 1998; Tenaillon et al. 2004). Three major theories exist to explain these observations. First, recombination may be directly mutagenic, leading to increased mutation rates in regions of high recombination and thus higher diversity (Lercher and Hurst 2002b; Hellmann et al. 2003, 2008). Second, while background selection and selective sweeps reduce haplotype diversity, recombination generates new haplotypes by shuffling variants onto different backgrounds, thereby maintaining diversity in regions of high recombination rates (Kaplan et al. 1989; Charlesworth et al. 1993, 1995; Hudson and Kaplan 1995; Nachman 2001). A third explanation is BGC, a recombination-associated process that preferentially repairs AT/GC mismatches produced during recombination to GC bases, leading to preferential fixation of GC alleles (for review, see Duret and Galtier 2009). Over time, the observed effect of BGC can mimic that of natural selection, leading to an excess of “weak” (W) A/T bases converted to “strong” (S) G/C bases as if the latter were under positive selection (Berglund et al. 2009; Galtier et al. 2009; Necsulea et al. 2011). The reports hypothesizing a mutagenic effect of recombination relied on common variants and substitutions (Lercher and Hurst 2002b; Hellmann et al. 2003, 2008). Several lines of evidence argue against the mutagenic recombination theory and instead suggest that a selection-dependent mechanism or BGC can explain the observed correlation between diversity and recombination rate (Duret and Arndt 2008; Berglund et al. 2009; Galtier et al. 2009; Lohmueller et al. 2011).Previous studies using common variants within humans and substitutions between humans and chimpanzees are effectively dealing with mutations accumulated over many generations. Their patterns, therefore, reflect the cumulative influence of many processes, including natural selection, population demographic history, and BGC. A major challenge in the field is to elucidate the extent to which these forces alter the distribution of variants over time and to distinguish their relative contributions. To minimize the effects of selection, many studies restrict their analysis to noncoding regions of the genome. However, widespread signatures of recent positive selection, even within supposedly neutral regions (Williamson et al. 2007), suggest that noncoding regions may also be influenced by selection.Rare variants represent a newly available and expanding resource that can overcome some of these limitations. Rare variants are relatively young, predominantly because they are the result of recent mutation events. Therefore, rare variants are typically less affected by population demographic history or natural selection (Messer 2009). Furthermore, as BGC acts only on variants after they have arisen in the population (Duret and Galtier 2009), it does not influence innate mutation rates. Rare variants, therefore, are an appropriate resource for studying the spectrum and genomic distribution of mutations while minimizing the potentially confounding influences. In addition, while family-based whole-genome sequencing has begun to identify de novo mutations that provide more direct measures of mutation rates (The 1000 Genomes Project Consortium 2010; Conrad et al. 2011; Campbell et al. 2012; Kong et al. 2012), the identified mutations sparsely cover the genome. For example, if whole-genome sequencing of each parent-offspring trio yields ∼40 de novo mutations (Conrad et al. 2011), 500 such trios would need to be sequenced to accumulate roughly 20,000 mutations. These mutations, however, would occur once per 150 kb on average, and the data would lack the spatial resolution necessary to detect the effect of local genomic context on a finer scale.We studied a set of rare variants discovered via targeted resequencing of 202 genes in >14,000 unrelated individuals. We analyzed the per-gene mutation rate as well as the probability of each site to contain a variant of a specific subtype relative to local GC content, recombination rate, and recombination hotspots. In order to compare mutation rate inferences based on rare variants with those obtained by within- and between-species data, we compared rare variant patterns to common variant data from The 1000 Genomes Project Consortium and substitution sites between humans and chimpanzee. These three variant classes cover different evolutionary time scales, and the differences between them allow us to examine the distinct influence of genomic context on the initial mutation process, the subsequent rise of some mutations to become common variants, and eventual fixation.  相似文献   

11.
12.
13.
14.
15.
16.
The human gut microbiome is a complex ecosystem composed mainly of uncultured bacteria. It plays an essential role in the catabolism of dietary fibers, the part of plant material in our diet that is not metabolized in the upper digestive tract, because the human genome does not encode adequate carbohydrate active enzymes (CAZymes). We describe a multi-step functionally based approach to guide the in-depth pyrosequencing of specific regions of the human gut metagenome encoding the CAZymes involved in dietary fiber breakdown. High-throughput functional screens were first applied to a library covering 5.4 × 109 bp of metagenomic DNA, allowing the isolation of 310 clones showing beta-glucanase, hemicellulase, galactanase, amylase, or pectinase activities. Based on the results of refined secondary screens, sequencing efforts were reduced to 0.84 Mb of nonredundant metagenomic DNA, corresponding to 26 clones that were particularly efficient for the degradation of raw plant polysaccharides. Seventy-three CAZymes from 35 different families were discovered. This corresponds to a fivefold target-gene enrichment compared to random sequencing of the human gut metagenome. Thirty-three of these CAZy encoding genes are highly homologous to prevalent genes found in the gut microbiome of at least 20 individuals for whose metagenomic data are available. Moreover, 18 multigenic clusters encoding complementary enzyme activities for plant cell wall degradation were also identified. Gene taxonomic assignment is consistent with horizontal gene transfer events in dominant gut species and provides new insights into the human gut functional trophic chain.The human intestinal microbiome is the dense and complex ecosystem that resides in the distal part of our digestive tract. Its role in metabolizing dietary constituents (Sonnenburg et al. 2005; Flint et al. 2008; Ley et al. 2008) and in protecting the host against pathogens (Rakoff-Nahoum et al. 2004) is crucial to human health (Macdonald and Monteleone 2005; McGarr et al. 2005; Manichanh et al. 2006; Turnbaugh and Gordon 2009). It is mainly composed of commensal bacteria from the Bacteroidetes, Firmicutes, Proteobacteria, and Actinobacteria phyla (five), and of several archaeal and eukaryotic species. With up to 1012 cells per gram of feces, the bacterial abundance is estimated to reach 1000 operational taxonomic units (OTUs) per individual, 70% to 80% of the most dominant ones being subject-specific (Zoetendal et al. 1998; Tap et al. 2009). However, only 20% of the bacterial species have been successfully cultured so far (Eckburg et al. 2005). Large-scale analyses of genomic and metagenomic sequences have provided gene catalogs and statistical evidence on protein families involved in the predominant functions of the human gut microbiome (Gill et al. 2006; Kurokawa et al. 2007; Flint et al. 2008; Turnbaugh et al. 2009; Qin et al. 2010), among which the catabolism of dietary fibers is of particular interest in human nutrition and health. Dietary fibers are the components of vegetables, cereals, leguminous seeds, and fruits that are not digested in the stomach or in the small intestine, but are fermented in the colon by the gut microbiome and/or excreted in feces (Grabitske and Slavin 2008). Chemically, dietary fibers are mainly composed of complex plant cell wall polysaccharides and their associated lignin (Selvendran 1984), along with storage polysaccharides such as fructans and resistant starch (Institute of Medicine 2005). Dietary fibers have been identified as a strong positive dietary factor in the prevention of obesity, diabetes, and cardiovascular diseases (World Health Organization 2003). Because of the wide structural diversity of dietary fibers, the human gut bacteria produce a huge panel of carbohydrate active enzymes (CAZymes), with widely different substrate specificities, to degrade these compounds into metabolizable monosaccharides and disaccharides. The functions and the evolutionary relationships of CAZyme-encoding genes of the human gut microbiome are being extensively studied through functional and structural genomics investigations (Flint et al. 2008; Lozupone et al. 2008; Mahowald et al. 2009; Martens et al. 2009), which are nevertheless restricted to cultivated bacterial species. CAZyme diversity has also been described in three metagenomics studies focused on this microbiome (Gill et al. 2006; Turnbaugh et al. 2009, 2010), and these revealed the presence of at least 81 families of glycoside-hydrolases, making the human gut metagenome one of the richest source of CAZymes (Li et al. 2009). However, the proof of function of annotated genes issued from metagenomes still constitutes a goal for enzyme discovery. This can be addressed by functional screening of metagenomic libraries, in order to retrieve genes of interest. Numerous studies have provided conclusive evidence on the potential of such an approach for the identification of novel glycoside-hydrolases from various ecosystems such as soil (Rondon et al. 2000; Richardson et al. 2002; Voget et al. 2003; Pang et al. 2009), lakes (Rees et al. 2003), hot springs (Tang et al. 2006, 2008), rumen (Ferrer et al. 2005; Guo et al. 2008; Liu et al. 2008; Duan et al. 2009), rabbit (Feng et al. 2007), and insect guts (Brennan et al. 2004; for review, see Ferrer et al. 2009; Li et al. 2009; Simon and Daniel 2009; Uchiyama and Miyazaki 2009). In all cases, the identification of the gene responsible for the screened activity was carried out by sequencing only a few kilobases of metagenomic DNA. Collectively these studies have established an experimental proof of function for 35 glycoside hydrolases (from eight families) issued from metagenomes (data from the CAZy database; http://www.cazy.org/), a number that is very small considering the known CAZy diversity. Here, we examined the potential of high-throughput functional screening of large insert libraries to guide in-depth pyrosequencing of specific regions of the human gut metagenome that encode the enzymatic machinery involved in dietary fiber catabolism.  相似文献   

17.
18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号