首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Genome-wide association scans are rapidly becoming reality, but there is no present consensus regarding genotyping strategies to optimise the discovery of true genetic risk factors. For a given investment in genotyping, should tag SNPs be selected in a gene-centric manner, or instead, should coverage be optimised based on linkage disequilibrium alone? We explored this question using empirical data from the HapMap-ENCODE project, and we found that tags designed specifically to capture common variation in exonic and evolutionarily conserved regions provide good coverage for 15-30% of the total common variation (depending on the population sample studied), and yield genotype savings compared with an anonymous tagging approach that captures all common variation. However, the same number of tags based on linkage disequilibrium alone captures substantially more (30-46%) of the total common variation. Therefore, the best strategy depends crucially on the unknown degree to which functional variation resides in recognisable exons and evolutionarily conserved sequence. A hypothetical but reasonable scenario might be one in which trait-causing variation is equally distributed between exons plus conserved sequence, and the rest of the genome. In this scenario, our analysis suggests that a tagging approach that captures variation in exons and conserved sequence provides only modestly better coverage of putatively causal variation than does anonymous tagging. In HapMap CEU samples (with northern and western European ancestry), we observed roughly equivalent coverage for equal investment for both tagging strategies.  相似文献   

2.
Recent advances in high throughput genotyping technologies will allow large-scale association studies to disentangle the genetic basis of human common diseases. Currently, a large-scale genotyping effort is being carried out by the HapMap project and the outcome of this project is expected to help researchers in their efforts to understand how genetic variation influences susceptibility to disease. However, there is some controversy on whether this huge public effort will be of value for those populations not studied in the HapMap project. Here, we present simulation results based on the empirical distribution of linkage disequilibrium (LD) on a large chromosomal region (10 Mb) on human chromosome 20(1,2) for two European and two Asian populations. These results show that statistical power to detect associations does not depend on the population were SNP tagging was performed.  相似文献   

3.
The human genome is estimated to contain one single nucleotide polymorphism (SNP) every 300 base pairs. The presence of LD between SNP markers can be used to save genotyping cost via appropriate SNP tagging strategies, whereas absence or low level of LD between markers generally increase genotyping cost. It is quite common that a large proportion of tagging SNPs in a tagging scheme often turn out to be singleton SNPs, that is, SNPs that only tag themselves rather than contribute power to the rest of a region. If genotyping cost is a major concern, which often is the case at the present time for genome-wide association studies, these singleton tagging SNPs would be the primary targets to be removed from genotyping. It is important, however, to understand the characteristics of such SNPs and estimate the impact of removing them in a study. Using the HapMap genotype data and genome wide expression data, we assessed the distribution and functional implications of singleton SNPs in the human genome. Our results demonstrated that SNPs of potentially higher functional importance (eg, nonsynonymous SNPs, SNPs in splicing sites and SNPs in 5' and 3' UTR) are associated with a higher tendency to be singleton SNPs than SNPs in intronic and intergenic regions. We further assessed whether singleton SNPs can be tagged using haplotypes of tagSNPs in the three genome wide chips, that is, GeneChip 500k of Affymetrix, HumanHap300 and HumanHap550 of Illumina, and discussed the general implications on genetic association studies.  相似文献   

4.
There is great interest in the use of tagging single nucleotide polymorphisms (tSNPs) to facilitate association studies of complex diseases. This is based on the premise that a minimum set of tSNPs may be sufficient to capture most of the variation in certain regions of the human genome. Several methods have been described to select tSNPs, based on either haplotype-block structure or independent of the underlying block structure. In this paper, we compare eight methods for choosing tSNPs in 10 representative resequenced candidate genes (a total of 194.2 kb) with different levels of linkage disequilibrium (LD) in a sample of European-Americans. We compared tagging efficiency (TE) and prediction accuracy of tSNPs identified by these methods, as a function of several factors, including LD level, minor allele frequency, and tagging criteria. We also assessed tagging consistency between each method. We found that tSNPs selected based on the methods Haplotype Diversity and Haplotype r2 provided the highest TE, whereas the prediction accuracy was comparable among different methods. Tagging consistency between different methods of tSNPs selection was poor. This work demonstrates that when tSNPs-based association studies are undertaken, the choice of method for selecting tSNPs requires careful consideration.  相似文献   

5.
6.
Recent developments in genome-wide association studies (GWAS) have lead to the localization of disease genes for many complex diseases. The scrutiny of the respective publications reveals, first, that statistical analysis is restricted typically to single-marker analysis in the first step, and that, second, the presence of multiple, independently associated SNPs within the same linkage disequilibrium (LD) region is a common phenomenon. Motivated by this observation, we show through a power simulation study that a simultaneous analysis of tightly linked SNPs in the initial GWAS analysis step would lead to increased power, when compared with that in single-marker analysis. This is true for all the three approaches we considered (implementations in BEAGLE, FAMHAP and UNPHASED). The best performance was obtained using a two-marker haplotype analysis. In conclusion, we would expect additional gene findings for re-analyzing successful GWAS with a multi-marker approach.  相似文献   

7.
8.
Genome-wide association studies (GWAS) are being conducted to identify common genetic variants that predispose to human diseases to unravel the genetic etiology of complex human diseases now. Because of genotyping cost constraints, it often follows a two-stage design, in which a large number of markers are identified in a proportion of the available samples in stage 1, and then the markers identified in stage 1 are examined in all the samples in stage 2. In this paper, we introduce a nonlinear entropy-based statistic for joint analysis for two-stage genome-wide association studies. Type I error rates and power of the entropy-based statistic for association tests are validated using simulation studies in single-locus test. The power of entropy-based joint analysis is investigated by simulations. And the results suggest that entropy-based joint analysis is always more powerful than linear joint analysis that uses a linear function of risk allele frequencies in cases and controls when detecting rare genetic variants; the powers of these two joint analyses are comparable when detecting common genetic variants. Furthermore, when the false discovery rate is controlled, entropy-based joint analysis is more powerful and needs fewer samples than linear joint analysis that uses a linear function of risk allele frequencies in cases and controls. So, we recommend we should use entropy-based strategy for two-stage genome-wide association studies to detect the rare and common genetic variants with moderate to large genetic effect underlying a complex disease.  相似文献   

9.
Bayes factor analysis has the attractive property of accommodating the risks of both false negatives and false positives when identifying susceptibility gene variants in genome-wide association studies (GWASs). For a particular SNP, the critical aspect of this analysis is that it incorporates the probability of obtaining the observed value of a statistic on disease association under the alternative hypotheses of non-null association. An approximate Bayes factor (ABF) was proposed by Wakefield (Genetic Epidemiology 2009;33:79–86) based on a normal prior for the underlying effect-size distribution. However, misspecification of the prior can lead to failure in incorporating the probability under the alternative hypothesis. In this paper, we propose a semi-parametric, empirical Bayes factor (SP-EBF) based on a nonparametric effect-size distribution estimated from the data. Analysis of several GWAS datasets revealed the presence of substantial numbers of SNPs with small effect sizes, and the SP-EBF attributed much greater significance to such SNPs than the ABF. Overall, the SP-EBF incorporates an effect-size distribution that is estimated from the data, and it has the potential to improve the accuracy of Bayes factor analysis in GWASs.Subject terms: Epidemiology, Genetics  相似文献   

10.
11.
12.
13.
A high-throughput SNP typing system for genome-wide association studies   总被引:16,自引:2,他引:16  
One of the most difficult issues to be solved in genome-wide association studies is to reduce the amount of genomic DNA required for genotyping. Currently available technologies require too large a quantity of genomic DNA to genotype with hundreds or thousands of single-nucleotide polymorphisms (SNPs). To overcome this problem, we combined the Invader assay with multiplex polymerase chain reaction (PCR), carried out in the presence of antibody to Taq polymerase, as well as using a novel 384-well card system that can reduce the required reaction volume. We amplified 100 genomic DNA fragments, each containing one SNP, in a single tube, and analyzed each SNP with the Invader assay. This procedure correctly genotyped 98 of the 100 SNP loci examined in PCR-amplified samples from ten individuals; the genotypes were confirmed by direct sequencing. The reproducibility and universality of the method were confirmed with two additional sets of 100 SNPs. Because we used 40 ng of genomic DNA as a template for multiplex PCR, the amount needed to assay one SNP was only 0.4 ng; therefore, theoretically, more than 200,000 SNPs could be genotyped at once when 100 μg of genomic DNA is available. Our results indicate the feasibility of undertaking genome-wide association studies using blood samples of only 5–10 ml. Received: May 18, 2001 / Accepted: May 21, 2001  相似文献   

14.
Genome-wide association studies can provide researchers some reference on gene mapping of complex trait, a key point of which is how to improve the power of association test. Recently, two-stage approaches are widely used to genome-wide association analysis. In the first stage, a screening test is used to select markers, and in the second stage, a family-based association test is performed based on a smaller set of the selected markers. Here, we modify an existing two-stage approach and propose a new test statistic for the association analysis. Simulation studies are conducted to compare the type I error rates and powers of the proposed approach with those of the existing two-stage approaches. Simulation results show that the new two-stage approach has greater power than the other two-stage approaches to some extent.  相似文献   

15.
16.
Genome-wide association (GWA) studies for complex human diseases are now feasible. Many GWA studies rely on commercial SNP chips, for which a common evaluation criterion is global coverage of the genome. Although providing an overall evaluation of an SNP chip, the global coverage does not tell us how the coverage varies across the genome, an important feature that should be taken into consideration, as coverage variation often results in power variation and potentially biased search in subsequent association analysis. To achieve a fuller understanding of SNP chip coverage, we conducted detailed evaluation of coverage, including (1) a map of local coverage - calculated over small consecutive genomic regions and (2) gene coverage - calculated for each known gene in the genome. These evaluations can reveal the degree of variation of each SNP chip in covering the genome and can facilitate SNP chip comparisons at a finer scale.  相似文献   

17.
There are considerable expectations about the ability of genome-wide association (GWA) studies to make exciting discoveries about the role of genes in common diseases. GWA studies may allow researchers to identify causal pathways that have not been unveiled before, thus opening new avenues to disease understanding, prevention and therapy. However, there are still many open challenges. One is how to analyse these studies. The problem of false positives and false negatives provides an interesting methodological stimulus to find optimal solutions. Once main genetic effects have been concretely documented, the next question is how to proceed with the investigation of gene-gene and gene-environment interactions. It is possible that what really counts is not the main effect of genes but complex interactions. Finding and interpreting such interactions is not straightforward. Finally, continuous updated integration of all evidence, from both old studies, current GWA investigations and future replication studies, and careful interpretation of the strength of the evidence are crucial to maximize transparency and lead to informative selection of the next steps of research in this field. The present Commentary is a report of an Environmental Cancer Risk, Nutrition and Individual Susceptibility network Workshop held in Venice in October 2007 and discusses some of the problems outlined above, with examples.  相似文献   

18.
Complex diseases such as hypertension are inherently multifactorial and involve many factors of mild-to-minute effect sizes. A genome-wide association study (GWAS) typically tests hundreds of thousands of single-nucleotide polymorphisms (SNPs), and offers opportunity to evaluate aggregated effects of many genetic variants with effects that are too small to detect individually. The gene-set-enrichment analysis (GSEA) is a pathway-based approach that tests for such aggregated effects of genes that are linked by biological functions. A key step in GSEA is the summary statistic (gene score) used to measure the overall relevance of a gene based on all SNPs tested in the gene. Existing GSEA methods use maximum statistics sensitive to gene size and linkage equilibrium. We propose the approach of variable set enrichment analysis (VSEA) and study new gene score methods that are less dependent on gene size. The new method treats groups of variables (SNPs or other variants) as base units for summarizing gene scores and relies less on gene definition itself. The power of VSEA is analyzed by simulation studies modeling various scenarios of complex multiloci interactions. Results show that the new gene scores generally performed better, some substantially so, than existing GSEA extension to GWAS. The new methods are implemented in an R package and when applied to a real GWAS data set demonstrated its practical utility in a GWAS setting.  相似文献   

19.
《Genetics in medicine》2010,12(2):81-84
The article describes the limited population diversity of genome-wide association studies and its resulting impact on the development of commercial genetic tests with restricted applicability and usefulness to certain groups, potentially increasing existing disparities. To enable development of new clinical tools applicable to all groups, much more focus is needed to engage minority communities to enroll in genetics or genomic research studies and on investigators to reach out to underrepresented communities.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号