首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
One of the major challenges in the near future is the identification of genes that contribute to complex disorders. Large scale association studies that utilize a dense map of single nucleotide polymorphisms (SNPs) have been considered as a valuable tool for this purpose. However, genome-wide screens are limited by costs of genotyping thousands of SNPs in a large number of individuals. Here we present a pooling strategy that enables high-throughput SNP validation and determination of allele frequencies in case and control populations. Quantitative analysis of allele frequencies of SNPs in DNA pools is based on matrix-assisted laser desorption/ionization time of flight (MALDI-TOF) mass spectrometry of primer extension assays. We demonstrate the accuracy and reliability of this approach on pools of eight previously genotyped individuals with an allele frequency representation in the range of 0.1 to 0.9. The accuracy of measured allele frequencies was shown in DNA pools of 142 to 186 individuals using additional markers. Allele frequencies determined from the pooled samples deviate from the real frequencies by about 3%. The described method reduces costs and time and enables genotyping of up to thousands of samples by taking advantage of the high-throughput MALDI-TOF technology.  相似文献   

2.
The challenge in the postgenome era is to measure sequence variations over large genomic regions in numerous patient samples. This massive amount of work can only be completed if more accurate, cost-effective, and high-throughput solutions become available. Here we describe a novel DNA fragmentation approach for single nucleotide polymorphism (SNP) discovery and sequence validation. The base-specific cleavage is achieved by creating primer extension products, in which acid-labile phosphoramidite (P-N) bonds replace the 5' phosphodiester bonds of newly incorporated pyrimidine nucleotides. Sequence variations are detected by hydrolysis of this acid-labile bond and MALDI-TOF analysis of the resulting fragments. In this study, we developed a robust protocol for P-N-bond fragmentation and investigated additional ways to improve its sensitivity and reproducibility. We also present the analysis of several human genomic targets ranging from 100-450 bp in length. By using a semiautomated sample processing protocol, we investigated an array of SNPs within a 240-bp segment of the NFKBIA gene in 48 human DNA samples. We identified and measured frequencies for the two common SNPs in the 3'UTR of NFKBIA (separated by 123 bp) and then confirmed these values in an independent genotyping experiment. The calculated allele frequencies in white and African American groups differed significantly, yet both fit Hardy-Weinberg expectations. This demonstrates the utility and effectiveness of PN-bond DNA fragmentation and subsequent MALDI-TOF MS analysis for the high-throughput discovery and measurement of sequence variations in fragments up to 0.5 kb in length in multiple human blood DNA samples.  相似文献   

3.
Detection of unknown single nucleotide polymorphism (SNP) relies on large scale sequencing expeditions of genomic fragments or complex high-throughput chip technology. We describe a simplified strategy for fluorimetric detection of known and unknown SNP by proportional hybridization to oligonucleotide arrays based on optimization of the established principle of signal loss or gain that requires a drastically reduced number of matched or mismatched probes. The array consists of two sets of 18-mer oligonucleotide probes. One set includes overlapping oligos with 4-nucleotide tiling representing an arbitrarily selected "consensus" sequence (consensus-oligos), the other includes oligos specific for known SNP within the same genomic region (variant-oligos). Fluorescence-labeled DNA amplified from a homozygous source identical to the consensus represents the reference target and is co-hybridized with a differentially-labeled test sample. Lack of hybridization of the test sample to consensus- with simultaneous hybridization to variant-oligos designates a known allele. Lack of hybridization to consensus- and variant-oligos indicates a new allele. Detection of unknown variants in heterozygous samples depends upon fluorimetric analysis of signal intensity based on the principle that homozygous samples generate twice the amount of signal. This method can identify unknown SNP in heterozygous conditions with a sensitivity of 82% and specificity of 90%. This strategy should dramatically increase the efficiency of SNP detection throughout the human genome and will decrease the cost and complexity of applying genomic wide analysis in the context of clinical trials.  相似文献   

4.
Xiao M  Kwok PY 《Genome research》2003,13(5):932-939
The analysis of human genetic variations such as single nucleotide polymorphisms (SNPs) has great applications in genome-wide association studies of complex genetic traits. We have developed an SNP genotyping method based on the primer extension assay with fluorescence quenching as the detection. The template-directed dye-terminator incorporation with fluorescence quenching detection (FQ-TDI) assay is based on the observation that the intensity of fluorescent dye R110- and R6G-labeled acycloterminators is universally quenched once they are incorporated onto a DNA oligonucleotide primer. By comparing the rate of fluorescence quenching of the two allelic dyes in real time, we have extended this method for allele frequency estimation of SNPs in pooled DNA samples. The kinetic FQ-TDI assay is highly accurate and reproducible both in genotyping and in allele frequency estimation. Allele frequencies estimated by the kinetic FQ-TDI assay correlated well with known allele frequencies, with an r(2) value of 0.993. Applying this strategy to large-scale studies will greatly reduce the time and cost for genotyping hundreds and thousands of SNP markers between affected and control populations.  相似文献   

5.
Association screening involving numerous genetic markers is facilitated by the analysis of pooled DNA samples rather than individual samples. Several genotyping methods have shown high accuracy and precision of allele frequency estimation in pools. Here, we expand the validation of SNP allele frequency estimation in DNA pools using Pyrosequencing by analyzing 186 pools for three SNPs representing complex sequencing cases. The correlation coefficient between estimated and true allele frequencies ranged between 0.979 and 0.996 and tended to increase with pool size, whereas the difference between estimated and true allele frequencies was 2.37+/-0.11%, in post-PCR pools. The precision was 1.73%. Pool size had no significant effect on accuracy and precision. A comparison between post-PCR and pre-PCR pools showed that for pre-PCR pooling efforts to accurately quantify the genomic DNA samples to be pooled and subsequently amplified are critical. To conclude, Pyrosequencing can be used for allele frequency estimation in DNA pools of SNPs with complex sequencing scenarios with accuracy and precision values in ranges comparable with those of other SNP typing techniques. Considering the ease of use, short run and analysis times, and little instrument maintenance requirements, Pyrosequencing may even be a preferred option.  相似文献   

6.
High-throughput genotyping by whole-genome resequencing   总被引:4,自引:0,他引:4  
The next-generation sequencing technology coupled with the growing number of genome sequences opens the opportunity to redesign genotyping strategies for more effective genetic mapping and genome analysis. We have developed a high-throughput method for genotyping recombinant populations utilizing whole-genome resequencing data generated by the Illumina Genome Analyzer. A sliding window approach is designed to collectively examine genome-wide single nucleotide polymorphisms for genotype calling and recombination breakpoint determination. Using this method, we constructed a genetic map for 150 rice recombinant inbred lines with an expected genotype calling accuracy of 99.94% and a resolution of recombination breakpoints within an average of 40 kb. In comparison to the genetic map constructed with 287 PCR-based markers for the rice population, the sequencing-based method was ∼20× faster in data collection and 35× more precise in recombination breakpoint determination. Using the sequencing-based genetic map, we located a quantitative trait locus of large effect on plant height in a 100-kb region containing the rice “green revolution” gene. Through computer simulation, we demonstrate that the method is robust for different types of mapping populations derived from organisms with variable quality of genome sequences and is feasible for organisms with large genome sizes and low polymorphisms. With continuous advances in sequencing technologies, this genome-based method may replace the conventional marker-based genotyping approach to provide a powerful tool for large-scale gene discovery and for addressing a wide range of biological questions.The first use of DNA-based markers decades ago laid the groundwork for gene discovery through forward and reverse genetics. The types of markers and methods for constructing genetic maps have evolved rapidly with advances in molecular biology techniques. The development of PCR triggered the burst of a generation of markers that considerably simplified experimental procedures for marker designing and scoring. However, these markers, although still widely used, have shown growing limitations in chromosomal coverage, time, and cost effectiveness. The development of genomics concepts and tools has set the stage for replacing the marker-based mapping approach with genome-based high-throughput strategies.The availability of genome sequences opened the door to high-throughput genotyping. This was initially accomplished by adopting microarray technology, which detects single nucleotide polymorphisms (SNPs) through hybridizing genomic DNA to oligonucleotides spotted on gene chips. This genotyping method substantially improved the efficiency of marker collection by allowing the detection of hundreds to thousands of markers in a single hybridization (Winzeler et al. 1998). It has been applied to model systems such as human, Arabidopsis, and rice (Meaburn et al. 2006; Singer et al. 2006; Jeremy et al. 2008). Although the goal of high-throughput was achieved, serious limitations remain for the array-based method. It is laborious, time-consuming, and expensive to design, produce, and process microarrays suited for specific mapping populations.The advent of the next-generation sequencing technology holds the promise for a methodological leap forward in genotyping and genetic mapping. The new sequencing techniques not only increase sequencing throughput by several orders of magnitude but also allow simultaneously sequencing a large number of samples using a multiplexed sequencing strategy (Craig et al. 2008; Cronn et al. 2008). These recent technical advances have paved the way for the development of a sequencing-based high-throughput genotyping method that combines advantages of time and cost effectiveness, dense marker coverage, high mapping accuracy and resolution, and more comparable genome and genetic maps among mapping populations and organisms.Here we describe the first high-throughput genotyping method that uses SNPs detected by whole-genome resequencing. This type of SNP data differs from traditional genetic markers primarily in two aspects. First, it is often not the case that all members of a recombinant population can be scored at a given SNP site. Second, an individual SNP site is no longer a reliable marker or locus for genotyping due to several potential sources of sequence errors. To deal with these unique features of the SNP data generated by the next-generation sequencing, we developed a new analytical framework, that is, a sliding window approach for evaluating SNPs collectively rather than individually. The method was applied to analyzing 150 rice recombinant inbred lines (RILs) derived from a cross between indica and japonica rice cultivars using sequences generated on the Illumina Genome Analyzer (GA).  相似文献   

7.
A pragmatic approach that balances the benefit of a whole-genome association (WGA) experiment against the cost of individual genotyping is to use pooled genomic DNA samples. We aimed to determine the feasibility of this approach in a WGA scan in rheumatoid arthritis (RA) using the validated human leucocyte antigen (HLA) and PTPN22 associations as test loci. A total of 203 269 single-nucleotide polymorphisms (SNPs) on the Affymetrix 100K GeneChip and Illumina Infinium microarrays were examined. A new approach to the estimation of allele frequencies from Affymetrix hybridization intensities was developed involving weighting for quality signals from the probe quartets. SNPs were ranked by z-scores, combined from United Kingdom and New Zealand case-control cohorts. Within a 1.7 Mb HLA region, 33 of the 257 SNPs and at PTPN22, 21 of the 45 SNPs, were ranked within the top 100 associated SNPs genome wide. Within PTPN22, individual genotyping of SNP rs1343125 within MAGI3 confirmed association and provided some evidence for association independent of the PTPN22 620W variant (P=0.03). Our results emphasize the feasibility of using genomic DNA pooling for the detection of association with complex disease susceptibility alleles. The results also underscore the importance of the HLA and PTPN22 loci in RA aetiology.  相似文献   

8.
The human genome is estimated to contain one single nucleotide polymorphism (SNP) every 300 base pairs. The presence of LD between SNP markers can be used to save genotyping cost via appropriate SNP tagging strategies, whereas absence or low level of LD between markers generally increase genotyping cost. It is quite common that a large proportion of tagging SNPs in a tagging scheme often turn out to be singleton SNPs, that is, SNPs that only tag themselves rather than contribute power to the rest of a region. If genotyping cost is a major concern, which often is the case at the present time for genome-wide association studies, these singleton tagging SNPs would be the primary targets to be removed from genotyping. It is important, however, to understand the characteristics of such SNPs and estimate the impact of removing them in a study. Using the HapMap genotype data and genome wide expression data, we assessed the distribution and functional implications of singleton SNPs in the human genome. Our results demonstrated that SNPs of potentially higher functional importance (eg, nonsynonymous SNPs, SNPs in splicing sites and SNPs in 5' and 3' UTR) are associated with a higher tendency to be singleton SNPs than SNPs in intronic and intergenic regions. We further assessed whether singleton SNPs can be tagged using haplotypes of tagSNPs in the three genome wide chips, that is, GeneChip 500k of Affymetrix, HumanHap300 and HumanHap550 of Illumina, and discussed the general implications on genetic association studies.  相似文献   

9.
The discovery of single nucleotide polymorphisms ( SNPs) is currently pursued with a tremendous effort. SNPs represent a rich source for molecular markers, since estimations predict six to seven million of these DNA variations in the human genome. A subset of these genetic variants is thought to have a pervasive impact on modern medicine, be it for the elucidation of differential pharmacological response or for the facilitated identification of genes involved in monogenetic and complex human diseases. Here we describe the overall process that leads to the set up of a SNP database. We describe a high-throughput sequencing assay for SNP discovery, automation of the dataflow from the DNA sequencer to the SNP analysis, and the tools to facilitate it. At the end of the process, a web-accessible interface collects the SNP information, which is processed in order to be written into the SNP database and to be available for end users who would like to select appropriate SNPs for their special screening needs.  相似文献   

10.
SNP discovery in pooled samples with mismatch repair detection   总被引:2,自引:0,他引:2       下载免费PDF全文
A targeted discovery effort is required to identify low frequency single nucleotide polymorphisms (SNPs) in human coding and regulatory regions. We here describe combining mismatch repair detection (MRD) with dideoxy terminator sequencing to detect SNPs in pooled DNA samples. MRD enriches for variant alleles in the pooled sample, and sequencing determines the nature of the variants. By using a genomic DNA pool as a template, approximately 100 fragments were amplified and subsequently combined and subjected en masse to the MRD procedure. The variant-enriched pool from this one MRD reaction is enriched for the population variants of all the tested fragments. Each fragment was amplified from the variant-enriched pool and sequenced, allowing the discovery of alleles with frequencies as low as 1% in the initial population. Our results support that MRD-based SNP discovery can be used for large-scale discovery of SNPs at low frequencies in a population.  相似文献   

11.
Inexpensive, high-throughput genotyping methods are needed for analyzing human genetic variations. We have successfully applied the regular bioluminometric assay coupled with modified primer extension reactions (BAMPER) method to single-nucleotide polymorphism (SNP) typing as well as the allele frequency determination for various SNPs. This method includes the production of single-strand target DNA from a genome and a primer extension reaction coupled with inorganic pyrophosphate (PPi) detection by a bioluminometric assay. It is an efficient way to get accurate allele frequencies for various SNPs, while single-strand DNA preparation is labor intensive. The procedure can be simplified in the typing of SNPs. We demonstrate that a modified BAMPER method in which we need not prepare a single-strand DNA can be carried out in one tube. A PCR product is directly used as a template for SNP typing in the new BAMPER method. Generally, tremendous amounts of PPi are produced in a PCR process, as well as many residual dNTPs, and residual PCR primers remain in the PCR products, which cause a large background signal in a bioluminometric assay. Here, shrimp alkaline phosphatase (SAP) and E. coli exonuclease I were used to degrade these components prior to BAMPER detection. The specific primer extension reactions in BAMPER were carried out under thermocycle conditions. The primers were extended to produce large amounts of PPi only when their bases at 3'-termini were complementary to the target. The extension products, PPis, were converted to ATP to be analyzed using the luciferin-luciferase detection system. We successfully demonstrated that PCR products can be directly genotyped by BAMPER in one tube for SNPs with various GC contents. As all reactions can be carried out in a single tube, the method will be useful for realizing a fully automated genotyping system.  相似文献   

12.
Polymorphism ratio sequencing (PRS) combines the advantages of high-throughput DNA sequencing with new labeling and pooling schemes to produce a powerful assay for sensitive single nucleotide polymorphism (SNP) discovery, rapid genotyping, and accurate, multiplexed allele frequency determination. In the PRS method, dideoxy-terminator extension ladders generated from a sample and reference template are labeled with different energy-transfer fluorescent dyes and coinjected into a separation capillary for comparison of relative signal intensities. We demonstrate the PRS method by screening two human mitochondrial genomes for sequence variations using a microfabricated capillary array electrophoresis device. A titration of multiplexed DNA samples places the limit of minor allele frequency detection at 5%. PRS is a sensitive and robust polymorphism detection method for the analysis of individual or multiplexed samples that is compatible with any four-color fluorescence DNA sequencer.  相似文献   

13.
High-throughput gene mapping in Caenorhabditis elegans   总被引:10,自引:0,他引:10  
Positional cloning of mutations in model genetic systems is a powerful method for the identification of targets of medical and agricultural importance. To facilitate the high-throughput mapping of mutations in Caenorhabditis elegans, we have identified a further 9602 putative new single nucleotide polymorphisms (SNPs) between two C. elegans strains, Bristol N2 and the Hawaiian mapping strain CB4856, by sequencing inserts from a CB4856 genomic DNA library and using an informatics pipeline to compare sequences with the canonical N2 genomic sequence. When combined with data from other laboratories, our marker set of 17,189 SNPs provides even coverage of the complete worm genome. To date, we have confirmed >1099 evenly spaced SNPs (one every 91 +/- 56 kb) across the six chromosomes and validated the utility of our SNP marker set and new fluorescence polarization-based genotyping methods for systematic and high-throughput identification of genes in C. elegans by cloning several proprietary genes. We illustrate our approach by recombination mapping and confirmation of the mutation in the cloned gene, dpy-18.  相似文献   

14.
Utilizing the results of extensive single nucleotide polymorphism (SNP) studies in humans, stimulated by the International HapMap Project, we present evidence that SNPs are not randomly spaced across the genome, but are somewhat clustered. This observation has important consequences for assay design, since hidden variants in primer sites can affect the accuracy of data. Indeed, using data from the calibration exercises of the HapMap Project, we found instances in which primer site mutations caused allele dropout and other genotyping failures. Given the dynamic nature of SNP discovery, it was inevitable that SNPs would be identified in the primer sites of many assays used for HapMap genotyping. We found that assays with such primer site mutations were correlated with elevated rates of genotype failure and allele dropout. This suggests that taking nearby SNPs into account is important for optimal genotyping assay design.  相似文献   

15.
目的探讨DNA池结合焦磷酸测序技术(DNA poo1ing and pyrosequencing,DNA poo1ing-PSQ)在病例-对照关联性研究中的适用性。方法利用前期获取的傣族高血压病例与健康对照样本及基因分型结果,选取有序列复杂性及频率代表性的4个SNP位点,用DNA pooling-PSQ分别对由453例傣族高血压患者及488例傣族对照构建的DNAPb~进行等位基因频率测定,所得结果与用SNaPshot逐一分型计数所得结果进行比较。结果DNA poo1ing-PSQ法对于非复杂性的中高频率SNP位点等位基因频率估计较为准确,差值在0.9%-2.7%之间,所得关联分析结果与SNaPshot法一致;但对于复杂SNPrs12046278及低频SNP rs11066280,所得频率与逐一分型法相差值较大,在11.2%-15.6%之间,位点的关联性结果也不一致。结论DNA poolinm-PS0适用干中频和高频的非复杂性SNPs关联分析.  相似文献   

16.
Ascertainment bias in studies of human genome-wide polymorphism   总被引:19,自引:1,他引:19  
Large-scale SNP genotyping studies rely on an initial assessment of nucleotide variation to identify sites in the DNA sequence that harbor variation among individuals. This "SNP discovery" sample may be quite variable in size and composition, and it has been well established that properties of the SNPs that are found are influenced by the discovery sampling effort. The International HapMap project relied on nearly any piece of information available to identify SNPs-including BAC end sequences, shotgun reads, and differences between public and private sequences-and even made use of chimpanzee data to confirm human sequence differences. In addition, the ascertainment criteria shifted from using only SNPs that had been validated in population samples, to double-hit SNPs, to finally accepting SNPs that were singletons in small discovery samples. In contrast, Perlegen's primary discovery was a resequencing-by-hybridization effort using the 24 people of diverse origin in the Polymorphism Discovery Resource. Here we take these two data sets and contrast two basic summary statistics, heterozygosity and F(ST), as well as the site frequency spectra, for 500-kb windows spanning the genome. The magnitude of disparity between these samples in these measures of variability indicates that population genetic analysis on the raw genotype data is ill advised. Given the knowledge of the discovery samples, we perform an ascertainment correction and show how the post-correction data are more consistent across these studies. However, discrepancies persist, suggesting that the heterogeneity in the SNP discovery process of the HapMap project resulted in a data set resistant to complete ascertainment correction. Ascertainment bias will likely erode the power of tests of association between SNPs and complex disorders, but the effect will likely be small, and perhaps more importantly, it is unlikely that the bias will introduce false-positive inferences.  相似文献   

17.
Genotyping costs still preclude analysis of a comprehensive SNP map in thousands of individual subjects in the search for disease susceptibility loci. Allele frequency estimation in DNA pools from cases and controls offers a partial solution, but variance in these estimates will result in some loss of statistical power. However, there has been no systematic attempt to quantify the several sources of error in previous studies. We report an analysis of the magnitude of variance components of each experimental stage in DNA pooling studies, and find that a design based on the formation of numerous small pools of approximately 50 individuals is superior to the formation of fewer, larger pools and the replication of any of the experimental stages. We conclude that this approach may retain an effective sample size greater than 68% of the true sample size, whilst offering a 60-fold reduction in DNA usage and a greater than 30-fold saving in cost, compared to individual genotyping. The possibility of combining pooling with informed selection of haplotype tag SNPs is also considered. In this way further savings in efficiency may be possible by using pooled allele frequency estimates to infer haplotype frequencies and hence, allele frequencies at untyped markers.  相似文献   

18.
Genotyping costs still preclude analysis of a comprehensive SNP map in thousands of individual subjects in the search for disease susceptibility loci. Allele frequency estimation in DNA pools from cases and controls offers a partial solution, but variance in these estimates will result in some loss of statistical power. However, there has been no systematic attempt to quantify the several sources of error in previous studies. We report an analysis of the magnitude of variance components of each experimental stage in DNA pooling studies, and find that a design based on the formation of numerous small pools of approximately 50 individuals is superior to the formation of fewer, larger pools and the replication of any of the experimental stages. We conclude that this approach may retain an effective sample size greater than 68% of the true sample size, whilst offering a 60-fold reduction in DNA usage and a greater than 30-fold saving in cost, compared to individual genotyping. The possibility of combining pooling with informed selection of haplotype tag SNPs is also considered. In this way further savings in efficiency may be possible by using pooled allele frequency estimates to infer haplotype frequencies and hence, allele frequencies at untyped markers.  相似文献   

19.
Madsen BE  Villesen P  Wiuf C 《Genome research》2007,17(10):1414-1419
By surveying a filtered, high-quality set of SNPs in the human genome, we have found that SNPs positioned 1, 2, 4, 6, or 8 bp apart are more frequent than SNPs positioned 3, 5, 7, or 9 bp apart. The observed pattern is not restricted to genomic regions that are known to cause sequencing or alignment errors, for example, transposable elements (SINE, LINE, and LTR), tandem repeats, and large duplicated regions. However, we found that the pattern is almost entirely confined to what we define as "periodic DNA." Periodic DNA is a genomic region with a high degree of periodicity in nucleotide usage. It turned out that periodic DNA is mainly small regions (average length 16.9 bp), widely distributed in the genome. Furthermore, periodic DNA has a 1.8 times higher SNP density than the rest of the genome and SNPs inside periodic DNA have a significantly higher genotyping error rate than SNPs outside periodic DNA. Our results suggest that not all SNPs in the human genome are created by independent single nucleotide mutations, and that care should be taken in analysis of SNPs from periodic DNA. The latter may have important consequences for SNP and association studies.  相似文献   

20.
Recent studies have revealed that linkage disequilibrium (LD) patterns vary across the human genome with some regions of high LD interspersed by regions of low LD. A small fraction of SNPs (tag SNPs) is sufficient to capture most of the haplotype structure of the human genome. In this paper, we develop a method to partition haplotypes into blocks and to identify tag SNPs based on genotype data by combining a dynamic programming algorithm for haplotype block partitioning and tag SNP selection based on haplotype data with a variation of the expectation maximization (EM) algorithm for haplotype inference. We assess the effects of using either haplotype or genotype data in haplotype block identification and tag SNP selection as a function of several factors, including sample size, density or number of SNPs studied, allele frequencies, fraction of missing data, and genotyping error rate, using extensive simulations. We find that a modest number of haplotype or genotype samples will result in consistent block partitions and tag SNP selection. The power of association studies based on tag SNPs using genotype data is similar to that using haplotype data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号