首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The probabilities that two individuals share 0, 1, or 2 alleles identical by descent (IBD) at a given genotyped marker locus are quantities of fundamental importance for disease gene and quantitative trait mapping and in family-based tests of association. Until recently, genotyped markers were sufficiently sparse that founder haplotypes could be modelled as having been drawn from a population in linkage equilibrium for the purpose of estimating IBD probabilities. However, with the advent of high-throughput single nucleotide polymorphism genotyping assays, this is no longer a reasonable assumption. Indeed, the imminent arrival of individual sequencing will enable high-density single nucleotide polymorphism genotyping on a scale for which current algorithms are not equipped. In this paper, we present a simple new model in which founder haplotypes are modelled as a Markov chain. Another important innovation is that genotyping errors are explicitly incorporated into the model. We compare results obtained using the new model to those obtained using the popular genetic linkage analysis package Merlin, with and without using the cluster model of linkage disequilibrium that is incorporated into that program. We find that the new model results in accuracy approaching that of Merlin with haplotype blocks, but achieves this with orders of magnitude faster run times. Moreover, the new algorithm scales linearly with number of markers, irrespective of density, whereas Merlin scales supralinearly. We also confirm a previous finding that ignoring linkage disequilibrium in founder haplotypes can cause errors in the calculation of IBD probabilities.  相似文献   

2.
The proportion of the genome that is shared identical by descent (IBD) between pairs of individuals is often estimated in studies involving genome‐wide SNP data. These estimates can be used to check pedigrees, estimate heritability, and adjust association analyses. We focus on the method of moments technique as implemented in PLINK [Purcell et al., 2007] and other software that estimates the proportions of the genome at which two individuals share 0, 1, or 2 alleles IBD. This technique is based on the assumption that the study sample is drawn from a single, homogeneous, randomly mating population. This assumption is violated if pedigree founders are drawn from multiple populations or include admixed individuals. In the presence of population structure, the method of moments estimator has an inflated variance and can be biased because it relies on sample‐based allele frequency estimates. In the case of the PLINK estimator, which truncates genome‐wide sharing estimates at zero and one to generate biologically interpretable results, the bias is most often towards over‐estimation of relatedness between ancestrally similar individuals. Using simulated pedigrees, we are able to demonstrate and quantify the behavior of the PLINK method of moments estimator under different population structure conditions. We also propose a simple method based on SNP pruning for improving genome‐wide IBD estimates when the assumption of a single, homogeneous population is violated.  相似文献   

3.
We performed Haseman-Elston regression on a set of bipolar pedigrees using each of three dependent variables: a binary trait indicating disease concordance or discordance, a binary trait adjusted for age-of-onset, and the residuals from a survival analysis. The latter two methods, which both adjust for age-of-onset, gave smaller p-values when previous analyses suggested linkage between disease and marker, but not when previous analyses were not suggestive of linkage. © 1997 Wiley-Liss, Inc.  相似文献   

4.
Maximizing association statistics over genetic models   总被引:1,自引:0,他引:1  
The assessment of the association between a candidate locus and a disease may require the assumption of an inheritance model. Most researchers select the additive model and test the association with the Cochran-Armitage trend test. This test assumes a dose-response effect with regard to the number of copies of the variant allele. However, if there is reason to expect dominance or recessiveness in the effect of the variant allele, the heterozygous genotype may be grouped with one of the two homozygous, depending on the inheritance model, and a simple test on the 2 x 2 table can be used to assess independence. When the underlying genetic model is unknown, association may be assessed using the max-statistic, which selects the largest test statistic from the dominant, recessive and additive models. The statistical significance of the max-statistic has been previously addressed using permutation or Monte Carlo simulation approaches. We aimed to provide simpler alternatives to the max-test to make it feasible in large-scale association studies. Our simulations show that this procedure has an effective number of tests of 2.2, which can be used to correct the significance level or P-values. We also derive the asymptotic distribution of max-statistic, which leads to a simple way to calculate the significance level and allows the derivation of a formula for power calculations in the design of studies that plan to use the max-statistic. A simulation study shows that the use of the max-statistic is a powerful approach that provides safeguard against model uncertainty.  相似文献   

5.
The number of identical deleterious mutations present in a population may become very large, depending on the combined effect of genetic drift, population growth and limited negative selection. The distribution of the length of the shared area between two random chromosomes carrying the mutations has been investigated for a number of generations varying from 20-100 since introduction. The consequences for investigations using association and haplotype sharing methods are discussed. © 1997 Wiley-Liss, Inc.  相似文献   

6.
Genetic Analysis Workshop II: sib pair screening tests for linkage   总被引:4,自引:0,他引:4  
For each marker locus and for every pair of sibs with data available in the 1983 workshop data, the proportion of genes identical by descent was estimated. The mean proportions were compared between concordant and discordant sib pairs, and the mean proportion for concordantly affected pairs was compared with one half. Together with standard tests of association, these found to be sensitive screening tests for linkage.  相似文献   

7.
The methods proposed by Haseman and Elston [1972] were used to estimate the proportion of genes identical by descent shared by each sib pair at each locus. These estimates were then used as a basis for obtaining all possible locus-locus correlations. The 12 significant correlations (P < .01) and their rank order indicated the correct linkage groups and the order of the loci.  相似文献   

8.
Recently, Liang et al. ([2001] Hum. Hered. 51:64-78) proposed a general multipoint linkage method for estimating the chromosomal position of a putative susceptibility locus. Their technique is computationally simple and does not require specification of penetrance or a mode of inheritance. In complex genetic diseases, covariate data may be available which reflect etiologic or locus heterogeneity. We developed approaches to incorporating covariates into the method of Liang et al. ([2001] Hum. Hered. 51:64-78) with particular attention to exploiting age-at-onset information. The results of simulation studies, and a worked data example using a family data set ascertained through probands with schizophrenia, suggest that utilizing covariate information can yield substantial efficiency gains in localizing susceptibility genes.  相似文献   

9.
Recently, Wen and Stephens (Wen and Stephens [2010] Ann Appl Stat 4(3):1158–1182) proposed a linear predictor, called BLIMP, that uses conditional multivariate normal moments to impute genotypes with accuracy similar to current state‐of‐the‐art methods. One novelty is that it regularized the estimated covariance matrix based on a model from population genetics. We extended multivariate moments to impute genotypes in pedigrees. Our proposed method, PedBLIMP, utilizes both the linkage‐disequilibrium (LD) information estimated from external panel data and the pedigree structure or identity‐by‐descent (IBD) information. The proposed method was evaluated on a pedigree design where some individuals were genotyped with dense markers and the rest with sparse markers. We found that incorporating the pedigree/IBD information can improve imputation accuracy compared to BLIMP. Because rare variants usually have low LD with other single‐nucleotide polymorphisms (SNPs), incorporating pedigree/IBD information largely improved imputation accuracy for rare variants. We also compared PedBLIMP with IMPUTE2 and GIGI. Results show that when sparse markers are in a certain density range, our method can outperform both IMPUTE2 and GIGI.  相似文献   

10.
In this paper, we proposed a multipoint method to assess evidence of linkage to one region by incorporating linkage evidence from another region. This approach uses affected sib pairs in which the number of alleles shared identical by descent (IBD) is the primary statistic. This generalized estimating equation (GEE) approach is robust in that no assumption about the mode of inheritance is required, other than assuming the two regions being considered are unlinked and that there is no more than one susceptibility gene in each region. The method proposed here uses data from all available families to simultaneously test the hypothesis of statistical interaction between regions and to estimate the location of the susceptibility gene in the target region. As an illustration, we have applied this GEE method to an asthma sib pair study (Wjst et al. [1999] Genomics 58:1-8), which earlier reported evidence of linkage to chromosome 6 but showed no evidence for chromosome 20. Our results yield strong evidence to chromosome 20 (P value = 0.0001) after incorporating linkage information from chromosome 6. Furthermore, it estimates with 95% certainty that the map location of the susceptibility gene is flanked by markers D20S186 and D20S101, which are approximately 16.3 cM apart.  相似文献   

11.
Recently, Liang et al. ([2001b] Genet. Epidemiol. 21:105-122) proposed a conditional approach to assess linkage evidence on the target region by incorporating linkage information from an unlinked (reference) region using allele shared IBD (identity-by-decent) from affected sib pairs. This is carried out by conditioning on the IBD sharing value at the estimated trait locus of the reference region. Since markers considered are typically non-fully informative, the IBD sharing at each marker needs to be estimated (or imputed). In this report, we propose an alternative approach to deal with the IBD sharing in the reference region. This new approach makes full use of the observed data without having to categorize the imputed IBD sharing as needed in Liang et al. ([2001b] Genet. Epidemiol. 21:105-122). We compare these two approaches by simulating data from a variety of two-locus models including heterogeneity, additive and multiplicative with either fully informative markers or non-fully informative markers. The performance of both approaches is quite comparable showing consistent estimates of the trait locus and key genetic parameters.  相似文献   

12.
For diseases with complex genetic etiology, more than one susceptibility gene may exist in a single chromosomal region. Extending the work of Liang et al. ([2001] Hum. Hered. 51:64-78), we developed a method for simultaneous localization of two susceptibility genes in one region. We derived an expression for expected allele sharing of an affected sib pair (ASP) at each point across a chromosomal segment containing two susceptibility genes. Using generalized estimating equations (GEE), we developed an algorithm that uses marker identical-by-descent (IBD) sharing in affected sib pairs to simultaneously estimate the locations of the two genes and the mean IBD sharing in ASPs at these two disease loci. Confidence intervals for gene locations can be constructed based on large sample approximations. Application of the described methods to data from a genome scan for type 1 diabetes (Mein et al. [1998] Nat. Genet. 19:297-300) yielded estimates of two putative disease gene locations on chromosome 6, approximately 20 cM apart. Properties of the estimators, including bias, precision, and confidence interval coverage, were studied by simulation for a range of genetic models. The simulations demonstrated that the proposed method can improve disease gene localization and aid in resolving large peaks when two disease genes are present in one chromosomal region. Joint localization of two disease genes improves with increased excess allele sharing at the disease gene loci, increased distance between the disease genes, and increased number of affected sib pairs in the sample.  相似文献   

13.
Estimates of relatedness have several applications such as the identification of relatives or in identifying disease related genes through identity by descent (IBD) mapping. Here we present a new method for identifying IBD tracts among individuals from genome‐wide single nucleotide polymorphisms data. We use a continuous time Markov model where the hidden states are the number of alleles shared IBD between pairs of individuals at a given position. In contrast to previous methods, our method accurately accounts for linkage disequilibrium using pairwise haplotype probabilities. The method provides a map of the local relatedness along the genome. We illustrate the potential of the method for mapping disease genes on a real data set, and show that the method has the potential to map causative disease mutations using only a handful of affected individuals. The new IBD mapping method provides considerable improvement in mapping power in natural populations compared to standard association mapping methods. Genet. Epidemiol. 2009. © 2008 Wiley‐Liss, Inc.  相似文献   

14.
The aim of this study was to compare, under different models of gene-environment (G x E) interaction, the power to detect linkage and G x E interaction of different tests using affected sib-pairs. Methods considered were: 1) the maximum likelihood lod-score (MLS), based on the distribution of parental alleles identical by descent (IBD) in affected sibs; 2) the sum of the MLS (sMLS) calculated in affected sib-pairs with 2, 1, or 0 sibs exposed; 3) the predivided sample test (PST), which compares the IBD distribution between affected sib-pairs with 2, 1, or 0 sibs exposed; 4) the triangle test statistic (TTS), which uses the IBD distribution among discordant affected sib-pairs (one exposed, one unexposed); and 5) the mean interaction test (MIT), based on the regression of the proportion of alleles shared IBD among affected sib-pairs on the exposure among sib-pairs. The MLS, sMLS, and MIT allow detection of linkage. However, the sMLS and MIT account for a possible G x E interaction without testing it. In contrast, the PST and the TTS allow detection of both linkage and G x E interaction. Results showed that when exposure cancels the effect of the gene, or changes the direction of this effect (i.e., the protective allele becomes the risk allele), the PST, sMLS, and MIT may provide, under some models, greater power to detect linkage than the MLS. Under models where exposure changes the direction of the effect of the gene, the TTS test may also be more powerful than the other tests accounting for G x E interaction. Under the other models, the MLS remains the most powerful test to detect linkage. However, only the PST and TTS allow the detection of G x E interaction.  相似文献   

15.
16.
It is 100 years since R. A. Fisher proposed that a Mendelian model of genetic variant effects, additive over loci, could explain the patterns of observed phenotypic correlations between relatives. His loci were hypothetical and his model theoretical. It is only about 50 years since the first genetic markers allowed the detection of even variants with major effects on phenotype, and only 20 years since the development of single-nucleotide polymorphism technology provided dense markers over the genome. Then both mappings in defined pedigrees and population-based genome-wide association studies samples allowed the detection of multiple contributing variants of smaller effect. Finally, with methods based on genotypic correlations between individuals, or on allelic associations between loci, the additive heritability contributions of the genome can be estimated from large population samples. In this review we trace, from 1918 to 2018, the analysis of observed phenotypic correlations between relatives to estimate underlying genetic components of traits in human populations. As with studies from 1918 onward, we use height as the example trait where not only data are readily available, but where Fisher's model of large numbers of variants of infinitesimal effect appears to provide a good approximation to reality. However, we also trace the use of phenotypic and genotypic correlations between relatives in mapping causal variants and resolving genetic contributions to more complex human traits. With the availability of DNA sequence data, we can hope to not only estimate the total genetic contribution to a trait, but to resolve effects of individual genetic variants on biological function.  相似文献   

17.
The Cochran-Armitage trend test has been used in case-control studies for testing genetic association. As the variance of the test statistic is a function of unknown parameters, e.g. disease prevalence and allele frequency, it must be estimated. The usual estimator combining data for cases and controls assumes they follow the same distribution under the null hypothesis. Under the alternative hypothesis, however, the cases and controls follow different distributions. Thus, the power of the trend tests may be affected by the variance estimator used. In particular, the usual method combining both cases and controls is not an asymptotically unbiased estimator of the null variance when the alternative is true. Two different estimates of the null variance are available which are consistent under both the null and alternative hypotheses. In this paper, we examine sample size and small sample power performance of trend tests, which are optimal for three common genetic models as well as a robust trend test based on the three estimates of the variance and provide guidelines for choosing an appropriate test.  相似文献   

18.
For diseases with complex genetic etiology, more than one susceptibility gene may exist in a single chromosomal region. Under explicit assumptions about the number of disease genes in a region, generalized estimating equations (GEE) can be used to estimate the putative disease gene location(s) and expected identical-by-descent allele sharing in affected sib pairs at these gene(s). Extending the work of Liang et al. developed a method for simultaneous localization of two susceptibility genes in one region using marker identical-by-descent (IBD) sharing in affected sib pairs. Here we propose methods to evaluate the evidence for two versus one disease loci in a region in a quasi-likelihood/GEE framework. We describe tests based on approximate quasi-likelihood ratio and generalized score test statistics. Because of difficulties in determining the asymptotic null distributions of these statistics and the small sample sizes that can be available in genetic studies, we recommend that significance be evaluated empirically. Application of the described methods to data from a genome scan for type 1 diabetes yielded some evidence for two linked disease genes on chromosome 6, approximately 20 cM apart (p value for an approximate quasi-likelihood ratio test=0.049). In simulation studies, we found that both tests performed quite well for a range of scenarios. Power to detect the presence of two linked disease genes increased with the number of affected sib pairs, greater IBD sharing at the two loci, and larger distance between the two loci.  相似文献   

19.
This report describes a retrospective exposure assessment method used in a follow-up mortality study of workers exposed to benzene. The approach quantified historical exposure to benzene in a multi-industry, multicenter cohort, involving 672 factories in 12 cities in China. Historical exposure data were collected to obtain exposure information related to 1,427 work units (departments) and 3,179 unique job titles from benzene-producing or -using factories in which written records and other data sources were evaluated. The basic unit for exposure assessment was a factory/work unit/job title combination which was considered separately during each of seven calendar-year time periods between 1949 and 1987 for a total of 18,435 exposure assignments. Historical information collected to estimate exposure included benzene monitoring data; lists of raw materials and factory products, and the percentage of benzene in each; the total amount and dates of use of benzene or benzene-containing materials; use of engineering controls and personal protective equipment; and other available exposure information. Overall, 38% (ranging from 3% for the earliest periods to 67% for the last period) of the estimates were based primarily on benzene monitoring data. In the absence of job-specific benzene monitoring data for a given calendar period, measurement results or exposure estimates for similar jobs and/or other calendar periods were used in conjunction with other exposure information to derive estimates. Estimated exposure levels are presented by industries and occupations. The highest average exposures during 1949–1987 were observed for the rubber and plastic industry (30.7 ppm), and for rubber glue applicators (52.6 ppm).  相似文献   

20.
The antigen/allele genotype frequencies among patients (AGFAP) method has been powerful in discriminating between modes of inheritance, and detecting heterogeneity effects, for a number of diseases associated with the HLA system. The method is not dependent on the high level of polymorphism seen in the HLA system, but does require a marker allele association with disease. With recent rapid advances in mapping of the human genome, the method is increasingly relevant in all disease studies. Extension of the AGFAP method to ascertainment schemes other than random sampling of patients is presented here. The method is shown to be robust for distinguishing between incompletely penetrant recessive vs. additive or dominant models if affected children are obtained from nuclear families selected on the basis of at least two affected members: two affected sibs, or an affected parent and affected child. The method can lead to false conclusions for data from families ascertained for at least one affected parent and two affected children. A new test, termed the parental contributions test, applicable in families selected for the presence of an affected parent, and one or more affected children, is presented. The test, based on the expected symmetry (recessive) vs. asymmetry (additive and dominant) of parental marker allele contributions to an affected offspring in these pedigrees, is powerful in distinguishing between these modes of inheritance when there is a marker allele association with disease. Sporadic cases of disease are shown to cause deviations from AGFAP expectations for the recessive model, but not for the additive model. These results will aid in study of the genetics, and hence molecular basis, of complex diseases. © 1993 Wiley-Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号