首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
For diseases with complex genetic etiology, more than one susceptibility gene may exist in a single chromosomal region. Extending the work of Liang et al. ([2001] Hum. Hered. 51:64-78), we developed a method for simultaneous localization of two susceptibility genes in one region. We derived an expression for expected allele sharing of an affected sib pair (ASP) at each point across a chromosomal segment containing two susceptibility genes. Using generalized estimating equations (GEE), we developed an algorithm that uses marker identical-by-descent (IBD) sharing in affected sib pairs to simultaneously estimate the locations of the two genes and the mean IBD sharing in ASPs at these two disease loci. Confidence intervals for gene locations can be constructed based on large sample approximations. Application of the described methods to data from a genome scan for type 1 diabetes (Mein et al. [1998] Nat. Genet. 19:297-300) yielded estimates of two putative disease gene locations on chromosome 6, approximately 20 cM apart. Properties of the estimators, including bias, precision, and confidence interval coverage, were studied by simulation for a range of genetic models. The simulations demonstrated that the proposed method can improve disease gene localization and aid in resolving large peaks when two disease genes are present in one chromosomal region. Joint localization of two disease genes improves with increased excess allele sharing at the disease gene loci, increased distance between the disease genes, and increased number of affected sib pairs in the sample.  相似文献   

2.
In some genetic association studies, samples contain both parental and unrelated controls. Under such scenarios, instead of analyzing only trios using family-based association tests or only unrelated subjects using a case-control study design, Nagelkerke et al. ([2004] Eur. J. Hum. Genet. 12:964-970) and Epstein et al. ([2005] Am. J. Hum. Genet. 76:592-608) proposed methods that implemented a likelihood ratio test to combine the two different types of data. In this article, we put forward a more powerful and simplified strategy to combine trios with unrelated subjects based on the haplotype relative risk (HRR) (Falk and Rubinstein [1987] Ann. Hum. Genet. 51:227-233). The HRR compares parental marker alleles transmitted to an affected offspring to those not transmitted as a test for association, a strategy that is similar to a case-control study that compares allele frequencies in diseased cases to those of unrelated controls. We prove that affected offspring can be pooled with diseased cases and that parental controls can be treated as unrelated controls when the trios and unrelated subjects are randomly sampled from the same population. Therefore, unrelated subjects can be incorporated into the HRR intuitively and effortlessly. For trios without complete parental genotypes, we adopted the strategy proposed by (Guo et al. [2005a] BMC Genet. 6:S90; [2005b] Hum. Hered. 59: 125-135), which is more feasible than the one proposed by Weinberg ([1999] Am. J. Hum. Genet. 64:1186-1193). In addition, simulation results suggest that the combined haplotype relative risk is more powerful than Epstein et al.'s method regardless of the disease prevalence in a homogeneous population.  相似文献   

3.
Based on the symmetry of transmitted/nontransmitted alleles from heterozygous parents under the null hypothesis of no association, the work proposed here establishes a general statistical framework for constructing association tests with data from nuclear families with multiple affected children. A class of association tests is proposed for both diallelic and multiallelic markers. The proposed test statistics reduce to the transmission disequilibrium test for trios, to T(su) by Martin et al. ([1997] Am. J. Hum. Genet. 61:439-448) for affected sib pairs, and to the pedigree disequilibrium test by Martin et al. ([2000] Am. J. Hum. Genet. 67:146-154); [2001] Am. J. Hum. Genet. 68:1065-1067) when using affected sibships only. The association test used in simulation and for real data (sitosterolemia) is the one which has the best overall power in detecting association. This association test is generally more powerful than the association tests proposed by Martin et al. ([2000] Am. J. Hum. Genet. 67:146-154); [2001] Am. J. Hum. Genet. 68:1065-1067) when using only affected sibships. For the sitosterolemia data set, the association test has its most significant result (P-value=0.0012) for the marker locus on the same bacterial artificial chromosome as the disease locus.  相似文献   

4.
Recently, Liang et al. ([2001b] Genet. Epidemiol. 21:105-122) proposed a conditional approach to assess linkage evidence on the target region by incorporating linkage information from an unlinked (reference) region using allele shared IBD (identity-by-decent) from affected sib pairs. This is carried out by conditioning on the IBD sharing value at the estimated trait locus of the reference region. Since markers considered are typically non-fully informative, the IBD sharing at each marker needs to be estimated (or imputed). In this report, we propose an alternative approach to deal with the IBD sharing in the reference region. This new approach makes full use of the observed data without having to categorize the imputed IBD sharing as needed in Liang et al. ([2001b] Genet. Epidemiol. 21:105-122). We compare these two approaches by simulating data from a variety of two-locus models including heterogeneity, additive and multiplicative with either fully informative markers or non-fully informative markers. The performance of both approaches is quite comparable showing consistent estimates of the trait locus and key genetic parameters.  相似文献   

5.
We consider the analysis of multiple single nucleotide polymorphisms (SNPs) within a gene or region. The simplest analysis of such data is based on a series of single SNP hypothesis tests, followed by correction for multiple testing, but it is intuitively plausible that a joint analysis of the SNPs will have higher power, particularly when the causal locus may not have been observed. However, standard tests, such as a likelihood ratio test based on an unrestricted alternative hypothesis, tend to have large numbers of degrees of freedom and hence low power. This has motivated a number of alternative test statistics. Here we compare several of the competing methods, including the multivariate score test (Hotelling's test) of Chapman et al. ([2003] Hum. Hered. 56:18-31), Fisher's method for combining P-values, the minimum P-value approach, a Fourier-transform-based approach recently suggested by Wang and Elston ([2007] Am. J. Human Genet. 80:353-360) and a Bayesian score statistic proposed for microarray data by Goeman et al. ([2005] J. R. Stat. Soc. B 68:477-493). Some relationships between these methods are pointed out, and simulation results given to show that the minimum P-value and the Goeman et al. ([2005] J. R. Stat. Soc. B 68:477-493) approaches work well over a range of scenarios. The Wang and Elston approach often performs poorly; we explain why, and show how its performance can be substantially improved.  相似文献   

6.
In the presence of multiple data sets, an important issue is how to best measure the overall evidence for linkage across data sets. Previously, we advocated the use of the posterior probability of linkage (PPL) for this purpose [Vieland, Am J Hum Genet 63:947–54, 1998; Wang et al., Ann Hum Genet 64:533–53, 2000; Vieland et al., Hum Hered 51:199–208, 2001]. In this paper, we propose a critical modification of our earlier two‐point PPL in order to handle multiple‐point calculations. The proposed modification is then applied to the genome‐screen data sets and the COAG chromosome 5 data sets provided by GAW 12. We find linkage signals at location (in the order of the strength of the signal) 45 cM on chromosome 6, 23 cM on chromosome 20, and 30 cM on chromosome 1. No linkage signal is found on chromosome 5. © 2001 Wiley‐Liss, Inc.  相似文献   

7.
Multipoint linkage analysis using sibpair designs remains a common approach to help investigators to narrow chromosomal regions for traits (either qualitative or quantitative) of interest. Despite its popularity, the success of this approach depends heavily on how issues such as genetic heterogeneity, gene-gene, and gene-environment interactions are properly handled. If addressed properly, the likelihood of detecting genetic linkage and of efficiently estimating the location of the trait locus would be enhanced, sometimes drastically. Previously, we have proposed an approach to deal with these issues by modeling the genetic effect of the target trait locus as a function of covariates pertained to the sibpairs. Here the genetic effect is simply the probability that a sibpair shares the same allele at the trait locus from their parents. Such modeling helps to divide the sibpairs into more homogeneous subgroups, which in turn helps to enhance the chance to detect linkage. One limitation of this approach is the need to categorize the covariates so that a small and fixed number of genetic effect parameters are introduced. In this report, we take advantage of the fact that nowadays multiple markers are readily available for genotyping simultaneously. This suggests that one could estimate the dependence of the generic effect on the covariates nonparametrically. We present an iterative procedure to estimate (1) the genetic effect nonparametrically and (2) the location of the trait locus through estimating functions developed by Liang et al. ([2001a] Hum Hered 51:67-76). We apply this new method to the linkage study of schizophrenia to illustrate how the onset ages of each sibpair may help to address the issue of genetic heterogeneity. This analysis sheds new light on the dependence of the trait effect on onset ages from affected sibpairs, an observation not revealed previously. In addition, we have carried out some simulation work, which suggests that this method provides accurate inference for estimating the location of quantitative trait loci.  相似文献   

8.
We consider two-stage case-control designs for testing associations between single nucleotide polymorphisms (SNPs) and disease, in which a subsample of subjects is used to select a panel of "tagging" SNPs that will be considered in the main study. We propose a pseudolikelihood [Pepe and Flemming, 1991: JASA 86:108-113] that combines the information from both the main study and the substudy to test the association with any polymorphism in the original set. SNP-tagging [Chapman et al., 2003: Hum Hered 56:18-31] and haplotype-tagging [Stram et al., 2003a; Hum Hered 55:27-36] approaches are compared. We show that the cost-efficiency of such a design for estimating the relative risk associated with the causal polymorphism can be considerably better than for a single-stage design, even if the causal polymorphism is not included in the tag-SNP set. We also consider the optimal selection of cases and controls in such designs and the relative efficiency for estimating the location of a causal variant in linkage disequilibrium mapping. Nevertheless, as the cost of high-volume genotyping plummets and haplotype tagging information from the International HapMap project [Gibbs et al., 2003; Nature 426:789-796] rapidly accumulates in public databases, such two-stage designs may soon become unnecessary.  相似文献   

9.
In a small region several marker loci may be associated with a trait, either because they directly influence the trait or because they are in linkage disequilibrium (LD) with a causal variant. Having established a potentially causal effect at a primary variant, we may ask if any other variants in the region appear to further contribute to the trait, indicating that the additional variant is either causal or is in LD with another causal locus. Methods of approaching this problem using case-parent trio data include the stepwise conditional logistic regression approach described by Cordell and Clayton ([2002] Am. J. Hum. Genet. 70:124-141), and a constrained-permutation method recently proposed by Spijker et al. ([2005] Ann. Hum. Genet. 69:90-101). Through simulation we demonstrate that the procedure described by Spijker et al. [2005], as well as unconditional logistic regression with "affected family-based controls" (AFBACs), can lead to inflated type 1 errors in situations when haplotypes are not inferable for all trios, whereas the conditional logistic regression approach gives correct significance levels. We propose an alternative to the permutation method of Spijker et al. [2005], which does not rely on haplotyping, and results in correct type 1 errors and potentially high power when assumptions of random mating, Hardy-Weinberg Equilibrium, and multiplicative effects of disease alleles are satisfied.  相似文献   

10.
Linkage disequilibrium (LD) or association studies using case-parent trios have become a common approach to locate unobserved susceptibility genes underlying complex diseases. With the availability of ever more dense marker maps, how to utilize the information carried by multiple markers simultaneously remains challenging. Recently, Liang et al. ([2001a] Am. J. Hum. Genet. 68: 937-950) proposed a multipoint LD method to estimate the location of a susceptibility gene within a framework map along with its sampling uncertainty. Two important features of this method are that 1) it uses all trios whether parents are heterozygous for a given marker or not, and 2) it provides a single test statistic for the null hypothesis of no linkage or no LD to the region, avoiding the multiple testing problem encountered when performing individual transmission disequilibrium tests (TDT) for each marker individually. In this paper, we discuss how this method can be expanded to address important issues pertaining to complex diseases in a unified fashion. These issues include, among others, gene-gene and gene-environment interactions, genetic heterogeneity, phenotypic refinement, and paternal vs. maternal transmission. We applied this method to asthmatic case-parent trios from the Collaborative Study on the Genetics of Asthma (CSGA), and found that the previous evidence for linkage and LD in a 13.6 cM region of chromosome 11 can be attributed to maternal transmission, while there was no evidence of excess paternal transmission. Furthermore, such discrepancy in preferential transmission was most evident among probands with early onset age (6 years old or younger).  相似文献   

11.
Nonrandom ascertainment is commonly used in genetic studies of rare diseases, since this design is often more convenient than the random-sampling design. When there is an underlying latent heterogeneity, Epstein et al. ([2002] Am. J. Hum. Genet. 70:886-895) showed that it is possible to get unbiased or consistent estimation of population parameters under ascertainment adjustment, but Glidden and Liang ([2002] Genet. Epidemiol. 23:201-208) showed in a simulation study that the resulting estimates are highly sensitive to misspecification of the latent components. To overcome this difficulty, we consider a heavy-tailed model for latent variables that allows a robust estimation of the parameters. We describe a hierarchical-likelihood approach that avoids the integration used in the standard marginal likelihood approach. We revisit and extend the previous simulation, and show that the resulting estimator is efficient and robust against misspecification of the distribution of latent variables.  相似文献   

12.
We consider three tests for genetic association in data from nuclear families (the Family-Based Association Test (FBAT) test proposed by Rabinowitz and Laird ([2000] Hum. Hered. 50:211-223), a second test proposed by Rabinowitz ([2002] J. Am. Stat. Assoc. 97:742-758), and the Family Genotype Analysis Program (FGAP) nonfounder or partial score test proposed by Clayton ([1999] Am. J. Hum. Genet. 65:1170-1177) and Whittemore and Tu ([2000] Am. J. Hum. Genet. 66:1329-1340)). We show that each test statistic arises from the efficient score of the family data as the solution to a set of constraints on its null expectation. Moreover, the FBAT and Rabinowitz tests (but not the FGAP test) are locally the most powerful among all tests satisfying their constraints. We used simulations to examine how the three tests perform in situations when their assumptions are violated and the number of families is not huge. We found that the FBAT test tended to have less power than the other two tests, particularly when applied to families in whom all offspring were affected. The Rabinowitz and FGAP tests performed similarly, although the latter tended to extract more information from families containing one typed parent. While none of the tests showed good power to detect rare, recessively acting genes, the Rabinowitz test with a sample variance estimate performed particularly poorly in this case. However, the Rabinowitz test with a model-based variance had power comparable to that of the FGAP test, and more accurate type I error rates. We conclude that for the situations we considered, the Rabinowitz test with model-based variance has good power without forfeiting robustness against misspecification of parental genotype probabilities. However, its utility is limited by the lack of a simple algorithm to apply it to families with varying structures and phenotypes.  相似文献   

13.
We have developed a method for jointly testing linkage and association using data from affected sib pairs and their parents. We specify a conditional logistic regression model with two covariates, one that quantifies association (either direct association or indirect association via linkage disequilibrium), and a second that quantifies linkage. The latter covariate is computed based on expected identity-by-descend (ibd) sharing of marker alleles between siblings. In addition to a joint test of linkage and association, our general framework can be used to obtain a linkage test comparable to the mean test (Blackwelder and Elston [1985] Genet. Epidemiol. 2:85-97), and an association test comparable to the Family-Based Association Test (FBAT; Rabinowitz and Laird [2000] Hum. Hered. 50:211-223). We present simulation results demonstrating that our joint test can be more powerful than some standard tests of linkage or association. For example, with a relative risk of 2.7 per variant allele at a disease locus, the estimated power to detect a nearby marker with a modest level of LD was 58.1% by the mean test (linkage only), 69.8% by FBAT, and 82.5% by our joint test of linkage and association. Our model can also be used to obtain tests of linkage conditional on association and association conditional on linkage, which can be helpful in fine mapping.  相似文献   

14.
Haplotypes of closely linked single-nucleotide polymorphisms (SNPs) potentially offer greater power than individual SNPs to detect association between genetic variants and disease. We present a novel approach for association mapping in which density-based clustering of haplotypes reduces the dimensionality of the general linear model (GLM)-based score test of association implemented in the HaploStats software (Schaid et al. [2002] Am. J. Hum. Genet. 70:425-434). A flexible haplotype similarity score, a generalization of previously used measures, forms the basis, for grouping haplotypes of probable recent common ancestry. All haplotypes within a cluster are assigned the same regression coefficient within the GLM, and evidence for association is assessed with a score statistic. The approach is applicable to both binary and continuous trait data, and does not require prior phase information. Results of simulation studies demonstrated that clustering enhanced the power of the score test to detect association, under a variety of conditions, while preserving valid Type-I error. Improvement in performance was most dramatic in the presence of extreme haplotype diversity, while a slight improvement was observed even at low diversity. Our method also offers, for binary traits, a slight advantage in power over a similar approach based on an evolutionary model (Tzeng et al. [2006] Am. J. Hum. Genet. 78:231-242).  相似文献   

15.
Variance component models form a powerful and flexible tool for multipoint linkage analysis of quantitative traits. Estimates of genetic similarity are needed for the variance component model to detect linkage and to locate genes, and two methods are commonly used to calculate multipoint identity-by-descent (IBD) estimates for autosomes. Fulker et al. ([1995] Am. J. Hum. Genet. 56: 1229-1233) and Almasy and Blangero ([1998] Am. J. Hum. Genet. 62: 119-121) used multiple regression to estimate the IBD sharing along a chromosome, while the approach of Kruglyak and Lander ([1995] Am. J. Hum. Genet. 57: 439-454) is based on a hidden Markov model. In this paper, we modify the variance component model to accommodate sex-chromosomes, and we extend both multipoint IBD estimation methods to accommodate sex-linked loci. Simulation studies demonstrate the power and precision of the variance component model to detect QTLs located on the sex-chromosome. The two multipoint IBD estimation methods have the same accuracy to identify QTL position, but the hidden Markov model yields a larger average maximum LOD score to detect linkage than the regression model. The extension of the multipoint IBD estimation methods and the variance component model to the X chromosome shows that the variance component model is a powerful and flexible tool for linkage analysis of quantitative traits on both autosomes and sex-chromosomes.  相似文献   

16.
The concept of haplotype sharing (HS) has received considerable attention recently, and several haplotype association methods have been proposed. Here, we extend the work of Beckmann and colleagues [2005 Hum. Hered. 59:67-78] who derived an HS statistic (BHS) as special case of Mantel's space-time clustering approach. The Mantel-type HS statistic correlates genetic similarity with phenotypic similarity across pairs of individuals. While phenotypic similarity is measured as the mean-corrected cross product of phenotypes, we propose to incorporate information of the underlying genetic model in the measurement of the genetic similarity. Specifically, for the recessive and dominant modes of inheritance we suggest the use of the minimum and maximum of shared length of haplotypes around a marker locus for pairs of individuals. If the underlying genetic model is unknown, we propose a model-free HS Mantel statistic using the max-test approach. We compare our novel HS statistics to BHS using simulated case-control data and illustrate its use by re-analyzing data from a candidate region of chromosome 18q from the Rheumatoid Arthritis (RA) Consortium. We demonstrate that our approach is point-wise valid and superior to BHS. In the re-analysis of the RA data, we identified three regions with point-wise P-values<0.005 containing six known genes (PMIP1, MC4R, PIGN, KIAA1468, TNFRSF11A and ZCCHC2) which might be worth follow-up.  相似文献   

17.
Many studies are done in small isolated populations and populations where marriages between relatives are encouraged. In this paper, we point out some problems with applying the maximum lod score (MLS) method (Risch, [1990] Am. J. Hum. Genet. 46:242-253) in these populations where relationships exist between the two parents of the affected sib-pairs. Characterizing the parental relationships by the kinship coefficient between the parents (f), the maternal inbreeding coefficient (alpha(m), and the paternal inbreeding coefficient (alpha(p)), we explored the relationship between the identity by descent (IBD) vector expected under the null hypothesis of no linkage and these quantities. We find that the expected IBD vector is no longer (0.25, 0.5, 0.25) when f, alpha(m), and alpha(p) differ from zero. In addition, the expected IBD vector does not always follow the triangle constraints recommended by Holmans ([1993] Am. J. Hum. Genet. 52:362-374). So the classically used MLS statistic needs to be adapted to the presence of parental relationships. We modified the software GENEHUNTER (Kruglyak et al. [1996] Am. J. Hum. Genet. 58: 1347-1363) to do so. Indeed, the current version of the software does not compute the likelihood properly under the null hypothesis. We studied the adapted statistic by simulating data on three different family structures: (1) parents are double first cousins (f=0.125, alpha(m)=alpha(p)=0), (2) each parent is the offspring of first cousins (f=0, alpha(m)=alpha(p)=0.0625), and (3) parents are related as in the pedigree from Goddard et al. ([1996] Am. J. Hum. Genet. 58:1286-1302) (f=0.109, alpha(m)=alpha(p)=0.0625). The appropriate threshold needs to be derived for each case in order to get the correct type I error. And using the classical statistic in the presence of both parental kinship and parental inbreeding almost always leads to false conclusions.  相似文献   

18.
Relative efficiency of ambiguous vs. directly measured haplotype frequencies   总被引:10,自引:0,他引:10  
Haplotypes are useful for both fine-mapping of susceptibility loci and evaluation of sequence variation at multiple sites along a chromosome. However, they are difficult to directly measure over long stretches of DNA in diploid organisms. Consequently, multiple genetic markers are typically measured, without linkage phase information, giving rise to a subject's diplotype. From diplotype data, haplotypes are often inferred by pedigree information, or treated as partially missing data when haplotype frequencies are estimated among unrelated subjects. This latter ambiguity can increase the variance of the estimated haplotype frequencies. Douglas et al. ([2001] Nat. Genet. 28:361-364) recently quantified the relative efficiency of estimating haplotype frequencies from the diplotypes of unrelated subjects, relative to directly measured haplotypes via somatic cell hybrids (conversion technology), and demonstrated that unknown linkage phase can lead to a large loss of efficiency. However, their results were based on linkage equilibrium among marker loci, which may not be realistic for closely linked markers. We extend their relative efficiency calculations by several aspects: 1) allowance for linkage disequilbrium (LD) among marker loci; 2) evaluation of different patterns of LD; and 3) evaluation of nuclear families with and without parents. We show that although the loss in efficiency of haplotype frequencies among unrelated subjects decreases as LD increases to its maximum value, the general conclusions of Douglas et al. ([2001] Nat. Genet. 28:361-364) hold true for a variety of LD patterns and magnitudes. However, our results also demonstrate that trios of parents+one child are highly efficient for haplotype frequency estimation, that additional children offer little information, and that siblings without parents can be grossly inefficient. Genet. Epidemiol. 23:426-443, 2002.  相似文献   

19.
The curse of multiple testing has led to the adoption of a stringent Bonferroni threshold for declaring genome-wide statistical significance for any one SNP as standard practice. Although justified in avoiding false positives, this conservative approach has the potential to miss true associations as most studies are drastically underpowered. As an alternative to increasing sample size, we compare results from a typical SNP-by-SNP analysis with three other methods that incorporate regional information in order to boost or dampen an otherwise noisy signal: the haplotype method (Schaid et al. [2002] Am J Hum Genet 70:425-434), the gene-based method (Liu et al. [2010] Am J Hum Genet 87:139-145), and a new method (interaction count) that uses genome-wide screening of pairwise SNP interactions. Using a modestly sized case-control study, we conduct a genome-wide association studies (GWAS) of age-related macular degeneration, and find striking agreement across all methods in regions of known associated variants. We also find strong evidence of novel associated variants in two regions (Chromosome 2p25 and Chromosome 10p15) in which the individual SNP P-values are only suggestive, but where there are very high levels of agreement between all methods. We propose that consistency between different analysis methods may be an alternative to increasingly larger sample sizes in sifting true signals from noise in GWAS.  相似文献   

20.
Population-based case-control studies measuring associations between haplotypes of single nucleotide polymorphisms (SNPs) are increasingly popular, in part because haplotypes of a few "tagging" SNPs may serve as surrogates for variation in relatively large sections of the genome. Due to current technological limitations, haplotypes in cases and controls must be inferred from unphased genotypic data. Using individual-specific inferred haplotypes as covariates in standard epidemiologic analyses (e.g., conditional logistic regression) is an attractive analysis strategy, as it allows adjustment for nongenetic covariates, provides omnibus and haplotype-specific tests of association, and can estimate haplotype and haplotype x environment interaction effects. In principle, some adjustment for the uncertainty in inferred haplotypes should be made. Via simulation, we compare the performance (bias and mean squared error of haplotype and haplotype x environment interaction effect estimates) of several analytic strategies using inferred haplotypes in the context of matched case-control data. These strategies include using only the most likely haplotype assignment, the expectation substitution approach described by Stram et al. ([2003b] Hum. Hered. 55:179-190) and others, and an improper version of multiple imputation. For relatively uncomplicated haplotype structures and moderate haplotype relative risks (/=5). An application to progesterone-receptor haplotypes and endometrial cancer further illustrates that the performance of all these methods depends on how well the observed haplotypes "tag" the unobserved causal variant.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号