首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Genome‐wide association studies are helping to dissect the etiology of complex diseases. Although case‐control association tests are generally more powerful than family‐based association tests, population stratification can lead to spurious disease‐marker association or mask a true association. Several methods have been proposed to match cases and controls prior to genotyping, using family information or epidemiological data, or using genotype data for a modest number of genetic markers. Here, we describe a genetic similarity score matching (GSM) method for efficient matched analysis of cases and controls in a genome‐wide or large‐scale candidate gene association study. GSM comprises three steps: (1) calculating similarity scores for pairs of individuals using the genotype data; (2) matching sets of cases and controls based on the similarity scores so that matched cases and controls have similar genetic background; and (3) using conditional logistic regression to perform association tests. Through computer simulation we show that GSM correctly controls false‐positive rates and improves power to detect true disease predisposing variants. We compare GSM to genomic control using computer simulations, and find improved power using GSM. We suggest that initial matching of cases and controls prior to genotyping combined with careful re‐matching after genotyping is a method of choice for genome‐wide association studies. Genet. Epidemiol. 33:508–517, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

2.
Genome‐wide case‐control association study is gaining popularity, thanks to the rapid development of modern genotyping technology. In such studies, population stratification is a potential concern especially when the number of study subjects is large as it can lead to seriously inflated false‐positive rates. Current methods addressing this issue are still not completely immune to excess false positives. A simple method that corrects for population stratification is proposed. This method modifies a test statistic such as the Armitage trend test by using an additive constant that measures the variation of the effect size confounded by population stratification across genomic control (GC) markers. As a result, the original statistic is deflated by a multiplying factor that is specific to the marker being tested for association. This deflating multiplying factor is guaranteed to be larger than 1. These properties are in contrast to the conventional GC method where the original statistic is deflated by a common factor regardless of the marker being tested and the deflation factor may turn out to be less than 1. The new method is introduced first for regular case‐control design and then for other situations such as quantitative traits and the presence of covariates. Extensive simulation study indicates that this new method provides an appealing alternative for genetic association analysis in the presence of population stratification. Genet. Epidemiol. 33:637–645, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

3.
The potential for bias from population stratification (PS) has raised concerns about case-control studies involving admixed ethnicities. We evaluated the potential bias due to PS in relating a binary outcome with a candidate gene under simulated settings where study populations consist of multiple ethnicities. Disease risks were assigned within the range of prostate cancer rates of African Americans reported in SEER registries assuming k=2, 5, or 10 admixed ethnicities. Genotype frequencies were considered in the range of 5-95%. Under a model assuming no genotype effect on disease (odds ratio (OR)=1), the range of observed OR estimates ignoring ethnicity was 0.64-1.55 for k=2, 0.72-1.33 for k=5, and 0.81-1.22 for k=10. When genotype effect on disease was modeled to be OR=2, the ranges of observed OR estimates were 1.28-3.09, 1.43-2.65, and 1.62-2.42 for k=2, 5, and 10 ethnicities, respectively. Our results indicate that the magnitude of bias is small unless extreme differences exist in genotype frequency. Bias due to PS decreases as the number of admixed ethnicities increases. The biases are bounded by the minimum and maximum of all pairwise baseline disease odds ratios across ethnicities. Therefore, bias due to PS alone may be small when baseline risk differences are small within major categories of admixed ethnicity, such as African Americans.  相似文献   

4.
Health summary measures are commonly used by policy makers to help make decisions on the allocation of societal resources for competing medical treatments. The net monetary benefit is a health summary measure that overcomes the statistical limitations of a popular measure namely, the cost-effectiveness ratio. We introduce a linear model framework to estimate propensity score adjusted net monetary benefit. This method provides less biased estimates in the presence of significant differences in baseline measures and demographic characteristics between treatment groups in quasi-randomized or observational studies. Simulation studies were conducted to better understand the utility of propensity score adjusted estimates of net monetary benefits when important covariates are unobserved. The results indicated that the propensity score adjusted net monetary benefit provides a robust measure of cost-effectiveness in the presence of hidden bias. The methods are illustrated using data from SEER-Medicare for the treatment of bladder cancer.  相似文献   

5.
Proper control of confounding due to population stratification is crucial for valid analysis of case-control association studies. Fine matching of cases and controls based on genetic ancestry is an increasingly popular strategy to correct for such confounding, both in genome-wide association studies (GWASs) as well as studies that employ next-generation sequencing, where matching can be used when selecting a subset of participants from a GWAS for rare-variant analysis. Existing matching methods match on measures of genetic ancestry that combine multiple components of ancestry into a scalar quantity. However, we show that including nonconfounding ancestry components in a matching criterion can lead to inaccurate matches, and hence to an improper control of confounding. To resolve this issue, we propose a novel method that assigns cases and controls to matched strata based on the stratification score (Epstein et al. [2007] Am J Hum Genet 80:921-930), which is the probability of disease given genomic variables. Matching on the stratification score leads to more accurate matches because case participants are matched to control participants who have a similar risk of disease given ancestry information. We illustrate our matching method using the African-American arm of the GAIN GWAS of schizophrenia. In this study, we observe that confounding due to stratification can be resolved by our matching approach but not by other existing matching procedures. We also use simulated data to show our novel matching approach can provide a more appropriate correction for population stratification than existing matching approaches.  相似文献   

6.
Population stratification (PS) can lead to an inflated rate of false‐positive findings in genome‐wide association studies (GWAS). The commonly used approach of adjustment for a fixed number of principal components (PCs) could have a deleterious impact on power when selected PCs are equally distributed in cases and controls, or the adjustment of certain covariates, such as self‐identified ethnicity or recruitment center, already included in the association analyses, correctly maps to major axes of genetic heterogeneity. We propose a computationally efficient procedure, PC‐Finder, to identify a minimal set of PCs while permitting an effective correction for PS. A general pseudo F statistic, derived from a non‐parametric multivariate regression model, can be used to assess whether PS exists or has been adequately corrected by a set of selected PCs. Empirical data from two GWAS conducted as part of the Cancer Genetic Markers of Susceptibility (CGEMS) project demonstrate the application of the procedure. Furthermore, simulation studies show the power advantage of the proposed procedure in GWAS over currently used PS correction strategies, particularly when the PCs with substantial genetic variation are distributed similarly in cases and controls and therefore do not induce PS. Genet. Epidemiol. 33:432–441, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

7.
Li Q  Yu K 《Genetic epidemiology》2008,32(3):215-226
Hidden population substructure can cause population stratification and lead to false-positive findings in population-based genome-wide association (GWA) studies. Given a large panel of markers scanned in a GWA study, it becomes increasingly feasible to uncover the hidden population substructure within the study sample based on measured genotypes across the genome. Recognizing that population substructure can be displayed as clustered and/or continuous patterns of genetic variation, we propose a method that aims at the detection and correction of the confounding effect resulting from both patterns of population substructure. The proposed method is an extension of the EIGENSTRAT method (Price et al. [2006] Nat Genet 38:904-909). This approach is computationally feasible and easily applied to large-scale GWA studies. We show through simulation studies that, compared with the EIGENSTRAT method, the new method requires a smaller number of markers and yields a more appropriate correction for population stratification.  相似文献   

8.
The recent successes of GWAS based on large sample sizes motivate combining independent datasets to obtain larger sample sizes and thereby increase statistical power. Analysis methods that can accommodate different study designs, such as family-based and case-control designs, are of general interest. However, population stratification can cause spurious association for population-based association analyses. For family-based association analysis that infers missing parental genotypes based on the allele frequencies estimated in the entire sample, the parental mating-type probabilities may not be correctly estimated in the presence of population stratification. Therefore, any approach to combining family and case-control data should also properly account for population stratification. Although several methods have been proposed to accommodate family-based and case-control data, all have restrictions. Most of them require sampling a homogeneous population, which may not be a reasonable assumption for data from a large consortium. One of the methods, FamCC, can account for population stratification and uses nuclear families with arbitrary number of siblings but requires parental genotype data, which are often unavailable for late-onset diseases. We extended the family-based test, Association in the Presence of Linkage (APL), to combine family and case-control data (CAPL). CAPL can accommodate case-control data and families with multiple affected siblings and missing parents in the presence of population stratification. We used simulations to demonstrate that CAPL is a valid test either in a homogeneous population or in the presence of population stratification. We also showed that CAPL can have more power than other methods that combine family and case-control data.  相似文献   

9.
Population stratification leads to a predictable phenomenon—a reduction in the number of heterozygotes compared to that calculated assuming Hardy‐Weinberg Equilibrium (HWE). We show that population stratification results in another phenomenon—an excess in the proportion of spouse‐pairs with the same genotypes at all ancestrally informative markers, resulting in ancestrally related positive assortative mating. We use principal components analysis to show that there is evidence of population stratification within the Framingham Heart Study, and show that the first principal component correlates with a North‐South European cline. We then show that the first principal component is highly correlated between spouses (r = 0.58, p = 0.0013), demonstrating that there is ancestrally related positive assortative mating among the Framingham Caucasian population. We also show that the single nucleotide polymorphisms loading most heavily on the first principal component show an excess of homozygotes within the spouses, consistent with similar ancestry‐related assortative mating in the previous generation. This nonrandom mating likely affects genetic structure seen more generally in the North American population of European descent today, and decreases the rate of decay of linkage disequilibrium for ancestrally informative markers. Genet. Epidemiol. 34: 674‐679, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

10.
For genome‐wide association studies with family‐based designs, we propose a Bayesian approach. We show that standard transmission disequilibrium test and family‐based association test statistics can naturally be implemented in a Bayesian framework, allowing flexible specification of the likelihood and prior odds. We construct a Bayes factor conditional on the offspring phenotype and parental genotype data and then use the data we conditioned on to inform the prior odds for each marker. In the construction of the prior odds, the evidence for association for each single marker is obtained at the population‐level by estimating its genetic effect size by fitting the conditional mean model. Since such genetic effect size estimates are statistically independent of the effect size estimation within the families, the actual data set can inform the construction of the prior odds without any statistical penalty. In contrast to Bayesian approaches that have recently been proposed for genome‐wide association studies, our approach does not require assumptions about the genetic effect size; this makes the proposed method entirely data‐driven. The power of the approach was assessed through simulation. We then applied the approach to a genome‐wide association scan to search for associations between single nucleotide polymorphisms and body mass index in the Childhood Asthma Management Program data. Genet. Epidemiol. 34:569–574, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

11.
12.
We present a new method, the delta-centralization (DC) method, to correct for population stratification (PS) in case-control association studies. DC works well even when there is a lot of confounding due to PS. The latter causes overdispersion in the usual chi-square statistics which then have non-central chi-square distributions. Other methods approach the noncentrality indirectly, but we deal with it directly, by estimating the non-centrality parameter tau itself. Specifically: (1) We define a quantity delta, a function of the relevant subpopulation parameters. We show that, for relatively large samples, delta exactly predicts the elevation of the false positive rate due to PS, when there is no true association between marker genotype and disease. (This quantity delta is quite different from Wright's F(ST) and can be large even when F(ST) is small.) (2) We show how to estimate delta, using a panel of unlinked "neutral" loci. (3) We then show that delta2 corresponds to tau the noncentrality parameter of the chi-square distribution. Thus, we can centralize the chi-square using our estimate of 6; this is the DC method. (4) We demonstrate, via computer simulations, that DC works well with as few as 25-30 unlinked markers, where the markers are chosen to have allele frequencies reasonably close (within +/- .1) to those at the test locus. (5) We compare DC with genomic control and show that where as the latter becomes overconservative when there is considerable confounding due to PS (i.e. when delta is large), DC performs well for all values of delta.  相似文献   

13.
Concordance indices are used to assess the degree of agreement between different methods that measure the same characteristic. In this context, the total deviation index (TDI) is an unscaled concordance measure that quantifies to which extent the readings from the same subject obtained by different methods may differ with a certain probability. Common approaches to estimate the TDI assume data are normally distributed and linearity between response and effects (subjects, methods and random error). Here, we introduce a new non‐parametric methodology for estimation and inference of the TDI that can deal with any kind of quantitative data. The present study introduces this non‐parametric approach and compares it with the already established methods in two real case examples that represent situations of non‐normal data (more specifically, skewed data and count data). The performance of the already established methodologies and our approach in these contexts is assessed by means of a simulation study. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

14.
We propose a method to analyze family‐based samples together with unrelated cases and controls. The method builds on the idea of matched case–control analysis using conditional logistic regression (CLR). For each trio within the family, a case (the proband) and matched pseudo‐controls are constructed, based upon the transmitted and untransmitted alleles. Unrelated controls, matched by genetic ancestry, supplement the sample of pseudo‐controls; likewise unrelated cases are also paired with genetically matched controls. Within each matched stratum, the case genotype is contrasted with control/pseudo‐control genotypes via CLR, using a method we call matched‐CLR (mCLR). Eigenanalysis of numerous SNP genotypes provides a tool for mapping genetic ancestry. The result of such an analysis can be thought of as a multidimensional map, or eigenmap, in which the relative genetic similarities and differences amongst individuals is encoded in the map. Once constructed, new individuals can be projected onto the ancestry map based on their genotypes. Successful differentiation of individuals of distinct ancestry depends on having a diverse, yet representative sample from which to construct the ancestry map. Once samples are well‐matched, mCLR yields comparable power to competing methods while ensuring excellent control over Type I error. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

15.
Longitudinal studies collect information on a sample of individuals which is followed over time to analyze the effects of individual and time‐dependent characteristics on the observed response. These studies often suffer from attrition: individuals drop out of the study before its completion time and thus present incomplete data records. When the missing mechanism, once conditioned on other (observed) variables, does not depend on current (eventually unobserved) values of the response variable, the dropout mechanism is known to be ignorable. We propose a selection model extending semiparametric variance component models for longitudinal binary responses to allow for dependence between the missing data mechanism and the primary response process. The model is applied to a data set from a methadone maintenance treatment programme held in Sidney, 1986. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

16.
Genotype imputation is a critical technique for following up genome‐wide association studies. Efficient methods are available for dealing with the probabilistic nature of imputed single nucleotide polymorphisms (SNPs) in population‐based designs, but not for family‐based studies. We have developed a new analytical approach (FBATdosage), using imputed allele dosage in the general framework of family‐based association tests to bridge this gap. Simulation studies showed that FBATdosage yielded highly consistent type I error rates, whatever the level of genotype uncertainty, and a much higher power than the best‐guess genotype approach. FBATdosage allows fast linkage and association testing of several million of imputed variants with binary or quantitative phenotypes in nuclear families of arbitrary size with arbitrary missing data for the parents. The application of this approach to a family‐based association study of leprosy susceptibility successfully refined the association signal at two candidate loci, C1orf141‐IL23R on chromosome 1 and RAB32‐C6orf103 on chromosome 6.  相似文献   

17.
While intent‐to‐treat (ITT) analysis is widely accepted for superiority trials, there remains debate about its role in non‐inferiority trials. It has often been said that ITT analysis tends to be anti‐conservative in demonstrating non‐inferiority, suggesting that per‐protocol (PP) analysis may be preferable for non‐inferiority trials, despite the inherent bias of such analyses. We propose using randomization‐based g‐estimation analyses that more effectively preserve the integrity of randomization than do the more widely used PP analyses. Simulation studies were conducted to investigate the impacts of different types of treatment changes on the conservatism or anti‐conservatism of analyses using the ITT, PP, and g‐estimation methods in a time‐to‐event outcome. The ITT results were anti‐conservative for all simulations. Anti‐conservativeness increased with the percentage of treatment change and was more pronounced for outcome‐dependent treatment changes. PP analysis, in which treatment‐switching cases were censored at the time of treatment change, maintained type I error near the nominal level for independent treatment changes, whereas for outcome‐dependent cases, PP analysis was either conservative or anti‐conservative depending on the mechanism underlying the percentage of treatment changes. G‐estimation analysis maintained type I error near the nominal level even for outcome‐dependent treatment changes, although information on unmeasured covariates is not used in the analysis. Thus, randomization‐based g‐estimation analyses should be used to supplement the more conventional ITT and PP analyses, especially for non‐inferiority trials. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

18.
Emerging data suggest that the genetic regulation of the biological response to inflammatory stress may be fundamentally different to the genetic underpinning of the homeostatic control (resting state) of the same biological measures. In this paper, we interrogate this hypothesis using a single‐SNP score test and a novel class‐level testing strategy to characterize protein‐coding gene and regulatory element‐level associations with longitudinal biomarker trajectories in response to stimulus. Using the proposed c lass‐level a ssociation s core s tatistic for l ongitudinal d ata, which accounts for correlations induced by linkage disequilibrium, the genetic underpinnings of evoked dynamic changes in repeatedly measured biomarkers are investigated. The proposed method is applied to data on two biomarkers arising from the Genetics of Evoked Responses to Niacin and Endotoxemia study, a National Institutes of Health‐sponsored investigation of the genomics of inflammatory and metabolic responses during low‐grade endotoxemia. Our results suggest that the genetic basis of evoked inflammatory response is different than the genetic contributors to resting state, and several potentially novel loci are identified. A simulation study demonstrates appropriate control of type‐1 error rates, relative computational efficiency, and power. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

19.
The three‐arm clinical trial design, which includes a test treatment, an active reference, and placebo control, is the gold standard for the assessment of non‐inferiority. In the presence of non‐compliance, one common concern is that an intent‐to‐treat (ITT) analysis (which is the standard approach to non‐inferiority trials), tends to increase the chances of erroneously concluding non‐inferiority, suggesting that the per‐protocol (PP) analysis may be preferable for non‐inferiority trials despite its inherent bias. The objective of this paper was to develop statistical methodology for dealing with non‐compliance in three‐arm non‐inferiority trials for censored, time‐to‐event data. Changes in treatment were here considered the only form of non‐compliance. An approach using a three‐arm rank preserving structural failure time model and G‐estimation analysis is here presented. Using simulations, the impact of non‐compliance on non‐inferiority trials was investigated in detail using ITT, PP analyses, and the present proposed method. Results indicate that the proposed method shows good characteristics, and that neither ITT nor PP analyses can always guarantee the validity of the non‐inferiority conclusion. A Statistical Analysis System program for the implementation of the proposed test procedure is available from the authors upon request. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

20.
Patients often respond differently to a treatment because of individual heterogeneity. Failures of clinical trials can be substantially reduced if, prior to an investigational treatment, patients are stratified into responders and nonresponders based on biological or demographic characteristics. These characteristics are captured by a predictive signature. In this paper, we propose a procedure to search for predictive signatures based on the approach of patient rule induction method. Specifically, we discuss selection of a proper objective function for the search, present its algorithm, and describe a resampling scheme that can enhance search performance. Through simulations, we characterize conditions under which the procedure works well. To demonstrate practical uses of the procedure, we apply it to two real‐world data sets. We also compare the results with those obtained from a recent regression‐based approach, Adaptive Index Models, and discuss their respective advantages. In this study, we focus on oncology applications with survival responses. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号