首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In studies of older adults, researchers often recruit proxy respondents, such as relatives or caregivers, when study participants cannot provide self‐reports (e.g., because of illness). Proxies are usually only sought to report on behalf of participants with missing self‐reports; thus, either a participant self‐report or proxy report, but not both, is available for each participant. Furthermore, the missing‐data mechanism for participant self‐reports is not identifiable and may be nonignorable. When exposures are binary and participant self‐reports are conceptualized as the gold standard, substituting error‐prone proxy reports for missing participant self‐reports may produce biased estimates of outcome means. Researchers can handle this data structure by treating the problem as one of misclassification within the stratum of participants with missing self‐reports. Most methods for addressing exposure misclassification require validation data, replicate data, or an assumption of nondifferential misclassification; other methods may result in an exposure misclassification model that is incompatible with the analysis model. We propose a model that makes none of the aforementioned requirements and still preserves model compatibility. Two user‐specified tuning parameters encode the exposure misclassification model. Two proposed approaches estimate outcome means standardized for (potentially) high‐dimensional covariates using multiple imputation followed by propensity score methods. The first method is parametric and uses maximum likelihood to estimate the exposure misclassification model (i.e., the imputation model) and the propensity score model (i.e., the analysis model); the second method is nonparametric and uses boosted classification and regression trees to estimate both models. We apply both methods to a study of elderly hip fracture patients. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

2.
Objectives. We explore how misclassification in disease status can distort the exposure–disease association in a study with dichotomous disease and exposure status.Methods. We define the difference in population odds ratios between populations with and without disease misclassification as population-level bias and derive the bias as a function of sensitivity and specificity for observed disease status. The magnitude and direction of bias can be elucidated through analytic derivations, as illustrated with numerical examples.Results. Patterns of bias exist not only for nondifferential misclassification but also for some differential misclassification scenarios. We have provided conditions defined in terms of sensitivity and specificity that correspond to each pattern of bias.Conclusions. Caution is needed in interpreting results when misclassification is present. Our findings can be used to assess the effects of disease misclassification in a population when sensitivity and specificity are known or can be estimated.In epidemiological and clinical studies, we are often interested in the association between a dichotomous exposure and a dichotomous health outcome such as disease status. However, misclassification is often present in these measures when the gold standard assessment is too expensive to apply and a more affordable but less accurate assessment is used instead. For example, misclassification for disease status is likely to occur when psychiatric disorder status is assessed through self-reported surveys instead of in-person clinical diagnosis. Likewise, misclassification for exposure status is likely to occur when individual exposure to air pollution is assessed by measurements recorded at neighborhood monitoring stations rather than by personal monitoring devices.Misclassification can alter the odds ratio (OR) that measures the exposure–disease association in a population. This difference can sometimes present significant problems in drawing conclusions about the nature and strength of the exposure–disease association, because the direction of the deviation is unclear and the magnitude of the deviation can be large. Here we focus on the impact of disease misclassification on the exposure–disease relationship when the exposure category is correctly classified.Two types of disease misclassification can arise in an exposure–disease association study: nondifferential and differential. Nondifferential misclassification occurs when neither sensitivity nor specificity for disease classification varies by exposure category. By contrast, differential misclassification occurs when misclassification of disease status varies by exposure category.1,2 It is usually believed that nondifferential misclassification in either exposure or disease status results in an estimate that has the same sign as the true association but reduced magnitude, unless the misclassification is so severe that the estimate might switch over to the opposite side of the null.3–9 However, differential misclassification can have effects with indeterminate direction,6 away from the null, toward the null, or even switched to the opposite side of the null. It is unclear what conditions cause specific deviations. Chyou studied patterns of effects in the OR estimation attributable to differential misclassification by case–control status in a case–control study, with limited numerical examples.10 However, conclusions based on limited numerical examples may be sensitive to the conditions chosen for the study. Thus it is desirable to use analytic derivation to examine the pattern of misclassification effects in the exposure–disease association, especially when differential misclassification occurs.Here we focus on the difference in population parameters (here the OR) between populations with and without disease misclassification, referred to as population-level bias. This population-level bias is different from the bias of an estimator, which represents the difference between an estimator’s expectation and the true value of the parameter being estimated. For sample-based estimation, the parameters estimated are consistent asymptotically for the corresponding population parameters; thus the patterns of bias for the sample estimators are the same asymptotically as the patterns for the population parameters. We focus on population parameters without estimation error and refer to population-level bias simply as bias.  相似文献   

3.
In case–control studies, it is common for a categorical exposure variable to be misclassified. It is also common for exposure status to be informatively missing for some individuals, in that the probability of missingness may be related to exposure. Procedures for addressing the bias due to misclassification via validation data have been extensively studied, and related methods have been proposed for dealing with informative missingness based on supplemental sampling of some of those with missing data. In this paper, we introduce study designs and analytic procedures for dealing with both problems simultaneously in a 2×2 analysis. Results based on convergence in probability illustrate that the combined effects of missingness and misclassification, even when the latter is non‐differential, can lead to naïve exposure odds ratio estimates that are inflated or on the wrong side of the null. The motivating example comes from a case–control study of the association between low birth weight and the diagnosis of breast cancer later in life, where self‐reported birth weight for some women is supplemented by accurate information from birth certificates. Copyright © 2006 John Wiley & Sons, Ltd.  相似文献   

4.
The case‐only test has been proposed as a more powerful approach to detect gene–environment (G × E) interactions. This approach assumes that the genetic and environmental factors are independent. Although it is well known that Type I error rate will increase if this assumption is violated, it is less widely appreciated that G × E correlation can also lead to power loss. We illustrate this phenomenon by comparing the performance of the case‐only test to other approaches to detect G × E interactions in a genome‐wide association study (GWAS) of esophageal squamous‐cell carcinoma (ESCC) in Chinese populations. Some of these approaches do not use information on the correlation between exposure and genotype (standard logistic regression), whereas others seek to use this information in a robust fashion to boost power without increasing Type I error (two‐step, empirical Bayes, and cocktail methods). G × E interactions were identified involving drinking status and two regions containing genes in the alcohol metabolism pathway, 4q23 and 12q24. Although the case‐only test yielded the most significant tests of G × E interaction in the 4q23 region, the case‐only test failed to identify significant interactions in the 12q24 region which were readily identified using other approaches. The low power of the case‐only test in the 12q24 region is likely due to the strong inverse association between the single nucleotide polymorphism (SNPs) in this region and drinking status. This example underscores the need to consider multiple approaches to detect G × E interactions, as different tests are more or less sensitive to different alternative hypotheses and violations of the G × E independence assumption.  相似文献   

5.
Multistate Markov regression models used for quantifying the effect size of state‐specific covariates pertaining to the dynamics of multistate outcomes have gained popularity. However, the measurements of multistate outcome are prone to the errors of classification, particularly when a population‐based survey/research is involved with proxy measurements of outcome due to cost consideration. Such a misclassification may affect the effect size of relevant covariates such as odds ratio used in the field of epidemiology. We proposed a Bayesian measurement‐error‐driven hidden Markov regression model for calibrating these biased estimates with and without a 2‐stage validation design. A simulation algorithm was developed to assess various scenarios of underestimation and overestimation given nondifferential misclassification (independent of covariates) and differential misclassification (dependent on covariates). We applied our proposed method to the community‐based survey of androgenetic alopecia and found that the effect size of the majority of covariate was inflated after calibration regardless of which type of misclassification. Our proposed Bayesian measurement‐error‐driven hidden Markov regression model is practicable and effective in calibrating the effects of covariates on multistate outcome, but the prior distribution on measurement errors accrued from 2‐stage validation design is strongly recommended.  相似文献   

6.
Two types of misclassification that commonly occur in family-genetic studies are distinguished: 1) nondifferential misclassification, in which the probability of error as to phenotype (presence or absence of psychiatric disorder) does not depend on exposure status (being kin to a case or control proband) and 2) differential misclassification, in which it does. Nondifferential misclassification of phenotype reduces the observed relative risk towards the null value, sometimes quite dramatically. Differential misclassification can bias the observed relative risk in either direction, depending on the different values of sensitivity and specificity among relatives of cases and controls. The impact of these biases on genetic-epidemiologic studies is reviewed and discussed. In particular, the ability to detect major gene effects from the pattern of relative risks in first-, second-, and third-degree relatives can be severely compromised. Although there are some methods available to correct the effects of nondifferential misclassification, a major priority for family history studies is to minimize differential misclassification.  相似文献   

7.
Poor measurement of explanatory variables occurs frequently in observational studies. Error‐prone observations may lead to biased estimation and loss of power in detecting the impact of explanatory variables on the response. We consider misclassified binary exposure in the context of case–control studies, assuming the availability of validation data to inform the magnitude of the misclassification. A Bayesian adjustment to correct the misclassification is investigated. Simulation studies show that the Bayesian method can have advantages over non‐Bayesian counterparts, particularly in the face of a rare exposure, small validation sample sizes, and uncertainty about whether exposure misclassification is differential or non‐differential. The method is illustrated via application to several real studies. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

8.
In cross-sectional studies or studies based on questionnaires, errors in exposures and misclassification of health status may be related. The reason may be that some subjects tend to over- or underreport both exposure and disease. The author investigated the effects of such dependent misclassification from a threshold-model point of view, in that an assumption was made of an underlying linear relation between a continuous exposure and response, both measured with error, and where these errors are correlated. Allowance is also made for covariates measured without error. This approach enables the derivation of explicit expressions for bias in the estimated association between exposure and outcome in different situations. It is shown that, dependent on the true effect of the exposure, the effect of the errors can be both an over- and an underestimation of the true relation. In addition, a study design from which the true effect can be consistently estimated is also provided.  相似文献   

9.
We propose a Bayesian adjustment for the misclassification of a binary exposure variable in a matched case–control study. The method admits a priori knowledge about both the misclassification parameters and the exposure–disease association. The standard Dirichlet prior distribution for a multinomial model is extended to allow separation of prior assertions about the exposure–disease association from assertions about other parameters. The method is applied to a study of occupational risk factors for new‐onset adult asthma. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

10.
Case-control studies are particularly susceptible to differential exposure misclassification when exposure status is determined following incident case status. Probabilistic bias analysis methods have been developed as ways to adjust standard effect estimates based on the sensitivity and specificity of exposure misclassification. The iterative sampling method advocated in probabilistic bias analysis bears a distinct resemblance to a Bayesian adjustment; however, it is not identical. Furthermore, without a formal theoretical framework (Bayesian or frequentist), the results of a probabilistic bias analysis remain somewhat difficult to interpret. We describe, both theoretically and empirically, the extent to which probabilistic bias analysis can be viewed as approximately Bayesian. Although the differences between probabilistic bias analysis and Bayesian approaches to misclassification can be substantial, these situations often involve unrealistic prior specifications and are relatively easy to detect. Outside of these special cases, probabilistic bias analysis and Bayesian approaches to exposure misclassification in case-control studies appear to perform equally well.  相似文献   

11.
12.
In case-control studies, subjects in the case group may be recruited from suspected patients who are diagnosed positively with disease. While many statistical methods have been developed for measurement error or misclassification of exposure variables in epidemiological studies, no studies have been reported on the effect of errors in diagnosing disease on testing genetic association in case-control studies. We study the impact of using the original Cochran-Armitage trend test assuming no diagnostic error when, in fact, cases and controls may be clinically diagnosed by an imperfect gold standard or a reference test. The type I error, sample size and asymptotic power of trend tests are examined under a family of genetic models in the presence of diagnostic error. The empirical powers of the trend tests are also compared by simulation studies under various genetic models.  相似文献   

13.
Case‐control genome‐wide association studies provide a vast amount of genetic information that may be used to investigate secondary phenotypes. We study the situation in which the primary disease is rare and the secondary phenotype and genetic markers are dichotomous. An analysis of the association between a genetic marker and the secondary phenotype based on controls only (CO) is valid, whereas standard methods that also use cases result in biased estimates and highly inflated type I error if there is an interaction between the secondary phenotype and the genetic marker on the risk of the primary disease. Here we present an adaptively weighted (AW) method that combines the case and control data to study the association, while reducing to the CO analysis if there is strong evidence of an interaction. The possibility of such an interaction and the misleading results for standard methods, but not for the AW or CO approaches, are illustrated by data from a case‐control study of colorectal adenoma. Simulations and asymptotic theory indicate that the AW method can reduce the mean square error for estimation with a prespecified SNP and increase the power to discover a new association in a genome‐wide study, compared to CO analysis. Further experience with genome‐wide studies is needed to determine when methods that assume no interaction gain precision and power, thereby can be recommended, and when methods such as the AW or CO approaches are needed to guard against the possibility of nonzero interactions. Genet. Epidemiol. 34:427–433, 2010. Published 2010 Wiley‐Liss, Inc.  相似文献   

14.
Misclassification is a long‐standing statistical problem in epidemiology. In many real studies, either an exposure or a response variable or both may be misclassified. As such, potential threats to the validity of the analytic results (e.g., estimates of odds ratios) that stem from misclassification are widely discussed in the literature. Much of the discussion has been restricted to the nondifferential case, in which misclassification rates for a particular variable are assumed not to depend on other variables. However, complex differential misclassification patterns are common in practice, as we illustrate here using bacterial vaginosis and Trichomoniasis data from the HIV Epidemiology Research Study (HERS). Therefore, clear illustrations of valid and accessible methods that deal with complex misclassification are still in high demand. We formulate a maximum likelihood (ML) framework that allows flexible modeling of misclassification in both the response and a key binary exposure variable, while adjusting for other covariates via logistic regression. The approach emphasizes the use of internal validation data in order to evaluate the underlying misclassification mechanisms. Data‐driven simulations show that the proposed ML analysis outperforms less flexible approaches that fail to appropriately account for complex misclassification patterns. The value and validity of the method are further demonstrated through a comprehensive analysis of the HERS example data. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

15.
A number of new study designs have appeared in which the exposure distribution of a case series is compared to an exposure distribution representing a complete theoretical population or distribution. These designs include the case‐genotype study, the case‐cross‐over study, and the case‐specular study. This paper describes a unified likelihood‐based approach to the analysis of such studies, and discusses extensions of these methods when a control group is available. The approach clarifies certain assumptions implicit in the methods, and helps contrast these assumptions to those underlying ordinary case‐control studies. There are several reasons to expect discrepancies between ordinary case‐control estimates and case‐distribution estimates; for example, case‐distribution estimates can be more sensitive to exposure misclassification. Some discrepancies are illustrated in an application to case‐specular data on wire codes and childhood cancer. Copyright © 1999 John Wiley & Sons, Ltd.  相似文献   

16.
BACKGROUND: Case-control study is still one of the most commonly used study designs in epidemiological research. Misclassification of case-control status remains a significant issue because it will bias the results of a case-control study. There exist two types of misclassification, differential versus nondifferential. It is commonly accepted that nondifferential misclassification will bias the results of the study towards the null hypothesis. Conversely, no reports have assessed the impact and direction of differential misclassification on odds ratio (OR) estimate. The goal of the present study is to demonstrate by statistical derivation that patterns exist on the bias induced by differential misclassification. METHODS: Based on a 2 x 2 case-control study design, we derive the odds ratio without misclassification, and those with misclassification according to: (1) controls are misclassified as cases by exposure status; (2) cases are misclassified as controls by exposure status; and (3) both controls and cases are misclassified by exposure status simultaneously. Furthermore, mathematical derivations are shown for each of the ratios of the two odds ratios with and without misclassification. These methods are carried out by simulation analyses. RESULTS: Simulation analyses show that quite a number of biased odds ratios tend to move away from the null hypothesis and result in approaching zero or infinity with increasing proportion of misclassification among cases, controls, or both. These patterns are associated with the exposure status and the values of unbiased odds ratio (<1, 1, or >1). CONCLUSIONS: Our findings suggest that, unlike nondifferential misclassification, differential misclassification of case-control status in a case-control study may not weaken the exposure-outcome association towarding the null hypothesis. Care needs to be taken for interpreting the results of a case-control study when there exists differential misclassification bias, a practical issue in epidemiological research.  相似文献   

17.
Genotype misclassification occurs frequently in human genetic association studies. When cases and controls are subject to the same misclassification model, Pearson's chi-square test has the correct type I error but may lose power. Most current methods adjusting for genotyping errors assume that the misclassification model is known a priori or can be assessed by a gold standard instrument. But in practical applications, the misclassification probabilities may not be completely known or the gold standard method can be too costly to be available. The repeated measurement design provides an alternative approach for identifying misclassification probabilities. With this design, a proportion of the subjects are measured repeatedly (five or more repeats) for the genotypes when the error model is completely unknown. We investigate the applications of the repeated measurement method in genetic association analysis. Cost-effectiveness study shows that if the phenotyping-to-genotyping cost ratio or the misclassification rates are relatively large, the repeat sampling can gain power over the regular case-control design. We also show that the power gain is not sensitive to the genetic model, genetic relative risk and the population high-risk allele frequency, all of which are typically important ingredients in association studies. An important implication of this result is that whatever the genetic factors are, the repeated measurement method can be applied if the genotyping errors must be accounted for or the phenotyping cost is high.  相似文献   

18.
Complex diseases are likely to be caused by the interplay of genetic and environmental factors. Despite this, gene‐disease associations are frequently investigated using models that focus solely on a marginal gene effect, ignoring environmental factors entirely. Failing to take into account a gene‐environment interaction can weaken the apparent gene‐disease association, leading to loss in statistical power and, potentially, inability to identify genuine risk factors. If a gene‐environment interaction exists, therefore, a joint analysis allowing the effect of the gene to differ between groups defined by the environmental exposure can have greater statistical power than a marginal gene‐disease model. However, environmental data are subject to measurement error. Substantial losses in statistical power for detecting gene‐environment interactions can arise from measurement error in the environmental exposure. It is unclear, however, what effect measurement error may have on the power of the joint analysis. We consider the potential benefits, in terms of statistical power, of collecting concurrent environmental data within large cohorts in order to enhance gene detection. We further consider whether these benefits remain in the presence of misclassification in both the gene and the environmental exposure. We find that when an effect of the gene is apparent only in the presence of the environmental exposure, the joint analysis has greater power than a marginal gene‐disease analysis. This comparative increase in power remains in the presence of likely levels of misclassification of either the gene or environmental exposure. Genet. Epidemiol. 34:552–560, 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

19.
We examine the impact of nondifferential outcome misclassification on odds ratios estimated from pair‐matched case‐control studies and propose a Bayesian model to adjust these estimates for misclassification bias. The model relies on access to a validation subgroup with confirmed outcome status for all case‐control pairs as well as prior knowledge about the positive and negative predictive value of the classification mechanism. We illustrate the model's performance on simulated data and apply it to a database study examining the presence of ten morbidities in the prodromal phase of multiple sclerosis.  相似文献   

20.
Case‐control association studies often collect extensive information on secondary phenotypes, which are quantitative or qualitative traits other than the case‐control status. Exploring secondary phenotypes can yield valuable insights into biological pathways and identify genetic variants influencing phenotypes of direct interest. All publications on secondary phenotypes have used standard statistical methods, such as least‐squares regression for quantitative traits. Because of unequal selection probabilities between cases and controls, the case‐control sample is not a random sample from the general population. As a result, standard statistical analysis of secondary phenotype data can be extremely misleading. Although one may avoid the sampling bias by analyzing cases and controls separately or by including the case‐control status as a covariate in the model, the associations between a secondary phenotype and a genetic variant in the case and control groups can be quite different from the association in the general population. In this article, we present novel statistical methods that properly reflect the case‐control sampling in the analysis of secondary phenotype data. The new methods provide unbiased estimation of genetic effects and accurate control of false‐positive rates while maximizing statistical power. We demonstrate the pitfalls of the standard methods and the advantages of the new methods both analytically and numerically. The relevant software is available at our website. Genet. Epidemiol. 2009. © 2008 Wiley‐Liss, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号