首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Many epidemiological studies use a nested case‐control (NCC) design to reduce cost while maintaining study power. Because NCC sampling is conditional on the primary outcome, routine application of logistic regression to analyze a secondary outcome will generally be biased. Recently, many studies have proposed several methods to obtain unbiased estimates of risk for a secondary outcome from NCC data. Two common features of all current methods requires that the times of onset of the secondary outcome are known for cohort members not selected into the NCC study and the hazards of the two outcomes are conditionally independent given the available covariates. This last assumption will not be plausible when the individual frailty of study subjects is not captured by the measured covariates. We provide a maximum‐likelihood method that explicitly models the individual frailties and also avoids the need to have access to the full cohort data. We derive the likelihood contribution by respecting the original sampling procedure with respect to the primary outcome. We use proportional hazard models for the individual hazards, and Clayton's copula is used to model additional dependence between primary and secondary outcomes beyond that explained by the measured risk factors. We show that the proposed method is more efficient than weighted likelihood and is unbiased in the presence of shared frailty for the primary and secondary outcome. We illustrate the method with an application to a study of risk factors for diabetes in a Swedish cohort. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

2.
We compare the calibration and variability of risk prediction models that were estimated using various approaches for combining information on new predictors, termed ‘markers’, with parameter information available for other variables from an earlier model, which was estimated from a large data source. We assess the performance of risk prediction models updated based on likelihood ratio (LR) approaches that incorporate dependence between new and old risk factors as well as approaches that assume independence (‘naive Bayes’ methods). We study the impact of estimating the LR by (i) fitting a single model to cases and non‐cases when the distribution of the new markers is in the exponential family or (ii) fitting separate models to cases and non‐cases. We also evaluate a new constrained maximum likelihood method. We study updating the risk prediction model when the new data arise from a cohort and extend available methods to accommodate updating when the new data source is a case‐control study. To create realistic correlations between predictors, we also based simulations on real data on response to antiviral therapy for hepatitis C. From these studies, we recommend the LR method fit using a single model or constrained maximum likelihood. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

3.
The matched case-control design is frequently used in the study of complex disorders and can result in significant gains in efficiency, especially in the context of measuring biomarkers; however, risk prediction in this setting is not straightforward. We propose an inverse-probability weighting approach to estimate the predictive ability associated with a set of covariates. In particular, we propose an algorithm for estimating the summary index, area under the curve corresponding to the Receiver Operating Characteristic curve associated with a set of pre-defined covariates for predicting a binary outcome. By combining data from the parent cohort with that generated in a matched case control study, we describe methods for estimation of the population parameters of interest and the corresponding area under the curve. We evaluate the bias associated with the proposed methods in simulations by considering a range of parameter settings. We illustrate the methods in two data applications: (1) a prospective cohort study of cardiovascular disease in women, the Women's Health Study, and (2) a matched case-control study nested within the Nurses' Health Study aimed at risk prediction of invasive breast cancer.  相似文献   

4.
5.
An index measuring the utility of testing a DNA marker before deciding between two alternative treatments is proposed which can be estimated from pharmaco‐epidemiological case‐control or cohort studies. In the case‐control design, external estimates of the prevalence of the disease and of the frequency of the genetic risk variant are required for estimating the utility index. Formulas for point and interval estimates are derived. Empirical coverage probabilities of the confidence intervals were estimated under different scenarios of disease prevalence, prevalence of drug use, and population frequency of the genetic variant. To illustrate our method, we re‐analyse pharmaco‐epidemiological case‐control data on oral contraceptive intake and venous thrombosis in carriers and non‐carriers of the factor V Leiden mutation. We also re‐analyse cross‐sectional data from the Framingham study on a gene‐diet interaction between an APOA2 polymorphism and high saturated fat intake on obesity. We conclude that the utility index may be helpful to evaluate and appraise the potential clinical and public health relevance of gene‐environment interaction effects detected in genomic and candidate gene association studies and may be a valuable decision support for designing prospective studies on the clinical utility.  相似文献   

6.
The matched case‐control designs are commonly used to control for potential confounding factors in genetic epidemiology studies especially epigenetic studies with DNA methylation. Compared with unmatched case‐control studies with high‐dimensional genomic or epigenetic data, there have been few variable selection methods for matched sets. In an earlier paper, we proposed the penalized logistic regression model for the analysis of unmatched DNA methylation data using a network‐based penalty. However, for popularly applied matched designs in epigenetic studies that compare DNA methylation between tumor and adjacent non‐tumor tissues or between pre‐treatment and post‐treatment conditions, applying ordinary logistic regression ignoring matching is known to bring serious bias in estimation. In this paper, we developed a penalized conditional logistic model using the network‐based penalty that encourages a grouping effect of (1) linked Cytosine‐phosphate‐Guanine (CpG) sites within a gene or (2) linked genes within a genetic pathway for analysis of matched DNA methylation data. In our simulation studies, we demonstrated the superiority of using conditional logistic model over unconditional logistic model in high‐dimensional variable selection problems for matched case‐control data. We further investigated the benefits of utilizing biological group or graph information for matched case‐control data. We applied the proposed method to a genome‐wide DNA methylation study on hepatocellular carcinoma (HCC) where we investigated the DNA methylation levels of tumor and adjacent non‐tumor tissues from HCC patients by using the Illumina Infinium HumanMethylation27 Beadchip. Several new CpG sites and genes known to be related to HCC were identified but were missed by the standard method in the original paper. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

7.
Misconceptions about the impact of case–control matching remain common. We discuss several subtle problems associated with matched case–control studies that do not arise or are minor in matched cohort studies: (1) matching, even for non-confounders, can create selection bias; (2) matching distorts dose–response relations between matching variables and the outcome; (3) unbiased estimation requires accounting for the actual matching protocol as well as for any residual confounding effects; (4) for efficiency, identically matched groups should be collapsed; (5) matching may harm precision and power; (6) matched analyses may suffer from sparse-data bias, even when using basic sparse-data methods. These problems support advice to limit case–control matching to a few strong well-measured confounders, which would devolve to no matching if no such confounders are measured. On the positive side, odds ratio modification by matched variables can be assessed in matched case–control studies without further data, and when one knows either the distribution of the matching factors or their relation to the outcome in the source population, one can estimate and study patterns in absolute rates. Throughout, we emphasize distinctions from the more intuitive impacts of cohort matching.  相似文献   

8.
The case–cohort study design has often been used in studies of a rare disease or for a common disease with some biospecimens needing to be preserved for future studies. A case–cohort study design consists of a random sample, called the subcohort, and all or a portion of the subjects with the disease of interest. One advantage of the case–cohort design is that the same subcohort can be used for studying multiple diseases. Stratified random sampling is often used for the subcohort. Additive hazards models are often preferred in studies where the risk difference, instead of relative risk, is of main interest. Existing methods do not use the available covariate information fully. We propose a more efficient estimator by making full use of available covariate information for the additive hazards model with data from a stratified case–cohort design with rare (the traditional situation) and non‐rare (the generalized situation) diseases. We propose an estimating equation approach with a new weight function. The proposed estimators are shown to be consistent and asymptotically normally distributed. Simulation studies show that the proposed method using all available information leads to efficiency gain and stratification of the subcohort improves efficiency when the strata are highly correlated with the covariates. Our proposed method is applied to data from the Atherosclerosis Risk in Communities study. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

9.
In observational studies of the effect of an exposure on an outcome, the exposure–outcome association is usually confounded by other causes of the outcome (potential confounders). One common method to increase efficiency is to match the study on potential confounders. Matched case‐control studies are relatively common and well covered by the literature. Matched cohort studies are less common but do sometimes occur. It is often argued that it is valid to ignore the matching variables, in the analysis of matched cohort data. In this paper, we provide analyses delineating the scope and limits of this argument. We discuss why the argument does not carry over to effect estimation in matched case‐control studies, although it does carry over to null‐hypothesis testing. We also show how the argument does not extend to matched cohort studies when one adjusts for additional confounders in the analysis. Ignoring the matching variables can sometimes reduce variance, even though this is not guaranteed. We investigate the trade‐off between bias and variance in deciding whether adjustment for matching factors is advisable. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

10.
In many large prospective cohorts, expensive exposure measurements cannot be obtained for all individuals. Exposure–disease association studies are therefore often based on nested case–control or case–cohort studies in which complete information is obtained only for sampled individuals. However, in the full cohort, there may be a large amount of information on cheaply available covariates and possibly a surrogate of the main exposure(s), which typically goes unused. We view the nested case–control or case–cohort study plus the remainder of the cohort as a full‐cohort study with missing data. Hence, we propose using multiple imputation (MI) to utilise information in the full cohort when data from the sub‐studies are analysed. We use the fully observed data to fit the imputation models. We consider using approximate imputation models and also using rejection sampling to draw imputed values from the true distribution of the missing values given the observed data. Simulation studies show that using MI to utilise full‐cohort information in the analysis of nested case–control and case–cohort studies can result in important gains in efficiency, particularly when a surrogate of the main exposure is available in the full cohort. In simulations, this method outperforms counter‐matching in nested case–control studies and a weighted analysis for case–cohort studies, both of which use some full‐cohort information. Approximate imputation models perform well except when there are interactions or non‐linear terms in the outcome model, where imputation using rejection sampling works well. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

11.
Matched cohort analyses are becoming increasingly popular for estimating treatment effects in observational studies. However, in the applied biomedical literature, analysts and authors are inconsistent regarding whether to terminate follow‐up among members of a matched set once one member is no longer under observation. This paper focused on time‐to‐event outcomes and used Monte Carlo simulation methods to determine the optimal approach. We found that the bias of the estimated treatment effect estimate was negligible under both approaches and that the percentage of censoring had no discernible effect on the magnitude of bias. The mean model‐based standard error of the treatment estimate was consistently higher when we terminated observation within matched pairs. Furthermore, the type 1 error rate was consistently lower when we did not terminate follow‐up within matched pairs. In conclusion, when the focus was on time‐to‐event outcomes, we demonstrated that there was no advantage to terminating follow‐up within matched pairs. Continuing follow‐up on each subject until their observation was naturally complete was superior compared with terminating a subject's observation time once its matched pair had ceased to be under observation. Given the frequency with which these analyses are conducted in the applied literature, our results provide important guidance to analysts and applied researchers as to the preferred analytic approach. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

12.
A fundamental goal of epidemiologic research is to investigate the relationship between exposures and disease risk. Cases of the disease are often considered a single outcome and assumed to share a common etiology. However, evidence indicates that many human diseases arise and evolve through a range of heterogeneous molecular pathologic processes, influenced by diverse exposures. Pathogenic heterogeneity has been considered in various neoplasms such as colorectal, lung, prostate, and breast cancers, leukemia and lymphoma, and non‐neoplastic diseases, including obesity, type II diabetes, glaucoma, stroke, cardiovascular disease, autism, and autoimmune disease. In this article, we discuss analytic options for studying disease subtype heterogeneity, emphasizing methods for evaluating whether the association of a potential risk factor with disease varies by disease subtype. Methods are described for scenarios where disease subtypes are categorical and ordinal and for cohort studies, matched and unmatched case–control studies, and case–case study designs. For illustration, we apply the methods to a molecular pathological epidemiology study of alcohol intake and colon cancer risk by tumor LINE‐1 methylation subtypes. User‐friendly software to implement the methods is publicly available. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

13.
We describe a flexible family of tests for evaluating the goodness of fit (calibration) of a pre‐specified personal risk model to the outcomes observed in a longitudinal cohort. Such evaluation involves using the risk model to assign each subject an absolute risk of developing the outcome within a given time from cohort entry and comparing subjects’ assigned risks with their observed outcomes. This comparison involves several issues. For example, subjects followed only for part of the risk period have unknown outcomes. Moreover, existing tests do not reveal the reasons for poor model fit when it occurs, which can reflect misspecification of the model's hazards for the competing risks of outcome development and death. To address these issues, we extend the model‐specified hazards for outcome and death, and use score statistics to test the null hypothesis that the extensions are unnecessary. Simulated cohort data applied to risk models whose outcome and mortality hazards agreed and disagreed with those generating the data show that the tests are sensitive to poor model fit, provide insight into the reasons for poor fit, and accommodate a wide range of model misspecification. We illustrate the methods by examining the calibration of two breast cancer risk models as applied to a cohort of participants in the Breast Cancer Family Registry. The methods can be implemented using the Risk Model Assessment Program, an R package freely available at http://stanford.edu/~ggong/rmap/ . Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

14.
Analysing the determinants and consequences of hospital‐acquired infections involves the evaluation of large cohorts. Infected patients in the cohort are often rare for specific pathogens, because most of the patients admitted to the hospital are discharged or die without such an infection. Death and discharge are competing events to acquiring an infection, because these individuals are no longer at risk of getting a hospital‐acquired infection. Therefore, the data is best analysed with an extended survival model – the extended illness‐death model. A common problem in cohort studies is the costly collection of covariate values. In order to provide efficient use of data from infected as well as uninfected patients, we propose a tailored case‐cohort approach for the extended illness‐death model. The basic idea of the case‐cohort design is to only use a random sample of the full cohort, referred to as subcohort, and all cases, namely the infected patients. Thus, covariate values are only obtained for a small part of the full cohort. The method is based on existing and established methods and is used to perform regression analysis in adapted Cox proportional hazards models. We propose estimation of all cause‐specific cumulative hazards and transition probabilities in an extended illness‐death model based on case‐cohort sampling. As an example, we apply the methodology to infection with a specific pathogen using a large cohort from Spanish hospital data. The obtained results of the case‐cohort design are compared with the results in the full cohort to investigate the performance of the proposed method. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

15.
Propensity-score matching allows one to reduce the effects of treatment-selection bias or confounding when estimating the effects of treatments when using observational data. Some authors have suggested that methods of inference appropriate for independent samples can be used for assessing the statistical significance of treatment effects when using propensity-score matching. Indeed, many authors in the applied medical literature use methods for independent samples when making inferences about treatment effects using propensity-score matched samples. Dichotomous outcomes are common in healthcare research. In this study, we used Monte Carlo simulations to examine the effect on inferences about risk differences (or absolute risk reductions) when statistical methods for independent samples are used compared with when statistical methods for paired samples are used in propensity-score matched samples. We found that compared with using methods for independent samples, the use of methods for paired samples resulted in: (i) empirical type I error rates that were closer to the advertised rate; (ii) empirical coverage rates of 95 per cent confidence intervals that were closer to the advertised rate; (iii) narrower 95 per cent confidence intervals; and (iv) estimated standard errors that more closely reflected the sampling variability of the estimated risk difference. Differences between the empirical and advertised performance of methods for independent samples were greater when the treatment-selection process was stronger compared with when treatment-selection process was weaker. We recommend using statistical methods for paired samples when using propensity-score matched samples for making inferences on the effect of treatment on the reduction in the probability of an event occurring.  相似文献   

16.
OBJECTIVE--The aim was to assess the extent to which selection bias affects a case-control study of breast cancer screening in which attenders and non-attenders for screening are compared. DESIGN--There were two retrospective case-control studies, one estimating the risk of death from breast cancer in women in the screening district relative to those in the comparison district (study A), the second estimating the relative risk for women who had ever been screened compared with women who had never been screened in the screening district alone (study B). For cases and controls in study B, the women's screening history was summarised for the time period from date of entry to diagnosis of the case, or the equivalent time from date of entry for the matched controls. For cases detected by screening, the screen at which cancer was detected was included in the screening history. SUBJECTS--Cases were deaths from breast cancer in women with disease diagnosed after entry to the trial, up to 31 December 1986 or a maximum of seven years from date of entry, in one of the screening districts (Guildford) and one of the comparison districts (Stoke) participating in the UK Trial of Early Detection of Breast Cancer: study A: 198 deaths in Guildford and Stoke; study B: 51 deaths in Guildford only. There were five age matched controls for each case, with length of follow up at least as great as the time from entry to death of the case. MAIN RESULTS--The estimate of the risk of death from breast cancer in the screening district relative to the comparison district from study A was 0.76, thus implying a reduction of 24% in the screening district, similar to that obtained from a cohort analysis of data from the two districts. In contrast, the relative risk in study B for ever v never screened women was 0.51, which, taking the 72% compliance into account, would result in a relative risk of 0.65 for the screening district if there were no selection bias. The risk of breast cancer mortality in the never screened relative to the comparison district was 1.13, despite the fact that incidence rates in the two populations were similar. This suggested that cancers in the never screened group had a particularly poor prognosis, contributing to selection bias. CONCLUSIONS--The possible existence of selection bias should lead to caution in interpretation of the results of case-control studies of the effect of breast cancer screening on mortality.  相似文献   

17.
The process by which patients experience a series of recurrent events, such as hospitalizations, may be subject to death. In cohort studies, one strategy for analyzing such data is to fit a joint frailty model for the intensities of the recurrent event and death, which estimates covariate effects on the two event types while accounting for their dependence. When certain covariates are difficult to obtain, however, researchers may only have the resources to subsample patients on whom to collect complete data: one way is using the nested case–control (NCC) design, in which risk set sampling is performed based on a single outcome. We develop a general framework for the design of NCC studies in the presence of recurrent and terminal events and propose estimation and inference for a joint frailty model for recurrence and death using data arising from such studies. We propose a maximum weighted penalized likelihood approach using flexible spline models for the baseline intensity functions. Two standard error estimators are proposed: a sandwich estimator and a perturbation resampling procedure. We investigate operating characteristics of our estimators as well as design considerations via a simulation study and illustrate our methods using two studies: one on recurrent cardiac hospitalizations in patients with heart failure and the other on local recurrence and metastasis in patients with breast cancer.  相似文献   

18.
Epidemiologic studies of occupational cohorts have played a major role in the quantitative assessment of risks associated with several carcinogenic hazards and are likely to play an increasingly important role in this area. Relatively little attention has been given in either the epidemiologic or the risk assessment literature to the development of appropriate methods for modeling epidemiologic data for quantitative risk assessment (QRA). The purpose of this paper is to review currently available methods for modeling epidemiologic data for risk assessment. The focus of this paper is on methods for use with retrospective cohort mortality studies of occupational groups for estimating cancer risk, since these are the data most commonly used when epidemiologic information is used for QRA. Both empirical (e.g., Poisson regression and Cox proportionate hazards model) and biologic (e.g., two-stage models) models are considered. Analyses of a study of lung cancer among workers exposed to cadmium are used to illustrate these modeling methods. Based on this example it is demonstrated that the selection of a particular model may have a large influence on the resulting estimates of risk.  相似文献   

19.
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case‐control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non‐linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

20.
Biomarkers are often measured over time in epidemiological studies and clinical trials for better understanding of the mechanism of diseases. In large cohort studies, case‐cohort sampling provides a cost effective method to collect expensive biomarker data for revealing the relationship between biomarker trajectories and time to event. However, biomarker measurements are often limited by the sensitivity and precision of a given assay, resulting in data that are censored at detection limits and prone to measurement errors. Additionally, the occurrence of an event of interest may preclude biomarkers from being further evaluated. Inappropriate handling of these types of data can lead to biased conclusions. Under a classical case cohort design, we propose a modified likelihood‐based approach to accommodate these special features of longitudinal biomarker measurements in the accelerated failure time models. The maximum likelihood estimators based on the full likelihood function are obtained by Gaussian quadrature method. We evaluate the performance of our case‐cohort estimator and compare its relative efficiency to the full cohort estimator through simulation studies. The proposed method is further illustrated using the data from a biomarker study of sepsis among patients with community acquired pneumonia. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号