首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
PURPOSE: Dependent binary responses, such as health outcomes in twin pairs or siblings, frequently arise in perinatal epidemiologic research. This gives rise to correlated data, which must be taken into account during analysis to avoid erroneous statistical and biological inferences. METHODS: An analysis of perinatal mortality (fetal deaths plus deaths within the first 28 days) in twins in relation to cluster-varying (those that are unique to each fetus within a twin pregnancy such as birthweight) and cluster-constant (those that are identical for both twins within a sibship such as maternal smoking status) risk factors is presented. Marginal (ordinary logistic regression [OLR] and logistic regression using generalized estimating equations [GEE]) and cluster-specific (conditional and random-intercept logistic regression models) regression models are fit and their results contrasted. The United States "matched multiple data" file of twin births (1995-1997), which includes 285,226 twins from 142,613 pregnancies, was used to examine the implications of ignoring of clustering on regression inferences. RESULTS: The OLR models provide variance estimates for cluster constant covariates that ranged from 7% to 71% smaller than those from GEE-based models. This underestimation is even more pronounced for some cluster-varying covariates, ranging from 21% to 198%. CONCLUSIONS: Ignoring the cluster dependency is likely to affect the precision of covariate effects and consequently interpretation of results. With widespread availability of appropriate software, statistical methods for taking the intracluster dependency into account are easily implemented and necessary.  相似文献   

3.
Twin studies have long been recognized for their value in learning about the aetiology of disease and specifically for their potential for separating genetic effects from environmental effects. The recent upsurge of interest in life-course epidemiology and the study of developmental influences on later health has provided a new impetus to study twins as a source of unique insights. Twins are of special interest because they provide naturally matched pairs where the confounding effects of a large number of potentially causal factors (such as maternal nutrition or gestation length) may be removed by comparisons between twins who share them. The traditional tool of epidemiological 'risk factor analysis' is the regression model, but it is not straightforward to transfer standard regression methods to twin data, because the analysis needs to reflect the paired structure of the data, which induces correlation between twins. This paper reviews the use of more specialized regression methods for twin data, based on generalized least squares or linear mixed models, and explains the relationship between these methods and the commonly used approach of analysing within-twin-pair difference values. Methods and issues of interpretation are illustrated using an example from a recent study of the association between birth weight and cord blood erythropoietin. We focus on the analysis of continuous outcome measures but review additional complexities that arise with binary outcomes. We recommend the use of a general model that includes separate regression coefficients for within-twin-pair and between-pair effects, and provide guidelines for the interpretation of estimates obtained under this model.  相似文献   

4.
Alcohol consumption, anxiety, and depression were measured by questionnaire in 572 twin families ascertained from the Institute of Psychiatry (London) normal twin register, each family consisting of an adult twin pair, their parents, and siblings-a total of 1,742 individuals. A multivariate normal model for pedigree analysis was applied to each variable, with power transformations fitted to maximise the fit with distributional assumptions. The effect of shared twin environment was estimated by considering the measured cohabitation history of twin pairs. For log-transformed alcohol consumption, amongst current drinkers this effect was the same for MZ and DZ pairs but depended on the cohabitation status of pairs. For both anxiety and depression the effect was clearly not the same for MZ and DZ pairs. Therefore the basic assumption of the classical twin method appears to be invalid for all three traits. Estimates of heritability derived from these analyses were compared with those obtained (1) by applying the classical twin method to twin data only, and (2) by a pedigree analysis ignoring the effect of shared twin environment. For all variables there were considerable differences between estimates based on the three models. This study illustrates that data from twins and their relatives which includes information on cohabitation history might distinguish shared genes and shared environment as causes of familial aggregation. In these behavioral traits the effect of shared twin environment may depend on zygosity and play a major role in explaining familial aggregation in twin family data.  相似文献   

5.
Prognostic models can be developed with multiple regression analysis of a data set containing individual patient data. Often this data set is relatively small, while previously published studies present results for larger numbers of patients. We describe a method to combine univariable regression results from the medical literature with univariable and multivariable results from the data set containing individual patient data. This 'adaptation method' exploits the generally strong correlation between univariable and multivariable regression coefficients. The method is illustrated with several logistic regression models to predict 30-day mortality in patients with acute myocardial infarction. The regression coefficients showed considerably less variability when estimated with the adaptation method, compared to standard maximum likelihood estimates. Also, model performance, as distinguished in calibration and discrimination, improved clearly when compared to models including shrunk or penalized estimates. We conclude that prognostic models may benefit substantially from explicit incorporation of literature data.  相似文献   

6.
We assess the asymptotic bias of estimates of exposure effects conditional on covariates when summary scores of confounders, instead of the confounders themselves, are used to analyze observational data. First, we study regression models for cohort data that are adjusted for summary scores. Second, we derive the asymptotic bias for case‐control studies when cases and controls are matched on a summary score, and then analyzed either using conditional logistic regression or by unconditional logistic regression adjusted for the summary score. Two scores, the propensity score (PS) and the disease risk score (DRS) are studied in detail. For cohort analysis, when regression models are adjusted for the PS, the estimated conditional treatment effect is unbiased only for linear models, or at the null for non‐linear models. Adjustment of cohort data for DRS yields unbiased estimates only for linear regression; all other estimates of exposure effects are biased. Matching cases and controls on DRS and analyzing them using conditional logistic regression yields unbiased estimates of exposure effect, whereas adjusting for the DRS in unconditional logistic regression yields biased estimates, even under the null hypothesis of no association. Matching cases and controls on the PS yield unbiased estimates only under the null for both conditional and unconditional logistic regression, adjusted for the PS. We study the bias for various confounding scenarios and compare our asymptotic results with those from simulations with limited sample sizes. To create realistic correlations among multiple confounders, we also based simulations on a real dataset. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

7.
This paper compares three published methods for analysing multiple correlated ROC curves: a method using generalized estimating equations with marginal non-proportional ordinal regression models; a method using jackknifed pseudovalues of summary statistics; a method using a corrected F-test from analysis of variance of summary statistics. Use of these methods is illustrated through six real data examples from studies with the common factorial design, that is, multiple readers interpreting images obtained with each test modality on each study subject. The issue of the difference between typical summary statistics and summary statistics from typical ROC curves is explored. The examples also address similarities and differences among the analytical methods. In particular, while point estimates of differences between test modalities are similar, the standard errors of these differences do not agree for all three methods. A simulation study supports the standard errors provided by the generalized estimating equations with marginal non-proportional ordinal regression models.  相似文献   

8.
Estimation of individual genetic and environmental factor scores   总被引:1,自引:0,他引:1  
Implicit in the application of the common-factor model as a method for decomposing trait covariance into a genetic and environmental part is the use of factor scores. In multivariate analyses, it is possible to estimate these factor scores for the communal part of the model. Estimation of scores on latent factors in terms of individual observations within the context of a twin/family study amounts to estimation of individual genetic and environmental scores. Such estimates may be of both theoretical and practical interest and may be provided with confidence intervals around the individual estimates. The method is first illustrated with stimulated twin data and next is applied to blood pressure data obtained in a Dutch sample of 59 male adolescent twin pairs. Subjects with high blood pressure can be distinguished into groups with high genetic or high environmental scores.  相似文献   

9.
The most important statistical methods currently used for the analysis of twin data are described. The main objective of these methods is to estimate the contribution of the genetic and environmental factors to the variability of normal or pathological human traits, by means of the information obtained from monozygotic and dizygotic twin pairs. In this context, the concept of heritability becomes relevant. Not only the simple comparison between monozygotic and dizygotic twins, based on measures such as the concordance and the correlation, but also new and more complex approaches are presented, with a special emphasis on the structural equation models, and a synthetic view on the DF-analysis and the correlated frailty models. Some examples of applications to the data of the Italian Twin Registry are also illustrated.  相似文献   

10.
Log-linear models for the analysis of matched cohort studies   总被引:1,自引:0,他引:1  
The application of conditional logistic regression to the analysis of matched case-control studies has now become quite customary. In addition, it is well known that software designed to fit linear logistic and log-linear models can be used in these analyses. The application of conditional logistic regression to cohort designs is described, and an approach is developed that adapts the linear logistic and log-linear models for the analysis of prospectively collected data. Specific situations discussed include matched pairs, 2:1 matching, and studies in which some subjects are pair matched and others matched 2:1. The methods are illustrated with numeric examples.  相似文献   

11.
Marginal modeling of nonnested multilevel data using standard software   总被引:1,自引:0,他引:1  
Epidemiologic data are often clustered within multiple levels that may not be nested within each other. Generalized estimating equations are commonly used to adjust for correlation among observations within clusters when fitting regression models; however, standard software does not currently accommodate nonnested clusters. This paper introduces a simple generalized estimating equation strategy that uses available commercial or public software for the regression analysis of nonnested multilevel data. The authors describe how to obtain empirical standard error estimates for constructing valid confidence intervals and conducting statistical hypothesis tests. The method is evaluated using simulations and illustrated with an analysis of data from the Breast Cancer Surveillance Consortium that estimates the influence of woman, radiologist, and facility characteristics on the positive predictive value of screening mammography. Performance with a small number of clusters is discussed. Both the simulations and the example demonstrate the importance of accounting for the correlation within all levels of clustering for proper inference.  相似文献   

12.
Mean‐based semi‐parametric regression models such as the popular generalized estimating equations are widely used to improve robustness of inference over parametric models. Unfortunately, such models are quite sensitive to outlying observations. The Wilcoxon‐score‐based rank regression (RR) provides more robust estimates over generalized estimating equations against outliers. However, the RR and its extensions do not sufficiently address missing data arising in longitudinal studies. In this paper, we propose a new approach to address outliers under a different framework based on the functional response models. This functional‐response‐model‐based alternative not only addresses limitations of the RR and its extensions for longitudinal data, but, with its rank‐preserving property, even provides more robust estimates than these alternatives. The proposed approach is illustrated with both real and simulated data. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

13.
The univariate analysis of categorical twin data can be performed using either structural equation modeling (SEM) or logistic regression. This paper presents a comparison between these two methods using a simulation study. Dichotomous and ordinal (three category) twin data are simulated under two different sample sizes (1,000 and 2,000 twin pairs) and according to different additive genetic and common environmental models of phenotypic variation. The two methods are found to be generally comparable in their ability to detect a “correct” model under the specifications of the simulation. Both methods lack power to detect the right model for dichotomous data when the additive genetic effect is low (between 10 and 20%) or medium (between 30 and 40%); the ordinal data simulations produce similar results except for the additive genetic model with medium or high heritability. Neither method could adequately detect a correct model that included a modest common environmental effect (20%) even when the additive genetic effect was large and the sample size included 2,000 twin pairs. The SEM method was found to have better power than logistic regression when there is a medium (30%) or high (50%) additive genetic effect and a modest common environmental effect. Conversely, logistic regression performed better than SEM in correctly detecting additive genetic effects with simulated ordinal data (for both 1,000 and 2,000 pairs) that did not contain modest common environmental effects; in this case the SEM method incorrectly detected a common environmental effect that was not present. © 1996 Wiley-Liss, Inc.  相似文献   

14.
The contributions of shared genes and shared environments to familial aggregation of coronary heart disease risk factors were investigated by genetic and epidemiologic analysis of 434 adult female twin pairs from the Kaiser-Permanente Twin Registry in Oakland, California, during 1978 and 1979. Initial estimates of genetic heritability were statistically significant for serum levels of high density lipoprotein (HDL) cholesterol, low density lipoprotein (LDL) cholesterol, triglycerides, and Quetelet index, but were only marginally significant for systolic and diastolic blood pressures. These estimates were biased, however, because sisters in the same identical twin pair were more similar than sisters in the same fraternal twin pair not only with respect to shared genes but also with respect to shared environmental and behavioral influences. Heritability was estimated again after adjusting for shared environmental and behavioral effects by multiple regression analysis. Genetic heritability remained significant for HDL cholesterol (0.66), LDL cholesterol (0.88), triglycerides (0.53), and relative weight (0.55) but not for systolic (0.42) and diastolic (0.25) blood pressures. The strong genetic components of the levels of LDL cholesterol, HDL cholesterol, and relative weight may in part explain why some women have high levels of these coronary disease risk factors despite following recommended health behaviors.  相似文献   

15.
The use of standard univariate fixed- and random-effects models in meta-analysis has become well known in the last 20 years. However, these models are unsuitable for meta-analysis of clinical trials that present multiple survival estimates (usually illustrated by a survival curve) during a follow-up period. Therefore, special methods are needed to combine the survival curve data from different trials in a meta-analysis. For this purpose, only fixed-effects models have been suggested in the literature. In this paper, we propose a multivariate random-effects model for joint analysis of survival proportions reported at multiple time points and in different studies, to be combined in a meta-analysis. The model could be seen as a generalization of the fixed-effects model of Dear (Biometrics 1994; 50:989-1002). We illustrate the method by using a simulated data example as well as using a clinical data example of meta-analysis with aggregated survival curve data. All analyses can be carried out with standard general linear MIXED model software. Copyright (c) 2008 John Wiley & Sons, Ltd.  相似文献   

16.
17.
Distributed lag models (DLM) are regression models that include multiple lagged exposure variables as covariates. They are frequently used to model the relationship between daily mortality and short-term air pollution exposures. Specifying a maximum lag number is but one of the difficulties in using a DLM for environmental epidemiology. We propose an easily extendible ensemble post-processing approach. The resultant estimates are both more parsimonious, approaching zero with increasing lag, and more efficient. The benefits are shown to be robust under various simulation scenario’s and illustrated with data from the National Morbidity, Mortality and Air Pollution Study.  相似文献   

18.
The classic twin model design has a wide application in human genetics. Under the assumption that nongenetic effects are shared to the same degree by monozygotic (MZ) and dizygotic (DZ) twin pairs, a test of the equality of casewise concordances between MZ and DZ twins provides a clue to the influence of genetic and environmental factors on a disease. The casewise concordance is the conditional probability that given that one member of a twin pair is affected, the other is also affected. When disease prevalence is low or cost-effectiveness is considered, collection of twin pairs by ascertainment for performing casewise concordance analysis is required. In this article, by defining an overall casewise concordance parameter, several likelihood-based tests, such as likelihood ratio test LR, score test Score, the usual Wald test Wald and an alternative Wald test WaldA are investigated for a test of the equality of concordances between ascertained MZ and DZ twin pairs under multinomial models. Simulation studies were conducted for data with small sample sizes. The results show that the type I error rates and power of LR and Score are stable only when the overall casewise concordances are not extremely small or large. The Wald has higher power performance in most cases but would slightly inflate type I error rates; the WaldA is the most robust and recommended approach.  相似文献   

19.
OBJECTIVE: To illustrate methods for handling incomplete data in health research. METHODS: Two strategies for handling missing data are presented: complete-case analysis and imputations. The imputations used were mean imputations, regression imputations, and multiple imputations. These strategies are illustrated in the context of logistic regression through an example using data from the "Second Cuban national survey on risk factors and non communicable disease", carried out in 2001. RESULTS: The results obtained via mean and regression imputation were similar. The odds ratios were overestimated by 10%. The results of complete-case analysis showed the greatest difference from the reference odds ratios, with a variation of between 2 and 65%. The three methods distorted the relationship between age and hypertension. Multiple imputations produced estimates closest to those of the reference estimates with a variation of less than 16%. This was the only procedure preserving the relationship between age and hypertension. CONCLUSIONS: Selecting methods for handling missing data is difficult, since the same procedure can give precise estimations in certain circumstances and not in others. Complete-case analysis should be used with caution due to the substantial loss of information it produces. Mean and regression imputations produce unreliable estimates under missing at random (MAR) mechanisms.  相似文献   

20.
A popular way to control for confounding in observational studies is to identify clusters of individuals (e.g., twin pairs), such that a large set of potential confounders are constant (shared) within each cluster. By studying the exposure–outcome association within clusters, we are in effect controlling for the whole set of shared confounders. An increasingly popular analysis tool is the between–within (BW) model, which decomposes the exposure–outcome association into a ‘within‐cluster effect’ and a ‘between‐cluster effect’. BW models are relatively common for nonsurvival outcomes and have been studied in the theoretical literature. Although it is straightforward to use BW models for survival outcomes, this has rarely been carried out in practice, and such models have not been studied in the theoretical literature. In this paper, we propose a gamma BW model for survival outcomes. We compare the properties of this model with the more standard stratified Cox regression model and use the proposed model to analyze data from a twin study of obesity and mortality. We find the following: (i) the gamma BW model often produces a more powerful test of the ‘within‐cluster effect’ than stratified Cox regression; and (ii) the gamma BW model is robust against model misspecification, although there are situations where it could give biased estimates. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号