首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 24 毫秒
1.
Generalized estimating equations (GEEs) are commonly used for the marginal analysis of longitudinal data. In order to obtain consistent regression parameter estimates, these estimating equations must be unbiased. However, in the presence of certain types of time‐dependent covariates, these equations can be biased unless they incorporate the independence working correlation structure. Moreover, in this case, regression parameter estimation can be very inefficient because not all valid moment conditions are incorporated within the corresponding estimating equations. Therefore, approaches based on the generalized method of moments or quadratic inference functions have been proposed in order to utilize all valid moment conditions. However, we have found in previous studies, as well as the current study, that such methods will not always provide valid inference and can also be improved upon in terms of finite‐sample regression parameter estimation. Therefore, we propose both a modified GEE approach and a method selection strategy in order to ensure valid inference with the goal of improving regression parameter estimation. In a simulation study and application example, we compare existing and proposed methods and demonstrate that our modified GEE approach performs well, and the correlation information criterion has good accuracy with respect to selecting the best approach in terms of regression parameter estimation. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

2.
M Hu  J M Lachin 《Statistics in medicine》2001,20(22):3411-3428
A model fit by general estimating equations (GEE) has been used extensively for the analysis of longitudinal data in medical studies. To some extent, GEE tries to minimize a quadratic form of the residuals, and therefore is not robust in the sense that it, like least squares estimates, is sensitive to heavy-tailed distributions, contaminated distributions and extreme values. This paper describes the family of truncated robust estimating equations and its properties for the analysis of quantitative longitudinal data. Like GEE, the robust estimating equations aim to assess the covariate effects in the generalized linear model in the complete population of observations, but in a manner that is more robust to the influence of aberrant observations. A simulation study has been conducted to compare the finite-sample performance of GEE and the robust estimating equations under a variety of error distributions and data structures. It shows that the parameter estimates based on GEE and the robust estimating equations are approximately unbiased and the type I errors of Wald tests do not tend to be inflated. GEE is slightly more efficient with pure normal data, but the efficiency of GEE declines much more quickly than the robust estimating equations when the data become contaminated or have heavy tails, which makes the robust estimating equations advantageous with non-normal data. Both GEE and the robust estimating equations are applied to a longitudinal analysis of renal function in the Diabetes Control and Complications Trial (DCCT). For this application, GEE seems to be sensitive to the working correlation specification in that different working correlation structures may lead to different conclusions about the effect of intensive diabetes treatment. On the other hand, the robust estimating equations consistently conclude that the treatment effect is highly significant no matter which working correlation structure is used. The DCCT Research Group also demonstrated a significant effect using a mixed-effects longitudinal model.  相似文献   

3.
Two analytic methods were used in the Problem 2 data set. First, generalized estimating equations (GEE) modelling was developed to adjust for familial correlation in regressions evaluating candidate genes and an environmental factor. Second, the affected-pedigree-member (APM) method was used to identify chromosomal regions of interest and linkage of candidate genes with disease affection status. The GEE method identified C5 (MG1) as important for the quantitative trait Q1 and the corresponding affection status DIS, but the APM method was only suggestive. The GEE method identified C2 (MG2) as important for Q2 but only marginally important for Q1 and not important for DIS. © 1995 Wiley-Liss, Inc.  相似文献   

4.
Generalized estimating equations (GEE) are commonly used for the analysis of correlated data. However, use of quadratic inference functions (QIFs) is becoming popular because it increases efficiency relative to GEE when the working covariance structure is misspecified. Although shown to be advantageous in the literature, the impacts of covariates and imbalanced cluster sizes on the estimation performance of the QIF method in finite samples have not been studied. This cluster size variation causes QIF's estimating equations and GEE to be in separate classes when an exchangeable correlation structure is implemented, causing QIF and GEE to be incomparable in terms of efficiency. When utilizing this structure and the number of clusters is not large, we discuss how covariates and cluster size imbalance can cause QIF, rather than GEE, to produce estimates with the larger variability. This occurrence is mainly due to the empirical nature of weighting QIF employs, rather than differences in estimating equations classes. We demonstrate QIF's lost estimation precision through simulation studies covering a variety of general cluster randomized trial scenarios and compare QIF and GEE in the analysis of data from a cluster randomized trial. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

5.
In recent years health services researchers have conducted 'volume-outcome' studies to evaluate whether providers (hospitals or surgeons) who treat many patients for a specialized condition have better outcomes than those that treat few patients. These studies and the inherent clustering of events by provider present an unusual statistical problem. The volume-outcome setting is unique in that 'volume' reflects both the primary factor under study and also the cluster size. Consequently, the assumptions inherent in the use of available methods that correct for clustering might be violated in this setting. To address this issue, we investigate via simulation the properties of three estimation procedures for the analysis of cluster correlated data, specifically in the context of volume-outcome studies. We examine and compare the validity and efficiency of widely-available statistical techniques that have been used in the context of volume-outcome studies: generalized estimating equations (GEE) using both the independence and exchangeable correlation structures; random effects models; and the weighted GEE approach proposed by Williamson et al. (Biometrics 2003; 59:36-42) to account for informative clustering. Using data generated either from an underlying true random effects model or a cluster correlated model we show that both the random effects and the GEE with an exchangeable correlation structure have generally good properties, with relatively low bias for estimating the volume parameter and its variance. By contrast, the cluster weighted GEE method is inefficient.  相似文献   

6.
Elevated plasma levels of apolipoproteins A1 (apoA1) and B (apoB) are important protective factors and risk factors, respectively, for atherosclerosis and coronary heart disease. It is well known that both apoA1 and apoB reveal strong familial aggregation. Our goal was to investigate whether exogenous variables influence these associations. We used marginal regression models for the mean and association structure (generalized estimating equations 2; GEE2) to analyse data from 1435 family members within 469 families of different sizes included in the Donolo-Tel Aviv Three-Generation Offspring Study. The usual robust variance matrix was approximated by extensions of jack-knife estimators of variance to GEE2 models. Estimation of standard errors in models with quite complex correlation structures was possible using this approach. All analyses were easily carried out using a menu-driven stand-alone software tool for marginal regression modelling. We demonstrate that a variety of hypotheses can be tested using Wald statistics by modelling regression matrices for the association structure. We show that correlation for apoB between parent-offspring pairs increased with decreasing age difference and that pairs with individuals of the same gender had more similar apoA1 levels than individuals of different gender. Associations between different relative pairs did not all agree with those expected from differences in kinship coefficients. The analysis using GEE2 models revealed structures that would not have been detected by other models and should therefore be used in addition to traditional approaches of analysing family data. GEE2 should be considered a standard method for the investigation of familial aggregation.  相似文献   

7.
The method of generalized estimating equations (GEE) models the association between the repeated observations on a subject with a patterned correlation matrix. Correct specification of the underlying structure is a potentially beneficial goal, in terms of improving efficiency and enhancing scientific understanding. We consider two sets of criteria that have previously been suggested, respectively, for selecting an appropriate working correlation structure, and for ruling out a particular structure(s), in the GEE analysis of longitudinal studies with binary outcomes. The first selection criterion chooses the structure for which the model‐based and the sandwich‐based estimator of the covariance matrix of the regression parameter estimator are closest, while the second selection criterion chooses the structure that minimizes the weighted error sum of squares. The rule out criterion deselects structures for which the estimated correlation parameter violates standard constraints for binary data that depend on the marginal means. In addition, we remove structures from consideration if their estimated parameter values yield an estimated correlation structure that is not positive definite. We investigate the performance of the two sets of criteria using both simulated and real data, in the context of a longitudinal trial that compares two treatments for major depressive episode. Practical recommendations are also given on using these criteria to aid in the efficient selection of a working correlation structure in GEE analysis of longitudinal binary data. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

8.
The concordance correlation coefficient (CCC) is a commonly accepted measure of agreement between two observers for continuous responses. This paper proposes a generalized estimating equations (GEE) approach allowing dependency between repeated measurements over time to assess intra‐agreement for each observer and inter‐ and total agreement among multiple observers simultaneously. Furthermore, the indices of intra‐, inter‐, and total agreement through variance components (VC) from an extended three‐way linear mixed model (LMM) are also developed with consideration of the correlation structure of longitudinal repeated measurements. Simulation studies are conducted to compare the performance of the GEE and VC approaches for repeated measurements from longitudinal data. An application of optometric conformity study is used for illustration. In conclusion, the GEE approach allowing flexibility in model assumptions and correlation structures of repeated measurements gives satisfactory results with small mean square errors and nominal 95% coverage rates for large data sets, and when the assumption of the relationship between variances and covariances for the extended three‐way LMM holds, the VC approach performs outstandingly well for all sample sizes. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

9.
Case-control studies provide an important epidemiological tool to evaluate candidate genes. There are many different study designs available. We focus on a more recently proposed design, which we call a multiplex case-control (MCC) design. This design compares allele frequencies between related cases, each of whom are sampled from multiplex families, and unrelated controls. Since within-family genotype correlations will exist, statistical methods will need to take this into account. Moreover, there is a need to develop methods to simultaneously control for potential confounders in the analysis. Generalized estimating equations (GEE) are one approach to analyze this type of data; however, this approach can have singularity problems when estimating the correlation matrix. To allow for modeling of other covariates, we extend our previously developed method to a more general model-based approach. Our proposed methods use the score statistic, derived from a composite likelihood. We propose three different approaches to estimate the variance of this statistic. Under random ascertainment of pedigrees, score tests have correct type I error rates; however, pedigrees are not randomly ascertained. Thus, through simulations, we test the validity and power of the score tests under different ascertainment schemes, and an illustration of our methods, applied to data from a prostate cancer study, is presented. We find that our robust score statistic has estimated type I error rates within the expected range for all situations we considered whereas the other two statistics have inflated type I error rates under nonrandom ascertainment schemes. We also find GEE to fail at least 5% of the time for each simulation configuration; at times, the failure rate reaches above 80%. In summary, our robust method may be the only current regression analysis method available for MCC data.  相似文献   

10.
用广义估计方程估计数量性状的家庭相关   总被引:2,自引:1,他引:1       下载免费PDF全文
目的 研究数量性状家庭相关的测量方法,并对身高家系资料进行分析。方法 应用广义估计方程2(GEE2)估计数量性状的均数结构和关联结构的边际回归模型。所有估计都可在软件MAREG中实现。并用身高家系实例说明方法的应用。结果 GEE2可同时考虑协变量对性状的影响和性状的内部相关,得到回归系数和相关系数的稳健估计。对身高家系资料分析,调整性别、18岁前主要居住地和出生年代后,亲子相关(r=0.459)和同胞相关(r=0.671)高于配偶相关(r=0.184),有统计学意义。同一类型亲属对中,同性别相关(如父子r=0.603,母女r=0.456,兄弟r=0.947,姐妹r=0.681)大于异性别相关(如父女r=0.431,母子r=0.364,兄妹或姐弟r=0.530)。结论 GEE2可灵活的估计各种家庭相关系数和协变量对性状均数的影响,且参数估计稳健,因此可作为评价数量性状家庭聚集性的标准方法之一。  相似文献   

11.
As medical applications for cluster randomization designs become more common, investigators look for guidance on optimal methods for estimating the effect of group-based interventions over time. This study examines two distinct cluster randomization designs: (1) the repeated cross-sectional design in which centres are followed over time but patients change, and (2) the longitudinal design in which individual patients are followed over time within treatment clusters. Simulations of each study design stipulated a multiplicative treatment effect (on the log odds scale), between 5 and 15 clusters in each of two treatment arms, and followed over two time periods. Estimation options included linear mixed effects models using restricted maximum likelihood (REML), generalized estimating equations (GEE), mixed effects logistic regression using both penalized quasi likelihood (PQL) and numerical integration, and Bayesian Monte Carlo analysis. For the repeated cross-sectional designs, most methods performed well in terms of bias and coverage when clusters were numerous (30) and variability across clusters of baseline risk and treatment effect was modest. With few clusters (two groups of five) and higher variability, only the Bayesian methods maintained coverage. In the longitudinal designs, the common methods of REML, GEE, or PQL performed poorly when compared to numerical integration, while Bayesian methods demonstrated less bias and better coverage for estimates of both log odds ratios and risk differences. The performance of common statistical tools for the analysis of cluster randomization designs depends heavily on the precise design, the number of clusters, and the variability of baseline outcomes and treatment effects across centres.  相似文献   

12.
By modeling the effects of predictor variables as a multiplicative function of regression parameters being invariant over categories, and category-specific scalar effects, the ordered stereotype logit model is a flexible regression model for ordinal response variables. In this article, we propose a generalized estimating equations (GEE) approach to estimate the ordered stereotype logit model for panel data based on working covariance matrices, which are not required to be correctly specified. A simulation study compares the performance of GEE estimators based on various working correlation matrices and working covariance matrices using local odds ratios. Estimation of the model is illustrated using a real-world dataset. The results from the simulation study suggest that GEE estimation of this model is feasible in medium-sized and large samples and that estimators based on local odds ratios as realized in this study tend to be less efficient compared with estimators based on a working correlation matrix. For low true correlations, the efficiency gains seem to be rather small and if the working covariance structure is too flexible, the corresponding estimator may even be less efficient compared with the GEE estimator assuming independence. Like for GEE estimators more generally, if the true correlations over time are high, then a working covariance structure which is close to the true structure can lead to considerable efficiency gains compared with assuming independence.  相似文献   

13.
The analysis of repeated measure or clustered data is often complicated by the presence of correlation. Further complications arise for discrete responses, where the marginal probability‐dependent Fr'echet bounds impose feasibility limits on the correlation that are often more restrictive than the positive definite range. Some popular statistical methods, such as generalized estimating equations (GEE), ignore these bounds, and as such can generate erroneous estimates and lead to incorrect inferential results. In this paper, we discuss two alternative strategies: (i) using QIC to select a data‐driven correlation value within the Fréchet bounds, and (ii) the use of likelihood‐based latent variable modeling, such as multivariate probit, to get around the problem all together. We provide two examples of the repercussions of incorrectly using existing GEE software in the presence of correlated binary responses. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

14.
Jung SH  Ahn CW 《Statistics in medicine》2005,24(17):2583-2596
Controlled clinical trials often randomize subjects to two treatment groups and repeatedly evaluate them at baseline and intervals across a treatment period of fixed duration. A popular primary objective in these trials is to compare the change rates in the repeated measurements between treatment groups. Repeated measurements usually involve missing data and a serial correlation within each subject. The generalized estimating equation (GEE) method has been widely used to fit the time trend in repeated measurements because of its robustness to random missing and mispecification of the true correlation structure. In this paper, we propose a closed form sample size formula for comparing the change rates of binary repeated measurements using GEE for a two-group comparison. The sample size formula is derived incorporating missing patterns, such as independent missing and monotone missing, and correlation structures, such as AR(1) model. We also propose an algorithm to generate correlated binary data with arbitrary marginal means and a Markov dependency and use it in simulation studies.  相似文献   

15.
Generalized estimating equations (GEEs) are routinely used for the marginal analysis of correlated data. The efficiency of GEE depends on how closely the working covariance structure resembles the true structure, and therefore accurate modeling of the working correlation of the data is important. A popular approach is the use of an unstructured working correlation matrix, as it is not as restrictive as simpler structures such as exchangeable and AR‐1 and thus can theoretically improve efficiency. However, because of the potential for having to estimate a large number of correlation parameters, variances of regression parameter estimates can be larger than theoretically expected when utilizing the unstructured working correlation matrix. Therefore, standard error estimates can be negatively biased. To account for this additional finite‐sample variability, we derive a bias correction that can be applied to typical estimators of the covariance matrix of parameter estimates. Via simulation and in application to a longitudinal study, we show that our proposed correction improves standard error estimation and statistical inference. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

16.
As part of our ongoing studies of genetic markers of reproductive outcome in the Hutterites, we have been analyzing potential risk factors for pregnancy outcomes. In particular, we are interested in the effects of HLA sharing between parents on fetal loss rates. Pregnancy outcome data such as these have two characteristics that create statistical challenges, i.e., repeated observations per couple and between-couple heterogeneity in risk. We critically examine four approaches based on the logistic model for the analysis of this and similar data: 1) unconditional likelihood analysis with and without fixed cluster effects; 2) conditional likelihood analysis; 3) mixed-effects analysis with random cluster effects; and 4) the robust generalized estimating equation (GEE) procedure. Of these approaches, the GEE method of Liang and Zeger would be best suited for the analysis of our data when the question of interest concerns a variable that is constant over all pregnancies, such as HLA sharing. If the question concerns a couple's risk associated with a changing variable such as maternal age, the mixed-effects analysis is the more appropriate.  相似文献   

17.
Selecting an appropriate working correlation structure is pertinent to clustered data analysis using generalized estimating equations (GEE) because an inappropriate choice will lead to inefficient parameter estimation. We investigate the well‐known criterion of QIC for selecting a working correlation structure, and have found that performance of the QIC is deteriorated by a term that is theoretically independent of the correlation structures but has to be estimated with an error. This leads us to propose a correlation information criterion (CIC) that substantially improves the QIC performance. Extensive simulation studies indicate that the CIC has remarkable improvement in selecting the correct correlation structures. We also illustrate our findings using a data set from the Madras Longitudinal Schizophrenia Study. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

18.
Three-level cluster randomized trials (CRTs) are increasingly used in implementation science, where 2fold-nested-correlated data arise. For example, interventions are randomly assigned to practices, and providers within the same practice who provide care to participants are trained with the assigned intervention. Teerenstra et al proposed a nested exchangeable correlation structure that accounts for two levels of clustering within the generalized estimating equations (GEE) approach. In this article, we utilize GEE models to test the treatment effect in a two-group comparison for continuous, binary, or count data in three-level CRTs. Given the nested exchangeable correlation structure, we derive the asymptotic variances of the estimator of the treatment effect for different types of outcomes. When the number of clusters is small, researchers have proposed bias-corrected sandwich estimators to improve performance in two-level CRTs. We extend the variances of two bias-corrected sandwich estimators to three-level CRTs. The equal provider and practice sizes were assumed to calculate number of practices for simplicity. However, they are not guaranteed in practice. Relative efficiency (RE) is defined as the ratio of variance of the estimator of the treatment effect for equal to unequal provider and practice sizes. The expressions of REs are obtained from both asymptotic variance estimation and bias-corrected sandwich estimators. Their performances are evaluated for different scenarios of provider and practice size distributions through simulation studies. Finally, a percentage increase in the number of practices is proposed due to efficiency loss from unequal provider and/or practice sizes.  相似文献   

19.
We used simulation to compare accuracy of estimation and confidence interval coverage of several methods for analysing binary outcomes from cluster randomized trials. The following methods were used to estimate the population-averaged intervention effect on the log-odds scale: marginal logistic regression models using generalized estimating equations with information sandwich estimates of standard error (GEE); unweighted cluster-level mean difference (CL/U); weighted cluster-level mean difference (CL/W) and cluster-level random effects linear regression (CL/RE). Methods were compared across trials simulated with different numbers of clusters per trial arm, numbers of subjects per cluster, intraclass correlation coefficients (rho), and intervention versus control arm proportions. Two thousand data sets were generated for each combination of design parameter values. The results showed that the GEE method has generally acceptable properties, including close to nominal levels of confidence interval coverage, when a simple adjustment is made for data with relatively few clusters. CL/U and CL/W have good properties for trials where the number of subjects per cluster is sufficiently large and rho is sufficiently small. CL/RE also has good properties in this situation provided a t-distribution multiplier is used for confidence interval calculation in studies with small numbers of clusters. For studies where the number of subjects per cluster is small and rho is large all cluster-level methods may perform poorly for studies with between 10 and 50 clusters per trial arm.  相似文献   

20.
Generalized estimating equations (GEE) are often used for the marginal analysis of longitudinal data. Although much work has been performed to improve the validity of GEE for the analysis of data arising from small‐sample studies, little attention has been given to power in such settings. Therefore, we propose a valid GEE approach to improve power in small‐sample longitudinal study settings in which the temporal spacing of outcomes is the same for each subject. Specifically, we use a modified empirical sandwich covariance matrix estimator within correlation structure selection criteria and test statistics. Use of this estimator can improve the accuracy of selection criteria and increase the degrees of freedom to be used for inference. The resulting impacts on power are demonstrated via a simulation study and application example. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号