期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Covariance estimators for generalized estimating equations (GEE) in longitudinal analysis with small samples

Ming Wang Lan Kong Zheng Li Lijun Zhang 《Statistics in medicine》2016,35(10):1706-1721

Generalized estimating equations (GEE) is a general statistical method to fit marginal models for longitudinal data in biomedical studies. The variance–covariance matrix of the regression parameter coefficients is usually estimated by a robust “sandwich” variance estimator, which does not perform satisfactorily when the sample size is small. To reduce the downward bias and improve the efficiency, several modified variance estimators have been proposed for bias‐correction or efficiency improvement. In this paper, we provide a comprehensive review on recent developments of modified variance estimators and compare their small‐sample performance theoretically and numerically through simulation and real data examples. In particular, Wald tests and t‐tests based on different variance estimators are used for hypothesis testing, and the guideline on appropriate sample sizes for each estimator is provided for preserving type I error in general cases based on numerical results. Moreover, we develop a user‐friendly R package “geesmv” incorporating all of these variance estimators for public usage in practice. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

2.

Simple generalized estimating equations (GEEs) and weighted generalized estimating equations (WGEEs) in longitudinal studies with dropouts: guidelines and implementation in R

下载免费PDF全文

Alejandro Salazar Begoña Ojeda María Dueñas Fernando Fernández Inmaculada Failde 《Statistics in medicine》2016,35(19):3424-3448

Missing data are a common problem in clinical and epidemiological research, especially in longitudinal studies. Despite many methodological advances in recent decades, many papers on clinical trials and epidemiological studies do not report using principled statistical methods to accommodate missing data or use ineffective or inappropriate techniques. Two refined techniques are presented here: generalized estimating equations (GEEs) and weighted generalized estimating equations (WGEEs). These techniques are an extension of generalized linear models to longitudinal or clustered data, where observations are no longer independent. They can appropriately handle missing data when the missingness is completely at random (GEE and WGEE) or at random (WGEE) and do not require the outcome to be normally distributed. Our aim is to describe and illustrate with a real example, in a simple and accessible way to researchers, these techniques for handling missing data in the context of longitudinal studies subject to dropout and show how to implement them in R. We apply them to assess the evolution of health‐related quality of life in coronary patients in a data set subject to dropout. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

3.

Small-sample performance of the robust score test and its modifications in generalized estimating equations

Guo X Pan W Connett JE Hannan PJ French SA 《Statistics in medicine》2005,24(22):3479-3495

The sandwich variance estimator of generalized estimating equations (GEE) may not perform well when the number of independent clusters is small. This could jeopardize the validity of the robust Wald test by causing inflated type I error and lower coverage probability of the corresponding confidence interval than the nominal level. Here, we investigate the small-sample performance of the robust score test for correlated data and propose several modifications to improve the performance. In a simulation study, we compare the robust score test to the robust Wald test for correlated Bernoulli and Poisson data, respectively. It is confirmed that the robust Wald test is too liberal whereas the robust score test is too conservative for small samples. To explain this puzzling operating difference between the two tests, we consider their applications to two special cases, one-sample and two-sample comparisons, thus motivating some modifications to the robust score test. A modification based on a simple adjustment to the usual robust score statistic by a factor of J/(J - 1) (where J is the number of clusters) reduces the conservativeness of the generalized score test. Simulation studies mimicking group-randomized clinical trials with binary and count responses indicated that it may improve the small-sample performance over that of the generalized score and Wald tests with test size closer to the nominal level. Finally, we demonstrate the utility of our proposal by applying it to a group-randomized clinical trial, trying alternative cafeteria options in schools (TACOS). 相似文献

4.

A robust and unified framework for estimating heritability in twin studies using generalized estimating equations

Jaron Arbet Matt McGue Saonli Basu 《Statistics in medicine》2020,39(27):3897-3913

The ‘heritability’ of a phenotype measures the proportion of trait variance due to genetic factors in a population. In the past 50 years, studies with monozygotic and dizygotic twins have estimated heritability for 17,804 traits;¹ thus twin studies are popular for estimating heritability. Researchers are often interested in estimating heritability for non-normally distributed outcomes such as binary, counts, skewed or heavy-tailed continuous traits. In these settings, the traditional normal ACE model (NACE) and Falconer's method can produce poor coverage of the true heritability. Therefore, we propose a robust generalized estimating equations (GEE2) framework for estimating the heritability of non-normally distributed outcomes. The traditional NACE and Falconer's method are derived within this unified GEE2 framework, which additionally provides robust standard errors. Although the traditional Falconer's method cannot adjust for covariates, the corresponding ‘GEE2-Falconer’ can incorporate mean and variance-level covariate effects (e.g. let heritability vary by sex or age). Given a non-normally distributed outcome, the GEE2 models are shown to attain better coverage of the true heritability compared to traditional methods. Finally, a scenario is demonstrated where NACE produces biased estimates of heritability while Falconer remains unbiased. Therefore, we recommend GEE2-Falconer for estimating the heritability of non-normally distributed outcomes in twin studies. 相似文献

5.

Concordance correlation coefficients estimated by generalized estimating equations and variance components for longitudinal repeated measurements

下载免费PDF全文

Miao‐Yu Tsai 《Statistics in medicine》2017,36(8):1319-1333

The concordance correlation coefficient (CCC) is a commonly accepted measure of agreement between two observers for continuous responses. This paper proposes a generalized estimating equations (GEE) approach allowing dependency between repeated measurements over time to assess intra‐agreement for each observer and inter‐ and total agreement among multiple observers simultaneously. Furthermore, the indices of intra‐, inter‐, and total agreement through variance components (VC) from an extended three‐way linear mixed model (LMM) are also developed with consideration of the correlation structure of longitudinal repeated measurements. Simulation studies are conducted to compare the performance of the GEE and VC approaches for repeated measurements from longitudinal data. An application of optometric conformity study is used for illustration. In conclusion, the GEE approach allowing flexibility in model assumptions and correlation structures of repeated measurements gives satisfactory results with small mean square errors and nominal 95% coverage rates for large data sets, and when the assumption of the relationship between variances and covariances for the extended three‐way LMM holds, the VC approach performs outstandingly well for all sample sizes. Copyright © 2017 John Wiley & Sons, Ltd. 相似文献

6.

Joint modeling of multiple ordinal adherence outcomes via generalized estimating equations with flexible correlation structure

下载免费PDF全文

Zhen Jiang Yimeng Liu Abdus S. Wahed Geert Molenberghs 《Statistics in medicine》2018,37(6):983-995

Adherence to medication is critical in achieving effectiveness of many treatments. Factors that influence adherence behavior have been the subject of many clinical studies. Analyzing adherence is complicated because it is often measured on multiple drugs over a period, resulting in a multivariate longitudinal outcome. This paper is motivated by the Viral Resistance to Antiviral Therapy of Chronic Hepatitis C study, where adherence is measured on two drugs as a bivariate ordinal longitudinal outcome. To analyze such outcome, we propose a joint model assuming the multivariate ordinal outcome arose from a partitioned latent multivariate normal process. We also provide a flexible multilevel association structure covering both between and within outcome correlation. In simulation studies, we show that the joint model provides unbiased estimators for regression parameters, which are more efficient than those obtained through fitting separate model for each outcome. The joint method also yields unbiased estimators for the correlation parameters when the correlation structure is correctly specified. Finally, we analyze the Viral Resistance to Antiviral Therapy of Chronic Hepatitis C adherence data and discuss the findings. 相似文献

7.

Small-sample adjustments in using the sandwich variance estimator in generalized estimating equations 总被引：1，自引：0，他引：1

Pan W Wall MM 《Statistics in medicine》2002,21(10):1429-1441

The generalized estimating equation (GEE) approach is widely used in regression analyses with correlated response data. Under mild conditions, the resulting regression coefficient estimator is consistent and asymptotically normal with its variance being consistently estimated by the so-called sandwich estimator. Statistical inference is thus accomplished by using the asymptotic Wald chi-squared test. However, it has been noted in the literature that for small samples the sandwich estimator may not perform well and may lead to much inflated type I errors for the Wald chi-squared test. Here we propose using an approximate t- or F-test that takes account of the variability of the sandwich estimator. The level of type I error of the proposed t- or F-test is guaranteed to be no larger than that of the Wald chi-squared test. The satisfactory performance of the proposed new tests is confirmed in a simulation study. Our proposal also has some advantages when compared with other new approaches based on direct modifications of the sandwich estimator, including the one that corrects the downward bias of the sandwich estimator. In addition to hypothesis testing, our result has a clear implication on constructing Wald-type confidence intervals or regions. 相似文献

8.

Maintaining the validity of inference in small-sample stepped wedge cluster randomized trials with binary outcomes when using generalized estimating equations

Whitney P. Ford Philip M. Westgate 《Statistics in medicine》2020,39(21):2779-2792

Stepped wedge cluster trials are an increasingly popular alternative to traditional parallel cluster randomized trials. Such trials often utilize a small number of clusters and numerous time intervals, and these components must be considered when choosing an analysis method. A generalized linear mixed model containing a random intercept and fixed time and intervention covariates is the most common analysis approach. However, the sole use of a random intercept applies a constant intraclass correlation coefficient structure, which is an assumption that is likely to be violated given stepped wedge trials (SWTs) have multiple time intervals. Alternatively, generalized estimating equations (GEE) are robust to the misspecification of the working correlation structure, although it has been shown that small-sample adjustments to standard error estimates and the use of appropriate degrees of freedom are required to maintain the validity of inference when the number of clusters is small. In this article, we show, using an extensive simulation study based on a motivating example and a more general design, the use of GEE can maintain the validity of inference in small-sample SWTs with binary outcomes. Furthermore, we show which combinations of bias corrections to standard error estimates and degrees of freedom work best in terms of attaining nominal type I error rates. 相似文献

9.

Analysis of data with multiple sources of correlation in the framework of generalized estimating equations

Shults J Whitt MC Kumanyika S 《Statistics in medicine》2004,23(20):3209-3226

This paper is motivated by a study of physical activity participation habits in African American women with three potential sources of correlation among study outcomes, according to method of assessment, timing of measurement, and intensity of physical activity. To adjust for the multiple sources of correlation in this study, we implement an approach based on generalized estimating equations that models association via a patterned correlation matrix. We present a general algorithm that is relatively straightforward to program, an analysis of our physical activity study, and some asymptotic relative efficiency comparisons between correctly specifying the correlation structure vs ignoring two sources of correlation in the analysis of data from this study. The efficiency comparisons demonstrate that correctly modeling the correlation structure can prevent substantial losses in efficiency in estimation of the regression parameter. 相似文献

10.

Correlation analysis of twin data with repeated measures based on generalized estimating equations

John S. Grove Lue Ping Zhao Filemon Quiaoit 《Genetic epidemiology》1993,10(6):539-544

Repeated measures allow additional tests of common assumptions in twin correlation analysis. Analysis of log serum triglyceride level in NHLBI male twins using generalized estimating equations disclosed that the mean and variance shifted across exams, presumably because of changes in laboratory practice. © 1993 Wiley-Liss, Inc. 相似文献

11.

Doubly robust generalized estimating equations for longitudinal data

Shaun Seaman Andrew Copas 《Statistics in medicine》2009,28(6):937-955

A popular method for analysing repeated‐measures data is generalized estimating equations (GEE). When response data are missing at random (MAR), two modifications of GEE use inverse‐probability weighting and imputation. The weighted GEE (WGEE) method involves weighting observations by their inverse probability of being observed, according to some assumed missingness model. Imputation methods involve filling in missing observations with values predicted by an assumed imputation model. WGEE are consistent when the data are MAR and the dropout model is correctly specified. Imputation methods are consistent when the data are MAR and the imputation model is correctly specified. Recently, doubly robust (DR) methods have been developed. These involve both a model for probability of missingness and an imputation model for the expectation of each missing observation, and are consistent when either is correct. We describe DR GEE, and illustrate their use on simulated data. We also analyse the INITIO randomized clinical trial of HIV therapy allowing for MAR dropout. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

12.

A comparison of several approaches for choosing between working correlation structures in generalized estimating equation analysis of longitudinal binary data

Justine Shults Wenguang Sun Xin Tu Hanjoo Kim Jay Amsterdam Joseph M. Hilbe Thomas Ten‐Have 《Statistics in medicine》2009,28(18):2338-2355

The method of generalized estimating equations (GEE) models the association between the repeated observations on a subject with a patterned correlation matrix. Correct specification of the underlying structure is a potentially beneficial goal, in terms of improving efficiency and enhancing scientific understanding. We consider two sets of criteria that have previously been suggested, respectively, for selecting an appropriate working correlation structure, and for ruling out a particular structure(s), in the GEE analysis of longitudinal studies with binary outcomes. The first selection criterion chooses the structure for which the model‐based and the sandwich‐based estimator of the covariance matrix of the regression parameter estimator are closest, while the second selection criterion chooses the structure that minimizes the weighted error sum of squares. The rule out criterion deselects structures for which the estimated correlation parameter violates standard constraints for binary data that depend on the marginal means. In addition, we remove structures from consideration if their estimated parameter values yield an estimated correlation structure that is not positive definite. We investigate the performance of the two sets of criteria using both simulated and real data, in the context of a longitudinal trial that compares two treatments for major depressive episode. Practical recommendations are also given on using these criteria to aid in the efficient selection of a working correlation structure in GEE analysis of longitudinal binary data. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

13.

Weighted estimating equations for longitudinal studies with death and non-monotone missing time-dependent covariates and outcomes

Shardell M Miller RR 《Statistics in medicine》2008,27(7):1008-1025

We propose a marginal modeling approach to estimate the association between a time-dependent covariate and an outcome in longitudinal studies where some study participants die during follow-up and both variables have non-monotone response patterns. The proposed method is an extension of weighted estimating equations that allows the outcome and covariate to have different missing-data patterns. We present methods for both random and non-random missing-data mechanisms. A study of functional recovery in a cohort of elderly female hip-fracture patients motivates the approach. 相似文献

14.

Working covariance model selection for generalized estimating equations

Carey VJ Wang YG 《Statistics in medicine》2011,30(26):3117-3124

We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice. 相似文献

15.

Analysis of repeated bouts of measurements in the framework of generalized estimating equations

Shults J Mazurick CA Richard Landis J 《Statistics in medicine》2006,25(23):4114-4128

We consider a longitudinal study of interstitial cystitis (IC) in women, in which the time between bouts of repeated measurements is large relative to the within-bout separation in time. Our outcome of interest is the number of nocturnal voids that we model via quasi-least squares (QLS) in the framework of generalized estimating equations (GEE). To account for potential intra-subject correlation, we directly apply a banded Toeplitz correlation structure that previously was only implemented in an ad hoc approach using GEE. We describe this structure, its appropriateness for data from the IC study, and the results of our analysis. We then demonstrate that correct specification of the underlying correlation structure versus incorrectly applying a simpler structure can prevent substantial losses in efficiency in estimation of the regression parameter. These comparisons involve the limiting values of the estimates of the correlation parameters, which are not consistent for the misspecification scenarios considered here. We therefore obtain the limiting values of the QLS estimates when the structure is incorrectly specified. 相似文献

16.

Augmented generalized estimating equations for improving efficiency and validity of estimation in cluster randomized trials by leveraging cluster-level and individual-level covariates

Stephens AJ Tchetgen Tchetgen EJ De Gruttola V 《Statistics in medicine》2012,31(10):915-930

Recent methodological advances in covariate adjustment in randomized clinical trials have used semiparametric theory to improve efficiency of inferences by incorporating baseline covariates; these methods have focused on independent outcomes. We modify one of these approaches, augmentation of standard estimators, for use within cluster randomized trials in which treatments are assigned to groups of individuals, thereby inducing correlation. We demonstrate the potential for imbalance correction and efficiency improvement through consideration of both cluster-level covariates and individual-level covariates. To improve small-sample estimation, we consider several variance adjustments. We evaluate this approach for continuous and binary outcomes through simulation and apply it to data from a cluster randomized trial of a community behavioral intervention related to HIV prevention in Tanzania. 相似文献

17.

The modeling of medical expenditure data from a longitudinal survey using the generalized method of moments (GMM) approach

下载免费PDF全文

Zachary Hass Michael Levine Laura P. Sands Jeffrey Ting Huiping Xu 《Statistics in medicine》2016,35(15):2652-2664

相似文献

18.

A determinant‐based criterion for working correlation structure selection in generalized estimating equations

下载免费PDF全文

Ajmery Jaman Mahbub A. H. M. Latif Wasimul Bari Abdus S. Wahed 《Statistics in medicine》2016,35(11):1819-1833

In generalized estimating equations (GEE), the correlation between the repeated observations on a subject is specified with a working correlation matrix. Correct specification of the working correlation structure ensures efficient estimators of the regression coefficients. Among the criteria used, in practice, for selecting working correlation structure, Rotnitzky‐Jewell, Quasi Information Criterion (QIC) and Correlation Information Criterion (CIC) are based on the fact that if the assumed working correlation structure is correct then the model‐based (naive) and the sandwich (robust) covariance estimators of the regression coefficient estimators should be close to each other. The sandwich covariance estimator, used in defining the Rotnitzky‐Jewell, QIC and CIC criteria, is biased downward and has a larger variability than the corresponding model‐based covariance estimator. Motivated by this fact, a new criterion is proposed in this paper based on the bias‐corrected sandwich covariance estimator for selecting an appropriate working correlation structure in GEE. A comparison of the proposed and the competing criteria is shown using simulation studies with correlated binary responses. The results revealed that the proposed criterion generally performs better than the competing criteria. An example of selecting the appropriate working correlation structure has also been shown using the data from Madras Schizophrenia Study. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

19.

Improved hypothesis testing for coefficients in generalized estimating equations with small samples of clusters

McCaffrey DF Bell RM 《Statistics in medicine》2006,25(23):4081-4098

The sandwich standard error estimator is commonly used for making inferences about parameter estimates found as solutions to generalized estimating equations (GEE) for clustered data. The sandwich tends to underestimate the variability in the parameter estimates when the number of clusters is small, and reference distributions commonly used for hypothesis testing poorly approximate the distribution of Wald test statistics. Consequently, tests have greater than nominal type I error rates. We propose tests that use bias-reduced linearization, BRL, to adjust the sandwich estimator and Satterthwaite or saddlepoint approximations for the reference distribution of resulting Wald t-tests. We conducted a large simulation study of tests using a variety of estimators (traditional sandwich, BRL, Mancl and DeRouen's BC estimator, and a modification of an estimator proposed by Kott) and approximations to reference distributions under diverse settings that varied the distribution of the explanatory variables, the values of coefficients, and the degree of intra-cluster correlation (ICC). Our new method generally worked well, providing accurate estimates of the variability of fitted coefficients and tests with near-nominal type I error rates when the ICC is small. Our method works less well when the ICC is large, but it continues to out-perform the traditional sandwich and other alternatives. 相似文献

20.

An efficient and robust method for analyzing population pharmacokinetic data in genome‐wide pharmacogenomic studies: a generalized estimating equation approach

Kengo Nagashima Yasunori Sato Hisashi Noma Chikuma Hamada 《Statistics in medicine》2013,32(27):4838-4858

Powerful array‐based single‐nucleotide polymorphism‐typing platforms have recently heralded a new era in which genome‐wide studies are conducted with increasing frequency. A genetic polymorphism associated with population pharmacokinetics (PK) is typically analyzed using nonlinear mixed‐effect models (NLMM). Applying NLMM to large‐scale data, such as those generated by genome‐wide studies, raises several issues related to the assumption of random effects as follows: (i) computation time: it takes a long time to compute the marginal likelihood; (ii) convergence of iterative calculation: an adaptive Gauss–Hermite quadrature is generally used to estimate NLMM; however, iterative calculations may not converge in complex models; and (iii) random‐effects misspecification leads to slightly inflated type‐I error rates. As an alternative effective approach to resolving these issues, in this article, we propose a generalized estimating equation (GEE) approach for analyzing population PK data. In general, GEE analysis does not account for interindividual variability in PK parameters; therefore, the usual GEE estimators cannot be interpreted straightforwardly, and their validities have not been justified. Here, we propose valid inference methods for using GEE even under conditions of interindividual variability and provide theoretical justifications of the proposed GEE estimators for population PK data. In numerical evaluations by simulations, the proposed GEE approach exhibited high computational speed and stability relative to the NLMM approach. Furthermore, the NLMM analysis was sensitive to the misspecification of the random‐effects distribution, and the proposed GEE inference is valid for any distributional form. We provided an illustration by using data from a genome‐wide pharmacogenomic study of an anticancer drug. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献