期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Improving the correlation structure selection approach for generalized estimating equations and balanced longitudinal data

Philip M. Westgate 《Statistics in medicine》2014,33(13):2222-2237

Generalized estimating equations are commonly used to analyze correlated data. Choosing an appropriate working correlation structure for the data is important, as the efficiency of generalized estimating equations depends on how closely this structure approximates the true structure. Therefore, most studies have proposed multiple criteria to select the working correlation structure, although some of these criteria have neither been compared nor extensively studied. To ease the correlation selection process, we propose a criterion that utilizes the trace of the empirical covariance matrix. Furthermore, use of the unstructured working correlation can potentially improve estimation precision and therefore should be considered when data arise from a balanced longitudinal study. However, most previous studies have not allowed the unstructured working correlation to be selected as it estimates more nuisance correlation parameters than other structures such as AR‐1 or exchangeable. Therefore, we propose appropriate penalties for the selection criteria that can be imposed upon the unstructured working correlation. Via simulation in multiple scenarios and in application to a longitudinal study, we show that the trace of the empirical covariance matrix works very well relative to existing criteria. We further show that allowing criteria to select the unstructured working correlation when utilizing the penalties can substantially improve parameter estimation. Copyright © 2014 John Wiley & Sons, Ltd. 相似文献

2.

Finite sample adjustments in estimating equations and covariance estimators for intracluster correlations

Preisser JS Lu B Qaqish BF 《Statistics in medicine》2008,27(27):5764-5785

Bias-corrected covariance estimators are introduced in the context of an estimating equations approach for intracluster correlations among binary outcomes. Simulation study results show that the bias-corrected covariance estimators perform better than uncorrected sandwich estimators in terms of bias and coverage probabilities. Additionally, introduction of a matrix-based bias-correction into the estimating equations considerably improves point and interval estimation for the intracluster correlations. The methods are illustrated using data from a nested cross-sectional cluster trial on reducing underage drinking. 相似文献

3.

What can go wrong when ignoring correlation bounds in the use of generalized estimating equations

R. T. Sabo N. R. Chaganty 《Statistics in medicine》2010,29(24):2501-2507

The analysis of repeated measure or clustered data is often complicated by the presence of correlation. Further complications arise for discrete responses, where the marginal probability‐dependent Fr'echet bounds impose feasibility limits on the correlation that are often more restrictive than the positive definite range. Some popular statistical methods, such as generalized estimating equations (GEE), ignore these bounds, and as such can generate erroneous estimates and lead to incorrect inferential results. In this paper, we discuss two alternative strategies: (i) using QIC to select a data‐driven correlation value within the Fréchet bounds, and (ii) the use of likelihood‐based latent variable modeling, such as multivariate probit, to get around the problem all together. We provide two examples of the repercussions of incorrectly using existing GEE software in the presence of correlated binary responses. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

4.

Covariance estimators for generalized estimating equations (GEE) in longitudinal analysis with small samples

下载免费PDF全文

Ming Wang Lan Kong Zheng Li Lijun Zhang 《Statistics in medicine》2016,35(10):1706-1721

Generalized estimating equations (GEE) is a general statistical method to fit marginal models for longitudinal data in biomedical studies. The variance–covariance matrix of the regression parameter coefficients is usually estimated by a robust “sandwich” variance estimator, which does not perform satisfactorily when the sample size is small. To reduce the downward bias and improve the efficiency, several modified variance estimators have been proposed for bias‐correction or efficiency improvement. In this paper, we provide a comprehensive review on recent developments of modified variance estimators and compare their small‐sample performance theoretically and numerically through simulation and real data examples. In particular, Wald tests and t‐tests based on different variance estimators are used for hypothesis testing, and the guideline on appropriate sample sizes for each estimator is provided for preserving type I error in general cases based on numerical results. Moreover, we develop a user‐friendly R package “geesmv” incorporating all of these variance estimators for public usage in practice. Copyright © 2015 John Wiley & Sons, Ltd. 相似文献

5.

Maintaining the validity of inference in small-sample stepped wedge cluster randomized trials with binary outcomes when using generalized estimating equations

Whitney P. Ford Philip M. Westgate 《Statistics in medicine》2020,39(21):2779-2792

Stepped wedge cluster trials are an increasingly popular alternative to traditional parallel cluster randomized trials. Such trials often utilize a small number of clusters and numerous time intervals, and these components must be considered when choosing an analysis method. A generalized linear mixed model containing a random intercept and fixed time and intervention covariates is the most common analysis approach. However, the sole use of a random intercept applies a constant intraclass correlation coefficient structure, which is an assumption that is likely to be violated given stepped wedge trials (SWTs) have multiple time intervals. Alternatively, generalized estimating equations (GEE) are robust to the misspecification of the working correlation structure, although it has been shown that small-sample adjustments to standard error estimates and the use of appropriate degrees of freedom are required to maintain the validity of inference when the number of clusters is small. In this article, we show, using an extensive simulation study based on a motivating example and a more general design, the use of GEE can maintain the validity of inference in small-sample SWTs with binary outcomes. Furthermore, we show which combinations of bias corrections to standard error estimates and degrees of freedom work best in terms of attaining nominal type I error rates. 相似文献

6.

A robust and unified framework for estimating heritability in twin studies using generalized estimating equations

Jaron Arbet Matt McGue Saonli Basu 《Statistics in medicine》2020,39(27):3897-3913

The ‘heritability’ of a phenotype measures the proportion of trait variance due to genetic factors in a population. In the past 50 years, studies with monozygotic and dizygotic twins have estimated heritability for 17,804 traits;¹ thus twin studies are popular for estimating heritability. Researchers are often interested in estimating heritability for non-normally distributed outcomes such as binary, counts, skewed or heavy-tailed continuous traits. In these settings, the traditional normal ACE model (NACE) and Falconer's method can produce poor coverage of the true heritability. Therefore, we propose a robust generalized estimating equations (GEE2) framework for estimating the heritability of non-normally distributed outcomes. The traditional NACE and Falconer's method are derived within this unified GEE2 framework, which additionally provides robust standard errors. Although the traditional Falconer's method cannot adjust for covariates, the corresponding ‘GEE2-Falconer’ can incorporate mean and variance-level covariate effects (e.g. let heritability vary by sex or age). Given a non-normally distributed outcome, the GEE2 models are shown to attain better coverage of the true heritability compared to traditional methods. Finally, a scenario is demonstrated where NACE produces biased estimates of heritability while Falconer remains unbiased. Therefore, we recommend GEE2-Falconer for estimating the heritability of non-normally distributed outcomes in twin studies. 相似文献

7.

Working covariance model selection for generalized estimating equations

Carey VJ Wang YG 《Statistics in medicine》2011,30(26):3117-3124

We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice. 相似文献

8.

Simple generalized estimating equations (GEEs) and weighted generalized estimating equations (WGEEs) in longitudinal studies with dropouts: guidelines and implementation in R

下载免费PDF全文

Alejandro Salazar Begoña Ojeda María Dueñas Fernando Fernández Inmaculada Failde 《Statistics in medicine》2016,35(19):3424-3448

Missing data are a common problem in clinical and epidemiological research, especially in longitudinal studies. Despite many methodological advances in recent decades, many papers on clinical trials and epidemiological studies do not report using principled statistical methods to accommodate missing data or use ineffective or inappropriate techniques. Two refined techniques are presented here: generalized estimating equations (GEEs) and weighted generalized estimating equations (WGEEs). These techniques are an extension of generalized linear models to longitudinal or clustered data, where observations are no longer independent. They can appropriately handle missing data when the missingness is completely at random (GEE and WGEE) or at random (WGEE) and do not require the outcome to be normally distributed. Our aim is to describe and illustrate with a real example, in a simple and accessible way to researchers, these techniques for handling missing data in the context of longitudinal studies subject to dropout and show how to implement them in R. We apply them to assess the evolution of health‐related quality of life in coronary patients in a data set subject to dropout. Copyright © 2016 John Wiley & Sons, Ltd. 相似文献

9.

Working‐correlation‐structure identification in generalized estimating equations

Lin‐Yee Hin You‐Gan Wang 《Statistics in medicine》2009,28(4):642-658

Selecting an appropriate working correlation structure is pertinent to clustered data analysis using generalized estimating equations (GEE) because an inappropriate choice will lead to inefficient parameter estimation. We investigate the well‐known criterion of QIC for selecting a working correlation structure, and have found that performance of the QIC is deteriorated by a term that is theoretically independent of the correlation structures but has to be estimated with an error. This leads us to propose a correlation information criterion (CIC) that substantially improves the QIC performance. Extensive simulation studies indicate that the CIC has remarkable improvement in selecting the correct correlation structures. We also illustrate our findings using a data set from the Madras Longitudinal Schizophrenia Study. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献

10.

Correlation analysis of twin data with repeated measures based on generalized estimating equations

John S. Grove Lue Ping Zhao Filemon Quiaoit 《Genetic epidemiology》1993,10(6):539-544

Repeated measures allow additional tests of common assumptions in twin correlation analysis. Analysis of log serum triglyceride level in NHLBI male twins using generalized estimating equations disclosed that the mean and variance shifted across exams, presumably because of changes in laboratory practice. © 1993 Wiley-Liss, Inc. 相似文献

11.

Using second‐order generalized estimating equations to model heterogeneous intraclass correlation in cluster‐randomized trials

Catherine M. Crespi Weng Kee Wong Shiraz I. Mishra 《Statistics in medicine》2009,28(5):814-827

In cluster‐randomized trials, it is commonly assumed that the magnitude of the correlation among subjects within a cluster is constant across clusters. However, the correlation may in fact be heterogeneous and depend on cluster characteristics. Accurate modeling of the correlation has the potential to improve inference. We use second‐order generalized estimating equations to model heterogeneous correlation in cluster‐randomized trials. Using simulation studies we show that accurate modeling of heterogeneous correlation can improve inference when the correlation is high or varies by cluster size. We apply the methods to a cluster‐randomized trial of an intervention to promote breast cancer screening. Copyright © 2008 John Wiley & Sons, Ltd. 相似文献

12.

Joint modeling of multiple ordinal adherence outcomes via generalized estimating equations with flexible correlation structure

下载免费PDF全文

Zhen Jiang Yimeng Liu Abdus S. Wahed Geert Molenberghs 《Statistics in medicine》2018,37(6):983-995

Adherence to medication is critical in achieving effectiveness of many treatments. Factors that influence adherence behavior have been the subject of many clinical studies. Analyzing adherence is complicated because it is often measured on multiple drugs over a period, resulting in a multivariate longitudinal outcome. This paper is motivated by the Viral Resistance to Antiviral Therapy of Chronic Hepatitis C study, where adherence is measured on two drugs as a bivariate ordinal longitudinal outcome. To analyze such outcome, we propose a joint model assuming the multivariate ordinal outcome arose from a partitioned latent multivariate normal process. We also provide a flexible multilevel association structure covering both between and within outcome correlation. In simulation studies, we show that the joint model provides unbiased estimators for regression parameters, which are more efficient than those obtained through fitting separate model for each outcome. The joint method also yields unbiased estimators for the correlation parameters when the correlation structure is correctly specified. Finally, we analyze the Viral Resistance to Antiviral Therapy of Chronic Hepatitis C adherence data and discuss the findings. 相似文献

13.

Concordance correlation coefficients estimated by generalized estimating equations and variance components for longitudinal repeated measurements

下载免费PDF全文

Miao‐Yu Tsai 《Statistics in medicine》2017,36(8):1319-1333

The concordance correlation coefficient (CCC) is a commonly accepted measure of agreement between two observers for continuous responses. This paper proposes a generalized estimating equations (GEE) approach allowing dependency between repeated measurements over time to assess intra‐agreement for each observer and inter‐ and total agreement among multiple observers simultaneously. Furthermore, the indices of intra‐, inter‐, and total agreement through variance components (VC) from an extended three‐way linear mixed model (LMM) are also developed with consideration of the correlation structure of longitudinal repeated measurements. Simulation studies are conducted to compare the performance of the GEE and VC approaches for repeated measurements from longitudinal data. An application of optometric conformity study is used for illustration. In conclusion, the GEE approach allowing flexibility in model assumptions and correlation structures of repeated measurements gives satisfactory results with small mean square errors and nominal 95% coverage rates for large data sets, and when the assumption of the relationship between variances and covariances for the extended three‐way LMM holds, the VC approach performs outstandingly well for all sample sizes. Copyright © 2017 John Wiley & Sons, Ltd. 相似文献

14.

A comparison of several approaches for choosing between working correlation structures in generalized estimating equation analysis of longitudinal binary data

Justine Shults Wenguang Sun Xin Tu Hanjoo Kim Jay Amsterdam Joseph M. Hilbe Thomas Ten‐Have 《Statistics in medicine》2009,28(18):2338-2355

The method of generalized estimating equations (GEE) models the association between the repeated observations on a subject with a patterned correlation matrix. Correct specification of the underlying structure is a potentially beneficial goal, in terms of improving efficiency and enhancing scientific understanding. We consider two sets of criteria that have previously been suggested, respectively, for selecting an appropriate working correlation structure, and for ruling out a particular structure(s), in the GEE analysis of longitudinal studies with binary outcomes. The first selection criterion chooses the structure for which the model‐based and the sandwich‐based estimator of the covariance matrix of the regression parameter estimator are closest, while the second selection criterion chooses the structure that minimizes the weighted error sum of squares. The rule out criterion deselects structures for which the estimated correlation parameter violates standard constraints for binary data that depend on the marginal means. In addition, we remove structures from consideration if their estimated parameter values yield an estimated correlation structure that is not positive definite. We investigate the performance of the two sets of criteria using both simulated and real data, in the context of a longitudinal trial that compares two treatments for major depressive episode. Practical recommendations are also given on using these criteria to aid in the efficient selection of a working correlation structure in GEE analysis of longitudinal binary data. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

15.

Analysis of data with multiple sources of correlation in the framework of generalized estimating equations

Shults J Whitt MC Kumanyika S 《Statistics in medicine》2004,23(20):3209-3226

This paper is motivated by a study of physical activity participation habits in African American women with three potential sources of correlation among study outcomes, according to method of assessment, timing of measurement, and intensity of physical activity. To adjust for the multiple sources of correlation in this study, we implement an approach based on generalized estimating equations that models association via a patterned correlation matrix. We present a general algorithm that is relatively straightforward to program, an analysis of our physical activity study, and some asymptotic relative efficiency comparisons between correctly specifying the correlation structure vs ignoring two sources of correlation in the analysis of data from this study. The efficiency comparisons demonstrate that correctly modeling the correlation structure can prevent substantial losses in efficiency in estimation of the regression parameter. 相似文献

16.

Modified robust variance estimator for generalized estimating equations with improved small-sample performance

Wang M Long Q 《Statistics in medicine》2011,30(11):1278-1291

Generalized estimating equations (GEE (Biometrika 1986; 73(1):13-22) is a general statistical method to fit marginal models for correlated or clustered responses, and it uses a robust sandwich estimator to estimate the variance-covariance matrix of the regression coefficient estimates. While this sandwich estimator is robust to the misspecification of the correlation structure of the responses, its finite sample performance deteriorates as the number of clusters or observations per cluster decreases. To address this limitation, Pan (Biometrika 2001; 88(3):901-906) and Mancl and DeRouen (Biometrics 2001; 57(1):126-134) investigated two modifications to the original sandwich variance estimator. Motivated by the ideas underlying these two modifications, we propose a novel robust variance estimator that combines the strengths of these estimators. Our theoretical and numerical results show that the proposed estimator attains better efficiency and achieves better finite sample performance compared with existing estimators. In particular, when the sample size or cluster size is small, our proposed estimator exhibits lower bias and the resulting confidence intervals for GEE estimates achieve better coverage rates performance. We illustrate the proposed method using data from a dental study. 相似文献

17.

Sample size calculation in three-level cluster randomized trials using generalized estimating equation models

Jingxia Liu Graham A. Colditz 《Statistics in medicine》2020,39(24):3347-3372

Three-level cluster randomized trials (CRTs) are increasingly used in implementation science, where 2fold-nested-correlated data arise. For example, interventions are randomly assigned to practices, and providers within the same practice who provide care to participants are trained with the assigned intervention. Teerenstra et al proposed a nested exchangeable correlation structure that accounts for two levels of clustering within the generalized estimating equations (GEE) approach. In this article, we utilize GEE models to test the treatment effect in a two-group comparison for continuous, binary, or count data in three-level CRTs. Given the nested exchangeable correlation structure, we derive the asymptotic variances of the estimator of the treatment effect for different types of outcomes. When the number of clusters is small, researchers have proposed bias-corrected sandwich estimators to improve performance in two-level CRTs. We extend the variances of two bias-corrected sandwich estimators to three-level CRTs. The equal provider and practice sizes were assumed to calculate number of practices for simplicity. However, they are not guaranteed in practice. Relative efficiency (RE) is defined as the ratio of variance of the estimator of the treatment effect for equal to unequal provider and practice sizes. The expressions of REs are obtained from both asymptotic variance estimation and bias-corrected sandwich estimators. Their performances are evaluated for different scenarios of provider and practice size distributions through simulation studies. Finally, a percentage increase in the number of practices is proposed due to efficiency loss from unequal provider and/or practice sizes. 相似文献

18.

Analysis of repeated bouts of measurements in the framework of generalized estimating equations

Shults J Mazurick CA Richard Landis J 《Statistics in medicine》2006,25(23):4114-4128

We consider a longitudinal study of interstitial cystitis (IC) in women, in which the time between bouts of repeated measurements is large relative to the within-bout separation in time. Our outcome of interest is the number of nocturnal voids that we model via quasi-least squares (QLS) in the framework of generalized estimating equations (GEE). To account for potential intra-subject correlation, we directly apply a banded Toeplitz correlation structure that previously was only implemented in an ad hoc approach using GEE. We describe this structure, its appropriateness for data from the IC study, and the results of our analysis. We then demonstrate that correct specification of the underlying correlation structure versus incorrectly applying a simpler structure can prevent substantial losses in efficiency in estimation of the regression parameter. These comparisons involve the limiting values of the estimates of the correlation parameters, which are not consistent for the misspecification scenarios considered here. We therefore obtain the limiting values of the QLS estimates when the structure is incorrectly specified. 相似文献

19.

Bivariate association analyses for the mixture of continuous and binary traits with the use of extended generalized estimating equations

Liu J Pei Y Papasian CJ Deng HW 《Genetic epidemiology》2009,33(3):217-227

Genome-wide association (GWA) study is becoming a powerful tool in deciphering genetic basis of complex human diseases/traits. Currently, the univariate analysis is the most commonly used method to identify genes associated with a certain disease/phenotype under study. A major limitation with the univariate analysis is that it may not make use of the information of multiple correlated phenotypes, which are usually measured and collected in practical studies. The multivariate analysis has proven to be a powerful approach in linkage studies of complex diseases/traits, but it has received little attention in GWA. In this study, we aim to develop a bivariate analytical method for GWA study, which can be used for a complex situation in which continuous trait and a binary trait are measured under study. Based on the modified extended generalized estimating equation (EGEE) method we proposed herein, we assessed the performance of our bivariate analyses through extensive simulations as well as real data analyses. In the study, to develop an EGEE approach for bivariate genetic analyses, we combined two different generalized linear models corresponding to phenotypic variables using a seemingly unrelated regression model. The simulation results demonstrated that our EGEE-based bivariate analytical method outperforms univariate analyses in increasing statistical power under a variety of simulation scenarios. Notably, EGEE-based bivariate analyses have consistent advantages over univariate analyses whether or not there exists a phenotypic correlation between the two traits. Our study has practical importance, as one can always use multivariate analyses as a screening tool when multiple phenotypes are available, without extra costs of statistical power and false-positive rate. Analyses on empirical GWA data further affirm the advantages of our bivariate analytical method. 相似文献

20.

Generalized estimating equations to estimate the ordered stereotype logit model for panel data

Martin Spiess Daniel Fernández Thuong Nguyen Ivy Liu 《Statistics in medicine》2020,39(14):1919-1940

By modeling the effects of predictor variables as a multiplicative function of regression parameters being invariant over categories, and category-specific scalar effects, the ordered stereotype logit model is a flexible regression model for ordinal response variables. In this article, we propose a generalized estimating equations (GEE) approach to estimate the ordered stereotype logit model for panel data based on working covariance matrices, which are not required to be correctly specified. A simulation study compares the performance of GEE estimators based on various working correlation matrices and working covariance matrices using local odds ratios. Estimation of the model is illustrated using a real-world dataset. The results from the simulation study suggest that GEE estimation of this model is feasible in medium-sized and large samples and that estimators based on local odds ratios as realized in this study tend to be less efficient compared with estimators based on a working correlation matrix. For low true correlations, the efficiency gains seem to be rather small and if the working covariance structure is too flexible, the corresponding estimator may even be less efficient compared with the GEE estimator assuming independence. Like for GEE estimators more generally, if the true correlations over time are high, then a working covariance structure which is close to the true structure can lead to considerable efficiency gains compared with assuming independence. 相似文献