首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Existing methods for power and sample size estimation for longitudinal and other clustered study designs have limited applications. In this paper, we review and extend existing approaches to improve these limitations. In particular, we focus on power analysis for the two most popular approaches for clustered data analysis, the generalized estimating equations and the linear mixed-effects models. By basing the derivation of the power function on the asymptotic distribution of the model estimates, the proposed approach provides estimates of power that are consistent with the methods of inference for data analysis. The proposed methodology is illustrated with numerous examples that are motivated by real study designs.  相似文献   

2.
Existing methods for power analysis for longitudinal study designs are limited in that they do not adequately address random missing data patterns. Although the pattern of missing data can be assessed during data analysis, it is unknown during the design phase of a study. The random nature of the missing data pattern adds another layer of complexity in addressing missing data for power analysis. In this paper, we model the occurrence of missing data with a two-state, first-order Markov process and integrate the modelling information into the power function to account for random missing data patterns. The Markov model is easily specified to accommodate different anticipated missing data processes. We develop this approach for the two most popular longitudinal models: the generalized estimating equations (GEE) and the linear mixed-effects model under the missing completely at random (MCAR) assumption. For GEE, we also limit our consideration to the working independence correlation model. The proposed methodology is illustrated with numerous examples that are motivated by real study designs.  相似文献   

3.
The analysis of a baseline predictor with a longitudinally measured outcome is well established and sample size calculations are reasonably well understood. Analysis of bivariate longitudinally measured outcomes is gaining in popularity and methods to address design issues are required. The focus in a random effects model for bivariate longitudinal outcomes is on the correlations that arise between the random effects and between the bivariate residuals. In the bivariate random effects model, we estimate the asymptotic variances of the correlations and we propose power calculations for testing and estimating the correlations. We compare asymptotic variance estimates to variance estimates obtained from simulation studies and compare our proposed power calculations for correlations on bivariate longitudinal data to power calculations for correlations on cross‐sectional data. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

4.
Existing study design formulas for longitudinal studies have assumed that the exposure is time‐invariant. We derived sample size formulas for studies comparing rates of change by exposure when the exposure varies with time within a subject, focusing on observational studies where this variation is not controlled by the investigator. Two scenarios are considered, one assuming that the effect of exposure on the response is acute and the other assuming that it is cumulative. We show that accurate calculations can often be obtained by providing the intraclass correlation of exposure and the exposure prevalence at each time point. When comparing rates of change, studies with a time‐varying exposure are, in general, less efficient than studies with a time‐invariant one. We provide a public access program to perform the calculations described in the paper ( http://www.hsph.harvard.edu/faculty/spiegelman/optitxs.html ). Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

5.
In medical and health studies, heterogeneities in clustered count data have been traditionally modeled by positive random effects in Poisson mixed models; however, excessive zeros often occur in clustered medical and health count data. In this paper, we consider a three‐level random effects zero‐inflated Poisson model for health‐care utilization data where data are clustered by both subjects and families. To accommodate zero and positive components in the count response compatibly, we model the subject level random effects by a compound Poisson distribution. Our model displays a variance components decomposition which clearly reflects the hierarchical structure of clustered data. A quasi‐likelihood approach has been developed in the estimation of our model. We illustrate the method with analysis of the health‐care utilization data. The performance of our method is also evaluated through simulation studies. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

6.
Many chronic diseases or health conditions manifest with recurring episodes, each of which can be characterized by a measure of intensity or severity. Both the number of episodes and the severity of each episode can depend on the latent severity of an individual's underlying condition. Data such as this are commonly gathered repeatedly at fixed follow‐up intervals. An example is a study of the association between stressful life events and the onset of depression. Stress exposure is assessed through the frequency and intensity of stressful life events occurring each month. Both the number of events and the intensity of each event at each measurement occasion are informative about the underlying severity of stress over time. One might hypothesize that people that approach the onset of a depressive episode have worse stress profiles than the controls, reflected by both more frequent and more intense stressors. We propose models to analyze data collected repeatedly on both the frequency of an event and its severity when both of these are informative about the underlying latent severity. Maximum likelihood estimators are developed, and simulations with small to moderate sample sizes show that the estimators also have good finite sample properties, and they are robust against misspecification of the model. This method is applied to a psychiatric data set. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

7.
The D-optimality criterion is used to construct optimal designs for different numbers of independent cohorts, which constitute a number of repeated measurements per subject over time. A cost function for longitudinal data is proposed, and the optimality criterion is optimized taking into account the cost of the study. First, an optimal number of design points for a given number of cohorts and cost was identified. Then, an optimal number of cohorts is identified by comparing the relative efficiencies (REs). A numerical study shows that for models describing the trend of a continuous outcome over time by polynomials, the most efficient number of repeated measurements is equal to the sum of the total number of cohorts and the degree of the polynomial in the model. REs of a purely longitudinal cohort design with only one cohort, and mixed longitudinal and cross-sectional cohort designs with more cohorts are compared. The results show that a purely longitudinal cohort design with only one cohort of subjects measured at the optimal time points is the most efficient design. The findings in this paper show that one can obtain a highly efficient design for parameter estimation with only a few repeated measurements. The results of this study will reduce the cost of data collection and ease the logistical burdens in cohort studies.  相似文献   

8.
We present a methodology motivated by a controlled trial designed to validate SPOT GRADE, a novel surgical bleeding severity scale. Briefly, the study was designed to quantify inter- and intra-surgeon agreement for characterizing the severity of surgical bleeds via a Kappa statistic. Multiple surgeons were presented with a randomized sequence of controlled bleeding videos and asked to apply the rating system to characterize each wound. Each video was shown multiple times to quantify intra-surgeon reliability, creating clustered data. In addition, videos within the same category may have had different classification probabilities due to changes in blood flow rates and wound sizes. In this work, we propose a new variance estimator for the Kappa statistic, for use in clustered data as well as heterogeneity among items within the same classification category. We then apply this methodology to data from the SPOT GRADE trial.  相似文献   

9.
In randomized clinical trials, a pre‐treatment measurement is often taken at baseline, and post‐treatment effects are measured at several time points post‐baseline, say t=1, …, T. At the end of the trial, it is of interest to assess the treatment effect based on the mean change from baseline at the last time point T. We consider statistical methods for (i) a point estimate and 95 per cent confidence interval for the mean change from baseline at time T for each treatment group, and (ii) a p‐value and 95 per cent confidence interval for the between‐group difference in the mean change from baseline. The manner in which the baseline responses are used in the analysis influences both the accuracy and the efficiency of items (i) and (ii). In this paper, we will consider the ANCOVA approach with change from baseline as a dependent variable and compare that with a constrained longitudinal data analysis (cLDA) model proposed by Liang and Zeger (Sankhya: Indian J. Stat. (Ser B) 2000; 62 :134–148), in which the baseline is modeled as a dependent variable in conjunction with the constraint of a common baseline mean across the treatment groups. Some drawbacks of the ANCOVA model and potential advantages of the cLDA approach are discussed and illustrated using numerical simulations. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

10.

Background

Environmental and biomedical researchers frequently encounter laboratory data constrained by a lower limit of detection (LOD). Commonly used methods to address these left-censored data, such as simple substitution of a constant for all values < LOD, may bias parameter estimation. In contrast, multiple imputation (MI) methods yield valid and robust parameter estimates and explicit imputed values for variables that can be analyzed as outcomes or predictors.

Objective

In this article we expand distribution-based MI methods for left-censored data to a bivariate setting, specifically, a longitudinal study with biological measures at two points in time.

Methods

We have presented the likelihood function for a bivariate normal distribution taking into account values < LOD as well as missing data assumed missing at random, and we use the estimated distributional parameters to impute values < LOD and to generate multiple plausible data sets for analysis by standard statistical methods. We conducted a simulation study to evaluate the sampling properties of the estimators, and we illustrate a practical application using data from the Community Participatory Approach to Measuring Farmworker Pesticide Exposure (PACE3) study to estimate associations between urinary acephate (APE) concentrations (indicating pesticide exposure) at two points in time and self-reported symptoms.

Results

Simulation study results demonstrated that imputed and observed values together were consistent with the assumed and estimated underlying distribution. Our analysis of PACE3 data using MI to impute APE values < LOD showed that urinary APE concentration was significantly associated with potential pesticide poisoning symptoms. Results based on simple substitution methods were substantially different from those based on the MI method.

Conclusions

The distribution-based MI method is a valid and feasible approach to analyze bivariate data with values < LOD, especially when explicit values for the nondetections are needed. We recommend the use of this approach in environmental and biomedical research.  相似文献   

11.
Robins introduced marginal structural models (MSMs) and inverse probability of treatment weighted (IPTW) estimators for the causal effect of a time-varying treatment on the mean of repeated measures. We investigate the sensitivity of IPTW estimators to unmeasured confounding. We examine a new framework for sensitivity analyses based on a nonidentifiable model that quantifies unmeasured confounding in terms of a sensitivity parameter and a user-specified function. We present augmented IPTW estimators of MSM parameters and prove their consistency for the causal effect of an MSM, assuming a correct confounding bias function for unmeasured confounding. We apply the methods to assess sensitivity of the analysis of Hernán et al., who used an MSM to estimate the causal effect of zidovudine therapy on repeated CD4 counts among HIV-infected men in the Multicenter AIDS Cohort Study. Under the assumption of no unmeasured confounders, a 95 per cent confidence interval for the treatment effect includes zero. We show that under the assumption of a moderate amount of unmeasured confounding, a 95 per cent confidence interval for the treatment effect no longer includes zero. Thus, the analysis of Hernán et al. is somewhat sensitive to unmeasured confounding. We hope that our research will encourage and facilitate analyses of sensitivity to unmeasured confounding in other applications.  相似文献   

12.
Motivated by the multivariate nature of microbiome data with hierarchical taxonomic clusters, counts that are often skewed and zero inflated, and repeated measures, we propose a Bayesian latent variable methodology to jointly model multiple operational taxonomic units within a single taxonomic cluster. This novel method can incorporate both negative binomial and zero‐inflated negative binomial responses, and can account for serial and familial correlations. We develop a Markov chain Monte Carlo algorithm that is built on a data augmentation scheme using Pólya‐Gamma random variables. Hierarchical centering and parameter expansion techniques are also used to improve the convergence of the Markov chain. We evaluate the performance of our proposed method through extensive simulations. We also apply our method to a human microbiome study.  相似文献   

13.
We investigate methods for data-based selection of working covariance models in the analysis of correlated data with generalized estimating equations. We study two selection criteria: Gaussian pseudolikelihood and a geodesic distance based on discrepancy between model-sensitive and model-robust regression parameter covariance estimators. The Gaussian pseudolikelihood is found in simulation to be reasonably sensitive for several response distributions and noncanonical mean-variance relations for longitudinal data. Application is also made to a clinical dataset. Assessment of adequacy of both correlation and variance models for longitudinal data should be routine in applications, and we describe open-source software supporting this practice.  相似文献   

14.
临床营养研究中把握度(Power)(样本量计算)分析对于临床研究方案的设计至关重要,充足的把握度有助于保证临床研究结果的可信性和可靠性.本研究以ω-3脂肪乳干预的临床研究为例,对研究设计过程中的把握度分析进行描述.依据高质量文献报道数据以及Meta分析结果获得估计参数、按照不同参数对样本量进行设计、以及最终结合临床实际确定样本量规模.在此基础上,进一步采用模拟方法,对临床研究过程中可能出现的结果组合进行假设,给出了在所计算的样本量下,真实临床研究获得阳性或阴性结果所对应的各种情形.临床研究的样本量设计应兼顾临床和统计两方面的考虑.  相似文献   

15.
Parametric mixed‐effects models are useful in longitudinal data analysis when the sampling frequencies of a response variable and the associated covariates are the same. We propose a three‐step estimation procedure using local polynomial smoothing and demonstrate with data where the variables to be assessed are repeatedly sampled with different frequencies within the same time frame. We first insert pseudo data for the less frequently sampled variable based on the observed measurements to create a new dataset. Then standard simple linear regressions are fitted at each time point to obtain raw estimates of the association between dependent and independent variables. Last, local polynomial smoothing is applied to smooth the raw estimates. Rather than use a kernel function to assign weights, only analytical weights that reflect the importance of each raw estimate are used. The standard errors of the raw estimates and the distance between the pseudo data and the observed data are considered as the measure of the importance of the raw estimates. We applied the proposed method to a weight loss clinical trial, and it efficiently estimated the correlation between the inconsistently sampled longitudinal data. Our approach was also evaluated via simulations. The results showed that the proposed method works better when the residual variances of the standard linear regressions are small and the within‐subjects correlations are high. Also, using analytic weights instead of kernel function during local polynomial smoothing is important when raw estimates have extreme values, or the association between the dependent and independent variable is nonlinear. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

16.
Correlation is inherent in longitudinal studies due to the repeated measurements on subjects, as well as due to time-dependent covariates in the study. In the National Longitudinal Study of Adolescent to Adult Health (Add Health), data were repeatedly collected on children in grades 7-12 across four waves. Thus, observations obtained on the same adolescent were correlated, while predictors were correlated with current and future outcomes such as obesity status, among other health issues. Previous methods, such as the generalized method of moments (GMM) approach have been proposed to estimate regression coefficients for time-dependent covariates. However, these approaches combined all valid moment conditions to produce an averaged parameter estimate for each covariate and thus assumed that the effect of each covariate on the response was constant across time. This assumption is not necessarily optimal in applications such as Add Health or health-related data. Thus, we depart from this assumption and instead use the Partitioned GMM approach to estimate multiple coefficients for the data based on different time periods. These extra regression coefficients are obtained using a partitioning of the moment conditions pertaining to each respective relationship. This approach offers a deeper understanding and appreciation into the effect of each covariate on the response. We conduct simulation studies, as well as analyses of obesity in Add Health, rehospitalization in Medicare data, and depression scores in a clinical study. The Partitioned GMM methods exhibit benefits over previously proposed models with improved insight into the nonconstant relationships realized when analyzing longitudinal data.  相似文献   

17.
ObjectiveWhether perceived job insecurity increases the risk of suicidal behaviors is unclear. Improved understanding in this area could inform efforts to reduce suicide risk among those experiencing elevated job insecurity during the COVID-19 pandemic as well as post-pandemic. We aimed to investigate if perceived job insecurity predicted increased risk of suicide mortality and suicide attempts.MethodEmployees (N=65 571), representative of the Swedish working population who participated in the Swedish Work Environment Survey in 1991–2003, were followed up through 2016 in the National Inpatient and Death Registers. Suicide deaths and suicide attempts were defined according to International Classification of Diseases (ICD) 10 and ICD-8/9 codes of underlying cause of death and in-/outpatient care. Job insecurity and subsequent risk of suicide and suicide attempt were investigated with marginal structural Cox regression analyses and inverse probability of treatment weighting to control for confounding.ResultsPerceived job insecurity was associated with an elevated risk of suicide [hazard ratio (HR) 1.51, 95% confidence interval (CI) 1.03–2.20], but not with incident suicide attempts (HR 1.03, CI 0.86–1.24). Estimates remained similar after considering prevalent/previous poor mental health, other work factors, and when restricting the follow up time to ten years.ConclusionThe study suggests that job insecurity is associated with an increased risk of suicide mortality. Concerns about elevated job insecurity and suicide levels in the wake of the current pandemic could thus be considered in strategies to reduce the population health impact job insecurity both during and following the COVID-19 pandemic.  相似文献   

18.
Joint effects of genetic and environmental factors have been increasingly recognized in the development of many complex human diseases. Despite the popularity of case‐control and case‐only designs, longitudinal cohort studies that can capture time‐varying outcome and exposure information have long been recommended for gene–environment (G × E) interactions. To date, literature on sampling designs for longitudinal studies of G × E interaction is quite limited. We therefore consider designs that can prioritize a subsample of the existing cohort for retrospective genotyping on the basis of currently available outcome, exposure, and covariate data. In this work, we propose stratified sampling based on summaries of individual exposures and outcome trajectories and develop a full conditional likelihood approach for estimation that adjusts for the biased sample. We compare the performance of our proposed design and analysis with combinations of different sampling designs and estimation approaches via simulation. We observe that the full conditional likelihood provides improved estimates for the G × E interaction and joint exposure effects over uncorrected complete‐case analysis, and the exposure enriched outcome trajectory dependent design outperforms other designs in terms of estimation efficiency and power for detection of the G × E interaction. We also illustrate our design and analysis using data from the Normative Aging Study, an ongoing longitudinal cohort study initiated by the Veterans Administration in 1963. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

19.
Response‐dependent two‐phase designs are used increasingly often in epidemiological studies to ensure sampling strategies offer good statistical efficiency while working within resource constraints. Optimal response‐dependent two‐phase designs are difficult to implement, however, as they require specification of unknown parameters. We propose adaptive two‐phase designs that exploit information from an internal pilot study to approximate the optimal sampling scheme for an analysis based on mean score estimating equations. The frequency properties of estimators arising from this design are assessed through simulation, and they are shown to be similar to those from optimal designs. The design procedure is then illustrated through application to a motivating biomarker study in an ongoing rheumatology research program. Copyright © 2015 © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

20.
目的 纵向随访合肥市主城区低出生体重儿0~12月神经精神发育状况,为开展生长发育监测和早期干预提供参考依据。方法 2012年9月-2013年9月建立低出生体重儿(LBWI)出生队列,共纳入228名低出生体重儿,并在同地区同一时间段的儿童保健系统中选取161名正常出生体重儿(NBWI)作为对照组,在6、9、12月龄时对所有婴儿进行智能发育评估。结果 LBWI在6、9、12月龄时,智能发育指数(79.4±16.2 vs 93.5±13.3,85.6±11.7 vs 93.5±8.6,79.7±13.3 vs 86.9±13.0)和运动发育指数(78.1±12.1 vs 88.9±9.1,79.8±16.5 vs 92.3±13.6,79.3±14.6 vs 90.8±13.8)显著低于NBWI组,差异有统计学意义(P<0.05);1岁内LBWI发育迟缓发生率在10.7%~37.5%之间,发育偏离发生率达12.7%~29.8%,明显高于NBWI(分别在10%和15%以下),差异有统计学意义(P<0.01);LBWI在婴儿期神经精神发育追赶不明显,尤其是运动发育较缓慢。结论 低出生体重对婴儿神经精神发育存在不良影响,LBWI在1周岁时仍不能达到正常出生体重儿童的发育水平,应关注LBWI婴儿期的早期干预,并随访监测LBWI 生后的神经精神发育状况。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号