首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The zero‐inflated negative binomial regression model (ZINB) is often employed in diverse fields such as dentistry, health care utilization, highway safety, and medicine to examine relationships between exposures of interest and overdispersed count outcomes exhibiting many zeros. The regression coefficients of ZINB have latent class interpretations for a susceptible subpopulation at risk for the disease/condition under study with counts generated from a negative binomial distribution and for a non‐susceptible subpopulation that provides only zero counts. The ZINB parameters, however, are not well‐suited for estimating overall exposure effects, specifically, in quantifying the effect of an explanatory variable in the overall mixture population. In this paper, a marginalized zero‐inflated negative binomial regression (MZINB) model for independent responses is proposed to model the population marginal mean count directly, providing straightforward inference for overall exposure effects based on maximum likelihood estimation. Through simulation studies, the finite sample performance of MZINB is compared with marginalized zero‐inflated Poisson, Poisson, and negative binomial regression. The MZINB model is applied in the evaluation of a school‐based fluoride mouthrinse program on dental caries in 677 children. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

2.
A zero-truncated negative binomial mixed regression model is presented to analyse overdispersed positive count data. The study is motivated by the determination of pertinent risk factors associated with ischaemic stroke hospitalizations. Random effects are incorporated in the linear predictor to adjust for inter-hospital variations and the dependency of clustered observations using the generalized linear mixed model approach. The method assists hospital administrators and clinicians to estimate the number of subsequent readmissions based on characteristics of the patient at the index stroke. The findings have important implications on resource usage, rehabilitation planning and management of acute stroke care.  相似文献   

3.
Negative binomial model has been increasingly used to model the count data in recent clinical trials. It is frequently chosen over Poisson model in cases of overdispersed count data that are commonly seen in clinical trials. One of the challenges of applying negative binomial model in clinical trial design is the sample size estimation. In practice, simulation methods have been frequently used for sample size estimation. In this paper, an explicit formula is developed to calculate sample size based on the negative binomial model. Depending on different approaches to estimate the variance under null hypothesis, three variations of the sample size formula are proposed and discussed. Important characteristics of the formula include its accuracy and its ability to explicitly incorporate dispersion parameter and exposure time. The performance of the formula with each variation is assessed using simulations. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

4.
Rate differences are an important effect measure in biostatistics and provide an alternative perspective to rate ratios. When the data are event counts observed during an exposure period, adjusted rate differences may be estimated using an identity‐link Poisson generalised linear model, also known as additive Poisson regression. A problem with this approach is that the assumption of equality of mean and variance rarely holds in real data, which often show overdispersion. An additive negative binomial model is the natural alternative to account for this; however, standard model‐fitting methods are often unable to cope with the constrained parameter space arising from the non‐negativity restrictions of the additive model. In this paper, we propose a novel solution to this problem using a variant of the expectation–conditional maximisation–either algorithm. Our method provides a reliable way to fit an additive negative binomial regression model and also permits flexible generalisations using semi‐parametric regression functions. We illustrate the method using a placebo‐controlled clinical trial of fenofibrate treatment in patients with type II diabetes, where the outcome is the number of laser therapy courses administered to treat diabetic retinopathy. An R package is available that implements the proposed method. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

5.
When conducting a meta‐analysis of studies with bivariate binary outcomes, challenges arise when the within‐study correlation and between‐study heterogeneity should be taken into account. In this paper, we propose a marginal beta‐binomial model for the meta‐analysis of studies with binary outcomes. This model is based on the composite likelihood approach and has several attractive features compared with the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta‐binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed‐form expression of likelihood function, and no constraints on the correlation parameter. More importantly, because the marginal beta‐binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study‐specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta‐binomial model with the bivariate generalized linear mixed model and the Sarmanov beta‐binomial model by simulation studies. Interestingly, the results show that the marginal beta‐binomial model performs better than the Sarmanov beta‐binomial model, whether or not the true model is Sarmanov beta‐binomial, and the marginal beta‐binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta‐analyses of diagnostic accuracy studies and a meta‐analysis of case–control studies are conducted for illustration. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

6.
Lee AH  Xiang L  Fung WK 《Statistics in medicine》2004,23(17):2757-2769
In many biomedical applications, count data have a large proportion of zeros and the zero-inflated Poisson regression (ZIP) model may be appropriate. A popular score test for zero-inflation, comparing the ZIP model to a standard Poisson regression model, was given by van den Broek. Similarly, for count data that exhibit extra zeros and are simultaneously overdispersed, a score test for testing the ZIP model against a zero-inflated negative binomial alternative was proposed by Ridout, Hinde and Demétrio. However, these test statistics are sensitive to anomalous cases in the data, and incorrect inferences concerning the choice of model may be drawn. In this paper, diagnostic measures are derived to assess the influence of observations on the score statistics. Two examples that motivated the application of zero-inflated regression models are considered to illustrate the importance of sensitivity analysis of the zero-inflation tests.  相似文献   

7.
Clustered overdispersed multivariate count data are challenging to model due to the presence of correlation within and between samples. Typically, the first source of correlation needs to be addressed but its quantification is of less interest. Here, we focus on the correlation between time points. In addition, the effects of covariates on the multivariate counts distribution need to be assessed. To fulfill these requirements, a regression model based on the Dirichlet-multinomial distribution for association between covariates and the categorical counts is extended by using random effects to deal with the additional clustering. This model is the Dirichlet-multinomial mixed regression model. Alternatively, a negative binomial regression mixed model can be deployed where the corresponding likelihood is conditioned on the total count. It appears that these two approaches are equivalent when the total count is fixed and independent of the random effects. We consider both subject-specific and categorical-specific random effects. However, the latter has a larger computational burden when the number of categories increases. Our work is motivated by microbiome data sets obtained by sequencing of the amplicon of the bacterial 16S rRNA gene. These data have a compositional structure and are typically overdispersed. The microbiome data set is from an epidemiological study carried out in a helminth-endemic area in Indonesia. The conclusions are as follows: time has no statistically significant effect on microbiome composition, the correlation between subjects is statistically significant, and treatment has a significant effect on the microbiome composition only in infected subjects who remained infected.  相似文献   

8.
We present a case study using the negative binomial regression model for discrete outcome data arising from a clinical trial designed to evaluate the effectiveness of a prehabilitation program in preventing functional decline among physically frail, community-living older persons. The primary outcome was a measure of disability at 7 months that had a range from 0 to 16 with a mean of 2.8 (variance of 16.4) and a median of 1. The data were right skewed with clumping at zero (i.e., 40% of subjects had no disability at 7 months). Because the variance was nearly 6 times greater than the mean, the negative binomial model provided an improved fit to the data and accounted better for overdispersion than the Poisson regression model, which assumes that the mean and variance are the same. Although correcting the variance and corresponding test statistics for overdispersion is a standard procedure in the Poisson model, the estimates of the regression parameters are inefficient because they have more sampling variability than is necessary. The negative binomial model provides an alternative approach for the analysis of discrete data where overdispersion is a problem, provided that the model is correctly specified and adequately fits the data.  相似文献   

9.
In this article, we present a general procedure to analyze exchangeable binary data that may also be viewed as realizations of binomial mixtures. Our approach unifies existing models and is practical and computationally easy. Resulting from completely monotonic functions, we introduce a rich family of parametric parsimonious binomial mixtures, including the incomplete Beta‐, Gamma‐, Normal‐, and Poisson‐binomial, generalizing the Beta‐binomial. We show that the family is closed under convex linear combinations, products, and composites. We also give the moments and the Markov property of the family. With such distributions, we can perform statistical inference on correlated binary data and, in particular, overdispersed data. We propose a regression procedure that generalizes logistic regression. We provide a forward model selection procedure. We run a small simulation to validate the inclusion of the binomial distribution. Finally, we apply the proposed procedure to analyze the 2, 4, 5‐Trichlorophenoxyacetic acid and E2 data and compare the results with existing procedures. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

10.
Count data with extra zeros are common in many medical applications. The zero-inflated Poisson (ZIP) regression model is useful to analyse such data. For hierarchical or correlated count data where the observations are either clustered or represent repeated outcomes from individual subjects, a class of ZIP mixed regression models may be appropriate. However, the ZIP parameter estimates can be severely biased if the non-zero counts are overdispersed in relation to the Poisson distribution. In this paper, a score test is proposed for testing the ZIP mixed regression model against the zero-inflated negative binomial alternative. Sampling distribution and power of the test statistic are evaluated by simulation studies. The results show that the test statistic performs satisfactorily under a wide range of conditions. The test procedure is applied to pancreas disorder length of stay that comprised mainly same-day separations and simultaneous prolonged hospitalizations.  相似文献   

11.
Motivated by the analysis of quality of life data from a clinical trial on early breast cancer, we propose in this paper a generalized partially linear mean‐covariance regression model for longitudinal proportional data, which are bounded in a closed interval. Cholesky decomposition of the covariance matrix for within‐subject responses and generalized estimation equations are used to estimate unknown parameters and the nonlinear function in the model. Simulation studies are performed to evaluate the performance of the proposed estimation procedures. Our new model is also applied to analyze the data from the cancer clinical trial that motivated this research. In comparison with available models in the literature, the proposed model does not require specific parametric assumptions on the density function of the longitudinal responses and the probability function of the boundary values and can capture dynamic changes of time or other interested variables on both mean and covariance of the correlated proportional responses. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

12.
We introduce a semi‐parametric approach to ecological regression for disease mapping, based on modelling the regression M‐quantiles of a negative binomial variable. The proposed method is robust to outliers in the model covariates, including those due to measurement error, and can account for both spatial heterogeneity and spatial clustering. A simulation experiment based on the well‐known Scottish lip cancer data set is used to compare the M‐quantile modelling approach with a disease mapping approach based on a random effects model. This suggests that the M‐quantile approach leads to predicted relative risks with smaller root mean square error. The paper concludes with an illustrative application of the M‐quantile approach, mapping low birth weight incidence data for English Local Authority Districts for the years 2005–2010. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

13.
Zero‐inflated Poisson (ZIP) and negative binomial (ZINB) models are widely used to model zero‐inflated count responses. These models extend the Poisson and negative binomial (NB) to address excessive zeros in the count response. By adding a degenerate distribution centered at 0 and interpreting it as describing a non‐risk group in the population, the ZIP (ZINB) models a two‐component population mixture. As in applications of Poisson and NB, the key difference between ZIP and ZINB is the allowance for overdispersion by the ZINB in its NB component in modeling the count response for the at‐risk group. Overdispersion arising in practice too often does not follow the NB, and applications of ZINB to such data yield invalid inference. If sources of overdispersion are known, other parametric models may be used to directly model the overdispersion. Such models too are subject to assumed distributions. Further, this approach may not be applicable if information about the sources of overdispersion is unavailable. In this paper, we propose a distribution‐free alternative and compare its performance with these popular parametric models as well as a moment‐based approach proposed by Yu et al. [Statistics in Medicine 2013; 32 : 2390–2405]. Like the generalized estimating equations, the proposed approach requires no elaborate distribution assumptions. Compared with the approach of Yu et al., it is more robust to overdispersed zero‐inflated responses. We illustrate our approach with both simulated and real study data. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

14.
In many practical applications, count data often exhibit greater or less variability than allowed by the equality of mean and variance, referred to as overdispersion/underdispersion, and there are several reasons that may lead to the overdispersion/underdispersion such as zero inflation and mixture. Moreover, if the count data are distributed as a generalized Poisson or a negative binomial distribution that accommodates extra variation not explained by a simple Poisson or a binomial model, then the dispersion occurs too. In this paper, we deal with a class of two‐component zero‐inflated generalized Poisson mixture regression models to fit such data and propose a local influence measure procedure for model comparison and statistical diagnostics. At first, we formally develop a general model framework that unifies zero inflation, mixture as well as overdispersion/underdispersion simultaneously, and then we mainly investigate two types of perturbation schemes, the global and individual perturbation schemes, for perturbing various model assumptions and detecting influential observations. Also, we obtain the corresponding local influence measures. Our method is novel for count data analysis and can be used to explore these essential issues such as zero inflation, mixture, and dispersion related to zero‐inflated generalized Poisson mixture models. On the basis of the results of model comparison, we could further conduct the sensitivity analysis of perturbation as well as hypothesis test with more accuracy. Finally, we employ here a simulation study and a real example to illustrate the proposed local influence measures. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

15.
Investigators in clinical research are often interested in determining the association between patient characteristics and post-operative length of stay (LOS). We examined the relative performance of seven different statistical strategies for analyzing LOS in a cohort of patients undergoing CABG surgery. We compared linear regression; linear regression with log-transformed length of stay; generalized linear models with the following distributions: Poisson, negative binomial, normal, and gamma; and semi-parametric survival models.Nine of twenty patient characteristics were found to be significantly associated with increased LOS in all models. The models disagreed upon the statistical significance of the association between the remaining patient characteristics and increased LOS. Generalized linear models with Poisson, negative binomial, and gamma distributions, and the Cox regression model demonstrated the greatest consistency. With the exception of Cox regression, all models had similar ability to predict length of stay in the actual data. However, the generalized linear models tended to have marginally lower prediction error than the linear models. Using four measures of prediction error, Cox regression had substantially higher prediction error than the other models. Generalized linear models were best able to predict patient length of stay in Monte Carlo simulations that were performed.Researchers should consider generalized linear models with normal, Poisson, or negative binomial distributions for predicting length of stay following CABG surgery. Post-operative length of stay is a complex phenomenon that is difficult to incorporate into a simple parametric model due to a small proportion of patients having very long lengths of stay.  相似文献   

16.
A robust likelihood approach for the analysis of overdispersed correlated count data that takes into account cluster varying covariates is proposed. We emphasise two characteristics of the proposed method: That the correlation structure satisfies the constraints on the second moments and that the estimation of the correlation structure guarantees consistent estimates of the regression coefficients. In addition we extend the mean specification to include within- and between-cluster effects. The method is illustrated through the analysis of data from two studies. In the first study, cross-sectional count data from a randomised controlled trial are analysed to evaluate the efficacy of a communication skills training programme. The second study involves longitudinal count data which represent counts of damaged hand joints in patients with psoriatic arthritis. Motivated by this study, we generalize our model to accommodate for a subpopulation of patients who are not susceptible to the development of damaged hand joints.  相似文献   

17.
Many different methods have been proposed for the analysis of cluster randomized trials (CRTs) over the last 30 years. However, the evaluation of methods on overdispersed count data has been based mostly on the comparison of results using empiric data; i.e. when the true model parameters are not known. In this study, we assess via simulation the performance of five methods for the analysis of counts in situations similar to real community‐intervention trials. We used the negative binomial distribution to simulate overdispersed counts of CRTs with two study arms, allowing the period of time under observation to vary among individuals. We assessed different sample sizes, degrees of clustering and degrees of cluster‐size imbalance. The compared methods are: (i) the two‐sample t‐test of cluster‐level rates, (ii) generalized estimating equations (GEE) with empirical covariance estimators, (iii) GEE with model‐based covariance estimators, (iv) generalized linear mixed models (GLMM) and (v) Bayesian hierarchical models (Bayes‐HM). Variation in sample size and clustering led to differences between the methods in terms of coverage, significance, power and random‐effects estimation. GLMM and Bayes‐HM performed better in general with Bayes‐HM producing less dispersed results for random‐effects estimates although upward biased when clustering was low. GEE showed higher power but anticonservative coverage and elevated type I error rates. Imbalance affected the overall performance of the cluster‐level t‐test and the GEE's coverage in small samples. Important effects arising from accounting for overdispersion are illustrated through the analysis of a community‐intervention trial on Solar Water Disinfection in rural Bolivia. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

18.
Longitudinal measurement of biomarkers is important in determining risk factors for binary endpoints such as infection or disease. However, biomarkers are subject to measurement error, and some are also subject to left‐censoring due to a lower limit of detection. Statistical methods to address these issues are few. We herein propose a generalized linear mixed model and estimate the model parameters using the Monte Carlo Newton‐Raphson (MCNR) method. Inferences regarding the parameters are made by applying Louis's method and the delta method. Simulation studies were conducted to compare the proposed MCNR method with existing methods including the maximum likelihood (ML) method and the ad hoc approach of replacing the left‐censored values with half of the detection limit (HDL). The results showed that the performance of the MCNR method is superior to ML and HDL with respect to the empirical standard error, as well as the coverage probability for the 95% confidence interval. The HDL method uses an incorrect imputation method, and the computation is constrained by the number of quadrature points; while the ML method also suffers from the constrain for the number of quadrature points, the MCNR method does not have this limitation and approximates the likelihood function better than the other methods. The improvement of the MCNR method is further illustrated with real‐world data from a longitudinal study of local cervicovaginal HIV viral load and its effects on oncogenic HPV detection in HIV‐positive women.  相似文献   

19.
The generalized odds‐rate model is a class of semiparametric regression models, which includes the proportional hazards and proportional odds models as special cases. There are few works on estimation of the generalized odds‐rate model with interval censored data because of the challenges in maximizing the complex likelihood function. In this paper, we propose a gamma‐Poisson data augmentation approach to develop an Expectation Maximization algorithm, which can be used to fit the generalized odds‐rate model to interval censored data. The proposed Expectation Maximization algorithm is easy to implement and is computationally efficient. The performance of the proposed method is evaluated by comprehensive simulation studies and illustrated through applications to datasets from breast cancer and hemophilia studies. In order to make the proposed method easy to use in practice, an R package ‘ICGOR’ was developed. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

20.
Yee TW 《Statistics in medicine》2004,23(14):2295-2315
One of the most popular methods for quantile regression is the LMS method of Cole and Green. The method naturally falls within a penalized likelihood framework, and consequently allows for considerable flexible because all three parameters may be modelled by cubic smoothing splines. The model is also very understandable: for a given value of the covariate, the LMS method applies a Box-Cox transformation to the response in order to transform it to standard normality; to obtain the quantiles, an inverse Box-Cox transformation is applied to the quantiles of the standard normal distribution. The purposes of this article are three-fold. Firstly, LMS quantile regression is presented within the framework of the class of vector generalized additive models. This confers a number of advantages such as a unifying theory and estimation process. Secondly, a new LMS method based on the Yeo-Johnson transformation is proposed, which has the advantage that the response is not restricted to be positive. Lastly, this paper describes a software implementation of three LMS quantile regression methods in the S language. This includes the LMS-Yeo-Johnson method, which is estimated efficiently by a new numerical integration scheme. The LMS-Yeo-Johnson method is illustrated by way of a large cross-sectional data set from a New Zealand working population.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号