首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Health-care providers in the UK and elsewhere are required to maintain records of incidents relating to patient safety, including the date and time of each incident. However, for reporting and analysis, the resulting data are typically grouped into discrete time intervals, for example, weekly or monthly counts. The grouping represents a potential loss of information for estimating variations in incidence over time. We use a Poisson point process model to quantify this loss of information. We also suggest some diagnostic procedures for checking the goodness of fit of the Poisson model. Finally, we apply the model to the data on hospital-acquired methicillin-resistant Staphylococcus aureus infections in two hospitals in the north of England. We find that, in one of the hospitals, the estimated incidence decreased by a factor of approximately 2.3 over a 7-year period from 0.323 to 0.097 cases per day per 1000 beds, whereas in the other, the estimated incidence showed only a small and nonsignificant decrease over the same period from 0.137 to 0.131.  相似文献   

2.
We use a hierarchical model for a meta-analysis that combines information from autopsy studies of adenoma prevalence and counts. The studies we included reported findings using a variety of adenoma prevalence groupings and age categories. We use a non-homogeneous Poisson model for multinomial bin probabilities. The Poisson model allows risk to depend on age and sex, and incorporates extra-Poisson variability. We evaluate model fit using the posterior predicted distribution of adenoma prevalence reported by the studies included in our analyses and validate our model using adenoma prevalence reported by more recent colonoscopy studies. For 1990, the estimated adenoma prevalence among Americans at age 60 is 40.3 per cent for men compared to 29.2 per cent for women.  相似文献   

3.
Zero‐inflated Poisson regression is a popular tool used to analyze data with excessive zeros. Although much work has already been performed to fit zero‐inflated data, most models heavily depend on special features of the individual data. To be specific, this means that there is a sizable group of respondents who endorse the same answers making the data have peaks. In this paper, we propose a new model with the flexibility to model excessive counts other than zero, and the model is a mixture of multinomial logistic and Poisson regression, in which the multinomial logistic component models the occurrence of excessive counts, including zeros, K (where K is a positive integer) and all other values. The Poisson regression component models the counts that are assumed to follow a Poisson distribution. Two examples are provided to illustrate our models when the data have counts containing many ones and sixes. As a result, the zero‐inflated and K‐inflated models exhibit a better fit than the zero‐inflated Poisson and standard Poisson regressions. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

4.
We propose functional linear models for zero‐inflated count data with a focus on the functional hurdle and functional zero‐inflated Poisson (ZIP) models. Although the hurdle model assumes the counts come from a mixture of a degenerate distribution at zero and a zero‐truncated Poisson distribution, the ZIP model considers a mixture of a degenerate distribution at zero and a standard Poisson distribution. We extend the generalized functional linear model framework with a functional predictor and multiple cross‐sectional predictors to model counts generated by a mixture distribution. We propose an estimation procedure for functional hurdle and ZIP models, called penalized reconstruction, geared towards error‐prone and sparsely observed longitudinal functional predictors. The approach relies on dimension reduction and pooling of information across subjects involving basis expansions and penalized maximum likelihood techniques. The developed functional hurdle model is applied to modeling hospitalizations within the first 2 years from initiation of dialysis, with a high percentage of zeros, in the Comprehensive Dialysis Study participants. Hospitalization counts are modeled as a function of sparse longitudinal measurements of serum albumin concentrations, patient demographics, and comorbidities. Simulation studies are used to study finite sample properties of the proposed method and include comparisons with an adaptation of standard principal components regression. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

5.
An adaptive approach to Poisson regression modelling is presented for analysing event data from electronic devices monitoring medication-taking. The emphasis is on applying this approach to data for individual subjects although it also applies to data for multiple subjects. This approach provides for visualization of adherence patterns as well as for objective comparison of actual device use with prescribed medication-taking. Example analyses are presented using data on openings of electronic pill bottle caps monitoring adherence of subjects with HIV undergoing highly active antiretroviral therapies. The modelling approach consists of partitioning the observation period, computing grouped event counts/rates for intervals in this partition, and modelling these event counts/rates in terms of elapsed time after entry into the study using Poisson regression. These models are based on adaptively selected sets of power transforms of elapsed time determined by rule-based heuristic search through arbitrary sets of parametric models, thereby effectively generating a smooth non-parametric regression fit to the data. Models are compared using k-fold likelihood cross-validation.  相似文献   

6.
BACKGROUND: Numerous studies have reported associations between fine particulate and sulfur oxide air pollution and human mortality. Yet there continues to be concern that public policy efforts to improve air quality may not produce actual improvement in human health. OBJECTIVES: This study retrospectively explored a natural experiment associated with a copper smelter strike from 15 July 1967 through the beginning of April 1968. METHODS: In the 1960s, copper smelters accounted for approximately 90% of all sulfate emissions in the four Southwest states of New Mexico, Arizona, Utah, and Nevada. Over the 8.5-month strike period, a regional improvement in visibility accompanied an approximately 60% decrease in concentrations of suspended sulfate particles. We collected monthly mortality counts for 1960-1975 and analyzed them using Poisson regression models. RESULTS: The strike-related estimated percent decrease in mortality was 2.5% (95% confidence interval, 1.1-4.0%), based on a Poisson regression model that controlled for time trends, mortality counts in bordering states, and nationwide mortality counts for influenza/pneumonia, cardiovascular, and other respiratory deaths. CONCLUSIONS: These results contribute to the growing body of evidence that ambient sulfate particulate matter and related air pollutants are adversely associated with human health and that the reduction in this pollution can result in reduced mortality.  相似文献   

7.
PURPOSE: Few studies have examined the relationship between weather variables and cryptosporidiosis in Australia. This paper examines the potential impact of weather variability on the transmission of cryptosporidiosis and explores the possibility of developing an empirical forecast system. METHODS: Data on weather variables, notified cryptosporidiosis cases, and population size in Brisbane were supplied by the Australian Bureau of Meteorology, Queensland Department of Health, and Australian Bureau of Statistics for the period of January 1, 1996-December 31, 2004, respectively. Time series Poisson regression and seasonal auto-regression integrated moving average (SARIMA) models were performed to examine the potential impact of weather variability on the transmission of cryptosporidiosis. RESULTS: Both the time series Poisson regression and SARIMA models show that seasonal and monthly maximum temperature at a prior moving average of 1 and 3 months were significantly associated with cryptosporidiosis disease. It suggests that there may be 50 more cases a year for an increase of 1 degrees C maximum temperature on average in Brisbane. Model assessments indicated that the SARIMA model had better predictive ability than the Poisson regression model (SARIMA: root mean square error (RMSE): 0.40, Akaike information criterion (AIC): -12.53; Poisson regression: RMSE: 0.54, AIC: -2.84). Furthermore, the analysis of residuals shows that the time series Poisson regression appeared to violate a modeling assumption, in that residual autocorrelation persisted. CONCLUSIONS: The results of this study suggest that weather variability (particularly maximum temperature) may have played a significant role in the transmission of cryptosporidiosis. A SARIMA model may be a better predictive model than a Poisson regression model in the assessment of the relationship between weather variability and the incidence of cryptosporidiosis.  相似文献   

8.
In the recent two decades, data mining methods for signal detection have been developed for drug safety surveillance, using large post‐market safety data. Several of these methods assume that the number of reports for each drug–adverse event combination is a Poisson random variable with mean proportional to the unknown reporting rate of the drug–adverse event pair. Here, a Bayesian method based on the Poisson–Dirichlet process (DP) model is proposed for signal detection from large databases, such as the Food and Drug Administration's Adverse Event Reporting System (AERS) database. Instead of using a parametric distribution as a common prior for the reporting rates, as is the case with existing Bayesian or empirical Bayesian methods, a nonparametric prior, namely, the DP, is used. The precision parameter and the baseline distribution of the DP, which characterize the process, are modeled hierarchically. The performance of the Poisson–DP model is compared with some other models, through an intensive simulation study using a Bayesian model selection and frequentist performance characteristics such as type‐I error, false discovery rate, sensitivity, and power. For illustration, the proposed model and its extension to address a large amount of zero counts are used to analyze statin drugs for signals using the 2006–2011 AERS data. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

9.
Count data often arise in biomedical studies, while there could be a special feature with excessive zeros in the observed counts. The zero‐inflated Poisson model provides a natural approach to accounting for the excessive zero counts. In the semiparametric framework, we propose a generalized partially linear single‐index model for the mean of the Poisson component, the probability of zero, or both. We develop the estimation and inference procedure via a profile maximum likelihood method. Under some mild conditions, we establish the asymptotic properties of the profile likelihood estimators. The finite sample performance of the proposed method is demonstrated by simulation studies, and the new model is illustrated with a medical care dataset. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

10.
In recent years, the availability of infectious disease counts in time and space has increased, and consequently, there has been renewed interest in model formulation for such data. In this paper, we describe a model that was motivated by the need to analyze hand, foot, and mouth disease surveillance data in China. The data are aggregated by geographical areas and by week, with the aims of the analysis being to gain insight into the space–time dynamics and to make short‐term predictions, which will aid in the implementation of public health campaigns in those areas with a large predicted disease burden. The model we develop decomposes disease‐risk into marginal spatial and temporal components and a space–time interaction piece. The latter is the crucial element, and we use a tensor product spline model with a Markov random field prior on the coefficients of the basis functions. The model can be formulated as a Gaussian Markov random field and so fast computation can be carried out using the integrated nested Laplace approximation approach. A simulation study shows that the model can pick up complex space–time structure and our analysis of hand, foot, and mouth disease data in the central north region of China provides new insights into the dynamics of the disease. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

11.
Monthly counts of medical visits across several years for persons identified to have alcoholism problems are modeled using two-state hidden Markov models (HMM) in order to describe the effect of alcoholism treatment on the likelihood of persons to be in a 'healthy' or 'unhealthy' state. The medical visits can be classified into different types leading to multivariate counts of medical visits each month. A multiple indicator HMM is introduced, which simultaneously fits the multivariate Poisson counts by assuming a shared hidden state underlying all of them. The multiple indicator HMM borrows information across different types of medical encounters. A univariate HMM based on the total count across types of medical visits each month is also considered. Comparisons between the multiple indicator HMM and the total count HMM are made, as well as comparisons with more traditional longitudinal models that directly model the counts. A Bayesian framework is used for the estimation of the HMM and implementation is in Winbugs.  相似文献   

12.
The zero‐inflated Poisson (ZIP) regression model is often employed in public health research to examine the relationships between exposures of interest and a count outcome exhibiting many zeros, in excess of the amount expected under sampling from a Poisson distribution. The regression coefficients of the ZIP model have latent class interpretations, which correspond to a susceptible subpopulation at risk for the condition with counts generated from a Poisson distribution and a non‐susceptible subpopulation that provides the extra or excess zeros. The ZIP model parameters, however, are not well suited for inference targeted at marginal means, specifically, in quantifying the effect of an explanatory variable in the overall mixture population. We develop a marginalized ZIP model approach for independent responses to model the population mean count directly, allowing straightforward inference for overall exposure effects and empirical robust variance estimation for overall log‐incidence density ratios. Through simulation studies, the performance of maximum likelihood estimation of the marginalized ZIP model is assessed and compared with other methods of estimating overall exposure effects. The marginalized ZIP model is applied to a recent study of a motivational interviewing‐based safer sex counseling intervention, designed to reduce unprotected sexual act counts. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

13.
The statistical analysis of spatially correlated data has become an important scientific research topic lately. The analysis of the mortality or morbidity rates observed at different areas may help to decide if people living in certain locations are considered at higher risk than others. Once the statistical model for the data of interest has been chosen, further effort can be devoted to identifying the areas under higher risks. Many scientists, including statisticians, have tried the conditional autoregressive (CAR) model to describe the spatial autocorrelation among the observed data. This model has greater smoothing effect than the exchangeable models, such as the Poisson gamma model for spatial data. This paper focuses on comparing the two types of models using the index LG, the ratio of local to global variability. Two applications, Taiwan asthma mortality and Scotland lip cancer, are considered and the use of LG is illustrated. The estimated values for both data sets are small, implying a Poisson gamma model may be favoured over the CAR model. We discuss the implications for the two applications respectively. To evaluate the performance of the index LG, we also compute the Bayes factor, a Bayesian model selection criterion, to see which model is preferred for the two applications and simulation data. To derive the value of LG, we estimate its posterior mode based on samples derived from the BUGS program, while for Bayes factor we use the double Laplace-Metropolis method, Schwarz criterion, and a modified harmonic mean for approximations. The results of LG and Bayes factor are consistent. We conclude that LG is fairly accurate as an index for selection between Poisson gamma and CAR model. When easy and fast computation is of concern, we recommend using LG as the first and less costly index.  相似文献   

14.
The zero‐inflated negative binomial regression model (ZINB) is often employed in diverse fields such as dentistry, health care utilization, highway safety, and medicine to examine relationships between exposures of interest and overdispersed count outcomes exhibiting many zeros. The regression coefficients of ZINB have latent class interpretations for a susceptible subpopulation at risk for the disease/condition under study with counts generated from a negative binomial distribution and for a non‐susceptible subpopulation that provides only zero counts. The ZINB parameters, however, are not well‐suited for estimating overall exposure effects, specifically, in quantifying the effect of an explanatory variable in the overall mixture population. In this paper, a marginalized zero‐inflated negative binomial regression (MZINB) model for independent responses is proposed to model the population marginal mean count directly, providing straightforward inference for overall exposure effects based on maximum likelihood estimation. Through simulation studies, the finite sample performance of MZINB is compared with marginalized zero‐inflated Poisson, Poisson, and negative binomial regression. The MZINB model is applied in the evaluation of a school‐based fluoride mouthrinse program on dental caries in 677 children. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

15.
Temporal trends in organ donor harvesting rates are subject to variability. It is important to detect variations as early as possible using current data. We developed a predictive model for monitoring harvesting activity using the number of donors harvested monthly between 1996 and 2001. A Poisson model was used to predict the number of donors harvested each month along with their confidence intervals. This model also updates, on a monthly basis, the predicted number of donors for the current year. During 2002, the number of donors observed each month followed the predicted monthly variations, but a significant increase was observed in March and May. These models can be used by transplantation agencies for monitoring purposes and for the evaluation of organ donation programmes.  相似文献   

16.
An important topic when estimating the effect of air pollutants on human health is choosing the best method to control for seasonal patterns and time varying confounders, such as temperature and humidity. Semi‐parametric Poisson time‐series models include smooth functions of calendar time and weather effects to control for potential confounders. Case‐crossover (CC) approaches are considered efficient alternatives that control seasonal confounding by design and allow inclusion of smooth functions of weather confounders through their equivalent Poisson representations. We evaluate both methodological designs with respect to seasonal control and compare spline‐based approaches, using natural splines and penalized splines, and two time‐stratified CC approaches. For the spline‐based methods, we consider fixed degrees of freedom, minimization of the partial autocorrelation function, and general cross‐validation as smoothing criteria. Issues of model misspecification with respect to weather confounding are investigated under simulation scenarios, which allow quantifying omitted, misspecified, and irrelevant‐variable bias. The simulations are based on fully parametric mechanisms designed to replicate two datasets with different mortality and atmospheric patterns. Overall, minimum partial autocorrelation function approaches provide more stable results for high mortality counts and strong seasonal trends, whereas natural splines with fixed degrees of freedom perform better for low mortality counts and weak seasonal trends followed by the time‐season‐stratified CC model, which performs equally well in terms of bias but yields higher standard errors. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

17.

Objective

To propose a more realistic model for disease cluster detection, through a modification of the spatial scan statistic to account simultaneously for inflated zeros and overdispersion.

Introduction

Spatial Scan Statistics [1] usually assume Poisson or Binomial distributed data, which is not adequate in many disease surveillance scenarios. For example, small areas distant from hospitals may exhibit a smaller number of cases than expected in those simple models. Also, underreporting may occur in underdeveloped regions, due to inefficient data collection or the difficulty to access remote sites. Those factors generate excess zero case counts or overdispersion, inducing a violation of the statistical model and also increasing the type I error (false alarms). Overdispersion occurs when data variance is greater than the predicted by the used model. To accommodate it, an extra parameter must be included; in the Poisson model, one makes the variance equal to the mean.

Methods

Tools like the Generalized Poisson (GP) and the Double Poisson [2] may be a better option for this kind of problem, modeling separately the mean and variance, which could be easily adjusted by covariates. When excess zeros occur, the Zero Inflated Poisson (ZIP) model is used, although ZIP’s estimated parameters may be severely biased if nonzero counts are too dispersed, compared to the Poisson distribution. In this case the Inflated Zero models for the Generalized Poisson (ZIGP), Double Poisson (ZIDP) and Negative Binomial (ZINB) could be good alternatives to the joint modeling of excess zeros and overdispersion. By one hand, Zero Inflated Poisson (ZIP) models were proposed using the spatial scan statistic to deal with the excess zeros [3]. By the other hand, another spatial scan statistic was based on a Poisson-Gamma mixture model for overdispersion [4]. In this work we present a model which includes inflated zeros and overdispersion simultaneously, based on the ZIDP model. Let the parameter p indicate the zero inflation. As the the remaining parameters of the observed cases map and the parameter p are not independent, the likelihood maximization process is not straightforward; it becomes even more complicated when we include covariates in the analysis. To solve this problem we introduce a vector of latent variables in order to factorize the likelihood, and obtain a facilitator for the maximization process using the E-M (Expectation-Maximization) algorithm. We derive the formulas to maximize iteratively the likelihood, and implement a computer program using the E-M algorithm to estimate the parameters under null and alternative hypothesis. The p-value is obtained via the Fast Double Bootstrap Test [5].

Results

Numerical simulations are conducted to assess the effectiveness of the method. We present results for Hanseniasis surveillance in the Brazilian Amazon in 2010 using this technique. We obtain the most likely spatial clusters for the Poisson, ZIP, Poisson-Gamma mixture and ZIDP models and compare the results.

Conclusions

The Zero Inflated Double Poisson Spatial Scan Statistic for disease cluster detection incorporates the flexibility of previous models, accounting for inflated zeros and overdispersion simultaneously.The Hanseniasis study case map, due to excess of zero cases counts in many municipalities of the Brazilian Amazon and the presence of overdispersion, was a good benchmark to test the ZIDP model. The results obtained are easier to understand compared to each of the previous spatial scan statistic models, the Zero Inflated Poisson (ZIP) model and the Poisson-Gamma mixture model for overdispersion, taken separetely. The E-M algorithm and the Fast Double Bootstrap test are computationally efficient for this type of problem.  相似文献   

18.
This paper presents a case study in longitudinal data analysis where the goal is to estimate the efficacy of a new drug for treatment of a severe chronic constipation. Data consist of long sequences of binary outcomes (relief/no relief) on each of a large number of patients randomized to treatment (low and high dose) or placebo. Data characteristics indicate: (1) the treatment effects vary non-linearly with time; (2) there is substantial heterogeneity across subjects in their responses to treatment; and (3) there is a high proportion of subjects who never experience any relief (the non-responders).To overcome these challenges, we develop a hierarchical model for binary longitudinal data with a mixture distribution on the probability of response to account for the high frequency of non-responders. While the model is specified conditionally on subject-specific latent variables, we also draw inferences on key population-average parameters for the assessment of the treatments' efficacy in a population. In addition we employ a model-checking method to compare the goodness-of-fit for our model against simpler modelling approaches for aggregated counts, such as the zero-inflated Poisson and zero-inflated negative binomial models.We estimate subject-specific and population-average rate ratios of relief for the treatment with respect to the placebo as functions of time (RR(t)), and compare them with the rate ratios estimated from the models for aggregated counts. We find that: (1) the treatment is effective with respect to the placebo with higher efficacy at the beginning of the study; (2) the estimated rate ratios from the models for aggregated counts appear to be similar to the average across time of the population-average rate ratios estimated under our model; and (3) model-checking suggests that the hierarchical and zero-inflated negative binomial model fit the data best.If we are mainly interested to establish the overall efficacy (or safety) of a new drug, it is appropriate to aggregate the longitudinal data over time and analyse the count data by use of standard statistical methods. However, the models for aggregated counts cannot capture time trend of treatment such as the initial treatment benefit or the development of tolerance during the early stage of the treatment which may be important information to physicians to predict the treatment effects for their patients.  相似文献   

19.
A major challenge when monitoring risks in socially deprived areas of under developed countries is that economic, epidemiological, and social data are typically underreported. Thus, statistical models that do not take the data quality into account will produce biased estimates. To deal with this problem, counts in suspected regions are usually approached as censored information. The censored Poisson model can be considered, but all censored regions must be precisely known a priori, which is not a reasonable assumption in most practical situations. We introduce the random‐censoring Poisson model (RCPM) which accounts for the uncertainty about both the count and the data reporting processes. Consequently, for each region, we will be able to estimate the relative risk for the event of interest as well as the censoring probability. To facilitate the posterior sampling process, we propose a Markov chain Monte Carlo scheme based on the data augmentation technique. We run a simulation study comparing the proposed RCPM with 2 competitive models. Different scenarios are considered. RCPM and censored Poisson model are applied to account for potential underreporting of early neonatal mortality counts in regions of Minas Gerais State, Brazil, where data quality is known to be poor.  相似文献   

20.
Paul M  Held L 《Statistics in medicine》2011,30(10):1118-1136
Infectious disease counts from surveillance systems are typically observed in several administrative geographical areas. In this paper, a non-linear model for the analysis of such multiple time series of counts is discussed. To account for heterogeneous incidence levels or varying transmission of a pathogen across regions, region-specific and possibly spatially correlated random effects are introduced. Inference is based on penalized likelihood methodology for mixed models. Since the use of classical model choice criteria such as AIC or BIC can be problematic in the presence of random effects, models are compared by means of one-step-ahead predictions and proper scoring rules. In a case study, the model is applied to monthly counts of meningococcal disease cases in 94 departments of France (excluding Corsica) and weekly counts of influenza cases in 140 administrative districts of Southern Germany. The predictive performance improves if existing heterogeneity is accounted for by random effects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号