首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We consider a two- and a three-stage hierarchical design containing the effects of k clusters with n units per cluster. In the two-stage model, the conditional distribution of the discrete response Y(i) is assumed to be independent binomial with mean n(straight theta)i (I=1,....k). The success probabilities, straight theta(i)'s, are assumed exchangeable across the k clusters, each arising from a beta distribution. In the three-stage model, the parameters in the beta distribution are assumed to have independent gamma distributions. The size of each cluster, n, is determined for functions of straight theta(i). Lengths of central posterior intervals are computed for various functions of the straight theta(i)'s using Markov chain Monte Carlo and Monte Carlo simulations. Several prior distributions are characterized and tables are provided for n with given k. Methods for sample size calculations under the two- and three-stage models are illustrated and compared for the design of a multi-institutional study to evaluate the appropriateness of discharge planning rates for a cohort of patients with congestive heart failure.  相似文献   

2.
Multiple informant data refers to information obtained from different individuals or sources used to measure the same construct; for example, researchers might collect information regarding child psychopathology from the child's teacher and the child's parent. Frequently, studies with multiple informants have incomplete observations; in some cases the missingness of informants is substantial. We introduce a Maximum Likelihood (ML) technique to fit models with multiple informants as predictors that permits missingness in the predictors as well as the response. We provide closed form solutions when possible and analytically compare the ML technique to the existing Generalized Estimating Equations (GEE) approach. We demonstrate that the ML approach can be used to compare the effect of the informants on response without standardizing the data. Simulations incorporating missingness show that ML is more efficient than the existing GEE method. In the presence of MCAR missing data, we find through a simulation study that the ML approach is robust to a relatively extreme departure from the normality assumption. We implement both methods in a study investigating the association between physical activity and obesity with activity measured using multiple informants (children and their mothers).  相似文献   

3.
Two-part models assume the data has a probability mass at zero and a continuous response for values greater than 0. We construct tests based on the proportion of zeros and the difference among the positive values of a response giving rise to a two degree of freedom chi(2) test. This note gives the non-centrality parameter and finds the power for the test. The power is compared to the simulation results in a recent paper. We derive sample size calculations. We note that some modifications of this procedure may provide better tests if the non-zero responses are discrete. Published in 2001 by John Wiley & Sons, Ltd.  相似文献   

4.
5.
Abstract: The most cost-effective method to measure the morbidity managed and treatments provided in general practice is from records of a cluster of consultations (encounters) from each general practitioner (GP) in a random sample. A cluster sampling method is proposed for future surveys for analysis of encounter-based general practice data. The sample sizes needed to measure the most common problems managed and drugs prescribed were estimated using ratio-estimator models for cluster sample surveys. Morbidity and treatment rates were estimated from the Australian Morbidity and Treatment Survey in General Practice 1990–1991 (AMTS). The 20 most common problems in the AMTS were managed at estimated rates of 1.5 to 9.5 per 100 encounters. The 20 most common drugs were prescribed at estimated rates of 0.7 to 3.6 per 100 problems. These rates were used to determine precision as a percentage of each true value for future surveys, that is, as relative precision. If we want to be 95 per cent confident that these rates will be within 5 per cent of each true rate, sample sizes of 552 to 5675 GPs are needed. If we fix the sample size at 1000 GPs, relative precision lies within 12 per cent of these rates. If the sample size is increased to 1500 GPs, relative precision improves only marginally. The differences in sample size for each of the most frequent morbidity and treatment data are largely due to their variable distributions and relatively infrequent occurrence in general practice. A sample size of 1000 GPs will enable measurement of the most common morbidity and treatments at 95 per cent confidence.  相似文献   

6.
Recent advances in social science surveys include collection of biological samples. Although biomarkers offer a large potential for social science and economic research, they impose a number of statistical challenges, often being distributed asymmetrically with heavy tails. Using data from the UK Household Panel Survey, we illustrate the comparative performance of a set of flexible parametric distributions, which allow for a wide range of skewness and kurtosis: the four‐parameter generalized beta of the second kind (GB2), the three‐parameter generalized gamma, and their three‐, two‐, or one‐parameter nested and limiting cases. Commonly used blood‐based biomarkers for inflammation, diabetes, cholesterol, and stress‐related hormones are modelled. Although some of the three‐parameter distributions nested within the GB2 outperform the latter for most of the biomarkers considered, the GB2 can be used as a guide for choosing among competing parametric distributions for biomarkers. Going “beyond the mean” to estimate tail probabilities, we find that GB2 performs fairly well with some disparities at the very high levels of glycated hemoglobin and fibrinogen. Commonly used linear models are shown to perform worse than almost all the flexible distributions.  相似文献   

7.
Background: Birthweight distributions for early last-menstrual-period-based gestational ages are bimodal, and some birthweights in the right-side distribution are implausible for the specified gestational age. Mixture models can be used to identify births in the right-side distribution. The objective of this study was to determine which maternal and infant factors to include in the mixture models to obtain the best fitting models for New Jersey state birth records. Methods: We included covariates in the models as linear predictors of the means of the component distributions and the proportion of births in each component. This allowed both the means and the proportions to vary across levels of the covariates. Results: The final model included maternal age and timing of entry into prenatal care. The proportion of births in the right-side distribution was lowest for older mothers who entered prenatal care early, higher for teen mothers who entered prenatal care early, higher still for older mothers who entered prenatal care late, and highest for teens who entered prenatal care late. Over 44% of births were classified as incorrect reported gestational age. Conclusion: These results suggest that (1) including these two covariates as linear predictors of the means and mixing proportions gives the best model for identifying births with incorrect reported gestational age, (2) late entry into prenatal care is a mechanism by which erroneously short last-menstrual-period-based gestational ages are generated, and (3) including linear predictors of the mixing proportions in the model increases the validity of the classification of incorrect reported gestational age.  相似文献   

8.
The role of maternal infections, nutritional status and obstetric history in low birth weight is not clear. Thus, the objective of the present study was to assess the effects of maternal HIV infection, nutritional status and obstetric history, and season of birth on gestation length and birth size. The study population was 1669 antenatal care attendees in Harare, Zimbabwe. A prospective cohort study was conducted as part of a randomised, controlled trial. Maternal anthropometry, age, gravidity, and HIV status and load were assessed in 22nd-35th weeks gestation. Outcomes were gestation length and birth size. Birth data were available from 1106 (66.3%) women, of which 360 (32.5%) had HIV infection. Mean gestation length was 39.1 weeks with 16.6% <37 weeks, mean birth weight was 3030 g with 10.5% <2500 g. Gestation length increased with age in primigravidae, but not multigravidae (interaction, P=0.005), and birth in the early dry season, low arm fat area, multiple pregnancies and maternal HIV load were negative predictors. Birth weight increased with maternal height, and birth in the late rainy and early dry season; primi-secundigravidity, low arm fat area, HIV load, multiple pregnancies and female sex were negative predictors. In conclusion, gestation length and birth weight decline with increasing maternal HIV load. In addition, season of birth, gravidity, maternal height and body fat mass, and infant sex are predictors of birth weight.  相似文献   

9.
Missing data in longitudinal studies   总被引:11,自引:0,他引:11  
When observations are made repeatedly over time on the same experimental units, unbalanced patterns of observations are a common occurrence. This complication makes standard analyses more difficult or inappropriate to implement, means loss of efficiency, and may introduce bias into the results as well. Some possible approaches to dealing with missing data include complete case analyses, univariate analyses with adjustments for variance estimates, two-step analyses, and likelihood based approaches. Likelihood approaches can be further categorized as to whether or not an explicit model is introduced for the non-response mechanism. This paper will review the use of likelihood based analyses for longitudinal data with missing responses, both from the point of view of ease of implementation and appropriateness in view of the non-response mechanism. Models for both measured and dichotomous outcome data will be discussed. The appropriateness of some non-likelihood based analyses is briefly considered.  相似文献   

10.
11.
12.
We propose a simple method to compute sample size for an arbitrary test hypothesis in population pharmacokinetics (PK) studies analysed with non-linear mixed effects models. Sample size procedures exist for linear mixed effects model, and have been recently extended by Rochon using the generalized estimating equation of Liang and Zeger. Thus, full model based inference in sample size computation has been possible. The method we propose extends the approach using a first-order linearization of the non-linear mixed effects model and use of the Wald chi(2) test statistic. The proposed method is general. It allows an arbitrary non-linear model as well as arbitrary distribution of random effects characterizing both inter- and intra-individual variability of the mixed effects model. To illustrate possible uses of the method we present tables of minimum sample sizes, in particular, with an illustration of the effect of sampling design on sample size. We demonstrate how (D-)optimal or frequent sampling requires fewer subjects in comparison to a sparse sampling design. We also present results from Monte Carlo simulations showing that the computed sample size can produce the desired power. The proposed method greatly reduces computing times compared with simulation-based methods of estimating sample sizes for population PK studies.  相似文献   

13.
Multinomial Logistic Regression (MLR) has been advocated for developing clinical prediction models that distinguish between three or more unordered outcomes. We present a full-factorial simulation study to examine the predictive performance of MLR models in relation to the relative size of outcome categories, number of predictors and the number of events per variable. It is shown that MLR estimated by Maximum Likelihood yields overfitted prediction models in small to medium sized data. In most cases, the calibration and overall predictive performance of the multinomial prediction model is improved by using penalized MLR. Our simulation study also highlights the importance of events per variable in the multinomial context as well as the total sample size. As expected, our study demonstrates the need for optimism correction of the predictive performance measures when developing the multinomial logistic prediction model. We recommend the use of penalized MLR when prediction models are developed in small data sets or in medium sized data sets with a small total sample size (ie, when the sizes of the outcome categories are balanced). Finally, we present a case study in which we illustrate the development and validation of penalized and unpenalized multinomial prediction models for predicting malignancy of ovarian cancer.  相似文献   

14.
Time-to-event regression is a frequent tool in biomedical research. In clinical trials this time is usually measured from the beginning of the study. The same approach is often adopted in the analysis of longitudinal observational studies. However, in recent years there has appeared literature making a case for the use of the date of birth as a starting point, and thus utilize age as the time-to-event. In this paper, we explore different types of age-scale models and compare them with time-on-study models in terms of the estimated regression coefficients they produce. We consider six proportional hazards regression models that differ in the choice of time scale and in the method of adjusting for the years before the study. By considering the estimating equations of these models as well as numerical simulations we conclude that correct adjustment for the age at entry is crucial in reducing bias of the estimated coefficients. The unadjusted age-scale model is inferior to any of the five other models considered, regardless of their choice of time scale. Additionally, if adjustment for age at entry is made, our analyses show very little to suggest that there exists any practically meaningful difference in the estimated regression coefficients depending on the choice of time scale. These findings are supported by four practical examples from the Framingham Heart Study.  相似文献   

15.
The effect of litter size on weight gain in mice   总被引:2,自引:0,他引:2  
The body weights of rodents at weaning are generally believed to be inversely related to the number of animals in the litter during the birth-to-weaning period. Quantitative data for rats have been published, but not for mice. Using carefully matched litters, we have measured the average body weights of mice raised in litters containing 2, 4, 6, and 12 pups relative to the average body weight of litters of eight pups. Except for the 2 pup litters, the inverse relation was found to hold. It is also shown that the pattern of average weaning weight as a function of litter size is the same as previously published by two other groups using rats.  相似文献   

16.
We have presented a new likelihood-based approach for constructing confidence intervals of effect size that are applicable to small samples. We also conduct a simulation study to compare the coverage probability of the new likelihood-based method with other three methods proposed by Hedges and Olkin and Kraemer and Paik. Simulation studies show that the confidence interval generated by the modified signed log-likelihood ratio method possesses essentially exact coverage probabilities even for small samples, although the coverage probabilities are consistently but slightly less than the nominal level. The methods are also applied to two examples.  相似文献   

17.
Despite our best efforts, missing outcomes are common in randomized controlled clinical trials. The National Research Council's Committee on National Statistics panel report titled The Prevention and Treatment of Missing Data in Clinical Trials noted that further research is required to assess the impact of missing data on the power of clinical trials and how to set useful target rates and acceptable rates of missing data in clinical trials. In this article, using binary responses for illustration, we establish that conclusions based on statistical analyses that include only complete cases can be seriously misleading, and that the adverse impact of missing data grows not only with increasing rates of missingness but also with increasing sample size. We illustrate how principled sensitivity analysis can be used to assess the robustness of the conclusions. Finally, we illustrate how sample sizes can be adjusted to account for expected rates of missingness. We find that when sensitivity analyses are considered as part of the primary analysis, the required adjustments to the sample size are dramatically larger than those that are traditionally used. Furthermore, in some cases, especially in large trials with small target effect sizes, it is impossible to achieve the desired power.  相似文献   

18.
缺失数据的多重估算   总被引:2,自引:0,他引:2  
目的 探讨多重估算方法在缺失数据分析中的应用。方法 利用Bayesian理论与MCMC方法,在NORM软件中模拟得到m个完整数据集。结果 对m个重复测量数据集用SAS软件分析,合并m个分析结果可见,由NORM软件合并数据集的标准差比缺失数据集更稳定。结论 多重估算法既能反映缺失数据的不确定性,又可充分利用资料信息,对模型估计结果更可信。  相似文献   

19.
As shown by successively‐larger random samples of data on weight, length and head circumference with Ns of 25, 100, 400, 900, 1600 and 2500, respectively, and calculated values for percentiles 2.5, 5, 10, 15, 50, 85, 90, 95 and 97.5 sample size must range from a minimum of 25 to effectively distinguish the 10th from the 50th percentile, to 1300 to distinguish the 2.5th and 5th percentiles at the 95 percent confidence limits. From these randomly‐generated examples arranged by doubling values of N, and from theoretical calculations, nearly all published “norms” and standards are based on age/sex samples of inadequate size for the intended purpose.  相似文献   

20.
Missing data are common in longitudinal studies and can occur in the exposure interest. There has been little work assessing the impact of missing data in marginal structural models (MSMs), which are used to estimate the effect of an exposure history on an outcome when time‐dependent confounding is present. We design a series of simulations based on the Framingham Heart Study data set to investigate the impact of missing data in the primary exposure of interest in a complex, realistic setting. We use a standard application of MSMs to estimate the causal odds ratio of a specific activity history on outcome. We report and discuss the results of four missing data methods, under seven possible missing data structures, including scenarios in which an unmeasured variable predicts missing information. In all missing data structures, we found that a complete case analysis, where all subjects with missing exposure data are removed from the analysis, provided the least bias. An analysis that censored individuals at the first occasion of missing exposure and includes a censorship model as well as a propensity model when creating the inverse probability weights also performed well. The presence of an unmeasured predictor of missing data only slightly increased bias, except in the situation such that the exposure had a large impact on missing data and the unmeasured variable had a large impact on missing data and outcome. A discussion of the results is provided using causal diagrams, showing the usefulness of drawing such diagrams before conducting an analysis. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号