首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We argue, that due to the curse of dimensionality, there are major difficulties with any pure or smoothed likelihood-based method of inference in designed studies with randomly missing data when missingness depends on a high-dimensional vector of variables. We study in detail a semi-parametric superpopulation version of continuously stratified random sampling. We show that all estimators of the population mean that are uniformly consistent or that achieve an algebraic rate of convergence, no matter how slow, require the use of the selection (randomization) probabilities. We argue that, in contrast to likelihood methods which ignore these probabilities, inverse selection probability weighted estimators continue to perform well achieving uniform n1/2-rates of convergence. We propose a curse of dimensionality appropriate (CODA) asymptotic theory for inference in non- and semi-parametric models in an attempt to formalize our arguments. We discuss whether our results constitute a fatal blow to the likelihood principle and study the attitude toward these that a committed subjective Bayesian would adopt. Finally, we apply our CODA theory to analyse the effect of the ‘curse of dimensionality’ in several interesting semi-parametric models, including a model for a two-armed randomized trial with randomization probabilities depending on a vector of continuous pre-treatment covariates X. We provide substantive settings under which a subjective Bayesian would ignore the randomization probabilities in analysing the trial data. We then show that any statistician who ignores the randomization probabilities is unable to construct nominal 95 per cent confidence intervals for the true treatment effect that have both: (i) an expected length which goes to zero with increasing sample size; and (ii) a guaranteed expected actual coverage rate of at least 95 per cent over the ensemble of trials analysed by the statistician during his or her lifetime. However, we derive a new interval estimator, depending on the Randomization probabilities, that satisfies (i) and (ii). © 1997 by John Wiley & Sons, Ltd.  相似文献   

2.
In this paper we compare several methods for estimating population disease prevalence from data collected by two-phase sampling when there is non-response at the second phase. The traditional weighting type estimator requires the missing completely at random assumption and may yield biased estimates if the assumption does not hold. We review two approaches and propose one new approach to adjust for non-response assuming that the non-response depends on a set of covariates collected at the first phase: an adjusted weighting type estimator using estimated response probability from a response model; a modelling type estimator using predicted disease probability from a disease model; and a regression type estimator combining the adjusted weighting type estimator and the modelling type estimator. These estimators are illustrated using data from an Alzheimer's disease study in two populations.  相似文献   

3.
In medical research, a two-phase study is often used for the estimation of the area under the receiver operating characteristic curve (AUC) of a diagnostic test. However, such a design introduces verification bias. One of the methods to correct verification bias is inverse probability weighting (IPW). Since the probability a subject is selected into phase 2 of the study for disease verification is known, both true and estimated verification probabilities can be used to form an IPW estimator for AUC. In this article, we derive explicit variance formula for both IPW AUC estimators and show that the IPW AUC estimator using the true values of verification probabilities even when they are known are less efficient than its counterpart using the estimated values. Our simulation results show that the efficiency loss can be substantial especially when the variance of test result in disease population is small relative to its counterpart in nondiseased population.  相似文献   

4.
The consistency of doubly robust estimators relies on the consistent estimation of at least one of two nuisance regression parameters. In moderate-to-large dimensions, the use of flexible data-adaptive regression estimators may aid in achieving this consistency. However, n1/2-consistency of doubly robust estimators is not guaranteed if one of the nuisance estimators is inconsistent. In this paper, we present a doubly robust estimator for survival analysis with the novel property that it converges to a Gaussian variable at an n1/2-rate for a large class of data-adaptive estimators of the nuisance parameters, under the only assumption that at least one of them is consistently estimated at an n1/4-rate. This result is achieved through the adaptation of recent ideas in semiparametric inference, which amount to (i) Gaussianizing (ie, making asymptotically linear) a drift term that arises in the asymptotic analysis of the doubly robust estimator and (ii) using cross-fitting to avoid entropy conditions on the nuisance estimators. We present the formula of the asymptotic variance of the estimator, which allows for the computation of doubly robust confidence intervals and p values. We illustrate the finite-sample properties of the estimator in simulation studies and demonstrate its use in a phase III clinical trial for estimating the effect of a novel therapy for the treatment of human epidermal growth factor receptor 2 (HER2)–positive breast cancer.  相似文献   

5.
A Bayesian analysis of a proportion under non-ignorable non-response   总被引:1,自引:0,他引:1  
The National Health Interview Survey (NHIS) is one of the surveys used to assess one aspect of the health status of the U.S. population. One indicator of the nation's health is the total number of doctor visits made by the household members in the past year. We study the binary variable of at least one doctor visit versus no doctor visit by all household members to each of the 50 states and the District of Columbia. The proportion of households with at least one doctor visit is an indicator of the status of health of the U.S. population. There is a substantial number of non-respondents among the sampled households. The main issue we address here is that the non-response mechanism should not be ignored because respondents and non-respondents differ. The purpose of this work is to estimate the proportion of households with at least one doctor visit, and to investigate what adjustment needs to be made for non-ignorable non-response. We consider a non-ignorable non-response model that expresses uncertainty about ignorability through the ratio of odds of a household doctor visit among respondents to the odds of doctor visit among all households, and this ratio varies from state to state. We use a hierarchical Bayesian selection model to accommodate this non-response mechanism. Because of the weak identifiability of the parameters, it is necessary to 'borrow strength' across states as in small area estimation. We also perform a simulation study to compare the expansion model with an alternative expansion model, an ignorable model and a non-ignorable model. Inference for the probability of a doctor visit is generally similar across the models. Our main result is that for some of the states the non-response mechanism can be considered non-ignorable, and that 95 per cent credible intervals of the probability for a household doctor visit and the probability that a household responds shed important light on the NHIS data.  相似文献   

6.
Group sequential designs are widely used in clinical trials to determine whether a trial should be terminated early. In such trials, maximum likelihood estimates are often used to describe the difference in efficacy between the experimental and reference treatments; however, these are well known for displaying conditional and unconditional biases. Established bias‐adjusted estimators include the conditional mean‐adjusted estimator (CMAE), conditional median unbiased estimator, conditional uniformly minimum variance unbiased estimator (CUMVUE), and weighted estimator. However, their performances have been inadequately investigated. In this study, we review the characteristics of these bias‐adjusted estimators and compare their conditional bias, overall bias, and conditional mean‐squared errors in clinical trials with survival endpoints through simulation studies. The coverage probabilities of the confidence intervals for the four estimators are also evaluated. We find that the CMAE reduced conditional bias and showed relatively small conditional mean‐squared errors when the trials terminated at the interim analysis. The conditional coverage probability of the conditional median unbiased estimator was well below the nominal value. In trials that did not terminate early, the CUMVUE performed with less bias and an acceptable conditional coverage probability than was observed for the other estimators. In conclusion, when planning an interim analysis, we recommend using the CUMVUE for trials that do not terminate early and the CMAE for those that terminate early. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

7.
Often in biomedical studies, the event of interest is recurrent and within-subject events cannot usually be assumed independent. In semi-parametric estimation of the proportional rates model, a working independence assumption leads to an estimating equation for the regression parameter vector, with within-subject correlation accounted for through a robust (sandwich) variance estimator; these methods have been extended to the case of clustered subjects. We consider variance estimation in the setting where subjects are clustered and the study consists of a small number of moderate-to-large-sized clusters. We demonstrate through simulation that the robust estimator is quite inaccurate in this setting. We propose a corrected version of the robust variance estimator, as well as jackknife and bootstrap estimators. Simulation studies reveal that the corrected variance is considerably more accurate than the robust estimator, and slightly more accurate than the jackknife and bootstrap variance. The proposed methods are used to compare hospitalization rates between Canada and the U.S. in a multi-centre dialysis study. Copyright (c) 2005 John Wiley & Sons, Ltd.  相似文献   

8.
In a cost-effectiveness analysis using clinical trial data, estimates of the between-treatment difference in mean cost and mean effectiveness are needed. Several methods for handling censored data have been suggested. One of them is inverse-probability weighting, and has the advantage that it can also be applied to estimate the parameters from a linear regression of the mean. Such regression models can potentially estimate the treatment contrast more precisely, since some of the residual variance can be explained by baseline covariates. The drawback, however, is that inverse-probability weighting may not be efficient. Using existing results on semi-parametric efficiency, this paper derives the semi-parametric efficient parameter estimates for regression of mean cost, mean quality-adjusted survival time and mean survival time. The performance of these estimates is evaluated through a simulation study. Applying both the new estimators and the inverse-probability weighted estimators to the results of the EVALUATE trial showed that the new estimators achieved a halving of the variance of the estimated treatment contrast for cost. Some practical suggestions for choosing an estimator are offered.  相似文献   

9.
Four estimators of annual infection probability were compared pertinent to Quantitative Microbial Risk Analysis (QMRA). A stochastic model, the Gold Standard, was used as the benchmark. It is a product of independent daily infection probabilities which in turn are based on daily doses. An alternative and commonly-used estimator, here referred to as the Naïve, assumes a single daily infection probability from a single value of daily dose. The typical use of this estimator in stochastic QMRA involves the generation of a distribution of annual infection probabilities, but since each of these is based on a single realisation of the dose distribution, the resultant annual infection probability distribution simply represents a set of inaccurate estimates. While the medians of both distributions were within an order of magnitude for our test scenario, the 95th percentiles, which are sometimes used in QMRA as conservative estimates of risk, differed by around one order of magnitude. The other two estimators examined, the Geometric and Arithmetic, were closely related to the Naïve and use the same equation, and both proved to be poor estimators. Lastly, this paper proposes a simple adjustment to the Gold Standard equation accommodating periodic infection probabilities when the daily infection probabilities are unknown.  相似文献   

10.
In survival analyses, inverse‐probability‐of‐treatment (IPT) and inverse‐probability‐of‐censoring (IPC) weighted estimators of parameters in marginal structural Cox models are often used to estimate treatment effects in the presence of time‐dependent confounding and censoring. In most applications, a robust variance estimator of the IPT and IPC weighted estimator is calculated leading to conservative confidence intervals. This estimator assumes that the weights are known rather than estimated from the data. Although a consistent estimator of the asymptotic variance of the IPT and IPC weighted estimator is generally available, applications and thus information on the performance of the consistent estimator are lacking. Reasons might be a cumbersome implementation in statistical software, which is further complicated by missing details on the variance formula. In this paper, we therefore provide a detailed derivation of the variance of the asymptotic distribution of the IPT and IPC weighted estimator and explicitly state the necessary terms to calculate a consistent estimator of this variance. We compare the performance of the robust and consistent variance estimators in an application based on routine health care data and in a simulation study. The simulation reveals no substantial differences between the 2 estimators in medium and large data sets with no unmeasured confounding, but the consistent variance estimator performs poorly in small samples or under unmeasured confounding, if the number of confounders is large. We thus conclude that the robust estimator is more appropriate for all practical purposes.  相似文献   

11.
In meta‐analysis of odds ratios (ORs), heterogeneity between the studies is usually modelled via the additive random effects model (REM). An alternative, multiplicative REM for ORs uses overdispersion. The multiplicative factor in this overdispersion model (ODM) can be interpreted as an intra‐class correlation (ICC) parameter. This model naturally arises when the probabilities of an event in one or both arms of a comparative study are themselves beta‐distributed, resulting in beta‐binomial distributions. We propose two new estimators of the ICC for meta‐analysis in this setting. One is based on the inverted Breslow‐Day test, and the other on the improved gamma approximation by Kulinskaya and Dollinger (2015, p. 26) to the distribution of Cochran's Q. The performance of these and several other estimators of ICC on bias and coverage is studied by simulation. Additionally, the Mantel‐Haenszel approach to estimation of ORs is extended to the beta‐binomial model, and we study performance of various ICC estimators when used in the Mantel‐Haenszel or the inverse‐variance method to combine ORs in meta‐analysis. The results of the simulations show that the improved gamma‐based estimator of ICC is superior for small sample sizes, and the Breslow‐Day‐based estimator is the best for . The Mantel‐Haenszel‐based estimator of OR is very biased and is not recommended. The inverse‐variance approach is also somewhat biased for ORs≠1, but this bias is not very large in practical settings. Developed methods and R programs, provided in the Web Appendix, make the beta‐binomial model a feasible alternative to the standard REM for meta‐analysis of ORs. © 2017 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

12.
The relative concentration index is a widely used measure for assessing relative differences in health across all socioeconomic population groups. We extend its usage to individual-level data collected through complex surveys by deriving its variance using the Taylor linearization (TL) method. Two existing plug-in variance estimators that only require grouped data are also compared. We discuss sources of uncertainty that each variance estimator considers and present simulation studies to compare the performance of the three estimators under various sampling designs. The proposed TL variance estimator consistently produces valid results; however, it requires the access to individual-level data. Both plug-in variance estimators are biased because of failure to account for certain error sources. However, when only grouped data is available, one of the plug-in estimators can be valid as long as the socioeconomic groups are treated equally sized, a commonly used analytic strategy to emphasize group's instead of individual's burden of disease in health disparity assessment. We illustrate the three variance estimators by applying them to assessing socioeconomic disparities in child and adolescent obesity using complex survey sampled drawn from the National Health and Nutrition Examination Survey.  相似文献   

13.
Methods for random-effects meta-analysis require an estimate of the between-study variance, τ2. The performance of estimators of τ2 (measured by bias and coverage) affects their usefulness in assessing heterogeneity of study-level effects and also the performance of related estimators of the overall effect. However, as we show, the performance of the methods varies widely among effect measures. For the effect measures mean difference (MD) and standardized MD (SMD), we use improved effect-measure-specific approximations to the expected value of Q for both MD and SMD to introduce two new methods of point estimation of τ2 for MD (Welch-type and corrected DerSimonian-Laird) and one WT interval method. We also introduce one point estimator and one interval estimator for τ2 in SMD. Extensive simulations compare our methods with four point estimators of τ2 (the popular methods of DerSimonian-Laird, restricted maximum likelihood, and Mandel and Paule, and the less-familiar method of Jackson) and four interval estimators for τ2 (profile likelihood, Q-profile, Biggerstaff and Jackson, and Jackson). We also study related point and interval estimators of the overall effect, including an estimator whose weights use only study-level sample sizes. We provide measure-specific recommendations from our comprehensive simulation study and discuss an example.  相似文献   

14.
We propose a maximum likelihood estimator (MLE) of the kappa coefficient from a 2×2 table when the binary ratings depend on patient and/or clinician effects. We achieve this by expressing the logit of the probability of positive rating as a linear function of the subject-specific and the rater-specific covariates. We investigate the bias and variance of the MLE in small and moderate size samples through Monte Carlo simulation and we provide the sample size calculation to detect departure from the null hypothesis H0: kappa = κ0 in the direction of H1: kappa>κ0.  相似文献   

15.
We consider the estimation of parameters in a particular segmented generalized linear model with additive measurement error in predictors, with a focus on linear and logistic regression. In epidemiologic studies segmented regression models often occur as threshold models, where it is assumed that the exposure has no influence on the response up to a possibly unknown threshold. Furthermore, in occupational and environmental studies the exposure typically cannot be measured exactly. Ignoring this measurement error leads to asymptotically biased estimators of the threshold. It is shown that this asymptotic bias is different from that observed for estimating standard generalized linear model parameters in the presence of measurement error, being both larger and in different directions than expected. In most cases considered the threshold is asymptotically underestimated. Two standard general methods for correcting for this bias are considered; regression calibration and simulation extrapolation (simex). In ordinary logistic and linear regression these procedures behave similarly, but in the threshold segmented regression model they operate quite differently. The regression calibration estimator usually has more bias but less variance than the simex estimator. Regression calibration and simex are typically thought of as functional methods, also known as semi-parametric methods, because they make no assumptions about the distribution of the unobservable covariate X. The contrasting structural, parametric maximum likelihood estimate assumes a parametric distributional form for X. In ordinary linear regression there is typically little difference between structural and functional methods. One of the major, surprising findings of our study is that in threshold regression, the functional and structural methods differ substantially in their performance. In one of our simulations, approximately consistent functional estimates can be as much as 25 times more variable than the maximum likelihood estimate for a properly specified parametric model. Structural (parametric) modelling ought not be a neglected tool in measurement error models. An example involving dust concentration and bronchitis in a mechanical engineering plant in Munich is used to illustrate the results. © 1997 by John Wiley & Sons, Ltd.  相似文献   

16.
We provide a Bayesian analysis of data categorized into two levels of age (younger than 50 years, at least 50 years) and three levels of bone mineral density (normal, osteopenia, osteoporosis) for white females at least 20 years old in the third National Health and Nutrition Examination Survey. For the sample, the age of each individual is known, but some individuals did not have their BMD measured. We use two types of models: In the ignorable non-response models the propensity to respond does not depend on BMD and age of an individual, while in the non-ignorable non-response models it does. These are the baseline models which are used to derive all models for testing. Our non-ignorable non-response models are 'close' to the ignorable non-response models, thereby reducing the effects of the assumptions about non-respondents that cannot be tested in non-response models. We have data from 35 counties, small areas, and therefore our models are hierarchical, a feature that allows a 'borrowing of strength' across the counties, and they provide a substantial reduction in variation. The non-ignorable non-response models are generalizations of the ignorable non-response models, and therefore, the non-ignorable non-response models allow broader inference. The joint posterior density of the parameters for each model is complex, and therefore, we fit each model using Markov chain Monte Carlo methods to obtain samples which are used to make inference about BMD and age. For each county we can estimate the proportion of individuals in each BMD and age cell of the categorical table, and we can assess the relation between BMD and age using the Bayes factor. A sensitivity analysis shows that there are differences (typically small) in inference that permits different levels of association between BMD and age. A simulation study shows that there is not much difference between the baseline ignorable and non-ignorable non-response models.  相似文献   

17.
In personalized medicine, it is often desired to determine if all patients or only a subset of them benefit from a treatment. We consider estimation in two-stage adaptive designs that in stage 1 recruit patients from the full population. In stage 2, patient recruitment is restricted to the part of the population, which, based on stage 1 data, benefits from the experimental treatment. Existing estimators, which adjust for using stage 1 data for selecting the part of the population from which stage 2 patients are recruited, as well as for the confirmatory analysis after stage 2, do not consider time to event patient outcomes. In this work, for time to event data, we have derived a new asymptotically unbiased estimator for the log hazard ratio and a new interval estimator with good coverage probabilities and probabilities that the upper bounds are below the true values. The estimators are appropriate for several selection rules that are based on a single or multiple biomarkers, which can be categorical or continuous.  相似文献   

18.
Liu A  Wu C  Yu KF  Gehan E 《Statistics in medicine》2005,24(7):1009-1027
We consider estimation of various probabilities after termination of a group sequential phase II trial. A motivating example is that the stopping rule of a phase II oncologic trial is determined solely based on response to a drug treatment, and at the end of the trial estimating the rate of toxicity and response is desirable. The conventional maximum likelihood estimator (sample proportion) of a probability is shown to be biased, and two alternative estimators are proposed to correct for bias, a bias-reduced estimator obtained by using Whitehead's bias-adjusted approach, and an unbiased estimator from the Rao-Blackwell method of conditioning. All three estimation procedures are shown to have certain invariance property in bias. Moreover, estimators of a probability and their bias and precision can be evaluated through the observed response rate and the stage at which the trial stops, thus avoiding extensive computation.  相似文献   

19.
Shen H  Brown LD  Zhi H 《Statistics in medicine》2006,25(17):3023-3038
In this paper, the problem of interest is efficient estimation of log-normal means. Several existing estimators are reviewed first, including the sample mean, the maximum likelihood estimator, the uniformly minimum variance unbiased estimator and a conditional minimal mean squared error estimator. A new estimator is then proposed, and we show that it improves over the existing estimators in terms of squared error risk. The improvement is more significant with small sample sizes and large coefficient of variations, which is common in clinical pharmacokinetic (PK) studies. In addition, the new estimator is very easy to implement, and provides us with a simple alternative to summarize PK data, which are usually modelled by log-normal distributions. We also propose a parametric bootstrap confidence interval for log-normal means around the new estimator and illustrate its nice coverage property with a simulation study. Our estimator is compared with the existing ones via theoretical calculations and applications to real PK studies.  相似文献   

20.
Propensity score methods are increasingly used to estimate the effect of a treatment or exposure on an outcome in non-randomised studies. We focus on one such method, stratification on the propensity score, comparing it with the method of inverse-probability weighting by the propensity score. The propensity score--the conditional probability of receiving the treatment given observed covariates--is usually an unknown probability estimated from the data. Estimators for the variance of treatment effect estimates typically used in practice, however, do not take into account that the propensity score itself has been estimated from the data. By deriving the asymptotic marginal variance of the stratified estimate of treatment effect, correctly taking into account the estimation of the propensity score, we show that routinely used variance estimators are likely to produce confidence intervals that are too conservative when the propensity score model includes variables that predict (cause) the outcome, but only weakly predict the treatment. In contrast, a comparison with the analogous marginal variance for the inverse probability weighted (IPW) estimator shows that routinely used variance estimators for the IPW estimator are likely to produce confidence intervals that are almost always too conservative. Because exact calculation of the asymptotic marginal variance is likely to be complex, particularly for the stratified estimator, we suggest that bootstrap estimates of variance should be used in practice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号