首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Array comparative genomic hybridization (aCGH) provides a genome‐wide information of DNA copy number that is potentially useful for disease classification. One immediate problem is that the data contain many features (probes) but only a few samples. Existing approaches to overcome this problem include features selection, ridge regression and partial least squares. However, these methods typically ignore the spatial characteristic of aCGH data. To explicitly make use of this spatial information we develop a procedure called smoothed logistic regression (SLR) model. The procedure is based on a mixed logistic regression model, where the random component is a mixture distribution that controls smoothness and sparseness. Conceptually such a procedure is straightforward, but its implementation is complicated due to computational problems. We develop a fast and reliable iterative weighted least‐squares algorithm based on the singular value decomposition. Simulated data and two real data sets are used to illustrate the procedure. For real data sets, error rates are calculated using the leave‐one‐out cross validation procedure. For both simulated and real data examples, SLR achieves better misclassification error rates compared with previous methods. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

2.
In conventional survival analysis there is an underlying assumption that all study subjects are susceptible to the event. In general, this assumption does not adequately hold when investigating the time to an event other than death. Owing to genetic and/or environmental etiology, study subjects may not be susceptible to the disease. Analyzing nonsusceptibility has become an important topic in biomedical, epidemiological, and sociological research, with recent statistical studies proposing several mixture models for right‐censored data in regression analysis. In longitudinal studies, we often encounter left, interval, and right‐censored data because of incomplete observations of the time endpoint, as well as possibly left‐truncated data arising from the dissimilar entry ages of recruited healthy subjects. To analyze these kinds of incomplete data while accounting for nonsusceptibility and possible crossing hazards in the framework of mixture regression models, we utilize a logistic regression model to specify the probability of susceptibility, and a generalized gamma distribution, or a log‐logistic distribution, in the accelerated failure time location‐scale regression model to formulate the time to the event. Relative times of the conditional event time distribution for susceptible subjects are extended in the accelerated failure time location‐scale submodel. We also construct graphical goodness‐of‐fit procedures on the basis of the Turnbull–Frydman estimator and newly proposed residuals. Simulation studies were conducted to demonstrate the validity of the proposed estimation procedure. The mixture regression models are illustrated with alcohol abuse data from the Taiwan Aboriginal Study Project and hypertriglyceridemia data from the Cardiovascular Disease Risk Factor Two‐township Study in Taiwan. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

3.
Ignoring the fact that the reference test used to establish the discriminative properties of a combination of diagnostic biomarkers is imperfect can lead to a biased estimate of the diagnostic accuracy of the combination. In this paper, we propose a Bayesian latent‐class mixture model to select a combination of biomarkers that maximizes the area under the ROC curve (AUC), while taking into account the imperfect nature of the reference test. In particular, a method for specification of the prior for the mixture component parameters is developed that allows controlling the amount of prior information provided for the AUC. The properties of the model are evaluated by using a simulation study and an application to real data from Alzheimer's disease research. In the simulation study, 100 data sets are simulated for sample sizes ranging from 100 to 600 observations, with a varying correlation between biomarkers. The inclusion of an informative as well as a flat prior for the diagnostic accuracy of the reference test is investigated. In the real‐data application, the proposed model was compared with the generally used logistic‐regression model that ignores the imperfectness of the reference test. Conditional on the selected sample size and prior distributions, the simulation study results indicate satisfactory performance of the model‐based estimates. In particular, the obtained average estimates for all parameters are close to the true values. For the real‐data application, AUC estimates for the proposed model are substantially higher than those from the ‘traditional’ logistic‐regression model. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

4.
The postmastectomy survival rates are often based on previous outcomes of large numbers of women who had a disease, but they do not accurately predict what will happen in any particular patient's case. Pathologic explanatory variables such as disease multifocality, tumor size, tumor grade, lymphovascular invasion, and enhanced lymph node staining are prognostically significant to predict these survival rates. We propose a new cure rate survival regression model for predicting breast carcinoma survival in women who underwent mastectomy. We assume that the unknown number of competing causes that can influence the survival time is given by a power series distribution and that the time of the tumor cells left active after the mastectomy for metastasizing follows the beta Weibull distribution. The new compounding regression model includes as special cases several well‐known cure rate models discussed in the literature. The model parameters are estimated by maximum likelihood. Further, for different parameter settings, sample sizes, and censoring percentages, some simulations are performed. We derive the appropriate matrices for assessing local influences on the parameter estimates under different perturbation schemes and present some ways to assess local influences. The potentiality of the new regression model to predict accurately breast carcinoma mortality is illustrated by means of real data. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

5.
Forensic medicine is increasingly called upon to assess the age of individuals. Forensic age estimation is mostly required in relation to illegal immigration and identification of bodies or skeletal remains. A variety of age estimation methods are based on dental samples and use of regression models, where the age of an individual is predicted by morphological tooth changes that take place over time. From the medico‐legal point of view, regression models, with age as the dependent random variable entail that age tends to be overestimated in the young and underestimated in the old. To overcome this bias, we describe a new full Bayesian calibration method (asymmetric Laplace Bayesian calibration) for forensic age estimation that uses asymmetric Laplace distribution as the probability model. The method was compared with three existing approaches (two Bayesian and a classical method) using simulated data. Although its accuracy was comparable with that of the other methods, the asymmetric Laplace Bayesian calibration appears to be significantly more reliable and robust in case of misspecification of the probability model. The proposed method was also applied to a real dataset of values of the pulp chamber of the right lower premolar measured on x‐ray scans of individuals of known age. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

6.
We present a model for meta‐regression in the presence of missing information on some of the study level covariates, obtaining inferences using Bayesian methods. In practice, when confronted with missing covariate data in a meta‐regression, it is common to carry out a complete case or available case analysis. We propose to use the full observed data, modelling the joint density as a factorization of a meta‐regression model and a conditional factorization of the density for the covariates. With the inclusion of several covariates, inter‐relations between these covariates are modelled. Under this joint likelihood‐based approach, it is shown that the lesser assumption of the covariates being Missing At Random is imposed, instead of the more usual Missing Completely At Random (MCAR) assumption. The model is easily programmable in WinBUGS, and we examine, through the analysis of two real data sets, sensitivity and robustness of results to the MCAR assumption. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

7.
This paper presents a Bayesian model for meta‐analysis of sparse discrete binomial data, which are out of the scope of the usual hierarchical normal random‐effect models. Treatment effectiveness data are often of this type. The crucial linking distribution between the effectiveness conditional on the healthcare center and the unconditional effectiveness is constructed from specific bivariate classes of distributions with given marginals. This assures coherency between the marginal and conditional prior distributions utilized in the analysis. Further, we impose a bivariate class of priors that is able to accommodate a wide range of heterogeneity degrees between the multicenter clinical trials involved. Applications to real multicenter data are given and compared with previous meta‐analysis. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

8.
In many practical applications, count data often exhibit greater or less variability than allowed by the equality of mean and variance, referred to as overdispersion/underdispersion, and there are several reasons that may lead to the overdispersion/underdispersion such as zero inflation and mixture. Moreover, if the count data are distributed as a generalized Poisson or a negative binomial distribution that accommodates extra variation not explained by a simple Poisson or a binomial model, then the dispersion occurs too. In this paper, we deal with a class of two‐component zero‐inflated generalized Poisson mixture regression models to fit such data and propose a local influence measure procedure for model comparison and statistical diagnostics. At first, we formally develop a general model framework that unifies zero inflation, mixture as well as overdispersion/underdispersion simultaneously, and then we mainly investigate two types of perturbation schemes, the global and individual perturbation schemes, for perturbing various model assumptions and detecting influential observations. Also, we obtain the corresponding local influence measures. Our method is novel for count data analysis and can be used to explore these essential issues such as zero inflation, mixture, and dispersion related to zero‐inflated generalized Poisson mixture models. On the basis of the results of model comparison, we could further conduct the sensitivity analysis of perturbation as well as hypothesis test with more accuracy. Finally, we employ here a simulation study and a real example to illustrate the proposed local influence measures. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

9.
Rate differences are an important effect measure in biostatistics and provide an alternative perspective to rate ratios. When the data are event counts observed during an exposure period, adjusted rate differences may be estimated using an identity‐link Poisson generalised linear model, also known as additive Poisson regression. A problem with this approach is that the assumption of equality of mean and variance rarely holds in real data, which often show overdispersion. An additive negative binomial model is the natural alternative to account for this; however, standard model‐fitting methods are often unable to cope with the constrained parameter space arising from the non‐negativity restrictions of the additive model. In this paper, we propose a novel solution to this problem using a variant of the expectation–conditional maximisation–either algorithm. Our method provides a reliable way to fit an additive negative binomial regression model and also permits flexible generalisations using semi‐parametric regression functions. We illustrate the method using a placebo‐controlled clinical trial of fenofibrate treatment in patients with type II diabetes, where the outcome is the number of laser therapy courses administered to treat diabetic retinopathy. An R package is available that implements the proposed method. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

10.
We provide a simple and practical, yet flexible, penalized estimation method for a Cox proportional hazards model with current status data. We approximate the baseline cumulative hazard function by monotone B‐splines and use a hybrid approach based on the Fisher‐scoring algorithm and the isotonic regression to compute the penalized estimates. We show that the penalized estimator of the nonparametric component achieves the optimal rate of convergence under some smooth conditions and that the estimators of the regression parameters are asymptotically normal and efficient. Moreover, a simple variance estimation method is considered for inference on the regression parameters. We perform 2 extensive Monte Carlo studies to evaluate the finite‐sample performance of the penalized approach and compare it with the 3 competing R packages: C1.coxph, intcox, and ICsurv. A goodness‐of‐fit test and model diagnostics are also discussed. The methodology is illustrated with 2 real applications.  相似文献   

11.
Mean‐based semi‐parametric regression models such as the popular generalized estimating equations are widely used to improve robustness of inference over parametric models. Unfortunately, such models are quite sensitive to outlying observations. The Wilcoxon‐score‐based rank regression (RR) provides more robust estimates over generalized estimating equations against outliers. However, the RR and its extensions do not sufficiently address missing data arising in longitudinal studies. In this paper, we propose a new approach to address outliers under a different framework based on the functional response models. This functional‐response‐model‐based alternative not only addresses limitations of the RR and its extensions for longitudinal data, but, with its rank‐preserving property, even provides more robust estimates than these alternatives. The proposed approach is illustrated with both real and simulated data. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

12.
This paper presents a new Bayesian methodology for identifying a transition period for the development of drug resistance to antiretroviral drug or therapy in HIV/AIDS studies or other related fields. Estimation of such a transition period requires an availability of longitudinal data where growth trajectories of a response variable tend to exhibit a gradual change from a declining trend to an increasing trend rather than an abrupt change. We assess this clinically important feature of the longitudinal HIV/AIDS data using the bent‐cable framework within a growth mixture Tobit model. To account for heterogeneity of drug resistance among subjects, the parameters of the bent‐cable growth mixture Tobit model are also allowed to differ by subgroups (subpopulations) of patients classified into latent classes on the basis of trajectories of observed viral load data with skewness and left‐censoring. The proposed methods are illustrated using real data from an AIDS clinical study. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

13.
Proportion data with support lying in the interval [0,1] are a commonplace in various domains of medicine and public health. When these data are available as clusters, it is important to correctly incorporate the within‐cluster correlation to improve the estimation efficiency while conducting regression‐based risk evaluation. Furthermore, covariates may exhibit a nonlinear relationship with the (proportion) responses while quantifying disease status. As an alternative to various existing classical methods for modeling proportion data (such as augmented Beta regression) that uses maximum likelihood, or generalized estimating equations, we develop a partially linear additive model based on the quadratic inference function. Relying on quasi‐likelihood estimation techniques and polynomial spline approximation for unknown nonparametric functions, we obtain the estimators for both parametric part and nonparametric part of our model and study their large‐sample theoretical properties. We illustrate the advantages and usefulness of our proposition over other alternatives via extensive simulation studies, and application to a real dataset from a clinical periodontal study.  相似文献   

14.
In real life and somewhat contrary to biostatistical textbook knowledge, sensitivity and specificity (and not only predictive values) of diagnostic tests can vary with the underlying prevalence of disease. In meta‐analysis of diagnostic studies, accounting for this fact naturally leads to a trivariate expansion of the traditional bivariate logistic regression model with random study effects. In this paper, a new model is proposed using trivariate copulas and beta‐binomial marginal distributions for sensitivity, specificity, and prevalence as an expansion of the bivariate model. Two different copulas are used, the trivariate Gaussian copula and a trivariate vine copula based on the bivariate Plackett copula. This model has a closed‐form likelihood, so standard software (e.g., SAS PROC NLMIXED ) can be used. The results of a simulation study have shown that the copula models perform at least as good but frequently better than the standard model. The methods are illustrated by two examples. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

15.
The problem of analyzing a continuous variable with a discrete component is addressed within the framework of the mixture model proposed by Moulton and Halsey (Biometrics 1995; 51:1570-1578). The model can be generalized by the introduction of the log-skew-normal distribution for the continuous component, and the fit can be significantly improved by its use, while retaining the interpretation of regression parameter estimates. Simulation studies and application to a real data set are used for demonstration.  相似文献   

16.
Making inferences about the average treatment effect using the random effects model for meta‐analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between‐study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta‐analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta‐analysis and meta‐regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta‐analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.  相似文献   

17.
In randomized clinical trials, it is common that patients may stop taking their assigned treatments and then switch to a standard treatment (standard of care available to the patient) but not the treatments under investigation. Although the availability of limited retrieved data on patients who switch to standard treatment, called off‐protocol data, could be highly valuable in assessing the associated treatment effect with the experimental therapy, it leads to a complex data structure requiring the development of models that link the information of per‐protocol data with the off‐protocol data. In this paper, we develop a novel Bayesian method to jointly model longitudinal treatment measurements under various dropout scenarios. Specifically, we propose a multivariate normal mixed‐effects model for repeated measurements from the assigned treatments and the standard treatment, a multivariate logistic regression model for those stopping the assigned treatments, logistic regression models for those starting a standard treatment off protocol, and a conditional multivariate logistic regression model for completely withdrawing from the study. We assume that withdrawing from the study is non‐ignorable, but intermittent missingness is assumed to be at random. We examine various properties of the proposed model. We develop an efficient Markov chain Monte Carlo sampling algorithm. We analyze in detail via the proposed method a real dataset from a clinical trial. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

18.
Increasing evidence suggests that rare and generally deleterious genetic variants might have a strong impact on disease risks of not only Mendelian disease, but also many common diseases. However, identifying such rare variants remains challenging, and novel statistical methods and bioinformatic software must be developed. Hence, we have to extensively evaluate various methods under reasonable genetic models. Although there are abundant genomic data, they are not most helpful for the evaluation of the methods because the disease mechanism is unknown. Thus, it is imperative that we simulate genomic data that mimic the real data containing rare variants and that enable us to impose a known disease penetrance model. Although resampling simulation methods have shown their advantages in computational efficiency and in preserving important properties such as linkage disequilibrium (LD) and allele frequency, they still have limitations as we demonstrated. We propose an algorithm that combines a regression‐based imputation with resampling to simulate genetic data with both rare and common variants. Logistic regression model was employed to fit the relationship between a rare variant and its nearby common variants in the 1000 Genomes Project data and then applied to the real data to fill in one rare variant at a time using the fitted logistic model based on common variants. Individuals then were simulated using the real data with imputed rare variants. We compared our method with existing simulators and demonstrated that our method performed well in retaining the real sample properties, such as LD and minor allele frequency, qualitatively.  相似文献   

19.
DNA methylation is a key epigenetic mark involved in both normal development and disease progression. Recent advances in high‐throughput technologies have enabled genome‐wide profiling of DNA methylation. However, DNA methylation profiling often employs different designs and platforms with varying resolution, which hinders joint analysis of methylation data from multiple platforms. In this study, we propose a penalized functional regression model to impute missing methylation data. By incorporating functional predictors, our model utilizes information from nonlocal probes to improve imputation quality. Here, we compared the performance of our functional model to linear regression and the best single probe surrogate in real data and via simulations. Specifically, we applied different imputation approaches to an acute myeloid leukemia dataset consisting of 194 samples and our method showed higher imputation accuracy, manifested, for example, by a 94% relative increase in information content and up to 86% more CpG sites passing post‐imputation filtering. Our simulated association study further demonstrated that our method substantially improves the statistical power to identify trait‐associated methylation loci. These findings indicate that the penalized functional regression model is a convenient and valuable imputation tool for methylation data, and it can boost statistical power in downstream epigenome‐wide association study (EWAS).  相似文献   

20.
Interval‐censored data occur naturally in many fields and the main feature is that the failure time of interest is not observed exactly, but is known to fall within some interval. In this paper, we propose a semiparametric probit model for analyzing case 2 interval‐censored data as an alternative to the existing semiparametric models in the literature. Specifically, we propose to approximate the unknown nonparametric nondecreasing function in the probit model with a linear combination of monotone splines, leading to only a finite number of parameters to estimate. Both the maximum likelihood and the Bayesian estimation methods are proposed. For each method, regression parameters and the baseline survival function are estimated jointly. The proposed methods make no assumptions about the observation process and can be applicable to any interval‐censored data with easy implementation. The methods are evaluated by simulation studies and are illustrated by two real‐life interval‐censored data applications. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号