首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Bivariate multinomial data such as the left and right eyes retinopathy status data are analyzed either by using a joint bivariate probability model or by exploiting certain odds ratio‐based association models. However, the joint bivariate probability model yields marginal probabilities, which are complicated functions of marginal and association parameters for both variables, and the odds ratio‐based association model treats the odds ratios involved in the joint probabilities as ‘working’ parameters, which are consequently estimated through certain arbitrary ‘working’ regression models. Also, this later odds ratio‐based model does not provide any easy interpretations of the correlations between two categorical variables. On the basis of pre‐specified marginal probabilities, in this paper, we develop a bivariate normal type linear conditional multinomial probability model to understand the correlations between two categorical variables. The parameters involved in the model are consistently estimated using the optimal likelihood and generalized quasi‐likelihood approaches. The proposed model and the inferences are illustrated through an intensive simulation study as well as an analysis of the well‐known Wisconsin Diabetic Retinopathy status data. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

2.
We discuss maximum likelihood methods for analysing binary responses measured at two times, such as in a cross-over design. We construct a 2 x 2 table for each individual with cell probabilities corresponding to the cross-classification of the responses at the two times; the underlying likelihood for each individual is multinomial with four cells. The three dimensional parameter space of the multinomial distribution is completely specified by the two marginal probabilities of success of the 2 x 2 table and an association parameter between the binary responses at the two times. We examine a logistic model for the marginal probabilities of the 2 x 2 table for individual i; the association parameters we consider are either the correlation coefficient, the odds ratio or the relative risk. Simulations show that the parameter estimates for the logistic regression model for the marginal probabilities are not very sensitive to the parameters used to describe the association between the binary responses at the two times. Thus, we suggest choosing the measure of association for ease of interpretation.  相似文献   

3.
In typical clinical trials or epidemiologic studies of a bilateral eye disease, the primary outcome data consist of pairs of ordered categorical responses that tend to be highly correlated. In such studies, interest often centres in associating the outcome data with a grouping variable such as the treatment indicator or the exposure status and other person- and eye-specific covariates. In this paper, we propose a latent variable regression model to analyse such bivariate ordered categorical data. We use as a joint distribution for bivariate latent random variables the cross ratio distribution proposed by Plackett, which results in modelling the dependency between the fellow eyes with the global odds ratio. We illustrate the proposed model with data from the Wisconsin Epidemiologic Study of Diabetic Retinopathy, a study that seeks to identify risk factors among younger-onset diabetics.  相似文献   

4.
Multivariate binary responses from the same subject are usually correlated. For example, malnutrition of children are usually measured using ‘stunting’ (low height-for-age) and ‘wasting’ (low weight-for-age) calculated from their height, weight and age, and hence the status of being stunted may depend on the status of being wasted and vice-versa. For analyzing such malnutrition data, one needs special statistical models allowing for dependence between the responses to avoid misleading inference. The problem of dependence in multivariate binary responses is generally addressed by using marginal models with generalized estimating equation. However, using the marginal models alone, it is difficult to specify the measures of dependence between the responses precisely. Islam et al. (J Appl Stat 40(5):1064–1075, 2013) proposed a joint modeling approach for bivariate binary responses using both the conditional and marginal models where the dependence between the responses can be measured and tested using a link function of the models. However, the author didn’t examine the properties of the regression coefficient except for the dependence parameter. This paper has given further insight into the joint model and investigated the properties of regression coefficients using an extensive simulation study. The simulation results showed that the maximum likelihood estimators (MLEs) of the regression coefficients of the joint model showed well performance in terms of bias, mean squared error and coverage probability particularly when sample size large. Generally speaking, the MLEs of the parameters associated with joint models possessed the same asymptotic properties as the MLEs of those associated with standard generalized linear models, except for the interpretations. Further the paper provided an application of joint model for analyzing malnutrition data from Bangladesh demographic and health survey 2011. The results revealed that the estimates of the both marginal and condition regression coefficients of the joint model have meaningful interpretation and explanation, which will in turn help the policy makers for designing appropriate policies for improving nutrition status.  相似文献   

5.
PURPOSE: Dependent binary responses, such as health outcomes in twin pairs or siblings, frequently arise in perinatal epidemiologic research. This gives rise to correlated data, which must be taken into account during analysis to avoid erroneous statistical and biological inferences. METHODS: An analysis of perinatal mortality (fetal deaths plus deaths within the first 28 days) in twins in relation to cluster-varying (those that are unique to each fetus within a twin pregnancy such as birthweight) and cluster-constant (those that are identical for both twins within a sibship such as maternal smoking status) risk factors is presented. Marginal (ordinary logistic regression [OLR] and logistic regression using generalized estimating equations [GEE]) and cluster-specific (conditional and random-intercept logistic regression models) regression models are fit and their results contrasted. The United States "matched multiple data" file of twin births (1995-1997), which includes 285,226 twins from 142,613 pregnancies, was used to examine the implications of ignoring of clustering on regression inferences. RESULTS: The OLR models provide variance estimates for cluster constant covariates that ranged from 7% to 71% smaller than those from GEE-based models. This underestimation is even more pronounced for some cluster-varying covariates, ranging from 21% to 198%. CONCLUSIONS: Ignoring the cluster dependency is likely to affect the precision of covariate effects and consequently interpretation of results. With widespread availability of appropriate software, statistical methods for taking the intracluster dependency into account are easily implemented and necessary.  相似文献   

6.
Bivariate copula regression allows for the flexible combination of two arbitrary, continuous marginal distributions with regression effects being placed on potentially all parameters of the resulting bivariate joint response distribution. Motivated by the risk factors for adverse birth outcomes, many of which are dichotomous, we consider mixed binary-continuous responses that extend the bivariate continuous framework to the situation where one response variable is discrete (more precisely, binary) whereas the other response remains continuous. Utilizing the latent continuous representation of binary regression models, we implement a penalized likelihood–based approach for the resulting class of copula regression models and employ it in the context of modeling gestational age and the presence/absence of low birth weight. The analysis demonstrates the advantage of the flexible specification of regression impacts including nonlinear effects of continuous covariates and spatial effects. Our results imply that racial and spatial inequalities in the risk factors for infant mortality are even greater than previously suggested.  相似文献   

7.
Bivariate observations of binary and ordinal data arise frequently and require a bivariate modeling approach in cases where one is interested in aspects of the marginal distributions as separate outcomes along with the association between the two. We consider methods for constructing such bivariate models based on latent variables with logistic marginals and propose a model based on the Ali-Mikhail-Haq bivariate logistic distribution. We motivate the model as an extension of that based on the Gumbel type 2 distribution as considered by other authors and as a bivariate extension of the logistic distribution, which preserves certain natural characteristics. Basic properties of the obtained model are studied and the proposed methods are illustrated through analysis of two data sets: a basic science cognitive experiment of visual recognition and awareness and a clinical data set describing assessments of walking disability among multiple sclerosis patients.  相似文献   

8.
In genome‐wide association studies of binary traits, investigators typically use logistic regression to test common variants for disease association within studies, and combine association results across studies using meta‐analysis. For common variants, logistic regression tests are well calibrated, and meta‐analysis of study‐specific association results is only slightly less powerful than joint analysis of the combined individual‐level data. In recent sequencing and dense chip based association studies, investigators increasingly test low‐frequency variants for disease association. In this paper, we seek to (1) identify the association test with maximal power among tests with well controlled type I error rate and (2) compare the relative power of joint and meta‐analysis tests. We use analytic calculation and simulation to compare the empirical type I error rate and power of four logistic regression based tests: Wald, score, likelihood ratio, and Firth bias‐corrected. We demonstrate for low‐count variants (roughly minor allele count [MAC] < 400) that: (1) for joint analysis, the Firth test has the best combination of type I error and power; (2) for meta‐analysis of balanced studies (equal numbers of cases and controls), the score test is best, but is less powerful than Firth test based joint analysis; and (3) for meta‐analysis of sufficiently unbalanced studies, all four tests can be anti‐conservative, particularly the score test. We also establish MAC as the key parameter determining test calibration for joint and meta‐analysis.  相似文献   

9.
We explore the ‘reassessment’ design in a logistic regression setting, where a second wave of sampling is applied to recover a portion of the missing data on a binary exposure and/or outcome variable. We construct a joint likelihood function based on the original model of interest and a model for the missing data mechanism, with emphasis on non‐ignorable missingness. The estimation is carried out by numerical maximization of the joint likelihood function with close approximation of the accompanying Hessian matrix, using sharable programs that take advantage of general optimization routines in standard software. We show how likelihood ratio tests can be used for model selection and how they facilitate direct hypothesis testing for whether missingness is at random. Examples and simulations are presented to demonstrate the performance of the proposed method. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

10.
Nonresponses and missing data are common in observational studies. Ignoring or inadequately handling missing data may lead to biased parameter estimation, incorrect standard errors and, as a consequence, incorrect statistical inference and conclusions. We present a strategy for modelling non‐ignorable missingness where the probability of nonresponse depends on the outcome. Using a simple case of logistic regression, we quantify the bias in regression estimates and show the observed likelihood is non‐identifiable under non‐ignorable missing data mechanism. We then adopt a selection model factorisation of the joint distribution as the basis for a sensitivity analysis to study changes in estimated parameters and the robustness of study conclusions against different assumptions. A Bayesian framework for model estimation is used as it provides a flexible approach for incorporating different missing data assumptions and conducting sensitivity analysis. Using simulated data, we explore the performance of the Bayesian selection model in correcting for bias in a logistic regression. We then implement our strategy using survey data from the 45 and Up Study to investigate factors associated with worsening health from the baseline to follow‐up survey. Our findings have practical implications for the use of the 45 and Up Study data to answer important research questions relating to health and quality‐of‐life. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

11.
We propose a new weighted hurdle regression method for modeling count data, with particular interest in modeling cardiovascular events in patients on dialysis. Cardiovascular disease remains one of the leading causes of hospitalization and death in this population. Our aim is to jointly model the relationship/association between covariates and (i) the probability of cardiovascular events, a binary process, and (ii) the rate of events once the realization is positive—when the ‘hurdle’ is crossed—using a zero‐truncated Poisson distribution. When the observation period or follow‐up time, from the start of dialysis, varies among individuals, the estimated probability of positive cardiovascular events during the study period will be biased. Furthermore, when the model contains covariates, then the estimated relationship between the covariates and the probability of cardiovascular events will also be biased. These challenges are addressed with the proposed weighted hurdle regression method. Estimation for the weighted hurdle regression model is a weighted likelihood approach, where standard maximum likelihood estimation can be utilized. The method is illustrated with data from the United States Renal Data System. Simulation studies show the ability of proposed method to successfully adjust for differential follow‐up times and incorporate the effects of covariates in the weighting. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

12.
We describe a methodology for analysing self‐reported risk behaviour transitional patterns in a binary outcome variable, subject to misclassification and a large loss to follow‐up. The motivation stems from the analysis of self‐reported transitional patterns in responses to the question ‘have you ever smoked a whole cigarette?’ in a cohort of South African school children. The partially complete records analysis (PCRA) introduced, estimates the transitional probability as: the ratio of the joint probability of the response at two time points based on the complete records for this time sequence over the marginal probabilities of the response based on the complete records at the first time point, and assumes a non‐informative missing pattern. A comparison was made using un‐weighted complete records and inverse probability weighted logistic regression. The estimates of the probabilities of reporting ever having smoked a cigarette obtained from the three methods were similar for a particular transition. The PCRA method lacked precision compared with the inverse probability weighted logistic regression. A simulation study indicated an association between bias and reporting error in all three methods. The PCRA method can be considered as a method for the estimation of transition probabilities in a cohort study where there is consistency in the self‐reported risk behaviour pattern and the sample size is large at baseline. The inverse probability weighting approach is more precise and is suitable for this setting in order to determine risk factors for the incidence of self‐reported substance used in a cohort with a high dropout rate. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

13.
We consider Bayesian sensitivity analysis for unmeasured confounding in observational studies where the association between a binary exposure, binary response, measured confounders and a single binary unmeasured confounder can be formulated using logistic regression models. A model for unmeasured confounding is presented along with a family of prior distributions that model beliefs about a possible unknown unmeasured confounder. Simulation from the posterior distribution is accomplished using Markov chain Monte Carlo. Because the model for unmeasured confounding is not identifiable, standard large-sample theory for Bayesian analysis is not applicable. Consequently, the impact of different choices of prior distributions on the coverage probability of credible intervals is unknown. Using simulations, we investigate the coverage probability when averaged with respect to various distributions over the parameter space. The results indicate that credible intervals will have approximately nominal coverage probability, on average, when the prior distribution used for sensitivity analysis approximates the sampling distribution of model parameters in a hypothetical sequence of observational studies. We motivate the method in a study of the effectiveness of beta blocker therapy for treatment of heart failure.  相似文献   

14.
Multiple logistic regression is an accepted statistical method for assessing association between an anticedant characteristic (risk factor) and a quantal outcome (probability of disease occurrence), statistically adjusting for potential confounding effects of other covariates. Yet the method has potential drawbacks which are not generally recognized. This article considers one important drawback of logistic regression. Specifically the so-called main effect logistic model assumes that the probability of developing disease is linearly and additively related to the risk factors on the logistic scale. This assumption stipulates that for each risk factor, the odds ratio is constant over all reference exposure levels, and that the odds ratio exposed to two or more factors is equal to the product of individual risk factor odds ratios. If the observed odds ratios in the data follow this pattern, the model-predicted odds ratios will be accurate, and the meaning of the odds ratio for each risk factor will be straightforward. But if the observed odds ratios deviate from the model assumption, the model will not fit the data accurately, and the model-predicted odds ratios will not reflect those in the data. Although satisfactory fit can always be achieved by adding to the model polynomial and product terms derived from the original risk factors, the odds ratios estimated by such an interaction logistic model are difficult to interpret, viz., the odds ratio for each risk factor depends not only on the reference exposure levels of that factor, but also on the exposure level in other factors.(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

15.
When conducting a meta‐analysis of studies with bivariate binary outcomes, challenges arise when the within‐study correlation and between‐study heterogeneity should be taken into account. In this paper, we propose a marginal beta‐binomial model for the meta‐analysis of studies with binary outcomes. This model is based on the composite likelihood approach and has several attractive features compared with the existing models such as bivariate generalized linear mixed model (Chu and Cole, 2006) and Sarmanov beta‐binomial model (Chen et al., 2012). The advantages of the proposed marginal model include modeling the probabilities in the original scale, not requiring any transformation of probabilities or any link function, having closed‐form expression of likelihood function, and no constraints on the correlation parameter. More importantly, because the marginal beta‐binomial model is only based on the marginal distributions, it does not suffer from potential misspecification of the joint distribution of bivariate study‐specific probabilities. Such misspecification is difficult to detect and can lead to biased inference using currents methods. We compare the performance of the marginal beta‐binomial model with the bivariate generalized linear mixed model and the Sarmanov beta‐binomial model by simulation studies. Interestingly, the results show that the marginal beta‐binomial model performs better than the Sarmanov beta‐binomial model, whether or not the true model is Sarmanov beta‐binomial, and the marginal beta‐binomial model is more robust than the bivariate generalized linear mixed model under model misspecifications. Two meta‐analyses of diagnostic accuracy studies and a meta‐analysis of case–control studies are conducted for illustration. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

16.
J K Lindsey 《Statistics in medicine》1999,18(17-18):2223-2236
Although generalized linear models are reasonably well known, they are not as widely used in medical statistics as might be appropriate, with the exception of logistic, log-linear, and some survival models. At the same time, the generalized linear modelling methodology is decidedly outdated in that more powerful methods, involving wider classes of distributions, non-linear regression, censoring and dependence among responses, are required. Limitations of the generalized linear modelling approach include the need for the iterated weighted least squares (IWLS) procedure for estimation and deviances for inferences; these restrict the class of models that can be used and do not allow direct comparisons among models from different distributions. Powerful non-linear optimization routines are now available and comparisons can more fruitfully be made using the complete likelihood function. The link function is an artefact, necessary for IWLS to function with linear models, but that disappears once the class is extended to truly non-linear models. Restricting comparisons of responses under different treatments to differences in means can be extremely misleading if the shape of the distribution is changing. This may involve changes in dispersion, or of other shape-related parameters such as the skewness in a stable distribution, with the treatments or covariates. Any exact likelihood function, defined as the probability of the observed data, takes into account the fact that all observable data are interval censored, thus directly encompassing the various types of censoring possible with duration-type data. In most situations this can now be as easily used as the traditional approximate likelihood based on densities. Finally, methods are required for incorporating dependencies among responses in models including conditioning on previous history and on random effects. One important procedure for constructing such likelihoods is based on Kalman filtering.  相似文献   

17.
Suppose we use generalized estimating equations to estimate a marginal regression model for repeated binary observations. There are no established summary statistics available for assessing the adequacy of the fitted model. In this paper we propose a goodness-of-fit test statistic which has an approximate chi-squared distribution when we have specified the model correctly. The proposed statistic can be viewed as an extension of the Hosmer and Lemeshow goodness-of-fit statistic for ordinary logistic regression to marginal regression models for repeated binary responses. We illustrate the methods using data from a study of mental health service utilization by children. The repeated responses are a set of binary measures of service use. We fit a marginal logistic regression model to the data using generalized estimating equations, and we apply the proposed goodness-of-fit statistic to assess the adequacy of the fitted model.  相似文献   

18.
Matching in case-control studies is a situation in which one wishes to make inferences about a parameter of interest in the presence of nuisance parameters. The usual approach is to apply a conditional likelihood. A bivariate latent class log-linear model for binomial responses is shown to yield a standard likelihood identical to the usual conditional one. This extension of the Rasch model for binary responses gives consistent estimates and a suitable likelihood function for cases matched with any fixed number of controls.  相似文献   

19.
Recently analytical models for pedigree disease data have been developed that combine genetic and epidemiological modelling techniques. The regressive logistic model [Bonney, Biometrics 42: 611-625; 1986] relies on decomposing the likelihood of a pedigree into the product of conditional probabilities, one for each individual, by imposing a (natural) order on pedigree members. In addition to modelling measured epidemiological variables, vertical transmission, transmission of unmeasured ousiotypes (a special case being genotypes), and some modelling of sibship dependencies have been proposed. In this paper the model is extended to include an unmeasured sibship environment factor using a log-linear model for binary pedigree traits [Hopper et al., Genet Epidemiol 1: 183-188; 1984], which breaks the pedigree into conditionally independent groups. Statistical issues, such as designs for which these factors will be discernible and tests of fit, are discussed.  相似文献   

20.
In order to examine the bias of the estimate of the log odds ratio in a 2 x 2 contingency table, Walter computed the entire distribution of the estimated log odds ratio using various small sample sizes. This is equivalent to computing the distribution of the estimated parameter b1 in a logistic regression with one independent binary variable. In this paper, the distributions of the estimated parameters b1 and b2 for two independent binary variables are computed for some small sample logistic regressions using six different estimation methods based on maximum likelihood. These estimates are then compared to the true parameter values. The best estimation method depends on the frequency of the outcome of interest and on whether the bias or mean square error is considered more important.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号