首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Calibration, that is, whether observed outcomes agree with predicted risks, is important when evaluating risk prediction models. For dichotomous outcomes, several tools exist to assess different aspects of model calibration, such as calibration‐in‐the‐large, logistic recalibration, and (non‐)parametric calibration plots. We aim to extend these tools to prediction models for polytomous outcomes. We focus on models developed using multinomial logistic regression (MLR): outcome Y with k categories is predicted using k ? 1 equations comparing each category i (i = 2, … ,k) with reference category 1 using a set of predictors, resulting in k ? 1 linear predictors. We propose a multinomial logistic recalibration framework that involves an MLR fit where Y is predicted using the k ? 1 linear predictors from the prediction model. A non‐parametric alternative may use vector splines for the effects of the linear predictors. The parametric and non‐parametric frameworks can be used to generate multinomial calibration plots. Further, the parametric framework can be used for the estimation and statistical testing of calibration intercepts and slopes. Two illustrative case studies are presented, one on the diagnosis of malignancy of ovarian tumors and one on residual mass diagnosis in testicular cancer patients treated with cisplatin‐based chemotherapy. The risk prediction models were developed on data from 2037 and 544 patients and externally validated on 1107 and 550 patients, respectively. We conclude that calibration tools can be extended to polytomous outcomes. The polytomous calibration plots are particularly informative through the visual summary of the calibration performance. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

2.
In the last decade, few topics in the area of cardiovascular disease (CVD) research have received as much attention as risk prediction. One of the well‐documented risk factors for CVD is high blood pressure (BP). Traditional CVD risk prediction models consider BP levels measured at a single time and such models form the basis for current clinical guidelines for CVD prevention. However, in clinical practice, BP levels are often observed and recorded in a longitudinal fashion. Information on BP trajectories can be powerful predictors for CVD events. We consider joint modeling of time to coronary artery disease and individual longitudinal measures of systolic and diastolic BPs in a primary care cohort with up to 20 years of follow‐up. We applied novel prediction metrics to assess the predictive performance of joint models. Predictive performances of proposed joint models and other models were assessed via simulations and illustrated using the primary care cohort. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

3.
OBJECTIVE: Various prognostic models have been developed to predict outcome after traumatic brain injury (TBI). We aimed to determine the validity of six models that used baseline clinical and computed tomographic characteristics to predict mortality or unfavorable outcome at 6 months or later after severe or moderate TBI. STUDY DESIGN AND SETTING: The validity was studied in two selected series of TBI patients enrolled in clinical trials (Tirilazad trials; n = 2,269; International Selfotel Trial; n = 409) and in two unselected series of patients consecutively admitted to participating centers (European Brain Injury Consortium [EBIC] survey; n = 796; Traumatic Coma Data Bank; n = 746). Validity was indicated by discriminative ability (AUC) and calibration (Hosmer-Lemeshow goodness-of-fit test). RESULTS: The models varied in number of predictors (four to seven) and in development technique (two prediction trees and four logistic regression models). Discriminative ability varied widely (AUC: .61-.89), but calibration was poor for most models. Better discrimination was observed for logistic regression models compared with trees, and for models including more predictors. Further, discrimination was better when tested on unselected series that contained more heterogeneous populations. CONCLUSION: Our findings emphasize the need for external validation of prognostic models. The satisfactory discrimination indicates that logistic regression models, developed on large samples, can be used for classifying TBI patients according to prognostic risk.  相似文献   

4.
In this study, we investigated the role of occupational noise exposure and blood pressure among workers at 2 plants. A noise-exposed plant (plant 1, > or = 89 dBA) and a less-noise-exposed plant (plant 2, < or = 83 dBA) were chosen. Exposure was based on department-wide average noise measures; on the basis of job location and adjusting for layoffs during their employment at the plant, a cumulative time-weighted average noise level was calculated for each worker. The study population comprised 329 males in plant 1 and 314 males in plant 2. Their ages ranged from 40 to 63 y (mean ages = 49.6 and 48.7, respectively), and they had worked at least 15 y at the plant. The clinical examination was administered prior to the workday and measured height, weight, pulse, and blood pressure. In addition, we noted medical and personal-habits histories, including alcohol intake and cigarette smoking patterns. We used a questionnaire to determine in-depth occupation, military history, noisy hobbies, and family history of hypertension. When individuals who took blood-pressure medication were removed from the analysis, t tests for differences in average blood pressure between plants showed a mean systolic blood pressure of 123.3 mm Hg in plant 1 versus 120.8 mm Hg in plant 2 (p = .06) and a mean diastolic blood pressure of 80.3 mm Hg versus 77.8 mm Hg in Plant 1 and 2, respectively (p = .014). On the basis of data from the combined plants, multivariate analysis revealed that age, body mass index, cumulative noise exposure, current use of blood pressure medications, and alcohol intake were significant predictors for systolic blood pressure. Cumulative noise exposure was a significant predictor of diastolic blood pressure in plant 1 but not in plant 2, possibly reflecting a threshold effect.  相似文献   

5.
Few studies have compared associations of blood lead and tibia lead with blood pressure and hypertension, and associations have differed in samples with occupational exposure compared with those with mainly environmental lead exposure. African Americans have been underrepresented in prior studies. The authors performed a cross-sectional analysis of 2001-2002 data from a community-based cohort in Baltimore, Maryland, of 964 men and women aged 50-70 years (40% African American, 55% White, 5% other race/ethnicity) to evaluate associations of blood lead and tibia lead with systolic and diastolic blood pressure and hypertension while adjusting for a large set of potential confounding variables. Blood lead was a strong and consistent predictor of both systolic and diastolic blood pressure in models adjusted and not adjusted for race/ethnicity and socioeconomic status. Tibia lead was associated with hypertension status before adjustment for race/ethnicity and socioeconomic status (p = 0.01); after such adjustment, the association was borderline significant (p = 0.09). Propensity score analysis suggested that standard regression analysis may have exaggerated the attenuation. These findings are discussed in the context of complex causal pathways. The data suggest that lead has an acute effect on blood pressure via recent dose and a chronic effect on hypertension risk via cumulative dose.  相似文献   

6.
目的 基于logistic回归和随机森林构建急性缺血性卒中(acute ischemic stroke,AIS)3个月预后预测模型,并比较预测效果。方法 使用中国国家卒中登记Ⅱ(China National Stoke Registry Ⅱ,CNSRⅡ)数据库中的AIS数据,备选预测因子包括人口学特征、既往病史、用药史、临床检测指标、入院情况、院内情况、出院情况等不同时间点的变量。将数据按照8∶2随机分为训练集和测试集,在训练集中分别使用logistic回归和随机森林构建AIS患者3个月预后预测模型,在测试集中使用受试者工作特征曲线下面积(area under curve, AUC)评价区分度,使用Homser - Lemeshow检验和校准图来评价校准度。结果 最终纳入数据分析共9 847例AIS患者,其中61~80岁6 093例,男性6 477例,预后不良1 515例。在测试集中,logistic回归与随机森林的AUC差异无统计学意义(0.821,95%CI:0.815~0.827vs 0.825,95%CI:0.821~0.829,P = 0.268),且两类模型的校准度均较好(χ2 = 5.67,P = 0.684 vs χ2 = 8.52,P = 0.385)。结论 基于logistic回归和随机森林建立的AIS患者3个月预后预测模型的区分度和校准度均较好。  相似文献   

7.
We compare the calibration and variability of risk prediction models that were estimated using various approaches for combining information on new predictors, termed ‘markers’, with parameter information available for other variables from an earlier model, which was estimated from a large data source. We assess the performance of risk prediction models updated based on likelihood ratio (LR) approaches that incorporate dependence between new and old risk factors as well as approaches that assume independence (‘naive Bayes’ methods). We study the impact of estimating the LR by (i) fitting a single model to cases and non‐cases when the distribution of the new markers is in the exponential family or (ii) fitting separate models to cases and non‐cases. We also evaluate a new constrained maximum likelihood method. We study updating the risk prediction model when the new data arise from a cohort and extend available methods to accommodate updating when the new data source is a case‐control study. To create realistic correlations between predictors, we also based simulations on real data on response to antiviral therapy for hepatitis C. From these studies, we recommend the LR method fit using a single model or constrained maximum likelihood. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

8.
目的 建立和验证老年人群缺血性心脑血管病(ICVD)预测模型.方法 统计分析来自某保健医院2003年5月的体检资料、历年住院资料、问卷调查资料和电话同访资料.按照4∶1的比例随机抽取基线人群,生成建模组和验证组.将验证人群基线资料代入建模人群的回归模型生成预测值.用ROC曲线下面积(AUC)检验预测模型的判别能力;用Hosmer-Lemeshow检验比较预测率每十分位分组的平均值和实际率来判断预测的准确性;将预测的6年ICVD发病风险的人群均值与实际观察到的6年累计患病率进行比较,计算误差率,验证预测模型群体水平的预测能力.结果 分析样本为2271名>65岁男性老年人,建模人群1817人,验证人群454人.把年龄分为两层(≥75岁高龄组;<75岁老龄组)建立分层Cox比例风险回归模型.老龄组有统计学意义的危险因素是年龄、SBP、血清肌酐(Scr)、空腹血糖(FBG),保护因素是高密度脂蛋白胆固醇(HDL-C);高龄组有统计学意义的危险因素是BMI、SBP、TC、Scr、FBG,保护因素是HDL-C.ROC的AUC及其95%CI为0.723(0.687~0.759),将个体按预测ICVD累计患病率与实际患病率进行Hosmer-Lemshow检验:x2=1.43,P-0.786,模型群体水平预测误差率为-2.23%,能力较好.结论 建立的老年男性人群ICVD预测模型判别能力较好,个体预测能力和群体预测能力较为满意.  相似文献   

9.
Competing risk analysis considers event times due to multiple causes or of more than one event types. Commonly used regression models for such data include (1) cause‐specific hazards model, which focuses on modeling one type of event while acknowledging other event types simultaneously, and (2) subdistribution hazards model, which links the covariate effects directly to the cumulative incidence function. Their use in the presence of high‐dimensional predictors are largely unexplored. Motivated by an analysis using the linked SEER‐Medicare database for the purposes of predicting cancer versus noncancer mortality for patients with prostate cancer, we study the accuracy of prediction and variable selection of existing machine learning methods under both models using extensive simulation experiments, including different approaches to choosing penalty parameters in each method. We then apply the optimal approaches to the analysis of the SEER‐Medicare data.  相似文献   

10.
Biomedical studies have a common interest in assessing relationships between multiple related health outcomes and high‐dimensional predictors. For example, in reproductive epidemiology, one may collect pregnancy outcomes such as length of gestation and birth weight and predictors such as single nucleotide polymorphisms in multiple candidate genes and environmental exposures. In such settings, there is a need for simple yet flexible methods for selecting true predictors of adverse health responses from a high‐dimensional set of candidate predictors. To address this problem, one may either consider linear regression models for the continuous outcomes or convert these outcomes into binary indicators of adverse responses using predefined cutoffs. The former strategy has the disadvantage of often leading to a poorly fitting model that does not predict risk well, whereas the latter approach can be very sensitive to the cutoff choice. As a simple yet flexible alternative, we propose a method for adverse subpopulation regression, which relies on a two‐component latent class model, with the dominant component corresponding to (presumed) healthy individuals and the risk of falling in the minority component characterized via a logistic regression. The logistic regression model is designed to accommodate high‐dimensional predictors, as occur in studies with a large number of gene by environment interactions, through the use of a flexible nonparametric multiple shrinkage approach. The Gibbs sampler is developed for posterior computation. We evaluate the methods with the use of simulation studies and apply these to a genetic epidemiology study of pregnancy outcomes. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

11.
In a nutrition survey of 247 Chinese Americans (38% men and 62% women) aged 60–96, the mean plasma ascorbic acid was found to be lower than that of white Americans. The association between their blood pressure and plasma ascorbic acid was explored by analysis of covariance and multiple regression, adjusting for age, sex, body mass index, alcohol and cigarette consumption, dietary Na:K ratio, serum Ca:P ratio and physical activity level. Inverse associations were observed between plasma ascorbic acid and systolic blood pressure (p<.001) and diastolic blood pressure (p<.01). Subjects in the lowest quartile of plasma ascorbic acid had a mean systolic blood pressure which was 17.6 mm Hg higher than those in the highest quartile; their mean diastolic blood pressure was higher by 5.5 mm Hg. We speculate that subclinical vitamin C deficiency may be a risk factor for hypertension in Chinese Americans.  相似文献   

12.
ObjectiveFall prevention is important in many hospitals. Current fall-risk-screening tools have limited predictive accuracy specifically for older inpatients. Their administration can be time-consuming. A reliable and easy-to-administer tool is desirable to identify older inpatients at higher fall risk. We aimed to develop and internally validate a prognostic prediction model for inpatient falls for older patients.DesignRetrospective analysis of a large cohort drawn from hospital electronic health record data.Setting and ParticipantsOlder patients (≥70 years) admitted to a university medical center (2016 until 2021).MethodsThe outcome was an inpatient fall (≥24 hours of admission). Two prediction models were developed using regularized logistic regression in 5 imputed data sets: one model without predictors indicating missing values (Model-without) and one model with these additional predictors indicating missing values (Model-with). We internally validated our whole model development strategy using 10-fold stratified cross-validation. The models were evaluated using discrimination (area under the receiver operating characteristic curve) and calibration (plot assessment). We determined whether the areas under the receiver operating characteristic curves (AUCs) of the models were significantly different using DeLong test.ResultsOur data set included 21,286 admissions. In total, 470 (2.2%) had a fall after 24 hours of admission. The Model-without had 12 predictors and Model-with 13, of which 4 were indicators of missing values. The AUCs of the Model-without and Model-with were 0.676 (95% CI 0.646-0.707) and 0.695 (95% CI 0.667-0.724). The AUCs between both models were significantly different (P = .013). Calibration was good for both models.Conclusions and ImplicationsBoth the Model-with and Model-without indicators of missing values showed good calibration and fair discrimination, where the Model-with performed better. Our models showed competitive performance to well-established fall-risk-screening tools, and they have the advantage of being based on routinely collected data. This may substantially reduce the burden on nurses, compared with nonautomatic fall-risk-screening tools.  相似文献   

13.
目的 探讨logistic回归和随机森林在体检人群糖尿病患病风险预测中的应用。 方法 选择2006年1月-2015年12月在北京航天总医院体检中心参加体检的非糖尿病者11 769例次,随机选取70%样本,以性别、年龄、BMI、吸烟史、饮酒史、高血压既往史、高血压家族史、糖尿病家族史、收缩压、舒张压、空腹血糖、总胆固醇、甘油三酯、脂肪肝等14个因素作为自变量,以5年内是否罹患糖尿病作为因变量,基于logistic回归和随机森林分别建立糖尿病预测模型。将预测模型应用于剩余30%样本,根据所得受试者工作特征曲线的曲线下面积(AUC)评价模型的预测效果。 结果 Logistic回归预测模型和随机森林预测模型的AUC分别为0.912(95%CI:0.898~0.927)和0.919(95%CI:0.906~0.932),在最佳临界点,Logistic回归预测模型的灵敏度和特异度分别为80.8%和87.3%,随机森林预测模型的灵敏度和特异度分别为84.1%和85.3%。 结论 Logistic回归预测模型和随机森林预测模型对体检人群的糖尿病患病风险均具有良好的预测能力。  相似文献   

14.
Prediction of an outcome for a given unit based on prediction models built on a training sample plays a major role in many research areas. The uncertainty of the prediction is predominantly characterized by the subject sampling variation in current practice, where prediction models built on hypothetically re‐sampled units yield variable predictions for the same unit of interest. It is almost always true that the predictors used to build prediction models are simply a subset of the entirety of factors related to the outcome. Following the frequentist principle, we can account for the variation because of hypothetically re‐sampled predictors used to build the prediction models. This is particularly important in medicine where the prediction has important and sometime life‐death consequences on a patient's health status. In this article, we discuss some rationale along this line in the context of medicine. We propose a simple approach to estimate the standard error of the prediction that accounts for the variation because of sampling both subjects and predictors under logistic and Cox regression models. A simulation study is presented to support our argument and demonstrate the performance of our method. The concept and method are applied to a real data set. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

15.
16.
Tracking of cardiovascular risk factors: the Tromsø study, 1979-1995   总被引:3,自引:0,他引:3  
Tracking of cardiovascular risk factors (blood pressure, body mass index (BMI), and serum lipids) has not been studied much in a general, adult population. No known study has compared tracking of these factors for both sexes. In the present study, 17,710 men and women aged 20-61 years at baseline attended two or three population-based health surveys in Troms?, Norway, over 16 years (between 1979-1980 and 1994-1995). Tracking coefficients were estimated by using different methods, and possible predictors of tracking were found. There was a high degree of tracking for BMI (overall tracking coefficients: 0.85 for men, 0.80 for women). Relatively high (or moderate) tracking was found for systolic blood pressure (respective sex-specific coefficients: 0.52, 0.54), diastolic blood pressure (0.48, 0.48), high density lipoprotein cholesterol (0.55, 0.64), and total cholesterol (0.77, 0.65). The lowest coefficients were for triglycerides (0.43, 0.39). Analysis of tracking in the upper sextile confirmed these results. Although some baseline predictors were associated with tracking, the effects were relatively weak. When predictors for tracking in the upper sextile were assessed, significant associations were found with relatively strong effects. No major sex differences were observed in tracking. However, women were more likely than men to remain in the upper sextile of systolic and diastolic blood pressures and of BMI.  相似文献   

17.
BackgroundEnvironmental uranium exposure originating as a byproduct of uranium processing can impact human health. The Fernald Feed Materials Production Center functioned as a uranium processing facility from 1951 to 1989, and potential health effects among residents living near this plant were investigated via the Fernald Medical Monitoring Program (FMMP).MethodsData from 8216 adult FMMP participants were used to test the hypothesis that elevated uranium exposure was associated with indicators of hypertension or changes in hematologic parameters at entry into the program. A cumulative uranium exposure estimate, developed by FMMP investigators, was used to classify exposure. Systolic and diastolic blood pressure and physician diagnoses were used to assess hypertension; and red blood cells, platelets, and white blood cell differential counts were used to characterize hematology. The relationship between uranium exposure and hypertension or hematologic parameters was evaluated using generalized linear models and quantile regression for continuous outcomes, and logistic regression or ordinal logistic regression for categorical outcomes, after adjustment for potential confounding factors.ResultsOf 8216 adult FMMP participants 4187 (51%) had low cumulative uranium exposure, 1273 (15%) had moderate exposure, and 2756 (34%) were in the high (>0.50 Sievert) cumulative lifetime uranium exposure category. Participants with elevated uranium exposure had decreased white blood cell and lymphocyte counts and increased eosinophil counts. Female participants with higher uranium exposures had elevated systolic blood pressure compared to women with lower exposures. However, no exposure-related changes were observed in diastolic blood pressure or hypertension diagnoses among female or male participants.ConclusionsResults from this investigation suggest that residents in the vicinity of the Fernald plant with elevated exposure to uranium primarily via inhalation exhibited decreases in white blood cell counts, and small, though statistically significant, gender-specific alterations in systolic blood pressure at entry into the FMMP.  相似文献   

18.
Dynamic prediction models make use of patient‐specific longitudinal data to update individualized survival probability predictions based on current and past information. Colonoscopy (COL) and fecal occult blood test (FOBT) results were collected from two Australian surveillance studies on individuals characterized as high‐risk based on a personal or family history of colorectal cancer. Motivated by a Poisson process, this paper proposes a generalized nonlinear model with a complementary log–log link as a dynamic prediction tool that produces individualized probabilities for the risk of developing advanced adenoma or colorectal cancer (AAC). This model allows predicted risk to depend on a patient's baseline characteristics and time‐dependent covariates. Information on the dates and results of COLs and FOBTs were incorporated using time‐dependent covariates that contributed to patient risk of AAC for a specified period following the test result. These covariates serve to update a person's risk as additional COL, and FOBT test information becomes available. Model selection was conducted systematically through the comparison of Akaike information criterion. Goodness‐of‐fit was assessed with the use of calibration plots to compare the predicted probability of event occurrence with the proportion of events observed. Abnormal COL results were found to significantly increase risk of AAC for 1 year following the test. Positive FOBTs were found to significantly increase the risk of AAC for 3 months following the result. The covariates that incorporated the updated test results were of greater significance and had a larger effect on risk than the baseline variables. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

19.
Predicting the probability of the occurrence of a binary outcome or condition is important in biomedical research. While assessing discrimination is an essential issue in developing and validating binary prediction models, less attention has been paid to methods for assessing model calibration. Calibration refers to the degree of agreement between observed and predicted probabilities and is often assessed by testing for lack‐of‐fit. The objective of our study was to examine the ability of graphical methods to assess the calibration of logistic regression models. We examined lack of internal calibration, which was related to misspecification of the logistic regression model, and external calibration, which was related to an overfit model or to shrinkage of the linear predictor. We conducted an extensive set of Monte Carlo simulations with a locally weighted least squares regression smoother (i.e., the loess algorithm) to examine the ability of graphical methods to assess model calibration. We found that loess‐based methods were able to provide evidence of moderate departures from linearity and indicate omission of a moderately strong interaction. Misspecification of the link function was harder to detect. Visual patterns were clearer with higher sample sizes, higher incidence of the outcome, or higher discrimination. Loess‐based methods were also able to identify the lack of calibration in external validation samples when an overfit regression model had been used. In conclusion, loess‐based smoothing methods are adequate tools to graphically assess calibration and merit wider application. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd  相似文献   

20.
When statistical models are used to predict the values of unobserved random variables, loss functions are often used to quantify the accuracy of a prediction. The expected loss over some specified set of occasions is called the prediction error. This paper considers the estimation of prediction error when regression models are used to predict survival times and discusses the use of these estimates. Extending the previous work, we consider both point and confidence interval estimations of prediction error, and allow for variable selection and model misspecification. Different estimators are compared in a simulation study for an absolute relative error loss function, and results indicate that cross‐validation procedures typically produce reliable point estimates and confidence intervals, whereas model‐based estimates are sensitive to model misspecification. Links between performance measures for point predictors and for predictive distributions of survival times are also discussed. The methodology is illustrated in a medical setting involving survival after treatment for disease. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号