首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
ObjectiveFall prevention is important in many hospitals. Current fall-risk-screening tools have limited predictive accuracy specifically for older inpatients. Their administration can be time-consuming. A reliable and easy-to-administer tool is desirable to identify older inpatients at higher fall risk. We aimed to develop and internally validate a prognostic prediction model for inpatient falls for older patients.DesignRetrospective analysis of a large cohort drawn from hospital electronic health record data.Setting and ParticipantsOlder patients (≥70 years) admitted to a university medical center (2016 until 2021).MethodsThe outcome was an inpatient fall (≥24 hours of admission). Two prediction models were developed using regularized logistic regression in 5 imputed data sets: one model without predictors indicating missing values (Model-without) and one model with these additional predictors indicating missing values (Model-with). We internally validated our whole model development strategy using 10-fold stratified cross-validation. The models were evaluated using discrimination (area under the receiver operating characteristic curve) and calibration (plot assessment). We determined whether the areas under the receiver operating characteristic curves (AUCs) of the models were significantly different using DeLong test.ResultsOur data set included 21,286 admissions. In total, 470 (2.2%) had a fall after 24 hours of admission. The Model-without had 12 predictors and Model-with 13, of which 4 were indicators of missing values. The AUCs of the Model-without and Model-with were 0.676 (95% CI 0.646-0.707) and 0.695 (95% CI 0.667-0.724). The AUCs between both models were significantly different (P = .013). Calibration was good for both models.Conclusions and ImplicationsBoth the Model-with and Model-without indicators of missing values showed good calibration and fair discrimination, where the Model-with performed better. Our models showed competitive performance to well-established fall-risk-screening tools, and they have the advantage of being based on routinely collected data. This may substantially reduce the burden on nurses, compared with nonautomatic fall-risk-screening tools.  相似文献   

2.
ObjectiveWe compared popular methods to handle missing data with multiple imputation (a more sophisticated method that preserves data).Study Design and SettingWe used data of 804 patients with a suspicion of deep venous thrombosis (DVT). We studied three covariates to predict the presence of DVT: d-dimer level, difference in calf circumference, and history of leg trauma. We introduced missing values (missing at random) ranging from 10% to 90%. The risk of DVT was modeled with logistic regression for the three methods, that is, complete case analysis, exclusion of d-dimer level from the model, and multiple imputation.ResultsMultiple imputation showed less bias in the regression coefficients of the three variables and more accurate coverage of the corresponding 90% confidence intervals than complete case analysis and dropping d-dimer level from the analysis. Multiple imputation showed unbiased estimates of the area under the receiver operating characteristic curve (0.88) compared with complete case analysis (0.77) and when the variable with missing values was dropped (0.65).ConclusionAs this study shows that simple methods to deal with missing data can lead to seriously misleading results, we advise to consider multiple imputation. The purpose of multiple imputation is not to create data, but to prevent the exclusion of observed data.  相似文献   

3.
ObjectiveTo evaluate whether different categorization strategies for introducing continuous variables in multivariable logistic regression analysis results in prognostic models that differ in content and performance.Study Design and SettingBackward multivariable logistic regression (P < 0.05 and P < 0.157) was performed with possible predictors for persistent complaints in patients with nonspecific neck pain. The continuous variables were introduced in the analysis in three separate ways: (1) continuous, (2) split into multiple categories, and (3) dichotomized. The different models were compared with regard to model content, goodness of fit, explained variation, and discriminative ability. We also compared the effect on performance of categorization before and after the selection procedure.ResultsFor P < 0.05, the final model with continuous variables, containing five predictors, disagreed on three predictors with both categorization strategies. For P < 0.157, the model with continuous variables, containing six predictors, disagreed on three predictors with the model containing stratified continuous variables and on six predictors compared with the model with dichotomized variables. The models in which the variables were kept continuous performed best. There was no clear difference in performance between categorization before and after the selection procedure.ConclusionCategorization of continuous variables resulted in a different content and poorer performance of the final model.  相似文献   

4.
Tests for regression coefficients such as global, local, and partial F‐tests are common in applied research. In the framework of multiple imputation, there are several papers addressing tests for regression coefficients. However, for simultaneous hypothesis testing, the existing methods are computationally intensive because they involve calculation with vectors and (inversion of) matrices. In this paper, we propose a simple method based on the scalar entity, coefficient of determination, to perform (global, local, and partial) F‐tests with multiply imputed data. The proposed method is evaluated using simulated data and applied to suicide prevention data. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

5.
ObjectivesPrevious studies investigated factors associated with mortality. Nevertheless, evidence is limited regarding the determinants of lifespan. We aimed to develop and validate a lifespan prediction model based on the most important predictors.DesignA prospective cohort study.Setting and ParticipantsA total of 23,892 community-living adults aged 65 years or older with confirmed death records between 1998 and 2018 from 23 provinces in China.MethodsInformation including demographic characteristics, lifestyle, functional health, and prevalence of diseases was collected. The risk prediction model was generated using multivariate linear regression, incorporating the most important predictors identified by the Lasso selection method. We used 1000 bootstrap resampling for the internal validation. The model performance was assessed by adjusted R2, root mean square error (RMSE), mean absolute error (MAE), and intraclass correlation coefficient (ICC).ResultsTwenty-one predictors were included in the final lifespan prediction model. Older adults with longer lifespans were characterized by older age at baseline, female, minority race, living in rural areas, married, with healthier lifestyles and more leisure engagement, better functional status, and absence of diseases. The predicted lifespans were highly consistent with observed lifespans, with an adjusted R2 of 0.893. RMSE was 2.86 (95% CI 2.84–2.88) and MAE was 2.18 (95% CI 2.16–2.20) years. The ICC between observed and predicted lifespans was 0.971 (95% CI 0.971–0.971).Conclusions and ImplicationsThe lifespan prediction model was validated with good performance, the web-based prediction tool can be easily applied in practical use as it relies on all easily accessible variables.  相似文献   

6.
ObjectivesThe choice of an adequate sample size for a Cox regression analysis is generally based on the rule of thumb derived from simulation studies of a minimum of 10 events per variable (EPV). One simulation study suggested scenarios in which the 10 EPV rule can be relaxed. The effect of a range of binary predictors with varying prevalence, reflecting clinical practice, has not yet been fully investigated.Study Design and SettingWe conducted an extended resampling study using a large general-practice data set, comprising over 2 million anonymized patient records, to examine the EPV requirements for prediction models with low-prevalence binary predictors developed using Cox regression. The performance of the models was then evaluated using an independent external validation data set. We investigated both fully specified models and models derived using variable selection.ResultsOur results indicated that an EPV rule of thumb should be data driven and that EPV ≥ 20 ​ generally eliminates bias in regression coefficients when many low-prevalence predictors are included in a Cox model.ConclusionHigher EPV is needed when low-prevalence predictors are present in a model to eliminate bias in regression coefficients and improve predictive accuracy.  相似文献   

7.
ObjectivesThis article aimed to develop and validate an anthropometric equation based on the least absolute shrinkage and selection operator (LASSO) regression, a machine learning approach, to predict appendicular skeletal muscle mass (ASM) in 60-70-year-old women.DesignA cross-sectional study.Setting and participantsCommunity-dwelling women aged 60-70 years.MethodsA total of 1296 community-dwelling women aged 60-70 years were randomly divided into the development or the validation group (1:1 ratio). ASM was evaluated by bioelectrical impedance analysis (BIA) as the reference. Variables including weight, height, body mass index (BMI), sitting height, waist-to-hip ratio (WHR), calf circumference (CC), and 5 summary measures of limb length were incorporated as candidate predictors. LASSO regression was used to select predictors with 10-fold cross-validation, and multiple linear regression was applied to develop the BIA-measured ASM prediction equation. Paired t test and Bland-Altman analysis were used to validate agreement.ResultsWeight, WHR, CC, and sitting height were selected by LASSO regression as independent variables and the equation is ASM = 0.2308 × weight (kg) – 27.5652 × WHR + 8.0179 × CC (m) + 2.3772 × Sitting height (m) + 22.2405 (adjusted R2 = 0.848, standard error of the estimate = 0.661 kg, P < .001). Bland-Altman analysis showed a high agreement between BIA-measured ASM and predicted ASM that the mean difference between the 2 methods was ?0.041 kg, with the 95% limits of agreement of ?1.441 to 1.359 kg.Conclusions and ImplicationsThe equation for 60-70-year-old women could provide an available measurement of ASM for communities that cannot equip with BIA, which promotes the early screening of sarcopenia at the community level. Additionally, sitting height could predict ASM effectively, suggesting that maybe it can be used in further studies of muscle mass.  相似文献   

8.
We propose a method to combine several predictors (markers) that are measured repeatedly over time into a composite marker score without assuming a model and only requiring a mild condition on the predictor distribution. Assuming that the first and second moments of the predictors can be decomposed into a time and a marker component via a Kronecker product structure that accommodates the longitudinal nature of the predictors, we develop first-moment sufficient dimension reduction techniques to replace the original markers with linear transformations that contain sufficient information for the regression of the predictors on the outcome. These linear combinations can then be combined into a score that has better predictive performance than a score built under a general model that ignores the longitudinal structure of the data. Our methods can be applied to either continuous or categorical outcome measures. In simulations, we focus on binary outcomes and show that our method outperforms existing alternatives by using the AUC, the area under the receiver-operator characteristics (ROC) curve, as a summary measure of the discriminatory ability of a single continuous diagnostic marker for binary disease outcomes. Published 2011. This article is a US Government work and is in the public domain in the USA.  相似文献   

9.
ObjectivesRegardless of the proportion of missing values, complete-case analysis is most frequently applied, although advanced techniques such as multiple imputation (MI) are available. The objective of this study was to explore the performance of simple and more advanced methods for handling missing data in cases when some, many, or all item scores are missing in a multi-item instrument.Study Design and SettingReal-life missing data situations were simulated in a multi-item variable used as a covariate in a linear regression model. Various missing data mechanisms were simulated with an increasing percentage of missing data. Subsequently, several techniques to handle missing data were applied to decide on the most optimal technique for each scenario. Fitted regression coefficients were compared using the bias and coverage as performance parameters.ResultsMean imputation caused biased estimates in every missing data scenario when data are missing for more than 10% of the subjects. Furthermore, when a large percentage of subjects had missing items (>25%), MI methods applied to the items outperformed methods applied to the total score.ConclusionWe recommend applying MI to the item scores to get the most accurate regression model estimates. Moreover, we advise not to use any form of mean imputation to handle missing data.  相似文献   

10.
ObjectiveTo show how the bivariate random effects meta-analysis model can be used to study the relation between the explanatory variables and the performance of diagnostic tests as characterized by a summary receiver operating characteristic curve (SROCC).Study Design and SettingThe subject is discussed by means of a data example in which sensitivity and specificity are available for 149 studies on one of three tests for the diagnosis of coronary artery disease. The focus is on comparing SROCCs between different tests adjusted for potential confounders, but the methods can be applied much more generally.ResultsDifferent types of SROCCs can be calculated. The influence of explanatory variables on an SROCC is an ensemble of sensitivity and specificity regression coefficients and covariance parameters. The regression coefficients of the SROCC are estimated and tested, and the percentage explained variability is determined. Under certain assumptions, the SROCCs of different covariate values do not cross. If these are fulfilled, it is much easier to describe the influence of explanatory variables. Conclusions can depend on the type of SROCC.ConclusionThe bivariate random effects meta-analysis model is an appropriate and convenient framework to investigate the effect of covariates on the performance of diagnostic tests as measured by SROCCs.  相似文献   

11.
Summary measures of cardiovascular risk have long been used in public health, but few include nutritional predictors despite extensive evidence linking diet and heart disease. Study objectives were to develop and validate a novel risk score in a case-control study of myocardial infarction (MI) conducted in Costa Rica during 1994-2004. After restricting the data set to healthy participants (n = 1678), conditional logistic regression analyses modeled associations of lifestyle factors (unhealthy diet, decreased physical activity, smoking, waist:hip ratio, low or high alcohol intake, and low socioeconomic status) with risk for MI. Using the estimated coefficients as weights for each component, a regression model was fit to assess score performance. The score was subsequently validated in participants with a history of chronic disease. Higher risk score values were associated with a significantly increased risk of MI [OR = 2.72 (95% CI = 2.28-3.24)]. The findings were replicated in a model (n = 1392) that included the best covariate measures available in the study [OR = 2.71 (95% CI = 2.26-3.26)]. Performance of the score in different subsets of the study population showed c-statistics ranging from 0.63 to 0.67. The new score presents a quantitative summary of modifiable cardiovascular risk factors in the study population.  相似文献   

12.
ObjectivesTo develop and validate a prediction model to detect sensitization to wheat allergens in bakery workers.Study Design and SettingThe prediction model was developed in 867 Dutch bakery workers (development set, prevalence of sensitization 13%) and included questionnaire items (candidate predictors). First, principal component analysis was used to reduce the number of candidate predictors. Then, multivariable logistic regression analysis was used to develop the model. Internal validation and extent of optimism was assessed with bootstrapping. External validation was studied in 390 independent Dutch bakery workers (validation set, prevalence of sensitization 20%).ResultsThe prediction model contained the predictors nasoconjunctival symptoms, asthma symptoms, shortness of breath and wheeze, work-related upper and lower respiratory symptoms, and traditional bakery. The model showed good discrimination with an area under the receiver operating characteristic (ROC) curve area of 0.76 (and 0.75 after internal validation). Application of the model in the validation set gave a reasonable discrimination (ROC area = 0.69) and good calibration after a small adjustment of the model intercept.ConclusionA simple model with questionnaire items only can be used to stratify bakers according to their risk of sensitization to wheat allergens. Its use may increase the cost-effectiveness of (subsequent) medical surveillance.  相似文献   

13.
BackgroundThis study builds on previous research that seeks to estimate kilocalorie intake through microstructural analysis of eating behaviors. As opposed to previous methods, which used a static, individual-based measure of kilocalories per bite, the new method incorporates time- and food-varying predictors. A measure of kilocalories per bite (KPB) was estimated using between- and within-subjects variables.ObjectiveThe purpose of this study was to examine the relationship between within-subjects and between-subjects predictors and KPB, and to develop a model of KPB that improves over previous models of KPB. Within-subjects predictors included time since last bite, food item enjoyment, premeal satiety, and time in meal. Between-subjects predictors included body mass index, mouth volume, and sex.Participants/settingSeventy-two participants (39 female) consumed two random meals out of five possible meal options with known weights and energy densities. There were 4,051 usable bites measured.Main outcome measuresThe outcome measure of the first analysis was KPB. The outcome measure of the second analysis was meal-level kilocalorie intake, with true intake compared to three estimation methods.Statistical analyses performedMultilevel modeling was used to analyze the influence of the seven predictors of KPB. The accuracy of the model was compared to previous methods of estimating KPB using a repeated-measured analysis of variance.ResultsAll hypothesized relationships were significant, with slopes in the expected direction, except for body mass index and time in meal. In addition, the new model (with nonsignificant predictors removed) improved over earlier models of KPB.ConclusionsThis model offers a new direction for methods of inexpensive, accurate, and objective estimates of kilocalorie intake from bite-based measures.  相似文献   

14.
Logistic regression analysis may well be used to develop a prognostic model for a dichotomous outcome. Especially when limited data are available, it is difficult to determine an appropriate selection of covariables for inclusion in such models. Also, predictions may be improved by applying some sort of shrinkage in the estimation of regression coefficients. In this study we compare the performance of several selection and shrinkage methods in small data sets of patients with acute myocardial infarction, where we aim to predict 30-day mortality. Selection methods included backward stepwise selection with significance levels alpha of 0.01, 0.05, 0. 157 (the AIC criterion) or 0.50, and the use of qualitative external information on the sign of regression coefficients in the model. Estimation methods included standard maximum likelihood, the use of a linear shrinkage factor, penalized maximum likelihood, the Lasso, or quantitative external information on univariable regression coefficients. We found that stepwise selection with a low alpha (for example, 0.05) led to a relatively poor model performance, when evaluated on independent data. Substantially better performance was obtained with full models with a limited number of important predictors, where regression coefficients were reduced with any of the shrinkage methods. Incorporation of external information for selection and estimation improved the stability and quality of the prognostic models. We therefore recommend shrinkage methods in full models including prespecified predictors and incorporation of external information, when prognostic models are constructed in small data sets.  相似文献   

15.
目的 系统评价宫颈癌发病风险预测模型的现况,为实践工作选择最合适的模型提供证据,指导宫颈癌筛查。方法 以宫颈癌和风险预测模型相关的两组中英文关键词,分别检索中国知网、万方数据知识服务平台及PubMed、Embase、Cochrane Library,筛选截至2019年11月21日发表构建或验证宫颈癌发病模型相关文献。根据CHARMS清单制定提取表,以PROBAST工具评估偏倚风险。结果 共纳入12篇文献,涉及15个模型,其中5个模型在中国构建。预测结局包含从宫颈癌前病变到癌症发生的多个阶段宫颈涂片异常(1)、CIN的发生或复发(9)、宫颈癌发生(5)。使用较多的预测因素为HPV感染(12)、年龄(7)、吸烟(5)和文化程度(5)。有2个模型采用机器学习建模。模型表现上,区分度范围为0.53~0.87,而校准度只有2个模型正确评价。仅2个模型在中国台湾地区利用不同时间段的人群进行了外部验证。偏倚风险评价发现所有模型均为高风险,尤其分析领域,问题集中在缺失数据处理不当(13)、模型表现评价不完整(13)、内部验证使用不当(12)和样本量不足(11)。另外,预测因素和结局测量不一致(8)、结局测量盲法使用情况未报告(8)的问题较突出。相对而言,Rothberg等(2018)的模型质量较高。结论 宫颈癌发病风险预测模型有一定数量但质量较差,亟须提高预测因素与结局的测量以及缺失数据处理和模型表现评价等统计分析细节,对现有模型进行外部验证,以更好地指导筛查。  相似文献   

16.
目的 开发和验证基于机器学习算法的孕期大于胎龄儿(LGA)风险预测模型,并比较其与传统逻辑回归方法建模的性能差异。方法 研究对象来自"中国免费孕前优生健康检查项目",于2010-2012年在全国31个省市的220个县开展,覆盖全部农村计划妊娠夫妇,本研究选取分娩新生儿胎龄在24~42周内,单胎活产的所有育龄期夫妇及其新生儿为研究对象。应用10种机器学习算法分别建立LGA预测模型,评估模型对LGA的预测性能。结果 最终纳入104 936名新生儿,男婴54 856例(52.3%),女婴50 080例(47.7%),LGA的发生率为11.7%(12 279例)。经过下采样数据平衡处理后,机器学习方法建立模型的整体效能出现明显提高,其中以CatBoost模型在预测LGA风险方面表现最佳,模型的受试者工作特征曲线的曲线下面积(AUC)为0.932;逻辑回归模型表现最差,AUC仅为0.555。结论 与传统的逻辑回归方法相比,通过机器学习算法可建立更有效的孕期LGA风险预测模型,具有潜在的应用价值。  相似文献   

17.
Multiple imputation (MI) is a commonly used technique for handling missing data in large‐scale medical and public health studies. However, variable selection on multiply‐imputed data remains an important and longstanding statistical problem. If a variable selection method is applied to each imputed dataset separately, it may select different variables for different imputed datasets, which makes it difficult to interpret the final model or draw scientific conclusions. In this paper, we propose a novel multiple imputation‐least absolute shrinkage and selection operator (MI‐LASSO) variable selection method as an extension of the least absolute shrinkage and selection operator (LASSO) method to multiply‐imputed data. The MI‐LASSO method treats the estimated regression coefficients of the same variable across all imputed datasets as a group and applies the group LASSO penalty to yield a consistent variable selection across multiple‐imputed datasets. We use a simulation study to demonstrate the advantage of the MI‐LASSO method compared with the alternatives. We also apply the MI‐LASSO method to the University of Michigan Dioxin Exposure Study to identify important circumstances and exposure factors that are associated with human serum dioxin concentration in Midland, Michigan. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

18.
目的:介绍4种多重并行中介模型的分析方法,包括纯回归法、逆概率加权法、扩展的自然效应模型和基于权重的填补法,并对其进行探讨和比较。方法:针对多重并行中介模型,通过3种情境的模拟试验比较不同方法在不同情境下估计直接效应和间接效应的表现,并应用英国生物样本库的数据集进行实例分析。结果:模拟试验和实例分析结果显示纯回归法和逆...  相似文献   

19.
Continuous predictors are routinely encountered when developing a prognostic model. Investigators, who are often non‐statisticians, must decide how to handle continuous predictors in their models. Categorising continuous measurements into two or more categories has been widely discredited, yet is still frequently done because of its simplicity, investigator ignorance of the potential impact and of suitable alternatives, or to facilitate model uptake. We examine three broad approaches for handling continuous predictors on the performance of a prognostic model, including various methods of categorising predictors, modelling a linear relationship between the predictor and outcome and modelling a nonlinear relationship using fractional polynomials or restricted cubic splines. We compare the performance (measured by the c‐index, calibration and net benefit) of prognostic models built using each approach, evaluating them using separate data from that used to build them. We show that categorising continuous predictors produces models with poor predictive performance and poor clinical usefulness. Categorising continuous predictors is unnecessary, biologically implausible and inefficient and should not be used in prognostic model development. © 2016 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.  相似文献   

20.
《Value in health》2015,18(8):1025-1036
BackgroundCondition-specific measures are frequently used to assess the health-related quality of life of people with multiple sclerosis (MS). Such measures are unsuitable for use in economic evaluations that require estimates of cost per quality-adjusted life-year because they are not based on preferences.ObjectivesTo report the estimation of a preference-based single index for an eight-dimensional instrument for MS, the Multiple Sclerosis Impact Scale – Eight Dimensions (MSIS-8D), derived from an MS-specific measure of health-related quality of life, the 29-item Multiple Sclerosis Impact Scale (MSIS-29).MethodsWe elicited preferences for a sample of MSIS-8D states (n = 169) from a sample (n = 1702) of the UK general population. Preferences were elicited using the time trade-off technique via an Internet-based survey. We fitted regression models to these data to estimate values for all health states described by the MSIS-8D. Estimated values were assessed against MSIS-29 scores and values derived from generic preference-based measures in a large, representative sample of people with MS.ResultsParticipants reported that the time trade-off questions were easy to understand. Observed health state values ranged from 0.08 to 0.89. The best-performing model was a main effects, random effects model (mean absolute error = 0.04). Validation analyses support the performance of the MSIS-8D index: it correlated more strongly than did generic measures with MSIS-29 scores, and it discriminated effectively between subgroups of people with MS.ConclusionsThe MSIS-8D enables health state values to be estimated from the MSIS-29, adding to the methods available to assess health outcomes and to estimate quality-adjusted life-years for MS for use in health technology assessment and decision-making contexts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号