首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Clinical prediction models (CPMs) can predict clinically relevant outcomes or events. Typically, prognostic CPMs are derived to predict the risk of a single future outcome. However, there are many medical applications where two or more outcomes are of interest, meaning this should be more widely reflected in CPMs so they can accurately estimate the joint risk of multiple outcomes simultaneously. A potentially naïve approach to multi‐outcome risk prediction is to derive a CPM for each outcome separately, then multiply the predicted risks. This approach is only valid if the outcomes are conditionally independent given the covariates, and it fails to exploit the potential relationships between the outcomes. This paper outlines several approaches that could be used to develop CPMs for multiple binary outcomes. We consider four methods, ranging in complexity and conditional independence assumptions: namely, probabilistic classifier chain, multinomial logistic regression, multivariate logistic regression, and a Bayesian probit model. These are compared with methods that rely on conditional independence: separate univariate CPMs and stacked regression. Employing a simulation study and real‐world example, we illustrate that CPMs for joint risk prediction of multiple outcomes should only be derived using methods that model the residual correlation between outcomes. In such a situation, our results suggest that probabilistic classification chains, multinomial logistic regression or the Bayesian probit model are all appropriate choices. We call into question the development of CPMs for each outcome in isolation when multiple correlated or structurally related outcomes are of interest and recommend more multivariate approaches to risk prediction.  相似文献   

2.
We compare the calibration and variability of risk prediction models that were estimated using various approaches for combining information on new predictors, termed ‘markers’, with parameter information available for other variables from an earlier model, which was estimated from a large data source. We assess the performance of risk prediction models updated based on likelihood ratio (LR) approaches that incorporate dependence between new and old risk factors as well as approaches that assume independence (‘naive Bayes’ methods). We study the impact of estimating the LR by (i) fitting a single model to cases and non‐cases when the distribution of the new markers is in the exponential family or (ii) fitting separate models to cases and non‐cases. We also evaluate a new constrained maximum likelihood method. We study updating the risk prediction model when the new data arise from a cohort and extend available methods to accommodate updating when the new data source is a case‐control study. To create realistic correlations between predictors, we also based simulations on real data on response to antiviral therapy for hepatitis C. From these studies, we recommend the LR method fit using a single model or constrained maximum likelihood. Copyright © 2016 John Wiley & Sons, Ltd.  相似文献   

3.
During the recent decades, interest in prediction models has substantially increased, but approaches to synthesize evidence from previously developed models have failed to keep pace. This causes researchers to ignore potentially useful past evidence when developing a novel prediction model with individual participant data (IPD) from their population of interest. We aimed to evaluate approaches to aggregate previously published prediction models with new data. We consider the situation that models are reported in the literature with predictors similar to those available in an IPD dataset. We adopt a two‐stage method and explore three approaches to calculate a synthesis model, hereby relying on the principles of multivariate meta‐analysis. The former approach employs a naive pooling strategy, whereas the latter accounts for within‐study and between‐study covariance. These approaches are applied to a collection of 15 datasets of patients with traumatic brain injury, and to five previously published models for predicting deep venous thrombosis. Here, we illustrated how the generally unrealistic assumption of consistency in the availability of evidence across included studies can be relaxed. Results from the case studies demonstrate that aggregation yields prediction models with an improved discrimination and calibration in a vast majority of scenarios, and result in equivalent performance (compared with the standard approach) in a small minority of situations. The proposed aggregation approaches are particularly useful when few participant data are at hand. Assessing the degree of heterogeneity between IPD and literature findings remains crucial to determine the optimal approach in aggregating previous evidence into new prediction models. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

4.
Many factors determine a woman's risk of breast cancer. Some of them are genetic and relate to family history, others are based on personal factors such as reproductive history and medical history. While many papers have concentrated on subsets of these risk factors, no papers have incorporated personal risk factors with a detailed genetic analysis. There is a need to combine these factors to provide a better overall determinant of risk. The discovery of the BRCA1 and BRCA2 genes has explained some of the genetic determinants of breast cancer risk, but these genes alone do not explain all of the familial aggregation of breast cancer. We have developed a model incorporating the BRCA genes, a low penetrance gene and personal risk factors. For an individual woman her family history is used in conjuction with Bayes theorem to iteratively produce the likelihood of her carrying any genes predisposing to breast cancer, which in turn affects her likelihood of developing breast cancer. This risk was further refined based on the woman's personal history. The model has been incorporated into a computer program that gives a personalised risk estimate.  相似文献   

5.
Synthesis analysis refers to a statistical method that integrates multiple univariate regression models and the correlation between each pair of predictors into a single multivariate regression model. The practical application of such a method could be developing a multivariate disease prediction model where a dataset containing the disease outcome and every predictor of interest is not available. In this study, we propose a new version of synthesis analysis that is specific to binary outcomes. We show that our proposed method possesses desirable statistical properties. We also conduct a simulation study to assess the robustness of the proposed method and compare it to a competing method. Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

6.
Genome wide association studies have identified several single nucleotide polymorphisms (SNPs) that are independently associated with small increments in risk of prostate cancer, opening up the possibility for using such variants in risk prediction. Using segregation analysis of population‐based samples of 4,390 families of prostate cancer patients from the UK and Australia, and assuming all familial aggregation has genetic causes, we previously found that the best model for the genetic susceptibility to prostate cancer was a mixed model of inheritance that included both a recessive major gene component and a polygenic component (P) that represents the effect of a large number of genetic variants each of small effect, where . Based on published studies of 26 SNPs that are currently known to be associated with prostate cancer, we have extended our model to incorporate these SNPs by decomposing the polygenic component into two parts: a polygenic component due to the known susceptibility SNPs, , and the residual polygenic component due to the postulated but as yet unknown genetic variants, . The resulting algorithm can be used for predicting the probability of developing prostate cancer in the future based on both SNP profiles and explicit family history information. This approach can be applied to other diseases for which population‐based family data and established risk variants exist. Genet. Epidemiol. 2011. © 2011 Wiley‐Liss, Inc. 35: 549‐556, 2011  相似文献   

7.
Abstract

The aim of the present study was to determine the association between the socio-demographic, lifestyle factors, and dietary habits with the risk of prostate cancer (PC) in a case–control study of Spanish men. None of the socio-demographic, lifestyle or dietetic variables was found predictors of PC risk. Body mass index was associated with an increased risk for aggressive PC and fruit consumption with lower Gleason scores, thus less aggressive cancers. Nonetheless, after applying Bonferroni correction, these variables were not still associated with PC aggressiveness. More adequately, powered epidemiological studies that measure the effect of lifestyle and dietary intake in PC risk and aggressiveness are warranted to further elucidate the role of these modifiable factors on PC etiology.  相似文献   

8.
The heritability of most complex traits is driven by variants throughout the genome. Consequently, polygenic risk scores, which combine information on multiple variants genome-wide, have demonstrated improved accuracy in genetic risk prediction. We present a new two-step approach to constructing genome-wide polygenic risk scores from meta-GWAS summary statistics. Local linkage disequilibrium (LD) is adjusted for in Step 1, followed by, uniquely, long-range LD in Step 2. Our algorithm is highly parallelizable since block-wise analyses in Step 1 can be distributed across a high-performance computing cluster, and flexible, since sparsity and heritability are estimated within each block. Inference is obtained through a formal Bayesian variable selection framework, meaning final risk predictions are averaged over competing models. We compared our method to two alternative approaches: LDPred and lassosum using all seven traits in the Welcome Trust Case Control Consortium as well as meta-GWAS summaries for type 1 diabetes (T1D), coronary artery disease, and schizophrenia. Performance was generally similar across methods, although our framework provided more accurate predictions for T1D, for which there are multiple heterogeneous signals in regions of both short- and long-range LD. With sufficient compute resources, our method also allows the fastest runtimes.  相似文献   

9.
This paper is concerned with methods for the external validation of the prognostic scores that predict the probability of some event such as death. The problem is similar to that of testing the goodness-of-fit of a logistic regression model, except that the prognostic scores are not estimated and validated using the same data. A number of methods for assessing logistic model goodness-of-fit have been proposed, and some of these can also be used in the setting considered. A simple test based on the likelihood ratio statistic is proposed, which does not require arbitrary choice of groups or smoothing parameters. In a simulation study, the proposed method is found to be as powerful as commonly used methods under the scenarios considered.  相似文献   

10.
The importance of developing personalized risk prediction estimates has become increasingly evident in recent years. In general, patient populations may be heterogenous and represent a mixture of different unknown subtypes of disease. When the source of this heterogeneity and resulting subtypes of disease are unknown, accurate prediction of survival may be difficult. However, in certain disease settings, the onset time of an observable short‐term event may be highly associated with these unknown subtypes of disease and thus may be useful in predicting long‐term survival. One approach to incorporate short‐term event information along with baseline markers for the prediction of long‐term survival is through a landmark Cox model, which assumes a proportional hazards model for the residual life at a given landmark point. In this paper, we use this modeling framework to develop procedures to assess how a patient's long‐term survival trajectory may change over time given good short‐term outcome indications along with prognosis on the basis of baseline markers. We first propose time‐varying accuracy measures to quantify the predictive performance of landmark prediction rules for residual life and provide resampling‐based procedures to make inference about such accuracy measures. Simulation studies show that the proposed procedures perform well in finite samples. Throughout, we illustrate our proposed procedures by using a breast cancer dataset with information on time to metastasis and time to death. In addition to baseline clinical markers available for each patient, a chromosome instability genetic score, denoted by CIN25, is also available for each patient and has been shown to be predictive of survival for various types of cancer. We provide procedures to evaluate the incremental value of CIN25 for the prediction of residual life and examine how the residual life profile changes over time. This allows us to identify an informative landmark point, t0, such that accurate risk predictions of the residual life could be made for patients who survive past t0 without metastasis. Copyright © 2013 John Wiley & Sons, Ltd.  相似文献   

11.
In a typical case-control family study, detailed risk factor information is often collected on cases and controls, but not on their relatives for reasons of cost and logistical difficulty in locating the relatives. The impact of missing risk factor information for relatives on estimation of the strength of dependence between the disease risk of pairs of relatives is largely unknown. In this paper, we extend our earlier work on estimating the dependence of ages at onset between paired relatives from case-control family data to include covariates on cases and controls, and possibly relatives. Using population-based case-control families as our basic data structure, we study the effect of missing covariates for relatives and/or cases and controls on the bias of certain dependence parameter estimators via a simulation study. Finally we illustrate various analyses using a case-control family study of early onset prostate cancer.  相似文献   

12.
Although the value of the PSA (prostate-specific antigen) test as a cancer-screening instrument remains hotly contested, over the past two decades its usage has become commonplace. While most men diagnosed with prostate cancer will die with rather than of the disease, widespread PSA screening has led to an attendant increase in cancer diagnoses and the usage of aggressive treatments to ‘combat’ it. Despite the central (if controversial) role that PSA now plays in the diagnosis of prostate cancer and monitoring for recurrence, few studies have set out to explore its role in men's experiences of the disease. Drawing on ethnographic fieldwork at a prostate cancer support group in western Canada, we seek to delineate the meanings the PSA test holds for prostate cancer survivors. For many men in the study, their PSA levels were seen to provide an objective indicator of the presence or absence of cancer, with important implications for their subjective experience of cancer diagnosis and survivorship.  相似文献   

13.
This updated meta-analysis was performed to clarify the relationship between phytoestrogens and prostate cancer risk. Twenty one case–control and two cohort studies were finally selected for this meta-analysis, totaling 11,346 cases and 140,177 controls. Analytical results showed that daidzein (OR?=?0.85; 95% CI: 0.75–0.96), genistein (OR?=?0.87; 95% CI: 0.78–0.98), and glycitein (OR?=?0.89; 95% CI: 0.81–0.98) were associated with a reduction of prostate cancer risk, but total isoflavones (OR?=?0.93; 95% CI: 0.84–1.04), equol (OR?=?0.86; 95% CI: 0.66–1.14), total lignans (OROgna.05; 95% CI: 0.54–2.04), secoisolariciresinol (OR?=?1.02; 95% CI: 0.83–1.24), matairesinol (OR?=?0.91; 95% CI: 0.75–1.11), enterolactone (OR?=?0.94; 95% CI: 0.73–1.20), and coumestrol (OR?=?0.89; 95% CI: 0.76–1.06) were not. Sensitivity and publication bias analyses demonstrated that the pooled estimates were stable and reliable. The results support the notion that some phytoestrogens may have a role in decreasing the risk of prostate cancer. Additional large and well-designed cohort studies are needed to confirm these relationships.  相似文献   

14.
Prostate cancer is one of the most common cancers in American men. The cancer could either be locally confined, or it could spread outside the organ. When locally confined, there are several options for treating and curing this disease. Otherwise, surgery is the only option, and in extreme cases of outside spread, it could very easily recur within a short time even after surgery and subsequent radiation therapy. Hence, it is important to know, based on pre-surgery biopsy results how likely the cancer is organ-confined or not.The paper considers a hierarchical Bayesian neural network approach for posterior prediction probabilities of certain features indicative of non-organ confined prostate cancer. In particular, we find such probabilities for margin positivity (MP) and seminal vesicle (SV) positivity jointly. The available training set consists of bivariate binary outcomes indicating the presence or absence of the two. In addition, we have certain covariates such as prostate specific antigen (PSA), gleason score and the indicator for the cancer to be unilateral or bilateral (i.e. spread on one or both sides) in one data set and gene expression microarrays in another data set. We take a hierarchical Bayesian neural network approach to find the posterior prediction probabilities for a test and validation set, and compare these with the actual outcomes for the first data set. In case of the microarray data we use leave one out cross-validation to access the accuracy of our method. We also demonstrate the superiority of our method to the other competing methods through a simulation study. The Bayesian procedure is implemented by an application of the Markov chain Monte Carlo numerical integration technique. For the problem at hand, our Bayesian bivariate neural network procedure is shown to be superior to the classical neural network, Radford Neal's Bayesian neural network as well as bivariate logistic models to predict jointly the MP and SV in a patient in both the data sets as well as in the simulation study.  相似文献   

15.
An increasingly important data source for the development of clinical risk prediction models is electronic health records (EHRs). One of their key advantages is that they contain data on many individuals collected over time. This allows one to incorporate more clinical information into a risk model. However, traditional methods for developing risk models are not well suited to these irregularly collected clinical covariates. In this paper, we compare a range of approaches for using longitudinal predictors in a clinical risk model. Using data from an EHR for patients undergoing hemodialysis, we incorporate five different clinical predictors into a risk model for patient mortality. We consider different approaches for treating the repeated measurements including use of summary statistics, machine learning methods, functional data analysis, and joint models. We follow up our empirical findings with a simulation study. Overall, our results suggest that simple approaches perform just as well, if not better, than more complex analytic approaches. These results have important implication for development of risk prediction models with EHRs. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

16.
 目的 运用logistic回归分析构建神经外科开颅手术后颅内感染风险预测模型并进行效果评价。方法 选取某院神经外科2019年1月—2021年6月行开颅手术的患者为研究对象,根据术后是否发生颅内感染分为病例组和对照组,采用logistic回归分析开颅手术后颅内感染发生的危险因素并构建风险预测模型,通过Hosmer-Lemeshow拟合优度检验和受试者工作特征(ROC)曲线对其效果进行综合评价。结果 共纳入778例开颅手术患者,121例发生术后颅内感染,发病率为15.55%;logistic多因素回归分析结果显示,幕下手术、脑室引流时间≥3 d、使用明胶海绵≥3片、出血量≥300 mL、切口脑脊液漏是开颅手术后颅内感染的独立危险因素(均P<0.05);开颅手术后颅内感染的风险预测模型为:logit (P)=5.408+0.833×(幕下手术)+0.083×(脑室引流时间)+1.059×(使用明胶海绵)+0.456×(出血量)+2.821×(切口脑脊液漏);Hosmer-Lemeshow拟合优度检验结果显示颅内感染的预测概率和实际发病率比较,差异无统计学意义(P=0.768);logistic回归风险预测模型验证准确率为86.00%,ROC曲线下面积为0.847,95%CI为0.814~0.878。结论 幕下手术、脑室引流时间≥3 d、使用明胶海绵≥3片、出血量≥300 mL、切口脑脊液漏是神经外科开颅手术后颅内感染的独立危险因素,运用logistic回归分析构建的风险预测模型对术后颅内感染的预测效果较好。  相似文献   

17.
胃癌的环境与遗传危险因素及归因危险度分析   总被引:6,自引:1,他引:5  
目的分析胃癌的环境与遗传危险因素并进行归因危险度评价。方法采用病例对照研究方法.对南京地区121例原发性胃癌病例进行环境危险因素调查,并对相关酶系基因多态性进行分析.综合评价环境危险因素及遗传危险性在胃癌发生中的归因危险度。结果在南京地区人群中,吸烟、食用腌制食品等两种环境危险因素与遗传危险因子细胞色素氧化酶P4502E1(CYP2E1)和N-乙酰化酶(NAT2)的基因型的人群综合归因危险度达69.7%。胃癌的发生主要是环境危险因素与内在遗传持点共同作用的结果。结论对胃癌的干预应同时考虑环境危险因素和遗传危险性,在了解个体遗传易感性的基础上,对其相应的环境危险因素进行干预,以达到Ⅰ级预防的目的。  相似文献   

18.
Many prediction models have been developed for the risk assessment and the prevention of cardiovascular disease in primary care. Recent efforts have focused on improving the accuracy of these prediction models by adding novel biomarkers to a common set of baseline risk predictors. Few have considered incorporating repeated measures of the common risk predictors. Through application to the Atherosclerosis Risk in Communities study and simulations, we compare models that use simple summary measures of the repeat information on systolic blood pressure, such as (i) baseline only; (ii) last observation carried forward; and (iii) cumulative mean, against more complex methods that model the repeat information using (iv) ordinary regression calibration; (v) risk‐set regression calibration; and (vi) joint longitudinal and survival models. In comparison with the baseline‐only model, we observed modest improvements in discrimination and calibration using the cumulative mean of systolic blood pressure, but little further improvement from any of the complex methods. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.  相似文献   

19.
目的 借助数据挖掘技术,联合流行病学特征和临床症状资料构建肺癌危险度预测模型,评价各模型用于肺癌危险度预测的性能,并筛选出最优模型。方法 选取460例肺癌患者和560例肺良性疾病患者为研究对象,收集其流行病学特征和临床症状共16个自变量。将研究对象按照3∶1的比例随机分为训练集与测试集,应用支持向量机(support vector machine, SVM)、决策树C5.0和人工神经网络(artificial neural network, ANN)分别建立肺癌危险度预测模型,并比较不同模型的预测性能。结果 经特征提取,痰中带血、发热出汗和吸烟史等9个变量被筛选为有效变量,用来构建肺癌危险度预测模型。测试集中SVM、决策树C5.0和ANN模型的灵敏度分别为74.1%、62.5%和92.9%;特异度分别为76.2%、80.4%和64.3%;阳性预测值分别为70.9%、71.4%和67.1%;阴性预测值分别为79.0%、73.2%和92.0%;准确度分别为75.3%、72.5%和76.9%;曲线下面积分别为0.752(95%CI:0.694~0.803)、0.715(95%CI:0.655...  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号