首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Leal SM  Heath SC 《Genetic epidemiology》1999,17(Z1):S217-S222
Markov chain Monte Carlo (MCMC) methods offer a rapid parametric approach that can test for linkage throughout the entire genome. It has an advantage similar to nonparametric methods in that the model does not have to be completely specified a priori. However, unlike nonparametric methods, there are no limitations on pedigree size and MCMC methods can also handle relatively complex pedigree structures. In addition MCMC methods can be used to carry segregation analysis in order to answer questions on the genetic components of a disease phenotype. Segregation analysis gave evidence for between two and eight alcoholism susceptibility loci, each having a modest effect on the phenotype. MCMC methods were used to map alcoholism loci using the phenotypes ALDX1 (DSM-III-R and Feighner criteria) and ALDX2 (World Health Organization diagnosis ICD-10 criteria). There was mild evidence for quantitative trait loci on chromosomes 2, 10, and 11.  相似文献   

2.
3.
In genetic counseling for cancer risk, the probability of carrying a mutation of a cancer-causing gene plays an important role. Family history of various cancers is important in calculating this probability. BRCAPRO is a widely used software for calculating the probability of carrying mutations in BRCA1 and BRCA2 genes given the family history of breast and ovarian cancer in first- and second-degree relatives. BRCAPRO uses an analytical (exact) calculational procedure. Using Markov chain Monte Carlo (MCMC) methods, we extend BRCAPRO to handle, in principle, any type of cancer, family history, any number of genes and alleles that each gene may have. When the information used in this MCMC approach is the same as for BRCAPRO (two genes: BRCA1 and BRCA2; two cancers: breast and ovarian; first- and second-degree relatives only), the two approaches give essentially the same answer. Extending the model to include (1) prostate cancer, (2) two mutated alleles of BRCA2, namely, mutations in Ovarian Cancer Cluster Region (OCCR) and non-OCCR region, and (3) relatives of degree greater than second-degree, leads to different carrier probabilities. The MCMC approach is a useful tool in building a comprehensive model to give accurate estimates of carrier probabilities. Such an approach will be even more important as additional information about the genetics of various cancers becomes available.  相似文献   

4.
Respondent‐driven sampling (RDS) is a recently introduced, and now widely used, technique for estimating disease prevalence in hidden populations. RDS data are collected through a snowball mechanism, in which current sample members recruit future sample members. In this paper we present RDS as Markov chain Monte Carlo importance sampling, and we examine the effects of community structure and the recruitment procedure on the variance of RDS estimates. Past work has assumed that the variance of RDS estimates is primarily affected by segregation between healthy and infected individuals. We examine an illustrative model to show that this is not necessarily the case, and that bottlenecks anywhere in the networks can substantially affect estimates. We also show that variance is inflated by a common design feature in which the sample members are encouraged to recruit multiple future sample members. The paper concludes with suggestions for implementing and evaluating RDS studies. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

5.
We describe a Bayesian approach to incorporate between-individual heterogeneity associated with parameters of complicated biological models. We emphasize the use of the Markov chain Monte Carlo (MCMC) method in this context and demonstrate the implementation and use of MCMC by analysis of simulated overdispersed Poisson counts and by analysis of an experimental data set on preneoplastic liver lesions (their number and sizes) in the presence of heterogeneity. These examples show that MCMC-based estimates, derived from the posterior distribution with uniform priors, may agree well with maximum likelihood estimates (if available). However, with heterogeneous parameters, maximum likelihood estimates can be difficult to obtain, involving many integrations. In this case, the MCMC method offers substantial computational advantages.  相似文献   

6.
We tested a new computer program, LOKI, that implements a reversible jump Markov chain Monte Carlo (MCMC) technique for segregation and linkage analysis. Our objective was to determine whether this software, designed for use with continuously distributed phenotypes, has any efficacy when applied to the discrete disease states of the simulated data from the Mordor data from GAW Problem 1. Although we were able to identify the genomic location for two of the three quantitative trait loci by repeated application of the software, the MCMC sampler experienced significant mixing problems indicating that the method, as currently formulated in LOKI, was not suitable for the discrete phenotypes in this data set.  相似文献   

7.
Eberly LE  Carlin BP 《Statistics in medicine》2000,19(17-18):2279-2294
The marked increase in popularity of Bayesian methods in statistical practice over the last decade owes much to the simultaneous development of Markov chain Monte Carlo (MCMC) methods for the evaluation of requisite posterior distributions. However, along with this increase in computing power has come the temptation to fit models larger than the data can readily support, meaning that often the propriety of the posterior distributions for certain parameters depends on the propriety of the associated prior distributions. An important example arises in spatial modelling, wherein separate random effects for capturing unstructured heterogeneity and spatial clustering are of substantive interest, even though only their sum is well identified by the data. Increasing the informative content of the associated prior distributions offers an obvious remedy, but one that hampers parameter interpretability and may also significantly slow the convergence of the MCMC algorithm. In this paper we investigate the relationship among identifiability, Bayesian learning and MCMC convergence rates for a common class of spatial models, in order to provide guidance for prior selection and algorithm tuning. We are able to elucidate the key issues with relatively simple examples, and also illustrate the varying impacts of covariates, outliers and algorithm starting values on the resulting algorithms and posterior distributions.  相似文献   

8.
In statistical modelling, it is often important to know how much parameter estimates are influenced by particular observations. An attractive approach is to re-estimate the parameters with each observation deleted in turn, but this is computationally demanding when fitting models by using Markov chain Monte Carlo (MCMC), as obtaining complete sample estimates is often in itself a very time-consuming task. Here we propose two efficient ways to approximate the case-deleted estimates by using output from MCMC estimation. Our first proposal, which directly approximates the usual influence statistics in maximum likelihood analyses of generalised linear models (GLMs), is easy to implement and avoids any further evaluation of the likelihood. Hence, unlike the existing alternatives, it does not become more computationally intensive as the model complexity increases. Our second proposal, which utilises model perturbations, also has this advantage and does not require the form of the GLM to be specified. We show how our two proposed methods are related and evaluate them against the existing method of importance sampling and case deletion in a logistic regression analysis with missing covariates. We also provide practical advice for those implementing our procedures, so that they may be used in many situations where MCMC is used to fit statistical models.  相似文献   

9.
This article focuses on the modelling and prediction of costs due to disease accrued over time, to inform the planning of future services and budgets. It is well documented that the modelling of cost data is often problematic due to the distribution of such data; for example, strongly right skewed with a significant percentage of zero-cost observations. An additional problem associated with modelling costs over time is that cost observations measured on the same individual at different time points will usually be correlated. In this study we compare the performance of four different multilevel/hierarchical models (which allow for both the within-subject and between-subject variability) for analysing healthcare costs in a cohort of individuals with early inflammatory polyarthritis (IP) who were followed-up annually over a 5-year time period from 1990/1991. The hierarchical models fitted included linear regression models and two-part models with log-transformed costs, and two-part model with gamma regression and a log link. The cohort was split into a learning sample, to fit the different models, and a test sample to assess the predictive ability of these models. To obtain predicted costs on the original cost scale (rather than the log-cost scale) two different retransformation factors were applied. All analyses were carried out using Bayesian Markov chain Monte Carlo (MCMC) simulation methods.  相似文献   

10.
Although extended pedigrees are often sampled through probands with extreme levels of a quantitative trait, Markov chain Monte Carlo (MCMC) methods for segregation and linkage analysis have not been able to perform ascertainment corrections. Further, the extent to which ascertainment of pedigrees leads to biases in the estimation of segregation and linkage parameters has not been previously studied for MCMC procedures. In this paper, we studied these issues with a Bayesian MCMC approach for joint segregation and linkage analysis, as implemented in the package Loki. We first simulated pedigrees ascertained through individuals with extreme values of a quantitative trait in spirit of the sequential sampling theory of Cannings and Thompson [Cannings and Thompson [1977] Clin. Genet. 12:208-212]. Using our simulated data, we detected no bias in estimates of the trait locus location. However, in addition to allele frequencies, when the ascertainment threshold was higher than or close to the true value of the highest genotypic mean, bias was also found in the estimation of this parameter. When there were multiple trait loci, this bias destroyed the additivity of the effects of the trait loci, and caused biases in the estimation all genotypic means when a purely additive model was used for analyzing the data. To account for pedigree ascertainment with sequential sampling, we developed a Bayesian ascertainment approach and implemented Metropolis-Hastings updates in the MCMC samplers used in Loki. Ascertainment correction greatly reduced biases in parameter estimates. Our method is designed for multiple, but a fixed number of trait loci.  相似文献   

11.
Lawson AB 《Statistics in medicine》2000,19(17-18):2361-2375
The spatial modelling of small area health data has, for some time, included spatial autocorrelation as a random effect. This effect is non-specific and global and does not address the location of clusters of disease (a specific task). This paper addresses the need for specific and non-specific random effects within spatial epidemiology. In addition, individual frailty is also considered important and a computational algorithm based on reversible jump Markov chain Monte Carlo (RJMCMC) methods is described.  相似文献   

12.
A Bayesian statistical model and estimation methodology based on forward projection adaptive Markov chain Monte Carlo is developed in order to perform the calibration of a high‐dimensional nonlinear system of ordinary differential equations representing an epidemic model for human papillomavirus types 6 and 11 (HPV‐6, HPV‐11). The model is compartmental and involves stratification by age, gender and sexual‐activity group. Developing this model and a means to calibrate it efficiently is relevant because HPV is a very multi‐typed and common sexually transmitted infection with more than 100 types currently known. The two types studied in this paper, types 6 and 11, are causing about 90% of anogenital warts. We extend the development of a sexual mixing matrix on the basis of a formulation first suggested by Garnett and Anderson, frequently used to model sexually transmitted infections. In particular, we consider a stochastic mixing matrix framework that allows us to jointly estimate unknown attributes and parameters of the mixing matrix along with the parameters involved in the calibration of the HPV epidemic model. This matrix describes the sexual interactions between members of the population under study and relies on several quantities that are a priori unknown. The Bayesian model developed allows one to estimate jointly the HPV‐6 and HPV‐11 epidemic model parameters as well as unknown sexual mixing matrix parameters related to assortativity. Finally, we explore the ability of an extension to the class of adaptive Markov chain Monte Carlo algorithms to incorporate a forward projection strategy for the ordinary differential equation state trajectories. Efficient exploration of the Bayesian posterior distribution developed for the ordinary differential equation parameters provides a challenge for any Markov chain sampling methodology, hence the interest in adaptive Markov chain methods. We conclude with simulation studies on synthetic and recent actual data. Copyright © 2012 John Wiley & Sons, Ltd.  相似文献   

13.
It is well known that the modeling of cost data is often problematic due to the distribution of such data. Commonly observed problems include 1) a strongly right-skewed data distribution and 2) a significant percentage of zero-cost observations. This article demonstrates how a hurdle model can be implemented from a Bayesian perspective by means of Markov Chain Monte Carlo simulation methods using the freely available software WinBUGS. Assessment of model fit is addressed through the implementation of two cross-validation methods. The relative merits of this Bayesian approach compared to the classical equivalent are discussed in detail. To illustrate the methods described, patient-specific non-health-care resource-use data from a prospective longitudinal study and the Norfolk Arthritis Register (NOAR) are utilized for 218 individuals with early inflammatory polyarthritis (IP). The NOAR database also includes information on various patient-level covariates.  相似文献   

14.
Genetic data from founder populations are advantageous for studies of complex traits that are often plagued by the problem of genetic heterogeneity. However, the desire to analyze large and complex pedigrees that often arise from such populations, coupled with the need to handle many linked and highly polymorphic loci simultaneously, poses challenges to current standard approaches. A viable alternative to solving such problems is via Markov chain Monte Carlo (MCMC) procedures, where a Markov chain, defined on the state space of a latent variable (e.g., genotypic configuration or inheritance vector), is constructed. However, finding starting points for the Markov chains is a difficult problem when the pedigree is not single-locus peelable; methods proposed in the literature have not yielded completely satisfactory solutions. We propose a generalization of the heated Gibbs sampler with relaxed penetrances (HGRP) of Lin et al., ([1993] IMA J. Math. Appl. Med. Biol. 10:1-17) to search for starting points. HGRP guarantees that a starting point will be found if there is no error in the data, but the chain usually needs to be run for a long time if the pedigree is extremely large and complex. By introducing a forcing step, the current algorithm substantially reduces the state space, and hence effectively speeds up the process of finding a starting point. Our algorithm also has a built-in preprocessing procedure for Mendelian error detection. The algorithm has been applied to both simulated and real data on two large and complex Hutterite pedigrees under many settings, and good results are obtained. The algorithm has been implemented in a user-friendly package called START.  相似文献   

15.
A Bayesian method for multipoint mapping of disease genes based on Markov chain Monte Carlo algorithms was applied to the simulated GAW11 data (Study 2). The method is based on repeated Gibbs and more general Metropolis-Hastings steps. For simplicity we assumed a single disease locus model with two alleles. A normal distribution for the underlying latent variable of the qualitative phenotype was assumed. Based on a single replicate of the data no clear evidence of any of the genes underlying the simulated disease was found. However, when three replicates were combined the method was able to locate the locus C correctly on chromosome 3.  相似文献   

16.
目的评价由倾向指数方法得到的暴露效果的估计量和统计性质,并探讨其实用性。方法利用计算机模拟对倾向指数方法在无模型误定和有模型误定情况下的偏度和精度进行分析,并与基于模型方法的模拟结果进行比较。结果当存在模型误定时,倾向指数方法比基于模型的方法具有较好的稳健性。结论对于大量、关系复杂的数据,应用倾向指数方法具有较大的灵活性。  相似文献   

17.
目的 新型冠状病毒肺炎疫情已席卷全球,疫情结束前,其病死率的估计受现有确诊病例和发病到死亡时间分布的影响,且结论尚不明确,本研究旨在对新型冠状病毒肺炎的年龄别病死率进行估计。方法 收集国家卫生健康委员会和CDC发布的新型冠状病毒肺炎疫情数据信息,采用Gamma分布拟合发病到死亡时间分布规律,采用马尔科夫链蒙特卡罗模拟估计年龄别病死率。结果 新型冠状病毒肺炎的发病到死亡时间M=13.77(P25P75:9.03~21.02)d,总病死率为4.1%(95% CI:3.7%~4.4%),0~、10~、20~、30~、40~、50~、60~、70~和≥80岁组病死率分别为0.1%、0.4%、0.4%、0.4%、0.8%、2.3%、6.4%、14.0%和25.8%。结论 校正删失的马尔科夫链蒙特卡罗模拟方法适用于新发突发传染病疫情期间的病死率估计,尽早明确新型冠状病毒肺炎的病死率有助于疫情的防控。  相似文献   

18.
We report a Markov chain Monte Carlo analysis of the five simulated quantitative traits in Genetic Analysis Workshop 12 using the Loki software. Our objectives were to determine the efficacy of the Markov chain Monte Carlo method and to test a new scoring technique. Our initial blind analysis, on replicate 42 (the “best replicate”) successfully detected four out of the five disease loci and found no false positives. A power analysis shows that the software could usually detect 4 of the 10 trait/gene combinations at an empirical point‐wise p‐value of 1.5 x 10‐4. © 2001 Wiley‐Liss, Inc.  相似文献   

19.
We provide an overview of the use of kernel smoothing to summarize the quantitative trait locus posterior distribution from a Markov chain Monte Carlo sample. More traditional distributional summary statistics based on the histogram depend both on the bin width and on the sideway shift of the bin grid used. These factors influence both the overall mapping accuracy and the estimated location of the mode of the distribution. Replacing the histogram by kernel smoothing helps to alleviate these problems. Using simulated data, we performed numerical comparisons between the two approaches. The results clearly illustrate the superiority of the kernel method. The kernel approach is particularly efficient when one needs to point out the best putative quantitative trait locus position on the marker map. In such situations, the smoothness of the posterior estimate is especially important because rough posterior estimates easily produce biased mode estimates. Different kernel implementations are available from Rolf Nevanlinna Institute's web page (http://www.rni.helsinki.fi/;fjh).  相似文献   

20.
We present a reversible jump Bayesian piecewise log-linear hazard model that extends the Bayesian piecewise exponential hazard to a continuous function of piecewise linear log hazards. A simulation study encompassing several different hazard shapes, accrual rates, censoring proportion, and sample sizes showed that the Bayesian piecewise linear log-hazard model estimated the true mean survival time and survival distributions better than the piecewsie exponential hazard. Survival data from Wake Forest Baptist Medical Center is analyzed by both methods and the posterior results are compared.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号