首页 | 本学科首页   官方微博 | 高级检索  
     


Towards Data Warehousing and Mining of Protein Unfolding Simulation Data
Authors:Daniel Berrar  Frederic Stahl  Candida Silva  J. Rui Rodrigues  Rui M. M. Brito  Werner Dubitzky
Affiliation:(1) School of Biomedical Sciences, University of Ulster, Coleraine, Coleraine, Northern Ireland Cromore Road, BT52 1SA, Coleraine, Northern Ireland;(2) School of Biomedical Sciences, University of Ulster, Coleraine, Northern Ireland;(3) Weihenstephan University of Applied Sciences, Freising, Germany;(4) Departamento de Quámica, Faculdade de Ciências e Tecnologia, and Centro de Neurociências de Coimbra, Universidade de Coimbra, Coimbra, Portugal;(5) School of Biomedical Sciences, University of Ulster, Coleraine, Northern Ireland
Abstract:
Objectives. The prediction of protein structure and the precise understanding of protein folding and unfolding processes remains one of the greatest challenges in structural biology and bioinformatics. Computer simulations based on molecular dynamics (MD) are at the forefront of the effort to gain a deeper understanding of these complex processes. Currently, these MD simulations are usually on the order of tens of nanoseconds, generate a large amount of conformational data and are computationally expensive. More and more groups run such simulations and generate a myriad of data, which raises new challenges in managing and analyzing these data. Because the vast range of proteins researchers want to study and simulate, the computational effort needed to generate data, the large data volumes involved, and the different types of analyses scientists need to perform, it is desirable to provide a public repository allowing researchers to pool and share protein unfolding data.Methods. To adequately organize, manage, and analyze the data generated by unfolding simulation studies, we designed a data warehouse system that is embedded in a grid environment to facilitate the seamless sharing of available computer resources and thus enable many groups to share complex molecular dynamics simulations on a more regular basis.Results.To gain insight into the conformational fluctuations and stability of the monomeric forms of the amyloidogenic protein transthyretin (TTR), molecular dynamics unfolding simulations of the monomer of human TTR have been conducted. Trajectory data and meta-data of the wild-type (WT) protein and the highly amyloidogenic variant L55P-TTR represent the test case for the data warehouse.Conclusions.Web and grid services, especially pre-defined data mining services that can run on or ‘near’ the data repository of the data warehouse, are likely to play a pivotal role in the analysis of molecular dynamics unfolding data. Based on “Grid Warehousing of Molecular Dynamics Protein Unfolding Data”, by Frederic Stahl, Daniel Berrar, Candida Silva, J. Rui Rodrigues, Rui M.M. Brito, and Werner Dubitzky, which appeared in Proceedings of the IEEE/ACM International Symposium on Cluster Computing and the Grid, Cardiff, UK, May 9–12, 2005.
Keywords:data warehousing  grid  molecular dynamics simulation  data mining  protein unfolding  transthyretin
本文献已被 PubMed SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号