首页 | 本学科首页   官方微博 | 高级检索  
     


Prediction of spurious HLA class II typing results using probabilistic classification
Affiliation:1. DKMS Life Science Lab, Blasewitzerstraße 43, 01307 Dresden, Germany;2. DKMS German Bone Marrow Donor Center, Tübingen, Germany
Abstract:While modern high-throughput sequence-based HLA genotyping methods generally provide highly accurate typing results, artefacts may nonetheless arise for numerous reasons, such as sample contamination, sequencing errors, read misalignments, or PCR amplification biases. To help detecting spurious typing results, we tested the performance of two probabilistic classifiers (binary logistic regression and random forest models) based on population-specific genotype frequencies. We trained the model using high-resolution typing results for HLA-DRB1, DQB1, and DPB1 from large samples of German, Polish and UK-based donors. The high predictive capacity of the best models replicated both in 10-fold cross-validation for each gene and in using independent evaluation data (AUC 0.820–0.893). While genotype frequencies alone provide enough predictive power to render the model generally useful for highlighting potentially spurious typing results, the inclusion of workflow-specific predictors substantially increases prediction specificity. Low initial DNA concentrations in combination with low-volume PCR reactions form a major source of stochastic error specific to the Fluidigm chip-based workflow at DKMS Life Science Lab. The addition of DNA concentrations as a predictor variable thus substantially increased AUC (0.947–0.959) over purely frequency-based models.
Keywords:Random forest  HLA class II  Genotyping error  Allelic dropout  Classification
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号