PreCimp: Pre‐collapsing imputation approach increases imputation accuracy of rare variants in terms of collapsed variables |
| |
Authors: | Young Jin Kim Juyoung Lee Bong‐Jo Kim TD‐Genes Consortium Taesung Park |
| |
Affiliation: | 1. Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, Korea;2. Division of Structural and Functional Genomics, Center for Genome Science, Korean National Institute of Health, Osong, Chungchungbuk‐do, Korea;3. Department of Statistics, Seoul National University, Seoul, Korea |
| |
Abstract: | Imputation is widely used for obtaining information about rare variants. However, one issue concerning imputation is the low accuracy of imputed rare variants as the inaccurate imputed rare variants may distort the results of region‐based association tests. Therefore, we developed a pre‐collapsing imputation method (PreCimp) to improve the accuracy of imputation by using collapsed variables. Briefly, collapsed variables are generated using rare variants in the reference panel, and a new reference panel is constructed by inserting pre‐collapsed variables into the original reference panel. Following imputation analysis provides the imputed genotypes of the collapsed variables. We demonstrated the performance of PreCimp on 5,349 genotyped samples using a Korean population specific reference panel including 848 samples of exome sequencing, Affymetrix 5.0, and exome chip. PreCimp outperformed a traditional post‐collapsing method that collapses imputed variants after single rare variant imputation analysis. Compared with the results of post‐collapsing method, PreCimp approach was shown to relatively increase imputation accuracy about 3.4–6.3% when dosage r2 is between 0.6 and 0.8, 10.9–16.1% when dosage r2 is between 0.4 and 0.6, and 21.4 ~ 129.4% when dosage r2 is below 0.4. |
| |
Keywords: | genotyping imputation next generation sequencing population genetics SNPs |
|
|