Generalized Functional Linear Models for Gene‐Based Case‐Control Association Studies |
| |
Authors: | Ruzong Fan Yifan Wang James L Mills Tonia C Carter Iryna Lobach Alexander F Wilson Joan E Bailey‐Wilson Daniel E Weeks Momiao Xiong |
| |
Institution: | 1. Biostatistics and Bioinformatics Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, United States of America;2. Epidemiology Branch, Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Rockville, Maryland, United States of America;3. Center for Human Genetics, Marshfield Clinic, Marshfield, Wisconsin, United States of America;4. Division of Biostatistics, Department of Epidemiology and Biostatistics, School of Medicine, University of California, San Francisco, California, United States of America;5. Statistical Genetics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, Maryland, United States of America;6. Departments of Human Genetics and Biostatistics, Graduate School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America;7. Human Genetics Center, University of Texas ‐ Houston, Houston, Texas, United States of America |
| |
Abstract: | By using functional data analysis techniques, we developed generalized functional linear models for testing association between a dichotomous trait and multiple genetic variants in a genetic region while adjusting for covariates. Both fixed and mixed effect models are developed and compared. Extensive simulations show that Rao's efficient score tests of the fixed effect models are very conservative since they generate lower type I errors than nominal levels, and global tests of the mixed effect models generate accurate type I errors. Furthermore, we found that the Rao's efficient score test statistics of the fixed effect models have higher power than the sequence kernel association test (SKAT) and its optimal unified version (SKAT‐O) in most cases when the causal variants are both rare and common. When the causal variants are all rare (i.e., minor allele frequencies less than 0.03), the Rao's efficient score test statistics and the global tests have similar or slightly lower power than SKAT and SKAT‐O. In practice, it is not known whether rare variants or common variants in a gene region are disease related. All we can assume is that a combination of rare and common variants influences disease susceptibility. Thus, the improved performance of our models when the causal variants are both rare and common shows that the proposed models can be very useful in dissecting complex traits. We compare the performance of our methods with SKAT and SKAT‐O on real neural tube defects and Hirschsprung's disease datasets. The Rao's efficient score test statistics and the global tests are more sensitive than SKAT and SKAT‐O in the real data analysis. Our methods can be used in either gene‐disease genome‐wide/exome‐wide association studies or candidate gene analyses. |
| |
Keywords: | rare variants common variants case‐control association studies complex diseases logistic regression functional data analysis generalized functional linear models |
|
|