On optimal gene-based analysis of genome scans |
| |
Authors: | Bacanu Silviu-Alin |
| |
Affiliation: | Virginia Commonwealth University, Richmond, Virginia 23219, USA. sabacanu@vcu.edu |
| |
Abstract: | Univariate analysis of markers has modest power when there are multiple causal variants within a gene. Under this scenario, combining the effects of all variants from a gene in a gene-wide statistic is thought to increase power. However, it is not really clear (1) what is the performance of most commonly used gene-wide methods for whole genome scans and (2) how scalable these methods are for more computationally intensive analyses, e.g. analysis of genome-wide sequence data. We attempt to answer these questions by using realistic simulations to assess the performance of a range of gene-based methods: (1) commonly used, e.g. VEGAS and GATES; (2) less commonly used, e.g. Simes, adaptive sum (aSUM), and kernel methods; and (3) a combination of univariate and multivariate tests we proposed for the analysis of markers in linkage disequilibrium. Simes is the fastest method and has good power for single causal variant models. aSUM method has good power for multiple causal variant models, especially at lower gene lengths. Our proposed statistic yields good power for all causal models. Given the extreme data volumes coming from sequencing studies, we recommend a two step analysis of genome scans. The initial step uses the very fast Simes procedure to flag possibly interesting genes. The second step refines interesting signals by using more computationally intensive methods, e.g. (1) aSUM for shorter and (2) VEGAS for larger gene lengths. Alternatively, genome scans can be analyzed using only our proposed method while sacrificing only a modest amount of power. |
| |
Keywords: | VEGAS GATES Simes sequencing principal components |
本文献已被 PubMed 等数据库收录! |
|