TIGAR: An Improved Bayesian Tool for Transcriptomic Data Imputation Enhances Gene Mapping of Complex Traits

The transcriptome-wide association studies (TWASs) that test for association between the study trait and the imputed gene expression levels from cis-acting expression quantitative trait loci (cis-eQTL) genotypes have successfully enhanced the discovery of genetic risk loci for complex traits. By using the gene expression imputation models fitted from reference datasets that have both genetic and transcriptomic data, TWASs facilitate gene-based tests with GWAS data while accounting for the reference transcriptomic data. The existing TWAS tools like PrediXcan and FUSION use parametric imputation models that have limitations for modeling the complex genetic architecture of transcriptomic data. Therefore, to improve on this, we employ a nonparametric Bayesian method that was originally proposed for genetic prediction of complex traits, which assumes a data-driven nonparametric prior for cis-eQTL effect sizes. The nonparametric Bayesian method is flexible and general because it includes both of the parametric imputation models used by PrediXcan and FUSION as special cases. Our simulation studies showed that the nonparametric Bayesian model improved both imputation R2 for transcriptomic data and the TWAS power over PrediXcan when ≥1% cis-SNPs co-regulate gene expression and gene expression heritability ≤0.2. In real applications, the nonparametric Bayesian method fitted transcriptomic imputation models for 57.8% more genes over PrediXcan, thus improving the power of follow-up TWASs. We implement both parametric PrediXcan and nonparametric Bayesian methods in a convenient software tool “TIGAR” (Transcriptome-Integrated Genetic Association Resource), which imputes transcriptomic data and performs subsequent TWASs using individual-level or summary-level GWAS data.