Exploiting family history in aggregation unit-based genetic association tests

The development of sequencing technology calls for new powerful methods to detect disease associations and lower the cost of sequencing studies. Family history (FH) contains information on disease status of relatives, adding valuable information about the probands’ health problems and risk of diseases. Incorporating data from FH is a cost-effective way to improve statistical evidence in genetic studies, and moreover, overcomes limitations in study designs with insufficient cases or missing genotype information for association analysis. We proposed family history aggregation unit-based test (FHAT) and optimal FHAT (FHAT-O) to exploit available FH for rare variant association analysis. Moreover, we extended liability threshold model of case-control status and FH (LT-FH) method in aggregated unit-based methods and compared that with FHAT and FHAT-O. The computational efficiency and flexibility of the FHAT and FHAT-O were demonstrated through both simulations and applications. We showed that FHAT, FHAT-O, and LT-FH methods offer reasonable control of the type I error unless case/control ratio is unbalanced, in which case they result in smaller inflation than that observed with conventional methods excluding FH. We also demonstrated that FHAT and FHAT-O are more powerful than LT-FH and conventional methods in many scenarios. By applying FHAT and FHAT-O to the analysis of all cause dementia and hypertension using the exome sequencing data from the UK Biobank, we showed that our methods can improve significance for known regions. Furthermore, we replicated the previous associations in all cause dementia and hypertension and detected novel regions through the exome-wide analysis.