ADSP and Affiliates Whole Genome Sequencing Report

Introduction

Studies conducted primarily in non-Hispanic White populations have shown that genetic variants that are observed infrequently in populations are important to the development of Alzheimer’s disease (AD). Research has also shown that genetic variation that increases risk or protects against development of AD can be shared across ancestral backgrounds but also may differ based on these categories. Therefore, it is important to study large numbers of individuals from different ancestral backgrounds in order to fully understand and reveal the genetic underpinnings of AD, and to ensure that any prevention or treatment strategies based on genetics work for everyone.

To increase researchers ability to find variants important for AD across and within different populations, the ADSP is whole genome sequencing (WGS) large numbers of participants across the four major ancestral populations of the United States. Foreign studies that include ancestral populations of the United States are also included in the ADSP in order to capture the most genetic variation possible and to allow for important ancestral, social, cultural and environmental questions related to development of AD to be investigated. Here, ancestry/ethnicity population categories (Asian, Black/African American, Hispanic/Latino, and Non-Hispanic White) are based on self-reported or ascribed race or ethnicity as defined by the Office of Management and Budget (OMB) standards ( https://orwh.od.nih.gov/toolkit/other-relevant-federal-policies/OMB-standards) and only apply to populations within the United States. Importantly, the genetics field is currently assessing the appropriate use of terminology for population descriptors and genetic variation (see for example Byeon et al., AJHG, 2021 and Kamariza et al., Nat Genet, 2021). To this end, while we follow OMB standards for our population categories, we substitute race with ancestry in our description of population categories. Generally, “race” and “ethnicity” refer to social or cultural categories, and have no biological meaning, whereas “ancestry” refers to a person’s biological ancestors from whom their DNA was genetically inherited. Ancestry can thus also refer to where a majority of a person’s ancestors originated from (i.e. Africa, Asia, Europe) and is often described as “continental ancestry”. Any reference to ancestry is based on the genetically determined ancestry of a population and is designated separately from ethnicity by study investigators. These designations often follow historically defined continental population definitions. ADSP datasets from foreign countries are included ancestry/ethnicity categories which most closely align with their genetically determined ancestry.

To reach the study sample size necessary to detect associations with genetic variants that are not frequently seen in a population ~18,500 cases and ~18,500 controls per ancestry/ethnic population are being included and sequenced as part of the ADSP Follow-Up Study (FUS). The following tables document the progress being made towards this recruitment goal.

Byeon YJJ, Islamaj R, Yeganova L, Wilbur WJ, Lu Z, Brody LC, Bonham VL. Evolving use of ancestry, ethnicity, and race in genetics research – A survey spanning seven decades. AJHG. 108, 12:2215-2223. 2021.

Kamariza M, Crawford L, Jones D, Finucane H. Misuse of the term ‘trans-ethnic’ in genomics research. Nat Genet. 53:1520-1521. 2021.

Sequencing Overview by Case/Control Status and Self-Reported Ancestry/Ethnicity*

Presented here are whole genome sequencing (WGS) totals by self-reported ancestry/ethnicity and case/control status for completed and proposed or planned projects in the Alzheimer Disease Sequencing Project (ADSP) Follow-Up Study (FUS). Note that self-reported or ascribed ancestry/ethnicity follow the OMB standards and only apply to US populations. ADSP datasets from foreign countries are included in ancestry/ethnicity categories which most closely align with their genetically determined ancestry. The far right columns of the table show the total numbers needed per ancestry/ethnicity to reach the 18,500 case and 18,500 control requirements. The first three sections represent sequencing which is funded by NIA and is either released (Release 3) or planned for release in 2022 (Release 4) and 2023 (Release 5) by the Genome Center for Alzheimer’s Disease (GCAD) release 3 through 5. These data, once released, are available for use by all qualified researchers with an approved usage plan. The row labeled, “Datasets under review for funding”, contains totals for datasets pending funding, and the row labeled, “Additional datasets with cognitive data”, includes additional datasets that have cognitive data and may be available for WGS or have WGS and may be incorporated in to the ADSP pending agreements.

*Following OMB ancestral and ethnic category standards for US populations:
https://orwh.od.nih.gov/toolkit/other-relevant-federal-policies/OMB-standards

**Numbers are cumulative from previous sections

FUS Currently Funded and Sequenced/Pending Sequencing WGS Datasets

Self-identified: Black/African American*

Presented here are Black/African American datasets currently funded for WGS under the ADSP FUS (see PAR-17-214, PAR-18-890, and PAR-19-234). WGS is complete for all currently funded datasets in this ancestral/ethnic group. An additional dataset, the Ibadan study, is composed of individuals from Nigeria.

*Applies to US populations only

Datasets with WGS complete

**Mild Cognitive Impairment/ADRD/Unknown

Self-identified: Asian*

Presented here are Asian ancestry datasets currently funded for WGS. WGS is complete for LASI-DAD from India and GARD from Korea. Sequencing is underway for the KBASE study from Korea. Additional datasets recruited from Asian American and Canadian populations are also planned for inclusion in the ADSP FUS.

*Applies to US populations only

Datasets with WGS complete

Datasets with WGS pending

**Mild Cognitive Impairment/ADRD/Unknown

Self-identified: Non-Hispanic White*

Presented here are non-Hispanic White datasets currently funded for WGS under the ADSP FUS (see PAR-17-214, PAR-18-890, and PAR-19-234) and other partners with available WGS. WGS is complete for all currently funded datasets in this ancestral/ethnic group.

*Applies to US populations only

Datasets with WGS complete

**Mild Cognitive Impairment/ADRD/Unknown
***Founder/Familial

Self-identified: Hispanic/Latino and Amerindian*

Presented here are Latino/Hispanic datasets currently funded for WGS under the ADSP FUS (see PAR-17-214, PAR-18-890, and PAR and other partners with available WGS. WGS is complete for the datasets in the top table and are funded and planned for the 2 datasets in the bottom table. We also include studies with primarily Amerindian ancestry from countries such as Peru in this ancestral/ethnic category.

*Applies to US populations only

Datasets with WGS complete

Datasets with WGS pending

**Mild Cognitive Impairment/ADRD/Unknown

ADSP and Affiliates Currently Funded WGS by Ancestry/Ethnicity (thru 2023)

This table presents the number of genomes the ADSP will have sequenced by 2023 in terms of number of study participants across the four major ancestry groups. We will have over 55,000 genomes sequenced. But nearly half of these 24,911 will be from non-Hispanic White populations. Future planned sequencing will continue to reduce this disparity.