The Digestive Disease Data Science Commons: Host Datasets

CSIBD investigators focus on a comprehensive approach for investigating the basis of human disease, going beyond massively parallel sequencing to dissect cellular circuitry in order to make functional genotype-phenotype connections. Studies are geared toward identifying the role of disease-linked functional variants in order to uncover therapeutic targets.

Raw data links will take you to public access (e.g. GEO, SRA) and protected (e.g. DUOS, dbGAP) repositories.  Please note that DUOS links in this page will direct you to the DUOS landing page.  Once you sign in to DUOS, you should arrive at the data library page in which you can use the search bar to look up the study you are interested in (e.g. search ‘000146’).  Like dbGAP, DUOS hosts protected clinical data and additional permissions are required for access as explained in their step by step tutorial.

Understanding disease at the single cell level
Featured: Bringing diversity into genetics
  • Genetic architecture of the inflammatory bowel diseases across East Asian and European ancestries

    Liu Z, Liu R, Gao H, Jung S, Gao X, Sun R, Liu X, Kim Y, Lee HS, Kawai Y, Nagasaki M, Umeno J, Tokunaga K, Kinouchi Y, Masamune A, Shi W, Shen C, Guo Z, Yuan K; FinnGen; International Inflammatory Bowel Disease Genetics Consortium; Chinese Inflammatory Bowel Disease Genetics Consortium; Zhu S, Li D, Liu J, Ge T, Cho J, Daly MJ, McGovern DPB, Ye BD, Song K, Kakuta Y, Li M, Huang H.

    In a study led by PFS award recipient Dr. Hailiang Huang, CSIBD investigators and collaborators created the first sizable sample of East Asian ancestries in studying IBD genetics. Eighty-one novel IBD associations were identified, including many new coding variants enriched in East Asian ancestries. These new coding variants deepen the IBD allelic series to facilitate target modulation in drug discoveries. The study also found that while IBD genetic effects are generally consistent across ancestries, genetics underlying CD appear more ancestral dependent than UC, driven by both allele frequency and genetic effect.

    Individual-level genotype data for EAS samples are available upon request: SHA1, Z.L. (zhanjuliu@tongji.edu.cn); KOR1, K.S. (kysong@amc.seoul.kr); JPN1, Y. Kakuta (ykakuta@med.tohoku.ac.jp) and; ICH1, IIBDGC (ibdgc-dcc@mssm.edu). Please note that access to individual-level genotypes from samples recruited within mainland China are subject to the policies and approvals from the Human Genetic Resource Administration, Ministry of Science and Technology of the People’s Republic of China.

    Data - CaVEMaN and DAP-G GTEx v8 Fine-Mapping cis-eQTL Data1000 Genomes Project Phase 3TOPMed reference panel R2Human Genome Diversity ProjectSimons Genome Diversity ProjectKorean Personal Genome Diversity ProjectNBDC human database (accession ID: JGAS000114). 

    STRING functional protein association networks.

    NFE summary statistics; FIN summary statistics are from FinnGen R7; and more.

    PRS weights and genome-wide summary statistics for the meta-analyzed EAS samples, and across all study samples (EAS and EUR) can be downloaded here

CSIBD integrative studies