The Digestive Disease Data Science Commons: Host Datasets

CSIBD investigators focus on a comprehensive approach for investigating the basis of human disease, going beyond massively parallel sequencing to dissect cellular circuitry in order to make functional genotype-phenotype connections. Studies are geared toward identifying the role of disease-linked functional variants in order to uncover therapeutic targets.

Raw data links will take you to public access (e.g. GEO, SRA) and protected (e.g. DUOS, dbGAP) repositories.  Please note that DUOS links in this page will direct you to the DUOS landing page.  Once you sign in to DUOS, you should arrive at the data library page in which you can use the search bar to look up the study you are interested in (e.g. search ‘000146’).  Like dbGAP, DUOS hosts protected clinical data and additional permissions are required for access as explained in their step by step tutorial.

Featured Studies

  • Bidirectional CRISPR screens decode a GLIS3-dependent fibrotic cell circuit

    Pokatayev V, Jaiswal A, Shih AR, Segerstolpe Å, Li B, Creasey EA, Zhao Y, Lin C, Murphy S, Chou CH, Graham DB, Xavier RJ.

    Fibrosis, or pathological tissue scarring, occurs across many inflammatory diseases and can progress to organ dysfunction or failure. Although fibrosis contributes to ~50% of disease-related deaths, effective anti-fibrotic therapies remain limited. In this study, Pokatayev et al. integrate single-cell and spatial transcriptomics with bidirectional CRISPR screens to identify GLIS3 as a previously unrecognized regulator of an immune–stromal cell circuit that drives intestinal fibrosis during chronic inflammation. Their findings establish GLIS3 as a novel player in fibrotic pathology and suggest that targeting this cellular pathway could offer new therapeutic strategies to prevent or attenuate fibrosis in inflammatory bowel disease and other chronic inflammatory conditions.

    scRNA-seq data: GSE134809, SCP259, SCP1884, and SDY1765. scRNA-seq data of stimulated fibroblasts: GSE250516. scRNA-seq data for PDGFRA+ fibroblasts from mouse large intestine: GSE288481. Bulk RNA-seq data: GSE250515. ChIP–seq data: GSE250514. CRISPR screen data: Supplementary Data 2. Publicly available RNA-seq data (PROTECT): GSE109142. Xenium-based spatial transcriptomics profiling: SCP2927 (human intestinal tissue) and SCP3384 (mouse intestinal tissue). Raw H&E staining images after spatial profiling: Zenodo.

    Code: no new software or code was generated in this study.

  • Regional encoding of enteric nervous system responses to microbiota and type 2 inflammation

    Tan P, Jaiswal A, Murphy SP, Brown EM, Wheeler H, Su CW, Finan EP, Jasso GJ, Shi HN, Graham DB, Delorey TM, Deguine J, Xavier RJ.

    The intestine harbors a dedicated population of neurons, the enteric nervous system (ENS), that regulates intestinal physiology by integrating environmental and mechanical cues. To learn more about how microbial signals and inflammation affect this system, a team led by CSIBD investigators Ramnik Xavier, Jacques Deguine and Daniel Graham profiled more than 7,600 single enteric neurons across intestinal regions from mice with defined microbiomes, parasitic infections, or food allergy. They found shifts in transcriptional states and neuronal subsets associated with specific perturbations. Using in vivo AAV-Perturb-seq, the team identified Edf1 and Mitf as regulators of inhibitory motor neuron programs and total gastrointestinal transit time.

    These studies provide a detailed transcriptomic road map of the ENS during distinct environmental challenges and reveal a set of coordinated responses to perturbations. Furthermore, they directly link ENS cell states to changes in intestinal physiology and provide a blueprint for future functional studies of the ENS in health and disease.

    Raw sequencing files are accessible on NCBI’s GEO database with the following accession numbers: plate-based enteric neuron single cell atlas (GSE302208), droplet-based AAV-Perturb-seq data (GSE302051) and droplet-based intestine single cell atlas (GSE301963). Processed sequencing data is available through the Broad’s Single Cell Portal as SCP2971. Images analyzed and quantified as part of this study are available through Zenodo.

    Code

Understanding disease at the single cell level

CSIBD integrative studies