ISMMS

Knowledge Management Center for Illuminating the Druggable Genome

Avi Ma’ayan, PhD
Principal Investigator
Professor, Department of Pharmacological Sciences
Director, Mount Sinai Center for Bioinformatics
Icahn School of Medicine at Mount Sinai

Overview

To better understand the function of the understudied protein targets, which are the focus of the implementation phase of the Illuminating the Druggable Genome (IDG) project, we impute knowledge using machine learning strategies. To establish this classification system, we organize data from many omics- and literature-based resources into attribute tables where genes are the rows and their attributes are the columns. Examples of such attribute tables include gene or protein expression in cancer cell lines (CCLE) or human tissues (GTEx), changes in expression in response to drug perturbations or single-gene knockdowns (LINCS), regulation by transcription factors based on ChIP-seq data (ENCODE), and phenotypes in mice observed when single genes are knocked out (KOMP). In total, we process and abstract data from over 100 resources. We then predict target function, target association with pathways, small-molecules/drugs that modulate the activity and expression of the target, and target relevance to human disease. Overall, the KMC-ISMMS develops a useful resource that will accelerate target and drug discovery.

NIH grant number: U24CA224260-01

RCHS4 user interface

Diverse datasets from different resources are organized into attribute tables to perform machine learning strategies to impute knowledge about gene function of the understudied targets of IDG.

Screenshot from the ARCHS4 user interface: For developing the ARCHS4 resource, all available FASTQ files from RNA-seq experiments were retrieved from the Gene Expression Omnibus (GEO) and aligned using a cloud-based infrastructure. In total 137,792 samples are accessible through ARCHS4 with 72,363 mouse and 65,429 human samples. Through efficient use of cloud resources and dockerized deployment of the sequencing pipeline, the alignment cost per sample is reduced to less than one cent. The ARCHS4 web interface provides intuitive exploration of the processed data through querying tools, interactive visualization, and gene landing pages that provide average expression across cell lines and tissues, top co-expressed genes, and predicted biological functions and protein-protein interactions for each gene, including all the IDG targets of interest, based on prior knowledge combined with co-expression data.

Web

Ma’ayan Laboratory: https://labs.icahn.mssm.edu/maayanlab/
Mount Sinai Center for Bioinformatics: https://icahn.mssm.edu/research/bioinformatics
Harmonizome https://maayanlab.cloud/Harmonizome/
Geneshot https://amp.pharm.mssm.edu/geneshot/
CREEDS https://amp.pharm.mssm.edu/CREEDS/
Enrichr https://maayanlab.cloud/Enrichr/
Clustergrammer https://maayanlab.cloud/clustergrammer/
ARCHS4 https://amp.pharm.mssm.edu/archs4

Knowledge Management Center for Illuminating the Druggable Genome

Overview

Web

Twitter

You Tube

LinkedIn

KMC-ISMMS publications:

Page reviewed March 11,2024

CONTACT: idg.rdoc@gmail.com