Infering Genes Links with Unsupervised Methods

Collaboration with Pasteur Institute on Gene Expression Analysis

Posted by Clement Wang on June 15, 2021

Project Overview

This one-week project was conducted in collaboration with the Pasteur Institute, focusing on inferring gene relationships from expression data using unsupervised learning methods in R.

Logo of Pasteur institut

Our team consisted of 6 students working with a comprehensive dataset containing gene expression data from 2,000 patients across 5 different stimuli. The objective was to identify meaningful gene interactions and relationships from this dataset.

Technical Approach

We imported and processed the data using R Studio, which presented its own set of challenges. As this was my first experience with R.

For the core analysis, we focused on two main unsupervised learning techniques:

  • Correlation analysis to identify linear relationships between gene expressions
  • Joint Graphical Lasso for sparse precision matrix estimation and network inference