︎︎︎ EXIT BUILDING

︎︎︎NEXT FLOOR

︎︎︎PREVIOUS FLOOR



DATA SCIENCE WITH PANCREATIC CANCER @ THE BROAD INSTITUTE

January 2020 - Present

  



Pancreatic cancer is a rare but deadly form of cancer, due to its difficulty to detect and diagnose. Given that its especially influenced by the nervous system, our hypothesis was that mutations in genes from different neurotransmitter families could signal a precursor to pancreatic cancer. 

Using the cBioPortal for cancer genomics, I obtained the RNA expression data for neurotransmitter receptors from 8 neurotransmitter families. We began by first creating a few heatmaps to visualize the expression data. 

These heatmaps are made using seaborn and matplotlib, two useful data visualization libraries in python.








It was necessary to scale the expression data according to the TPM values. TPM refers to “transcripts per million” and is calculated using the length of the gene. I’ve provided a code sample on converting the expression values to TPM.

           

Now the expression heatmaps can be adjusted for TPM as well:



Following these initial heatmaps, we can look at the differences between the tumor cohort and the normal cohort. 


This project is still ongoing. For more visualizations and reference code, please visit the github project. :)