A quick recap to the course : Systems biology course has 3 main modules, and I talked about the 1st module of Statistics in the previous blog. Today I will be telling more about the 2nd and the 3rd modules.
Module 2: Metabolic modelling
As the name suggests it is the modelling of complex metabolic and other types of regulatory networks. In the course we focus on the metabolic networks and touch the topic of integration of various forms of omics data into the metabolic network to predict various types of models.
This kind of studies are useful in drug development or performing gene knockout studies. In such cases it is experimentally not feasible to test hundreds of drugs in multiple conditions due to time constraints as well as financial concerns, thus, as a replacement these computationally generated metabolic networks are used to predict the top 10 drugs/genes that can be knocked-out to get the desired phenotype. These top candidates can then be easily tested experimentally and one with the most optimal results can be selected.
We learned about the theoretical basics of metabolic modelling and various methods used to improve prediction, such as Flux Balance Analysis (FBA), parsimonious FBA and Flux Variability Analysis (FVA) of genome scale metabolic models. Later, we implemented these methods in the practical lab exercises where we were given a metabolic model of liver cancer cell line, and we predicted the top 10 candidates for gene knockout studies.
Module 3: Models of Gene Regulation
The third module focused on the integration of various omics data sets. Before jumping into the integration of variety of omics data, we were first informed about the need for various omics data and the caveats, biases and problems of each type of omics datasets during the theory classes. Further, we learned about various proteomics methods and the advantages and disadvantages of using specific methods in relation to the sample size and the quality of the data expected for the study.
Finally, after understanding the basics we dived into the lab exercises where we were given transcriptomics and proteomics data of drug treated versus non-treated samples over a time series experiment. We had to integrate these two omics data using PECA tool and answer the questions regarding changes in the gene expression with relation to the protein expression at various time points and associate the general changes caused by the drugs to the general mechanistic pathways that were affected by drug treatment.
While analysing these real datasets we encountered different problems and caveats for each type of datasets. For eg. proteomics datasets are generally sparse and tend to have large number of missing values, thus it becomes important to deal with these missing values before proceeding to analysis else the data seems to be more or less useless.
Learning and exploring real datasets while tackling the problems faced by researchers made the course very exciting!