Week 5 - Homework

week05
homework

Import Data

We need to use two data for this exercise:

Reminder: the DE results were obtained by comparing SET1 samples to WT samples using data from read-counts.csv

  1. Import DE analysis result (toy_DEanalysis.csv) and name it as de_res.

(You can either use the basic function read.csv() or the function read_csv() from the {readr} package.)

  1. Import the read-counts.csv file and name it counts.

You can either use the first column (Feature) to name the rows of your data frame or keep the fisrt column as is. It will just change how you filter data later.

Find Genes of Interest

  1. Find the genes which satisfy the following conditions:
  • log2 fold change < -1 or > 1 (a.k.a, the absolute log2 fold change is bigger than 1)
  • adjusted p-value < 0.05

Store the results in a variable target_genes.

Draw Boxplots

  1. Load the {ggplot2} package.

  2. Draw a boxplot for the 1st gene of the target_genes to show the expression level between SET1 and WT samples.

Hints: you need to extract the expression data for the gene from the counts and build a data frame for the boxplot.

  1. Refine the boxplot from question 5 to include the following customizations:
  • A subtitle showing the gene’s log2 fold change.
  • Fill the boxplot with different colors for the “WT” and “SET1” groups.
  • Apply the theme_minimal() theme.
  • Hide the legend.

The homework correction is available here: link