Week 4 - Homework
week04
homework
- Import the
read-counts.csv
file.
Quick reminder: this data file contains gene expression values of samples from four groups, sample names are prefixed by “WT”, “SET1”, “SET1.RRP6” and “RRP6”. Each group has 10 samples.
- Calculate the average gene expression per gene across the 10 samples in the “WT” group.
- Now, repeat the previous step to calculate the average expression for the remaining three groups: “SET1”, “SET1.RRP6” and “RRP6”
- Store the four average values in a list named
avg_list
, using the group names as the names of the list. - Display the first 5 average values for the “SET1.RRP6” group.
- Transform the list obtained in question 3 to a data frame using
as.data.frame()
. Show the head lines of your data frame.
Tip
A data frame can be considered as a list of equal-length vectors.
- What are the genes having an average greater than 10000 in WT and SET1 samples? Compare if there are genes in common using learned operator or the
intersect()
function.
- Check if the average expression of the “RRP6” group is normally distributed (
?shapiro.test()
) using significance level at 5%. What is the p-value of normality test? If it’s normally distributed, draw directly a histogram (?hist()
) for the values. Otherwise, draw a histogram for the log-transformed values.
The homework correction is available here: link