Exercises for the course RNA-seq data analysis with Chipster
Exercises for the course RNA-seq data analysis with Chipster
Keywords
FASTQ, QC, Pre-processing, Alignment, BAM, Expression-estimation, Feature-summarisation, Differential-expression, Statistical-model, Exploratory-analysis
Authors
- Eija Korpelainen @eija, ekorpelainen@gmail.com
Type
- Practical
Description
This practical covers the whole RNA-seq data analysis pipeline, from quality control of raw reads to differential expression analysis, using the free Chipster software. Material updated in Dec 2015.
Aims
- Performing RNA-seq analysis
- Recognizing and troubleshooting issues with the data
Prerequisites
- As the user-friendly Chipster software is used in the exercises, no command line or R experience is required.
Target audience
- The course is suitable for any researcher interested in learning RNA-seq data analysis.
Learning objectives
- Applying FastQC quality control software and interpreting the output
- Performing preprocessing with Trimmomatic software
- Producing alignment with TopHat2
- Interpreting the aligner output
- Being able to visualise alignments with Chipster genome browser
- Applying RseQC software for alignment level QC and interpreting the output
- Producing a table of read counts with HTSeq software
- Identifying confounding effects with PCA and MDS plots and taking necessary action
- Performing DE analysis with edgeR and DESeq2 and interpreting the output
- Understanding and performing multifactor analysis
- Comparing gene lists with Venn diagram
- Producing plots with DESeq2: normalized counts for a gene, dispersion plot, MA plot, p-value distribution
- Operating Chipster software
Materials
- Practicals on RNA-seq data analysis
Data
The datasets for the exercises are available on Chipster server as example sessions. Two datasets are used:
* Raw reads from human hESC1 and GM12878 cells produced by the ENCODE project.
* Table of per-gene read counts from an experiment by Brooks et al which studied the effect of RNAi knockdown of the splicing factor Pasilla in Drosophila melanogaster. The read counts were obtained from the pasilla Bioconductor package.
Timing
The lecture and practicals can be performed in one day.
Content stability
The content is updated approximately every 3 months.
Technical requirements
- Chipster software v3.6.3 or later
Literature references
- Suitable reading includes the book RNA-seq data analysis - practical approach
Keywords: FASTQ, QC, Pre-processing, Alignment, BAM, Expression-estimation, Feature-summarisation, Differential-expression, Statistical-model, Exploratory-analysis
Activity log