Skip to content

rkhetani/Intro-to-R-with-DGE

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction to R and differential gene expression (DGE) analysis

Description

This repository has teaching materials for a 3-day, hands-on Introduction to R and differential gene expression (DGE) analysis workshop. The workshop will introduce participants to the basics of R and RStudio and their application to differential gene expression analysis on RNA-seq count data.

R is a simple programming environment that enables the effective handling of data, while providing excellent graphical support. RStudio is a tool that provides a user-friendly environment for working with R. Together, R and RStudio allow participants to wrangle data, plot, and use DESeq2 to obtain lists of differentially expressed genes from RNA-seq count data.

This workshop is intended to provide both basic R programming knowledge AND its application. Participants should be interested in:

  • using R for increasing their efficiency for data analysis
  • visualizing data using R (ggplot2)
  • using R to perform statistical analysis on RNA-seq count data to obtain differentially expressed gene lists

Learning Objectives

  • R syntax: Understanding the different 'parts of speech' in R; introducing variables and functions, demonstrating how functions work, and modifying arguments for specific use cases.
  • Data structures in R: Getting a handle on the classes of data structures and the types of data used by R.
  • Data inspection and wrangling: Reading in data from files. Using indices and various functions to subset, merge, and create datasets.
  • Visualizing data: Visualizing data using plotting functions in base R as well as from external packages such as ggplot2.
  • Exporting data and graphics: Generating new data tables and plots for use outside of the R environment.
  • Differential expression analysis for RNA-seq data:
    • QC on count data
    • Using DESeq2 to obtain a list of significantly different genes
    • Visualizing expression patterns of differentially expressed genes
    • Performing functional analysis on gene lists with R-based tools

These materials are developed for a trainer-led workshop, but also amenable to self-guided learning.

Contents

Introduction to R

Lessons Estimated duration
Introduction to R and RStudio 40 min
Syntax and data structures 80 min
Functions and arguments 45 min
Data wrangling: subsetting vectors and factors 65 min
Data wrangling: subsetting data frames, matrices and lists 75 min
Matching and reordering 90 min
Data visualization with ggplot2 60 min

Differential Gene Expression (DGE) using RNA-seq raw counts data

Lessons Estimated duration
Setting up and DGE overview 70 min
Introduction to count normalization 60 min
QC using principal component analysis (PCA) and heirarchical clustering 90 min
Getting started with DESeq2 70 min
Pairwise comparisons with DEseq2 45 min
Visualization of DGE analysis results 45 min
Summary of DGE workflow 15 min
Complex designs with DESeq2 (LRT) 30 min
Functional Analysis 85 min

These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 98.6%
  • CSS 1.4%