Functional analysis of DE genes

Almost all the tools below allow to define a custom background. Use a custom background if you can.
For single cell RNASeq analysis, the background should be the list of genes present in the Seurat object. For more info on the background I refer to the Function analysis course on the e-learning system.

If you’re a VIB scientist who works on human, mouse and rat, check Ingenuity Pathway Analysis. VIB offers this software for free to VIB scientists. You can find more info in the Functional analysis course on the e-learning system. Contact janick.mathys@vib.be if you want us to organize a training on this software.

Other useful websites for ORA and GSEA:

  • gProfiler: many organisms, very nice visuals, limited resources, custom background. A tutorial on how to use gProfiler for functional characterization of marker genes can be found in the Functional analysis course on the e-learning system.
  • DAVID: all organisms, boring tables, a lot of resources, custom background
  • EnrichR: limited set of organisms, tables and charts, a lot of resources, no custom background

Easy to use software to install on your computer for GSEA:

More information on these tools can be found on:

R/Bioconductor packages for ORA and GSEA:

  • goana() from the limma package
  • enrichGO() from the clusterProfiler package
  • enrichplot package for visualization of the results

Example code for goana()

Simple R script for goana analysis
List of marker genes for each cluster
List of all genes in Seurat object (background)

Example code for enrichGO()

Load required packages

> library(org.Hs.eg.db)
> library(clusterProfiler)
> library(AnnotationHub)

I’m not sure it works on Gene Symbols you might need to map them first to EntrezIDs (see above)

For the GO enrichment analysis you need:

  • gene: your list of markers
  • universe: background (2000 HVGs)
  • keytype: ID from which database?
  • OrgDb: organism annotation database
  • ont: which GO root ontology
  • pAdjustMethod: method for multiple testing correction
  • qvalueCutoff: threshold on adjusted p-value
  • readable: map ID to gene symbol?

> GO2 <- enrichGO(gene=MarkersList,universe=Background,keyType=”ENTREZID” (or “SYMBOL”), OrgDb=org.Hs.eg.db,ont=”BP”,pAdjustMethod=”BH”,qvalueCutoff=0.05,readable=TRUE)

Output results from GO analysis to a table

> go_summary <-  data.frame(GO2)

The package also has a function enrichKEGG() for pathways. The package has additional functions gseGO(), gseKEGG() for GSEA but then your input also needs to contain values (log fold changes, p-values…), and not only gene names.