SBGNview Based Pathway Analysis and Visualization Workflow

Introduction

SBGNview has collected pathway data and gene sets from the following databases: Reactome, PANTHER Pathway, SMPDB, MetaCyc and MetaCrop. These gene sets can be used for pathway enrichment analysis.

In this vignette, we will show you a complete pathway analysis workflow based on GAGE + SBGNview. Similar workflows have been documented in the gage package using GAGE + Pathview.

Citation

Please cite the following papers when using the open-source SBGNview package. This will help the project and our team:

Luo W, Brouwer C. Pathview: an R/Biocondutor package for pathway-based data integration and visualization. Bioinformatics, 2013, 29(14):1830-1831, doi: 10.1093/bioinformatics/btt285

Please also cite the GAGE paper when using the gage package:

Luo W, Friedman M, etc. GAGE: generally applicable gene set enrichment for pathway analysis. BMC Bioinformatics, 2009, 10, pp. 161, doi: 10.1186/1471-2105-10-161

Installation and quick start

Please see the Quick Start tutorial for installation instructions and quick start examples.

Complete pathway analysis + visualization workflow

In this example, we analyze a RNA-Seq dataset of IFNg KO mice vs wild type mice. It contains normalized RNA-seq gene expression data described in Greer, Renee L., Xiaoxi Dong, et al, 2016.

Load the gene (RNA-seq) data

The RNA abundance data was quantile normalized and log2 transformed, stored in a “SummarizedExperiment” object. SBGNview input user data (gene.data or cpd.data) can be either a numeric matrix or a vector, like those in pathview. In addition, it can be a “SummarizedExperiment” object, which is commonly used in BioConductor packages.

library(SBGNview)
library(SummarizedExperiment)
data("IFNg", "pathways.info")
count.data <- assays(IFNg)$counts
head(count.data)
wt.cols <- which(IFNg$group == "wt")
ko.cols <- which(IFNg$group == "ko")

Gene sets from SBGNview pathway collection

Load gene set for mouse with ENSEMBL gene IDs

ensembl.pathway <- sbgn.gsets(id.type = "ENSEMBL",
                              species = "mmu",
                              mol.type = "gene",
                              output.pathway.name = TRUE
                              )
head(ensembl.pathway[[2]])

Pathway or gene set analysis using GAGE

if(!requireNamespace("gage", quietly = TRUE)) {
  BiocManager::install("gage", update = FALSE)
}

library(gage)
degs <- gage(exprs = count.data,
           gsets = ensembl.pathway,
           ref = wt.cols,
           samp = ko.cols,
           compare = "paired" #"as.group"
           )
head(degs$greater)[,3:5]
head(degs$less)[,3:5]
down.pathways <- row.names(degs$less)[1:10]
head(down.pathways)

Visualize perturbations in top SBGN pathways

Calculate fold changes or gene perturbations

The abundance values were log2 transformed. Here we calculate the fold change of IFNg KO group v.s. WT group.

ensembl.koVsWt <- count.data[,ko.cols]-count.data[,wt.cols]
head(ensembl.koVsWt)

#alternatively, we can also calculate mean fold changes per gene, which corresponds to gage analysis above with compare="as.group"
mean.wt <- apply(count.data[,wt.cols] ,1 ,"mean")
head(mean.wt)
mean.ko <- apply(count.data[,ko.cols],1,"mean")
head(mean.ko)
# The abundance values were on log scale. Hence fold change is their difference.
ensembl.koVsWt.m <- mean.ko - mean.wt

Visualize pathway perturbations by SBNGview

#load the SBGNview pathway collection, which may takes a few seconds.
data(sbgn.xmls)
down.pathways <- sapply(strsplit(down.pathways,"::"), "[", 1)
head(down.pathways)
sbgnview.obj <- SBGNview(
    gene.data = ensembl.koVsWt,
    gene.id.type = "ENSEMBL",
    input.sbgn = down.pathways[1:2],#can be more than 2 pathways
    output.file = "ifn.sbgnview.less",
    show.pathway.name = TRUE,
    max.gene.value = 2,
    min.gene.value = -2,
    mid.gene.value = 0,
    node.sum = "mean",
    output.format = c("png"),
    
    font.size = 2.3,
    org = "mmu",
    
    text.length.factor.complex = 3,
    if.scale.compartment.font.size = TRUE,
    node.width.adjust.factor.compartment = 0.04 
)
sbgnview.obj

SBGNview graph of the most down-regulated pathways in IFNg KO experiment SBGNview graph of the second most down-regulated pathways in IFNg KO experiment

SBGNview with SummarizedExperiment object

The ‘cancer.ds’ is a microarray dataset from a breast cancer study. The dataset was adopted from gage package and processed into a SummarizedExperiment object. It is used to demo SBGNview’s visualization ability.

data("cancer.ds")
sbgnview.obj <- SBGNview(
    gene.data = cancer.ds,
    gene.id.type = "ENTREZID",
    input.sbgn = "R-HSA-877300",
    output.file = "demo.SummarizedExperiment",
    show.pathway.name = TRUE,
    max.gene.value = 1,
    min.gene.value = -1,
    mid.gene.value = 0,
    node.sum = "mean",
    output.format = c("png"),
    
    font.size = 2.3,
    org = "hsa",
    
    text.length.factor.complex = 3,
    if.scale.compartment.font.size = TRUE,
    node.width.adjust.factor.compartment = 0.04
   )
sbgnview.obj
SBGNview of a cancer dataset gse16873
SBGNview of a cancer dataset gse16873

Session Info

sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] gage_2.57.0                 SummarizedExperiment_1.37.0
##  [3] Biobase_2.67.0              GenomicRanges_1.59.1       
##  [5] GenomeInfoDb_1.43.1         IRanges_2.41.1             
##  [7] S4Vectors_0.45.2            BiocGenerics_0.53.3        
##  [9] generics_0.1.3              MatrixGenerics_1.19.0      
## [11] matrixStats_1.4.1           SBGNview_1.21.0            
## [13] SBGNview.data_1.20.0        pathview_1.47.0            
## [15] knitr_1.49                  bookdown_0.41              
## 
## loaded via a namespace (and not attached):
##  [1] KEGGREST_1.47.0         xfun_0.49               bslib_0.8.0            
##  [4] lattice_0.22-6          vctrs_0.6.5             tools_4.4.2            
##  [7] Rdpack_2.6.2            bitops_1.0-9            AnnotationDbi_1.69.0   
## [10] RSQLite_2.3.8           blob_1.2.4              pkgconfig_2.0.3        
## [13] Matrix_1.7-1            graph_1.85.0            lifecycle_1.0.4        
## [16] GenomeInfoDbData_1.2.13 compiler_4.4.2          Biostrings_2.75.1      
## [19] htmltools_0.5.8.1       sys_3.4.3               buildtools_1.0.0       
## [22] sass_0.4.9              RCurl_1.98-1.16         yaml_2.3.10            
## [25] GO.db_3.20.0            crayon_1.5.3            jquerylib_0.1.4        
## [28] DelayedArray_0.33.2     cachem_1.1.0            org.Hs.eg.db_3.20.0    
## [31] abind_1.4-8             digest_0.6.37           maketools_1.3.1        
## [34] rsvg_2.6.1              fastmap_1.2.0           grid_4.4.2             
## [37] SparseArray_1.7.2       cli_3.6.3               magrittr_2.0.3         
## [40] S4Arrays_1.7.1          XML_3.99-0.17           UCSC.utils_1.3.0       
## [43] bit64_4.5.2             rmarkdown_2.29          XVector_0.47.0         
## [46] httr_1.4.7              igraph_2.1.1            bit_4.5.0              
## [49] png_0.1-8               memoise_2.0.1           evaluate_1.0.1         
## [52] rbibutils_2.3           rlang_1.1.4             DBI_1.2.3              
## [55] Rgraphviz_2.51.0        xml2_1.3.6              KEGGgraph_1.67.0       
## [58] jsonlite_1.8.9          R6_2.5.1                zlibbioc_1.52.0