--- title: "BulkSignalR : Inference of ligand-receptor interactions from bulk data or spatial transcriptomics" author: - name: Jean-Philippe Villemin affiliation: - Institut de Recherche en Cancérologie de Montpellier, Inserm, Montpellier, France email: jean-philippe.villemin@inserm.fr - name: Jacques Colinge affiliation: - Institut de Recherche en Cancérologie de Montpellier, Inserm, Montpellier, France email: jacques.colinge@inserm.fr date: "`r format(Sys.Date(), '%m/%d/%Y')`" abstract: >
BulkSignalR is used to infer ligand-receptor (L-R) interactions from bulk
expression (transcriptomics/proteomics) data, or spatial
transcriptomics. Potential L-R interactions are taken from the
LR*db* database, which is included in our other package SingleCellSignalR,
available from
Bioconductor. Inferences rely on a statistical model linking potential
L-R interactions with biological pathways from Reactome or biological
processes from GO.
A number of visualization and data summary functions are proposed to
help navigating the predicted interactions.
BulkSignalR package version: `r packageVersion("BulkSignalR")`
output:
rmarkdown::html_vignette:
self_contained: true
toc: true
toc_depth: 4
highlight: pygments
fig_height: 3
fig_width: 3
fig_caption: no
code_folding: show
package: BulkSignalR
vignette: >
%\VignetteIndexEntry{BulkSignalR-Main}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include = FALSE}
knitr::knit_hooks$set(optipng = knitr::hook_optipng)
```
```{r load-libs, message = FALSE, warning = FALSE, results = FALSE}
library(BulkSignalR)
library(igraph)
library(dplyr)
library(STexampleData)
```
# Introduction
## What is it for?
`BulkSignalR` is a tool that enables the inference of L-R
interactions from bulk expression data, *i.e.*, from transcriptomics
(RNA-seq or microarrays) or expression proteomics.
`BulkSignalR` also applies to spatial transcriptomics such as
10x Genomics VISIUM (TM), and a set of
functions dedicated to spatial data have been added to better support
this particular use of the package.
## Starting point
There are a variety of potential data sources prior to using `BulkSignalR`
(proteomics, sequencing, etc.) that result
in the generation of a matrix of numbers representing the expression
levels of genes or proteins. This matrix may be normalized already or not.
In every case, the latter matrix is the starting point of using
`BulkSignalR`.
It is mandatory that genes/proteins are represented as rows of the
expression matrix and the samples as columns. HUGO gene symbols
(even for proteins) must be used to ensure matching LR*db*,
Reactome, and GOBP contents.
You can also directly work with an object from the `SummarizedExperiment` or
`SpatialExperiment` Bioconductor classes.
## How does it work?
As represented in the figure below, only a few steps are required in order to
perform a `BulkSignalR` analysis.
Three S4 objects will be sequentially constructed:
* **BSR-DataModel**, denoted `bsrdm`
* **BSR-Inference**, denoted `bsrinf`
* **BSR-Signature**, denoted `bsrsig`
\
**BSRDataModel** integrates the expression data matrix and the
parameters of the statistical model learned on this matrix.
**BSRInference** provides a list triples (ligand, receptor, pathway downstream
the receptor) with their statistical significance. If a receptor occurs
in multiple pathways, the corresponding number of triples will be described
in the BSR-Inference object. Genes (or proteins) targeted by the pathway
are also described in this object.
**BSRSignature** contains gene signatures associated with the triples
(ligand, receptor, downstream pathway)
stored in a `BSRInference` object. Those signatures are comprised
of the ligand and the receptor obviously, but also all the pathway
target genes that were used by the statistical model. Gene signatures
are meant to report the L-R interaction as a global phenomenon integrating
its downstream effect. Indeed, signatures can be scored with a dedicated
function allowing the user to set an arbitrary weight on the downtream
component. Those scores represent the activity of the L-R interactions
across the samples, they are returned as a matrix.
Because of the occurrence of certain receptors in multiple pathways,
but also because some ligands may bind several receptors, or *vice versa*,
BSR-Inference objects may contain redundant data depending on how
the user want to look at them. We therefore provide a range of
reduction operators meant to obtain reduced BSR-Inference objects
(see below).
\
Furthermore, we provide several handy functions to explore the data
through different plots (heatmaps, alluvial plots, chord diagrams or networks).
`BulkSignalR` package functions have many parameters that can be
changed by the user to fit specific needs (see Reference Manual for details).
## Parallel mode settings
Users can reduce compute time
by using several processors in parallel.
```{r parallel, message=FALSE, warnings=FALSE}
library(doParallel)
n.proc <- 1
cl <- makeCluster(n.proc)
registerDoParallel(cl)
# To add at the end of your script
# stopCluster(cl)
```
**Notes : **
For operating systems that can fork such as all the UNIX-like systems,
it might be preferable to use the library `doMC` that is faster (less
overhead). This is transparent to `BulkSignalR`.
In case you need to reproduce results exactly and since statistical
model parameter learning involves the generation of randomized expression
matrices, you might want to use `set.seed()`. In a parallel mode,
`iseed` that is a parameter of `clusterSetRNGStream` must be used.
# First Example
## Loading the data
Here, we load salivary duct carcinoma transcriptomes integrated as
`sdc` in `BulkSignalR`.
```{r loading,eval=TRUE}
data(sdc)
head(sdc)
```
## Building a BSRDataModel object
`BSRDataModel` creates the first object with information
relative to your bulk expression data.
```{r BSRDataModel, eval=TRUE,cache=FALSE}
bsrdm <- BSRDataModel(counts = sdc)
bsrdm
```
`learnParameters` updates a BSR-DataModel object with the parameters
necessary for `BulkSignalR` statistical model.
```{r learnParameters,eval=TRUE,warning=FALSE,fig.dim = c(7,3)}
bsrdm <- learnParameters(bsrdm,quick=TRUE)
bsrdm
```
## Building a BSRInference object
From the previous object `bsrdm`, you can generate inferences by calling its
method
`BSRInference`. The resulting BSR-Inference object, `bsrinf`,
contains all the inferred L-R interactions
with their associated pathways and corrected p-values.
From here, you can already access LR interactions using: `LRinter(bsrinf)`.
```{r BSRInference,eval=TRUE}
# We use a subset of the reference to speed up
# inference in the context of the vignette.
subset <- c("REACTOME_BASIGIN_INTERACTIONS",
"REACTOME_SYNDECAN_INTERACTIONS",
"REACTOME_ECM_PROTEOGLYCANS",
"REACTOME_CELL_JUNCTION_ORGANIZATION")
reactSubset <- BulkSignalR:::.SignalR$BulkSignalR_Reactome[
BulkSignalR:::.SignalR$BulkSignalR_Reactome$`Reactome name` %in% subset,]
resetPathways(dataframe = reactSubset,
resourceName = "Reactome")
bsrinf <- BSRInference(bsrdm,
min.cor = 0.3,
reference="REACTOME")
LRinter.dataframe <- LRinter(bsrinf)
head(LRinter.dataframe[
order(LRinter.dataframe$qval),
c("L", "R", "LR.corr", "pw.name", "qval")])
```
You can finally filter out non-significant L-R interactions and order them
by best Q-value before saving them to a file for instance.
```{r BSRInferenceTfile, eval=FALSE}
write.table(LRinter.dataframe[order(LRinter.dataframe$qval), ],
"./sdc_LR.tsv",
row.names = FALSE,
sep = "\t",
quote = FALSE
)
```
## Reduction strategies
The output of `BSRInference` is exhaustive and can thus contain
redundancy due to intrinsic redundancy in the reference databases (Reactome,
KEGG, GOBP) and multilateral interactions in LR*db*.
To address this phenomenon, we propose several strategies.
### Reducing a BSRInference object to pathway
With `reduceToPathway`, all the L-R interactions with the receptor included
in a certain pathway are aggregated to only report each downstream pathway
once. For a given pathway, the reported P-values and target genes are those
of best (minimum P-value) L-R interaction that was part of the aggregation.
Nothing is recomputed, we simply merge.
```{r ReduceToPathway, eval=TRUE}
bsrinf.redP <- reduceToPathway(bsrinf)
```
### Reducing a BSRInference object to best pathway
With ` reduceToBestPathway`, a BSR-Inference object is reduced to only report
one pathway per L-R interaction. The pathway with the smallest P-value
is selected.
A same pathways might occur multiple times with with different L-R interactions.
```{r ReduceToBestPathway, eval=TRUE}
bsrinf.redBP <- reduceToBestPathway(bsrinf)
```
### Reducing to ligands or receptors
As already mentioned, several ligands might bind a single receptor (or
several shared receptors) and the converse is true as well. Two
reduction operators enable users to either aggregate all the ligands
of a same receptor or all the receptors bound by a same ligand:
```{r ReduceToLigand,eval=TRUE}
bsrinf.L <- reduceToLigand(bsrinf)
bsrinf.R <- reduceToReceptor(bsrinf)
```
### Combined reductions
Combinations are possible.
For instance, users can apply `reduceToPathway` and `reduceToBestPathway`
reductions sequentially to maximize the reduction effect. In case the exact
same sets of aggregated ligands and receptors obtained with `reduceToPathway`
was associated with several pathways, the pathway with the best P-value
would be kept by `reduceToBestPathway`.
```{r doubleReduction, eval=TRUE}
bsrinf.redP <- reduceToPathway(bsrinf)
bsrinf.redPBP <- reduceToBestPathway(bsrinf.redP)
```
## Building a BSRSignature object
Gene signatures for a given, potentially reduced BSR-Inference object
are generated by `BSRSignature`, which returns a BSRSignature
object.
To follow the activity of L-R interactions across the samples of
the dataset, `scoreLRGeneSignatures` computes a score for each gene signature.
Then, heatmaps can be generated to represent differences, *e.g.*, using
the utility function a `simpleHeatmap`.
Hereafter, we show different workflows of reductions combined with gene
signature scoring and display.
### Scoring by ligand-receptor
```{r scoringLR,eval=TRUE,fig.dim = c(5,4)}
bsrsig.redBP <- BSRSignature(bsrinf.redBP, qval.thres = 0.001)
scoresLR <- scoreLRGeneSignatures(bsrdm, bsrsig.redBP,
name.by.pathway = FALSE
)
simpleHeatmap(scoresLR[1:20, ],
hcl.palette = "Cividis",
pointsize=8,width=4,height=3)
```
### Scoring by pathway
```{r scoringPathway,eval=TRUE,fig.dim = c(7,3)}
bsrsig.redPBP <- BSRSignature(bsrinf.redPBP, qval.thres = 0.01)
scoresPathway <- scoreLRGeneSignatures(bsrdm, bsrsig.redPBP,
name.by.pathway = TRUE
)
simpleHeatmap(scoresPathway,
hcl.palette = "Blue-Red 2",
pointsize=8,width=6,height=2)
```
## Other visualization utilities
### Heatmap of ligand-receptor-target genes expression
After computing gene signatures score, one may wish to look at the
expression of the genes involved in that signature. For instance,
we can display three heatmaps corresponding to the scaled (z-scores)
expression of ligands (pink), receptors (green), and target genes (blue).
On the top of each heatmap, the whole signature score from
`scoreLRGeneSignatures` is reported for reference.
```{r heatmapMulti,eval=TRUE,fig.dim = c(7,14)}
pathway1 <- pathways(bsrsig.redPBP)[1]
signatureHeatmaps(
pathway = pathway1,
bsrdm = bsrdm,
bsrsig = bsrsig.redPBP,
h.width = 6,
h.height = 8,
fontsize = 4,
show_column_names = TRUE
)
```
### AlluvialPlot
`alluvial.plot` is a function that enable users to represent the different
interactions between ligands, receptors, and pathways stored in a
BSRInference object.
Obviously, it is possible to filter by ligand, receptor, or pathway. This
is achieved by specifying a key word on the chosen category. A filter on
L-R interaction Q-values can be applied in addition.
```{r AlluvialPlot,eval=TRUE,fig.dim = c(8,3)}
alluvialPlot(bsrinf,
keywords = c("LAMC1"),
type = "L",
qval.thres = 0.01
)
```
### BubblePlot
`bubblePlotPathwaysLR` is a handy way to visualize the strengths of several
L-R interactions in relation with their receptor downstream pathways.
A vector of pathways of interest can be provided to limit the complexity
of the plot.
```{r BubblePlot,eval=TRUE,fig.dim = c(8,4)}
pathways <- LRinter(bsrinf)[1,c("pw.name")]
bubblePlotPathwaysLR(bsrinf,
pathways = pathways,
qval.thres = 0.001,
color = "red",
pointsize = 8
)
```
### Chordiagram
`chord.diagram.LR` is a function that enable users to feature the different
L-R interactions involved in a specific pathway.
L-R correlations strengths are drawn in a yellow color-scale.
Ligands are in grey, whereas receptors are in green.
You can also highlight in red one specific interaction by passing values
of a L-R pair as follows `ligand="FYN", receptor="SPN"`.
```{r Chordiagram,eval=TRUE,fig.dim = c(6,4.5)}
chordDiagramLR(bsrinf,
pw.id.filter = "R-HSA-210991",
limit = 20,
ligand="FYN",
receptor="SPN"
)
```
# Network Analysis
Since `BulkSignalR` relies on intracellular networks to estimate the statistical
significance of (ligand, receptor, pathway triples), links from receptors to
target genes are naturally accessible. Different functions enable users to
exploit this graphical data for plotting or further data analysis.
Furthermore, networks can be exported in text files and graphML objects
to be further explored with Cytoscape (www.cytoscape.org),
yEd (www.yworks.com), or similar software tools.
```{r network1, eval=TRUE}
# Generate a ligand-receptor network and export it in .graphML
# for Cytoscape or similar tools
gLR <- getLRNetwork(bsrinf.redBP, qval.thres = 1e-3)
# save to file
# write.graph(gLR,file="SDC-LR-network.graphml",format="graphml")
# As an alternative to Cytoscape, you can play with igraph package functions.
plot(gLR,
edge.arrow.size = 0.1,
vertex.label.color = "black",
vertex.label.family = "Helvetica",
vertex.label.cex = 0.1
)
```
```{r network2, eval=TRUE}
# You can apply other functions.
# Community detection
u.gLR <- as_undirected(gLR) # most algorithms work for undirected graphs only
comm <- cluster_edge_betweenness(u.gLR)
# plot(comm,u.gLR,
# vertex.label.color="black",
# vertex.label.family="Helvetica",
# vertex.label.cex=0.1)
# Cohesive blocks
cb <- cohesive_blocks(u.gLR)
plot(cb, u.gLR,
vertex.label.color = "black",
vertex.label.family = "Helvetica",
vertex.label.cex = 0.1,
edge.color = "black"
)
```
```{r network3, eval=FALSE,warning=FALSE}
# For the next steps, we just share the code below but graph generation function
# are commented to lighten the vignette.
# Generate a ligand-receptor network complemented with intra-cellular,
# receptor downstream pathways [computations are a bit longer here]
#
# You can save to a file for cystoscape or plot with igraph.
gLRintra <- getLRIntracellNetwork(bsrinf.redBP, qval.thres = 1e-3)
lay <- layout_with_kk(gLRintra)
# plot(gLRintra,
# layout=lay,
# edge.arrow.size=0.1,
# vertex.label.color="black",
# vertex.label.family="Helvetica",
# vertex.label.cex=0.1)
# Reduce complexity by focusing on strongly targeted pathways
pairs <- LRinter(bsrinf.redBP)
top <- unique(pairs[pairs$pval < 1e-3, c("pw.id", "pw.name")])
top
gLRintra.res <- getLRIntracellNetwork(bsrinf.redBP,
qval.thres = 0.01,
restrict.pw = top$pw.id
)
lay <- layout_with_fr(gLRintra.res)
# plot(gLRintra.res,
# layout=lay,
# edge.arrow.size=0.1,
# vertex.label.color="black",
# vertex.label.family="Helvetica",
# vertex.label.cex=0.4)
```
# Non human data
\
In order to process data from nonhuman organisms,
users only need to specify a few additional parameters and all the
other steps of the analysis remain unchanged.
By default, `BulksignalR` works with *Homo sapiens*.
We implemented a strategy using ortholog genes (mapped by the
orthogene BioConductor package) in BulkSignalR directly.
The function `findOrthoGenes` creates a correspondence table between
human and another organism.
`convertToHuman` then converts an initial expression matrix to
a *Homo sapiens* equivalent.
When calling `BSRDataModel`, the user only needs to pass this transformed
matrix, the actual nonhuman organism, and the correspondence table.
Then, L-R interaction inference is performed as for human data.
Finally, users can switch back to gene names relative to the original organism
via `resetToInitialOrganism`.
The rest of the workflow is executed as usual for
computing gene signatures and visualizing.
\
```{r mouse,eval=TRUE,warning=FALSE}
data(bodyMap.mouse)
ortholog.dict <- findOrthoGenes(
from_organism = "mmusculus",
from_values = rownames(bodyMap.mouse)
)
matrix.expression.human <- convertToHuman(
counts = bodyMap.mouse,
dictionary = ortholog.dict
)
bsrdm <- BSRDataModel(
counts = matrix.expression.human,
species = "mmusculus",
conversion.dict = ortholog.dict
)
bsrdm <- learnParameters(bsrdm,quick=TRUE)
bsrinf <- BSRInference(bsrdm,reference="REACTOME")
bsrinf <- resetToInitialOrganism(bsrinf, conversion.dict = ortholog.dict)
# For example, if you want to explore L-R interactions
# you can proceed as shown above for a human dataset.
# bsrinf.redBP <- reduceToBestPathway(bsrinf)
# bsrsig.redBP <- BSRSignature(bsrinf.redBP, qval.thres=0.001)
# scoresLR <- scoreLRGeneSignatures(bsrdm,bsrsig.redBP,name.by.pathway=FALSE)
# simpleHeatmap(scoresLR[1:20,],column.names=TRUE,
# width=9, height=5, pointsize=8)
```
# Spatial Transcriptomics
\
`BulkSignalR` workflow can be applied to spatial transcriptomics (ST) to find
significant L-R interactions occurring in a tissue. Additional functions
have been introduced to facilitate the visualization and analysis of the
results in a spatial context. The only necessary change is to adapt the
training of the statistical model to shallower data featuring dropouts
and reduced dynamic range. This is achieved by imposing a minimum correlation
at -1 in the training and requiring at least two target genes in a
pathway instead of 4. Also, thresholds on L-R interaction Q-values should
be released slightly such as 1% instead of 0.1%.
A basic spatial function is `spatialPlot` that enables visualizing
L-R interaction gene signature scores at their spatial coordinates with a
potential reference plot (raw tissue image or user-defined areas) on the side.
When the papaer was published, we provided scripts
[BulkSignalR github companion](
https://github.com/jcolinge/BulkSignalR_companion),
to retrieve ST raw data, from tabulated files, that was the
starting point to execute our workflow.
We can also directly work with an object from the `SpatialExperiment`
Bioconductor class.
Also you can check out the `VisiumIO` package that allows users
to readily import Visium data from the 10X Space Ranger pipeline.
`VisiumIO` package provides a convenient function to easily retrieve
SpatialExperiment object.
\
```{r spatial1,eval=TRUE,message=FALSE}
# load data =================================================
# We re-initialise the environment variable of Reactome
# whith the whole set. (because we previously made a subset before)
reactSubset <- getResource(resourceName = "Reactome",
cache = TRUE)
resetPathways(dataframe = reactSubset,
resourceName = "Reactome")
# Few steps of pre-process to subset a spatialExperiment object
# from STexampleData package ==================================
spe <- Visium_humanDLPFC()
set.seed(123)
speSubset <- spe[, colData(spe)$ground_truth%in%c("Layer1","Layer2")]
idx <- sample(ncol(speSubset), 10)
speSubset <- speSubset[, idx]
my.image.as.raster <- SpatialExperiment::imgRaster(speSubset,
sample_id = imgData(spe)$sample_id[1], image_id = "lowres")
colData(speSubset)$idSpatial <- paste(colData(speSubset)[[4]],
colData(speSubset)[[5]],sep = "x")
annotation <- colData(speSubset)
```
```{r spatial2,eval=TRUE,warning=FALSE}
# prepare data =================================================
bsrdm <- BSRDataModel(speSubset,
min.count = 1,
prop = 0.01,
method = "TC",
symbol.col = 2,
x.col = 4,
y.col = 5,
barcodeID.col = 1)
bsrdm <- learnParameters(bsrdm,
quick = TRUE,
min.positive = 2,
verbose = TRUE)
bsrinf <- BSRInference(bsrdm, min.cor = -1,reference="REACTOME")
# spatial analysis ============================================
bsrinf.red <- reduceToBestPathway(bsrinf)
pairs.red <- LRinter(bsrinf.red)
thres <- 0.01
min.corr <- 0.01
pairs.red <- pairs.red[pairs.red$qval < thres & pairs.red$LR.corr > min.corr,]
head(pairs.red[
order(pairs.red$qval),
c("L", "R", "LR.corr", "pw.name", "qval")])
s.red <- BSRSignature(bsrinf.red, qval.thres=thres)
scores.red <- scoreLRGeneSignatures(bsrdm,s.red)
head(scores.red)
```
From here, one can start to explore the data
through different plots.
**Note :** As we work on a data subset to increase
the vignette generation, only a few point are
displayed on the slide.
```{r spatialPlot3,eval=TRUE,fig.dim = c(6,4.5)}
# Visualization ============================================
# plot one specific interaction
# we have to follow the syntax with {}
# to be compatible with reduction operations
inter <- "{SLIT2} / {GPC1}"
# with raw tissue reference
spatialPlot(scores.red[inter, ], annotation, inter,
ref.plot = TRUE, ref.plot.only = FALSE,
image.raster = NULL, dot.size = 1,
label.col = "ground_truth"
)
# or with synthetic image reference
spatialPlot(scores.red[inter, ], annotation, inter,
ref.plot = TRUE, ref.plot.only = FALSE,
image.raster = my.image.as.raster, dot.size = 1,
label.col = "ground_truth"
)
```
You can dissect one interaction to visualise both ligand and
receptor expression of the interaction.
```{r spatialPlot4,eval=TRUE,fig.dim = c(6,4.5)}
separatedLRPlot(scores.red, "SLIT2", "GPC1",
ncounts(bsrdm),
annotation,
label.col = "ground_truth")
```
```{r spatialPlot5,eval=TRUE}
# generate visual index on disk in pdf file
spatialIndexPlot(scores.red, annotation,
label.col = "ground_truth",
out.file="spatialIndexPlot")
```
\
Finally, we provide function to assess statistical associations of L-R gene
signature scores with the different user-defined areas of a sample.
Based on these associations, a visualization tool can represent the
latter in the form of a heatmap.
\
```{r spatialPlot6,eval=TRUE,fig.dim = c(6,4.5)}
# statistical association with tissue areas based on correlations
# For display purpose, we only use a subset here
assoc.bsr.corr <- spatialAssociation(scores.red[c(1:17), ],
annotation, label.col = "ground_truth",test = "Spearman")
head(assoc.bsr.corr)
spatialAssociationPlot(assoc.bsr.corr)
```
We also provide 2D-projections (see `spatialDiversityPlot` function) to
assess diversity among L-R interaction spatial distributions over
an entire dataset. Other function as `generateSpatialPlots` can generate on disk
multiple individual spatial plots.
\
Note that we also describe more diverse use cases in the
[BulkSignalR github companion](
https://github.com/jcolinge/BulkSignalR_companion).
\
See the reference manual for all the details.
```{r inferCells,eval=FALSE,include=FALSE}
## Additional functions
#This is not part of the main workflow for analyzing LR interactions but
#we offer convenient functions for inferring cell types. However
#we recommend using other softwares specifically dedicated to this purpose.
# Inferring cell types
data(sdc, package = "BulkSignalR")
bsrdm <- BSRDataModel(counts = sdc)
bsrdm <- learnParameters(bsrdm)
bsrinf <- BSRInference(bsrdm)
# Common TME cell type signatures
data(immune.signatures, package = "BulkSignalR")
unique(immune.signatures$signature)
immune.signatures <- immune.signatures[immune.signatures$signature %in% c(
"B cells", "Dentritic cells", "Macrophages",
"NK cells", "T cells", "T regulatory cells"
), ]
data("tme.signatures", package = "BulkSignalR")
signatures <- rbind(immune.signatures,
tme.signatures[tme.signatures$signature %in%
c("Endothelial cells", "Fibroblasts"), ])
tme.scores <- scoreSignatures(bsrdm, signatures)
# assign cell types to interactions
lr2ct <- assignCellTypesToInteractions(bsrdm, bsrinf, tme.scores)
head(lr2ct)
# cellular network computation and plot
g.table <- cellularNetworkTable(lr2ct)
gCN <- cellularNetwork(g.table)
plot(gCN, edge.width = 5 * E(gCN)$score)
gSummary <- summarizedCellularNetwork(g.table)
plot(gSummary, edge.width = 1 + 30 * E(gSummary)$score)
# relationship with partial EMT---
# Should be tested HNSCC data instead of SDC!!
# find the ligands
data(p.EMT, package = "BulkSignalR")
gs <- p.EMT$gene
triggers <- relateToGeneSet(bsrinf, gs)
triggers <- triggers[triggers$n.genes > 1, ] # at least 2 target genes in the gs
ligands.in.gs <- intersect(triggers$L, gs)
triggers <- triggers[!(triggers$L %in% ligands.in.gs), ]
ligands <- unique(triggers$L)
# link to cell types
cf <- cellTypeFrequency(triggers, lr2ct, min.n.genes = 2)
missing <- setdiff(rownames(tme.scores), names(cf$s))
cf$s[missing] <- 0
cf$t[missing] <- 0
op <- par(mar = c(2, 10, 2, 2))
barplot(cf$s, col = "lightgray", horiz = T, las = 2)
par(op)
# random selections based on random gene sets
qval.thres <- 1e-3
inter <- LRinter(bsrinf)
tg <- tgGenes(bsrinf)
tcor <- tgCorr(bsrinf)
good <- inter$qval <= qval.thres
inter <- inter[good, ]
tg <- tg[good]
tcor <- tcor[good]
all.targets <- unique(unlist(tg))
r.cf <- list()
for (k in 1:100) { # should 1000 or more
r.gs <- sample(all.targets, length(intersect(gs, all.targets)))
r.triggers <- relateToGeneSet(bsrinf, r.gs, qval.thres = qval.thres)
r.triggers <- r.triggers[r.triggers$n.genes > 1, ]
r.ligands.in.gs <- intersect(r.triggers$L, r.gs)
r.triggers <- r.triggers[!(r.triggers$L %in% r.ligands.in.gs), ]
r <- cellTypeFrequency(r.triggers, lr2ct, min.n.genes = 2)
missing <- setdiff(rownames(tme.scores), names(r$s))
r$s[missing] <- 0
r$t[missing] <- 0
o <- order(names(r$t))
r$s <- r$s[o]
r$t <- r$t[o]
r.cf <- c(r.cf, list(r))
}
r.m.s <- foreach(i = seq_len(length(r.cf)), .combine = rbind) %do% {
r.cf[[i]]$s
}
# plot results
op <- par(mar = c(2, 10, 2, 2))
boxplot(r.m.s, col = "lightgray", horizontal = T, las = 2)
pts <- data.frame(x = as.numeric(cf$s[colnames(r.m.s)]), cty = colnames(r.m.s))
stripchart(x ~ cty, data = pts, add = TRUE, pch = 19, col = "red")
par(op)
for (cty in rownames(tme.scores)) {
cat(cty, ": P=", sum(r.m.s[, cty] >= cf$s[cty]) / nrow(r.m.s),
"\n", sep = "")
}
```
# Acknowledgements
We thank Guillaume Tosato for his help with the figures and
Gauthier Gadouas for testing the software on different platforms.
\
Thank you for reading this guide and for using `BulkSignalR`.
# Session Information
```{r session-info}
sessionInfo()
```