PROGENy is resource that leverages a large compendium of publicly available signaling perturbation experiments to yield a common core of pathway responsive genes. For each pathway, a collection of genes are available along their contribution and significance to it.
Inside PROGENy, one can find gene signatures for 14 different pathways:
These signatures, coupled with any statistical method, can be used to infer pathway activities from bulk or single-cell transcriptomics. In this vignette we just show how to access these signatures and some of their properties. To infer pathway activities, please check out decoupleR, available in R or python.
First we load the necessary packages:
Here is how to retrieve all genes from each pathway in human:
model <- progeny::model_human_full
head(model)
#> gene pathway weight p.value
#> 1 RFC2 EGFR 1.47064662 0.001655275
#> 2 ESRRA EGFR 0.17858956 0.211837604
#> 3 HNRNPK EGFR 0.30669860 0.084560061
#> 4 CBX6 EGFR -0.67550734 0.017641119
#> 5 ASRGL1 EGFR -0.25232814 0.295429969
#> 6 FLJ30679 MAPK -0.06047373 0.628461747
Here we can observe how some genes behave for some pathways. For example, The gene CBX6 has a negative response to EGFR, meaning that when there is EGFR signaling this gene is down-regulated. On the other hand, the gene RFC2 has a positive weight for EGFR, meaning that when there is EGFR signaling it becomes up-regulated. We can also see the significance of each gene to each pathway. To better estimate pathway activities, we recommend to select the top 100 significant genes for each pathway or filter by significance.
We can visualize the distribution of weights for the top 100 genes per pathway:
# Get top 100 significant genes per pathway
model_100 <- model %>%
group_by(pathway) %>%
slice_min(order_by = p.value, n = 200)
# Plot
ggplot(data=model_100, aes(x=weight, color=pathway, fill=pathway)) +
geom_density() +
theme(text = element_text(size=12)) +
facet_wrap(~ pathway, scales='free') +
xlab('scores') +
ylab('densities') +
theme_bw() +
theme(legend.position = "none")
Each pathway show a different distribution of weights. Some up-regulate more genes than down-regulate, like NFkB or TNFa, while others do the opposite, like PI3K or VEGF.