Title: | Analyze and interpret RNA-Seq results |
---|---|
Description: | pathlinkR is an R package designed to facilitate analysis of RNA-Seq results. Specifically, our aim with pathlinkR was to provide a number of tools which take a list of DE genes and perform different analyses on them, aiding with the interpretation of results. Functions are included to perform pathway enrichment, with muliplte databases supported, and tools for visualizing these results. Genes can also be used to create and plot protein-protein interaction networks, all from inside of R. |
Authors: | Travis Blimkie [cre] , Andy An [aut] |
Maintainer: | Travis Blimkie <[email protected]> |
License: | GPL-3 + file LICENSE |
Version: | 1.3.7 |
Built: | 2024-12-24 03:30:01 UTC |
Source: | https://github.com/bioc/pathlinkR |
Internal function which is used to create even breaks for
volcano plots produced by eruption
.
.eruptionBreaks(x)
.eruptionBreaks(x)
x |
Length-two numeric vector to manually specify limits of the x-axis in log2 fold change; defaults to NA which lets ggplot2 determine the best values. |
ggplot scale object
https://github.com/hancockinformatics/pathlinkR
Helper function to handle heatmap legends without clutteing up the main function.
.plotFoldChangeLegend(.matFC, .log2FoldChange, .cellColours)
.plotFoldChangeLegend(.matFC, .log2FoldChange, .cellColours)
.matFC |
Matrix of fold change values |
.log2FoldChange |
Boolean denoting if values will be in log2 |
.cellColours |
Colours for fold change values |
A list containing heatmap legend parameters and colour function
https://github.com/hancockinformatics/pathlinkR
Internal wrapper function to run Sigora and return the results with desired columns
.runSigora(enrichGenes, gpsRepo, gpsLevel, pValFilter = NA)
.runSigora(enrichGenes, gpsRepo, gpsLevel, pValFilter = NA)
enrichGenes |
Vector of genes to enrich |
gpsRepo |
GPS object to use for testing pathways |
gpsLevel |
Level to use for enrichment testing |
pValFilter |
Desired threshold for filtering results |
A "data.frame" (tibble) of results from Sigora
https://cran.r-project.org/package=sigora
https://github.com/hancockinformatics/pathlinkR
Trims a character string to the desired length, without breaking in the middle of a word (i.e. chops at the nearest space). Appends an ellipsis at the end to indicate some text has been removed.
.truncNeatly(x, l = 60)
.truncNeatly(x, l = 60)
x |
Character to be truncated |
l |
Desired maximum length for the output character |
Character vector
https://github.com/hancockinformatics/pathlinkR
Creates a volcano plot of genes from RNA-Seq results, with various options for tweaking the appearance. Ensembl gene IDs should be the rownames of the input object.
eruption( rnaseqResult, columnFC = NA, columnP = NA, pCutoff = 0.05, fcCutoff = 1.5, labelCutoffs = FALSE, baseColour = "steelblue4", nonsigColour = "lightgrey", alpha = 0.5, pointSize = 1, title = NA, nonlog2 = FALSE, xaxis = NA, yaxis = NA, highlightGenes = c(), highlightColour = "red", highlightName = "Selected", label = "auto", n = 10, manualGenes = c(), removeUnannotated = TRUE, labelSize = 3.5, pad = 1.4 )
eruption( rnaseqResult, columnFC = NA, columnP = NA, pCutoff = 0.05, fcCutoff = 1.5, labelCutoffs = FALSE, baseColour = "steelblue4", nonsigColour = "lightgrey", alpha = 0.5, pointSize = 1, title = NA, nonlog2 = FALSE, xaxis = NA, yaxis = NA, highlightGenes = c(), highlightColour = "red", highlightName = "Selected", label = "auto", n = 10, manualGenes = c(), removeUnannotated = TRUE, labelSize = 3.5, pad = 1.4 )
rnaseqResult |
Data frame of RNASeq results, with Ensembl gene IDs as rownames. Can be a "DESeqResults" or "TopTags" object, or a simple data frame. See "Details" for more information. |
columnFC |
Character; Column to plot along the x-axis, typically log2
fold change values. Only required when |
columnP |
Character; Column to plot along the y-axis, typically nominal
or adjusted p values. Only required when |
pCutoff |
Adjusted p value cutoff, defaults to < 0.05 |
fcCutoff |
Absolute fold change cutoff, defaults to > 1.5 |
labelCutoffs |
Logical; Should cutoff lines for p value and fold change
be labeled? Size of the label is controlled by |
baseColour |
Colour of points for all significant DE genes ("steelblue4") |
nonsigColour |
Colour of non-significant DE genes ("lightgrey") |
alpha |
Transparency of the points (0.5) |
pointSize |
Size of the points (1) |
title |
Title of the plot |
nonlog2 |
Show non-log2 fold changes instead of log2 fold change (FALSE) |
xaxis |
Length-two numeric vector to manually specify limits of the x-axis in log2 fold change; defaults to NA which lets ggplot2 determine the best values. |
yaxis |
Length-two numeric vector to manually specify limits of the y-axis (in -log10). Defaults to NA which lets ggplot2 determine the best values. |
highlightGenes |
Vector of genes to emphasize by colouring differently (e.g. genes of interest). Must be Ensembl IDs. |
highlightColour |
Colour for the genes specified in |
highlightName |
Optional name to call the |
label |
When set to "auto" (default), label the top |
n |
number of top up- and down-regulated genes to label. Applies when
|
manualGenes |
If |
removeUnannotated |
Boolean (TRUE): Remove genes without annotations (no HGNC symbol). |
labelSize |
Size of font for labels (3.5) |
pad |
Padding of labels; adjust this if the labels overlap |
The input to eruption()
can be of class "DESeqResults" (from
DESeq2
), "TopTags" (edgeR
), or a simple data frame. When providing
either of the former, the columns to plot are automatically pulled
("log2FoldChange" and "padj" for DESeqResults, or "logFC" and "FDR" for
TopTags). Otherwise, the arguments "columnFC" and "columnP" must be
specified. If one wishes to override the default behaviour for
"DESeqResults" or "TopTags" (e.g. plot nominal p values on the y-axis),
convert those objects to data frames, then supply "columnFC" and "columnP".
The argument highlightGenes
can be used to draw attention to a specific
set of genes, e.g. those from a pathway of interest. Setting the argument
label="highlight"
will also mean those same genes (at least some of them)
will be given labels, further emphasizing them in the volcano plot.
Since this function returns a ggplot object, further custom changes could
be applied using the standard ggplot2 functions (labs()
, theme()
,
etc.).
Volcano plot of genes from an RNA-Seq experiment; a "ggplot" object
https://github.com/hancockinformatics/pathlinkR
data("exampleDESeqResults") eruption(rnaseqResult=exampleDESeqResults[[1]])
data("exampleDESeqResults") eruption(rnaseqResult=exampleDESeqResults[[1]])
List of example results from DESeq2
data(exampleDESeqResults)
data(exampleDESeqResults)
A list of two "DESeqResults" objects, each with 5000 rows and 6 columns:
A combined score for the gene
Fold change value for the gene
Standard error for the fold change value
The statistic value
The nominal p value for the gene
The adjusted p value for the gene
An object of class "list"
For details on DESeq2 and its data structures/methods, please see https://bioconductor.org/packages/DESeq2/
Given a data frame of pathways and their member genes, calculate the pairwise distances using a constructed identity matrix. Zero means two pathways are identical, while one means two pathways share no genes in common.
getPathwayDistances(pathwayData = sigoraDatabase, distMethod = "jaccard")
getPathwayDistances(pathwayData = sigoraDatabase, distMethod = "jaccard")
pathwayData |
Three column data frame of pathways and their constituent
genes. Defaults to the provided |
distMethod |
Character; method used to determine pairwise pathway
distances. Can be any option supported by |
Matrix of the pairwise pathway distances (dissimilarity) based on overlap of their constituent genes; object of class "matrix".
None.
https://github.com/hancockinformatics/pathlinkR
# Here we'll use a subset of all the pathways, to save time data("sigoraDatabase") getPathwayDistances( pathwayData=dplyr::slice_head( dplyr::arrange(sigoraDatabase, pathwayId), prop=0.05 ), distMethod="jaccard" )
# Here we'll use a subset of all the pathways, to save time data("sigoraDatabase") getPathwayDistances( pathwayData=dplyr::slice_head( dplyr::arrange(sigoraDatabase, pathwayId), prop=0.05 ), distMethod="jaccard" )
Colour assignments for grouped pathways
data(groupedPathwayColours)
data(groupedPathwayColours)
A length 8 named vector of hex colour values
An object of class "character"
Table of Hallmark gene sets and their genes
data(hallmarkDatabase)
data(hallmarkDatabase)
A data frame (tibble) with 8,209 rows and 2 columns
Name of the Hallmark Gene Set
Ensembl gene IDs
An object of class "tbl", "tbl.df", "data.frame"
For more information on the MSigDB Hallmark gene sets, please see https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp
A data frame containing human PPI data from InnateDB, from the entry "All Experimentally Validated Interactions (updated weekly)" at https://innatedb.com/redirect.do?go=downloadImported. A few important steps have been taken to filter the data, namely the removal of duplicate interactions, and removing interactions that have the same components but are swapped between A and B.
data(innateDbPPI)
data(innateDbPPI)
A data frame (tibble) with 152,256 rows and 2 columns:
Ensembl gene ID for the first gene/protein in the interaction
Ensembl gene ID for the second gene/protein in the interaction
An object of class "tbl", "tbl.df", "data.frame"
For more details on the data sourced from InnateDB, please see their website: https://www.innatedb.com
Table of KEGG pathways and genes
data(keggDatabase)
data(keggDatabase)
A data frame (tibble) with 32883 rows and 4 columns
KEGG pathway ID
Name of the Reactome pathway
Ensembl gene ID
HGNC gene symbol
An object of class "tbl", "tbl.df", "data.frame"
See https://kegg.jp for more information.
A data frame to aid in mapping human gene IDs between different
formats, inclusing Ensembl IDs, HGNC symbols, and Entrez IDs. Mapping
information was sourced using biomaRt
and AnnotationDbi
.
data(mappingFile)
data(mappingFile)
A data frame (tibble) with 43,993 rows and 3 columns
Ensembl IDs
HGNC symbols
NCBI Entrez IDs
An object of class "tbl", "tbl.df", "data.frame"
See https://bioconductor.org/packages/biomaRt/ and https://bioconductor.org/packages/AnnotationDbi/ for information on each of the utilized packages and functions.
Creates a tidygraph network object from the provided pathway
information, ready to be visualized with pathnetGGraph
or
pathnetVisNetwork
.
pathnetCreate( pathwayEnrichmentResult, columnId = "pathwayId", columnP = "pValueAdjusted", foundation, trim = TRUE, trimOrder = 1 )
pathnetCreate( pathwayEnrichmentResult, columnId = "pathwayId", columnP = "pValueAdjusted", foundation, trim = TRUE, trimOrder = 1 )
pathwayEnrichmentResult |
Data frame of results from
|
columnId |
Character; column containing the Reactome pathway IDs. Defaults to "pathwayID". |
columnP |
Character; column containing the adjusted p values. Defaults to "pValueAdjusted". |
foundation |
List of pathway pairs to use in constructing a network.
Typically this will be the output from |
trim |
Remove independent subgraphs which don't contain any enriched
pathways (default is |
trimOrder |
Order to use when removing subgraphs; Higher values will
keep more non-enriched pathway nodes. Defaults to |
With the "trim" option enabled, nodes (pathways) and subgraphs which
are not sufficiently connected to enriched pathways will be removed. How
aggressively this is done can be controlled via the trimOrder
argument,
and the optimal value will depend on the number of enriched pathways and
the number of interacting pathways (i.e. number of rows in "foundation").
A pathway network as a "tidygraph" object, with the following columns for nodes:
pathwayId |
Reactome pathway ID |
pathwayName |
Reactome pathway name |
comparison |
Name of source comparison, if this pathway was enriched |
direction |
Whether an enriched pathway was found in all genes or up- or down-regulated genes |
pValue |
Nominal p-value from the enrichment result |
pValueAdjusted |
Corrected p-value from the enrichment |
genes |
Candidate genes for the given pathway if it was enriched |
numCandidateGenes |
Number of candidate genes |
numBgGenes |
Number of background genes |
geneRatio |
Ratio of candidate and background genes |
totalGenes |
Total number of DE genes tested, for an enriched pathway |
topLevelPathway |
Highest level Reactome term for a given pathway |
groupedPathway |
Custom pathway category used in visualizations |
For edges, the following information is also included:
from |
Starting node (row number) for the edge |
to |
Ending node (row number) for the edge |
similarity |
Similarity of two nodes/pathways |
distance |
Inverse of similarity |
https://github.com/hancockinformatics/pathlinkR
data("sigoraDatabase", "sigoraExamples") pathwayDistancesJaccard <- getPathwayDistances( pathwayData=dplyr::slice_head( dplyr::arrange(sigoraDatabase, pathwayId), prop=0.05 ), distMethod="jaccard" ) startingPathways <- pathnetFoundation( mat=pathwayDistancesJaccard, maxDistance=0.8 ) pathnetCreate( pathwayEnrichmentResult=sigoraExamples[grepl( "Pos", sigoraExamples$comparison ), ], foundation=startingPathways, trim=TRUE, trimOrder=1 )
data("sigoraDatabase", "sigoraExamples") pathwayDistancesJaccard <- getPathwayDistances( pathwayData=dplyr::slice_head( dplyr::arrange(sigoraDatabase, pathwayId), prop=0.05 ), distMethod="jaccard" ) startingPathways <- pathnetFoundation( mat=pathwayDistancesJaccard, maxDistance=0.8 ) pathnetCreate( pathwayEnrichmentResult=sigoraExamples[grepl( "Pos", sigoraExamples$comparison ), ], foundation=startingPathways, trim=TRUE, trimOrder=1 )
From a "n by n" distance matrix, generate a table of interacting pathways to use in constructing a pathway network. The cutoff can be adjusted to have more or fewer edges in the final network, depending on the number of pathways involved, i.e. the number of enriched pathways you're trying to visualize.
The desired cutoff will also vary based on the distance measure used, so some trial-and-error may be needed to find an appropriate value.
pathnetFoundation(mat, maxDistance = NA, propToKeep = NA)
pathnetFoundation(mat, maxDistance = NA, propToKeep = NA)
mat |
Matrix of distances between pathways, i.e. 0 means two pathways
are identical. Should match the output from |
maxDistance |
Numeric distance cutoff (less than or equal) used to
determine if two pathways should share an edge. Pathway pairs with a
distance of 0 are always removed. One of |
propToKeep |
Top proportion of pathway pairs to keep as edges, ranked
based distance. One of |
A "data.frame" (tibble) of interacting pathway pairs with the following columns:
pathwayName1 |
Name of the first pathway in the pair |
pathwayName2 |
Name of the second pathway in the pair |
distance |
Distance measure for the two pathways |
pathway1 |
Reactome ID for the first pathway in the pair |
pathway2 |
Reactome ID for the first pathway in the pair |
None.
https://github.com/hancockinformatics/pathlinkR
data("sigoraDatabase") pathwayDistancesJaccard <- getPathwayDistances( pathwayData=dplyr::slice_head( dplyr::arrange(sigoraDatabase, pathwayId), prop=0.05 ), distMethod="jaccard" ) startingPathways <- pathnetFoundation( mat=pathwayDistancesJaccard, maxDistance=0.8 )
data("sigoraDatabase") pathwayDistancesJaccard <- getPathwayDistances( pathwayData=dplyr::slice_head( dplyr::arrange(sigoraDatabase, pathwayId), prop=0.05 ), distMethod="jaccard" ) startingPathways <- pathnetFoundation( mat=pathwayDistancesJaccard, maxDistance=0.8 )
Plots the network object generated from createPathnet
,
creating a visual representation of pathway similarity/interactions based
on overlapping genes.
pathnetGGraph( network, networkLayout = "nicely", nodeSizeRange = c(4, 8), nodeBorderWidth = 1.5, nodeLabelSize = 5, nodeLabelColour = "black", nodeLabelAlpha = 0.67, nodeLabelOverlaps = 6, nodeLabelLength = 40, nodeLabelWrap = 20, labelProp = 0.25, segColour = "black", edgeColour = "grey30", edgeWidthRange = c(0.33, 3), edgeAlpha = 1, themeBaseSize = 16 )
pathnetGGraph( network, networkLayout = "nicely", nodeSizeRange = c(4, 8), nodeBorderWidth = 1.5, nodeLabelSize = 5, nodeLabelColour = "black", nodeLabelAlpha = 0.67, nodeLabelOverlaps = 6, nodeLabelLength = 40, nodeLabelWrap = 20, labelProp = 0.25, segColour = "black", edgeColour = "grey30", edgeWidthRange = c(0.33, 3), edgeAlpha = 1, themeBaseSize = 16 )
network |
Tidygraph network object, output from |
networkLayout |
Desired layout for the network visualization. Defaults
to "nicely", but supports any method found in |
nodeSizeRange |
Size range for nodes, mapped to significance (Bonferroni
p-value). Defaults to |
nodeBorderWidth |
Width of borders on nodes, defaults to 1.5 |
nodeLabelSize |
Size of node labels; defaults to 5. |
nodeLabelColour |
Colour of the node labels; defaults to "black". |
nodeLabelAlpha |
Transparency of node labels. Defaults to |
nodeLabelOverlaps |
Max overlaps for node labels, from |
nodeLabelLength |
Length of the pathway name displayed before
truncation. Defaults to |
nodeLabelWrap |
Line length before pathway name is wrapped onto a new
line. Defaults to |
labelProp |
Proportion of "interactor" (i.e. non-enriched) pathways that
the function will attempt to label. E.g. setting this to 0.5 (the default)
means half of the non-enriched pathways will potentially be labeled - it
won't be exact because the node labeling is done with |
segColour |
Colour of line segments connecting labels to nodes. Defaults to "black". |
edgeColour |
Colour of network edges; defaults to "grey30". |
edgeWidthRange |
Range of edge widths, mapped to |
edgeAlpha |
Alpha value for edges; defaults to |
themeBaseSize |
Base font size for all plot elements. Defaults
to |
A note regarding node labels: The function tries to prioritize
labeling enriched pathways (filled nodes), with the labelProp
argument
determining roughly how many of the remaining interactor pathways might get
labels. You'll likely need to tweak this value, and try different seeds, to
get the desired effect.
A pathway network or "pathnet"; a plot object of class "ggplot"
None.
https://github.com/hancockinformatics/pathlinkR
data("sigoraDatabase", "sigoraExamples") pathwayDistancesJaccard <- getPathwayDistances( pathwayData=dplyr::slice_head( dplyr::arrange(sigoraDatabase, pathwayId), prop=0.05 ), distMethod="jaccard" ) startingPathways <- pathnetFoundation( mat=pathwayDistancesJaccard, maxDistance=0.8 ) exPathnet <- pathnetCreate( pathwayEnrichmentResult=sigoraExamples[grepl( "Pos", sigoraExamples$comparison ), ], foundation=startingPathways, trim=TRUE, trimOrder=1 ) pathnetGGraph( exPathnet, labelProp=0.1, nodeLabelSize=4, nodeLabelOverlaps=8, segColour="red" )
data("sigoraDatabase", "sigoraExamples") pathwayDistancesJaccard <- getPathwayDistances( pathwayData=dplyr::slice_head( dplyr::arrange(sigoraDatabase, pathwayId), prop=0.05 ), distMethod="jaccard" ) startingPathways <- pathnetFoundation( mat=pathwayDistancesJaccard, maxDistance=0.8 ) exPathnet <- pathnetCreate( pathwayEnrichmentResult=sigoraExamples[grepl( "Pos", sigoraExamples$comparison ), ], foundation=startingPathways, trim=TRUE, trimOrder=1 ) pathnetGGraph( exPathnet, labelProp=0.1, nodeLabelSize=4, nodeLabelOverlaps=8, segColour="red" )
Plots the network object generated from createPathnet
,
creating a visual and interactive representation of similarities/
interactions between pathways using their overlapping genes.
pathnetVisNetwork( network, networkLayout = "layout_nicely", nodeSizeRange = c(20, 50), nodeBorderWidth = 2.5, labelNodes = TRUE, nodeLabelSize = 60, nodeLabelColour = "black", nodeLabelLength = 40, edgeColour = "#848484", edgeWidthRange = c(5, 20), highlighting = TRUE )
pathnetVisNetwork( network, networkLayout = "layout_nicely", nodeSizeRange = c(20, 50), nodeBorderWidth = 2.5, labelNodes = TRUE, nodeLabelSize = 60, nodeLabelColour = "black", nodeLabelLength = 40, edgeColour = "#848484", edgeWidthRange = c(5, 20), highlighting = TRUE )
network |
Tidygraph network object as output by |
networkLayout |
Desired layout for the network visualization. Defaults
to "layout_nicely", and should support most igraph layouts. See
|
nodeSizeRange |
Node size is mapped to the negative log of the
Bonferroni-adjusted p value, and this length-two numeric vector controls
the minimum and maximum. Defaults to |
nodeBorderWidth |
Size of the node border, defaults to 2.5 |
labelNodes |
Boolean determining if nodes should be labeled. Note it will only ever label enriched nodes/pathways. |
nodeLabelSize |
Size of the node labels in pixels; defaults to 60. |
nodeLabelColour |
Colour of the node labels; defaults to "black". |
nodeLabelLength |
Length of the pathway name displayed before
truncation. Defaults to |
edgeColour |
Colour of network edges; defaults to "#848484". |
edgeWidthRange |
Edge width is mapped to the similarity measure (one
over distance). This length-two numeric vector controls the minimum and
maximum width of edges. Defaults to |
highlighting |
When clicking on a node, should directly neighbouring nodes be highlighted (other nodes are dimmed)? Defaults to TRUE. |
This function makes use of the visNetwork library, which allows for various forms of interactivity, such as including text when hovering over nodes, node selection and dragging (including multiple selections), and highlighting nodes belonging to a larger group (e.g. top-level Reactome category).
An interactive pathway, network or "pathnet"; object of class "visNetwork"
https://datastorm-open.github.io/visNetwork/
https://github.com/hancockinformatics/pathlinkR
data("sigoraDatabase", "sigoraExamples") pathwayDistancesJaccard <- getPathwayDistances( pathwayData=dplyr::slice_head( dplyr::arrange(sigoraDatabase, pathwayId), prop=0.05 ), distMethod="jaccard" ) startingPathways <- pathnetFoundation( mat=pathwayDistancesJaccard, maxDistance=0.8 ) exPathnet <- pathnetCreate( pathwayEnrichmentResult=sigoraExamples[grepl( "Pos", sigoraExamples$comparison ), ], foundation=startingPathways, trim=TRUE, trimOrder=1 ) pathnetVisNetwork(exPathnet)
data("sigoraDatabase", "sigoraExamples") pathwayDistancesJaccard <- getPathwayDistances( pathwayData=dplyr::slice_head( dplyr::arrange(sigoraDatabase, pathwayId), prop=0.05 ), distMethod="jaccard" ) startingPathways <- pathnetFoundation( mat=pathwayDistancesJaccard, maxDistance=0.8 ) exPathnet <- pathnetCreate( pathwayEnrichmentResult=sigoraExamples[grepl( "Pos", sigoraExamples$comparison ), ], foundation=startingPathways, trim=TRUE, trimOrder=1 ) pathnetVisNetwork(exPathnet)
A data frame containing all Reactome, Hallmark, and KEGG pathways/terms, along with a manually-curated top-level category for each entry.
data(pathwayCategories)
data(pathwayCategories)
A data frame (tibble) with 3326 rows and 5 columns
Reactome, Hallmark, or KEGG pathway identifier
Pathway name
Top hierarchy pathway term, shortened in some cases
Top grouped pathway
Original top pathway name
An object of class "tbl", "tbl.df", "data.frame"
See https://reactome.org/, https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp, and https://kegg.jp for information on each of these databases.
This function provides a simple and consistent interface to three different pathway enrichment tools: Sigora and ReactomePA (which both test for Reactome pathways), and MSigDB Hallmark gene set enrichment.
pathwayEnrichment( inputList, columnFC = NA, columnP = NA, filterInput = TRUE, pCutoff = 0.05, fcCutoff = 1.5, split = TRUE, analysis = "sigora", filterResults = "default", gpsRepo = "reaH", gpsLevel = "default", geneUniverse = NULL, verbose = FALSE )
pathwayEnrichment( inputList, columnFC = NA, columnP = NA, filterInput = TRUE, pCutoff = 0.05, fcCutoff = 1.5, split = TRUE, analysis = "sigora", filterResults = "default", gpsRepo = "reaH", gpsLevel = "default", geneUniverse = NULL, verbose = FALSE )
inputList |
A list, with each element containing RNA-Seq results as a "DESeqResults", "TopTags", or "data.frame" object. Rownames of each table must contain Ensembl Gene IDs. The list names are used as the comparison name for each element (e.g. "COVID vs Healthy"). See Details for more information on supported input types. |
columnFC |
Character; Column to plot along the x-axis, typically log2
fold change values. Only required when |
columnP |
Character; Column to plot along the y-axis, typically nominal
or adjusted p values. Only required when |
filterInput |
When providing list of data frames containing the
unfiltered RNA-Seq results (i.e. not all genes are significant), set this
to |
pCutoff |
Adjusted p value cutoff when filtering. Defaults to < 0.05. |
fcCutoff |
Minimum absolute fold change value when filtering. Defaults to > 1.5 |
split |
Boolean (TRUE); Split into up- and down-regulated DE genes using the fold change column, and do enrichment independently on each. Results are combined at the end, with an added "direction" column. |
analysis |
Method/database to use for enrichment analysis. The default is "sigora", but can also be "reactome"/"reactomepa", "hallmark", "kegg", "fgsea_reactome" or "fgsea_hallmark". |
filterResults |
Should the output be filtered for significance? Use |
gpsRepo |
Only applies to |
gpsLevel |
Only applies to |
geneUniverse |
Only applies when |
verbose |
Logical; If FALSE (the default), don't print info/progress messages. |
inputList
must be a named list of RNA-Seq results, with each
element being of class "DESeqResults" from DESeq2
, "TopTags" from
edgeR
, or a simple data frame. For the first two cases, column names are
expected to be the standard defined by each class ("log2FoldChange" and
"padj" for "DESeqResults", and "logFC" and "FDR" for "TopTags"). Hence for
these two cases the arguments columnFC
and columnP
can be left as NA
.
In the last case (elements are "data.frame"), both columnFC
and
columnP
must be supplied when filterInput=TRUE
,
and columnFC
must be given if split=TRUE
.
Setting analysis
to any of "reactome", "reactomepa", "hallmark", or
"kegg" will execute traditional over-representation analysis, the only
difference being the database used ("reactome" and "reactomepa" are treated
the same). Setting analysis="sigora"
will use a gene pair-based approach,
which can be performed on either Reactome data when gpsRepo="reaH"
or
KEGG data with gpsRepo="kegH"
.
A "data.frame" (tibble) of pathway enrichment results for all input comparisons, with the following columns:
comparison |
Source comparison from the names of |
direction |
Whether the pathway was enriched in all genes
( |
pathwayId |
Pathway identifier |
pathwayName |
Pathway name |
pValue |
Nominal p value for the pathway |
pValueAdjusted |
p value, corrected for multiple testing |
genes |
Candidate genes, which were DE for the comparison and also in the pathway |
numCandidateGenes |
Number of candidate genes |
numBgGenes |
Number of background genes for the pathway |
geneRatio |
Ratio of candidate and background genes |
totalGenes |
Number of DE genes which were tested for enriched pathways |
topLevelPathway |
High level Reactome term which serves to group similar pathways |
Sigora: https://cran.r-project.org/package=sigora ReactomePA: https://www.bioconductor.org/packages/ReactomePA/ Reactome: https://reactome.org/ MSigDB/Hallmark: https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp KEGG: https://www.kegg.jp/
https://github.com/hancockinformatics/pathlinkR
data("exampleDESeqResults") pathwayEnrichment( inputList=exampleDESeqResults[1], filterInput=TRUE, split=TRUE, analysis="hallmark", filterResults="default" )
data("exampleDESeqResults") pathwayEnrichment( inputList=exampleDESeqResults[1], filterInput=TRUE, split=TRUE, analysis="hallmark", filterResults="default" )
Creates a plot to visualize and compare pathway enrichment results from multiple DE comparisons. Can automatically assign each pathway into an informative top-level category.
pathwayPlots( pathwayEnrichmentResults, columns = 1, specificTopPathways = "any", specificPathways = "any", colourValues = c("blue", "red"), nameWidth = 35, nameRows = 1, xAngle = "angled", maxPVal = 50, intercepts = NA, includeGeneRatio = FALSE, size = 4, legendMultiply = 1, showNumGenes = FALSE, pathwayPosition = "right", newGroupNames = NA, fontSize = 12 )
pathwayPlots( pathwayEnrichmentResults, columns = 1, specificTopPathways = "any", specificPathways = "any", colourValues = c("blue", "red"), nameWidth = 35, nameRows = 1, xAngle = "angled", maxPVal = 50, intercepts = NA, includeGeneRatio = FALSE, size = 4, legendMultiply = 1, showNumGenes = FALSE, pathwayPosition = "right", newGroupNames = NA, fontSize = 12 )
pathwayEnrichmentResults |
Data frame of results from the function
|
columns |
Number of columns to split the pathways across, particularly relevant if there are many significant pathways. Can specify up to 3 columns, with a default of 1. |
specificTopPathways |
Only plot pathways from a specific vector of
"topLevelPathway". Defaults to "any" which includes all pathway results, or
see |
specificPathways |
Only plot specific pathways. Defaults to "any". |
colourValues |
Length-two character vector of colours to use for the
scale. Defaults to |
nameWidth |
How many characters to show for pathway name before truncating? Defaults to 35. |
nameRows |
For pathway names (y axis), how many rows (lines) should names wrap across when they're too long? Defaults to 1. |
xAngle |
Angle of x axis labels, set to "angled" (45 degrees), "horizontal" (0 degrees), or "vertical" (90 degrees). |
maxPVal |
P values below |
intercepts |
Add vertical lines to separate different groupings, by
providing a vector of intercepts (e.g. |
includeGeneRatio |
Boolean (FALSE). Should the gene ratio be included as an aesthetic mapping?If so, then it is attributed to the size of the triangles. |
size |
Size of points if not scaling to gene ratio. Defaults to 4. |
legendMultiply |
Size of the legend, e.g. increase if there are a lot of pathways which makes the legend small and unreadable by comparison. Defaults to 1, i.e. no increase in legend size. |
showNumGenes |
Boolean, defaults to FALSE. Show the number of genes for each comparison as brackets under the comparison's name. |
pathwayPosition |
Whether to have the y-axis labels (pathway names) on the left or right side. Default is "right". |
newGroupNames |
If you want to change the names of the comparisons to different names. Input a vector in the order as they appear. |
fontSize |
Base font size for all text elements of the plot. Defaults to 12. |
A plot of enriched pathways; a "ggplot" object
https://github.com/hancockinformatics/pathlinkR https://bioconductor.org/packages/fgsea/
data("sigoraExamples") pathwayPlots(sigoraExamples, columns=2)
data("sigoraExamples") pathwayPlots(sigoraExamples, columns=2)
Creates a heatmap of fold changes values for results from RNA-Seq results, with various parameters to tweak the appearance.
plotFoldChange( inputList, columnFC = NA, columnP = NA, pathName = NA, pathId = NA, genesToPlot = NA, manualTitle = NA, titleSize = 14, geneFormat = "ensembl", pCutoff = 0.05, fcCutoff = 1.5, cellColours = c("blue", "white", "red"), cellBorder = gpar(col = "grey"), plotSignificantOnly = TRUE, showStars = TRUE, hideNonsigFC = TRUE, vjust = 0.75, rot = 0, invert = FALSE, log2FoldChange = FALSE, colSplit = NA, clusterRows = TRUE, clusterColumns = FALSE, colAngle = 90, colCenter = TRUE, rowAngle = 0, rowCenter = FALSE )
plotFoldChange( inputList, columnFC = NA, columnP = NA, pathName = NA, pathId = NA, genesToPlot = NA, manualTitle = NA, titleSize = 14, geneFormat = "ensembl", pCutoff = 0.05, fcCutoff = 1.5, cellColours = c("blue", "white", "red"), cellBorder = gpar(col = "grey"), plotSignificantOnly = TRUE, showStars = TRUE, hideNonsigFC = TRUE, vjust = 0.75, rot = 0, invert = FALSE, log2FoldChange = FALSE, colSplit = NA, clusterRows = TRUE, clusterColumns = FALSE, colAngle = 90, colCenter = TRUE, rowAngle = 0, rowCenter = FALSE )
inputList |
A list, with each element containing RNA-Seq results as a "DESeqResults", "TopTags", or "data.frame" object, with Ensembl gene IDs in the rownames. The list names are used as the comparison name for each dataframe (e.g. "COVID vs Healthy"). See Details for more information on supported input types. |
columnFC |
Character; Column to plot along the x-axis, typically log2
fold change values. Only required when |
columnP |
Character; Column to plot along the y-axis, typically nominal
or adjusted p values. Only required when |
pathName |
The name of a Reactome pathway to pull genes from, also used
for the plot title. Alternative to |
pathId |
ID of a Reactome pathway to pull genes from. Alternative to
|
genesToPlot |
Vector of Ensembl gene IDs you want to plot, instead of
pulling the genes from a pathway, i.e. this option and
|
manualTitle |
Provide your own title, and override the use of a pathway name the title. |
titleSize |
Font size for the title (14). |
geneFormat |
Type of genes given in |
pCutoff |
P value cutoff, default is <0.05 |
fcCutoff |
Absolute fold change cutoff, default is >1.5 |
cellColours |
Vector specifying desired colours to use for the cells in
the heatmap. Defaults to |
cellBorder |
A call to |
plotSignificantOnly |
Boolean (TRUE). Only plot genes that are
differentially expressed (i.e. they pass |
showStars |
Boolean (TRUE) show significance stars on the heatmap |
hideNonsigFC |
Boolean (TRUE). If a gene is significant in one
comparison but not in another, this will set the colour of the non-
significant gene as grey to visually emphasize the significant genes. If
set to FALSE, it will be set the colour to the fold change, and if the p
value passes |
vjust |
Adjustment of the position of the significance stars. Default is 0.75. May need to adjust if there are many genes. |
rot |
Rotation of the position of the significance stars. Default is 0. |
invert |
Boolean (FALSE). The default setting plots genes as rows and
comparisons as columns. Setting this to |
log2FoldChange |
Boolean (FALSE). Default plots the fold changes in the legend as the true fold change. Set to TRUE if you want log2 fold change. |
colSplit |
A vector, with the same length as |
clusterRows |
Boolean (TRUE). Whether to cluster the rows (genes). May
need to change if |
clusterColumns |
Boolean (FALSE). Whether to cluster the columns
(comparisons). Will override order of |
colAngle |
Angle of column text. Defaults to 90. |
colCenter |
Whether to center column text. Default is TRUE, but it
should be set to FALSE if the column name is angled (e.g. |
rowAngle |
Angle of row text, defaults to 0. |
rowCenter |
Whether to center column text. The default is FALSE, but it
should be set to TRUE if vertical column name (e.g. |
All elements of inputList
should belong to one of the following
classes: "DESeqResults" from DESeq2
, "TopTags" from edgeR
,
or a simple "data.frame". In the first two cases, the proper columns for
fold change and p values are detected automatically ("log2FoldChange" and
"padj" for "DESeqResults", or "logFC" and "FDR" for "TopTags"). In the
third case, the arguments columnFC
and columnP
must be
supplied. Additionally, if one wished to override the default columns for
either "DESeqResults" or "TopTags" objects, simply coerce the object to a
simple "data.frame" and supply columnFC
and columnP
as desired.
The cellColours
argument is designed to map a range of negative
and positive values to the three provided colours, with zero as the middle
colour. If the plotted matrix contains only positive (or negative) values,
then it will become a two-colour scale, white-to-red (or blue-to-white).
The colSplit
argument can be used to define larger groups represented in
inputList
. For example, consider an experiment comparing two different
treatments to an untreated control, in both wild type and mutant cells.
This would give the following comparisons:
"wildtype_treatment1_vs_untreated", "wildtype_treatment2_vs_untreated",
"mutant_treatment1_vs_untreated", and "mutant_treatment2_vs_untreated".
One could then specify colSplit
as
c("Wild type", "Wild type", "Mutant", "Mutant")
to make the wild type
and mutant results more visually distinct.
A heatmap of fold changes for genes of interest; an "ggplot" class object
https://bioconductor.org/packages/ComplexHeatmap/
https://github.com/hancockinformatics/pathlinkR
data("exampleDESeqResults") plotFoldChange( exampleDESeqResults, pathName="Generation of second messenger molecules" )
data("exampleDESeqResults") plotFoldChange( exampleDESeqResults, pathName="Generation of second messenger molecules" )
Creates a protein-protein interaction (PPI) network using data from InnateDB, with options for network order, and filtering input.
ppiBuildNetwork( rnaseqResult, filterInput = TRUE, columnFC = NA, columnP = NA, pCutoff = 0.05, fcCutoff = 1.5, order = "zero", hubMeasure = "betweenness", ppiData = innateDbPPI )
ppiBuildNetwork( rnaseqResult, filterInput = TRUE, columnFC = NA, columnP = NA, pCutoff = 0.05, fcCutoff = 1.5, order = "zero", hubMeasure = "betweenness", ppiData = innateDbPPI )
rnaseqResult |
An object of class "DESeqResults", "TopTags", or a simple data frame. See Details for more information on input types. |
filterInput |
If providing list of data frames containing the
unfiltered output from |
columnFC |
Character; optional column containing fold change values,
used only when |
columnP |
Character; optional column containing p values, used only
when |
pCutoff |
Adjusted p value cutoff, defaults to <0.05 |
fcCutoff |
Absolute fold change cutoff, defaults to an absolute value of >1.5 |
order |
Desired network order. Possible options are "zero" (default), "first," "minSimple." |
hubMeasure |
Character denoting what measure should be used in
determining which nodes to highlight as hubs when plotting the network.
Options include "betweenness" (default), "degree", and "hubscore". These
represent network statistics calculated by their respective
|
ppiData |
Data frame of PPI data; must contain rows of interactions as pairs of Ensembl gene IDs, with columns named "ensemblGeneA" and "ensemblGeneB". Defaults to pre-packaged InnateDB PPI data. |
The input to ppiBuildNetwork()
can be a "DESeqResults" object
(from DESeq2
), "TopTags" (edgeR
), or a simple data frame.
When not providing a basic data frame, the columns for filtering are
automatically pulled ("log2FoldChange" and "padj" for DESeqResults, or
"logFC" and "FDR" for TopTags). Otherwise, the arguments "columnFC" and
"columnP" must be specified.
The "hubMeasure" argument determines how ppiBuildNetwork
assesses
connectedness of nodes in the network, which will be used to highlight
nodes when visualizing with ppiPlotNetwork
. The options are "degree",
"betweenness", or "hubscore". This last option uses the igraph
implementation of the Kleinburg hub centrality score - details on this
method can be found at ?igraph::hub_score
.
A Protein-Protein Interaction (PPI) network; a "tidygraph" object for plotting or further analysis, with the minimum set of columns for nodes (additional columns from the input will also be included):
name |
Ensembl gene ID for the node |
degree |
Degree of the node, i.e. the number of interactions |
betweenness |
Betweenness measure for the node |
seed |
TRUE when the node was part of the input list of genes |
hubScore |
Special hubScore for each node. The suffix denotes the measure being used; e.g. "hubScoreBtw" is for betweenness |
hgncSymbol |
HGNC gene name for the node |
Additionally the following columns are provided for edges:
from |
Starting node for the interaction/edge as a row number |
to |
Ending node for the interaction/edge as a row number |
InnateDB: https://www.innatedb.com/
https://github.com/hancockinformatics/pathlinkR/
data("exampleDESeqResults") ppiBuildNetwork( rnaseqResult=exampleDESeqResults[[1]], filterInput=TRUE, order="zero" )
data("exampleDESeqResults") ppiBuildNetwork( rnaseqResult=exampleDESeqResults[[1]], filterInput=TRUE, order="zero" )
Takes network file (GraphML or JSON) and process it into a tidygraph object, adding network statistics along the way.
ppiCleanNetwork(network)
ppiCleanNetwork(network)
network |
|
This function was designed so that networks created by other
packages or websites (e.g. https://networkanalyst.ca) could be imported
and visualized with ppiPlotNetwork
.
A Protein-Protein Interaction (PPI) network; a "tidygraph" object, with the minimal set of columns (other from the input are also included):
name |
Identifier for the node |
degree |
Degree of the node, i.e. the number of interactions |
betweenness |
Betweenness measure for the node |
seed |
TRUE when the node was part of the input list of genes |
hubScore |
Special hubScore for each node. The suffix denotes the measure being used; e.g. "hubScoreBtw" is for betweenness |
hgncSymbol |
HGNC gene name for the node |
Additionally the following columns are provided for edges:
from |
Starting node for the interaction/edge as a row number |
to |
Ending node for the interaction/edge as a row number |
https://github.com/hancockinformatics/pathlinkR/
tj1 <- jsonlite::read_json( system.file("extdata/networkAnalystExample.json", package="pathlinkR"), simplifyVector=TRUE ) tj2 <- igraph::graph_from_data_frame( d=dplyr::select(tj1$edges, source, target), directed=FALSE, vertices=dplyr::select( tj1$nodes, id, label, x, y, "types"=molType, expr ) ) tj3 <- ppiCleanNetwork(tidygraph::as_tbl_graph(tj2))
tj1 <- jsonlite::read_json( system.file("extdata/networkAnalystExample.json", package="pathlinkR"), simplifyVector=TRUE ) tj2 <- igraph::graph_from_data_frame( d=dplyr::select(tj1$edges, source, target), directed=FALSE, vertices=dplyr::select( tj1$nodes, id, label, x, y, "types"=molType, expr ) ) tj3 <- ppiCleanNetwork(tidygraph::as_tbl_graph(tj2))
Test a PPI network for enriched pathways
ppiEnrichNetwork( network, analysis = "sigora", filterResults = "default", gpsRepo = "default", geneUniverse = NULL )
ppiEnrichNetwork( network, analysis = "sigora", filterResults = "default", gpsRepo = "default", geneUniverse = NULL )
network |
A "tidygraph" network object, with Ensembl IDs in the first column of the node table |
analysis |
Default is "sigora", but can also be "reactomepa" or "hallmark" |
filterResults |
Should the output be filtered for significance? Use
|
gpsRepo |
Only applies to |
geneUniverse |
Only applies when |
A "data.frame" (tibble) of enriched pathways, with the following columns:
pathwayId |
Pathway identifier |
pathwayName |
Pathway name |
pValue |
Nominal p value for the pathway |
pValueAdjusted |
p value corrected for multiple testing |
genes |
Candidate genes, which were DE for the comparison and also in the pathway |
numCandidateGenes |
Number of candidate genes |
numBgGenes |
Number of background genes for the pathway |
geneRatio |
Ratio of candidate and background genes |
totalGenes |
Number of DE genes which were tested for enriched pathways |
topLevelPathway |
High level Reactome term which serves to group similar pathways |
Sigora: https://cran.r-project.org/package=sigora ReactomePA: https://www.bioconductor.org/packages/ReactomePA/ MSigDB/Hallmark: https://www.gsea-msigdb.org/gsea/msigdb/collections.jsp
https://github.com/hancockinformatics/pathlinkR
data("exampleDESeqResults") exNetwork <- ppiBuildNetwork( rnaseqResult=exampleDESeqResults[[1]], filterInput=TRUE, order="zero" ) ppiEnrichNetwork( network=exNetwork, analysis="hallmark" )
data("exampleDESeqResults") exNetwork <- ppiBuildNetwork( rnaseqResult=exampleDESeqResults[[1]], filterInput=TRUE, order="zero" ) ppiEnrichNetwork( network=exNetwork, analysis="hallmark" )
Extract a subnetwork based on pathway genes
ppiExtractSubnetwork( network, genes = NULL, pathwayEnrichmentResult = NULL, pathwayToExtract )
ppiExtractSubnetwork( network, genes = NULL, pathwayEnrichmentResult = NULL, pathwayToExtract )
network |
Input network object; output from |
genes |
Character vector of Ensembl gene IDs to use as the starting
point to extract a subnetwork from the initial network. You must provide
either the |
pathwayEnrichmentResult |
Pathway enrichment result, output from
|
pathwayToExtract |
Name of the pathway determining what genes (nodes)
are pulled from the input network. Must be present in the "pathwayName"
column of |
Uses functions from the igraph package to extract a minimally connected subnetwork from the starting network, using either a list of Ensembl genes or genes from an enriched pathway as the basis. To see what genes were pulled out for the pathway, see the "starters" attribute of the output network.
A Protein-Protein Interaction (PPI) network; a "tidygraph" object for plotting or further analysis, with the minimum set of columns for nodes (additional columns from the input will also be included):
name |
Ensembl gene ID for the node |
degree |
Degree of the node, i.e. the number of interactions |
betweenness |
Betweenness measure for the node |
seed |
TRUE when the node was part of the input list of genes |
hubScore |
Special hubScore for each node. The suffix denotes the measure being used; e.g. "hubScoreBtw" is for betweenness |
hgncSymbol |
HGNC gene name for the node |
Additionally the following columns are provided for edges:
from |
Starting node for the interaction/edge as a row number |
to |
Ending node for the interaction/edge as a row number |
Code for network module (subnetwork) extraction was based off of that used in "jboktor/NetworkAnalystR" on Github.
https://github.com/hancockinformatics/pathlinkR
data("exampleDESeqResults") exNetwork <- ppiBuildNetwork( rnaseqResult=exampleDESeqResults[[1]], filterInput=TRUE, order="zero" ) exPathways <- ppiEnrichNetwork( network=exNetwork, analysis="hallmark" ) ppiExtractSubnetwork( network=exNetwork, pathwayEnrichmentResult=exPathways, pathwayToExtract="INTERFERON ALPHA RESPONSE" )
data("exampleDESeqResults") exNetwork <- ppiBuildNetwork( rnaseqResult=exampleDESeqResults[[1]], filterInput=TRUE, order="zero" ) exPathways <- ppiEnrichNetwork( network=exNetwork, analysis="hallmark" ) ppiExtractSubnetwork( network=exNetwork, pathwayEnrichmentResult=exPathways, pathwayToExtract="INTERFERON ALPHA RESPONSE" )
Visualize a protein-protein interaction (PPI) network using
ggraph
functions, output from ppiBuildNetwork
.
ppiPlotNetwork( network, networkLayout = "nicely", title = NA, nodeSize = c(2, 6), fillColumn, fillType, catFillColours = "Set1", foldChangeColours = c("firebrick3", "#188119"), intColour = "grey70", nodeBorder = "grey30", hubColour = "blue2", subnetwork = TRUE, legend = FALSE, legendTitle = NULL, edgeColour = "grey40", edgeAlpha = 0.5, edgeWidth = 0.5, label = FALSE, labelColumn, labelFilter = 5, labelSize = 4, labelColour = "black", labelFace = "bold", labelPadding = 0.25, minSegLength = 0.25 )
ppiPlotNetwork( network, networkLayout = "nicely", title = NA, nodeSize = c(2, 6), fillColumn, fillType, catFillColours = "Set1", foldChangeColours = c("firebrick3", "#188119"), intColour = "grey70", nodeBorder = "grey30", hubColour = "blue2", subnetwork = TRUE, legend = FALSE, legendTitle = NULL, edgeColour = "grey40", edgeAlpha = 0.5, edgeWidth = 0.5, label = FALSE, labelColumn, labelFilter = 5, labelSize = 4, labelColour = "black", labelFace = "bold", labelPadding = 0.25, minSegLength = 0.25 )
network |
A |
networkLayout |
Layout of nodes in the network. Supports all layouts
from |
title |
Optional title for the plot (NA) |
nodeSize |
Length-two numeric vector, specifying size range of node
sizes (maps to node degree). Default is |
fillColumn |
Tidy-select column for mapping node colour. Designed to
handle continuous numeric mappings (either positive/negative only, or
both), and categorical mappings, plus a special case for displaying fold
changes from, for example, RNA-Seq data. See |
fillType |
String denoting type of fill mapping to perform for nodes. Options are: "foldChange", "twoSided", "oneSided", or "categorical". |
catFillColours |
Colour palette to be used when |
foldChangeColours |
A two-length character vector containing colours
for up and down regulated genes. Defaults to |
intColour |
Fill colour for non-seed nodes, i.e. interactors. Defaults to "grey70". |
nodeBorder |
Colour (stroke or outline) of all nodes in the network. Defaults to "grey30". |
hubColour |
Colour of node labels for hubs. The top 2% of nodes
(based on calculated hub score) are highlighted with this colour, if
|
subnetwork |
Logical determining if networks from
|
legend |
Should a legend be included? Defaults to FALSE. |
legendTitle |
Optional title for the legend, defaults to |
edgeColour |
Edge colour, defaults to "grey40" |
edgeAlpha |
Transparency of edges, defaults to 0.5 |
edgeWidth |
Thickness of edges connecting nodes. Defaults to 0.5 |
label |
Boolean, whether labels should be added to nodes. Defaults to FALSE. |
labelColumn |
Tidy-select column of the network/data to be used in
labeling nodes. Recommend setting to |
labelFilter |
Degree filter used to determine which nodes should be labeled. Defaults to 5. This value can be increased to reduce the number of node labels, to prevent the network from being too crowded. |
labelSize |
Size of node labels, defaults to 5. |
labelColour |
Colour of node labels, defaults to "black" |
labelFace |
Font face for node labels, defaults to "bold" |
labelPadding |
Padding around the label, defaults to 0.25 lines. |
minSegLength |
Minimum length of lines to be drawn from labels to points. The default specified here is 0.25, half of the normal default value. |
Any layout supported by ggraph can be specified here - see
?layout_tbl_graph_igraph
for a list of options. Or you can supply a data
frame containing coordinates for each node. The first and second columns
will be used for x and y, respectively. Note that having columns named "x"
and "y" in the input network will generate a warning message when supplying
custom coordinates.
Since this function returns a standard ggplot object, you can tweak the
final appearance using the normal array of ggplot2 function, e.g. labs()
and theme()
to further customize the final appearance.
The fillType
argument will determine how the node colour is mapped to
the desired column. "foldChange" represents a special case, where the fill
column is numeric and whose values should be mapped to up (> 0) or down (<
0). "twoSided" and "oneSided" are designed for numeric data that contains
either positive and negative values, or only positive/negative values,
respectively. "categorical" handles any other non-numeric colour mapping,
and uses "Set1" from RColorBrewer.
Node statistics (degree, betweenness, and hub score) are calculated using
the respective functions from the tidygraph
package.
A Protein-Protein Interaction (PPI) network plot; an object of class "ggplot"
https://github.com/hancockinformatics/pathlinkR/
data("exampleDESeqResults") exNetwork <- ppiBuildNetwork( rnaseqResult=exampleDESeqResults[[1]], filterInput=TRUE, order="zero" ) ppiPlotNetwork( network=exNetwork, title="COVID positive over time", fillColumn=LogFoldChange, fillType="foldChange", legend=TRUE, label=FALSE )
data("exampleDESeqResults") exNetwork <- ppiBuildNetwork( rnaseqResult=exampleDESeqResults[[1]], filterInput=TRUE, order="zero" ) ppiPlotNetwork( network=exNetwork, title="COVID positive over time", fillColumn=LogFoldChange, fillType="foldChange", legend=TRUE, label=FALSE )
INTERNAL Find and return the largest subnetwork
ppiRemoveSubnetworks(network)
ppiRemoveSubnetworks(network)
network |
Graph object |
Largest subnetwork from the input network list as an "igraph" object
https://github.com/hancockinformatics/pathlinkR/
Table of all Reactome pathways and genes
data(reactomeDatabase)
data(reactomeDatabase)
A data frame (tibble) with 123574 rows and 3 columns
Reactome pathway ID
Entrez gene ID
Name of the Reactome pathway
An object of class "tbl", "tbl.df", "data.frame"
See https://reactome.org/ for more information.
Table of all Sigora pathways and their constituent genes
data(sigoraDatabase)
data(sigoraDatabase)
A data frame (tibble) with 60775 rows and 4 columns
Reactome pathway identifier
Reactome pathway description
Ensembl gene identifier
HGNC gene symbol
An object of class "tbl", "tbl.df", "data.frame"
Please refer to the Sigora package for more details: https://cran.r-project.org/package=sigora
Example Sigora output from running pathwayEnrichment()
on
"exampleDESeqResults"
data(sigoraExamples)
data(sigoraExamples)
A data frame (tibble) with 66 rows and 12 columns
Comparison from which results are derived; names of the input list
Was the pathway enriched in up or down regulated genes
Reactome pathway identifier
Description of the pathway
Nominal p value for the enrichment
p value adjusted for multiple testing
Genes in the pathway/input
Analyzed genes found in the pathway of interest
All genes from the pathway database
Quotient of the number of candidate and background genes
Total number of input genes
Pathway category
An object of class "tbl", "tbl.df", "data.frame"
Please refer to the Sigora package for more details on that method: https://cran.r-project.org/package=sigora