Title: | KEGGgraph: A graph approach to KEGG PATHWAY in R and Bioconductor |
---|---|
Description: | KEGGGraph is an interface between KEGG pathway and graph object as well as a collection of tools to analyze, dissect and visualize these graphs. It parses the regularly updated KGML (KEGG XML) files into graph models maintaining all essential pathway attributes. The package offers functionalities including parsing, graph operation, visualization and etc. |
Authors: | Jitao David Zhang, with inputs from Paul Shannon and Hervé Pagès |
Maintainer: | Jitao David Zhang <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.67.0 |
Built: | 2024-10-30 07:35:20 UTC |
Source: | https://github.com/bioc/KEGGgraph |
Colorectal cancer dataset provided by SPIA
package. It is just a
copy during the development of SPIA
package in case the package
is not available. It will be removed when the SPIA
package is
stable.
see the description of SPIA
package.
see the format of SPIA
package.
Yi Hong and Kok Sun Ho and Kong Weng Eu and Peh Yean Cheah, A susceptibility gene set for early onset colorectal cancer that integrates diverse signaling pathways: implication for tumorigenesis, Clin Cancer Res, 2007, 13(4),1107-14.
The function expands KEGG node of paralogues, and is mainly used internally. The end-users are not expected to call it unless they know exactly what they are doing.
expandKEGGNode(node)
expandKEGGNode(node)
node |
An object of |
Jitao David Zhang mailto:[email protected]
The function expands paralogue nodes in KEGG pathway and returns expanded KEGG pathway, KEGG node and edge data is maintained.
expandKEGGPathway(pathway)
expandKEGGPathway(pathway)
pathway |
An object of |
The function expands nodes with paralogues in KEGG pathway and copy neccessary edges.
An object of KEGGPathway-class
Jitao David Zhang mailto:[email protected]
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") kegg.pathway <- parseKGML(sfile) kegg.expandpathway <- expandKEGGPathway(kegg.pathway)
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") kegg.pathway <- parseKGML(sfile) kegg.expandpathway <- expandKEGGPathway(kegg.pathway)
In KGML files, 'graph' element has a 'name' attribute to store the displaying name of a node, which is straighforward for end users. For example, biologists have no idea about a node 'hsa:1432' but its display name 'MAPK14' helps them to link this node to their knowledge. This method extract DisplayName from graph objects for KEGGNode and graph, where the method for graph returns the display names of its nodes.
An object of KEGGNode-class
A KEGG graph object
Jitao David Zhang mailto:[email protected]
KGML Document Manual https://www.genome.jp/kegg/docs/xml/
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) nodes <- nodes(pathway) subnodes <- nodes[10:15] sapply(subnodes, getDisplayName) ## compare them with getName, one 'displayName' may correspond to many paralogues sapply(subnodes, getName)
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) nodes <- nodes(pathway) subnodes <- nodes[10:15] sapply(subnodes, getDisplayName) ## compare them with getName, one 'displayName' may correspond to many paralogues sapply(subnodes, getName)
The method extracts EntryIDs from KEGGNode-class
or
KEGGEdge-class
object(s).
In case of KEGGEdge-class
objects, the entryID of the nodes
involved in the binary are returned as a vector in the order
specified by the direction of the relation, that is, if the
edge is defined as A->B, then the entryID returned from the edge
equals to c(getEntryID(A), getEntryID(B)).
Object of KEGGEdge-class
A wrapper for list of KEGGNode-class
or
KEGGEdge-class
objects
Jitao David Zhang mailto:[email protected]
KGML Document Manual https://www.genome.jp/kegg/docs/xml/
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) nodes <- nodes(pathway) node <- nodes[[7]] getEntryID(node) edges <- edges(pathway) edge <- edges[[7]] getEntryID(edge) getEntryID(nodes[1:4]) getEntryID(edges[1:4])
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) nodes <- nodes(pathway) node <- nodes[[7]] getEntryID(node) edges <- edges(pathway) edge <- edges[[7]] getEntryID(edge) getEntryID(nodes[1:4]) getEntryID(edges[1:4])
Tranlsate a object into a link point to the gene on KEGG website.
This method complies with the Gene link rule of the KEGG website.
A KEGGID, for example 'hsa:1423'
getKEGGgeneLink("hsa:1423")
getKEGGgeneLink("hsa:1423")
Get KEGGID from a KEGGNode-class
object.
The KEGGNode-class
can be either another pathway (KEGGID
in the form like 'hsa\d*'), KEGG Gene ('hsa:\d*') or compound
('cpd:C\d*'). In case of the KEGG Gene ID, the organism prefix is
removed when the value is returned.
An object of KEGGNode-class
wntfile <- system.file("extdata/hsa04310.xml",package="KEGGgraph") wnt <- parseKGML(wntfile) nodes <- nodes(wnt) getKEGGID(nodes[[1]]) getKEGGID(nodes[[26]])
wntfile <- system.file("extdata/hsa04310.xml",package="KEGGgraph") wnt <- parseKGML(wntfile) nodes <- nodes(wnt) getKEGGID(nodes[[1]]) getKEGGID(nodes[[26]])
The 'get' methods extracts KEGG node (edge) attributes from a graph produced
by calling parseKGML2Graph
or
KEGGpathway2Graph
. The 'set' methods writes a list into
the edge or node data.
getKEGGnodeData(graph, n) getKEGGedgeData(graph, n)
getKEGGnodeData(graph, n) getKEGGedgeData(graph, n)
graph |
a graph object by parsing KGML file, where KEGG node and edge attributes are maintained |
n |
optional character string, name of the desired node or edge. If is missing all node Data is returned |
Node and edge data is stored as list within environments in graphs to save
memory and speed up graph manipulations. When using
getKEGGnodeData
or getKEGGedgeData
is called, the list
is extracted out of the environment and returned.
Either a list or single item of KEGGNode-class
or
KEGGEdge-class
object(s).
These functions will be unified into 'KEGGnodeData' and 'KEGGnodeData<-' forms.
Jitao David Zhang mailto:[email protected]
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) getKEGGnodeData(gR,"hsa:4214") getKEGGedgeData(gR,"hsa:4214~hsa:5605")
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) getKEGGnodeData(gR,"hsa:4214") getKEGGedgeData(gR,"hsa:4214~hsa:5605")
The function simply returns the KGML file url given KEGG PATHWAY ID. If the KEGG PATHWAY ID contains no organism prefix, user can specify the 'organism' parameter. Otherwise the 'organism' option is ignored.
retrieveKGML is a simple wrapper to getKGMLurl, which downloads the
KGML file with download.file
in utils package.
getKGMLurl(pathwayid, organism = "hsa") retrieveKGML(pathwayid, organism, destfile,method="auto", ...) kgmlNonmetabolicName2MetabolicName(destfile) getCategoryIndepKGMLurl(pathwayid, organism="hsa", method="auto", ...)
getKGMLurl(pathwayid, organism = "hsa") retrieveKGML(pathwayid, organism, destfile,method="auto", ...) kgmlNonmetabolicName2MetabolicName(destfile) getCategoryIndepKGMLurl(pathwayid, organism="hsa", method="auto", ...)
pathwayid |
KEGG PATHWAY ID, e.g. 'hsa00020' |
organism |
three-alphabet organism code, if pathwayid contains the ocde this option is ignored |
destfile |
Destination file, to which the remote KGML file should be saved |
method |
Method to be used for downloading files, passed to |
... |
Parameters passed to download.file |
The function getKGMLurl
takes the pathway identifier (can be in the form of
'hsa00020' or with 'pathway' prefix, for example 'path:hsa00020'), and
returns the url to download KGML file.
The mapping between pathway identifier and pathway name can be found by KEGGPATHNAME2ID (or reversed mappings) in KEGG.db package. See vignette for example.
retrieveKGML
calls download.file
to download the KGML
file from KEGG REST API remotely.
Before July 2017, KEGG FTP server was used to download the KGML files. Since then the REST API service of KEGG is used instead.
KGML File URL of the given pathway.
So far the function does not check the correctness of the 'organism' prefix, it is the responsibility of the user to garantee the right spelling.
There were a period when the metabolic and non-metabolic pathways were
saved separately in different directories, and KEGGgraph
was
able to handle them. kgmlNonmetabolicName2MetabolicName
is used to translate
non-metabolic pathway KGML URL to that of metabolic
pathway. getCategoryIndepKGMLurl
determines the correct URL to
download by attempting both possibilities. They were mainly called
internally. Now since the KGML file is to be downloaded in each
pathway's main page instead from the FTp server, these functions are
no more needed and will be removed in the next release.
Jitao David Zhang mailto:[email protected]
Plea from KEGG (available as of Aug 2011) https://www.genome.jp/kegg/docs/plea.html
getKGMLurl("hsa00020") getKGMLurl("path:hsa00020") getKGMLurl("00020",organism="hsa") getKGMLurl(c("00460", "hsa:00461", "path:hsa00453", "path:00453")) hasConnection <- RCurl::url.exists(getKGMLurl("cel00010")) if(hasConnection) { tmp <- tempfile() retrieveKGML(pathwayid='00010', organism='cel', destfile=tmp, method="auto") } else { warnings("No connection to KEGG webservice") }
getKGMLurl("hsa00020") getKGMLurl("path:hsa00020") getKGMLurl("00020",organism="hsa") getKGMLurl(c("00460", "hsa:00461", "path:hsa00453", "path:00453")) hasConnection <- RCurl::url.exists(getKGMLurl("cel00010")) if(hasConnection) { tmp <- tempfile() retrieveKGML(pathwayid='00010', organism='cel', destfile=tmp, method="auto") } else { warnings("No connection to KEGG webservice") }
Get 'name' attribute for given object, this method can be used for almost all objects implemented in KEGGgraph package to extract their name slot. See manual pages of individual objects for examples.
An object of KEGGEdgeSubType-class
An object of KEGGNode-class
An object of KEGGPathway-class
An object of
KEGGPathwayInfo-class
An object of
KEGGReaction-class
Jitao David Zhang mailto:[email protected]
KGML Document Manual https://www.genome.jp/kegg/docs/xml/
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) ## get pathway name getName(pathway) ## get node name nodes <- nodes(pathway) getName(nodes[[2]]) ## get edge name: it is not informative since the nodes are identified ## with file-dependent indices edges <- edges(pathway) getName(edges[[7]]) ## get subtype name subtype <- getSubtype(edges[[2]])[[1]] getName(subtype)
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) ## get pathway name getName(pathway) ## get node name nodes <- nodes(pathway) getName(nodes[[2]]) ## get edge name: it is not informative since the nodes are identified ## with file-dependent indices edges <- edges(pathway) getName(edges[[7]]) ## get subtype name subtype <- getSubtype(edges[[2]])[[1]] getName(subtype)
The function extracts the value(s) in a named vector by given name(s), in case no element is found with the given name, NA will be returned
getNamedElement(vector, name)
getNamedElement(vector, name)
vector |
A named vector of any data type |
name |
Wanted name |
The elements with the given name, 'NA' in case no one was found
Jitao David Zhang mailto:[email protected]
vec <- c(first="Hamburg", second="Hoffenheim",third="Bremen") getNamedElement(vec, "third") getNamedElement(vec, "last")
vec <- c(first="Hamburg", second="Hoffenheim",third="Bremen") getNamedElement(vec, "third") getNamedElement(vec, "last")
KEGG stores additional information of the pathways in their KGML files, which can be extracted by this function.
The method returns the attributes of the pathway including its full title, short name, organism, image file link (which can be downloaded from KEGG website) and web link.
An object of
KEGGPathway-class
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) getPathwayInfo(pathway)
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) getPathwayInfo(pathway)
In KGML, the pathway element specifies one graph object with the entry
elements as its nodes and the relation and reaction elements as its
edges. The relation elements are saved as edges in objects of
KEGGPathway-class
, and the reactions elements are
saved as a slot of the object, which can be retrieved with the
function getReactions
.
Regulatory pathways are always viewed as protein networks, so there is
no 'reaction' information saved in their KGML files. Metabolic
pathways are viewed both as both protein networks and chemical
networks, hence the KEGGPathway-class
object may have reactions
information.
An object of KEGGPathway-class
Jitao David Zhang mailto:[email protected]
KGML Document manual https://www.genome.jp/kegg/docs/xml/
mapfile <- system.file("extdata/map00260.xml", package="KEGGgraph") maptest <- parseKGML(mapfile) maptest mapReactions <- getReactions(maptest) mapReactions[1:3]
mapfile <- system.file("extdata/map00260.xml", package="KEGGgraph") maptest <- parseKGML(mapfile) maptest mapReactions <- getReactions(maptest) mapReactions[1:3]
Get Rgraphviz compatitable edge names, where the out- and in-nodes sharing a edge are concatenated by "~".
getRgraphvizEdgeNames(graph)
getRgraphvizEdgeNames(graph)
graph |
A graph object |
A list of names, the order is determined by the edge order.
Jitao David Zhang maito:[email protected]
Rgraphviz package
tnodes <- c("Hamburg","Dortmund","Bremen", "Paris") tedges <- list("Hamburg"=c("Dortmund", "Bremen"), "Dortmund"=c("Hamburg"), "Bremen"=c("Hamburg"), "Paris"=c()) tgraph <- new("graphNEL", nodes = tnodes, edgeL = tedges) getRgraphvizEdgeNames(tgraph)
tnodes <- c("Hamburg","Dortmund","Bremen", "Paris") tedges <- list("Hamburg"=c("Dortmund", "Bremen"), "Dortmund"=c("Hamburg"), "Bremen"=c("Hamburg"), "Paris"=c()) tgraph <- new("graphNEL", nodes = tnodes, edgeL = tedges) getRgraphvizEdgeNames(tgraph)
KEGG stores sub-type of interactions between entities in the KGML files, which can be extracted with this method. The descriptions for the subtypes can be explored at the KGML document manual in the references.
See KEGGEdge-class
for examples. The method for graphs
is a wrapper to extract all subtype information from one graph.
A graph object of KEGGgraph. The method returns a list of subtypes in the same order of edges
An object of KEGGEdge, which stores the subtype information
Jitao David Zhang mailto:[email protected]
KGML Document manual https://www.genome.jp/kegg/docs/xml/
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) edges <- edges(pathway) subtype <- getSubtype(edges[[1]]) subtype
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) edges <- edges(pathway) subtype <- getSubtype(edges[[1]]) subtype
The methods get title attribute for given KGML element, for example
for objects of KEGGPathway-class
or KEGGPathwayInfo-class
An object of KEGGPathway-class
An object of KEGGPathwayInfo-class
Jitao David Zhang mailto:[email protected]
KGML Document manual https://www.genome.jp/kegg/docs/xml/
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) getTitle(pathway) pi <- getPathwayInfo(pathway) getTitle(pi)
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) getTitle(pathway) pi <- getPathwayInfo(pathway) getTitle(pi)
This method can be used to extract generic type attribute from several objects implemented in KEGGgraph package.
The meanings and descriptions of the types can be found at KGML manual listed in the reference.
An object of KEGGEdge-class
An object of KEGGNode-class
An object of
KEGGReaction-class
Jitao David Zhang mailto:[email protected]
KGML Manual https://www.genome.jp/kegg/docs/xml/
mapfile <- system.file("extdata/map00260.xml", package="KEGGgraph") maptest <- parseKGML(mapfile) ## node type node <- nodes(maptest)[[3]] getType(node) ## edge type edge <- edges(maptest)[[5]] getType(edge) ## reaction type reaction <- getReactions(maptest)[[5]] getType(reaction)
mapfile <- system.file("extdata/map00260.xml", package="KEGGgraph") maptest <- parseKGML(mapfile) ## node type node <- nodes(maptest)[[3]] getType(node) ## edge type edge <- edges(maptest)[[5]] getType(edge) ## reaction type reaction <- getReactions(maptest)[[5]] getType(reaction)
Get 'value' attribute, mainly used internally and is not expected to be called by users.
An object of KEGGEdgeSubType-class
The graph density is defined as d = E/(V*(V-1)/2) where E is the number of edges and V of nodes.
graphDensity(graph)
graphDensity(graph)
graph |
A graph object |
The density of a graph lies between [0,1]
A value between [0,1]
Jitao David Zhang [email protected]
Aittokallio and Schwikowski (2006), Graph-based methods for analysing networks in cell biology, Briefings in Bioinformatics, 7, 243-255.
tnodes <- c("Hamburg","Dortmund","Bremen", "Paris") tedges <- list("Hamburg"=c("Dortmund", "Bremen"), "Dortmund"=c("Hamburg"), "Bremen"=c("Hamburg"), "Paris"=c()) tgraph <- new("graphNEL", nodes = tnodes, edgeL = tedges) graphDensity(tgraph)
tnodes <- c("Hamburg","Dortmund","Bremen", "Paris") tedges <- list("Hamburg"=c("Dortmund", "Bremen"), "Dortmund"=c("Hamburg"), "Bremen"=c("Hamburg"), "Paris"=c()) tgraph <- new("graphNEL", nodes = tnodes, edgeL = tedges) graphDensity(tgraph)
If a list contains objects of the same class with the given class name, we call it a homogenous
list and the function returns TRUE
, otherwise it returns FALSE
.
isHomoList(list, class)
isHomoList(list, class)
list |
A list |
class |
The class name to be validated |
logical
Jitao David Zhang mailto:[email protected]
testlist <- list("home1"="Hamburg","home2"="Heidelberg", "home3"="Tianjin") isHomoList(testlist,"character") testlist$lucky <- 16 isHomoList(testlist,"character")
testlist <- list("home1"="Hamburg","home2"="Heidelberg", "home3"="Tianjin") isHomoList(testlist,"character") testlist$lucky <- 16 isHomoList(testlist,"character")
A class to represent 'relation' elements in KGML files and edge objects in a KEGG graph
Objects are normally created by parseRelation
function,
which is not intended to be called by user directly
entry1ID
:The entryID of the first KEGGNode
entry2ID
:The entryID of the second KEGGNode
type
: The type of the relation, see getType-methods
subtype
: The subtype(s) of the edge, a list of KEGGEdgeSubType
signature(obj = "KEGGEdge")
: Get entryIDs
of the edge in the order specified by the direction of the edge
signature(object = "KEGGEdge")
: Get the
relation type
signature(object = "KEGGEdge")
: Get the names
of edges in the convention of Rgraphviz, 'node1~node2'
signature(object = "KEGGEdge")
: Show method
Jitao David Zhang mailto:[email protected]
KGML Manual https://www.genome.jp/kegg/docs/xml/
mapfile<- system.file("extdata/map00260.xml", package="KEGGgraph") maptest <- parseKGML(mapfile) x <- edges(maptest)[[1]] class(x) ## examples to extract information from KEGGEdge getName(x) getEntryID(x) getType(x) getSubtype(x) subtype <- getSubtype(x)[[1]] getName(subtype)
mapfile<- system.file("extdata/map00260.xml", package="KEGGgraph") maptest <- parseKGML(mapfile) x <- edges(maptest)[[1]] class(x) ## examples to extract information from KEGGEdge getName(x) getEntryID(x) getType(x) getSubtype(x) subtype <- getSubtype(x)[[1]] getName(subtype)
Edge subtypes defined by the KEGG database.
A data.frame
with 17 rows and 11 columns
A class to represent subtype in KEGG
Objects can be created by calls of the form new("KEGGEdgeSubType", ...)
.
name
:Object of class "character"
, name of the subtype
value
:Object of class "character"
, value of
the subtype
signature(object = "KEGGEdgeSubType")
: getting
subtype name
signature(object = "KEGGEdgeSubType")
:
getting subtype value
signature(object = "KEGGEdgeSubType")
: show method
Please note that 'KEGGEdgeSubtype' is a data frame storing subtype predefinitions, the 'type' with lowercases. 'KEGGEdgeSubType' is however a class representing these subtypes.
Jitao David Zhang mailto:[email protected]
showClass("KEGGEdgeSubType") ## use example(KEGGEdge-class) for more examples
showClass("KEGGEdgeSubType") ## use example(KEGGEdge-class) for more examples
Edge types defined by the KEGG database.
A data.frame
with values and explanations of edge types.
A class to represent 'graphics' element in KGML files
This method is mainly used to extract visualization information from KGML files.
Objects can be created by calling parseGraphics
name
:Object of class "character"
graphics name
x
:Object of class "integer"
x coordinate in
KEGG figure
y
:Object of class "integer"
y coordinate in
KEGG figure
type
:Object of class "character"
graphics type
(shape)
width
:Object of class "integer"
witdh of the symbol
height
:Object of class "integer"
height of the
symbol
fgcolor
:Object of class "character"
foreground
color
bgcolor
:Object of class "character"
background
color
Jitao David Zhang mailto:[email protected]
KGML Manual https://www.genome.jp/kegg/docs/xml/
showClass("KEGGGraphics")
showClass("KEGGGraphics")
Class to represent 'group' nodes in KEGG pathways
The objects are usually created by parseEntry
function
and is not intended to be called directly by users.
component
:Component of the group
entryID
: see the slot of
KEGGNode-class
graphics
: see the slot of
KEGGNode-class
link
: see the slot of
KEGGNode-class
map
: see the slot of
KEGGNode-class
name
: see the slot of
KEGGNode-class
reaction
: see the slot of
KEGGNode-class
type
: see the slot of KEGGNode-class
Class "KEGGNode"
, directly.
signature(object = "KEGGNode")
: returns
components of the group, in a vector of strings
Jitao David Zhang mailto:[email protected]
showClass("KEGGGroup")
showClass("KEGGGroup")
The class to present 'entry' element in KGML files and nodes in KEGG graphs
Objects can be created by calls of the function parseEntry
and is not intended to be directly created by users.
entryID
: entryID, the 'id' attribute of 'entry'
elements in KGML files. In each KGML file the entryID is specified
by auto-increment integers, therefore entryIDs from two individual
KGML files are not unique. However, if 'expandGenes' option is
specified in KEGGpathway2Graph
function, the unique
KEGGID will replace the default integer as the new entryID, which is
unique in biological context
name
:Name of the node
type
:Type of the node, use data(KEGGNodeType)
to see available values
link
:URL link of the node
reaction
:Reaction of the node
map
:Map of the node
graphics
:Graphic details (including display name) of
the node, an object of KEGGGraphics
signature(object = "KEGGNode")
: get
display name
signature(obj = "KEGGNode")
: get entryID,
in case of gene-expanded graphs this is the same as getKEGGID
signature(object = "KEGGNode")
: get KEGGID
signature(object = "KEGGNode")
: get the type of
the node
signature(object = "KEGGNode")
: replace name
signature(obj = "KEGGNode")
: returns
entryID (the same as getEntryID
), for compatibility with KEGGGroup-class
signature(object = "KEGGNode")
: show method
Jitao David Zhang mailto:[email protected]
KGML Document manual https://www.genome.jp/kegg/docs/xml/
## We show how to extract information from KEGGNode object sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) ns <- nodes(pathway) node <- ns[[1]] show(node) getName(node) getDisplayName(node) getEntryID(node) getKEGGID(node)
## We show how to extract information from KEGGNode object sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) ns <- nodes(pathway) node <- ns[[1]] show(node) getName(node) getDisplayName(node) getEntryID(node) getKEGGID(node)
Node types defined by the KEGG database.
A data.frame
with values and explanations of KEGG node types.
The data provides a translation mechanism between KEGG pathway identifiers,
for instance hsa04010
, and pathway names, for instance MAPK
signaling pathway
.
An AnnDbBiMap
A class to represent KEGG pathway
Objects can be created by calls of the form new("KEGGPathway",
...)
. Normally they are created by parseKGML
.
pathwayInfo
: An object of KEGGPathwayInfo-class
nodes
: List of objects of KEGGNode-class
edges
: List of objects of
KEGGEdge-class
reactions
: List of objects of
KEGGReaction-class
signature(object = "KEGGPathway", which =
"ANY")
: KEGGEdges of the pathway
signature(object = "KEGGPathway")
: setting edges
signature(object = "KEGGPathway")
: getting
pathway name
signature(object = "KEGGPathway")
: getting
pathway title
signature(object = "KEGGPathway", value =
"ANY")
: setting nodes
signature(object = "KEGGPathway")
: KEGGNodes of
the pathway
signature(object = "KEGGPathway")
: getting
KEGGPathwayInfo
signature(object = "KEGGPathway")
: getting
title of the pathway
signature(object = "KEGGPathway")
: display method
Jitao David Zhang mailto:[email protected]
KGML Document manual https://www.genome.jp/kegg/docs/xml/
parseKGML
,
KEGGEdge-class
, KEGGNode-class
,
KEGGReaction-class
## We show how to extract information from KEGGPathway objects ## Parse KGML file into a 'KEGGPathway' object mapfile <- system.file("extdata/map00260.xml", package="KEGGgraph") maptest <- parseKGML(mapfile) ## short summary of the pathway maptest ## get information of the pathway getPathwayInfo(maptest) ## nodes of the pathway nodes <- nodes(maptest) node <- nodes[[3]] getName(node) getType(node) getDisplayName(node) ## edges of the pathway edges <- edges(maptest) edge <- edges[[3]] getEntryID(edge) getSubtype(edge)
## We show how to extract information from KEGGPathway objects ## Parse KGML file into a 'KEGGPathway' object mapfile <- system.file("extdata/map00260.xml", package="KEGGgraph") maptest <- parseKGML(mapfile) ## short summary of the pathway maptest ## get information of the pathway getPathwayInfo(maptest) ## nodes of the pathway nodes <- nodes(maptest) node <- nodes[[3]] getName(node) getType(node) getDisplayName(node) ## edges of the pathway edges <- edges(maptest) edge <- edges[[3]] getEntryID(edge) getSubtype(edge)
The function parses an object of KEGGPathway-class
into
graph.
KEGGpathway2Graph(pathway, genesOnly = TRUE, expandGenes = TRUE)
KEGGpathway2Graph(pathway, genesOnly = TRUE, expandGenes = TRUE)
pathway |
An instance of |
genesOnly |
logical, should only the genes are maintained and other types of nodes (compounds, etc) neglected? TRUE by default |
expandGenes |
logical, should homologue proteins expanded? TRUE by default |
When 'expandGenes=TRUE', the nodes have unique names of KEGGID (in the form of 'org:xxxx', for example 'hsa:1432'), otherwise an auto-increment index given by KEGG is used as node names. In the latter case, the node names are duplicated and graphs cannot be simply merged before the nodes are unique.
KEGG node and edge data is stored in 'nodeData' and 'edgeData' slots
respectively, which can be extracted by getKEGGnodeData
and getKEGGedgeData
.
A directed graph.
Jitao David Zhang mailto:[email protected]
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") kegg.pathway <- parseKGML(sfile) gR.compact<- KEGGpathway2Graph(kegg.pathway,expandGenes=FALSE)
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") kegg.pathway <- parseKGML(sfile) gR.compact<- KEGGpathway2Graph(kegg.pathway,expandGenes=FALSE)
Regulatory pathways are always viewed as protein networks, so there is
no 'reaction' information saved in their KGML files. Metabolic
pathways are viewed both as both protein networks and chemical
networks, hence the KEGGPathway-class
object may have reactions
information among chemical compounds.
This functions extracts reaction information from KEGG pathway, and convert the chemical compound reaction network into directed graph.
KEGGpathway2reactionGraph(pathway)
KEGGpathway2reactionGraph(pathway)
pathway |
A |
The direction of the graph is specified by the role of the compound in the reaction, the edges goes always out of 'substrate' and points to 'product'.
For now there is no wrapper to parse the KGML file directly into a
reaction graph. In future there maybe one, but we don't want to
confuse users with two similar functions to parse the file into a
graph (since we assume that most users will need the protein graph,
which can be conveniently parsed by parseKGML2Graph
).
From version 1.18.0, reaction graphs returned by
KEGGpathway2reactionGraph
can be merged with other reaction
graphs or pathway graphs.Thus users can combine pathway and reaction
graph in one KGML file into a single graph.
A directed graph with compounds as nodes and reactions as edges.
If the pathway does not contain any chemical reactions, a warning
message will be printed and NULL
is returned.
Jitao David Zhang mailto:[email protected]
KGML Document manual https://www.genome.jp/kegg/docs/xml/
mapfile <- system.file("extdata/map00260.xml",package="KEGGgraph") map <- parseKGML(mapfile) cg <- KEGGpathway2reactionGraph(map) cg nodes(cg)[1:3] edges(cg)[1:3]
mapfile <- system.file("extdata/map00260.xml",package="KEGGgraph") map <- parseKGML(mapfile) cg <- KEGGpathway2reactionGraph(map) cg nodes(cg)[1:3] edges(cg)[1:3]
A class to represent information of a KEGG pathway
Objects can be created by calls of the function parsePathwayInfo
.
name
:Object of class "character"
Pathway name
org
:Object of class "character"
Organism
number
:Object of class "character"
Number
title
:Object of class "character"
Title of the
pathway
image
:Object of class "character"
Image URL
link
:Object of class "character"
URL Link
signature(object = "KEGGPathwayInfo")
: get
title of the pathway
signature(object = "KEGGPathwayInfo")
: show method
Jitao David Zhang mailto:[email protected]
KGML Document Manual https://www.genome.jp/kegg/docs/xml/
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) pi <- getPathwayInfo(pathway) class(pi) getTitle(pi)
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") pathway <- parseKGML(sfile) pi <- getPathwayInfo(pathway) class(pi) getTitle(pi)
A class to present 'reaction' elements in KGML files
Objects can be created by calls of the function parseReaction
.
name
:Object of class "character"
the KEGGID of
this reaction, e.g. "rn:R02749"
type
:Object of class "character"
the type of
this reaction, either 'reversible' or 'irreversible'
substrateName
:Object of class "character"
,
KEGG identifier of the COMPOUND database or the GLYCAN database
e.g. "cpd:C05378"
substrateAltName
:Object of class "character"
alternative name of its parent substrate element
productName
:Object of class "character"
specifies the KEGGID of the product
productAltName
:Object of class "character"
alternative name of its parent product element
signature(object = "KEGGReaction")
: show method
signature(object = "KEGGReaction")
: get the
KEGGID of the reaction
signature(object = "KEGGReaction")
: get the
type of the reaction
signature(object = "KEGGReaction")
: get the
name of substrate
signature(object = "KEGGReaction")
: get the
name of product
Jitao David Zhang mailto:[email protected]
KGML Document Manual https://www.genome.jp/kegg/docs/xml/
## We show how to extract reactions from a 'KEGGPathway' object mapfile <- system.file("extdata/map00260.xml", package="KEGGgraph") maptest <- parseKGML(mapfile) mapReactions <- getReactions(maptest) ## More details about reaction reaction <- mapReactions[[1]] getName(reaction) getType(reaction) getSubstrate(reaction) getProduct(reaction)
## We show how to extract reactions from a 'KEGGPathway' object mapfile <- system.file("extdata/map00260.xml", package="KEGGgraph") maptest <- parseKGML(mapfile) mapReactions <- getReactions(maptest) ## More details about reaction reaction <- mapReactions[[1]] getName(reaction) getType(reaction) getSubstrate(reaction) getProduct(reaction)
The function uses KEGG package and converts KGML file name into human readable pathway name.
kgmlFileName2PathwayName(filename)
kgmlFileName2PathwayName(filename)
filename |
A KGML file name |
So far it only supports KGML files organized by species.
NOTE: there is issue of package loading sequence to use this function:
the 'KEGG.db' must be loaded before 'KEGGgraph' to use it
properly. Otherwise the mget
returns error of 'KEGGPATHID2NAME'
is not a environment. So far I don't where does this bug come from, so
I commented out the examples.
A character string of pathway name
Jitao David Zhang mailto:[email protected]
The function merges a list of KEGG graphs into one graph object. The merged graph have unique nodes, and edges are merged into non-duplicate sets.
For the reason of speed, mergeGraphs
discards KEGG node and
edge informations. To maintain them while merging graphs, please use mergeKEGGgraphs
.
mergeGraphs(list, edgemode = "directed")
mergeGraphs(list, edgemode = "directed")
list |
A list of graph objects, which can be created by |
edgemode |
Edge mode of the graph product, by default 'directed' |
The function takes a list of graphs and merges them into a new graph. The nodes of individual graphs must be unique. The function takes care of the removal of duplicated edges.
A directed graph
It is known that graphs from C.elegance pathways have problem when merging, because the nodes name are not consistent betweeen edge records and entry IDs.
Jitao David Zhang <[email protected]>
mergeKEGGgraphs
extends function mergeGraphs
and merges
a list of KEGG graphs. Both mergeGraphs
and
mergeKEGGgraphs
can be used to merge graphs, while the latter
form is able to merge the nodes and edges attributes from KEGG, so
that the nodes and edges have a one-to-one mapping to the results from
getKEGGnodeData
and getKEGGEdgeData
.
See details below.
mergeKEGGgraphs(list, edgemode = "directed")
mergeKEGGgraphs(list, edgemode = "directed")
list |
A list of named KEGG graphs |
edgemode |
character, 'directed' by default |
mergeGraphs
discards the node or edge attributes, hence
getKEGGnodeData
or getKEGGedgeData
will
return NULL
on the resulting graph.
mergeKEGGgraphs
calls mergeGraphs
first to merge the
graphs, then it also merges the KEGGnodeData and KEGGedgeData.so that
they are one-to-one mapped to the nodes and edges in the merged graph.
A graph with nodeData and edgeData
From version 1.21.1, lists containing NULL should also work.
Jitao David Zhang mailto:[email protected]
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) wntfile <- system.file("extdata/hsa04310.xml",package="KEGGgraph") wntR <- parseKGML2Graph(wntfile, expandGenes=TRUE) graphlist <- list(mapkG=gR, wntG=wntR) mergedKEGG <- mergeKEGGgraphs(graphlist) mergedKEGG ## list containing NULL works also nlist <- list(gR, wntR, NULL) nmergedKEGG <- mergeKEGGgraphs(nlist)
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) wntfile <- system.file("extdata/hsa04310.xml",package="KEGGgraph") wntR <- parseKGML2Graph(wntfile, expandGenes=TRUE) graphlist <- list(mapkG=gR, wntG=wntR) mergedKEGG <- mergeKEGGgraphs(graphlist) mergedKEGG ## list containing NULL works also nlist <- list(gR, wntR, NULL) nmergedKEGG <- mergeKEGGgraphs(nlist)
The function returns the neighborhood set of given vertices in the form of list. Optionally user can choose to include the given vertices in the list, too.
neighborhood(graph, index, return.self = FALSE)
neighborhood(graph, index, return.self = FALSE)
graph |
An object of |
index |
Names of nodes, whose neighborhood set should be returned |
return.self |
logical, should the vertex itself also be returned? |
Let v be a vertex in a (di)graph, the out-neighborhood or successor set (N+(v), x belongs to V(G) and v->x) and the in-neighborhood or predecessor set (N-(v), x belongs to V(G) and x->v) are jointly returned.
The returned list is indexed by the given node indices, NULL
is
returned in case of non-existing node.
The nodes are unique, that is, duplicated nodes are removed in results.
A list indexed by the given node indices, each entry containing the neighborhood set of that node (or furthermore including that node).
Jitao David Zhang <[email protected]>
D.B. West. Introduction to Graph Theory, Second Edition. Prentice Hall, 2001
V <- c("Hamburg","Stuttgart","Berlin","Paris","Bremen") E <- list("Hamburg"=c("Berlin","Bremen"), "Stuttgart"=c("Berlin","Paris"), "Berlin"=c("Stuttgart","Bremen"), "Paris"=c("Stuttgart"), "Bremen"=c("Hamburg","Berlin")) g <- new("graphNEL", nodes=V, edgeL=E, edgemode="directed") if(require(Rgraphviz) & interactive()) { plot(g, "neato") } ## simple uses neighborhood(g, "Hamburg") neighborhood(g, c("Hamburg", "Berlin","Paris")) ## in case of non-existing nodes neighborhood(g, c("Stuttgart","Ulm")) ## also applicable to non-directed graphs neighborhood(ugraph(g), c("Stuttgart","Berlin"))
V <- c("Hamburg","Stuttgart","Berlin","Paris","Bremen") E <- list("Hamburg"=c("Berlin","Bremen"), "Stuttgart"=c("Berlin","Paris"), "Berlin"=c("Stuttgart","Bremen"), "Paris"=c("Stuttgart"), "Bremen"=c("Hamburg","Berlin")) g <- new("graphNEL", nodes=V, edgeL=E, edgemode="directed") if(require(Rgraphviz) & interactive()) { plot(g, "neato") } ## simple uses neighborhood(g, "Hamburg") neighborhood(g, c("Hamburg", "Berlin","Paris")) ## in case of non-existing nodes neighborhood(g, c("Stuttgart","Ulm")) ## also applicable to non-directed graphs neighborhood(ugraph(g), c("Stuttgart","Berlin"))
ENTRY elements contain information of nodes (proteins, enzymes,
compounds, maps, etc) in KEGG pathways. 'parseEntry' function parses
the elements into link{KEGGNode-class}
or KEGGGroup-class
objects. It is not expected to be called directly by the user.
parseEntry(entry)
parseEntry(entry)
entry |
XML node of KGML file |
See https://www.genome.jp/kegg/docs/xml/ for more details about 'entry' as well as other elements in KGML files.
An object of link{KEGGNode}
or (in case of a group node) link{KEGGGroup}
Jitao David Zhang <[email protected]>
https://www.genome.jp/kegg/docs/xml/
parseGraphics
, parseKGML
,
KEGGNode-class
, KEGGGroup-class
The function parses 'graphics' elements in KGML files, and it is mainly used internally.
parseGraphics(graphics)
parseGraphics(graphics)
graphics |
XML node |
The function is called by other parsing functions and not intended to be called directly by user.
An object of KEGGGraphics-class
.
Jitao David Zhang mailto:[email protected]
KGML Document manual https://www.genome.jp/kegg/docs/xml/
The function parses KGML files according to the KGML XML documentation.
parseKGML(file)
parseKGML(file)
file |
Name of KGML file |
The function parses KGML file (depending on XML package).
An object of KEGGPathway-class
.
Jitao David Zhang mailto:[email protected]
KGML Manual https://www.genome.jp/kegg/docs/xml/
parseEntry
, parseRelation
,
parseReaction
, KEGGPathway-class
,
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") kegg.pathway <- parseKGML(sfile) kegg.pathway
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") kegg.pathway <- parseKGML(sfile) kegg.pathway
This function extends the parseKGML2Graph
function, by
converting the resulting graph into a three-column data frame
representing out-nodes (the from
column in the data frame),
in-nodes (to
), types and subtypes of edges that connect
them (type
and subtype
, respectively). It can be used, for example, for exporting KEGG pathway networks
in plain text files.
parseKGML2DataFrame(file, reactions=FALSE,...)
parseKGML2DataFrame(file, reactions=FALSE,...)
file |
A KGML file |
reactions |
Logical, whether metabolic reactions should be parsed
and returned as part of the data frame. Default: |
... |
Other parameters passed to |
The out- and in-nodes are represented in the form of KEGG
identifiers. For human EntrezIDs the function
translateKEGGID2GeneID
can be used.
Multile edges are supported: in case more than one subtypes of edges exist between two nodes, they are all listed in the resulting data frame.
A four-column data frame, representing the graph structure:
out-nodes (the from
column), in-nodes (to
), edge type (type
) and
subtype (subtype
).
Jitao David Zhang
parseKGML2Graph
, KEGGpathway2Graph
and translateKEGGID2GeneID
.
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gdf <- parseKGML2DataFrame(sfile) head(gdf) dim(gdf) rfile <- system.file("extdata/hsa00020.xml",package="KEGGgraph") dim(dfWr <- parseKGML2DataFrame(rfile, reactions=TRUE)) dim(dfWOr <- parseKGML2DataFrame(rfile, reactions=FALSE)) stopifnot(nrow(dfWr)>nrow(dfWOr)) ## not expanding genes: only the KGML-specific identifiers are used then ## only for expert use ## NOT RUN gdf.ne <- parseKGML2DataFrame(sfile, expandGenes=FALSE) dim(gdf.ne) head(gdf.ne) ## NOT RUN
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gdf <- parseKGML2DataFrame(sfile) head(gdf) dim(gdf) rfile <- system.file("extdata/hsa00020.xml",package="KEGGgraph") dim(dfWr <- parseKGML2DataFrame(rfile, reactions=TRUE)) dim(dfWOr <- parseKGML2DataFrame(rfile, reactions=FALSE)) stopifnot(nrow(dfWr)>nrow(dfWOr)) ## not expanding genes: only the KGML-specific identifiers are used then ## only for expert use ## NOT RUN gdf.ne <- parseKGML2DataFrame(sfile, expandGenes=FALSE) dim(gdf.ne) head(gdf.ne) ## NOT RUN
This function is a wrapper for parseKGML and KEGGpathway2Graph. It
takes two actions: first it reads in the KGML file and parses it into an object of
KEGGPathway-class
, the second step it calls
KEGGpathway2Graph
function to return the graph model.
parseKGML2Graph(file, ...)
parseKGML2Graph(file, ...)
file |
Name of KGML file |
... |
other parameters passed to KEGGpathway2Graph, see
|
Note that groups of genes will be split into single genes by calling
the KEGGpathway2Graph
function. Edges that connected to
groups will be duplicated to connect each member of the group.
A graph object.
Jitao David Zhang mailto:[email protected]
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) gR
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) gR
The function does several tasks implemented in the KEGGgraph package in sequence to make expanding maps easier.
parseKGMLexpandMaps(file, downloadmethod = "auto", genesOnly = TRUE, localdir,...)
parseKGMLexpandMaps(file, downloadmethod = "auto", genesOnly = TRUE, localdir,...)
file |
A KGML file |
downloadmethod |
passed to |
genesOnly |
logical, should only the genes nodes remain in the returned graph object? |
localdir |
character string, if specified, the function tries to read files with the same base name from a local directory, useful when there are file copies on the client. |
... |
Other parameters passed to |
In KEGG pathways there're usually pathways contained('cross-linked') in other pathways, for example see https://www.genome.jp/kegg/pathway/hsa/hsa04115.html, where p53 signalling pathway contains other two pathways 'apoptosis' and 'cell cycle'. This function parses these pathways (refered as 'maps' in KGML manual), download their KGML files from KEGG REST API, parse them individually, and merge all the children pathway graphs with the parental pathway into one graph object. The graph is returned as the function value.
Since different graphs does not have unique node identifiers unless the genes are expanded, so by using this function user has to expand the genes. Another disadvantage is that so far due to the implementation, the KEGGnodeData and KEGGedgeData is lost during the merging. This however will probably be changed in the future version.
A directed graph object
Jitao David Zhang [email protected]
KGML Document manual https://www.genome.jp/kegg/docs/xml/
for most users it is enough to use parseKGML2Graph
The function parses the information of the given pathway from KGML
files into an object of KEGGPathwayInfo-class
. It is
used internally and is not expected to be called by users directly.
parsePathwayInfo(root)
parsePathwayInfo(root)
root |
Root element of the KGML file |
An object of KEGGPathwayInfo-class
Jitao David Zhang mailto:[email protected]
KGML Document Manual https://www.genome.jp/kegg/docs/xml/
The function parses 'reaction' element in KGML files. It is used interally and not expected to be called by users.
parseReaction(reaction)
parseReaction(reaction)
reaction |
A node of the type 'reaction' in KGML files |
See the reference manual for more information about 'reaction' type
An object of KEGGReaction-class
Jitao David Zhang mail:[email protected]
KGML Document Manual https://www.genome.jp/kegg/docs/xml/
RELATION elements in KGML files record the binary relationships
between ENTRY elements, corresponding to (directed) edges in a
graph. 'parseRelation' function parses RELATION elements into
KEGGEdge-class
objects from KGML
files. It is not expected to be called directly by the user.
parseRelation(relation)
parseRelation(relation)
relation |
XML node of KGML file |
See https://www.genome.jp/kegg/docs/xml/ for more details about 'relation' as well as other elements in KGML files.
An object of link{KEGGEdge}
.
Jitao David Zhang <[email protected]>
https://www.genome.jp/kegg/docs/xml/
The function parses KGML relation subtype, called internally and not intended to be used by end users.
parseSubType(subtype)
parseSubType(subtype)
subtype |
KGML subtype node |
An object of KEGGEdgeSubType-class
Jitao David Zhang mailto:[email protected]
The function provides a simple interface to Rgraphviz to render KEGG graph with custom styles.
KEGGgraphLegend
gives the legend of KEGG graphs
plotKEGGgraph(graph, y = "neato", shortLabel = TRUE, useDisplayName=TRUE, nodeRenderInfos, ...) KEGGgraphLegend()
plotKEGGgraph(graph, y = "neato", shortLabel = TRUE, useDisplayName=TRUE, nodeRenderInfos, ...) KEGGgraphLegend()
graph |
A KEGG graph, by calling |
y |
the layout method, |
shortLabel |
logical, should be short label used instead of full node name? |
useDisplayName |
logical, should the labels of nodes rendered as the 'display name' specified in the KGML file or render them simply with the node names? |
nodeRenderInfos |
List of node rendering info |
... |
Other functions passed to renderGraph, not implemented for now |
Users are not restricted to this function, alternatively you can choose other rendering functions.
The graph after layout and rendering is returned.
Jitao David Zhang mailto:[email protected]
opar <- par(ask=TRUE) sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) subs <- c("hsa:1432",edges(gR)$`hsa:1432`, "hsa:5778","hsa:5801","hsa:84867", "hsa:11072","hsa:5606","hsa:5608", "hsa:5494","hsa:5609") gR.sub <- subGraph(subs, gR) if(require(Rgraphviz)) plotKEGGgraph(gR.sub) KEGGgraphLegend() par(opar)
opar <- par(ask=TRUE) sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) subs <- c("hsa:1432",edges(gR)$`hsa:1432`, "hsa:5778","hsa:5801","hsa:84867", "hsa:11072","hsa:5606","hsa:5608", "hsa:5494","hsa:5609") gR.sub <- subGraph(subs, gR) if(require(Rgraphviz)) plotKEGGgraph(gR.sub) KEGGgraphLegend() par(opar)
A p-value of 0.05, 0.01, 0.001 correspond to one, two or three
asterisks. If 'sig.1' is set to TRUE
, then the p-value of 0.1
returns '.'.
pvalue2asterisk(pvalues, sig.1 = FALSE)
pvalue2asterisk(pvalues, sig.1 = FALSE)
pvalues |
A numeric value |
sig.1 |
logical, whether the significance sign of 0.1 should be returned |
A character string containing the signs
Jitao David Zhang mailto:[email protected]
pvalue2asterisk(0.03) pvalue2asterisk(0.007) pvalue2asterisk(3e-5) pvalue2asterisk(0.55)
pvalue2asterisk(0.03) pvalue2asterisk(0.007) pvalue2asterisk(3e-5) pvalue2asterisk(0.55)
Given a list of genes (identified by Entrez GeneID), the function subsets the given KEGG gragh of the genes as nodes (and maintaining all the edges between).
queryKEGGsubgraph(geneids, graph, organism = "hsa", addmissing = FALSE)
queryKEGGsubgraph(geneids, graph, organism = "hsa", addmissing = FALSE)
geneids |
A vector of Entrez GeneIDs |
graph |
A KEGG graph |
organism |
a three-alphabet code of organism |
addmissing |
logical, in case the given gene is not found in the graph, should it be added as single node to the subgraph? |
This function solves the questions like 'How is the list of gene interact with each other in the context of pathways?'
Limited by the translateKEGGID2GeneID
, this function
supports only human for now. We are working to include other
organisms.
If 'addmissing' is set to TRUE
, the missing gene in the given
list will be added to the returned subgraph as single nodes.
A subgraph with nodes representing genes and edges representing interactions.
Jitao David Zhang <[email protected]>
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) geneids <- c(5594, 5595, 6197, 5603, 1843,5530, 5603) sub <- queryKEGGsubgraph(geneids, gR) if(require(Rgraphviz) && interactive()) { plot(sub, "neato") } ## add missing nodes list2 <- c(geneids, 81029) sub2 <- queryKEGGsubgraph(list2, gR,addmissing=TRUE) if(require(Rgraphviz) && interactive()) { plot(sub2, "neato") }
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) geneids <- c(5594, 5595, 6197, 5603, 1843,5530, 5603) sub <- queryKEGGsubgraph(geneids, gR) if(require(Rgraphviz) && interactive()) { plot(sub, "neato") } ## add missing nodes list2 <- c(geneids, 81029) sub2 <- queryKEGGsubgraph(list2, gR,addmissing=TRUE) if(require(Rgraphviz) && interactive()) { plot(sub2, "neato") }
The function is intended to be a test tool. It subset the given graph repeatedly.
randomSubGraph(graph, per = 0.25, N = 10)
randomSubGraph(graph, per = 0.25, N = 10)
graph |
A graph object |
per |
numeric, the percentage of the nodes to be sampled, value between (0,1) |
N |
Repeat times |
The function is called for its side effect, NULL
is returned
Jitao David Zhang mailto:[email protected]
tnodes <- c("Hamburg","Dortmund","Bremen", "Paris") tedges <- list("Hamburg"=c("Dortmund", "Bremen"), "Dortmund"=c("Hamburg"), "Bremen"=c("Hamburg"), "Paris"=c()) tgraph <- new("graphNEL", nodes = tnodes, edgeL = tedges) randomSubGraph(tgraph, 0.5, 10)
tnodes <- c("Hamburg","Dortmund","Bremen", "Paris") tedges <- list("Hamburg"=c("Dortmund", "Bremen"), "Dortmund"=c("Hamburg"), "Bremen"=c("Hamburg"), "Paris"=c()) tgraph <- new("graphNEL", nodes = tnodes, edgeL = tedges) randomSubGraph(tgraph, 0.5, 10)
The function split 'group' entries in KGML files. Most of the cases they are complexes. During the splitting the function copies the edges between groups and nodes (or between groups and groups) correspondingly, so that the existing edges also exist after the groups are split.
splitKEGGgroup(pathway)
splitKEGGgroup(pathway)
pathway |
An object of |
By default the groups (complexes) in KEGG pathways are split.
An object of KEGGPathway-class
Jitao David Zhang mailto:[email protected]
KGML Manual https://www.genome.jp/kegg/docs/xml/
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") kegg.pathway <- parseKGML(sfile) kegg.pathway.split <- splitKEGGgroup(kegg.pathway) ## compare the different number of edges length(edges(kegg.pathway)) length(edges(kegg.pathway.split))
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") kegg.pathway <- parseKGML(sfile) kegg.pathway.split <- splitKEGGgroup(kegg.pathway) ## compare the different number of edges length(edges(kegg.pathway)) length(edges(kegg.pathway.split))
The function subsets KEGG graph by node types, mostly used in extracting gene networks.
subGraphByNodeType(graph, type = "gene", kegg=TRUE)
subGraphByNodeType(graph, type = "gene", kegg=TRUE)
graph |
A KEGG graph object produced by calling |
type |
node type, see |
kegg |
logical, should the KEGG Node and Edge attributes be maintained during the subsetting? By default set to 'TRUE' |
A subgraph of the original graph
Jitao David Zhang mailto:[email protected]
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") sGraph <- parseKGML2Graph(sfile,expandGenes=TRUE, genesOnly=FALSE) sGraphGene <- subGraphByNodeType(sGraph, type="gene")
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") sGraph <- parseKGML2Graph(sfile,expandGenes=TRUE, genesOnly=FALSE) sGraphGene <- subGraphByNodeType(sGraph, type="gene")
subKEGGgraph
extends generic method subGraph
and subsets the KEGG
graph. Both 'subKEGGgraph' and 'subGraph' can be used to subset the
graph, the difference lies in whether the node and edge attributes
from KEGG are also subset (subKEGGgraph
) or not
(subGraph
).
See details below.
subKEGGgraph(nodes, graph)
subKEGGgraph(nodes, graph)
nodes |
Node names to subset |
graph |
A graph parsed from KGML files, produced by
|
subGraph
does not subset the
node or edge attributes, hence the results of
getKEGGnodeData
and getKEGGedgeData
does
not map to the nodes and edges in the subgraph in a one-to-one
manner, with attributes of removed nodes and edges still remaining in
the subGraph.
subKEGGgraph
calls subGraph
first to subset the graph,
and then it also subsets the KEGGnodeData and KEGGedgeData so that
they are one-to-one mapped to the nodes and edges in the subgraph.
A graph with nodeData and edgeData.
Jitao David Zhang mailto:[email protected]
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) subs <- c("hsa:1432",edges(gR)$`hsa:1432`,"hsa:5778","hsa:5801", "hsa:84867","hsa:11072","hsa:5606","hsa:5608","hsa:5494","hsa:5609") gR.keggsub <- subKEGGgraph(subs, gR) gR gR.keggsub
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) subs <- c("hsa:1432",edges(gR)$`hsa:1432`,"hsa:5778","hsa:5801", "hsa:84867","hsa:11072","hsa:5606","hsa:5608","hsa:5494","hsa:5609") gR.keggsub <- subKEGGgraph(subs, gR) gR gR.keggsub
To render KEGG pathway graphs, we have created a custom style of edges to represent their subtypes. 'subtypeDisplay' extracts this information
An KEGG graph
An object of KEGGEdge-class
An object of
KEGGEdgeSubType-class
The function translates the KEGG graph into a graph of equivalant topology while index with unique identifiers given by user. The new identifiers could be, for example, GeneSymbol or other identifiers mapped to KEGGID.
translateKEGGgraph(graph, newNodes)
translateKEGGgraph(graph, newNodes)
graph |
A KEGG graph |
newNodes |
A character vector giving the new nodes, must be of the same length and same order of the nodes of the given graph |
The function is still experimental and users are welcomed to report any difficulties
Another graph indexed by the given identifier
Jitao David Zhang <[email protected]>
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) subG <- subKEGGgraph(c("hsa:1848","hsa:1432","hsa:2002","hsa:8986"),gR) symbols <- c("DUSP6","MAPK14","ELK1","RPS6KA4") sub2G <- translateKEGGgraph(subG, symbols) sub2G nodes(sub2G) if(require(Rgraphviz) & interactive()) { plot(sub2G, "neato") }
sfile <- system.file("extdata/hsa04010.xml",package="KEGGgraph") gR <- parseKGML2Graph(sfile,expandGenes=TRUE) subG <- subKEGGgraph(c("hsa:1848","hsa:1432","hsa:2002","hsa:8986"),gR) symbols <- c("DUSP6","MAPK14","ELK1","RPS6KA4") sub2G <- translateKEGGgraph(subG, symbols) sub2G nodes(sub2G) if(require(Rgraphviz) & interactive()) { plot(sub2G, "neato") }
translateKEGGID2GeneID
translates KEGGID to NCBI Entrez Gene ID, and
translateGeneID2KEGGID
translates Entrez Gene ID back to KEGGID.
translateKEGGID2GeneID(x, organism="hsa") translateGeneID2KEGGID(x, organism="hsa")
translateKEGGID2GeneID(x, organism="hsa") translateGeneID2KEGGID(x, organism="hsa")
x |
KEGGID, e.g. 'hsa:1432', or Entrez Gene ID, e.g. '1432' |
organism |
Three alphabet code for organisms. The mapping between the orgniams and codes can be found at https://www.genome.jp/kegg/kegg3.html |
The KEGGID are unique identifiers used by KEGG PATHWAY to identify gene products. After parsing the KEGG pathway into graph, the graph use KEGGID as its nodes' names.
translateKEGGID2GeneID
converts KEGGIDs into entrez GeneID, which can be
translated to other types of identifiers, for example with biomaRt
package or organism-specific annotation packages. See vignette for
examples.
translateKEGG2GeneID
is maintained for back-compatibility and
wrapps translateKEGGID2GeneID
.
Entrez GeneID of the given KEGG ID(s)
This function works so far only with human KEGGIDs, since for them the Entrez GeneID can be derived easily with removing the organism prefix.
The complete functional function will be implemented in the later release of the package.
Jitao David Zhang
egNodes <- c("hsa:1432", "hsa:11072") translateKEGGID2GeneID(egNodes) translateGeneID2KEGGID("1432")
egNodes <- c("hsa:1432", "hsa:11072") translateKEGGID2GeneID(egNodes) translateGeneID2KEGGID("1432")