Title: | R wrapper for the OMA REST API |
---|---|
Description: | A package for the orthology prediction data download from OMA database. |
Authors: | Klara Kaleb |
Maintainer: | Klara Kaleb <[email protected]>, Adrian Altenhoff <[email protected]> |
License: | GPL-3 |
Version: | 2.23.0 |
Built: | 2024-10-30 08:28:16 UTC |
Source: | https://github.com/bioc/OmaDB |
OmaDB is a wrapper for the REST API for the Orthologous MAtrix project (OMA) which is a database for the inference of orthologs among complete genomes. For more details on the OMA project, see https://omabrowser.org/.
The package contains a range of functions that are used to query the database. Some of the main functions are listed below:
getProtein()
getHOG()
getOMAGroup()
getGenomePairs()
getTaxonomy()
mapSequence()
annotateSequence()
searchProtein()
In addition to these, OmaDB features a range of functions that are used to format the retrieved data into some commonly used Bioconductor objects using packages such as GenomicRanges, Biostrings, topGO and ggtree. Some of them are listed below:
formatTopGO()
getGRanges()
The above functions are described in more detail in the package vignette's listed below:
Get started with OmaDB
Exploring Hierarchical orthologous groups with OmaDB
Exploring Taxonomic trees with OmaDB
Sequence Analysis with OmaDB
The function to obtain further information from a given url.
## S3 method for class 'omadb_obj' x$name
## S3 method for class 'omadb_obj' x$name
x |
object |
name |
attribute |
API response behind the URL
This function obtain Gene Ontology annotation for a given sequence that does not need to exist in the OMA Browser so far. The query sequence will analysed and a fast homology detection approach based on kmers will be used to detect the closest sequences in OMA. GO annotations for these top hits will be used to annotated the query sequence.
annotateSequence(query)
annotateSequence(query)
query |
the sequence to be annotated, it can be either a string or an AAString object from the Biostrings package |
a data.frame containing the the GO annotation information of the most similar protein to the query sequence
annotateSequence(query='MNDPSLLGYPNVGPQQQQQQQQQQHAGLLGKGTPNALQQQLHMNQLTGIPPPGLMNNSDVHTSSNNNSRQLLDQLANGNANMLNMNMDNNNNNNNNNNNNNNNGGGSGVMMNASTAAVNSIGMVPTVGTPVNINVNASNPLLHPHLDDPSLLNNPIWKLQLHLAAVSAQSLGQPNIYARQNAMKKYLATQQAQQAQQQAQQQAQQQVPGPFGPGPQAAPPALQPTDFQQSHIAEASKSLVDCTKQALMEMADTLTDSKTAKKQQPTGDSTPSGTATNSAVSTPLTPKIELFANGKDEANQALLQHKKLSQYSIDEDDDIENRMVMPKDSKYDDQLWHALDLSNLQIFNISANIFKYDFLTRLYLNGNSLTELPAEIKNLSNLRVLDLSHNRLTSLPAELGSCFQLKYFYFFDNMVTTLPWEFGNLCNLQFLGVEGNPLEKQFLKILTEKSVTGLIFYLRDNRPEIPLPHERRFIEINTDGEPQREYDSLQQSTEHLATDLAKRTFTVLSYNTLCQHYATPKMYRYTPSWALSWDYRRNKLKEQILSYDSDLLCLQEVESKTFEEYWVPLLDKHGYTGIFHAKARAKTMHSKDSKKVDGCCIFFKRDQFKLITKDAMDFSGAWMKHKKFQRTEDYLNRAMNKDNVALFLKLQHIPSGDTIWAVTTHLHWDPKFNDVKTFQVGVLLDHLETLLKEETSHNFRQDIKKFPVLICGDFNSYINSAVYELINTGRVQIHQEGNGRDFGYMSEKNFSHNLALKSSYNCIGELPFTNFTPSFTDVIDYIWFSTHALRVRGLLGEVDPEYVSKFIGFPNDKFPSDHIPLLARFEFMKTNTGSKKV')
annotateSequence(query='MNDPSLLGYPNVGPQQQQQQQQQQHAGLLGKGTPNALQQQLHMNQLTGIPPPGLMNNSDVHTSSNNNSRQLLDQLANGNANMLNMNMDNNNNNNNNNNNNNNNGGGSGVMMNASTAAVNSIGMVPTVGTPVNINVNASNPLLHPHLDDPSLLNNPIWKLQLHLAAVSAQSLGQPNIYARQNAMKKYLATQQAQQAQQQAQQQAQQQVPGPFGPGPQAAPPALQPTDFQQSHIAEASKSLVDCTKQALMEMADTLTDSKTAKKQQPTGDSTPSGTATNSAVSTPLTPKIELFANGKDEANQALLQHKKLSQYSIDEDDDIENRMVMPKDSKYDDQLWHALDLSNLQIFNISANIFKYDFLTRLYLNGNSLTELPAEIKNLSNLRVLDLSHNRLTSLPAELGSCFQLKYFYFFDNMVTTLPWEFGNLCNLQFLGVEGNPLEKQFLKILTEKSVTGLIFYLRDNRPEIPLPHERRFIEINTDGEPQREYDSLQQSTEHLATDLAKRTFTVLSYNTLCQHYATPKMYRYTPSWALSWDYRRNKLKEQILSYDSDLLCLQEVESKTFEEYWVPLLDKHGYTGIFHAKARAKTMHSKDSKKVDGCCIFFKRDQFKLITKDAMDFSGAWMKHKKFQRTEDYLNRAMNKDNVALFLKLQHIPSGDTIWAVTTHLHWDPKFNDVKTFQVGVLLDHLETLLKEETSHNFRQDIKKFPVLICGDFNSYINSAVYELINTGRVQIHQEGNGRDFGYMSEKNFSHNLALKSSYNCIGELPFTNFTPSFTDVIDYIWFSTHALRVRGLLGEVDPEYVSKFIGFPNDKFPSDHIPLLARFEFMKTNTGSKKV')
The function to create a list of GO annotations that is compatible with topGO from protein objects in roma
formatTopGO(geneList, format)
formatTopGO(geneList, format)
geneList |
the list of OmaDB protein objects or a dataframe of ontologies to be included in the analysis - this is where the GO annotations are extracted from. |
format |
format for the data to be returned in - either 'GO2geneID' or 'geneID2GO' |
a list containing the GO2geneID or geneID2GO information
geneList = list(getProtein(id='YEAST01'),getProtein(id='YEAST03')) annotations = formatTopGO(geneList,format='geneID2GO')
geneList = list(getProtein(id='YEAST01'),getProtein(id='YEAST03')) annotations = formatTopGO(geneList,format='geneID2GO')
The function to obtain the value for an object attribute.
getAttribute(obj, attribute)
getAttribute(obj, attribute)
obj |
the object of interest |
attribute |
the attribute of interest |
an value for a given object attribute
members = getAttribute(getOMAGroup(id ='YEAST58'),'members')
members = getAttribute(getOMAGroup(id ='YEAST58'),'members')
This function obtains the basic information for one specific genome available on the OMA Browser, or - if no id is provided - a dataframe with all available genomes.
getGenome(id = NULL, attribute = NULL)
getGenome(id = NULL, attribute = NULL)
id |
A genome identifier. By default, all available genomes will be returned. |
attribute |
An extra attribute to be returned (proteins) |
Ids can be either the scientific name of a species, the NCBI taxonomy id or the UniProtKB mnemonic species code.
The optional argument attribute can be used to directly load the proteins belonging to the genome. Alternatively, you can access the proteins attribute of the result which will transparently load the proteins from the OMA Browser.
an object containing the JSON keys as attributes or a dataframe
getGenome() getGenome(id='HUMAN') getGenome(id=9606) getGenome(id='HUMAN',attribute='proteins')
getGenome() getGenome(id='HUMAN') getGenome(id=9606) getGenome(id='HUMAN',attribute='proteins')
This function retrieves the pairwise relations among two genomes from the OMA Browser database. The relations are orthologs in case the genomes are different and "close paralogs" and "homoeologs" in case they are the same.
getGenomePairs(genome_id1, genome_id2, chr1 = NULL, chr2 = NULL, rel_type = NULL, ...)
getGenomePairs(genome_id1, genome_id2, chr1 = NULL, chr2 = NULL, rel_type = NULL, ...)
genome_id1 |
an identifier for the first genome, which can be either its taxon id or UniProt species code |
genome_id2 |
an an identifier for the second genome, which can be either its taxon id or UniProt species code |
chr1 |
the chromosome of interest for the first genome |
chr2 |
the chromosome of interest for the second genome |
rel_type |
the pairs relationship type |
... |
qwargs |
By using the parameters chr1 and chr2, one can limit the relations
to a certain chromosome for one or both genomes. The id of the
chromosome corresponds to the chromosome ids from the
getGenome
result.
The rel_type parameter further limits the returned relations to a specific subtype of orthologs (i.e. "1:1", "1:n", "m:1", "m:n") or - within a genome to either "close paralogs" or "homeologs".
a dataframe containing information about both the entries in the orthologous pair and their relationship
getGenomePairs(genome_id1='YEAST',genome_id2='ASHGO')
getGenomePairs(genome_id1='YEAST',genome_id2='ASHGO')
The function retrieves a specific Hierarchical Orthologous Group (HOG) from the OMA Browser database. A HOG is a set of genes that have all decendet from a single ancestral gene at a specific taxonomic level.
getHOG(id, level = NULL, members = FALSE)
getHOG(id, level = NULL, members = FALSE)
id |
an identifier for the HOG to be returned - either its HOG ID or a protein id. |
level |
a specific level for the HOG to be restricted to. level can either be 'root', or the name of a taxonomic level that is part of the HOG, e.g. 'Fungi'. By default it will retrieve the depest level of the most specific subhog for the given ID. |
members |
boolean that when set to TRUE returns a dataframe containg the protein members at a given hog level |
A HOG can be identified by its member proteins and a taxonomic level, or a HOG ID. As a taxonomic level, you can use either 'root' to retrieve the HOG at its deepest level, or the name of NCBI taxonomy level, or leave it out in which case the deepest level that doesn't include a duplication node is used.
The function either returns a single hog object or a list of hog objects. The later happens if the HOG ID you provide has already split into several sub-hogs at the level you indicate.
an object containing HOG attributes, or a list of those
getHOG(id = 'YEAST590') getHOG(id = 'YEAST590', level='root') getHOG(id = 'YEAST590', level='Saccharomycetaceae', members=TRUE)
getHOG(id = 'YEAST590') getHOG(id = 'YEAST590', level='root') getHOG(id = 'YEAST590', level='Saccharomycetaceae', members=TRUE)
Function to obtain loci in genomic range format for a given list of proteins
getLocus(proteins)
getLocus(proteins)
proteins |
the dataframe or a list of dataframes containing the protein data of interest. this can either be the members df or a list of protein ids. |
genomic range object from the GenomicRanges package in Bioconductor
loci = getLocus(proteins = getOMAGroup('YEAST58')['members'])
loci = getLocus(proteins = getOMAGroup('YEAST58')['members'])
The function to obtain the attributes and their data types for the object created.
getObjectAttributes(obj)
getObjectAttributes(obj)
obj |
the object of interest |
an list of object attributes and their data classes
attributes = getObjectAttributes(getOMAGroup(id ='YEAST58'))
attributes = getObjectAttributes(getOMAGroup(id ='YEAST58'))
This function obtains an OMA Group from the OMA Browser database. An OMA Group is defined to be a clique of proteins that are all orthologous to each other, i.e. they are all related through speciation events only. An OMA Group can thus by definition not contain any inparalogs. It is a very stringent orthology grouping approach. OMA Groups are mostly useful to infer phylogenetic species tree where they can be used as marker genes.
getOMAGroup(id, attribute = NULL)
getOMAGroup(id, attribute = NULL)
id |
An identifier for the group. See above for possible types of IDs. |
attribute |
an extra attribute to be returned (close_groups) |
Retrieving an OMA Group can be done using a group nr as id, its fingerprint (a 7mer AA sequence which is unique to proteins in that group), a member protein id or any sequence pattern that is unique to the group.
an object containing the JSON keys as attributes or a dataframe
getOMAGroup(id='58') getOMAGroup(id='P12345') getOMAGroup(id='NNRRGRI') getOMAGroup(id='58', attribute='close_groups')
getOMAGroup(id='58') getOMAGroup(id='P12345') getOMAGroup(id='NNRRGRI') getOMAGroup(id='58', attribute='close_groups')
This function enables to retrieve information on one or several proteins from the OMA Browser database.
getProtein(id, attribute = NULL)
getProtein(id, attribute = NULL)
id |
Identifier(s) for the entry or entries to be returned. a character string if single entry or a vector if multiple. |
attribute |
Instead of the protein, return the attribute property of the protein. Attriute needs to be one of 'domains', 'orthologs', 'gene_ontology', 'locus', or 'homoeologs'. |
In its simplest form the function returns the base data of the query protein. The query protein can be selected with any unique id, for example with a UniProtKB accession (P12345), an OMA id (YEAST00012), or a RefSeq id (NP_001226). To retrieve more than one protein, you should pass a vector of IDs.
Non-scalar properties of proteins such as their domains, GO annotations, orthologs or homeologs will get loaded upon accessing them, or if you only need this information you can set the attribute parameter to the property name and retrieve this information directly.
An object containing the JSON keys as attributes or a dataframe containing the non-scalar protein property.
For non-unique non-unique IDs or partial ID lookup, use
searchProtein
instead.
getProtein(id='YEAST00001') getProtein(id='YEAST00001', attribute='orthologs') getProtein(id=c('YEAST00001','YEAST00002','YEAST00012')) getProtein(id=c('YEAST00001','YEAST00002','YEAST00012'), attribute='gene_ontology')
getProtein(id='YEAST00001') getProtein(id='YEAST00001', attribute='orthologs') getProtein(id=c('YEAST00001','YEAST00002','YEAST00012')) getProtein(id=c('YEAST00001','YEAST00002','YEAST00012'), attribute='gene_ontology')
The function to obtain the taxonomic tree from the database in the newick format that can be plugged into phylo.io for visualisation.
getTaxonomy(root = NULL, members = NULL, newick = TRUE)
getTaxonomy(root = NULL, members = NULL, newick = TRUE)
root |
optional parameter, the root of the node of interest |
members |
optional parameter, list of member ncbi taxon or UniProt IDs that should be included in the induced taxonomy. |
newick |
optional parameter, boolean default set to TRUE |
an object containing the JSON keys as attributes
getTaxonomy() getTaxonomy(members='YEAST,ASHGO') getTaxonomy(root='Alveolata')
getTaxonomy() getTaxonomy(members='YEAST,ASHGO') getTaxonomy(root='Alveolata')
The function to create a topGO object containing the GO annotations for the given protein list.
getTopGO(annotations, format, foregroundGenes, ontology)
getTopGO(annotations, format, foregroundGenes, ontology)
annotations |
list of GO annoatations obtained from the formatTopGO() |
format |
Format for the data to be returned in - either 'GO2geneID' or 'geneID2GO' |
foregroundGenes |
List of identifiers for the foreground genes |
ontology |
The ontology for which the enrichment should be done. This parameter is passed directly to the topGOdata constructor. |
topGO object
geneList = list(getProtein(id='YEAST58'),getProtein(id='YEAST00059')) annotations = formatTopGO(geneList,format='geneID2GO') library(topGO) getTopGO(annotations, foregroundGenes = list('YEAST00058'), format = 'geneID2GO', ontology = 'BP')
geneList = list(getProtein(id='YEAST58'),getProtein(id='YEAST00059')) annotations = formatTopGO(geneList,format='geneID2GO') library(topGO) getTopGO(annotations, foregroundGenes = list('YEAST00058'), format = 'geneID2GO', ontology = 'BP')
A convenience function to obtain a tree object from newick tree, essentially wraps read.tree from the ape package.
getTree(newick)
getTree(newick)
newick |
The newick tree to be instantiated. |
a tree object
taxonomy = getTaxonomy(root='Alveolata') getTree(newick=taxonomy$newick)
taxonomy = getTaxonomy(root='Alveolata') getTree(newick=taxonomy$newick)
The function to obtain the API and database version that the package is using.
getVersion()
getVersion()
S3 object
getVersion()
getVersion()
An object containing information for the OMA group number 737636.
group
group
An S3 object with 4 variables:
group number, not stable across releases
fingerprint of the oma group, stable across releases
url to the endpoint containing the list of oma groups that share some of the orthologs with this oma group
list of protein members of this oma group
...
https://omabrowser.org/api/group/YEAST58/
An object containing information for the HOG:0273533.1b.
hog
hog
An S3 object with 8 variables:
hog identifier
the taxonomic level of this hog
url pointer to the hog information at a given level
url pointer to the list of gene members for this hog
a dataframe object containing the rest of the taxonomic levels in this hog
the root taxonomic level of this hog
a dataframe containing information on the parent hogs to the current hogs
a dataframe containing information on the children hogs to the current hogs
...
https://omabrowser.org/api/hog/HOG:0273533.1b/
The function to identify a sequence.
mapSequence(query, search = NULL, full_length = FALSE)
mapSequence(query, search = NULL, full_length = FALSE)
query |
the sequence to be searched, it can be either a string or an AAString object from the Biostrings package |
search |
argument to choose search strategy. Can be set to 'exact', 'approximate' or 'mixed'. Defaults to 'mixed', meaning first tries to find exact match. If no target can be found, uses approximate search strategy to identify query sequence in database. |
full_length |
a boolean indicating whether or not for exact matches, the query sequence must be matching the full target sequence. By default, a partial exact match is also reported as exact match. |
a data.frame containing the information of matches for the query sequence
mapSequence(query='MNDPSLLGYPNVGPQQQQQQQQQQHAGLLGKGTPNALQQQLHMNQLTGIPPPGLMNNSDVHTSSNNNSRQLLDQLANGNANMLNMNMDNNNNNNNNNNNNNNNGGGSGVMMNASTAAVNSIGMVPTVGTPVNINVNASNPLLHPHLDDPSLLNNPIWKLQLHLAAVSAQSLGQPNIYARQNAMKKYLATQQAQQAQQQAQQQAQQQVPGPFGPGPQAAPPALQPTDFQQSHIAEASKSLVDCTKQALMEMADTLTDSKTAKKQQPTGDSTPSGTATNSAVSTPLTPKIELFANGKDEANQALLQHKKLSQYSIDEDDDIENRMVMPKDSKYDDQLWHALDLSNLQIFNISANIFKYDFLTRLYLNGNSLTELPAEIKNLSNLRVLDLSHNRLTSLPAELGSCFQLKYFYFFDNMVTTLPWEFGNLCNLQFLGVEGNPLEKQFLKILTEKSVTGLIFYLRDNRPEIPLPHERRFIEINTDGEPQREYDSLQQSTEHLATDLAKRTFTVLSYNTLCQHYATPKMYRYTPSWALSWDYRRNKLKEQILSYDSDLLCLQEVESKTFEEYWVPLLDKHGYTGIFHAKARAKTMHSKDSKKVDGCCIFFKRDQFKLITKDAMDFSGAWMKHKKFQRTEDYLNRAMNKDNVALFLKLQHIPSGDTIWAVTTHLHWDPKFNDVKTFQVGVLLDHLETLLKEETSHNFRQDIKKFPVLICGDFNSYINSAVYELINTGRVQIHQEGNGRDFGYMSEKNFSHNLALKSSYNCIGELPFTNFTPSFTDVIDYIWFSTHALRVRGLLGEVDPEYVSKFIGFPNDKFPSDHIPLLARFEFMKTNTGSKKV') mapSequence(search='mixed',query='NKLLQPTDFQQSHIAEASKSLVDCTKQALMEMADTLTDSKTAKKQQPTGDSTPSGTATNSAVSTPLTPKIELFANGKDEANQALLQHKKLSQYSIDEDDDIENRMVMPKDSKYDDQLWHALDLSNLQIFNISANIFKYDFLTRLYLNGNSLTELPAEIKNLSNLRVLDLSHNRLTSLPAELGSCFQLKYFYFFDNMVTTLPWEFGNLCNLQFLGVEGNPLEKQFLKILTEKSVTGLIFYLRDNRPEIPLPHERRFIEINTDGEPQREYDSLQQSTEHLATDLAKRTFTVLSYNTLCQHYATPKMYRYTPSWALSWDYRRNKLKEQILSYDSDLLCLQEVESKTFEEYWVPLLDKHGYTGIFHAKARAKTMHSKDSKKVDGCCIFFKRDQFKLITKDAMDFSGAWMKHKKFQRTEDYLNRAMNKDNVALFLKLQHIPSGDTIWAVTTHLHWDPKFNDVKTFQVGVLLDHLETLLKEETSHNFRQDIKKFPVLICGDFNSYINSAVYELINTGRVQIHQEGNGRDFGYMSEKNFSHNLALKSSYNCIGELPFTNFTPSFTDVIDYIWFSTHALRVRGLLGEVDPEYVSKFIGFPNDKFPSDHIPLLARFEFMKTNTGSKKV')
mapSequence(query='MNDPSLLGYPNVGPQQQQQQQQQQHAGLLGKGTPNALQQQLHMNQLTGIPPPGLMNNSDVHTSSNNNSRQLLDQLANGNANMLNMNMDNNNNNNNNNNNNNNNGGGSGVMMNASTAAVNSIGMVPTVGTPVNINVNASNPLLHPHLDDPSLLNNPIWKLQLHLAAVSAQSLGQPNIYARQNAMKKYLATQQAQQAQQQAQQQAQQQVPGPFGPGPQAAPPALQPTDFQQSHIAEASKSLVDCTKQALMEMADTLTDSKTAKKQQPTGDSTPSGTATNSAVSTPLTPKIELFANGKDEANQALLQHKKLSQYSIDEDDDIENRMVMPKDSKYDDQLWHALDLSNLQIFNISANIFKYDFLTRLYLNGNSLTELPAEIKNLSNLRVLDLSHNRLTSLPAELGSCFQLKYFYFFDNMVTTLPWEFGNLCNLQFLGVEGNPLEKQFLKILTEKSVTGLIFYLRDNRPEIPLPHERRFIEINTDGEPQREYDSLQQSTEHLATDLAKRTFTVLSYNTLCQHYATPKMYRYTPSWALSWDYRRNKLKEQILSYDSDLLCLQEVESKTFEEYWVPLLDKHGYTGIFHAKARAKTMHSKDSKKVDGCCIFFKRDQFKLITKDAMDFSGAWMKHKKFQRTEDYLNRAMNKDNVALFLKLQHIPSGDTIWAVTTHLHWDPKFNDVKTFQVGVLLDHLETLLKEETSHNFRQDIKKFPVLICGDFNSYINSAVYELINTGRVQIHQEGNGRDFGYMSEKNFSHNLALKSSYNCIGELPFTNFTPSFTDVIDYIWFSTHALRVRGLLGEVDPEYVSKFIGFPNDKFPSDHIPLLARFEFMKTNTGSKKV') mapSequence(search='mixed',query='NKLLQPTDFQQSHIAEASKSLVDCTKQALMEMADTLTDSKTAKKQQPTGDSTPSGTATNSAVSTPLTPKIELFANGKDEANQALLQHKKLSQYSIDEDDDIENRMVMPKDSKYDDQLWHALDLSNLQIFNISANIFKYDFLTRLYLNGNSLTELPAEIKNLSNLRVLDLSHNRLTSLPAELGSCFQLKYFYFFDNMVTTLPWEFGNLCNLQFLGVEGNPLEKQFLKILTEKSVTGLIFYLRDNRPEIPLPHERRFIEINTDGEPQREYDSLQQSTEHLATDLAKRTFTVLSYNTLCQHYATPKMYRYTPSWALSWDYRRNKLKEQILSYDSDLLCLQEVESKTFEEYWVPLLDKHGYTGIFHAKARAKTMHSKDSKKVDGCCIFFKRDQFKLITKDAMDFSGAWMKHKKFQRTEDYLNRAMNKDNVALFLKLQHIPSGDTIWAVTTHLHWDPKFNDVKTFQVGVLLDHLETLLKEETSHNFRQDIKKFPVLICGDFNSYINSAVYELINTGRVQIHQEGNGRDFGYMSEKNFSHNLALKSSYNCIGELPFTNFTPSFTDVIDYIWFSTHALRVRGLLGEVDPEYVSKFIGFPNDKFPSDHIPLLARFEFMKTNTGSKKV')
A dataframe containing information for the orthologs of protein YEAST00058.
orthologs
orthologs
A dataframe object with 15 variables:
entry number of the ortholog
oma identifier of the ortholog
canonicalid of the ortholog
sequence_md5 of the ortholog
oma_group of the ortholog
hog id of the ortholog
chromosomal location of the ortholog
start locus of the ortholog
end locus of the ortholog
locus strand of the ortholog
true/false
relationship type of the ortholog to the gene
ortholog distance
ortholog score
...
https://omabrowser.org/api/protein/YEAST00058/orthologs
A dataframe containing information for the whole genome aligment of YEAST and ASHGO.
pairs
pairs
A dataframe object with 12 variables for each member of the pair, as well some 3 additional variables:
entry number of the ortholog
oma identifier of the ortholog
canonicalid of the ortholog
sequence_md5 of the ortholog
oma_group of the ortholog
hog id of the ortholog
chromosomal location of the ortholog
start locus of the ortholog
end locus of the ortholog
locus strand of the ortholog
true/false
relationship type of the ortholog to the gene
ortholog distance
ortholog score
...
https://omabrowser.org/api/pairs/YEAST/ASHGO/
An object containing information for the YEAST00058 protein.
protein
protein
A S3 object with 23 variables:
entry number of the protein
url pointer to the protein
oma identifier of the protein
canonicalid of the protein
sequence_md5 of the protein
oma_group of the protein
hog id of the protein
chromosomal location of the protein
GRanges object with the locus information for the protein
true/false
root taxonomic level of the relevant hog
taxonomic levels of the hog in which the protein is present
length of the protein sequence
AAString of the protein sequence
DNAString of the protein sequence
url pointer to the list of protein domains
url pointer to the list of protein cross references
url pointer to the list of protein orthologs
url pointer to the list of protein homeologs
url pointer to the list of protein GO ontologies
url pointer to the protein oma group
url pointer to the protein hog members
list of url pointers to the protein isoforms
...
https://omabrowser.org/api/protein/6633022/
This function is usualy not needed by users. In most circumstances an attribute containing a URL is automatically loaded when accessed. However, in case the data is transformed into a dataframe, this will no longer be true, in which case one can access the data behind this attribute using this function.
resolveURL(url)
resolveURL(url)
url |
The url of interest |
a data.frame containing the information behind an URL
resolveURL('http://omabrowser.org/api/protein/YEAST58/gene_ontology/')
resolveURL('http://omabrowser.org/api/protein/YEAST58/gene_ontology/')
The function to list all the crossreferences that match a certain defined pattern.
searchProtein(pattern)
searchProtein(pattern)
pattern |
the pattern to query the OMA database with - needs to be at least 3 characters long |
a data.frame containing information on the cross references for a given pattern
searchProtein(pattern='MAL')
searchProtein(pattern='MAL')
An example dataframe containing GO annotations identified from a given sequence.
sequence_annotation
sequence_annotation
A dataframe with 13 variables:
qualifier of the annotation
GO term for the annotation
GO term for the annotation
evidence for the annotation
date
identified object type
identified object name
aspect
assignment of the annotation
GO term name
database
database reference
synonym
...
An example dataframe containing proteins identified from a given sequence.
sequence_map
sequence_map
A dataframe with 3 variables:
sequence that was queried
type of identification
list of protein targets identified
...
Function to set the base url to the OMA Browser API. If no url is specified, the default OMA Browser API url is used.
setAPI(url)
setAPI(url)
url |
Base url to the API |
An example newick format taxonomy object.
taxonomy
taxonomy
An S3 with 2 variables:
sequence that was queried
taxonomy newick
...
https://omabrowser.org/api/taxonomy/Alveolata/?type=newick
An example xref object.
xref
xref
A dataframe with 8 variables:
cross reference
source of the cross reference
oma database entry number
oma id of the cross reference
genome_id of the cross reference
taxon_id of the cross reference
species of the cross reference
genome url pointer of the cross reference
...
https://omabrowser.org/api/xref/?search=MAL