Title: | Storing and accessing epitranscriptomic information using the AnnotationDbi interface |
---|---|
Description: | EpiTxDb facilitates the storage of epitranscriptomic information. More specifically, it can keep track of modification identity, position, the enzyme for introducing it on the RNA, a specifier which determines the position on the RNA to be modified and the literature references each modification is associated with. |
Authors: | Felix G.M. Ernst [aut, cre] |
Maintainer: | Felix G.M. Ernst <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.19.0 |
Built: | 2024-10-30 07:21:25 UTC |
Source: | https://github.com/bioc/EpiTxDb |
The EpiTxDb
class is a
AnnotationDb
type container
for storing Epitranscriptomic information.
The information are typically stored on a per transcript and not as genomic
coordinates, but the EpiTxDb
class is agnostic to this. In case of
genomic coordinates transcriptsBy
will return modifications per
chromosome.
## S4 method for signature 'EpiTxDb' organism(object) ## S4 method for signature 'EpiTxDb' seqinfo(x) ## S4 method for signature 'EpiTxDb' seqlevels(x) ## S4 method for signature 'EpiTxDb' as.list(x)
## S4 method for signature 'EpiTxDb' organism(object) ## S4 method for signature 'EpiTxDb' seqinfo(x) ## S4 method for signature 'EpiTxDb' seqlevels(x) ## S4 method for signature 'EpiTxDb' as.list(x)
x , object
|
a |
For
organism()
and seqlevels()
a character
vector
seqinfo()
a
Seqinfo
object
as.list()
a list
makeEpiTxDbFromGRanges
for
creating a EpiTxDb
object from a
GRanges
object and it's
metadata columns
makeEpiTxDbFromRMBase
for
creating a EpiTxDb
object from RMBase online resources
makeEpiTxDbFromtRNAdb
for
creating a EpiTxDb
object from tRNAdb online resources
makeEpiTxDb
for creating a
EpiTxDb
object from data.frames
modifications
,
modificationsBy
for getting
epitranscriptomic modification locations
select
for using the default interface of
AnnotationDb
objects.
shiftGenomicToTranscript
and shiftTranscriptToGenomic
for transfering genomic to transcript coordinates and back again.
etdb_file <- system.file("extdata", "EpiTxDb.Hs.hg38.snoRNAdb.sqlite", package="EpiTxDb") etdb <- loadDb(etdb_file) etdb # general methods seqinfo(etdb) # seqlevels(etdb) # easy access to all transcript names
etdb_file <- system.file("extdata", "EpiTxDb.Hs.hg38.snoRNAdb.sqlite", package="EpiTxDb") etdb <- loadDb(etdb_file) etdb # general methods seqinfo(etdb) # seqlevels(etdb) # easy access to all transcript names
EpiTxDb internal data
data(rmbase_data)
data(rmbase_data)
data.frame
EpiTxDb
- Storing and accessing epitranscriptomic information
using the AnnotationDbi interfacetitle
Felix G M Ernst [aut]
Jia-Jia Xuan, Wen-Ju Sun, Ke-Ren Zhou, Shun Liu, Peng-Hui Lin, Ling-Ling Zheng, Liang-Hu Qu, Jian-Hua Yang (2017): "RMBase v2.0: Deciphering the Map of RNA Modifications from Epitranscriptome Sequencing Data." Nucleic Acids Research, Volume 46, Issue D1, 4 January 2018, Pages D327–D334. doi: 10.1093/nar/gkx934
Jühling, Frank; Mörl, Mario; Hartmann, Roland K.; Sprinzl, Mathias; Stadler, Peter F.; Pütz, Joern (2009): "TRNAdb 2009: Compilation of tRNA Sequences and tRNA Genes." Nucleic Acids Research 37 (suppl_1): D159–D162. doi: 10.1093/nar/gkn772
Sprinzl, Mathias; Vassilenko, Konstantin S. (2005): "Compilation of tRNA Sequences and Sequences of tRNA Genes." Nucleic Acids Research 33 (suppl_1): D139–D140. doi: 10.1093/nar/gki012
EpiTxDb
from user supplied annotations as
data.frame
smakeEpiTxDb
is a low-level constructor for creating a
EpiTxDb
object from user supplied annotations.
This functions typically will not be used by regular users.
makeEpiTxDb( modifications, reactions = NULL, specifiers = NULL, references = NULL, metadata = NULL, reassign.ids = FALSE )
makeEpiTxDb( modifications, reactions = NULL, specifiers = NULL, references = NULL, metadata = NULL, reassign.ids = FALSE )
modifications |
A
The first six are mandatory, whereas one of the last two has to be set.
|
reactions |
An optional
(default: |
specifiers |
An optional
(default: |
references |
An optional
(default: |
metadata |
An optional
This dataframe will be returned
by |
reassign.ids |
|
a EpiTxDb
object.
makeEpiTxDbFromGRanges
for
creating a EpiTxDb
object from a
GRanges
object and it's metadata
columns
makeEpiTxDbFromRMBase
for
creating a EpiTxDb
object from RMBase online resources
makeEpiTxDbFromtRNAdb
for
creating a EpiTxDb
object from tRNAdb online resources
shortName
and
ModRNAString
for information on
ModRNAString
objects.
mod <- data.frame("mod_id" = 1L, "mod_type" = "m1A", "mod_name" = "m1A_1", "mod_start" = 1L, "mod_end" = 1L, "mod_strand" = "+", "sn_id" = 1L, "sn_name" = "test") rx <- data.frame(mod_id = 1L, rx_genename = "test", rx_rank = 1L, rx_ensembl = "test", rx_ensembltrans = "test", rx_entrezid = "test") spec <- data.frame(mod_id = 1L, spec_type = "test", spec_genename = "test", spec_ensembl = "test", spec_ensembltrans = "test", spec_entrezid = "test") ref <- data.frame(mod_id = 1L, ref_type = "test", ref = "test") etdb <- makeEpiTxDb(mod,rx,spec,ref)
mod <- data.frame("mod_id" = 1L, "mod_type" = "m1A", "mod_name" = "m1A_1", "mod_start" = 1L, "mod_end" = 1L, "mod_strand" = "+", "sn_id" = 1L, "sn_name" = "test") rx <- data.frame(mod_id = 1L, rx_genename = "test", rx_rank = 1L, rx_ensembl = "test", rx_ensembltrans = "test", rx_entrezid = "test") spec <- data.frame(mod_id = 1L, spec_type = "test", spec_genename = "test", spec_ensembl = "test", spec_ensembltrans = "test", spec_entrezid = "test") ref <- data.frame(mod_id = 1L, ref_type = "test", ref = "test") etdb <- makeEpiTxDb(mod,rx,spec,ref)
EpiTxDb
object from a GRanges
objectmakeEpiTxDbFromGRanges
extracts informations from a
GRanges
object. The following
metadata columns can be used:
mod_id
, mod_type
, mod_name
and tx_ensembl
.
The first three are mandatory, whereas tx_ensembl
is optional.
rx_genename
, rx_rank
, rx_ensembl
,
rx_ensembltrans
and rx_entrezid
spec_type
, spec_genename
, spec_ensembl
,
spec_ensembltrans
and spec_entrezid
ref_type
and ref
... and passed on the makeEpiTxDb
.
makeEpiTxDbFromGRanges(gr, metadata = NULL, reassign.ids = FALSE)
makeEpiTxDbFromGRanges(gr, metadata = NULL, reassign.ids = FALSE)
gr |
A |
metadata |
A 2-column |
reassign.ids |
= FALSE |
a EpiTxDb
object.
library(GenomicRanges) gr <- GRanges(seqnames = "test", ranges = IRanges::IRanges(1,1), strand = "+", DataFrame(mod_id = 1L, mod_type = "Am", mod_name = "Am_1")) etdb <- makeEpiTxDbFromGRanges(gr)
library(GenomicRanges) gr <- GRanges(seqnames = "test", ranges = IRanges::IRanges(1,1), strand = "+", DataFrame(mod_id = 1L, mod_type = "Am", mod_name = "Am_1")) etdb <- makeEpiTxDbFromGRanges(gr)
EpiTxDb
object from RMBase v2.0 online resourcesmakeEpiTxDbFromRMBase
will make use of the RMBase v2.0 online
resources.
EPITXDB_RMBASE_URL downloadRMBaseFiles(organism, genome, modtype) makeEpiTxDbFromRMBase( organism, genome, modtype, tx = NULL, sequences = NULL, metadata = NULL, reassign.ids = FALSE, verbose = FALSE ) getRMBaseDataAsGRanges(files, verbose = FALSE) makeEpiTxDbFromRMBaseFiles( files, tx = NULL, sequences = NULL, metadata = NULL, reassign.ids = FALSE, verbose = FALSE ) listAvailableOrganismsFromRMBase() listAvailableGenomesFromRMBase(organism) listAvailableModFromRMBase(organism, genome)
EPITXDB_RMBASE_URL downloadRMBaseFiles(organism, genome, modtype) makeEpiTxDbFromRMBase( organism, genome, modtype, tx = NULL, sequences = NULL, metadata = NULL, reassign.ids = FALSE, verbose = FALSE ) getRMBaseDataAsGRanges(files, verbose = FALSE) makeEpiTxDbFromRMBaseFiles( files, tx = NULL, sequences = NULL, metadata = NULL, reassign.ids = FALSE, verbose = FALSE ) listAvailableOrganismsFromRMBase() listAvailableGenomesFromRMBase(organism) listAvailableModFromRMBase(organism, genome)
organism |
A |
genome |
A |
modtype |
A |
tx |
A |
sequences |
A named |
metadata , reassign.ids
|
See |
verbose |
|
files |
From |
An object of class character
of length 1.
a EpiTxDb
object.
EpiTxDb
object from tRNAdb resourcesmakeEpiTxDbFromtRNAdb
will make use of the tRNAdb online
resources and extract the modification information from the RNA database.
If a named DNAStringSet
is
provided as sequences
, the result from the tRNAdb will be matched
against the sequences. Valid matches will be used as transcript identifiers
and returned after a check of modification compatibility with the provided
sequence. By this process multiple copies of transcripts can be associated
with a single modification.
makeEpiTxDbFromtRNAdb
uses the functions provided by the
tRNAdbImport
package.
import.tRNAdb
will be used with
database = "RNA"
and the three different values for origin
.
gettRNAdbDataAsGRanges( organism, sequences = NULL, dbURL = tRNAdbImport::TRNA_DB_URL ) makeEpiTxDbFromtRNAdb( organism, sequences = NULL, metadata = NULL, dbURL = tRNAdbImport::TRNA_DB_URL ) listAvailableOrganismsFromtRNAdb()
gettRNAdbDataAsGRanges( organism, sequences = NULL, dbURL = tRNAdbImport::TRNA_DB_URL ) makeEpiTxDbFromtRNAdb( organism, sequences = NULL, metadata = NULL, dbURL = tRNAdbImport::TRNA_DB_URL ) listAvailableOrganismsFromtRNAdb()
organism |
A |
sequences |
A named |
dbURL |
The URL to the tRNA db website. |
metadata |
See |
a EpiTxDb
object.
Juehling F, Moerl M, Hartmann RK, Sprinzl M, Stadler PF, Puetz J. 2009. "tRNAdb 2009: compilation of tRNA sequences and tRNA genes." Nucleic Acids Research, Volume 37 (suppl_1): D159–162. doi:10.1093/nar/gkn772.
## Not run: # getting just the annotation data etdb <- makeEpiTxDbFromtRNAdb("Saccharomyces cerevisiae") # For associating the result with transcripts, provide and additional # named DNAStringSet object. Matching will be done against each sequence # allowing 5 mismatches and indels. The final result will be checked for # validity regarding the identity of the modifications etdb <- makeEpiTxDbFromtRNAdb("Saccharomyces cerevisiae", some_transcript_sequences) ## End(Not run)
## Not run: # getting just the annotation data etdb <- makeEpiTxDbFromtRNAdb("Saccharomyces cerevisiae") # For associating the result with transcripts, provide and additional # named DNAStringSet object. Matching will be done against each sequence # allowing 5 mismatches and indels. The final result will be checked for # validity regarding the identity of the modifications etdb <- makeEpiTxDbFromtRNAdb("Saccharomyces cerevisiae", some_transcript_sequences) ## End(Not run)
EpiTxDb-object
modifications
and modificationsBy
are functions to
extract modification annotation from a EpiTxDb
object.
modifiedSeqsByTranscript
returns a
ModRNAStringSet
from a EpiTxDb
object and compatible RNAStringSet
object. This used the
combineIntoModstrings()
function from the
Modstrings
package.
modifications( x, columns = c("mod_id", "mod_type", "mod_name"), filter = NULL, use.names = FALSE, ... ) modificationsBy( x, by = c("seqnames", "mod_type", "reaction", "specifier", "specifier_type"), ... ) modifiedSeqsByTranscript(x, sequences, ...) ## S4 method for signature 'EpiTxDb' modifications( x, columns = c("mod_id", "mod_type", "mod_name"), filter = NULL, use.names = FALSE ) ## S4 method for signature 'EpiTxDb' modificationsBy( x, by = c("seqnames", "modtype", "reaction", "specifier", "specifiertype") ) ## S4 method for signature 'EpiTxDb,DNAStringSet' modifiedSeqsByTranscript(x, sequences)
modifications( x, columns = c("mod_id", "mod_type", "mod_name"), filter = NULL, use.names = FALSE, ... ) modificationsBy( x, by = c("seqnames", "mod_type", "reaction", "specifier", "specifier_type"), ... ) modifiedSeqsByTranscript(x, sequences, ...) ## S4 method for signature 'EpiTxDb' modifications( x, columns = c("mod_id", "mod_type", "mod_name"), filter = NULL, use.names = FALSE ) ## S4 method for signature 'EpiTxDb' modificationsBy( x, by = c("seqnames", "modtype", "reaction", "specifier", "specifiertype") ) ## S4 method for signature 'EpiTxDb,DNAStringSet' modifiedSeqsByTranscript(x, sequences)
x |
a |
columns |
Columns to include in the result. If the vector is named,
those names are used for the corresponding column in the element metadata
of the returned object. (default: |
filter |
Either NULL or a named list of vectors to be used to restrict
the output. Valid names for this list are: "mod_id", "mod_type",
"mod_name", "sn_id", "sn_name", "rx_genename", "rx_ensembl",
"rx_ensembltrans", "rx_entrezid", "spec_genename", "spec_type",
"spec_ensembl", "spec_ensembltrans", "spec_entrezid" , "ref_type" and
"ref". (default: |
use.names |
|
... |
Not used. |
by |
By which information type should the result be split into? A
|
sequences |
A |
a GRanges
object for
modifications
and a
GRangesList
for
modificationsBy
.
etdb_file <- system.file("extdata", "EpiTxDb.Hs.hg38.snoRNAdb.sqlite", package="EpiTxDb") etdb <- loadDb(etdb_file) etdb
etdb_file <- system.file("extdata", "EpiTxDb.Hs.hg38.snoRNAdb.sqlite", package="EpiTxDb") etdb <- loadDb(etdb_file) etdb
Ranges
positionSequence
generates sequences of integer values
along the range information of x
. This can be used for navigating
specific positions on a range information.
positionSequence(x, order = FALSE, decreasing = FALSE) ## S4 method for signature 'Ranges' positionSequence(x, order = FALSE, decreasing = FALSE) ## S4 method for signature 'RangesList' positionSequence(x, order = FALSE, decreasing = FALSE) ## S4 method for signature 'Ranges' as.integer(x)
positionSequence(x, order = FALSE, decreasing = FALSE) ## S4 method for signature 'Ranges' positionSequence(x, order = FALSE, decreasing = FALSE) ## S4 method for signature 'RangesList' positionSequence(x, order = FALSE, decreasing = FALSE) ## S4 method for signature 'Ranges' as.integer(x)
x |
a |
order |
|
decreasing |
|
a integer
vector if x is a
GRanges
object and a
IntegerList
if x is a
GRangesList
library(GenomicRanges) # Returns an integer vector gr <- GRanges("chr1:1-5:+") positionSequence(gr) gr2 <- GRanges("chr1:1-5:-") positionSequence(gr) # returns an IntegerList grl <- GRangesList("1" = gr,"2" = gr,"3" = gr2) # must be named positionSequence(grl)
library(GenomicRanges) # Returns an integer vector gr <- GRanges("chr1:1-5:+") positionSequence(gr) gr2 <- GRanges("chr1:1-5:-") positionSequence(gr) # returns an IntegerList grl <- GRangesList("1" = gr,"2" = gr,"3" = gr2) # must be named positionSequence(grl)
Ranges
objectrescale()
rescales IRanges
, GRanges
, IRangesList
and GRangesList
by using minima and maxima derived from to
and
from
.
rescale(x, to = 1L, from = 1L) ## S4 method for signature 'IRanges' rescale(x, to = 1L, from = 1L) ## S4 method for signature 'IRangesList' rescale(x, to = 1L, from = 1L) ## S4 method for signature 'GRanges' rescale(x, to = 1L, from = 1L) ## S4 method for signature 'GRangesList' rescale(x, to = 1L, from = 1L)
rescale(x, to = 1L, from = 1L) ## S4 method for signature 'IRanges' rescale(x, to = 1L, from = 1L) ## S4 method for signature 'IRangesList' rescale(x, to = 1L, from = 1L) ## S4 method for signature 'GRanges' rescale(x, to = 1L, from = 1L) ## S4 method for signature 'GRangesList' rescale(x, to = 1L, from = 1L)
x |
a |
to , from
|
an |
an object of the same type and dimensions as x
H. Pagès, F. Ernst
IRanges
for details on
character
vectors coercible to IRanges
.
x <- IRanges("5-10") # widen the ranges rescale(x, 100, 10) # widen and shift rescale(x, "31-60", "5-14")
x <- IRanges("5-10") # widen the ranges rescale(x, 100, 10) # widen and shift rescale(x, "31-60", "5-14")
EpiTxDb
objectsAs expected for a AnnotationDb
object, the general accessors
select
, keys
, columns
and keytypes
can be used
to get information from a EpiTxDb
object.
## S4 method for signature 'EpiTxDb' select(x, keys, columns, keytype, ...) ## S4 method for signature 'EpiTxDb' columns(x) ## S4 method for signature 'EpiTxDb' keys(x, keytype, ...) ## S4 method for signature 'EpiTxDb' keytypes(x)
## S4 method for signature 'EpiTxDb' select(x, keys, columns, keytype, ...) ## S4 method for signature 'EpiTxDb' columns(x) ## S4 method for signature 'EpiTxDb' keys(x, keytype, ...) ## S4 method for signature 'EpiTxDb' keytypes(x)
x |
a |
keys , columns , keytype , ...
|
See
|
a data.frame
object for select()
and a character
vecor for keytypes()
, keys()
and columns()
.
etdb_file <- system.file("extdata", "EpiTxDb.Hs.hg38.snoRNAdb.sqlite", package="EpiTxDb") etdb <- loadDb(etdb_file) etdb
etdb_file <- system.file("extdata", "EpiTxDb.Hs.hg38.snoRNAdb.sqlite", package="EpiTxDb") etdb <- loadDb(etdb_file) etdb
GRanges
coordinates based on another GRanges
objectshiftGenomicToTranscript
shifts positions of a
GRanges
object based on coordinates of another GRanges
object. The most common
application is to shift genomic coordinates to transcript coordinates, which
is reflected in the name. shiftTranscriptToGenomic
implements the
reverse operation.
Matches are determined by
findOverlaps
for
shiftGenomicToTranscript
and by
findMatches
for
shiftTranscriptToGenomic
using the seqnames
of the
subject
and the names
of tx
.
shiftTranscriptToGenomic(subject, tx) shiftGenomicToTranscript(subject, tx) ## S4 method for signature 'GRanges,GRangesList' shiftTranscriptToGenomic(subject, tx) ## S4 method for signature 'GRangesList,GRangesList' shiftTranscriptToGenomic(subject, tx) ## S4 method for signature 'GRanges,GRangesList' shiftGenomicToTranscript(subject, tx) ## S4 method for signature 'GRangesList,GRangesList' shiftGenomicToTranscript(subject, tx)
shiftTranscriptToGenomic(subject, tx) shiftGenomicToTranscript(subject, tx) ## S4 method for signature 'GRanges,GRangesList' shiftTranscriptToGenomic(subject, tx) ## S4 method for signature 'GRangesList,GRangesList' shiftTranscriptToGenomic(subject, tx) ## S4 method for signature 'GRanges,GRangesList' shiftGenomicToTranscript(subject, tx) ## S4 method for signature 'GRangesList,GRangesList' shiftGenomicToTranscript(subject, tx)
subject |
a |
tx |
a named |
a GRanges
or
GRangesList
object depending
on the type of subject
library(GenomicRanges) # Construct some example data subject1 <- GRanges("chr1", IRanges(3, 6), strand = "+") subject2 <- GRanges("chr1", IRanges(c(17,23), width=3), strand = c("+","-")) subject3 <- GRanges("chr2", IRanges(c(51, 54), c(53, 59)), strand = "-") subject <- GRangesList(a=subject1, b=subject2, c=subject3) tx1 <- GRanges("chr1", IRanges(1, 40), strand="+") tx2 <- GRanges("chr1", IRanges(10, 30), strand="+") tx3 <- GRanges("chr2", IRanges(50, 60), strand="-") tx <- GRangesList(a=tx1, b=tx2, c=tx3) # shift to transcript coordinates. Since the third subject does not have # a match in tx it is dropped with a warning shifted_grl <- shiftGenomicToTranscript(subject,tx) # ... and back shifted_grl2 <- shiftTranscriptToGenomic(shifted_grl,tx) # comparison of ranges work. However the seqlevels differ ranges(shifted_grl2) == ranges(subject[list(1,c(1,1),c(1,2))])
library(GenomicRanges) # Construct some example data subject1 <- GRanges("chr1", IRanges(3, 6), strand = "+") subject2 <- GRanges("chr1", IRanges(c(17,23), width=3), strand = c("+","-")) subject3 <- GRanges("chr2", IRanges(c(51, 54), c(53, 59)), strand = "-") subject <- GRangesList(a=subject1, b=subject2, c=subject3) tx1 <- GRanges("chr1", IRanges(1, 40), strand="+") tx2 <- GRanges("chr1", IRanges(10, 30), strand="+") tx3 <- GRanges("chr2", IRanges(50, 60), strand="-") tx <- GRangesList(a=tx1, b=tx2, c=tx3) # shift to transcript coordinates. Since the third subject does not have # a match in tx it is dropped with a warning shifted_grl <- shiftGenomicToTranscript(subject,tx) # ... and back shifted_grl2 <- shiftTranscriptToGenomic(shifted_grl,tx) # comparison of ranges work. However the seqlevels differ ranges(shifted_grl2) == ranges(subject[list(1,c(1,1),c(1,2))])