| Title: | Detection of post-transcriptional modifications in high throughput sequencing data |
|---|---|
| Description: | RNAmodR provides classes and workflows for loading/aggregation data from high througput sequencing aimed at detecting post-transcriptional modifications through analysis of specific patterns. In addition, utilities are provided to validate and visualize the results. The RNAmodR package provides a core functionality from which specific analysis strategies can be easily implemented as a seperate package. |
| Authors: | Felix G.M. Ernst [aut, cre] (ORCID: <https://orcid.org/0000-0001-5064-0928>), Denis L.J. Lafontaine [ctb, fnd] |
| Maintainer: | Felix G.M. Ernst <[email protected]> |
| License: | Artistic-2.0 |
| Version: | 1.27.0 |
| Built: | 2026-05-30 06:59:51 UTC |
| Source: | https://github.com/bioc/RNAmodR |
The aggregate function is defined for each
SequenceData object and can be used
directly on a SequenceData object or
indirectly via a Modifier object.
For the letter the call is redirect to the
SequenceData object, the result summarized
as defined for the individual Modifier class and stored in the
aggregate slot of the Modifier object. The data is then used
for subsequent tasks, such as search for modifications and visualization of
the results.
The summarization is implemented in the aggregateData for each type of
Modifier class. The stored data from the aggregate slot can be
retrieved using the getAggregateData function.
Whether the aggrgeated data is already present in the aggregate slot
can be checked using the hasAggregateData function.
For SequenceDataSet, SequenceDataList and ModfierSet
classes wrapper of the aggregate function exist as well.
aggregate(x, ...) aggregateData(x, ...) getAggregateData(x) hasAggregateData(x) ## S4 method for signature 'SequenceData' aggregate(x, condition = c()) ## S4 method for signature 'SequenceData' aggregateData(x, condition) ## S4 method for signature 'SequenceDataSet' aggregate(x, condition = "Treated") ## S4 method for signature 'SequenceDataList' aggregate(x, condition = "Treated") ## S4 method for signature 'Modifier' aggregate(x, force = FALSE) ## S4 method for signature 'Modifier' aggregateData(x) ## S4 method for signature 'Modifier' getAggregateData(x) ## S4 method for signature 'Modifier' hasAggregateData(x) ## S4 method for signature 'ModifierSet' aggregate(x, force = FALSE)aggregate(x, ...) aggregateData(x, ...) getAggregateData(x) hasAggregateData(x) ## S4 method for signature 'SequenceData' aggregate(x, condition = c()) ## S4 method for signature 'SequenceData' aggregateData(x, condition) ## S4 method for signature 'SequenceDataSet' aggregate(x, condition = "Treated") ## S4 method for signature 'SequenceDataList' aggregate(x, condition = "Treated") ## S4 method for signature 'Modifier' aggregate(x, force = FALSE) ## S4 method for signature 'Modifier' aggregateData(x) ## S4 method for signature 'Modifier' getAggregateData(x) ## S4 method for signature 'Modifier' hasAggregateData(x) ## S4 method for signature 'ModifierSet' aggregate(x, force = FALSE)
x |
a |
... |
additional arguments |
condition |
character value, which selects, for which condition the data
should be aggregated. One of the following values: |
force |
whether to recreate the aggregated data, if it is already stored
inside the |
aggregate: for SequenceData object the aggregated data
is returned as a SplitDataFrameList with an element per transcript,
whereas for a Modifier the modified input object is returned,
containing the aggregated data, which can be accessed using
getAggregateData.
getAggregateData: only for Modifier: a
SplitDataFrameList with an element per transcript is returned. If the
aggregated data is not stored in the object, it is generated on the fly, but
does not persist.
hasAggregateData: TRUE or FALSE. Does the Modifier
object already contain aggregated data?
If 'x' is a
SequenceData: a
SplitDataFrameList with elments per transcript.
SequenceDataSet or
SequenceDataList: a SimpleList
with SplitDataFrameList as elements.
Modifier or
ModifierSet: an updated Modifier
object. The data can be accessed by using the aggregateData function.
data(e5sd,package="RNAmodR") data(msi,package="RNAmodR") # modify() triggers the search for modifications in the data contained in # the Modifier or ModifierSet object sdfl <- aggregate(e5sd) mi <- aggregate(msi[[1]])data(e5sd,package="RNAmodR") data(msi,package="RNAmodR") # modify() triggers the search for modifications in the data contained in # the Modifier or ModifierSet object sdfl <- aggregate(e5sd) mi <- aggregate(msi[[1]])
To compare data of different samples, a
ModifierSet can be used. To select the data
alongside the transcripts and their positions a
GRanges or a
GRangesList needs to be provided.
In case of a GRanges object, the parent column must match the
transcript names as defined by the out put of ranges(x), whereas in
case of a GRangesList the element names must match the transcript
names.
compare(x, name, pos = 1L, ...) compareByCoord(x, coord, ...) plotCompare(x, name, pos = 1L, normalize, ...) plotCompareByCoord(x, coord, normalize, ...) ## S4 method for signature 'ModifierSet' compare(x, name, pos = 1L, normalize, ...) ## S4 method for signature 'ModifierSet,GRanges' compareByCoord(x, coord, normalize, ...) ## S4 method for signature 'ModifierSet,GRangesList' compareByCoord(x, coord, normalize, ...) ## S4 method for signature 'ModifierSet' plotCompare(x, name, pos = 1L, normalize, ...) ## S4 method for signature 'ModifierSet,GRanges' plotCompareByCoord(x, coord, normalize, ...) ## S4 method for signature 'ModifierSet,GRangesList' plotCompareByCoord(x, coord, normalize, ...)compare(x, name, pos = 1L, ...) compareByCoord(x, coord, ...) plotCompare(x, name, pos = 1L, normalize, ...) plotCompareByCoord(x, coord, normalize, ...) ## S4 method for signature 'ModifierSet' compare(x, name, pos = 1L, normalize, ...) ## S4 method for signature 'ModifierSet,GRanges' compareByCoord(x, coord, normalize, ...) ## S4 method for signature 'ModifierSet,GRangesList' compareByCoord(x, coord, normalize, ...) ## S4 method for signature 'ModifierSet' plotCompare(x, name, pos = 1L, normalize, ...) ## S4 method for signature 'ModifierSet,GRanges' plotCompareByCoord(x, coord, normalize, ...) ## S4 method for signature 'ModifierSet,GRangesList' plotCompareByCoord(x, coord, normalize, ...)
x |
a |
name |
Only for |
pos |
Only for |
... |
optional parameters:
|
coord |
coordinates of position to subset to. Either a |
normalize |
either a single logical or character value. If it is a
character, it must match one of the names in the |
compareByCoord returns a
DataFrame and
plotCompareByCoord returns a ggplot object, which can be
modified further. The DataFrame contains columns per sample as well
as the columns names, positions and mod incorporated
from the coord input. If coord contains a column
Activity this is included in the results as well.
data(msi,package="RNAmodR") # constructing a GRanges obejct to mark positive positions mod <- modifications(msi) coord <- unique(unlist(mod)) coord$score <- NULL coord$sd <- NULL # return a DataFrame compareByCoord(msi,coord) # plot the comparison as a heatmap plotCompareByCoord(msi,coord)data(msi,package="RNAmodR") # constructing a GRanges obejct to mark positive positions mod <- modifications(msi) coord <- unique(unlist(mod)) coord$score <- NULL coord$sd <- NULL # return a DataFrame compareByCoord(msi,coord) # plot the comparison as a heatmap plotCompareByCoord(msi,coord)
CoverageSequenceData implements
SequenceData to contain and aggregate the
coverage of reads per position along the transcripts.
CoverageSequenceData contains one column per data file named using the
following naming convention coverage.condition.replicate.
aggregate calculates the mean and sd for samples in the control
and treated condition separatly.
CoverageSequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) CoverageSequenceData(bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature ## 'CoverageSequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'CoverageSequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'CoverageSequenceData' getDataTrack(x, name, ...)CoverageSequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) CoverageSequenceData(bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature ## 'CoverageSequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'CoverageSequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'CoverageSequenceData' getDataTrack(x, name, ...)
df, ranges, sequence, replicate
|
inputs for creating a
|
condition |
For |
bamfiles, annotation, seqinfo, grl, sequences, param, args, ...
|
See
|
x |
a |
name |
For |
a CoverageSequenceData object
# Construction of a CoverageSequenceData object library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) csd <- CoverageSequenceData(files, annotation = annotation, sequences = sequences)# Construction of a CoverageSequenceData object library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) csd <- CoverageSequenceData(files, annotation = annotation, sequences = sequences)
The End5SequenceData/End3SequenceData/EndSequenceData
classes aggregate the counts of read ends at each position along a
transcript. End5SequenceData/End3SequenceData classes aggregate
either the 5'-end or 3'-end, the EndSequenceData aggregates both.
All three classes contain one column per data file named using the following
naming convention (end5/end3/end).condition.replicate.
aggregate calculates the mean and sd for samples in the control
and treated condition separatly.
End5SequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) End3SequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) EndSequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) End5SequenceData(bamfiles, annotation, sequences, seqinfo, ...) End3SequenceData(bamfiles, annotation, sequences, seqinfo, ...) EndSequenceData(bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature ## 'End5SequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature ## 'End3SequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature ## 'EndSequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'End5SequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'End3SequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'EndSequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'EndSequenceData' getDataTrack(x, name, ...) ## S4 method for signature 'End5SequenceData' getDataTrack(x, name, ...) ## S4 method for signature 'End3SequenceData' getDataTrack(x, name, ...)End5SequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) End3SequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) EndSequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) End5SequenceData(bamfiles, annotation, sequences, seqinfo, ...) End3SequenceData(bamfiles, annotation, sequences, seqinfo, ...) EndSequenceData(bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature ## 'End5SequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature ## 'End3SequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature ## 'EndSequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'End5SequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'End3SequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'EndSequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'EndSequenceData' getDataTrack(x, name, ...) ## S4 method for signature 'End5SequenceData' getDataTrack(x, name, ...) ## S4 method for signature 'End3SequenceData' getDataTrack(x, name, ...)
df, ranges, sequence, replicate
|
inputs for creating a
|
condition |
For |
bamfiles, annotation, seqinfo, grl, sequences, param, args, ...
|
See
|
x |
a |
name |
For |
a End5SequenceData, a End3SequenceData or a
EndSequenceData object
# Construction of a End5SequenceData object library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) e5sd <- End5SequenceData(files, annotation = annotation, sequences = sequences)# Construction of a End5SequenceData object library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) e5sd <- End5SequenceData(files, annotation = annotation, sequences = sequences)
The Modifier class is a virtual class, which provides the central
functionality to search for post-transcriptional RNA modification patterns in
high throughput sequencing data.
Each subclass has to implement the following functions:
Slot nucleotide: Either "RNA" or "DNA". For conveniance the
subclasses RNAModifier and DNAModifier are already available
and can be inherited from.
Function aggregateData: used for specific data
aggregation
Function findMod: used for specific search for
modifications
Optionally the function settings<- can be
implemented to store additional arguments, which the base class does not
recognize.
Modifier objects are constructed centrally by calling
Modifier() with a className matching the specific class to be
constructed. This will trigger the immediate analysis, if find.mod is
not set to FALSE.
Modifier(className, x, annotation, sequences, seqinfo, ...) ## S4 method for signature 'SequenceData' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'SequenceDataSet' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'SequenceDataList' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'character' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'list' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'BamFileList' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... )Modifier(className, x, annotation, sequences, seqinfo, ...) ## S4 method for signature 'SequenceData' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'SequenceDataSet' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'SequenceDataList' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'character' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'list' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'BamFileList' Modifier( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... )
className |
The name of the class which should be constructed. |
x |
the input which can be of the following types
|
annotation |
annotation data, which must match the information contained
in the BAM files. This parameter is only required if |
sequences |
sequences matching the target sequences the reads were
mapped onto. This must match the information contained in the BAM files.
TThis parameter is only required if |
seqinfo |
An optional |
... |
Additional otpional parameters:
All additional options must be named and will be passed to the
|
a Modifier object of type className
nucleotidea character value, which needs to contain "RNA" or
"DNA"
moda character value, which needs to contain one or more
elements from the alphabet of a
ModRNAString or
ModDNAString class.
scorethe main score identifier used for visualizations
dataTypethe class name(s) of the SequenceData class used
bamfilesthe input bam files as BamFileList
conditionconditions along the BamFileList: Either
control or treated
replicatereplicate number along the BamFileList for each of the
condition types.
dataThe sequence data object: Either a SequenceData,
SequenceDataSet or a SequenceDataList object, if more than one
dataType is used.
aggregatethe aggregated data as a SplitDataFrameList
modificationsthe found modifications as a GRanges object
settingsarguments used for the analysis as a list
aggregateValidForCurrentArgumentsTRUE or FALSE whether
the aggregate data was constructed with the current arguments
modificationsValidForCurrentArgumentsTRUE or FALSE
whether the modifications were found with the current arguments
Modifier objects can be created in two ways, either by providing a
list of bamfiles or
SequenceData/SequenceDataSet/SequenceDataList objects,
which match the structure in dataType().
dataType() can be a character vector or a list of
character vectors and depending on this the input files have to
follow this structure:
a single character: a SequenceData is
constructed/expected.
a character vector: a SequenceDataSet is
constructed/expected.
a list of character vectors: a SequenceDataList
is constructed/expected.
The cases for a SequenceData or SequenceDataSet are straight
forward, since the input remains the same. The last case is special, since it
is a hypothetical option, in which bam files from two or more different
methods have to be combined to reliably detect a single modification (The
elements of a SequenceDataList don't have to be created from the
bamfiles, whereas from a SequenceDataSet they have to be).
For this example a list of character vectors is expected.
Each element must be named according to the names of dataType() and
contain a character vector for creating a SequenceData object.
All additional options must be named and will be passed to the
settings function and onto the SequenceData
objects, if x is not a SequenceData object or a list of
SequenceData objects.
For the Modifier and ModifierSet classes a number of functions
are implemented to access the data stored by the object.
The validAggregate and validModification functions check if
settings have been modified, after the data was
loaded. This potentially invalidates them. To update the data, run the
aggregate or the modify function.
bamfiles(x) mainScore(x) modifierType(x) modType(x) dataType(x) sequenceData(x) sequences(x, ...) validAggregate(x) validModification(x) ## S4 method for signature 'Modifier' show(object) ## S4 method for signature 'Modifier' bamfiles(x) ## S4 method for signature 'Modifier' conditions(object) ## S4 method for signature 'Modifier' mainScore(x) ## S4 method for signature 'Modifier' modifierType(x) ## S4 method for signature 'Modifier' modType(x) ## S4 method for signature 'Modifier' dataType(x) ## S4 method for signature 'Modifier' names(x) ## S4 method for signature 'Modifier' ranges(x) ## S4 method for signature 'Modifier' replicates(x) ## S4 method for signature 'Modifier' seqinfo(x) ## S4 method for signature 'Modifier' seqtype(x) ## S4 method for signature 'Modifier' sequenceData(x) ## S4 method for signature 'Modifier' sequences(x, modified = FALSE) ## S4 method for signature 'Modifier' validAggregate(x) ## S4 method for signature 'Modifier' validModification(x) ## S4 method for signature 'ModifierSet' show(object) ## S4 method for signature 'ModifierSet' bamfiles(x) ## S4 method for signature 'ModifierSet' conditions(object) ## S4 method for signature 'ModifierSet' mainScore(x) ## S4 method for signature 'ModifierSet' modifications(x, perTranscript = FALSE) ## S4 method for signature 'ModifierSet' modifierType(x) ## S4 method for signature 'ModifierSet' modType(x) ## S4 method for signature 'ModifierSet' dataType(x) ## S4 method for signature 'ModifierSet' ranges(x) ## S4 method for signature 'ModifierSet' replicates(x) ## S4 method for signature 'ModifierSet' seqinfo(x) ## S4 method for signature 'ModifierSet' seqtype(x) ## S4 method for signature 'ModifierSet' sequences(x, modified = FALSE)bamfiles(x) mainScore(x) modifierType(x) modType(x) dataType(x) sequenceData(x) sequences(x, ...) validAggregate(x) validModification(x) ## S4 method for signature 'Modifier' show(object) ## S4 method for signature 'Modifier' bamfiles(x) ## S4 method for signature 'Modifier' conditions(object) ## S4 method for signature 'Modifier' mainScore(x) ## S4 method for signature 'Modifier' modifierType(x) ## S4 method for signature 'Modifier' modType(x) ## S4 method for signature 'Modifier' dataType(x) ## S4 method for signature 'Modifier' names(x) ## S4 method for signature 'Modifier' ranges(x) ## S4 method for signature 'Modifier' replicates(x) ## S4 method for signature 'Modifier' seqinfo(x) ## S4 method for signature 'Modifier' seqtype(x) ## S4 method for signature 'Modifier' sequenceData(x) ## S4 method for signature 'Modifier' sequences(x, modified = FALSE) ## S4 method for signature 'Modifier' validAggregate(x) ## S4 method for signature 'Modifier' validModification(x) ## S4 method for signature 'ModifierSet' show(object) ## S4 method for signature 'ModifierSet' bamfiles(x) ## S4 method for signature 'ModifierSet' conditions(object) ## S4 method for signature 'ModifierSet' mainScore(x) ## S4 method for signature 'ModifierSet' modifications(x, perTranscript = FALSE) ## S4 method for signature 'ModifierSet' modifierType(x) ## S4 method for signature 'ModifierSet' modType(x) ## S4 method for signature 'ModifierSet' dataType(x) ## S4 method for signature 'ModifierSet' ranges(x) ## S4 method for signature 'ModifierSet' replicates(x) ## S4 method for signature 'ModifierSet' seqinfo(x) ## S4 method for signature 'ModifierSet' seqtype(x) ## S4 method for signature 'ModifierSet' sequences(x, modified = FALSE)
x, object
|
a |
... |
Additional arguments. |
modified |
For |
perTranscript |
|
modifierType: a character vector with the appropriate class
Name of a Modifier.
modType: a character vector with the modifications detected by
the Modifier class.
seqtype: a single character value defining if either
"RNA" or "DNA" modifications are detected by the Modifier class.
mainScore: a character vector.
sequenceData: a SequenceData object.
modifications: a GRanges or GRangesList object
describing the found modifications.
seqinfo: a Seqinfo object.
sequences: a RNAStingSet object.
ranges: a GRangesList object with each element per
transcript.
bamfiles: a BamFileList object.
validAggregate: TRUE or FALSE. Checks if current
settings are the same for which the data was aggregate
validModification: TRUE or FALSE. Checks if
current settings are the same for which modification were found
data(msi,package="RNAmodR") mi <- msi[[1]] modifierType(mi) # The class name of the Modifier object modifierType(msi) seqtype(mi) modType(mi) mainScore(mi) sequenceData(mi) modifications(mi) # general accessors seqinfo(mi) sequences(mi) ranges(mi) bamfiles(mi)data(msi,package="RNAmodR") mi <- msi[[1]] modifierType(mi) # The class name of the Modifier object modifierType(msi) seqtype(mi) modType(mi) mainScore(mi) sequenceData(mi) modifications(mi) # general accessors seqinfo(mi) sequences(mi) ranges(mi) bamfiles(mi)
The ModifierSet class allows multiple
Modifier objects to be created from the same
annotation and sequence data varying only the bam input files.
In addition the comparison of samples is also done via calling functions on
the ModifierSet objects.
The ModifierSet is a virtual class, which derives from the
SimpleList class with the slot elementType = "Modifier". The
ModifierSet class has to be implemented for each specific analysis.#'
ModifierSet(className, x, annotation, sequences, seqinfo, ...) ## S4 method for signature 'list' ModifierSet( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'character' ModifierSet( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'BamFileList' ModifierSet( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'Modifier' ModifierSet(className, x, annotation, sequences, seqinfo, ...)ModifierSet(className, x, annotation, sequences, seqinfo, ...) ## S4 method for signature 'list' ModifierSet( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'character' ModifierSet( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'BamFileList' ModifierSet( className, x, annotation = NULL, sequences = NULL, seqinfo = NULL, ... ) ## S4 method for signature 'Modifier' ModifierSet(className, x, annotation, sequences, seqinfo, ...)
className |
The name of the class which should be constructed. |
x |
the input which can be of the following types
|
annotation |
annotation data, which must match the information contained
in the BAM files. This is parameter is only required, if |
sequences |
sequences matching the target sequences the reads were
mapped onto. This must match the information contained in the BAM files. This
is parameter is only required, if |
seqinfo |
An optional |
... |
Additional otpional parameters:
All other arguments will be passed onto the |
a ModifierSet object of type className
The input files have to be provided as a list of elements. Each
element in itself must be valid for the creation of Modifier
object (Have a look at the man page for more details) and must be named.
SequenceData
The modify function executes the search for modifications for a
Modifier class. Usually this is done
automatically during construction of a Modifier object.
When the modify functions is called, the aggregated data is checked
for validity for the current settings and the search for modifications is
performed using the findMod. The results are stored in the
modification slot of the Modifier object, which is returned by
modify. The results can be accessed via the modifications()
function.
findMod returns the found modifications as a GRanges
object and has to be implemented for each individual Modifier class.
modifications(x, ...) modify(x, ...) findMod(x) ## S4 method for signature 'Modifier' modifications(x, perTranscript = FALSE) ## S4 method for signature 'Modifier' modify(x, force = FALSE) ## S4 method for signature 'Modifier' findMod(x) ## S4 method for signature 'ModifierSet' modify(x, force = FALSE)modifications(x, ...) modify(x, ...) findMod(x) ## S4 method for signature 'Modifier' modifications(x, perTranscript = FALSE) ## S4 method for signature 'Modifier' modify(x, force = FALSE) ## S4 method for signature 'Modifier' findMod(x) ## S4 method for signature 'ModifierSet' modify(x, force = FALSE)
x |
a |
... |
additional arguments |
perTranscript |
For |
force |
force to run |
modify: the updated Modifier object.
modifications: the modifications found as a GRanges
object.
data(msi,package="RNAmodR") # modify() triggers the search for modifications in the data contained in # the Modifier or ModifierSet object mi <- modify(msi[[1]])data(msi,package="RNAmodR") # modify() triggers the search for modifications in the data contained in # the Modifier or ModifierSet object mi <- modify(msi[[1]])
Inosine can be detected in RNA-Seq data by the conversion of A positions to
G. This conversion is detected by ModInosine and used to search for
Inosine positions. dataType is "PileupSequenceData".
Only samples labeled with the condition treated are used for this
analysis, since the A to G conversion is common feature among the reverse
transcriptases usually emploied. Let us know, if that is not the case, and
the class needs to be modified.
Further information on Functions of
ModInosine.
ModInosine(x, annotation, sequences, seqinfo, ...) ModSetInosine(x, annotation = NA, sequences = NA, seqinfo = NA, ...)ModInosine(x, annotation, sequences, seqinfo, ...) ModSetInosine(x, annotation = NA, sequences = NA, seqinfo = NA, ...)
x |
the input which can be of the different types depending on whether
a |
annotation |
annotation data, which must match the information contained
in the BAM files. This is parameter is only required, if |
sequences |
sequences matching the target sequences the reads were
mapped onto. This must match the information contained in the BAM files. This
is parameter is only required, if |
seqinfo |
An optional |
... |
Optional arguments overwriting default values, which are
|
ModInosine score: the scores for reported Inosine positions are
between 0 and 1. They are calculated as the relative amount of called G bases
((G / N)) and only saved for genomic A positions.
a ModInosine or ModSetInosine object
Felix G.M. Ernst [aut]
# construction of ModInosine object library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) mi <- ModInosine(files,annotation = annotation ,sequences = sequences) # construction of ModSetInosine object ## Not run: files <- list("SampleSet1" = c(treated = RNAmodR.Data.example.wt.1(), treated = RNAmodR.Data.example.wt.2(), treated = RNAmodR.Data.example.wt.3()), "SampleSet2" = c(treated = RNAmodR.Data.example.bud23.1(), treated = RNAmodR.Data.example.bud23.2()), "SampleSet3" = c(treated = RNAmodR.Data.example.trm8.1(), treated = RNAmodR.Data.example.trm8.2())) msi <- ModSetInosine(files, annotation = annotation, sequences = sequences) ## End(Not run)# construction of ModInosine object library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) mi <- ModInosine(files,annotation = annotation ,sequences = sequences) # construction of ModSetInosine object ## Not run: files <- list("SampleSet1" = c(treated = RNAmodR.Data.example.wt.1(), treated = RNAmodR.Data.example.wt.2(), treated = RNAmodR.Data.example.wt.3()), "SampleSet2" = c(treated = RNAmodR.Data.example.bud23.1(), treated = RNAmodR.Data.example.bud23.2()), "SampleSet3" = c(treated = RNAmodR.Data.example.trm8.1(), treated = RNAmodR.Data.example.trm8.2())) msi <- ModSetInosine(files, annotation = annotation, sequences = sequences) ## End(Not run)
All of the functions of Modifier and
the ModifierSet classes are
inherited by the ModInosine and ModSetInosine classes.
Check below for the specifically implemented functions.
## S4 replacement method for signature 'ModInosine' settings(x) <- value ## S4 method for signature 'ModInosine' aggregateData(x) ## S4 method for signature 'ModInosine' findMod(x) ## S4 method for signature 'ModInosine' getDataTrack(x, name, type, ...) ## S4 method for signature 'ModInosine,GRanges' plotDataByCoord(x, coord, type = "score", window.size = 15L, ...) ## S4 method for signature 'ModInosine' plotData(x, name, from = 1L, to = 30L, type = "score", ...) ## S4 method for signature 'ModSetInosine,GRanges' plotDataByCoord(x, coord, type = "score", window.size = 15L, ...) ## S4 method for signature 'ModSetInosine' plotData(x, name, from = 1L, to = 30L, type = "score", ...)## S4 replacement method for signature 'ModInosine' settings(x) <- value ## S4 method for signature 'ModInosine' aggregateData(x) ## S4 method for signature 'ModInosine' findMod(x) ## S4 method for signature 'ModInosine' getDataTrack(x, name, type, ...) ## S4 method for signature 'ModInosine,GRanges' plotDataByCoord(x, coord, type = "score", window.size = 15L, ...) ## S4 method for signature 'ModInosine' plotData(x, name, from = 1L, to = 30L, type = "score", ...) ## S4 method for signature 'ModSetInosine,GRanges' plotDataByCoord(x, coord, type = "score", window.size = 15L, ...) ## S4 method for signature 'ModSetInosine' plotData(x, name, from = 1L, to = 30L, type = "score", ...)
x |
a |
value |
See |
coord, name, from, to, type, window.size, ...
|
See
|
ModInosine specific arguments for plotData:
colour.bases - a named character vector of length = 4
for the colours of the individual bases. The names are expected to be
c("G","A","U","C")
settings See settings.
aggregate See aggregate.
modify See modify.
getDataTrack a list of
DataTrack objects. See
plotDataByCoord.
plotData See plotDataByCoord.
plotDataByCoord See plotDataByCoord.
data(msi,package="RNAmodR") mi <- msi[[1]] settings(mi) ## Not run: aggregate(mi) modify(mi) ## End(Not run) getDataTrack(mi, "1", mainScore(mi))data(msi,package="RNAmodR") mi <- msi[[1]] settings(mi) ## Not run: aggregate(mi) modify(mi) ## End(Not run) getDataTrack(mi, "1", mainScore(mi))
These functions are not intended for general use, but are used for additional package development.
x, data, seqdata, sequence, args
|
internally used arguments |
The NormEnd5SequenceData/NormEnd3SequenceData
aggregate the counts of read ends (Either 5' or 3') at each position along a
transcript. In addition, the number of counts are then normalized to the
length of the transcript and to the overlapping reads.
Both classes contain three columns per data file named using the
following naming convention (normend5/normend3).condition.replicate.
The three columns are distinguished by additional identifiers ends,
norm.tx and norm.ol.
aggregate calculates the mean and sd for samples in the control
and treated condition separatly. Similar to the stored results for
each of the two conditions six columns are returned (three for mean and sd
each) ending in ends, tx and ol.
NormEnd5SequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) NormEnd3SequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) NormEnd5SequenceData(bamfiles, annotation, sequences, seqinfo, ...) NormEnd3SequenceData(bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature ## 'NormEnd5SequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature ## 'NormEnd3SequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'NormEnd5SequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'NormEnd3SequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'NormEnd5SequenceData' getDataTrack(x, name, ...) ## S4 method for signature 'NormEnd3SequenceData' getDataTrack(x, name, ...)NormEnd5SequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) NormEnd3SequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) NormEnd5SequenceData(bamfiles, annotation, sequences, seqinfo, ...) NormEnd3SequenceData(bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature ## 'NormEnd5SequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature ## 'NormEnd3SequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'NormEnd5SequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'NormEnd3SequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'NormEnd5SequenceData' getDataTrack(x, name, ...) ## S4 method for signature 'NormEnd3SequenceData' getDataTrack(x, name, ...)
df, ranges, sequence, replicate
|
inputs for creating a
|
condition |
For |
bamfiles, annotation, seqinfo, grl, sequences, param, args, ...
|
See
|
x |
a |
name |
For |
a NormEnd5SequenceData or NormEnd3SequenceData object
# Construction of a NormEnd5SequenceData object ## Not run: library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) ne5sd <- NormEnd5SequenceData(files, annotation = annotation, sequences = sequences) ## End(Not run)# Construction of a NormEnd5SequenceData object ## Not run: library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) ne5sd <- NormEnd5SequenceData(files, annotation = annotation, sequences = sequences) ## End(Not run)
The PileupSequenceData aggregates the pileup of called bases per
position.
PileupSequenceData contains five columns per data file named using the
following naming convention pileup.condition.replicate. The five
columns are distinguished by additional identifiers -, G,
A, T and C.
aggregate calculates the mean and sd for each nucleotide in the
control and treated condition separatly. The results are then
normalized to a row sum of 1.
PileupSequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) PileupSequenceData(bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature ## 'PileupSequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'PileupSequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'PileupSequenceData' getDataTrack(x, name, ...) pileupToCoverage(x) ## S4 method for signature 'PileupSequenceData' pileupToCoverage(x)PileupSequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) PileupSequenceData(bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature ## 'PileupSequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'PileupSequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'PileupSequenceData' getDataTrack(x, name, ...) pileupToCoverage(x) ## S4 method for signature 'PileupSequenceData' pileupToCoverage(x)
df, ranges, sequence, replicate
|
inputs for creating a
|
condition |
For |
bamfiles, annotation, seqinfo, grl, sequences, param, args, ...
|
See
|
x |
a |
name |
For |
a PileupSequenceData object
# Construction of a PileupSequenceData object library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) psd <- PileupSequenceData(files, annotation = annotation, sequences = sequences)# Construction of a PileupSequenceData object library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) psd <- PileupSequenceData(files, annotation = annotation, sequences = sequences)
SequenceData,
SequenceDataSet, SequenceDataList, Modifier or
ModifierSet object.With the plotData and plotDataByCoord functions data
from a SequenceData, SequenceDataSet, SequenceDataList,
Modifier or ModifierSet object can be visualized.
Internally the functionality of the Gviz package is used. For each
SequenceData and Modifier class the getDataTrack is
implemented returning a DataTrack object
from the Gviz package.
Positions to be visualized are selected by defining a genomic coordinate,
for which x has to contain data.
plotData(x, name, from = 1L, to = 30L, type, ...) plotDataByCoord(x, coord, type, window.size = 15L, ...) getDataTrack(x, name, ...) ## S4 method for signature 'Modifier,GRanges' plotDataByCoord(x, coord, type = NA, window.size = 15L, ...) ## S4 method for signature 'Modifier' plotData( x, name, from, to, type = NA, showSequenceData = FALSE, showSequence = TRUE, showAnnotation = FALSE, ... ) ## S4 method for signature 'Modifier' getDataTrack(x, name = name, ...) ## S4 method for signature 'ModifierSet,GRanges' plotDataByCoord(x, coord, type = NA, window.size = 15L, ...) ## S4 method for signature 'ModifierSet' plotData( x, name, from, to, type = NA, showSequenceData = FALSE, showSequence = TRUE, showAnnotation = FALSE, ... ) ## S4 method for signature 'SequenceData,GRanges' plotDataByCoord(x, coord, type = NA, window.size = 15L, ...) ## S4 method for signature 'SequenceData' plotData( x, name, from, to, perTranscript = FALSE, showSequence = TRUE, showAnnotation = FALSE, ... ) ## S4 method for signature 'SequenceData' getDataTrack(x, name = name, ...) ## S4 method for signature 'SequenceDataList' getDataTrack(x, name = name, ...) ## S4 method for signature 'SequenceDataList,GRanges' plotDataByCoord(x, coord, type = NA, window.size = 15L, ...) ## S4 method for signature 'SequenceDataList' plotData( x, name, from, to, perTranscript = FALSE, showSequence = TRUE, showAnnotation = FALSE, ... ) ## S4 method for signature 'SequenceDataSet' getDataTrack(x, name = name, ...) ## S4 method for signature 'SequenceDataSet,GRanges' plotDataByCoord(x, coord, type = NA, window.size = 15L, ...) ## S4 method for signature 'SequenceDataSet' plotData( x, name, from, to, perTranscript = FALSE, showSequence = TRUE, showAnnotation = FALSE, ... )plotData(x, name, from = 1L, to = 30L, type, ...) plotDataByCoord(x, coord, type, window.size = 15L, ...) getDataTrack(x, name, ...) ## S4 method for signature 'Modifier,GRanges' plotDataByCoord(x, coord, type = NA, window.size = 15L, ...) ## S4 method for signature 'Modifier' plotData( x, name, from, to, type = NA, showSequenceData = FALSE, showSequence = TRUE, showAnnotation = FALSE, ... ) ## S4 method for signature 'Modifier' getDataTrack(x, name = name, ...) ## S4 method for signature 'ModifierSet,GRanges' plotDataByCoord(x, coord, type = NA, window.size = 15L, ...) ## S4 method for signature 'ModifierSet' plotData( x, name, from, to, type = NA, showSequenceData = FALSE, showSequence = TRUE, showAnnotation = FALSE, ... ) ## S4 method for signature 'SequenceData,GRanges' plotDataByCoord(x, coord, type = NA, window.size = 15L, ...) ## S4 method for signature 'SequenceData' plotData( x, name, from, to, perTranscript = FALSE, showSequence = TRUE, showAnnotation = FALSE, ... ) ## S4 method for signature 'SequenceData' getDataTrack(x, name = name, ...) ## S4 method for signature 'SequenceDataList' getDataTrack(x, name = name, ...) ## S4 method for signature 'SequenceDataList,GRanges' plotDataByCoord(x, coord, type = NA, window.size = 15L, ...) ## S4 method for signature 'SequenceDataList' plotData( x, name, from, to, perTranscript = FALSE, showSequence = TRUE, showAnnotation = FALSE, ... ) ## S4 method for signature 'SequenceDataSet' getDataTrack(x, name = name, ...) ## S4 method for signature 'SequenceDataSet,GRanges' plotDataByCoord(x, coord, type = NA, window.size = 15L, ...) ## S4 method for signature 'SequenceDataSet' plotData( x, name, from, to, perTranscript = FALSE, showSequence = TRUE, showAnnotation = FALSE, ... )
x |
a |
name |
Only for |
from |
Only for |
to |
Only for |
type |
the data type of data show as data tracks. |
... |
optional parameters:
|
coord |
coordinates of a positions to subset to as a
|
window.size |
integer value for the number of positions on the left and
right site of the selected positions included in the plotting (default:
|
showSequenceData |
|
showSequence |
|
showAnnotation |
|
perTranscript |
|
a plot send to the active graphic device
data(msi,package="RNAmodR") plotData(msi[[1]], "2", from = 10L, to = 45L) ## Not run: plotData(msi, "2", from = 10L, to = 45L) ## End(Not run)data(msi,package="RNAmodR") plotData(msi[[1]], "2", from = 10L, to = 45L) ## Not run: plotData(msi, "2", from = 10L, to = 45L) ## End(Not run)
Modifier and ModifierSet objectsplotROC streamlines labeling, prediction, performance and plotting
functions to test the peformance of a Modifier object and the data
analyzed via the functionallity from the ROCR package.
The data from x will be labeled as positive using the coord
arguments. The other arguments will be passed on to the specific ROCR
functions.
By default the prediction.args include three values:
measure = "tpr"
x.measure = "fpr"
score = mainScore(x)
The remaining arguments are not predefined.
plotROC(x, coord, ...) ## S4 method for signature 'Modifier' plotROC( x, coord, score = NULL, prediction.args = list(), performance.args = list(), plot.args = list() ) ## S4 method for signature 'ModifierSet' plotROC( x, coord, score = NULL, prediction.args = list(), performance.args = list(), plot.args = list() )plotROC(x, coord, ...) ## S4 method for signature 'Modifier' plotROC( x, coord, score = NULL, prediction.args = list(), performance.args = list(), plot.args = list() ) ## S4 method for signature 'ModifierSet' plotROC( x, coord, score = NULL, prediction.args = list(), performance.args = list(), plot.args = list() )
x |
a |
coord |
coordinates of position to label as positive. Either a
|
... |
additional arguments |
score |
the score identifier to subset to, if multiple scores are available. |
prediction.args |
arguments which will be used for calling
|
performance.args |
arguments which will be used for calling
|
plot.args |
arguments which will be used for calling |
a plot send to the active graphic device
Tobias Sing, Oliver Sander, Niko Beerenwinkel, Thomas Lengauer (2005): "ROCR: visualizing classifier performance in R." Bioinformatics 21(20):3940-3941 DOI: 10.1093/bioinformatics/bti623
data(msi,package="RNAmodR") # constructing a GRanges obejct to mark positive positions mod <- modifications(msi) coord <- unique(unlist(mod)) coord$score <- NULL coord$sd <- NULL # plotting a TPR vs. FPR plot per ModInosine object plotROC(msi[[1]],coord) # plotting a TPR vs. FPR plot per ModSetInosine object plotROC(msi,coord)data(msi,package="RNAmodR") # constructing a GRanges obejct to mark positive positions mod <- modifications(msi) coord <- unique(unlist(mod)) coord$score <- NULL coord$sd <- NULL # plotting a TPR vs. FPR plot per ModInosine object plotROC(msi[[1]],coord) # plotting a TPR vs. FPR plot per ModSetInosine object plotROC(msi,coord)
ProtectedEndSequenceData implements
SequenceData to contain and aggregate the
start and ends of reads per position along a transcript.
ProtectedEndSequenceData offsets the start position by -1 to align the
information on the 5'-3'-phosphate bonds to one position. The
ProtectedEndSequenceData class is implemented specifically as required
for the RiboMethSeq method.
The objects of type ProtectedEndSequenceData contain three columns per
data file named using the following naming convention
protectedend.condition.replicate.
aggregate calculates the mean and sd for samples in the control
and treated condition separatly.
ProtectedEndSequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) ProtectedEndSequenceData(bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature ## 'ProtectedEndSequenceData, ## BamFileList, ## GRangesList, ## XStringSet, ## ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'ProtectedEndSequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'ProtectedEndSequenceData' getDataTrack(x, name, ...)ProtectedEndSequenceDataFrame( df, ranges, sequence, replicate, condition, bamfiles, seqinfo ) ProtectedEndSequenceData(bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature ## 'ProtectedEndSequenceData, ## BamFileList, ## GRangesList, ## XStringSet, ## ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'ProtectedEndSequenceData' aggregateData(x, condition = c("Both", "Treated", "Control")) ## S4 method for signature 'ProtectedEndSequenceData' getDataTrack(x, name, ...)
df, ranges, sequence, replicate
|
inputs for creating a
|
condition |
For |
bamfiles, annotation, seqinfo, grl, sequences, param, args, ...
|
See
|
x |
a |
name |
For |
a ProtectedEndSequenceData object
# Construction of a ProtectedEndSequenceData object library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) pesd <- ProtectedEndSequenceData(files, annotation = annotation, sequences = sequences)# Construction of a ProtectedEndSequenceData object library(RNAmodR.Data) library(rtracklayer) annotation <- GFF3File(RNAmodR.Data.example.man.gff3()) sequences <- RNAmodR.Data.example.man.fasta() files <- c(treated = RNAmodR.Data.example.wt.1()) pesd <- ProtectedEndSequenceData(files, annotation = annotation, sequences = sequences)
Post-transcriptional modifications can be found abundantly in rRNA and tRNA and can be detected classically via several strategies. However, difficulties arise if the identity and the position of the modified nucleotides is to be determined at the same time. Classically, a primer extension, a form of reverse transcription (RT), would allow certain modifications to be accessed by blocks during the RT changes or changes in the cDNA sequences. Other modification would need to be selectively treated by chemical reactions to influence the outcome of the reverse transcription.
With the increased availability of high throughput sequencing, these classical methods were adapted to high throughput methods allowing more RNA molecules to be accessed at the same time. With these advances post-transcriptional modifications were also detected on mRNA. Among these high throughput techniques are for example Pseudo-Seq (Carlile et al. 2014), RiboMethSeq (Birkedal et al. 2015) and AlkAnilineSeq (Marchand et al. 2018) each able to detect a specific type of modification from footprints in RNA-Seq data prepared with the selected methods.
Since similar pattern can be observed from some of these techniques, overlaps of the bioinformatical pipeline already are and will become more frequent with new emerging sequencing techniques.
RNAmodR implements classes and a workflow to detect
post-transcriptional RNA modifications in high throughput sequencing data. It
is easily adaptable to new methods and can help during the phase of initial
method development as well as more complex screenings.
Briefly, from the SequenceData, specific subclasses are derived for
accessing specific aspects of aligned reads, e.g. 5’-end positions or pileup
data. With this a Modifier class can be used to detect specific
patterns for individual types of modifications. The SequenceData
classes can be shared by different Modifier classes allowing easy
adaptation to new methods.
Felix G M Ernst [aut], Denis L.J. Lafontaine [ctb]
- Carlile TM, Rojas-Duran MF, Zinshteyn B, Shin H, Bartoli KM, Gilbert WV (2014): "Pseudouridine profiling reveals regulated mRNA pseudouridylation in yeast and human cells." Nature 515 (7525), P. 143–146. DOI: 10.1038/nature13802.
- Birkedal U, Christensen-Dalsgaard M, Krogh N, Sabarinathan R, Gorodkin J, Nielsen H (2015): "Profiling of ribose methylations in RNA by high-throughput sequencing." Angewandte Chemie (International ed. in English) 54 (2), P. 451–455. DOI: 10.1002/anie.201408362.
- Marchand V, Ayadi L, __Ernst FGM__, Hertler J, Bourguignon-Igel V, Galvanin A, Kotter A, Helm M, __Lafontaine DLJ__, Motorin Y (2018): "AlkAniline-Seq: Profiling of m7 G and m3 C RNA Modifications at Single Nucleotide Resolution." Angewandte Chemie (International ed. in English) 57 (51), P. 16785–16790. DOI: 10.1002/anie.201810946.
The RNAmodR.RiboMethSeq and RNAmodR.AlkAnilineSeq
package.
The following datasets are contained in the RNAmodR package. They are used in the man page examples.
data(msi) data(sds) data(sdl) data(psd) data(e5sd) data(e3sd) data(esd) data(csd) data(ne3sd) data(ne5sd) data(pesd)data(msi) data(sds) data(sdl) data(psd) data(e5sd) data(e3sd) data(esd) data(csd) data(ne3sd) data(ne5sd) data(pesd)
msi a ModSetInosine instance
sds a SequenceDataSet instance
sdl a SequenceDataList instance
psd a PileupSequenceData instance
e5sd a End5SequenceData instance
e3sd a End3SequenceData instance
esd a EndSequenceData instance
csd a CoverageSequenceData instance
ne3sd a NormEnd3SequenceData instance
ne5sd a NormEnd5SequenceData instance
pesd a ProtectedEndSequenceData instance
An object of class SequenceDataSet of length 2.
An object of class SequenceDataList of length 3.
An object of class PileupSequenceData of dimension 100 x 101 x 15 x 15.
An object of class End5SequenceData of dimension 100 x 101 x 3 x 3.
An object of class End3SequenceData of dimension 100 x 101 x 3 x 3.
An object of class EndSequenceData of dimension 100 x 101 x 3 x 3.
An object of class CoverageSequenceData of dimension 100 x 101 x 3 x 3.
An object of class NormEnd3SequenceData of dimension 100 x 101 x 9 x 9.
An object of class NormEnd5SequenceData of dimension 100 x 101 x 9 x 9.
An object of class ProtectedEndSequenceData of dimension 100 x 101 x 3 x 3.
These functions are not intended for general use, but are used for additional package development.
getData is used to load data into a
SequenceData object and must be
implented for all SequenceData classes. The results must match the
requirements outlined in the value section.
In addition the following functions should be implemented for complete functionality:
aggregateData for each SequenceData and Modifier class.
See also aggregateData
findMod for each Modifier class. See also
findMod.
plotData/plotDataByCoord for each Modifier
and ModifierSet class. See also
plotData.
The following helper function can be called from within findMod to
construct a coordinate for each modification found:
constructModRanges constructs a GRanges object describing the
location, type and associated scores of a modification.
constructModRanges is typically called from the modify
function, which must be implemented for all
Modifier classes.
constructModRanges(range, data, modType, scoreFun, source, type) getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'GRanges,DataFrame' constructModRanges(range, data, modType, scoreFun, source, type)constructModRanges(range, data, modType, scoreFun, source, type) getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'GRanges,DataFrame' constructModRanges(range, data, modType, scoreFun, source, type)
range |
for |
data |
for |
modType |
for |
scoreFun |
for |
source |
for |
type |
for |
x |
for |
bamfiles |
for |
grl |
for |
sequences |
for |
param |
for |
args |
for |
getData: returns a list with elements per BamFile in
bamfiles. Elements can be
IntegerList,
NumericList or a
CompressedSplitDataFrameList. The
data in the elements must be order by increasing positions numbers. However,
names and rownames will be discarded.
constructModRanges: returns a GRanges object with
genomic coordinates of modified nucleotides in the associated transcripts.
# new SequenceData class setClass(Class = "ExampleSequenceData", contains = "SequenceData", prototype = list(minQuality = 5L)) ExampleSequenceData <- function(bamfiles, annotation, sequences, seqinfo, ...){ RNAmodR:::SequenceData("Example", bamfiles = bamfiles, annotation = annotation, sequences = sequences, seqinfo = seqinfo, ...) } setMethod("getData", signature = c(x = "ExampleSequenceData", bamfiles = "BamFileList", grl = "GRangesList", sequences = "XStringSet", param = "ScanBamParam"), definition = function(x, bamfiles, grl, sequences, param, args){ ### } ) setMethod("aggregateData", signature = c(x = "ExampleSequenceData"), function(x, condition = c("Both","Treated","Control")){ ### } ) setMethod( f = "getDataTrack", signature = c(x = "ExampleSequenceData"), definition = function(x, name, ...) { ### } ) # new Modifier class setClass("ModExample", contains = "Modifier", prototype = list(mod = "X", score = "score", dataType = "ExampleSequenceData")) ModExample <- function(x, annotation, sequences, seqinfo, ...){ RNAmodR:::Modifier("ModExample", x = x, annotation = annotation, sequences = sequences, seqinfo = seqinfo, ...) } setMethod(f = "aggregateData", signature = c(x = "ModExample"), definition = function(x, force = FALSE){ # Some data with element per transcript } ) setMethod("findMod", signature = c(x = "ModExample"), function(x){ # an element per modification found. } ) setMethod( f = "getDataTrack", signature = signature(x = "ModExample"), definition = function(x, name, type, ...) { } ) setMethod( f = "plotDataByCoord", signature = signature(x = "ModExample", coord = "GRanges"), definition = function(x, coord, type = "score", window.size = 15L, ...) { } ) setMethod( f = "plotData", signature = signature(x = "ModExample"), definition = function(x, name, from, to, type = "score", ...) { } ) # new ModifierSet class setClass("ModSetExample", contains = "ModifierSet", prototype = list(elementType = "ModExample")) ModSetExample <- function(x, annotation, sequences, seqinfo, ...){ RNAmodR:::ModifierSet("ModExample", x = x, annotation = annotation, sequences = sequences, seqinfo = seqinfo, ...) } setMethod( f = "plotDataByCoord", signature = signature(x = "ModSetExample", coord = "GRanges"), definition = function(x, coord, type = "score", window.size = 15L, ...) { } ) setMethod( f = "plotData", signature = signature(x = "ModSetExample"), definition = function(x, name, from, to, type = "score", ...) { } )# new SequenceData class setClass(Class = "ExampleSequenceData", contains = "SequenceData", prototype = list(minQuality = 5L)) ExampleSequenceData <- function(bamfiles, annotation, sequences, seqinfo, ...){ RNAmodR:::SequenceData("Example", bamfiles = bamfiles, annotation = annotation, sequences = sequences, seqinfo = seqinfo, ...) } setMethod("getData", signature = c(x = "ExampleSequenceData", bamfiles = "BamFileList", grl = "GRangesList", sequences = "XStringSet", param = "ScanBamParam"), definition = function(x, bamfiles, grl, sequences, param, args){ ### } ) setMethod("aggregateData", signature = c(x = "ExampleSequenceData"), function(x, condition = c("Both","Treated","Control")){ ### } ) setMethod( f = "getDataTrack", signature = c(x = "ExampleSequenceData"), definition = function(x, name, ...) { ### } ) # new Modifier class setClass("ModExample", contains = "Modifier", prototype = list(mod = "X", score = "score", dataType = "ExampleSequenceData")) ModExample <- function(x, annotation, sequences, seqinfo, ...){ RNAmodR:::Modifier("ModExample", x = x, annotation = annotation, sequences = sequences, seqinfo = seqinfo, ...) } setMethod(f = "aggregateData", signature = c(x = "ModExample"), definition = function(x, force = FALSE){ # Some data with element per transcript } ) setMethod("findMod", signature = c(x = "ModExample"), function(x){ # an element per modification found. } ) setMethod( f = "getDataTrack", signature = signature(x = "ModExample"), definition = function(x, name, type, ...) { } ) setMethod( f = "plotDataByCoord", signature = signature(x = "ModExample", coord = "GRanges"), definition = function(x, coord, type = "score", window.size = 15L, ...) { } ) setMethod( f = "plotData", signature = signature(x = "ModExample"), definition = function(x, name, from, to, type = "score", ...) { } ) # new ModifierSet class setClass("ModSetExample", contains = "ModifierSet", prototype = list(elementType = "ModExample")) ModSetExample <- function(x, annotation, sequences, seqinfo, ...){ RNAmodR:::ModifierSet("ModExample", x = x, annotation = annotation, sequences = sequences, seqinfo = seqinfo, ...) } setMethod( f = "plotDataByCoord", signature = signature(x = "ModSetExample", coord = "GRanges"), definition = function(x, coord, type = "score", window.size = 15L, ...) { } ) setMethod( f = "plotData", signature = signature(x = "ModSetExample"), definition = function(x, name, from, to, type = "score", ...) { } )
The SequenceData class is implemented to contain data on each position
along transcripts and holds the corresponding annotation data and
nucleotide sequence of these transcripts. To access this data several
functions are available. The
SequenceData class is a virtual class, from which specific classes can
be extended. Currently the following classes are implemented:
The annotation and sequence data can be accessed through the functions
ranges and sequences, respectively. Beaware, that the data is
always provided according to genomic positions with increasing
rownames, but the sequence is given as the actual sequence of the
transcript. Therefore, it is necessary to treat the minus strand accordingly.
The SequenceData class is derived from the
CompressedSplitDataFrameList class
with additional slots for annotation and sequence data. Some functionality is
not inherited and might not available to full extend, e.g.relist.
SequenceDataFrame
The SequenceDataFrame class is a virtual class and contains data for
positions along a single transcript. In addition to being used for returning
elements from a SequenceData object, the SequenceDataFrame class is
used to store the unlisted data within a
SequenceData object. Therefore, a matching
SequenceData and SequenceDataFrame class must be implemented.
The SequenceDataFrame class is derived from the
DataFrame class.
Subsetting of a SequenceDataFrame returns a SequenceDataFrame or
DataFrame, if it is subset by a column or row, respectively. The
drop argument is ignored for column subsetting.
## S4 method for signature 'SequenceData' cbind(..., deparse.level = 1) ## S4 method for signature 'SequenceData' rbind(..., deparse.level = 1) SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'character,character' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'character,BSgenome' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'TxDb,character' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'TxDb,BSgenome' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GRangesList,character' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GRangesList,BSgenome' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GFF3File,BSgenome' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GFF3File,character' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'character,FaFile' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GFF3File,FaFile' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'TxDb,FaFile' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GRangesList,FaFile' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...)## S4 method for signature 'SequenceData' cbind(..., deparse.level = 1) ## S4 method for signature 'SequenceData' rbind(..., deparse.level = 1) SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'character,character' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'character,BSgenome' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'TxDb,character' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'TxDb,BSgenome' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GRangesList,character' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GRangesList,BSgenome' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GFF3File,BSgenome' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GFF3File,character' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'character,FaFile' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GFF3File,FaFile' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'TxDb,FaFile' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...) ## S4 method for signature 'GRangesList,FaFile' SequenceData(dataType, bamfiles, annotation, sequences, seqinfo, ...)
... |
Optional arguments overwriting default values. Not all
|
deparse.level |
See |
dataType |
The prefix for construction the class name of the
|
bamfiles |
the input which can be of the following types
|
annotation |
annotation data, which must match the information contained in the BAM files. |
sequences |
sequences matching the target sequences the reads were mapped onto. This must match the information contained in the BAM files. |
seqinfo |
optional |
A SequenceData object
sequencesTypea character value for the class name of
sequences. Either RNAStringSet, ModRNAStringSet,
DNAStringSet or ModDNAStringSet.
minQualitya integer value describing a threshold of the minimum
quality of reads to be used.
The SequenceData, SequenceDataSet, SequenceDataList and
SequenceDataFrame classes share functionality. Have a look at the
elements listed directly below.
replicates(x) ## S4 method for signature 'SequenceDataFrame' show(object) ## S4 method for signature 'SequenceDataFrame' conditions(object) ## S4 method for signature 'SequenceDataFrame' bamfiles(x) ## S4 method for signature 'SequenceDataFrame' dataType(x) ## S4 method for signature 'SequenceDataFrame' ranges(x) ## S4 method for signature 'SequenceDataFrame' replicates(x) ## S4 method for signature 'SequenceDataFrame' seqinfo(x) ## S4 method for signature 'SequenceDataFrame' seqinfo(x) ## S4 method for signature 'SequenceDataFrame' seqtype(x) ## S4 replacement method for signature 'SequenceDataFrame' seqtype(x) <- value ## S4 method for signature 'SequenceDataFrame' sequences(x) ## S4 method for signature 'SequenceData' show(object) ## S4 method for signature ## 'SequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'SequenceData' bamfiles(x) ## S4 method for signature 'SequenceData' conditions(object) ## S4 method for signature 'SequenceData' ranges(x) ## S4 method for signature 'SequenceData' replicates(x) ## S4 method for signature 'SequenceData' seqinfo(x) ## S4 method for signature 'SequenceData' sequences(x) ## S4 method for signature 'SequenceData' seqtype(x) ## S4 replacement method for signature 'SequenceData' seqtype(x) <- value ## S4 method for signature 'SequenceData' dataType(x) ## S4 method for signature 'SequenceDataSet' show(object) ## S4 method for signature 'SequenceDataSet' bamfiles(x) ## S4 method for signature 'SequenceDataSet' conditions(object) ## S4 method for signature 'SequenceDataSet' names(x) ## S4 method for signature 'SequenceDataSet' ranges(x) ## S4 method for signature 'SequenceDataSet' replicates(x) ## S4 method for signature 'SequenceDataSet' seqinfo(x) ## S4 method for signature 'SequenceDataSet' seqtype(x) ## S4 replacement method for signature 'SequenceDataSet' seqtype(x) <- value ## S4 method for signature 'SequenceDataSet' sequences(x) ## S4 method for signature 'SequenceDataList' show(object) ## S4 method for signature 'SequenceDataList' bamfiles(x) ## S4 method for signature 'SequenceDataList' conditions(object) ## S4 method for signature 'SequenceDataList' names(x) ## S4 method for signature 'SequenceDataList' ranges(x) ## S4 method for signature 'SequenceDataList' replicates(x) ## S4 method for signature 'SequenceDataList' seqinfo(x) ## S4 method for signature 'SequenceDataList' seqtype(x) ## S4 replacement method for signature 'SequenceDataList' seqtype(x) <- value ## S4 method for signature 'SequenceDataList' sequences(x)replicates(x) ## S4 method for signature 'SequenceDataFrame' show(object) ## S4 method for signature 'SequenceDataFrame' conditions(object) ## S4 method for signature 'SequenceDataFrame' bamfiles(x) ## S4 method for signature 'SequenceDataFrame' dataType(x) ## S4 method for signature 'SequenceDataFrame' ranges(x) ## S4 method for signature 'SequenceDataFrame' replicates(x) ## S4 method for signature 'SequenceDataFrame' seqinfo(x) ## S4 method for signature 'SequenceDataFrame' seqinfo(x) ## S4 method for signature 'SequenceDataFrame' seqtype(x) ## S4 replacement method for signature 'SequenceDataFrame' seqtype(x) <- value ## S4 method for signature 'SequenceDataFrame' sequences(x) ## S4 method for signature 'SequenceData' show(object) ## S4 method for signature ## 'SequenceData,BamFileList,GRangesList,XStringSet,ScanBamParam' getData(x, bamfiles, grl, sequences, param, args) ## S4 method for signature 'SequenceData' bamfiles(x) ## S4 method for signature 'SequenceData' conditions(object) ## S4 method for signature 'SequenceData' ranges(x) ## S4 method for signature 'SequenceData' replicates(x) ## S4 method for signature 'SequenceData' seqinfo(x) ## S4 method for signature 'SequenceData' sequences(x) ## S4 method for signature 'SequenceData' seqtype(x) ## S4 replacement method for signature 'SequenceData' seqtype(x) <- value ## S4 method for signature 'SequenceData' dataType(x) ## S4 method for signature 'SequenceDataSet' show(object) ## S4 method for signature 'SequenceDataSet' bamfiles(x) ## S4 method for signature 'SequenceDataSet' conditions(object) ## S4 method for signature 'SequenceDataSet' names(x) ## S4 method for signature 'SequenceDataSet' ranges(x) ## S4 method for signature 'SequenceDataSet' replicates(x) ## S4 method for signature 'SequenceDataSet' seqinfo(x) ## S4 method for signature 'SequenceDataSet' seqtype(x) ## S4 replacement method for signature 'SequenceDataSet' seqtype(x) <- value ## S4 method for signature 'SequenceDataSet' sequences(x) ## S4 method for signature 'SequenceDataList' show(object) ## S4 method for signature 'SequenceDataList' bamfiles(x) ## S4 method for signature 'SequenceDataList' conditions(object) ## S4 method for signature 'SequenceDataList' names(x) ## S4 method for signature 'SequenceDataList' ranges(x) ## S4 method for signature 'SequenceDataList' replicates(x) ## S4 method for signature 'SequenceDataList' seqinfo(x) ## S4 method for signature 'SequenceDataList' seqtype(x) ## S4 replacement method for signature 'SequenceDataList' seqtype(x) <- value ## S4 method for signature 'SequenceDataList' sequences(x)
x, object
|
a |
value |
a new |
bamfiles |
a |
grl |
a |
sequences |
a |
param |
a |
args |
a list of addition arguments |
seqinfo: a Seqinfo object ().
sequences: a RNAStingSet object or a RNAString
object for a SequenceDataFrame.
ranges: a GRangesList object with each element per
transcript or a GRanges object for a SequenceDataFrame.
bamfiles: a BamFileList object or a SimpleList of
BamFileList objects for a SequenceDataList.
data(e5sd,package="RNAmodR") # general accessors seqinfo(e5sd) sequences(e5sd) ranges(e5sd) bamfiles(e5sd)data(e5sd,package="RNAmodR") # general accessors seqinfo(e5sd) sequences(e5sd) ranges(e5sd) bamfiles(e5sd)
The SequenceDataFrame class is a virtual class and contains data for
positions along a single transcript. In addition to being used for returning
elements from a SequenceData object, the SequenceDataFrame class is
used to store the unlisted data within a
SequenceData object. Therefore, a matching
SequenceData and SequenceDataFrame class must be implemented.
The SequenceDataFrame class is derived from the
DataFrame class. To follow the
functionallity in the S4Vectors package, SequenceDataFrame
implements the concept, whereas SequenceDFrame is the implementation
for in-memory data representation from which some specific
*SequenceDataFrame class derive from, e.g.
CoverageSequenceData.
Subsetting of a SequenceDataFrame returns a SequenceDataFrame or
DataFrame, if it is subset by a column or row, respectively. The
drop argument is ignored for column subsetting.
## S4 method for signature 'SequenceDataFrame' cbind(..., deparse.level = 1) ## S4 method for signature 'SequenceDataFrame,ANY,ANY,ANY' x[i, j, ..., drop = TRUE]## S4 method for signature 'SequenceDataFrame' cbind(..., deparse.level = 1) ## S4 method for signature 'SequenceDataFrame,ANY,ANY,ANY' x[i, j, ..., drop = TRUE]
x, i, j, ..., drop, deparse.level
|
arguments used for
|
A SequenceDataFrame object or if subset to row a
DataFrame
rangesa GRanges
object each element describing a transcript including its element. The
GRanges is constructed from the unlisted results of the
exonsBy(x, by="tx") function.
If during construction a GRangesList is provided instead of a
character value pointing to a gff3 file or a TxDb object, it must have
a comparable structure.
sequencea XString of
type sequencesType from the parent
SequenceData object.
conditionconditions along the
BamFileList: Either control
or treated
replicatereplicate number along the BamFileList for each of the
condition types.
bamfilesthe input bam files as
BamFileList
seqinfoa Seqinfo describing
the avialable/used chromosomes.
for an example see
ProtectedEndSequenceData
and for more information see SequenceData
data(e5sd,package="RNAmodR") # A SequenceDataFrame can is usually constructed by subsetting from # a SequenceData object sdf <- e5sd[[1]] # Its also used to store the unlisted data in a SequenceData object sdf <- unlist(e5sd) # should probably only used internally e5sd <- relist(sdf,e5sd)data(e5sd,package="RNAmodR") # A SequenceDataFrame can is usually constructed by subsetting from # a SequenceData object sdf <- e5sd[[1]] # Its also used to store the unlisted data in a SequenceData object sdf <- unlist(e5sd) # should probably only used internally e5sd <- relist(sdf,e5sd)
The SequenceDataList class is used to hold SequenceData or
SequenceDataSet objects as its elements. It is derived from the
List class.
The SequenceDataList is used to hold data from different sets of
aligned reads. This allows multiple methods to be aggregated into one
modification detection strategy. Annotation and sequence data must be the
same for all elements, however the bam files can be different.
SequenceDataList(...)SequenceDataList(...)
... |
The elements to be included in the |
a SequenceDataList
data(psd,package="RNAmodR") data(e5sd,package="RNAmodR") sdl <- SequenceDataList(SequenceDataSet(psd,e5sd),e5sd)data(psd,package="RNAmodR") data(e5sd,package="RNAmodR") sdl <- SequenceDataList(SequenceDataSet(psd,e5sd),e5sd)
The SequenceDataSet class is used to hold SequenceData objects
as its elements. It is derived from the
List class.
The SequenceDataSet is used to hold different data types from the of
same aligned reads. The same dataset can be used to generate multiple sets of
data types. Bam files, annotation and sequence data must be the same for all
elements.
SequenceDataSet(...)SequenceDataSet(...)
... |
The elements to be included in the |
a SequenceDataSet
data(psd,package="RNAmodR") data(e5sd,package="RNAmodR") sdl <- SequenceDataSet(psd,e5sd)data(psd,package="RNAmodR") data(e5sd,package="RNAmodR") sdl <- SequenceDataSet(psd,e5sd)
A Gviz compatible
SequenceTrack for showing modified
DNA sequences.
ModDNASequenceTrack(sequence, chromosome, genome, name = "SequenceTrack", ...) ## S4 method for signature 'SequenceModDNAStringSetTrack' seqnames(x) ## S4 method for signature 'SequenceModDNAStringSetTrack' seqlevels(x)ModDNASequenceTrack(sequence, chromosome, genome, name = "SequenceTrack", ...) ## S4 method for signature 'SequenceModDNAStringSetTrack' seqnames(x) ## S4 method for signature 'SequenceModDNAStringSetTrack' seqlevels(x)
sequence |
A |
chromosome, genome, name, ...
|
See
|
x |
A |
a SequenceModDNAStringSetTrack object
sequenceA ModDNAStringSet object
seq <- ModDNAStringSet(c(chr1 = paste0(alphabet(ModDNAString()), collapse = ""))) st <- ModDNASequenceTrack(seq) Gviz::plotTracks(st, chromosome = "chr1",from = 1L, to = 20L)seq <- ModDNAStringSet(c(chr1 = paste0(alphabet(ModDNAString()), collapse = ""))) st <- ModDNASequenceTrack(seq) Gviz::plotTracks(st, chromosome = "chr1",from = 1L, to = 20L)
A Gviz compatible
SequenceTrack for showing modified
RNA sequences.
ModRNASequenceTrack(sequence, chromosome, genome, name = "SequenceTrack", ...) ## S4 method for signature 'SequenceModRNAStringSetTrack' seqnames(x) ## S4 method for signature 'SequenceModRNAStringSetTrack' seqlevels(x)ModRNASequenceTrack(sequence, chromosome, genome, name = "SequenceTrack", ...) ## S4 method for signature 'SequenceModRNAStringSetTrack' seqnames(x) ## S4 method for signature 'SequenceModRNAStringSetTrack' seqlevels(x)
sequence |
A |
chromosome, genome, name, ...
|
See
|
x |
A |
a SequenceModRNAStringSetTrack object
sequenceA ModRNAStringSet object
seq <- ModRNAStringSet(c(chr1 = paste0(alphabet(ModRNAString()), collapse = ""))) st <- ModRNASequenceTrack(seq) # on some system character encoding during printing is not handled correctly ## Not run: Gviz::plotTracks(st, chromosome = "chr1",from = 1L, to = 20L) ## End(Not run)seq <- ModRNAStringSet(c(chr1 = paste0(alphabet(ModRNAString()), collapse = ""))) st <- ModRNASequenceTrack(seq) # on some system character encoding during printing is not handled correctly ## Not run: Gviz::plotTracks(st, chromosome = "chr1",from = 1L, to = 20L) ## End(Not run)
Modifier objectsDepending on data prepation, quality and desired stringency of a modification
strategy, settings for cut off parameters or other variables may need to be
adjusted. This should be rarely the case, but a function for changing these
settings, is implemented as the... settings function.
For changing values the input can be either a list or something
coercible to a list. Upon changing a setting, the validity of the
value in terms of type(!) and dimensions will be checked.
If settings have been modified after the data was loaded, the data is
potentially invalid. To update the data, run the aggregate or the
modify function.
settings(x, name = NULL) settings(x, name) <- value ## S4 method for signature 'Modifier' settings(x, name = NULL) ## S4 replacement method for signature 'Modifier' settings(x) <- value ## S4 method for signature 'ModifierSet' settings(x, name = NULL) ## S4 replacement method for signature 'ModifierSet' settings(x) <- valuesettings(x, name = NULL) settings(x, name) <- value ## S4 method for signature 'Modifier' settings(x, name = NULL) ## S4 replacement method for signature 'Modifier' settings(x) <- value ## S4 method for signature 'ModifierSet' settings(x, name = NULL) ## S4 replacement method for signature 'ModifierSet' settings(x) <- value
x |
a |
name |
name of the setting to be returned or set |
value |
value of the setting to be set |
If name is omitted, settings returns a list of all settings.
If name is set, settings returns a single settings or
NULL, if a value for name is not available.
data(msi,package="RNAmodR") mi <- msi[[1]] # returns a list of all settings settings(mi) # accesses a specific setting settings(mi,"minCoverage") # modification of setting settings(mi) <- list(minCoverage = 11L)data(msi,package="RNAmodR") mi <- msi[[1]] # returns a list of all settings settings(mi) # accesses a specific setting settings(mi,"minCoverage") # modification of setting settings(mi) <- list(minCoverage = 11L)
stats returns information about reads used in the RNAmodR analysis.
Three modes are available depending on which type of object is provided. If a
SequenceData object is provided, a
BamFile or
BamFileList must be provided as well. If a
Modifier object is used, the bam files
returned from the bamfiles function are used. This is also the case,
if a ModifierSet object is used.
stats(x, file, ...) ## S4 method for signature 'SequenceData,BamFile' stats(x, file, ...) ## S4 method for signature 'SequenceData,BamFileList' stats(x, file, ...) ## S4 method for signature 'Modifier,missing' stats(x) ## S4 method for signature 'ModifierSet,missing' stats(x)stats(x, file, ...) ## S4 method for signature 'SequenceData,BamFile' stats(x, file, ...) ## S4 method for signature 'SequenceData,BamFileList' stats(x, file, ...) ## S4 method for signature 'Modifier,missing' stats(x) ## S4 method for signature 'ModifierSet,missing' stats(x)
x |
a |
file |
a |
... |
optional parameters used as stated
|
a DataFrame, DataFrameList or SimpleList with
the results in aggregated form
library(RNAmodR.Data) library(rtracklayer) sequences <- RNAmodR.Data.example.AAS.fasta() annotation <- GFF3File(RNAmodR.Data.example.AAS.gff3()) files <- list("SampleSet1" = c(treated = RNAmodR.Data.example.wt.1(), treated = RNAmodR.Data.example.wt.2(), treated = RNAmodR.Data.example.wt.3()), "SampleSet2" = c(treated = RNAmodR.Data.example.bud23.1(), treated = RNAmodR.Data.example.bud23.2()), "SampleSet3" = c(treated = RNAmodR.Data.example.trm8.1(), treated = RNAmodR.Data.example.trm8.2())) msi <- ModSetInosine(files, annotation = annotation, sequences = sequences) # smallest chunk of information stats(sequenceData(msi[[1L]]),bamfiles(msi[[1L]])[[1L]]) # partial information stats(sequenceData(msi[[1L]]),bamfiles(msi[[1L]])) # the whole stats stats(msi)library(RNAmodR.Data) library(rtracklayer) sequences <- RNAmodR.Data.example.AAS.fasta() annotation <- GFF3File(RNAmodR.Data.example.AAS.gff3()) files <- list("SampleSet1" = c(treated = RNAmodR.Data.example.wt.1(), treated = RNAmodR.Data.example.wt.2(), treated = RNAmodR.Data.example.wt.3()), "SampleSet2" = c(treated = RNAmodR.Data.example.bud23.1(), treated = RNAmodR.Data.example.bud23.2()), "SampleSet3" = c(treated = RNAmodR.Data.example.trm8.1(), treated = RNAmodR.Data.example.trm8.2())) msi <- ModSetInosine(files, annotation = annotation, sequences = sequences) # smallest chunk of information stats(sequenceData(msi[[1L]]),bamfiles(msi[[1L]])[[1L]]) # partial information stats(sequenceData(msi[[1L]]),bamfiles(msi[[1L]])) # the whole stats stats(msi)
SequenceData, SequenceDataSet,
SequenceDataList, Modifier or ModifierSet object.With the subsetByCoord function data from a SequenceData,
SequenceDataSet, SequenceDataList, Modifier or
ModifierSet object can be subset to positions as defined in
coord.
If coord contains a column mod and x is a
Modifier object, it will be filtered to identifiers matching the
modType of x. To disable this
behaviour remove the column mod from coord or set type =
NA
labelByCoord functions similarly. It will return a
SplitDataFrameList, which matches the dimensions of the aggregated
data plus the labels column, which contains logical values to indicate
selected positions.
subsetByCoord(x, coord, ...) labelByCoord(x, coord, ...) ## S4 method for signature 'Modifier,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'Modifier,GRangesList' subsetByCoord(x, coord, ...) ## S4 method for signature 'ModifierSet' subset(x, name, pos = 1L, ...) ## S4 method for signature 'ModifierSet,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'ModifierSet,GRangesList' subsetByCoord(x, coord, ...) ## S4 method for signature 'Modifier,GRanges' labelByCoord(x, coord, ...) ## S4 method for signature 'Modifier,GRangesList' labelByCoord(x, coord, ...) ## S4 method for signature 'ModifierSet,GRanges' labelByCoord(x, coord, ...) ## S4 method for signature 'ModifierSet,GRangesList' labelByCoord(x, coord, ...) ## S4 method for signature 'SplitDataFrameList,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceData' subset(x, name, pos = 1L, ...) ## S4 method for signature 'SequenceData,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceData,GRangesList' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataSet' subset(x, name, pos = 1L, ...) ## S4 method for signature 'SequenceDataSet,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataSet,GRangesList' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataList' subset(x, name, pos = 1L, ...) ## S4 method for signature 'SequenceDataList,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataList,GRangesList' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceData,GRanges' labelByCoord(x, coord, ...) ## S4 method for signature 'SequenceData,GRangesList' labelByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataSet,GRanges' labelByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataSet,GRangesList' labelByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataList,GRanges' labelByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataList,GRangesList' labelByCoord(x, coord, ...)subsetByCoord(x, coord, ...) labelByCoord(x, coord, ...) ## S4 method for signature 'Modifier,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'Modifier,GRangesList' subsetByCoord(x, coord, ...) ## S4 method for signature 'ModifierSet' subset(x, name, pos = 1L, ...) ## S4 method for signature 'ModifierSet,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'ModifierSet,GRangesList' subsetByCoord(x, coord, ...) ## S4 method for signature 'Modifier,GRanges' labelByCoord(x, coord, ...) ## S4 method for signature 'Modifier,GRangesList' labelByCoord(x, coord, ...) ## S4 method for signature 'ModifierSet,GRanges' labelByCoord(x, coord, ...) ## S4 method for signature 'ModifierSet,GRangesList' labelByCoord(x, coord, ...) ## S4 method for signature 'SplitDataFrameList,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceData' subset(x, name, pos = 1L, ...) ## S4 method for signature 'SequenceData,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceData,GRangesList' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataSet' subset(x, name, pos = 1L, ...) ## S4 method for signature 'SequenceDataSet,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataSet,GRangesList' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataList' subset(x, name, pos = 1L, ...) ## S4 method for signature 'SequenceDataList,GRanges' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataList,GRangesList' subsetByCoord(x, coord, ...) ## S4 method for signature 'SequenceData,GRanges' labelByCoord(x, coord, ...) ## S4 method for signature 'SequenceData,GRangesList' labelByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataSet,GRanges' labelByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataSet,GRangesList' labelByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataList,GRanges' labelByCoord(x, coord, ...) ## S4 method for signature 'SequenceDataList,GRangesList' labelByCoord(x, coord, ...)
x |
a |
coord |
coordinates of position to subset to. Either a |
... |
Optional parameters:
|
name |
Optional: Limit results to one specific transcript. |
pos |
Optional: Limit results to a specific position. |
If 'x' is a
SequenceData or
Modifier: a SplitDataFrameList
with elments per transcript.
SequenceDataSet,
SequenceDataList or
ModifierSet: a SimpleList of
SplitDataFrameList with elments per transcript.
data(msi,package="RNAmodR") mod <- modifications(msi) coord <- unique(unlist(mod)) coord$score <- NULL coord$sd <- NULL subsetByCoord(msi,coord)data(msi,package="RNAmodR") mod <- modifications(msi) coord <- unique(unlist(mod)) coord$score <- NULL coord$sd <- NULL subsetByCoord(msi,coord)