Title: | LC-MS and GC-MS Data Analysis |
---|---|
Description: | Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling. |
Authors: | Colin A. Smith [aut], Ralf Tautenhahn [aut], Steffen Neumann [aut, cre] , Paul Benton [aut], Christopher Conley [aut], Johannes Rainer [aut] , Michael Witting [ctb], William Kumler [aut] , Philippine Louail [aut] , Pablo Vangeenderhuysen [ctb] , Carl Brunius [ctb] |
Maintainer: | Steffen Neumann <[email protected]> |
License: | GPL (>= 2) + file LICENSE |
Version: | 4.5.2 |
Built: | 2024-12-17 03:20:49 UTC |
Source: | https://github.com/bioc/xcms |
The methods listed on this page allow to filter and subset XCMSnExp
objects. Most of them are inherited from the OnDiskMSnExp object defined
in the MSnbase
package and have been adapted for XCMSnExp
to enable
correct subsetting of preprocessing results.
[
: subset a XCMSnExp
object by spectra. Be aware that this removes
all preprocessing results, except adjusted retention times if
keepAdjustedRtime = TRUE
is passed to the method.
[[
: extracts a single Spectrum
object (defined in MSnbase
). The
reported retention time is the adjusted retention time if alignment has
been performed.
filterChromPeaks
: subset the chromPeaks
matrix
in object
. Parameter
method
allows to specify how the chromatographic peaks should be
filtered. Currently, only method = "keep"
is supported which allows to
specify chromatographic peaks to keep with parameter keep
(i.e. provide
a logical
, integer
or character
defining which chromatographic peaks
to keep). Feature definitions (if present) are updated correspondingly.
filterFeatureDefinitions
: allows to subset the feature definitions of
an XCMSnExp
object. Parameter features
allow to define which features
to keep. It can be a logical
, integer
(index of features to keep) or
character
(feature IDs) vector.
filterFile
: allows to reduce the XCMSnExp
to data from only selected
files. Identified chromatographic peaks for these files are retained while
correspondence results (feature definitions) are removed by default. To
force keeping feature definitions use keepFeatures = TRUE
. Adjusted
retention times (if present) are retained by default if present. Use
keepAdjustedRtime = FALSE
to drop them.
filterMsLevel
: reduces the XCMSnExp
object to spectra of the
specified MS level(s). Chromatographic peaks and identified features are
also subsetted to the respective MS level. See also the filterMsLevel
documentation in MSnbase
for details and examples.
filterMz
: filters the data set based on the provided m/z value range.
All chromatographic peaks and features (grouped peaks) with their apex
falling within the provided mz value range are retained
(i.e. if chromPeaks(object)[, "mz"]
is >= mz[1]
and <= mz[2]
).
Adjusted retention times, if present, are kept.
filterRt
: filters the data set based on the provided retention time
range. All chromatographic peaks and features (grouped peaks)
within the specified retention time window are retained
(i.e. if the retention time corresponding to the peak's apex is within the
specified rt range). If retention time correction has been performed,
the method will by default filter the object by adjusted retention times.
The argument adjusted
allows to specify manually whether filtering
should be performed on raw or adjusted retention times. Filtering by
retention time does not drop any preprocessing results nor does it remove
or change alignment results (i.e. adjusted retention times).
The method returns an empty object if no spectrum or feature is within
the specified retention time range.
split
: splits an XCMSnExp
object into a list
of XCMSnExp
objects
based on the provided parameter f
. Note that by default all
pre-processing results are removed by the splitting, except adjusted
retention times, if the optional argument keepAdjustedRtime = TRUE
is
provided.
## S4 method for signature 'XCMSnExp,ANY,ANY,ANY' x[i, j, ..., drop = TRUE] ## S4 method for signature 'XCMSnExp,ANY,ANY' x[[i, j, drop = FALSE]] ## S4 method for signature 'XCMSnExp' filterMsLevel(object, msLevel., keepAdjustedRtime = hasAdjustedRtime(object)) ## S4 method for signature 'XCMSnExp' filterFile( object, file, keepAdjustedRtime = hasAdjustedRtime(object), keepFeatures = FALSE ) ## S4 method for signature 'XCMSnExp' filterMz(object, mz, msLevel., ...) ## S4 method for signature 'XCMSnExp' filterRt(object, rt, msLevel., adjusted = hasAdjustedRtime(object)) ## S4 method for signature 'XCMSnExp,ANY' split(x, f, drop = FALSE, ...) ## S4 method for signature 'XCMSnExp' filterChromPeaks( object, keep = rep(TRUE, nrow(chromPeaks(object))), method = "keep", ... ) ## S4 method for signature 'XCMSnExp' filterFeatureDefinitions(object, features = integer())
## S4 method for signature 'XCMSnExp,ANY,ANY,ANY' x[i, j, ..., drop = TRUE] ## S4 method for signature 'XCMSnExp,ANY,ANY' x[[i, j, drop = FALSE]] ## S4 method for signature 'XCMSnExp' filterMsLevel(object, msLevel., keepAdjustedRtime = hasAdjustedRtime(object)) ## S4 method for signature 'XCMSnExp' filterFile( object, file, keepAdjustedRtime = hasAdjustedRtime(object), keepFeatures = FALSE ) ## S4 method for signature 'XCMSnExp' filterMz(object, mz, msLevel., ...) ## S4 method for signature 'XCMSnExp' filterRt(object, rt, msLevel., adjusted = hasAdjustedRtime(object)) ## S4 method for signature 'XCMSnExp,ANY' split(x, f, drop = FALSE, ...) ## S4 method for signature 'XCMSnExp' filterChromPeaks( object, keep = rep(TRUE, nrow(chromPeaks(object))), method = "keep", ... ) ## S4 method for signature 'XCMSnExp' filterFeatureDefinitions(object, features = integer())
x |
For |
i |
For |
j |
For |
... |
Optional additional arguments. |
drop |
For |
object |
A XCMSnExp object. |
msLevel. |
For |
keepAdjustedRtime |
For |
file |
For |
keepFeatures |
For |
mz |
For |
rt |
For |
adjusted |
For |
f |
For |
keep |
For |
method |
For |
features |
For |
All subsetting methods try to ensure that the returned data is
consistent. Correspondence results for example are removed by default if the
data set is sub-setted by file, since the correspondence results are
dependent on the files on which correspondence was performed. This can be
changed by setting keepFeatures = TRUE
.
For adjusted retention times, most subsetting methods
support the argument keepAdjustedRtime
(even the [
method)
that forces the adjusted retention times to be retained even if the
default would be to drop them.
All methods return an XCMSnExp object.
The filterFile
method removes also process history steps not
related to the files to which the object should be sub-setted and updates
the fileIndex
attribute accordingly. Also, the method does not
allow arbitrary ordering of the files or re-ordering of the files within
the object.
Note also that most of the filtering methods, and also the subsetting
operations [
drop all or selected preprocessing results. To
consolidate the alignment results, i.e. ensure that adjusted retention
times are always preserved, use the applyAdjustedRtime()
function on the object that contains the alignment results. This replaces
the raw retention times with the adjusted ones.
Johannes Rainer
XCMSnExp for base class documentation.
XChromatograms()
for similar filter functions on
XChromatograms
objects.
## Loading a test data set with identified chromatographic peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## Subset the dataset to the first and third file. xod_sub <- filterFile(faahko_sub, file = c(1, 3)) ## The number of chromatographic peaks per file for the full object table(chromPeaks(faahko_sub)[, "sample"]) ## The number of chromatographic peaks per file for the subset table(chromPeaks(xod_sub)[, "sample"]) basename(fileNames(faahko_sub)) basename(fileNames(xod_sub)) ## Filter on mz values; chromatographic peaks and features within the ## mz range are retained (as well as adjusted retention times). xod_sub <- filterMz(faahko_sub, mz = c(300, 400)) head(chromPeaks(xod_sub)) nrow(chromPeaks(xod_sub)) nrow(chromPeaks(faahko_sub)) ## Filter on rt values. All chromatographic peaks and features within the ## retention time range are retained. Filtering is performed by default on ## adjusted retention times, if present. xod_sub <- filterRt(faahko_sub, rt = c(2700, 2900)) range(rtime(xod_sub)) head(chromPeaks(xod_sub)) range(chromPeaks(xod_sub)[, "rt"]) nrow(chromPeaks(faahko_sub)) nrow(chromPeaks(xod_sub)) ## Extract a single Spectrum faahko_sub[[4]] ## Subsetting using [ removes all preprocessing results - using ## keepAdjustedRtime = TRUE would keep adjusted retention times, if present. xod_sub <- faahko_sub[fromFile(faahko_sub) == 1] xod_sub ## Using split does also remove preprocessing results, but it supports the ## optional parameter keepAdjustedRtime. ## Split the object into a list of XCMSnExp objects, one per file xod_list <- split(faahko_sub, f = fromFile(faahko_sub)) xod_list
## Loading a test data set with identified chromatographic peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## Subset the dataset to the first and third file. xod_sub <- filterFile(faahko_sub, file = c(1, 3)) ## The number of chromatographic peaks per file for the full object table(chromPeaks(faahko_sub)[, "sample"]) ## The number of chromatographic peaks per file for the subset table(chromPeaks(xod_sub)[, "sample"]) basename(fileNames(faahko_sub)) basename(fileNames(xod_sub)) ## Filter on mz values; chromatographic peaks and features within the ## mz range are retained (as well as adjusted retention times). xod_sub <- filterMz(faahko_sub, mz = c(300, 400)) head(chromPeaks(xod_sub)) nrow(chromPeaks(xod_sub)) nrow(chromPeaks(faahko_sub)) ## Filter on rt values. All chromatographic peaks and features within the ## retention time range are retained. Filtering is performed by default on ## adjusted retention times, if present. xod_sub <- filterRt(faahko_sub, rt = c(2700, 2900)) range(rtime(xod_sub)) head(chromPeaks(xod_sub)) range(chromPeaks(xod_sub)[, "rt"]) nrow(chromPeaks(faahko_sub)) nrow(chromPeaks(xod_sub)) ## Extract a single Spectrum faahko_sub[[4]] ## Subsetting using [ removes all preprocessing results - using ## keepAdjustedRtime = TRUE would keep adjusted retention times, if present. xod_sub <- faahko_sub[fromFile(faahko_sub) == 1] xod_sub ## Using split does also remove preprocessing results, but it supports the ## optional parameter keepAdjustedRtime. ## Split the object into a list of XCMSnExp objects, one per file xod_list <- split(faahko_sub, f = fromFile(faahko_sub)) xod_list
Subset an xcmsRaw
object by scans. The
returned xcmsRaw
object contains values for all
scans specified with argument i
. Note that the scanrange
slot of the returned xcmsRaw
will be
c(1, length(object@scantime))
and hence not range(i)
.
## S4 method for signature 'xcmsRaw,logicalOrNumeric,missing,missing' x[i, j, drop]
## S4 method for signature 'xcmsRaw,logicalOrNumeric,missing,missing' x[i, j, drop]
x |
The |
i |
Integer or logical vector specifying the scans/spectra to which
|
j |
Not supported. |
drop |
Not supported. |
Only subsetting by scan index in increasing order or by a logical
vector are supported. If not ordered, argument i
is sorted
automatically. Indices which are larger than the total number of scans
are discarded.
The sub-setted xcmsRaw
object.
Johannes Rainer
## Load a test file file <- system.file('cdf/KO/ko15.CDF', package = "faahKO") xraw <- xcmsRaw(file, profstep = 0) ## The number of scans/spectra: length(xraw@scantime) ## Subset the object to scans with a scan time from 3500 to 4000. xsub <- xraw[xraw@scantime >= 3500 & xraw@scantime <= 4000] range(xsub@scantime) ## The number of scans: length(xsub@scantime) ## The number of values of the subset: length(xsub@env$mz)
## Load a test file file <- system.file('cdf/KO/ko15.CDF', package = "faahKO") xraw <- xcmsRaw(file, profstep = 0) ## The number of scans/spectra: length(xraw@scantime) ## Subset the object to scans with a scan time from 3500 to 4000. xsub <- xraw[xraw@scantime >= 3500 & xraw@scantime <= 4000] range(xsub@scantime) ## The number of scans: length(xsub@scantime) ## The number of values of the subset: length(xsub@env$mz)
Determine which peaks are absent / present in a sample class
object |
|
class |
Name of a sample class from |
minfrac |
minimum fraction of samples necessary in the class to be absent/present |
Determine which peaks are absent / present in a sample class
The functions treat peaks that are only present because
of fillPeaks
correctly, i.e. does not count them as present.
An logical vector with the same length as nrow(groups(object))
.
absent(object, ...)
present(object, ...)
The adjustRtime
method(s) perform retention time correction (alignment)
between chromatograms of different samples/dataset. Alignment is performed
by default on MS level 1 data. Retention times of spectra from other MS
levels, if present, are subsequently adjusted based on the adjusted
retention times of the MS1 spectra. Note that calling adjustRtime
on a
xcms result object will remove any eventually present previous alignment
results as well as any correspondence analysis results. To run a second
round of alignment, raw retention times need to be replaced with adjusted
ones using the applyAdjustedRtime()
function.
The alignment method can be specified (and configured) using a dedicated
param
argument.
Supported param
objects are:
ObiwarpParam
: performs retention time adjustment based on the full m/z -
rt data using the obiwarp method (Prince (2006)). It is based on the
original code but supports in addition
alignment of multiple samples by aligning each against a center sample.
The alignment is performed directly on the profile-matrix and can hence
be performed independently of the peak detection or peak grouping.
PeakGroupsParam
: performs retention time correction based on the
alignment of features defined in all/most samples (corresponding to
house keeping compounds or marker compounds) (Smith 2006). First the
retention time deviation of these features is described by fitting either a
polynomial (smooth = "loess"
) or a linear (smooth = "linear"
) function
to the data points. These are then subsequently used to adjust the
retention time of each spectrum in each sample (even from spectra of
MS levels different than MS 1). Since the function is
based on features (i.e. chromatographic peaks grouped across samples) a
initial correspondence analysis has to be performed before using the
groupChromPeaks()
function. Alternatively, it is also possible to
manually define a numeric
matrix with retention times of markers in each
samples that should be used for alignment. Such a matrix
can be passed
to the alignment function using the peakGroupsMatrix
parameter of the
PeakGroupsParam
parameter object. By default the adjustRtimePeakGroups
function is used to define this matrix
. This function identifies peak
groups (features) for alignment in object
based on the parameters defined
in param
. See also do_adjustRtime_peakGroups()
for the core API
function.
LamaParama
: This function performs retention time correction by aligning
chromatographic data to an external reference dataset (concept and initial
implementation by Carl Brunius). The process involves identifying and
aligning peaks within the experimental chromatographic data, represented
as an XcmsExperiment
object, to a predefined set of landmark features
called "lamas". These landmark features are characterized by their
mass-to-charge ratio (m/z) and retention time. see LamaParama()
for more
information on the method.
adjustRtime(object, param, ...) ## S4 method for signature 'MsExperiment,ObiwarpParam' adjustRtime(object, param, chunkSize = 2L, BPPARAM = bpparam()) ## S4 method for signature 'MsExperiment,PeakGroupsParam' adjustRtime(object, param, msLevel = 1L, ...) PeakGroupsParam( minFraction = 0.9, extraPeaks = 1, smooth = "loess", span = 0.2, family = "gaussian", peakGroupsMatrix = matrix(nrow = 0, ncol = 0), subset = integer(), subsetAdjust = c("average", "previous") ) ObiwarpParam( binSize = 1, centerSample = integer(), response = 1L, distFun = "cor_opt", gapInit = numeric(), gapExtend = numeric(), factorDiag = 2, factorGap = 1, localAlignment = FALSE, initPenalty = 0, subset = integer(), subsetAdjust = c("average", "previous"), rtimeDifferenceThreshold = 5 ) adjustRtimePeakGroups(object, param = PeakGroupsParam(), msLevel = 1L) ## S4 method for signature 'OnDiskMSnExp,ObiwarpParam' adjustRtime(object, param, msLevel = 1L) ## S4 method for signature 'PeakGroupsParam' minFraction(object) ## S4 replacement method for signature 'PeakGroupsParam' minFraction(object) <- value ## S4 method for signature 'PeakGroupsParam' extraPeaks(object) ## S4 replacement method for signature 'PeakGroupsParam' extraPeaks(object) <- value ## S4 method for signature 'PeakGroupsParam' smooth(x) ## S4 replacement method for signature 'PeakGroupsParam' smooth(object) <- value ## S4 method for signature 'PeakGroupsParam' span(object) ## S4 replacement method for signature 'PeakGroupsParam' span(object) <- value ## S4 method for signature 'PeakGroupsParam' family(object) ## S4 replacement method for signature 'PeakGroupsParam' family(object) <- value ## S4 method for signature 'PeakGroupsParam' peakGroupsMatrix(object) ## S4 replacement method for signature 'PeakGroupsParam' peakGroupsMatrix(object) <- value ## S4 method for signature 'PeakGroupsParam' subset(x) ## S4 replacement method for signature 'PeakGroupsParam' subset(object) <- value ## S4 method for signature 'PeakGroupsParam' subsetAdjust(object) ## S4 replacement method for signature 'PeakGroupsParam' subsetAdjust(object) <- value ## S4 method for signature 'ObiwarpParam' binSize(object) ## S4 replacement method for signature 'ObiwarpParam' binSize(object) <- value ## S4 method for signature 'ObiwarpParam' centerSample(object) ## S4 replacement method for signature 'ObiwarpParam' centerSample(object) <- value ## S4 method for signature 'ObiwarpParam' response(object) ## S4 replacement method for signature 'ObiwarpParam' response(object) <- value ## S4 method for signature 'ObiwarpParam' distFun(object) ## S4 replacement method for signature 'ObiwarpParam' distFun(object) <- value ## S4 method for signature 'ObiwarpParam' gapInit(object) ## S4 replacement method for signature 'ObiwarpParam' gapInit(object) <- value ## S4 method for signature 'ObiwarpParam' gapExtend(object) ## S4 replacement method for signature 'ObiwarpParam' gapExtend(object) <- value ## S4 method for signature 'ObiwarpParam' factorDiag(object) ## S4 replacement method for signature 'ObiwarpParam' factorDiag(object) <- value ## S4 method for signature 'ObiwarpParam' factorGap(object) ## S4 replacement method for signature 'ObiwarpParam' factorGap(object) <- value ## S4 method for signature 'ObiwarpParam' localAlignment(object) ## S4 replacement method for signature 'ObiwarpParam' localAlignment(object) <- value ## S4 method for signature 'ObiwarpParam' initPenalty(object) ## S4 replacement method for signature 'ObiwarpParam' initPenalty(object) <- value ## S4 method for signature 'ObiwarpParam' subset(x) ## S4 replacement method for signature 'ObiwarpParam' subset(object) <- value ## S4 method for signature 'ObiwarpParam' subsetAdjust(object) ## S4 replacement method for signature 'ObiwarpParam' subsetAdjust(object) <- value ## S4 method for signature 'XCMSnExp,PeakGroupsParam' adjustRtime(object, param, msLevel = 1L) ## S4 method for signature 'XCMSnExp,ObiwarpParam' adjustRtime(object, param, msLevel = 1L)
adjustRtime(object, param, ...) ## S4 method for signature 'MsExperiment,ObiwarpParam' adjustRtime(object, param, chunkSize = 2L, BPPARAM = bpparam()) ## S4 method for signature 'MsExperiment,PeakGroupsParam' adjustRtime(object, param, msLevel = 1L, ...) PeakGroupsParam( minFraction = 0.9, extraPeaks = 1, smooth = "loess", span = 0.2, family = "gaussian", peakGroupsMatrix = matrix(nrow = 0, ncol = 0), subset = integer(), subsetAdjust = c("average", "previous") ) ObiwarpParam( binSize = 1, centerSample = integer(), response = 1L, distFun = "cor_opt", gapInit = numeric(), gapExtend = numeric(), factorDiag = 2, factorGap = 1, localAlignment = FALSE, initPenalty = 0, subset = integer(), subsetAdjust = c("average", "previous"), rtimeDifferenceThreshold = 5 ) adjustRtimePeakGroups(object, param = PeakGroupsParam(), msLevel = 1L) ## S4 method for signature 'OnDiskMSnExp,ObiwarpParam' adjustRtime(object, param, msLevel = 1L) ## S4 method for signature 'PeakGroupsParam' minFraction(object) ## S4 replacement method for signature 'PeakGroupsParam' minFraction(object) <- value ## S4 method for signature 'PeakGroupsParam' extraPeaks(object) ## S4 replacement method for signature 'PeakGroupsParam' extraPeaks(object) <- value ## S4 method for signature 'PeakGroupsParam' smooth(x) ## S4 replacement method for signature 'PeakGroupsParam' smooth(object) <- value ## S4 method for signature 'PeakGroupsParam' span(object) ## S4 replacement method for signature 'PeakGroupsParam' span(object) <- value ## S4 method for signature 'PeakGroupsParam' family(object) ## S4 replacement method for signature 'PeakGroupsParam' family(object) <- value ## S4 method for signature 'PeakGroupsParam' peakGroupsMatrix(object) ## S4 replacement method for signature 'PeakGroupsParam' peakGroupsMatrix(object) <- value ## S4 method for signature 'PeakGroupsParam' subset(x) ## S4 replacement method for signature 'PeakGroupsParam' subset(object) <- value ## S4 method for signature 'PeakGroupsParam' subsetAdjust(object) ## S4 replacement method for signature 'PeakGroupsParam' subsetAdjust(object) <- value ## S4 method for signature 'ObiwarpParam' binSize(object) ## S4 replacement method for signature 'ObiwarpParam' binSize(object) <- value ## S4 method for signature 'ObiwarpParam' centerSample(object) ## S4 replacement method for signature 'ObiwarpParam' centerSample(object) <- value ## S4 method for signature 'ObiwarpParam' response(object) ## S4 replacement method for signature 'ObiwarpParam' response(object) <- value ## S4 method for signature 'ObiwarpParam' distFun(object) ## S4 replacement method for signature 'ObiwarpParam' distFun(object) <- value ## S4 method for signature 'ObiwarpParam' gapInit(object) ## S4 replacement method for signature 'ObiwarpParam' gapInit(object) <- value ## S4 method for signature 'ObiwarpParam' gapExtend(object) ## S4 replacement method for signature 'ObiwarpParam' gapExtend(object) <- value ## S4 method for signature 'ObiwarpParam' factorDiag(object) ## S4 replacement method for signature 'ObiwarpParam' factorDiag(object) <- value ## S4 method for signature 'ObiwarpParam' factorGap(object) ## S4 replacement method for signature 'ObiwarpParam' factorGap(object) <- value ## S4 method for signature 'ObiwarpParam' localAlignment(object) ## S4 replacement method for signature 'ObiwarpParam' localAlignment(object) <- value ## S4 method for signature 'ObiwarpParam' initPenalty(object) ## S4 replacement method for signature 'ObiwarpParam' initPenalty(object) <- value ## S4 method for signature 'ObiwarpParam' subset(x) ## S4 replacement method for signature 'ObiwarpParam' subset(object) <- value ## S4 method for signature 'ObiwarpParam' subsetAdjust(object) ## S4 replacement method for signature 'ObiwarpParam' subsetAdjust(object) <- value ## S4 method for signature 'XCMSnExp,PeakGroupsParam' adjustRtime(object, param, msLevel = 1L) ## S4 method for signature 'XCMSnExp,ObiwarpParam' adjustRtime(object, param, msLevel = 1L)
object |
For |
param |
The parameter object defining the alignment method (and its setting). |
... |
ignored. |
chunkSize |
For |
BPPARAM |
parallel processing setup. Defaults to |
msLevel |
For |
minFraction |
For |
extraPeaks |
For |
smooth |
For |
span |
For |
family |
For |
peakGroupsMatrix |
For |
subset |
For |
subsetAdjust |
For |
binSize |
|
centerSample |
|
response |
For |
distFun |
For |
gapInit |
For |
gapExtend |
For |
factorDiag |
For |
factorGap |
For |
localAlignment |
For |
initPenalty |
For |
rtimeDifferenceThreshold |
For |
value |
The value for the slot. |
x |
An |
adjustRtime
on an OnDiskMSnExp
or XCMSnExp
object will return an
XCMSnExp
object with the alignment results.
adjustRtime
on an MsExperiment
or XcmsExperiment
will return an
XcmsExperiment
with the adjusted retention times stored in an new
spectra variable rtime_adjusted
in the object's spectra
.
ObiwarpParam
, PeakGroupsParam
and LamaParama
return the respective
parameter object.
adjustRtimeGroups
returns a matrix
with the retention times of marker
features in each sample (each row one feature, each row one sample).
All alignment methods allow to perform the retention time correction on a
user-selected subset of samples (e.g. QC samples) after which all samples
not part of that subset will be adjusted based on the adjusted retention
times of the closest subset sample (close in terms of index within object
and hence possibly injection index). It is thus suggested to load MS data
files in the order in which their samples were injected in the measurement
run(s).
How the non-subset samples are adjusted depends also on the parameter
subsetAdjust
: with subsetAdjust = "previous"
, each non-subset
sample is adjusted based on the closest previous subset sample which
results in most cases with adjusted retention times of the non-subset
sample being identical to the subset sample on which the adjustment bases.
The second, default, option is subsetAdjust = "average"
in which case
each non subset sample is adjusted based on the average retention time
adjustment from the previous and following subset sample. For the average,
a weighted mean is used with weights being the inverse of the distance of
the non-subset sample to the subset samples used for alignment.
See also section Alignment of experiments including blanks in the xcms vignette for more details.
Colin Smith, Johannes Rainer, Philippine Louail, Carl Brunius
Prince, J. T., and Marcotte, E. M. (2006) "Chromatographic Alignment of ESI-LC-MS Proteomic Data Sets by Ordered Bijective Interpolated Warping" Anal. Chem., 78 (17), 6140-6152.
Smith, C.A., Want, E.J., O'Maille, G., Abagyan, R. and Siuzdak, G. (2006). "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 78:779-787.
plotAdjustedRtime()
for visualization of alignment results.
Alignment is achieved using the ['adjustRtime()'] method with a 'param' of class 'LamaParama'. This method corrects retention time by aligning chromatographic data with an external reference dataset.
Chromatographic peaks in the experimental data are first matched to predefined (external) landmark features based on their mass-to-charge ratio and retention time and subsequently the data is aligned by minimizing the differences in retention times between the matched chromatographic peaks and lamas. This adjustment is performed file by file.
Adjustable parameters such as 'ppm', 'tolerance', and 'toleranceRt' define acceptable deviations during the matching process. It's crucial to note that only lamas and chromatographic peaks exhibiting a one-to-one mapping are considered when estimating retention time shifts. If a file has no peaks matching with lamas, no adjustment will be performed, and the the retention times will be returned as-is. Users can evaluate this matching, for example, by checking the number of matches and ranges of the matching peaks, by first running '[matchLamasChromPeaks()]'.
Different warping methods are available; users can choose to fit a *loess* ('method = "loess"', the default) or a *gam* ('method = "gam"') between the reference data points and observed matching ChromPeaks. Additional parameters such as 'span', 'weight', 'outlierTolerance', 'zeroWeight', and 'bs' are specific to these models. These parameters offer flexibility in fine-tuning how the matching chromatographic peaks are fitted to the lamas, thereby generating a model to align the overall retention time for a single file.
Other functions related to this method:
- 'LamaParama()': return the respective parameter object for alignment using 'adjustRtime()' function. It is also the input for the functions listed below.
- ‘matchLamasChromPeaks()': quickly matches each file’s ChromPeaks to Lamas, allowing the user to evaluate the matches for each file.
- 'summarizeLamaMatch()': generates a summary of the 'LamaParama' method. See below for the details of the return object.
- 'matchedRtimes()': Access the list of 'data.frame' saved in the 'LamaParama' object, generated by the 'matchLamasChromPeaks()' function.
- 'plot()':plot the chromatographic peaks versus the reference lamas as well as the fitting line for the chosen model type. The user can decide what file to inspect by specifying the assay number with the parameter 'assay'
## S4 method for signature 'XcmsExperiment,LamaParama' adjustRtime(object, param, BPPARAM = bpparam(), ...) matchLamasChromPeaks(object, param, BPPARAM = bpparam()) summarizeLamaMatch(param) matchedRtimes(param) LamaParama( lamas = matrix(ncol = 2, nrow = 0, dimnames = list(NULL, c("mz", "rt"))), method = c("loess", "gam"), span = 0.5, outlierTolerance = 3, zeroWeight = 10, ppm = 20, tolerance = 0, toleranceRt = 5, bs = "tp" ) ## S4 method for signature 'LamaParama,ANY' plot( x, index = 1L, colPoints = "#00000060", colFit = "#00000080", xlab = "Matched Chromatographic peaks", ylab = "Lamas", ... )
## S4 method for signature 'XcmsExperiment,LamaParama' adjustRtime(object, param, BPPARAM = bpparam(), ...) matchLamasChromPeaks(object, param, BPPARAM = bpparam()) summarizeLamaMatch(param) matchedRtimes(param) LamaParama( lamas = matrix(ncol = 2, nrow = 0, dimnames = list(NULL, c("mz", "rt"))), method = c("loess", "gam"), span = 0.5, outlierTolerance = 3, zeroWeight = 10, ppm = 20, tolerance = 0, toleranceRt = 5, bs = "tp" ) ## S4 method for signature 'LamaParama,ANY' plot( x, index = 1L, colPoints = "#00000060", colFit = "#00000080", xlab = "Matched Chromatographic peaks", ylab = "Lamas", ... )
object |
An object of class 'XcmsExperiment' with defined ChromPeaks. |
param |
An object of class 'LamaParama' that will later be used for adjustment using the '[adjustRtime()]' function. |
BPPARAM |
For 'matchLamasChromPeaks()': parallel processing setup. Defaults to 'BPPARAM = bpparam()'. See [bpparam()] for more information. |
... |
For 'plot()': extra parameters to be passed to the function. |
lamas |
For 'LamaParama': 'matrix' or 'data.frame' with the m/z and retention times values of features (as first and second column) from the external dataset on which the alignment will be based on. |
method |
For 'LamaParama':'character(1)' with the type of warping. Either 'method = "gam"' or 'method = "loess"' (default). |
span |
For 'LamaParama': 'numeric(1)' defining the degree of smoothing ('method = "loess"'). This parameter is passed to the internal call to [loess()]. |
outlierTolerance |
For 'LamaParama': 'numeric(1)' defining the settings for outlier removal during the fitting. By default (with 'outlierTolerance = 3'), all data points with absolute residuals larger than 3 times the mean absolute residual of all data points from the first, initial fit, are removed from the final model fit. |
zeroWeight |
For 'LamaParama': 'numeric(1)': defines the weight of the first data point (i.e. retention times of the first lama-chromatographic peak pair). Values larger than 1 reduce warping problems in the early RT range. |
ppm |
For 'LamaParama': 'numeric(1)' defining the m/z-relative maximal allowed difference in m/z between 'lamas' and chromatographic peaks. Used for the mapping of identified chromatographic peaks and lamas. |
tolerance |
For 'LamaParama': 'numeric(1)' defining the absolute acceptable difference in m/z between lamas and chromatographic peaks. Used for the mapping of identified chromatographic peaks and 'lamas'. |
toleranceRt |
For 'LamaParama': 'numeric(1)' defining the absolute acceptable difference in retention time between lamas and chromatographic peaks. Used for the mapping of identified chromatographic peaks and 'lamas'. |
bs |
For 'LamaParama()': 'character(1)' defining the GAM smoothing method. (defaults to thin plate, 'bs = "tp"') |
x |
For 'plot()': object of class 'LamaParama' to be plotted. |
index |
For 'plot()': 'numeric(1)' index of the file that should be plotted. |
colPoints |
For 'plot()': color for the plotting of the datapoint. |
colFit |
For 'plot()': color of the fitting line. |
xlab , ylab
|
For 'plot()': x- and y-axis labels. |
For 'matchLamasChromPeaks()': A 'LamaParama' object with new slot 'rtMap' composed of a list of matrices representing the 1:1 matches between Lamas (ref) and ChromPeaks (obs). To access this, 'matchedRtimes()' can be used.
For 'matchedRtimes()': A list of 'data.frame' representing matches between chromPeaks and 'lamas' for each files.
For 'summarizeLamaMatch()':A 'data.frame' with:
- "Total_peaks": total number of chromatographic peaks in the file.
- "Matched_peak": The number of matched peaks to Lamas.
- "Total_Lamas": Total number of Lamas.
- "Model_summary": 'summary.loess' or 'summary.gam' object for each file.
If there are no matches when using 'matchLamasChromPeaks()', the file retention will not be adjusted when calling [adjustRtime()] with the same 'LamaParama' and 'XcmsExperiment' object.
To see examples on how to utilize this methods and its functionality, see the vignette.
Carl Brunius, Philippine Louail
## load test and reference datasets ref <- loadXcmsData("xmse") tst <- loadXcmsData("faahko_sub2") ## create lamas input from the reference dataset library(MsExperiment) f <- sampleData(ref)$sample_type f[f == "QC"] <- NA ref <- filterFeatures(ref, PercentMissingFilter(threshold = 0, f = f)) ref_mz_rt <- featureDefinitions(ref)[, c("mzmed","rtmed")] ## Set up the LamaParama object param <- LamaParama(lamas = ref_mz_rt, method = "loess", span = 0.5, outlierTolerance = 3, zeroWeight = 10, ppm = 20, tolerance = 0, toleranceRt = 20, bs = "tp") ## input into `adjustRtime()` tst_adjusted <- adjustRtime(tst, param = param) ## run diagnostic functions to pre-evaluate alignment param <- matchLamasChromPeaks(tst, param = param) mtch <- matchedRtimes(param) ## Access summary of matches and model information summary <- summarizeLamaMatch(param) ##coverage for each file summary$Matched_peaks / summary$Total_peaks * 100 ## Access the information on the model of for the first file summary$model_summary[[1]]
## load test and reference datasets ref <- loadXcmsData("xmse") tst <- loadXcmsData("faahko_sub2") ## create lamas input from the reference dataset library(MsExperiment) f <- sampleData(ref)$sample_type f[f == "QC"] <- NA ref <- filterFeatures(ref, PercentMissingFilter(threshold = 0, f = f)) ref_mz_rt <- featureDefinitions(ref)[, c("mzmed","rtmed")] ## Set up the LamaParama object param <- LamaParama(lamas = ref_mz_rt, method = "loess", span = 0.5, outlierTolerance = 3, zeroWeight = 10, ppm = 20, tolerance = 0, toleranceRt = 20, bs = "tp") ## input into `adjustRtime()` tst_adjusted <- adjustRtime(tst, param = param) ## run diagnostic functions to pre-evaluate alignment param <- matchLamasChromPeaks(tst, param = param) mtch <- matchedRtimes(param) ## Access summary of matches and model information summary <- summarizeLamaMatch(param) ##coverage for each file summary$Matched_peaks / summary$Total_peaks * 100 ## Access the information on the model of for the first file summary$model_summary[[1]]
Replaces the raw retention times with the adjusted retention time or returns the object unchanged if none are present.
applyAdjustedRtime(object)
applyAdjustedRtime(object)
object |
An XCMSnExp or XcmsExperiment object. |
Adjusted retention times are stored in parallel to the adjusted
retention times in XCMSnExp
or XcmsExperiment
objects. The
applyAdjustedRtime
replaces the raw (original) retention times with the
adjusted retention times.
An XCMSnExp
or XcmsExperiment
object with the raw (original) retention
times being replaced with the adjusted retention time.
Replacing the raw retention times with adjusted retention times
disables the possibility to restore raw retention times using the
dropAdjustedRtime()
method. This function does not remove the
retention time processing step with the settings of the alignment from
the processHistory()
of the object
to ensure that the processing
history is preserved.
Johannes Rainer
adjustRtime()
for the function to perform the alignment (retention
time correction).
[adjustedRtime()] for the method to extract adjusted retention times from an [XCMSnExp] object. [dropAdjustedRtime] for the method to delete alignment results and to restore the raw retention times.
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) xod <- adjustRtime(faahko_sub, param = ObiwarpParam()) hasAdjustedRtime(xod) ## Replace raw retention times with adjusted retention times. xod <- applyAdjustedRtime(xod) ## No adjusted retention times present hasAdjustedRtime(xod) ## Raw retention times have been replaced with adjusted retention times plot(split(rtime(faahko_sub), fromFile(faahko_sub))[[1]] - split(rtime(xod), fromFile(xod))[[1]], type = "l") ## And the process history still contains the settings for the alignment processHistory(xod)
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) xod <- adjustRtime(faahko_sub, param = ObiwarpParam()) hasAdjustedRtime(xod) ## Replace raw retention times with adjusted retention times. xod <- applyAdjustedRtime(xod) ## No adjusted retention times present hasAdjustedRtime(xod) ## Raw retention times have been replaced with adjusted retention times plot(split(rtime(faahko_sub), fromFile(faahko_sub))[[1]] - split(rtime(xod), fromFile(xod))[[1]], type = "l") ## And the process history still contains the settings for the alignment processHistory(xod)
AutoLockMass
~~AutoLockMass
- This function decides where the lock mass scans are
in the xcmsRaw object. This is done by using the scan time differences.
object |
An |
AutoLockMass
A numeric vector of scan locations corresponding to lock Mass scans
signature(object = "xcmsRaw")
Paul Benton, [email protected]
## Not run: library(xcms) library(faahKO) ## These files do not have this problem ## to correct for but just for an example cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xr<-xcmsRaw(cdffiles[1]) xr ##Lets assume that the lockmass starts at 1 and is every 100 scans lockMass<-xcms:::makeacqNum(xr, freq=100, start=1) ## these are equalvent lockmass2<-AutoLockMass(xr) all((lockmass == lockmass2) == TRUE) ob<-stitch(xr, lockMass) ## End(Not run)
## Not run: library(xcms) library(faahKO) ## These files do not have this problem ## to correct for but just for an example cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xr<-xcmsRaw(cdffiles[1]) xr ##Lets assume that the lockmass starts at 1 and is every 100 scans lockMass<-xcms:::makeacqNum(xr, freq=100, start=1) ## these are equalvent lockmass2<-AutoLockMass(xr) all((lockmass == lockmass2) == TRUE) ob<-stitch(xr, lockMass) ## End(Not run)
The methods listed on this page are XCMSnExp
methods inherited from its parent, the
OnDiskMSnExp
class from the MSnbase
package, that alter the raw data or are related to data subsetting. Thus
calling any of these methods causes all xcms
pre-processing
results to be removed from the XCMSnExp
object to ensure
its data integrity.
bin
: allows to bin spectra. See
bin
documentation in the MSnbase
package for more
details and examples.
clean
: removes unused 0
intensity data
points. See clean
documentation in the MSnbase
package
for details and examples.
filterAcquisitionNum
: filters the
XCMSnExp
object keeping only spectra with the provided
acquisition numbers. See filterAcquisitionNum
for
details and examples.
The normalize
method performs basic normalization of
spectra intensities. See normalize
documentation
in the MSnbase
package for details and examples.
The pickPeaks
method performs peak picking. See
pickPeaks
documentation for details and examples.
The removePeaks
method removes mass peaks (intensities)
lower than a threshold. Note that these peaks refer to mass
peaks, which are different to the chromatographic peaks detected and
analyzed in a metabolomics experiment! See
removePeaks
documentation for details and
examples.
The smooth
method smooths spectra. See
smooth
documentation in MSnbase
for details and
examples.
## S4 method for signature 'XCMSnExp' bin(x, binSize = 1L, msLevel.) ## S4 method for signature 'XCMSnExp' clean(object, all = FALSE, verbose = FALSE, msLevel.) ## S4 method for signature 'XCMSnExp' filterAcquisitionNum(object, n, file) ## S4 method for signature 'XCMSnExp' normalize(object, method = c("max", "sum"), ...) ## S4 method for signature 'XCMSnExp' pickPeaks( object, halfWindowSize = 3L, method = c("MAD", "SuperSmoother"), SNR = 0L, ... ) ## S4 method for signature 'XCMSnExp' removePeaks(object, t = "min", verbose = FALSE, msLevel.) ## S4 method for signature 'XCMSnExp' smooth( x, method = c("SavitzkyGolay", "MovingAverage"), halfWindowSize = 2L, verbose = FALSE, ... )
## S4 method for signature 'XCMSnExp' bin(x, binSize = 1L, msLevel.) ## S4 method for signature 'XCMSnExp' clean(object, all = FALSE, verbose = FALSE, msLevel.) ## S4 method for signature 'XCMSnExp' filterAcquisitionNum(object, n, file) ## S4 method for signature 'XCMSnExp' normalize(object, method = c("max", "sum"), ...) ## S4 method for signature 'XCMSnExp' pickPeaks( object, halfWindowSize = 3L, method = c("MAD", "SuperSmoother"), SNR = 0L, ... ) ## S4 method for signature 'XCMSnExp' removePeaks(object, t = "min", verbose = FALSE, msLevel.) ## S4 method for signature 'XCMSnExp' smooth( x, method = c("SavitzkyGolay", "MovingAverage"), halfWindowSize = 2L, verbose = FALSE, ... )
x |
|
binSize |
|
msLevel. |
For |
object |
|
all |
For |
verbose |
|
n |
For |
file |
For |
method |
For |
... |
Optional additional arguments. |
halfWindowSize |
For |
SNR |
For |
t |
For |
For all methods: a XCMSnExp
object.
Johannes Rainer
XCMSnExp-filter
for methods to filter and subset
XCMSnExp
objects.
XCMSnExp
for base class documentation.
OnDiskMSnExp
for the documentation of the
parent class.
This functions takes two same-sized numeric vectors x
and y
, bins/cuts x
into bins (either a pre-defined number
of equal-sized bins or bins of a pre-defined size) and aggregates values
in y
corresponding to x
values falling within each bin. By
default (i.e. method = "max"
) the maximal y
value for the
corresponding x
values is identified. x
is expected to be
incrementally sorted and, if not, it will be internally sorted (in which
case also y
will be ordered according to the order of x
).
binYonX( x, y, breaks, nBins, binSize, binFromX, binToX, fromIdx = 1L, toIdx = length(x), method = "max", baseValue, sortedX = !is.unsorted(x), shiftByHalfBinSize = FALSE, returnIndex = FALSE, returnX = TRUE )
binYonX( x, y, breaks, nBins, binSize, binFromX, binToX, fromIdx = 1L, toIdx = length(x), method = "max", baseValue, sortedX = !is.unsorted(x), shiftByHalfBinSize = FALSE, returnIndex = FALSE, returnX = TRUE )
x |
Numeric vector to be used for binning. |
y |
Numeric vector (same length than |
breaks |
Numeric vector defining the breaks for the bins, i.e. the lower and upper values for each bin. See examples below. |
nBins |
integer(1) defining the number of desired bins. |
binSize |
numeric(1) defining the desired bin size. |
binFromX |
Optional numeric(1) allowing to manually specify
the range of x-values to be used for binning.
This will affect only the calculation of the breaks for the bins
(i.e. if |
binToX |
Same as |
fromIdx |
Integer vector defining the start position of one or multiple
sub-sets of input vector |
toIdx |
Same as |
method |
A character string specifying the method that should be used to
aggregate values in |
baseValue |
The base value for empty bins (i.e. bins into which either
no values in |
sortedX |
Whether |
shiftByHalfBinSize |
Logical specifying whether the bins should be
shifted by half the bin size to the left. Thus, the first bin will have
its center at |
returnIndex |
Logical indicating whether the index of the max (if
|
returnX |
|
The breaks defining the boundary of each bin can be either passed
directly to the function with the argument breaks
, or are
calculated on the data based on arguments nBins
or binSize
along with fromIdx
, toIdx
and optionally binFromX
and binToX
.
Arguments fromIdx
and toIdx
allow to specify subset(s) of
the input vector x
on which bins should be calculated. The
default the full x
vector is considered. Also, if not specified
otherwise with arguments binFromX
and binToX
, the range
of the bins within each of the sub-sets will be from x[fromIdx]
to x[toIdx]
. Arguments binFromX
and binToX
allow to
overwrite this by manually defining the a range on which the breaks
should be calculated. See examples below for more details.
Calculation of breaks: for nBins
the breaks correspond to
seq(min(x[fromIdx])), max(x[fromIdx], length.out = (nBins + 1))
.
For binSize
the breaks correspond to
seq(min(x[fromIdx]), max(x[toIdx]), by = binSize)
with the
exception that the last break value is forced to be equal to
max(x[toIdx])
. This ensures that all values from the specified
range are covered by the breaks defining the bins. The last bin could
however in some instances be slightly larger than binSize
. See
breaks_on_binSize
and breaks_on_nBins
for
more details.
Returns a list of length 2, the first element (named "x"
)
contains the bin mid-points, the second element (named "y"
) the
aggregated values from input vector y
within each bin. For
returnIndex = TRUE
the list contains an additional element
"index"
with the index of the max or min (depending on whether
method = "max"
or method = "min"
) value within each bin in
input vector x
.
The function ensures that all values within the range used to define
the breaks are considered in the binning (and assigned to a bin). This
means that for all bins except the last one values in x
have to be
>= xlower
and < xupper
(with xlower
and xupper
being the lower and upper boundary, respectively). For
the last bin the condition is x >= xlower & x <= xupper
.
Note also that if shiftByHalfBinSize
is TRUE
the range of
values that is used for binning is expanded by binSize
(i.e. the
lower boundary will be fromX - binSize/2
, the upper
toX + binSize/2
). Setting this argument to TRUE
resembles
the binning that is/was used in profBin
function from
xcms
< 1.51.
NA
handling: by default the function ignores NA
values in
y
(thus inherently assumes na.rm = TRUE
). No NA
values are allowed in x
.
Johannes Rainer
######## ## Simple example illustrating the breaks and the binning. ## ## Define breaks for 5 bins: brks <- seq(2, 12, length.out = 6) ## The first bin is then [2,4), the second [4,6) and so on. brks ## Get the max value falling within each bin. binYonX(x = 1:16, y = 1:16, breaks = brks) ## Thus, the largest value in x = 1:16 falling into the bin [2,4) (i.e. being ## >= 2 and < 4) is 3, the largest one falling into [4,6) is 5 and so on. ## Note however the function ensures that the minimal and maximal x-value ## (in this example 1 and 12) fall within a bin, i.e. 12 is considered for ## the last bin. ####### ## Performing the binning ons sub-set of x ## X <- 1:16 ## Bin X from element 4 to 10 into 5 bins. X[4:10] binYonX(X, X, nBins = 5L, fromIdx = 4, toIdx = 10) ## This defines breaks for 5 bins on the values from 4 to 10 and bins ## the values into these 5 bins. Alternatively, we could manually specify ## the range for the binning, i.e. the minimal and maximal value for the ## breaks: binYonX(X, X, nBins = 5L, fromIdx = 4, toIdx = 10, binFromX = 1, binToX = 16) ## In this case the breaks for 5 bins were defined from a value 1 to 16 and ## the values 4 to 10 were binned based on these breaks. ####### ## Bin values within a sub-set of x, second example ## ## This example illustrates how the fromIdx and toIdx parameters can be used. ## x defines 3 times the sequence form 1 to 10, while y is the sequence from ## 1 to 30. In this very simple example x is supposed to represent M/Z values ## from 3 consecutive scans and y the intensities measured for each M/Z in ## each scan. We want to get the maximum intensities for M/Z value bins only ## for the second scan, and thus we use fromIdx = 11 and toIdx = 20. The breaks ## for the bins are defined with the nBins, binFromX and binToX. X <- rep(1:10, 3) Y <- 1:30 ## Bin the M/Z values in the second scan into 5 bins and get the maximum ## intensity for each bin. Note that we have to specify sortedX = TRUE as ## the x and y vectors would be sorted otherwise. binYonX(X, Y, nBins = 5L, sortedX = TRUE, fromIdx = 11, toIdx = 20) ####### ## Bin in overlapping sub-sets of X ## ## In this example we define overlapping sub-sets of X and perform the binning ## within these. X <- 1:30 ## Define the start and end indices of the sub-sets. fIdx <- c(2, 8, 21) tIdx <- c(10, 25, 30) binYonX(X, nBins = 5L, fromIdx = fIdx, toIdx = tIdx) ## The same, but pre-defining also the desired range of the bins. binYonX(X, nBins = 5L, fromIdx = fIdx, toIdx = tIdx, binFromX = 4, binToX = 28) ## The same bins are thus used for each sub-set.
######## ## Simple example illustrating the breaks and the binning. ## ## Define breaks for 5 bins: brks <- seq(2, 12, length.out = 6) ## The first bin is then [2,4), the second [4,6) and so on. brks ## Get the max value falling within each bin. binYonX(x = 1:16, y = 1:16, breaks = brks) ## Thus, the largest value in x = 1:16 falling into the bin [2,4) (i.e. being ## >= 2 and < 4) is 3, the largest one falling into [4,6) is 5 and so on. ## Note however the function ensures that the minimal and maximal x-value ## (in this example 1 and 12) fall within a bin, i.e. 12 is considered for ## the last bin. ####### ## Performing the binning ons sub-set of x ## X <- 1:16 ## Bin X from element 4 to 10 into 5 bins. X[4:10] binYonX(X, X, nBins = 5L, fromIdx = 4, toIdx = 10) ## This defines breaks for 5 bins on the values from 4 to 10 and bins ## the values into these 5 bins. Alternatively, we could manually specify ## the range for the binning, i.e. the minimal and maximal value for the ## breaks: binYonX(X, X, nBins = 5L, fromIdx = 4, toIdx = 10, binFromX = 1, binToX = 16) ## In this case the breaks for 5 bins were defined from a value 1 to 16 and ## the values 4 to 10 were binned based on these breaks. ####### ## Bin values within a sub-set of x, second example ## ## This example illustrates how the fromIdx and toIdx parameters can be used. ## x defines 3 times the sequence form 1 to 10, while y is the sequence from ## 1 to 30. In this very simple example x is supposed to represent M/Z values ## from 3 consecutive scans and y the intensities measured for each M/Z in ## each scan. We want to get the maximum intensities for M/Z value bins only ## for the second scan, and thus we use fromIdx = 11 and toIdx = 20. The breaks ## for the bins are defined with the nBins, binFromX and binToX. X <- rep(1:10, 3) Y <- 1:30 ## Bin the M/Z values in the second scan into 5 bins and get the maximum ## intensity for each bin. Note that we have to specify sortedX = TRUE as ## the x and y vectors would be sorted otherwise. binYonX(X, Y, nBins = 5L, sortedX = TRUE, fromIdx = 11, toIdx = 20) ####### ## Bin in overlapping sub-sets of X ## ## In this example we define overlapping sub-sets of X and perform the binning ## within these. X <- 1:30 ## Define the start and end indices of the sub-sets. fIdx <- c(2, 8, 21) tIdx <- c(10, 25, 30) binYonX(X, nBins = 5L, fromIdx = fIdx, toIdx = tIdx) ## The same, but pre-defining also the desired range of the bins. binYonX(X, nBins = 5L, fromIdx = fIdx, toIdx = tIdx, binFromX = 4, binToX = 28) ## The same bins are thus used for each sub-set.
The 'BlankFlag' class and method enable users to flag features of an 'XcmsExperiment' or 'SummarizedExperiment' object based on the relationship between the intensity of a feature in blanks compared to the intensity in the samples.
This class and method are part of the possible dispatch of the generic function 'filterFeatures'. Features *below* ('<') the user-input threshold will be flagged by calling the 'filterFeatures' function. This means that an extra column will be created in 'featureDefinitions' or 'rowData' called 'possible_contaminants' with a logical value for each feature.
BlankFlag( threshold = 2, blankIndex = integer(), qcIndex = integer(), na.rm = TRUE ) ## S4 method for signature 'XcmsResult,BlankFlag' filterFeatures(object, filter, ...) ## S4 method for signature 'SummarizedExperiment,BlankFlag' filterFeatures(object, filter, assay = 1)
BlankFlag( threshold = 2, blankIndex = integer(), qcIndex = integer(), na.rm = TRUE ) ## S4 method for signature 'XcmsResult,BlankFlag' filterFeatures(object, filter, ...) ## S4 method for signature 'SummarizedExperiment,BlankFlag' filterFeatures(object, filter, assay = 1)
threshold |
'numeric' indicates the minimum difference required between the mean abundance of a feature in samples compared to the mean abundance of the same feature in blanks for it to not be considered a possible contaminant. For example, the default threshold of 2 signifies that the mean abundance of the features in samples has to be at least twice the mean abundance in blanks for it to not be flagged as a possible contaminant. |
blankIndex |
'integer' (or 'logical') vector corresponding to the indices of blank samples. |
qcIndex |
'integer' (or 'logical') vector corresponding to the indices of quality control (QC) samples. |
na.rm |
'logical' indicates whether missing values ('NA') should be removed prior to the calculations. |
object |
|
filter |
The parameter object selecting and configuring the type of
filtering. It can be one of the following classes: |
... |
Optional parameters. For |
assay |
For filtering of |
For 'BlankFlag': a 'BlankFlag' class. 'filterFeatures' returns the input object with an added column in the features metadata called 'possible_contaminants' with a logical value for each feature. This is added to 'featureDefinitions' for 'XcmsExperiment' objects and 'rowData' for 'SummarizedExperiment' objects.
Philippine Louail
Other Filter features in xcms:
DratioFilter
,
PercentMissingFilter
,
RsdFilter
Defines breaks for binSize
sized bins for values ranging
from fromX
to toX
.
breaks_on_binSize(fromX, toX, binSize)
breaks_on_binSize(fromX, toX, binSize)
fromX |
numeric(1) specifying the lowest value for the bins. |
toX |
numeric(1) specifying the largest value for the bins. |
binSize |
numeric(1) defining the size of a bin. |
This function creates breaks for bins of size binSize
. The
function ensures that the full data range is included in the bins, i.e. the
last value (upper boundary of the last bin) is always equal toX
. This
however means that the size of the last bin will not always be equal to the
desired bin size.
See examples for more details and a comparisom to R's seq
function.
A numeric vector defining the lower and upper bounds of the bins.
Johannes Rainer
binYonX
for a binning function.
Other functions to define bins:
breaks_on_nBins()
## Define breaks with a size of 0.13 for a data range from 1 to 10: breaks_on_binSize(1, 10, 0.13) ## The size of the last bin is however larger than 0.13: diff(breaks_on_binSize(1, 10, 0.13)) ## If we would use seq, the max value would not be included: seq(1, 10, by = 0.13) ## In the next example we use binSize that leads to an additional last bin with ## a smaller binSize: breaks_on_binSize(1, 10, 0.51) ## Again, the max value is included, but the size of the last bin is < 0.51. diff(breaks_on_binSize(1, 10, 0.51)) ## Using just seq would result in the following bin definition: seq(1, 10, by = 0.51) ## Thus it defines one bin (break) less.
## Define breaks with a size of 0.13 for a data range from 1 to 10: breaks_on_binSize(1, 10, 0.13) ## The size of the last bin is however larger than 0.13: diff(breaks_on_binSize(1, 10, 0.13)) ## If we would use seq, the max value would not be included: seq(1, 10, by = 0.13) ## In the next example we use binSize that leads to an additional last bin with ## a smaller binSize: breaks_on_binSize(1, 10, 0.51) ## Again, the max value is included, but the size of the last bin is < 0.51. diff(breaks_on_binSize(1, 10, 0.51)) ## Using just seq would result in the following bin definition: seq(1, 10, by = 0.51) ## Thus it defines one bin (break) less.
Calculate breaks for same-sized bins for data values
from fromX
to toX
.
breaks_on_nBins(fromX, toX, nBins, shiftByHalfBinSize = FALSE)
breaks_on_nBins(fromX, toX, nBins, shiftByHalfBinSize = FALSE)
fromX |
numeric(1) specifying the lowest value for the bins. |
toX |
numeric(1) specifying the largest value for the bins. |
nBins |
numeric(1) defining the number of bins. |
shiftByHalfBinSize |
Logical indicating whether the bins should be shifted
left by half bin size. This results centered bins, i.e. the first bin being
centered at |
This generates bins such as a call to
seq(fromX, toX, length.out = nBins)
would. The first and second element
in the result vector thus defines the lower and upper boundary for the first
bin, the second and third value for the second bin and so on.
A numeric vector of length nBins + 1
defining the lower and
upper bounds of the bins.
Johannes Rainer
binYonX
for a binning function.
Other functions to define bins:
breaks_on_binSize()
## Create breaks to bin values from 3 to 20 into 20 bins breaks_on_nBins(3, 20, nBins = 20) ## The same call but using shiftByHalfBinSize breaks_on_nBins(3, 20, nBins = 20, shiftByHalfBinSize = TRUE)
## Create breaks to bin values from 3 to 20 into 20 bins breaks_on_nBins(3, 20, nBins = 20) ## The same call but using shiftByHalfBinSize breaks_on_nBins(3, 20, nBins = 20, shiftByHalfBinSize = TRUE)
Combines the samples and peaks from multiple xcmsSet
objects
into a single object. Group and retention time correction data
are discarded. The profinfo
list is set to be equal to the
first object.
xs1 |
|
... |
|
A xcmsSet
object.
c(xs1, ...)
Colin A. Smith, [email protected]
Calibrate peaks using mz values of known masses/calibrants. mz values of identified peaks are adjusted based on peaks that are close to the provided mz values. See details below for more information.
The isCalibrated
function returns TRUE
if chromatographic
peaks of the XCMSnExp object x
were calibrated and FALSE
otherwise.
CalibrantMassParam( mz = list(), mzabs = 1e-04, mzppm = 5, neighbors = 3, method = "linear" ) isCalibrated(object) ## S4 method for signature 'XCMSnExp' calibrate(object, param)
CalibrantMassParam( mz = list(), mzabs = 1e-04, mzppm = 5, neighbors = 3, method = "linear" ) isCalibrated(object) ## S4 method for signature 'XCMSnExp' calibrate(object, param)
mz |
a |
mzabs |
|
mzppm |
|
neighbors |
|
method |
|
object |
An XCMSnExp object. |
param |
The |
The method does first identify peaks that are close to the provided
mz values and, given that there difference to the calibrants is smaller
than the user provided cut off (based on arguments mzabs
and mzppm
),
their mz values are replaced with the provided mz values. The mz values
of all other peaks are either globally shifted (for method = "shift"
or estimated by a linear model through all calibrants.
Peaks are considered close to a calibrant mz if the difference between
the calibrant and its mz is <= mzabs + mz * mzppm /1e6
.
Adjustment methods: adjustment function/factor is estimated using the difference between calibrant and peak mz values only for peaks that are close enough to the calibrants. The availabel methods are:
shift
: shifts the m/z of each peak by a global factor which
corresponds to the average difference between peak mz and calibrant mz.
linear
: fits a linear model throught the differences between
calibrant and peak mz values and adjusts the mz values of all peaks
using this.
edgeshift
: performs same adjustment as linear
for peaks that are
within the mz range of the calibrants and shift outside of it.
For more information, details and examples refer to the xcms-direct-injection vignette.
For CalibrantMassParam
: a CalibrantMassParam
instance.
For calibrate
: an XCMSnExp object with chromatographic peaks being
calibrated. Be aware that the actual raw mz values are not (yet)
calibrated, but only the identified chromatographic peaks.
The CalibrantMassParam
function returns an instance of
the CalibrantMassParam
class with all settings and properties set.
The calibrate
method returns an XCMSnExp object with the
chromatographic peaks being calibrated. Note that only the detected
peaks are calibrated, but not the individual mz values in each spectrum.
CalibrantMassParam
classes don't have exported getter or setter
methods.
Joachim Bargsten, Johannes Rainer
Calibrate peaks of a xcmsSet via a set of known masses
object |
a |
calibrants |
a vector or a list of vectors with reference m/z-values |
method |
the used calibrating-method, see below |
mzppm |
the relative error used for matching peaks in ppm (parts per million) |
mzabs |
the absolute error used for matching peaks in Da |
neighbours |
the number of neighbours from wich the one with the highest intensity is used (instead of the nearest) |
plotres |
can be set to TRUE if wanted a result-plot showing the found m/z with the distances and the regression |
object |
a |
calibrants |
for each sample different calibrants can be used, if a list of m/z-vectors is given. The length of the list must be the same as the number of samples, alternatively a single vector of masses can be given which is used for all samples. |
method |
"shift" for shifting each m/z, "linear" does a linear regression and adds a linear term to each m/z. "edgeshift" does a linear regression within the range of the mz-calibrants and a shift outside. |
calibrate(object, calibrants,method="linear",
mzabs=0.0001, mzppm=5,
neighbours=3, plotres=FALSE)
chromatogram
: extract chromatographic data (such as an extracted ion
chromatogram, a base peak chromatogram or total ion chromatogram) from
an OnDiskMSnExp or XCMSnExp objects. See also the help page of the
chromatogram
function in the MSnbase
package.
## S4 method for signature 'XCMSnExp' chromatogram( object, rt, mz, aggregationFun = "sum", missing = NA_real_, msLevel = 1L, BPPARAM = bpparam(), adjustedRtime = hasAdjustedRtime(object), filled = FALSE, include = c("apex_within", "any", "none"), ... )
## S4 method for signature 'XCMSnExp' chromatogram( object, rt, mz, aggregationFun = "sum", missing = NA_real_, msLevel = 1L, BPPARAM = bpparam(), adjustedRtime = hasAdjustedRtime(object), filled = FALSE, include = c("apex_within", "any", "none"), ... )
object |
Either a OnDiskMSnExp or XCMSnExp object from which the chromatograms should be extracted. |
rt |
|
mz |
|
aggregationFun |
|
missing |
|
msLevel |
|
BPPARAM |
Parallelisation backend to be used, which will
depend on the architecture. Default is
|
adjustedRtime |
For |
filled |
|
include |
|
... |
optional parameters - currently ignored. |
Arguments rt
and mz
allow to specify the MS data slice (i.e. the m/z
range and retention time window) from which the chromatogram should be
extracted. These parameters can be either a numeric
of length 2 with the
lower and upper limit, or a matrix
with two columns with the lower and
upper limits to extract multiple EICs at once.
The parameter aggregationSum
allows to specify the function to be
used to aggregate the intensities across the m/z range for the same
retention time. Setting aggregationFun = "sum"
would e.g. allow
to calculate the total ion chromatogram (TIC),
aggregationFun = "max"
the base peak chromatogram (BPC).
If for a given retention time no intensity is measured in that spectrum a
NA
intensity value is returned by default. This can be changed with the
parameter missing
, setting missing = 0
would result in a 0
intensity
being returned in these cases.
chromatogram
returns a XChromatograms object with
the number of columns corresponding to the number of files in
object
and number of rows the number of specified ranges (i.e.
number of rows of matrices provided with arguments mz
and/or
rt
). All chromatographic peaks with their apex position within the
m/z and retention time range are also retained as well as all feature
definitions for these peaks.
For XCMSnExp objects, if adjusted retention times are
available, the chromatogram
method will by default report
and use these (for the subsetting based on the provided parameter
rt
). This can be changed by setting adjustedRtime = FALSE
.
Johannes Rainer
XCMSnExp for the data object. Chromatogram for the object representing chromatographic data.
[XChromatograms] for the object allowing to arrange multiple [XChromatogram] objects. [plot] to plot a [XChromatogram] or [MChromatograms] objects. `as` (`as(x, "data.frame")`) in `MSnbase` for a method to extract the MS data as `data.frame`.
## Load a test data set with identified chromatographic peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## Extract the ion chromatogram for one chromatographic peak in the data. chrs <- chromatogram(faahko_sub, rt = c(2700, 2900), mz = 335) chrs ## Identified chromatographic peaks chromPeaks(chrs) ## Plot the chromatogram plot(chrs) ## Extract chromatograms for multiple ranges. mzr <- matrix(c(335, 335, 344, 344), ncol = 2, byrow = TRUE) rtr <- matrix(c(2700, 2900, 2600, 2750), ncol = 2, byrow = TRUE) chrs <- chromatogram(faahko_sub, mz = mzr, rt = rtr) chromPeaks(chrs) plot(chrs) ## Get access to all chromatograms for the second mz/rt range chrs[1, ] ## Plot just that one plot(chrs[1, , drop = FALSE])
## Load a test data set with identified chromatographic peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## Extract the ion chromatogram for one chromatographic peak in the data. chrs <- chromatogram(faahko_sub, rt = c(2700, 2900), mz = 335) chrs ## Identified chromatographic peaks chromPeaks(chrs) ## Plot the chromatogram plot(chrs) ## Extract chromatograms for multiple ranges. mzr <- matrix(c(335, 335, 344, 344), ncol = 2, byrow = TRUE) rtr <- matrix(c(2700, 2900, 2600, 2750), ncol = 2, byrow = TRUE) chrs <- chromatogram(faahko_sub, mz = mzr, rt = rtr) chromPeaks(chrs) plot(chrs) ## Get access to all chromatograms for the second mz/rt range chrs[1, ] ## Plot just that one plot(chrs[1, , drop = FALSE])
Extract an ion chromatogram (EIC) for each chromatographic peak in an
XcmsExperiment()
object. The result is returned as an XChromatograms()
of length equal to the number of chromatographic peaks (and one column).
chromPeakChromatograms(object, ...) ## S4 method for signature 'XcmsExperiment' chromPeakChromatograms( object, expandRt = 0, expandMz = 0, aggregationFun = "max", peaks = character(), return.type = c("XChromatograms", "MChromatograms"), ..., progressbar = TRUE )
chromPeakChromatograms(object, ...) ## S4 method for signature 'XcmsExperiment' chromPeakChromatograms( object, expandRt = 0, expandMz = 0, aggregationFun = "max", peaks = character(), return.type = c("XChromatograms", "MChromatograms"), ..., progressbar = TRUE )
object |
An |
... |
currently ignored. |
expandRt |
|
expandMz |
|
aggregationFun |
|
peaks |
optional |
return.type |
|
progressbar |
|
Johannes Rainer
featureChromatograms()
to extract an EIC for each feature.
## Load a test data set with detected peaks library(MSnbase) library(xcms) library(MsExperiment) faahko_sub <- loadXcmsData("faahko_sub2") ## Get EICs for every detected chromatographic peak chrs <- chromPeakChromatograms(faahko_sub) chrs ## Order of EICs matches the order in chromPeaks chromPeaks(faahko_sub) |> head() ## variable "sample_index" provides the index of the sample the EIC was ## extracted from fData(chrs)$sample_index ## Get the EIC for selected peaks only. pks <- rownames(chromPeaks(faahko_sub))[c(6, 12)] pks ## Expand the data on retention time dimension by 15 seconds (on each side) res <- chromPeakChromatograms(faahko_sub, peaks = pks, expandRt = 5) plot(res[1, ])
## Load a test data set with detected peaks library(MSnbase) library(xcms) library(MsExperiment) faahko_sub <- loadXcmsData("faahko_sub2") ## Get EICs for every detected chromatographic peak chrs <- chromPeakChromatograms(faahko_sub) chrs ## Order of EICs matches the order in chromPeaks chromPeaks(faahko_sub) |> head() ## variable "sample_index" provides the index of the sample the EIC was ## extracted from fData(chrs)$sample_index ## Get the EIC for selected peaks only. pks <- rownames(chromPeaks(faahko_sub))[c(6, 12)] pks ## Expand the data on retention time dimension by 15 seconds (on each side) res <- chromPeakChromatograms(faahko_sub, peaks = pks, expandRt = 5) plot(res[1, ])
Extract (MS1 or MS2) spectra from an XcmsExperiment or XCMSnExp object
for identified chromatographic peaks. To return spectra for selected
chromatographic peaks, their peak ID (i.e., row name in the chromPeaks
matrix) can be provided with parameter peaks
.
For msLevel = 1L
(only supported for return.type = "Spectra"
or
return.type = "List"
) MS1 spectra within the retention time boundaries
(in the file in which the peak was detected) are returned. For
msLevel = 2L
MS2 spectra are returned for a chromatographic
peak if their precursor m/z is within the retention time and m/z range of
the chromatographic peak. Parameter method
allows to define whether all
or a single spectrum should be returned:
method = "all"
: (default): return all spectra for each chromatographic
peak.
method = "closest_rt"
: return the spectrum with the retention time
closest to the peak's retention time (at apex).
method = "closest_mz"
: return the spectrum with the precursor m/z
closest to the peaks's m/z (at apex); only supported for msLevel > 1
.
method = "largest_tic"
: return the spectrum with the largest total
signal (sum of peaks intensities).
method = "largest_bpi"
: return the spectrum with the largest peak
intensity (maximal peak intensity).
method = "signal"
: only for object
being a XCMSnExp
: return the
spectrum with the sum of intensities most similar to the peak's apex
signal ("maxo"
); only supported for msLevel = 2L
.
Parameter return.type
allows to specify the type of the result object.
With return.type = "Spectra"
(the default) a Spectra object with all
matching spectra is returned. With return.type = "Spectra"
a List
of
Spectra
is returned. The length of the list is equal to the number of rows
of chromPeaks
. Each element of the list contains thus a Spectra
with all
spectra for one chromatographic peak (or a Spectra
of length 0 if no
spectrum was found for the respective chromatographic peak).
Parameter chromPeakColumns
allows the user to add specific metadata
columns from the chromatographic peaks (chromPeaks
) to the returned
spectra object. This can be useful to keep information such as retention
time (rt
), m/z (mz
). The columns will be named as they are written in the
chromPeaks
object with the prefix "chrom_peak_"
. The peak ID
(i.e., the row name of the peak in the chromPeaks
matrix) is always added
to the spectra object as a metadata column named "chrom_peak_id"
.
See also the LC-MS/MS data analysis vignette for more details and examples.
chromPeakSpectra(object, ...) ## S4 method for signature 'XcmsExperiment' chromPeakSpectra( object, method = c("all", "closest_rt", "closest_mz", "largest_tic", "largest_bpi"), msLevel = 2L, expandRt = 0, expandMz = 0, ppm = 0, skipFilled = FALSE, peaks = character(), chromPeakColumns = c("rt", "mz"), return.type = c("Spectra", "List"), BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' chromPeakSpectra( object, msLevel = 2L, expandRt = 0, expandMz = 0, ppm = 0, method = c("all", "closest_rt", "closest_mz", "signal", "largest_tic", "largest_bpi"), skipFilled = FALSE, return.type = c("Spectra", "MSpectra", "List", "list"), peaks = character() )
chromPeakSpectra(object, ...) ## S4 method for signature 'XcmsExperiment' chromPeakSpectra( object, method = c("all", "closest_rt", "closest_mz", "largest_tic", "largest_bpi"), msLevel = 2L, expandRt = 0, expandMz = 0, ppm = 0, skipFilled = FALSE, peaks = character(), chromPeakColumns = c("rt", "mz"), return.type = c("Spectra", "List"), BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' chromPeakSpectra( object, msLevel = 2L, expandRt = 0, expandMz = 0, ppm = 0, method = c("all", "closest_rt", "closest_mz", "signal", "largest_tic", "largest_bpi"), skipFilled = FALSE, return.type = c("Spectra", "MSpectra", "List", "list"), peaks = character() )
object |
XcmsExperiment or XCMSnExp object with identified chromatographic peaks for which spectra should be returned. |
... |
ignored. |
method |
|
msLevel |
|
expandRt |
|
expandMz |
|
ppm |
|
skipFilled |
|
peaks |
|
chromPeakColumns |
|
return.type |
|
BPPARAM |
parallel processing setup. Defaults to |
parameter return.type
allow to specify the type of the returned object:
return.type = "Spectra"
(default): a Spectra
object (defined in the
Spectra
package). The result contains all spectra for all peaks.
Metadata column "peak_id"
provides the ID of the respective peak
(i.e. its rowname in chromPeaks()
.
return.type = "List"
: List
of length equal to the number of
chromatographic peaks is returned, each element being a Spectra
with
the spectra for one chromatographic peak.
For backward compatibility options "MSpectra"
and "list"
are also
supported but are not suggested.
return.type = "MSpectra"
(deprecated): a MSpectra object with elements being
Spectrum objects. The result objects contains all spectra
for all peaks. Metadata column "peak_id"
provides the ID of the
respective peak (i.e. its rowname in chromPeaks()
).
return.type = "list"
: list
of list
s that are either of length
0 or contain Spectrum2 object(s) within the m/z-rt range. The
length of the list matches the number of peaks.
Johannes Rainer
## Read a file with DDA LC-MS/MS data library(MsExperiment) fl <- system.file("TripleTOF-SWATH/PestMix1_DDA.mzML", package = "msdata") dda <- readMsExperiment(fl) ## Perform MS1 peak detection dda <- findChromPeaks(dda, CentWaveParam(peakwidth = c(5, 15), prefilter = c(5, 1000))) ## Return all MS2 spectro for each chromatographic peaks as a Spectra object ms2_sps <- chromPeakSpectra(dda) ms2_sps ## spectra variable *chrom_peak_id* contain the row names of the peaks in the ## chromPeak matrix and allow thus to map chromatographic peaks to the ## returned MS2 spectra ms2_sps$chrom_peak_id chromPeaks(dda) ## Alternatively, return the result as a List of Spectra objects. This list ## is parallel to chromPeaks hence the mapping between chromatographic peaks ## and MS2 spectra is easier. ms2_sps <- chromPeakSpectra(dda, return.type = "List") names(ms2_sps) rownames(chromPeaks(dda)) ms2_sps[[1L]] ## Parameter `msLevel` allows to define from which MS level spectra should ## be returned. By default `msLevel = 2L` but with `msLevel = 1L` all ## MS1 spectra with a retention time within the retention time range of ## a chromatographic peak can be returned. Alternatively, selected ## spectra can be returned by specifying the selection criteria/method ## with the `method` parameter. Below we extract for each chromatographic ## peak the MS1 spectra with a retention time closest to the ## chromatographic peak's apex position. Alternatively it would also be ## possible to select the spectrum with the highest total signal or ## highest (maximal) intensity. ms1_sps <- chromPeakSpectra(dda, msLevel = 1L, method = "closest_rt") ms1_sps ## Parameter peaks would allow to extract spectra for specific peaks only. ## Peaks can be defined with parameter `peaks` which can be either an ## `integer` with the index of the peak in the `chromPeaks` matrix or a ## `character` with its rowname in `chromPeaks`. chromPeakSpectra(dda, msLevel = 1L, method = "closest_rt", peaks = c(3, 5))
## Read a file with DDA LC-MS/MS data library(MsExperiment) fl <- system.file("TripleTOF-SWATH/PestMix1_DDA.mzML", package = "msdata") dda <- readMsExperiment(fl) ## Perform MS1 peak detection dda <- findChromPeaks(dda, CentWaveParam(peakwidth = c(5, 15), prefilter = c(5, 1000))) ## Return all MS2 spectro for each chromatographic peaks as a Spectra object ms2_sps <- chromPeakSpectra(dda) ms2_sps ## spectra variable *chrom_peak_id* contain the row names of the peaks in the ## chromPeak matrix and allow thus to map chromatographic peaks to the ## returned MS2 spectra ms2_sps$chrom_peak_id chromPeaks(dda) ## Alternatively, return the result as a List of Spectra objects. This list ## is parallel to chromPeaks hence the mapping between chromatographic peaks ## and MS2 spectra is easier. ms2_sps <- chromPeakSpectra(dda, return.type = "List") names(ms2_sps) rownames(chromPeaks(dda)) ms2_sps[[1L]] ## Parameter `msLevel` allows to define from which MS level spectra should ## be returned. By default `msLevel = 2L` but with `msLevel = 1L` all ## MS1 spectra with a retention time within the retention time range of ## a chromatographic peak can be returned. Alternatively, selected ## spectra can be returned by specifying the selection criteria/method ## with the `method` parameter. Below we extract for each chromatographic ## peak the MS1 spectra with a retention time closest to the ## chromatographic peak's apex position. Alternatively it would also be ## possible to select the spectrum with the highest total signal or ## highest (maximal) intensity. ms1_sps <- chromPeakSpectra(dda, msLevel = 1L, method = "closest_rt") ms1_sps ## Parameter peaks would allow to extract spectra for specific peaks only. ## Peaks can be defined with parameter `peaks` which can be either an ## `integer` with the index of the peak in the `chromPeaks` matrix or a ## `character` with its rowname in `chromPeaks`. chromPeakSpectra(dda, msLevel = 1L, method = "closest_rt", peaks = c(3, 5))
Collecting Peaks into xcmsFragments
s from several
MS-runs using xcmsSet
and
xcmsRaw
.
object |
(empty) |
xs |
A |
compMethod |
("floor", "round", "none"): compare-method which is used to find the parent peak of a MSnpeak through comparing the MZ-values of the MS1peaks with the MSnParentPeaks. |
snthresh , mzgap , uniq
|
these are the parameters for the getspec-peakpicker included in xcmsRaw. |
After running collect(xFragments,xSet) The peak table of the xcmsFragments includes the ms1Peaks from all experiments stored in a xcmsSet-object. Further it contains the relevant msN-peaks from the xcmsRaw-objects, which were created temporarily with the paths in xcmsSet.
A matrix with columns:
peakID |
unique identifier of every peak |
MSnParentPeakID |
PeakID of the parent peak of a msLevel>1 - peak, it is 0 if the peak is msLevel 1. |
msLevel |
The msLevel of the peak. |
rt |
retention time of the peak midpoint |
mz |
the mz-Value of the peak |
intensity |
the intensity of the peak |
sample |
the number of the sample from the xcmsSet |
GroupPeakMSn |
Used for grouped xcmsSet groups |
CollisionEnergy |
The collision energy of the fragment |
collect(object, ...)
For xcms
>= 3.15.3 please use compareChromatograms()
instead of
correlate
Correlate intensities of two chromatograms with each other. If the two
Chromatogram
objects have different retention times they are first
aligned to match data points in the first to data points in the second
chromatogram. See help on alignRt
in MSnbase::Chromatogram()
for more
details.
If correlate
is called on a single MChromatograms()
object a pairwise
correlation of each chromatogram with each other is performed and a matrix
with the correlation coefficients is returned.
Note that the correlation of two chromatograms depends also on their order,
e.g. correlate(chr1, chr2)
might not be identical to
correlate(chr2, chr1)
. The lower and upper triangular part of the
correlation matrix might thus be different.
## S4 method for signature 'Chromatogram,Chromatogram' correlate( x, y, use = "pairwise.complete.obs", method = c("pearson", "kendall", "spearman"), align = c("closest", "approx"), ... ) ## S4 method for signature 'MChromatograms,missing' correlate( x, y = NULL, use = "pairwise.complete.obs", method = c("pearson", "kendall", "spearman"), align = c("closest", "approx"), ... ) ## S4 method for signature 'MChromatograms,MChromatograms' correlate( x, y = NULL, use = "pairwise.complete.obs", method = c("pearson", "kendall", "spearman"), align = c("closest", "approx"), ... )
## S4 method for signature 'Chromatogram,Chromatogram' correlate( x, y, use = "pairwise.complete.obs", method = c("pearson", "kendall", "spearman"), align = c("closest", "approx"), ... ) ## S4 method for signature 'MChromatograms,missing' correlate( x, y = NULL, use = "pairwise.complete.obs", method = c("pearson", "kendall", "spearman"), align = c("closest", "approx"), ... ) ## S4 method for signature 'MChromatograms,MChromatograms' correlate( x, y = NULL, use = "pairwise.complete.obs", method = c("pearson", "kendall", "spearman"), align = c("closest", "approx"), ... )
x |
|
y |
|
use |
|
method |
|
align |
|
... |
optional parameters passed along to the |
numeric(1)
or matrix
(if called on MChromatograms
objects)
with the correlation coefficient. If a matrix
is returned, the rows
represent the chromatograms in x
and the columns the chromatograms in
y
.
Michael Witting, Johannes Rainer
library(MSnbase) chr1 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3), intensity = c(5, 29, 50, NA, 100, 12, 3, 4, 1, 3)) chr2 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3), intensity = c(80, 50, 20, 10, 9, 4, 3, 4, 1, 3)) chr3 <- Chromatogram(rtime = 3:9 + rnorm(7, sd = 0.3), intensity = c(53, 80, 130, 15, 5, 3, 2)) chrs <- MChromatograms(list(chr1, chr2, chr3)) ## Using `compareChromatograms` instead of `correlate`. compareChromatograms(chr1, chr2) compareChromatograms(chr2, chr1) compareChromatograms(chrs, chrs)
library(MSnbase) chr1 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3), intensity = c(5, 29, 50, NA, 100, 12, 3, 4, 1, 3)) chr2 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3), intensity = c(80, 50, 20, 10, 9, 4, 3, 4, 1, 3)) chr3 <- Chromatogram(rtime = 3:9 + rnorm(7, sd = 0.3), intensity = c(53, 80, 130, 15, 5, 3, 2)) chrs <- MChromatograms(list(chr1, chr2, chr3)) ## Using `compareChromatograms` instead of `correlate`. compareChromatograms(chr1, chr2) compareChromatograms(chr2, chr1) compareChromatograms(chrs, chrs)
Create a report showing the most significant differences between two sets of samples. Optionally create extracted ion chromatograms for the most significant differences.
object |
the |
class1 |
character vector with the first set of sample classes to be compared |
class2 |
character vector with the second set of sample classes to be compared |
filebase |
base file name to save report, |
eicmax |
number of the most significantly different analytes to create EICs for |
eicwidth |
width (in seconds) of EICs produced |
sortpval |
logical indicating whether the reports should be sorted by p-value |
classeic |
character vector with the sample classes to include in the EICs |
value |
intensity values to be used for the diffreport. |
metlin |
mass uncertainty to use for generating link to Metlin metabolite database. the sign of the uncertainty indicates negative or positive mode data for M+H or M-H calculation. a value of FALSE or 0 removes the column |
h |
Numeric variable for the height of the eic and boxplots that are printed out. |
w |
Numeric variable for the width of the eic and boxplots print out made. |
mzdec |
Number of decimal places of title m/z values in the eic plot. |
missing |
|
... |
optional arguments to be passed to |
This method handles creation of summary reports with statistics about which analytes were most significantly different between two sets of samples. It computes Welch's two-sample t-statistic for each analyte and ranks them by p-value. It returns a summary report that can optionally be written out to a tab-separated file.
Additionally, it does all the heavy lifting involved in creating superimposed extracted ion chromatograms for a given number of analytes. It does so by reading the raw data files associated with the samples of interest one at a time. As it does so, it prints the name of the sample it is currently reading. Depending on the number and size of the samples, this process can take a long time.
If a base file name is provided, the report (see Value section) will be saved to a tab separated file. If EICs are generated, they will be saved as 640x480 PNG files in a newly created subdirectory. However this parameter can be changed with the commands arguments. The numbered file names correspond to the rows in the report.
Chromatographic traces in the EICs are colored and labeled by
their sample class. Sample classes take their color from the
current palette. The color a sample class is assigned is dependent
its order in the xcmsSet
object, not the order given in
the class arguments. Thus levels(sampclass(object))[1]
would use color palette()[1]
and so on. In that way, sample
classes maintain the same color across any number of different
generated reports.
When there are multiple sample classes, xcms will produce boxplots of the different classes and will generate a single anova p-value statistic. Like the eic's the plot number corresponds to the row number in the report.
A data frame with the following columns:
fold |
mean fold change (always greater than 1, see |
tstat |
Welch's two sample t-statistic, positive for analytes having
greater intensity in |
pvalue |
p-value of t-statistic |
anova |
p-value of the anova statistic if there are multiple classes |
mzmed |
median m/z of peaks in the group |
mzmin |
minimum m/z of peaks in the group |
mzmax |
maximum m/z of peaks in the group |
rtmed |
median retention time of peaks in the group |
rtmin |
minimum retention time of peaks in the group |
rtmax |
maximum retention time of peaks in the group |
npeaks |
number of peaks assigned to the group |
Sample Classes |
number samples from each sample class represented in the group |
metlin |
A URL to metlin for that mass |
... |
one column for every sample class |
Sample Names |
integrated intensity value for every sample |
... |
one column for every sample |
diffreport(object, class1 = levels(sampclass(object))[1],
class2 = levels(sampclass(object))[2],
filebase = character(), eicmax = 0, eicwidth = 200,
sortpval = TRUE, classeic = c(class1,class2),
value=c("into","maxo","intb"), metlin = FALSE,
h=480,w=640, mzdec=2, missing =
numeric(), ...)
OnDiskMSnExp
objectdirname
allows to get and set the path to the directory containing the
source files of the OnDiskMSnExp (or XCMSnExp) object.
## S4 method for signature 'OnDiskMSnExp' dirname(path) ## S4 replacement method for signature 'OnDiskMSnExp' dirname(path) <- value
## S4 method for signature 'OnDiskMSnExp' dirname(path) ## S4 replacement method for signature 'OnDiskMSnExp' dirname(path) <- value
path |
|
value |
|
Johannes Rainer
The function performs retention time correction by assessing
the retention time deviation across all samples using peak groups
(features) containg chromatographic peaks present in most/all samples.
The retention time deviation for these features in each sample is
described by fitting either a polynomial (smooth = "loess"
) or
a linear (smooth = "linear"
) model to the data points. The
models are subsequently used to adjust the retention time for each
spectrum in each sample.
do_adjustRtime_peakGroups( peaks, peakIndex, rtime = list(), minFraction = 0.9, extraPeaks = 1, smooth = c("loess", "linear"), span = 0.2, family = c("gaussian", "symmetric"), peakGroupsMatrix = matrix(ncol = 0, nrow = 0), subset = integer(), subsetAdjust = c("average", "previous") )
do_adjustRtime_peakGroups( peaks, peakIndex, rtime = list(), minFraction = 0.9, extraPeaks = 1, smooth = c("loess", "linear"), span = 0.2, family = c("gaussian", "symmetric"), peakGroupsMatrix = matrix(ncol = 0, nrow = 0), subset = integer(), subsetAdjust = c("average", "previous") )
peaks |
a |
peakIndex |
a |
rtime |
a |
minFraction |
For |
extraPeaks |
For |
smooth |
For |
span |
For |
family |
For |
peakGroupsMatrix |
optional |
subset |
For |
subsetAdjust |
For |
The alignment bases on the presence of compounds that can be found
in all/most samples of an experiment. The retention times of individual
spectra are then adjusted based on the alignment of the features
corresponding to these house keeping compounds. The paraneters
minFraction
and extraPeaks
can be used to fine tune which
features should be used for the alignment (i.e. which features
most likely correspond to the above mentioned house keeping compounds).
Parameter subset
allows to define a subset of samples within the
experiment that should be aligned. All samples not being part of the subset
will be aligned based on the adjustment of the closest sample within the
subset. This allows to e.g. exclude blank samples from the alignment process
with their retention times being still adjusted based on the alignment
results of the real samples.
A list
with numeric
vectors with the adjusted
retention times grouped by sample.
The method ensures that returned adjusted retention times are increasingly ordered, just as the raw retention times.
Colin Smith, Johannes Rainer
Colin A. Smith, Elizabeth J. Want, Grace O'Maille, Ruben Abagyan and Gary Siuzdak. "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 2006, 78:779-787.
This function performs peak density and wavelet based chromatographic peak detection for high resolution LC/MS data in centroid mode [Tautenhahn 2008].
do_findChromPeaks_centWave( mz, int, scantime, valsPerSpect, ppm = 25, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, roiList = list(), firstBaselineCheck = TRUE, roiScales = NULL, sleep = 0, extendLengthMSW = FALSE, verboseBetaColumns = FALSE )
do_findChromPeaks_centWave( mz, int, scantime, valsPerSpect, ppm = 25, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, roiList = list(), firstBaselineCheck = TRUE, roiScales = NULL, sleep = 0, extendLengthMSW = FALSE, verboseBetaColumns = FALSE )
mz |
Numeric vector with the individual m/z values from all scans/ spectra of one file/sample. |
int |
Numeric vector with the individual intensity values from all scans/spectra of one file/sample. |
scantime |
Numeric vector of length equal to the number of spectra/scans of the data representing the retention time of each scan. |
valsPerSpect |
Numeric vector with the number of values for each spectrum. |
ppm |
|
peakwidth |
|
snthresh |
|
prefilter |
|
mzCenterFun |
Name of the function to calculate the m/z center of the
chromatographic peak. Allowed are: |
integrate |
Integration method. For |
mzdiff |
|
fitgauss |
|
noise |
|
verboseColumns |
|
roiList |
An optional list of regions-of-interest (ROI) representing
detected mass traces. If ROIs are submitted the first analysis step is
omitted and chromatographic peak detection is performed on the submitted
ROIs. Each ROI is expected to have the following elements specified:
|
firstBaselineCheck |
|
roiScales |
Optional numeric vector with length equal to |
sleep |
|
extendLengthMSW |
Option to force centWave to use all scales when
running centWave rather than truncating with the EIC length. Uses the "open"
method to extend the EIC to a integer base-2 length prior to being passed to
|
verboseBetaColumns |
Option to calculate two additional metrics of peak
quality via comparison to an idealized bell curve. Adds |
This algorithm is most suitable for high resolution
LC/{TOF,OrbiTrap,FTICR}-MS data in centroid mode. In the first phase
the method identifies regions of interest (ROIs) representing
mass traces that are characterized as regions with less than ppm
m/z deviation in consecutive scans in the LC/MS map. In detail, starting
with a single m/z, a ROI is extended if a m/z can be found in the next scan
(spectrum) for which the difference to the mean m/z of the ROI is smaller
than the user defined ppm
of the m/z. The mean m/z of the ROI is then
updated considering also the newly included m/z value.
These ROIs are then, after some cleanup, analyzed using continuous wavelet
transform (CWT) to locate chromatographic peaks on different scales. The
first analysis step is skipped, if regions of interest are passed with
the roiList
parameter.
A matrix, each row representing an identified chromatographic peak, with columns:
Intensity weighted mean of m/z values of the peak across scans.
Minimum m/z of the peak.
Maximum m/z of the peak.
Retention time of the peak's midpoint.
Minimum retention time of the peak.
Maximum retention time of the peak.
Integrated (original) intensity of the peak.
Per-peak baseline corrected integrated peak intensity.
Maximum intensity of the peak.
Signal to noise ratio, defined as (maxo - baseline)/sd
,
sd
being the standard deviation of local chromatographic noise.
RMSE of Gaussian fit.
Additional columns for verboseColumns = TRUE
:
Gaussian parameter mu.
Gaussian parameter sigma.
Gaussian parameter h.
Region number of the m/z ROI where the peak was localized.
m/z deviation of mass trace across scans in ppm.
Scale on which the peak was localized.
Peak position found by wavelet analysis (scan number).
Left peak limit found by wavelet analysis (scan number).
Right peak limit found by wavelet analysis (scan numer).
Additional columns for verboseBetaColumns = TRUE
:
Correlation between an "ideal" bell curve and the raw data
Signal-to-noise residuals calculated from the beta_cor fit
The centWave was designed to work on centroided mode, thus it is expected that such data is presented to the function.
This function exposes core chromatographic peak detection functionality of the centWave method. While this function can be called directly, users will generally call the corresponding method for the data object instead.
Ralf Tautenhahn, Johannes Rainer
Ralf Tautenhahn, Christoph Böttcher, and Steffen Neumann "Highly sensitive feature detection for high resolution LC/MS" BMC Bioinformatics 2008, 9:504
centWave
for the standard user interface method.
Other core peak detection functions:
do_findChromPeaks_centWaveWithPredIsoROIs()
,
do_findChromPeaks_massifquant()
,
do_findChromPeaks_matchedFilter()
,
do_findPeaks_MSW()
## Load the test file faahko_sub <- loadXcmsData("faahko_sub") ## Subset to one file and restrict to a certain retention time range data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000)) ## Get m/z and intensity values mzs <- mz(data) ints <- intensity(data) ## Define the values per spectrum: valsPerSpect <- lengths(mzs) ## Calling the function. We're using a large value for noise and prefilter ## to speed up the call in the example - in a real use case we would either ## set the value to a reasonable value or use the default value. res <- do_findChromPeaks_centWave(mz = unlist(mzs), int = unlist(ints), scantime = rtime(data), valsPerSpect = valsPerSpect, noise = 10000, prefilter = c(3, 10000)) head(res)
## Load the test file faahko_sub <- loadXcmsData("faahko_sub") ## Subset to one file and restrict to a certain retention time range data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000)) ## Get m/z and intensity values mzs <- mz(data) ints <- intensity(data) ## Define the values per spectrum: valsPerSpect <- lengths(mzs) ## Calling the function. We're using a large value for noise and prefilter ## to speed up the call in the example - in a real use case we would either ## set the value to a reasonable value or use the default value. res <- do_findChromPeaks_centWave(mz = unlist(mzs), int = unlist(ints), scantime = rtime(data), valsPerSpect = valsPerSpect, noise = 10000, prefilter = c(3, 10000)) head(res)
The do_findChromPeaks_centWaveWithPredIsoROIs
performs a
two-step centWave based peak detection: chromatographic peaks are
identified using centWave followed by a prediction of the location of
the identified peaks' isotopes in the mz-retention time space. These
locations are fed as regions of interest (ROIs) to a subsequent
centWave run. All non overlapping peaks from these two peak detection
runs are reported as the final list of identified peaks.
The do_findChromPeaks_centWaveAddPredIsoROIs
performs
centWave based peak detection based in regions of interest (ROIs)
representing predicted isotopes for the peaks submitted with argument
peaks.
. The function returns a matrix with the identified peaks
consisting of all input peaks and peaks representing predicted isotopes
of these (if found by the centWave algorithm).
do_findChromPeaks_centWaveWithPredIsoROIs( mz, int, scantime, valsPerSpect, ppm = 25, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, roiList = list(), firstBaselineCheck = TRUE, roiScales = NULL, snthreshIsoROIs = 6.25, maxCharge = 3, maxIso = 5, mzIntervalExtension = TRUE, polarity = "unknown", extendLengthMSW = FALSE, verboseBetaColumns = FALSE ) do_findChromPeaks_addPredIsoROIs( mz, int, scantime, valsPerSpect, ppm = 25, peakwidth = c(20, 50), snthresh = 6.25, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, peaks. = NULL, maxCharge = 3, maxIso = 5, mzIntervalExtension = TRUE, polarity = "unknown" )
do_findChromPeaks_centWaveWithPredIsoROIs( mz, int, scantime, valsPerSpect, ppm = 25, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, roiList = list(), firstBaselineCheck = TRUE, roiScales = NULL, snthreshIsoROIs = 6.25, maxCharge = 3, maxIso = 5, mzIntervalExtension = TRUE, polarity = "unknown", extendLengthMSW = FALSE, verboseBetaColumns = FALSE ) do_findChromPeaks_addPredIsoROIs( mz, int, scantime, valsPerSpect, ppm = 25, peakwidth = c(20, 50), snthresh = 6.25, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, peaks. = NULL, maxCharge = 3, maxIso = 5, mzIntervalExtension = TRUE, polarity = "unknown" )
mz |
Numeric vector with the individual m/z values from all scans/ spectra of one file/sample. |
int |
Numeric vector with the individual intensity values from all scans/spectra of one file/sample. |
scantime |
Numeric vector of length equal to the number of spectra/scans of the data representing the retention time of each scan. |
valsPerSpect |
Numeric vector with the number of values for each spectrum. |
ppm |
|
peakwidth |
|
snthresh |
For |
prefilter |
|
mzCenterFun |
Name of the function to calculate the m/z center of the
chromatographic peak. Allowed are: |
integrate |
Integration method. For |
mzdiff |
|
fitgauss |
|
noise |
|
verboseColumns |
|
roiList |
An optional list of regions-of-interest (ROI) representing
detected mass traces. If ROIs are submitted the first analysis step is
omitted and chromatographic peak detection is performed on the submitted
ROIs. Each ROI is expected to have the following elements specified:
|
firstBaselineCheck |
|
roiScales |
Optional numeric vector with length equal to |
snthreshIsoROIs |
|
maxCharge |
|
maxIso |
|
mzIntervalExtension |
|
polarity |
|
extendLengthMSW |
Option to force centWave to use all scales when
running centWave rather than truncating with the EIC length. Uses the "open"
method to extend the EIC to a integer base-2 length prior to being passed to
|
verboseBetaColumns |
Option to calculate two additional metrics of peak
quality via comparison to an idealized bell curve. Adds |
peaks. |
A matrix or |
For more details on the centWave algorithm see
centWave
.
A matrix, each row representing an identified chromatographic peak. All non-overlapping peaks identified in both centWave runs are reported. The matrix columns are:
Intensity weighted mean of m/z values of the peaks across scans.
Minimum m/z of the peaks.
Maximum m/z of the peaks.
Retention time of the peak's midpoint.
Minimum retention time of the peak.
Maximum retention time of the peak.
Integrated (original) intensity of the peak.
Per-peak baseline corrected integrated peak intensity.
Maximum intensity of the peak.
Signal to noise ratio, defined as (maxo - baseline)/sd
,
sd
being the standard deviation of local chromatographic noise.
RMSE of Gaussian fit.
Additional columns for verboseColumns = TRUE
:
Gaussian parameter mu.
Gaussian parameter sigma.
Gaussian parameter h.
Region number of the m/z ROI where the peak was localized.
m/z deviation of mass trace across scans in ppm.
Scale on which the peak was localized.
Peak position found by wavelet analysis (scan number).
Left peak limit found by wavelet analysis (scan number).
Right peak limit found by wavelet analysis (scan numer).
Additional columns for verboseBetaColumns = TRUE
:
Correlation between an "ideal" bell curve and the raw data
Signal-to-noise residuals calculated from the beta_cor fit
Hendrik Treutler, Johannes Rainer
Other core peak detection functions:
do_findChromPeaks_centWave()
,
do_findChromPeaks_massifquant()
,
do_findChromPeaks_matchedFilter()
,
do_findPeaks_MSW()
Massifquant is a Kalman filter (KF)-based chromatographic peak
detection for XC-MS data in centroid mode. The identified peaks
can be further refined with the centWave method (see
do_findChromPeaks_centWave
for details on centWave)
by specifying withWave = TRUE
.
do_findChromPeaks_massifquant( mz, int, scantime, valsPerSpect, ppm = 10, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, criticalValue = 1.125, consecMissedLimit = 2, unions = 1, checkBack = 0, withWave = FALSE )
do_findChromPeaks_massifquant( mz, int, scantime, valsPerSpect, ppm = 10, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, criticalValue = 1.125, consecMissedLimit = 2, unions = 1, checkBack = 0, withWave = FALSE )
mz |
Numeric vector with the individual m/z values from all scans/ spectra of one file/sample. |
int |
Numeric vector with the individual intensity values from all scans/spectra of one file/sample. |
scantime |
Numeric vector of length equal to the number of spectra/scans of the data representing the retention time of each scan. |
valsPerSpect |
Numeric vector with the number of values for each spectrum. |
ppm |
|
peakwidth |
|
snthresh |
|
prefilter |
|
mzCenterFun |
Name of the function to calculate the m/z center of the
chromatographic peak. Allowed are: |
integrate |
Integration method. For |
mzdiff |
|
fitgauss |
|
noise |
|
verboseColumns |
|
criticalValue |
|
consecMissedLimit |
|
unions |
|
checkBack |
|
withWave |
|
This algorithm's performance has been tested rigorously
on high resolution LC/(OrbiTrap, TOF)-MS data in centroid mode.
Simultaneous kalman filters identify peaks and calculate their
area under the curve. The default parameters are set to operate on
a complex LC-MS Orbitrap sample. Users will find it useful to do some
simple exploratory data analysis to find out where to set a minimum
intensity, and identify how many scans an average peak spans. The
consecMissedLimit
parameter has yielded good performance on
Orbitrap data when set to (2
) and on TOF data it was found best
to be at (1
). This may change as the algorithm has yet to be
tested on many samples. The criticalValue
parameter is perhaps
most dificult to dial in appropriately and visual inspection of peak
identification is the best suggested tool for quick optimization.
The ppm
and checkBack
parameters have shown less influence
than the other parameters and exist to give users flexibility and
better accuracy.
A matrix, each row representing an identified chromatographic peak, with columns:
Intensity weighted mean of m/z values of the peaks across scans.
Minumum m/z of the peak.
Maximum m/z of the peak.
Minimum retention time of the peak.
Maximum retention time of the peak.
Retention time of the peak's midpoint.
Integrated (original) intensity of the peak.
Maximum intensity of the peak.
If withWave
is set to TRUE
, the result is the same as
returned by the do_findChromPeaks_centWave
method.
Christopher Conley
Conley CJ, Smith R, Torgrip RJ, Taylor RM, Tautenhahn R and Prince JT "Massifquant: open-source Kalman filter-based XC-MS isotope trace feature detection" Bioinformatics 2014, 30(18):2636-43.
massifquant
for the standard user interface method.
Other core peak detection functions:
do_findChromPeaks_centWave()
,
do_findChromPeaks_centWaveWithPredIsoROIs()
,
do_findChromPeaks_matchedFilter()
,
do_findPeaks_MSW()
## Load the test file faahko_sub <- loadXcmsData("faahko_sub") ## Subset to one file and restrict to a certain retention time range data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000)) ## Get m/z and intensity values mzs <- mz(data) ints <- intensity(data) ## Define the values per spectrum: valsPerSpect <- lengths(mzs) ## Perform the peak detection using massifquant - setting prefilter to ## a high value to speed up the call for the example res <- do_findChromPeaks_massifquant(mz = unlist(mzs), int = unlist(ints), scantime = rtime(data), valsPerSpect = valsPerSpect, prefilter = c(3, 10000)) head(res)
## Load the test file faahko_sub <- loadXcmsData("faahko_sub") ## Subset to one file and restrict to a certain retention time range data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000)) ## Get m/z and intensity values mzs <- mz(data) ints <- intensity(data) ## Define the values per spectrum: valsPerSpect <- lengths(mzs) ## Perform the peak detection using massifquant - setting prefilter to ## a high value to speed up the call for the example res <- do_findChromPeaks_massifquant(mz = unlist(mzs), int = unlist(ints), scantime = rtime(data), valsPerSpect = valsPerSpect, prefilter = c(3, 10000)) head(res)
This function identifies peaks in the chromatographic
time domain as described in [Smith 2006]. The intensity values are
binned by cutting The LC/MS data into slices (bins) of a mass unit
(binSize
m/z) wide. Within each bin the maximal intensity is
selected. The peak detection is then performed in each bin by
extending it based on the steps
parameter to generate slices
comprising bins current_bin - steps +1
to
current_bin + steps - 1
.
Each of these slices is then filtered with matched filtration using
a second-derative Gaussian as the model peak shape. After filtration
peaks are detected using a signal-to-ration cut-off. For more details
and illustrations see [Smith 2006].
do_findChromPeaks_matchedFilter( mz, int, scantime, valsPerSpect, binSize = 0.1, impute = "none", baseValue, distance, fwhm = 30, sigma = fwhm/2.3548, max = 5, snthresh = 10, steps = 2, mzdiff = 0.8 - binSize * steps, index = FALSE, sleep = 0 )
do_findChromPeaks_matchedFilter( mz, int, scantime, valsPerSpect, binSize = 0.1, impute = "none", baseValue, distance, fwhm = 30, sigma = fwhm/2.3548, max = 5, snthresh = 10, steps = 2, mzdiff = 0.8 - binSize * steps, index = FALSE, sleep = 0 )
mz |
Numeric vector with the individual m/z values from all scans/ spectra of one file/sample. |
int |
Numeric vector with the individual intensity values from all scans/spectra of one file/sample. |
scantime |
Numeric vector of length equal to the number of spectra/scans of the data representing the retention time of each scan. |
valsPerSpect |
Numeric vector with the number of values for each spectrum. |
binSize |
|
impute |
Character string specifying the method to be used for missing
value imputation. Allowed values are |
baseValue |
The base value to which empty elements should be set. This
is only considered for |
distance |
For |
fwhm |
|
sigma |
|
max |
|
snthresh |
|
steps |
|
mzdiff |
|
index |
|
sleep |
|
The intensities are binned by the provided m/z values within each
spectrum (scan). Binning is performed such that the bins are centered
around the m/z values (i.e. the first bin includes all m/z values between
min(mz) - bin_size/2
and min(mz) + bin_size/2
).
For more details on binning and missing value imputation see
binYonX
and imputeLinInterpol
methods.
A matrix, each row representing an identified chromatographic peak, with columns:
Intensity weighted mean of m/z values of the peak across scans.
Minimum m/z of the peak.
Maximum m/z of the peak.
Retention time of the peak's midpoint.
Minimum retention time of the peak.
Maximum retention time of the peak.
Integrated (original) intensity of the peak.
Integrated intensity of the filtered peak.
Maximum intensity of the peak.
Maximum intensity of the filtered peak.
Rank of peak in merged EIC (<= max
).
Signal to noise ratio of the peak
This function exposes core peak detection functionality of
the matchedFilter method. While this function can be called
directly, users will generally call the corresponding method for the
data object instead (e.g. the link{findPeaks.matchedFilter}
method).
Colin A Smith, Johannes Rainer
Colin A. Smith, Elizabeth J. Want, Grace O'Maille, Ruben Abagyan and Gary Siuzdak. "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 2006, 78:779-787.
binYonX
for a binning function,
imputeLinInterpol
for the interpolation of missing values.
matchedFilter
for the standard user interface method.
Other core peak detection functions:
do_findChromPeaks_centWave()
,
do_findChromPeaks_centWaveWithPredIsoROIs()
,
do_findChromPeaks_massifquant()
,
do_findPeaks_MSW()
## Load the test file faahko_sub <- loadXcmsData("faahko_sub") ## Subset to one file and restrict to a certain retention time range data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000)) ## Get m/z and intensity values mzs <- mz(data) ints <- intensity(data) ## Define the values per spectrum: valsPerSpect <- lengths(mzs) res <- do_findChromPeaks_matchedFilter(mz = unlist(mzs), int = unlist(ints), scantime = rtime(data), valsPerSpect = valsPerSpect) head(res)
## Load the test file faahko_sub <- loadXcmsData("faahko_sub") ## Subset to one file and restrict to a certain retention time range data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000)) ## Get m/z and intensity values mzs <- mz(data) ints <- intensity(data) ## Define the values per spectrum: valsPerSpect <- lengths(mzs) res <- do_findChromPeaks_matchedFilter(mz = unlist(mzs), int = unlist(ints), scantime = rtime(data), valsPerSpect = valsPerSpect) head(res)
This function performs peak detection in mass spectrometry direct injection spectrum using a wavelet based algorithm.
do_findPeaks_MSW( mz, int, snthresh = 3, verboseColumns = FALSE, scantime = numeric(), valsPerSpect = integer(), ... )
do_findPeaks_MSW( mz, int, snthresh = 3, verboseColumns = FALSE, scantime = numeric(), valsPerSpect = integer(), ... )
mz |
Numeric vector with the individual m/z values from all scans/ spectra of one file/sample. |
int |
Numeric vector with the individual intensity values from all scans/spectra of one file/sample. |
snthresh |
|
verboseColumns |
|
scantime |
ignored. |
valsPerSpect |
ignored. |
... |
Additional parameters to be passed to the
|
This is a wrapper around the peak picker in Bioconductor's
MassSpecWavelet
package calling
peakDetectionCWT
and
tuneInPeakInfo
functions. See the
xcmsDirect vignette for more information.
A matrix, each row representing an identified peak, with columns:
m/z value of the peak at the centroid position.
Minimum m/z of the peak.
Maximum m/z of the peak.
Always -1
.
Always -1
.
Always -1
.
Integrated (original) intensity of the peak.
Maximum intensity of the peak.
Always NA
.
Maximum MSW-filter response of the peak.
Signal to noise ratio.
Joachim Kutzera, Steffen Neumann, Johannes Rainer
MSW
for the standard user interface
method. peakDetectionCWT
from the
MassSpecWavelet
package.
Other core peak detection functions:
do_findChromPeaks_centWave()
,
do_findChromPeaks_centWaveWithPredIsoROIs()
,
do_findChromPeaks_massifquant()
,
do_findChromPeaks_matchedFilter()
The do_groupChromPeaks_density
function performs chromatographic peak
grouping based on the density (distribution) of peaks, found in different
samples, along the retention time axis in slices of overlapping m/z ranges.
By default (with parameter ppm = 0
) these m/z ranges have all the same
(constant) size (depending on parameter binSize
). For values of ppm
larger than 0 the m/z bins (ranges or slices) will have increasing sizes
depending on the m/z value. This better models the m/z-dependent
measurement error/precision seen on some MS instruments.
do_groupChromPeaks_density( peaks, sampleGroups, bw = 30, minFraction = 0.5, minSamples = 1, binSize = 0.25, maxFeatures = 50, sleep = 0, index = seq_len(nrow(peaks)), ppm = 0 )
do_groupChromPeaks_density( peaks, sampleGroups, bw = 30, minFraction = 0.5, minSamples = 1, binSize = 0.25, maxFeatures = 50, sleep = 0, index = seq_len(nrow(peaks)), ppm = 0 )
peaks |
A |
sampleGroups |
For |
bw |
For |
minFraction |
For |
minSamples |
For |
binSize |
For |
maxFeatures |
For |
sleep |
|
index |
An optional |
ppm |
For |
For overlapping slices along the mz dimension, the function calculates the density distribution of identified peaks along the retention time axis and groups peaks from the same or different samples that are close to each other. See (Smith 2006) for more details.
A data.frame
, each row representing a (mz-rt) feature (i.e. a peak group)
with columns:
"mzmed"
: median of the peaks' apex mz values.
"mzmin"
: smallest mz value of all peaks' apex within the feature.
"mzmax"
:largest mz value of all peaks' apex within the feature.
"rtmed"
: the median of the peaks' retention times.
"rtmin"
: the smallest retention time of the peaks in the group.
"rtmax"
: the largest retention time of the peaks in the group.
"npeaks"
: the total number of peaks assigned to the feature.
"peakidx"
: a list
with the indices of all peaks in a feature in the
peaks
input matrix.
Note that this number can be larger than the total number of samples, since multiple peaks from the same sample could be assigned to a feature.
The default settings might not be appropriate for all LC/GC-MS setups,
especially the bw
and binSize
parameter should be adjusted
accordingly.
Colin Smith, Johannes Rainer
Colin A. Smith, Elizabeth J. Want, Grace O'Maille, Ruben Abagyan and Gary Siuzdak. "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 2006, 78:779-787.
Other core peak grouping algorithms:
do_groupChromPeaks_nearest()
,
do_groupPeaks_mzClust()
## Load the test file library(xcms) library(MsExperiment) faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Extract the matrix with the identified peaks from the xcmsSet: pks <- chromPeaks(faahko_sub) ## Perform the peak grouping with default settings: res <- do_groupChromPeaks_density(pks, sampleGroups = rep(1, 3)) ## The feature definitions: head(res)
## Load the test file library(xcms) library(MsExperiment) faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Extract the matrix with the identified peaks from the xcmsSet: pks <- chromPeaks(faahko_sub) ## Perform the peak grouping with default settings: res <- do_groupChromPeaks_density(pks, sampleGroups = rep(1, 3)) ## The feature definitions: head(res)
The do_groupChromPeaks_nearest
function groups peaks across samples by
creating a master peak list and assigning corresponding peaks from all
samples to each peak group (i.e. feature). The method is inspired by the
correspondence algorithm of mzMine (Katajamaa 2006).
do_groupChromPeaks_nearest( peaks, sampleGroups, mzVsRtBalance = 10, absMz = 0.2, absRt = 15, kNN = 10 )
do_groupChromPeaks_nearest( peaks, sampleGroups, mzVsRtBalance = 10, absMz = 0.2, absRt = 15, kNN = 10 )
peaks |
A |
sampleGroups |
For |
mzVsRtBalance |
For |
absMz |
For |
absRt |
For |
kNN |
For |
A list
with elements "featureDefinitions"
and
"peakIndex"
. "featureDefinitions"
is a matrix
, each row
representing an (mz-rt) feature (i.e. peak group) with columns:
"mzmed"
: median of the peaks' apex mz values.
"mzmin"
: smallest mz value of all peaks' apex within the feature.
"mzmax"
:largest mz value of all peaks' apex within the feature.
"rtmed"
: the median of the peaks' retention times.
"rtmin"
: the smallest retention time of the peaks in the feature.
"rtmax"
: the largest retention time of the peaks in the feature.
"npeaks"
: the total number of peaks assigned to the feature.
"peakIndex"
is a list
with the indices of all peaks in a feature in the
peaks
input matrix.
Katajamaa M, Miettinen J, Oresic M: MZmine: Toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 2006, 22:634-636.
Other core peak grouping algorithms:
do_groupChromPeaks_density()
,
do_groupPeaks_mzClust()
The do_groupPeaks_mzClust
function performs high resolution
correspondence on single spectra samples.
do_groupPeaks_mzClust( peaks, sampleGroups, ppm = 20, absMz = 0, minFraction = 0.5, minSamples = 1 )
do_groupPeaks_mzClust( peaks, sampleGroups, ppm = 20, absMz = 0, minFraction = 0.5, minSamples = 1 )
peaks |
A |
sampleGroups |
For |
ppm |
For |
absMz |
For |
minFraction |
For |
minSamples |
For |
A list
with elements "featureDefinitions"
and
"peakIndex"
. "featureDefinitions"
is a matrix
, each row
representing an (mz-rt) feature (i.e. peak group) with columns:
"mzmed"
: median of the peaks' apex mz values.
"mzmin"
: smallest mz value of all peaks' apex within the feature.
"mzmax"
: largest mz value of all peaks' apex within the feature.
"rtmed"
: always -1
.
"rtmin"
: always -1
.
"rtmax"
: always -1
.
"npeaks"
: the total number of peaks assigned to the feature. Note that
this number can be larger than the total number of samples, since
multiple peaks from the same sample could be assigned to a group.
"peakIndex"
is a list
with the indices of all peaks in a peak group in
the peaks
input matrix.
Saira A. Kazmi, Samiran Ghosh, Dong-Guk Shin, Dennis W. Hill
and David F. Grant
Alignment of high resolution mass spectra:
development of a heuristic approach for metabolomics.
Metabolomics,
Vol. 2, No. 2, 75-83 (2006)
Other core peak grouping algorithms:
do_groupChromPeaks_density()
,
do_groupChromPeaks_nearest()
The 'DratioFilter' class and method enable users to filter features from an 'XcmsExperiment' or 'SummarizedExperiment' object based on the D-ratio or *dispersion ratio*. This is defined as the standard deviation for QC samples divided by the standard deviation for biological test samples, for each feature of the object (Broadhurst et al.).
This 'filter' is part of the possible dispatch of the generic function 'filterFeatures'. Features *above* ('>') the user-input threshold will be removed from the entire dataset.
DratioFilter( threshold = 0.5, qcIndex = integer(), studyIndex = integer(), na.rm = TRUE, mad = FALSE ) ## S4 method for signature 'XcmsResult,DratioFilter' filterFeatures(object, filter, ...) ## S4 method for signature 'SummarizedExperiment,DratioFilter' filterFeatures(object, filter, assay = 1)
DratioFilter( threshold = 0.5, qcIndex = integer(), studyIndex = integer(), na.rm = TRUE, mad = FALSE ) ## S4 method for signature 'XcmsResult,DratioFilter' filterFeatures(object, filter, ...) ## S4 method for signature 'SummarizedExperiment,DratioFilter' filterFeatures(object, filter, assay = 1)
threshold |
'numeric' value representing the threshold. Features with a D-ratio *strictly higher* ('>') than this will be removed from the entire dataset. |
qcIndex |
'integer' (or 'logical') vector corresponding to the indices of QC samples. |
studyIndex |
'integer' (or 'logical') vector corresponding of the indices of study samples. |
na.rm |
'logical' Indicates whether missing values ('NA') should be removed prior to the calculations. |
mad |
'logical' Indicates whether the *Median Absolute Deviation* (MAD) should be used instead of the standard deviation. This is suggested for non-gaussian distributed data. |
object |
|
filter |
The parameter object selecting and configuring the type of
filtering. It can be one of the following classes: |
... |
Optional parameters. For |
assay |
For filtering of |
For 'DratioFilter': a 'DratioFilter' class. 'filterFeatures' return the input object minus the features that did not met the user input threshold
Philippine Louail
Broadhurst D, Goodacre R, Reinke SN, Kuligowski J, Wilson ID, Lewis MR, Dunn WB. Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studies. Metabolomics. 2018;14(6):72. doi: 10.1007/s11306-018-1367-3. Epub 2018 May 18. PMID: 29805336; PMCID: PMC5960010.
Other Filter features in xcms:
BlankFlag
,
PercentMissingFilter
,
RsdFilter
estimatePrecursorIntensity()
determines the precursor intensity for a MS 2
spectrum based on the intensity of the respective signal from the
neighboring MS 1 spectra (i.e. based on the peak with the m/z matching the
precursor m/z of the MS 2 spectrum). Based on parameter method
either the
intensity of the peak from the previous MS 1 scan is used
(method = "previous"
) or an interpolation between the intensity from the
previous and subsequent MS1 scan is used (method = "interpolation"
, which
considers also the retention times of the two MS1 scans and the retention
time of the MS2 spectrum).
## S4 method for signature 'MsExperiment' estimatePrecursorIntensity( object, ppm = 10, tolerance = 0, method = c("previous", "interpolation"), BPPARAM = bpparam() ) ## S4 method for signature 'OnDiskMSnExp' estimatePrecursorIntensity( object, ppm = 10, tolerance = 0, method = c("previous", "interpolation"), BPPARAM = bpparam() )
## S4 method for signature 'MsExperiment' estimatePrecursorIntensity( object, ppm = 10, tolerance = 0, method = c("previous", "interpolation"), BPPARAM = bpparam() ) ## S4 method for signature 'OnDiskMSnExp' estimatePrecursorIntensity( object, ppm = 10, tolerance = 0, method = c("previous", "interpolation"), BPPARAM = bpparam() )
object |
|
ppm |
|
tolerance |
|
method |
|
BPPARAM |
parallel processing setup. See |
numeric
with length equal to the number of spectra in x
. NA
is
returned for MS 1 spectra or if no matching peak in a MS 1 scan can be
found for an MS 2 spectrum
Johannes Rainer with feedback and suggestions from Corey Broeckling
A general function for asymmetric chromatographic peaks.
etg(x, H, t1, tt, k1, kt, lambda1, lambdat, alpha, beta)
etg(x, H, t1, tt, k1, kt, lambda1, lambdat, alpha, beta)
x |
times to evaluate function at |
H |
peak height |
t1 |
time of leading edge inflection point |
tt |
time of trailing edge inflection point |
k1 |
leading edge parameter |
kt |
trailing edge parameter |
lambda1 |
leading edge parameter |
lambdat |
trailing edge parameter |
alpha |
leading edge parameter |
beta |
trailing edge parameter |
The function evaluated at times x
.
Colin A. Smith, [email protected]
Jianwei Li. Development and Evaluation of Flexible Empirical Peak Functions for Processing Chromatographic Peaks. Anal. Chem., 69 (21), 4452-4462, 1997. http://dx.doi.org/10.1021/ac970481d
Export the feature table for further analysis in the MetaboAnalyst
software (or the MetaboAnalystR
R package).
exportMetaboAnalyst( x, file = NULL, label, value = "into", digits = NULL, groupnames = FALSE, ... )
exportMetaboAnalyst( x, file = NULL, label, value = "into", digits = NULL, groupnames = FALSE, ... )
x |
XCMSnExp object with identified chromatographic peaks grouped across samples. |
file |
|
label |
either |
value |
|
digits |
|
groupnames |
|
... |
additional parameters to be passed to the |
If file
is not specified, the function returns the matrix
in
the format supported by MetaboAnalyst.
Johannes Rainer
data.frame
containing MS dataUPDATE: the extractMsData
and plotMsData
functions are deprecated
and as(x, "data.frame")
and plot(x, type = "XIC")
(x
being an
OnDiskMSnExp
or XCMSnExp
object) should be used instead. See examples
below. Be aware that filtering the raw object might however drop the
adjusted retention times. In such cases it is advisable to use the
applyAdjustedRtime()
function prior to filtering.
Extract a data.frame
of retention time, mz and intensity
values from each file/sample in the provided rt-mz range (or for the full
data range if rt
and mz
are not defined).
## S4 method for signature 'OnDiskMSnExp' extractMsData(object, rt, mz, msLevel = 1L) ## S4 method for signature 'XCMSnExp' extractMsData( object, rt, mz, msLevel = 1L, adjustedRtime = hasAdjustedRtime(object) )
## S4 method for signature 'OnDiskMSnExp' extractMsData(object, rt, mz, msLevel = 1L) ## S4 method for signature 'XCMSnExp' extractMsData( object, rt, mz, msLevel = 1L, adjustedRtime = hasAdjustedRtime(object) )
object |
A |
rt |
|
mz |
|
msLevel |
|
adjustedRtime |
(for |
A list
of length equal to the number of samples/files in
object
. Each element being a data.frame
with columns
"rt"
, "mz"
and "i"
with the retention time, mz and
intensity tuples of a file. If no data is available for the mz-rt range
in a file a data.frame
with 0 rows is returned for that file.
Johannes Rainer
XCMSnExp
for the data object.
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## Extract the full MS data for a certain retention time range ## as a data.frame tmp <- filterRt(faahko_sub, rt = c(2800, 2900)) ms_all <- as(tmp, "data.frame") head(ms_all) nrow(ms_all)
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## Extract the full MS data for a certain retention time range ## as a data.frame tmp <- filterRt(faahko_sub, rt = c(2800, 2900)) ms_all <- as(tmp, "data.frame") head(ms_all) nrow(ms_all)
Feature compounding aims at identifying and grouping LC-MS features
representing different ions or adducts (including isotopes) of the same
originating compound.
The MsFeatures package
provides a general framework and functionality to group features based on
different properties. The groupFeatures
methods for XcmsExperiment()
or
XCMSnExp objects implemented in xcms
extend these to enable
the compounding of LC-MS data considering also e.g. feature peak shaped.
Note that these functions simply define feature groups but don't
actually aggregate or combine the features.
See MsFeatures::groupFeatures()
for an overview on the general feature
grouping concept as well as details on the individual settings and
parameters.
The available options for groupFeatures
on xcms
preprocessing results
(i.e. on XcmsExperiment
or XCMSnExp
objects after correspondence
analysis with groupChromPeaks()
) are:
Grouping by similar retention times: groupFeatures-similar-rtime()
.
Grouping by similar feature values across samples:
AbundanceSimilarityParam()
.
Grouping by similar peak shape of extracted ion chromatograms:
EicSimilarityParam()
.
An ideal workflow grouping features should sequentially perform the above methods (in the listed order).
Compounded feature groups can be accessed with the featureGroups
function.
## S4 method for signature 'XcmsResult' featureGroups(object) ## S4 replacement method for signature 'XcmsResult' featureGroups(object) <- value
## S4 method for signature 'XcmsResult' featureGroups(object) ## S4 replacement method for signature 'XcmsResult' featureGroups(object) <- value
object |
an |
value |
for |
Johannes Rainer, Mar Garcia-Aloy, Vinicius Veri Hernandes
plotFeatureGroups()
for visualization of grouped features.
Extract ion chromatograms for features in an XcmsExperiment or
XCMSnExp object. The function returns for each feature the
extracted ion chromatograms (along with all associated chromatographic
peaks) in each sample. The chromatogram is extracted from the m/z - rt
region that includes all chromatographic peaks of a feature. By default,
this region is defined using the range of the chromatographic peaks' m/z
and retention times (with mzmin = min
, mzmax = max
, rtmin = min
and
rtmax = max
). For some features, and depending on the data, the m/z and
rt range can thus be relatively large. The boundaries of the m/z - rt
region can also be restricted by changing parameters mzmin
, mzmax
,
rtmin
and rtmax
to a different functions, such as median
.
By default only chromatographic peaks associated with a feature are
included in the returned XChromatograms object. For object
being an
XCMSnExp
object parameter include
allows also to return all
chromatographic peaks with their apex position within the selected
region (include = "apex_within"
) or any chromatographic peak overlapping
the m/z and retention time range (include = "any"
).
featureChromatograms(object, ...) ## S4 method for signature 'XcmsExperiment' featureChromatograms( object, expandRt = 0, expandMz = 0, aggregationFun = "max", features = character(), return.type = "XChromatograms", chunkSize = 2L, mzmin = min, mzmax = max, rtmin = min, rtmax = max, ..., progressbar = TRUE, BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' featureChromatograms( object, expandRt = 0, aggregationFun = "max", features, include = c("feature_only", "apex_within", "any", "all"), filled = FALSE, n = length(fileNames(object)), value = c("maxo", "into"), expandMz = 0, ... )
featureChromatograms(object, ...) ## S4 method for signature 'XcmsExperiment' featureChromatograms( object, expandRt = 0, expandMz = 0, aggregationFun = "max", features = character(), return.type = "XChromatograms", chunkSize = 2L, mzmin = min, mzmax = max, rtmin = min, rtmax = max, ..., progressbar = TRUE, BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' featureChromatograms( object, expandRt = 0, aggregationFun = "max", features, include = c("feature_only", "apex_within", "any", "all"), filled = FALSE, n = length(fileNames(object)), value = c("maxo", "into"), expandMz = 0, ... )
object |
|
... |
optional arguments to be passed along to the |
expandRt |
|
expandMz |
|
aggregationFun |
|
features |
|
return.type |
|
chunkSize |
For |
mzmin |
|
mzmax |
|
rtmin |
|
rtmax |
|
progressbar |
|
BPPARAM |
For |
include |
Only for |
filled |
Only for |
n |
Only for |
value |
Only for |
XChromatograms()
object. In future, depending on parameter
return.type
, the data might be returned as a different object.
The EIC data of a feature is extracted from every sample using the same
m/z - rt area. The EIC in a sample does thus not exactly represent the
signal of the actually identified chromatographic peak in that sample.
The chromPeakChromatograms()
function would allow to extract the actual
EIC of the chromatographic peak in a specific sample. See also examples
below.
Parameters include
, filled
, n
and value
are only supported
for object
being an XCMSnExp
.
When extracting EICs from only the top n
samples it can happen that one
or more of the features specified with features
are dropped because they
have no detected peak in the top n samples. The chance for this to happen
is smaller if x
contains also filled-in peaks (with fillChromPeaks
).
Johannes Rainer
filterColumnsKeepTop()
to filter the extracted EICs keeping only
the top n columns (samples) with the highest intensity.
chromPeakChromatograms()
for a function to extract an EIC for each
chromatographic peak.
## Load a test data set with detected peaks library(xcms) library(MsExperiment) faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Perform correspondence analysis xdata <- groupChromPeaks(faahko_sub, param = PeakDensityParam(minFraction = 0.8, sampleGroups = rep(1, 3))) ## Get the feature definitions featureDefinitions(xdata) ## Extract ion chromatograms for the first 3 features. Parameter ## `features` can be either the feature IDs or feature indices. chrs <- featureChromatograms(xdata, features = rownames(featureDefinitions)[1:3]) ## Plot the EIC for the first feature using different colors for each file. plot(chrs[1, ], col = c("red", "green", "blue")) ## The EICs for all 3 samples use the same m/z and retention time range, ## which was defined using the `featureArea` function: featureArea(xdata, features = rownames(featureDefinitions(xdata))[1:3], mzmin = min, mzmax = max, rtmin = min, rtmax = max) ## To extract the actual (exact) EICs for each chromatographic peak of ## a feature in each sample, the `chromPeakChromatograms` function would ## need to be used instead. Below we extract the EICs for all ## chromatographic peaks of the first feature. We need to first get the ## IDs of all chromatographic peaks assigned to the first feature: peak_ids <- rownames(chromPeaks(xdata))[featureDefinitions(xdata)$peakidx[[1L]]] ## We can now pass these to the `chromPeakChromatograms` function with ## parameter `peaks`: eic_1 <- chromPeakChromatograms(xdata, peaks = peak_ids) ## To plot these into a single plot we need to use the ## `plotChromatogramsOverlay` function: plotChromatogramsOverlay(eic_1)
## Load a test data set with detected peaks library(xcms) library(MsExperiment) faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Perform correspondence analysis xdata <- groupChromPeaks(faahko_sub, param = PeakDensityParam(minFraction = 0.8, sampleGroups = rep(1, 3))) ## Get the feature definitions featureDefinitions(xdata) ## Extract ion chromatograms for the first 3 features. Parameter ## `features` can be either the feature IDs or feature indices. chrs <- featureChromatograms(xdata, features = rownames(featureDefinitions)[1:3]) ## Plot the EIC for the first feature using different colors for each file. plot(chrs[1, ], col = c("red", "green", "blue")) ## The EICs for all 3 samples use the same m/z and retention time range, ## which was defined using the `featureArea` function: featureArea(xdata, features = rownames(featureDefinitions(xdata))[1:3], mzmin = min, mzmax = max, rtmin = min, rtmax = max) ## To extract the actual (exact) EICs for each chromatographic peak of ## a feature in each sample, the `chromPeakChromatograms` function would ## need to be used instead. Below we extract the EICs for all ## chromatographic peaks of the first feature. We need to first get the ## IDs of all chromatographic peaks assigned to the first feature: peak_ids <- rownames(chromPeaks(xdata))[featureDefinitions(xdata)$peakidx[[1L]]] ## We can now pass these to the `chromPeakChromatograms` function with ## parameter `peaks`: eic_1 <- chromPeakChromatograms(xdata, peaks = peak_ids) ## To plot these into a single plot we need to use the ## `plotChromatogramsOverlay` function: plotChromatogramsOverlay(eic_1)
This function returns spectra associated with the identified features in
the input object. By default, spectra are returned for all features (from
all MS levels), but parameter features
allows to specify/select features
for which the result should be returned.
Parameter msLevel
allows to define whether MS level 1 or 2 spectra
should be returned. For msLevel = 1L
all MS1 spectra within the
retention time range of each chromatographic peak (in that respective
data file) associated with a feature are returned. Note that for samples
in which no peak was identified (or even filled-in) no spectra are
returned. For msLevel = 2L
all MS2 spectra with a retention time within
the retention time range and their precursor m/z within the m/z range of
any chromatographic peak of a feature are returned.
See also chromPeakSpectra()
(used internally to extract spectra for
each chromatographic peak of a feature) for additional information,
specifically also on parameter method
. By default (method = "all"
)
all spectra associated with any of the chromatographic peaks of a
feature are returned. With any other option for method
, a single
spectrum per chromatographic peak will be returned (hence multiple
spectra per feature).
The information from featureDefinitions
for each feature can be included
in the returned Spectra()
object using the featureColumns
parameter.
This is useful for keeping details such as the median retention time (rtmed
)
or median m/z (mzmed
). The columns will retain their names as specified
in the featureDefinitions
object, prefixed by "feature_"
(e.g., "feature_mzmed"
). Additionally, the feature ID (i.e., the row
name of the feature in the featureDefinitions
data.frame) is always added
as a metadata column named "feature_id"
.
See also chromPeakSpectra()
, as it supports a similar parameter for
including columns from the chromatographic peaks in the returned spectra object.
These parameters can be used in combination to include information from both
the chromatographic peaks and the features in the returned Spectra()
.
The peak ID (i.e., the row name of the peak in the chromPeaks
matrix)
is added as a metadata column named "chrom_peak_id"
.
featureSpectra(object, ...) ## S4 method for signature 'XcmsExperiment' featureSpectra( object, msLevel = 2L, expandRt = 0, expandMz = 0, ppm = 0, skipFilled = FALSE, return.type = c("Spectra", "List"), features = character(), featureColumns = c("rtmed", "mzmed"), ... ) ## S4 method for signature 'XCMSnExp' featureSpectra( object, msLevel = 2L, expandRt = 0, expandMz = 0, ppm = 0, skipFilled = FALSE, return.type = c("MSpectra", "Spectra", "list", "List"), features = character(), ... )
featureSpectra(object, ...) ## S4 method for signature 'XcmsExperiment' featureSpectra( object, msLevel = 2L, expandRt = 0, expandMz = 0, ppm = 0, skipFilled = FALSE, return.type = c("Spectra", "List"), features = character(), featureColumns = c("rtmed", "mzmed"), ... ) ## S4 method for signature 'XCMSnExp' featureSpectra( object, msLevel = 2L, expandRt = 0, expandMz = 0, ppm = 0, skipFilled = FALSE, return.type = c("MSpectra", "Spectra", "list", "List"), features = character(), ... )
object |
XcmsExperiment or XCMSnExp object with feature defitions. |
... |
additional arguments to be passed along to |
msLevel |
|
expandRt |
|
expandMz |
|
ppm |
|
skipFilled |
|
return.type |
|
features |
|
featureColumns |
|
The function returns either a Spectra()
(for return.type = "Spectra"
)
or a List
of Spectra
(for return.type = "List"
). For the latter,
the order and the length matches parameter features
(or if no features
is defined the order of the features in featureDefinitions(object)
).
Spectra variables "chrom_peak_id"
and "feature_id"
define to which
chromatographic peak or feature each individual spectrum is associated
with.
Johannes Rainer
Simple function to calculate feature summaries. These include counts and percentages of samples in which a chromatographic peak is present for each feature and counts and percentages of samples in which more than one chromatographic peak was annotated to the feature. Also relative standard deviations (RSD) are calculated for the integrated peak areas per feature across samples. For 'perSampleCounts = TRUE' also the individual chromatographic peak counts per sample are returned.
featureSummary( x, group, perSampleCounts = FALSE, method = "maxint", skipFilled = TRUE )
featureSummary( x, group, perSampleCounts = FALSE, method = "maxint", skipFilled = TRUE )
x |
[XcmsExperiment()] or [XCMSnExp()] object with correspondence results. |
group |
'numeric', 'logical', 'character' or 'factor' with the same length than 'x' has samples to aggregate counts by the groups defined in 'group'. |
perSampleCounts |
'logical(1)' whether feature wise individual peak counts per sample should be returned too. |
method |
'character' passed to the [featureValues()] function. See respective help page for more information. |
skipFilled |
'logical(1)' whether filled-in peaks should be excluded (default) or included in the summary calculation. |
'matrix' with one row per feature and columns:
- '"count"': the total number of samples in which a peak was found. - '"perc"': the percentage of samples in which a peak was found. - '"multi_count"': the total number of samples in which more than one peak was assigned to the feature. - '"multi_perc"': the percentage of those samples in which a peak was found, that have also multiple peaks annotated to the feature. Example: for a feature, at least one peak was detected in 50 samples. In 5 of them 2 peaks were assigned to the feature. '"multi_perc"' is in this case 10 - '"rsd"': relative standard deviation (coefficient of variation) of the integrated peak area of the feature's peaks. - The same 4 columns are repeated for each unique element (level) in 'group' if 'group' was provided.
If 'perSampleCounts = TRUE' also one column for each sample is returned with the peak counts per sample.
Johannes Rainer
Gap filling integrate signal in the m/z-rt area of a feature (i.e., a
chromatographic peak group) for samples in which no chromatographic
peak for this feature was identified and add it to the chromPeaks()
matrix. Such filled-in peaks are indicated with a TRUE
in column
"is_filled"
in the result object's chromPeakData()
data frame.
The method for gap filling along with its settings can be defined with
the param
argument. Two different approaches are available:
param = FillChromPeaksParam()
: the default of the original xcms
code. Signal is integrated from the m/z and retention time range as
defined in the featureDefinitions()
data frame, i.e. from the
"rtmin"
, "rtmax"
, "mzmin"
and "mzmax"
. This method is not
suggested as it underestimates the actual peak area and it is also
not available for object
being an XcmsExperiment object. See
details below for more information and settings for this method.
param = ChromPeakAreaParam()
: the area from which the signal for a
feature is integrated is defined based on the feature's chromatographic
peak areas. The m/z range is by default defined as the the lower quartile
of chromatographic peaks' "mzmin"
value to the upper quartile of the
chromatographic peaks' "mzmax"
values. The retention time range for the
area is defined analogously. Alternatively, by setting mzmin = median
,
mzmax = median
, rtmin = median
and rtmax = median
in
ChromPeakAreaParam
, the median "mzmin"
, "mzmax"
, "rtmin"
and
"rtmax"
values from all detected chromatographic peaks of a feature
would be used instead.
In contrast to the FillChromPeaksParam
approach this method uses (all)
identified chromatographic peaks of a feature to define the area
from which the signal should be integrated.
expandMz
,expandMz<-
: getter and setter
for the expandMz
slot of the object.
expandRt
,expandRt<-
: getter and setter
for the expandRt
slot of the object.
ppm
,ppm<-
: getter and setter
for the ppm
slot of the object.
fillChromPeaks(object, param, ...) ## S4 method for signature 'XcmsExperiment,ChromPeakAreaParam' fillChromPeaks( object, param, msLevel = 1L, chunkSize = 2L, BPPARAM = bpparam() ) FillChromPeaksParam( expandMz = 0, expandRt = 0, ppm = 0, fixedMz = 0, fixedRt = 0 ) fixedRt(object) fixedMz(object) ChromPeakAreaParam( mzmin = function(z) quantile(z, probs = 0.25, names = FALSE), mzmax = function(z) quantile(z, probs = 0.75, names = FALSE), rtmin = function(z) quantile(z, probs = 0.25, names = FALSE), rtmax = function(z) quantile(z, probs = 0.75, names = FALSE) ) ## S4 method for signature 'FillChromPeaksParam' expandMz(object) ## S4 replacement method for signature 'FillChromPeaksParam' expandMz(object) <- value ## S4 method for signature 'FillChromPeaksParam' expandRt(object) ## S4 replacement method for signature 'FillChromPeaksParam' expandRt(object) <- value ## S4 method for signature 'FillChromPeaksParam' ppm(object) ## S4 replacement method for signature 'FillChromPeaksParam' ppm(object) <- value ## S4 method for signature 'XCMSnExp,FillChromPeaksParam' fillChromPeaks(object, param, msLevel = 1L, BPPARAM = bpparam()) ## S4 method for signature 'XCMSnExp,ChromPeakAreaParam' fillChromPeaks(object, param, msLevel = 1L, BPPARAM = bpparam()) ## S4 method for signature 'XCMSnExp,missing' fillChromPeaks(object, param, BPPARAM = bpparam(), msLevel = 1L)
fillChromPeaks(object, param, ...) ## S4 method for signature 'XcmsExperiment,ChromPeakAreaParam' fillChromPeaks( object, param, msLevel = 1L, chunkSize = 2L, BPPARAM = bpparam() ) FillChromPeaksParam( expandMz = 0, expandRt = 0, ppm = 0, fixedMz = 0, fixedRt = 0 ) fixedRt(object) fixedMz(object) ChromPeakAreaParam( mzmin = function(z) quantile(z, probs = 0.25, names = FALSE), mzmax = function(z) quantile(z, probs = 0.75, names = FALSE), rtmin = function(z) quantile(z, probs = 0.25, names = FALSE), rtmax = function(z) quantile(z, probs = 0.75, names = FALSE) ) ## S4 method for signature 'FillChromPeaksParam' expandMz(object) ## S4 replacement method for signature 'FillChromPeaksParam' expandMz(object) <- value ## S4 method for signature 'FillChromPeaksParam' expandRt(object) ## S4 replacement method for signature 'FillChromPeaksParam' expandRt(object) <- value ## S4 method for signature 'FillChromPeaksParam' ppm(object) ## S4 replacement method for signature 'FillChromPeaksParam' ppm(object) <- value ## S4 method for signature 'XCMSnExp,FillChromPeaksParam' fillChromPeaks(object, param, msLevel = 1L, BPPARAM = bpparam()) ## S4 method for signature 'XCMSnExp,ChromPeakAreaParam' fillChromPeaks(object, param, msLevel = 1L, BPPARAM = bpparam()) ## S4 method for signature 'XCMSnExp,missing' fillChromPeaks(object, param, BPPARAM = bpparam(), msLevel = 1L)
object |
|
param |
|
... |
currently ignored. |
msLevel |
|
chunkSize |
For |
BPPARAM |
Parallel processing settings. |
expandMz |
for |
expandRt |
for |
ppm |
for |
fixedMz |
for |
fixedRt |
for |
mzmin |
|
mzmax |
|
rtmin |
|
rtmax |
|
value |
The value for the slot. |
After correspondence (i.e. grouping of chromatographic peaks across
samples) there will always be features (peak groups) that do not include
peaks from every sample. The fillChromPeaks
method defines
intensity values for such features in the missing samples by integrating
the signal in the m/z-rt region of the feature. Two different approaches
to define this region are available: with ChromPeakAreaParam
the region
is defined based on the detected chromatographic peaks of a feature,
while with FillChromPeaksParam
the region is defined based on the m/z and
retention times of the feature (which represent the m/z and retentention
times of the apex position of the associated chromatographic peaks). For the
latter approach various parameters are available to increase the area from
which signal is to be integrated, either by a constant value (fixedMz
and
fixedRt
) or by a feature-relative amount (expandMz
and expandRt
).
Adjusted retention times will be used if available.
Based on the peak finding algorithm that was used to identify the
(chromatographic) peaks, different internal functions are used to
guarantee that the integrated peak signal matches as much as possible
the peak signal integration used during the peak detection. For peaks
identified with the matchedFilter()
method, signal
integration is performed on the profile matrix generated with
the same settings used also during peak finding (using the same
bin
size for example). For direct injection data and peaks
identified with the MSW
algorithm signal is integrated
only along the mz dimension. For all other methods the complete (raw)
signal within the area is used.
An XcmsExperiment or XCMSnExp
object with previously missing
chromatographic peaks for features filled into its chromPeaks()
matrix.
The FillChromPeaksParam
function returns a
FillChromPeaksParam
object.
expandMz,expandRt,ppm,fixedMz,fixedRt
See corresponding parameter above.
rtmin,rtmax,mzmin,mzmax
See corresponding parameter above.
The reported "mzmin"
, "mzmax"
, "rtmin"
and
"rtmax"
for the filled peaks represents the actual MS area from
which the signal was integrated.
No peak is filled in if no signal was present in a file/sample
in the respective mz-rt area. These samples will still show a NA
in the matrix returned by the featureValues()
method.
Johannes Rainer
groupChromPeaks()
for methods to perform the correspondence.
featureArea for the function to define the m/z-retention time region for each feature.
## Load a test data set with identified chromatographic peaks library(xcms) library(MsExperiment) res <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Perform the correspondence. We assign all samples to the same group. res <- groupChromPeaks(res, param = PeakDensityParam(sampleGroups = rep(1, length(res)))) ## For how many features do we lack an integrated peak signal? sum(is.na(featureValues(res))) ## Filling missing peak data using the peak area from identified ## chromatographic peaks. res <- fillChromPeaks(res, param = ChromPeakAreaParam()) ## How many missing values do we have after peak filling? sum(is.na(featureValues(res))) ## Get the peaks that have been filled in: fp <- chromPeaks(res)[chromPeakData(res)$is_filled, ] head(fp) ## Get the process history step along with the parameters used to perform ## The peak filling: ph <- processHistory(res, type = "Missing peak filling")[[1]] ph ## The parameter class: ph@param ## It is also possible to remove filled-in peaks: res <- dropFilledChromPeaks(res) sum(is.na(featureValues(res)))
## Load a test data set with identified chromatographic peaks library(xcms) library(MsExperiment) res <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Perform the correspondence. We assign all samples to the same group. res <- groupChromPeaks(res, param = PeakDensityParam(sampleGroups = rep(1, length(res)))) ## For how many features do we lack an integrated peak signal? sum(is.na(featureValues(res))) ## Filling missing peak data using the peak area from identified ## chromatographic peaks. res <- fillChromPeaks(res, param = ChromPeakAreaParam()) ## How many missing values do we have after peak filling? sum(is.na(featureValues(res))) ## Get the peaks that have been filled in: fp <- chromPeaks(res)[chromPeakData(res)$is_filled, ] head(fp) ## Get the process history step along with the parameters used to perform ## The peak filling: ph <- processHistory(res, type = "Missing peak filling")[[1]] ph ## The parameter class: ph@param ## It is also possible to remove filled-in peaks: res <- dropFilledChromPeaks(res) sum(is.na(featureValues(res)))
For each sample, identify peak groups where that sample is not represented. For each of those peak groups, integrate the signal in the region of that peak group and create a new peak.
object |
the |
method |
the filling method |
After peak grouping, there will always be peak groups that do not include peaks from every sample. This method produces intensity values for those missing samples by integrating raw data in peak group region. According to the type of raw-data there are 2 different methods available. for filling gcms/lcms data the method "chrom" integrates raw-data in the chromatographic domain, whereas "MSW" is used for peaklists without retention-time information like those from direct-infusion spectra.
A xcmsSet
objects with filled in peak groups.
fillPeaks(object, method="")
For each sample, identify peak groups where that sample is not represented. For each of those peak groups, integrate the signal in the region of that peak group and create a new peak.
object |
the |
nSlaves |
(DEPRECATED): number of slaves/cores to be used for
parallel peak filling.
MPI is used if installed, otherwise the snow package is employed for
multicore support. If none of the two packages is available it uses
the parallel package for parallel processing on multiple CPUs of the
current machine. Users are advised to use the |
expand.mz |
Expansion factor for the m/z range used for integration. |
expand.rt |
Expansion factor for the rentention time range used for integration. |
BPPARAM |
allows to define a specific parallel processing setup
for the current task (see |
After peak grouping, there will always be peak groups that do not include peaks from every sample. This method produces intensity values for those missing samples by integrating raw data in peak group region. In a given group, the start and ending retention time points for integration are defined by the median start and end points of the other detected peaks. The start and end m/z values are similarly determined. Intensities can be still be zero, which is a rather unusual intensity for a peak. This is the case if e.g. the raw data was threshholded, and the integration area contains no actual raw intensities, or if one sample is miscalibrated, such thet the raw data points are (just) outside the integration area.
Importantly, if retention time correction data is available, the alignment information is used to more precisely integrate the propper region of the raw data. If the corrected retention time is beyond the end of the raw data, the value will be not-a-number (NaN).
A xcmsSet
objects with filled in peak groups (into and maxo).
fillPeaks.chrom(object, nSlaves=0,expand.mz=1,expand.rt=1,
BPPARAM = bpparam())
xcmsSet-class
,
getPeaks
fillPeaks
For each sample, identify peak groups where that sample is not represented. For each of those peak groups, integrate the signal in the region of that peak group and create a new peak.
object |
the |
After peak grouping, there will always be peak groups that do not include peaks from every sample. This method produces intensity values for those missing samples by integrating raw data in peak group region. In a given group, the start and ending m/z values for integration are defined by the median start and end points of the other detected peaks.
A xcmsSet
objects with filled in peak groups.
fillPeaks.MSW(object)
In contrast to the fillPeaks.chrom
method the maximum
intensity reported in column "maxo"
is not the maximum
intensity measured in the expected peak area (defined by columns
"mzmin"
and "mzmax"
), but the largest intensity of mz
value(s) closest to the "mzmed"
of the feature.
xcmsSet-class
,
getPeaks
fillPeaks
These functions allow to filter (subset) MChromatograms()
or
XChromatograms()
objects, i.e. sets of chromatographic data, without
changing the data (intensity and retention times) within the individual
chromatograms (Chromatogram()
objects).
filterColumnsIntensityAbove
: subsets a MChromatograms
objects keeping
only columns (samples) for which value
is larger than the provided
threshold
in which
rows (i.e. if which = "any"
a
column is kept if any of the chromatograms in that column have a
value
larger than threshold
or with which = "all"
all
chromatograms in that column fulfill this criteria). Parameter value
allows to define on which value the comparison should be performed, with
value = "bpi"
the maximum intensity of each chromatogram is compared to
threshold
, with value = "tic" the total sum of intensities of each chromatogram is compared to
threshold. For
XChromatogramsobject,
value = "maxo"and
value = "into"are supported which compares the largest intensity of all identified chromatographic peaks in the chromatogram with
threshold', or the integrated peak area, respectively.
filterColumnsKeepTop
: subsets a MChromatograms
object keeping the top
n
columns sorted by the value specified with sortBy
. In detail, for
each column the value defined by sortBy
is extracted from each
chromatogram and aggregated using the aggregationFun
. Thus, by default,
for each chromatogram the maximum intensity is determined
(sortBy = "bpi"
) and these values are summed up for chromatograms in the
same column (aggregationFun = sum
). The columns are then sorted by these
values and the top n
columns are retained in the returned
MChromatograms
. Similar to the filterColumnsIntensityAbove
function,
this function allows to use for XChromatograms
objects to sort the
columns by column sortBy = "maxo"
or sortBy = "into"
of the
chromPeaks
matrix.
## S4 method for signature 'MChromatograms' filterColumnsIntensityAbove( object, threshold = 0, value = c("bpi", "tic"), which = c("any", "all") ) ## S4 method for signature 'MChromatograms' filterColumnsKeepTop( object, n = 1L, sortBy = c("bpi", "tic"), aggregationFun = sum ) ## S4 method for signature 'XChromatograms' filterColumnsIntensityAbove( object, threshold = 0, value = c("bpi", "tic", "maxo", "into"), which = c("any", "all") ) ## S4 method for signature 'XChromatograms' filterColumnsKeepTop( object, n = 1L, sortBy = c("bpi", "tic", "maxo", "into"), aggregationFun = sum )
## S4 method for signature 'MChromatograms' filterColumnsIntensityAbove( object, threshold = 0, value = c("bpi", "tic"), which = c("any", "all") ) ## S4 method for signature 'MChromatograms' filterColumnsKeepTop( object, n = 1L, sortBy = c("bpi", "tic"), aggregationFun = sum ) ## S4 method for signature 'XChromatograms' filterColumnsIntensityAbove( object, threshold = 0, value = c("bpi", "tic", "maxo", "into"), which = c("any", "all") ) ## S4 method for signature 'XChromatograms' filterColumnsKeepTop( object, n = 1L, sortBy = c("bpi", "tic", "maxo", "into"), aggregationFun = sum )
object |
|
threshold |
for |
value |
|
which |
for |
n |
for |
sortBy |
for |
aggregationFun |
for |
a filtered MChromatograms
(or XChromatograms
) object with the
same number of rows (EICs) but eventually a lower number of columns
(samples).
Johannes Rainer
library(MSnbase) chr1 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3), intensity = c(5, 29, 50, NA, 100, 12, 3, 4, 1, 3)) chr2 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3), intensity = c(80, 50, 20, 10, 9, 4, 3, 4, 1, 3)) chr3 <- Chromatogram(rtime = 3:9 + rnorm(7, sd = 0.3), intensity = c(53, 80, 130, 15, 5, 3, 2)) chrs <- MChromatograms(list(chr1, chr2, chr1, chr3, chr2, chr3), ncol = 3, byrow = FALSE) chrs #### filterColumnsIntensityAbove ## ## Keep all columns with for which the maximum intensity of any of its ## chromatograms is larger 90 filterColumnsIntensityAbove(chrs, threshold = 90) ## Require that ALL chromatograms in a column have a value larger 90 filterColumnsIntensityAbove(chrs, threshold = 90, which = "all") ## If none of the columns fulfills the criteria no columns are returned filterColumnsIntensityAbove(chrs, threshold = 900) ## Filtering XChromatograms allow in addition to filter on the columns ## "maxo" or "into" of the identified chromatographic peaks within each ## chromatogram. #### filterColumnsKeepTop ## ## Keep the 2 columns with the highest sum of maximal intensities in their ## chromatograms filterColumnsKeepTop(chrs, n = 1) ## Keep the 50 percent of columns with the highest total sum of signal. Note ## that n will be rounded to the next larger integer value filterColumnsKeepTop(chrs, n = 0.5 * ncol(chrs), sortBy = "tic")
library(MSnbase) chr1 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3), intensity = c(5, 29, 50, NA, 100, 12, 3, 4, 1, 3)) chr2 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3), intensity = c(80, 50, 20, 10, 9, 4, 3, 4, 1, 3)) chr3 <- Chromatogram(rtime = 3:9 + rnorm(7, sd = 0.3), intensity = c(53, 80, 130, 15, 5, 3, 2)) chrs <- MChromatograms(list(chr1, chr2, chr1, chr3, chr2, chr3), ncol = 3, byrow = FALSE) chrs #### filterColumnsIntensityAbove ## ## Keep all columns with for which the maximum intensity of any of its ## chromatograms is larger 90 filterColumnsIntensityAbove(chrs, threshold = 90) ## Require that ALL chromatograms in a column have a value larger 90 filterColumnsIntensityAbove(chrs, threshold = 90, which = "all") ## If none of the columns fulfills the criteria no columns are returned filterColumnsIntensityAbove(chrs, threshold = 900) ## Filtering XChromatograms allow in addition to filter on the columns ## "maxo" or "into" of the identified chromatographic peaks within each ## chromatogram. #### filterColumnsKeepTop ## ## Keep the 2 columns with the highest sum of maximal intensities in their ## chromatograms filterColumnsKeepTop(chrs, n = 1) ## Keep the 50 percent of columns with the highest total sum of signal. Note ## that n will be rounded to the next larger integer value filterColumnsKeepTop(chrs, n = 0.5 * ncol(chrs), sortBy = "tic")
xcms
Result ObjectThe XcmsExperiment
is a data container for xcms
preprocessing results
(i.e. results from chromatographic peak detection, alignment and
correspondence analysis).
It provides the same functionality than the XCMSnExp object, but uses the
more advanced and modern MS infrastructure provided by the MsExperiment
and Spectra
Bioconductor packages. With this comes a higher flexibility on
how and where to store the data.
Documentation of the various functions for XcmsExperiment
objects are
grouped by topic and provided in the sections below.
The default xcms
workflow is to perform
chromatographic peak detection using findChromPeaks()
optionally refine identified chromatographic peaks using
refineChromPeaks()
perform an alignment (retention time adjustment) using adjustRtime()
.
Depending on the method used this requires to run a correspondence
analysis first
perform a correspondence analysis using the groupChromPeaks()
function
to group chromatographic peaks across samples to define the LC-MS
features.
optionally perform a gap-filling to rescue signal in samples in which
no chromatographic peak was identified and hence a missing value would
be reported. This can be performed using the fillChromPeaks()
function.
filterFeatureDefinitions(object, ...) ## S4 method for signature 'MsExperiment' filterRt(object, rt = numeric(), ...) ## S4 method for signature 'MsExperiment' filterMzRange(object, mz = numeric(), msLevel. = uniqueMsLevels(object)) ## S4 method for signature 'MsExperiment' filterMz(object, mz = numeric(), msLevel. = uniqueMsLevels(object)) ## S4 method for signature 'MsExperiment' filterMsLevel(object, msLevel. = uniqueMsLevels(object)) ## S4 method for signature 'MsExperiment' uniqueMsLevels(object) ## S4 method for signature 'MsExperiment' filterFile(object, file = integer(), ...) ## S4 method for signature 'MsExperiment' rtime(object) ## S4 method for signature 'MsExperiment' fromFile(object) ## S4 method for signature 'MsExperiment' fileNames(object) ## S4 method for signature 'MsExperiment' polarity(object) ## S4 method for signature 'MsExperiment' filterIsolationWindow(object, mz = numeric()) ## S4 method for signature 'MsExperiment' chromatogram( object, rt = matrix(nrow = 0, ncol = 2), mz = matrix(nrow = 0, ncol = 2), aggregationFun = "sum", msLevel = 1L, isolationWindowTargetMz = NULL, chunkSize = 2L, return.type = "MChromatograms", BPPARAM = bpparam() ) featureArea( object, mzmin = min, mzmax = max, rtmin = min, rtmax = max, features = character() ) ## S4 method for signature 'MsExperiment,missing' plot(x, y, msLevel = 1L, peakCol = "#ff000060", ...) ## S4 method for signature 'XcmsExperiment,ANY,ANY,ANY' x[i, j, ..., drop = TRUE] ## S4 method for signature 'XcmsExperiment' filterIsolationWindow(object, mz = numeric()) ## S4 method for signature 'XcmsExperiment' filterRt(object, rt, msLevel.) ## S4 method for signature 'XcmsExperiment' filterMzRange(object, mz = numeric(), msLevel. = uniqueMsLevels(object)) ## S4 method for signature 'XcmsExperiment' filterMsLevel(object, msLevel. = uniqueMsLevels(object)) ## S4 method for signature 'XcmsExperiment' hasChromPeaks(object, msLevel = integer()) ## S4 method for signature 'XcmsExperiment' dropChromPeaks(object, keepAdjustedRtime = FALSE) ## S4 replacement method for signature 'XcmsExperiment' chromPeaks(object) <- value ## S4 method for signature 'XcmsExperiment' chromPeaks( object, rt = numeric(), mz = numeric(), ppm = 0, msLevel = integer(), type = c("any", "within", "apex_within"), isFilledColumn = FALSE ) ## S4 replacement method for signature 'XcmsExperiment' chromPeakData(object) <- value ## S4 method for signature 'XcmsExperiment' chromPeakData( object, msLevel = integer(), return.type = c("DataFrame", "data.frame") ) ## S4 method for signature 'XcmsExperiment' filterChromPeaks( object, keep = rep(TRUE, nrow(.chromPeaks(object))), method = "keep", ... ) ## S4 method for signature 'XcmsExperiment' dropAdjustedRtime(object) ## S4 method for signature 'MsExperiment' hasAdjustedRtime(object) ## S4 method for signature 'XcmsExperiment' rtime(object, adjusted = hasAdjustedRtime(object)) ## S4 method for signature 'XcmsExperiment' adjustedRtime(object) ## S4 method for signature 'XcmsExperiment' hasFeatures(object, msLevel = integer()) ## S4 replacement method for signature 'XcmsExperiment' featureDefinitions(object) <- value ## S4 method for signature 'XcmsExperiment' featureDefinitions( object, mz = numeric(), rt = numeric(), ppm = 0, type = c("any", "within", "apex_within"), msLevel = integer() ) ## S4 method for signature 'XcmsExperiment' dropFeatureDefinitions(object, keepAdjustedRtime = FALSE) ## S4 method for signature 'XcmsExperiment' filterFeatureDefinitions(object, features = integer()) ## S4 method for signature 'XcmsExperiment' hasFilledChromPeaks(object) ## S4 method for signature 'XcmsExperiment' dropFilledChromPeaks(object) ## S4 method for signature 'XcmsExperiment' quantify(object, ...) ## S4 method for signature 'XcmsExperiment' featureValues( object, method = c("medret", "maxint", "sum"), value = "into", intensity = "into", filled = TRUE, missing = NA_real_, msLevel = integer() ) ## S4 method for signature 'XcmsExperiment' chromatogram( object, rt = matrix(nrow = 0, ncol = 2), mz = matrix(nrow = 0, ncol = 2), aggregationFun = "sum", msLevel = 1L, chunkSize = 2L, isolationWindowTargetMz = NULL, return.type = c("XChromatograms", "MChromatograms"), include = character(), chromPeaks = c("apex_within", "any", "none"), BPPARAM = bpparam() ) ## S4 method for signature 'XcmsExperiment' processHistory(object, type) ## S4 method for signature 'XcmsExperiment' filterFile( object, file, keepAdjustedRtime = hasAdjustedRtime(object), keepFeatures = FALSE, ... )
filterFeatureDefinitions(object, ...) ## S4 method for signature 'MsExperiment' filterRt(object, rt = numeric(), ...) ## S4 method for signature 'MsExperiment' filterMzRange(object, mz = numeric(), msLevel. = uniqueMsLevels(object)) ## S4 method for signature 'MsExperiment' filterMz(object, mz = numeric(), msLevel. = uniqueMsLevels(object)) ## S4 method for signature 'MsExperiment' filterMsLevel(object, msLevel. = uniqueMsLevels(object)) ## S4 method for signature 'MsExperiment' uniqueMsLevels(object) ## S4 method for signature 'MsExperiment' filterFile(object, file = integer(), ...) ## S4 method for signature 'MsExperiment' rtime(object) ## S4 method for signature 'MsExperiment' fromFile(object) ## S4 method for signature 'MsExperiment' fileNames(object) ## S4 method for signature 'MsExperiment' polarity(object) ## S4 method for signature 'MsExperiment' filterIsolationWindow(object, mz = numeric()) ## S4 method for signature 'MsExperiment' chromatogram( object, rt = matrix(nrow = 0, ncol = 2), mz = matrix(nrow = 0, ncol = 2), aggregationFun = "sum", msLevel = 1L, isolationWindowTargetMz = NULL, chunkSize = 2L, return.type = "MChromatograms", BPPARAM = bpparam() ) featureArea( object, mzmin = min, mzmax = max, rtmin = min, rtmax = max, features = character() ) ## S4 method for signature 'MsExperiment,missing' plot(x, y, msLevel = 1L, peakCol = "#ff000060", ...) ## S4 method for signature 'XcmsExperiment,ANY,ANY,ANY' x[i, j, ..., drop = TRUE] ## S4 method for signature 'XcmsExperiment' filterIsolationWindow(object, mz = numeric()) ## S4 method for signature 'XcmsExperiment' filterRt(object, rt, msLevel.) ## S4 method for signature 'XcmsExperiment' filterMzRange(object, mz = numeric(), msLevel. = uniqueMsLevels(object)) ## S4 method for signature 'XcmsExperiment' filterMsLevel(object, msLevel. = uniqueMsLevels(object)) ## S4 method for signature 'XcmsExperiment' hasChromPeaks(object, msLevel = integer()) ## S4 method for signature 'XcmsExperiment' dropChromPeaks(object, keepAdjustedRtime = FALSE) ## S4 replacement method for signature 'XcmsExperiment' chromPeaks(object) <- value ## S4 method for signature 'XcmsExperiment' chromPeaks( object, rt = numeric(), mz = numeric(), ppm = 0, msLevel = integer(), type = c("any", "within", "apex_within"), isFilledColumn = FALSE ) ## S4 replacement method for signature 'XcmsExperiment' chromPeakData(object) <- value ## S4 method for signature 'XcmsExperiment' chromPeakData( object, msLevel = integer(), return.type = c("DataFrame", "data.frame") ) ## S4 method for signature 'XcmsExperiment' filterChromPeaks( object, keep = rep(TRUE, nrow(.chromPeaks(object))), method = "keep", ... ) ## S4 method for signature 'XcmsExperiment' dropAdjustedRtime(object) ## S4 method for signature 'MsExperiment' hasAdjustedRtime(object) ## S4 method for signature 'XcmsExperiment' rtime(object, adjusted = hasAdjustedRtime(object)) ## S4 method for signature 'XcmsExperiment' adjustedRtime(object) ## S4 method for signature 'XcmsExperiment' hasFeatures(object, msLevel = integer()) ## S4 replacement method for signature 'XcmsExperiment' featureDefinitions(object) <- value ## S4 method for signature 'XcmsExperiment' featureDefinitions( object, mz = numeric(), rt = numeric(), ppm = 0, type = c("any", "within", "apex_within"), msLevel = integer() ) ## S4 method for signature 'XcmsExperiment' dropFeatureDefinitions(object, keepAdjustedRtime = FALSE) ## S4 method for signature 'XcmsExperiment' filterFeatureDefinitions(object, features = integer()) ## S4 method for signature 'XcmsExperiment' hasFilledChromPeaks(object) ## S4 method for signature 'XcmsExperiment' dropFilledChromPeaks(object) ## S4 method for signature 'XcmsExperiment' quantify(object, ...) ## S4 method for signature 'XcmsExperiment' featureValues( object, method = c("medret", "maxint", "sum"), value = "into", intensity = "into", filled = TRUE, missing = NA_real_, msLevel = integer() ) ## S4 method for signature 'XcmsExperiment' chromatogram( object, rt = matrix(nrow = 0, ncol = 2), mz = matrix(nrow = 0, ncol = 2), aggregationFun = "sum", msLevel = 1L, chunkSize = 2L, isolationWindowTargetMz = NULL, return.type = c("XChromatograms", "MChromatograms"), include = character(), chromPeaks = c("apex_within", "any", "none"), BPPARAM = bpparam() ) ## S4 method for signature 'XcmsExperiment' processHistory(object, type) ## S4 method for signature 'XcmsExperiment' filterFile( object, file, keepAdjustedRtime = hasAdjustedRtime(object), keepFeatures = FALSE, ... )
object |
An |
... |
Additional optional parameters. For |
rt |
For |
mz |
For |
msLevel. |
For |
file |
For |
aggregationFun |
For |
msLevel |
|
isolationWindowTargetMz |
For |
chunkSize |
For |
return.type |
For |
BPPARAM |
For |
mzmin |
For |
mzmax |
For |
rtmin |
For |
rtmax |
For |
features |
For |
x |
An |
y |
For |
peakCol |
For |
i |
For |
j |
For |
drop |
For |
keepAdjustedRtime |
|
value |
For |
ppm |
For |
type |
For |
isFilledColumn |
For |
keep |
For |
method |
For |
adjusted |
For |
intensity |
For |
filled |
For |
missing |
For |
include |
For |
chromPeaks |
For |
keepFeatures |
for most subsetting functions ( |
[
: subset an XcmsExperiment
by sample (parameter i
). Subsetting
will by default drop correspondence results (as subsetting by samples will
obviously affect the feature definition) and alignment results (adjusted
retention times) while identified chromatographic peaks (for the selected
samples) will be retained. Which preprocessing results should be
kept or dropped can also be configured with optional parameters
keepChromPeaks
(by default TRUE
), keepAdjustedRtime
(by default
FALSE
) and keepFeatures
(by default FALSE
).
filterChromPeaks
: filter chromatographic peaks of an XcmsExperiment
keeping only those specified with parameter keep
. Returns the
XcmsExperiment
with the filtered data. Chromatographic peaks to
retain can be specified either by providing their index in the
chromPeaks
matrix, their ID (rowname in chromPeaks
) or with a
logical
vector with the same length than number of rows of
chromPeaks
. Assignment of chromatographic peaks are updated to
eventually present feature definitions after filtering.
filterFeatureDefinitions
: filter feature definitions of an
XcmsExperiment
keeping only those defined with parameter features
,
which can be a logical
of length equal to the number of features,
an integer
with the index of the features in
featureDefinitions(object)
to keep or a character
with the feature
IDs (i.e. row names in featureDefinitions(object)
).
filterFile
: filter an XcmsExperiment
(or MsExperiment
) by file
(sample). The index of the samples to which the data should be subsetted
can be specified with parameter file
. The sole purpose of this function
is to provide backward compatibility with the MSnbase
package. Wherever
possible, the [
function should be used instead for any sample-based
subsetting. Parameters keepChromPeaks
, keepAdjustedRtime
and
keepChromPeaks
can be passed using ...
.
Note also that in contrast to [
, filterFile
does not support subsetting
in arbitrary order.
filterIsolationWindow
: filter the spectra within an MsExperiment
or XcmsExperiment
object keeping only those with an isolation window
containing the specified m/z (i.e., keeping spectra with an
"isolationWindowLowerMz"
smaller than the user-provided mz
and an
"isolationWindowUpperMz"
larger than mz
). For an XcmsExperiment
also
all chromatographic peaks (and subsequently also features) are removed for
which the range of their "isolationWindowLowerMz"
and
"isolationWindowUpperMz"
(columns in chromPeakData
) do not contain
the user provided mz
.
filterMsLevel
: filter the data of the XcmsExperiment
or MsExperiment
to keep only data of the MS level(s) specified with parameter msLevel.
.
filterMz
, filterMzRange
: filter the spectra within an
XcmsExperiment
or MsExperiment
to the specified m/z range (parameter
mz
). For XcmsExperiment
also identified chromatographic peaks and
features are filtered keeping only those that are within the specified
m/z range (i.e. for which the m/z of the peak apex is within the m/z
range). Parameter msLevels.
allows to restrict the filtering to
only specified MS levels. By default data from all MS levels are
filtered.
filterRt
: filter an XcmsExperiment
keeping only data within the
specified retention time range (parameter rt
). This function will keep
all preprocessing results present within the retention time range: all
identified chromatographic peaks with the retention time of the apex
position within the retention time range rt
are retained along, if
present, with the associated features.
Parameter msLevel.
is currently ignored, i.e. filtering will always
performed on all MS levels of the object.
chromatogram
: extract chromatographic data from a data set. Parameters
mz
and rt
allow to define specific m/z - retention time regions to
extract the data from (to e.g. for extracted ion chromatograms EICs).
Both parameters are expected to be numerical two-column matrices with
the first column defining the lower and the second the upper margin.
Each row can define a separate m/z - retention time region. Currently
the function returns a MChromatograms()
object for object
being a
MsExperiment
or, for object
being an XcmsExperiment
, either a
MChromatograms
or XChromatograms()
depending on parameter
return.type
(can be either "MChromatograms"
or "XChromatograms"
).
For the latter also chromatographic peaks detected within the provided
m/z and retention times are returned. Parameter chromPeaks
allows
to specify which chromatographic peaks should be reported. See
documentation on the chromPeaks
parameter for more information.
If the XcmsExperiment
contains correspondence results, also the
associated feature definitions will be included in the returned
XChromatograms
. By default the function returns chromatograms from MS1
data, but by setting parameter msLevel = 2L
it is possible to e.g.
extract also MS2 chromatograms. By default, with parameter
isolationWindowTargetMz = NULL
or isolationWindowTargetMz = NA_real_
,
data from all MS2 spectra will be considered in the chromatogram
extraction. If MS2 data was generated within different m/z isolation
windows (such as e.g. with Scies SWATH data), the parameter
isolationWindowTargetMz
should be used to ensure signal is only extracted
from the respective isolation window. The isolationWindowTargetMz()
function on the Spectra
object can be used to inspect/list available
isolation windows of a data set. See also the xcms LC-MS/MS vignette for
examples and details.
chromPeaks
: returns a numeric
matrix with the identified
chromatographic peaks. Each row represents a chromatographic peak
identified in one sample (file). The number of columns depends on the
peak detection algorithm (see findChromPeaks()
) but most methods return
the following columns: "mz"
(intensity-weighted mean of the m/z values
of all mass peaks included in the chromatographic peak), "mzmin"
(
smallest m/z value of any mass peak in the chromatographic peak), "mzmax"
(largest m/z value of any mass peak in the chromatographic peak), "rt"
(retention time of the peak apex), "rtmin"
(retention time of the first
scan/mass peak of the chromatographic peak), "rtmax"
(retention time of
the last scan/mass peak of the chromatographic peak), "into"
(integrated
intensity of the chromatographic peak), "maxo"
(maximal intensity of any
mass peak of the chromatographic peak), "sample"
(index of the sample
in object
in which the peak was identified). Parameters rt
, mz
,
ppm
, msLevel
and type
allow to extract subsets of identified
chromatographic peaks from the object
. See parameter description below
for details.
chromPeakData
: returns a DataFrame
with potential additional
annotations for the identified chromatographic peaks. Each row in this
DataFrame
corresponds to a row (same index and row name) in the
chromPeaks
matrix. The default annotations are "ms_level"
(the MS
level in which the peak was identified) and "is_filled"
(whether the
chromatographic peak was detected (by findChromPeaks
) or filled-in
(by fillChromPeaks
).
chromPeakSpectra
: extract MS spectra for identified chromatographic
peaks. This can be either all (full scan) MS1 spectra with retention
times between the retention time range of a chromatographic peak, all
MS2 spectra (if present) with a retention time within the retention
time range of a (MS1) chromatographic peak and a precursor m/z within
the m/z range of the chromatographic peak or single, selected spectra
depending on their total signal or highest signal. Parameter msLevel
allows to define from which MS level spectra should be extracted,
parameter method
allows to define if all or selected spectra should
be returned. See chromPeakSpectra()
for details.
dropChromPeaks
: removes (all) chromatographic peak detection results
from object
. This will also remove any correspondence results (i.e.
features) and eventually present adjusted retention times from the object
if the alignment was performed after the peak detection.
Alignment results (adjusted retention times) can be retained if parameter
keepAdjustedRtime
is set to TRUE
.
dropFilledChromPeaks
: removes chromatographic peaks added by gap filling
with fillChromPeaks
.
fillChromPeaks
: perform gap filling to integrate signal missing
values in samples in which no chromatographic peak was found. This
depends on correspondence results, hence groupChromPeaks
needs to be
called first. For details and options see fillChromPeaks()
.
findChromPeaks
: perform chromatographic peak detection. See
findChromPeaks()
for details.
hasChromPeaks
: whether the object contains peak detection results.
Parameter msLevel
allows to check whether peak detection results are
available for the specified MS level(s).
hasFilledChromPeaks
: whether gap-filling results (i.e., filled-in
chromatographic peaks) are present.
manualChromPeaks
: manually add chromatographic peaks by defining
their m/z and retention time ranges. See manualChromPeaks()
for
details and examples.
plotChromPeakImage
: show the density of identified chromatographic
peaks per file along the retention time. See plotChromPeakImage()
for
details.
plotChromPeaks
: indicate identified chromatographic peaks from one
sample in the RT-m/z space. See plotChromPeaks()
for details.
plotPrecursorIons
: general visualization of precursor ions of
LC-MS/MS data. See plotPrecursorIons()
for details.
refineChromPeaks
: refines identified chromatographic peaks in object
.
See refineChromPeaks()
for details.
adjustedRtime
: extract adjusted retention times. This is just an
alias for rtime(object, adjusted = TRUE)
.
adjustRtime
: performs retention time adjustment (alignment) of the data.
See adjustRtime()
for details.
applyAdjustedRtime
: replaces the original (raw) retention times with the
adjusted ones. See applyAdjustedRtime()
for more information.
dropAdjustedRtime
: drops alignment results (adjusted retention time) from
the result object. This also reverts the retention times of identified
chromatographic peaks if present in the result object. Note that any
results from a correspondence analysis (i.e. feature definitions) will be
dropped too (if the correspondence analysis was performed after the
alignment). This can be overruled with keepAdjustedRtime = TRUE
.
hasAdjustedRtime
: whether alignment was performed on the object (i.e.,
the object contains alignment results).
plotAdjustedRtime
: plot the alignment results; see plotAdjustedRtime()
for more information.
dropFeatureDefinitions
: removes any correspondence analysis results from
object
as well as any filled-in chromatographic peaks. By default
(with parameter keepAdjustedRtime = FALSE
) also all alignment results
will be removed if alignment was performed after the correspondence
analysis. This can be overruled with keepAdjustedRtime = TRUE
.
featureArea
: returns a matrix
with columns "mzmin"
, "mzmax"
,
"rtmin"
and "rtmax"
with the m/z and retention time range for each
feature (row) in object
. By default these represent the minimal m/z
and retention times as well as maximal m/z and retention times for
all chromatographic peaks assigned to that feature. Parameter
features
allows to extract these values for selected features only.
Parameters mzmin
, mzmax
, rtmin
and rtmax
allow to define
the function to calculate the reported "mzmin"
, "mzmax"
, "rtmin"
and "rtmax"
values.
featureChromatograms
: extract ion chromatograms (EICs) for each
feature in object
. See featureChromatograms()
for more details.
featureDefinitions
: returns a data.frame
with feature definitions or
an empty data.frame
if no correspondence analysis results are present.
Parameters msLevel
, mz
, ppm
and rt
allow to define subsets of
feature definitions that should be returned with the parameter type
defining how these parameters should be used to subset the returned
data.frame
. See parameter descriptions for details.
featureSpectra
: returns a Spectra()
or List
of Spectra
with
(MS1 or MS2) spectra associated to each feature. See featureSpectra()
for more details and available parameters.
featuresSummary
: calculate a simple summary on features. See
featureSummary()
for details.
groupChromPeaks
: performs the correspondence analysis (i.e., grouping
of chromatographic peaks into LC-MS features). See groupChromPeaks()
for details.
hasFeatures
: whether correspondence analysis results are presentin in
object
. The optional parameter msLevel
allows to define the MS
level(s) for which it should be determined if feature definitions are
available.
overlappingFeatures
: identify features that overlapping or close in
m/z - rt dimension. See overlappingFeatures()
for more information.
XcmsExperiment
Preprocessing results can be extracted using the following functions:
chromPeaks
: extract identified chromatographic peaks. See section on
chromatographic peak detection for details.
featureDefinitions
: extract the definition of features (chromatographic
peaks grouped across samples). See section on correspondence analysis for
details.
featureValues
: extract a matrix
of values for features from each
sample (file). Rows are features, columns samples. Which value should be
returned can be defined with parameter value
, which can be any column of
the chromPeaks
matrix. By default (value = "into"
) the integrated
chromatographic peak intensities are returned. With parameter msLevel
it
is possible to extract values for features from certain MS levels.
During correspondence analysis, more than one chromatographic peak per
sample can be assigned to the same feature (e.g. if they are very close in
retention time). Parameter method
allows to define the strategy to deal
with such cases: method = "medret"
: report the value from the
chromatographic peak with the apex position closest to the feautre's
median retention time. method = "maxint"
: report the value from the
chromatographic peak with the largest signal (parameter intensity
allows
to define the column in chromPeaks
that should be selected; defaults to
intensity = "into").
method = "sum"': sum the values for all
chromatographic peaks assigned to the feature in the same sample.
quantify
: extract the correspondence analysis results as a
SummarizedExperiment()
. The feature values are used as assay
in
the returned SummarizedExperiment
, rowData
contains the
featureDefinitions
(without column "peakidx"
) and colData
the
sampleData
of object
. Additional parameters to the featureValues
function (that is used to extract the feature value matrix) can be
passed via ...
.
plot
: plot for each file the position of individual peaks in the m/z -
retention time space (with color-coded intensity) and a base peak
chromatogram. This function should ideally be called only on a data subset
(i.e. after using filterRt
and filterMz
to restrict to a region of
interest). Parameter msLevel
allows to define from which MS level the
plot should be created. If x
is a XcmsExperiment
with available
identified chromatographic peaks, also the region defining the peaks
are indicated with a rectangle. Parameter peakCol
allows to define the
color of the border for these rectangles.
plotAdjustedRtime
: plot the alignment results; see plotAdjustedRtime()
for more information.
plotChromPeakImage
: show the density of identified chromatographic
peaks per file along the retention time. See plotChromPeakImage()
for
details.
plotChromPeaks
: indicate identified chromatographic peaks from one
sample in the RT-m/z space. See plotChromPeaks()
for details.
uniqueMsLevels
: returns the unique MS levels of the spectra in object
.
The functions listed below ensure compatibility with the older
XCMSnExp()
xcms result object.
fileNames
: returns the original data file names for the spectra data.
Ideally, the dataOrigin
or dataStorage
spectra variables from the
object's spectra
should be used instead.
fromFile
: returns the file (sample) index for each spectrum within
object
. Generally, subsetting by sample using the [
is the preferred
way to get spectra from a specific sample.
polarity
: returns the polarity information for each spectrum in
object
.
processHistory
: returns a list
with ProcessHistory process history
objects that contain also the parameter object used for the different
processings. Optional parameter type
allows to query for specific
processing steps.
rtime
: extract retention times of the spectra from the
MsExperiment
or XcmsExperiment
object. It is thus a shortcut for
rtime(spectra(object))
which would be the preferred way to extract
retention times from an MsExperiment
. The rtime
method for
XcmsExperiment
has an additional parameter adjusted
which allows to
define whether adjusted retention times (if present - adjusted = TRUE
)
or raw retention times (adjusted = FALSE
) should be returned. By
default adjusted retention times are returned if available.
XCMSnExp()
object Subsetting by [
supports arbitrary ordering.
Johannes Rainer
## Creating a MsExperiment object representing the data from an LC-MS ## experiment. library(MsExperiment) ## Defining the raw data files fls <- c(system.file('cdf/KO/ko15.CDF', package = "faahKO"), system.file('cdf/KO/ko16.CDF', package = "faahKO"), system.file('cdf/KO/ko18.CDF', package = "faahKO")) ## Defining a data frame with the sample characterization df <- data.frame(mzML_file = basename(fls), sample = c("ko15", "ko16", "ko18")) ## Importing the data. This will initialize a `Spectra` object representing ## the raw data and assign these to the individual samples. mse <- readMsExperiment(spectraFiles = fls, sampleData = df) ## Extract a total ion chromatogram and base peak chromatogram ## from the data bpc <- chromatogram(mse, aggregationFun = "max") tic <- chromatogram(mse) ## Plot them par(mfrow = c(2, 1)) plot(bpc, main = "BPC") plot(tic, main = "TIC") ## Extracting MS2 chromatographic data ## ## To show how MS2 chromatograms can be extracted we first load a DIA ## (SWATH) data set. mse_dia <- readMsExperiment(system.file("TripleTOF-SWATH", "PestMix1_SWATH.mzML", package = "msdata")) ## Extracting MS2 chromatogram requires also to specify the isolation ## window from which to extract the data. Without that chromatograms ## will be empty: chr_ms2 <- chromatogram(mse_dia, msLevel = 2L) intensity(chr_ms2[[1L]]) ## First we list available isolation windows table(isolationWindowTargetMz(spectra(mse_dia))) ## We can then extract the TIC of MS2 data for a specific isolation window chr_ms2 <- chromatogram(mse_dia, msLevel = 2L, isolationWindowTargetMz = 244.05) plot(chr_ms2) #### ## Chromatographic peak detection ## Perform peak detection on the data using the centWave algorith. Note ## that the parameters are chosen to reduce the run time of the example. p <- CentWaveParam(noise = 10000, snthresh = 40, prefilter = c(3, 10000)) xmse <- findChromPeaks(mse, param = p) xmse ## Have a quick look at the identified chromatographic peaks head(chromPeaks(xmse)) ## Extract chromatographic peaks identified between 3000 and 3300 seconds chromPeaks(xmse, rt = c(3000, 3300), type = "within") ## Extract ion chromatograms (EIC) for the first two chromatographic ## peaks. chrs <- chromatogram(xmse, mz = chromPeaks(xmse)[1:2, c("mzmin", "mzmax")], rt = chromPeaks(xmse)[1:2, c("rtmin", "rtmax")]) ## An EIC for each sample and each of the two regions was extracted. ## Identified chromatographic peaks in the defined regions are extracted ## as well. chrs ## Plot the EICs for the second defined region plot(chrs[2, ]) ## Subsetting the data to the results (and data) for the second sample a <- xmse[2] nrow(chromPeaks(xmse)) nrow(chromPeaks(a)) ## Filtering the result by retention time: keeping all spectra and ## chromatographic peaks within 3000 and 3500 seconds. xmse_sub <- filterRt(xmse, rt = c(3000, 3500)) xmse_sub nrow(chromPeaks(xmse_sub)) ## Perform an initial feature grouping to allow alignment using the ## peak groups method: pdp <- PeakDensityParam(sampleGroups = rep(1, 3)) xmse <- groupChromPeaks(xmse, param = pdp) ## Perform alignment using the peak groups method. pgp <- PeakGroupsParam(span = 0.4) xmse <- adjustRtime(xmse, param = pgp) ## Visualizing the alignment results plotAdjustedRtime(xmse) ## Performing the final correspondence analysis xmse <- groupChromPeaks(xmse, param = pdp) ## Show the definition of the first 6 features featureDefinitions(xmse) |> head() ## Extract the feature values; show the results for the first 6 rows. featureValues(xmse) |> head() ## The full results can also be extracted as a `SummarizedExperiment` ## that would eventually simplify subsequent analyses with other packages. ## Any additional parameters passed to the function are passed to the ## `featureValues` function that is called to generate the feature value ## matrix. se <- quantify(xmse, method = "sum") ## EICs for all features can be extracted with the `featureChromatograms` ## function. Note that, depending on the data set, extracting this for ## all features might take some time. Below we extract EICs for the ## first 10 features by providing the feature IDs. chrs <- featureChromatograms(xmse, features = rownames(featureDefinitions(xmse))[1:10]) chrs plot(chrs[3, ])
## Creating a MsExperiment object representing the data from an LC-MS ## experiment. library(MsExperiment) ## Defining the raw data files fls <- c(system.file('cdf/KO/ko15.CDF', package = "faahKO"), system.file('cdf/KO/ko16.CDF', package = "faahKO"), system.file('cdf/KO/ko18.CDF', package = "faahKO")) ## Defining a data frame with the sample characterization df <- data.frame(mzML_file = basename(fls), sample = c("ko15", "ko16", "ko18")) ## Importing the data. This will initialize a `Spectra` object representing ## the raw data and assign these to the individual samples. mse <- readMsExperiment(spectraFiles = fls, sampleData = df) ## Extract a total ion chromatogram and base peak chromatogram ## from the data bpc <- chromatogram(mse, aggregationFun = "max") tic <- chromatogram(mse) ## Plot them par(mfrow = c(2, 1)) plot(bpc, main = "BPC") plot(tic, main = "TIC") ## Extracting MS2 chromatographic data ## ## To show how MS2 chromatograms can be extracted we first load a DIA ## (SWATH) data set. mse_dia <- readMsExperiment(system.file("TripleTOF-SWATH", "PestMix1_SWATH.mzML", package = "msdata")) ## Extracting MS2 chromatogram requires also to specify the isolation ## window from which to extract the data. Without that chromatograms ## will be empty: chr_ms2 <- chromatogram(mse_dia, msLevel = 2L) intensity(chr_ms2[[1L]]) ## First we list available isolation windows table(isolationWindowTargetMz(spectra(mse_dia))) ## We can then extract the TIC of MS2 data for a specific isolation window chr_ms2 <- chromatogram(mse_dia, msLevel = 2L, isolationWindowTargetMz = 244.05) plot(chr_ms2) #### ## Chromatographic peak detection ## Perform peak detection on the data using the centWave algorith. Note ## that the parameters are chosen to reduce the run time of the example. p <- CentWaveParam(noise = 10000, snthresh = 40, prefilter = c(3, 10000)) xmse <- findChromPeaks(mse, param = p) xmse ## Have a quick look at the identified chromatographic peaks head(chromPeaks(xmse)) ## Extract chromatographic peaks identified between 3000 and 3300 seconds chromPeaks(xmse, rt = c(3000, 3300), type = "within") ## Extract ion chromatograms (EIC) for the first two chromatographic ## peaks. chrs <- chromatogram(xmse, mz = chromPeaks(xmse)[1:2, c("mzmin", "mzmax")], rt = chromPeaks(xmse)[1:2, c("rtmin", "rtmax")]) ## An EIC for each sample and each of the two regions was extracted. ## Identified chromatographic peaks in the defined regions are extracted ## as well. chrs ## Plot the EICs for the second defined region plot(chrs[2, ]) ## Subsetting the data to the results (and data) for the second sample a <- xmse[2] nrow(chromPeaks(xmse)) nrow(chromPeaks(a)) ## Filtering the result by retention time: keeping all spectra and ## chromatographic peaks within 3000 and 3500 seconds. xmse_sub <- filterRt(xmse, rt = c(3000, 3500)) xmse_sub nrow(chromPeaks(xmse_sub)) ## Perform an initial feature grouping to allow alignment using the ## peak groups method: pdp <- PeakDensityParam(sampleGroups = rep(1, 3)) xmse <- groupChromPeaks(xmse, param = pdp) ## Perform alignment using the peak groups method. pgp <- PeakGroupsParam(span = 0.4) xmse <- adjustRtime(xmse, param = pgp) ## Visualizing the alignment results plotAdjustedRtime(xmse) ## Performing the final correspondence analysis xmse <- groupChromPeaks(xmse, param = pdp) ## Show the definition of the first 6 features featureDefinitions(xmse) |> head() ## Extract the feature values; show the results for the first 6 rows. featureValues(xmse) |> head() ## The full results can also be extracted as a `SummarizedExperiment` ## that would eventually simplify subsequent analyses with other packages. ## Any additional parameters passed to the function are passed to the ## `featureValues` function that is called to generate the feature value ## matrix. se <- quantify(xmse, method = "sum") ## EICs for all features can be extracted with the `featureChromatograms` ## function. Note that, depending on the data set, extracting this for ## all features might take some time. Below we extract EICs for the ## first 10 features by providing the feature IDs. chrs <- featureChromatograms(xmse, features = rownames(featureDefinitions(xmse))[1:10]) chrs plot(chrs[3, ])
When dealing with metabolomics results, it is often necessary to filter
features based on certain criteria. These criteria are typically derived
from statistical formulas applied to full rows of data, where each row
represents a feature and its abundance of signal in each samples.
The filterFeatures
function filters features based on these conventional
quality assessment criteria. Multiple types of filtering are implemented and
can be defined by the filter
argument.
Supported filter
arguments are:
RsdFilter
: Calculates the relative standard deviation
(i.e. coefficient of variation) in abundance for each feature in QC
(Quality Control) samples and filters them in the input object according to
a provided threshold.
DratioFilter
: Computes the D-ratio or dispersion ratio, defined as
the standard deviation in abundance for QC samples divided by the standard
deviation for biological test samples, for each feature and filters them
according to a provided threshold.
PercentMissingFilter
: Determines the percentage of missing values for
each feature in the various sample groups and filters them according to a
provided threshold.
BlankFlag
: Identifies features where the mean abundance in test samples
is lower than a specified multiple of the mean abundance of blank samples.
This can be used to flag features that result from contamination in the
solvent of the samples. A new column possible_contaminants
is added to the
featureDefinitions
(XcmsExperiment
object) or rowData
(SummarizedExperiment
object) reflecting this.
For specific examples, see the help pages of the individual parameter classes listed above.
object |
|
filter |
The parameter object selecting and configuring the type of
filtering. It can be one of the following classes: |
assay |
For filtering of |
... |
Optional parameters. For |
Philippine Louail
Broadhurst D, Goodacre R, Reinke SN, Kuligowski J, Wilson ID, Lewis MR, Dunn WB. Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studies. Metabolomics. 2018;14(6):72. doi: 10.1007/s11306-018-1367-3. Epub 2018 May 18. PMID: 29805336; PMCID: PMC5960010.
## See the vignettes for more detailed examples library(MsExperiment) ## Load a test data set with features defined. test_xcms <- loadXcmsData() ## Set up parameter to filter based on coefficient of variation. By setting ## the filter such as below, features that have a coefficient of variation ## superior to 0.3 in QC samples will be removed from the object `test_xcms` ## when calling the `filterFeatures` function. rsd_filter <- RsdFilter(threshold = 0.3, qcIndex = sampleData(test_xcms)$sample_type == "QC") filtered_data_rsd <- filterFeatures(object = test_xcms, filter = rsd_filter) ## Set up parameter to filter based on D-ratio. By setting the filter such ## as below, features that have a D-ratio computed based on their abundance ## between QC and study samples superior to 0.5 will be removed from the ## object `test_xcms`. dratio_filter <- DratioFilter(threshold = 0.5, qcIndex = sampleData(test_xcms)$sample_type == "QC", studyIndex = sampleData(test_xcms)$sample_type == "study") filtered_data_dratio <- filterFeatures(object = test_xcms, filter = dratio_filter) ## Set up parameter to filter based on the percent of missing data. ## Parameter f should represent the sample group of samples, for which the ## percentage of missing values will be evaluated. As the setting is defined ## bellow, if a feature as less (or equal) to 30% missing values in one ## sample group, it will be kept in the `test_xcms` object. missing_data_filter <- PercentMissingFilter(threshold = 30, f = sampleData(test_xcms)$sample_type) filtered_data_missing <- filterFeatures(object = test_xcms, filter = missing_data_filter) ## Set up parameter to flag possible contaminants based on blank samples' ## abundance. By setting the filter such as below, features that have mean ## abundance ratio between blank(here use study as an example) and QC ## samples less than 2 will be marked as `TRUE` in an extra column named ## `possible_contaminants` in the `featureDefinitions` table of the object ## `test_xcms`. filter <- BlankFlag(threshold = 2, qcIndex = sampleData(test_xcms)$sample_type == "QC", blankIndex = sampleData(test_xcms)$sample_type == "study") filtered_xmse <- filterFeatures(test_xcms, filter)
## See the vignettes for more detailed examples library(MsExperiment) ## Load a test data set with features defined. test_xcms <- loadXcmsData() ## Set up parameter to filter based on coefficient of variation. By setting ## the filter such as below, features that have a coefficient of variation ## superior to 0.3 in QC samples will be removed from the object `test_xcms` ## when calling the `filterFeatures` function. rsd_filter <- RsdFilter(threshold = 0.3, qcIndex = sampleData(test_xcms)$sample_type == "QC") filtered_data_rsd <- filterFeatures(object = test_xcms, filter = rsd_filter) ## Set up parameter to filter based on D-ratio. By setting the filter such ## as below, features that have a D-ratio computed based on their abundance ## between QC and study samples superior to 0.5 will be removed from the ## object `test_xcms`. dratio_filter <- DratioFilter(threshold = 0.5, qcIndex = sampleData(test_xcms)$sample_type == "QC", studyIndex = sampleData(test_xcms)$sample_type == "study") filtered_data_dratio <- filterFeatures(object = test_xcms, filter = dratio_filter) ## Set up parameter to filter based on the percent of missing data. ## Parameter f should represent the sample group of samples, for which the ## percentage of missing values will be evaluated. As the setting is defined ## bellow, if a feature as less (or equal) to 30% missing values in one ## sample group, it will be kept in the `test_xcms` object. missing_data_filter <- PercentMissingFilter(threshold = 30, f = sampleData(test_xcms)$sample_type) filtered_data_missing <- filterFeatures(object = test_xcms, filter = missing_data_filter) ## Set up parameter to flag possible contaminants based on blank samples' ## abundance. By setting the filter such as below, features that have mean ## abundance ratio between blank(here use study as an example) and QC ## samples less than 2 will be marked as `TRUE` in an extra column named ## `possible_contaminants` in the `featureDefinitions` table of the object ## `test_xcms`. filter <- BlankFlag(threshold = 2, qcIndex = sampleData(test_xcms)$sample_type == "QC", blankIndex = sampleData(test_xcms)$sample_type == "study") filtered_xmse <- filterFeatures(test_xcms, filter)
The findChromPeaks
method performs chromatographic peak detection on
LC/GC-MS data. The peak detection algorithm can be selected, and configured,
using the param
argument.
Supported param
objects are:
CentWaveParam()
: chromatographic peak detection using the centWave
algorithm.
CentWavePredIsoParam()
: centWave with predicted isotopes. Peak
detection uses a two-step centWave-based approach considering also feature
isotopes.
MatchedFilterParam()
: peak detection using the matched filter
algorithm.
MassifquantParam()
: peak detection using the Kalman filter-based
massifquant method.
MSWParam()
: single-spectrum non-chromatography MS data peak detection.
For specific examples see the help pages of the individual parameter classes listed above.
findChromPeaks(object, param, ...) ## S4 method for signature 'MsExperiment,Param' findChromPeaks( object, param, msLevel = 1L, chunkSize = 2L, ..., BPPARAM = bpparam() ) ## S4 method for signature 'XcmsExperiment,Param' findChromPeaks( object, param, msLevel = 1L, chunkSize = 2L, add = FALSE, ..., BPPARAM = bpparam() )
findChromPeaks(object, param, ...) ## S4 method for signature 'MsExperiment,Param' findChromPeaks( object, param, msLevel = 1L, chunkSize = 2L, ..., BPPARAM = bpparam() ) ## S4 method for signature 'XcmsExperiment,Param' findChromPeaks( object, param, msLevel = 1L, chunkSize = 2L, add = FALSE, ..., BPPARAM = bpparam() )
object |
The data object on which to perform the peak detection. Can be
an |
param |
The parameter object selecting and configuring the algorithm. |
... |
Optional parameters. |
msLevel |
|
chunkSize |
|
BPPARAM |
Parallel processing setup. Uses by default the system-wide
default setup. See |
add |
|
Johannes Rainer
plotChromPeaks()
to plot identified chromatographic peaks for one file.
refineChromPeaks()
for methods to refine or clean identified
chromatographic peaks.
manualChromPeaks()
to manually add/define chromatographic peaks.
Other peak detection methods:
findChromPeaks-centWave
,
findChromPeaks-centWaveWithPredIsoROIs
,
findChromPeaks-massifquant
,
findChromPeaks-matchedFilter
,
findPeaks-MSW
The centWave algorithm perform peak density and wavelet based chromatographic peak detection for high resolution LC/MS data in centroid mode [Tautenhahn 2008].
The CentWaveParam
class allows to specify all settings
for a chromatographic peak detection using the centWave method. Instances
should be created with the CentWaveParam
constructor.
The findChromPeaks,OnDiskMSnExp,CentWaveParam
method
performs chromatographic peak detection using the centWave
algorithm on all samples from an OnDiskMSnExp
object. OnDiskMSnExp
objects encapsule all
experiment specific data and load the spectra data (mz and intensity
values) on the fly from the original files applying also all eventual
data manipulations.
ppm
,ppm<-
: getter and setter for the ppm
slot of the object.
peakwidth
,peakwidth<-
: getter and setter for the
peakwidth
slot of the object.
snthresh
,snthresh<-
: getter and setter for the
snthresh
slot of the object.
prefilter
,prefilter<-
: getter and setter for the
prefilter
slot of the object.
mzCenterFun
,mzCenterFun<-
: getter and setter for the
mzCenterFun
slot of the object.
integrate
,integrate<-
: getter and setter for the
integrate
slot of the object.
mzdiff
,mzdiff<-
: getter and setter for the
mzdiff
slot of the object.
fitgauss
,fitgauss<-
: getter and setter for the
fitgauss
slot of the object.
noise
,noise<-
: getter and setter for the
noise
slot of the object.
verboseColumns
,verboseColumns<-
: getter and
setter for the verboseColumns
slot of the object.
roiList
,roiList<-
: getter and setter for the
roiList
slot of the object.
fistBaselineCheck
,firstBaselineCheck<-
: getter
and setter for the firstBaselineCheck
slot of the object.
roiScales
,roiScales<-
: getter and setter for the
roiScales
slot of the object.
CentWaveParam( ppm = 25, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1L, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, roiList = list(), firstBaselineCheck = TRUE, roiScales = numeric(), extendLengthMSW = FALSE, verboseBetaColumns = FALSE ) ## S4 method for signature 'OnDiskMSnExp,CentWaveParam' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, ... ) ## S4 method for signature 'CentWaveParam' ppm(object) ## S4 replacement method for signature 'CentWaveParam' ppm(object) <- value ## S4 method for signature 'CentWaveParam' peakwidth(object) ## S4 replacement method for signature 'CentWaveParam' peakwidth(object) <- value ## S4 method for signature 'CentWaveParam' snthresh(object) ## S4 replacement method for signature 'CentWaveParam' snthresh(object) <- value ## S4 method for signature 'CentWaveParam' prefilter(object) ## S4 replacement method for signature 'CentWaveParam' prefilter(object) <- value ## S4 method for signature 'CentWaveParam' mzCenterFun(object) ## S4 replacement method for signature 'CentWaveParam' mzCenterFun(object) <- value ## S4 method for signature 'CentWaveParam' integrate(f) ## S4 replacement method for signature 'CentWaveParam' integrate(object) <- value ## S4 method for signature 'CentWaveParam' mzdiff(object) ## S4 replacement method for signature 'CentWaveParam' mzdiff(object) <- value ## S4 method for signature 'CentWaveParam' fitgauss(object) ## S4 replacement method for signature 'CentWaveParam' fitgauss(object) <- value ## S4 method for signature 'CentWaveParam' noise(object) ## S4 replacement method for signature 'CentWaveParam' noise(object) <- value ## S4 method for signature 'CentWaveParam' verboseColumns(object) ## S4 replacement method for signature 'CentWaveParam' verboseColumns(object) <- value ## S4 method for signature 'CentWaveParam' roiList(object) ## S4 replacement method for signature 'CentWaveParam' roiList(object) <- value ## S4 method for signature 'CentWaveParam' firstBaselineCheck(object) ## S4 replacement method for signature 'CentWaveParam' firstBaselineCheck(object) <- value ## S4 method for signature 'CentWaveParam' roiScales(object) ## S4 replacement method for signature 'CentWaveParam' roiScales(object) <- value ## S4 method for signature 'CentWaveParam' as.list(x, ...)
CentWaveParam( ppm = 25, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1L, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, roiList = list(), firstBaselineCheck = TRUE, roiScales = numeric(), extendLengthMSW = FALSE, verboseBetaColumns = FALSE ) ## S4 method for signature 'OnDiskMSnExp,CentWaveParam' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, ... ) ## S4 method for signature 'CentWaveParam' ppm(object) ## S4 replacement method for signature 'CentWaveParam' ppm(object) <- value ## S4 method for signature 'CentWaveParam' peakwidth(object) ## S4 replacement method for signature 'CentWaveParam' peakwidth(object) <- value ## S4 method for signature 'CentWaveParam' snthresh(object) ## S4 replacement method for signature 'CentWaveParam' snthresh(object) <- value ## S4 method for signature 'CentWaveParam' prefilter(object) ## S4 replacement method for signature 'CentWaveParam' prefilter(object) <- value ## S4 method for signature 'CentWaveParam' mzCenterFun(object) ## S4 replacement method for signature 'CentWaveParam' mzCenterFun(object) <- value ## S4 method for signature 'CentWaveParam' integrate(f) ## S4 replacement method for signature 'CentWaveParam' integrate(object) <- value ## S4 method for signature 'CentWaveParam' mzdiff(object) ## S4 replacement method for signature 'CentWaveParam' mzdiff(object) <- value ## S4 method for signature 'CentWaveParam' fitgauss(object) ## S4 replacement method for signature 'CentWaveParam' fitgauss(object) <- value ## S4 method for signature 'CentWaveParam' noise(object) ## S4 replacement method for signature 'CentWaveParam' noise(object) <- value ## S4 method for signature 'CentWaveParam' verboseColumns(object) ## S4 replacement method for signature 'CentWaveParam' verboseColumns(object) <- value ## S4 method for signature 'CentWaveParam' roiList(object) ## S4 replacement method for signature 'CentWaveParam' roiList(object) <- value ## S4 method for signature 'CentWaveParam' firstBaselineCheck(object) ## S4 replacement method for signature 'CentWaveParam' firstBaselineCheck(object) <- value ## S4 method for signature 'CentWaveParam' roiScales(object) ## S4 replacement method for signature 'CentWaveParam' roiScales(object) <- value ## S4 method for signature 'CentWaveParam' as.list(x, ...)
ppm |
|
peakwidth |
|
snthresh |
|
prefilter |
|
mzCenterFun |
Name of the function to calculate the m/z center of the
chromatographic peak. Allowed are: |
integrate |
Integration method. For |
mzdiff |
|
fitgauss |
|
noise |
|
verboseColumns |
|
roiList |
An optional list of regions-of-interest (ROI) representing
detected mass traces. If ROIs are submitted the first analysis step is
omitted and chromatographic peak detection is performed on the submitted
ROIs. Each ROI is expected to have the following elements specified:
|
firstBaselineCheck |
|
roiScales |
Optional numeric vector with length equal to |
extendLengthMSW |
Option to force centWave to use all scales when
running centWave rather than truncating with the EIC length. Uses the "open"
method to extend the EIC to a integer base-2 length prior to being passed to
|
verboseBetaColumns |
Option to calculate two additional metrics of peak
quality via comparison to an idealized bell curve. Adds |
object |
For For all other methods: a parameter object. |
param |
An |
BPPARAM |
A parameter class specifying if and how parallel processing
should be performed. It defaults to |
return.type |
Character specifying what type of object the method should
return. Can be either |
msLevel |
|
... |
ignored. |
value |
The value for the slot. |
f |
For |
x |
The parameter object. |
The centWave algorithm is most suitable for high resolution
LC/{TOF,OrbiTrap,FTICR}-MS data in centroid mode. In the first phase
the method identifies regions of interest (ROIs) representing
mass traces that are characterized as regions with less than ppm
m/z deviation in consecutive scans in the LC/MS map. In detail, starting
with a single m/z, a ROI is extended if a m/z can be found in the next scan
(spectrum) for which the difference to the mean m/z of the ROI is smaller
than the user defined ppm
of the m/z. The mean m/z of the ROI is then
updated considering also the newly included m/z value.
These ROIs are then, after some cleanup, analyzed using continuous wavelet
transform (CWT) to locate chromatographic peaks on different scales.
The first analysis step is skipped, if regions of interest are passed
via the param
parameter.
Parallel processing (one process per sample) is supported and can
be configured either by the BPPARAM
parameter or by globally
defining the parallel processing mode using the
register
method from the BiocParallel
package.
The CentWaveParam
function returns a CentWaveParam
class instance with all of the settings specified for chromatographic
peak detection by the centWave method.
For findChromPeaks
: if return.type = "XCMSnExp"
an
XCMSnExp
object with the results of the peak detection.
If return.type = "list"
a list of length equal to the number of
samples with matrices specifying the identified peaks.
If return.type = "xcmsSet"
an xcmsSet
object
with the results of the peak detection.
ppm,peakwidth,snthresh,prefilter,mzCenterFun,integrate,mzdiff,fitgauss,noise,verboseColumns,roiList,firstBaselineCheck,roiScales,extendLengthMSW,verboseBetaColumns
See corresponding parameter above. Slots values should exclusively be accessed via the corresponding getter and setter methods listed above.
These methods and classes are part of the updated and modernized
xcms
user interface which will eventually replace the
findPeaks
methods. It supports peak detection on
OnDiskMSnExp
objects (defined in the MSnbase
package). All of the settings to the centWave algorithm can be passed
with a CentWaveParam
object.
Ralf Tautenhahn, Johannes Rainer
Ralf Tautenhahn, Christoph Böttcher, and Steffen Neumann "Highly sensitive feature detection for high resolution LC/MS" BMC Bioinformatics 2008, 9:504
The do_findChromPeaks_centWave
core API function and
findPeaks.centWave
for the old user interface.
peaksWithCentWave
for functions to perform centWave peak
detection in purely chromatographic data.
XCMSnExp
for the object containing the results of
the peak detection.
Other peak detection methods:
findChromPeaks()
,
findChromPeaks-centWaveWithPredIsoROIs
,
findChromPeaks-massifquant
,
findChromPeaks-matchedFilter
,
findPeaks-MSW
## Create a CentWaveParam object. Note that the noise is set to 10000 to ## speed up the execution of the example - in a real use case the default ## value should be used, or it should be set to a reasonable value. cwp <- CentWaveParam(ppm = 20, noise = 10000, prefilter = c(3, 10000)) ## Change snthresh parameter snthresh(cwp) <- 25 cwp ## Perform the peak detection using centWave on some of the files from the ## faahKO package. Files are read using the `readMsExperiment` function ## from the MsExperiment package library(faahKO) library(xcms) library(MsExperiment) fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE, full.names = TRUE) raw_data <- readMsExperiment(fls[1]) ## Perform the peak detection using the settings defined above. res <- findChromPeaks(raw_data, param = cwp) head(chromPeaks(res))
## Create a CentWaveParam object. Note that the noise is set to 10000 to ## speed up the execution of the example - in a real use case the default ## value should be used, or it should be set to a reasonable value. cwp <- CentWaveParam(ppm = 20, noise = 10000, prefilter = c(3, 10000)) ## Change snthresh parameter snthresh(cwp) <- 25 cwp ## Perform the peak detection using centWave on some of the files from the ## faahKO package. Files are read using the `readMsExperiment` function ## from the MsExperiment package library(faahKO) library(xcms) library(MsExperiment) fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE, full.names = TRUE) raw_data <- readMsExperiment(fls[1]) ## Perform the peak detection using the settings defined above. res <- findChromPeaks(raw_data, param = cwp) head(chromPeaks(res))
This method performs a two-step centWave-based chromatographic peak detection: in a first centWave run peaks are identified for which then the location of their potential isotopes in the mz-retention time is predicted. A second centWave run is then performed on these regions of interest (ROIs). The final list of chromatographic peaks comprises all non-overlapping peaks from both centWave runs.
The CentWavePredIsoParam
class allows to specify all
settings for the two-step centWave-based peak detection considering also
predicted isotopes of peaks identified in the first centWave run.
Instances should be created with the CentWavePredIsoParam
constructor. See also the documentation of the
CentWaveParam
for all methods and arguments this class
inherits.
The findChromPeaks,OnDiskMSnExp,CentWavePredIsoParam
method performs a two-step centWave-based chromatographic peak detection
on all samples from an OnDiskMSnExp
object.
OnDiskMSnExp
objects encapsule all experiment
specific data and load the spectra data (mz and intensity values) on the
fly from the original files applying also all eventual data
manipulations.
snthreshIsoROIs
,snthreshIsoROIs<-
: getter and
setter for the snthreshIsoROIs
slot of the object.
maxCharge
,maxCharge<-
: getter and
setter for the maxCharge
slot of the object.
maxIso
,maxIso<-
: getter and
setter for the maxIso
slot of the object.
mzIntervalExtension
,mzIntervalExtension<-
: getter
and setter for the mzIntervalExtension
slot of the object.
polarity
,polarity<-
: getter and
setter for the polarity
slot of the object.
CentWavePredIsoParam( ppm = 25, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1L, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, roiList = list(), firstBaselineCheck = TRUE, roiScales = numeric(), extendLengthMSW = FALSE, verboseBetaColumns = FALSE, snthreshIsoROIs = 6.25, maxCharge = 3, maxIso = 5, mzIntervalExtension = TRUE, polarity = "unknown" ) ## S4 method for signature 'OnDiskMSnExp,CentWavePredIsoParam' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, ... ) ## S4 method for signature 'CentWavePredIsoParam' snthreshIsoROIs(object) ## S4 replacement method for signature 'CentWavePredIsoParam' snthreshIsoROIs(object) <- value ## S4 method for signature 'CentWavePredIsoParam' maxCharge(object) ## S4 replacement method for signature 'CentWavePredIsoParam' maxCharge(object) <- value ## S4 method for signature 'CentWavePredIsoParam' maxIso(object) ## S4 replacement method for signature 'CentWavePredIsoParam' maxIso(object) <- value ## S4 method for signature 'CentWavePredIsoParam' mzIntervalExtension(object) ## S4 replacement method for signature 'CentWavePredIsoParam' mzIntervalExtension(object) <- value ## S4 method for signature 'CentWavePredIsoParam' polarity(object) ## S4 replacement method for signature 'CentWavePredIsoParam' polarity(object) <- value
CentWavePredIsoParam( ppm = 25, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1L, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, roiList = list(), firstBaselineCheck = TRUE, roiScales = numeric(), extendLengthMSW = FALSE, verboseBetaColumns = FALSE, snthreshIsoROIs = 6.25, maxCharge = 3, maxIso = 5, mzIntervalExtension = TRUE, polarity = "unknown" ) ## S4 method for signature 'OnDiskMSnExp,CentWavePredIsoParam' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, ... ) ## S4 method for signature 'CentWavePredIsoParam' snthreshIsoROIs(object) ## S4 replacement method for signature 'CentWavePredIsoParam' snthreshIsoROIs(object) <- value ## S4 method for signature 'CentWavePredIsoParam' maxCharge(object) ## S4 replacement method for signature 'CentWavePredIsoParam' maxCharge(object) <- value ## S4 method for signature 'CentWavePredIsoParam' maxIso(object) ## S4 replacement method for signature 'CentWavePredIsoParam' maxIso(object) <- value ## S4 method for signature 'CentWavePredIsoParam' mzIntervalExtension(object) ## S4 replacement method for signature 'CentWavePredIsoParam' mzIntervalExtension(object) <- value ## S4 method for signature 'CentWavePredIsoParam' polarity(object) ## S4 replacement method for signature 'CentWavePredIsoParam' polarity(object) <- value
ppm |
|
peakwidth |
|
snthresh |
|
prefilter |
|
mzCenterFun |
Name of the function to calculate the m/z center of the
chromatographic peak. Allowed are: |
integrate |
Integration method. For |
mzdiff |
|
fitgauss |
|
noise |
|
verboseColumns |
|
roiList |
An optional list of regions-of-interest (ROI) representing
detected mass traces. If ROIs are submitted the first analysis step is
omitted and chromatographic peak detection is performed on the submitted
ROIs. Each ROI is expected to have the following elements specified:
|
firstBaselineCheck |
|
roiScales |
Optional numeric vector with length equal to |
extendLengthMSW |
Option to force centWave to use all scales when
running centWave rather than truncating with the EIC length. Uses the "open"
method to extend the EIC to a integer base-2 length prior to being passed to
|
verboseBetaColumns |
Option to calculate two additional metrics of peak
quality via comparison to an idealized bell curve. Adds |
snthreshIsoROIs |
|
maxCharge |
|
maxIso |
|
mzIntervalExtension |
|
polarity |
|
object |
For For all other methods: a parameter object. |
param |
An |
BPPARAM |
A parameter class specifying if and how parallel processing
should be performed. It defaults to |
return.type |
Character specifying what type of object the method should
return. Can be either |
msLevel |
|
... |
ignored. |
value |
The value for the slot. |
See centWave
for details on the centWave method.
Parallel processing (one process per sample) is supported and can
be configured either by the BPPARAM
parameter or by globally
defining the parallel processing mode using the
register
method from the BiocParallel
package.
The CentWavePredIsoParam
function returns a
CentWavePredIsoParam
class instance with all of the settings
specified for the two-step centWave-based peak detection considering also
isotopes.
For findChromPeaks
: if return.type = "XCMSnExp"
an
XCMSnExp
object with the results of the peak detection.
If return.type = "list"
a list of length equal to the number of
samples with matrices specifying the identified peaks.
If return.type = "xcmsSet"
an xcmsSet
object
with the results of the peak detection.
ppm,peakwidth,snthresh,prefilter,mzCenterFun,integrate,mzdiff,fitgauss,noise,verboseColumns,roiList,firstBaselineCheck,roiScales,extendLengthMSW,verboseBetaColumns,snthreshIsoROIs,maxCharge,maxIso,mzIntervalExtension,polarity
See corresponding parameter above.
These methods and classes are part of the updated and modernized
xcms
user interface which will eventually replace the
findPeaks
methods. It supports chromatographic peak
detection on
OnDiskMSnExp
objects (defined in the
MSnbase
package). All of the settings to the algorithm can be
passed with a CentWavePredIsoParam
object.
Hendrik Treutler, Johannes Rainer
The do_findChromPeaks_centWaveWithPredIsoROIs
core
API function and findPeaks.centWave
for the old user
interface. CentWaveParam
for the class the
CentWavePredIsoParam
extends.
XCMSnExp
for the object containing the results of
the peak detection.
Other peak detection methods:
findChromPeaks()
,
findChromPeaks-centWave
,
findChromPeaks-massifquant
,
findChromPeaks-matchedFilter
,
findPeaks-MSW
## Create a param object p <- CentWavePredIsoParam(maxCharge = 4) ## Change snthresh parameter snthresh(p) <- 25 p
## Create a param object p <- CentWavePredIsoParam(maxCharge = 4) ## Change snthresh parameter snthresh(p) <- 25 p
Massifquant is a Kalman filter (KF)-based chromatographic peak
detection for XC-MS data in centroid mode. The identified peaks
can be further refined with the centWave method (see
findChromPeaks-centWave
for details on centWave)
by specifying withWave = TRUE
.
The MassifquantParam
class allows to specify all
settings for a chromatographic peak detection using the massifquant
method eventually in combination with the centWave algorithm. Instances
should be created with the MassifquantParam
constructor.
The findChromPeaks,OnDiskMSnExp,MassifquantParam
method performs chromatographic peak detection using the
massifquant algorithm on all samples from an
OnDiskMSnExp
object.
OnDiskMSnExp
objects encapsule all experiment
specific data and load the spectra data (mz and intensity values) on the
fly from the original files applying also all eventual data
manipulations.
ppm
,ppm<-
: getter and setter for the ppm
slot of the object.
peakwidth
,peakwidth<-
: getter and setter for the
peakwidth
slot of the object.
snthresh
,snthresh<-
: getter and setter for the
snthresh
slot of the object.
prefilter
,prefilter<-
: getter and setter for the
prefilter
slot of the object.
mzCenterFun
,mzCenterFun<-
: getter and setter for the
mzCenterFun
slot of the object.
integrate
,integrate<-
: getter and setter for the
integrate
slot of the object.
mzdiff
,mzdiff<-
: getter and setter for the
mzdiff
slot of the object.
fitgauss
,fitgauss<-
: getter and setter for the
fitgauss
slot of the object.
noise
,noise<-
: getter and setter for the
noise
slot of the object.
verboseColumns
,verboseColumns<-
: getter and
setter for the verboseColumns
slot of the object.
criticalValue
,criticalValue<-
: getter and
setter for the criticalValue
slot of the object.
consecMissedLimit
,consecMissedLimit<-
: getter and
setter for the consecMissedLimit
slot of the object.
unions
,unions<-
: getter and
setter for the unions
slot of the object.
checkBack
,checkBack<-
: getter and
setter for the checkBack
slot of the object.
withWave
,withWave<-
: getter and
setter for the withWave
slot of the object.
MassifquantParam( ppm = 25, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1L, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, criticalValue = 1.125, consecMissedLimit = 2, unions = 1, checkBack = 0, withWave = FALSE ) ## S4 method for signature 'OnDiskMSnExp,MassifquantParam' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, ... ) ## S4 method for signature 'MassifquantParam' ppm(object) ## S4 replacement method for signature 'MassifquantParam' ppm(object) <- value ## S4 method for signature 'MassifquantParam' peakwidth(object) ## S4 replacement method for signature 'MassifquantParam' peakwidth(object) <- value ## S4 method for signature 'MassifquantParam' snthresh(object) ## S4 replacement method for signature 'MassifquantParam' snthresh(object) <- value ## S4 method for signature 'MassifquantParam' prefilter(object) ## S4 replacement method for signature 'MassifquantParam' prefilter(object) <- value ## S4 method for signature 'MassifquantParam' mzCenterFun(object) ## S4 replacement method for signature 'MassifquantParam' mzCenterFun(object) <- value ## S4 method for signature 'MassifquantParam' integrate(f) ## S4 replacement method for signature 'MassifquantParam' integrate(object) <- value ## S4 method for signature 'MassifquantParam' mzdiff(object) ## S4 replacement method for signature 'MassifquantParam' mzdiff(object) <- value ## S4 method for signature 'MassifquantParam' fitgauss(object) ## S4 replacement method for signature 'MassifquantParam' fitgauss(object) <- value ## S4 method for signature 'MassifquantParam' noise(object) ## S4 replacement method for signature 'MassifquantParam' noise(object) <- value ## S4 method for signature 'MassifquantParam' verboseColumns(object) ## S4 replacement method for signature 'MassifquantParam' verboseColumns(object) <- value ## S4 method for signature 'MassifquantParam' criticalValue(object) ## S4 replacement method for signature 'MassifquantParam' criticalValue(object) <- value ## S4 method for signature 'MassifquantParam' consecMissedLimit(object) ## S4 replacement method for signature 'MassifquantParam' consecMissedLimit(object) <- value ## S4 method for signature 'MassifquantParam' unions(object) ## S4 replacement method for signature 'MassifquantParam' unions(object) <- value ## S4 method for signature 'MassifquantParam' checkBack(object) ## S4 replacement method for signature 'MassifquantParam' checkBack(object) <- value ## S4 method for signature 'MassifquantParam' withWave(object) ## S4 replacement method for signature 'MassifquantParam' withWave(object) <- value
MassifquantParam( ppm = 25, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), mzCenterFun = "wMean", integrate = 1L, mzdiff = -0.001, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, criticalValue = 1.125, consecMissedLimit = 2, unions = 1, checkBack = 0, withWave = FALSE ) ## S4 method for signature 'OnDiskMSnExp,MassifquantParam' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, ... ) ## S4 method for signature 'MassifquantParam' ppm(object) ## S4 replacement method for signature 'MassifquantParam' ppm(object) <- value ## S4 method for signature 'MassifquantParam' peakwidth(object) ## S4 replacement method for signature 'MassifquantParam' peakwidth(object) <- value ## S4 method for signature 'MassifquantParam' snthresh(object) ## S4 replacement method for signature 'MassifquantParam' snthresh(object) <- value ## S4 method for signature 'MassifquantParam' prefilter(object) ## S4 replacement method for signature 'MassifquantParam' prefilter(object) <- value ## S4 method for signature 'MassifquantParam' mzCenterFun(object) ## S4 replacement method for signature 'MassifquantParam' mzCenterFun(object) <- value ## S4 method for signature 'MassifquantParam' integrate(f) ## S4 replacement method for signature 'MassifquantParam' integrate(object) <- value ## S4 method for signature 'MassifquantParam' mzdiff(object) ## S4 replacement method for signature 'MassifquantParam' mzdiff(object) <- value ## S4 method for signature 'MassifquantParam' fitgauss(object) ## S4 replacement method for signature 'MassifquantParam' fitgauss(object) <- value ## S4 method for signature 'MassifquantParam' noise(object) ## S4 replacement method for signature 'MassifquantParam' noise(object) <- value ## S4 method for signature 'MassifquantParam' verboseColumns(object) ## S4 replacement method for signature 'MassifquantParam' verboseColumns(object) <- value ## S4 method for signature 'MassifquantParam' criticalValue(object) ## S4 replacement method for signature 'MassifquantParam' criticalValue(object) <- value ## S4 method for signature 'MassifquantParam' consecMissedLimit(object) ## S4 replacement method for signature 'MassifquantParam' consecMissedLimit(object) <- value ## S4 method for signature 'MassifquantParam' unions(object) ## S4 replacement method for signature 'MassifquantParam' unions(object) <- value ## S4 method for signature 'MassifquantParam' checkBack(object) ## S4 replacement method for signature 'MassifquantParam' checkBack(object) <- value ## S4 method for signature 'MassifquantParam' withWave(object) ## S4 replacement method for signature 'MassifquantParam' withWave(object) <- value
ppm |
|
peakwidth |
|
snthresh |
|
prefilter |
|
mzCenterFun |
Name of the function to calculate the m/z center of the
chromatographic peak. Allowed are: |
integrate |
Integration method. For |
mzdiff |
|
fitgauss |
|
noise |
|
verboseColumns |
|
criticalValue |
|
consecMissedLimit |
|
unions |
|
checkBack |
|
withWave |
|
object |
For For all other methods: a parameter object. |
param |
An |
BPPARAM |
A parameter class specifying if and how parallel processing
should be performed. It defaults to |
return.type |
Character specifying what type of object the method should
return. Can be either |
msLevel |
|
... |
ignored. |
value |
The value for the slot. |
f |
For |
This algorithm's performance has been tested rigorously
on high resolution LC/(OrbiTrap, TOF)-MS data in centroid mode.
Simultaneous kalman filters identify chromatographic peaks and calculate
their area under the curve. The default parameters are set to operate on
a complex LC-MS Orbitrap sample. Users will find it useful to do some
simple exploratory data analysis to find out where to set a minimum
intensity, and identify how many scans an average peak spans. The
consecMissedLimit
parameter has yielded good performance on
Orbitrap data when set to (2
) and on TOF data it was found best
to be at (1
). This may change as the algorithm has yet to be
tested on many samples. The criticalValue
parameter is perhaps
most dificult to dial in appropriately and visual inspection of peak
identification is the best suggested tool for quick optimization.
The ppm
and checkBack
parameters have shown less influence
than the other parameters and exist to give users flexibility and
better accuracy.
Parallel processing (one process per sample) is supported and can
be configured either by the BPPARAM
parameter or by globally
defining the parallel processing mode using the
register
method from the BiocParallel
package.
The MassifquantParam
function returns a
MassifquantParam
class instance with all of the settings
specified for chromatographic peak detection by the massifquant
method.
For findChromPeaks
: if return.type = "XCMSnExp"
an
XCMSnExp
object with the results of the peak detection.
If return.type = "list"
a list of length equal to the number of
samples with matrices specifying the identified peaks.
If return.type = "xcmsSet"
an xcmsSet
object
with the results of the peak detection.
ppm,peakwidth,snthresh,prefilter,mzCenterFun,integrate,mzdiff,fitgauss,noise,verboseColumns,criticalValue,consecMissedLimit,unions,checkBack,withWave
See corresponding parameter above. Slots values should exclusively be accessed via the corresponding getter and setter methods listed above.
These methods and classes are part of the updated and modernized
xcms
user interface which will eventually replace the
findPeaks
methods. It supports chromatographic peak
detection on
OnDiskMSnExp
objects (defined in the
MSnbase
package). All of the settings to the massifquant and
centWave algorithm can be passed with a MassifquantParam
object.
Christopher Conley, Johannes Rainer
Conley CJ, Smith R, Torgrip RJ, Taylor RM, Tautenhahn R and Prince JT "Massifquant: open-source Kalman filter-based XC-MS isotope trace feature detection" Bioinformatics 2014, 30(18):2636-43.
The do_findChromPeaks_massifquant
core API function
and findPeaks.massifquant
for the old user interface.
XCMSnExp
for the object containing the results of
the peak detection.
Other peak detection methods:
findChromPeaks()
,
findChromPeaks-centWave
,
findChromPeaks-centWaveWithPredIsoROIs
,
findChromPeaks-matchedFilter
,
findPeaks-MSW
## Create a MassifquantParam object. mqp <- MassifquantParam() ## Change snthresh prefilter parameters snthresh(mqp) <- 30 prefilter(mqp) <- c(6, 10000) mqp ## Perform the peak detection using massifquant on the files from the ## faahKO package. Files are read using the readMSData from the MSnbase ## package library(faahKO) library(MSnbase) fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE, full.names = TRUE) raw_data <- readMSData(fls[1], mode = "onDisk") ## Perform the peak detection using the settings defined above. res <- findChromPeaks(raw_data, param = mqp) head(chromPeaks(res))
## Create a MassifquantParam object. mqp <- MassifquantParam() ## Change snthresh prefilter parameters snthresh(mqp) <- 30 prefilter(mqp) <- c(6, 10000) mqp ## Perform the peak detection using massifquant on the files from the ## faahKO package. Files are read using the readMSData from the MSnbase ## package library(faahKO) library(MSnbase) fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE, full.names = TRUE) raw_data <- readMSData(fls[1], mode = "onDisk") ## Perform the peak detection using the settings defined above. res <- findChromPeaks(raw_data, param = mqp) head(chromPeaks(res))
The matchedFilter algorithm identifies peaks in the
chromatographic time domain as described in [Smith 2006]. The intensity
values are binned by cutting The LC/MS data into slices (bins) of a mass
unit (binSize
m/z) wide. Within each bin the maximal intensity is
selected. The chromatographic peak detection is then performed in each
bin by extending it based on the steps
parameter to generate
slices comprising bins current_bin - steps +1
to
current_bin + steps - 1
. Each of these slices is then filtered
with matched filtration using a second-derative Gaussian as the model
peak shape. After filtration peaks are detected using a signal-to-ratio
cut-off. For more details and illustrations see [Smith 2006].
The MatchedFilterParam
class allows to specify all
settings for a chromatographic peak detection using the matchedFilter
method. Instances should be created with the MatchedFilterParam
constructor.
The findChromPeaks,OnDiskMSnExp,MatchedFilterParam
method performs peak detection using the matchedFilter algorithm
on all samples from an OnDiskMSnExp
object.
OnDiskMSnExp
objects encapsule all experiment
specific data and load the spectra data (mz and intensity values) on the
fly from the original files applying also all eventual data
manipulations.
binSize
,binSize<-
: getter and setter for the
binSize
slot of the object.
impute
,impute<-
: getter and setter for the
impute
slot of the object.
baseValue
,baseValue<-
: getter and setter for the
baseValue
slot of the object.
distance
,distance<-
: getter and setter for the
distance
slot of the object.
fwhm
,fwhm<-
: getter and setter for the
fwhm
slot of the object.
sigma
,sigma<-
: getter and setter for the
sigma
slot of the object.
max
,max<-
: getter and setter for the
max
slot of the object.
snthresh
,snthresh<-
: getter and setter for the
snthresh
slot of the object.
steps
,steps<-
: getter and setter for the
steps
slot of the object.
mzdiff
,mzdiff<-
: getter and setter for the
mzdiff
slot of the object.
index
,index<-
: getter and setter for the
index
slot of the object.
MatchedFilterParam( binSize = 0.1, impute = "none", baseValue = numeric(), distance = numeric(), fwhm = 30, sigma = fwhm/2.3548, max = 5, snthresh = 10, steps = 2, mzdiff = 0.8 - binSize * steps, index = FALSE ) ## S4 method for signature 'OnDiskMSnExp,MatchedFilterParam' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, ... ) ## S4 method for signature 'MatchedFilterParam' binSize(object) ## S4 replacement method for signature 'MatchedFilterParam' binSize(object) <- value ## S4 method for signature 'MatchedFilterParam' impute(object) ## S4 replacement method for signature 'MatchedFilterParam' impute(object) <- value ## S4 method for signature 'MatchedFilterParam' baseValue(object) ## S4 replacement method for signature 'MatchedFilterParam' baseValue(object) <- value ## S4 method for signature 'MatchedFilterParam' distance(object) ## S4 replacement method for signature 'MatchedFilterParam' distance(object) <- value ## S4 method for signature 'MatchedFilterParam' fwhm(object) ## S4 replacement method for signature 'MatchedFilterParam' fwhm(object) <- value ## S4 method for signature 'MatchedFilterParam' sigma(object) ## S4 replacement method for signature 'MatchedFilterParam' sigma(object) <- value ## S4 method for signature 'MatchedFilterParam' max(x) ## S4 replacement method for signature 'MatchedFilterParam' max(object) <- value ## S4 method for signature 'MatchedFilterParam' snthresh(object) ## S4 replacement method for signature 'MatchedFilterParam' snthresh(object) <- value ## S4 method for signature 'MatchedFilterParam' steps(object) ## S4 replacement method for signature 'MatchedFilterParam' steps(object) <- value ## S4 method for signature 'MatchedFilterParam' mzdiff(object) ## S4 replacement method for signature 'MatchedFilterParam' mzdiff(object) <- value ## S4 method for signature 'MatchedFilterParam' index(object) ## S4 replacement method for signature 'MatchedFilterParam' index(object) <- value
MatchedFilterParam( binSize = 0.1, impute = "none", baseValue = numeric(), distance = numeric(), fwhm = 30, sigma = fwhm/2.3548, max = 5, snthresh = 10, steps = 2, mzdiff = 0.8 - binSize * steps, index = FALSE ) ## S4 method for signature 'OnDiskMSnExp,MatchedFilterParam' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, ... ) ## S4 method for signature 'MatchedFilterParam' binSize(object) ## S4 replacement method for signature 'MatchedFilterParam' binSize(object) <- value ## S4 method for signature 'MatchedFilterParam' impute(object) ## S4 replacement method for signature 'MatchedFilterParam' impute(object) <- value ## S4 method for signature 'MatchedFilterParam' baseValue(object) ## S4 replacement method for signature 'MatchedFilterParam' baseValue(object) <- value ## S4 method for signature 'MatchedFilterParam' distance(object) ## S4 replacement method for signature 'MatchedFilterParam' distance(object) <- value ## S4 method for signature 'MatchedFilterParam' fwhm(object) ## S4 replacement method for signature 'MatchedFilterParam' fwhm(object) <- value ## S4 method for signature 'MatchedFilterParam' sigma(object) ## S4 replacement method for signature 'MatchedFilterParam' sigma(object) <- value ## S4 method for signature 'MatchedFilterParam' max(x) ## S4 replacement method for signature 'MatchedFilterParam' max(object) <- value ## S4 method for signature 'MatchedFilterParam' snthresh(object) ## S4 replacement method for signature 'MatchedFilterParam' snthresh(object) <- value ## S4 method for signature 'MatchedFilterParam' steps(object) ## S4 replacement method for signature 'MatchedFilterParam' steps(object) <- value ## S4 method for signature 'MatchedFilterParam' mzdiff(object) ## S4 replacement method for signature 'MatchedFilterParam' mzdiff(object) <- value ## S4 method for signature 'MatchedFilterParam' index(object) ## S4 replacement method for signature 'MatchedFilterParam' index(object) <- value
binSize |
|
impute |
Character string specifying the method to be used for missing
value imputation. Allowed values are |
baseValue |
The base value to which empty elements should be set. This
is only considered for |
distance |
For |
fwhm |
|
sigma |
|
max |
|
snthresh |
|
steps |
|
mzdiff |
|
index |
|
object |
For For all other methods: a parameter object. |
param |
An |
BPPARAM |
A parameter class specifying if and how parallel processing
should be performed. It defaults to |
return.type |
Character specifying what type of object the method should
return. Can be either |
msLevel |
|
... |
ignored. |
value |
The value for the slot. |
x |
For |
The intensities are binned by the provided m/z values within each
spectrum (scan). Binning is performed such that the bins are centered
around the m/z values (i.e. the first bin includes all m/z values between
min(mz) - bin_size/2
and min(mz) + bin_size/2
).
For more details on binning and missing value imputation see
binYonX
and imputeLinInterpol
methods.
Parallel processing (one process per sample) is supported and can
be configured either by the BPPARAM
parameter or by globally
defining the parallel processing mode using the
register
method from the BiocParallel
package.
The MatchedFilterParam
function returns a
MatchedFilterParam
class instance with all of the settings
specified for chromatographic detection by the matchedFilter
method.
For findChromPeaks
: if return.type = "XCMSnExp"
an
XCMSnExp
object with the results of the peak detection.
If return.type = "list"
a list of length equal to the number of
samples with matrices specifying the identified peaks.
If return.type = "xcmsSet"
an xcmsSet
object
with the results of the peak detection.
binSize,impute,baseValue,distance,fwhm,sigma,max,snthresh,steps,mzdiff,index
See corresponding parameter above. Slots values should exclusively be accessed via the corresponding getter and setter methods listed above.
These methods and classes are part of the updated and modernized
xcms
user interface which will eventually replace the
findPeaks
methods. It supports chromatographic peak
detection on
OnDiskMSnExp
objects (defined in the
MSnbase
package). All of the settings to the matchedFilter
algorithm can be passed with a MatchedFilterParam
object.
Colin A Smith, Johannes Rainer
Colin A. Smith, Elizabeth J. Want, Grace O'Maille, Ruben Abagyan and Gary Siuzdak. "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 2006, 78:779-787.
The do_findChromPeaks_matchedFilter
core API function
and findPeaks.matchedFilter
for the old user interface.
peaksWithMatchedFilter
for functions to perform matchedFilter
peak detection in purely chromatographic data.
XCMSnExp
for the object containing the results of
the chromatographic peak detection.
Other peak detection methods:
findChromPeaks()
,
findChromPeaks-centWave
,
findChromPeaks-centWaveWithPredIsoROIs
,
findChromPeaks-massifquant
,
findPeaks-MSW
## Create a MatchedFilterParam object. Note that we use a unnecessarily large ## binSize parameter to reduce the run-time of the example. mfp <- MatchedFilterParam(binSize = 5) ## Change snthresh parameter snthresh(mfp) <- 15 mfp ## Perform the peak detection using matchecFilter on the files from the ## faahKO package. Files are read using the readMSData from the MSnbase ## package library(faahKO) library(MSnbase) fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE, full.names = TRUE) raw_data <- readMSData(fls[1], mode = "onDisk") ## Perform the chromatographic peak detection using the settings defined ## above. Note that we are also disabling parallel processing in this ## example by registering a "SerialParam" res <- findChromPeaks(raw_data, param = mfp) head(chromPeaks(res))
## Create a MatchedFilterParam object. Note that we use a unnecessarily large ## binSize parameter to reduce the run-time of the example. mfp <- MatchedFilterParam(binSize = 5) ## Change snthresh parameter snthresh(mfp) <- 15 mfp ## Perform the peak detection using matchecFilter on the files from the ## faahKO package. Files are read using the readMSData from the MSnbase ## package library(faahKO) library(MSnbase) fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE, full.names = TRUE) raw_data <- readMSData(fls[1], mode = "onDisk") ## Perform the chromatographic peak detection using the settings defined ## above. Note that we are also disabling parallel processing in this ## example by registering a "SerialParam" res <- findChromPeaks(raw_data, param = mfp) head(chromPeaks(res))
findChromPeaks
on a Chromatogram or MChromatograms object with a
CentWaveParam parameter object performs centWave-based peak detection
on purely chromatographic data. See centWave for details on the method
and CentWaveParam for details on the parameter class.
Note that not all settings from the CentWaveParam
will be used.
See peaksWithCentWave()
for the arguments used for peak detection
on purely chromatographic data.
After chromatographic peak detection, identified peaks can also be refined
with the refineChromPeaks()
method, which can help to reduce peak
detection artifacts.
## S4 method for signature 'Chromatogram,CentWaveParam' findChromPeaks(object, param, ...) ## S4 method for signature 'MChromatograms,CentWaveParam' findChromPeaks(object, param, BPPARAM = bpparam(), ...) ## S4 method for signature 'MChromatograms,MatchedFilterParam' findChromPeaks(object, param, BPPARAM = BPPARAM, ...)
## S4 method for signature 'Chromatogram,CentWaveParam' findChromPeaks(object, param, ...) ## S4 method for signature 'MChromatograms,CentWaveParam' findChromPeaks(object, param, BPPARAM = bpparam(), ...) ## S4 method for signature 'MChromatograms,MatchedFilterParam' findChromPeaks(object, param, BPPARAM = BPPARAM, ...)
object |
a Chromatogram or MChromatograms object. |
param |
a CentWaveParam object specifying the settings for the
peak detection. See |
... |
currently ignored. |
BPPARAM |
a parameter class specifying if and how parallel processing
should be performed (only for |
If called on a Chromatogram
object, the method returns an XChromatogram
object with the identified peaks. See peaksWithCentWave()
for details on
the peak matrix content.
Johannes Rainer
peaksWithCentWave()
for the downstream function and centWave
for details on the method.
library(MSnbase) ## Loading a test data set with identified chromatographic peaks faahko_sub <- loadXcmsData("faahko_sub2") faahko_sub <- filterRt(faahko_sub, c(2500, 3700)) ## od <- as(filterFile(faahko_sub, 1L), "MsExperiment") ## Extract chromatographic data for a small m/z range chr <- chromatogram(od, mz = c(272.1, 272.3))[1, 1] ## Identify peaks with default settings xchr <- findChromPeaks(chr, CentWaveParam()) xchr ## Plot data and identified peaks. plot(xchr) library(MsExperiment) library(xcms) ## Perform peak detection on an MChromatograms object fls <- c(system.file("cdf/KO/ko15.CDF", package = "faahKO"), system.file("cdf/KO/ko16.CDF", package = "faahKO"), system.file("cdf/KO/ko18.CDF", package = "faahKO")) od3 <- readMsExperiment(fls) ## Disable parallel processing for this example register(SerialParam()) ## Extract chromatograms for a m/z - retention time slice chrs <- chromatogram(od3, mz = 344, rt = c(2500, 3500)) ## Perform peak detection using CentWave xchrs <- findChromPeaks(chrs, param = CentWaveParam()) xchrs ## Extract the identified chromatographic peaks chromPeaks(xchrs) ## plot the result plot(xchrs)
library(MSnbase) ## Loading a test data set with identified chromatographic peaks faahko_sub <- loadXcmsData("faahko_sub2") faahko_sub <- filterRt(faahko_sub, c(2500, 3700)) ## od <- as(filterFile(faahko_sub, 1L), "MsExperiment") ## Extract chromatographic data for a small m/z range chr <- chromatogram(od, mz = c(272.1, 272.3))[1, 1] ## Identify peaks with default settings xchr <- findChromPeaks(chr, CentWaveParam()) xchr ## Plot data and identified peaks. plot(xchr) library(MsExperiment) library(xcms) ## Perform peak detection on an MChromatograms object fls <- c(system.file("cdf/KO/ko15.CDF", package = "faahKO"), system.file("cdf/KO/ko16.CDF", package = "faahKO"), system.file("cdf/KO/ko18.CDF", package = "faahKO")) od3 <- readMsExperiment(fls) ## Disable parallel processing for this example register(SerialParam()) ## Extract chromatograms for a m/z - retention time slice chrs <- chromatogram(od3, mz = 344, rt = c(2500, 3500)) ## Perform peak detection using CentWave xchrs <- findChromPeaks(chrs, param = CentWaveParam()) xchrs ## Extract the identified chromatographic peaks chromPeaks(xchrs) ## plot the result plot(xchrs)
findChromPeaks
on a Chromatogram or MChromatograms object with a
MatchedFilterParam parameter object performs matchedFilter-based peak
detection on purely chromatographic data. See matchedFilter for details
on the method and MatchedFilterParam for details on the parameter class.
Note that not all settings from the MatchedFilterParam
will be used.
See peaksWithMatchedFilter()
for the arguments used for peak detection
on purely chromatographic data.
## S4 method for signature 'Chromatogram,MatchedFilterParam' findChromPeaks(object, param, ...)
## S4 method for signature 'Chromatogram,MatchedFilterParam' findChromPeaks(object, param, ...)
object |
a Chromatogram or MChromatograms object. |
param |
a MatchedFilterParam object specifying the settings for the
peak detection. See |
... |
currently ignored. |
If called on a Chromatogram
object, the method returns a matrix
with
the identified peaks. See peaksWithMatchedFilter()
for details on the
matrix content.
Johannes Rainer
peaksWithMatchedFilter()
for the downstream function and
matchedFilter for details on the method.
## Loading a test data set with identified chromatographic peaks faahko_sub <- loadXcmsData("faahko_sub2") faahko_sub <- filterRt(faahko_sub, c(2500, 3700)) ## od <- as(filterFile(faahko_sub, 1L), "MsExperiment") ## Extract chromatographic data for a small m/z range chr <- chromatogram(od, mz = c(272.1, 272.3))[1, 1] ## Identify peaks with default settings xchr <- findChromPeaks(chr, MatchedFilterParam()) ## Plot the identified peaks plot(xchr)
## Loading a test data set with identified chromatographic peaks faahko_sub <- loadXcmsData("faahko_sub2") faahko_sub <- filterRt(faahko_sub, c(2500, 3700)) ## od <- as(filterFile(faahko_sub, 1L), "MsExperiment") ## Extract chromatographic data for a small m/z range chr <- chromatogram(od, mz = c(272.1, 272.3))[1, 1] ## Identify peaks with default settings xchr <- findChromPeaks(chr, MatchedFilterParam()) ## Plot the identified peaks plot(xchr)
The findChromPeaksIsolationWindow
function allows to perform a
chromatographic peak detection in MS level > 1 spectra of certain isolation
windows (e.g. SWATH pockets). The function performs a peak detection,
separately for all spectra belonging to the same isolation window and adds
them to the chromPeaks()
matrix of the result object. Information about
the isolation window in which they were detected is added to
chromPeakData()
data frame.
Note that peak detection with this method does not remove previously
identified chromatographic peaks (e.g. on MS1 level using the
findChromPeaks()
function but adds newly identified peaks to the existing
chromPeaks()
matrix.
Isolation windows can be defined with the isolationWindow
parameter, that
by default uses the definition of isolationWindowTargetMz()
, i.e.
chromatographic peak detection is performed for all spectra with the same
isolation window target m/z (seprarately for each file). The parameter
param
allows to define and configure the peak detection algorithm (see
findChromPeaks()
for more information).
findChromPeaksIsolationWindow(object, ...) ## S4 method for signature 'MsExperiment' findChromPeaksIsolationWindow( object, param, msLevel = 2L, isolationWindow = isolationWindowTargetMz(spectra(object)), chunkSize = 2L, ..., BPPARAM = bpparam() ) ## S4 method for signature 'OnDiskMSnExp' findChromPeaksIsolationWindow( object, param, msLevel = 2L, isolationWindow = isolationWindowTargetMz(object), ... )
findChromPeaksIsolationWindow(object, ...) ## S4 method for signature 'MsExperiment' findChromPeaksIsolationWindow( object, param, msLevel = 2L, isolationWindow = isolationWindowTargetMz(spectra(object)), chunkSize = 2L, ..., BPPARAM = bpparam() ) ## S4 method for signature 'OnDiskMSnExp' findChromPeaksIsolationWindow( object, param, msLevel = 2L, isolationWindow = isolationWindowTargetMz(object), ... )
object |
|
... |
currently not used. |
param |
Peak detection parameter object, such as a
CentWaveParam object defining and configuring the chromographic
peak detection algorithm.
See also |
msLevel |
|
isolationWindow |
|
chunkSize |
if |
BPPARAM |
if |
An XcmsExperiment
or XCMSnExp
object with the chromatographic peaks
identified in spectra of each isolation window from each file added to the
chromPeaks
matrix.
Isolation window definition for each identified peak are stored as additional
columns in chromPeakData()
.
Johannes Rainer, Michael Witting
reconstructChromPeakSpectra()
for the function to reconstruct
MS2 spectra for each MS1 chromatographic peak.
This is a method to find a fragment mass with a ppm window in a xcmsFragment object
findMZ(object, find, ppmE=25, print=TRUE)
findMZ(object, find, ppmE=25, print=TRUE)
object |
xcmsFragment object type |
find |
The fragment ion to be found |
ppmE |
the ppm error window for searching |
print |
If we should print a nice little report |
The method simply searches for a given fragment ion in an xcmsFragment object type given a certain ppm error window
A data frame with the following columns:
PrecursorMz |
The precursor m/z of the fragment |
MSnParentPeakID |
An index ID of the location of the precursor peak in the xcmsFragment object |
msLevel |
The level of the found fragment ion |
rt |
the Retention time of the found ion |
mz |
the actual m/z of the found fragment ion |
intensity |
The intensity of the fragment ion |
sample |
Which sample the fragment ion came from |
GroupPeakMSn |
an ID if the peaks were grouped by an xcmsSet grouping |
CollisionEnergy |
The collision energy of the precursor scan |
H. Paul Benton, [email protected]
H. Paul Benton, D.M. Wong, S.A.Strauger, G. Siuzdak "XC"
Analytical Chemistry 2008
## Not run: library(msdata) mzMLpath <- system.file("iontrap", package = "msdata") mzMLfiles<-list.files(mzMLpath, pattern = "extracted.mzML", recursive = TRUE, full.names = TRUE) xs <- xcmsSet(mzMLfiles, method = "MS1") ##takes only one file from the file set xfrag <- xcmsFragments(xs) found<-findMZ(xfrag, 657.3433, 50) ## End(Not run)
## Not run: library(msdata) mzMLpath <- system.file("iontrap", package = "msdata") mzMLfiles<-list.files(mzMLpath, pattern = "extracted.mzML", recursive = TRUE, full.names = TRUE) xs <- xcmsSet(mzMLfiles, method = "MS1") ##takes only one file from the file set xfrag <- xcmsFragments(xs) found<-findMZ(xfrag, 657.3433, 50) ## End(Not run)
This is a method to find a neutral loss with a ppm window in a xcmsFragment object
findneutral(object, find, ppmE=25, print=TRUE)
findneutral(object, find, ppmE=25, print=TRUE)
object |
xcmsFragment object type |
find |
The neutral loss to be found |
ppmE |
the ppm error window for searching |
print |
If we should print a nice little report |
The method searches for a given neutral loss in an xcmsFragment object type given a certain ppm error window. The neutral losses are generated between neighbouring ions. The resulting data frame shows the whole scan in which the neutral loss was found.
A data frame with the following columns:
PrecursorMz |
The precursor m/z of the neutral losses |
MSnParentPeakID |
An index ID of the location of the precursor peak in the xcmsFragment object |
msLevel |
The level of the found fragment ion |
rt |
the Retention time of the found ion |
mz |
the actual m/z of the found fragment ion |
intensity |
The intensity of the fragment ion |
sample |
Which sample the fragment ion came from |
GroupPeakMSn |
an ID if the peaks were grouped by an xcmsSet grouping |
CollisionEnergy |
The collision energy of the precursor scan |
H. Paul Benton, [email protected]
H. Paul Benton, D.M. Wong, S.A.Strauger, G. Siuzdak "XC"
Analytical Chemistry 2008
## Not run: library(msdata) mzMLpath <- system.file("iontrap", package = "msdata") mzMLfiles<-list.files(mzMLpath, pattern = "extracted.mzML", recursive = TRUE, full.names = TRUE) xs <- xcmsSet(mzMLfiles, method = "MS1") ##takes only one file from the file set xfrag <- xcmsFragments(xs) found<-findneutral(xfrag, 58.1455, 50) ## End(Not run)
## Not run: library(msdata) mzMLpath <- system.file("iontrap", package = "msdata") mzMLfiles<-list.files(mzMLpath, pattern = "extracted.mzML", recursive = TRUE, full.names = TRUE) xs <- xcmsSet(mzMLfiles, method = "MS1") ##takes only one file from the file set xfrag <- xcmsFragments(xs) found<-findneutral(xfrag, 58.1455, 50) ## End(Not run)
A number of peak pickers exist in XCMS. findPeaks
is the generic method.
object |
|
method |
Method to use for peak detection. See details. |
... |
Optional arguments to be passed along |
Different algorithms can be used by specifying them with the
method
argument. For example to use the matched filter
approach described by Smith et al (2006) one would use:
findPeaks(object, method="matchedFilter")
. This is also
the default.
Further arguments given by ...
are
passed through to the function implementing
the method
.
A character vector of nicknames for the
algorithms available is returned by
getOption("BioC")$xcms$findPeaks.methods
.
If the nickname of a method is called "centWave",
the help page for that specific method can
be accessed with ?findPeaks.centWave
.
A matrix with columns:
mz |
weighted (by intensity) mean of peak m/z across scans |
mzmin |
m/z of minimum step |
mzmax |
m/z of maximum step |
rt |
retention time of peak midpoint |
rtmin |
leading edge of peak retention time |
rtmax |
trailing edge of peak retention time |
into |
integrated area of original (raw) peak |
maxo |
maximum intensity of original (raw) peak |
and additional columns depending on the choosen method.
findPeaks(object, ...)
findPeaks.matchedFilter
findPeaks.centWave
findPeaks.addPredictedIsotopeFeatures
findPeaks.centWaveWithPredictedIsotopeROIs
xcmsRaw-class
Perform peak detection in mass spectrometry direct injection spectrum using a wavelet based algorithm.
The MSWParam
class allows to specify all
settings for a peak detection using the MSW method. Instances should be
created with the MSWParam
constructor.
The findChromPeaks,OnDiskMSnExp,MSWParam
method performs peak detection in single-spectrum non-chromatography MS
data using functionality from the MassSpecWavelet
package on all
samples from an OnDiskMSnExp
object.
OnDiskMSnExp
objects encapsule all experiment
specific data and load the spectra data (mz and intensity values) on the
fly from the original files applying also all eventual data
manipulations.
snthresh
,snthresh<-
: getter and setter for the
snthresh
slot of the object.
verboseColumns
,verboseColumns<-
: getter and setter
for the verboseColumns
slot of the object.
scales
,scales<-
: getter and setter for the
scales
slot of the object.
nearbyPeak
,nearbyPeak<-
: getter and setter for the
nearbyPeak
slot of the object.
peakScaleRange
,peakScaleRange<-
: getter and setter
for the peakScaleRange
slot of the object.
ampTh
,ampTh<-
: getter and setter for the
ampTh
slot of the object.
minNoiseLevel
,minNoiseLevel<-
: getter and setter
for the minNoiseLevel
slot of the object.
ridgeLength
,ridgeLength<-
: getter and setter for
the ridgeLength
slot of the object.
peakThr
,peakThr<-
: getter and setter for the
peakThr
slot of the object.
tuneIn
,tuneIn<-
: getter and setter for the
tuneIn
slot of the object.
addParams
,addParams<-
: getter and setter for the
addParams
slot of the object. This slot stores optional additional
parameters to be passed to the
identifyMajorPeaks
and
peakDetectionCWT
functions from the
MassSpecWavelet
package.
MSWParam( snthresh = 3, verboseColumns = FALSE, scales = c(1, seq(2, 30, 2), seq(32, 64, 4)), nearbyPeak = TRUE, peakScaleRange = 5, ampTh = 0.01, minNoiseLevel = ampTh/snthresh, ridgeLength = 24, peakThr = NULL, tuneIn = FALSE, ... ) ## S4 method for signature 'OnDiskMSnExp,MSWParam' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, ... ) ## S4 method for signature 'MSWParam' snthresh(object) ## S4 replacement method for signature 'MSWParam' snthresh(object) <- value ## S4 method for signature 'MSWParam' verboseColumns(object) ## S4 replacement method for signature 'MSWParam' verboseColumns(object) <- value ## S4 method for signature 'MSWParam' scales(object) ## S4 replacement method for signature 'MSWParam' scales(object) <- value ## S4 method for signature 'MSWParam' nearbyPeak(object) ## S4 replacement method for signature 'MSWParam' nearbyPeak(object) <- value ## S4 method for signature 'MSWParam' peakScaleRange(object) ## S4 replacement method for signature 'MSWParam' peakScaleRange(object) <- value ## S4 method for signature 'MSWParam' ampTh(object) ## S4 replacement method for signature 'MSWParam' ampTh(object) <- value ## S4 method for signature 'MSWParam' minNoiseLevel(object) ## S4 replacement method for signature 'MSWParam' minNoiseLevel(object) <- value ## S4 method for signature 'MSWParam' ridgeLength(object) ## S4 replacement method for signature 'MSWParam' ridgeLength(object) <- value ## S4 method for signature 'MSWParam' peakThr(object) ## S4 replacement method for signature 'MSWParam' peakThr(object) <- value ## S4 method for signature 'MSWParam' tuneIn(object) ## S4 replacement method for signature 'MSWParam' tuneIn(object) <- value ## S4 method for signature 'MSWParam' addParams(object) ## S4 replacement method for signature 'MSWParam' addParams(object) <- value
MSWParam( snthresh = 3, verboseColumns = FALSE, scales = c(1, seq(2, 30, 2), seq(32, 64, 4)), nearbyPeak = TRUE, peakScaleRange = 5, ampTh = 0.01, minNoiseLevel = ampTh/snthresh, ridgeLength = 24, peakThr = NULL, tuneIn = FALSE, ... ) ## S4 method for signature 'OnDiskMSnExp,MSWParam' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, ... ) ## S4 method for signature 'MSWParam' snthresh(object) ## S4 replacement method for signature 'MSWParam' snthresh(object) <- value ## S4 method for signature 'MSWParam' verboseColumns(object) ## S4 replacement method for signature 'MSWParam' verboseColumns(object) <- value ## S4 method for signature 'MSWParam' scales(object) ## S4 replacement method for signature 'MSWParam' scales(object) <- value ## S4 method for signature 'MSWParam' nearbyPeak(object) ## S4 replacement method for signature 'MSWParam' nearbyPeak(object) <- value ## S4 method for signature 'MSWParam' peakScaleRange(object) ## S4 replacement method for signature 'MSWParam' peakScaleRange(object) <- value ## S4 method for signature 'MSWParam' ampTh(object) ## S4 replacement method for signature 'MSWParam' ampTh(object) <- value ## S4 method for signature 'MSWParam' minNoiseLevel(object) ## S4 replacement method for signature 'MSWParam' minNoiseLevel(object) <- value ## S4 method for signature 'MSWParam' ridgeLength(object) ## S4 replacement method for signature 'MSWParam' ridgeLength(object) <- value ## S4 method for signature 'MSWParam' peakThr(object) ## S4 replacement method for signature 'MSWParam' peakThr(object) <- value ## S4 method for signature 'MSWParam' tuneIn(object) ## S4 replacement method for signature 'MSWParam' tuneIn(object) <- value ## S4 method for signature 'MSWParam' addParams(object) ## S4 replacement method for signature 'MSWParam' addParams(object) <- value
snthresh |
|
verboseColumns |
|
scales |
Numeric defining the scales of the continuous wavelet transform (CWT). |
nearbyPeak |
logical(1) whether to include nearby peaks of major peaks. |
peakScaleRange |
numeric(1) defining the scale range of the peak (larger than 5 by default). |
ampTh |
numeric(1) defining the minimum required relative amplitude of the peak (ratio of the maximum of CWT coefficients). |
minNoiseLevel |
numeric(1) defining the minimum noise level used in computing the SNR. |
ridgeLength |
numeric(1) defining the minimum highest scale of the peak in 2-D CWT coefficient matrix. |
peakThr |
numeric(1) with the minimum absolute intensity
(above baseline) of peaks to be picked. If provided, the smoothing
Savitzky-Golay filter is used (in the |
tuneIn |
logical(1) whther to tune in the parameter estimation of the detected peaks. |
... |
Additional parameters to be passed to the
|
object |
For For all other methods: a parameter object. |
param |
An |
BPPARAM |
A parameter class specifying if and how parallel processing
should be performed. It defaults to |
return.type |
Character specifying what type of object the method should
return. Can be either |
msLevel |
|
value |
The value for the slot. |
This is a wrapper for the peak picker in Bioconductor's
MassSpecWavelet
package calling
peakDetectionCWT
and
tuneInPeakInfo
functions. See the
xcmsDirect vignette for more information.
Parallel processing (one process per sample) is supported and can
be configured either by the BPPARAM
parameter or by globally
defining the parallel processing mode using the
register
method from the BiocParallel
package.
The MSWParam
function returns a MSWParam
class instance with all of the settings specified for peak detection by
the MSW method.
For findChromPeaks
: if return.type = "XCMSnExp"
an
XCMSnExp
object with the results of the peak detection.
If return.type = "list"
a list of length equal to the number of
samples with matrices specifying the identified peaks.
If return.type = "xcmsSet"
an xcmsSet
object
with the results of the detection.
snthresh,verboseColumns,scales,nearbyPeak,peakScaleRange,ampTh,minNoiseLevel,ridgeLength,peakThr,tuneIn,addParams
See corresponding parameter above.
These methods and classes are part of the updated and modernized
xcms
user interface which will eventually replace the
findPeaks
methods. It supports peak detection on
OnDiskMSnExp
objects (defined in the MSnbase
package). All of the settings
to the algorithm can be passed with a MSWParam
object.
Joachim Kutzera, Steffen Neumann, Johannes Rainer
The do_findPeaks_MSW
core API function
and findPeaks.MSW
for the old user interface.
XCMSnExp
for the object containing the results of
the peak detection.
Other peak detection methods:
findChromPeaks()
,
findChromPeaks-centWave
,
findChromPeaks-centWaveWithPredIsoROIs
,
findChromPeaks-massifquant
,
findChromPeaks-matchedFilter
library(MSnbase) ## Create a MSWParam object mp <- MSWParam() ## Change snthresh parameter snthresh(mp) <- 15 mp ## Loading a small subset of direct injection, single spectrum files library(msdata) fticrf <- list.files(system.file("fticr-mzML", package = "msdata"), recursive = TRUE, full.names = TRUE) fticr <- readMSData(fticrf[1], msLevel. = 1, mode = "onDisk") ## Perform the MSW peak detection on these: p <- MSWParam(scales = c(1, 7), peakThr = 80000, ampTh = 0.005, SNR.method = "data.mean", winSize.noise = 500) fticr <- findChromPeaks(fticr, param = p) head(chromPeaks(fticr))
library(MSnbase) ## Create a MSWParam object mp <- MSWParam() ## Change snthresh parameter snthresh(mp) <- 15 mp ## Loading a small subset of direct injection, single spectrum files library(msdata) fticrf <- list.files(system.file("fticr-mzML", package = "msdata"), recursive = TRUE, full.names = TRUE) fticr <- readMSData(fticrf[1], msLevel. = 1, mode = "onDisk") ## Perform the MSW peak detection on these: p <- MSWParam(scales = c(1, 7), peakThr = 80000, ampTh = 0.005, SNR.method = "data.mean", winSize.noise = 500) fticr <- findChromPeaks(fticr, param = p) head(chromPeaks(fticr))
Peak density and wavelet based feature detection aiming at isotope peaks for high resolution LC/MS data in centroid mode
object |
|
ppm |
maxmial tolerated m/z deviation in consecutive scans, in ppm (parts per million) |
peakwidth |
Chromatographic peak width, given as range (min,max) in seconds |
prefilter |
|
mzCenterFun |
Function to calculate the m/z center of the feature: |
integrate |
Integration method. If |
mzdiff |
minimum difference in m/z for peaks with overlapping retention times, can be negative to allow overlap |
fitgauss |
logical, if TRUE a Gaussian is fitted to each peak |
scanrange |
scan range to process |
noise |
optional argument which is useful for data that was centroided without any intensity threshold,
centroids with intensity < |
sleep |
number of seconds to pause between plotting peak finding cycles |
verbose.columns |
logical, if TRUE additional peak meta data columns are returned |
xcmsPeaks |
peak list picked using the |
snthresh |
signal to noise ratio cutoff, definition see below. |
maxcharge |
max. number of the isotope charge. |
maxiso |
max. number of the isotope peaks to predict for each detected feature. |
mzIntervalExtension |
logical, if TRUE predicted isotope ROIs (regions of interest) are extended in the m/z dimension to increase the detection of low intensity and hence noisy peaks. |
This algorithm is most suitable for high resolution LC/{TOF,OrbiTrap,FTICR}-MS data in centroid mode. In the first phase of the method isotope ROIs (regions of interest) in the LC/MS map are predicted.
In the second phase these mass traces are further analysed.
Continuous wavelet transform (CWT) is used to locate chromatographic peaks on different scales.
The resulting peak list and the given peak list (xcmsPeaks
) are merged and redundant peaks are removed.
A matrix with columns:
mz |
weighted (by intensity) mean of peak m/z across scans |
mzmin |
m/z peak minimum |
mzmax |
m/z peak maximum |
rt |
retention time of peak midpoint |
rtmin |
leading edge of peak retention time |
rtmax |
trailing edge of peak retention time |
into |
integrated peak intensity |
intb |
baseline corrected integrated peak intensity |
maxo |
maximum peak intensity |
sn |
Signal/Noise ratio, defined as |
egauss |
RMSE of Gaussian fit |
if verbose.columns
is TRUE
additionally :
mu |
Gaussian parameter mu |
sigma |
Gaussian parameter sigma |
h |
Gaussian parameter h |
f |
Region number of m/z ROI where the peak was localised |
dppm |
m/z deviation of mass trace across scans in ppm |
scale |
Scale on which the peak was localised |
scpos |
Peak position found by wavelet analysis |
scmin |
Left peak limit found by wavelet analysis (scan number) |
scmax |
Right peak limit found by wavelet analysis (scan number) |
findPeaks.centWave(object, ppm=25, peakwidth=c(20,50),
prefilter=c(3,100), mzCenterFun="wMean", integrate=1, mzdiff=-0.001, fitgauss=FALSE,
scanrange= numeric(), noise=0, sleep=0, verbose.columns=FALSE, xcmsPeaks, snthresh=6.25, maxcharge=3, maxiso=5, mzIntervalExtension=TRUE)
Ralf Tautenhahn
Ralf Tautenhahn, Christoph Böttcher, and Steffen Neumann "Highly sensitive feature detection for high resolution LC/MS" BMC Bioinformatics 2008, 9:504\ Hendrik Treutler and Steffen Neumann. "Prediction, detection, and validation of isotope clusters in mass spectrometry data" Submitted to Metabolites 2016, Special Issue "Bioinformatics and Data Analysis"
findPeaks.centWave
findPeaks-methods
xcmsRaw-class
Peak density and wavelet based feature detection for high resolution LC/MS data in centroid mode
object |
|
ppm |
maxmial tolerated m/z deviation in consecutive scans, in ppm (parts per million) |
peakwidth |
Chromatographic peak width, given as range (min,max) in seconds |
snthresh |
signal to noise ratio cutoff, definition see below. |
prefilter |
|
mzCenterFun |
Function to calculate the m/z center of the feature: |
integrate |
Integration method. If |
mzdiff |
minimum difference in m/z for peaks with overlapping retention times, can be negative to allow overlap |
fitgauss |
logical, if TRUE a Gaussian is fitted to each peak |
scanrange |
scan range to process |
noise |
optional argument which is useful for data that was centroided without any intensity threshold,
centroids with intensity < |
sleep |
number of seconds to pause between plotting peak finding cycles |
verbose.columns |
logical, if TRUE additional peak meta data columns are returned |
ROI.list |
A optional list of ROIs that represents detected mass traces (ROIs). If this list is empty (default) then centWave detects the mass trace ROIs,
otherwise this step is skipped and the supplied ROIs are used in the peak detection phase. Each ROI object in the list has the following slots:
|
firstBaselineCheck |
logical, if TRUE continuous data within ROI is checked to be above 1st baseline |
roiScales |
numeric, optional vector of scales for each ROI in |
This algorithm is most suitable for high resolution LC/{TOF,OrbiTrap,FTICR}-MS data in centroid mode. In the first phase of the method mass traces (characterised as regions with less than ppm
m/z deviation in consecutive scans) in the LC/MS map are located.
In the second phase these mass traces are further analysed.
Continuous wavelet transform (CWT) is used to locate chromatographic peaks on different scales.
A matrix with columns:
mz |
weighted (by intensity) mean of peak m/z across scans |
mzmin |
m/z peak minimum |
mzmax |
m/z peak maximum |
rt |
retention time of peak midpoint |
rtmin |
leading edge of peak retention time |
rtmax |
trailing edge of peak retention time |
into |
integrated peak intensity |
intb |
baseline corrected integrated peak intensity |
maxo |
maximum peak intensity |
sn |
Signal/Noise ratio, defined as |
egauss |
RMSE of Gaussian fit |
if verbose.columns
is TRUE
additionally :
mu |
Gaussian parameter mu |
sigma |
Gaussian parameter sigma |
h |
Gaussian parameter h |
f |
Region number of m/z ROI where the peak was localised |
dppm |
m/z deviation of mass trace across scans in ppm |
scale |
Scale on which the peak was localised |
scpos |
Peak position found by wavelet analysis |
scmin |
Left peak limit found by wavelet analysis (scan number) |
scmax |
Right peak limit found by wavelet analysis (scan number) |
findPeaks.centWave(object, ppm=25, peakwidth=c(20,50), snthresh=10,
prefilter=c(3,100), mzCenterFun="wMean", integrate=1, mzdiff=-0.001, fitgauss=FALSE,
scanrange= numeric(), noise=0, sleep=0, verbose.columns=FALSE, ROI.list=list()),
firstBaselineCheck=TRUE, roiScales=NULL
Ralf Tautenhahn
Ralf Tautenhahn, Christoph Böttcher, and Steffen Neumann "Highly sensitive feature detection for high resolution LC/MS" BMC Bioinformatics 2008, 9:504
centWave
for the new user interface.
findPeaks-methods
xcmsRaw-class
Peak density and wavelet based feature detection for high resolution LC/MS data in centroid mode with additional peak picking of isotope features on basis of isotope peak predictions
object |
|
ppm |
maxmial tolerated m/z deviation in consecutive scans, in ppm (parts per million) |
peakwidth |
Chromatographic peak width, given as range (min,max) in seconds |
snthresh |
signal to noise ratio cutoff, definition see below. |
prefilter |
|
mzCenterFun |
Function to calculate the m/z center of the feature: |
integrate |
Integration method. If |
mzdiff |
minimum difference in m/z for peaks with overlapping retention times, can be negative to allow overlap |
fitgauss |
logical, if TRUE a Gaussian is fitted to each peak |
scanrange |
scan range to process |
noise |
optional argument which is useful for data that was centroided without any intensity threshold,
centroids with intensity < |
sleep |
number of seconds to pause between plotting peak finding cycles |
verbose.columns |
logical, if TRUE additional peak meta data columns are returned |
ROI.list |
A optional list of ROIs that represents detected mass traces (ROIs). If this list is empty (default) then centWave detects the mass trace ROIs,
otherwise this step is skipped and the supplied ROIs are used in the peak detection phase. Each ROI object in the list has the following slots:
|
firstBaselineCheck |
logical, if TRUE continuous data within ROI is checked to be above 1st baseline |
roiScales |
numeric, optional vector of scales for each ROI in |
snthreshIsoROIs |
signal to noise ratio cutoff for predicted isotope ROIs, definition see below. |
maxcharge |
max. number of the isotope charge. |
maxiso |
max. number of the isotope peaks to predict for each detected feature. |
mzIntervalExtension |
logical, if TRUE predicted isotope ROIs (regions of interest) are extended in the m/z dimension to increase the detection of low intensity and hence noisy peaks. |
This algorithm is most suitable for high resolution LC/{TOF,OrbiTrap,FTICR}-MS data in centroid mode.
The centWave
algorithm is applied in two peak picking steps as follows. In the first peak picking step ROIs (regions of interest, characterised as regions with less than ppm
m/z deviation in consecutive scans) in the LC/MS map are located and further analysed using continuous wavelet transform (CWT) for the localization of chromatographic peaks on different scales.
In the second peak picking step isotope ROIs in the LC/MS map are predicted further analysed using continuous wavelet transform (CWT) for the localization of chromatographic peaks on different scales.
The peak lists resulting from both peak picking steps are merged and redundant peaks are removed.
A matrix with columns:
mz |
weighted (by intensity) mean of peak m/z across scans |
mzmin |
m/z peak minimum |
mzmax |
m/z peak maximum |
rt |
retention time of peak midpoint |
rtmin |
leading edge of peak retention time |
rtmax |
trailing edge of peak retention time |
into |
integrated peak intensity |
intb |
baseline corrected integrated peak intensity |
maxo |
maximum peak intensity |
sn |
Signal/Noise ratio, defined as |
egauss |
RMSE of Gaussian fit |
if verbose.columns
is TRUE
additionally :
mu |
Gaussian parameter mu |
sigma |
Gaussian parameter sigma |
h |
Gaussian parameter h |
f |
Region number of m/z ROI where the peak was localised |
dppm |
m/z deviation of mass trace across scans in ppm |
scale |
Scale on which the peak was localised |
scpos |
Peak position found by wavelet analysis |
scmin |
Left peak limit found by wavelet analysis (scan number) |
scmax |
Right peak limit found by wavelet analysis (scan number) |
findPeaks.centWaveWithPredictedIsotopeROIs(object, ppm=25, peakwidth=c(20,50), snthresh=10,
prefilter=c(3,100), mzCenterFun="wMean", integrate=1, mzdiff=-0.001, fitgauss=FALSE,
scanrange= numeric(), noise=0, sleep=0, verbose.columns=FALSE, ROI.list=list(),
firstBaselineCheck=TRUE, roiScales=NULL, snthreshIsoROIs=6.25, maxcharge=3, maxiso=5, mzIntervalExtension=TRUE)
Ralf Tautenhahn
Ralf Tautenhahn, Christoph Böttcher, and Steffen Neumann "Highly sensitive feature detection for high resolution LC/MS" BMC Bioinformatics 2008, 9:504\ Hendrik Treutler and Steffen Neumann. "Prediction, detection, and validation of isotope clusters in mass spectrometry data" Submitted to Metabolites 2016, Special Issue "Bioinformatics and Data Analysis"
do_findChromPeaks_centWaveWithPredIsoROIs
for the
corresponding core API function.
findPeaks.addPredictedIsotopeFeatures
findPeaks.centWave
findPeaks-methods
xcmsRaw-class
Massifquant is a Kalman filter (KF) based feature detection for XC-MS data in centroid mode (currently in experimental stage). Optionally allows for calling the method "centWave" on features discovered by Massifquant to further refine the feature detection; to do so, supply any additional parameters specific to centWave (even more experimental). The method may be conveniently called through the xcmsSet(...) method.
The following arguments are specific to Massifquant. Any additional arguments supplied must correspond as specified by the method findPeaks.centWave.
object |
An xcmsRaw object. |
criticalValue |
Numeric: Suggested values: (0.1-3.0). This setting helps determine the the Kalman Filter prediciton margin of error. A real centroid belonging to a bonafide feature must fall within the KF prediction margin of error. Much like in the construction of a confidence interval, criticalVal loosely translates to be a multiplier of the standard error of the prediction reported by the Kalman Filter. If the features in the XC-MS sample have a small mass deviance in ppm error, a smaller critical value might be better and vice versa. |
consecMissedLimit |
Integer: Suggested values:(1,2,3). While a feature is in the proces of being detected by a Kalman Filter, the Kalman Filter may not find a predicted centroid in every scan. After 1 or more consecutive failed predictions, this setting informs Massifquant when to stop a Kalman Filter from following a candidate feature. |
prefilter |
Numeric Vector: (Positive Integer, Positive Numeric): The first argument is only used if (withWave = 1); see centWave for details. The second argument specifies the minimum threshold for the maximum intensity of a feature that must be met. |
peakwidth |
Integer Vector: (Positive Integer, Positive Integer): Only the first argument is used for Massifquant, which specifices the minimum feature length in time scans. If centWave is used, then the second argument is the maximum feature length subject to being greater than the mininum feature length. |
ppm |
The minimum estimated parts per million mass resolution a feature must possess. |
unions |
Integer: set to 1 if apply t-test union on segmentation; set to 0 if no t-test to be applied on chromatographically continous features sharing same m/z range. Explanation: With very few data points, sometimes a Kalman Filter stops tracking a feature prematurely. Another Kalman Filter is instantiated and begins following the rest of the signal. Because tracking is done backwards to forwards, this algorithmic defect leaves a real feature divided into two segments or more. With this option turned on, the program identifies segmented features and combines them (merges them) into one with a two sample t-test. The potential danger of this option is that some truly distinct features may be merged. |
withWave |
Integer: set to 1 if turned on; set to 0 if turned off. Allows the user to find features first with Massifquant and then filter those features with the second phase of centWave, which includes wavelet estimation. |
checkBack |
Integer: set to 1 if turned on; set to 0 if turned off. The convergence of a Kalman Filter to a feature's precise m/z mapping is very fast, but sometimes it incorporates erroneous centroids as part of a feature (especially early on). The "scanBack" option is an attempt to remove the occasional outlier that lies beyond the converged bounds of the Kalman Filter. The option does not directly affect identification of a feature because it is a postprocessing measure; it has not shown to be a extremely useful thus far and the default is set to being turned off. |
This algorithm's performance has been tested rigorously on high resolution LC/{OrbiTrap, TOF}-MS data in centroid mode. Simultaneous kalman filters identify features and calculate their area under the curve. The default parameters are set to operate on a complex LC-MS Orbitrap sample. Users will find it useful to do some simple exploratory data analysis to find out where to set a minimum intensity, and identify how many scans an average feature spans. The "consecMissedLimit" parameter has yielded good performance on Orbitrap data when set to (2) and on TOF data it was found best to be at (1). This may change as the algorithm has yet to be tested on many samples. The "criticalValue" parameter is perhaps most dificult to dial in appropriately and visual inspection of peak identification is the best suggested tool for quick optimization. The "ppm" and "checkBack" parameters have shown less influence than the other parameters and exist to give users flexibility and better accuracy.
If the method findPeaks.massifquant(...) is used, then a matrix is returned with rows corresponding to features, and properties of the features listed with the following column names. Otherwise, if centWave feature is used also (withWave = 1), or Massifquant is called through the xcmsSet(...) method, then their corresponding return values are used.
mz |
weighted m/z mean (weighted by intensity) of the feature |
mzmin |
m/z lower boundary of the feature |
mzmax |
m/z upper boundary of the feature |
rtmin |
starting scan time of the feature |
rtmax |
starting scan time of the feature |
into |
the raw quantitation (area under the curve) of the feature. |
area |
feature area that is not normalized by the scan rate. |
findPeaks.massifquant(object, ppm=10, peakwidth=c(20,50), snthresh=10,
prefilter=c(3,100), mzCenterFun="wMean", integrate=1, mzdiff=-0.001,
fitgauss=FALSE, scanrange= numeric(), noise=0,
sleep=0, verbose.columns=FALSE, criticalValue = 1.125, consecMissedLimit = 2,
unions = 1, checkBack = 0, withWave = 0)
Christopher Conley
Submitted for review. Christopher Conley, Ralf J .O Torgrip. Ryan Taylor, and John T. Prince. "Massifquant: open-source Kalman filter based XC-MS feature detection". August 2013.
centWave
for the new user interface.
findPeaks-methods
xcmsSet
xcmsRaw
xcmsRaw-class
library(faahKO) library(xcms) #load all the wild type and Knock out samples cdfpath <- system.file("cdf", package = "faahKO") ## Subset to only the first 2 files. cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE)[1:2] ## Run the massifquant analysis. Setting the noise level to 10000 to speed up ## execution of the examples - in a real use case it should be set to a reasoable ## value. xset <- xcmsSet(cdffiles, method = "massifquant", consecMissedLimit = 1, snthresh = 10, criticalValue = 1.73, ppm = 10, peakwidth= c(30, 60), prefilter= c(1,3000), noise = 10000, withWave = 0)
library(faahKO) library(xcms) #load all the wild type and Knock out samples cdfpath <- system.file("cdf", package = "faahKO") ## Subset to only the first 2 files. cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE)[1:2] ## Run the massifquant analysis. Setting the noise level to 10000 to speed up ## execution of the examples - in a real use case it should be set to a reasoable ## value. xset <- xcmsSet(cdffiles, method = "massifquant", consecMissedLimit = 1, snthresh = 10, criticalValue = 1.73, ppm = 10, peakwidth= c(30, 60), prefilter= c(1,3000), noise = 10000, withWave = 0)
Find peaks in the chromatographic time domain of the
profile matrix. For more details see
do_findChromPeaks_matchedFilter
.
## S4 method for signature 'xcmsRaw' findPeaks.matchedFilter( object, fwhm = 30, sigma = fwhm/2.3548, max = 5, snthresh = 10, step = 0.1, steps = 2, mzdiff = 0.8 - step * steps, index = FALSE, sleep = 0, scanrange = numeric() )
## S4 method for signature 'xcmsRaw' findPeaks.matchedFilter( object, fwhm = 30, sigma = fwhm/2.3548, max = 5, snthresh = 10, step = 0.1, steps = 2, mzdiff = 0.8 - step * steps, index = FALSE, sleep = 0, scanrange = numeric() )
object |
The |
fwhm |
|
sigma |
|
max |
|
snthresh |
|
step |
numeric(1) specifying the width of the bins/slices in m/z dimension. |
steps |
|
mzdiff |
|
index |
|
sleep |
(DEPRECATED). The use of this parameter is highly discouraged, as it could cause problems in parallel processing mode. |
scanrange |
Numeric vector defining the range of scans to which the
original |
A matrix, each row representing an intentified chromatographic peak, with columns:
Intensity weighted mean of m/z values of the peak across scans.
Minimum m/z of the peak.
Maximum m/z of the peak.
Retention time of the peak's midpoint.
Minimum retention time of the peak.
Maximum retention time of the peak.
Integrated (original) intensity of the peak.
Integrated intensity of the filtered peak.
Maximum intensity of the peak.
Maximum intensity of the filtered peak.
Rank of peak in merged EIC (<= max
).
Signal to noise ratio of the peak.
Colin A. Smith
Colin A. Smith, Elizabeth J. Want, Grace O'Maille, Ruben Abagyan and Gary Siuzdak. "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 2006, 78:779-787. @family Old peak detection methods
matchedFilter
for the new user interface.
xcmsRaw
,
do_findChromPeaks_matchedFilter
for the core function
performing the peak detection.
Collecting Tandem MS or MS$^n$ Mass Spectrometry precursor peaks as annotated in XML raw file
object |
|
Some mass spectrometers can acquire MS1 and MS2 (or MS$^n$ scans) quasi simultaneously, e.g. in data dependent tandem MS or DDIT mode.
Since xcmsFragments attaches all MS$^n$ peaks to MS1 peaks in xcmsSet, it is important that findPeaks and xcmsSet do not miss any MS1 precursor peak.
To be sure that all MS1 precursor peaks are in an xcmsSet, findPeaks.MS1 does not do an actual peak picking, but simply uses the annotation stored in mzXML, mzData or mzML raw files.
This relies on the following XML tags:
mzData:
<spectrum id="463">
<spectrumInstrument msLevel="2">
<cvParam cvLabel="psi" accession="PSI:1000039" name="TimeInSeconds" value="92.7743"/>
</spectrumInstrument>
<precursor msLevel="1" spectrumRef="461">
<cvParam cvLabel="psi" accession="PSI:1000040" name="MassToChargeRatio" value="462.091"/>
<cvParam cvLabel="psi" accession="PSI:1000042" name="Intensity" value="366.674"/>
</precursor>
</spectrum>
mzXML:
<scan num="17" msLevel="2" retentionTime="PT1.5224S">
<precursorMz precursorIntensity="125245">220.1828003</precursorMz>
</scan>
Several mzXML and mzData converters are known to create incomplete files, either without intensities (they will be set to 0) or without the precursor retention time (then a reasonably close rt will be chosen. NYI).
A matrix with columns:
mz , mzmin , mzmax
|
annotated MS1 precursor selection mass |
rt , rtmin , rtmax
|
annotated MS1 precursor retention time |
into , maxo , sn
|
annotated MS1 precursor intensity |
findPeaks.MS1(object)
Steffen Neumann, [email protected]
findPeaks-methods
xcmsRaw-class
This method performs peak detection in mass spectrometry direct injection spectrum using a wavelet based algorithm.
## S4 method for signature 'xcmsRaw' findPeaks.MSW(object, snthresh = 3, verbose.columns = FALSE, ...)
## S4 method for signature 'xcmsRaw' findPeaks.MSW(object, snthresh = 3, verbose.columns = FALSE, ...)
object |
The |
snthresh |
|
verbose.columns |
Logical whether additional peak meta data columns should be returned. |
... |
Additional parameters to be passed to the
|
This is a wrapper around the peak picker in Bioconductor's
MassSpecWavelet
package calling
peakDetectionCWT
and
tuneInPeakInfo
functions.
A matrix, each row representing an intentified peak, with columns:
m/z value of the peak at the centroid position.
Minimum m/z of the peak.
Maximum m/z of the peak.
Always -1
.
Always -1
.
Always -1
.
Integrated (original) intensity of the peak.
Maximum intensity of the peak.
Always NA
.
Maximum MSW-filter response of the peak.
Signal to noise ratio.
Joachim Kutzera, Steffen Neumann, Johannes Rainer
MSW
for the new user interface,
do_findPeaks_MSW
for the downstream analysis
function or peakDetectionCWT
from the
MassSpecWavelet
for details on the algorithm and additionally
supported parameters.
The GenericParam
class allows to store generic parameter
information such as the name of the function that was/has to be called
(slot fun
) and its arguments (slot args
). This object is
used to track the process history of the data processings of an
XCMSnExp
object. This is in contrast to e.g. the
CentWaveParam
object that is passed to the actual
processing method.
GenericParam(fun = character(), args = list())
GenericParam(fun = character(), args = list())
fun |
|
args |
|
The GenericParam
function returns a GenericParam
object.
fun
character
specifying the function name.
args
list
(ideally named) with the arguments to the
function.
Johannes Rainer
processHistory
for how to access the process history
of an XCMSnExp
object.
prm <- GenericParam(fun = "mean") prm <- GenericParam(fun = "mean", args = list(na.rm = TRUE))
prm <- GenericParam(fun = "mean") prm <- GenericParam(fun = "mean", args = list(na.rm = TRUE))
Generate multiple extracted ion chromatograms for m/z values of
interest. For xcmsSet
objects, reread original raw data
and apply precomputed retention time correction, if applicable.
Note that this method will always return profile, not raw data (with profile data being the binned data along M/Z). See details for further information.
object |
the |
mzrange |
Either a two column matrix with minimum or maximum m/z or a
matrix of any dimensions containing columns For |
rtrange |
A two column matrix the same size as For |
step |
step (bin) size to use for profile generation. Note that a
value of |
groupidx |
either character vector with names or integer vector with indicies of peak groups for which to get EICs |
sampleidx |
either character vector with names or integer vector with indicies of samples for which to get EICs |
rt |
|
In contrast to the rawEIC
method, that extracts the
actual raw values, this method extracts them from the object's profile
matrix (or if the provided step
argument does not match the
profStep
of the object
the profile matrix is calculated
on the fly and the values returned).
For xcmsSet
and xcmsRaw
objects, an xcmsEIC
object.
getEIC(object, mzrange, rtrange = NULL, step = 0.1)
getEIC(object, mzrange, rtrange = 200, groupidx,
sampleidx = sampnames(object), rt = c("corrected", "raw"))
xcmsRaw-class
,
xcmsSet-class
,
xcmsEIC-class
,
rawEIC
Integrate extracted ion chromatograms in pre-defined defined
regions. Return output similar to findPeaks
.
object |
the |
peakrange |
matrix or data frame with 4 columns: |
step |
step size to use for profile generation |
A matrix with columns:
i |
rank of peak identified in merged EIC (<= |
mz |
weighted (by intensity) mean of peak m/z across scans |
mzmin |
m/z of minimum step |
mzmax |
m/z of maximum step |
ret |
retention time of peak midpoint |
retmin |
leading edge of peak retention time |
retmax |
trailing edge of peak retention time |
into |
integrated area of original (raw) peak |
intf |
integrated area of filtered peak, always |
maxo |
maximum intensity of original (raw) peak |
maxf |
maximum intensity of filtered peak, always |
getPeaks(object, peakrange, step = 0.1)
Return the data from a single mass scan using the numeric index of the scan as a reference.
object |
the |
scan |
integer index of scan. if negative, the index numbered from the end |
mzrange |
limit data points returned to those between in the range,
|
A matrix with two columns:
mz |
m/z values |
intensity |
intensity values |
getScan(object, scan, mzrange = numeric())
getMsnScan(object, scan, mzrange = numeric())
Return full-resolution averaged data from multiple mass scans.
object |
the |
... |
arguments passed to |
Based on the mass points from the spectra selected, a master unique list of masses is generated. Every spectra is interpolated at those masses and then averaged.
A matrix with two columns:
mz |
m/z values |
intensity |
intensity values |
getSpec(object, ...)
xcmsRaw-class
,
profRange
,
getScan
Reads the raw data applies evential retention time corrections and
waters Lock mass correction and
returns it as an xcmsRaw
object (or list of xcmsRaw
objects) for one or more files of the xcmsSet
object.
object |
the |
sampleidx |
The index of the sample for which the raw data should be returned. Can be a single number or a numeric vector with the indices. Alternatively, the file name can be specified. |
profmethod |
The profile method. |
profstep |
The profile step. |
rt |
Whether corrected or raw retention times should be returned. |
... |
Additional arguments submitted to the |
A single xcmsRaw
object or a list of xcmsRaw
objects.
getXcmsRaw(object, sampleidx=1,
profmethod=profinfo(object)$method, profstep=profinfo(object)$step,
rt=c("corrected", "raw"), ...
)
Johannes Rainer, [email protected]
A number of grouping (or alignment) methods exist in XCMS. group
is the generic method.
object |
|
method |
Method to use for grouping. See details. |
... |
Optional arguments to be passed along |
Different algorithms can be used by specifying them with the
method
argument. For example to use the density-based
approach described by Smith et al (2006) one would use:
group(object, method="density")
. This is also the default.
Further arguments given by ...
are
passed through to the function implementing
the method
.
A character vector of nicknames for the
algorithms available is returned by
getOption("BioC")$xcms$group.methods
.
If the nickname of a method is called "mzClust",
the help page for that specific method can
be accessed with ?group.mzClust
.
An xcmsSet
object with peak group assignments and statistics.
group(object, ...)
group.density
group.mzClust
group.nearest
xcmsSet-class
,
Group peaks together across samples using overlapping m/z bins and calculation of smoothed peak distributions in chromatographic time.
object |
the |
minfrac |
minimum fraction of samples necessary in at least one of the sample groups for it to be a valid group |
minsamp |
minimum number of samples necessary in at least one of the sample groups for it to be a valid group |
bw |
bandwidth (standard deviation or half width at half maximum) of gaussian smoothing kernel to apply to the peak density chromatogram |
mzwid |
width of overlapping m/z slices to use for creating peak density chromatograms and grouping peaks across samples |
max |
maximum number of groups to identify in a single m/z slice |
sleep |
seconds to pause between plotting successive steps of the peak grouping algorithm. peaks are plotted as points showing relative intensity. identified groups are flanked by dotted vertical lines. |
An xcmsSet
object with peak group assignments and statistics.
group(object, bw = 30, minfrac = 0.5, minsamp = 1,
mzwid = 0.25, max = 50, sleep = 0)
do_groupChromPeaks_density
for the core API function
performing the analysis.
xcmsSet-class
,
density
Runs high resolution alignment on single spectra samples stored in a given xcmsSet.
object |
a xcmsSet with peaks |
mzppm |
the relative error used for clustering/grouping in ppm (parts per million) |
mzabs |
the absolute error used for clustering/grouping |
minsamp |
set the minimum number of samples in one bin |
minfrac |
set the minimum fraction of each class in one bin |
Returns a xcmsSet with slots groups and groupindex set.
group(object, method="mzClust", mzppm = 20, mzabs = 0, minsamp = 1, minfrac=0)
Saira A. Kazmi, Samiran Ghosh, Dong-Guk Shin,
Dennis W. Hill and David F. Grant
Alignment of high resolution mass spectra: development of a heuristic
approach for metabolomics.
Metabolomics, Vol. 2, No. 2, 75-83 (2006)
## Not run: library(msdata) mzMLpath <- system.file("fticr-mzML", package = "msdata") mzMLfiles <- list.files(mzMLpath, recursive = TRUE, full.names = TRUE) xs <- xcmsSet(method="MSW", files=mzMLfiles, scales=c(1,7), SNR.method='data.mean' , winSize.noise=500, peakThr=80000, amp.Th=0.005) xsg <- group(xs, method="mzClust") ## End(Not run)
## Not run: library(msdata) mzMLpath <- system.file("fticr-mzML", package = "msdata") mzMLfiles <- list.files(mzMLpath, recursive = TRUE, full.names = TRUE) xs <- xcmsSet(method="MSW", files=mzMLfiles, scales=c(1,7), SNR.method='data.mean' , winSize.noise=500, peakThr=80000, amp.Th=0.005) xsg <- group(xs, method="mzClust") ## End(Not run)
Group peaks together across samples by creating a master peak list and assigning corresponding peaks from all samples. It is inspired by the alignment algorithm of mzMine. For further details check http://mzmine.sourceforge.net/ and
Katajamaa M, Miettinen J, Oresic M: MZmine: Toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics (Oxford, England) 2006, 22:634?636.
Currently, there is no equivalent to minfrac
or minsamp
.
object |
the |
mzVsRTbalance |
Multiplicator for mz value before calculating the (euclidean) distance between two peaks. |
mzCheck |
Maximum tolerated distance for mz. |
rtCheck |
Maximum tolerated distance for RT. |
kNN |
Number of nearest Neighbours to check |
An xcmsSet
object with peak group assignments and statistics.
group(object, mzVsRTbalance=10, mzCheck=0.2, rtCheck=15, kNN=10)
xcmsSet-class
,
group.density
and
group.mzClust
## Not run: library(xcms) library(faahKO) ## These files do not have this problem to correct for ## but just for an example cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset<-xcmsSet(cdffiles) gxset<-group(xset, method="nearest") nrow(gxset@groups) == 1096 ## the number of features before minFrac post.minFrac<-function(object, minFrac=0.5){ ix.minFrac<-sapply(1:length(unique(sampclass(object))), function(x, object, mf){ meta<-groups(object) minFrac.idx<-numeric(length=nrow(meta)) idx<-which( meta[,levels(sampclass(object))[x]] >= mf*length(which(levels(sampclass(object))[x] == sampclass(object)) )) minFrac.idx[idx]<-1 return(minFrac.idx) }, object, minFrac) ix.minFrac<-as.logical(apply(ix.minFrac, 1, sum)) ix<-which(ix.minFrac == TRUE) return(ix) } ## using the above function we can get a post processing minFrac idx<-post.minFrac(gxset) gxset.post<-gxset ## copy the xcmsSet object gxset.post@groupidx<-gxset@groupidx[idx] gxset.post@groups<-gxset@groups[idx,] nrow(gxset.post@groups) == 465 ## number of features after minFrac ## End(Not run)
## Not run: library(xcms) library(faahKO) ## These files do not have this problem to correct for ## but just for an example cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset<-xcmsSet(cdffiles) gxset<-group(xset, method="nearest") nrow(gxset@groups) == 1096 ## the number of features before minFrac post.minFrac<-function(object, minFrac=0.5){ ix.minFrac<-sapply(1:length(unique(sampclass(object))), function(x, object, mf){ meta<-groups(object) minFrac.idx<-numeric(length=nrow(meta)) idx<-which( meta[,levels(sampclass(object))[x]] >= mf*length(which(levels(sampclass(object))[x] == sampclass(object)) )) minFrac.idx[idx]<-1 return(minFrac.idx) }, object, minFrac) ix.minFrac<-as.logical(apply(ix.minFrac, 1, sum)) ix<-which(ix.minFrac == TRUE) return(ix) } ## using the above function we can get a post processing minFrac idx<-post.minFrac(gxset) gxset.post<-gxset ## copy the xcmsSet object gxset.post@groupidx<-gxset@groupidx[idx] gxset.post@groups<-gxset@groups[idx,] nrow(gxset.post@groups) == 465 ## number of features after minFrac ## End(Not run)
The groupChromPeaks
method performs a correspondence analysis i.e., it
groups chromatographic peaks across samples to define the LC-MS features.
The correspondence algorithm can be selected, and configured, using the
param
argument. See documentation of XcmsExperiment()
and XCMSnExp()
for information on how to access and extract correspondence results.
The correspondence analysis can be performed on chromatographic peaks of
any MS level (if present and if chromatographic peak detection has been
performed for that MS level) defining features combining these peaks. The
MS level can be selected with the parameter msLevel
. By default, calling
groupChromPeaks
will remove any previous correspondence results. This can
be disabled with add = TRUE
, which will add newly defined features to
already present feature definitions.
Supported param
objects are:
PeakDensityParam
: correspondence using the peak density method
(Smith 2006) that groups chromatographic peaks along the retention time
axis within slices of (partially overlapping) m/z ranges. By default,
these m/z ranges (bins) have a constant size. By setting ppm
to a value
larger than 0, m/z dependent bin sizes can be used instead (better
representing the m/z dependent measurement error of some MS instruments).
All peaks (from the same or from different samples) with their apex
position being close on the retention time axis are grouped into a LC-MS
feature. Only samples with non-missing sample group assignment (i.e. for
which the value provided with parameter sampleGroups
is different than
NA
) are considered and counted for the feature definition. This allows
to exclude certain samples or groups (e.g. blanks) from the feature
definition avoiding thus features with only detected peaks in these. Note
that this affects only the definition of new features.
Chromatographic peaks in these samples will still be assigned to features
which were defined based on the other samples.
See in addition do_groupChromPeaks_density()
for the core API
function.
NearestPeaksParam
: performs peak grouping based on the proximity of
chromatographic peaks from different samples in the m/z - rt space similar
to the correspondence method of mzMine (Katajamaa 2006). The method
creates first a master peak list consisting of all chromatographic peaks
from the sample with the most detected peaks and iteratively calculates
distances to peaks from the sample with the next most number of peaks
grouping peaks together if their distance is smaller than the provided
thresholds.
See in addition do_groupChromPeaks_nearest()
for the core API function.
MzClustParam
: performs high resolution peak grouping for
single spectrum metabolomics data (Kazmi 2006). This method should
only be used for such data as the retention time is not considered
in the correspondence analysis.
See in addition do_groupPeaks_mzClust()
for the core API function.
For specific examples and description of the method and settings see the help pages of the individual parameter classes listed above.
groupChromPeaks(object, param, ...) ## S4 method for signature 'XcmsExperiment,Param' groupChromPeaks(object, param, msLevel = 1L, add = FALSE) PeakDensityParam( sampleGroups = numeric(), bw = 30, minFraction = 0.5, minSamples = 1, binSize = 0.25, ppm = 0, maxFeatures = 50 ) MzClustParam( sampleGroups = numeric(), ppm = 20, absMz = 0, minFraction = 0.5, minSamples = 1 ) NearestPeaksParam( sampleGroups = numeric(), mzVsRtBalance = 10, absMz = 0.2, absRt = 15, kNN = 10 ) ## S4 method for signature 'PeakDensityParam' sampleGroups(object) ## S4 replacement method for signature 'PeakDensityParam' sampleGroups(object) <- value ## S4 method for signature 'PeakDensityParam' bw(object) ## S4 replacement method for signature 'PeakDensityParam' bw(object) <- value ## S4 method for signature 'PeakDensityParam' minFraction(object) ## S4 replacement method for signature 'PeakDensityParam' minFraction(object) <- value ## S4 method for signature 'PeakDensityParam' minSamples(object) ## S4 replacement method for signature 'PeakDensityParam' minSamples(object) <- value ## S4 method for signature 'PeakDensityParam' binSize(object) ## S4 replacement method for signature 'PeakDensityParam' binSize(object) <- value ## S4 method for signature 'PeakDensityParam' maxFeatures(object) ## S4 replacement method for signature 'PeakDensityParam' maxFeatures(object) <- value ## S4 method for signature 'PeakDensityParam' ppm(object) ## S4 method for signature 'MzClustParam' sampleGroups(object) ## S4 replacement method for signature 'MzClustParam' sampleGroups(object) <- value ## S4 method for signature 'MzClustParam' ppm(object) ## S4 replacement method for signature 'MzClustParam' ppm(object) <- value ## S4 method for signature 'MzClustParam' absMz(object) ## S4 replacement method for signature 'MzClustParam' absMz(object) <- value ## S4 method for signature 'MzClustParam' minFraction(object) ## S4 replacement method for signature 'MzClustParam' minFraction(object) <- value ## S4 method for signature 'MzClustParam' minSamples(object) ## S4 replacement method for signature 'MzClustParam' minSamples(object) <- value ## S4 method for signature 'NearestPeaksParam' sampleGroups(object) ## S4 replacement method for signature 'NearestPeaksParam' sampleGroups(object) <- value ## S4 method for signature 'NearestPeaksParam' mzVsRtBalance(object) ## S4 replacement method for signature 'NearestPeaksParam' mzVsRtBalance(object) <- value ## S4 method for signature 'NearestPeaksParam' absMz(object) ## S4 replacement method for signature 'NearestPeaksParam' absMz(object) <- value ## S4 method for signature 'NearestPeaksParam' absRt(object) ## S4 replacement method for signature 'NearestPeaksParam' absRt(object) <- value ## S4 method for signature 'NearestPeaksParam' kNN(object) ## S4 replacement method for signature 'NearestPeaksParam' kNN(object) <- value ## S4 method for signature 'PeakDensityParam' as.list(x, ...) ## S4 method for signature 'XCMSnExp,PeakDensityParam' groupChromPeaks(object, param, msLevel = 1L, add = FALSE) ## S4 method for signature 'XCMSnExp,MzClustParam' groupChromPeaks(object, param, msLevel = 1L) ## S4 method for signature 'XCMSnExp,NearestPeaksParam' groupChromPeaks(object, param, msLevel = 1L, add = FALSE)
groupChromPeaks(object, param, ...) ## S4 method for signature 'XcmsExperiment,Param' groupChromPeaks(object, param, msLevel = 1L, add = FALSE) PeakDensityParam( sampleGroups = numeric(), bw = 30, minFraction = 0.5, minSamples = 1, binSize = 0.25, ppm = 0, maxFeatures = 50 ) MzClustParam( sampleGroups = numeric(), ppm = 20, absMz = 0, minFraction = 0.5, minSamples = 1 ) NearestPeaksParam( sampleGroups = numeric(), mzVsRtBalance = 10, absMz = 0.2, absRt = 15, kNN = 10 ) ## S4 method for signature 'PeakDensityParam' sampleGroups(object) ## S4 replacement method for signature 'PeakDensityParam' sampleGroups(object) <- value ## S4 method for signature 'PeakDensityParam' bw(object) ## S4 replacement method for signature 'PeakDensityParam' bw(object) <- value ## S4 method for signature 'PeakDensityParam' minFraction(object) ## S4 replacement method for signature 'PeakDensityParam' minFraction(object) <- value ## S4 method for signature 'PeakDensityParam' minSamples(object) ## S4 replacement method for signature 'PeakDensityParam' minSamples(object) <- value ## S4 method for signature 'PeakDensityParam' binSize(object) ## S4 replacement method for signature 'PeakDensityParam' binSize(object) <- value ## S4 method for signature 'PeakDensityParam' maxFeatures(object) ## S4 replacement method for signature 'PeakDensityParam' maxFeatures(object) <- value ## S4 method for signature 'PeakDensityParam' ppm(object) ## S4 method for signature 'MzClustParam' sampleGroups(object) ## S4 replacement method for signature 'MzClustParam' sampleGroups(object) <- value ## S4 method for signature 'MzClustParam' ppm(object) ## S4 replacement method for signature 'MzClustParam' ppm(object) <- value ## S4 method for signature 'MzClustParam' absMz(object) ## S4 replacement method for signature 'MzClustParam' absMz(object) <- value ## S4 method for signature 'MzClustParam' minFraction(object) ## S4 replacement method for signature 'MzClustParam' minFraction(object) <- value ## S4 method for signature 'MzClustParam' minSamples(object) ## S4 replacement method for signature 'MzClustParam' minSamples(object) <- value ## S4 method for signature 'NearestPeaksParam' sampleGroups(object) ## S4 replacement method for signature 'NearestPeaksParam' sampleGroups(object) <- value ## S4 method for signature 'NearestPeaksParam' mzVsRtBalance(object) ## S4 replacement method for signature 'NearestPeaksParam' mzVsRtBalance(object) <- value ## S4 method for signature 'NearestPeaksParam' absMz(object) ## S4 replacement method for signature 'NearestPeaksParam' absMz(object) <- value ## S4 method for signature 'NearestPeaksParam' absRt(object) ## S4 replacement method for signature 'NearestPeaksParam' absRt(object) <- value ## S4 method for signature 'NearestPeaksParam' kNN(object) ## S4 replacement method for signature 'NearestPeaksParam' kNN(object) <- value ## S4 method for signature 'PeakDensityParam' as.list(x, ...) ## S4 method for signature 'XCMSnExp,PeakDensityParam' groupChromPeaks(object, param, msLevel = 1L, add = FALSE) ## S4 method for signature 'XCMSnExp,MzClustParam' groupChromPeaks(object, param, msLevel = 1L) ## S4 method for signature 'XCMSnExp,NearestPeaksParam' groupChromPeaks(object, param, msLevel = 1L, add = FALSE)
object |
The data object on which the correspondence analysis should be
performed. Can be an |
param |
The parameter object selecting and configuring the algorithm. |
... |
Optional parameters. |
msLevel |
|
add |
|
sampleGroups |
For |
bw |
For |
minFraction |
For |
minSamples |
For |
binSize |
For |
ppm |
For |
maxFeatures |
For |
absMz |
For |
mzVsRtBalance |
For |
absRt |
For |
kNN |
For |
value |
The value for the slot. |
x |
The parameter object. |
For groupChromPeaks
: either an XcmsExperiment()
or XCMSnExp()
object with the correspondence result.
Colin Smith, Johannes Rainer
Smith, C.A., Want E.J., O'Maille G., Abagyan R., and Siuzdak G. (2006) "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 78:779-787.
Katajamaa, M., Miettinen, J., Oresic, M. (2006) "MZmine: Toolbox for processing and visualization of mass spectrometry based molecular profile data". Bioinformatics, 22:634-636.
Kazmi S. A., Ghosh, S., Shin, D., Hill, D.W., and Grant, D.F. (2006) "Alignment of high resolution mass spectra: development of a heuristic approach for metabolomics. Metabolomics Vol. 2, No. 2, 75-83.
Features from the same originating compound are expected to have similar
intensities across samples. This method thus groups features based on
similarity of abundances (i.e. feature values) across samples in a
data set.
See also AbundanceSimilarityParam()
for additional information and
details.
This help page lists parameters specific for xcms
result objects (i.e.
XcmsExperiment()
and XCMSnExp()
objects). Documentation of the
parameters for the similarity calculation is available in the
AbundanceSimilarityParam()
help page in the MsFeatures
package.
## S4 method for signature 'XcmsResult,AbundanceSimilarityParam' groupFeatures( object, param, msLevel = 1L, method = c("medret", "maxint", "sum"), value = "into", intensity = "into", filled = TRUE, ... )
## S4 method for signature 'XcmsResult,AbundanceSimilarityParam' groupFeatures( object, param, msLevel = 1L, method = c("medret", "maxint", "sum"), value = "into", intensity = "into", filled = TRUE, ... )
object |
|
param |
|
msLevel |
|
method |
|
value |
|
intensity |
|
filled |
|
... |
additional parameters passed to the |
input object with feature group definitions added to (or updated
in) a column "feature_group"
in its featureDefinitions
data frame.
Johannes Rainer
feature-grouping for a general overview.
Other feature grouping methods:
groupFeatures-eic-similarity
,
groupFeatures-similar-rtime
library(MsFeatures) library(MsExperiment) ## Load a test data set with detected peaks faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Group chromatographic peaks across samples xodg <- groupChromPeaks(faahko_sub, param = PeakDensityParam(sampleGroups = rep(1, 3))) ## Group features based on correlation of feature values (integrated ## peak area) across samples. Note that there are many missing values ## in the feature value which influence grouping of features in the present ## data set. xodg_grp <- groupFeatures(xodg, param = AbundanceSimilarityParam(threshold = 0.8)) table(featureDefinitions(xodg_grp)$feature_group) ## Group based on the maximal peak intensity per feature xodg_grp <- groupFeatures(xodg, param = AbundanceSimilarityParam(threshold = 0.8, value = "maxo")) table(featureDefinitions(xodg_grp)$feature_group)
library(MsFeatures) library(MsExperiment) ## Load a test data set with detected peaks faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Group chromatographic peaks across samples xodg <- groupChromPeaks(faahko_sub, param = PeakDensityParam(sampleGroups = rep(1, 3))) ## Group features based on correlation of feature values (integrated ## peak area) across samples. Note that there are many missing values ## in the feature value which influence grouping of features in the present ## data set. xodg_grp <- groupFeatures(xodg, param = AbundanceSimilarityParam(threshold = 0.8)) table(featureDefinitions(xodg_grp)$feature_group) ## Group based on the maximal peak intensity per feature xodg_grp <- groupFeatures(xodg, param = AbundanceSimilarityParam(threshold = 0.8, value = "maxo")) table(featureDefinitions(xodg_grp)$feature_group)
Features from the same originating compound are expected to share their
elution pattern (i.e. chromatographic peak shape) with it.
Thus, this methods allows to group features based on similarity of their
extracted ion chromatograms (EICs). The similarity calculation is performed
separately for each sample with the similarity score being aggregated across
samples for the final generation of the similarity matrix on which the
grouping (considering parameter threshold
) will be performed.
The compareChromatograms()
function is used for similarity calculation
which by default calculates the Pearson's correlation coefficient. The
settings for compareChromatograms
can be specified with parameters
ALIGNFUN
, ALIGNFUNARGS
, FUN
and FUNARGS
. ALIGNFUN
defaults to
alignRt()
and is the function used to align the chromatograms before
comparison. ALIGNFUNARGS
allows to specify additional arguments for the
ALIGNFUN
function. It defaults to
ALIGNFUNARGS = list(tolerance = 0, method = "closest")
which ensures that
data points from the same spectrum (scan, i.e. with the same retention time)
are compared between the EICs from the same sample. Parameter FUN
defines
the function to calculate the similarity score and defaults to FUN = cor
and FUNARGS
allows to pass additional arguments to this function (defaults
to FUNARGS = list(use = "pairwise.complete.obs")
. See also
compareChromatograms()
for more information.
The grouping of features based on the EIC similarity matrix is performed
with the function specified with parameter groupFun
which defaults to
groupFun = groupSimilarityMatrix
which groups all rows (features) in the
similarity matrix with a similarity score larger than threshold
into the
same cluster. This creates clusters of features in which all features
have a similarity score >= threshold
with any other feature in that
cluster. See groupSimilarityMatrix()
for details. Additional parameters to
that function can be passed with the ...
argument.
This feature grouping should be called after an initial feature
grouping by retention time (see SimilarRtimeParam()
). The feature groups
defined in columns "feature_group"
of featureDefinitions(object)
(for
features matching msLevel
) will be used and refined by this method.
Features with a value of NA
in featureDefinitions(object)$feature_group
will be skipped/not considered for feature grouping.
EicSimilarityParam( threshold = 0.9, n = 1, onlyPeak = TRUE, value = c("maxo", "into"), groupFun = groupSimilarityMatrix, ALIGNFUN = alignRt, ALIGNFUNARGS = list(tolerance = 0, method = "closest"), FUN = cor, FUNARGS = list(use = "pairwise.complete.obs"), ... ) ## S4 method for signature 'XcmsResult,EicSimilarityParam' groupFeatures(object, param, msLevel = 1L)
EicSimilarityParam( threshold = 0.9, n = 1, onlyPeak = TRUE, value = c("maxo", "into"), groupFun = groupSimilarityMatrix, ALIGNFUN = alignRt, ALIGNFUNARGS = list(tolerance = 0, method = "closest"), FUN = cor, FUNARGS = list(use = "pairwise.complete.obs"), ... ) ## S4 method for signature 'XcmsResult,EicSimilarityParam' groupFeatures(object, param, msLevel = 1L)
threshold |
|
n |
|
onlyPeak |
|
value |
|
groupFun |
|
ALIGNFUN |
|
ALIGNFUNARGS |
named |
FUN |
|
FUNARGS |
named |
... |
for |
object |
|
param |
|
msLevel |
|
input object with feature groups added (i.e. in column
"feature_group"
of its featureDefinitions
data frame.
At present the featureChromatograms()
function is used to extract the
EICs for each feature, which does however use one m/z and rt range for
each feature and the EICs do thus not exactly represent the identified
chromatographic peaks of each sample (i.e. their specific m/z and
retention time ranges).
While being possible to be performed on the full data set without prior
feature grouping, this is not suggested for the following reasons: I) the
selection of the top n
samples with the highest signal for the
feature group will be biased by very abundant compounds as this is
performed on the full data set (i.e. the samples with the highest overall
intensities are used for correlation of all features) and II) it is
computationally much more expensive because a pairwise correlation between
all features has to be performed.
It is also suggested to perform the correlation on a subset of samples
per feature with the highest intensities of the peaks (for that feature)
although it would also be possible to run the correlation on all samples by
setting n
equal to the total number of samples in the data set. EIC
correlation should however be performed ideally on samples in which the
original compound is highly abundant to avoid correlation of missing values
or noisy peak shapes as much as possible.
By default also the signal which is outside identified chromatographic peaks is excluded from the correlation.
Johannes Rainer
feature-grouping for a general overview.
Other feature grouping methods:
groupFeatures-abundance-correlation
,
groupFeatures-similar-rtime
library(MsFeatures) library(MsExperiment) ## Load a test data set with detected peaks faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Group chromatographic peaks across samples xodg <- groupChromPeaks(faahko_sub, param = PeakDensityParam(sampleGroups = rep(1, 3))) ## Performing a feature grouping based on EIC similarities on a single ## sample xodg_grp <- groupFeatures(xodg, param = EicSimilarityParam(n = 1)) table(featureDefinitions(xodg_grp)$feature_group) ## Usually it is better to perform this correlation on pre-grouped features ## e.g. based on similar retention time. xodg_grp <- groupFeatures(xodg, param = SimilarRtimeParam(diffRt = 4)) xodg_grp <- groupFeatures(xodg_grp, param = EicSimilarityParam(n = 1)) table(featureDefinitions(xodg_grp)$feature_group)
library(MsFeatures) library(MsExperiment) ## Load a test data set with detected peaks faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Group chromatographic peaks across samples xodg <- groupChromPeaks(faahko_sub, param = PeakDensityParam(sampleGroups = rep(1, 3))) ## Performing a feature grouping based on EIC similarities on a single ## sample xodg_grp <- groupFeatures(xodg, param = EicSimilarityParam(n = 1)) table(featureDefinitions(xodg_grp)$feature_group) ## Usually it is better to perform this correlation on pre-grouped features ## e.g. based on similar retention time. xodg_grp <- groupFeatures(xodg, param = SimilarRtimeParam(diffRt = 4)) xodg_grp <- groupFeatures(xodg_grp, param = EicSimilarityParam(n = 1)) table(featureDefinitions(xodg_grp)$feature_group)
Group features based on similar retention time. This method is supposed to be
used as an initial crude grouping of features based on the median retention
time of all their chromatographic peaks. All features with a difference in
their retention time which is <=
parameter diffRt
of the parameter object
are grouped together. If a column "feature_group"
is found in
featureDefinitions()
this is further sub-grouped by this method.
See MsFeatures::SimilarRtimeParam()
in MsFeatures
for more details.
## S4 method for signature 'XcmsResult,SimilarRtimeParam' groupFeatures(object, param, msLevel = 1L, ...)
## S4 method for signature 'XcmsResult,SimilarRtimeParam' groupFeatures(object, param, msLevel = 1L, ...)
object |
|
param |
|
msLevel |
|
... |
passed to the |
the input object with feature groups added (i.e. in column
"feature_group"
of its featureDefinitions
data frame.
Johannes Rainer
Other feature grouping methods:
groupFeatures-abundance-correlation
,
groupFeatures-eic-similarity
library(MsFeatures) library(MsExperiment) ## Load a test data set with detected peaks faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Group chromatographic peaks across samples xodg <- groupChromPeaks(faahko_sub, param = PeakDensityParam(sampleGroups = rep(1, 3))) ## Group features based on similar retention time (i.e. difference <= 2 seconds) xodg_grp <- groupFeatures(xodg, param = SimilarRtimeParam(diffRt = 2)) ## Feature grouping get added to the featureDefinitions in column "feature_group" head(featureDefinitions(xodg_grp)$feature_group) table(featureDefinitions(xodg_grp)$feature_group) length(unique(featureDefinitions(xodg_grp)$feature_group)) ## Using an alternative groupiing method that creates larger groups xodg_grp <- groupFeatures(xodg, param = SimilarRtimeParam(diffRt = 2, groupFun = MsCoreUtils::group)) length(unique(featureDefinitions(xodg_grp)$feature_group))
library(MsFeatures) library(MsExperiment) ## Load a test data set with detected peaks faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Group chromatographic peaks across samples xodg <- groupChromPeaks(faahko_sub, param = PeakDensityParam(sampleGroups = rep(1, 3))) ## Group features based on similar retention time (i.e. difference <= 2 seconds) xodg_grp <- groupFeatures(xodg, param = SimilarRtimeParam(diffRt = 2)) ## Feature grouping get added to the featureDefinitions in column "feature_group" head(featureDefinitions(xodg_grp)$feature_group) table(featureDefinitions(xodg_grp)$feature_group) length(unique(featureDefinitions(xodg_grp)$feature_group)) ## Using an alternative groupiing method that creates larger groups xodg_grp <- groupFeatures(xodg, param = SimilarRtimeParam(diffRt = 2, groupFun = MsCoreUtils::group)) length(unique(featureDefinitions(xodg_grp)$feature_group))
Allow linking of peak group data between classes using unique group names that remain the same as long as no re-grouping occurs.
object |
the |
mzdec |
number of decimal places to use for m/z |
rtdec |
number of decimal places to use for retention time |
template |
a character vector with existing group names whose format should be emulated |
A character vector with unique names for each peak group in the
object. The format is M[m/z]T[time in seconds]
.
(object, mzdec = 0, rtdec = 0, template = NULL)
(object)
groupnames
generates names for the identified features from the
correspondence analysis based in their mass and retention time. This
generates feature names that are equivalent to the group names of the old
user interface (aka xcms1).
## S4 method for signature 'XCMSnExp' groupnames(object, mzdec = 0, rtdec = 0, template = NULL)
## S4 method for signature 'XCMSnExp' groupnames(object, mzdec = 0, rtdec = 0, template = NULL)
object |
|
mzdec |
|
rtdec |
|
template |
|
character
with unique names for each feature in object
. The
format is M(m/z)T(time in seconds)
.
groupOverlaps
identifies overlapping ranges in the input data and groups
them by returning their indices in xmin
xmax
.
groupOverlaps(xmin, xmax)
groupOverlaps(xmin, xmax)
xmin |
|
xmax |
|
list
with the indices of grouped elements.
Johannes Rainer
x <- c(2, 12, 34.2, 12.4) y <- c(3, 16, 35, 36) groupOverlaps(x, y)
x <- c(2, 12, 34.2, 12.4) y <- c(3, 16, 35, 36) groupOverlaps(x, y)
Generate a matrix of peak values with rows for every group and
columns for every sample. The value included in the matrix can
be any of the columns from the xcmsSet
peaks
slot
matrix. Collisions where more than one peak from a single sample
are in the same group get resolved with one of several user-selectable
methods.
object |
the |
method |
conflict resolution method, |
value |
name of peak column to enter into returned matrix, or |
intensity |
if |
A matrix with with rows for every group and columns for every
sample. Missing peaks have NA
values.
groupval(object, method = c("medret", "maxint"),
value = "index", intensity = "into")
The highlightChromPeaks
function adds chromatographic
peak definitions to an existing plot, such as one created by the
plot
method on a Chromatogram
or
MChromatograms
object.
highlightChromPeaks( x, rt, mz, peakIds = character(), border = rep("00000040", length(fileNames(x))), lwd = 1, col = NA, type = c("rect", "point", "polygon"), whichPeaks = c("any", "within", "apex_within"), ... )
highlightChromPeaks( x, rt, mz, peakIds = character(), border = rep("00000040", length(fileNames(x))), lwd = 1, col = NA, type = c("rect", "point", "polygon"), whichPeaks = c("any", "within", "apex_within"), ... )
x |
For |
rt |
For |
mz |
|
peakIds |
|
border |
colors to be used to color the border of the rectangles/peaks.
Has to be equal to the number of samples in |
lwd |
|
col |
For |
type |
the plotting type. See |
whichPeaks |
|
... |
additional parameters to the |
Johannes Rainer
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## Extract the ion chromatogram for one chromatographic peak in the data. chrs <- chromatogram(faahko_sub, rt = c(2700, 2900), mz = 335) plot(chrs) ## Extract chromatographic peaks for the mz/rt range (if any). chromPeaks(faahko_sub, rt = c(2700, 2900), mz = 335) ## Highlight the chromatographic peaks in the area ## Show the peak definition with a rectangle highlightChromPeaks(faahko_sub, rt = c(2700, 2900), mz = 335) ## Color the actual peak highlightChromPeaks(faahko_sub, rt = c(2700, 2900), mz = 335, col = c("#ff000020", "#00ff0020"), type = "polygon")
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## Extract the ion chromatogram for one chromatographic peak in the data. chrs <- chromatogram(faahko_sub, rt = c(2700, 2900), mz = 335) plot(chrs) ## Extract chromatographic peaks for the mz/rt range (if any). chromPeaks(faahko_sub, rt = c(2700, 2900), mz = 335) ## Highlight the chromatographic peaks in the area ## Show the peak definition with a rectangle highlightChromPeaks(faahko_sub, rt = c(2700, 2900), mz = 335) ## Color the actual peak highlightChromPeaks(faahko_sub, rt = c(2700, 2900), mz = 335, col = c("#ff000020", "#00ff0020"), type = "polygon")
Create log intensity false-color image of a xcmsRaw object plotted with m/z and retention time axes
x |
xcmsRaw object |
col |
vector of colors to use for for the image |
... |
arguments for |
image(x, col = rainbow(256), ...)
Colin A. Smith, [email protected]
This function provides missing value imputation based on linear
interpolation and resembles some of the functionality of the
profBinLin
and profBinLinBase
functions deprecated from
version 1.51 on.
imputeLinInterpol( x, baseValue, method = "lin", distance = 1L, noInterpolAtEnds = FALSE )
imputeLinInterpol( x, baseValue, method = "lin", distance = 1L, noInterpolAtEnds = FALSE )
x |
A numeric vector with eventual missing ( |
baseValue |
The base value to which empty elements should be set. This
is only considered for |
method |
One of |
distance |
For |
noInterpolAtEnds |
For |
Values for NAs in input vector x
can be imputed using methods
"lin"
and "linbase"
:
impute = "lin"
uses simple linear imputation to derive a value
for an empty element in input vector x
from its neighboring
non-empty elements. This method is equivalent to the linear
interpolation in the profBinLin
method. Whether interpolation is
performed if missing values are present at the beginning and end of
x
can be set with argument noInterpolAtEnds
. By default
interpolation is also performed at the ends interpolating from 0
at the beginning and towards 0
at the end. For
noInterpolAtEnds = TRUE
no interpolation is performed at both
ends replacing the missing values at the beginning and/or the end of
x
with 0
.
impute = "linbase"
uses linear interpolation to impute values for
empty elements within a user-definable proximity to non-empty elements
and setting the element's value to the baseValue
otherwise. The
default for the baseValue
is half of the smallest value in
x
(NA
s being removed). Whether linear interpolation based
imputation is performed for a missing value depends on the
distance
argument. Interpolation is only performed if one of the
next distance
closest neighbors to the current empty element has
a value other than NA
. No interpolation takes place for
distance = 0
, while distance = 1
means that the value for
an empty element is interpolated from directly adjacent non-empty
elements while, if the next neighbors of the current empty element are
also NA
, it's vale is set to baseValue
.
This corresponds to the linear interpolation performed by the
profBinLinBase
method. For more details see examples below.
A numeric vector with empty values imputed based on the selected
method
.
Johannes Rainer
####### ## Impute missing values by linearly interpolating from neighboring ## non-empty elements x <- c(3, NA, 1, 2, NA, NA, 4, NA, NA, NA, 3, NA, NA, NA, NA, 2) imputeLinInterpol(x, method = "lin") ## visualize the interpolation: plot(x = 1:length(x), y = x) points(x = 1:length(x), y = imputeLinInterpol(x, method = "lin"), type = "l", col = "grey") ## If the first or last elements are NA, interpolation is performed from 0 ## to the first non-empty element. x <- c(NA, 2, 1, 4, NA) imputeLinInterpol(x, method = "lin") ## visualize the interpolation: plot(x = 1:length(x), y = x) points(x = 1:length(x), y = imputeLinInterpol(x, method = "lin"), type = "l", col = "grey") ## If noInterpolAtEnds is TRUE no interpolation is performed at both ends imputeLinInterpol(x, method = "lin", noInterpolAtEnds = TRUE) ###### ## method = "linbase" ## "linbase" performs imputation by interpolation for empty elements based on ## 'distance' adjacent non-empty elements, setting all remaining empty elements ## to the baseValue x <- c(3, NA, 1, 2, NA, NA, 4, NA, NA, NA, 3, NA, NA, NA, NA, 2) ## Setting distance = 0 skips imputation by linear interpolation imputeLinInterpol(x, method = "linbase", distance = 0) ## With distance = 1 for all empty elements next to a non-empty element the value ## is imputed by linear interpolation. xInt <- imputeLinInterpol(x, method = "linbase", distance = 1L) xInt plot(x = 1:length(x), y = x, ylim = c(0, max(x, na.rm = TRUE))) points(x = 1:length(x), y = xInt, type = "l", col = "grey") ## Setting distance = 2L would cause that for all empty elements for which the ## distance to the next non-empty element is <= 2 the value is imputed by ## linear interpolation: xInt <- imputeLinInterpol(x, method = "linbase", distance = 2L) xInt plot(x = 1:length(x), y = x, ylim = c(0, max(x, na.rm = TRUE))) points(x = 1:length(x), y = xInt, type = "l", col = "grey")
####### ## Impute missing values by linearly interpolating from neighboring ## non-empty elements x <- c(3, NA, 1, 2, NA, NA, 4, NA, NA, NA, 3, NA, NA, NA, NA, 2) imputeLinInterpol(x, method = "lin") ## visualize the interpolation: plot(x = 1:length(x), y = x) points(x = 1:length(x), y = imputeLinInterpol(x, method = "lin"), type = "l", col = "grey") ## If the first or last elements are NA, interpolation is performed from 0 ## to the first non-empty element. x <- c(NA, 2, 1, 4, NA) imputeLinInterpol(x, method = "lin") ## visualize the interpolation: plot(x = 1:length(x), y = x) points(x = 1:length(x), y = imputeLinInterpol(x, method = "lin"), type = "l", col = "grey") ## If noInterpolAtEnds is TRUE no interpolation is performed at both ends imputeLinInterpol(x, method = "lin", noInterpolAtEnds = TRUE) ###### ## method = "linbase" ## "linbase" performs imputation by interpolation for empty elements based on ## 'distance' adjacent non-empty elements, setting all remaining empty elements ## to the baseValue x <- c(3, NA, 1, 2, NA, NA, 4, NA, NA, NA, 3, NA, NA, NA, NA, 2) ## Setting distance = 0 skips imputation by linear interpolation imputeLinInterpol(x, method = "linbase", distance = 0) ## With distance = 1 for all empty elements next to a non-empty element the value ## is imputed by linear interpolation. xInt <- imputeLinInterpol(x, method = "linbase", distance = 1L) xInt plot(x = 1:length(x), y = x, ylim = c(0, max(x, na.rm = TRUE))) points(x = 1:length(x), y = xInt, type = "l", col = "grey") ## Setting distance = 2L would cause that for all empty elements for which the ## distance to the next non-empty element is <= 2 the value is imputed by ## linear interpolation: xInt <- imputeLinInterpol(x, method = "linbase", distance = 2L) xInt plot(x = 1:length(x), y = x, ylim = c(0, max(x, na.rm = TRUE))) points(x = 1:length(x), y = xInt, type = "l", col = "grey")
imputeRowMin
imputes missing values in x
by replacing NA
s in each row
with a proportion of the minimal value for that row (i.e.
min_fraction * min(x[i, ])
).
imputeRowMin(x, min_fraction = 1/2)
imputeRowMin(x, min_fraction = 1/2)
x |
|
min_fraction |
|
Johannes Rainer
imputeLCMD
package for more left censored imputation functions.
Other imputation functions:
imputeRowMinRand()
library(MSnbase) library(faahKO) data("faahko") xset <- group(faahko) mat <- groupval(xset, value = "into") mat_imp <- imputeRowMin(mat) head(mat) head(mat_imp) ## Replace with 1/8 of the row mimimum head(imputeRowMin(mat, min_fraction = 1/8))
library(MSnbase) library(faahKO) data("faahko") xset <- group(faahko) mat <- groupval(xset, value = "into") mat_imp <- imputeRowMin(mat) head(mat) head(mat_imp) ## Replace with 1/8 of the row mimimum head(imputeRowMin(mat, min_fraction = 1/8))
Replace missing values with random numbers.
When using the method = "mean_sd"
, random numbers will be generated
from a normal distribution based
on (a fraction of) the row min and a standard deviation estimated from the
linear relationship between row standard deviation and mean of the full data
set. Parameter sd_fraction
allows to further reduce the estimated
standard deviation.
When using the method method = "from_to"
, random numbers between 2 specific values
will be generated.
imputeRowMinRand( x, method = c("mean_sd", "from_to"), min_fraction = 1/2, min_fraction_from = 1/1000, sd_fraction = 1, abs = TRUE )
imputeRowMinRand( x, method = c("mean_sd", "from_to"), min_fraction = 1/2, min_fraction_from = 1/1000, sd_fraction = 1, abs = TRUE )
x |
|
method |
method |
min_fraction |
|
min_fraction_from |
|
sd_fraction |
|
abs |
|
For method mean_sd, imputed
values are taken from a normal distribution with mean being a
user defined fraction of the row minimum and the standard deviation
estimated for that mean based on the linear relationship between row
standard deviations and row means in the full matrix x
.
To largely avoid imputed values being negative or larger than the real
values, the standard deviation for the random number generation is estimated
ignoring the intercept of the linear model estimating the relationship
between standard deviation and mean. If abs = TRUE
NA
values are
replaced with the absolute value of the random values.
For method from_to, imputed values are taken between 2 user defined fractions of the row minimum.
Johannes Rainer, Mar Garcia-Aloy
imputeLCMD
package for more left censored imputation functions.
Other imputation functions:
imputeRowMin()
library(faahKO) library(MSnbase) data("faahko") xset <- group(faahko) mat <- groupval(xset, value = "into") ## Estimate the relationship between row sd and mean. The standard deviation ## of the random distribution is estimated on this relationship. mns <- rowMeans(mat, na.rm = TRUE) sds <- apply(mat, MARGIN = 1, sd, na.rm = TRUE) plot(mns, sds) abline(lm(sds ~ mns)) mat_imp_meansd <- imputeRowMinRand(mat, method = "mean_sd") mat_imp_fromto <- imputeRowMinRand(mat, method = "from_to") head(mat) head(mat_imp_meansd) head(mat_imp_fromto)
library(faahKO) library(MSnbase) data("faahko") xset <- group(faahko) mat <- groupval(xset, value = "into") ## Estimate the relationship between row sd and mean. The standard deviation ## of the random distribution is estimated on this relationship. mns <- rowMeans(mat, na.rm = TRUE) sds <- apply(mat, MARGIN = 1, sd, na.rm = TRUE) plot(mns, sds) abline(lm(sds ~ mns)) mat_imp_meansd <- imputeRowMinRand(mat, method = "mean_sd") mat_imp_fromto <- imputeRowMinRand(mat, method = "from_to") head(mat) head(mat_imp_meansd) head(mat_imp_fromto)
isolationWindowTargetMz
extracts the isolation window target m/z definition
for each spectrum in object
.
## S4 method for signature 'OnDiskMSnExp' isolationWindowTargetMz(object)
## S4 method for signature 'OnDiskMSnExp' isolationWindowTargetMz(object)
object |
OnDiskMSnExp object. |
a numeric
of length equal to the number of spectra in object
with
the isolation window target m/z or NA
if not specified/available.
Johannes Rainer
Create an image of the raw (profile) data m/z against retention time, with the intensity color coded.
x |
xcmsRaw object. |
log |
Whether the intensity should be log transformed. |
col.regions |
The color ramp that should be used for encoding of the intensity. |
rt |
wheter the original ( |
... |
Arguments for |
levelplot(x, log=TRUE, col.regions=colorRampPalette(brewer.pal(9,
"YlOrRd"))(256), ...)
levelplot(x, log=TRUE, col.regions=colorRampPalette(brewer.pal(9,
"YlOrRd"))(256), rt="raw", ...)
Johannes Rainer, [email protected]
This function extracts the raw data which will be used an
xcmsRaw
object. Further processing of data is
done in the xcmsRaw
constructor.
object |
Specification of a data source (such as a file name or database query) |
The implementing methods decide how to gather the data.
A list containing elements describing the data source. The rt
,
scanindex
, tic
, and acquisitionNum
components
each have one entry per scan. They are parallel in the sense that
rt[1]
, scanindex[1]
, and acquisitionNum[1]
all
refer to the same scan. The list containst the following components:
rt |
Numeric vector with acquisition time (in seconds) for each scan |
tic |
Numeric vector with Total Ion Count for each scan |
scanindex |
Integer vector with starting positions of each scan in the |
mz |
Concatenated vector of m/z values for all scans |
intensity |
Concatenated vector of intensity values for all scans |
signature(object = "xcmsSource")
Uses loadRaw,xcmsSource-method
to extract raw data.
Subclasses of xcmsSource
can provide different
ways of fetching data.
Daniel Hackney, [email protected]
Data sets with 'xcms' preprocessing results are provided within the 'xcms' package and can be loaded with the 'loadXcmsData' function. The available Test data sets are:
- 'xdata': an [XCMSnExp()] object with the results from a 'xcms'-based pre-processing of an LC-MS untargeted metabolomics data set. The raw data files are provided in the 'faahKO' R package.
- 'xmse': an [XcmsExperiment()] object with the results from an 'xcms'-based pre-processing of an LC-MS untargeted metabolomics data set (same original data set and pre-processing settings as for the 'xdata' data set). The pre-processing of this data set is described in detail in the *xcms* vignette of the 'xcms' package.
- 'faahko_sub': an [XCMSnExp()] object with identified chromatographic peaks in 3 samples from the data files in the 'faahKO' R package.
- 'faahko_sub2': an [XcmsExperiment()] object with identified chromatographic peaks in 3 samples from the data files in the 'faahKO' R package.
Data sets can also be loaded using 'data', which would however require to update objects to point to the location of the raw data files. The 'loadXcmsData' loads the data and ensures that all paths are updated accordingly.
loadXcmsData(x = c("xmse", "xdata", "faahko_sub", "faahko_sub2"))
loadXcmsData(x = c("xmse", "xdata", "faahko_sub", "faahko_sub2"))
x |
For 'loadXcmsData': 'character(1)' with the name of the data file (object) to load. |
library(xcms) xdata <- loadXcmsData()
library(xcms) xdata <- loadXcmsData()
The manualChromPeaks
function allows to manually define chromatographic
peaks, integrate the intensities within the specified peak area and add
them to the object's chromPeaks
matrix. A peak is not added for a sample
if no signal was found in the respective data file.
Because chromatographic peaks are added to eventually previously identified
peaks, it is suggested to run refineChromPeaks()
with the
MergeNeighboringPeaksParam()
approach to merge potentially overlapping
peaks.
The manualFeatures
function allows to manually group identified
chromatographic peaks into features by providing their index in the
object's chromPeaks
matrix.
manualChromPeaks(object, ...) manualFeatures(object, ...) ## S4 method for signature 'MsExperiment' manualChromPeaks( object, chromPeaks = matrix(numeric()), samples = seq_along(object), msLevel = 1L, chunkSize = 2L, BPPARAM = bpparam() ) ## S4 method for signature 'XcmsExperiment' manualChromPeaks( object, chromPeaks = matrix(numeric()), samples = seq_along(object), msLevel = 1L, chunkSize = 2L, BPPARAM = bpparam() ) ## S4 method for signature 'XcmsExperiment' manualFeatures(object, peakIdx = list(), msLevel = 1L) ## S4 method for signature 'OnDiskMSnExp' manualChromPeaks( object, chromPeaks = matrix(), samples = seq_along(fileNames(object)), msLevel = 1L, BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' manualChromPeaks( object, chromPeaks = matrix(), samples = seq_along(fileNames(object)), msLevel = 1L, BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' manualFeatures(object, peakIdx = list(), msLevel = 1L)
manualChromPeaks(object, ...) manualFeatures(object, ...) ## S4 method for signature 'MsExperiment' manualChromPeaks( object, chromPeaks = matrix(numeric()), samples = seq_along(object), msLevel = 1L, chunkSize = 2L, BPPARAM = bpparam() ) ## S4 method for signature 'XcmsExperiment' manualChromPeaks( object, chromPeaks = matrix(numeric()), samples = seq_along(object), msLevel = 1L, chunkSize = 2L, BPPARAM = bpparam() ) ## S4 method for signature 'XcmsExperiment' manualFeatures(object, peakIdx = list(), msLevel = 1L) ## S4 method for signature 'OnDiskMSnExp' manualChromPeaks( object, chromPeaks = matrix(), samples = seq_along(fileNames(object)), msLevel = 1L, BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' manualChromPeaks( object, chromPeaks = matrix(), samples = seq_along(fileNames(object)), msLevel = 1L, BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' manualFeatures(object, peakIdx = list(), msLevel = 1L)
object |
XcmsExperiment, XCMSnExp or OnDiskMSnExp object. |
... |
ignored. |
chromPeaks |
For |
samples |
For |
msLevel |
|
chunkSize |
|
BPPARAM |
parallel processing settings (see |
peakIdx |
For |
XcmsExperiment
or XCMSnExp
with the manually added
chromatographic peaks or features.
Johannes Rainer
## Read a test dataset. fls <- c(system.file("microtofq/MM14.mzML", package = "msdata"), system.file("microtofq/MM8.mzML", package = "msdata")) ## Define a data frame with some sample annotations ann <- data.frame( injection_index = 1:2, sample_id = c("MM14", "MM8")) ## Import the data library(MsExperiment) mse <- readMsExperiment(fls) ## Define some arbitrary peak areas pks <- cbind( mzmin = c(512, 234.3), mzmax = c(513, 235), rtmin = c(10, 33), rtmax = c(19, 50) ) pks res <- manualChromPeaks(mse, pks) chromPeaks(res) ## Peaks were only found in the second file.
## Read a test dataset. fls <- c(system.file("microtofq/MM14.mzML", package = "msdata"), system.file("microtofq/MM8.mzML", package = "msdata")) ## Define a data frame with some sample annotations ann <- data.frame( injection_index = 1:2, sample_id = c("MM14", "MM8")) ## Import the data library(MsExperiment) mse <- readMsExperiment(fls) ## Define some arbitrary peak areas pks <- cbind( mzmin = c(512, 234.3), mzmax = c(513, 235), rtmin = c(10, 33), rtmax = c(19, 50) ) pks res <- manualChromPeaks(mse, pks) chromPeaks(res) ## Peaks were only found in the second file.
For each element in a matix, replace it with the median of the values around it.
medianFilter(x, mrad, nrad)
medianFilter(x, mrad, nrad)
x |
numeric matrix to median filter |
mrad |
number of rows on either side of the value to use for median calculation |
nrad |
number of rows on either side of the value to use for median calculation |
A matrix whose values have been median filtered
Colin A. Smith, [email protected]
mat <- matrix(1:25, nrow=5) mat medianFilter(mat, 1, 1)
mat <- matrix(1:25, nrow=5) mat medianFilter(mat, 1, 1)
The MS2 and MSn data is stored in separate slots,
and can not directly be used by e.g. findPeaks().
msn2xcmsRaw()
will copy the MSn spectra
into the "normal" xcmsRaw
slots.
msn2xcmsRaw(xmsn)
msn2xcmsRaw(xmsn)
xmsn |
an object of class |
The default gap value is determined from the 90th percentile of the pair-wise differences between adjacent mass values.
An xcmsRaw object
Steffen Neumann [email protected]
msnfile <- system.file("microtofq/MSMSpos20_6.mzML", package = "msdata") xrmsn <- xcmsRaw(msnfile, includeMSn=TRUE) xr <- msn2xcmsRaw(xrmsn) p <- findPeaks(xr, method="centWave")
msnfile <- system.file("microtofq/MSMSpos20_6.mzML", package = "msdata") xrmsn <- xcmsRaw(msnfile, includeMSn=TRUE) xr <- msn2xcmsRaw(xrmsn) p <- findPeaks(xr, method="centWave")
overlappingFeatures
identifies features that are overlapping or close in
the m/z - rt space.
overlappingFeatures(x, expandMz = 0, expandRt = 0, ppm = 0)
overlappingFeatures(x, expandMz = 0, expandRt = 0, ppm = 0)
x |
|
expandMz |
|
expandRt |
|
ppm |
|
list
with indices of features (in featureDefinitions()
) that
are overlapping.
Johannes Rainer
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## Correspondence analysis xdata <- groupChromPeaks(faahko_sub, param = PeakDensityParam(sampleGroups = c(1, 1, 1))) ## Identify overlapping features overlappingFeatures(xdata) ## Identify features that are separated on retention time by less than ## 2 minutes overlappingFeatures(xdata, expandRt = 60)
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## Correspondence analysis xdata <- groupChromPeaks(faahko_sub, param = PeakDensityParam(sampleGroups = c(1, 1, 1))) ## Identify overlapping features overlappingFeatures(xdata) ## Identify features that are separated on retention time by less than ## 2 minutes overlappingFeatures(xdata, expandRt = 60)
Plot extracted ion chromatograms for many peaks simultaneously, indicating peak integration start and end points with vertical grey lines.
object |
the |
peaks |
matrix with peak information as produced by |
figs |
two-element vector describing the number of rows and the number of columns of peaks to plot, if missing then an approximately square grid that will fit the number of peaks supplied |
width |
width of chromatogram retention time to plot for each peak |
This function is intended to help graphically analyze the results of peak picking. It can help estimate the number of false positives and improper integration start and end points. Its output is very compact and tries to waste as little space as possible. Each plot is labeled with rounded m/z and retention time separated by a space.
signature(object = "xcmsSet")
plotPeaks(object, peaks, figs, width = 200)
xcmsRaw-class
,
findPeaks
,
split.screen
peaksWithCentWave
identifies (chromatographic) peaks in purely
chromatographic data, i.e. based on intensity and retention time values
without m/z values.
peaksWithCentWave( int, rt, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), integrate = 1, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, firstBaselineCheck = TRUE, extendLengthMSW = FALSE, ... )
peaksWithCentWave( int, rt, peakwidth = c(20, 50), snthresh = 10, prefilter = c(3, 100), integrate = 1, fitgauss = FALSE, noise = 0, verboseColumns = FALSE, firstBaselineCheck = TRUE, extendLengthMSW = FALSE, ... )
int |
|
rt |
|
peakwidth |
|
snthresh |
|
prefilter |
|
integrate |
|
fitgauss |
|
noise |
|
verboseColumns |
|
firstBaselineCheck |
|
extendLengthMSW |
|
... |
currently ignored. |
The method uses the same algorithm for the peak detection than centWave,
employs however a different approach to identify the initial regions in
which the peak detection is performed (i.e. the regions of interest ROI).
The method first identifies all local maxima in the chromatographic data and
defines the corresponding positions +/- peakwidth[2]
as the ROIs. Noise
estimation bases also on these ROIs and can thus be different from centWave
resulting in different signal to noise ratios.
A matrix, each row representing an identified chromatographic peak, with columns:
"rt"
: retention time of the peak's midpoint (time of the maximum signal).
"rtmin"
: minimum retention time of the peak.
"rtmax"
: maximum retention time of the peak.
"into"
: integrated (original) intensity of the peak.
"intb"
: per-peak baseline corrected integrated peak intensity.
"maxo"
: maximum (original) intensity of the peak.
"sn"
: signal to noise ratio of the peak defined as
(maxo - baseline)/sd
with sd
being the standard deviation of the local
chromatographic noise.
Additional columns for verboseColumns = TRUE
:
"mu"
: gaussian parameter mu.
"sigma"
: gaussian parameter sigma.
"h"
: gaussian parameter h.
"f"
: region number of the m/z ROI where the peak was localized.
"dppm"
: m/z deviation of mass trace across scans in ppm (always NA
).
"scale"
: scale on which the peak was localized.
"scpos"
: peak position found by wavelet analysis (index in int
).
"scmin"
: left peak limit found by wavelet analysis (index in int
).
"scmax"
: right peak limit found by wavelet analysis (index in int
).
Johannes Rainer
centWave for a detailed description of the peak detection method.
Other peak detection functions for chromatographic data:
peaksWithMatchedFilter()
## Reading a file library(MsExperiment) library(xcms) od <- readMsExperiment(system.file("cdf/KO/ko15.CDF", package = "faahKO")) ## Extract chromatographic data for a small m/z range mzr <- c(272.1, 272.2) chr <- chromatogram(od, mz = mzr, rt = c(3000, 3300))[1, 1] int <- intensity(chr) rt <- rtime(chr) ## Plot the region plot(chr, type = "h") ## Identify peaks in the chromatographic data pks <- peaksWithCentWave(intensity(chr), rtime(chr)) pks ## Highlight the peaks rect(xleft = pks[, "rtmin"], xright = pks[, "rtmax"], ybottom = rep(0, nrow(pks)), ytop = pks[, "maxo"], col = "#ff000040", border = "#00000040")
## Reading a file library(MsExperiment) library(xcms) od <- readMsExperiment(system.file("cdf/KO/ko15.CDF", package = "faahKO")) ## Extract chromatographic data for a small m/z range mzr <- c(272.1, 272.2) chr <- chromatogram(od, mz = mzr, rt = c(3000, 3300))[1, 1] int <- intensity(chr) rt <- rtime(chr) ## Plot the region plot(chr, type = "h") ## Identify peaks in the chromatographic data pks <- peaksWithCentWave(intensity(chr), rtime(chr)) pks ## Highlight the peaks rect(xleft = pks[, "rtmin"], xright = pks[, "rtmax"], ybottom = rep(0, nrow(pks)), ytop = pks[, "maxo"], col = "#ff000040", border = "#00000040")
The function performs peak detection using the matchedFilter algorithm on chromatographic data (i.e. with only intensities and retention time).
peaksWithMatchedFilter( int, rt, fwhm = 30, sigma = fwhm/2.3548, max = 20, snthresh = 10, ... )
peaksWithMatchedFilter( int, rt, fwhm = 30, sigma = fwhm/2.3548, max = 20, snthresh = 10, ... )
int |
|
rt |
|
fwhm |
|
sigma |
|
max |
|
snthresh |
|
... |
currently ignored. |
A matrix, each row representing an identified chromatographic peak, with columns:
"rt"
: retention time of the peak's midpoint (time of the maximum signal).
"rtmin"
: minimum retention time of the peak.
"rtmax"
: maximum retention time of the peak.
"into"
: integrated (original) intensity of the peak.
"intf"
: integrated intensity of the filtered peak.
"maxo"
: maximum (original) intensity of the peak.
"maxf"
" maximum intensity of the filtered peak.
"sn"
: signal to noise ratio of the peak.
Johannes Rainer
matchedFilter for a detailed description of the peak detection method.
Other peak detection functions for chromatographic data:
peaksWithCentWave()
## Load the test file faahko_sub <- loadXcmsData("faahko_sub") ## Subset to one file and drop identified chromatographic peaks data <- dropChromPeaks(filterFile(faahko_sub, 1)) ## Extract chromatographic data for a small m/z range chr <- chromatogram(data, mz = c(272.1, 272.3), rt = c(3000, 3200))[1, 1] pks <- peaksWithMatchedFilter(intensity(chr), rtime(chr)) pks ## Plotting the data plot(rtime(chr), intensity(chr), type = "h") rect(xleft = pks[, "rtmin"], xright = pks[, "rtmax"], ybottom = c(0, 0), ytop = pks[, "maxo"], border = "red")
## Load the test file faahko_sub <- loadXcmsData("faahko_sub") ## Subset to one file and drop identified chromatographic peaks data <- dropChromPeaks(filterFile(faahko_sub, 1)) ## Extract chromatographic data for a small m/z range chr <- chromatogram(data, mz = c(272.1, 272.3), rt = c(3000, 3200))[1, 1] pks <- peaksWithMatchedFilter(intensity(chr), rtime(chr)) pks ## Plotting the data plot(rtime(chr), intensity(chr), type = "h") rect(xleft = pks[, "rtmin"], xright = pks[, "rtmax"], ybottom = c(0, 0), ytop = pks[, "maxo"], border = "red")
Create a report showing all aligned peaks.
object |
the |
filebase |
base file name to save report, |
... |
arguments passed down to |
This method handles creation of summary reports similar to
diffreport
. It returns a summary report that can
optionally be written out to a tab-separated file.
If a base file name is provided, the report (see Value section) will be saved to a tab separated file.
A data frame with the following columns:
mz |
median m/z of peaks in the group |
mzmin |
minimum m/z of peaks in the group |
mzmax |
maximum m/z of peaks in the group |
rt |
median retention time of peaks in the group |
rtmin |
minimum retention time of peaks in the group |
rtmax |
maximum retention time of peaks in the group |
npeaks |
number of peaks assigned to the group |
Sample Classes |
number samples from each sample class represented in the group |
... |
one column for every sample class |
Sample Names |
integrated intensity value for every sample |
... |
one column for every sample |
peakTable(object, filebase = character(), ...)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xs<-xcmsSet(cdf files) xs<-group(xs) peakTable(xs, filebase="peakList") ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xs<-xcmsSet(cdf files) xs<-group(xs) peakTable(xs, filebase="peakList") ## End(Not run)
The 'PercentMissingFilter' class and method enable users to filter features from an 'XcmsExperiment' or 'SummarizedExperiment' object based on the percentage (values from 1 to 100) of missing values for each features in different sample groups and filters them according to a provided threshold.
This 'filter' is part of the possible dispatch of the generic function 'filterFeatures'. Features with a percentage of missing values *higher* ('>') than the user input threshold in all sample groups will be removed (i.e. features for which the proportion of missing values is below ('<=') the threshold in at least one sample group will be retained).
PercentMissingFilter(threshold = 30, f = factor()) ## S4 method for signature 'XcmsResult,PercentMissingFilter' filterFeatures(object, filter, ...) ## S4 method for signature 'SummarizedExperiment,PercentMissingFilter' filterFeatures(object, filter, assay = 1)
PercentMissingFilter(threshold = 30, f = factor()) ## S4 method for signature 'XcmsResult,PercentMissingFilter' filterFeatures(object, filter, ...) ## S4 method for signature 'SummarizedExperiment,PercentMissingFilter' filterFeatures(object, filter, assay = 1)
threshold |
'numeric' percentage (between 0 and 100) of accepted missing values for a feature in one sample group. |
f |
'vector' of the same length as the 'object', specifying the sample type for each sample in the dataset. The percentage of missing values per feature will be computed within each of these sample groups. Parameter 'f', if not already a 'factor', will be converted to one using the factor function. Samples with an 'NA' as their value in 'f' will be excluded from calculation. |
object |
|
filter |
The parameter object selecting and configuring the type of
filtering. It can be one of the following classes: |
... |
Optional parameters. For |
assay |
For filtering of |
For 'PercentMissingFilter': a 'PercentMissingFilter' class. 'filterFeatures' return the input object minus the features that did not met the user input threshold
Philippine Louail
Other Filter features in xcms:
BlankFlag
,
DratioFilter
,
RsdFilter
The phenoDataFromPaths
function builds a data.frame
representing the experimental design from the folder structure in which
the files of the experiment are located.
phenoDataFromPaths(paths)
phenoDataFromPaths(paths)
paths |
|
This function is used by the old xcmsSet
function to guess
the experimental design (i.e. group assignment of the files) from the
folders in which the files of the experiment can be found.
## List the files available in the faahKO package base_dir <- system.file("cdf", package = "faahKO") cdf_files <- list.files(base_dir, recursive = TRUE, full.names = TRUE)
## List the files available in the faahKO package base_dir <- system.file("cdf", package = "faahKO") cdf_files <- list.files(base_dir, recursive = TRUE, full.names = TRUE)
Batch plot a list of extracted ion chromatograms to the current graphics device.
x |
the |
y |
optional |
groupidx |
either character vector with names or integer vector with indicies of peak groups for which to plot EICs |
sampleidx |
either character vector with names or integer vector with indicies of samples for which to plot EICs |
rtrange |
a two column matrix with minimum and maximum retention times between which to return EIC data points if it has the same number of rows as the number groups in the
it may also be a single number specifying the time window around the peak for which to plot EIC data |
col |
color to use for plotting extracted ion chromatograms. if missing
and if it is the same length as the number groups in the |
legtext |
text to use for legend. if |
peakint |
logical, plot integrated peak area with darkened lines (requires
that |
sleep |
seconds to pause between plotting EICs |
... |
other graphical parameters |
A xcmsSet
object.
plot.xcmsEIC(x, y, groupidx = groupnames(x), sampleidx = sampnames(x), rtrange = x@rtrange,
col = rep(1, length(sampleidx)), legtext = NULL, peakint = TRUE, sleep = 0, ...)
Colin A. Smith, [email protected]
xcmsEIC-class
,
png
,
pdf
,
postscript
,
The 'plotAdjustedRtime' function plots the difference between the adjusted and *raw* retention times on the y-axis against the raw retention times on the x-axis. Each line represents the results for one sample (file). If alignment was performed using the *peak groups* method (see [adjustRtime()] for more infromation) also the peak groups used in the alignment are visualized.
plotAdjustedRtime( object, col = "#00000080", lty = 1, lwd = 1, type = "l", adjustedRtime = TRUE, xlab = ifelse(adjustedRtime, yes = expression(rt[adj]), no = expression(rt[raw])), ylab = expression(rt[adj] - rt[raw]), peakGroupsCol = "#00000060", peakGroupsPch = 16, peakGroupsLty = 3, ylim, ... )
plotAdjustedRtime( object, col = "#00000080", lty = 1, lwd = 1, type = "l", adjustedRtime = TRUE, xlab = ifelse(adjustedRtime, yes = expression(rt[adj]), no = expression(rt[raw])), ylab = expression(rt[adj] - rt[raw]), peakGroupsCol = "#00000060", peakGroupsPch = 16, peakGroupsLty = 3, ylim, ... )
object |
A [XcmsExperiment()] or [XCMSnExp()] object with the alignment results. |
col |
color(s) for the individual lines. Has to be of length 1 or equal to the number of samples. |
lty |
line type for the lines of the individual samples. |
lwd |
line width for the lines of the individual samples. |
type |
plot *type* (see [par()] for options; defaults to 'type = "l"'). |
adjustedRtime |
'logical(1)' whether adjusted or raw retention times should be shown on the x-axis. |
xlab |
the label for the x-axis. |
ylab |
the label for the y-axis. |
peakGroupsCol |
color to be used for the peak groups (only if alignment was performed using the *peak groups* method. |
peakGroupsPch |
point character ('pch') to be used for the peak groups (only if alignment was performed using the *peak groups* method. |
peakGroupsLty |
line type ('lty') to be used to connect points for each peak groups (only if alignment was performed using the *peak groups* method. |
ylim |
optional 'numeric(2)' with the upper and lower limits on the y-axis.b |
... |
Additional arguments to be passed down to the 'plot' function. |
Johannes Rainer
## Load a test data set with detected peaks faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Performing the peak grouping using the "peak density" method. p <- PeakDensityParam(sampleGroups = c(1, 1, 1)) res <- groupChromPeaks(faahko_sub, param = p) ## Perform the retention time adjustment using peak groups found in both ## files. fgp <- PeakGroupsParam(minFraction = 1) res <- adjustRtime(res, param = fgp) ## Visualize the impact of the alignment. plotAdjustedRtime(res, adjusted = FALSE) grid()
## Load a test data set with detected peaks faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) ## Performing the peak grouping using the "peak density" method. p <- PeakDensityParam(sampleGroups = c(1, 1, 1)) res <- groupChromPeaks(faahko_sub, param = p) ## Perform the retention time adjustment using peak groups found in both ## files. fgp <- PeakGroupsParam(minFraction = 1) res <- adjustRtime(res, param = fgp) ## Visualize the impact of the alignment. plotAdjustedRtime(res, adjusted = FALSE) grid()
Uses the pre-generated profile mode matrix to plot averaged or base peak extracted ion chromatograms over a specified mass range.
object |
the |
base |
logical, plot a base-peak chromatogram |
ident |
logical, use mouse to identify and label peaks |
fitgauss |
logical, fit a gaussian to the largest peak |
vline |
numeric vector with locations of vertical lines |
... |
arguments passed to |
If ident == TRUE
, an integer vector with the indecies of
the points that were identified. If fitgauss == TRUE
, a
nls
model with the fitted gaussian. Otherwise a two-column
matrix with the plotted points.
plotChrom(object, base = FALSE, ident = FALSE,
fitgauss = FALSE, vline = numeric(0), ...)
plotOverlay
draws chromatographic peak data from multiple (different)
extracted ion chromatograms (EICs) into the same plot. This allows to
directly compare the peak shape of these EICs in the same sample. In
contrast to the plot
function for MChromatograms()
object, which draws
the data from the same EIC across multiple samples in the same plot, this
function draws the different EICs from the same sample into the same plot.
If plotChromatogramsOverlay
is called on a XChromatograms
object any
present chromatographic peaks will also be highlighted/drawn depending on the
parameters peakType
, peakCol
, peakBg
and peakPch
(see also help on
the plot
function for XChromatogram()
object for details).
## S4 method for signature 'MChromatograms' plotChromatogramsOverlay( object, col = "#00000060", type = "l", main = NULL, xlab = "rtime", ylab = "intensity", xlim = numeric(), ylim = numeric(), stacked = 0, transform = identity, ... ) ## S4 method for signature 'XChromatograms' plotChromatogramsOverlay( object, col = "#00000060", type = "l", main = NULL, xlab = "rtime", ylab = "intensity", xlim = numeric(), ylim = numeric(), peakType = c("polygon", "point", "rectangle", "none"), peakBg = NULL, peakCol = NULL, peakPch = 1, stacked = 0, transform = identity, ... )
## S4 method for signature 'MChromatograms' plotChromatogramsOverlay( object, col = "#00000060", type = "l", main = NULL, xlab = "rtime", ylab = "intensity", xlim = numeric(), ylim = numeric(), stacked = 0, transform = identity, ... ) ## S4 method for signature 'XChromatograms' plotChromatogramsOverlay( object, col = "#00000060", type = "l", main = NULL, xlab = "rtime", ylab = "intensity", xlim = numeric(), ylim = numeric(), peakType = c("polygon", "point", "rectangle", "none"), peakBg = NULL, peakCol = NULL, peakPch = 1, stacked = 0, transform = identity, ... )
object |
|
col |
definition of the color in which the chromatograms should be
drawn. Can be of length 1 or equal to |
type |
|
main |
optional title of the plot. If not defined, the range of m/z values is used. |
xlab |
|
ylab |
|
xlim |
optional |
ylim |
optional |
stacked |
|
transform |
|
... |
optional arguments to be passed to the plotting functions (see
help on the base R |
peakType |
if |
peakBg |
if |
peakCol |
if |
peakPch |
if |
silently returns a list
(length equal to ncol(object)
of
numeric
(length equal to nrow(object)
) with the y position of
each EIC.
Johannes Rainer
## Load preprocessed data and extract EICs for some features. library(xcms) library(MSnbase) xdata <- loadXcmsData() data(xdata) ## Update the path to the files for the local system dirname(xdata) <- c(rep(system.file("cdf", "KO", package = "faahKO"), 4), rep(system.file("cdf", "WT", package = "faahKO"), 4)) ## Subset to the first 3 files. xdata <- filterFile(xdata, 1:3, keepFeatures = TRUE) ## Define features for which to extract EICs fts <- c("FT097", "FT163", "FT165") chrs <- featureChromatograms(xdata, features = fts) plotChromatogramsOverlay(chrs) ## plot the overlay of EICs in the first sample plotChromatogramsOverlay(chrs[, 1]) ## Define a different color for each feature (row in chrs). By default, also ## all chromatographic peaks of a feature is labeled in the same color. plotChromatogramsOverlay(chrs[, 1], col = c("#ff000040", "#00ff0040", "#0000ff40")) ## Alternatively, we can define a color for each individual chromatographic ## peak and provide this with the `peakBg` and `peakCol` parameters. chromPeaks(chrs[, 1]) ## Use a color for each of the two identified peaks in that sample plotChromatogramsOverlay(chrs[, 1], col = c("#ff000040", "#00ff0040", "#0000ff40"), peakBg = c("#ffff0020", "#00ffff20")) ## Plotting the data in all samples. plotChromatogramsOverlay(chrs, col = c("#ff000040", "#00ff0040", "#0000ff40")) ## Creating a "stacked" EIC plot: the EICs are placed along the y-axis ## relative to their m/z value. With `stacked = 1` the y-axis is split in ## half, the lower half being used for the stacking of the EICs, the upper ## half being used for the *original* intensity axis. res <- plotChromatogramsOverlay(chrs[, 1], stacked = 1, col = c("#ff000040", "#00ff0040", "#0000ff40")) ## add horizontal lines for the m/z values of each EIC abline(h = res[[1]], col = "grey", lty = 2) ## Note that this type of visualization is different than the conventional ## plot function for chromatographic data, which will draw the EICs for ## multiple samples into the same plot plot(chrs) ## Converting the object to a MChromatograms without detected peaks chrs <- as(chrs, "MChromatograms") plotChromatogramsOverlay(chrs, col = c("#ff000040", "#00ff0040", "#0000ff40"))
## Load preprocessed data and extract EICs for some features. library(xcms) library(MSnbase) xdata <- loadXcmsData() data(xdata) ## Update the path to the files for the local system dirname(xdata) <- c(rep(system.file("cdf", "KO", package = "faahKO"), 4), rep(system.file("cdf", "WT", package = "faahKO"), 4)) ## Subset to the first 3 files. xdata <- filterFile(xdata, 1:3, keepFeatures = TRUE) ## Define features for which to extract EICs fts <- c("FT097", "FT163", "FT165") chrs <- featureChromatograms(xdata, features = fts) plotChromatogramsOverlay(chrs) ## plot the overlay of EICs in the first sample plotChromatogramsOverlay(chrs[, 1]) ## Define a different color for each feature (row in chrs). By default, also ## all chromatographic peaks of a feature is labeled in the same color. plotChromatogramsOverlay(chrs[, 1], col = c("#ff000040", "#00ff0040", "#0000ff40")) ## Alternatively, we can define a color for each individual chromatographic ## peak and provide this with the `peakBg` and `peakCol` parameters. chromPeaks(chrs[, 1]) ## Use a color for each of the two identified peaks in that sample plotChromatogramsOverlay(chrs[, 1], col = c("#ff000040", "#00ff0040", "#0000ff40"), peakBg = c("#ffff0020", "#00ffff20")) ## Plotting the data in all samples. plotChromatogramsOverlay(chrs, col = c("#ff000040", "#00ff0040", "#0000ff40")) ## Creating a "stacked" EIC plot: the EICs are placed along the y-axis ## relative to their m/z value. With `stacked = 1` the y-axis is split in ## half, the lower half being used for the stacking of the EICs, the upper ## half being used for the *original* intensity axis. res <- plotChromatogramsOverlay(chrs[, 1], stacked = 1, col = c("#ff000040", "#00ff0040", "#0000ff40")) ## add horizontal lines for the m/z values of each EIC abline(h = res[[1]], col = "grey", lty = 2) ## Note that this type of visualization is different than the conventional ## plot function for chromatographic data, which will draw the EICs for ## multiple samples into the same plot plot(chrs) ## Converting the object to a MChromatograms without detected peaks chrs <- as(chrs, "MChromatograms") plotChromatogramsOverlay(chrs, col = c("#ff000040", "#00ff0040", "#0000ff40"))
Plot the density of chromatographic peaks along the retention
time axis and indicate which peaks would be (or were) grouped into the
same feature based using the peak density correspondence method.
Settings for the peak density method can be passed with an
PeakDensityParam object to parameter param
. If the object
contains
correspondence results and the correspondence was performed with the
peak groups method, the results from that correspondence can be
visualized setting simulate = FALSE
.
## S4 method for signature 'XCMSnExp' plotChromPeakDensity( object, mz, rt, param, simulate = TRUE, col = "#00000080", xlab = "retention time", ylab = "sample", xlim = range(rt), main = NULL, type = c("any", "within", "apex_within"), ... )
## S4 method for signature 'XCMSnExp' plotChromPeakDensity( object, mz, rt, param, simulate = TRUE, col = "#00000080", xlab = "retention time", ylab = "sample", xlim = range(rt), main = NULL, type = c("any", "within", "apex_within"), ... )
object |
A XCMSnExp object with identified chromatographic peaks. |
mz |
|
rt |
|
param |
PeakDensityParam from which parameters for the
peak density correspondence algorithm can be extracted. If not provided
and if |
simulate |
|
col |
Color to be used for the individual samples. Length has to be 1
or equal to the number of samples in |
xlab |
|
ylab |
|
xlim |
|
main |
|
type |
|
... |
Additional parameters to be passed to the |
The plotChromPeakDensity
function allows to evaluate
different settings for the peak density on an mz slice of
interest (e.g. containing chromatographic peaks corresponding to a known
metabolite).
The plot shows the individual peaks that were detected within the
specified mz
slice at their retention time (x-axis) and sample in
which they were detected (y-axis). The density function is plotted as a
black line. Parameters for the density
function are taken from the
param
object. Grey rectangles indicate which chromatographic peaks
would be grouped into a feature by the peak density
correspondence
method. Parameters for the algorithm are also taken from param
.
See groupChromPeaks()
for more information about the
algorithm and its supported settings.
The function is called for its side effect, i.e. to create a plot.
Johannes Rainer
groupChromPeaks()
for details on the
peak density correspondence method and supported settings.
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Plot the chromatographic peak density for a specific mz range to evaluate ## different peak density correspondence settings. mzr <- c(305.05, 305.15) plotChromPeakDensity(faahko_sub, mz = mzr, pch = 16, param = PeakDensityParam(sampleGroups = rep(1, length(fileNames(faahko_sub)))))
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Plot the chromatographic peak density for a specific mz range to evaluate ## different peak density correspondence settings. mzr <- c(305.05, 305.15) plotChromPeakDensity(faahko_sub, mz = mzr, pch = 16, param = PeakDensityParam(sampleGroups = rep(1, length(fileNames(faahko_sub)))))
'plotChromPeaks' plots the identified chromatographic peaks from one file into the plane spanned by the retention time (x-axis) and m/z (y-axis) dimension. Each chromatographic peak is plotted as a rectangle representing its width in RT and m/z dimension.
'plotChromPeakImage' plots the number of detected peaks for each sample along the retention time axis as an *image* plot, i.e. with the number of peaks detected in each bin along the retention time represented with the color of the respective cell.
plotChromPeaks( x, file = 1, xlim = NULL, ylim = NULL, add = FALSE, border = "#00000060", col = NA, xlab = "retention time", ylab = "mz", main = NULL, msLevel = 1L, ... ) plotChromPeakImage( x, binSize = 30, xlim = NULL, log = FALSE, xlab = "retention time", yaxt = par("yaxt"), main = "Chromatographic peak counts", msLevel = 1L, ... )
plotChromPeaks( x, file = 1, xlim = NULL, ylim = NULL, add = FALSE, border = "#00000060", col = NA, xlab = "retention time", ylab = "mz", main = NULL, msLevel = 1L, ... ) plotChromPeakImage( x, binSize = 30, xlim = NULL, log = FALSE, xlab = "retention time", yaxt = par("yaxt"), main = "Chromatographic peak counts", msLevel = 1L, ... )
x |
A [XcmsExperiment()] or [XCMSnExp()] object. |
file |
For 'plotChromPeaks': 'integer(1)' specifying the index of the file within 'x' for which the plot should be created. Defaults to 'file = 1'. |
xlim |
'numeric(2)' specifying the x-axis limits (retention time dimension). Defaults to 'xlim = NULL' in which case the full retention time range of the file is used. |
ylim |
For 'plotChromPeaks': 'numeric(2)' specifying the y-axis limits (m/z dimension). Defaults to 'ylim = NULL' in which case the full m/z range of the file is used. |
add |
For 'plotChromPeaks': 'logical(1)' whether the plot should be added to an existing plot or if a new plot should be created. |
border |
For ‘plotChromPeaks': the color for the rectangles’ border. |
col |
For 'plotChromPeaks': the color to be used to fill the rectangles. |
xlab |
'character(1)' defining the x-axis label. |
ylab |
For 'plotChromPeaks': 'character(1)' defining the y-axis label. |
main |
'character(1)' defining the plot title. By default (i.e. 'main = NULL') the name of the file will be used as title. |
msLevel |
'integer(1)' defining the MS level from which the peaks should be visualized. |
... |
Additional arguments passed to the 'plot' (for 'plotChromPeaks') and 'image' (for 'plotChromPeakImage') functions. Ignored for 'add = TRUE'. |
binSize |
For 'plotChromPeakImage': 'numeric(1)' defining the size of the bins along the x-axis (retention time). Defaults to 'binSize = 30', peaks within each 30 seconds will thus counted and plotted. |
log |
For 'plotChromPeakImage': 'logical(1)' whether the peak counts should be log2 transformed before plotting. |
yaxt |
For 'plotChromPeakImage': 'character(1)' defining whether y-axis labels should be added. To disable the y-axis use 'yaxt = "n"'. For any other value of 'yaxt' the axis will be drawn. See [par()] help page for more details. |
The width and line type of the rectangles indicating the detected chromatographic peaks for the 'plotChromPeaks' function can be specified using the 'par' function, i.e. with 'par(lwd = 3)' and 'par(lty = 2)', respectively.
Johannes Rainer
## Load a test data set with detected peaks faahko_sub <- loadXcmsData("faahko_sub2") ## plotChromPeakImage: plot an image for the identified peaks per file plotChromPeakImage(faahko_sub) ## Show all detected chromatographic peaks from the first file plotChromPeaks(faahko_sub) ## Plot all detected peaks from the second file and restrict the plot to a ## mz-rt slice plotChromPeaks(faahko_sub, file = 2, xlim = c(3500, 3600), ylim = c(400, 600))
## Load a test data set with detected peaks faahko_sub <- loadXcmsData("faahko_sub2") ## plotChromPeakImage: plot an image for the identified peaks per file plotChromPeakImage(faahko_sub) ## Show all detected chromatographic peaks from the first file plotChromPeaks(faahko_sub) ## Plot all detected peaks from the second file and restrict the plot to a ## mz-rt slice plotChromPeaks(faahko_sub, file = 2, xlim = c(3500, 3600), ylim = c(400, 600))
Plot extracted ion chromatogram for m/z values of interest. The raw data is used in contrast to plotChrom
which uses data from the profile matrix.
object |
|
mzrange |
m/z range for EIC. Uses the full m/z range by default. |
rtrange |
retention time range for EIC. Uses the full retention time range by default. |
scanrange |
scan range for EIC |
mzdec |
Number of decimal places of title m/z values in the eic plot. |
type |
Speficies how the data should be plotted (by default as a line). |
add |
If the EIC should be added to an existing plot. |
... |
Additional parameters passed to the plotting function
(e.g. |
A two-column matrix with the plotted points.
plotEIC(object, mzrange = numeric(), rtrange = numeric(),
scanrange = numeric(), mzdec=2, type="l", add=FALSE, ...)
Ralf Tautenhahn
plotFeatureGroups
visualizes defined feature groups in the m/z by
retention time space. Features are indicated by points with features from
the same feature group being connected by a line. See featureGroups()
for details on and options for feature grouping.
plotFeatureGroups( x, xlim = numeric(), ylim = numeric(), xlab = "retention time", ylab = "m/z", pch = 4, col = "#00000060", type = "o", main = "Feature groups", featureGroups = character(), ... )
plotFeatureGroups( x, xlim = numeric(), ylim = numeric(), xlab = "retention time", ylab = "m/z", pch = 4, col = "#00000060", type = "o", main = "Feature groups", featureGroups = character(), ... )
x |
XcmsExperiment or |
xlim |
|
ylim |
|
xlab |
|
ylab |
|
pch |
the plotting character. Defaults to |
col |
color to be used to draw the features. At present only a single color is supported. |
type |
plotting type (see |
main |
|
featureGroups |
optional |
... |
additional parameters to be passed to the |
Johannes Rainer
UPDATE: please use plot()
from the MsExperiment
or
plot(x, type = "XIC")
from the MSnbase
package instead. See examples
in the vignette for more information.
The plotMsData
creates a plot that combines an (base peak )
extracted ion chromatogram on top (rt against intensity) and a plot of
rt against m/z values at the bottom.
plotMsData( x, main = "", cex = 1, mfrow = c(2, 1), grid.color = "lightgrey", colramp = colorRampPalette(rev(brewer.pal(9, "YlGnBu"))) )
plotMsData( x, main = "", cex = 1, mfrow = c(2, 1), grid.color = "lightgrey", colramp = colorRampPalette(rev(brewer.pal(9, "YlGnBu"))) )
x |
|
main |
|
cex |
|
mfrow |
|
grid.color |
a color definition for the grid line (or |
colramp |
a color ramp palette to be used to color the data points
based on their intensity. See argument |
Johannes Rainer
Plot extracted ion chromatograms for many peaks simultaneously, indicating peak integration start and end points with vertical grey lines.
object |
the |
peaks |
matrix with peak information as produced by |
figs |
two-element vector describing the number of rows and the number of columns of peaks to plot, if missing then an approximately square grid that will fit the number of peaks supplied |
width |
width of chromatogram retention time to plot for each peak |
This function is intended to help graphically analyze the results of peak picking. It can help estimate the number of false positives and improper integration start and end points. Its output is very compact and tries to waste as little space as possible. Each plot is labeled with rounded m/z and retention time separated by a space.
plotPeaks(object, peaks, figs, width = 200)
xcmsRaw-class
,
findPeaks
,
split.screen
Simple visualization of the position of fragment spectra's precursor ion in the MS1 retention time by m/z area.
plotPrecursorIons( x, pch = 21, col = "#00000080", bg = "#00000020", xlab = "retention time", ylab = "m/z", main = character(), ... )
plotPrecursorIons( x, pch = 21, col = "#00000080", bg = "#00000020", xlab = "retention time", ylab = "m/z", main = character(), ... )
x |
|
pch |
|
col |
the color to be used for all data points. Defines the border
color if |
bg |
the background color (if |
xlab |
|
ylab |
|
main |
Optional |
... |
additional parameters to be passed to the |
Johannes Rainer
## Load a test data file with DDA LC-MS/MS data library(MsExperiment) fl <- system.file("TripleTOF-SWATH", "PestMix1_DDA.mzML", package = "msdata") pest_dda <- readMsExperiment(fl) plotPrecursorIons(pest_dda) grid() ## Subset the data object to plot the data specifically for one or ## selected file/sample: plotPrecursorIons(pest_dda[1L])
## Load a test data file with DDA LC-MS/MS data library(MsExperiment) fl <- system.file("TripleTOF-SWATH", "PestMix1_DDA.mzML", package = "msdata") pest_dda <- readMsExperiment(fl) plotPrecursorIons(pest_dda) grid() ## Subset the data object to plot the data specifically for one or ## selected file/sample: plotPrecursorIons(pest_dda[1L])
Use "democracy" to determine the average m/z and RT deviations for a grouped xcmsSet, and dependency on sample or absolute m/z
plotQC(object, sampNames, sampColors, sampOrder, what)
plotQC(object, sampNames, sampColors, sampOrder, what)
object |
A grouped |
sampNames |
Override sample names (e.g. with simplified names) |
sampColors |
Provide a set of colors (default: monochrome ?) |
sampOrder |
Override the order of samples, e.g. to bring them in order of measurement to detect time drift |
what |
A vector of which QC plots to generate. "mzdevhist": histogram of mz deviations. Should be gaussian shaped. If it is multimodal, then some peaks seem to have a systematically higher m/z deviation "rtdevhist": histogram of RT deviations. Should be gaussian shaped. If it is multimodal, then some peaks seem to have a systematically higher RT deviation "mzdevmass": Shows whether m/z deviations are absolute m/z dependent, could indicate miscalibration "mzdevtime": Shows whether m/z deviations are RT dependent, could indicate instrument drift "mzdevsample": median mz deviation for each sample, indicates outliers "rtdevsample": median RT deviation for each sample, indicates outliers |
plotQC() is a warpper to create a set of diagnostic plots. For the m/z deviations, the median of all m/z withon one group are assumed.
List with four matrices, each of dimension features * samples: "mz": median mz deviation for each sample "mzdev": median mz deviation for each sample "rt": median RT deviation for each sample "rtdev": median RT deviation for each sample
Michael Wenk, Michael Wenk <[email protected]>
library(faahKO) xsg <- group(faahko) plotQC(xsg, what="mzdevhist") plotQC(xsg, what="rtdevhist") plotQC(xsg, what="mzdevmass") plotQC(xsg, what="mzdevtime") plotQC(xsg, what="mzdevsample") plotQC(xsg, what="rtdevsample")
library(faahKO) xsg <- group(faahko) plotQC(xsg, what="mzdevhist") plotQC(xsg, what="rtdevhist") plotQC(xsg, what="mzdevmass") plotQC(xsg, what="mzdevtime") plotQC(xsg, what="mzdevsample") plotQC(xsg, what="rtdevsample")
Produce a scatterplot showing raw data point location in retention time and m/z. This plot is more useful for centroided data than continuum data.
object |
the |
mzrange |
numeric vector of length >= 2 whose range will be used to select the masses to plot |
rtrange |
numeric vector of length >= 2 whose range will be used to select the retention times to plot |
scanrange |
numeric vector of length >= 2 whose range will be used to select scans to plot |
log |
logical, log transform intensity |
title |
main title of the plot |
A matrix with the points plotted.
plotRaw(object, mzrange = numeric(), rtrange = numeric(),
scanrange = numeric(), log=FALSE, title='Raw Data')
Use corrected retention times for each sample to calculate retention time deviation profiles and plot each on the same graph.
object |
the |
col |
vector of colors for plotting each sample |
ty |
vector of line and point types for plotting each sample |
leg |
logical plot legend with sample labels |
densplit |
logical, also plot peak overall peak density |
plotrt(object, col = NULL, ty = NULL, leg = TRUE,
densplit = FALSE)
Plot a single mass scan using the impulse representation. Most useful for centroided data.
object |
the |
scan |
integer with number of scan to plot |
mzrange |
numeric vector of length >= 2 whose range will be used to select masses to plot |
ident |
logical, use mouse to interactively identify and label individual masses |
plotScan(object, scan, mzrange = numeric(), ident = FALSE)
Uses the pre-generated profile mode matrix to plot mass spectra over a specified retention time range.
object |
the |
ident |
logical, use mouse to identify and label peaks |
vline |
numeric vector with locations of vertical lines |
... |
arguments passed to |
If ident == TRUE
, an integer vector with the indecies of
the points that were identified. Otherwise a two-column matrix
with the plotted points.
plotSpec(object, ident = FALSE, vline = numeric(0), ...)
This method uses the rgl package to create interactive three dimensonal representations of the profile matrix. It uses the terrain color scheme.
object |
the |
log |
logical, log transform intensity |
aspect |
numeric vector with aspect ratio of the m/z, retention time and intensity components of the plot |
... |
arguments passed to |
The rgl package is still in development and imposes some limitations on the output format. A bug in the axis label code means that the axis labels only go from 0 to the aspect ratio constant of that axis. Additionally the axes are not labeled with what they are.
It is important to only plot a small portion of the profile matrix. Large portions can quickly overwhelm your CPU and memory.
plotSurf(object, log = FALSE, aspect = c(1, 1, .5), ...)
Plot chromatogram of total ion count. Optionally allow identification of target peaks and viewing/identification of individual spectra.
object |
the |
ident |
logical, use mouse to identify and label chromatographic peaks |
msident |
logical, use mouse to identify and label spectral peaks |
If ident == TRUE
, an integer vector with the indecies of
the points that were identified. Otherwise a two-column matrix
with the plotted points.
plotTIC(object, ident = FALSE, msident = FALSE)
Objects of the type ProcessHistory
allow to keep track
of any data processing step in an metabolomics experiment. They are
created by the data processing methods, such as
findChromPeaks
and added to the corresponding results
objects. Thus, usually, users don't need to create them.
The XProcessHistory
extends the ProcessHistory
by
adding a slot param
that allows to store the actual parameter
class of the processing step.
processParam
, processParam<-
: get or set the
parameter class from an XProcessHistory
object.
msLevel
: returns the MS level on which a certain analysis
has been performed, or NA
if not defined.
The processType
method returns a character specifying the
processing step type.
The processDate
extracts the start date of the processing
step.
The processInfo
extracts optional additional information
on the processing step.
The fileIndex
extracts the indices of the files on which
the processing step was applied.
## S4 method for signature 'ProcessHistory' show(object) ## S4 method for signature 'XProcessHistory' show(object) ## S4 method for signature 'XProcessHistory' processParam(object) ## S4 method for signature 'XProcessHistory' msLevel(object) ## S4 method for signature 'ProcessHistory' processType(object) ## S4 method for signature 'ProcessHistory' processDate(object) ## S4 method for signature 'ProcessHistory' processInfo(object) ## S4 method for signature 'ProcessHistory' fileIndex(object)
## S4 method for signature 'ProcessHistory' show(object) ## S4 method for signature 'XProcessHistory' show(object) ## S4 method for signature 'XProcessHistory' processParam(object) ## S4 method for signature 'XProcessHistory' msLevel(object) ## S4 method for signature 'ProcessHistory' processType(object) ## S4 method for signature 'ProcessHistory' processDate(object) ## S4 method for signature 'ProcessHistory' processInfo(object) ## S4 method for signature 'ProcessHistory' fileIndex(object)
object |
A |
For processParam
: a parameter object extending the
Param
class.
The processType
method returns a character string with the
processing step type.
The processDate
method returns a character string with the
time stamp of the processing step start.
The processInfo
method returns a character string with
optional additional informations.
The fileIndex
method returns a integer vector with the index
of the files/samples on which the processing step was applied.
type
character(1): string defining the type of the processing step.
This string has to match predefined values. Use
processHistoryTypes
to list them.
date
character(1): date time stamp when the processing step was started.
info
character(1): optional additional information.
fileIndex
integer of length 1 or > 1 to specify on which samples of the object the processing was performed.
error
(ANY): used to store eventual calculation errors.
param
(Param): an object of type Param
(e.g.
CentWaveParam
) specifying the settings of the processing
step.
msLevel:
integer
definining the MS level(s) on which the
analysis was performed.
Johannes Rainer
The profile matrix is an n x m matrix, n (rows) representing equally spaced m/z values (bins) and m (columns) the retention time of the corresponding scans. Each cell contains the maximum intensity measured for the specific scan and m/z values falling within the m/z bin.
The `profMat` method creates a new profile matrix or returns the profile matrix within the object's `@env` slot, if available. Settings for the profile matrix generation, such as `step` (the bin size), `method` or additional settings are extracted from the respective slots of the `xcmsRaw` object. Alternatively it is possible to specify all of the settings as additional parameters. For [MsExperiment()] or [XcmsExperiment()] objects, the method returns a `list` of profile matrices, one for each sample in `object`. Using parameter `fileIndex` it is also possible to create a profile matrix only for selected samples (files).
## S4 method for signature 'MsExperiment' profMat( object, method = "bin", step = 0.1, baselevel = NULL, basespace = NULL, mzrange. = NULL, fileIndex = seq_along(object), chunkSize = 1L, msLevel = 1L, BPPARAM = bpparam(), ... ) ## S4 method for signature 'xcmsRaw' profMat(object, method, step, baselevel, basespace, mzrange.)
## S4 method for signature 'MsExperiment' profMat( object, method = "bin", step = 0.1, baselevel = NULL, basespace = NULL, mzrange. = NULL, fileIndex = seq_along(object), chunkSize = 1L, msLevel = 1L, BPPARAM = bpparam(), ... ) ## S4 method for signature 'xcmsRaw' profMat(object, method, step, baselevel, basespace, mzrange.)
object |
An |
method |
|
step |
|
baselevel |
|
basespace |
|
mzrange. |
Optional |
fileIndex |
For |
chunkSize |
For |
msLevel |
For |
BPPARAM |
For |
... |
ignored. |
Profile matrix generation methods:
"bin"
: The default profile matrix generation method that does a
simple binning, i.e. aggregating of intensity values falling within an
m/z bin.
"binlin"
: Binning followed by linear interpolation to impute missing
values. The value for m/z bins without a measured intensity are inferred
by a linear interpolation between neighboring bins with a measured
intensity.
"binlinbase"
: Binning followed by a linear interpolation to impute
values for empty elements (m/z bins) within a user-definable proximity to
non-empty elements while stetting the element's value to the
baselevel
otherwise. See impute = "linbase"
parameter of
imputeLinInterpol()
for more details.
"intlin"
: Set the elements' values to the integral of the linearly
interpolated data from plus to minus half the step size.
profMat
returns the profile matrix (rows representing scans,
columns equally spaced m/z values). For object
being a MsExperiment
or XcmsExperiment
, the method returns a list
of profile matrices,
one for each file (sample).
Johannes Rainer
file <- system.file('cdf/KO/ko15.CDF', package = "faahKO") ## Load the data without generating the profile matrix (profstep = 0) xraw <- xcmsRaw(file, profstep = 0) ## Extract the profile matrix profmat <- profMat(xraw, step = 0.3) dim(profmat) ## If not otherwise specified, the settings from the xraw object are used: profinfo(xraw) ## To extract a profile matrix with linear interpolation use profmat <- profMat(xraw, step = 0.3, method = "binlin") ## Alternatively, the profMethod of the xraw objects could be changed profMethod(xraw) <- "binlin" profmat_2 <- profMat(xraw, step = 0.3) all.equal(profmat, profmat_2)
file <- system.file('cdf/KO/ko15.CDF', package = "faahKO") ## Load the data without generating the profile matrix (profstep = 0) xraw <- xcmsRaw(file, profstep = 0) ## Extract the profile matrix profmat <- profMat(xraw, step = 0.3) dim(profmat) ## If not otherwise specified, the settings from the xraw object are used: profinfo(xraw) ## To extract a profile matrix with linear interpolation use profmat <- profMat(xraw, step = 0.3, method = "binlin") ## Alternatively, the profMethod of the xraw objects could be changed profMethod(xraw) <- "binlin" profmat_2 <- profMat(xraw, step = 0.3) all.equal(profmat, profmat_2)
Apply a median filter of given size to a profile matrix.
object |
the |
massrad |
number of m/z grid points on either side to use for median calculation |
scanrad |
number of scan grid points on either side to use for median calculation |
profMedFilt(object, massrad = 0, scanrad = 0)
These methods get and set the method for generating profile
(matrix) data from raw mass spectral data. It can currently be
bin
, binlin
, binlinbase
, or intlin
.
profMethod(object)
xcmsRaw-class
,
profMethod
,
profBin
,
plotSpec
,
plotChrom
,
findPeaks
Specify a subset of the profile mode matrix given a mass, time, or scan range. Allow flexible user entry for other functions.
object |
the |
mzrange |
single numeric mass or vector of masses |
rtrange |
single numeric time (in seconds) or vector of times |
scanrange |
single integer scan index or vector of indecies |
... |
arguments to other functions |
This function handles selection of mass/time subsets of the profile matrix for other functions. It allows the user to specify such subsets in a variety of flexible ways with minimal typing.
Because R does partial argument matching, mzrange
,
scanrange
, and rtrange
can be specified in short
form using m=
, s=
, and t=
, respectively. If
both a scanrange
and rtrange
are specified, then
the rtrange
specification takes precedence.
When specifying ranges, you may either enter a single number or
a numeric vector. If a single number is entered, then the closest
single scan or mass value is selected. If a vector is entered,
then the range is set to the range()
of the values entered.
That allows specification of ranges using shortened, slightly
non-standard syntax. For example, one could specify 400 to 500
seconds using any of the following: t=c(400,500)
,
t=c(500,400)
, or t=400:500
. Use of the sequence
operator (:
) can save several keystrokes when specifying
ranges. However, while the sequence operator works well for
specifying integer ranges, fractional ranges do not always work
as well.
A list with the folloing items:
mzrange |
numeric vector with start and end mass |
masslab |
textual label of mass range |
massidx |
integer vector of mass indecies |
scanrange |
integer vector with stat ane end scans |
scanlab |
textual label of scan range |
scanidx |
integer vector of scan range |
rtrange |
numeric vector of start and end times |
timelab |
textual label of time range |
profRange(object, mzrange = numeric(),
rtrange = numeric(), scanrange = numeric(),
...)
These methods get and set the m/z step for generating profile (matrix) data from raw mass spectral data. Smaller steps yield more precision at the cost of greater memory usage.
profStep(object)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsRaw(cdffiles[1]) xset plotSurf(xset, mass=c(200,500)) profStep(xset)<-0.1 ## decrease the bin size to get better resolution plotSurf(xset, mass=c(200, 500)) ##works nicer on high resolution data. ## End(Not run)
## Not run: library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xset <- xcmsRaw(cdffiles[1]) xset plotSurf(xset, mass=c(200,500)) profStep(xset)<-0.1 ## decrease the bin size to get better resolution plotSurf(xset, mass=c(200, 500)) ##works nicer on high resolution data. ## End(Not run)
featureValues,XCMSnExp
: extract a matrix
for
feature values with rows representing features and columns samples.
Parameter value
allows to define which column from the
chromPeaks
matrix should be returned. Multiple
chromatographic peaks from the same sample can be assigned to a feature.
Parameter method
allows to specify the method to be used in such
cases to chose from which of the peaks the value should be returned.
Parameter 'msLevel' allows to choose a specific MS level for which feature
values should be returned (given that features have been defined for that MS
level).
quantify,XCMSnExp
: return the preprocessing results as an
SummarizedExperiment
object containing the feature abundances
as assay matrix, the feature definitions (returned by
featureDefinitions
) as rowData
and the phenotype
information as colData
. This is an ideal container for further
processing of the data. Internally, the featureValues
method
is used to extract the feature abundances, parameters for that method can
be passed to quantify
with ...
.
## S4 method for signature 'XCMSnExp' quantify(object, ...) ## S4 method for signature 'XCMSnExp' featureValues( object, method = c("medret", "maxint", "sum"), value = "into", intensity = "into", filled = TRUE, missing = NA, msLevel = integer() )
## S4 method for signature 'XCMSnExp' quantify(object, ...) ## S4 method for signature 'XCMSnExp' featureValues( object, method = c("medret", "maxint", "sum"), value = "into", intensity = "into", filled = TRUE, missing = NA, msLevel = integer() )
object |
A |
... |
For |
method |
|
value |
|
intensity |
|
filled |
|
missing |
how missing values should be reported. Allowed values are
|
msLevel |
for 'featureValues': 'integer' defining the MS level(s) for which feature values should be returned. By default, values for features defined for all MS levels are returned. |
For featureValues
: a matrix
with
feature values, columns representing samples, rows features. The order
of the features matches the order found in the
featureDefinitions(object)
DataFrame
. The rownames of the
matrix
are the same than those of the featureDefinitions
DataFrame
. NA
is reported for features without
corresponding chromatographic peak in the respective sample(s).
For quantify
: a SummarizedExperiment
representing
the preprocessing results.
This method is equivalent to the groupval
for
xcmsSet
objects. Note that missing = 0
should be used to
get the same behaviour as groupval
, i.e. report missing values as 0
after a call to fillPeaks
.
Johannes Rainer
XCMSnExp
for information on the data object.
featureDefinitions
to extract the DataFrame
with the
feature definitions.
featureChromatograms
to extract ion chromatograms for each
feature.
hasFeatures
to evaluate whether the
XCMSnExp
provides feature definitions.
groupval
for the equivalent method on xcmsSet
objects.
Generate extracted ion chromatogram for m/z values of interest. The
raw data is used in contrast to getEIC
which uses
data from the profile matrix (i.e. values binned along the M/Z
dimension).
object |
|
mzrange |
m/z range for EIC |
rtrange |
retention time range for EIC |
scanrange |
scan range for EIC |
A list of :
scan |
scan number |
intensity |
added intensity values |
rawEIC(object, mzrange = numeric(), rtrange = numeric(), scanrange = numeric())
Ralf Tautenhahn
Returns a matrix with columns for time, m/z, and intensity that represents the raw data from a chromatography mass spectrometry experiment.
object |
The container of the raw data |
mzrange |
Subset by m/z range |
rtrange |
Subset by retention time range |
scanrange |
Subset by scan index range |
log |
Whether to log transform the intensities |
A numeric matrix with three columns: time, mz and intensity.
rawMat(object, mzrange = numeric(), rtrange = numeric(),
scanrange = numeric(), log=FALSE)
Michael Lawrence
plotRaw
for plotting the raw intensities
Reconstructs MS2 spectra for each MS1 chromatographic peak (if possible) for data independent acquisition (DIA) data (such as SWATH). See the LC-MS/MS analysis vignette for more details and examples.
reconstructChromPeakSpectra(object, ...) ## S4 method for signature 'XcmsExperiment' reconstructChromPeakSpectra( object, expandRt = 0, diffRt = 2, minCor = 0.8, intensity = "maxo", peakId = rownames(chromPeaks(object, msLevel = 1L)), BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' reconstructChromPeakSpectra( object, expandRt = 0, diffRt = 2, minCor = 0.8, intensity = "maxo", peakId = rownames(chromPeaks(object, msLevel = 1L)), BPPARAM = bpparam(), return.type = c("Spectra", "MSpectra") )
reconstructChromPeakSpectra(object, ...) ## S4 method for signature 'XcmsExperiment' reconstructChromPeakSpectra( object, expandRt = 0, diffRt = 2, minCor = 0.8, intensity = "maxo", peakId = rownames(chromPeaks(object, msLevel = 1L)), BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' reconstructChromPeakSpectra( object, expandRt = 0, diffRt = 2, minCor = 0.8, intensity = "maxo", peakId = rownames(chromPeaks(object, msLevel = 1L)), BPPARAM = bpparam(), return.type = c("Spectra", "MSpectra") )
object |
|
... |
ignored. |
expandRt |
|
diffRt |
|
minCor |
|
intensity |
|
peakId |
optional |
BPPARAM |
parallel processing setup. See |
return.type |
|
In detail, the function performs for each MS1 chromatographic peak:
Identify all MS2 chromatographic peaks from the isolation window
containing the m/z of the ion (i.e. the MS1 chromatographic peak) with
approximately the same retention time than the MS1 peak (accepted rt shift
can be specified with the diffRt
parameter).
Correlate the peak shapes of the candidate MS2 chromatographic peaks with
the peak shape of the MS1 peak retaining only MS2 chromatographic peaks
for which the correlation is > minCor
.
Reconstruct the MS2 spectrum using the m/z of all above selected MS2
chromatographic peaks and their intensity (either "maxo"
or "into"
).
Each MS2 chromatographic peak selected for an MS1 peak will thus represent
one mass peak in the reconstructed spectrum.
The resulting Spectra()
object provides also the peak IDs of the MS2
chromatographic peaks for each spectrum as well as their correlation value
with spectra variables ms2_peak_id and ms2_peak_cor.
Spectra()
object (defined in the Spectra
package) with the
reconstructed MS2 spectra for all MS1 peaks in object
. Contains
empty spectra (i.e. without m/z and intensity values) for MS1 peaks for
which reconstruction was not possible (either no MS2 signal was recorded
or the correlation of the MS2 chromatographic peaks with the MS1
chromatographic peak was below threshold minCor
. Spectra variables
"ms2_peak_id"
and "ms2_peak_cor"
(of type CharacterList()
and NumericList()
with length equal to the number of peaks per
reconstructed MS2 spectrum) providing the IDs and the correlation of the
MS2 chromatographic peaks from which the MS2 spectrum was reconstructed.
As retention time the median retention times of all MS2 chromatographic
peaks used for the spectrum reconstruction is reported. The MS1
chromatographic peak intensity is reported as the reconstructed
spectrum's precursorIntensity
value (see parameter intensity
above).
Johannes Rainer, Michael Witting
findChromPeaksIsolationWindow()
for the function to perform MS2
peak detection in DIA isolation windows and for examples.
The refineChromPeaks
method performs a post-processing of the
chromatographic peak detection step to eventually clean and improve the
results. The function can be applied to a XcmsExperiment()
or XCMSnExp()
object after peak detection with findChromPeaks()
. The type of peak
refinement and cleaning can be defined, along with all its settings, using
one of the following parameter objects:
CleanPeaksParam
: remove chromatographic peaks with a retention time
range larger than the provided maximal acceptable width (maxPeakwidth
).
FilterIntensityParam
: remove chromatographic peaks with intensities
below the specified threshold. By default (with nValues = 1
) values in
the chromPeaks
matrix are evaluated: all peaks with a value in the
column defined with parameter value
that are >=
a threshold (defined
with parameter threshold
) are retained. If nValues
is larger than 1,
the individual peak intensities from the raw MS files are evaluated:
chromatographic peaks with at least nValues
mass peaks >= threshold
are retained.
MergeNeighboringPeaksParam
: peak detection sometimes fails to identify a
chromatographic peak correctly, especially for broad peaks and if the peak
shape is irregular (mostly for HILIC data). In such cases several smaller
peaks are reported. Also, peak detection with centWave can result in
partially or completely overlapping peaks. This method aims to reduce
such peak detection artifacts by merging chromatographic peaks that are
overlapping or close in RT and m/z dimension (considering also the measured
signal between them). See section Details for MergeNeighboringPeaksParam
for details and a comprehensive description of the approach.
refineChromPeaks
methods will always remove feature definitions, because
a call to this method can change or remove identified chromatographic peaks,
which may be part of features.
refineChromPeaks(object, param, ...) ## S4 method for signature 'XcmsExperiment,CleanPeaksParam' refineChromPeaks(object, param = CleanPeaksParam(), msLevel = 1L) ## S4 method for signature 'XcmsExperiment,MergeNeighboringPeaksParam' refineChromPeaks( object, param, msLevel = 1L, chunkSize = 2L, BPPARAM = bpparam() ) ## S4 method for signature 'XcmsExperiment,FilterIntensityParam' refineChromPeaks( object, param, msLevel = 1L, chunkSize = 2L, BPPARAM = bpparam() ) CleanPeaksParam(maxPeakwidth = 10) MergeNeighboringPeaksParam( expandRt = 2, expandMz = 0, ppm = 10, minProp = 0.75 ) FilterIntensityParam(threshold = 0, nValues = 1L, value = "maxo") ## S4 method for signature 'XCMSnExp,CleanPeaksParam' refineChromPeaks(object, param = CleanPeaksParam(), msLevel = 1L) ## S4 method for signature 'XCMSnExp,MergeNeighboringPeaksParam' refineChromPeaks( object, param = MergeNeighboringPeaksParam(), msLevel = 1L, BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp,FilterIntensityParam' refineChromPeaks( object, param = FilterIntensityParam(), msLevel = 1L, BPPARAM = bpparam() )
refineChromPeaks(object, param, ...) ## S4 method for signature 'XcmsExperiment,CleanPeaksParam' refineChromPeaks(object, param = CleanPeaksParam(), msLevel = 1L) ## S4 method for signature 'XcmsExperiment,MergeNeighboringPeaksParam' refineChromPeaks( object, param, msLevel = 1L, chunkSize = 2L, BPPARAM = bpparam() ) ## S4 method for signature 'XcmsExperiment,FilterIntensityParam' refineChromPeaks( object, param, msLevel = 1L, chunkSize = 2L, BPPARAM = bpparam() ) CleanPeaksParam(maxPeakwidth = 10) MergeNeighboringPeaksParam( expandRt = 2, expandMz = 0, ppm = 10, minProp = 0.75 ) FilterIntensityParam(threshold = 0, nValues = 1L, value = "maxo") ## S4 method for signature 'XCMSnExp,CleanPeaksParam' refineChromPeaks(object, param = CleanPeaksParam(), msLevel = 1L) ## S4 method for signature 'XCMSnExp,MergeNeighboringPeaksParam' refineChromPeaks( object, param = MergeNeighboringPeaksParam(), msLevel = 1L, BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp,FilterIntensityParam' refineChromPeaks( object, param = FilterIntensityParam(), msLevel = 1L, BPPARAM = bpparam() )
object |
XCMSnExp or XcmsExperiment object with identified chromatographic peaks. |
param |
Object defining the refinement method and its settings. |
... |
ignored. |
msLevel |
|
chunkSize |
For |
BPPARAM |
parameter object to set up parallel processing. Uses the
default parallel processing setup returned by |
maxPeakwidth |
For |
expandRt |
For |
expandMz |
For |
ppm |
For |
minProp |
For |
threshold |
For |
nValues |
For |
value |
For |
XCMSnExp
or XcmsExperiment object with the refined
chomatographic peaks.
For peak refinement using the MergeNeighboringPeaksParam
, chromatographic
peaks are first expanded in m/z and retention time dimension (based on
parameters expandMz
, ppm
and expandRt
) and subsequently grouped into
sets of merge candidates if they are (after expansion) overlapping in both
m/z and rt (within the same sample). Note that each peak gets
expanded by expandRt
and expandMz
, thus peaks differing by less than
2 * expandMz
(or 2 * expandRt
) will be evaluated for merging.
Peak merging is performed along the retention time axis, i.e., the peaks are
first ordered by their "rtmin"
and merge candidates are defined iteratively
starting with the first peak.
Candidate peaks are merged if the
average intensity of the 3 data points in the middle position between them
(i.e., at half the distance between "rtmax"
of the first and "rtmin"
of
the second peak) is larger than a certain proportion (minProp
) of the
smaller ("maxo"
) intensity of both peaks. In cases in which this calculated
mid point is not located between the apexes of the two peaks (e.g., if the
peaks are largely overlapping) the average signal intensity at half way
between the apexes is used instead. Candidate peaks are not merged if all 3
data points between them have NA
intensities.
Merged peaks get the "mz"
, "rt"
, "sn"
and "maxo"
values from the
peak with the largest signal ("maxo"
) as well as its row in the metadata
of the peak (chromPeakData
). The "rtmin"
and "rtmax"
of the merged
peaks are updated and "into"
is recalculated based on all signal between
"rtmin"
and "rtmax"
and the newly defined "mzmin"
and "mzmax"
(which
is the range of "mzmin"
and "mzmax"
of the merged peaks after expanding
by expandMz
and ppm
). The reported "mzmin"
and "mzmax"
for the
merged peak represents the m/z range of all non-NA intensities used for the
calculation of the peak signal ("into"
).
Johannes Rainer, Mar Garcia-Aloy
## Load a test data set with detected peaks library(xcms) library(MsExperiment) faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) #### ## CleanPeaksParam: ## Distribution of chromatographic peak widths quantile(chromPeaks(faahko_sub)[, "rtmax"] - chromPeaks(faahko_sub)[, "rtmin"]) ## Remove all chromatographic peaks with a width larger 60 seconds data <- refineChromPeaks(faahko_sub, param = CleanPeaksParam(60)) quantile(chromPeaks(data)[, "rtmax"] - chromPeaks(data)[, "rtmin"]) #### ## FilterIntensityParam: ## Remove all peaks with a maximal intensity below 50000 res <- refineChromPeaks(faahko_sub, param = FilterIntensityParam(threshold = 50000)) nrow(chromPeaks(faahko_sub)) nrow(chromPeaks(res)) #### ## MergeNeighboringPeaksParam: ## Subset to a single file xd <- filterFile(faahko_sub, file = 1) ## Example of a split peak that will be merged mzr <- 305.1 + c(-0.01, 0.01) chr <- chromatogram(xd, mz = mzr, rt = c(2700, 3700)) plot(chr) ## Combine the peaks res <- refineChromPeaks(xd, param = MergeNeighboringPeaksParam(expandRt = 4)) chr_res <- chromatogram(res, mz = mzr, rt = c(2700, 3700)) plot(chr_res) ## Example of a peak that was not merged, because the signal between them ## is lower than the cut-off minProp mzr <- 496.2 + c(-0.01, 0.01) chr <- chromatogram(xd, mz = mzr, rt = c(3200, 3500)) plot(chr) chr_res <- chromatogram(res, mz = mzr, rt = c(3200, 3500)) plot(chr_res)
## Load a test data set with detected peaks library(xcms) library(MsExperiment) faahko_sub <- loadXcmsData("faahko_sub2") ## Disable parallel processing for this example register(SerialParam()) #### ## CleanPeaksParam: ## Distribution of chromatographic peak widths quantile(chromPeaks(faahko_sub)[, "rtmax"] - chromPeaks(faahko_sub)[, "rtmin"]) ## Remove all chromatographic peaks with a width larger 60 seconds data <- refineChromPeaks(faahko_sub, param = CleanPeaksParam(60)) quantile(chromPeaks(data)[, "rtmax"] - chromPeaks(data)[, "rtmin"]) #### ## FilterIntensityParam: ## Remove all peaks with a maximal intensity below 50000 res <- refineChromPeaks(faahko_sub, param = FilterIntensityParam(threshold = 50000)) nrow(chromPeaks(faahko_sub)) nrow(chromPeaks(res)) #### ## MergeNeighboringPeaksParam: ## Subset to a single file xd <- filterFile(faahko_sub, file = 1) ## Example of a split peak that will be merged mzr <- 305.1 + c(-0.01, 0.01) chr <- chromatogram(xd, mz = mzr, rt = c(2700, 3700)) plot(chr) ## Combine the peaks res <- refineChromPeaks(xd, param = MergeNeighboringPeaksParam(expandRt = 4)) chr_res <- chromatogram(res, mz = mzr, rt = c(2700, 3700)) plot(chr_res) ## Example of a peak that was not merged, because the signal between them ## is lower than the cut-off minProp mzr <- 496.2 + c(-0.01, 0.01) chr <- chromatogram(xd, mz = mzr, rt = c(3200, 3500)) plot(chr) chr_res <- chromatogram(res, mz = mzr, rt = c(3200, 3500)) plot(chr_res)
removeIntensities
allows to remove intensities from chromatographic data
matching certain conditions (depending on parameter which
). The
intensities are actually not removed but replaced with NA_real_
. To
actually remove the intensities (and the associated retention times)
use clean()
afterwards.
Parameter which
allows to specify which intensities should be replaced by
NA_real_
. By default (which = "below_threshod"
intensities below
threshold
are removed. If x
is a XChromatogram
or XChromatograms
object (and hence provides also chromatographic peak definitions within the
object) which = "outside_chromPeak"
can be selected which removes any
intensity which is outside the boundaries of identified chromatographic
peak(s) in the chromatographic data.
Note that filterIntensity()
might be a better approach to subset/filter
chromatographic data.
## S4 method for signature 'Chromatogram' removeIntensity(object, which = "below_threshold", threshold = 0) ## S4 method for signature 'MChromatograms' removeIntensity(object, which = "below_threshold", threshold = 0) ## S4 method for signature 'XChromatogram' removeIntensity( object, which = c("below_threshold", "outside_chromPeak"), threshold = 0 )
## S4 method for signature 'Chromatogram' removeIntensity(object, which = "below_threshold", threshold = 0) ## S4 method for signature 'MChromatograms' removeIntensity(object, which = "below_threshold", threshold = 0) ## S4 method for signature 'XChromatogram' removeIntensity( object, which = c("below_threshold", "outside_chromPeak"), threshold = 0 )
object |
an object representing chromatographic data. Can be a
|
which |
|
threshold |
|
the input object with matching intensities being replaced by NA
.
Johannes Rainer
library(MSnbase) chr <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3), intensity = c(5, 29, 50, NA, 100, 12, 3, 4, 1, 3)) ## Remove all intensities below 20 res <- removeIntensity(chr, threshold = 20) intensity(res)
library(MSnbase) chr <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3), intensity = c(5, 29, 50, NA, 100, 12, 3, 4, 1, 3)) ## Remove all intensities below 20 res <- removeIntensity(chr, threshold = 20) intensity(res)
To correct differences between retention times between different
samples, a number of of methods exist in XCMS. retcor
is the generic method.
object |
|
method |
Method to use for retention time correction. See details. |
... |
Optional arguments to be passed along |
Different algorithms can be used by specifying them with the
method
argument. For example to use the approach described by
Smith et al (2006) one would use: retcor(object,
method="loess")
. This is also the default.
Further arguments given by ...
are
passed through to the function implementing
the method
.
A character vector of nicknames for the
algorithms available is returned by
getOption("BioC")$xcms$retcor.methods
.
If the nickname of a method is called "loess",
the help page for that specific method can
be accessed with ?retcor.loess
.
An xcmsSet
object with corrected retntion times.
retcor(object, ...)
retcor.loess
retcor.obiwarp
xcmsSet-class
,
Calculate retention time deviations for each sample. It is based on the code at http://obi-warp.sourceforge.net/. However, this function is able to align multiple samples, by a center-star strategy.
For the original publication see
Chromatographic Alignment of ESI-LC-MS Proteomics Data Sets by Ordered Bijective Interpolated Warping John T. Prince and, Edward M. Marcotte Analytical Chemistry 2006 78 (17), 6140-6152
object |
the |
plottype |
if |
profStep |
step size (in m/z) to use for profile generation from the raw data files |
center |
the index of the sample all others will be aligned to. If center==NULL, the sample with the most peaks is chosen as default. |
col |
vector of colors for plotting each sample |
ty |
vector of line and point types for plotting each sample |
response |
Responsiveness of warping. 0 will give a linear warp based on start and end points. 100 will use all bijective anchors |
distFunc |
DistFunc function: cor (Pearson's R) or cor_opt (default, calculate only 10% diagonal band of distance matrix, better runtime), cov (covariance), prd (product), euc (Euclidean distance) |
gapInit |
Penalty for Gap opening, see below |
gapExtend |
Penalty for Gap enlargement, see below |
factorDiag |
Local weighting applied to diagonal moves in alignment. |
factorGap |
Local weighting applied to gap moves in alignment. |
localAlignment |
Local rather than global alignment |
initPenalty |
Penalty for initiating alignment (for local alignment only) Default: 0 |
Default gap penalties: (gapInit, gapExtend) [by distFunc type]: 'cor' = '0.3,2.4' 'cov' = '0,11.7' 'prd' = '0,7.8' 'euc' = '0.9,1.8'
An xcmsSet
object
retcor(object, method="obiwarp", plottype = c("none", "deviation"), profStep=1, center=NULL, col = NULL, ty = NULL, response=1, distFunc="cor_opt", gapInit=NULL, gapExtend=NULL, factorDiag=2, factorGap=1, localAlignment=0, initPenalty=0)
These two methods use “well behaved” peak groups to calculate retention time deviations for every time point of each sample. Use smoothed deviations to align retention times.
object |
the |
missing |
number of missing samples to allow in retention time correction groups |
extra |
number of extra peaks to allow in retention time correction correction groups |
smooth |
either |
span |
degree of smoothing for local polynomial regression fitting |
family |
if |
plottype |
if |
col |
vector of colors for plotting each sample |
ty |
vector of line and point types for plotting each sample |
An xcmsSet
object
retcor(object, missing = 1, extra = 1,
smooth = c("loess", "linear"), span = .2,
family = c("gaussian", "symmetric"),
plottype = c("none", "deviation", "mdevden"),
col = NULL, ty = NULL)
xcmsSet-class
,
loess
retcor.obiwarp
Expands (or contracts) the retention time window in each row of
a matrix as defined by the retmin
and retmax
columns.
retexp(peakrange, width = 200)
retexp(peakrange, width = 200)
peakrange |
maxtrix with columns |
width |
new width for the window |
The altered matrix.
Colin A. Smith, [email protected]
rla
calculates the relative log abundances (RLA, see reference) on a
numeric
vector.
rla(x, group, log.transform = TRUE) rowRla(x, group, log.transform = TRUE)
rla(x, group, log.transform = TRUE) rowRla(x, group, log.transform = TRUE)
x |
|
group |
|
log.transform |
|
The RLA is defines as the (log) abundance of an analyte relative to the median across all abundances of the same group.
numeric
of the same length than x
(for rla
) or matrix
with
the same dimensions than x
(for rowRla
).
Johannes Rainer
De Livera AM, Dias DA, De Souza D, Rupasinghe T, Pyke J, Tull D, Roessner U, McConville M, Speed TP. Normalizing and integrating metabolomics data. Anal Chem 2012 Dec 18;84(24):10768-76.
x <- c(3, 4, 5, 1, 2, 3, 7, 8, 9) grp <- c(1, 1, 1, 2, 2, 2, 3, 3, 3) rla(x, grp)
x <- c(3, 4, 5, 1, 2, 3, 7, 8, 9) grp <- c(1, 1, 1, 2, 2, 2, 3, 3, 3) rla(x, grp)
The 'RsdFilter' class and methods enable users to filter features from an 'XcmsExperiment' or 'SummarizedExperiment' object based on their relative standard deviation (coefficient of variation) for a specified threshold.
This 'filter' is part of the possible dispatch of the generic function 'filterFeatures'. Features *above* ('>') the user-input threshold will be removed from the entire dataset.
RsdFilter(threshold = 0.3, qcIndex = integer(), na.rm = TRUE, mad = FALSE) ## S4 method for signature 'XcmsResult,RsdFilter' filterFeatures(object, filter, ...) ## S4 method for signature 'SummarizedExperiment,RsdFilter' filterFeatures(object, filter, assay = 1)
RsdFilter(threshold = 0.3, qcIndex = integer(), na.rm = TRUE, mad = FALSE) ## S4 method for signature 'XcmsResult,RsdFilter' filterFeatures(object, filter, ...) ## S4 method for signature 'SummarizedExperiment,RsdFilter' filterFeatures(object, filter, assay = 1)
threshold |
'numeric' value representing the threshold. Features with a coefficient of variation *strictly higher* ('>') than this will be removed from the entire dataset. |
qcIndex |
'integer' (or 'logical') vector corresponding to the indices of QC samples. |
na.rm |
'logical' indicates whether missing values ('NA') should be removed prior to the calculations. |
mad |
'logical' indicates whether the *Median Absolute Deviation* (MAD) should be used instead of the standard deviation. This is suggested for non-gaussian distributed data. |
object |
|
filter |
The parameter object selecting and configuring the type of
filtering. It can be one of the following classes: |
... |
Optional parameters. For |
assay |
For filtering of |
For 'RsdFilter': a 'RsdFilter' class. 'filterFeatures' return the input object minus the features that did not met the user input threshold.
It is assumed that the abundance values are in natural scale. Abundances in log scale should be first transformed to natural scale before calculating the RSD.
Philippine Louail
Other Filter features in xcms:
BlankFlag
,
DratioFilter
,
PercentMissingFilter
Return sample names for an object
A character vector with sample names.
sampnames(object)
sampnames(object)
If peak detection is performed with findPeaks
setting argument stopOnError = FALSE
eventual errors during the
process do not cause to stop the processing but are recorded inside of
the resulting xcmsSet
object. These errors can be
accessed with the showError
method.
## S4 method for signature 'xcmsSet' showError(object, message. = TRUE, ...)
## S4 method for signature 'xcmsSet' showError(object, message. = TRUE, ...)
object |
An |
message. |
Logical indicating whether only the error message, or the error itself should be returned. |
... |
Additional arguments. |
A list of error messages (if message. = TRUE
) or errors or an
empty list if no errors are present.
Johannes Rainer
There are several methods for calculating a distance between two sets of peaks in xcms. specDist
is the generic method.
object |
a xcmsSet or xcmsRaw. |
method |
Method to use for distance calculation. See details. |
... |
mzabs, mzppm and parameters for the distance function. |
Different algorithms can be used by specifying them with the
method
argument. For example to use the "meanMZmatch"
approach with xcmsSet one would use:
specDist(object, peakIDs1, peakIDs2, method="meanMZmatch")
. This is also
the default.
Further arguments given by ...
are
passed through to the function implementing
the method
.
A character vector of nicknames for the
algorithms available is returned by
getOption("BioC")$xcms$specDist.methods
.
If the nickname of a method is called "meanMZmatch",
the help page for that specific method can
be accessed with ?specDist.meanMZmatch
.
mzabs |
maximum absolute deviation for two matching peaks |
mzppm |
relative deviations in ppm for two matching peaks |
symmetric |
use symmetric pairwise m/z-matches only, or each match |
specDist(object, peakIDs1, peakIDs2,...)
specDist(object, PSpec1, PSpec2,...)
Joachim Kutzera, [email protected]
This method calculates the distance of two sets of peaks using the cosine-distance.
specDist.cosine(peakTable1, peakTable2, mzabs=0.001, mzppm=10, mzExp=0.6, intExp=3, nPdiff=2, nPmin=8, symmetric=FALSE)
specDist.cosine(peakTable1, peakTable2, mzabs=0.001, mzppm=10, mzExp=0.6, intExp=3, nPdiff=2, nPmin=8, symmetric=FALSE)
peakTable1 |
a Matrix containing at least m/z-values, row must be called "mz" |
peakTable2 |
the matrix for the other mz-values |
mzabs |
maximum absolute deviation for two matching peaks |
mzppm |
relative deviations in ppm for two matching peaks |
symmetric |
use symmetric pairwise m/z-matches only, or each match |
mzExp |
the exponent used for mz |
intExp |
the exponent used for intensity |
nPdiff |
the maximum nrow-difference of the two peaktables |
nPmin |
the minimum absolute sum of peaks from both praktables |
The result is the cosine-distance of the product from weighted factors of mz and intensity from matching peaks in the two peaktables. The factors are calculated as wFact = mz^mzExp * int^intExp. if no distance is calculated (for example because no matching peaks were found) the return-value is NA.
specDist.cosine(peakTable1, peakTable2, mzabs = 0.001, mzppm = 10,
mzExp = 0.6, intExp = 3, nPdiff = 2, nPmin = 8,
symmetric = FALSE)
Joachim Kutzera, [email protected]
This method calculates the distance of two sets of peaks.
specDist.meanMZmatch(peakTable1, peakTable2, matchdist=1, matchrate=1, mzabs=0.001, mzppm=10, symmetric=TRUE)
specDist.meanMZmatch(peakTable1, peakTable2, matchdist=1, matchrate=1, mzabs=0.001, mzppm=10, symmetric=TRUE)
peakTable1 |
a Matrix containing at least m/z-values, row must be called "mz" |
peakTable2 |
the matrix for the other mz-values |
mzabs |
maximum absolute deviation for two matching peaks |
mzppm |
relative deviations in ppm for two matching peaks |
symmetric |
use symmetric pairwise m/z-matches only, or each match |
matchdist |
the weight for value one (see details) |
matchrate |
the weight for value two |
The result of the calculation is a weighted sum of two values. Value one is the mean absolute difference of the matching peaks, value two is the relation of matching peaks and non matching peaks. if no distance is calculated (for example because no matching peaks were found) the return-value is NA.
specDist.meanMZmatch(peakTable1, peakTable2,
matchdist=1, matchrate=1,
mzabs=0.001, mzppm=10, symmetric=TRUE)
Joachim Kutzera, [email protected]
This method calculates the distance of two sets of peaks by just returning the number of matching peaks (m/z-values).
specDist.peakCount(peakTable1, peakTable2, mzabs=0.001, mzppm=10, symmetric=FALSE)
specDist.peakCount(peakTable1, peakTable2, mzabs=0.001, mzppm=10, symmetric=FALSE)
peakTable1 |
a Matrix containing at least m/z-values, row must be called "mz" |
peakTable2 |
the matrix for the other mz-values |
mzabs |
maximum absolute deviation for two matching peaks |
mzppm |
relative deviations in ppm for two matching peaks |
symmetric |
use symmetric pairwise m/z-matches only, or each match |
specDist.peakCount(peakTable1, peakTable2, mzppm=10,symmetric=FALSE )
Joachim Kutzera, [email protected]
Given a sparse continuum mass spectrum, determine regions where no signal is present, substituting half of the minimum intensity for those regions. Calculate the noise level as the weighted mean of the regions with signal and the regions without signal. If there is only one raw peak, return zero.
specNoise(spec, gap = quantile(diff(spec[, "mz"]), 0.9))
specNoise(spec, gap = quantile(diff(spec[, "mz"]), 0.9))
spec |
matrix with named columns |
gap |
threshold above which to data points are considerd to be separated by a blank region and not bridged by an interpolating line |
The default gap value is determined from the 90th percentile of the pair-wise differences between adjacent mass values.
A numeric noise level
Colin A. Smith, [email protected]
Given a spectrum, identify and list significant peaks as determined by several criteria.
specPeaks(spec, sn = 20, mzgap = 0.2)
specPeaks(spec, sn = 20, mzgap = 0.2)
spec |
matrix with named columns |
sn |
minimum signal to noise ratio |
mzgap |
minimal distance between adjacent peaks, with smaller peaks being excluded |
Peaks must meet two criteria to be considered peaks: 1) Their s/n ratio must exceed a certain threshold. 2) They must not be within a given distance of any greater intensity peaks.
A matrix with columns:
mz |
m/z at maximum peak intensity |
intensity |
maximum intensity of the peak |
fwhm |
full width at half max of the peak |
Colin A. Smith, [email protected]
Divides the scans from a xcmsRaw
object into
a list of multiple objects. MS$^n$ data is discarded.
x |
|
f |
factor such that |
drop |
logical indicating if levels that do not occur should be dropped (if 'f' is a 'factor' or a list). |
... |
further potential arguments passed to methods. |
A list of xcmsRaw
objects.
split(x, f, drop = TRUE, ...)
Steffen Neumann, [email protected]
Divides the samples and peaks from a xcmsSet
object into
a list of multiple objects. Group data is discarded.
xs |
|
f |
factor such that |
drop |
logical indicating if levels that do not occur should be dropped (if 'f' is a 'factor' or a list). |
... |
further potential arguments passed to methods. |
A list of xcmsSet
objects.
split(x, f, drop = TRUE, ...)
Colin A. Smith, [email protected]
This selfStart
model evalueates the Gaussian model and its
gradient. It has an initial
attribute that will evalueate
the inital estimates of the parameters mu
, sigma
,
and h
.
SSgauss(x, mu, sigma, h)
SSgauss(x, mu, sigma, h)
x |
a numeric vector of values at which to evaluate the model |
mu |
mean of the distribution function |
sigma |
standard deviation of the distribution fuction |
h |
height of the distribution function |
Initial values for mu
and h
are chosen from the
maximal value of x
. The initial value for sigma
is
determined from the area under x
divided by h*sqrt(2*pi)
.
A numeric vector of the same length as x
. It is the value
of the expression h*exp(-(x-mu)^2/(2*sigma^2)
, which is a
modified gaussian function where the maximum height is treated
as a separate parameter not dependent on sigma
. If arguments
mu
, sigma
, and h
are names of objects, the
gradient matrix with respect to these names is attached as an
attribute named gradient
.
Colin A. Smith, [email protected]
Fixes gaps in data due to calibration scans or lock mass. Automatically detects file type and calls the relevant method. The mzXML file keeps the data the same length in time but overwrites the lock mass scans. The netCDF version adds the scans back into the data thereby increasing the length of the data and correcting for the unseen gap.
object |
An |
lockMass |
A dataframe of locations of the gaps |
freq |
The intervals of the lock mass scans |
start |
The starting lock mass scan location, default is 1 |
makeacqNum
takes locates the gap using the starting lock mass scan and it's intervals. This data frame is then used in
stitch
to correct for the gap caused by the lock mass. Correction works by using scans from either side of the gap to fill it in.
stitch
A corrected xcmsRaw-class
object
makeacqNum
A numeric vector of scan locations corresponding to lock Mass scans
stitch(object, lockMass=numeric())
makeacqNum(object, freq=numeric(), start=1)
Paul Benton, [email protected]
## Not run: library(xcms) library(faahKO) ## These files do not have this problem to correct for but just ## for an example cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xr<-xcmsRaw(cdffiles[1]) xr ##Lets assume that the lockmass starts at 1 and is every 100 scans lockMass<-xcms:::makeacqNum(xr, freq=100, start=1) ## these are equcal lockmass<-AutoLockMass(xr) ob<-stitch(xr, lockMass) ob ## plot the old data before correction foo<-rawEIC(xr, m=c(200,210), scan=c(80,140)) plot(foo$scan, foo$intensity, type="h") ## plot the new corrected data to see what changed foo<-rawEIC(ob, m=c(200,210), scan=c(80,140)) plot(foo$scan, foo$intensity, type="h") ## End(Not run)
## Not run: library(xcms) library(faahKO) ## These files do not have this problem to correct for but just ## for an example cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xr<-xcmsRaw(cdffiles[1]) xr ##Lets assume that the lockmass starts at 1 and is every 100 scans lockMass<-xcms:::makeacqNum(xr, freq=100, start=1) ## these are equcal lockmass<-AutoLockMass(xr) ob<-stitch(xr, lockMass) ob ## plot the old data before correction foo<-rawEIC(xr, m=c(200,210), scan=c(80,140)) plot(foo$scan, foo$intensity, type="h") ## plot the new corrected data to see what changed foo<-rawEIC(ob, m=c(200,210), scan=c(80,140)) plot(foo$scan, foo$intensity, type="h") ## End(Not run)
xcmsSet
objectThis method updates an old xcmsSet
object to the latest definition.
## S4 method for signature 'xcmsSet' updateObject(object, ..., verbose = FALSE)
## S4 method for signature 'xcmsSet' updateObject(object, ..., verbose = FALSE)
object |
The |
... |
Optional additional arguments. Currently ignored. |
verbose |
Currently ignored. |
An updated xcmsSet
containing all data from
the input object.
Johannes Rainer
This function allows to enable the usage of old, partially deprecated code from xcms by setting a corresponding global option. See details for functions affected.
useOriginalCode(x)
useOriginalCode(x)
x |
|
The functions/methods that are affected by this option are:
do_findChromPeaks_matchedFilter: use the original code that iteratively creates a subset of the binned (profile) matrix. This is helpful for computers with limited memory or matchedFilter settings with a very small bin size.
logical(1)
indicating whether old code is being used.
For parallel processing using the SOCKS method (e.g. by SnowParam()
on
Windows computers) this option might not be passed to the individual R
processes performing the calculations. In such cases it is suggested to
specify the option manually and system-wide by adding the line
options(XCMSuseOriginalCode = TRUE)
in a file called .Rprofile in the
folder in which new R processes are started (usually the user's
home directory; to ensure that the option is correctly read add a new line
to the file too). See also Startup from the base R documentation on how to
specify system-wide options for R.
Usage of old code is strongly dicouraged. This function is thought to be used mainly in the transition phase from xcms to xcms version 3.
Johannes Rainer
Export in XML data formats: verify the written data
verify.mzQuantML(filename, xsdfilename)
verify.mzQuantML(filename, xsdfilename)
filename |
filename (may include full path) for the output file. Pipes or URLs are not allowed. |
xsdfilename |
Filename of the XSD to verify against (may include full path) |
The verify.mzQuantML() function will verify an PSI standard format mzQuantML document against the XSD schemda, see http://www.psidev.info/mzquantml
None.
Write the raw data to a (simple) CDF file.
object |
the |
filename |
filename (may include full path) for the CDF file. Pipes or URLs are not allowed. |
Currently the only application known to read the resulting file is XCMS. Others, especially those which build on the AndiMS library, will refuse to load the output.
None.
write.cdf(object, filename)
Write the raw data to a (simple) mzData file.
object |
the |
filename |
filename (may include full path) for the mzData file. Pipes or URLs are not allowed. |
This function will export a given xcmsRaw object to an mzData file. The mzData file will contain a <spectrumList> containing the <spectrum> with mass and intensity values in 32 bit precision. Other formats are currently not supported. Any header information (e.g. additional <software> information or <cvParams>) will be lost. Currently, also any MSn information will not be stored.
None.
write.mzdata(object, filename)
Export in XML data formats: Write the processed data in an xcmsSet to mzQuantML.
object |
the |
filename |
filename (may include full path) for the output file. Pipes or URLs are not allowed. |
The write.mzQuantML() function will write a (grouped) xcmsSet into the PSI standard format mzQuantML, see http://www.psidev.info/mzquantml
None.
write.mzQuantML(object, filename)
xcmsSet-class
,
xcmsSet
,
verify.mzQuantML
,
writeMSData
exports mass spectrometry data in mzML or mzXML format.
If adjusted retention times are present, these are used as retention time of
the exported spectra.
## S4 method for signature 'XCMSnExp,character' writeMSData( object, file, outformat = c("mzml", "mzxml"), copy = FALSE, software_processing = NULL, ... )
## S4 method for signature 'XCMSnExp,character' writeMSData( object, file, outformat = c("mzml", "mzxml"), copy = FALSE, software_processing = NULL, ... )
object |
XCMSnExp object with the mass spectrometry data. |
file |
|
outformat |
|
copy |
|
software_processing |
optionally provide specific data processing steps.
See documentation of the |
... |
Additional parameters to pass down to the |
Johannes Rainer
writeMSData()
function in the MSnbase
package.
Write the grouped xcmsSet to an mzTab file.
object |
the |
filename |
filename (may include full path) for the mzTab file. Pipes or URLs are not allowed. |
The mzTab file format for MS-based metabolomics (and proteomics) is a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files from xcms contain small molecule sections together with experimental metadata and basic quantitative information. The format is intended to store a simple summary of the final results.
None.
writeMzTab(object, filename)
library(faahKO) xs <- group(faahko) mzt <- data.frame(character(0)) mzt <- xcms:::mzTabHeader(mzt, version="1.1.0", mode="Complete", type="Quantification", description="faahKO", xset=xs) mzt <- xcms:::mzTabAddSME(mzt, xs) xcms:::writeMzTab(mzt, "faahKO.mzTab")
library(faahKO) xs <- group(faahko) mzt <- data.frame(character(0)) mzt <- xcms:::mzTabHeader(mzt, version="1.1.0", mode="Complete", type="Quantification", description="faahKO", xset=xs) mzt <- xcms:::mzTabAddSME(mzt, xs) xcms:::writeMzTab(mzt, "faahKO.mzTab")
The XChromatogram
object allows to store chromatographic data (e.g.
an extracted ion chromatogram) along with identified chromatographic peaks
within that data. The object inherits all functions from the Chromatogram()
object in the MSnbase
package.
Multiple XChromatogram
objects can be stored in a XChromatograms
object.
This class extends MChromatograms()
from the MSnbase
package and allows
thus to arrange chromatograms in a matrix-like structure, columns
representing samples and rows m/z-retention time ranges.
All functions are described (grouped into topic-related sections) after the Arguments section.
XChromatograms(data, phenoData, featureData, chromPeaks, chromPeakData, ...) XChromatogram( rtime = numeric(), intensity = numeric(), mz = c(NA_real_, NA_real_), filterMz = c(NA_real_, NA_real_), precursorMz = c(NA_real_, NA_real_), productMz = c(NA_real_, NA_real_), fromFile = integer(), aggregationFun = character(), msLevel = 1L, chromPeaks, chromPeakData ) ## S4 method for signature 'XChromatogram' show(object) ## S4 method for signature 'XChromatogram' chromPeaks( object, rt = numeric(), mz = numeric(), ppm = 0, type = c("any", "within", "apex_within"), msLevel ) ## S4 replacement method for signature 'XChromatogram' chromPeaks(object) <- value ## S4 method for signature 'XChromatogram,ANY' plot( x, col = "#00000060", lty = 1, type = "l", xlab = "retention time", ylab = "intensity", main = NULL, peakType = c("polygon", "point", "rectangle", "none"), peakCol = "#00000060", peakBg = "#00000020", peakPch = 1, ... ) ## S4 method for signature 'XChromatogram' filterMz(object, mz, ...) ## S4 method for signature 'XChromatogram' filterRt(object, rt, ...) ## S4 method for signature 'XChromatogram' hasChromPeaks(object) ## S4 method for signature 'XChromatogram' dropFilledChromPeaks(object) ## S4 method for signature 'XChromatogram' chromPeakData(object) ## S4 replacement method for signature 'XChromatogram' chromPeakData(object) <- value ## S4 method for signature 'XChromatogram,MergeNeighboringPeaksParam' refineChromPeaks(object, param = MergeNeighboringPeaksParam()) ## S4 method for signature 'XChromatogram' filterChromPeaks(object, method = c("keepTop"), ...) ## S4 method for signature 'XChromatogram' transformIntensity(object, FUN = identity) ## S4 method for signature 'XChromatograms' show(object) ## S4 method for signature 'XChromatograms' hasChromPeaks(object) ## S4 method for signature 'XChromatograms' hasFilledChromPeaks(object) ## S4 method for signature 'XChromatograms' chromPeaks( object, rt = numeric(), mz = numeric(), ppm = 0, type = c("any", "within", "apex_within"), msLevel ) ## S4 method for signature 'XChromatograms' chromPeakData(object) ## S4 method for signature 'XChromatograms' filterMz(object, mz, ...) ## S4 method for signature 'XChromatograms' filterRt(object, rt, ...) ## S4 method for signature 'XChromatograms,ANY' plot( x, col = "#00000060", lty = 1, type = "l", xlab = "retention time", ylab = "intensity", main = NULL, peakType = c("polygon", "point", "rectangle", "none"), peakCol = "#00000060", peakBg = "#00000020", peakPch = 1, ... ) ## S4 method for signature 'XChromatograms' processHistory(object, fileIndex, type) ## S4 method for signature 'XChromatograms' hasFeatures(object, ...) ## S4 method for signature 'XChromatograms' dropFeatureDefinitions(object, ...) ## S4 method for signature 'XChromatograms,PeakDensityParam' groupChromPeaks(object, param) ## S4 method for signature 'XChromatograms' featureDefinitions( object, mz = numeric(), rt = numeric(), ppm = 0, type = c("any", "within", "apex_within") ) ## S4 method for signature 'XChromatograms,ANY,ANY,ANY' x[i, j, drop = TRUE] ## S4 method for signature 'XChromatograms' featureValues( object, method = c("medret", "maxint", "sum"), value = "into", intensity = "into", missing = NA, ... ) ## S4 method for signature 'XChromatograms' plotChromPeakDensity( object, param, col = "#00000060", xlab = "retention time", main = NULL, peakType = c("polygon", "point", "rectangle", "none"), peakCol = "#00000060", peakBg = "#00000020", peakPch = 1, simulate = TRUE, ... ) ## S4 method for signature 'XChromatograms' dropFilledChromPeaks(object) ## S4 method for signature 'XChromatograms,MergeNeighboringPeaksParam' refineChromPeaks(object, param = MergeNeighboringPeaksParam()) ## S4 method for signature 'XChromatograms' filterChromPeaks(object, method = c("keepTop"), ...) ## S4 method for signature 'XChromatograms' transformIntensity(object, FUN = identity)
XChromatograms(data, phenoData, featureData, chromPeaks, chromPeakData, ...) XChromatogram( rtime = numeric(), intensity = numeric(), mz = c(NA_real_, NA_real_), filterMz = c(NA_real_, NA_real_), precursorMz = c(NA_real_, NA_real_), productMz = c(NA_real_, NA_real_), fromFile = integer(), aggregationFun = character(), msLevel = 1L, chromPeaks, chromPeakData ) ## S4 method for signature 'XChromatogram' show(object) ## S4 method for signature 'XChromatogram' chromPeaks( object, rt = numeric(), mz = numeric(), ppm = 0, type = c("any", "within", "apex_within"), msLevel ) ## S4 replacement method for signature 'XChromatogram' chromPeaks(object) <- value ## S4 method for signature 'XChromatogram,ANY' plot( x, col = "#00000060", lty = 1, type = "l", xlab = "retention time", ylab = "intensity", main = NULL, peakType = c("polygon", "point", "rectangle", "none"), peakCol = "#00000060", peakBg = "#00000020", peakPch = 1, ... ) ## S4 method for signature 'XChromatogram' filterMz(object, mz, ...) ## S4 method for signature 'XChromatogram' filterRt(object, rt, ...) ## S4 method for signature 'XChromatogram' hasChromPeaks(object) ## S4 method for signature 'XChromatogram' dropFilledChromPeaks(object) ## S4 method for signature 'XChromatogram' chromPeakData(object) ## S4 replacement method for signature 'XChromatogram' chromPeakData(object) <- value ## S4 method for signature 'XChromatogram,MergeNeighboringPeaksParam' refineChromPeaks(object, param = MergeNeighboringPeaksParam()) ## S4 method for signature 'XChromatogram' filterChromPeaks(object, method = c("keepTop"), ...) ## S4 method for signature 'XChromatogram' transformIntensity(object, FUN = identity) ## S4 method for signature 'XChromatograms' show(object) ## S4 method for signature 'XChromatograms' hasChromPeaks(object) ## S4 method for signature 'XChromatograms' hasFilledChromPeaks(object) ## S4 method for signature 'XChromatograms' chromPeaks( object, rt = numeric(), mz = numeric(), ppm = 0, type = c("any", "within", "apex_within"), msLevel ) ## S4 method for signature 'XChromatograms' chromPeakData(object) ## S4 method for signature 'XChromatograms' filterMz(object, mz, ...) ## S4 method for signature 'XChromatograms' filterRt(object, rt, ...) ## S4 method for signature 'XChromatograms,ANY' plot( x, col = "#00000060", lty = 1, type = "l", xlab = "retention time", ylab = "intensity", main = NULL, peakType = c("polygon", "point", "rectangle", "none"), peakCol = "#00000060", peakBg = "#00000020", peakPch = 1, ... ) ## S4 method for signature 'XChromatograms' processHistory(object, fileIndex, type) ## S4 method for signature 'XChromatograms' hasFeatures(object, ...) ## S4 method for signature 'XChromatograms' dropFeatureDefinitions(object, ...) ## S4 method for signature 'XChromatograms,PeakDensityParam' groupChromPeaks(object, param) ## S4 method for signature 'XChromatograms' featureDefinitions( object, mz = numeric(), rt = numeric(), ppm = 0, type = c("any", "within", "apex_within") ) ## S4 method for signature 'XChromatograms,ANY,ANY,ANY' x[i, j, drop = TRUE] ## S4 method for signature 'XChromatograms' featureValues( object, method = c("medret", "maxint", "sum"), value = "into", intensity = "into", missing = NA, ... ) ## S4 method for signature 'XChromatograms' plotChromPeakDensity( object, param, col = "#00000060", xlab = "retention time", main = NULL, peakType = c("polygon", "point", "rectangle", "none"), peakCol = "#00000060", peakBg = "#00000020", peakPch = 1, simulate = TRUE, ... ) ## S4 method for signature 'XChromatograms' dropFilledChromPeaks(object) ## S4 method for signature 'XChromatograms,MergeNeighboringPeaksParam' refineChromPeaks(object, param = MergeNeighboringPeaksParam()) ## S4 method for signature 'XChromatograms' filterChromPeaks(object, method = c("keepTop"), ...) ## S4 method for signature 'XChromatograms' transformIntensity(object, FUN = identity)
data |
For |
phenoData |
For |
featureData |
For |
chromPeaks |
For |
chromPeakData |
For |
... |
For |
rtime |
For |
intensity |
For For `featureValues`: `character(1)` specifying the name of the column in `chromPeaks(object)` containing the intensity value of the peak that should be used for the `method = "maxint"` conflict resolution if. |
mz |
For |
filterMz |
For |
precursorMz |
For |
productMz |
For |
fromFile |
For |
aggregationFun |
For |
msLevel |
For |
object |
An |
rt |
For |
ppm |
For |
type |
For For `plot`: what type of plot should be used for the chromatogram (such as `"l"` for lines, `"p"` for points etc), see help of [plot()] in the `graphics` package for more details. For `processHistory`: restrict returned processing steps to specific types. Use [processHistoryTypes()] to list all supported values. |
value |
For For `featureValues`: `character(1)` specifying the name of the column in `chromPeaks(object)` that should be returned or `"index"` (default) to return the index of the peak associated with the feature in each sample. To return the integrated peak area instead of the index use `value = "into"`. |
x |
For |
col |
For |
lty |
For |
xlab |
For |
ylab |
For |
main |
For |
peakType |
For |
peakCol |
For |
peakBg |
For |
peakPch |
For |
param |
For |
method |
For |
FUN |
For |
fileIndex |
For |
i |
For |
j |
For |
drop |
For |
missing |
For |
simulate |
For |
See help of the individual functions.
Objects can be created with the contructor function XChromatogram
and
XChromatograms
, respectively. Also, they can be coerced from
Chromatogram or MChromatograms()
objects using
as(object, "XChromatogram")
or as(object, "XChromatograms")
.
Besides classical subsetting with [
specific filter operations on
MChromatograms()
and XChromatograms
objects are available. See
filterColumnsIntensityAbove()
for more details.
[
allows to subset a XChromatograms
object by row (i
) and column
(j
), with i
and j
being of type integer
. The featureDefinitions
will also be subsetted accordingly and the peakidx
column updated.
filterMz
filters the chromatographic peaks within an XChromatogram
or
XChromatograms
, if a column "mz"
is present in the chromPeaks
matrix.
This would be the case if the XChromatogram
was extracted from an
XCMSnExp()
object with the chromatogram()
function. All
chromatographic peaks with their m/z within the m/z range defined by mz
will be retained. Also feature definitions (if present) will be subset
accordingly. The function returns a filtered XChromatogram
or
XChromatograms
object.
filterRt
filters chromatogram(s) by the provided retention time range.
All eventually present chromatographic peaks with their apex within the
retention time range specified with rt
will be retained. Also feature
definitions, if present, will be filtered accordingly. The function
returns a filtered XChromatogram
or XChromatograms
object.
See also help of Chromatogram in the MSnbase
package for general
information and data access. The methods listed here are specific for
XChromatogram
and XChromatograms
objects.
chromPeaks
, chromPeaks<-
: extract or set the matrix with the
chromatographic peak definitions. Parameter rt
allows to specify a
retention time range for which peaks should be returned along with
parameter type
that defines how overlapping is defined (parameter
description for details). For XChromatogram
objects the function returns
a matrix
with columns "rt"
(retention time of the peak apex),
"rtmin"
(the lower peak boundary), "rtmax"
(the upper peak boundary),
"into"
(the ingegrated peak signal/area of the peak), "maxo"
(the
maximum instensity of the peak and "sn"
(the signal to noise ratio).
Note that, depending on the peak detection algorithm, the matrix may
contain additional columns.
For XChromatograms
objects the matrix
contains also columns "row"
and "column"
specifying in which chromatogram of object
the peak was
identified. Chromatographic peaks are ordered by row.
chromPeakData
, chromPeakData<-
: extract or set the DataFrame()
with
optional chromatographic peak annotations.
hasChromPeaks
: infer whether a XChromatogram
(or XChromatograms
)
has chromatographic peaks. For XChromatogram
: returns a logical(1)
,
for XChromatograms
: returns a matrix
, same dimensions than object
with either TRUE
or FALSE
if chromatographic peaks are available in
the chromatogram at the respective position.
hasFilledChromPeaks
: whether a XChromatogram
(or a XChromatogram
in
a XChromatograms
) has filled-in chromatographic peaks.
For XChromatogram
: returns a logical(1)
,
for XChromatograms
: returns a matrix
, same dimensions than object
with either TRUE
or FALSE
if chromatographic peaks are available in
the chromatogram at the respective position.
dropFilledChromPeaks
: removes filled-in chromatographic peaks. See
dropFilledChromPeaks()
help for XCMSnExp()
objects for more
information.
hasFeatures
: for XChromatograms
objects only: if correspondence
analysis has been performed and m/z-rt feature definitions are present.
Returns a logical(1)
.
dropFeatureDefinitions
: for XChrmomatograms
objects only: delete any
correspondence analysis results (and related process history).
featureDefinitions
: for XChromatograms
objects only. Extract the
results from the correspondence analysis (performed with
groupChromPeaks
). Returns a DataFrame
with the properties of the
defined m/z-rt features: their m/z and retention time range. Columns
peakidx
and row
contain the index of the chromatographic peaks in the
chromPeaks
matrix associated with the feature and the row in the
XChromatograms
object in which the feature was defined. Similar to the
chromPeaks
method it is possible to filter the returned feature matrix
with the mz
, rt
and ppm
parameters.
featureValues
: for XChromatograms
objects only. Extract the abundance
estimates for the individuals features. Note that by default (with
parameter value = "index"
a matrix
of indices of the peaks in the
chromPeaks
matrix associated to the feature is returned. To extract the
integrated peak area use value = "into"
. The function returns a matrix
with one row per feature (in featureDefinitions
) and each column being
a sample (i.e. column of object
). For features without a peak associated
in a certain sample NA
is returned. This can be changed with the
missing
argument of the function.
filterChromPeaks
: filters chromatographic peaks in object
depending
on parameter method
and method-specific parameters passed as additional
arguments with ...
. Available methods are:
method = "keepTop"
: keep top n
(default n = 1L
) peaks in each
chromatogram ordered by column order
(defaults to order = "maxo"
).
Parameter decreasing
(default decreasing = TRUE
) can be used to
order peaks in descending (decreasing = TRUE
) or ascending (
decreasing = FALSE
) order to keep the top n
peaks with largest or
smallest values, respectively.
processHistory
: returns a list
of ProcessHistory objects representing
the individual performed processing steps. Optional parameters type
and
fileIndex
allow to further specify which processing steps to return.
transformIntensity
: transforms the intensity values of the chromatograms
with provided function FUN
. See transformIntensity()
in the MSnbase
package for details. For XChromatogram
and XChromatograms
in addition
to the intensity values also columns "into"
and "maxo"
in the object's
chromPeaks
matrix are transformed by the same function.
plot
draws the chromatogram and highlights in addition any
chromatographic peaks present in the XChromatogram
or XChromatograms
(unless peakType = "none"
was specified). To draw peaks in different
colors a vector of color definitions with length equal to
nrow(chromPeaks(x))
has to be submitted with peakCol
and/or peakBg
defining one color for each peak (in the order as peaks are in
chromPeaks(x))
. For base peak chromatograms or total ion chromatograms
it might be better to set peakType = "none"
to avoid generating busy
plots.
plotChromPeakDensity
: visualize peak density-based correspondence
analysis results. See section Correspondence analysis for more details.
See findChromPeaks-Chromatogram-CentWaveParam for information.
After chromatographic peak detection it is also possible to refine
identified chromatographic peaks with the refineChromPeaks
method (e.g. to
reduce peak detection artifacts). Currently, only peak refinement using the
merge neighboring peaks method is available (see
MergeNeighboringPeaksParam()
for a detailed description of the approach.
Identified chromatographic peaks in an XChromatograms
object can be grouped
into features with the groupChromPeaks
function. Currently, such a
correspondence analysis can be performed with the peak density method
(see groupChromPeaks for more details) specifying the algorithm settings
with a PeakDensityParam()
object. A correspondence analysis is performed
separately for each row in the XChromatograms
object grouping
chromatographic peaks across samples (columns).
The analysis results are stored in the returned XChromatograms
object
and can be accessed with the featureDefinitions
method which returns a
DataFrame
with one row for each feature. Column "row"
specifies in
which row of the XChromatograms
object the feature was identified.
The plotChromPeakDensity
method can be used to visualize peak density
correspondence results, or to simulate a peak density correspondence
analysis on chromatographic data. The resulting plot consists of two panels,
the upper panel showing the chromatographic data as well as the identified
chromatographic peaks, the lower panel the distribution of peaks (the peak
density) along the retention time axis. This plot shows each peak as a point
with it's peak's retention time on the x-axis, and the sample in which it
was found on the y-axis. The distribution of peaks along the retention time
axis is visualized with a density estimate. Grouped chromatographic peaks
are indicated with grey shaded rectangles. Parameter simulate
allows to
define whether the correspondence analysis should be simulated (
simulate=TRUE
, based on the available data and the provided
PeakDensityParam()
parameter class) or not (simulate=FALSE
). For the
latter it is assumed that a correspondence analysis has been performed with
the peak density method on the object
.
See examples below.
Abundance estimates for each feature can be extracted with the
featureValues
function using parameter value = "into"
to extract the
integrated peak area for each feature. The result is a matrix
, columns
being samples and rows features.
Highlighting the peak area(s) in an XChromatogram
or XChromatograms
object (plot
with peakType = "polygon"
) draws a polygon representing
the displayed chromatogram from the peak's minimal retention time to the
maximal retention time. If the XChromatograms
was extracted from an
XCMSnExp()
object with the chromatogram()
function this might not
represent the actual identified peak area if the m/z range that was
used to extract the chromatogram was larger than the peak's m/z.
Johannes Rainer
findChromPeaks-centWave for peak
detection on MChromatograms()
objects.
## ---- Creation of XChromatograms ---- ## ## Create a XChromatograms from Chromatogram objects library(MSnbase) dta <- list(Chromatogram(rtime = 1:7, c(3, 4, 6, 12, 8, 3, 2)), Chromatogram(1:10, c(4, 6, 3, 4, 7, 13, 43, 34, 23, 9))) ## Create an XChromatograms without peak data xchrs <- XChromatograms(dta) ## Create an XChromatograms with peaks data pks <- list(matrix(c(4, 2, 5, 30, 12, NA), nrow = 1, dimnames = list(NULL, c("rt", "rtmin", "rtmax", "into", "maxo", "sn"))), NULL) xchrs <- XChromatograms(dta, chromPeaks = pks) ## Create an XChromatograms from XChromatogram objects dta <- lapply(dta, as, "XChromatogram") chromPeaks(dta[[1]]) <- pks[[1]] xchrs <- XChromatograms(dta, nrow = 1) hasChromPeaks(xchrs) ## Loading a test data set with identified chromatographic peaks faahko_sub <- loadXcmsData("faahko_sub2") ## Subset the dataset to the first and third file. xod_sub <- filterFile(faahko_sub, file = c(1, 3)) od <- as(xod_sub, "MsExperiment") ## Extract chromatograms for a m/z - retention time slice chrs <- chromatogram(od, mz = 344, rt = c(2500, 3500)) chrs ## --------------------------------------------------- ## ## Chromatographic peak detection ## ## --------------------------------------------------- ## ## Perform peak detection using CentWave xchrs <- findChromPeaks(chrs, param = CentWaveParam()) xchrs ## Do we have chromatographic peaks? hasChromPeaks(xchrs) ## Process history processHistory(xchrs) ## The chromatographic peaks, columns "row" and "column" provide information ## in which sample the peak was identified. chromPeaks(xchrs) ## Spectifically extract chromatographic peaks for one sample/chromatogram chromPeaks(xchrs[1, 2]) ## Plot the results plot(xchrs) ## Plot the results using a different color for each sample sample_colors <- c("#ff000040", "#00ff0040", "#0000ff40") cols <- sample_colors[chromPeaks(xchrs)[, "column"]] plot(xchrs, col = sample_colors, peakBg = cols) ## Indicate the peaks with a rectangle plot(xchrs, col = sample_colors, peakCol = cols, peakType = "rectangle", peakBg = NA) ## --------------------------------------------------- ## ## Correspondence analysis ## ## --------------------------------------------------- ## ## Group chromatographic peaks across samples prm <- PeakDensityParam(sampleGroup = rep(1, 2)) res <- groupChromPeaks(xchrs, param = prm) hasFeatures(res) featureDefinitions(res) ## Plot the correspondence results. Use simulate = FALSE to show the ## actual results. Grouped chromatographic peaks are indicated with ## grey shaded rectangles. plotChromPeakDensity(res, simulate = FALSE) ## Simulate a correspondence analysis based on different settings. Larger ## bw will increase the smoothing of the density estimate hence grouping ## chromatographic peaks that are more apart on the retention time axis. prm <- PeakDensityParam(sampleGroup = rep(1, 3), bw = 60) plotChromPeakDensity(res, param = prm) ## Delete the identified feature definitions res <- dropFeatureDefinitions(res) hasFeatures(res) library(MSnbase) ## Create a XChromatogram object pks <- matrix(nrow = 1, ncol = 6) colnames(pks) <- c("rt", "rtmin", "rtmax", "into", "maxo", "sn") pks[, "rtmin"] <- 2 pks[, "rtmax"] <- 9 pks[, "rt"] <- 4 pks[, "maxo"] <- 19 pks[, "into"] <- 93 xchr <- XChromatogram(rtime = 1:10, intensity = c(4, 8, 14, 19, 18, 12, 9, 8, 5, 2), chromPeaks = pks) xchr ## Add arbitrary peak annotations df <- DataFrame(peak_id = c("a")) xchr <- XChromatogram(rtime = 1:10, intensity = c(4, 8, 14, 19, 18, 12, 9, 8, 5, 2), chromPeaks = pks, chromPeakData = df) xchr chromPeakData(xchr) ## Extract the chromatographic peaks chromPeaks(xchr) ## Plotting of a single XChromatogram object ## o Don't highlight chromatographic peaks plot(xchr, peakType = "none") ## o Indicate peaks with a polygon plot(xchr) ## Add a second peak to the data. pks <- rbind(chromPeaks(xchr), c(7, 7, 10, NA, 15, NA)) chromPeaks(xchr) <- pks ## Plot the peaks in different colors plot(xchr, peakCol = c("#ff000080", "#0000ff80"), peakBg = c("#ff000020", "#0000ff20")) ## Indicate the peaks as rectangles plot(xchr, peakCol = c("#ff000060", "#0000ff60"), peakBg = NA, peakType = "rectangle") ## Filter the XChromatogram by retention time xchr_sub <- filterRt(xchr, rt = c(4, 6)) xchr_sub plot(xchr_sub)
## ---- Creation of XChromatograms ---- ## ## Create a XChromatograms from Chromatogram objects library(MSnbase) dta <- list(Chromatogram(rtime = 1:7, c(3, 4, 6, 12, 8, 3, 2)), Chromatogram(1:10, c(4, 6, 3, 4, 7, 13, 43, 34, 23, 9))) ## Create an XChromatograms without peak data xchrs <- XChromatograms(dta) ## Create an XChromatograms with peaks data pks <- list(matrix(c(4, 2, 5, 30, 12, NA), nrow = 1, dimnames = list(NULL, c("rt", "rtmin", "rtmax", "into", "maxo", "sn"))), NULL) xchrs <- XChromatograms(dta, chromPeaks = pks) ## Create an XChromatograms from XChromatogram objects dta <- lapply(dta, as, "XChromatogram") chromPeaks(dta[[1]]) <- pks[[1]] xchrs <- XChromatograms(dta, nrow = 1) hasChromPeaks(xchrs) ## Loading a test data set with identified chromatographic peaks faahko_sub <- loadXcmsData("faahko_sub2") ## Subset the dataset to the first and third file. xod_sub <- filterFile(faahko_sub, file = c(1, 3)) od <- as(xod_sub, "MsExperiment") ## Extract chromatograms for a m/z - retention time slice chrs <- chromatogram(od, mz = 344, rt = c(2500, 3500)) chrs ## --------------------------------------------------- ## ## Chromatographic peak detection ## ## --------------------------------------------------- ## ## Perform peak detection using CentWave xchrs <- findChromPeaks(chrs, param = CentWaveParam()) xchrs ## Do we have chromatographic peaks? hasChromPeaks(xchrs) ## Process history processHistory(xchrs) ## The chromatographic peaks, columns "row" and "column" provide information ## in which sample the peak was identified. chromPeaks(xchrs) ## Spectifically extract chromatographic peaks for one sample/chromatogram chromPeaks(xchrs[1, 2]) ## Plot the results plot(xchrs) ## Plot the results using a different color for each sample sample_colors <- c("#ff000040", "#00ff0040", "#0000ff40") cols <- sample_colors[chromPeaks(xchrs)[, "column"]] plot(xchrs, col = sample_colors, peakBg = cols) ## Indicate the peaks with a rectangle plot(xchrs, col = sample_colors, peakCol = cols, peakType = "rectangle", peakBg = NA) ## --------------------------------------------------- ## ## Correspondence analysis ## ## --------------------------------------------------- ## ## Group chromatographic peaks across samples prm <- PeakDensityParam(sampleGroup = rep(1, 2)) res <- groupChromPeaks(xchrs, param = prm) hasFeatures(res) featureDefinitions(res) ## Plot the correspondence results. Use simulate = FALSE to show the ## actual results. Grouped chromatographic peaks are indicated with ## grey shaded rectangles. plotChromPeakDensity(res, simulate = FALSE) ## Simulate a correspondence analysis based on different settings. Larger ## bw will increase the smoothing of the density estimate hence grouping ## chromatographic peaks that are more apart on the retention time axis. prm <- PeakDensityParam(sampleGroup = rep(1, 3), bw = 60) plotChromPeakDensity(res, param = prm) ## Delete the identified feature definitions res <- dropFeatureDefinitions(res) hasFeatures(res) library(MSnbase) ## Create a XChromatogram object pks <- matrix(nrow = 1, ncol = 6) colnames(pks) <- c("rt", "rtmin", "rtmax", "into", "maxo", "sn") pks[, "rtmin"] <- 2 pks[, "rtmax"] <- 9 pks[, "rt"] <- 4 pks[, "maxo"] <- 19 pks[, "into"] <- 93 xchr <- XChromatogram(rtime = 1:10, intensity = c(4, 8, 14, 19, 18, 12, 9, 8, 5, 2), chromPeaks = pks) xchr ## Add arbitrary peak annotations df <- DataFrame(peak_id = c("a")) xchr <- XChromatogram(rtime = 1:10, intensity = c(4, 8, 14, 19, 18, 12, 9, 8, 5, 2), chromPeaks = pks, chromPeakData = df) xchr chromPeakData(xchr) ## Extract the chromatographic peaks chromPeaks(xchr) ## Plotting of a single XChromatogram object ## o Don't highlight chromatographic peaks plot(xchr, peakType = "none") ## o Indicate peaks with a polygon plot(xchr) ## Add a second peak to the data. pks <- rbind(chromPeaks(xchr), c(7, 7, 10, NA, 15, NA)) chromPeaks(xchr) <- pks ## Plot the peaks in different colors plot(xchr, peakCol = c("#ff000080", "#0000ff80"), peakBg = c("#ff000020", "#0000ff20")) ## Indicate the peaks as rectangles plot(xchr, peakCol = c("#ff000060", "#0000ff60"), peakBg = NA, peakType = "rectangle") ## Filter the XChromatogram by retention time xchr_sub <- filterRt(xchr, rt = c(4, 6)) xchr_sub plot(xchr_sub)
These functions are provided for compatibility with older versions of ‘xcms’ only, and will be defunct at the next release.
The following functions/methods are deprecated.
profBin
, profBinM
, profBinLin
,
profBinLinM
, profBinLinBase
, profBinLinBaseM
have been deprecated and binYonX
in combination
with imputeLinInterpol
should be used instead.
extractMsData
: replaced by as(x, "data.frame")
.
plotMsData
: replaced by plot(x, type = "XIC")
.
This class is used to store and plot parallel extracted ion
chromatograms from multiple sample files. It integrates with the
xcmsSet
class to display peak area integrated during peak
identification or fill-in.
Objects can be created with the getEIC
method of
the xcmsSet
class. Objects can also be created by calls
of the form new("xcmsEIC", ...)
.
eic
:list containing named entries for every sample. for each entry, a list of two column EIC matricies with retention time and intensity
mzrange
:two column matrix containing starting and ending m/z for each EIC
rtrange
:two column matrix containing starting and ending time for each EIC
rt
:either "raw"
or "corrected"
to specify retention
times contained in the object
groupnames
:group names from xcmsSet
object used to generate EICs
signature(object = "xcmsEIC")
: get groupnames
slot
signature(object = "xcmsEIC")
: get mzrange
slot
signature(x = "xcmsEIC")
: plot the extracted ion
chromatograms
signature(object = "xcmsEIC")
: get rtrange
slot
signature(object = "xcmsEIC")
: get sample names
No notes yet.
Colin A. Smith, [email protected]
Data sources which read data from a file should inherit from this
class. The xcms
package provides classes to read from
netCDF
, mzData
, mzXML
, and mzML
files
using xcmsFileSource
.
This class should be considered virtual and will not work if passed to
loadRaw-methods
. The reason it is not explicitly
virtual is that there does not appear to be a way for a class to be
both virtual and have a data part (which lets functions treat objects
as if they were character strings).
This class validates that a file exists at the path given.
xcmsFileSource
objects should not be instantiated directly.
Instead, create subclasses and instantiate those.
.Data
:Object of class "character"
. File path
of a file from which to read raw data as the object's data part
Class "character"
, from data part.
Class "xcmsSource"
, directly.
xcmsSource
signature(object = "character")
: Create an
xcmsFileSource
object referencing the given file name.
Daniel Hackney [email protected]
EXPERIMANTAL FEATURE
xcmsFragments is an object similar to xcmsSet, which holds peaks picked (or collected) from one or several xcmsRaw objects.
There are still discussions going on about the exact API for MS$^n$ data, so this is likely to change in the future. The code is not yet pipeline-ified.
xcmsFragments(xs, ...)
xcmsFragments(xs, ...)
xs |
A |
... |
further arguments to the |
After running collect(xFragments,xSet) The peaktable of the xcmsFragments includes the ms1Peaks from all experinemts stored in a xcmsSet-object. Further it contains the relevant MSn-peaks from the xcmsRaw-objects, which were created temporarily with the paths in xcmsSet.
An xcmsFragments
object.
Joachim Kutzera, Steffen Neumann, [email protected]
This class is similar to xcmsSet
because it stores peaks
from a number of individual files. However, xcmsFragments keeps
Tandem MS and e.g. Ion Trap or Orbitrap MS$^n$ peaks, including the
parent ion relationships.
Objects can be created with the xcmsFragments
constructor and filled with peaks using the collect method.
peaks
:matrix with colmns peakID (MS1 parent in corresponding xcmsSet),
MSnParentPeakID (parent peak within this xcmsFragments), msLevel
(e.g. 2 for Tandem MS), rt (retention time in case of LC data), mz
(fragment mass-to-charge), intensity (peak intensity extracted
from the original xcmsSet
), sample (the index of the rawData-file).
MS2spec
:This is a list of matrixes. Each matrix in the list is a single collected spectra from collect
. The column ID's are mz, intensity, and full width half maximum(fwhm). The fwhm column is only relevant if the spectra came from profile data.
specinfo
:This is a matrix with reference data for the spectra in MS2spec. The column id's are preMZ, AccMZ, rtmin, rtmax, ref, CollisionEnergy. The preMZ is precursor mass from the MS1 scan. This mass is given by the XML file. With some instruments this mass is only given as nominal mass, therefore a AccMZ is given which is a weighted average mass from the MS1 scan of the collected spectra. The retention time is given by rtmin and rtmax. The ref column is a pointer to the MS2spec matrix spectra. The collisionEnergy column is the collision Energy for the spectra.
signature(object = "xcmsFragments")
: gets a xcmsSet-object, collects ms1-peaks from it and the msn-peaks from the corresponding xcmsRaw-files.
signature(object = "xcmsFragments")
: prints a (text based) pseudo-tree of the peaktable to display the dependencies of the peaks among each other.
signature(object = "xcmsFragments")
: print a human-readable
description of this object to the console.
S. Neumann, J. Kutzera
The XCMSnExp
object is a container for the results of a G/LC-MS
data preprocessing that comprises chromatographic peak detection, alignment
and correspondence. These results can be accessed with the chromPeaks
,
adjustedRtime
and featureDefinitions
functions; see below
(after the Usage, Arguments, Value and Slots sections) for more details).
Along with the results, the object contains the processing history that
allows to track each processing step along with the used settings. This
can be extracted with the processHistory
method.
XCMSnExp
objects, by directly extending the
OnDiskMSnExp
object from the MSnbase
package, inherit
all of its functionality and allows thus an easy access to the full raw
data at any stage of an analysis.
To support interaction with packages requiring the old objects,
XCMSnExp
objects can be coerced into xcmsSet
objects using the as
method (see examples below). All
preprocessing results will be passed along to the resulting
xcmsSet
object.
General functions for XCMSnExp
objects are (see further below for
specific function to handle chromatographic peak data, alignment and
correspondence results):
processHistoryTypes
returns the available types of
process histories. These can be passed with argument type
to the
processHistory
method to extract specific process step(s).
hasFilledChromPeaks
: whether filled-in peaks are present or not.
profMat
: creates a profile matrix, which
is a n x m matrix, n (rows) representing equally spaced m/z values (bins)
and m (columns) the retention time of the corresponding scans. Each cell
contains the maximum intensity measured for the specific scan and m/z
values. See profMat
for more details and description of
the various binning methods.
hasAdjustedRtime
: whether the object provides adjusted
retention times.
hasFeatures
: whether the object contains correspondence
results (i.e. features).
hasChromPeaks
: whether the object contains peak
detection results.
hasFilledChromPeaks
: whether the object contains any filled-in
chromatographic peaks.
adjustedRtime
,adjustedRtime<-
:
extract/set adjusted retention times. adjustedRtime<-
should not
be called manually, it is called internally by the
adjustRtime
methods. For XCMSnExp
objects,
adjustedRtime<-
does also apply retention time adjustments to
eventually present chromatographic peaks. The bySample
parameter
allows to specify whether the adjusted retention time should be grouped
by sample (file).
featureDefinitions
, featureDefinitions<-
: extract
or set the correspondence results, i.e. the mz-rt features (peak groups).
Similar to the chromPeaks
it is possible to extract features for
specified m/z and/or rt ranges. The function supports also the parameter
type
that allows to specify which features to be returned if any
of rt
or mz
is specified. For details see help of
chromPeaks
.
See also featureSummary
for a function to calculate simple
feature summaries.
chromPeaks
, chromPeaks<-
: extract or set
the matrix containing the information on identified chromatographic
peaks. Rownames of the matrix represent unique IDs of the respective peaks
within the experiment.
Parameter bySample
allows to specify whether peaks should
be returned ungrouped (default bySample = FALSE
) or grouped by
sample (bySample = TRUE
). The chromPeaks<-
method for
XCMSnExp
objects removes also all correspondence (peak grouping)
and retention time correction (alignment) results. The optional
arguments rt
, mz
, ppm
and type
allow to extract
only chromatographic peaks overlapping the defined retention time and/or
m/z ranges. Argument type
allows to define how overlapping is
determined: for type == "any"
(the default), all peaks that are even
partially overlapping the region are returned (i.e. for which either
"mzmin"
or "mzmax"
of the chromPeaks
or
featureDefinitions
matrix are within the provided m/z range), for
type == "within"
the full peak has to be within the region (i.e.
both "mzmin"
and "mzmax"
have to be within the m/z range) and
for type == "apex_within"
the peak's apex position (highest signal
of the peak) has to be within the region (i.e. the peak's or features m/z
has to be within the m/z range).
See description of the return value for details on the returned matrix.
Users usually don't have to use the chromPeaks<-
method directly
as detected chromatographic peaks are added to the object by the
findChromPeaks
method. Also, chromPeaks<-
will replace
any existing chromPeakData
.
chromPeakData
and chromPeakData<-
allow to get or set arbitrary
chromatographic peak annotations. These are returned or ar returned as a
DataFrame
. Note that the number of rows and the rownames of the
DataFrame
have to match those of chromPeaks
.
rtime
: extracts the retention time for each
scan. The bySample
parameter allows to return the values grouped
by sample/file and adjusted
whether adjusted or raw retention
times should be returned. By default the method returns adjusted
retention times, if they are available (i.e. if retention times were
adjusted using the adjustRtime
method).
mz
: extracts the mz values from each scan of
all files within an XCMSnExp
object. These values are extracted
from the original data files and eventual processing steps are applied
on the fly. Using the bySample
parameter it is possible to
switch from the default grouping of mz values by spectrum/scan to a
grouping by sample/file.
intensity
: extracts the intensity values from
each scan of all files within an XCMSnExp
object. These values are
extracted from the original data files and eventual processing steps are
applied on the fly. Using the bySample
parameter it is
possible to switch from the default grouping of intensity values by
spectrum/scan to a grouping by sample/file.
spectra
: extracts the
Spectrum
objects containing all data from
object
. The values are extracted from the original data files and
eventual processing steps are applied on the fly. By setting
bySample = TRUE
, the spectra are returned grouped by sample/file.
If the XCMSnExp
object contains adjusted retention times, these
are returned by default in the Spectrum
objects (can be
overwritten by setting adjusted = FALSE
).
processHistory
: returns a list
of
ProcessHistory
objects (or objects inheriting from this
base class) representing the individual processing steps that have been
performed, eventually along with their settings (Param
parameter
class). Optional arguments fileIndex
, type
and
msLevel
allow to restrict to process steps of a certain type or
performed on a certain file or MS level.
dropChromPeaks
: drops any identified chromatographic
peaks and returns the object without that information. Note that for
XCMSnExp
objects the method drops by default also results from a
correspondence (peak grouping) analysis. Adjusted retention times are
removed if the alignment has been performed after peak detection.
This can be overruled with keepAdjustedRtime = TRUE
.
dropFeatureDefinitions
: drops the results from a
correspondence (peak grouping) analysis, i.e. the definition of the mz-rt
features and returns the object without that information. Note that for
XCMSnExp
objects the method will also by default drop retention
time adjustment results, if these were performed after the last peak
grouping (i.e. which base on the results from the peak grouping that are
going to be removed). All related process history steps are
removed too as well as eventually filled in peaks
(by fillChromPeaks
). The parameter keepAdjustedRtime
can be used to avoid removal of adjusted retention times.
dropAdjustedRtime
: drops any retention time
adjustment information and returns the object without adjusted retention
time. For XCMSnExp
objects, this also reverts the retention times
reported for the chromatographic peaks in the peak matrix to the
original, raw, ones (after chromatographic peak detection). Note that
for XCMSnExp
objects the method drops also all peak grouping
results if these were performed after the retention time
adjustment. All related process history steps are removed too.
findChromPeaks
performs chromatographic peak detection
on the provided XCMSnExp
objects. For more details see the method
for XCMSnExp
.
Note that by default (with parameter add = FALSE
) previous peak
detection results are removed. Use add = TRUE
to perform a second
round of peak detection and add the newly identified peaks to the previous
peak detection results. Correspondence results (features) are always removed
prior to peak detection. Previous alignment (retention
time adjustment) results are kept, i.e. chromatographic peak detection
is performed using adjusted retention times if the data was first
aligned using e.g. obiwarp (adjustRtime
).
dropFilledChromPeaks
: drops any filled-in chromatographic
peaks (filled in by the fillChromPeaks
method) and all
related process history steps.
spectrapply
applies the provided function to each
Spectrum
in the object and returns its
results. If no function is specified the function simply returns the
list
of Spectrum
objects.
XCMSnExp
objects can be combined with the c
function. This
combines identified chromatographic peaks and the objects' pheno data but
discards alignment results or feature definitions.
plot
plots the spectrum data (see plot
for
MSnExp
objects in the MSnbase
package for more details.
For type = "XIC"
, identified chromatographic peaks will be indicated
as rectangles with border color peakCol
.
processHistoryTypes() ## S4 method for signature 'XCMSnExp' hasFilledChromPeaks(object) ## S4 method for signature 'OnDiskMSnExp' profMat( object, method = "bin", step = 0.1, baselevel = NULL, basespace = NULL, mzrange. = NULL, fileIndex, ... ) ## S4 method for signature 'XCMSnExp' show(object) ## S4 method for signature 'XCMSnExp' hasAdjustedRtime(object) ## S4 method for signature 'XCMSnExp' hasFeatures(object, msLevel = integer()) ## S4 method for signature 'XCMSnExp' hasChromPeaks(object, msLevel = integer()) ## S4 method for signature 'XCMSnExp' hasFilledChromPeaks(object) ## S4 method for signature 'XCMSnExp' adjustedRtime(object, bySample = FALSE) ## S4 replacement method for signature 'XCMSnExp' adjustedRtime(object) <- value ## S4 method for signature 'XCMSnExp' featureDefinitions( object, mz = numeric(), rt = numeric(), ppm = 0, type = c("any", "within", "apex_within"), msLevel = integer() ) ## S4 replacement method for signature 'XCMSnExp' featureDefinitions(object) <- value ## S4 method for signature 'XCMSnExp' chromPeaks( object, bySample = FALSE, rt = numeric(), mz = numeric(), ppm = 0, msLevel = integer(), type = c("any", "within", "apex_within"), isFilledColumn = FALSE ) ## S4 replacement method for signature 'XCMSnExp' chromPeaks(object) <- value ## S4 method for signature 'XCMSnExp' rtime(object, bySample = FALSE, adjusted = hasAdjustedRtime(object)) ## S4 method for signature 'XCMSnExp' mz(object, bySample = FALSE, BPPARAM = bpparam()) ## S4 method for signature 'XCMSnExp' intensity(object, bySample = FALSE, BPPARAM = bpparam()) ## S4 method for signature 'XCMSnExp' spectra( object, bySample = FALSE, adjusted = hasAdjustedRtime(object), BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' processHistory(object, fileIndex, type, msLevel) ## S4 method for signature 'XCMSnExp' dropChromPeaks(object, keepAdjustedRtime = FALSE) ## S4 method for signature 'XCMSnExp' dropFeatureDefinitions(object, keepAdjustedRtime = FALSE, dropLastN = -1) ## S4 method for signature 'XCMSnExp' dropAdjustedRtime(object) ## S4 method for signature 'XCMSnExp' profMat( object, method = "bin", step = 0.1, baselevel = NULL, basespace = NULL, mzrange. = NULL, fileIndex, ... ) ## S4 method for signature 'XCMSnExp,Param' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, add = FALSE ) ## S4 method for signature 'XCMSnExp' dropFilledChromPeaks(object) ## S4 method for signature 'XCMSnExp' spectrapply(object, FUN = NULL, BPPARAM = bpparam(), ...) ## S3 method for class 'XCMSnExp' c(...) ## S4 method for signature 'XCMSnExp' chromPeakData(object, ...) ## S4 replacement method for signature 'XCMSnExp' chromPeakData(object) <- value ## S4 method for signature 'XCMSnExp,missing' plot(x, y, type = c("spectra", "XIC"), peakCol = "#ff000060", ...)
processHistoryTypes() ## S4 method for signature 'XCMSnExp' hasFilledChromPeaks(object) ## S4 method for signature 'OnDiskMSnExp' profMat( object, method = "bin", step = 0.1, baselevel = NULL, basespace = NULL, mzrange. = NULL, fileIndex, ... ) ## S4 method for signature 'XCMSnExp' show(object) ## S4 method for signature 'XCMSnExp' hasAdjustedRtime(object) ## S4 method for signature 'XCMSnExp' hasFeatures(object, msLevel = integer()) ## S4 method for signature 'XCMSnExp' hasChromPeaks(object, msLevel = integer()) ## S4 method for signature 'XCMSnExp' hasFilledChromPeaks(object) ## S4 method for signature 'XCMSnExp' adjustedRtime(object, bySample = FALSE) ## S4 replacement method for signature 'XCMSnExp' adjustedRtime(object) <- value ## S4 method for signature 'XCMSnExp' featureDefinitions( object, mz = numeric(), rt = numeric(), ppm = 0, type = c("any", "within", "apex_within"), msLevel = integer() ) ## S4 replacement method for signature 'XCMSnExp' featureDefinitions(object) <- value ## S4 method for signature 'XCMSnExp' chromPeaks( object, bySample = FALSE, rt = numeric(), mz = numeric(), ppm = 0, msLevel = integer(), type = c("any", "within", "apex_within"), isFilledColumn = FALSE ) ## S4 replacement method for signature 'XCMSnExp' chromPeaks(object) <- value ## S4 method for signature 'XCMSnExp' rtime(object, bySample = FALSE, adjusted = hasAdjustedRtime(object)) ## S4 method for signature 'XCMSnExp' mz(object, bySample = FALSE, BPPARAM = bpparam()) ## S4 method for signature 'XCMSnExp' intensity(object, bySample = FALSE, BPPARAM = bpparam()) ## S4 method for signature 'XCMSnExp' spectra( object, bySample = FALSE, adjusted = hasAdjustedRtime(object), BPPARAM = bpparam() ) ## S4 method for signature 'XCMSnExp' processHistory(object, fileIndex, type, msLevel) ## S4 method for signature 'XCMSnExp' dropChromPeaks(object, keepAdjustedRtime = FALSE) ## S4 method for signature 'XCMSnExp' dropFeatureDefinitions(object, keepAdjustedRtime = FALSE, dropLastN = -1) ## S4 method for signature 'XCMSnExp' dropAdjustedRtime(object) ## S4 method for signature 'XCMSnExp' profMat( object, method = "bin", step = 0.1, baselevel = NULL, basespace = NULL, mzrange. = NULL, fileIndex, ... ) ## S4 method for signature 'XCMSnExp,Param' findChromPeaks( object, param, BPPARAM = bpparam(), return.type = "XCMSnExp", msLevel = 1L, add = FALSE ) ## S4 method for signature 'XCMSnExp' dropFilledChromPeaks(object) ## S4 method for signature 'XCMSnExp' spectrapply(object, FUN = NULL, BPPARAM = bpparam(), ...) ## S3 method for class 'XCMSnExp' c(...) ## S4 method for signature 'XCMSnExp' chromPeakData(object, ...) ## S4 replacement method for signature 'XCMSnExp' chromPeakData(object) <- value ## S4 method for signature 'XCMSnExp,missing' plot(x, y, type = c("spectra", "XIC"), peakCol = "#ff000060", ...)
object |
For |
method |
|
step |
|
baselevel |
|
basespace |
|
mzrange. |
Optional |
fileIndex |
For |
... |
Additional parameters. |
msLevel |
|
bySample |
logical(1) specifying whether results should be grouped by sample. |
value |
For For For |
mz |
optional |
rt |
optional |
ppm |
optional |
type |
For |
isFilledColumn |
|
adjusted |
logical(1) whether adjusted or raw (i.e. the original retention times reported in the files) should be returned. |
BPPARAM |
Parameter class for parallel processing. See
|
keepAdjustedRtime |
For |
dropLastN |
For |
param |
A |
return.type |
Character specifying what type of object the method should
return. Can be either |
add |
For |
FUN |
For |
x |
For |
y |
For |
peakCol |
For |
For profMat
: a list
with a the profile matrix
matrix
(or matrices if fileIndex
was not specified or if
length(fileIndex) > 1
). See profile-matrix
for
general help and information about the profile matrix.
For adjustedRtime
: if bySample = FALSE
a numeric
vector with the adjusted retention for each spectrum of all files/samples
within the object. If bySample = TRUE
a list
(length equal
to the number of samples) with adjusted retention times grouped by
sample. Returns NULL
if no adjusted retention times are present.
For featureDefinitions
: a DataFrame
with peak grouping
information, each row corresponding to one mz-rt feature (grouped peaks
within and across samples) and columns "mzmed"
(median mz value),
"mzmin"
(minimal mz value), "mzmax"
(maximum mz value),
"rtmed"
(median retention time), "rtmin"
(minimal retention
time), "rtmax"
(maximal retention time) and "peakidx"
.
Column "peakidx"
contains a list
with indices of
chromatographic peaks (rows) in the matrix returned by the
chromPeaks
method that belong to that feature group. The method
returns NULL
if no feature definitions are present.
featureDefinitions
supports also parameters mz
, rt
,
ppm
and type
to return only features within certain ranges (see
description of chromPeaks
for details).
For chromPeaks
: if bySample = FALSE
a matrix
(each row
being a chromatographic peak, rownames representing unique IDs of the peaks)
with at least the following columns:
"mz"
(intensity-weighted mean of mz values of the peak across
scans/retention times),
"mzmin"
(minimal mz value),
"mzmax"
(maximal mz value),
"rt"
(retention time of the peak apex),
"rtmin"
(minimal retention time),
"rtmax"
(maximal retention time),
"into"
(integrated, original, intensity of the peak),
"maxo"
(maximum intentity of the peak),
"sample"
(sample index in which the peak was identified) and
Depending on the employed peak detection algorithm and the
verboseColumns
parameter of it, additional columns might be
returned. If parameter isFilledColumn
was set to TRUE
a column
named "is_filled"
is also returned.
For bySample = TRUE
the chromatographic peaks are
returned as a list
of matrices, each containing the
chromatographic peaks of a specific sample. For samples in which no
peaks were detected a matrix with 0 rows is returned.
For rtime
: if bySample = FALSE
a numeric vector with
the retention times of each scan, if bySample = TRUE
a
list
of numeric vectors with the retention times per sample.
For mz
: if bySample = FALSE
a list
with the mz
values (numeric vectors) of each scan. If bySample = TRUE
a
list
with the mz values per sample.
For intensity
: if bySample = FALSE
a list
with
the intensity values (numeric vectors) of each scan. If
bySample = TRUE
a list
with the intensity values per
sample.
For spectra
: if bySample = FALSE
a list
with
Spectrum
objects. If bySample = TRUE
the
result is grouped by sample, i.e. as a list
of lists
, each
element in the outer list
being the list
of spectra
of the specific file.
For processHistory
: a list
of
ProcessHistory
objects providing the details of the
individual data processing steps that have been performed.
.processHistory
list
with XProcessHistory
objects
tracking all individual analysis steps that have been performed.
msFeatureData
MsFeatureData
class extending environment
and containing the results from a chromatographic peak detection (element
"chromPeaks"
), peak grouping (element "featureDefinitions"
)
and retention time correction (element "adjustedRtime"
) steps.
This object should not be manipulated directly.
Chromatographic peak data is added to an XCMSnExp
object by the
findChromPeaks
function. Functions to access chromatographic
peak data are:
hasChromPeaks
whether chromatographic peak data is available,
see below for help of the function.
chromPeaks
access chromatographic peaks (see below for help).
dropChromPeaks
remove chromatographic peaks (see below for
help).
dropFilledChromPeaks
remove filled-in peaks (see below for
help).
fillChromPeaks
fill-in missing peaks (see respective
help page).
plotChromPeaks
plot identified peaks for a file (see
respective help page).
plotChromPeakImage
plot distribution of peaks along the
retention time axis (see respective help page).
highlightChromPeaks
add chromatographic peaks to an
existing plot of a Chromatogram
(see respective help page).
Adjusted retention times are stored in an XCMSnExp
object besides the
original, raw, retention times, allowing to switch between raw and adjusted
times. It is also possible to replace the raw retention times with the
adjusted ones with the applyAdjustedRtime
. The adjusted
retention times are added to an XCMSnExp
by the
adjustRtime
function. All functions related to the access of
adjusted retention times are:
hasAdjustedRtime
whether adjusted retention times are available
(see below for help).
dropAdjustedRtime
remove adjusted retention times (see below
for help).
applyAdjustedRtime
replace the raw retention times with
the adjusted ones (see respective help page).
plotAdjustedRtime
plot differences between adjusted and
raw retention times (see respective help page).
The correspondence analysis (groupChromPeaks
) adds the feature
definitions to an XCMSnExp
object. All functions related to these are
listed below:
hasFeatures
whether correspondence results are available (see
below for help).
featureDefinitions
access the definitions of the features (see
below for help).
dropFeatureDefinitions
remove correspondence results (see below
for help).
featureValues
access values for features (see respective
help page).
featureSummary
perform a simple summary of the defined
features (see respective help page).
overlappingFeatures
identify features that are
overlapping or close in the m/z - rt space (see respective help page).
quantify
extract feature intensities and put them, along
with feature definitions and phenodata information, into a
SummarizedExperiment
. See help page for details.
The "chromPeaks"
element in the msFeatureData
slot is
equivalent to the @peaks
slot of the xcmsSet
object, the
"featureDefinitions"
contains information from the @groups
and @groupidx
slots from an xcmsSet
object.
Johannes Rainer
xcmsSet
for the old implementation.
OnDiskMSnExp
, MSnExp
and pSet
for a complete list of inherited methods.
findChromPeaks
for available peak detection methods
returning a XCMSnExp
object as a result.
groupChromPeaks
for available peak grouping
methods and featureDefinitions
for the method to extract
the feature definitions representing the peak grouping results.
adjustRtime
for retention time adjustment methods.
chromatogram
to extract MS data as
Chromatogram
objects.
as
(as(x, "data.frame")
) in the MSnbase
package for the method to extract MS data as data.frame
s.
featureSummary
to calculate basic feature summaries.
featureChromatograms
to extract chromatograms for each
feature.
chromPeakSpectra
to extract MS2 spectra with the m/z of
the precursor ion within the m/z range of a peak and a retention time
within its retention time range.
featureSpectra
to extract MS2 spectra associated with
identified features.
fillChromPeaks
for the method to fill-in eventually
missing chromatographic peaks for a feature in some samples.
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## The results from the peak detection are now stored in the XCMSnExp ## object faahko_sub ## The detected peaks can be accessed with the chromPeaks method. head(chromPeaks(faahko_sub)) ## The settings of the chromatographic peak detection can be accessed with ## the processHistory method processHistory(faahko_sub) ## Also the parameter class for the peak detection can be accessed processParam(processHistory(faahko_sub)[[1]]) ## The XCMSnExp inherits all methods from the pSet and OnDiskMSnExp classes ## defined in Bioconductor's MSnbase package. To access the (raw) retention ## time for each spectrum we can use the rtime method. Setting bySample = TRUE ## would cause the retention times to be grouped by sample head(rtime(faahko_sub)) ## Similarly it is possible to extract the mz values or the intensity values ## using the mz and intensity method, respectively, also with the option to ## return the results grouped by sample instead of the default, which is ## grouped by spectrum. Finally, to extract all of the data we can use the ## spectra method which returns Spectrum objects containing all raw data. ## Note that all these methods read the information from the original input ## files and subsequently apply eventual data processing steps to them. mzs <- mz(faahko_sub, bySample = TRUE) length(mzs) lengths(mzs) ## The full data could also be read using the spectra data, which returns ## a list of Spectrum object containing the mz, intensity and rt values. ## spctr <- spectra(faahko_sub) ## To get all spectra of the first file we can split them by file ## head(split(spctr, fromFile(faahko_sub))[[1]]) ############ ## Filtering ## ## XCMSnExp objects can be filtered by file, retention time, mz values or ## MS level. For some of these filter preprocessing results (mostly ## retention time correction and peak grouping results) will be dropped. ## Below we filter the XCMSnExp object by file to extract the results for ## only the second file. xod_2 <- filterFile(faahko_sub, file = 2) xod_2 ## Now the objects contains only the idenfified peaks for the second file head(chromPeaks(xod_2)) ########## ## Coercing to an xcmsSet object ## ## We can also coerce the XCMSnExp object into an xcmsSet object: xs <- as(faahko_sub, "xcmsSet") head(peaks(xs))
## Load a test data set with detected peaks library(MSnbase) data(faahko_sub) ## Update the path to the files for the local system dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO") ## Disable parallel processing for this example register(SerialParam()) ## The results from the peak detection are now stored in the XCMSnExp ## object faahko_sub ## The detected peaks can be accessed with the chromPeaks method. head(chromPeaks(faahko_sub)) ## The settings of the chromatographic peak detection can be accessed with ## the processHistory method processHistory(faahko_sub) ## Also the parameter class for the peak detection can be accessed processParam(processHistory(faahko_sub)[[1]]) ## The XCMSnExp inherits all methods from the pSet and OnDiskMSnExp classes ## defined in Bioconductor's MSnbase package. To access the (raw) retention ## time for each spectrum we can use the rtime method. Setting bySample = TRUE ## would cause the retention times to be grouped by sample head(rtime(faahko_sub)) ## Similarly it is possible to extract the mz values or the intensity values ## using the mz and intensity method, respectively, also with the option to ## return the results grouped by sample instead of the default, which is ## grouped by spectrum. Finally, to extract all of the data we can use the ## spectra method which returns Spectrum objects containing all raw data. ## Note that all these methods read the information from the original input ## files and subsequently apply eventual data processing steps to them. mzs <- mz(faahko_sub, bySample = TRUE) length(mzs) lengths(mzs) ## The full data could also be read using the spectra data, which returns ## a list of Spectrum object containing the mz, intensity and rt values. ## spctr <- spectra(faahko_sub) ## To get all spectra of the first file we can split them by file ## head(split(spctr, fromFile(faahko_sub))[[1]]) ############ ## Filtering ## ## XCMSnExp objects can be filtered by file, retention time, mz values or ## MS level. For some of these filter preprocessing results (mostly ## retention time correction and peak grouping results) will be dropped. ## Below we filter the XCMSnExp object by file to extract the results for ## only the second file. xod_2 <- filterFile(faahko_sub, file = 2) xod_2 ## Now the objects contains only the idenfified peaks for the second file head(chromPeaks(xod_2)) ########## ## Coercing to an xcmsSet object ## ## We can also coerce the XCMSnExp object into an xcmsSet object: xs <- as(faahko_sub, "xcmsSet") head(peaks(xs))
A matrix of peak information. The actual columns depend on
how it is generated (i.e. the findPeaks
method).
Objects can be created by calls of the form new("xcmsPeaks", ...)
.
.Data
:The matrix holding the peak information
Class "matrix"
, from data part.
Class "array"
, by class "matrix", distance 2.
Class "structure"
, by class "matrix", distance 3.
Class "vector"
, by class "matrix", distance 4, with explicit coerce.
None yet. Some utilities for working with peak data would be nice.
Michael Lawrence
findPeaks
for detecting peaks in an
xcmsRaw
.
This function handles the task of reading a NetCDF/mzXML file containing
LC/MS or GC/MS data into a new xcmsRaw
object. It also
transforms the data into profile (maxrix) mode for efficient
plotting and data exploration.
xcmsRaw(filename, profstep = 1, profmethod = "bin", profparam = list(), includeMSn=FALSE, mslevel=NULL, scanrange=NULL) deepCopy(object)
xcmsRaw(filename, profstep = 1, profmethod = "bin", profparam = list(), includeMSn=FALSE, mslevel=NULL, scanrange=NULL) deepCopy(object)
filename |
path name of the NetCDF or mzXML file to read |
profstep |
step size (in m/z) to use for profile generation |
profmethod |
method to use for profile generation. See
|
profparam |
extra parameters to use for profile generation |
includeMSn |
only for XML file formats: also read MS$^n$ (Tandem-MS of Ion-/Orbi- Trap spectra) |
mslevel |
move data from mslevel into normal MS1 slots, e.g. for peak picking and visualisation |
scanrange |
scan range to read |
object |
An xcmsRaw object |
See profile-matrix
for details on profile matrix
generation methods and settings.
The scanrange to import can be restricted, otherwise all MS1 data
is read. If profstep
is set to 0, no profile matrix is generated.
Unless includeMSn = TRUE
only first level MS data is read, not MS/MS,
etc.
deepCopy(xraw) will create a copy of the xcmsRaw object with its own
copy of mz and intensity data in xraw@env
.
A xcmsRaw
object.
Colin A. Smith, [email protected]
NetCDF file format: https://www.unidata.ucar.edu/software/netcdf/ http://www.astm.org/Standards/E2077.htm http://www.astm.org/Standards/E2078.htm
mzXML file format: http://sashimi.sourceforge.net/software_glossolalia.html
PSI-MS working group who developed mzData and mzML file formats: http://www.psidev.info/index.php?q=node/80
Parser used for XML file formats: http://tools.proteomecenter.org/wiki/index.php?title=Software:RAMP
xcmsRaw-class
,
profStep
,
profMethod
xcmsFragments
## Not run: library(xcms) library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xr<-xcmsRaw(cdffiles[1]) xr ##This gives some information about the file names(attributes(xr)) ## Lets have a look at the structure of the object str(xr) ##same but with a preview of each slot in the object ##SO... lets have a look at how this works head(xr@scanindex) ##[1] 0 429 860 1291 1718 2140 xr@env$mz[425:430] ##[1] 596.3 597.0 597.3 598.1 599.3 200.1 ##We can see that the 429 index is the last mz of scan 1 therefore... mz.scan1<-xr@env$mz[(1+xr@scanindex[1]):xr@scanindex[2]] intensity.scan1<-xr@env$intensity[(1+xr@scanindex[1]):xr@scanindex[2]] plot(mz.scan1, intensity.scan1, type="h", main=paste("Scan 1 of file", basename(cdffiles[1]), sep="")) ##the easier way :p scan1<-getScan(xr, 1) head(scan1) plotScan(xr, 1) ## End(Not run)
## Not run: library(xcms) library(faahKO) cdfpath <- system.file("cdf", package = "faahKO") cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE) xr<-xcmsRaw(cdffiles[1]) xr ##This gives some information about the file names(attributes(xr)) ## Lets have a look at the structure of the object str(xr) ##same but with a preview of each slot in the object ##SO... lets have a look at how this works head(xr@scanindex) ##[1] 0 429 860 1291 1718 2140 xr@env$mz[425:430] ##[1] 596.3 597.0 597.3 598.1 599.3 200.1 ##We can see that the 429 index is the last mz of scan 1 therefore... mz.scan1<-xr@env$mz[(1+xr@scanindex[1]):xr@scanindex[2]] intensity.scan1<-xr@env$intensity[(1+xr@scanindex[1]):xr@scanindex[2]] plot(mz.scan1, intensity.scan1, type="h", main=paste("Scan 1 of file", basename(cdffiles[1]), sep="")) ##the easier way :p scan1<-getScan(xr, 1) head(scan1) plotScan(xr, 1) ## End(Not run)
This class handles processing and visualization of the raw data from a single LC/MS or GS/MS run. It includes methods for producing a standard suite of plots including individual spectra, multi-scan average spectra, TIC, and EIC. It will also produce a feature list of significant peaks using matched filtration.
Objects can be created with the xcmsRaw
constructor
which reads data from a NetCDF file into a new object.
acquisitionNum
:Numeric representing the acquisition
number of the individual scans/spectra. Length of
acquisitionNum
is equal to the number of spectra/scans in the
object and hence equal to the scantime
slot. Note however that
this information is only available in mzML files.
env
:environment with three variables: mz
- concatenated
m/z values for all scans, intensity
- corresponding
signal intensity for each m/z value, and profile
-
matrix represention of the intensity values with columns
representing scans and rows representing equally spaced m/z
values. The profile matrix should be extracted with the
profMat
method.
filepath
:Path to the raw data file
gradient
:matrix with first row, time
, containing the time point
for interpolation and successive columns representing solvent
fractions at each point
msnAcquisitionNum
:for each scan a unique acquisition number as reported via "spectrum id" (mzData) or "<scan num=...>" and "<scanOrigin num=...>" (mzXML)
msnCollisionEnergy
:"CollisionEnergy" (mzData) or "collisionEnergy" (mzXML)
msnLevel
:for each scan the "msLevel" (both mzData and mzXML)
msnPrecursorCharge
:"ChargeState" (mzData) and "precursorCharge" (mzXML)
msnPrecursorIntensity
:"Intensity" (mzData) or "precursorIntensity" (mzXML)
msnPrecursorMz
:"MassToChargeRatio" (mzData) or "precursorMz" (mzXML)
msnPrecursorScan
:"spectrumRef" (both mzData and mzXML)
msnRt
:Retention time of the scan
msnScanindex
:msnScanindex
mzrange
:numeric vector of length 2 with minimum and maximum m/z values represented in the profile matrix
polarity
:polarity
profmethod
:characer value with name of method used for generating the profile matrix.
profparam
:list to store additional profile matrix
generation settings. Use the profinfo
method to
extract all profile matrix creation relevant information.
scanindex
:integer vector with starting positions of each scan in the
mz
and intensity
variables (note that index
values are based off a 0 initial position instead of 1).
scantime
:numeric vector with acquisition time (in seconds) for each scan.
tic
:numeric vector with total ion count (intensity) for each scan
mslevel
:Numeric representing the MS level that is present in MS1
slot. This slot should be accessed through its getter method
mslevel
.
scanrange
:Numeric of length 2 specifying the scan range (or NULL
for
the full range). This slot should be accessed through its getter
method scanrange
. Note that the scanrange
will
always be 1 to the number of scans within the xcmsRaw
object, which does not necessarily have to match to the scan index in
the original mzML file (e.g. if the original data was sub-setted). The
acquisitionNum
information can be used to track the
original position of each scan in the mzML file.
signature(object = "xcmsRaw")
: feature detection using
matched filtration in the chromatographic time domain
signature(object = "xcmsRaw")
: get extracted ion
chromatograms in specified m/z ranges. This will return the total
ion chromatogram (TIC) if the m/z range corresponds to the full m/z
range (i.e. sum of all signals per retention time across all m/z).
signature(object = "xcmsRaw")
: get data for peaks in
specified m/z and time ranges
signature(object = "xcmsRaw")
: get m/z and intensity
values for a single mass scan
signature(object = "xcmsRaw")
: get average m/z and
intensity values for multiple mass scans
signature(x = "xcmsRaw")
: get data for peaks in
specified m/z and time ranges
Create an image of the raw (profile) data m/z against retention time, with the intensity color coded.
Getter method for the mslevel
slot.
signature(object = "xcmsRaw")
: plot a chromatogram
from profile data
signature(object = "xcmsRaw")
: plot locations of raw
intensity data points
signature(object = "xcmsRaw")
: plot a mass spectrum
of an individual scan from the raw data
signature(object = "xcmsRaw")
: plot a mass spectrum
from profile data
signature(object = "xcmsRaw")
: experimental method for
plotting 3D surface of profile data with rgl
.
signature(object = "xcmsRaw")
: plot total ion count
chromatogram
signature(object = "xcmsRaw")
: returns a list containing
the profile generation method and step (profile m/z step size) and
eventual additional parameters to the profile function.
signature(object = "xcmsRaw")
: median filter profile
data in time and m/z dimensions
signature(object = "xcmsRaw")
: change the method of
generating the profile
matrix
signature(object = "xcmsRaw")
: get the method of
generating the profile
matrix
signature(object = "xcmsRaw")
: get vector of m/z values
for each row of the profile
matrix
signature(object = "xcmsRaw")
: interpret flexible ways
of specifying subsets of the profile
matrix
signature(object = "xcmsRaw")
: change the m/z step
used for generating the profile
matrix
signature(object = "xcmsRaw")
: get the m/z step used
for generating the profile
matrix
signature(object = "xcmsRaw")
: reverse the order of the
data points for each scan
Getter method for the scanrange
slot. See slot description
above for more information.
signature(object = "xcmsRaw")
: sort the data points
by increasing m/z for each scan
signature(object = "xcmsRaw")
: Raw data correction for
lock mass calibration gaps.
signature(object = "xcmsRaw")
:
internal function to identify regions of interest in the raw
data as part of the first step of centWave-based peak detection.
Colin A. Smith, [email protected], Johannes Rainer [email protected]
xcmsRaw
, subset-xcmsRaw
for subsetting by spectra.
This function handles the construction of xcmsSet objects. It finds peaks in batch mode and pre-sorts files from subdirectories into different classes suitable for grouping.
xcmsSet(files = NULL, snames = NULL, sclass = NULL, phenoData = NULL, profmethod = "bin", profparam = list(), polarity = NULL, lockMassFreq=FALSE, mslevel=NULL, nSlaves=0, progressCallback=NULL, scanrange = NULL, BPPARAM = bpparam(), stopOnError = TRUE, ...)
xcmsSet(files = NULL, snames = NULL, sclass = NULL, phenoData = NULL, profmethod = "bin", profparam = list(), polarity = NULL, lockMassFreq=FALSE, mslevel=NULL, nSlaves=0, progressCallback=NULL, scanrange = NULL, BPPARAM = bpparam(), stopOnError = TRUE, ...)
files |
path names of the NetCDF/mzXML files to read |
snames |
sample names. By default the file name without extension is used. |
sclass |
sample classes. |
phenoData |
|
profmethod |
Method to use for profile generation. Supported
values are |
profparam |
parameters to use for profile generation. |
polarity |
filter raw data for positive/negative scans |
lockMassFreq |
Performs correction for Waters LockMass function |
mslevel |
perform peak picking on data of given mslevel |
nSlaves |
DEPRECATED, use |
progressCallback |
function to be called, when progressInfo changes (useful for GUIs) |
scanrange |
scan range to read |
BPPARAM |
a |
stopOnError |
Logical specifying whether the feature detection
call should stop on the first encountered error (the default), or
whether feature detection is performed in all files regardless
eventual failures for individual files in which case all errors are
reported as |
... |
further arguments to the |
The default values of the files
, snames
, sclass
, and
phenoData
arguments cause the function to recursively search
for readable files. The filename without extention is used for the
sample name. The subdirectory path is used for the sample class.
If the files contain both positive and negative spectra, the polarity
can be selected explicitly. The default (NULL) is to read all scans.
If phenoData
is provided, it is stored to the phenoData
slot of the returned xcmsSet
class. If that data.frame
contains a column named “class”, its content will be returned
by the sampclass
method and thus be used for the
group/class assignment of the individual files (e.g. for peak grouping
etc.). For more details see the help of the xcmsSet-class
.
The step size (in m/z) to use for profile generation can be submitted
either using the profparam
argument
(e.g. profparam=list(step=0.1)
) or by submitting
step=0.1
. By specifying a value of 0
the profile matrix
generation can be skipped.
The feature/peak detection algorithm can be specified with the
method
argument which defaults to the "matchFilter"
method (findPeaks.matchedFilter
). Possible values are
returned by getOption("BioC")$xcms$findPeaks.methods
.
The lock mass correction allows for the lock mass scan to be added back in with the last working scan. This correction gives better reproducibility between sample sets.
A xcmsSet
object.
The arguments profmethod
and profparam
have no influence
on the feature/peak detection. The step size parameter step
for
the profile generation in the findPeaks.matchedFilter
peak detection algorithm can be passed using the ...
.
Colin A. Smith, [email protected]
xcmsSet-class
,
findPeaks
,
profStep
,
profMethod
,
profBin
This class transforms a set of peaks from multiple LC/MS or GC/MS samples into a matrix of preprocessed data. It groups the peaks and does nonlinear retention time correction without internal standards. It fills in missing peak values from raw data. Lastly, it generates extracted ion chromatograms for ions of interest.
The phenoData
slot (and phenoData
parameter in the
xcmsSet
function) is intended to contain a data.frame
describing
all experimental factors, i.e. the samples along with their
properties. If this data.frame
contains a column named
“class”, this will be returned by the sampclass
method
and will thus be used by all methods to determine the sample
grouping/class assignment (e.g. to define the colors in various plots
or for the group
method).
The sampclass<-
method adds or replaces the “class”
column in the phenoData
slot. If a data.frame
is
submitted to this method, the interaction of its columns will be
stored into the “class” column.
Also, similar to other classes in Bioconductor, the $
method
can be used to directly access all columns in the phenoData
slot (e.g. use xset$name
on a xcmsSet
object called
“xset” to extract the values from a column named “name” in the phenoData
slot).
Objects can be created with the xcmsSet
constructor
which gathers peaks from a set NetCDF files. Objects can also be
created by calls of the form new("xcmsSet", ...)
.
matrix
containing peak data.
A vector with peak indices of peaks which have been added by a
fillPeaks
method.
Matrix containing statistics about peak groups.
List containing indices of peaks in each group.
A data.frame
containing the experimental design factors.
list
containing two lists, raw
and corrected
,
each containing retention times for every scan of every sample.
Character vector with absolute path name of each NetCDF file.
list
containing the values method
- profile generation
method, and step
- profile m/z step size and eventual
additional parameters to the profile function.
logical
vector filled if the waters Lock mass correction
parameter is used.
A string ("positive" or "negative" or NULL) describing whether only positive or negative scans have been used reading the raw data.
Progress informations for some xcms functions (for GUI).
Function to be called, when progressInfo changes (for GUI).
Numeric representing the MS level on which the peak picking was
performed (by default on MS1). This slot should be accessed
through its getter method mslevel
.
Numeric of length 2 specifying the scan range (or NULL
for
the full range). This slot should be accessed through its getter
method scanrange
. The scan range provided in this slot
represents the scans to which the whole raw data is subsetted.
Internal slot to be used to keep track of performed processing steps. This slot should not be directly accessed by the user.
signature("xcmsSet")
: combine objects together
signature(object = "xcmsSet")
: set filepaths
slot
signature(object = "xcmsSet")
: get filepaths
slot
signature(object = "xcmsSet")
: create report of
differentially regulated ions including EICs
signature(object = "xcmsSet")
: fill in peak data for
groups with missing peaks
signature(object = "xcmsSet")
: get list of EICs for
each sample in the set
signature(object = "xcmsSet", sampleidx = 1,
profmethod = profMethod(object), profstep = profStep(object),
profparam=profinfo(object), mslevel = NULL, scanrange = NULL,
rt=c("corrected", "raw"), BPPARAM = bpparam())
: read the raw
data for one or more files in the xcmsSet
and return
it. The default parameters will apply all settings used in the
original xcmsSet
call to generate the xcmsSet
object to be applied also to the raw data. Parameter
sampleidx
allows to specify which raw file(s) should be
loaded. Argument BPPARAM
allows to setup parallel
processing.
signature(object = "xcmsSet")
: set groupidx
slot
signature(object = "xcmsSet")
: get groupidx
slot
signature(object = "xcmsSet")
: get textual names for
peak groups
signature(object = "xcmsSet")
: set groups
slot
signature(object = "xcmsSet")
: get groups
slot
signature(object = "xcmsSet")
: get matrix of values
from peak data with a row for each peak group
signature(object = "xcmsSet")
: find groups of peaks
across samples that share similar m/z and retention times
Getter method for the mslevel
slot.
signature(object = "xcmsSet")
: set peaks
slot
signature(object = "xcmsSet")
: get peaks
slot
signature(object = "xcmsSet")
: plot retention time
deviation profiles
signature(object = "xcmsSet")
: set profinfo
slot
signature(object = "xcmsSet")
: get profinfo
slot
signature(object = "xcmsSet")
: extract the method used to
generate the profile matrix.
signature(object = "xcmsSet")
: extract the profile step
used for the generation of the profile matrix.
signature(object = "xcmsSet")
: use initial grouping
of peaks to do nonlinear loess retention time correction
signature(object = "xcmsSet")
: Replaces the column
“class” in the phenoData
slot. See details for more information.
signature(object = "xcmsSet")
: Returns the content of the
column “class” from the phenoData
slot or, if not
present, the interaction of the experimental design factors
(i.e. of the phenoData
data.frame
). See details for
more information.
signature(object = "xcmsSet")
: set the phenoData
slot
signature(object = "xcmsSet")
: get the phenoData
slot
signature(object = "xcmsSet")
: set the progressCallback
slot
signature(object = "xcmsSet")
: get the progressCallback
slot
Getter method for the scanrange
slot. See scanrange slot
description above for more details.
signature(object = "xcmsSet")
: set rownames in the
phenoData
slot
signature(object = "xcmsSet")
: get rownames in the
phenoData
slot
signature("xcmsSet")
: divide the xcmsSet into a list of
xcmsSet objects depending on the provided factor. Note that only
peak data will be preserved, i.e. eventual peak grouping information
will be lost.
object$name
, object$name<-value
Access and set name
column in phenoData
object[, i]
Conducts subsetting of a xcmsSet
instance. Only subsetting
on columns, i.e. samples, is supported. Subsetting is performed on
all slots, also on groups
and groupidx
. Parameter
i
can be an integer vector, a logical vector or a character
vector of sample names (matching sampnames
).
Colin A. Smith, [email protected], Johannes Rainer [email protected]
This virtual class provides an implementation-independent way to load
mass spectrometer data from various sources for use in an
xcmsRaw
object. Subclasses can be defined to
enable data to be loaded from user-specified sources. The virtual
class xcmsFileSource
is included out of the box
which contains a file name as a character string.
When implementing child classes of xcmsSource
, a corresponding
loadRaw-methods
method must be provided which accepts
the xcmsSource
child class and returns a list in the format
described in loadRaw-methods
.
A virtual Class: No objects may be created from it.
Daniel Hackney, [email protected]
xcmsSource-methods
for creating xcmsSource
objects in various ways.
xcmsSource
object in a flexible wayUsers can define alternate means of reading data for
xcmsRaw
objects by creating new implementations
of this method.
signature(object = "xcmsSource")
Pass the object through unmodified.
Daniel Hackney, [email protected]