Package 'xcms'

Title:	LC-MS and GC-MS Data Analysis
Description:	Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.
Authors:	Colin A. Smith [aut], Ralf Tautenhahn [aut], Steffen Neumann [aut, cre] , Paul Benton [aut], Christopher Conley [aut], Johannes Rainer [aut] , Michael Witting [ctb], William Kumler [aut] , Philippine Louail [aut] , Pablo Vangeenderhuysen [ctb] , Carl Brunius [ctb]
Maintainer:	Steffen Neumann <[email protected]>
License:	GPL (>= 2) + file LICENSE
Version:	4.5.3
Built:	2025-02-15 06:16:31 UTC
Source:	https://github.com/bioc/xcms

Help Index

XCMSnExp filtering and subsetting
Subset an xcmsRaw object by scans
Determine which peaks are absent / present in a sample class
Alignment: Retention time correction methods.
Landmark-based alignment: aligning a dataset against an external reference
Replace raw with adjusted retention times
Automatic parameter for Lock mass fixing AutoLockMass ~~
XCMSnExp data manipulation methods inherited from MSnbase
Aggregate values in y for bins defined on x
Flag features based on the intensity in blank samples
Generate breaks for binning using a defined bin size.
Generate breaks for binning
Combine xcmsSet objects
Calibrant mass based calibration of chromatgraphic peaks
Calibrate peaks for correcting unprecise m/z values
Extracting chromatograms
Extract an ion chromatogram for each chromatographic peak
Extract spectra associated with chromatographic peaks
Chromatographic peak summaries
Collect MS^n peaks into xcmsFragments
Correlate chromatograms
Create report of analyte differences
Change the file path of an OnDiskMSnExp object
Align spectrum retention times across samples using peak groups found in most samples
Core API function for centWave peak detection
Core API function for two-step centWave peak detection with isotopes
Core API function for massifquant peak detection
Core API function for matchedFilter peak detection
Core API function for single-spectrum non-chromatography MS data peak detection
Core API function for peak density based chromatographic peak grouping
Core API function for chromatic peak grouping using a nearest neighbor approach
Core API function for peak grouping using mzClust
Filter features based on the dispersion ratio
Estimate precursor intensity for MS level 2 spectra
Empirically Transformed Gaussian function
Export data for use in MetaboAnalyst
DEPRECATED: Extract a data.frame containing MS data
Compounding of LC-MS features
Extract ion chromatograms for each feature
Extract spectra associated with features
Simple feature summaries
Gap Filling
Integrate areas of missing peaks
Integrate areas of missing peaks
Integrate areas of missing peaks in FTICR-MS data
Filtering sets of chromatographic data
Next Generation xcms Result Object
Filtering of features based on conventional quality assessment
Chromatographic Peak Detection
Chromatographic peak detection using the centWave method
Two-step centWave peak detection considering also isotopes
Chromatographic peak detection using the massifquant method
Peak detection in the chromatographic time domain
centWave-based peak detection in purely chromatographic data
matchedFilter-based peak detection in purely chromatographic data
Data independent acquisition (DIA): peak detection in isolation windows
Find fragment ions in xcmsFragment objects
Find neutral losses in xcmsFragment objects
Feature detection for GC/MS and LC/MS Data - methods
Single-spectrum non-chromatography MS data peak detection
Feature detection based on predicted isotope features for high resolution LC/MS data
Feature detection for high resolution LC/MS data
Feature detection with centWave and additional isotope features
Feature detection for XC-MS data.
Peak detection in the chromatographic time domain
Collecting MS1 precursor peaks
Peak detection for single-spectrum non-chromatography MS data
Generic parameter class
Get extracted ion chromatograms for specified m/z ranges
Get peak intensities for specified regions
Get m/z and intensity values for a single mass scan
Get average m/z and intensity values for multiple mass scans
Load the raw data for one or more files in the xcmsSet
Group peaks from different samples together
Group peaks from different samples together
Group Peaks via High Resolution Alignment
Group peaks from different samples together
Correspondence: group chromatographic peaks across samples
Compounding/feature grouping based on similarity of abundances across samples
Compounding/feature grouping based on similarity of extracted ion chromatograms
Compounding/feature grouping based on similar retention times
Generate unque names for peak groups
Generate unique group (feature) names based on mass and retention time
Group overlapping ranges
Extract a matrix of peak values for each group
Add definition of chromatographic peaks to an extracted chromatogram plot
Plot log intensity image of a xcmsRaw object
Impute values for empty elements in a vector using linear interpolation
Replace missing values with a proportion of the row minimum
Impute missing values with random numbers based on the row minimum
Extract isolation window target m/z definition
Plot log intensity image of a xcmsRaw object
Read binary data from a source
LC-MS preprocessing result test data sets
Manual peak integration and feature definition
Apply a median filter to a matrix
Copy MSn data in an xcmsRaw to the MS slots
Identify overlapping features
Plot a grid of a large number of peaks
Identify peaks in chromatographic data using centWave
Identify peaks in chromatographic data using matchedFilter
Create report of aligned peak intensities
Filter features based on the percentage of missing data
Derive experimental design from file paths
Plot extracted ion chromatograms from multiple files
Visualization of Alignment Results
Plot extracted ion chromatograms from the profile matrix
Plot multiple chromatograms into the same plot
Plot chromatographic peak density along the retention time axis
General visualizations of peak detection results
Plot extracted ion chromatograms for specified m/z range
Plot feature groups in the m/z-retention time space
DEPRECATED: Create a plot that combines a XIC and a mz/rt 2D plot for one sample
Plot a grid of a large number of peaks
General visualization of precursor ions of LC-MS/MS data
Plot m/z and RT deviations for QC purposes without external reference data
Scatterplot of raw data points
Plot retention time deviation profiles
Plot a single mass scan
Plot mass spectra from the profile matrix
Plot profile matrix 3D surface using OpenGL
Plot total ion count
Tracking data processing
The profile matrix
Median filtering of the profile matrix
Get and set method for generating profile data
Specify a subset of profile mode data
Get and set m/z step for generating profile data
Accessing mz-rt feature data values
Get extracted ion chromatograms for specified m/z range
Get a raw data matrix
Data independent acquisition (DIA): reconstruct MS2 spectra
Refine Identified Chromatographic Peaks
Remove intensities from chromatographic data
Correct retention time from different samples
Align retention times across samples with Obiwarp
Align retention times across samples
Set retention time window to a specified width
Calculate relative log abundances
Filter features based on their coefficient of variation
Get sample names
Extract processing errors
Distance methods for xcmsSet, xcmsRaw and xsAnnotate
a Distance function based on matching peaks
a Distance function based on matching peaks
a Distance function based on matching peaks
Calculate noise for a sparse continuum mass spectrum
Identify peaks in a sparse continuum mode spectrum
Divide an xcmsRaw object
Divide an xcmsSet object
Gaussian Model
Correct gaps in data
Update an xcmsSet object
Enable usage of old xcms code
Verify an mzQuantML file
Save an xcmsRaw object to file
Save an xcmsRaw object to a file
Save an xcmsSet object to an PSI mzQuantML file
Export MS data to mzML/mzXML files
Save a grouped xcmsSet object in mzTab-1.1 format file
Containers for chromatographic and peak detection data
Deprecated functions in package ‘xcms’
Class xcmsEIC, a class for multi-sample extracted ion chromatograms
Base class for loading raw data from a file
Constructor for xcmsFragments objects which holds Tandem MS peaks
Class xcmsFragments, a class for handling Tandem MS and MS$^n$ data
Data container storing xcms preprocessing results
A matrix of peaks
Constructor for xcmsRaw objects which reads NetCDF/mzXML files
Class xcmsRaw, a class for handling raw data
Constructor for xcmsSet objects which finds peaks in NetCDF/mzXML files
Class xcmsSet, a class for preprocessing peak data
Virtual class for raw data sources
Create an xcmsSource object in a flexible way

XCMSnExp filtering and subsetting

Description

The methods listed on this page allow to filter and subset XCMSnExp objects. Most of them are inherited from the OnDiskMSnExp object defined in the MSnbase package and have been adapted for XCMSnExp to enable correct subsetting of preprocessing results.

[: subset a XCMSnExp object by spectra. Be aware that this removes all preprocessing results, except adjusted retention times if keepAdjustedRtime = TRUE is passed to the method.
[[: extracts a single Spectrum object (defined in MSnbase). The reported retention time is the adjusted retention time if alignment has been performed.
filterChromPeaks: subset the chromPeaks matrix in object. Parameter method allows to specify how the chromatographic peaks should be filtered. Currently, only method = "keep" is supported which allows to specify chromatographic peaks to keep with parameter keep (i.e. provide a logical, integer or character defining which chromatographic peaks to keep). Feature definitions (if present) are updated correspondingly.
filterFeatureDefinitions: allows to subset the feature definitions of an XCMSnExp object. Parameter features allow to define which features to keep. It can be a logical, integer (index of features to keep) or character (feature IDs) vector.
filterFile: allows to reduce the XCMSnExp to data from only selected files. Identified chromatographic peaks for these files are retained while correspondence results (feature definitions) are removed by default. To force keeping feature definitions use keepFeatures = TRUE. Adjusted retention times (if present) are retained by default if present. Use keepAdjustedRtime = FALSE to drop them.
filterMsLevel: reduces the XCMSnExp object to spectra of the specified MS level(s). Chromatographic peaks and identified features are also subsetted to the respective MS level. See also the filterMsLevel documentation in MSnbase for details and examples.
filterMz: filters the data set based on the provided m/z value range. All chromatographic peaks and features (grouped peaks) with their apex falling within the provided mz value range are retained (i.e. if chromPeaks(object)[, "mz"] is ⁠>= mz[1]⁠ and ⁠<= mz[2]⁠). Adjusted retention times, if present, are kept.
filterRt: filters the data set based on the provided retention time range. All chromatographic peaks and features (grouped peaks) within the specified retention time window are retained (i.e. if the retention time corresponding to the peak's apex is within the specified rt range). If retention time correction has been performed, the method will by default filter the object by adjusted retention times. The argument adjusted allows to specify manually whether filtering should be performed on raw or adjusted retention times. Filtering by retention time does not drop any preprocessing results nor does it remove or change alignment results (i.e. adjusted retention times). The method returns an empty object if no spectrum or feature is within the specified retention time range.
split: splits an XCMSnExp object into a list of XCMSnExp objects based on the provided parameter f. Note that by default all pre-processing results are removed by the splitting, except adjusted retention times, if the optional argument keepAdjustedRtime = TRUE is provided.

Usage

## S4 method for signature 'XCMSnExp,ANY,ANY,ANY'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'XCMSnExp,ANY,ANY'
x[[i, j, drop = FALSE]]

## S4 method for signature 'XCMSnExp'
filterMsLevel(object, msLevel., keepAdjustedRtime = hasAdjustedRtime(object))

## S4 method for signature 'XCMSnExp'
filterFile(
  object,
  file,
  keepAdjustedRtime = hasAdjustedRtime(object),
  keepFeatures = FALSE
)

## S4 method for signature 'XCMSnExp'
filterMz(object, mz, msLevel., ...)

## S4 method for signature 'XCMSnExp'
filterRt(object, rt, msLevel., adjusted = hasAdjustedRtime(object))

## S4 method for signature 'XCMSnExp,ANY'
split(x, f, drop = FALSE, ...)

## S4 method for signature 'XCMSnExp'
filterChromPeaks(
  object,
  keep = rep(TRUE, nrow(chromPeaks(object))),
  method = "keep",
  ...
)

## S4 method for signature 'XCMSnExp'
filterFeatureDefinitions(object, features = integer())
## S4 method for signature 'XCMSnExp,ANY,ANY,ANY'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'XCMSnExp,ANY,ANY'
x[[i, j, drop = FALSE]]

## S4 method for signature 'XCMSnExp'
filterMsLevel(object, msLevel., keepAdjustedRtime = hasAdjustedRtime(object))

## S4 method for signature 'XCMSnExp'
filterFile(
  object,
  file,
  keepAdjustedRtime = hasAdjustedRtime(object),
  keepFeatures = FALSE
)

## S4 method for signature 'XCMSnExp'
filterMz(object, mz, msLevel., ...)

## S4 method for signature 'XCMSnExp'
filterRt(object, rt, msLevel., adjusted = hasAdjustedRtime(object))

## S4 method for signature 'XCMSnExp,ANY'
split(x, f, drop = FALSE, ...)

## S4 method for signature 'XCMSnExp'
filterChromPeaks(
  object,
  keep = rep(TRUE, nrow(chromPeaks(object))),
  method = "keep",
  ...
)

## S4 method for signature 'XCMSnExp'
filterFeatureDefinitions(object, features = integer())

Arguments

`x`	For `[` and `[[`: an `XCMSnExp` object.
`i`	For `[`: `numeric` or `logical` vector specifying to which spectra the data set should be reduced. For `[[`: a single integer or character.
`j`	For `[` and `[[`: not supported.
`...`	Optional additional arguments.
`drop`	For `[` and `[[`: not supported.
`object`	A XCMSnExp object.
`msLevel.`	For `filterMz`, `filterRt`: `numeric` defining the MS level(s) to which operations should be applied or to which the object should be subsetted.
`keepAdjustedRtime`	For `filterFile`, `filterMsLevel`, `[`, `split`: `logical(1)` defining whether the adjusted retention times should be kept, even if e.g. features are being removed (and the retention time correction was performed on these features).
`file`	For `filterFile`: `integer` defining the file index within the object to subset the object by file or `character` specifying the file names to sub set. The indices are expected to be increasingly ordered, if not they are ordered internally.
`keepFeatures`	For `filterFile`: `logical(1)` whether correspondence results (feature definitions) should be kept or dropped. Defaults to `keepFeatures = FALSE` hence feature definitions are removed from the returned object by default.
`mz`	For `filterMz`: `numeric(2)` defining the lower and upper mz value for the filtering.
`rt`	For `filterRt`: `numeric(2)` defining the retention time window (lower and upper bound) for the filtering.
`adjusted`	For `filterRt`: `logical` indicating whether the object should be filtered by original (`adjusted = FALSE`) or adjusted retention times (`adjusted = TRUE`). For `spectra`: whether the retention times in the individual `Spectrum` objects should be the adjusted or raw retention times.
`f`	For `split` a vector of length equal to the length of x defining how `x` should be splitted. It is converted internally to a `factor`.
`keep`	For `filterChromPeaks`: `logical`, `integer` or `character` defining which chromatographic peaks should be retained.
`method`	For `filterChromPeaks`: `character(1)` allowing to specify the method by which chromatographic peaks should be filtered. Currently only `method = "keep"` is supported (i.e. specify with parameter `keep` which chromatographic peaks should be retained).
`features`	For `filterFeatureDefinitions`: either a `integer` specifying the indices of the features (rows) to keep, a `logical` with a length matching the number of rows of `featureDefinitions` or a `character` with the feature (row) names.

Details

All subsetting methods try to ensure that the returned data is consistent. Correspondence results for example are removed by default if the data set is sub-setted by file, since the correspondence results are dependent on the files on which correspondence was performed. This can be changed by setting keepFeatures = TRUE. For adjusted retention times, most subsetting methods support the argument keepAdjustedRtime (even the [ method) that forces the adjusted retention times to be retained even if the default would be to drop them.

Value

All methods return an XCMSnExp object.

Note

The filterFile method removes also process history steps not related to the files to which the object should be sub-setted and updates the fileIndex attribute accordingly. Also, the method does not allow arbitrary ordering of the files or re-ordering of the files within the object.

Note also that most of the filtering methods, and also the subsetting operations [ drop all or selected preprocessing results. To consolidate the alignment results, i.e. ensure that adjusted retention times are always preserved, use the applyAdjustedRtime() function on the object that contains the alignment results. This replaces the raw retention times with the adjusted ones.

Author(s)

Johannes Rainer

Examples


## Loading a test data set with identified chromatographic peaks
library(MSnbase)
data(faahko_sub)
## Update the path to the files for the local system
dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO")

## Disable parallel processing for this example
register(SerialParam())

## Subset the dataset to the first and third file.
xod_sub <- filterFile(faahko_sub, file = c(1, 3))

## The number of chromatographic peaks per file for the full object
table(chromPeaks(faahko_sub)[, "sample"])

## The number of chromatographic peaks per file for the subset
table(chromPeaks(xod_sub)[, "sample"])

basename(fileNames(faahko_sub))
basename(fileNames(xod_sub))

## Filter on mz values; chromatographic peaks and features within the
## mz range are retained (as well as adjusted retention times).
xod_sub <- filterMz(faahko_sub, mz = c(300, 400))
head(chromPeaks(xod_sub))
nrow(chromPeaks(xod_sub))
nrow(chromPeaks(faahko_sub))

## Filter on rt values. All chromatographic peaks and features within the
## retention time range are retained. Filtering is performed by default on
## adjusted retention times, if present.
xod_sub <- filterRt(faahko_sub, rt = c(2700, 2900))

range(rtime(xod_sub))
head(chromPeaks(xod_sub))
range(chromPeaks(xod_sub)[, "rt"])

nrow(chromPeaks(faahko_sub))
nrow(chromPeaks(xod_sub))

## Extract a single Spectrum
faahko_sub[[4]]

## Subsetting using [ removes all preprocessing results - using
## keepAdjustedRtime = TRUE would keep adjusted retention times, if present.
xod_sub <- faahko_sub[fromFile(faahko_sub) == 1]
xod_sub

## Using split does also remove preprocessing results, but it supports the
## optional parameter keepAdjustedRtime.
## Split the object into a list of XCMSnExp objects, one per file
xod_list <- split(faahko_sub, f = fromFile(faahko_sub))
xod_list
## Loading a test data set with identified chromatographic peaks
library(MSnbase)
data(faahko_sub)
## Update the path to the files for the local system
dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO")

## Disable parallel processing for this example
register(SerialParam())

## Subset the dataset to the first and third file.
xod_sub <- filterFile(faahko_sub, file = c(1, 3))

## The number of chromatographic peaks per file for the full object
table(chromPeaks(faahko_sub)[, "sample"])

## The number of chromatographic peaks per file for the subset
table(chromPeaks(xod_sub)[, "sample"])

basename(fileNames(faahko_sub))
basename(fileNames(xod_sub))

## Filter on mz values; chromatographic peaks and features within the
## mz range are retained (as well as adjusted retention times).
xod_sub <- filterMz(faahko_sub, mz = c(300, 400))
head(chromPeaks(xod_sub))
nrow(chromPeaks(xod_sub))
nrow(chromPeaks(faahko_sub))

## Filter on rt values. All chromatographic peaks and features within the
## retention time range are retained. Filtering is performed by default on
## adjusted retention times, if present.
xod_sub <- filterRt(faahko_sub, rt = c(2700, 2900))

range(rtime(xod_sub))
head(chromPeaks(xod_sub))
range(chromPeaks(xod_sub)[, "rt"])

nrow(chromPeaks(faahko_sub))
nrow(chromPeaks(xod_sub))

## Extract a single Spectrum
faahko_sub[[4]]

## Subsetting using [ removes all preprocessing results - using
## keepAdjustedRtime = TRUE would keep adjusted retention times, if present.
xod_sub <- faahko_sub[fromFile(faahko_sub) == 1]
xod_sub

## Using split does also remove preprocessing results, but it supports the
## optional parameter keepAdjustedRtime.
## Split the object into a list of XCMSnExp objects, one per file
xod_list <- split(faahko_sub, f = fromFile(faahko_sub))
xod_list

Subset an xcmsRaw object by scans

Description

Subset an xcmsRaw object by scans. The returned xcmsRaw object contains values for all scans specified with argument i. Note that the scanrange slot of the returned xcmsRaw will be c(1, length(object@scantime)) and hence not range(i).

Usage

## S4 method for signature 'xcmsRaw,logicalOrNumeric,missing,missing'
x[i, j, drop]
## S4 method for signature 'xcmsRaw,logicalOrNumeric,missing,missing'
x[i, j, drop]

Arguments

`x`	The `xcmsRaw` object that should be sub-setted.
`i`	Integer or logical vector specifying the scans/spectra to which `x` should be sub-setted.
`j`	Not supported.
`drop`	Not supported.

Details

Only subsetting by scan index in increasing order or by a logical vector are supported. If not ordered, argument i is sorted automatically. Indices which are larger than the total number of scans are discarded.

Value

The sub-setted xcmsRaw object.

Author(s)

Johannes Rainer

Examples

## Load a test file
file <- system.file('cdf/KO/ko15.CDF', package = "faahKO")
xraw <- xcmsRaw(file, profstep = 0)
## The number of scans/spectra:
length(xraw@scantime)

## Subset the object to scans with a scan time from 3500 to 4000.
xsub <- xraw[xraw@scantime >= 3500 & xraw@scantime <= 4000]
range(xsub@scantime)
## The number of scans:
length(xsub@scantime)
## The number of values of the subset:
length(xsub@env$mz)
## Load a test file
file <- system.file('cdf/KO/ko15.CDF', package = "faahKO")
xraw <- xcmsRaw(file, profstep = 0)
## The number of scans/spectra:
length(xraw@scantime)

## Subset the object to scans with a scan time from 3500 to 4000.
xsub <- xraw[xraw@scantime >= 3500 & xraw@scantime <= 4000]
range(xsub@scantime)
## The number of scans:
length(xsub@scantime)
## The number of values of the subset:
length(xsub@env$mz)

Determine which peaks are absent / present in a sample class

Description

Determine which peaks are absent / present in a sample class

Arguments

`object`	`xcmsSet-class` object
`class`	Name of a sample class from `sampclass`
`minfrac`	minimum fraction of samples necessary in the class to be absent/present

Details

Determine which peaks are absent / present in a sample class The functions treat peaks that are only present because of fillPeaks correctly, i.e. does not count them as present.

Value

An logical vector with the same length as nrow(groups(object)).

Methods

object = "xcmsSet": absent(object, ...) present(object, ...)

Alignment: Retention time correction methods.

Description

The adjustRtime method(s) perform retention time correction (alignment) between chromatograms of different samples/dataset. Alignment is performed by default on MS level 1 data. Retention times of spectra from other MS levels, if present, are subsequently adjusted based on the adjusted retention times of the MS1 spectra. Note that calling adjustRtime on a xcms result object will remove any eventually present previous alignment results as well as any correspondence analysis results. To run a second round of alignment, raw retention times need to be replaced with adjusted ones using the applyAdjustedRtime() function.

The alignment method can be specified (and configured) using a dedicated param argument.

Supported param objects are:

ObiwarpParam: performs retention time adjustment based on the full m/z - rt data using the obiwarp method (Prince (2006)). It is based on the original code but supports in addition alignment of multiple samples by aligning each against a center sample. The alignment is performed directly on the profile-matrix and can hence be performed independently of the peak detection or peak grouping.
PeakGroupsParam: performs retention time correction based on the alignment of features defined in all/most samples (corresponding to house keeping compounds or marker compounds) (Smith 2006). First the retention time deviation of these features is described by fitting either a polynomial (smooth = "loess") or a linear (smooth = "linear") function to the data points. These are then subsequently used to adjust the retention time of each spectrum in each sample (even from spectra of MS levels different than MS 1). Since the function is based on features (i.e. chromatographic peaks grouped across samples) a initial correspondence analysis has to be performed before using the groupChromPeaks() function. Alternatively, it is also possible to manually define a numeric matrix with retention times of markers in each samples that should be used for alignment. Such a matrix can be passed to the alignment function using the peakGroupsMatrix parameter of the PeakGroupsParam parameter object. By default the adjustRtimePeakGroups function is used to define this matrix. This function identifies peak groups (features) for alignment in object based on the parameters defined in param. See also do_adjustRtime_peakGroups() for the core API function.
LamaParama: This function performs retention time correction by aligning chromatographic data to an external reference dataset (concept and initial implementation by Carl Brunius). The process involves identifying and aligning peaks within the experimental chromatographic data, represented as an XcmsExperiment object, to a predefined set of landmark features called "lamas". These landmark features are characterized by their mass-to-charge ratio (m/z) and retention time. see LamaParama() for more information on the method.

Usage

adjustRtime(object, param, ...)

## S4 method for signature 'MsExperiment,ObiwarpParam'
adjustRtime(object, param, chunkSize = 2L, BPPARAM = bpparam())

## S4 method for signature 'MsExperiment,PeakGroupsParam'
adjustRtime(object, param, msLevel = 1L, ...)

PeakGroupsParam(
  minFraction = 0.9,
  extraPeaks = 1,
  smooth = "loess",
  span = 0.2,
  family = "gaussian",
  peakGroupsMatrix = matrix(nrow = 0, ncol = 0),
  subset = integer(),
  subsetAdjust = c("average", "previous")
)

ObiwarpParam(
  binSize = 1,
  centerSample = integer(),
  response = 1L,
  distFun = "cor_opt",
  gapInit = numeric(),
  gapExtend = numeric(),
  factorDiag = 2,
  factorGap = 1,
  localAlignment = FALSE,
  initPenalty = 0,
  subset = integer(),
  subsetAdjust = c("average", "previous"),
  rtimeDifferenceThreshold = 5
)

adjustRtimePeakGroups(object, param = PeakGroupsParam(), msLevel = 1L)

## S4 method for signature 'OnDiskMSnExp,ObiwarpParam'
adjustRtime(object, param, msLevel = 1L)

## S4 method for signature 'PeakGroupsParam'
minFraction(object)

## S4 replacement method for signature 'PeakGroupsParam'
minFraction(object) <- value

## S4 method for signature 'PeakGroupsParam'
extraPeaks(object)

## S4 replacement method for signature 'PeakGroupsParam'
extraPeaks(object) <- value

## S4 method for signature 'PeakGroupsParam'
smooth(x)

## S4 replacement method for signature 'PeakGroupsParam'
smooth(object) <- value

## S4 method for signature 'PeakGroupsParam'
span(object)

## S4 replacement method for signature 'PeakGroupsParam'
span(object) <- value

## S4 method for signature 'PeakGroupsParam'
family(object)

## S4 replacement method for signature 'PeakGroupsParam'
family(object) <- value

## S4 method for signature 'PeakGroupsParam'
peakGroupsMatrix(object)

## S4 replacement method for signature 'PeakGroupsParam'
peakGroupsMatrix(object) <- value

## S4 method for signature 'PeakGroupsParam'
subset(x)

## S4 replacement method for signature 'PeakGroupsParam'
subset(object) <- value

## S4 method for signature 'PeakGroupsParam'
subsetAdjust(object)

## S4 replacement method for signature 'PeakGroupsParam'
subsetAdjust(object) <- value

## S4 method for signature 'ObiwarpParam'
binSize(object)

## S4 replacement method for signature 'ObiwarpParam'
binSize(object) <- value

## S4 method for signature 'ObiwarpParam'
centerSample(object)

## S4 replacement method for signature 'ObiwarpParam'
centerSample(object) <- value

## S4 method for signature 'ObiwarpParam'
response(object)

## S4 replacement method for signature 'ObiwarpParam'
response(object) <- value

## S4 method for signature 'ObiwarpParam'
distFun(object)

## S4 replacement method for signature 'ObiwarpParam'
distFun(object) <- value

## S4 method for signature 'ObiwarpParam'
gapInit(object)

## S4 replacement method for signature 'ObiwarpParam'
gapInit(object) <- value

## S4 method for signature 'ObiwarpParam'
gapExtend(object)

## S4 replacement method for signature 'ObiwarpParam'
gapExtend(object) <- value

## S4 method for signature 'ObiwarpParam'
factorDiag(object)

## S4 replacement method for signature 'ObiwarpParam'
factorDiag(object) <- value

## S4 method for signature 'ObiwarpParam'
factorGap(object)

## S4 replacement method for signature 'ObiwarpParam'
factorGap(object) <- value

## S4 method for signature 'ObiwarpParam'
localAlignment(object)

## S4 replacement method for signature 'ObiwarpParam'
localAlignment(object) <- value

## S4 method for signature 'ObiwarpParam'
initPenalty(object)

## S4 replacement method for signature 'ObiwarpParam'
initPenalty(object) <- value

## S4 method for signature 'ObiwarpParam'
subset(x)

## S4 replacement method for signature 'ObiwarpParam'
subset(object) <- value

## S4 method for signature 'ObiwarpParam'
subsetAdjust(object)

## S4 replacement method for signature 'ObiwarpParam'
subsetAdjust(object) <- value

## S4 method for signature 'XCMSnExp,PeakGroupsParam'
adjustRtime(object, param, msLevel = 1L)

## S4 method for signature 'XCMSnExp,ObiwarpParam'
adjustRtime(object, param, msLevel = 1L)
adjustRtime(object, param, ...)

## S4 method for signature 'MsExperiment,ObiwarpParam'
adjustRtime(object, param, chunkSize = 2L, BPPARAM = bpparam())

## S4 method for signature 'MsExperiment,PeakGroupsParam'
adjustRtime(object, param, msLevel = 1L, ...)

PeakGroupsParam(
  minFraction = 0.9,
  extraPeaks = 1,
  smooth = "loess",
  span = 0.2,
  family = "gaussian",
  peakGroupsMatrix = matrix(nrow = 0, ncol = 0),
  subset = integer(),
  subsetAdjust = c("average", "previous")
)

ObiwarpParam(
  binSize = 1,
  centerSample = integer(),
  response = 1L,
  distFun = "cor_opt",
  gapInit = numeric(),
  gapExtend = numeric(),
  factorDiag = 2,
  factorGap = 1,
  localAlignment = FALSE,
  initPenalty = 0,
  subset = integer(),
  subsetAdjust = c("average", "previous"),
  rtimeDifferenceThreshold = 5
)

adjustRtimePeakGroups(object, param = PeakGroupsParam(), msLevel = 1L)

## S4 method for signature 'OnDiskMSnExp,ObiwarpParam'
adjustRtime(object, param, msLevel = 1L)

## S4 method for signature 'PeakGroupsParam'
minFraction(object)

## S4 replacement method for signature 'PeakGroupsParam'
minFraction(object) <- value

## S4 method for signature 'PeakGroupsParam'
extraPeaks(object)

## S4 replacement method for signature 'PeakGroupsParam'
extraPeaks(object) <- value

## S4 method for signature 'PeakGroupsParam'
smooth(x)

## S4 replacement method for signature 'PeakGroupsParam'
smooth(object) <- value

## S4 method for signature 'PeakGroupsParam'
span(object)

## S4 replacement method for signature 'PeakGroupsParam'
span(object) <- value

## S4 method for signature 'PeakGroupsParam'
family(object)

## S4 replacement method for signature 'PeakGroupsParam'
family(object) <- value

## S4 method for signature 'PeakGroupsParam'
peakGroupsMatrix(object)

## S4 replacement method for signature 'PeakGroupsParam'
peakGroupsMatrix(object) <- value

## S4 method for signature 'PeakGroupsParam'
subset(x)

## S4 replacement method for signature 'PeakGroupsParam'
subset(object) <- value

## S4 method for signature 'PeakGroupsParam'
subsetAdjust(object)

## S4 replacement method for signature 'PeakGroupsParam'
subsetAdjust(object) <- value

## S4 method for signature 'ObiwarpParam'
binSize(object)

## S4 replacement method for signature 'ObiwarpParam'
binSize(object) <- value

## S4 method for signature 'ObiwarpParam'
centerSample(object)

## S4 replacement method for signature 'ObiwarpParam'
centerSample(object) <- value

## S4 method for signature 'ObiwarpParam'
response(object)

## S4 replacement method for signature 'ObiwarpParam'
response(object) <- value

## S4 method for signature 'ObiwarpParam'
distFun(object)

## S4 replacement method for signature 'ObiwarpParam'
distFun(object) <- value

## S4 method for signature 'ObiwarpParam'
gapInit(object)

## S4 replacement method for signature 'ObiwarpParam'
gapInit(object) <- value

## S4 method for signature 'ObiwarpParam'
gapExtend(object)

## S4 replacement method for signature 'ObiwarpParam'
gapExtend(object) <- value

## S4 method for signature 'ObiwarpParam'
factorDiag(object)

## S4 replacement method for signature 'ObiwarpParam'
factorDiag(object) <- value

## S4 method for signature 'ObiwarpParam'
factorGap(object)

## S4 replacement method for signature 'ObiwarpParam'
factorGap(object) <- value

## S4 method for signature 'ObiwarpParam'
localAlignment(object)

## S4 replacement method for signature 'ObiwarpParam'
localAlignment(object) <- value

## S4 method for signature 'ObiwarpParam'
initPenalty(object)

## S4 replacement method for signature 'ObiwarpParam'
initPenalty(object) <- value

## S4 method for signature 'ObiwarpParam'
subset(x)

## S4 replacement method for signature 'ObiwarpParam'
subset(object) <- value

## S4 method for signature 'ObiwarpParam'
subsetAdjust(object)

## S4 replacement method for signature 'ObiwarpParam'
subsetAdjust(object) <- value

## S4 method for signature 'XCMSnExp,PeakGroupsParam'
adjustRtime(object, param, msLevel = 1L)

## S4 method for signature 'XCMSnExp,ObiwarpParam'
adjustRtime(object, param, msLevel = 1L)

Arguments

`object`	For `adjustRtime`: an `MSnbase::OnDiskMSnExp()`, `XCMSnExp()`, `MsExperiment::MsExperiment()` or `XcmsExperiment()` object.
`param`	The parameter object defining the alignment method (and its setting).
`...`	ignored.
`chunkSize`	For `adjustRtime` if `object` is either an `MsExperiment` or `XcmsExperiment`: `integer(1)` defining the number of files (samples) that should be loaded into memory and processed at the same time. Alignment is then performed in parallel (per sample) on this subset of loaded data. This setting thus allows to balance between memory demand and speed (due to parallel processing). Because parallel processing can only performed on the subset of data currently loaded into memory in each iteration, the value for `chunkSize` should match the defined parallel setting setup. Using a parallel processing setup using 4 CPUs (separate processes) but using `⁠chunkSize = ⁠`1`⁠will not perform any parallel processing, as only the data from one sample is loaded in memory at a time. On the other hand, setting⁠`chunkSize' to the total number of samples in an experiment will load the full MS data into memory and will thus in most settings cause an out-of-memory error.
`BPPARAM`	parallel processing setup. Defaults to `BPPARAM = bpparam()`. See `BiocParallel::bpparam()` for details.
`msLevel`	For `adjustRtime`: `integer(1)` defining the MS level on which the alignment should be performed.
`minFraction`	For `PeakGroupsParam`: `numeric(1)` between 0 and 1 defining the minimum required proportion of samples in which peaks for the peak group were identified. Peak groups passing this criteria will be aligned across samples and retention times of individual spectra will be adjusted based on this alignment. For `minFraction = 1` the peak group has to contain peaks in all samples of the experiment. Note that if `subset` is provided, the specified fraction is relative to the defined subset of samples and not to the total number of samples within the experiment (i.e., a peak has to be present in the specified proportion of subset samples).
`extraPeaks`	For `PeakGroupsParam`: `numeric(1)` defining the maximal number of additional peaks for all samples to be assigned to a peak group (feature) for retention time correction. For a data set with 6 samples, `extraPeaks = 1` uses all peak groups with a total peak count `⁠<= 6 + 1⁠`. The total peak count is the total number of peaks being assigned to a peak group and considers also multiple peaks within a sample that are assigned to the group.
`smooth`	For `PeakGroupsParam`: `character(1)` defining the function to be used to interpolate corrected retention times for all peak groups. Can be either `"loess"` or `"linear"`.
`span`	For `PeakGroupsParam`: `numeric(1)` defining the degree of smoothing (if `smooth = "loess"`). This parameter is passed to the internal call to `stats::loess()`.
`family`	For `PeakGroupsParam`: `character(1)` defining the method for loess smoothing. Allowed values are `"gaussian"` and `"symmetric"`. See `stats::loess()` for more information.
`peakGroupsMatrix`	For `PeakGroupsParam`: optional `matrix` of (raw) retention times for the (marker) peak groups on which the alignment should be performed. Each column represents a sample, each row a feature/peak group. The `adjustRtimePeakGroups` method is used by default to determine this matrix on the provided `object`.
`subset`	For `ObiwarpParam` and `PeakGroupsParam`: `integer` with the indices of samples within the experiment on which the alignment models should be estimated. Samples not part of the subset are adjusted based on the closest subset sample. See Subset-based alignment section for details.
`subsetAdjust`	For `ObiwarpParam` and `PeakGroupsParam`: `character(1)` specifying the method with which non-subset samples should be adjusted. Supported options are `"previous"` and `"average"` (default). See Subset-based alignment section for details.
`binSize`	`numeric(1)` defining the bin size (in mz dimension) to be used for the profile matrix generation. See `step` parameter in `profile-matrix` documentation for more details.
`centerSample`	`integer(1)` defining the index of the center sample in the experiment. It defaults to `floor(median(1:length(fileNames(object))))`. Note that if `subset` is used, the index passed with `centerSample` is within these subset samples.
`response`	For `ObiwarpParam`: `numeric(1)` defining the responsiveness of warping with `response = 0` giving linear warping on start and end points and `response = 100` warping using all bijective anchors.
`distFun`	For `ObiwarpParam`: `character(1)` defining the distance function to be used. Allowed values are `"cor"` (Pearson's correlation), `"cor_opt"` (calculate only 10% diagonal band of distance matrix; better runtime), `"cov"` (covariance), `"prd"` (product) and `"euc"` (Euclidian distance). The default value is `distFun = "cor_opt"`.
`gapInit`	For `ObiwarpParam`: `numeric(1)` defining the penalty for gap opening. The default value for depends on the value of `distFun`: `distFun = "cor"` and `distFun = "cor_opt"` it is `0.3`, for `distFun = "cov"` and `distFun = "prd"` `0.0` and for `distFun = "euc"` `0.9`.
`gapExtend`	For `ObiwarpParam`: `numeric(1)` defining the penalty for gap enlargement. The default value for `gapExtend` depends on the value of `distFun`: for `distFun = "cor"` and `distFun = "cor_opt"` it is `2.4`, `distFun = "cov"` `11.7`, for `distFun = "euc"` `1.8` and for `distFun = "prd"` `7.8`.
`factorDiag`	For `ObiwarpParam`: `numeric(1)` defining the local weight applied to diagonal moves in the alignment.
`factorGap`	For `ObiwarpParam`: `numeric(1)` defining the local weight for gap moves in the alignment.
`localAlignment`	For `ObiwarpParam`: `logical(1)` whether a local alignment should be performed instead of the default global alignment.
`initPenalty`	For `ObiwarpParam`: `numeric(1)` defining the penalty for initiating an alignment (for local alignment only).
`rtimeDifferenceThreshold`	For `ObiwarpParam`: `numeric(1)` defining the threshold to identify a gap in the sequence of retention times of (MS1) spectra of a sample/file. A gap is defined if the difference in retention times between consecutive spectra is `⁠> rtimeDifferenceThreshold⁠` of the median observed difference or retenion times of that data sample/file. Spectra with an retention time after such a gap will not be adjusted. The default for this parameter is `rtimeDifferenceThreshold = 5`. For Waters data with lockmass scans or LC-MS/MS data this might however be a too low threshold and it should be increased. See also issue #739.
`value`	The value for the slot.
`x`	An `ObiwarpParam`, `PeakGroupsParam` or `LamaParama` object.

Value

adjustRtime on an OnDiskMSnExp or XCMSnExp object will return an XCMSnExp object with the alignment results.

adjustRtime on an MsExperiment or XcmsExperiment will return an XcmsExperiment with the adjusted retention times stored in an new spectra variable rtime_adjusted in the object's spectra.

ObiwarpParam, PeakGroupsParam and LamaParama return the respective parameter object.

adjustRtimeGroups returns a matrix with the retention times of marker features in each sample (each row one feature, each row one sample).

Subset-based alignment

All alignment methods allow to perform the retention time correction on a user-selected subset of samples (e.g. QC samples) after which all samples not part of that subset will be adjusted based on the adjusted retention times of the closest subset sample (close in terms of index within object and hence possibly injection index). It is thus suggested to load MS data files in the order in which their samples were injected in the measurement run(s).

How the non-subset samples are adjusted depends also on the parameter subsetAdjust: with subsetAdjust = "previous", each non-subset sample is adjusted based on the closest previous subset sample which results in most cases with adjusted retention times of the non-subset sample being identical to the subset sample on which the adjustment bases. The second, default, option is subsetAdjust = "average" in which case each non subset sample is adjusted based on the average retention time adjustment from the previous and following subset sample. For the average, a weighted mean is used with weights being the inverse of the distance of the non-subset sample to the subset samples used for alignment.

See also section Alignment of experiments including blanks in the xcms vignette for more details.

Author(s)

Colin Smith, Johannes Rainer, Philippine Louail, Carl Brunius

References

Prince, J. T., and Marcotte, E. M. (2006) "Chromatographic Alignment of ESI-LC-MS Proteomic Data Sets by Ordered Bijective Interpolated Warping" Anal. Chem., 78 (17), 6140-6152.

Smith, C.A., Want, E.J., O'Maille, G., Abagyan, R. and Siuzdak, G. (2006). "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 78:779-787.

Landmark-based alignment: aligning a dataset against an external reference

Description

Alignment is achieved using the ['adjustRtime()'] method with a 'param' of class 'LamaParama'. This method corrects retention time by aligning chromatographic data with an external reference dataset.

Chromatographic peaks in the experimental data are first matched to predefined (external) landmark features based on their mass-to-charge ratio and retention time and subsequently the data is aligned by minimizing the differences in retention times between the matched chromatographic peaks and lamas. This adjustment is performed file by file.

Adjustable parameters such as 'ppm', 'tolerance', and 'toleranceRt' define acceptable deviations during the matching process. It's crucial to note that only lamas and chromatographic peaks exhibiting a one-to-one mapping are considered when estimating retention time shifts. If a file has no peaks matching with lamas, no adjustment will be performed, and the the retention times will be returned as-is. Users can evaluate this matching, for example, by checking the number of matches and ranges of the matching peaks, by first running '[matchLamasChromPeaks()]'.

Different warping methods are available; users can choose to fit a *loess* ('method = "loess"', the default) or a *gam* ('method = "gam"') between the reference data points and observed matching ChromPeaks. Additional parameters such as 'span', 'weight', 'outlierTolerance', 'zeroWeight', and 'bs' are specific to these models. These parameters offer flexibility in fine-tuning how the matching chromatographic peaks are fitted to the lamas, thereby generating a model to align the overall retention time for a single file.

Other functions related to this method:

- 'LamaParama()': return the respective parameter object for alignment using 'adjustRtime()' function. It is also the input for the functions listed below.

- ‘matchLamasChromPeaks()': quickly matches each file’s ChromPeaks to Lamas, allowing the user to evaluate the matches for each file.

- 'summarizeLamaMatch()': generates a summary of the 'LamaParama' method. See below for the details of the return object.

- 'matchedRtimes()': Access the list of 'data.frame' saved in the 'LamaParama' object, generated by the 'matchLamasChromPeaks()' function.

- 'plot()':plot the chromatographic peaks versus the reference lamas as well as the fitting line for the chosen model type. The user can decide what file to inspect by specifying the assay number with the parameter 'assay'

Usage

## S4 method for signature 'XcmsExperiment,LamaParama'
adjustRtime(object, param, BPPARAM = bpparam(), ...)

matchLamasChromPeaks(object, param, BPPARAM = bpparam())

summarizeLamaMatch(param)

matchedRtimes(param)

LamaParama(
  lamas = matrix(ncol = 2, nrow = 0, dimnames = list(NULL, c("mz", "rt"))),
  method = c("loess", "gam"),
  span = 0.5,
  outlierTolerance = 3,
  zeroWeight = 10,
  ppm = 20,
  tolerance = 0,
  toleranceRt = 5,
  bs = "tp"
)

## S4 method for signature 'LamaParama,ANY'
plot(
  x,
  index = 1L,
  colPoints = "#00000060",
  colFit = "#00000080",
  xlab = "Matched Chromatographic peaks",
  ylab = "Lamas",
  ...
)
## S4 method for signature 'XcmsExperiment,LamaParama'
adjustRtime(object, param, BPPARAM = bpparam(), ...)

matchLamasChromPeaks(object, param, BPPARAM = bpparam())

summarizeLamaMatch(param)

matchedRtimes(param)

LamaParama(
  lamas = matrix(ncol = 2, nrow = 0, dimnames = list(NULL, c("mz", "rt"))),
  method = c("loess", "gam"),
  span = 0.5,
  outlierTolerance = 3,
  zeroWeight = 10,
  ppm = 20,
  tolerance = 0,
  toleranceRt = 5,
  bs = "tp"
)

## S4 method for signature 'LamaParama,ANY'
plot(
  x,
  index = 1L,
  colPoints = "#00000060",
  colFit = "#00000080",
  xlab = "Matched Chromatographic peaks",
  ylab = "Lamas",
  ...
)

Arguments

`object`	An object of class 'XcmsExperiment' with defined ChromPeaks.
`param`	An object of class 'LamaParama' that will later be used for adjustment using the '[adjustRtime()]' function.
`BPPARAM`	For 'matchLamasChromPeaks()': parallel processing setup. Defaults to 'BPPARAM = bpparam()'. See [bpparam()] for more information.
`...`	For 'plot()': extra parameters to be passed to the function.
`lamas`	For 'LamaParama': 'matrix' or 'data.frame' with the m/z and retention times values of features (as first and second column) from the external dataset on which the alignment will be based on.
`method`	For 'LamaParama':'character(1)' with the type of warping. Either 'method = "gam"' or 'method = "loess"' (default).
`span`	For 'LamaParama': 'numeric(1)' defining the degree of smoothing ('method = "loess"'). This parameter is passed to the internal call to [loess()].
`outlierTolerance`	For 'LamaParama': 'numeric(1)' defining the settings for outlier removal during the fitting. By default (with 'outlierTolerance = 3'), all data points with absolute residuals larger than 3 times the mean absolute residual of all data points from the first, initial fit, are removed from the final model fit.
`zeroWeight`	For 'LamaParama': 'numeric(1)': defines the weight of the first data point (i.e. retention times of the first lama-chromatographic peak pair). Values larger than 1 reduce warping problems in the early RT range.
`ppm`	For 'LamaParama': 'numeric(1)' defining the m/z-relative maximal allowed difference in m/z between 'lamas' and chromatographic peaks. Used for the mapping of identified chromatographic peaks and lamas.
`tolerance`	For 'LamaParama': 'numeric(1)' defining the absolute acceptable difference in m/z between lamas and chromatographic peaks. Used for the mapping of identified chromatographic peaks and 'lamas'.
`toleranceRt`	For 'LamaParama': 'numeric(1)' defining the absolute acceptable difference in retention time between lamas and chromatographic peaks. Used for the mapping of identified chromatographic peaks and 'lamas'.
`bs`	For 'LamaParama()': 'character(1)' defining the GAM smoothing method. (defaults to thin plate, 'bs = "tp"')
`x`	For 'plot()': object of class 'LamaParama' to be plotted.
`index`	For 'plot()': 'numeric(1)' index of the file that should be plotted.
`colPoints`	For 'plot()': color for the plotting of the datapoint.
`colFit`	For 'plot()': color of the fitting line.
`xlab`, `ylab`	For 'plot()': x- and y-axis labels.

Value

For 'matchLamasChromPeaks()': A 'LamaParama' object with new slot 'rtMap' composed of a list of matrices representing the 1:1 matches between Lamas (ref) and ChromPeaks (obs). To access this, 'matchedRtimes()' can be used.

For 'matchedRtimes()': A list of 'data.frame' representing matches between chromPeaks and 'lamas' for each files.

For 'summarizeLamaMatch()':A 'data.frame' with:

- "Total_peaks": total number of chromatographic peaks in the file.

- "Matched_peak": The number of matched peaks to Lamas.

- "Total_Lamas": Total number of Lamas.

- "Model_summary": 'summary.loess' or 'summary.gam' object for each file.

Note

If there are no matches when using 'matchLamasChromPeaks()', the file retention will not be adjusted when calling [adjustRtime()] with the same 'LamaParama' and 'XcmsExperiment' object.

To see examples on how to utilize this methods and its functionality, see the vignette.

Author(s)

Carl Brunius, Philippine Louail

Examples

## load test and reference datasets
ref <- loadXcmsData("xmse")
tst <- loadXcmsData("faahko_sub2")

## create lamas input from the reference dataset
library(MsExperiment)
f <- sampleData(ref)$sample_type
f[f == "QC"] <- NA
ref <- filterFeatures(ref, PercentMissingFilter(threshold = 0, f = f))
ref_mz_rt <- featureDefinitions(ref)[, c("mzmed","rtmed")]

## Set up the LamaParama object
param <- LamaParama(lamas = ref_mz_rt, method = "loess", span = 0.5,
                     outlierTolerance = 3, zeroWeight = 10, ppm = 20,
                     tolerance = 0, toleranceRt = 20, bs = "tp")

## input into `adjustRtime()`
tst_adjusted <- adjustRtime(tst, param = param)

## run diagnostic functions to pre-evaluate alignment
param <- matchLamasChromPeaks(tst, param = param)
mtch <- matchedRtimes(param)

## Access summary of matches and model information
summary <- summarizeLamaMatch(param)

##coverage for each file
summary$Matched_peaks / summary$Total_peaks * 100

## Access the information on the model of for the first file
summary$model_summary[[1]]

## load test and reference datasets
ref <- loadXcmsData("xmse")
tst <- loadXcmsData("faahko_sub2")

## create lamas input from the reference dataset
library(MsExperiment)
f <- sampleData(ref)$sample_type
f[f == "QC"] <- NA
ref <- filterFeatures(ref, PercentMissingFilter(threshold = 0, f = f))
ref_mz_rt <- featureDefinitions(ref)[, c("mzmed","rtmed")]

## Set up the LamaParama object
param <- LamaParama(lamas = ref_mz_rt, method = "loess", span = 0.5,
                     outlierTolerance = 3, zeroWeight = 10, ppm = 20,
                     tolerance = 0, toleranceRt = 20, bs = "tp")

## input into `adjustRtime()`
tst_adjusted <- adjustRtime(tst, param = param)

## run diagnostic functions to pre-evaluate alignment
param <- matchLamasChromPeaks(tst, param = param)
mtch <- matchedRtimes(param)

## Access summary of matches and model information
summary <- summarizeLamaMatch(param)

##coverage for each file
summary$Matched_peaks / summary$Total_peaks * 100

## Access the information on the model of for the first file
summary$model_summary[[1]]

Replace raw with adjusted retention times

Description

Replaces the raw retention times with the adjusted retention time or returns the object unchanged if none are present.

Usage

applyAdjustedRtime(object)
applyAdjustedRtime(object)

Arguments

object

An XCMSnExp or XcmsExperiment object.

Details

Adjusted retention times are stored in parallel to the adjusted retention times in XCMSnExp or XcmsExperiment objects. The applyAdjustedRtime replaces the raw (original) retention times with the adjusted retention times.

Value

An XCMSnExp or XcmsExperiment object with the raw (original) retention times being replaced with the adjusted retention time.

Note

Replacing the raw retention times with adjusted retention times disables the possibility to restore raw retention times using the dropAdjustedRtime() method. This function does not remove the retention time processing step with the settings of the alignment from the processHistory() of the object to ensure that the processing history is preserved.

Author(s)

Johannes Rainer

Examples


## Load a test data set with detected peaks
library(MSnbase)
data(faahko_sub)
## Update the path to the files for the local system
dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO")

## Disable parallel processing for this example
register(SerialParam())

xod <- adjustRtime(faahko_sub, param = ObiwarpParam())

hasAdjustedRtime(xod)

## Replace raw retention times with adjusted retention times.
xod <- applyAdjustedRtime(xod)

## No adjusted retention times present
hasAdjustedRtime(xod)

## Raw retention times have been replaced with adjusted retention times
plot(split(rtime(faahko_sub), fromFile(faahko_sub))[[1]] -
    split(rtime(xod), fromFile(xod))[[1]], type = "l")

## And the process history still contains the settings for the alignment
processHistory(xod)
## Load a test data set with detected peaks
library(MSnbase)
data(faahko_sub)
## Update the path to the files for the local system
dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO")

## Disable parallel processing for this example
register(SerialParam())

xod <- adjustRtime(faahko_sub, param = ObiwarpParam())

hasAdjustedRtime(xod)

## Replace raw retention times with adjusted retention times.
xod <- applyAdjustedRtime(xod)

## No adjusted retention times present
hasAdjustedRtime(xod)

## Raw retention times have been replaced with adjusted retention times
plot(split(rtime(faahko_sub), fromFile(faahko_sub))[[1]] -
    split(rtime(xod), fromFile(xod))[[1]], type = "l")

## And the process history still contains the settings for the alignment
processHistory(xod)

Automatic parameter for Lock mass fixing `AutoLockMass` ~~

Description

AutoLockMass - This function decides where the lock mass scans are in the xcmsRaw object. This is done by using the scan time differences.

Arguments

object

An xcmsRaw-class object

Value

AutoLockMass A numeric vector of scan locations corresponding to lock Mass scans

Methods

object = "xcmsRaw": signature(object = "xcmsRaw")

Author(s)

Paul Benton, [email protected]

Examples

## Not run: library(xcms)
    library(faahKO)
    ## These files do not have this problem
    ## to correct for but just for an example
    cdfpath <- system.file("cdf", package = "faahKO")
    cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE)
    xr<-xcmsRaw(cdffiles[1])
    xr
    ##Lets assume that the lockmass starts at 1 and is every 100 scans
    lockMass<-xcms:::makeacqNum(xr, freq=100, start=1)
    ## these are equalvent
    lockmass2<-AutoLockMass(xr)
    all((lockmass == lockmass2) == TRUE)

    ob<-stitch(xr, lockMass)

## End(Not run)
## Not run: library(xcms)
    library(faahKO)
    ## These files do not have this problem
    ## to correct for but just for an example
    cdfpath <- system.file("cdf", package = "faahKO")
    cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE)
    xr<-xcmsRaw(cdffiles[1])
    xr
    ##Lets assume that the lockmass starts at 1 and is every 100 scans
    lockMass<-xcms:::makeacqNum(xr, freq=100, start=1)
    ## these are equalvent
    lockmass2<-AutoLockMass(xr)
    all((lockmass == lockmass2) == TRUE)

    ob<-stitch(xr, lockMass)

## End(Not run)

XCMSnExp data manipulation methods inherited from MSnbase

Description

The methods listed on this page are XCMSnExp methods inherited from its parent, the OnDiskMSnExp class from the MSnbase package, that alter the raw data or are related to data subsetting. Thus calling any of these methods causes all xcms pre-processing results to be removed from the XCMSnExp object to ensure its data integrity.

bin: allows to bin spectra. See bin documentation in the MSnbase package for more details and examples.

clean: removes unused 0 intensity data points. See clean documentation in the MSnbase package for details and examples.

filterAcquisitionNum: filters the XCMSnExp object keeping only spectra with the provided acquisition numbers. See filterAcquisitionNum for details and examples.

The normalize method performs basic normalization of spectra intensities. See normalize documentation in the MSnbase package for details and examples.

The pickPeaks method performs peak picking. See pickPeaks documentation for details and examples.

The removePeaks method removes mass peaks (intensities) lower than a threshold. Note that these peaks refer to mass peaks, which are different to the chromatographic peaks detected and analyzed in a metabolomics experiment! See removePeaks documentation for details and examples.

The smooth method smooths spectra. See smooth documentation in MSnbase for details and examples.

Usage

## S4 method for signature 'XCMSnExp'
bin(x, binSize = 1L, msLevel.)

## S4 method for signature 'XCMSnExp'
clean(object, all = FALSE, verbose = FALSE, msLevel.)

## S4 method for signature 'XCMSnExp'
filterAcquisitionNum(object, n, file)

## S4 method for signature 'XCMSnExp'
normalize(object, method = c("max", "sum"), ...)

## S4 method for signature 'XCMSnExp'
pickPeaks(
  object,
  halfWindowSize = 3L,
  method = c("MAD", "SuperSmoother"),
  SNR = 0L,
  ...
)

## S4 method for signature 'XCMSnExp'
removePeaks(object, t = "min", verbose = FALSE, msLevel.)

## S4 method for signature 'XCMSnExp'
smooth(
  x,
  method = c("SavitzkyGolay", "MovingAverage"),
  halfWindowSize = 2L,
  verbose = FALSE,
  ...
)
## S4 method for signature 'XCMSnExp'
bin(x, binSize = 1L, msLevel.)

## S4 method for signature 'XCMSnExp'
clean(object, all = FALSE, verbose = FALSE, msLevel.)

## S4 method for signature 'XCMSnExp'
filterAcquisitionNum(object, n, file)

## S4 method for signature 'XCMSnExp'
normalize(object, method = c("max", "sum"), ...)

## S4 method for signature 'XCMSnExp'
pickPeaks(
  object,
  halfWindowSize = 3L,
  method = c("MAD", "SuperSmoother"),
  SNR = 0L,
  ...
)

## S4 method for signature 'XCMSnExp'
removePeaks(object, t = "min", verbose = FALSE, msLevel.)

## S4 method for signature 'XCMSnExp'
smooth(
  x,
  method = c("SavitzkyGolay", "MovingAverage"),
  halfWindowSize = 2L,
  verbose = FALSE,
  ...
)

Arguments

`x`	`XCMSnExp` or `OnDiskMSnExp` object.
`binSize`	`numeric(1)` defining the size of a bin (in Dalton).
`msLevel.`	For `bin`, `clean`, `filterMsLevel`, `removePeaks`: `numeric(1)` defining the MS level(s) to which operations should be applied or to which the object should be subsetted.
`object`	`XCMSnExp` or `OnDiskMSnExp` object.
`all`	For `clean`: `logical(1)`, if `TRUE` all zeros are removed.
`verbose`	`logical(1)` whether progress information should be displayed.
`n`	For `filterAcquisitionNum`: `integer` defining the acquisition numbers of the spectra to which the data set should be sub-setted.
`file`	For `filterAcquisitionNum`: `integer` defining the file index within the object to subset the object by file.
`method`	For `normalize`: `character(1)` specifying the normalization method. See `normalize` in the `MSnbase` package for details. For `pickPeaks`: `character(1)` defining the method. See `pickPeaks` for options. For `smooth`: `character(1)` defining the method. See `smooth` in the `MSnbase` package for options and details.
`...`	Optional additional arguments.
`halfWindowSize`	For `pickPeaks` and `smooth`: `integer(1)` defining the window size for the peak picking. See `pickPeaks` and `smooth` in the `MSnbase` package for details and options.
`SNR`	For `pickPeaks`: `numeric(1)` defining the signal to noise ratio to be considered. See `pickPeaks` documentation for details.
`t`	For `removePeaks`: either a `numeric(1)` or `"min"` defining the threshold (method) to be used. See `removePeaks` for details.

Value

For all methods: a XCMSnExp object.

Author(s)

Johannes Rainer

Aggregate values in y for bins defined on x

Description

This functions takes two same-sized numeric vectors x and y, bins/cuts x into bins (either a pre-defined number of equal-sized bins or bins of a pre-defined size) and aggregates values in y corresponding to x values falling within each bin. By default (i.e. method = "max") the maximal y value for the corresponding x values is identified. x is expected to be incrementally sorted and, if not, it will be internally sorted (in which case also y will be ordered according to the order of x).

Usage

binYonX(
  x,
  y,
  breaks,
  nBins,
  binSize,
  binFromX,
  binToX,
  fromIdx = 1L,
  toIdx = length(x),
  method = "max",
  baseValue,
  sortedX = !is.unsorted(x),
  shiftByHalfBinSize = FALSE,
  returnIndex = FALSE,
  returnX = TRUE
)
binYonX(
  x,
  y,
  breaks,
  nBins,
  binSize,
  binFromX,
  binToX,
  fromIdx = 1L,
  toIdx = length(x),
  method = "max",
  baseValue,
  sortedX = !is.unsorted(x),
  shiftByHalfBinSize = FALSE,
  returnIndex = FALSE,
  returnX = TRUE
)

Arguments

`x`	Numeric vector to be used for binning.
`y`	Numeric vector (same length than `x`) from which the maximum values for each bin should be defined. If not provided, `x` will be used.
`breaks`	Numeric vector defining the breaks for the bins, i.e. the lower and upper values for each bin. See examples below.
`nBins`	integer(1) defining the number of desired bins.
`binSize`	numeric(1) defining the desired bin size.
`binFromX`	Optional numeric(1) allowing to manually specify the range of x-values to be used for binning. This will affect only the calculation of the breaks for the bins (i.e. if `nBins` or `binSize` is provided). If not provided the minimal value in the sub-set `fromIdx`-`toIdx` in input vector `x` will be used.
`binToX`	Same as `binFromX`, but defining the maximum x-value to be used for binning.
`fromIdx`	Integer vector defining the start position of one or multiple sub-sets of input vector `x` that should be used for binning.
`toIdx`	Same as `toIdx`, but defining the maximum index (or indices) in x to be used for binning.
`method`	A character string specifying the method that should be used to aggregate values in `y`. Allowed are `"max"`, `"min"`, `"sum"` and `"mean"` to identify the maximal or minimal value or to sum all values within a bin or calculate their mean value.
`baseValue`	The base value for empty bins (i.e. bins into which either no values in `x` did fall, or to which only `NA` values in `y` were assigned). By default (i.e. if not specified), `NA` is assigned to such bins.
`sortedX`	Whether `x` is sorted.
`shiftByHalfBinSize`	Logical specifying whether the bins should be shifted by half the bin size to the left. Thus, the first bin will have its center at `fromX` and its lower and upper boundary are `fromX - binSize/2` and `fromX + binSize/2`. This argument is ignored if `breaks` are provided.
`returnIndex`	Logical indicating whether the index of the max (if `method = "max"`) or min (if `method = "min"`) value within each bin in input vector `x` should also be reported. For methods other than `"max"` or `"min"` this argument is ignored.
`returnX`	`logical` allowing to avoid returning `$x`, i.e. the mid-points of the bins. `returnX = FALSE` might be useful in cases where `breaks` are pre-defined as it considerably reduces the memory demand.

Details

The breaks defining the boundary of each bin can be either passed directly to the function with the argument breaks, or are calculated on the data based on arguments nBins or binSize along with fromIdx, toIdx and optionally binFromX and binToX. Arguments fromIdx and toIdx allow to specify subset(s) of the input vector x on which bins should be calculated. The default the full x vector is considered. Also, if not specified otherwise with arguments binFromX and binToX , the range of the bins within each of the sub-sets will be from x[fromIdx] to x[toIdx]. Arguments binFromX and binToX allow to overwrite this by manually defining the a range on which the breaks should be calculated. See examples below for more details.

Calculation of breaks: for nBins the breaks correspond to seq(min(x[fromIdx])), max(x[fromIdx], length.out = (nBins + 1)). For binSize the breaks correspond to seq(min(x[fromIdx]), max(x[toIdx]), by = binSize) with the exception that the last break value is forced to be equal to max(x[toIdx]). This ensures that all values from the specified range are covered by the breaks defining the bins. The last bin could however in some instances be slightly larger than binSize. See breaks_on_binSize and breaks_on_nBins for more details.

Value

Returns a list of length 2, the first element (named "x") contains the bin mid-points, the second element (named "y") the aggregated values from input vector y within each bin. For returnIndex = TRUE the list contains an additional element "index" with the index of the max or min (depending on whether method = "max" or method = "min") value within each bin in input vector x.

Note

The function ensures that all values within the range used to define the breaks are considered in the binning (and assigned to a bin). This means that for all bins except the last one values in x have to be >= xlower and < xupper (with xlower and xupper being the lower and upper boundary, respectively). For the last bin the condition is x >= xlower & x <= xupper. Note also that if shiftByHalfBinSize is TRUE the range of values that is used for binning is expanded by binSize (i.e. the lower boundary will be fromX - binSize/2, the upper toX + binSize/2). Setting this argument to TRUE resembles the binning that is/was used in profBin function from xcms < 1.51.

NA handling: by default the function ignores NA values in y (thus inherently assumes na.rm = TRUE). No NA values are allowed in x.

Author(s)

Johannes Rainer

Examples

########
## Simple example illustrating the breaks and the binning.
##
## Define breaks for 5 bins:
brks <- seq(2, 12, length.out = 6)
## The first bin is then [2,4), the second [4,6) and so on.
brks
## Get the max value falling within each bin.
binYonX(x = 1:16, y = 1:16, breaks = brks)
## Thus, the largest value in x = 1:16 falling into the bin [2,4) (i.e. being
## >= 2 and < 4) is 3, the largest one falling into [4,6) is 5 and so on.
## Note however the function ensures that the minimal and maximal x-value
## (in this example 1 and 12) fall within a bin, i.e. 12 is considered for
## the last bin.

#######
## Performing the binning ons sub-set of x
##
X <- 1:16
## Bin X from element 4 to 10 into 5 bins.
X[4:10]
binYonX(X, X, nBins = 5L, fromIdx = 4, toIdx = 10)
## This defines breaks for 5 bins on the values from 4 to 10 and bins
## the values into these 5 bins. Alternatively, we could manually specify
## the range for the binning, i.e. the minimal and maximal value for the
## breaks:
binYonX(X, X, nBins = 5L, fromIdx = 4, toIdx = 10, binFromX = 1, binToX = 16)
## In this case the breaks for 5 bins were defined from a value 1 to 16 and
## the values 4 to 10 were binned based on these breaks.

#######
## Bin values within a sub-set of x, second example
##
## This example illustrates how the fromIdx and toIdx parameters can be used.
## x defines 3 times the sequence form 1 to 10, while y is the sequence from
## 1 to 30. In this very simple example x is supposed to represent M/Z values
## from 3 consecutive scans and y the intensities measured for each M/Z in
## each scan. We want to get the maximum intensities for M/Z value bins only
## for the second scan, and thus we use fromIdx = 11 and toIdx = 20. The breaks
## for the bins are defined with the nBins, binFromX and binToX.
X <- rep(1:10, 3)
Y <- 1:30
## Bin the M/Z values in the second scan into 5 bins and get the maximum
## intensity for each bin. Note that we have to specify sortedX = TRUE as
## the x and y vectors would be sorted otherwise.
binYonX(X, Y, nBins = 5L, sortedX = TRUE, fromIdx = 11, toIdx = 20)

#######
## Bin in overlapping sub-sets of X
##
## In this example we define overlapping sub-sets of X and perform the binning
## within these.
X <- 1:30
## Define the start and end indices of the sub-sets.
fIdx <- c(2, 8, 21)
tIdx <- c(10, 25, 30)
binYonX(X, nBins = 5L, fromIdx = fIdx, toIdx = tIdx)
## The same, but pre-defining also the desired range of the bins.
binYonX(X, nBins = 5L, fromIdx = fIdx, toIdx = tIdx, binFromX = 4, binToX = 28)
## The same bins are thus used for each sub-set.
########
## Simple example illustrating the breaks and the binning.
##
## Define breaks for 5 bins:
brks <- seq(2, 12, length.out = 6)
## The first bin is then [2,4), the second [4,6) and so on.
brks
## Get the max value falling within each bin.
binYonX(x = 1:16, y = 1:16, breaks = brks)
## Thus, the largest value in x = 1:16 falling into the bin [2,4) (i.e. being
## >= 2 and < 4) is 3, the largest one falling into [4,6) is 5 and so on.
## Note however the function ensures that the minimal and maximal x-value
## (in this example 1 and 12) fall within a bin, i.e. 12 is considered for
## the last bin.

#######
## Performing the binning ons sub-set of x
##
X <- 1:16
## Bin X from element 4 to 10 into 5 bins.
X[4:10]
binYonX(X, X, nBins = 5L, fromIdx = 4, toIdx = 10)
## This defines breaks for 5 bins on the values from 4 to 10 and bins
## the values into these 5 bins. Alternatively, we could manually specify
## the range for the binning, i.e. the minimal and maximal value for the
## breaks:
binYonX(X, X, nBins = 5L, fromIdx = 4, toIdx = 10, binFromX = 1, binToX = 16)
## In this case the breaks for 5 bins were defined from a value 1 to 16 and
## the values 4 to 10 were binned based on these breaks.

#######
## Bin values within a sub-set of x, second example
##
## This example illustrates how the fromIdx and toIdx parameters can be used.
## x defines 3 times the sequence form 1 to 10, while y is the sequence from
## 1 to 30. In this very simple example x is supposed to represent M/Z values
## from 3 consecutive scans and y the intensities measured for each M/Z in
## each scan. We want to get the maximum intensities for M/Z value bins only
## for the second scan, and thus we use fromIdx = 11 and toIdx = 20. The breaks
## for the bins are defined with the nBins, binFromX and binToX.
X <- rep(1:10, 3)
Y <- 1:30
## Bin the M/Z values in the second scan into 5 bins and get the maximum
## intensity for each bin. Note that we have to specify sortedX = TRUE as
## the x and y vectors would be sorted otherwise.
binYonX(X, Y, nBins = 5L, sortedX = TRUE, fromIdx = 11, toIdx = 20)

#######
## Bin in overlapping sub-sets of X
##
## In this example we define overlapping sub-sets of X and perform the binning
## within these.
X <- 1:30
## Define the start and end indices of the sub-sets.
fIdx <- c(2, 8, 21)
tIdx <- c(10, 25, 30)
binYonX(X, nBins = 5L, fromIdx = fIdx, toIdx = tIdx)
## The same, but pre-defining also the desired range of the bins.
binYonX(X, nBins = 5L, fromIdx = fIdx, toIdx = tIdx, binFromX = 4, binToX = 28)
## The same bins are thus used for each sub-set.

Flag features based on the intensity in blank samples

Description

The 'BlankFlag' class and method enable users to flag features of an 'XcmsExperiment' or 'SummarizedExperiment' object based on the relationship between the intensity of a feature in blanks compared to the intensity in the samples.

This class and method are part of the possible dispatch of the generic function 'filterFeatures'. Features *below* ('<') the user-input threshold will be flagged by calling the 'filterFeatures' function. This means that an extra column will be created in 'featureDefinitions' or 'rowData' called 'possible_contaminants' with a logical value for each feature.

Usage

BlankFlag(
  threshold = 2,
  blankIndex = integer(),
  qcIndex = integer(),
  na.rm = TRUE
)

## S4 method for signature 'XcmsResult,BlankFlag'
filterFeatures(object, filter, ...)

## S4 method for signature 'SummarizedExperiment,BlankFlag'
filterFeatures(object, filter, assay = 1)
BlankFlag(
  threshold = 2,
  blankIndex = integer(),
  qcIndex = integer(),
  na.rm = TRUE
)

## S4 method for signature 'XcmsResult,BlankFlag'
filterFeatures(object, filter, ...)

## S4 method for signature 'SummarizedExperiment,BlankFlag'
filterFeatures(object, filter, assay = 1)

Arguments

`threshold`	'numeric' indicates the minimum difference required between the mean abundance of a feature in samples compared to the mean abundance of the same feature in blanks for it to not be considered a possible contaminant. For example, the default threshold of 2 signifies that the mean abundance of the features in samples has to be at least twice the mean abundance in blanks for it to not be flagged as a possible contaminant.
`blankIndex`	'integer' (or 'logical') vector corresponding to the indices of blank samples.
`qcIndex`	'integer' (or 'logical') vector corresponding to the indices of quality control (QC) samples.
`na.rm`	'logical' indicates whether missing values ('NA') should be removed prior to the calculations.
`object`	`XcmsExperiment` or `SummarizedExperiment`. For an `XcmsExperiment` object, the `featureValues(object)` will be evaluated, and for `Summarizedesxperiment` the `assay(object, assay)`. The object will be filtered.
`filter`	The parameter object selecting and configuring the type of filtering. It can be one of the following classes: `RsdFilter`, `DratioFilter`, `PercentMissingFilter` or `BlankFlag`.
`...`	Optional parameters. For `object` being an `XcmsExperiment`: parameters for the `featureValues()` call.
`assay`	For filtering of `SummarizedExperiment` objects only. Indicates which assay the filtering will be based on. Note that the features for the entire object will be removed, but the computations are performed on a single assay. Default is 1, which means the first assay of the `object` will be evaluated.

Value

For 'BlankFlag': a 'BlankFlag' class. 'filterFeatures' returns the input object with an added column in the features metadata called 'possible_contaminants' with a logical value for each feature. This is added to 'featureDefinitions' for 'XcmsExperiment' objects and 'rowData' for 'SummarizedExperiment' objects.

Author(s)

Philippine Louail

Generate breaks for binning using a defined bin size.

Description

Defines breaks for binSize sized bins for values ranging from fromX to toX.

Usage

breaks_on_binSize(fromX, toX, binSize)
breaks_on_binSize(fromX, toX, binSize)

Arguments

`fromX`	numeric(1) specifying the lowest value for the bins.
`toX`	numeric(1) specifying the largest value for the bins.
`binSize`	numeric(1) defining the size of a bin.

Details

This function creates breaks for bins of size binSize. The function ensures that the full data range is included in the bins, i.e. the last value (upper boundary of the last bin) is always equal toX. This however means that the size of the last bin will not always be equal to the desired bin size. See examples for more details and a comparisom to R's seq function.

Value

A numeric vector defining the lower and upper bounds of the bins.

Author(s)

Johannes Rainer

Examples

## Define breaks with a size of 0.13 for a data range from 1 to 10:
breaks_on_binSize(1, 10, 0.13)
## The size of the last bin is however larger than 0.13:
diff(breaks_on_binSize(1, 10, 0.13))
## If we would use seq, the max value would not be included:
seq(1, 10, by = 0.13)

## In the next example we use binSize that leads to an additional last bin with
## a smaller binSize:
breaks_on_binSize(1, 10, 0.51)
## Again, the max value is included, but the size of the last bin is < 0.51.
diff(breaks_on_binSize(1, 10, 0.51))
## Using just seq would result in the following bin definition:
seq(1, 10, by = 0.51)
## Thus it defines one bin (break) less.
## Define breaks with a size of 0.13 for a data range from 1 to 10:
breaks_on_binSize(1, 10, 0.13)
## The size of the last bin is however larger than 0.13:
diff(breaks_on_binSize(1, 10, 0.13))
## If we would use seq, the max value would not be included:
seq(1, 10, by = 0.13)

## In the next example we use binSize that leads to an additional last bin with
## a smaller binSize:
breaks_on_binSize(1, 10, 0.51)
## Again, the max value is included, but the size of the last bin is < 0.51.
diff(breaks_on_binSize(1, 10, 0.51))
## Using just seq would result in the following bin definition:
seq(1, 10, by = 0.51)
## Thus it defines one bin (break) less.

Generate breaks for binning

Description

Calculate breaks for same-sized bins for data values from fromX to toX.

Usage

breaks_on_nBins(fromX, toX, nBins, shiftByHalfBinSize = FALSE)
breaks_on_nBins(fromX, toX, nBins, shiftByHalfBinSize = FALSE)

Arguments

`fromX`	numeric(1) specifying the lowest value for the bins.
`toX`	numeric(1) specifying the largest value for the bins.
`nBins`	numeric(1) defining the number of bins.
`shiftByHalfBinSize`	Logical indicating whether the bins should be shifted left by half bin size. This results centered bins, i.e. the first bin being centered at `fromX` and the last around `toX`.

Details

This generates bins such as a call to seq(fromX, toX, length.out = nBins) would. The first and second element in the result vector thus defines the lower and upper boundary for the first bin, the second and third value for the second bin and so on.

Value

A numeric vector of length nBins + 1 defining the lower and upper bounds of the bins.

Author(s)

Johannes Rainer

Examples

## Create breaks to bin values from 3 to 20 into 20 bins
breaks_on_nBins(3, 20, nBins = 20)
## The same call but using shiftByHalfBinSize
breaks_on_nBins(3, 20, nBins = 20, shiftByHalfBinSize = TRUE)
## Create breaks to bin values from 3 to 20 into 20 bins
breaks_on_nBins(3, 20, nBins = 20)
## The same call but using shiftByHalfBinSize
breaks_on_nBins(3, 20, nBins = 20, shiftByHalfBinSize = TRUE)

Combine xcmsSet objects

Description

Combines the samples and peaks from multiple xcmsSet objects into a single object. Group and retention time correction data are discarded. The profinfo list is set to be equal to the first object.

Arguments

`xs1`	`xcmsSet` object
`...`	`xcmsSet` objects

Value

A xcmsSet object.

Methods

xs1 = "xcmsRaw": c(xs1, ...)

Author(s)

Colin A. Smith, [email protected]

Calibrant mass based calibration of chromatgraphic peaks

Description

Calibrate peaks using mz values of known masses/calibrants. mz values of identified peaks are adjusted based on peaks that are close to the provided mz values. See details below for more information.

The isCalibrated function returns TRUE if chromatographic peaks of the XCMSnExp object x were calibrated and FALSE otherwise.

Usage

CalibrantMassParam(
  mz = list(),
  mzabs = 1e-04,
  mzppm = 5,
  neighbors = 3,
  method = "linear"
)

isCalibrated(object)

## S4 method for signature 'XCMSnExp'
calibrate(object, param)
CalibrantMassParam(
  mz = list(),
  mzabs = 1e-04,
  mzppm = 5,
  neighbors = 3,
  method = "linear"
)

isCalibrated(object)

## S4 method for signature 'XCMSnExp'
calibrate(object, param)

Arguments

`mz`	a `numeric` or `list` of `numeric` vectors with reference mz values. If a `numeric` vector is provided, this is used for each sample in the `XCMSnExp` object. If a `list` is provided, it's length has to be equal to the number of samples in the experiment.
`mzabs`	`numeric(1)` the absolute error/deviation for matching peaks to calibrants (in Da).
`mzppm`	`numeric(1)` the relative error for matching peaks to calibrants in ppm (parts per million).
`neighbors`	`integer(1)` with the maximal number of peaks within the permitted distance to the calibrants that are considered. Among these the mz value of the peak with the largest intensity is used in the calibration function estimation.
`method`	`character(1)` defining the method that should be used to estimate the calibration function. Can be `"shift"`, `"linear"` (default) or `"edgeshift"`.
`object`	An XCMSnExp object.
`param`	The `CalibrantMassParam` object with the calibration settings.

Details

The method does first identify peaks that are close to the provided mz values and, given that there difference to the calibrants is smaller than the user provided cut off (based on arguments mzabs and mzppm), their mz values are replaced with the provided mz values. The mz values of all other peaks are either globally shifted (for method = "shift" or estimated by a linear model through all calibrants. Peaks are considered close to a calibrant mz if the difference between the calibrant and its mz is ⁠<= mzabs + mz * mzppm /1e6⁠.

Adjustment methods: adjustment function/factor is estimated using the difference between calibrant and peak mz values only for peaks that are close enough to the calibrants. The availabel methods are:

shift: shifts the m/z of each peak by a global factor which corresponds to the average difference between peak mz and calibrant mz.
linear: fits a linear model throught the differences between calibrant and peak mz values and adjusts the mz values of all peaks using this.
edgeshift: performs same adjustment as linear for peaks that are within the mz range of the calibrants and shift outside of it.

For more information, details and examples refer to the xcms-direct-injection vignette.

Value

For CalibrantMassParam: a CalibrantMassParam instance. For calibrate: an XCMSnExp object with chromatographic peaks being calibrated. Be aware that the actual raw mz values are not (yet) calibrated, but only the identified chromatographic peaks.

The CalibrantMassParam function returns an instance of the CalibrantMassParam class with all settings and properties set.

The calibrate method returns an XCMSnExp object with the chromatographic peaks being calibrated. Note that only the detected peaks are calibrated, but not the individual mz values in each spectrum.

Note

CalibrantMassParam classes don't have exported getter or setter methods.

Author(s)

Joachim Bargsten, Johannes Rainer

Calibrate peaks for correcting unprecise m/z values

Description

Calibrate peaks of a xcmsSet via a set of known masses

Arguments

`object`	a `xcmsSet` object with uncalibrated mz
`calibrants`	a vector or a list of vectors with reference m/z-values
`method`	the used calibrating-method, see below
`mzppm`	the relative error used for matching peaks in ppm (parts per million)
`mzabs`	the absolute error used for matching peaks in Da
`neighbours`	the number of neighbours from wich the one with the highest intensity is used (instead of the nearest)
`plotres`	can be set to TRUE if wanted a result-plot showing the found m/z with the distances and the regression

Value

`object`	a `xcmsSet` with one ore more samples
`calibrants`	for each sample different calibrants can be used, if a list of m/z-vectors is given. The length of the list must be the same as the number of samples, alternatively a single vector of masses can be given which is used for all samples.
`method`	"shift" for shifting each m/z, "linear" does a linear regression and adds a linear term to each m/z. "edgeshift" does a linear regression within the range of the mz-calibrants and a shift outside.

Methods

object = "xcmsSet": calibrate(object, calibrants,method="linear", mzabs=0.0001, mzppm=5, neighbours=3, plotres=FALSE)

Extracting chromatograms

Description

chromatogram: extract chromatographic data (such as an extracted ion chromatogram, a base peak chromatogram or total ion chromatogram) from an OnDiskMSnExp or XCMSnExp objects. See also the help page of the chromatogram function in the MSnbase package.

Usage

## S4 method for signature 'XCMSnExp'
chromatogram(
  object,
  rt,
  mz,
  aggregationFun = "sum",
  missing = NA_real_,
  msLevel = 1L,
  BPPARAM = bpparam(),
  adjustedRtime = hasAdjustedRtime(object),
  filled = FALSE,
  include = c("apex_within", "any", "none"),
  ...
)
## S4 method for signature 'XCMSnExp'
chromatogram(
  object,
  rt,
  mz,
  aggregationFun = "sum",
  missing = NA_real_,
  msLevel = 1L,
  BPPARAM = bpparam(),
  adjustedRtime = hasAdjustedRtime(object),
  filled = FALSE,
  include = c("apex_within", "any", "none"),
  ...
)

Arguments

`object`	Either a OnDiskMSnExp or XCMSnExp object from which the chromatograms should be extracted.
`rt`	`numeric(2)` or two-column `matrix` defining the lower and upper boundary for the retention time range(s). If not specified, the full retention time range of the original data will be used.
`mz`	`numeric(2)` or two-column `matrix` defining the lower and upper mz value for the MS data slice(s). If not specified, the chromatograms will be calculated on the full mz range.
`aggregationFun`	`character(1)` specifying the function to be used to aggregate intensity values across the mz value range for the same retention time. Allowed values are `"sum"` (the default), `"max"`, `"mean"` and `"min"`.
`missing`	`numeric(1)` allowing to specify the intensity value to be used if for a given retention time no signal was measured within the mz range of the corresponding scan. Defaults to `NA_real_` (see also Details and Notes sections below). Use `missing = 0` to resemble the behaviour of the `getEIC` from the old user interface.
`msLevel`	`integer(1)` specifying the MS level from which the chromatogram should be extracted. Defaults to `msLevel = 1L`.
`BPPARAM`	Parallelisation backend to be used, which will depend on the architecture. Default is `BiocParallel::bparam()`.
`adjustedRtime`	For `⁠chromatogram,XCMSnExp⁠`: whether the adjusted (`adjustedRtime = TRUE`) or raw retention times (`adjustedRtime = FALSE`) should be used for filtering and returned in the resulting MChromatograms object. Adjusted retention times are used by default if available.
`filled`	`logical(1)` whether filled-in peaks should also be returned. Defaults to `filled = FALSE`, i.e. returns only detected chromatographic peaks in the result object.
`include`	`character(1)` defining which chromatographic peaks should be returned. Supported are `include = "apex_within"` (the default) which returns chromatographic peaks that have their apex within the `mz` `rt` range, `include = "any"` to return all chromatographic peaks which m/z and rt ranges overlap the `mz` and `rt` or `include = "none"` to not include any chromatographic peaks.
`...`	optional parameters - currently ignored.

Details

Arguments rt and mz allow to specify the MS data slice (i.e. the m/z range and retention time window) from which the chromatogram should be extracted. These parameters can be either a numeric of length 2 with the lower and upper limit, or a matrix with two columns with the lower and upper limits to extract multiple EICs at once. The parameter aggregationSum allows to specify the function to be used to aggregate the intensities across the m/z range for the same retention time. Setting aggregationFun = "sum" would e.g. allow to calculate the total ion chromatogram (TIC), aggregationFun = "max" the base peak chromatogram (BPC).

If for a given retention time no intensity is measured in that spectrum a NA intensity value is returned by default. This can be changed with the parameter missing, setting missing = 0 would result in a 0 intensity being returned in these cases.

Value

chromatogram returns a XChromatograms object with the number of columns corresponding to the number of files in object and number of rows the number of specified ranges (i.e. number of rows of matrices provided with arguments mz and/or rt). All chromatographic peaks with their apex position within the m/z and retention time range are also retained as well as all feature definitions for these peaks.

Note

For XCMSnExp objects, if adjusted retention times are available, the chromatogram method will by default report and use these (for the subsetting based on the provided parameter rt). This can be changed by setting adjustedRtime = FALSE.

Author(s)

Johannes Rainer

Examples


## Load a test data set with identified chromatographic peaks
library(MSnbase)
data(faahko_sub)
## Update the path to the files for the local system
dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO")

## Disable parallel processing for this example
register(SerialParam())

## Extract the ion chromatogram for one chromatographic peak in the data.
chrs <- chromatogram(faahko_sub, rt = c(2700, 2900), mz = 335)

chrs

## Identified chromatographic peaks
chromPeaks(chrs)

## Plot the chromatogram
plot(chrs)

## Extract chromatograms for multiple ranges.
mzr <- matrix(c(335, 335, 344, 344), ncol = 2, byrow = TRUE)
rtr <- matrix(c(2700, 2900, 2600, 2750), ncol = 2, byrow = TRUE)
chrs <- chromatogram(faahko_sub, mz = mzr, rt = rtr)

chromPeaks(chrs)

plot(chrs)

## Get access to all chromatograms for the second mz/rt range
chrs[1, ]

## Plot just that one
plot(chrs[1, , drop = FALSE])
## Load a test data set with identified chromatographic peaks
library(MSnbase)
data(faahko_sub)
## Update the path to the files for the local system
dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO")

## Disable parallel processing for this example
register(SerialParam())

## Extract the ion chromatogram for one chromatographic peak in the data.
chrs <- chromatogram(faahko_sub, rt = c(2700, 2900), mz = 335)

chrs

## Identified chromatographic peaks
chromPeaks(chrs)

## Plot the chromatogram
plot(chrs)

## Extract chromatograms for multiple ranges.
mzr <- matrix(c(335, 335, 344, 344), ncol = 2, byrow = TRUE)
rtr <- matrix(c(2700, 2900, 2600, 2750), ncol = 2, byrow = TRUE)
chrs <- chromatogram(faahko_sub, mz = mzr, rt = rtr)

chromPeaks(chrs)

plot(chrs)

## Get access to all chromatograms for the second mz/rt range
chrs[1, ]

## Plot just that one
plot(chrs[1, , drop = FALSE])

Extract an ion chromatogram for each chromatographic peak

Description

Extract an ion chromatogram (EIC) for each chromatographic peak in an XcmsExperiment() object. The result is returned as an XChromatograms() of length equal to the number of chromatographic peaks (and one column).

Usage

chromPeakChromatograms(object, ...)

## S4 method for signature 'XcmsExperiment'
chromPeakChromatograms(
  object,
  expandRt = 0,
  expandMz = 0,
  aggregationFun = "max",
  peaks = character(),
  return.type = c("XChromatograms", "MChromatograms"),
  ...,
  progressbar = TRUE
)
chromPeakChromatograms(object, ...)

## S4 method for signature 'XcmsExperiment'
chromPeakChromatograms(
  object,
  expandRt = 0,
  expandMz = 0,
  aggregationFun = "max",
  peaks = character(),
  return.type = c("XChromatograms", "MChromatograms"),
  ...,
  progressbar = TRUE
)

Arguments

`object`	An `XcmsExperiment()` with identified chromatographic peaks.
`...`	currently ignored.
`expandRt`	`numeric(1)` to eventually expand the retention time range from which the signal should be integrated. The chromatogram will contain signal from `chromPeaks[, "rtmin"] - expandRt` to `chromPeaks[, "rtmax"] + expandRt`. The default is `expandRt = 0`.
`expandMz`	`numeric(1)` to eventually expand the m/z range from which the signal should be integrated. The chromatogram will contain signal from `chromPeaks[, "mzmin"] - expandMz` to `chromPeaks[, "mzmax"] + expandMz`. The default is `expandMz = 0`.
`aggregationFun`	`character(1)` defining the function how signals within the m/z range in each spectrum (i.e. for each discrete retention time) should be aggregated. The default (`aggregationFun = "max"`) reports the largest signal for each spectrum.
`peaks`	optional `character` providing the IDs of the chromatographic peaks (i.e. the row names of the peaks in `chromPeaks(object)`) for which chromatograms should be returned.
`return.type`	`character(1)` specifying the type of the returned object. Can be either `return.type = "XChromatograms"` (the default) or `return.type = "MChromatograms"` to return either a chromatographic object with or without the identified chromatographic peaks, respectively.
`progressbar`	`logical(1)` whether the progress of the extraction process should be displayed.

Author(s)

Johannes Rainer

Examples


## Load a test data set with detected peaks
library(MSnbase)
library(xcms)
library(MsExperiment)
faahko_sub <- loadXcmsData("faahko_sub2")

## Get EICs for every detected chromatographic peak
chrs <- chromPeakChromatograms(faahko_sub)
chrs

## Order of EICs matches the order in chromPeaks
chromPeaks(faahko_sub) |> head()

## variable "sample_index" provides the index of the sample the EIC was
## extracted from
fData(chrs)$sample_index

## Get the EIC for selected peaks only.
pks <- rownames(chromPeaks(faahko_sub))[c(6, 12)]
pks

## Expand the data on retention time dimension by 15 seconds (on each side)
res <- chromPeakChromatograms(faahko_sub, peaks = pks, expandRt = 5)
plot(res[1, ])
## Load a test data set with detected peaks
library(MSnbase)
library(xcms)
library(MsExperiment)
faahko_sub <- loadXcmsData("faahko_sub2")

## Get EICs for every detected chromatographic peak
chrs <- chromPeakChromatograms(faahko_sub)
chrs

## Order of EICs matches the order in chromPeaks
chromPeaks(faahko_sub) |> head()

## variable "sample_index" provides the index of the sample the EIC was
## extracted from
fData(chrs)$sample_index

## Get the EIC for selected peaks only.
pks <- rownames(chromPeaks(faahko_sub))[c(6, 12)]
pks

## Expand the data on retention time dimension by 15 seconds (on each side)
res <- chromPeakChromatograms(faahko_sub, peaks = pks, expandRt = 5)
plot(res[1, ])

Extract spectra associated with chromatographic peaks

Description

Extract (MS1 or MS2) spectra from an XcmsExperiment or XCMSnExp object for identified chromatographic peaks. To return spectra for selected chromatographic peaks, their peak ID (i.e., row name in the chromPeaks matrix) can be provided with parameter peaks. For msLevel = 1L (only supported for return.type = "Spectra" or return.type = "List") MS1 spectra within the retention time boundaries (in the file in which the peak was detected) are returned. For msLevel = 2L MS2 spectra are returned for a chromatographic peak if their precursor m/z is within the retention time and m/z range of the chromatographic peak. Parameter method allows to define whether all or a single spectrum should be returned:

method = "all": (default): return all spectra for each chromatographic peak.
method = "closest_rt": return the spectrum with the retention time closest to the peak's retention time (at apex).
method = "closest_mz": return the spectrum with the precursor m/z closest to the peaks's m/z (at apex); only supported for msLevel > 1.
method = "largest_tic": return the spectrum with the largest total signal (sum of peaks intensities).
method = "largest_bpi": return the spectrum with the largest peak intensity (maximal peak intensity).
method = "signal": only for object being a XCMSnExp: return the spectrum with the sum of intensities most similar to the peak's apex signal ("maxo"); only supported for msLevel = 2L.

Parameter return.type allows to specify the type of the result object. With return.type = "Spectra" (the default) a Spectra object with all matching spectra is returned. With return.type = "Spectra" a List of Spectra is returned. The length of the list is equal to the number of rows of chromPeaks. Each element of the list contains thus a Spectra with all spectra for one chromatographic peak (or a Spectra of length 0 if no spectrum was found for the respective chromatographic peak).

Parameter chromPeakColumns allows the user to add specific metadata columns from the chromatographic peaks (chromPeaks) to the returned spectra object. This can be useful to keep information such as retention time (rt), m/z (mz). The columns will be named as they are written in the chromPeaks object with the prefix "chrom_peak_". The peak ID (i.e., the row name of the peak in the chromPeaks matrix) is always added to the spectra object as a metadata column named "chrom_peak_id".

See also the LC-MS/MS data analysis vignette for more details and examples.

Usage

chromPeakSpectra(object, ...)

## S4 method for signature 'XcmsExperiment'
chromPeakSpectra(
  object,
  method = c("all", "closest_rt", "closest_mz", "largest_tic", "largest_bpi"),
  msLevel = 2L,
  expandRt = 0,
  expandMz = 0,
  ppm = 0,
  skipFilled = FALSE,
  peaks = character(),
  chromPeakColumns = c("rt", "mz"),
  return.type = c("Spectra", "List"),
  BPPARAM = bpparam()
)

## S4 method for signature 'XCMSnExp'
chromPeakSpectra(
  object,
  msLevel = 2L,
  expandRt = 0,
  expandMz = 0,
  ppm = 0,
  method = c("all", "closest_rt", "closest_mz", "signal", "largest_tic", "largest_bpi"),
  skipFilled = FALSE,
  return.type = c("Spectra", "MSpectra", "List", "list"),
  peaks = character()
)
chromPeakSpectra(object, ...)

## S4 method for signature 'XcmsExperiment'
chromPeakSpectra(
  object,
  method = c("all", "closest_rt", "closest_mz", "largest_tic", "largest_bpi"),
  msLevel = 2L,
  expandRt = 0,
  expandMz = 0,
  ppm = 0,
  skipFilled = FALSE,
  peaks = character(),
  chromPeakColumns = c("rt", "mz"),
  return.type = c("Spectra", "List"),
  BPPARAM = bpparam()
)

## S4 method for signature 'XCMSnExp'
chromPeakSpectra(
  object,
  msLevel = 2L,
  expandRt = 0,
  expandMz = 0,
  ppm = 0,
  method = c("all", "closest_rt", "closest_mz", "signal", "largest_tic", "largest_bpi"),
  skipFilled = FALSE,
  return.type = c("Spectra", "MSpectra", "List", "list"),
  peaks = character()
)

Arguments

`object`	XcmsExperiment or XCMSnExp object with identified chromatographic peaks for which spectra should be returned.
`...`	ignored.
`method`	`character(1)` specifying which spectra to include in the result. Defaults to `method = "all"`. See function description for details.
`msLevel`	`integer(1)` defining the MS level of the spectra that should be returned.
`expandRt`	`numeric(1)` to expand the retention time range of each peak by a constant value on each side.
`expandMz`	`numeric(1)` to expand the m/z range of each peak by a constant value on each side.
`ppm`	`numeric(1)` to expand the m/z range of each peak (on each side) by a value dependent on the peak's m/z.
`skipFilled`	`logical(1)` whether spectra for filled-in peaks should be reported or not.
`peaks`	`character`, `logical` or `integer` allowing to specify a subset of chromatographic peaks in `chromPeaks` for which spectra should be returned (providing either their ID, a logical vector same length than `nrow(chromPeaks(x))` or their index in `chromPeaks(x)`). This parameter overrides `skipFilled`.
`chromPeakColumns`	`character` vector with the names of the columns from `chromPeaks` that should be added to the returned spectra object. The columns will be named as they are written in the `chromPeaks` object with a prefix `"chrom_peak_"`. Defaults to `c("mz", "rt")`.
`return.type`	`character(1)` defining the type of result object that should be returned.
`BPPARAM`	parallel processing setup. Defaults to `BiocParallel::bpparam()`.

Value

parameter return.type allow to specify the type of the returned object:

return.type = "Spectra" (default): a Spectra object (defined in the Spectra package). The result contains all spectra for all peaks. Metadata column "peak_id" provides the ID of the respective peak (i.e. its rowname in chromPeaks().
return.type = "List": List of length equal to the number of chromatographic peaks is returned, each element being a Spectra with the spectra for one chromatographic peak.

For backward compatibility options "MSpectra" and "list" are also supported but are not suggested.

return.type = "MSpectra" (deprecated): a MSpectra object with elements being Spectrum objects. The result objects contains all spectra for all peaks. Metadata column "peak_id" provides the ID of the respective peak (i.e. its rowname in chromPeaks()).
return.type = "list": list of lists that are either of length 0 or contain Spectrum2 object(s) within the m/z-rt range. The length of the list matches the number of peaks.

Author(s)

Johannes Rainer

Examples


## Read a file with DDA LC-MS/MS data
library(MsExperiment)
fl <- system.file("TripleTOF-SWATH/PestMix1_DDA.mzML", package = "msdata")

dda <- readMsExperiment(fl)

## Perform MS1 peak detection
dda <- findChromPeaks(dda, CentWaveParam(peakwidth = c(5, 15),
    prefilter = c(5, 1000)))

## Return all MS2 spectro for each chromatographic peaks as a Spectra object
ms2_sps <- chromPeakSpectra(dda)
ms2_sps

## spectra variable *chrom_peak_id* contain the row names of the peaks in the
## chromPeak matrix and allow thus to map chromatographic peaks to the
## returned MS2 spectra
ms2_sps$chrom_peak_id
chromPeaks(dda)

## Alternatively, return the result as a List of Spectra objects. This list
## is parallel to chromPeaks hence the mapping between chromatographic peaks
## and MS2 spectra is easier.
ms2_sps <- chromPeakSpectra(dda, return.type = "List")
names(ms2_sps)
rownames(chromPeaks(dda))
ms2_sps[[1L]]

## Parameter `msLevel` allows to define from which MS level spectra should
## be returned. By default `msLevel = 2L` but with `msLevel = 1L` all
## MS1 spectra with a retention time within the retention time range of
## a chromatographic peak can be returned. Alternatively, selected
## spectra can be returned by specifying the selection criteria/method
## with the `method` parameter. Below we extract for each chromatographic
## peak the MS1 spectra with a retention time closest to the
## chromatographic peak's apex position. Alternatively it would also be
## possible to select the spectrum with the highest total signal or
## highest (maximal) intensity.
ms1_sps <- chromPeakSpectra(dda, msLevel = 1L, method = "closest_rt")
ms1_sps

## Parameter peaks would allow to extract spectra for specific peaks only.
## Peaks can be defined with parameter `peaks` which can be either an
## `integer` with the index of the peak in the `chromPeaks` matrix or a
## `character` with its rowname in `chromPeaks`.
chromPeakSpectra(dda, msLevel = 1L, method = "closest_rt", peaks = c(3, 5))
## Read a file with DDA LC-MS/MS data
library(MsExperiment)
fl <- system.file("TripleTOF-SWATH/PestMix1_DDA.mzML", package = "msdata")

dda <- readMsExperiment(fl)

## Perform MS1 peak detection
dda <- findChromPeaks(dda, CentWaveParam(peakwidth = c(5, 15),
    prefilter = c(5, 1000)))

## Return all MS2 spectro for each chromatographic peaks as a Spectra object
ms2_sps <- chromPeakSpectra(dda)
ms2_sps

## spectra variable *chrom_peak_id* contain the row names of the peaks in the
## chromPeak matrix and allow thus to map chromatographic peaks to the
## returned MS2 spectra
ms2_sps$chrom_peak_id
chromPeaks(dda)

## Alternatively, return the result as a List of Spectra objects. This list
## is parallel to chromPeaks hence the mapping between chromatographic peaks
## and MS2 spectra is easier.
ms2_sps <- chromPeakSpectra(dda, return.type = "List")
names(ms2_sps)
rownames(chromPeaks(dda))
ms2_sps[[1L]]

## Parameter `msLevel` allows to define from which MS level spectra should
## be returned. By default `msLevel = 2L` but with `msLevel = 1L` all
## MS1 spectra with a retention time within the retention time range of
## a chromatographic peak can be returned. Alternatively, selected
## spectra can be returned by specifying the selection criteria/method
## with the `method` parameter. Below we extract for each chromatographic
## peak the MS1 spectra with a retention time closest to the
## chromatographic peak's apex position. Alternatively it would also be
## possible to select the spectrum with the highest total signal or
## highest (maximal) intensity.
ms1_sps <- chromPeakSpectra(dda, msLevel = 1L, method = "closest_rt")
ms1_sps

## Parameter peaks would allow to extract spectra for specific peaks only.
## Peaks can be defined with parameter `peaks` which can be either an
## `integer` with the index of the peak in the `chromPeaks` matrix or a
## `character` with its rowname in `chromPeaks`.
chromPeakSpectra(dda, msLevel = 1L, method = "closest_rt", peaks = c(3, 5))

Chromatographic peak summaries

Description

The chromPeakSummary() method calculates summary statistics or other metrics for each of the identified chromatographic peaks in an xcms result object, such as the XcmsExperiment(). Different metrics can be calculated, depending upon (and configured by) using dedicated parameter classes. As a result, the method returns a matrix or data.frame with one row per chromatographic peak. Each column contains calculated values, depending on the used method/parameter class.

Currently implemented methods/parameter classes are:

BetaDistributionParam: calculates the beta_cor and beta_snr quality metrics as described in Kumler 2023 representing the result from a (correlation) test of similarity (using Pearson's correlation coefficient) to a bell curve and the signal-to-noise ratio calculated on the residuals of this test.

Usage

chromPeakSummary(object, param, ...)

## S4 method for signature 'XcmsExperiment,BetaDistributionParam'
chromPeakSummary(
  object,
  param,
  msLevel = 1L,
  chunkSize = 2L,
  BPPARAM = bpparam()
)

BetaDistributionParam()
chromPeakSummary(object, param, ...)

## S4 method for signature 'XcmsExperiment,BetaDistributionParam'
chromPeakSummary(
  object,
  param,
  msLevel = 1L,
  chunkSize = 2L,
  BPPARAM = bpparam()
)

BetaDistributionParam()

Arguments

`object`	an xcms result object containing information on identified chromatographic peaks.
`param`	a parameter object defining the method/summaries that should be calculated (see description above for supported parameter classes).
`...`	additional arguments passed to the method implementation.
`msLevel`	`integer(1)` with the MS level of the chromatographic peaks on which the metric should be calculated.
`chunkSize`	`integer(1)` defining the number of samples from which data should be loaded and processed at a time.
`BPPARAM`	Parallel processing setup. See `BiocParallel::bpparam()` for details.

Value

A matrix or data.frame with the same number of rows as there are chromatographic peaks. Columns contain the calculated values. The number of columns, their names and content depend on the used parameter object. See the respective documentation above for more details.

Author(s)

Pablo Vangeenderhuysen, Johannes Rainer, William Kumler

References

Kumler W, Hazelton B J and Ingalls A E (2023) "Picky with peakpicking: assessing chromatographic peak quality with simple metrics in metabolomics" BMC Bioinformatics 24(1):404. doi: 10.1186/s12859-023-05533-4

Collect MS^n peaks into xcmsFragments

Description

Collecting Peaks into xcmsFragmentss from several MS-runs using xcmsSet and xcmsRaw.

Arguments

`object`	(empty) `xcmsFragments-class` object
`xs`	A `xcmsSet-class` object which contains picked ms1-peaks from several experiments
`compMethod`	("floor", "round", "none"): compare-method which is used to find the parent peak of a MSnpeak through comparing the MZ-values of the MS1peaks with the MSnParentPeaks.
`snthresh`, `mzgap`, `uniq`	these are the parameters for the getspec-peakpicker included in xcmsRaw.

Details

After running collect(xFragments,xSet) The peak table of the xcmsFragments includes the ms1Peaks from all experiments stored in a xcmsSet-object. Further it contains the relevant msN-peaks from the xcmsRaw-objects, which were created temporarily with the paths in xcmsSet.

Value

A matrix with columns:

`peakID`	unique identifier of every peak
`MSnParentPeakID`	PeakID of the parent peak of a msLevel>1 - peak, it is 0 if the peak is msLevel 1.
`msLevel`	The msLevel of the peak.
`rt`	retention time of the peak midpoint
`mz`	the mz-Value of the peak
`intensity`	the intensity of the peak
`sample`	the number of the sample from the xcmsSet
`GroupPeakMSn`	Used for grouped xcmsSet groups
`CollisionEnergy`	The collision energy of the fragment

Methods

object = "xcmsFragments": collect(object, ...)

Correlate chromatograms

Description

For xcms >= 3.15.3 please use compareChromatograms() instead of correlate

Correlate intensities of two chromatograms with each other. If the two Chromatogram objects have different retention times they are first aligned to match data points in the first to data points in the second chromatogram. See help on alignRt in MSnbase::Chromatogram() for more details.

If correlate is called on a single MSnbase::MChromatograms() object a pairwise correlation of each chromatogram with each other is performed and a matrix with the correlation coefficients is returned.

Note that the correlation of two chromatograms depends also on their order, e.g. correlate(chr1, chr2) might not be identical to correlate(chr2, chr1). The lower and upper triangular part of the correlation matrix might thus be different.

Usage

## S4 method for signature 'Chromatogram,Chromatogram'
correlate(
  x,
  y,
  use = "pairwise.complete.obs",
  method = c("pearson", "kendall", "spearman"),
  align = c("closest", "approx"),
  ...
)

## S4 method for signature 'MChromatograms,missing'
correlate(
  x,
  y = NULL,
  use = "pairwise.complete.obs",
  method = c("pearson", "kendall", "spearman"),
  align = c("closest", "approx"),
  ...
)

## S4 method for signature 'MChromatograms,MChromatograms'
correlate(
  x,
  y = NULL,
  use = "pairwise.complete.obs",
  method = c("pearson", "kendall", "spearman"),
  align = c("closest", "approx"),
  ...
)
## S4 method for signature 'Chromatogram,Chromatogram'
correlate(
  x,
  y,
  use = "pairwise.complete.obs",
  method = c("pearson", "kendall", "spearman"),
  align = c("closest", "approx"),
  ...
)

## S4 method for signature 'MChromatograms,missing'
correlate(
  x,
  y = NULL,
  use = "pairwise.complete.obs",
  method = c("pearson", "kendall", "spearman"),
  align = c("closest", "approx"),
  ...
)

## S4 method for signature 'MChromatograms,MChromatograms'
correlate(
  x,
  y = NULL,
  use = "pairwise.complete.obs",
  method = c("pearson", "kendall", "spearman"),
  align = c("closest", "approx"),
  ...
)

Arguments

`x`	`MSnbase::Chromatogram()` or `MSnbase::MChromatograms()` object.
`y`	`MSnbase::Chromatogram()` or `MSnbase::MChromatograms()` object.
`use`	`character(1)` passed to the `cor` function. See `cor()` for details.
`method`	`character(1)` passed to the `cor` function. See `stats::cor()` for details.
`align`	`character(1)` defining the alignment method to be used. See help on `alignRt` in `MSnbase::Chromatogram()` for details. The value of this parameter is passed to the `method` parameter of `alignRt`.
`...`	optional parameters passed along to the `alignRt` method such as `tolerance` that, if set to `0` requires the retention times to be identical.

Value

numeric(1) or matrix (if called on MChromatograms objects) with the correlation coefficient. If a matrix is returned, the rows represent the chromatograms in x and the columns the chromatograms in y.

Author(s)

Michael Witting, Johannes Rainer

Examples


library(MSnbase)
chr1 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3),
    intensity = c(5, 29, 50, NA, 100, 12, 3, 4, 1, 3))
chr2 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3),
    intensity = c(80, 50, 20, 10, 9, 4, 3, 4, 1, 3))
chr3 <- Chromatogram(rtime = 3:9 + rnorm(7, sd = 0.3),
    intensity = c(53, 80, 130, 15, 5, 3, 2))

chrs <- MChromatograms(list(chr1, chr2, chr3))

## Using `compareChromatograms` instead of `correlate`.
compareChromatograms(chr1, chr2)
compareChromatograms(chr2, chr1)

compareChromatograms(chrs, chrs)
library(MSnbase)
chr1 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3),
    intensity = c(5, 29, 50, NA, 100, 12, 3, 4, 1, 3))
chr2 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3),
    intensity = c(80, 50, 20, 10, 9, 4, 3, 4, 1, 3))
chr3 <- Chromatogram(rtime = 3:9 + rnorm(7, sd = 0.3),
    intensity = c(53, 80, 130, 15, 5, 3, 2))

chrs <- MChromatograms(list(chr1, chr2, chr3))

## Using `compareChromatograms` instead of `correlate`.
compareChromatograms(chr1, chr2)
compareChromatograms(chr2, chr1)

compareChromatograms(chrs, chrs)

Create report of analyte differences

Description

Create a report showing the most significant differences between two sets of samples. Optionally create extracted ion chromatograms for the most significant differences.

Arguments

`object`	the `xcmsSet` object
`class1`	character vector with the first set of sample classes to be compared
`class2`	character vector with the second set of sample classes to be compared
`filebase`	base file name to save report, `.tsv` file and `_eic` will be appended to this name for the tabular report and EIC directory, respectively. if blank nothing will be saved
`eicmax`	number of the most significantly different analytes to create EICs for
`eicwidth`	width (in seconds) of EICs produced
`sortpval`	logical indicating whether the reports should be sorted by p-value
`classeic`	character vector with the sample classes to include in the EICs
`value`	intensity values to be used for the diffreport. If `value="into"`, integrated peak intensities are used. If `value="maxo"`, maximum peak intensities are used. If `value="intb"`, baseline corrected integrated peak intensities are used (only available if peak detection was done by `findPeaks.centWave`).
`metlin`	mass uncertainty to use for generating link to Metlin metabolite database. the sign of the uncertainty indicates negative or positive mode data for M+H or M-H calculation. a value of FALSE or 0 removes the column
`h`	Numeric variable for the height of the eic and boxplots that are printed out.
`w`	Numeric variable for the width of the eic and boxplots print out made.
`mzdec`	Number of decimal places of title m/z values in the eic plot.
`missing`	`numeric(1)` defining an optional value for missing values. `missing = 0` would e.g. replace all `NA` values in the feature matrix with `0`. Note that also a call to `fillPeaks` results in a feature matrix in which `NA` values are replaced by `0`.
`...`	optional arguments to be passed to `mt.teststat` from the `multtest` package.

Details

This method handles creation of summary reports with statistics about which analytes were most significantly different between two sets of samples. It computes Welch's two-sample t-statistic for each analyte and ranks them by p-value. It returns a summary report that can optionally be written out to a tab-separated file.

Additionally, it does all the heavy lifting involved in creating superimposed extracted ion chromatograms for a given number of analytes. It does so by reading the raw data files associated with the samples of interest one at a time. As it does so, it prints the name of the sample it is currently reading. Depending on the number and size of the samples, this process can take a long time.

If a base file name is provided, the report (see Value section) will be saved to a tab separated file. If EICs are generated, they will be saved as 640x480 PNG files in a newly created subdirectory. However this parameter can be changed with the commands arguments. The numbered file names correspond to the rows in the report.

Chromatographic traces in the EICs are colored and labeled by their sample class. Sample classes take their color from the current palette. The color a sample class is assigned is dependent its order in the xcmsSet object, not the order given in the class arguments. Thus levels(sampclass(object))[1] would use color palette()[1] and so on. In that way, sample classes maintain the same color across any number of different generated reports.

When there are multiple sample classes, xcms will produce boxplots of the different classes and will generate a single anova p-value statistic. Like the eic's the plot number corresponds to the row number in the report.

Value

A data frame with the following columns:

`fold`	mean fold change (always greater than 1, see `tstat` for which set of sample classes was higher)
`tstat`	Welch's two sample t-statistic, positive for analytes having greater intensity in `class2`, negative for analytes having greater intensity in `class1`
`pvalue`	p-value of t-statistic
`anova`	p-value of the anova statistic if there are multiple classes
`mzmed`	median m/z of peaks in the group
`mzmin`	minimum m/z of peaks in the group
`mzmax`	maximum m/z of peaks in the group
`rtmed`	median retention time of peaks in the group
`rtmin`	minimum retention time of peaks in the group
`rtmax`	maximum retention time of peaks in the group
`npeaks`	number of peaks assigned to the group
`Sample Classes`	number samples from each sample class represented in the group
`metlin`	A URL to metlin for that mass
`...`	one column for every sample class
`Sample Names`	integrated intensity value for every sample
`...`	one column for every sample

Methods

object = "xcmsSet": diffreport(object, class1 = levels(sampclass(object))[1], class2 = levels(sampclass(object))[2], filebase = character(), eicmax = 0, eicwidth = 200, sortpval = TRUE, classeic = c(class1,class2), value=c("into","maxo","intb"), metlin = FALSE, h=480,w=640, mzdec=2, missing = numeric(), ...)

Change the file path of an `OnDiskMSnExp` object

Description

dirname allows to get and set the path to the directory containing the source files of the OnDiskMSnExp (or XCMSnExp) object.

Usage

## S4 method for signature 'OnDiskMSnExp'
dirname(path)

## S4 replacement method for signature 'OnDiskMSnExp'
dirname(path) <- value
## S4 method for signature 'OnDiskMSnExp'
dirname(path)

## S4 replacement method for signature 'OnDiskMSnExp'
dirname(path) <- value

Arguments

`path`	OnDiskMSnExp.
`value`	`character` of length 1 or length equal to the number of files defining the new path to the files.

Author(s)

Johannes Rainer

Align spectrum retention times across samples using peak groups found in most samples

Description

The function performs retention time correction by assessing the retention time deviation across all samples using peak groups (features) containg chromatographic peaks present in most/all samples. The retention time deviation for these features in each sample is described by fitting either a polynomial (smooth = "loess") or a linear (smooth = "linear") model to the data points. The models are subsequently used to adjust the retention time for each spectrum in each sample.

Usage

do_adjustRtime_peakGroups(
  peaks,
  peakIndex,
  rtime = list(),
  minFraction = 0.9,
  extraPeaks = 1,
  smooth = c("loess", "linear"),
  span = 0.2,
  family = c("gaussian", "symmetric"),
  peakGroupsMatrix = matrix(ncol = 0, nrow = 0),
  subset = integer(),
  subsetAdjust = c("average", "previous")
)
do_adjustRtime_peakGroups(
  peaks,
  peakIndex,
  rtime = list(),
  minFraction = 0.9,
  extraPeaks = 1,
  smooth = c("loess", "linear"),
  span = 0.2,
  family = c("gaussian", "symmetric"),
  peakGroupsMatrix = matrix(ncol = 0, nrow = 0),
  subset = integer(),
  subsetAdjust = c("average", "previous")
)

Arguments

`peaks`	a `matrix` or `data.frame` with the identified chromatographic peaks in the samples.
`peakIndex`	a `list` of indices that provides the grouping information of the chromatographic peaks (across and within samples).
`rtime`	a `list` of `numeric` vectors with the retention times per file/sample.
`minFraction`	For `PeakGroupsParam`: `numeric(1)` between 0 and 1 defining the minimum required proportion of samples in which peaks for the peak group were identified. Peak groups passing this criteria will be aligned across samples and retention times of individual spectra will be adjusted based on this alignment. For `minFraction = 1` the peak group has to contain peaks in all samples of the experiment. Note that if `subset` is provided, the specified fraction is relative to the defined subset of samples and not to the total number of samples within the experiment (i.e., a peak has to be present in the specified proportion of subset samples).
`extraPeaks`	For `PeakGroupsParam`: `numeric(1)` defining the maximal number of additional peaks for all samples to be assigned to a peak group (feature) for retention time correction. For a data set with 6 samples, `extraPeaks = 1` uses all peak groups with a total peak count `⁠<= 6 + 1⁠`. The total peak count is the total number of peaks being assigned to a peak group and considers also multiple peaks within a sample that are assigned to the group.
`smooth`	For `PeakGroupsParam`: `character(1)` defining the function to be used to interpolate corrected retention times for all peak groups. Can be either `"loess"` or `"linear"`.
`span`	For `PeakGroupsParam`: `numeric(1)` defining the degree of smoothing (if `smooth = "loess"`). This parameter is passed to the internal call to `stats::loess()`.
`family`	For `PeakGroupsParam`: `character(1)` defining the method for loess smoothing. Allowed values are `"gaussian"` and `"symmetric"`. See `stats::loess()` for more information.
`peakGroupsMatrix`	optional `matrix` of (raw) retention times for peak groups on which the alignment should be performed. Each column represents a sample, each row a feature/peak group. If not provided, this matrix will be determined depending on parameters `minFraction` and `extraPeaks`. If provided, `minFraction` and `extraPeaks` will be ignored.
`subset`	For `ObiwarpParam` and `PeakGroupsParam`: `integer` with the indices of samples within the experiment on which the alignment models should be estimated. Samples not part of the subset are adjusted based on the closest subset sample. See Subset-based alignment section for details.
`subsetAdjust`	For `ObiwarpParam` and `PeakGroupsParam`: `character(1)` specifying the method with which non-subset samples should be adjusted. Supported options are `"previous"` and `"average"` (default). See Subset-based alignment section for details.

Details

The alignment bases on the presence of compounds that can be found in all/most samples of an experiment. The retention times of individual spectra are then adjusted based on the alignment of the features corresponding to these house keeping compounds. The paraneters minFraction and extraPeaks can be used to fine tune which features should be used for the alignment (i.e. which features most likely correspond to the above mentioned house keeping compounds).

Parameter subset allows to define a subset of samples within the experiment that should be aligned. All samples not being part of the subset will be aligned based on the adjustment of the closest sample within the subset. This allows to e.g. exclude blank samples from the alignment process with their retention times being still adjusted based on the alignment results of the real samples.

Value

A list with numeric vectors with the adjusted retention times grouped by sample.

Note

The method ensures that returned adjusted retention times are increasingly ordered, just as the raw retention times.

Author(s)

Colin Smith, Johannes Rainer

References

Colin A. Smith, Elizabeth J. Want, Grace O'Maille, Ruben Abagyan and Gary Siuzdak. "XCMS: Processing Mass Spectrometry Data for Metabolite Profiling Using Nonlinear Peak Alignment, Matching, and Identification" Anal. Chem. 2006, 78:779-787.

Core API function for centWave peak detection

Description

This function performs peak density and wavelet based chromatographic peak detection for high resolution LC/MS data in centroid mode [Tautenhahn 2008].

Usage

do_findChromPeaks_centWave(
  mz,
  int,
  scantime,
  valsPerSpect,
  ppm = 25,
  peakwidth = c(20, 50),
  snthresh = 10,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  roiList = list(),
  firstBaselineCheck = TRUE,
  roiScales = NULL,
  sleep = 0,
  extendLengthMSW = FALSE,
  verboseBetaColumns = FALSE
)
do_findChromPeaks_centWave(
  mz,
  int,
  scantime,
  valsPerSpect,
  ppm = 25,
  peakwidth = c(20, 50),
  snthresh = 10,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  roiList = list(),
  firstBaselineCheck = TRUE,
  roiScales = NULL,
  sleep = 0,
  extendLengthMSW = FALSE,
  verboseBetaColumns = FALSE
)

Arguments

`mz`	Numeric vector with the individual m/z values from all scans/ spectra of one file/sample.
`int`	Numeric vector with the individual intensity values from all scans/spectra of one file/sample.
`scantime`	Numeric vector of length equal to the number of spectra/scans of the data representing the retention time of each scan.
`valsPerSpect`	Numeric vector with the number of values for each spectrum.
`ppm`	`numeric(1)` defining the maximal tolerated m/z deviation in consecutive scans in parts per million (ppm) for the initial ROI definition.
`peakwidth`	`numeric(2)` with the expected approximate peak width in chromatographic space. Given as a range (min, max) in seconds.
`snthresh`	`numeric(1)` defining the signal to noise ratio cutoff.
`prefilter`	`numeric(2)`: `c(k, I)` specifying the prefilter step for the first analysis step (ROI detection). Mass traces are only retained if they contain at least `k` peaks with intensity `>= I`.
`mzCenterFun`	Name of the function to calculate the m/z center of the chromatographic peak. Allowed are: `"wMean"`: intensity weighted mean of the peak's m/z values, `"mean"`: mean of the peak's m/z values, `"apex"`: use the m/z value at the peak apex, `"wMeanApex3"`: intensity weighted mean of the m/z value at the peak apex and the m/z values left and right of it and `"meanApex3"`: mean of the m/z value of the peak apex and the m/z values left and right of it.
`integrate`	Integration method. For `integrate = 1` peak limits are found through descent on the mexican hat filtered data, for `integrate = 2` the descent is done on the real data. The latter method is more accurate but prone to noise, while the former is more robust, but less exact.
`mzdiff`	`numeric(1)` representing the minimum difference in m/z dimension required for peaks with overlapping retention times; can be negative to allow overlap. During peak post-processing, peaks defined to be overlapping are reduced to the one peak with the largest signal.
`fitgauss`	`logical(1)` whether or not a Gaussian should be fitted to each peak. This affects mostly the retention time position of the peak.
`noise`	`numeric(1)` allowing to set a minimum intensity required for centroids to be considered in the first analysis step (centroids with intensity `< noise` are omitted from ROI detection).
`verboseColumns`	`logical(1)` whether additional peak meta data columns should be returned.
`roiList`	An optional list of regions-of-interest (ROI) representing detected mass traces. If ROIs are submitted the first analysis step is omitted and chromatographic peak detection is performed on the submitted ROIs. Each ROI is expected to have the following elements specified: `scmin` (start scan index), `scmax` (end scan index), `mzmin` (minimum m/z), `mzmax` (maximum m/z), `length` (number of scans), `intensity` (summed intensity). Each ROI should be represented by a `list` of elements or a single row `data.frame`.
`firstBaselineCheck`	`logical(1)`. If `TRUE` continuous data within regions of interest is checked to be above the first baseline. In detail, a first rough estimate of the noise is calculated and peak detection is performed only in regions in which multiple sequential signals are higher than this first estimated baseline/noise level.
`roiScales`	Optional numeric vector with length equal to `roiList` defining the scale for each region of interest in `roiList` that should be used for the centWave-wavelets.
`sleep`	`numeric(1)` defining the number of seconds to wait between iterations. Defaults to `sleep = 0`. If `> 0` a plot is generated visualizing the identified chromatographic peak. Note: this argument is for backward compatibility only and will be removed in future.
`extendLengthMSW`	Option to force centWave to use all scales when running centWave rather than truncating with the EIC length. Uses the "open" method to extend the EIC to a integer base-2 length prior to being passed to `convolve` rather than the default "reflect" method. See https://github.com/sneumann/xcms/issues/445 for more information.
`verboseBetaColumns`	Option to calculate two additional metrics of peak quality via comparison to an idealized bell curve. Adds `beta_cor` and `beta_snr` to the `chromPeaks` output, corresponding to a Pearson correlation coefficient to a bell curve with several degrees of skew as well as an estimate of signal-to-noise using the residuals from the best-fitting bell curve. See https://github.com/sneumann/xcms/pull/685 and https://doi.org/10.1186/s12859-023-05533-4 for more information.

Details

This algorithm is most suitable for high resolution LC/{TOF,OrbiTrap,FTICR}-MS data in centroid mode. In the first phase the method identifies regions of interest (ROIs) representing mass traces that are characterized as regions with less than ppm m/z deviation in consecutive scans in the LC/MS map. In detail, starting with a single m/z, a ROI is extended if a m/z can be found in the next scan (spectrum) for which the difference to the mean m/z of the ROI is smaller than the user defined ppm of the m/z. The mean m/z of the ROI is then updated considering also the newly included m/z value.

These ROIs are then, after some cleanup, analyzed using continuous wavelet transform (CWT) to locate chromatographic peaks on different scales. The first analysis step is skipped, if regions of interest are passed with the roiList parameter.

Value

A matrix, each row representing an identified chromatographic peak, with columns:

mz: Intensity weighted mean of m/z values of the peak across scans.
mzmin: Minimum m/z of the peak.
mzmax: Maximum m/z of the peak.
rt: Retention time of the peak's midpoint.
rtmin: Minimum retention time of the peak.
rtmax: Maximum retention time of the peak.
into: Integrated (original) intensity of the peak.
intb: Per-peak baseline corrected integrated peak intensity.
maxo: Maximum intensity of the peak.
sn: Signal to noise ratio, defined as (maxo - baseline)/sd, sd being the standard deviation of local chromatographic noise.
egauss: RMSE of Gaussian fit.

Additional columns for verboseColumns = TRUE:

mu: Gaussian parameter mu.
sigma: Gaussian parameter sigma.
h: Gaussian parameter h.
f: Region number of the m/z ROI where the peak was localized.
dppm: m/z deviation of mass trace across scans in ppm.
scale: Scale on which the peak was localized.
scpos: Peak position found by wavelet analysis (scan number).
scmin: Left peak limit found by wavelet analysis (scan number).
scmax: Right peak limit found by wavelet analysis (scan numer).

Additional columns for verboseBetaColumns = TRUE:

beta_cor: Correlation between an "ideal" bell curve and the raw data
beta_snr: Signal-to-noise residuals calculated from the beta_cor fit

Note

The centWave was designed to work on centroided mode, thus it is expected that such data is presented to the function.

This function exposes core chromatographic peak detection functionality of the centWave method. While this function can be called directly, users will generally call the corresponding method for the data object instead.

Author(s)

Ralf Tautenhahn, Johannes Rainer

References

Ralf Tautenhahn, Christoph Böttcher, and Steffen Neumann "Highly sensitive feature detection for high resolution LC/MS" BMC Bioinformatics 2008, 9:504

Examples

## Load the test file
faahko_sub <- loadXcmsData("faahko_sub")

## Subset to one file and restrict to a certain retention time range
data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000))

## Get m/z and intensity values
mzs <- mz(data)
ints <- intensity(data)

## Define the values per spectrum:
valsPerSpect <- lengths(mzs)

## Calling the function. We're using a large value for noise and prefilter
## to speed up the call in the example - in a real use case we would either
## set the value to a reasonable value or use the default value.
res <- do_findChromPeaks_centWave(mz = unlist(mzs), int = unlist(ints),
    scantime = rtime(data), valsPerSpect = valsPerSpect, noise = 10000,
    prefilter = c(3, 10000))
head(res)
## Load the test file
faahko_sub <- loadXcmsData("faahko_sub")

## Subset to one file and restrict to a certain retention time range
data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000))

## Get m/z and intensity values
mzs <- mz(data)
ints <- intensity(data)

## Define the values per spectrum:
valsPerSpect <- lengths(mzs)

## Calling the function. We're using a large value for noise and prefilter
## to speed up the call in the example - in a real use case we would either
## set the value to a reasonable value or use the default value.
res <- do_findChromPeaks_centWave(mz = unlist(mzs), int = unlist(ints),
    scantime = rtime(data), valsPerSpect = valsPerSpect, noise = 10000,
    prefilter = c(3, 10000))
head(res)

Core API function for two-step centWave peak detection with isotopes

Description

The do_findChromPeaks_centWaveWithPredIsoROIs performs a two-step centWave based peak detection: chromatographic peaks are identified using centWave followed by a prediction of the location of the identified peaks' isotopes in the mz-retention time space. These locations are fed as regions of interest (ROIs) to a subsequent centWave run. All non overlapping peaks from these two peak detection runs are reported as the final list of identified peaks.

The do_findChromPeaks_centWaveAddPredIsoROIs performs centWave based peak detection based in regions of interest (ROIs) representing predicted isotopes for the peaks submitted with argument peaks.. The function returns a matrix with the identified peaks consisting of all input peaks and peaks representing predicted isotopes of these (if found by the centWave algorithm).

Usage

do_findChromPeaks_centWaveWithPredIsoROIs(
  mz,
  int,
  scantime,
  valsPerSpect,
  ppm = 25,
  peakwidth = c(20, 50),
  snthresh = 10,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  roiList = list(),
  firstBaselineCheck = TRUE,
  roiScales = NULL,
  snthreshIsoROIs = 6.25,
  maxCharge = 3,
  maxIso = 5,
  mzIntervalExtension = TRUE,
  polarity = "unknown",
  extendLengthMSW = FALSE,
  verboseBetaColumns = FALSE
)

do_findChromPeaks_addPredIsoROIs(
  mz,
  int,
  scantime,
  valsPerSpect,
  ppm = 25,
  peakwidth = c(20, 50),
  snthresh = 6.25,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  peaks. = NULL,
  maxCharge = 3,
  maxIso = 5,
  mzIntervalExtension = TRUE,
  polarity = "unknown"
)
do_findChromPeaks_centWaveWithPredIsoROIs(
  mz,
  int,
  scantime,
  valsPerSpect,
  ppm = 25,
  peakwidth = c(20, 50),
  snthresh = 10,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  roiList = list(),
  firstBaselineCheck = TRUE,
  roiScales = NULL,
  snthreshIsoROIs = 6.25,
  maxCharge = 3,
  maxIso = 5,
  mzIntervalExtension = TRUE,
  polarity = "unknown",
  extendLengthMSW = FALSE,
  verboseBetaColumns = FALSE
)

do_findChromPeaks_addPredIsoROIs(
  mz,
  int,
  scantime,
  valsPerSpect,
  ppm = 25,
  peakwidth = c(20, 50),
  snthresh = 6.25,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  peaks. = NULL,
  maxCharge = 3,
  maxIso = 5,
  mzIntervalExtension = TRUE,
  polarity = "unknown"
)

Arguments

`mz`	Numeric vector with the individual m/z values from all scans/ spectra of one file/sample.
`int`	Numeric vector with the individual intensity values from all scans/spectra of one file/sample.
`scantime`	Numeric vector of length equal to the number of spectra/scans of the data representing the retention time of each scan.
`valsPerSpect`	Numeric vector with the number of values for each spectrum.
`ppm`	`numeric(1)` defining the maximal tolerated m/z deviation in consecutive scans in parts per million (ppm) for the initial ROI definition.
`peakwidth`	`numeric(2)` with the expected approximate peak width in chromatographic space. Given as a range (min, max) in seconds.
`snthresh`	For `do_findChromPeaks_addPredIsoROIs`: numeric(1) defining the signal to noise threshold for the centWave algorithm. For `do_findChromPeaks_centWaveWithPredIsoROIs`: numeric(1) defining the signal to noise threshold for the initial (first) centWave run.
`prefilter`	`numeric(2)`: `c(k, I)` specifying the prefilter step for the first analysis step (ROI detection). Mass traces are only retained if they contain at least `k` peaks with intensity `>= I`.
`mzCenterFun`	Name of the function to calculate the m/z center of the chromatographic peak. Allowed are: `"wMean"`: intensity weighted mean of the peak's m/z values, `"mean"`: mean of the peak's m/z values, `"apex"`: use the m/z value at the peak apex, `"wMeanApex3"`: intensity weighted mean of the m/z value at the peak apex and the m/z values left and right of it and `"meanApex3"`: mean of the m/z value of the peak apex and the m/z values left and right of it.
`integrate`	Integration method. For `integrate = 1` peak limits are found through descent on the mexican hat filtered data, for `integrate = 2` the descent is done on the real data. The latter method is more accurate but prone to noise, while the former is more robust, but less exact.
`mzdiff`	`numeric(1)` representing the minimum difference in m/z dimension required for peaks with overlapping retention times; can be negative to allow overlap. During peak post-processing, peaks defined to be overlapping are reduced to the one peak with the largest signal.
`fitgauss`	`logical(1)` whether or not a Gaussian should be fitted to each peak. This affects mostly the retention time position of the peak.
`noise`	`numeric(1)` allowing to set a minimum intensity required for centroids to be considered in the first analysis step (centroids with intensity `< noise` are omitted from ROI detection).
`verboseColumns`	`logical(1)` whether additional peak meta data columns should be returned.
`roiList`	An optional list of regions-of-interest (ROI) representing detected mass traces. If ROIs are submitted the first analysis step is omitted and chromatographic peak detection is performed on the submitted ROIs. Each ROI is expected to have the following elements specified: `scmin` (start scan index), `scmax` (end scan index), `mzmin` (minimum m/z), `mzmax` (maximum m/z), `length` (number of scans), `intensity` (summed intensity). Each ROI should be represented by a `list` of elements or a single row `data.frame`.
`firstBaselineCheck`	`logical(1)`. If `TRUE` continuous data within regions of interest is checked to be above the first baseline. In detail, a first rough estimate of the noise is calculated and peak detection is performed only in regions in which multiple sequential signals are higher than this first estimated baseline/noise level.
`roiScales`	Optional numeric vector with length equal to `roiList` defining the scale for each region of interest in `roiList` that should be used for the centWave-wavelets.
`snthreshIsoROIs`	`numeric(1)` defining the signal to noise ratio cutoff to be used in the second centWave run to identify peaks for predicted isotope ROIs.
`maxCharge`	`integer(1)` defining the maximal isotope charge. Isotopes will be defined for charges `1:maxCharge`.
`maxIso`	`integer(1)` defining the number of isotope peaks that should be predicted for each peak identified in the first centWave run.
`mzIntervalExtension`	`logical(1)` whether the mz range for the predicted isotope ROIs should be extended to increase detection of low intensity peaks.
`polarity`	`character(1)` specifying the polarity of the data. Currently not used, but has to be `"positive"`, `"negative"` or `"unknown"` if provided.
`extendLengthMSW`	Option to force centWave to use all scales when running centWave rather than truncating with the EIC length. Uses the "open" method to extend the EIC to a integer base-2 length prior to being passed to `convolve` rather than the default "reflect" method. See https://github.com/sneumann/xcms/issues/445 for more information.
`verboseBetaColumns`	Option to calculate two additional metrics of peak quality via comparison to an idealized bell curve. Adds `beta_cor` and `beta_snr` to the `chromPeaks` output, corresponding to a Pearson correlation coefficient to a bell curve with several degrees of skew as well as an estimate of signal-to-noise using the residuals from the best-fitting bell curve. See https://github.com/sneumann/xcms/pull/685 and https://doi.org/10.1186/s12859-023-05533-4 for more information.
`peaks.`	A matrix or `xcmsPeaks` object such as one returned by a call to `link{do_findChromPeaks_centWave}` or `link{findPeaks.centWave}` (both with `verboseColumns = TRUE`) with the peaks for which isotopes should be predicted and used for an additional peak detectoin using the centWave method. Required columns are: `"mz"`, `"mzmin"`, `"mzmax"`, `"scmin"`, `"scmax"`, `"scale"` and `"into"`.

Details

For more details on the centWave algorithm see centWave.

Value

A matrix, each row representing an identified chromatographic peak. All non-overlapping peaks identified in both centWave runs are reported. The matrix columns are:

mz: Intensity weighted mean of m/z values of the peaks across scans.
mzmin: Minimum m/z of the peaks.
mzmax: Maximum m/z of the peaks.
rt: Retention time of the peak's midpoint.
rtmin: Minimum retention time of the peak.
rtmax: Maximum retention time of the peak.
into: Integrated (original) intensity of the peak.
intb: Per-peak baseline corrected integrated peak intensity.
maxo: Maximum intensity of the peak.
sn: Signal to noise ratio, defined as (maxo - baseline)/sd, sd being the standard deviation of local chromatographic noise.
egauss: RMSE of Gaussian fit.

Additional columns for verboseColumns = TRUE:

mu: Gaussian parameter mu.
sigma: Gaussian parameter sigma.
h: Gaussian parameter h.
f: Region number of the m/z ROI where the peak was localized.
dppm: m/z deviation of mass trace across scans in ppm.
scale: Scale on which the peak was localized.
scpos: Peak position found by wavelet analysis (scan number).
scmin: Left peak limit found by wavelet analysis (scan number).
scmax: Right peak limit found by wavelet analysis (scan numer).

Additional columns for verboseBetaColumns = TRUE:

beta_cor: Correlation between an "ideal" bell curve and the raw data
beta_snr: Signal-to-noise residuals calculated from the beta_cor fit

Author(s)

Hendrik Treutler, Johannes Rainer

Core API function for massifquant peak detection

Description

Massifquant is a Kalman filter (KF)-based chromatographic peak detection for XC-MS data in centroid mode. The identified peaks can be further refined with the centWave method (see do_findChromPeaks_centWave for details on centWave) by specifying withWave = TRUE.

Usage

do_findChromPeaks_massifquant(
  mz,
  int,
  scantime,
  valsPerSpect,
  ppm = 10,
  peakwidth = c(20, 50),
  snthresh = 10,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  criticalValue = 1.125,
  consecMissedLimit = 2,
  unions = 1,
  checkBack = 0,
  withWave = FALSE
)
do_findChromPeaks_massifquant(
  mz,
  int,
  scantime,
  valsPerSpect,
  ppm = 10,
  peakwidth = c(20, 50),
  snthresh = 10,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  criticalValue = 1.125,
  consecMissedLimit = 2,
  unions = 1,
  checkBack = 0,
  withWave = FALSE
)

Arguments

`mz`	Numeric vector with the individual m/z values from all scans/ spectra of one file/sample.
`int`	Numeric vector with the individual intensity values from all scans/spectra of one file/sample.
`scantime`	Numeric vector of length equal to the number of spectra/scans of the data representing the retention time of each scan.
`valsPerSpect`	Numeric vector with the number of values for each spectrum.
`ppm`	`numeric(1)` defining the maximal tolerated m/z deviation in consecutive scans in parts per million (ppm) for the initial ROI definition.
`peakwidth`	`numeric(2)` with the expected approximate peak width in chromatographic space. Given as a range (min, max) in seconds.
`snthresh`	`numeric(1)` defining the signal to noise ratio cutoff.
`prefilter`	`numeric(2)`: `c(k, I)` specifying the prefilter step for the first analysis step (ROI detection). Mass traces are only retained if they contain at least `k` peaks with intensity `>= I`.
`mzCenterFun`	Name of the function to calculate the m/z center of the chromatographic peak. Allowed are: `"wMean"`: intensity weighted mean of the peak's m/z values, `"mean"`: mean of the peak's m/z values, `"apex"`: use the m/z value at the peak apex, `"wMeanApex3"`: intensity weighted mean of the m/z value at the peak apex and the m/z values left and right of it and `"meanApex3"`: mean of the m/z value of the peak apex and the m/z values left and right of it.
`integrate`	Integration method. For `integrate = 1` peak limits are found through descent on the mexican hat filtered data, for `integrate = 2` the descent is done on the real data. The latter method is more accurate but prone to noise, while the former is more robust, but less exact.
`mzdiff`	`numeric(1)` representing the minimum difference in m/z dimension required for peaks with overlapping retention times; can be negative to allow overlap. During peak post-processing, peaks defined to be overlapping are reduced to the one peak with the largest signal.
`fitgauss`	`logical(1)` whether or not a Gaussian should be fitted to each peak. This affects mostly the retention time position of the peak.
`noise`	`numeric(1)` allowing to set a minimum intensity required for centroids to be considered in the first analysis step (centroids with intensity `< noise` are omitted from ROI detection).
`verboseColumns`	`logical(1)` whether additional peak meta data columns should be returned.
`criticalValue`	`numeric(1)`. Suggested values: (`0.1-3.0`). This setting helps determine the the Kalman Filter prediciton margin of error. A real centroid belonging to a bonafide peak must fall within the KF prediction margin of error. Much like in the construction of a confidence interval, `criticalVal` loosely translates to be a multiplier of the standard error of the prediction reported by the Kalman Filter. If the peak in the XC-MS sample have a small mass deviance in ppm error, a smaller critical value might be better and vice versa.
`consecMissedLimit`	`integer(1)` Suggested values: (`1,2,3`). While a peak is in the proces of being detected by a Kalman Filter, the Kalman Filter may not find a predicted centroid in every scan. After 1 or more consecutive failed predictions, this setting informs Massifquant when to stop a Kalman Filter from following a candidate peak.
`unions`	`integer(1)` set to `1` if apply t-test union on segmentation; set to `0` if no t-test to be applied on chromatographically continous peaks sharing same m/z range. Explanation: With very few data points, sometimes a Kalman Filter stops tracking a peak prematurely. Another Kalman Filter is instantiated and begins following the rest of the signal. Because tracking is done backwards to forwards, this algorithmic defect leaves a real peak divided into two segments or more. With this option turned on, the program identifies segmented peaks and combines them (merges them) into one with a two sample t-test. The potential danger of this option is that some truly distinct peaks may be merged.
`checkBack`	`integer(1)` set to `1` if turned on; set to `0` if turned off. The convergence of a Kalman Filter to a peak's precise m/z mapping is very fast, but sometimes it incorporates erroneous centroids as part of a peak (especially early on). The `scanBack` option is an attempt to remove the occasional outlier that lies beyond the converged bounds of the Kalman Filter. The option does not directly affect identification of a peak because it is a postprocessing measure; it has not shown to be a extremely useful thus far and the default is set to being turned off.
`withWave`	`logical(1)` if `TRUE`, the peaks identified first with Massifquant are subsequently filtered with the second step of the centWave algorithm, which includes wavelet estimation.

Details

This algorithm's performance has been tested rigorously on high resolution LC/(OrbiTrap, TOF)-MS data in centroid mode. Simultaneous kalman filters identify peaks and calculate their area under the curve. The default parameters are set to operate on a complex LC-MS Orbitrap sample. Users will find it useful to do some simple exploratory data analysis to find out where to set a minimum intensity, and identify how many scans an average peak spans. The consecMissedLimit parameter has yielded good performance on Orbitrap data when set to (2) and on TOF data it was found best to be at (1). This may change as the algorithm has yet to be tested on many samples. The criticalValue parameter is perhaps most dificult to dial in appropriately and visual inspection of peak identification is the best suggested tool for quick optimization. The ppm and checkBack parameters have shown less influence than the other parameters and exist to give users flexibility and better accuracy.

Value

A matrix, each row representing an identified chromatographic peak, with columns:

mz: Intensity weighted mean of m/z values of the peaks across scans.
mzmin: Minumum m/z of the peak.
mzmax: Maximum m/z of the peak.
rtmin: Minimum retention time of the peak.
rtmax: Maximum retention time of the peak.
rt: Retention time of the peak's midpoint.
into: Integrated (original) intensity of the peak.
maxo: Maximum intensity of the peak.

If withWave is set to TRUE, the result is the same as returned by the do_findChromPeaks_centWave method.

Author(s)

Christopher Conley

References

Conley CJ, Smith R, Torgrip RJ, Taylor RM, Tautenhahn R and Prince JT "Massifquant: open-source Kalman filter-based XC-MS isotope trace feature detection" Bioinformatics 2014, 30(18):2636-43.

Examples


## Load the test file
faahko_sub <- loadXcmsData("faahko_sub")

## Subset to one file and restrict to a certain retention time range
data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000))

## Get m/z and intensity values
mzs <- mz(data)
ints <- intensity(data)

## Define the values per spectrum:
valsPerSpect <- lengths(mzs)

## Perform the peak detection using massifquant - setting prefilter to
## a high value to speed up the call for the example
res <- do_findChromPeaks_massifquant(mz = unlist(mzs), int = unlist(ints),
    scantime = rtime(data), valsPerSpect = valsPerSpect,
    prefilter = c(3, 10000))
head(res)
## Load the test file
faahko_sub <- loadXcmsData("faahko_sub")

## Subset to one file and restrict to a certain retention time range
data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000))

## Get m/z and intensity values
mzs <- mz(data)
ints <- intensity(data)

## Define the values per spectrum:
valsPerSpect <- lengths(mzs)

## Perform the peak detection using massifquant - setting prefilter to
## a high value to speed up the call for the example
res <- do_findChromPeaks_massifquant(mz = unlist(mzs), int = unlist(ints),
    scantime = rtime(data), valsPerSpect = valsPerSpect,
    prefilter = c(3, 10000))
head(res)

Core API function for matchedFilter peak detection

Description

This function identifies peaks in the chromatographic time domain as described in [Smith 2006]. The intensity values are binned by cutting The LC/MS data into slices (bins) of a mass unit (binSize m/z) wide. Within each bin the maximal intensity is selected. The peak detection is then performed in each bin by extending it based on the steps parameter to generate slices comprising bins current_bin - steps +1 to current_bin + steps - 1. Each of these slices is then filtered with matched filtration using a second-derative Gaussian as the model peak shape. After filtration peaks are detected using a signal-to-ration cut-off. For more details and illustrations see [Smith 2006].

Usage

do_findChromPeaks_matchedFilter(
  mz,
  int,
  scantime,
  valsPerSpect,
  binSize = 0.1,
  impute = "none",
  baseValue,
  distance,
  fwhm = 30,
  sigma = fwhm/2.3548,
  max = 5,
  snthresh = 10,
  steps = 2,
  mzdiff = 0.8 - binSize * steps,
  index = FALSE,
  sleep = 0
)
do_findChromPeaks_matchedFilter(
  mz,
  int,
  scantime,
  valsPerSpect,
  binSize = 0.1,
  impute = "none",
  baseValue,
  distance,
  fwhm = 30,
  sigma = fwhm/2.3548,
  max = 5,
  snthresh = 10,
  steps = 2,
  mzdiff = 0.8 - binSize * steps,
  index = FALSE,
  sleep = 0
)

Arguments

`mz`	Numeric vector with the individual m/z values from all scans/ spectra of one file/sample.
`int`	Numeric vector with the individual intensity values from all scans/spectra of one file/sample.
`scantime`	Numeric vector of length equal to the number of spectra/scans of the data representing the retention time of each scan.
`valsPerSpect`	Numeric vector with the number of values for each spectrum.
`binSize`	`numeric(1)` specifying the width of the bins/slices in m/z dimension.
`impute`	Character string specifying the method to be used for missing value imputation. Allowed values are `"none"` (no linear interpolation), `"lin"` (linear interpolation), `"linbase"` (linear interpolation within a certain bin-neighborhood) and `"intlin"`. See `imputeLinInterpol` for more details.
`baseValue`	The base value to which empty elements should be set. This is only considered for `method = "linbase"` and corresponds to the `profBinLinBase`'s `baselevel` argument.
`distance`	For `method = "linbase"`: number of non-empty neighboring element of an empty element that should be considered for linear interpolation. See details section for more information.
`fwhm`	`numeric(1)` specifying the full width at half maximum of matched filtration gaussian model peak. Only used to calculate the actual sigma, see below.
`sigma`	`numeric(1)` specifying the standard deviation (width) of the matched filtration model peak.
`max`	`numeric(1)` representing the maximum number of peaks that are expected/will be identified per slice.
`snthresh`	`numeric(1)` defining the signal to noise ratio cutoff.
`steps`	`numeric(1)` defining the number of bins to be merged before filtration (i.e. the number of neighboring bins that will be joined to the slice in which filtration and peak detection will be performed).
`mzdiff`	`numeric(1)` representing the minimum difference in m/z dimension required for peaks with overlapping retention times; can be negative to allow overlap. During peak post-processing, peaks defined to be overlapping are reduced to the one peak with the largest signal.
`index`	`logical(1)` specifying whether indicies should be returned instead of values for m/z and retention times.
`sleep`	`numeric(1)` defining the number of seconds to wait between iterations. Defaults to `sleep = 0`. If `> 0` a plot is generated visualizing the identified chromatographic peak. Note: this argument is for backward compatibility only and will be removed in future.

Details

The intensities are binned by the provided m/z values within each spectrum (scan). Binning is performed such that the bins are centered around the m/z values (i.e. the first bin includes all m/z values between min(mz) - bin_size/2 and min(mz) + bin_size/2).

For more details on binning and missing value imputation see binYonX and imputeLinInterpol methods.

Value

A matrix, each row representing an identified chromatographic peak, with columns:

mz: Intensity weighted mean of m/z values of the peak across scans.
mzmin: Minimum m/z of the peak.
mzmax: Maximum m/z of the peak.
rt: Retention time of the peak's midpoint.
rtmin: Minimum retention time of the peak.
rtmax: Maximum retention time of the peak.
into: Integrated (original) intensity of the peak.
intf: Integrated intensity of the filtered peak.
maxo: Maximum intensity of the peak.
maxf: Maximum intensity of the filtered peak.
i: Rank of peak in merged EIC (<= max).
sn: Signal to noise ratio of the peak

Note

This function exposes core peak detection functionality of the matchedFilter method. While this function can be called directly, users will generally call the corresponding method for the data object instead (e.g. the link{findPeaks.matchedFilter} method).

Author(s)

Colin A Smith, Johannes Rainer

References

Examples


## Load the test file
faahko_sub <- loadXcmsData("faahko_sub")

## Subset to one file and restrict to a certain retention time range
data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000))

## Get m/z and intensity values
mzs <- mz(data)
ints <- intensity(data)

## Define the values per spectrum:
valsPerSpect <- lengths(mzs)

res <- do_findChromPeaks_matchedFilter(mz = unlist(mzs), int = unlist(ints),
    scantime = rtime(data), valsPerSpect = valsPerSpect)
head(res)
## Load the test file
faahko_sub <- loadXcmsData("faahko_sub")

## Subset to one file and restrict to a certain retention time range
data <- filterRt(filterFile(faahko_sub, 1), c(2500, 3000))

## Get m/z and intensity values
mzs <- mz(data)
ints <- intensity(data)

## Define the values per spectrum:
valsPerSpect <- lengths(mzs)

res <- do_findChromPeaks_matchedFilter(mz = unlist(mzs), int = unlist(ints),
    scantime = rtime(data), valsPerSpect = valsPerSpect)
head(res)

Core API function for single-spectrum non-chromatography MS data peak detection

Description

This function performs peak detection in mass spectrometry direct injection spectrum using a wavelet based algorithm.

Usage

do_findPeaks_MSW(
  mz,
  int,
  snthresh = 3,
  verboseColumns = FALSE,
  scantime = numeric(),
  valsPerSpect = integer(),
  ...
)
do_findPeaks_MSW(
  mz,
  int,
  snthresh = 3,
  verboseColumns = FALSE,
  scantime = numeric(),
  valsPerSpect = integer(),
  ...
)

Arguments

`mz`	Numeric vector with the individual m/z values from all scans/ spectra of one file/sample.
`int`	Numeric vector with the individual intensity values from all scans/spectra of one file/sample.
`snthresh`	`numeric(1)` defining the signal to noise ratio cutoff.
`verboseColumns`	`logical(1)` whether additional peak meta data columns should be returned.
`scantime`	ignored.
`valsPerSpect`	ignored.
`...`	Additional parameters to be passed to the `peakDetectionCWT` function.

Details

This is a wrapper around the peak picker in Bioconductor's MassSpecWavelet package calling peakDetectionCWT and tuneInPeakInfo functions. See the xcmsDirect vignette for more information.

Value

A matrix, each row representing an identified peak, with columns:

mz: m/z value of the peak at the centroid position.
mzmin: Minimum m/z of the peak.
mzmax: Maximum m/z of the peak.
rt: Always -1.
rtmin: Always -1.
rtmax: Always -1.
into: Integrated (original) intensity of the peak.
maxo: Maximum intensity of the peak.
intf: Always NA.
maxf: Maximum MSW-filter response of the peak.
sn: Signal to noise ratio.

Author(s)

Joachim Kutzera, Steffen Neumann, Johannes Rainer

Core API function for peak density based chromatographic peak grouping

Description

The do_groupChromPeaks_density function performs chromatographic peak grouping based on the density (distribution) of peaks, found in different samples, along the retention time axis in slices of overlapping m/z ranges. By default (with parameter ppm = 0) these m/z ranges have all the same (constant) size (depending on parameter binSize). For values of ppm larger than 0 the m/z bins (ranges or slices) will have increasing sizes depending on the m/z value. This better models the m/z-dependent measurement error/precision seen on some MS instruments.

Usage

do_groupChromPeaks_density(
  peaks,
  sampleGroups,
  bw = 30,
  minFraction = 0.5,
  minSamples = 1,
  binSize = 0.25,
  maxFeatures = 50,
  sleep = 0,
  index = seq_len(nrow(peaks)),
  ppm = 0
)
do_groupChromPeaks_density(
  peaks,
  sampleGroups,
  bw = 30,
  minFraction = 0.5,
  minSamples = 1,
  binSize = 0.25,
  maxFeatures = 50,
  sleep = 0,
  index = seq_len(nrow(peaks)),
  ppm = 0
)

Arguments

`peaks`	A `matrix` or `data.frame` with the mz values and retention times of the identified chromatographic peaks in all samples of an experiment. Required columns are `"mz"`, `"rt"` and `"sample"`. The latter should contain `numeric` values representing the index of the sample in which the peak was found.
`sampleGroups`	For `PeakDensityParam`: A vector of the same length than samples defining the sample group assignments (i.e. which samples belong to which sample group). This parameter is mandatory for `PeakDensityParam` and has to be defined also if there is no sample grouping in the experiment (in which case all samples should be assigned to the same group). Samples for which a `NA` is provided will not be considered in the feature definitions step. Providing `NA` for all blanks in an experiment will for example avoid features to be defined for signals (chrom peaks) present only in blank samples.
`bw`	For `PeakDensityParam`: `numeric(1)` defining the bandwidth (standard deviation ot the smoothing kernel) to be used. This argument is passed to the [density() method.
`minFraction`	For `PeakDensityParam`: `numeric(1)` defining the minimum fraction of samples in at least one sample group in which the peaks have to be present to be considered as a peak group (feature).
`minSamples`	For `PeakDensityParam`: `numeric(1)` with the minimum number of samples in at least one sample group in which the peaks have to be detected to be considered a peak group (feature).
`binSize`	For `PeakDensityParam`: `numeric(1)` defining the size of the overlapping slices in m/z dimension.
`maxFeatures`	For `PeakDensityParam`: `numeric(1)` with the maximum number of peak groups to be identified in a single mz slice.
`sleep`	`numeric(1)` defining the time to sleep between iterations and plot the result from the current iteration.
`index`	An optional `integer` providing the indices of the peaks in the original peak matrix.
`ppm`	For `MzClustParam`: `numeric(1)` representing the relative m/z error for the clustering/grouping (in parts per million). For `PeakDensityParam`: `numeric(1)` to define m/z-dependent, increasing m/z bin sizes. If `ppm = 0` (the default) m/z bins are defined by the sequence of values from the smallest to the larges m/z value with a constant bin size of `binSize`. For `ppm` > 0 the size of each bin is increased in addition by the `ppm` of the (upper) m/z boundary of the bin. The maximal bin size (used for the largest m/z values) would then be `binSize` plus `ppm` parts-per-million of the largest m/z value of all peaks in the data set.

Details

For overlapping slices along the mz dimension, the function calculates the density distribution of identified peaks along the retention time axis and groups peaks from the same or different samples that are close to each other. See (Smith 2006) for more details.

Value

A data.frame, each row representing a (mz-rt) feature (i.e. a peak group) with columns:

"mzmed": median of the peaks' apex mz values.
"mzmin": smallest mz value of all peaks' apex within the feature.
"mzmax":largest mz value of all peaks' apex within the feature.
"rtmed": the median of the peaks' retention times.
"rtmin": the smallest retention time of the peaks in the group.
"rtmax": the largest retention time of the peaks in the group.
"npeaks": the total number of peaks assigned to the feature.
"peakidx": a list with the indices of all peaks in a feature in the peaks input matrix.

Note that this number can be larger than the total number of samples, since multiple peaks from the same sample could be assigned to a feature.

Note

The default settings might not be appropriate for all LC/GC-MS setups, especially the bw and binSize parameter should be adjusted accordingly.

Author(s)

Colin Smith, Johannes Rainer

References

Examples

## Load the test file
library(xcms)
library(MsExperiment)
faahko_sub <- loadXcmsData("faahko_sub2")

## Disable parallel processing for this example
register(SerialParam())

## Extract the matrix with the identified peaks from the xcmsSet:
pks <- chromPeaks(faahko_sub)

## Perform the peak grouping with default settings:
res <- do_groupChromPeaks_density(pks, sampleGroups = rep(1, 3))

## The feature definitions:
head(res)
## Load the test file
library(xcms)
library(MsExperiment)
faahko_sub <- loadXcmsData("faahko_sub2")

## Disable parallel processing for this example
register(SerialParam())

## Extract the matrix with the identified peaks from the xcmsSet:
pks <- chromPeaks(faahko_sub)

## Perform the peak grouping with default settings:
res <- do_groupChromPeaks_density(pks, sampleGroups = rep(1, 3))

## The feature definitions:
head(res)

Core API function for chromatic peak grouping using a nearest neighbor approach

Description

The do_groupChromPeaks_nearest function groups peaks across samples by creating a master peak list and assigning corresponding peaks from all samples to each peak group (i.e. feature). The method is inspired by the correspondence algorithm of mzMine (Katajamaa 2006).

Usage

do_groupChromPeaks_nearest(
  peaks,
  sampleGroups,
  mzVsRtBalance = 10,
  absMz = 0.2,
  absRt = 15,
  kNN = 10
)
do_groupChromPeaks_nearest(
  peaks,
  sampleGroups,
  mzVsRtBalance = 10,
  absMz = 0.2,
  absRt = 15,
  kNN = 10
)

Arguments

`peaks`	A `matrix` or `data.frame` with the mz values and retention times of the identified chromatographic peaks in all samples of an experiment. Required columns are `"mz"`, `"rt"` and `"sample"`. The latter should contain `numeric` values representing the index of the sample in which the peak was found.
`sampleGroups`	For `PeakDensityParam`: A vector of the same length than samples defining the sample group assignments (i.e. which samples belong to which sample group). This parameter is mandatory for `PeakDensityParam` and has to be defined also if there is no sample grouping in the experiment (in which case all samples should be assigned to the same group). Samples for which a `NA` is provided will not be considered in the feature definitions step. Providing `NA` for all blanks in an experiment will for example avoid features to be defined for signals (chrom peaks) present only in blank samples.
`mzVsRtBalance`	For `NearestPeaksParam`: `numeric(1)` representing the factor by which m/z values are multiplied before calculating the (euclician) distance between two peaks.
`absMz`	For `NearestPeaksParam` and `MzClustParam`: `numeric(1)` maximum tolerated distance for m/z values.
`absRt`	For `NearestPeaksParam`: `numeric(1)` maximum tolerated distance for retention times.
`kNN`	For `NearestPeaksParam`: `integer(1)` representing the number of nearest neighbors to check.

Value

A list with elements "featureDefinitions" and "peakIndex". "featureDefinitions" is a matrix, each row representing an (mz-rt) feature (i.e. peak group) with columns:

"mzmed": median of the peaks' apex mz values.
"mzmin": smallest mz value of all peaks' apex within the feature.
"mzmax":largest mz value of all peaks' apex within the feature.
"rtmed": the median of the peaks' retention times.
"rtmin": the smallest retention time of the peaks in the feature.
"rtmax": the largest retention time of the peaks in the feature.
"npeaks": the total number of peaks assigned to the feature.

"peakIndex" is a list with the indices of all peaks in a feature in the peaks input matrix.

References

Katajamaa M, Miettinen J, Oresic M: MZmine: Toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 2006, 22:634-636.

Core API function for peak grouping using mzClust

Description

The do_groupPeaks_mzClust function performs high resolution correspondence on single spectra samples.

Usage

do_groupPeaks_mzClust(
  peaks,
  sampleGroups,
  ppm = 20,
  absMz = 0,
  minFraction = 0.5,
  minSamples = 1
)
do_groupPeaks_mzClust(
  peaks,
  sampleGroups,
  ppm = 20,
  absMz = 0,
  minFraction = 0.5,
  minSamples = 1
)

Arguments

`peaks`	A `matrix` or `data.frame` with the mz values and retention times of the identified chromatographic peaks in all samples of an experiment. Required columns are `"mz"`, `"rt"` and `"sample"`. The latter should contain `numeric` values representing the index of the sample in which the peak was found.
`sampleGroups`	For `PeakDensityParam`: A vector of the same length than samples defining the sample group assignments (i.e. which samples belong to which sample group). This parameter is mandatory for `PeakDensityParam` and has to be defined also if there is no sample grouping in the experiment (in which case all samples should be assigned to the same group). Samples for which a `NA` is provided will not be considered in the feature definitions step. Providing `NA` for all blanks in an experiment will for example avoid features to be defined for signals (chrom peaks) present only in blank samples.
`ppm`	For `MzClustParam`: `numeric(1)` representing the relative m/z error for the clustering/grouping (in parts per million). For `PeakDensityParam`: `numeric(1)` to define m/z-dependent, increasing m/z bin sizes. If `ppm = 0` (the default) m/z bins are defined by the sequence of values from the smallest to the larges m/z value with a constant bin size of `binSize`. For `ppm` > 0 the size of each bin is increased in addition by the `ppm` of the (upper) m/z boundary of the bin. The maximal bin size (used for the largest m/z values) would then be `binSize` plus `ppm` parts-per-million of the largest m/z value of all peaks in the data set.
`absMz`	For `NearestPeaksParam` and `MzClustParam`: `numeric(1)` maximum tolerated distance for m/z values.
`minFraction`	For `PeakDensityParam`: `numeric(1)` defining the minimum fraction of samples in at least one sample group in which the peaks have to be present to be considered as a peak group (feature).
`minSamples`	For `PeakDensityParam`: `numeric(1)` with the minimum number of samples in at least one sample group in which the peaks have to be detected to be considered a peak group (feature).

Value

A list with elements "featureDefinitions" and "peakIndex". "featureDefinitions" is a matrix, each row representing an (mz-rt) feature (i.e. peak group) with columns:

"mzmed": median of the peaks' apex mz values.
"mzmin": smallest mz value of all peaks' apex within the feature.
"mzmax": largest mz value of all peaks' apex within the feature.
"rtmed": always -1.
"rtmin": always -1.
"rtmax": always -1.
"npeaks": the total number of peaks assigned to the feature. Note that this number can be larger than the total number of samples, since multiple peaks from the same sample could be assigned to a group.

"peakIndex" is a list with the indices of all peaks in a peak group in the peaks input matrix.

References

Saira A. Kazmi, Samiran Ghosh, Dong-Guk Shin, Dennis W. Hill and David F. Grant
Alignment of high resolution mass spectra: development of a heuristic approach for metabolomics.
Metabolomics, Vol. 2, No. 2, 75-83 (2006)

Filter features based on the dispersion ratio

Description

The 'DratioFilter' class and method enable users to filter features from an 'XcmsExperiment' or 'SummarizedExperiment' object based on the D-ratio or *dispersion ratio*. This is defined as the standard deviation for QC samples divided by the standard deviation for biological test samples, for each feature of the object (Broadhurst et al.).

This 'filter' is part of the possible dispatch of the generic function 'filterFeatures'. Features *above* ('>') the user-input threshold will be removed from the entire dataset.

Usage

DratioFilter(
  threshold = 0.5,
  qcIndex = integer(),
  studyIndex = integer(),
  na.rm = TRUE,
  mad = FALSE
)

## S4 method for signature 'XcmsResult,DratioFilter'
filterFeatures(object, filter, ...)

## S4 method for signature 'SummarizedExperiment,DratioFilter'
filterFeatures(object, filter, assay = 1)
DratioFilter(
  threshold = 0.5,
  qcIndex = integer(),
  studyIndex = integer(),
  na.rm = TRUE,
  mad = FALSE
)

## S4 method for signature 'XcmsResult,DratioFilter'
filterFeatures(object, filter, ...)

## S4 method for signature 'SummarizedExperiment,DratioFilter'
filterFeatures(object, filter, assay = 1)

Arguments

`threshold`	'numeric' value representing the threshold. Features with a D-ratio strictly higher ('>') than this will be removed from the entire dataset.
`qcIndex`	'integer' (or 'logical') vector corresponding to the indices of QC samples.
`studyIndex`	'integer' (or 'logical') vector corresponding of the indices of study samples.
`na.rm`	'logical' Indicates whether missing values ('NA') should be removed prior to the calculations.
`mad`	'logical' Indicates whether the Median Absolute Deviation (MAD) should be used instead of the standard deviation. This is suggested for non-gaussian distributed data.
`object`	`XcmsExperiment` or `SummarizedExperiment`. For an `XcmsExperiment` object, the `featureValues(object)` will be evaluated, and for `Summarizedesxperiment` the `assay(object, assay)`. The object will be filtered.
`filter`	The parameter object selecting and configuring the type of filtering. It can be one of the following classes: `RsdFilter`, `DratioFilter`, `PercentMissingFilter` or `BlankFlag`.
`...`	Optional parameters. For `object` being an `XcmsExperiment`: parameters for the `featureValues()` call.
`assay`	For filtering of `SummarizedExperiment` objects only. Indicates which assay the filtering will be based on. Note that the features for the entire object will be removed, but the computations are performed on a single assay. Default is 1, which means the first assay of the `object` will be evaluated.

Value

For 'DratioFilter': a 'DratioFilter' class. 'filterFeatures' return the input object minus the features that did not met the user input threshold

Author(s)

Philippine Louail

References

Broadhurst D, Goodacre R, Reinke SN, Kuligowski J, Wilson ID, Lewis MR, Dunn WB. Guidelines and considerations for the use of system suitability and quality control samples in mass spectrometry assays applied in untargeted clinical metabolomic studies. Metabolomics. 2018;14(6):72. doi: 10.1007/s11306-018-1367-3. Epub 2018 May 18. PMID: 29805336; PMCID: PMC5960010.

Estimate precursor intensity for MS level 2 spectra

Description

estimatePrecursorIntensity() determines the precursor intensity for a MS 2 spectrum based on the intensity of the respective signal from the neighboring MS 1 spectra (i.e. based on the peak with the m/z matching the precursor m/z of the MS 2 spectrum). Based on parameter method either the intensity of the peak from the previous MS 1 scan is used (method = "previous") or an interpolation between the intensity from the previous and subsequent MS1 scan is used (method = "interpolation", which considers also the retention times of the two MS1 scans and the retention time of the MS2 spectrum).

Usage

## S4 method for signature 'MsExperiment'
estimatePrecursorIntensity(
  object,
  ppm = 10,
  tolerance = 0,
  method = c("previous", "interpolation"),
  BPPARAM = bpparam()
)

## S4 method for signature 'OnDiskMSnExp'
estimatePrecursorIntensity(
  object,
  ppm = 10,
  tolerance = 0,
  method = c("previous", "interpolation"),
  BPPARAM = bpparam()
)
## S4 method for signature 'MsExperiment'
estimatePrecursorIntensity(
  object,
  ppm = 10,
  tolerance = 0,
  method = c("previous", "interpolation"),
  BPPARAM = bpparam()
)

## S4 method for signature 'OnDiskMSnExp'
estimatePrecursorIntensity(
  object,
  ppm = 10,
  tolerance = 0,
  method = c("previous", "interpolation"),
  BPPARAM = bpparam()
)

Arguments

`object`	`MsExperiment`, `XcmsExperiment`, `OnDiskMSnExp` or `XCMSnExp` object.
`ppm`	`numeric(1)` defining the maximal acceptable difference (in ppm) of the precursor m/z and the m/z of the corresponding peak in the MS 1 scan.
`tolerance`	`numeric(1)` with the maximal allowed difference of m/z values between the precursor m/z of a spectrum and the m/z of the respective ion on the MS1 scan.
`method`	`character(1)` defining the method how the precursor intensity should be determined (see description above for details). Defaults to `method = "previous"`.
`BPPARAM`	parallel processing setup. See `bpparam()` for details.

Value

numeric with length equal to the number of spectra in x. NA is returned for MS 1 spectra or if no matching peak in a MS 1 scan can be found for an MS 2 spectrum

Author(s)

Johannes Rainer with feedback and suggestions from Corey Broeckling

Empirically Transformed Gaussian function

Description

A general function for asymmetric chromatographic peaks.

Usage

etg(x, H, t1, tt, k1, kt, lambda1, lambdat, alpha, beta)
etg(x, H, t1, tt, k1, kt, lambda1, lambdat, alpha, beta)

Arguments

`x`	times to evaluate function at
`H`	peak height
`t1`	time of leading edge inflection point
`tt`	time of trailing edge inflection point
`k1`	leading edge parameter
`kt`	trailing edge parameter
`lambda1`	leading edge parameter
`lambdat`	trailing edge parameter
`alpha`	leading edge parameter
`beta`	trailing edge parameter

Value

The function evaluated at times x.

Author(s)

Colin A. Smith, [email protected]

References

Jianwei Li. Development and Evaluation of Flexible Empirical Peak Functions for Processing Chromatographic Peaks. Anal. Chem., 69 (21), 4452-4462, 1997. http://dx.doi.org/10.1021/ac970481d

Export data for use in MetaboAnalyst

Description

Export the feature table for further analysis in the MetaboAnalyst software (or the MetaboAnalystR R package).

Usage

exportMetaboAnalyst(
  x,
  file = NULL,
  label,
  value = "into",
  digits = NULL,
  groupnames = FALSE,
  ...
)
exportMetaboAnalyst(
  x,
  file = NULL,
  label,
  value = "into",
  digits = NULL,
  groupnames = FALSE,
  ...
)

Arguments

`x`	XCMSnExp object with identified chromatographic peaks grouped across samples.
`file`	`character(1)` defining the file name. If not specified, the `matrix` with the content is returned.
`label`	either `character(1)` specifying the phenodata column in `x` defining the sample grouping or a vector with the same length than samples in `x` defining the group assignment of the samples.
`value`	`character(1)` specifying the value to be returned for each feature. See `featureValues()` for more details.
`digits`	`integer(1)` defining the number of significant digits to be used for numeric. The default `NULL` uses `getOption("digits")`. See `format()` for more information.
`groupnames`	`logical(1)` whether row names of the resulting matrix should be the feature IDs (`groupnames = FALSE`; default) or IDs that are composed of the m/z and retention time of the features (in the format `⁠M<m/z>T<rt>⁠` (`groupnames = TRUE`). See help of the groupnames function for details.
`...`	additional parameters to be passed to the `featureValues()` function.

Value

If file is not specified, the function returns the matrix in the format supported by MetaboAnalyst.

Author(s)

Johannes Rainer

DEPRECATED: Extract a `data.frame` containing MS data

Description

UPDATE: the extractMsData and plotMsData functions are deprecated and as(x, "data.frame") and plot(x, type = "XIC") (x being an OnDiskMSnExp or XCMSnExp object) should be used instead. See examples below. Be aware that filtering the raw object might however drop the adjusted retention times. In such cases it is advisable to use the applyAdjustedRtime() function prior to filtering.

Extract a data.frame of retention time, mz and intensity values from each file/sample in the provided rt-mz range (or for the full data range if rt and mz are not defined).

Usage

## S4 method for signature 'OnDiskMSnExp'
extractMsData(object, rt, mz, msLevel = 1L)

## S4 method for signature 'XCMSnExp'
extractMsData(
  object,
  rt,
  mz,
  msLevel = 1L,
  adjustedRtime = hasAdjustedRtime(object)
)
## S4 method for signature 'OnDiskMSnExp'
extractMsData(object, rt, mz, msLevel = 1L)

## S4 method for signature 'XCMSnExp'
extractMsData(
  object,
  rt,
  mz,
  msLevel = 1L,
  adjustedRtime = hasAdjustedRtime(object)
)

Arguments

`object`	A `XCMSnExp` or `OnDiskMSnExp` object.
`rt`	`numeric(2)` with the retention time range from which the data should be extracted.
`mz`	`numeric(2)` with the mz range.
`msLevel`	`integer` defining the MS level(s) to which the data should be sub-setted prior to extraction; defaults to `msLevel = 1L`.
`adjustedRtime`	(for `⁠extractMsData,XCMSnExp⁠`): `logical(1)` specifying if adjusted or raw retention times should be reported. Defaults to adjusted retention times, if these are present in `object`.

Value

A list of length equal to the number of samples/files in object. Each element being a data.frame with columns "rt", "mz" and "i" with the retention time, mz and intensity tuples of a file. If no data is available for the mz-rt range in a file a data.frame with 0 rows is returned for that file.

Author(s)

Johannes Rainer

Examples


## Load a test data set with detected peaks
library(MSnbase)
data(faahko_sub)
## Update the path to the files for the local system
dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO")

## Disable parallel processing for this example
register(SerialParam())

## Extract the full MS data for a certain retention time range
## as a data.frame
tmp <- filterRt(faahko_sub, rt = c(2800, 2900))
ms_all <- as(tmp, "data.frame")
head(ms_all)
nrow(ms_all)
## Load a test data set with detected peaks
library(MSnbase)
data(faahko_sub)
## Update the path to the files for the local system
dirname(faahko_sub) <- system.file("cdf/KO", package = "faahKO")

## Disable parallel processing for this example
register(SerialParam())

## Extract the full MS data for a certain retention time range
## as a data.frame
tmp <- filterRt(faahko_sub, rt = c(2800, 2900))
ms_all <- as(tmp, "data.frame")
head(ms_all)
nrow(ms_all)

Compounding of LC-MS features

Description

Feature compounding aims at identifying and grouping LC-MS features representing different ions or adducts (including isotopes) of the same originating compound. The MsFeatures package provides a general framework and functionality to group features based on different properties. The groupFeatures methods for XcmsExperiment() or XCMSnExp objects implemented in xcms extend these to enable the compounding of LC-MS data considering also e.g. feature peak shaped. Note that these functions simply define feature groups but don't actually aggregate or combine the features.

See MsFeatures::groupFeatures() for an overview on the general feature grouping concept as well as details on the individual settings and parameters.

The available options for groupFeatures on xcms preprocessing results (i.e. on XcmsExperiment or XCMSnExp objects after correspondence analysis with groupChromPeaks()) are:

Grouping by similar retention times: groupFeatures-similar-rtime().
Grouping by similar feature values across samples: AbundanceSimilarityParam().
Grouping by similar peak shape of extracted ion chromatograms: EicSimilarityParam().

An ideal workflow grouping features should sequentially perform the above methods (in the listed order).

Compounded feature groups can be accessed with the featureGroups function.

Usage

## S4 method for signature 'XcmsResult'
featureGroups(object)

## S4 replacement method for signature 'XcmsResult'
featureGroups(object) <- value
## S4 method for signature 'XcmsResult'
featureGroups(object)

## S4 replacement method for signature 'XcmsResult'
featureGroups(object) <- value

Arguments

`object`	an `XcmsExperiment()` or `XCMSnExp()` object with LC-MS pre-processing results.
`value`	for `⁠featureGroups<-⁠`: replacement for the feature groups in `object`. Has to be of length 1 or length equal to the number of features in `object`.

Author(s)

Johannes Rainer, Mar Garcia-Aloy, Vinicius Veri Hernandes

Extract ion chromatograms for each feature

Description

Extract ion chromatograms for features in an XcmsExperiment or XCMSnExp object. The function returns for each feature the extracted ion chromatograms (along with all associated chromatographic peaks) in each sample. The chromatogram is extracted from the m/z - rt region that includes all chromatographic peaks of a feature. By default, this region is defined using the range of the chromatographic peaks' m/z and retention times (with mzmin = min, mzmax = max, rtmin = min and rtmax = max). For some features, and depending on the data, the m/z and rt range can thus be relatively large. The boundaries of the m/z - rt region can also be restricted by changing parameters mzmin, mzmax, rtmin and rtmax to a different functions, such as median.

By default only chromatographic peaks associated with a feature are included in the returned XChromatograms object. For object being an XCMSnExp object parameter include allows also to return all chromatographic peaks with their apex position within the selected region (include = "apex_within") or any chromatographic peak overlapping the m/z and retention time range (include = "any").

Usage

featureChromatograms(object, ...)

## S4 method for signature 'XcmsExperiment'
featureChromatograms(
  object,
  expandRt = 0,
  expandMz = 0,
  aggregationFun = "max",
  features = character(),
  return.type = "XChromatograms",
  chunkSize = 2L,
  mzmin = min,
  mzmax = max,
  rtmin = min,
  rtmax = max,
  ...,
  progressbar = TRUE,
  BPPARAM = bpparam()
)

## S4 method for signature 'XCMSnExp'
featureChromatograms(
  object,
  expandRt = 0,
  aggregationFun = "max",
  features,
  include = c("feature_only", "apex_within", "any", "all"),
  filled = FALSE,
  n = length(fileNames(object)),
  value = c("maxo", "into"),
  expandMz = 0,
  ...
)
featureChromatograms(object, ...)

## S4 method for signature 'XcmsExperiment'
featureChromatograms(
  object,
  expandRt = 0,
  expandMz = 0,
  aggregationFun = "max",
  features = character(),
  return.type = "XChromatograms",
  chunkSize = 2L,
  mzmin = min,
  mzmax = max,
  rtmin = min,
  rtmax = max,
  ...,
  progressbar = TRUE,
  BPPARAM = bpparam()
)

## S4 method for signature 'XCMSnExp'
featureChromatograms(
  object,
  expandRt = 0,
  aggregationFun = "max",
  features,
  include = c("feature_only", "apex_within", "any", "all"),
  filled = FALSE,
  n = length(fileNames(object)),
  value = c("maxo", "into"),
  expandMz = 0,
  ...
)

Arguments

`object`	`XcmsExperiment` or `XCMSnExp` object with grouped chromatographic peaks.
`...`	optional arguments to be passed along to the `chromatogram()` function.
`expandRt`	`numeric(1)` to expand the retention time range for each chromatographic peak by a constant value on each side.
`expandMz`	`numeric(1)` to expand the m/z range for each chromatographic peak by a constant value on each side. Be aware that by extending the m/z range the extracted EIC might no longer represent the actual identified chromatographic peak because intensities of potential additional mass peaks within each spectra would be aggregated into the final reported intensity value per spectrum (retention time).
`aggregationFun`	`character(1)` specifying the name that should be used to aggregate intensity values across the m/z value range for the same retention time. The default `"max"` returns a base peak chromatogram.
`features`	`integer`, `character` or `logical` defining a subset of features for which chromatograms should be returned. Can be the index of the features in `featureDefinitions`, feature IDs (row names of `featureDefinitions`) or a logical vector.
`return.type`	`character(1)` defining how the result should be returned. At present only `return.type = "XChromatograms"` is supported and the results are thus returned as an `XChromatograms()` object.
`chunkSize`	For `object` being an `XcmsExperiment`: `integer(1)` defining the number of files from which the data should be loaded at a time into memory. Defaults to `chunkSize = 2L`.
`mzmin`	`function` defining how the lower boundary of the m/z region from which the EIC is integrated should be defined. Defaults to `mzmin = min` thus the smallest `"mzmin"` value for all chromatographic peaks of a feature will be used.
`mzmax`	`function` defining how the upper boundary of the m/z region from which the EIC is integrated should be defined. Defaults to `mzmax = max` thus the largest `"mzmax"` value for all chromatographic peaks of a feature will be used.
`rtmin`	`function` defining how the lower boundary of the rt region from which the EIC is integrated should be defined. Defaults to `rtmin = min` thus the smallest `"rtmin"` value for all chromatographic peaks of a feature will be used.
`rtmax`	`function` defining how the upper boundary of the rt region from which the EIC is integrated should be defined. Defaults to `rtmax = max` thus the largest `"rtmax"` value for all chromatographic peaks of a feature will be used.
`progressbar`	`logical(1)` defining whether a progress bar is shown.
`BPPARAM`	For `object` being an `XcmsExperiment`: parallel processing setup. Defaults to `BPPARAM = bpparam()`. See `BiocParallel::bpparam()` for more information.
`include`	Only for `object` being an `XCMSnExp`: `character(1)` defining which chromatographic peaks (and related feature definitions) should be included in the returned `XChromatograms()`. Defaults to `"feature_only"`; See description above for options and details.
`filled`	Only for `object` being an `XCMSnExp`: `logical(1)` whether filled-in peaks should be included in the result object. The default is `filled = FALSE`, i.e. only detected peaks are reported.
`n`	Only for `object` being an `XCMSnExp`: `integer(1)` to optionally specify the number of top n samples from which the EIC should be extracted.
`value`	Only for `object` being an `XCMSnExp`: `character(1)` specifying the column to be used to sort the samples. Can be either `"maxo"` (the default) or `"into"` to use the maximal peak intensity or the integrated peak area, respectively.

Value

XChromatograms() object. In future, depending on parameter return.type, the data might be returned as a different object.

Note

The EIC data of a feature is extracted from every sample using the same m/z - rt area. The EIC in a sample does thus not exactly represent the signal of the actually identified chromatographic peak in that sample. The chromPeakChromatograms() function would allow to extract the actual EIC of the chromatographic peak in a specific sample. See also examples below.

Parameters include, filled, n and value are only supported for object being an XCMSnExp.

When extracting EICs from only the top n samples it can happen that one or more of the features specified with features are dropped because they have no detected peak in the top n samples. The chance for this to happen is smaller if x contains also filled-in peaks (with fillChromPeaks).

Author(s)

Johannes Rainer

Examples


## Load a test data set with detected peaks
library(xcms)
library(MsExperiment)
faahko_sub <- loadXcmsData("faahko_sub2")

## Disable parallel processing for this example
register(SerialParam())

## Perform correspondence analysis
xdata <- groupChromPeaks(faahko_sub,
    param = PeakDensityParam(minFraction = 0.8, sampleGroups = rep(1, 3)))

## Get the feature definitions
featureDefinitions(xdata)

## Extract ion chromatograms for the first 3 features. Parameter
## `features` can be either the feature IDs or feature indices.
chrs <- featureChromatograms(xdata,
    features = rownames(featureDefinitions)[1:3])

## Plot the EIC for the first feature using different colors for each file.
plot(chrs[1, ], col = c("red", "green", "blue"))

## The EICs for all 3 samples use the same m/z and retention time range,
## which was defined using the `featureArea` function:
featureArea(xdata, features = rownames(featureDefinitions(xdata))[1:3],
    mzmin = min, mzmax = max, rtmin = min, rtmax = max)

## To extract the actual (exact) EICs for each chromatographic peak of
## a feature in each sample, the `chromPeakChromatograms` function would
## need to be used instead. Below we extract the EICs for all
## chromatographic peaks of the first feature. We need to first get the
## IDs of all chromatographic peaks assigned to the first feature:
peak_ids <- rownames(chromPeaks(xdata))[featureDefinitions(xdata)$peakidx[[1L]]]

## We can now pass these to the `chromPeakChromatograms` function with
## parameter `peaks`:
eic_1 <- chromPeakChromatograms(xdata, peaks = peak_ids)

## To plot these into a single plot we need to use the
## `plotChromatogramsOverlay` function:
plotChromatogramsOverlay(eic_1)
## Load a test data set with detected peaks
library(xcms)
library(MsExperiment)
faahko_sub <- loadXcmsData("faahko_sub2")

## Disable parallel processing for this example
register(SerialParam())

## Perform correspondence analysis
xdata <- groupChromPeaks(faahko_sub,
    param = PeakDensityParam(minFraction = 0.8, sampleGroups = rep(1, 3)))

## Get the feature definitions
featureDefinitions(xdata)

## Extract ion chromatograms for the first 3 features. Parameter
## `features` can be either the feature IDs or feature indices.
chrs <- featureChromatograms(xdata,
    features = rownames(featureDefinitions)[1:3])

## Plot the EIC for the first feature using different colors for each file.
plot(chrs[1, ], col = c("red", "green", "blue"))

## The EICs for all 3 samples use the same m/z and retention time range,
## which was defined using the `featureArea` function:
featureArea(xdata, features = rownames(featureDefinitions(xdata))[1:3],
    mzmin = min, mzmax = max, rtmin = min, rtmax = max)

## To extract the actual (exact) EICs for each chromatographic peak of
## a feature in each sample, the `chromPeakChromatograms` function would
## need to be used instead. Below we extract the EICs for all
## chromatographic peaks of the first feature. We need to first get the
## IDs of all chromatographic peaks assigned to the first feature:
peak_ids <- rownames(chromPeaks(xdata))[featureDefinitions(xdata)$peakidx[[1L]]]

## We can now pass these to the `chromPeakChromatograms` function with
## parameter `peaks`:
eic_1 <- chromPeakChromatograms(xdata, peaks = peak_ids)

## To plot these into a single plot we need to use the
## `plotChromatogramsOverlay` function:
plotChromatogramsOverlay(eic_1)

Extract spectra associated with features

Description

This function returns spectra associated with the identified features in the input object. By default, spectra are returned for all features (from all MS levels), but parameter features allows to specify/select features for which the result should be returned. Parameter msLevel allows to define whether MS level 1 or 2 spectra should be returned. For msLevel = 1L all MS1 spectra within the retention time range of each chromatographic peak (in that respective data file) associated with a feature are returned. Note that for samples in which no peak was identified (or even filled-in) no spectra are returned. For msLevel = 2L all MS2 spectra with a retention time within the retention time range and their precursor m/z within the m/z range of any chromatographic peak of a feature are returned.

See also chromPeakSpectra() (used internally to extract spectra for each chromatographic peak of a feature) for additional information, specifically also on parameter method. By default (method = "all") all spectra associated with any of the chromatographic peaks of a feature are returned. With any other option for method, a single spectrum per chromatographic peak will be returned (hence multiple spectra per feature).

The information from featureDefinitions for each feature can be included in the returned Spectra::Spectra() object using the featureColumns parameter. This is useful for keeping details such as the median retention time (rtmed) or median m/z (mzmed). The columns will retain their names as specified in the featureDefinitions object, prefixed by "feature_" (e.g., "feature_mzmed"). Additionally, the feature ID (i.e., the row name of the feature in the featureDefinitions data.frame) is always added as a metadata column named "feature_id".

See also chromPeakSpectra(), as it supports a similar parameter for including columns from the chromatographic peaks in the returned spectra object. These parameters can be used in combination to include information from both the chromatographic peaks and the features in the returned Spectra::Spectra(). The peak ID (i.e., the row name of the peak in the chromPeaks matrix) is added as a metadata column named "chrom_peak_id".

Usage

featureSpectra(object, ...)

## S4 method for signature 'XcmsExperiment'
featureSpectra(
  object,
  msLevel = 2L,
  expandRt = 0,
  expandMz = 0,
  ppm = 0,
  skipFilled = FALSE,
  return.type = c("Spectra", "List"),
  features = character(),
  featureColumns = c("rtmed", "mzmed"),
  ...
)

## S4 method for signature 'XCMSnExp'
featureSpectra(
  object,
  msLevel = 2L,
  expandRt = 0,
  expandMz = 0,
  ppm = 0,
  skipFilled = FALSE,
  return.type = c("MSpectra", "Spectra", "list", "List"),
  features = character(),
  ...
)
featureSpectra(object, ...)

## S4 method for signature 'XcmsExperiment'
featureSpectra(
  object,
  msLevel = 2L,
  expandRt = 0,
  expandMz = 0,
  ppm = 0,
  skipFilled = FALSE,
  return.type = c("Spectra", "List"),
  features = character(),
  featureColumns = c("rtmed", "mzmed"),
  ...
)

## S4 method for signature 'XCMSnExp'
featureSpectra(
  object,
  msLevel = 2L,
  expandRt = 0,
  expandMz = 0,
  ppm = 0,
  skipFilled = FALSE,
  return.type = c("MSpectra", "Spectra", "list", "List"),
  features = character(),
  ...
)

Arguments

`object`	XcmsExperiment or XCMSnExp object with feature defitions.
`...`	additional arguments to be passed along to `chromPeakSpectra()`, such as `method`.
`msLevel`	`integer(1)` defining the MS level of the spectra that should be returned.
`expandRt`	`numeric(1)` to expand the retention time range of each peak by a constant value on each side.
`expandMz`	`numeric(1)` to expand the m/z range of each peak by a constant value on each side.
`ppm`	`numeric(1)` to expand the m/z range of each peak (on each side) by a value dependent on the peak's m/z.
`skipFilled`	`logical(1)` whether spectra for filled-in peaks should be reported or not.
`return.type`	`character(1)` defining the type of result object that should be returned.
`features`	`character`, `logical` or `integer` allowing to specify a subset of features in `featureDefinitions` for which spectra should be returned (providing either their ID, a logical vector same length than `nrow(featureDefinitions(x))` or their index in `featureDefinitions(x)`). This parameter overrides `skipFilled` and is only supported for `return.type` being either `"Spectra"` or `"List"`.
`featureColumns`	`character` vector with the names of the columns from `featureDefinitions` that should be added to the returned spectra object. The columns will be named as they are written in the `featureDefinitions` object with the prefix `⁠"feature_⁠`. Defaults to `c("mzmed", "rtmed")`.

Value

The function returns either a Spectra::Spectra() (for return.type = "Spectra") or a List of Spectra (for return.type = "List"). For the latter, the order and the length matches parameter features (or if no features is defined the order of the features in featureDefinitions(object)).

Spectra variables "chrom_peak_id" and "feature_id" define to which chromatographic peak or feature each individual spectrum is associated with.

Author(s)

Johannes Rainer

Simple feature summaries

Description

Simple function to calculate feature summaries. These include counts and percentages of samples in which a chromatographic peak is present for each feature and counts and percentages of samples in which more than one chromatographic peak was annotated to the feature. Also relative standard deviations (RSD) are calculated for the integrated peak areas per feature across samples. For 'perSampleCounts = TRUE' also the individual chromatographic peak counts per sample are returned.

Usage

featureSummary(
  x,
  group,
  perSampleCounts = FALSE,
  method = "maxint",
  skipFilled = TRUE
)
featureSummary(
  x,
  group,
  perSampleCounts = FALSE,
  method = "maxint",
  skipFilled = TRUE
)

Arguments

`x`	[XcmsExperiment()] or [XCMSnExp()] object with correspondence results.
`group`	'numeric', 'logical', 'character' or 'factor' with the same length than 'x' has samples to aggregate counts by the groups defined in 'group'.
`perSampleCounts`	'logical(1)' whether feature wise individual peak counts per sample should be returned too.
`method`	'character' passed to the [featureValues()] function. See respective help page for more information.
`skipFilled`	'logical(1)' whether filled-in peaks should be excluded (default) or included in the summary calculation.

Value

'matrix' with one row per feature and columns:

- '"count"': the total number of samples in which a peak was found. - '"perc"': the percentage of samples in which a peak was found. - '"multi_count"': the total number of samples in which more than one peak was assigned to the feature. - '"multi_perc"': the percentage of those samples in which a peak was found, that have also multiple peaks annotated to the feature. Example: for a feature, at least one peak was detected in 50 samples. In 5 of them 2 peaks were assigned to the feature. '"multi_perc"' is in this case 10 - '"rsd"': relative standard deviation (coefficient of variation) of the integrated peak area of the feature's peaks. - The same 4 columns are repeated for each unique element (level) in 'group' if 'group' was provided.

If 'perSampleCounts = TRUE' also one column for each sample is returned with the peak counts per sample.

Author(s)

Johannes Rainer

Gap Filling

Description

Gap filling integrate signal in the m/z-rt area of a feature (i.e., a chromatographic peak group) for samples in which no chromatographic peak for this feature was identified and add it to the chromPeaks() matrix. Such filled-in peaks are indicated with a TRUE in column "is_filled" in the result object's chromPeakData() data frame.

The method for gap filling along with its settings can be defined with the param argument. Two different approaches are available:

param = FillChromPeaksParam(): the default of the original xcms code. Signal is integrated from the m/z and retention time range as defined in the featureDefinitions() data frame, i.e. from the "rtmin", "rtmax", "mzmin" and "mzmax". This method is not suggested as it underestimates the actual peak area and it is also not available for object being an XcmsExperiment object. See details below for more information and settings for this method.
param = ChromPeakAreaParam(): the area from which the signal for a feature is integrated is defined based on the feature's chromatographic peak areas. The m/z range is by default defined as the the lower quartile of chromatographic peaks' "mzmin" value to the upper quartile of the chromatographic peaks' "mzmax" values. The retention time range for the area is defined analogously. Alternatively, by setting mzmin = median, mzmax = median, rtmin = median and rtmax = median in ChromPeakAreaParam, the median "mzmin", "mzmax", "rtmin" and "rtmax" values from all detected chromatographic peaks of a feature would be used instead. In contrast to the FillChromPeaksParam approach this method uses (all) identified chromatographic peaks of a feature to define the area from which the signal should be integrated.

expandMz,expandMz<-: getter and setter for the expandMz slot of the object.

expandRt,expandRt<-: getter and setter for the expandRt slot of the object.

ppm,ppm<-: getter and setter for the ppm slot of the object.

Usage

fillChromPeaks(object, param, ...)

## S4 method for signature 'XcmsExperiment,ChromPeakAreaParam'
fillChromPeaks(
  object,
  param,
  msLevel = 1L,
  chunkSize = 2L,
  BPPARAM = bpparam()
)

FillChromPeaksParam(
  expandMz = 0,
  expandRt = 0,
  ppm = 0,
  fixedMz = 0,
  fixedRt = 0
)

fixedRt(object)

fixedMz(object)

ChromPeakAreaParam(
  mzmin = function(z) quantile(z, probs = 0.25, names = FALSE),
  mzmax = function(z) quantile(z, probs = 0.75, names = FALSE),
  rtmin = function(z) quantile(z, probs = 0.25, names = FALSE),
  rtmax = function(z) quantile(z, probs = 0.75, names = FALSE)
)

## S4 method for signature 'FillChromPeaksParam'
expandMz(object)

## S4 replacement method for signature 'FillChromPeaksParam'
expandMz(object) <- value

## S4 method for signature 'FillChromPeaksParam'
expandRt(object)

## S4 replacement method for signature 'FillChromPeaksParam'
expandRt(object) <- value

## S4 method for signature 'FillChromPeaksParam'
ppm(object)

## S4 replacement method for signature 'FillChromPeaksParam'
ppm(object) <- value

## S4 method for signature 'XCMSnExp,FillChromPeaksParam'
fillChromPeaks(object, param, msLevel = 1L, BPPARAM = bpparam())

## S4 method for signature 'XCMSnExp,ChromPeakAreaParam'
fillChromPeaks(object, param, msLevel = 1L, BPPARAM = bpparam())

## S4 method for signature 'XCMSnExp,missing'
fillChromPeaks(object, param, BPPARAM = bpparam(), msLevel = 1L)
fillChromPeaks(object, param, ...)

## S4 method for signature 'XcmsExperiment,ChromPeakAreaParam'
fillChromPeaks(
  object,
  param,
  msLevel = 1L,
  chunkSize = 2L,
  BPPARAM = bpparam()
)

FillChromPeaksParam(
  expandMz = 0,
  expandRt = 0,
  ppm = 0,
  fixedMz = 0,
  fixedRt = 0
)

fixedRt(object)

fixedMz(object)

ChromPeakAreaParam(
  mzmin = function(z) quantile(z, probs = 0.25, names = FALSE),
  mzmax = function(z) quantile(z, probs = 0.75, names = FALSE),
  rtmin = function(z) quantile(z, probs = 0.25, names = FALSE),
  rtmax = function(z) quantile(z, probs = 0.75, names = FALSE)
)

## S4 method for signature 'FillChromPeaksParam'
expandMz(object)

## S4 replacement method for signature 'FillChromPeaksParam'
expandMz(object) <- value

## S4 method for signature 'FillChromPeaksParam'
expandRt(object)

## S4 replacement method for signature 'FillChromPeaksParam'
expandRt(object) <- value

## S4 method for signature 'FillChromPeaksParam'
ppm(object)

## S4 replacement method for signature 'FillChromPeaksParam'
ppm(object) <- value

## S4 method for signature 'XCMSnExp,FillChromPeaksParam'
fillChromPeaks(object, param, msLevel = 1L, BPPARAM = bpparam())

## S4 method for signature 'XCMSnExp,ChromPeakAreaParam'
fillChromPeaks(object, param, msLevel = 1L, BPPARAM = bpparam())

## S4 method for signature 'XCMSnExp,missing'
fillChromPeaks(object, param, BPPARAM = bpparam(), msLevel = 1L)

Arguments

`object`	`XcmsExperiment` or `XCMSnExp` object with identified and grouped chromatographic peaks.
`param`	`ChromPeakAreaParam` or `FillChromPeaksParam` object defining which approach should be used (see details section).
`...`	currently ignored.
`msLevel`	`integer(1)` defining the MS level on which peak filling should be performed (defaults to `msLevel = 1L`). Only peak filling on one MS level at a time is supported, to fill in peaks for MS level 1 and 2 run first using `msLevel = 1` and then (on the returned result object) again with `msLevel = 2`.
`chunkSize`	For `fillChromPeaks` if `object` is an `XcmsExperiment`: `integer(1)` defining the number of files (samples) that should be loaded into memory and processed at the same time. This setting thus allows to balance between memory demand and speed (due to parallel processing). Because parallel processing can only performed on the subset of data currently loaded into memory in each iteration, the value for `chunkSize` should match the defined parallel setting setup. Using a parallel processing setup using 4 CPUs (separate processes) but using `⁠chunkSize = ⁠`1`⁠will not perform any parallel processing, as only the data from one sample is loaded in memory at a time. On the other hand, setting⁠`chunkSize' to the total number of samples in an experiment will load the full MS data into memory and will thus in most settings cause an out-of-memory error.
`BPPARAM`	Parallel processing settings.
`expandMz`	for `FillChromPeaksParam`: `numeric(1)` defining the value by which the mz width of peaks should be expanded. Each peak is expanded in mz direction by `⁠expandMz ⁠` their original m/z width. A value of `0` means no expansion, a value of `1` grows each peak by `⁠1 ⁠` the m/z width of the peak resulting in peaks with twice their original size in m/z direction (expansion by half m/z width to both sides).
`expandRt`	for `FillChromPeaksParam`: `numeric(1)`, same as `expandMz` but for the retention time width.
`ppm`	for `FillChromPeaksParam`: `numeric(1)` optionally specifying a ppm by which the m/z width of the peak region should be expanded. For peaks with an m/z width smaller than `mean(c(mzmin, mzmax)) * ppm / 1e6`, the `mzmin` will be replaced by `mean(c(mzmin, mzmax)) - (mean(c(mzmin, mzmax)) * ppm / 2 / 1e6)` `mzmax` by `mean(c(mzmin, mzmax)) + (mean(c(mzmin, mzmax)) * ppm / 2 / 1e6)`. This is applied before eventually expanding the m/z width using the `expandMz` parameter.
`fixedMz`	for `FillChromPeaksParam`: `numeric(1)` defining a constant factor by which the m/z width of each feature is to be expanded. The m/z width is expanded on both sides by `fixedMz` (i.e. `fixedMz` is subtracted from the lower m/z and added to the upper m/z). This expansion is applied after `expandMz` and `ppm`.
`fixedRt`	for `FillChromPeaksParam`: `numeric(1)` defining a constant factor by which the retention time width of each factor is to be expanded. The rt width is expanded on both sides by `fixedRt` (i.e. `fixedRt` is subtracted from the lower rt and added to the upper rt). This expansion is applied after `expandRt`.
`mzmin`	`function` to be applied to values in the `"mzmin"` column of all chromatographic peaks of a feature to define the lower m/z value of the area from which signal for the feature should be integrated. Defaults to `mzmin = function(z) quantile(z, probs = 0.25)` hence using the 25% quantile of all values.
`mzmax`	`function` to be applied to values in the `"mzmax"` column of all chromatographic peaks of a feature to define the upper m/z value of the area from which signal for the feature should be integrated. Defaults to `mzmax = function(z) quantile(z, probs = 0.75)` hence using the 75% quantile of all values.
`rtmin`	`function` to be applied to values in the `"rtmin"` column of all chromatographic peaks of a feature to define the lower rt value of the area from which signal for the feature should be integrated. Defaults to `rtmin = function(z) quantile(z, probs = 0.25)` hence using the 25% quantile of all values.
`rtmax`	`function` to be applied to values in the `"rtmax"` column of all chromatographic peaks of a feature to define the upper rt value of the area from which signal for the feature should be integrated. Defaults to `rtmax = function(z) quantile(z, probs = 0.75)` hence using the 75% quantile of all values.
`value`	The value for the slot.

Details

After correspondence (i.e. grouping of chromatographic peaks across samples) there will always be features (peak groups) that do not include peaks from every sample. The fillChromPeaks method defines intensity values for such features in the missing samples by integrating the signal in the m/z-rt region of the feature. Two different approaches to define this region are available: with ChromPeakAreaParam the region is defined based on the detected chromatographic peaks of a feature, while with FillChromPeaksParam the region is defined based on the m/z and retention times of the feature (which represent the m/z and retentention times of the apex position of the associated chromatographic peaks). For the latter approach various parameters are available to increase the area from which signal is to be integrated, either by a constant value (fixedMz and fixedRt) or by a feature-relative amount (expandMz and expandRt).

Adjusted retention times will be used if available.

Based on the peak finding algorithm that was used to identify the (chromatographic) peaks, different internal functions are used to guarantee that the integrated peak signal matches as much as possible the peak signal integration used during the peak detection. For peaks identified with the matchedFilter() method, signal integration is performed on the profile matrix generated with the same settings used also during peak finding (using the same bin size for example). For direct injection data and peaks identified with the MSW algorithm signal is integrated only along the mz dimension. For all other methods the complete (raw) signal within the area is used.

Value

An XcmsExperiment or XCMSnExp object with previously missing chromatographic peaks for features filled into its chromPeaks() matrix.

The FillChromPeaksParam function returns a FillChromPeaksParam object.

Slots

expandMz,expandRt,ppm,fixedMz,fixedRt: See corresponding parameter above.
rtmin,rtmax,mzmin,mzmax: See corresponding parameter above.

Note

The reported "mzmin", "mzmax", "rtmin" and "rtmax" for the filled peaks represents the actual MS area from which the signal was integrated.

No peak is filled in if no signal was present in a file/sample in the respective mz-rt area. These samples will still show a NA in the matrix returned by the featureValues() method.

Author(s)

Johannes Rainer

Examples


## Load a test data set with identified chromatographic peaks
library(xcms)
library(MsExperiment)
res <- loadXcmsData("faahko_sub2")

## Disable parallel processing for this example
register(SerialParam())

## Perform the correspondence. We assign all samples to the same group.
res <- groupChromPeaks(res,
    param = PeakDensityParam(sampleGroups = rep(1, length(res))))

## For how many features do we lack an integrated peak signal?
sum(is.na(featureValues(res)))

## Filling missing peak data using the peak area from identified
## chromatographic peaks.
res <- fillChromPeaks(res, param = ChromPeakAreaParam())

## How many missing values do we have after peak filling?
sum(is.na(featureValues(res)))

## Get the peaks that have been filled in:
fp <- chromPeaks(res)[chromPeakData(res)$is_filled, ]
head(fp)

## Get the process history step along with the parameters used to perform
## The peak filling:
ph <- processHistory(res, type = "Missing peak filling")[[1]]
ph

## The parameter class:
ph@param

## It is also possible to remove filled-in peaks:
res <- dropFilledChromPeaks(res)

sum(is.na(featureValues(res)))
## Load a test data set with identified chromatographic peaks
library(xcms)
library(MsExperiment)
res <- loadXcmsData("faahko_sub2")

## Disable parallel processing for this example
register(SerialParam())

## Perform the correspondence. We assign all samples to the same group.
res <- groupChromPeaks(res,
    param = PeakDensityParam(sampleGroups = rep(1, length(res))))

## For how many features do we lack an integrated peak signal?
sum(is.na(featureValues(res)))

## Filling missing peak data using the peak area from identified
## chromatographic peaks.
res <- fillChromPeaks(res, param = ChromPeakAreaParam())

## How many missing values do we have after peak filling?
sum(is.na(featureValues(res)))

## Get the peaks that have been filled in:
fp <- chromPeaks(res)[chromPeakData(res)$is_filled, ]
head(fp)

## Get the process history step along with the parameters used to perform
## The peak filling:
ph <- processHistory(res, type = "Missing peak filling")[[1]]
ph

## The parameter class:
ph@param

## It is also possible to remove filled-in peaks:
res <- dropFilledChromPeaks(res)

sum(is.na(featureValues(res)))

Integrate areas of missing peaks

Description

For each sample, identify peak groups where that sample is not represented. For each of those peak groups, integrate the signal in the region of that peak group and create a new peak.

Arguments

`object`	the `xcmsSet` object
`method`	the filling method

Details

After peak grouping, there will always be peak groups that do not include peaks from every sample. This method produces intensity values for those missing samples by integrating raw data in peak group region. According to the type of raw-data there are 2 different methods available. for filling gcms/lcms data the method "chrom" integrates raw-data in the chromatographic domain, whereas "MSW" is used for peaklists without retention-time information like those from direct-infusion spectra.

Value

A xcmsSet objects with filled in peak groups.

Methods

object = "xcmsSet": fillPeaks(object, method="")

Integrate areas of missing peaks

Description

For each sample, identify peak groups where that sample is not represented. For each of those peak groups, integrate the signal in the region of that peak group and create a new peak.

Arguments

`object`	the `xcmsSet` object
`nSlaves`	(DEPRECATED): number of slaves/cores to be used for parallel peak filling. MPI is used if installed, otherwise the snow package is employed for multicore support. If none of the two packages is available it uses the parallel package for parallel processing on multiple CPUs of the current machine. Users are advised to use the `BPPARAM` parameter instead.
`expand.mz`	Expansion factor for the m/z range used for integration.
`expand.rt`	Expansion factor for the rentention time range used for integration.
`BPPARAM`	allows to define a specific parallel processing setup for the current task (see `bpparam` from the `BiocParallel` package help more information). The default uses the globally defined parallel setup.

Details

After peak grouping, there will always be peak groups that do not include peaks from every sample. This method produces intensity values for those missing samples by integrating raw data in peak group region. In a given group, the start and ending retention time points for integration are defined by the median start and end points of the other detected peaks. The start and end m/z values are similarly determined. Intensities can be still be zero, which is a rather unusual intensity for a peak. This is the case if e.g. the raw data was threshholded, and the integration area contains no actual raw intensities, or if one sample is miscalibrated, such thet the raw data points are (just) outside the integration area.

Importantly, if retention time correction data is available, the alignment information is used to more precisely integrate the propper region of the raw data. If the corrected retention time is beyond the end of the raw data, the value will be not-a-number (NaN).

Value

A xcmsSet objects with filled in peak groups (into and maxo).

Methods

object = "xcmsSet": fillPeaks.chrom(object, nSlaves=0,expand.mz=1,expand.rt=1, BPPARAM = bpparam())

Integrate areas of missing peaks in FTICR-MS data

Description

For each sample, identify peak groups where that sample is not represented. For each of those peak groups, integrate the signal in the region of that peak group and create a new peak.

Arguments

object

the xcmsSet object

Details

Value

A xcmsSet objects with filled in peak groups.

Methods

object = "xcmsSet": fillPeaks.MSW(object)

Note

In contrast to the fillPeaks.chrom method the maximum intensity reported in column "maxo" is not the maximum intensity measured in the expected peak area (defined by columns "mzmin" and "mzmax"), but the largest intensity of mz value(s) closest to the "mzmed" of the feature.

Filtering sets of chromatographic data

Description

These functions allow to filter (subset) MSnbase::MChromatograms() or XChromatograms() objects, i.e. sets of chromatographic data, without changing the data (intensity and retention times) within the individual chromatograms (MSnbase::Chromatogram() objects).

filterColumnsIntensityAbove: subsets a MChromatograms objects keeping only columns (samples) for which value is larger than the provided threshold in which rows (i.e. if which = "any" a column is kept if any of the chromatograms in that column have a value larger than threshold or with which = "all" all chromatograms in that column fulfill this criteria). Parameter value allows to define on which value the comparison should be performed, with value = "bpi" the maximum intensity of each chromatogram is compared to threshold, with ⁠value = "tic" the total sum of intensities of each chromatogram is compared to ⁠threshold⁠. For ⁠XChromatograms⁠object,⁠value = "maxo"andvalue = "into"⁠are supported which compares the largest intensity of all identified chromatographic peaks in the chromatogram with⁠threshold', or the integrated peak area, respectively.
filterColumnsKeepTop: subsets a MChromatograms object keeping the top n columns sorted by the value specified with sortBy. In detail, for each column the value defined by sortBy is extracted from each chromatogram and aggregated using the aggregationFun. Thus, by default, for each chromatogram the maximum intensity is determined (sortBy = "bpi") and these values are summed up for chromatograms in the same column (aggregationFun = sum). The columns are then sorted by these values and the top n columns are retained in the returned MChromatograms. Similar to the filterColumnsIntensityAbove function, this function allows to use for XChromatograms objects to sort the columns by column sortBy = "maxo" or sortBy = "into" of the chromPeaks matrix.

Usage

## S4 method for signature 'MChromatograms'
filterColumnsIntensityAbove(
  object,
  threshold = 0,
  value = c("bpi", "tic"),
  which = c("any", "all")
)

## S4 method for signature 'MChromatograms'
filterColumnsKeepTop(
  object,
  n = 1L,
  sortBy = c("bpi", "tic"),
  aggregationFun = sum
)

## S4 method for signature 'XChromatograms'
filterColumnsIntensityAbove(
  object,
  threshold = 0,
  value = c("bpi", "tic", "maxo", "into"),
  which = c("any", "all")
)

## S4 method for signature 'XChromatograms'
filterColumnsKeepTop(
  object,
  n = 1L,
  sortBy = c("bpi", "tic", "maxo", "into"),
  aggregationFun = sum
)
## S4 method for signature 'MChromatograms'
filterColumnsIntensityAbove(
  object,
  threshold = 0,
  value = c("bpi", "tic"),
  which = c("any", "all")
)

## S4 method for signature 'MChromatograms'
filterColumnsKeepTop(
  object,
  n = 1L,
  sortBy = c("bpi", "tic"),
  aggregationFun = sum
)

## S4 method for signature 'XChromatograms'
filterColumnsIntensityAbove(
  object,
  threshold = 0,
  value = c("bpi", "tic", "maxo", "into"),
  which = c("any", "all")
)

## S4 method for signature 'XChromatograms'
filterColumnsKeepTop(
  object,
  n = 1L,
  sortBy = c("bpi", "tic", "maxo", "into"),
  aggregationFun = sum
)

Arguments

`object`	`MSnbase::MChromatograms()` or `XChromatograms()` object.
`threshold`	for `filterColumnsIntensityAbove`: `numeric(1)` with the threshold value to compare against.
`value`	`character(1)` defining which value should be used in the comparison or sorting. Can be `value = "bpi"` (default) to use the maximum intensity per chromatogram or `value = "tic"` to use the sum of intensities per chromatogram. For `XChromatograms()` objects also `value = "maxo"` and `value = "into"` is supported to use the maximum intensity or the integrated area of identified chromatographic peaks in each chromatogram.
`which`	for `filterColumnsIntensityAbove`: `character(1)` defining whether any (`which = "any"`, default) or all (`which = "all"`) chromatograms in a column have to fulfill the criteria for the column to be kept.
`n`	for `filterColumnsKeepTop`: `integer(1)` specifying the number of columns that should be returned. `n` will be rounded to the closest (larger) integer value.
`sortBy`	for `filterColumnsKeepTop`: the value by which columns should be ordered to determine the top n columns. Can be either `sortBy = "bpi"` (the default), in which case the maximum intensity of each column's chromatograms is used, or `sortBy = "tic"` to use the total intensity sum of all chromatograms. For `XChromatograms()` objects also `value = "maxo"` and `value = "into"` is supported to use the maximum intensity or the integrated area of identified chromatographic peaks in each chromatogram.
`aggregationFun`	for `filterColumnsKeepTop`: function to be used to aggregate (combine) the values from all chromatograms in each column. Defaults to `aggregationFun = sum` in which case the sum of the values is used to rank the columns. Alternatively the `mean`, `median` or similar function can be used.

Value

a filtered MChromatograms (or XChromatograms) object with the same number of rows (EICs) but eventually a lower number of columns (samples).

Author(s)

Johannes Rainer

Examples


library(MSnbase)
chr1 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3),
    intensity = c(5, 29, 50, NA, 100, 12, 3, 4, 1, 3))
chr2 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3),
    intensity = c(80, 50, 20, 10, 9, 4, 3, 4, 1, 3))
chr3 <- Chromatogram(rtime = 3:9 + rnorm(7, sd = 0.3),
    intensity = c(53, 80, 130, 15, 5, 3, 2))

chrs <- MChromatograms(list(chr1, chr2, chr1, chr3, chr2, chr3),
    ncol = 3, byrow = FALSE)
chrs

#### filterColumnsIntensityAbove
##
## Keep all columns with for which the maximum intensity of any of its
## chromatograms is larger 90
filterColumnsIntensityAbove(chrs, threshold = 90)

## Require that ALL chromatograms in a column have a value larger 90
filterColumnsIntensityAbove(chrs, threshold = 90, which = "all")

## If none of the columns fulfills the criteria no columns are returned
filterColumnsIntensityAbove(chrs, threshold = 900)

## Filtering XChromatograms allow in addition to filter on the columns
## "maxo" or "into" of the identified chromatographic peaks within each
## chromatogram.

#### filterColumnsKeepTop
##
## Keep the 2 columns with the highest sum of maximal intensities in their
## chromatograms
filterColumnsKeepTop(chrs, n = 1)

## Keep the 50 percent of columns with the highest total sum of signal. Note
## that n will be rounded to the next larger integer value
filterColumnsKeepTop(chrs, n = 0.5 * ncol(chrs), sortBy = "tic")
library(MSnbase)
chr1 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3),
    intensity = c(5, 29, 50, NA, 100, 12, 3, 4, 1, 3))
chr2 <- Chromatogram(rtime = 1:10 + rnorm(n = 10, sd = 0.3),
    intensity = c(80, 50, 20, 10, 9, 4, 3, 4, 1, 3))
chr3 <- Chromatogram(rtime = 3:9 + rnorm(7, sd = 0.3),
    intensity = c(53, 80, 130, 15, 5, 3, 2))

chrs <- MChromatograms(list(chr1, chr2, chr1, chr3, chr2, chr3),
    ncol = 3, byrow = FALSE)
chrs

#### filterColumnsIntensityAbove
##
## Keep all columns with for which the maximum intensity of any of its
## chromatograms is larger 90
filterColumnsIntensityAbove(chrs, threshold = 90)

## Require that ALL chromatograms in a column have a value larger 90
filterColumnsIntensityAbove(chrs, threshold = 90, which = "all")

## If none of the columns fulfills the criteria no columns are returned
filterColumnsIntensityAbove(chrs, threshold = 900)

## Filtering XChromatograms allow in addition to filter on the columns
## "maxo" or "into" of the identified chromatographic peaks within each
## chromatogram.

#### filterColumnsKeepTop
##
## Keep the 2 columns with the highest sum of maximal intensities in their
## chromatograms
filterColumnsKeepTop(chrs, n = 1)

## Keep the 50 percent of columns with the highest total sum of signal. Note
## that n will be rounded to the next larger integer value
filterColumnsKeepTop(chrs, n = 0.5 * ncol(chrs), sortBy = "tic")

Next Generation `xcms` Result Object

Description

The XcmsExperiment is a data container for xcms preprocessing results (i.e. results from chromatographic peak detection, alignment and correspondence analysis).

It provides the same functionality than the XCMSnExp object, but uses the more advanced and modern MS infrastructure provided by the MsExperiment and Spectra Bioconductor packages. With this comes a higher flexibility on how and where to store the data.

Documentation of the various functions for XcmsExperiment objects are grouped by topic and provided in the sections below.

The default xcms workflow is to perform

chromatographic peak detection using findChromPeaks()
optionally refine identified chromatographic peaks using refineChromPeaks()
perform an alignment (retention time adjustment) using adjustRtime(). Depending on the method used this requires to run a correspondence analysis first
perform a correspondence analysis using the groupChromPeaks() function to group chromatographic peaks across samples to define the LC-MS features.
optionally perform a gap-filling to rescue signal in samples in which no chromatographic peak was identified and hence a missing value would be reported. This can be performed using the fillChromPeaks() function.

Usage

filterFeatureDefinitions(object, ...)

## S4 method for signature 'MsExperiment'
filterRt(object, rt = numeric(), ...)

## S4 method for signature 'MsExperiment'
filterMzRange(object, mz = numeric(), msLevel. = uniqueMsLevels(object))

## S4 method for signature 'MsExperiment'
filterMz(object, mz = numeric(), msLevel. = uniqueMsLevels(object))

## S4 method for signature 'MsExperiment'
filterMsLevel(object, msLevel. = uniqueMsLevels(object))

## S4 method for signature 'MsExperiment'
uniqueMsLevels(object)

## S4 method for signature 'MsExperiment'
filterFile(object, file = integer(), ...)

## S4 method for signature 'MsExperiment'
rtime(object)

## S4 method for signature 'MsExperiment'
fromFile(object)

## S4 method for signature 'MsExperiment'
fileNames(object)

## S4 method for signature 'MsExperiment'
polarity(object)

## S4 method for signature 'MsExperiment'
filterIsolationWindow(object, mz = numeric())

## S4 method for signature 'MsExperiment'
chromatogram(
  object,
  rt = matrix(nrow = 0, ncol = 2),
  mz = matrix(nrow = 0, ncol = 2),
  aggregationFun = "sum",
  msLevel = 1L,
  isolationWindowTargetMz = NULL,
  chunkSize = 2L,
  return.type = "MChromatograms",
  BPPARAM = bpparam()
)

featureArea(
  object,
  mzmin = min,
  mzmax = max,
  rtmin = min,
  rtmax = max,
  features = character()
)

## S4 method for signature 'MsExperiment,missing'
plot(x, y, msLevel = 1L, peakCol = "#ff000060", ...)

## S3 method for class 'XcmsExperiment'
c(...)

## S4 method for signature 'XcmsExperiment,ANY,ANY,ANY'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'XcmsExperiment'
filterIsolationWindow(object, mz = numeric())

## S4 method for signature 'XcmsExperiment'
filterRt(object, rt, msLevel.)

## S4 method for signature 'XcmsExperiment'
filterMzRange(object, mz = numeric(), msLevel. = uniqueMsLevels(object))

## S4 method for signature 'XcmsExperiment'
filterMsLevel(object, msLevel. = uniqueMsLevels(object))

## S4 method for signature 'XcmsExperiment'
hasChromPeaks(object, msLevel = integer())

## S4 method for signature 'XcmsExperiment'
dropChromPeaks(object, keepAdjustedRtime = FALSE)

## S4 replacement method for signature 'XcmsExperiment'
chromPeaks(object) <- value

## S4 method for signature 'XcmsExperiment'
chromPeaks(
  object,
  rt = numeric(),
  mz = numeric(),
  ppm = 0,
  msLevel = integer(),
  type = c("any", "within", "apex_within"),
  isFilledColumn = FALSE
)

## S4 replacement method for signature 'XcmsExperiment'
chromPeakData(object) <- value

## S4 method for signature 'XcmsExperiment'
chromPeakData(
  object,
  msLevel = integer(),
  return.type = c("DataFrame", "data.frame")
)

## S4 method for signature 'XcmsExperiment'
filterChromPeaks(
  object,
  keep = rep(TRUE, nrow(.chromPeaks(object))),
  method = "keep",
  ...
)

## S4 method for signature 'XcmsExperiment'
dropAdjustedRtime(object)

## S4 method for signature 'MsExperiment'
hasAdjustedRtime(object)

## S4 method for signature 'XcmsExperiment'
rtime(object, adjusted = hasAdjustedRtime(object))

## S4 method for signature 'XcmsExperiment'
adjustedRtime(object)

## S4 method for signature 'XcmsExperiment'
hasFeatures(object, msLevel = integer())

## S4 replacement method for signature 'XcmsExperiment'
featureDefinitions(object) <- value

## S4 method for signature 'XcmsExperiment'
featureDefinitions(
  object,
  mz = numeric(),
  rt = numeric(),
  ppm = 0,
  type = c("any", "within", "apex_within"),
  msLevel = integer()
)

## S4 method for signature 'XcmsExperiment'
dropFeatureDefinitions(object, keepAdjustedRtime = FALSE)

## S4 method for signature 'XcmsExperiment'
filterFeatureDefinitions(object, features = integer())

## S4 method for signature 'XcmsExperiment'
hasFilledChromPeaks(object)

## S4 method for signature 'XcmsExperiment'
dropFilledChromPeaks(object)

## S4 method for signature 'XcmsExperiment'
quantify(object, ...)

## S4 method for signature 'XcmsExperiment'
featureValues(
  object,
  method = c("medret", "maxint", "sum"),
  value = "into",
  intensity = "into",
  filled = TRUE,
  missing = NA_real_,
  msLevel = integer()
)

## S4 method for signature 'XcmsExperiment'
chromatogram(
  object,
  rt = matrix(nrow = 0, ncol = 2),
  mz = matrix(nrow = 0, ncol = 2),
  aggregationFun = "sum",
  msLevel = 1L,
  chunkSize = 2L,
  isolationWindowTargetMz = NULL,
  return.type = c("XChromatograms", "MChromatograms"),
  include = character(),
  chromPeaks = c("apex_within", "any", "none"),
  BPPARAM = bpparam()
)

## S4 method for signature 'XcmsExperiment'
processHistory(object, type)

## S4 method for signature 'XcmsExperiment'
filterFile(
  object,
  file,
  keepAdjustedRtime = hasAdjustedRtime(object),
  keepFeatures = FALSE,
  ...
)
filterFeatureDefinitions(object, ...)

## S4 method for signature 'MsExperiment'
filterRt(object, rt = numeric(), ...)

## S4 method for signature 'MsExperiment'
filterMzRange(object, mz = numeric(), msLevel. = uniqueMsLevels(object))

## S4 method for signature 'MsExperiment'
filterMz(object, mz = numeric(), msLevel. = uniqueMsLevels(object))

## S4 method for signature 'MsExperiment'
filterMsLevel(object, msLevel. = uniqueMsLevels(object))

## S4 method for signature 'MsExperiment'
uniqueMsLevels(object)

## S4 method for signature 'MsExperiment'
filterFile(object, file = integer(), ...)

## S4 method for signature 'MsExperiment'
rtime(object)

## S4 method for signature 'MsExperiment'
fromFile(object)

## S4 method for signature 'MsExperiment'
fileNames(object)

## S4 method for signature 'MsExperiment'
polarity(object)

## S4 method for signature 'MsExperiment'
filterIsolationWindow(object, mz = numeric())

## S4 method for signature 'MsExperiment'
chromatogram(
  object,
  rt = matrix(nrow = 0, ncol = 2),
  mz = matrix(nrow = 0, ncol = 2),
  aggregationFun = "sum",
  msLevel = 1L,
  isolationWindowTargetMz = NULL,
  chunkSize = 2L,
  return.type = "MChromatograms",
  BPPARAM = bpparam()
)

featureArea(
  object,
  mzmin = min,
  mzmax = max,
  rtmin = min,
  rtmax = max,
  features = character()
)

## S4 method for signature 'MsExperiment,missing'
plot(x, y, msLevel = 1L, peakCol = "#ff000060", ...)

## S3 method for class 'XcmsExperiment'
c(...)

## S4 method for signature 'XcmsExperiment,ANY,ANY,ANY'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'XcmsExperiment'
filterIsolationWindow(object, mz = numeric())

## S4 method for signature 'XcmsExperiment'
filterRt(object, rt, msLevel.)

## S4 method for signature 'XcmsExperiment'
filterMzRange(object, mz = numeric(), msLevel. = uniqueMsLevels(object))

## S4 method for signature 'XcmsExperiment'
filterMsLevel(object, msLevel. = uniqueMsLevels(object))

## S4 method for signature 'XcmsExperiment'
hasChromPeaks(object, msLevel = integer())

## S4 method for signature 'XcmsExperiment'
dropChromPeaks(object, keepAdjustedRtime = FALSE)

## S4 replacement method for signature 'XcmsExperiment'
chromPeaks(object) <- value

## S4 method for signature 'XcmsExperiment'
chromPeaks(
  object,
  rt = numeric(),
  mz = numeric(),
  ppm = 0,
  msLevel = integer(),
  type = c("any", "within", "apex_within"),
  isFilledColumn = FALSE
)

## S4 replacement method for signature 'XcmsExperiment'
chromPeakData(object) <- value

## S4 method for signature 'XcmsExperiment'
chromPeakData(
  object,
  msLevel = integer(),
  return.type = c("DataFrame", "data.frame")
)

## S4 method for signature 'XcmsExperiment'
filterChromPeaks(
  object,
  keep = rep(TRUE, nrow(.chromPeaks(object))),
  method = "keep",
  ...
)

## S4 method for signature 'XcmsExperiment'
dropAdjustedRtime(object)

## S4 method for signature 'MsExperiment'
hasAdjustedRtime(object)

## S4 method for signature 'XcmsExperiment'
rtime(object, adjusted = hasAdjustedRtime(object))

## S4 method for signature 'XcmsExperiment'
adjustedRtime(object)

## S4 method for signature 'XcmsExperiment'
hasFeatures(object, msLevel = integer())

## S4 replacement method for signature 'XcmsExperiment'
featureDefinitions(object) <- value

## S4 method for signature 'XcmsExperiment'
featureDefinitions(
  object,
  mz = numeric(),
  rt = numeric(),
  ppm = 0,
  type = c("any", "within", "apex_within"),
  msLevel = integer()
)

## S4 method for signature 'XcmsExperiment'
dropFeatureDefinitions(object, keepAdjustedRtime = FALSE)

## S4 method for signature 'XcmsExperiment'
filterFeatureDefinitions(object, features = integer())

## S4 method for signature 'XcmsExperiment'
hasFilledChromPeaks(object)

## S4 method for signature 'XcmsExperiment'
dropFilledChromPeaks(object)

## S4 method for signature 'XcmsExperiment'
quantify(object, ...)

## S4 method for signature 'XcmsExperiment'
featureValues(
  object,
  method = c("medret", "maxint", "sum"),
  value = "into",
  intensity = "into",
  filled = TRUE,
  missing = NA_real_,
  msLevel = integer()
)

## S4 method for signature 'XcmsExperiment'
chromatogram(
  object,
  rt = matrix(nrow = 0, ncol = 2),
  mz = matrix(nrow = 0, ncol = 2),
  aggregationFun = "sum",
  msLevel = 1L,
  chunkSize = 2L,
  isolationWindowTargetMz = NULL,
  return.type = c("XChromatograms", "MChromatograms"),
  include = character(),
  chromPeaks = c("apex_within", "any", "none"),
  BPPARAM = bpparam()
)

## S4 method for signature 'XcmsExperiment'
processHistory(object, type)

## S4 method for signature 'XcmsExperiment'
filterFile(
  object,
  file,
  keepAdjustedRtime = hasAdjustedRtime(object),
  keepFeatures = FALSE,
  ...
)

Arguments

`object`	An `XcmsExperiment` object.
`...`	Additional optional parameters. For `quantify`: any parameter for the `featureValues` call used to extract the feature value matrix.
`rt`	For `chromPeaks` and `featureDefinitions`: `numeric(2)` defining the retention time range for which chromatographic peaks or features should be returned. The full range is used by default. For `chromatogram`: two column numerical `matrix` with each row representing the lower and upper retention time window(s) for the chromatograms. If not provided the full retention time range is used.
`mz`	For `chromPeaks` and `featureDefinitions`: `numeric(2)` optionally defining the m/z range for which chromatographic peaks or feature definitions should be returned. The full m/z range is used by default. For `chromatogram`: two-column numerical `matrix` with each row representing m/z range that should be aggregated into a chromatogram. If not provided the full m/z range of the data will be used (and hence a total ion chromatogram will be returned if `aggregationFun = "sum"` is used). For `filterIsolationWindow`: `numeric(1)` defining the m/z that should be contained within the spectra's isolation window.
`msLevel.`	For `filterRt`: ignored. `filterRt` will always filter by retention times on all MS levels regardless of this parameter. For `chromatogram`: `integer` with the MS level from which the chromatogram(s) should be extracted. Has to be either of length 1 or length equal to the numer of rows of the parameters `mz` and `rt` defining the m/z and rt regions from which the chromatograms should be created. Defaults to `msLevel = 1L`. for `filterMsLevel`: `integer` defining the MS level(s) to which the data should be subset.
`file`	For `filterFile`: `integer` with the indices of the samples (files) to which the data should be subsetted.
`aggregationFun`	For `chromatogram`: `character(1)` defining the function that should be used to aggregate intensities for retention time (i.e. each spectrum) along the specified m/z range (parameter `mz`). Defaults to `aggregationFun = "sum"` and hence all intensities will be summed up. Alternatively, use `aggregationFun = "max"` to use the maximal intensity per m/z range to create a base peak chromatogram (BPC).
`msLevel`	`integer` defining the MS level (or multiple MS level if the function supports it).
`isolationWindowTargetMz`	For `chromatogram`: `numeric` (of length equal to the number of rows of `rt` and `mz`) with the isolation window target m/z of the MS2 spectra from which the chromatgrom should be generated. For MS1 data (`msLevel = 1L`, the default), this parameter is ignored. See examples on `chromatogram` below for further information.
`chunkSize`	For `chromatogram`: `integer(1)` defining the number of files from which the data should be loaded at a time into memory. Defaults to `chunkSize = 2L`.
`return.type`	For `chromPeakData`: `character(1)` defining the class of the returned object. Can be either `"DataFrame"` (the default) or `"data.frame"`. For `chromatogram`: `character(1)` defining the type of the returned object. Currently only `return.type = "MChromatograms"` is supported.
`BPPARAM`	For `chromatogram`: parallel processing setup. Defaults to `BPPARAM = bpparam()`. See `BiocParallel::bpparam()` for more information.
`mzmin`	For `featureArea`: function to calculate the `"mzmin"` of a feature based on the `"mzmin"` values of the individual chromatographic peaks assigned to that feature. Defaults to `mzmin = min`.
`mzmax`	For `featureArea`: function to calculate the `"mzmax"` of a feature based on the `"mzmax"` values of the individual chromatographic peaks assigned to that feature. Defaults to `mzmax = max`.
`rtmin`	For `featureArea`: function to calculate the `"rtmin"` of a feature based on the `"rtmin"` values of the individual chromatographic peaks assigned to that feature. Defaults to `rtmin = min`.
`rtmax`	For `featureArea`: function to calculate the `"rtmax"` of a feature based on the `"rtmax"` values of the individual chromatographic peaks assigned to that feature. Defaults to `rtmax = max`.
`features`	For `filterFeatureDefinitions` and `featureArea`: `logical`, `integer` or `character` defining the features to keep or from which to extract the feature area, respectively. See function description for more information.
`x`	An `XcmsExperiment` object.
`y`	For `plot`: should not be defined as it is not supported.
`peakCol`	For `plot`: defines the border color of the rectangles indicating the identified chromatographic peaks. Only a single color is supported. Defaults to 'peakCol = "#ff000060".
`i`	For `[`: `integer` or `logical` defining the samples/files to subset.
`j`	For `[`: not supported.
`drop`	For `[`: ignored.
`keepAdjustedRtime`	`logical(1)`: whether adjusted retention times (if present) should be retained.
`value`	For `featureValues`: `character(1)` defining which value should be reported for each feature in each sample. Can be any column of the `chromPeaks` matrix or `"index"` if simply the index of the assigned peak should be returned. Defaults to `value = "into"` thus the integrated peak area is reported.
`ppm`	For `chromPeaks` and `featureDefinitions`: optional `numeric(1)` specifying the ppm by which the m/z range (defined by `mz` should be extended. For a value of `ppm = 10`, all peaks within `mz[1] - ppm / 1e6` and `mz[2] + ppm / 1e6` are returned.
`type`	For `chromPeaks` and `featureDefinitions` and only if either `mz` and `rt` are defined too: `character(1)`: defining which peaks (or features) should be returned. For `type = "any"`: returns all chromatographic peaks or features also only partially overlapping any of the provided ranges. For `type = "within"`: returns only peaks or features completely within the region defined by `mz` and/or `rt`. For `type = "apex_within"`: returns peaks or features for which the m/z and retention time of the peak's apex is within the region defined by `mz` and/or `rt`. For `processHistory`: restrict returned processing steps to specific types. Use `processHistoryTypes()` to list all supported values.
`isFilledColumn`	For `chromPeaks`: `logical(1)` whether a column `"is_filled"` should be included in the returned `matrix` with the information whether a peak was detected or only filled-in. Note that this information is also provided in the `chromPeakData` data frame.
`keep`	For `filterChromPeaks`: `logical`, `integer` or `character` specifying which chromatographic peaks to keep. If `logical` the length of `keep` needs to match the number of rows of `chromPeaks`. Alternatively, `keep` allows to specify the `index` (row) of peaks to keep or their ID (i.e. row name in `chromPeaks`).
`method`	For `featureValues`: `character(1)` specifying the method to resolve multi-peak mappings within the same sample (correspondence analysis can assign more than one chromatographic peak within a sample to the same feature, e.g. if they are close in retention time). Options: `method = "medret"`: report the value for the chromatographic peak closest to the feature's median retention time. `method = "maxint"`: report the value for the chromatographic peak with the largest signal (parameter `intensity` allows to select the column in `chromPeaks` that should be used for signal). `method = "sum"`: sum the value for all chromatographic peaks in a sample assigned to the same feature. The default is `method = "medret"`. For `filterChromPeaks`: currently only `method = "keep"` is supported.
`adjusted`	For `⁠rtime,XcmsExperiment⁠`: whether adjusted or raw retention times should be returned. The default is to return adjusted retention times, if available.
`intensity`	For `featureValues`: `character(1)` specifying the name of the column in the `chromPeaks(objects)` matrix containing the intensity value of the peak that should be used for the conflict resolution if `method = "maxint"`.
`filled`	For `featureValues`: `logical(1)` specifying whether values for filled-in peaks should be reported. For `filled = TRUE` (the default) filled peak values are returned, otherwise `NA` is reported for the respective features in the samples in which no peak was detected.
`missing`	For `featureValues`: default value for missing values. Allows to define the value that should be reported for a missing peak intensity. Defaults to `missing = NA_real_`.
`include`	For `chromatogram`: deprecated; use parameter `chromPeaks` instead.
`chromPeaks`	For `chromatogram`: `character(1)` defining which chromatographic peaks should be returned. Can be either `chromPeaks = "apex_within"` (default) to return all chromatographic peaks with the m/z and RT of their apex within the m/z and retention time window, `chromPeaks = "any"` for all chromatographic peaks that are overlapping with the m/z - retention time window or `chromPeaks = "none"` to not include any chromatographic peaks. See also parameter `type` below for additional information.
`keepFeatures`	for most subsetting functions (`[`, `filterFile`): `logical(1)`: wheter eventually present feature definitions should be retained in the returned (filtered) object.

Subset, filter and combine

[: subset an XcmsExperiment by sample (parameter i). Subsetting will by default drop correspondence results (as subsetting by samples will obviously affect the feature definition) and alignment results (adjusted retention times) while identified chromatographic peaks (for the selected samples) will be retained. Which preprocessing results should be kept or dropped can also be configured with optional parameters keepChromPeaks (by default TRUE), keepAdjustedRtime (by default FALSE) and keepFeatures (by default FALSE).
c: multiple XcmsExperiment objects can be combined into one using the c() function. This requires however that all the XcmsExperiments' Spectra objects use the same type of MsBackend and that their processing queues are empty. Also, only combining of peak detection results is supported. Any eventually present alignment or correspondence results will be dropped before combining the XcmsExperiment objects. Finally, at present, only the MS data of the individual XcmsExperiment objects is combined and any data eventually present in the ⁠@qdata⁠, ⁠@otherData⁠ and ⁠@experimentFiles⁠ slots is ignored. The function returns a XcmsExperiment objects with the combined MS data (Spectra objects) and chromatographic peak detection results.
filterChromPeaks: filter chromatographic peaks of an XcmsExperiment keeping only those specified with parameter keep. Returns the XcmsExperiment with the filtered data. Chromatographic peaks to retain can be specified either by providing their index in the chromPeaks matrix, their ID (rowname in chromPeaks) or with a logical vector with the same length than number of rows of chromPeaks. Assignment of chromatographic peaks are updated to eventually present feature definitions after filtering.
filterFeatureDefinitions: filter feature definitions of an XcmsExperiment keeping only those defined with parameter features, which can be a logical of length equal to the number of features, an integer with the index of the features in featureDefinitions(object) to keep or a character with the feature IDs (i.e. row names in featureDefinitions(object)).
filterFile: filter an XcmsExperiment (or MsExperiment) by file (sample). The index of the samples to which the data should be subsetted can be specified with parameter file. The sole purpose of this function is to provide backward compatibility with the MSnbase package. Wherever possible, the [ function should be used instead for any sample-based subsetting. Parameters keepChromPeaks, keepAdjustedRtime and keepChromPeaks can be passed using .... Note also that in contrast to [, filterFile does not support subsetting in arbitrary order.
filterIsolationWindow: filter the spectra within an MsExperiment or XcmsExperiment object keeping only those with an isolation window containing the specified m/z (i.e., keeping spectra with an "isolationWindowLowerMz" smaller than the user-provided mz and an "isolationWindowUpperMz" larger than mz). For an XcmsExperiment also all chromatographic peaks (and subsequently also features) are removed for which the range of their "isolationWindowLowerMz" and "isolationWindowUpperMz" (columns in chromPeakData) do not contain the user provided mz.
filterMsLevel: filter the data of the XcmsExperiment or MsExperiment to keep only data of the MS level(s) specified with parameter msLevel..
filterMz, filterMzRange: filter the spectra within an XcmsExperiment or MsExperiment to the specified m/z range (parameter mz). For XcmsExperiment also identified chromatographic peaks and features are filtered keeping only those that are within the specified m/z range (i.e. for which the m/z of the peak apex is within the m/z range). Parameter msLevels. allows to restrict the filtering to only specified MS levels. By default data from all MS levels are filtered.
filterRt: filter an XcmsExperiment keeping only data within the specified retention time range (parameter rt). This function will keep all preprocessing results present within the retention time range: all identified chromatographic peaks with the retention time of the apex position within the retention time range rt are retained along, if present, with the associated features. Parameter msLevel. is currently ignored, i.e. filtering will always performed on all MS levels of the object.

Functionality related to chromatographic peaks

chromatogram: extract chromatographic data from a data set. Parameters mz and rt allow to define specific m/z - retention time regions to extract the data from (to e.g. for extracted ion chromatograms EICs). Both parameters are expected to be numerical two-column matrices with the first column defining the lower and the second the upper margin. Each row can define a separate m/z - retention time region. Currently the function returns a MSnbase::MChromatograms() object for object being a MsExperiment or, for object being an XcmsExperiment, either a MChromatograms or XChromatograms() depending on parameter return.type (can be either "MChromatograms" or "XChromatograms"). For the latter also chromatographic peaks detected within the provided m/z and retention times are returned. Parameter chromPeaks allows to specify which chromatographic peaks should be reported. See documentation on the chromPeaks parameter for more information. If the XcmsExperiment contains correspondence results, also the associated feature definitions will be included in the returned XChromatograms. By default the function returns chromatograms from MS1 data, but by setting parameter msLevel = 2L it is possible to e.g. extract also MS2 chromatograms. By default, with parameter isolationWindowTargetMz = NULL or isolationWindowTargetMz = NA_real_, data from all MS2 spectra will be considered in the chromatogram extraction. If MS2 data was generated within different m/z isolation windows (such as e.g. with Scies SWATH data), the parameter isolationWindowTargetMz should be used to ensure signal is only extracted from the respective isolation window. The isolationWindowTargetMz() function on the Spectra object can be used to inspect/list available isolation windows of a data set. See also the xcms LC-MS/MS vignette for examples and details.
chromPeaks: returns a numeric matrix with the identified chromatographic peaks. Each row represents a chromatographic peak identified in one sample (file). The number of columns depends on the peak detection algorithm (see findChromPeaks()) but most methods return the following columns: "mz" (intensity-weighted mean of the m/z values of all mass peaks included in the chromatographic peak), "mzmin" ( smallest m/z value of any mass peak in the chromatographic peak), "mzmax" (largest m/z value of any mass peak in the chromatographic peak), "rt" (retention time of the peak apex), "rtmin" (retention time of the first scan/mass peak of the chromatographic peak), "rtmax" (retention time of the last scan/mass peak of the chromatographic peak), "into" (integrated intensity of the chromatographic peak), "maxo" (maximal intensity of any mass peak of the chromatographic peak), "sample" (index of the sample in object in which the peak was identified). Parameters rt, mz, ppm, msLevel and type allow to extract subsets of identified chromatographic peaks from the object. See parameter description below for details.
chromPeakData: returns a DataFrame with potential additional annotations for the identified chromatographic peaks. Each row in this DataFrame corresponds to a row (same index and row name) in the chromPeaks matrix. The default annotations are "ms_level" (the MS level in which the peak was identified) and "is_filled" (whether the chromatographic peak was detected (by findChromPeaks) or filled-in (by fillChromPeaks).
chromPeakSpectra: extract MS spectra for identified chromatographic peaks. This can be either all (full scan) MS1 spectra with retention times between the retention time range of a chromatographic peak, all MS2 spectra (if present) with a retention time within the retention time range of a (MS1) chromatographic peak and a precursor m/z within the m/z range of the chromatographic peak or single, selected spectra depending on their total signal or highest signal. Parameter msLevel allows to define from which MS level spectra should be extracted, parameter method allows to define if all or selected spectra should be returned. See chromPeakSpectra() for details.
dropChromPeaks: removes (all) chromatographic peak detection results from object. This will also remove any correspondence results (i.e. features) and eventually present adjusted retention times from the object if the alignment was performed after the peak detection. Alignment results (adjusted retention times) can be retained if parameter keepAdjustedRtime is set to TRUE.
dropFilledChromPeaks: removes chromatographic peaks added by gap filling with fillChromPeaks.
fillChromPeaks: perform gap filling to integrate signal missing values in samples in which no chromatographic peak was found. This depends on correspondence results, hence groupChromPeaks needs to be called first. For details and options see fillChromPeaks().
findChromPeaks: perform chromatographic peak detection. See findChromPeaks() for details.
hasChromPeaks: whether the object contains peak detection results. Parameter msLevel allows to check whether peak detection results are available for the specified MS level(s).
hasFilledChromPeaks: whether gap-filling results (i.e., filled-in chromatographic peaks) are present.
manualChromPeaks: manually add chromatographic peaks by defining their m/z and retention time ranges. See manualChromPeaks() for details and examples.
plotChromPeakImage: show the density of identified chromatographic peaks per file along the retention time. See plotChromPeakImage() for details.
plotChromPeaks: indicate identified chromatographic peaks from one sample in the RT-m/z space. See plotChromPeaks() for details.
plotPrecursorIons: general visualization of precursor ions of LC-MS/MS data. See plotPrecursorIons() for details.
refineChromPeaks: refines identified chromatographic peaks in object. See refineChromPeaks() for details.

Functionality related to alignment

adjustedRtime: extract adjusted retention times. This is just an alias for rtime(object, adjusted = TRUE).
adjustRtime: performs retention time adjustment (alignment) of the data. See adjustRtime() for details.
applyAdjustedRtime: replaces the original (raw) retention times with the adjusted ones. See applyAdjustedRtime() for more information.
dropAdjustedRtime: drops alignment results (adjusted retention time) from the result object. This also reverts the retention times of identified chromatographic peaks if present in the result object. Note that any results from a correspondence analysis (i.e. feature definitions) will be dropped too (if the correspondence analysis was performed after the alignment). This can be overruled with keepAdjustedRtime = TRUE.
hasAdjustedRtime: whether alignment was performed on the object (i.e., the object contains alignment results).
plotAdjustedRtime: plot the alignment results; see plotAdjustedRtime() for more information.

Functionality related to correspondence analysis

dropFeatureDefinitions: removes any correspondence analysis results from object as well as any filled-in chromatographic peaks. By default (with parameter keepAdjustedRtime = FALSE) also all alignment results will be removed if alignment was performed after the correspondence analysis. This can be overruled with keepAdjustedRtime = TRUE.
featureArea: returns a matrix with columns "mzmin", "mzmax", "rtmin" and "rtmax" with the m/z and retention time range for each feature (row) in object. By default these represent the minimal m/z and retention times as well as maximal m/z and retention times for all chromatographic peaks assigned to that feature. Parameter features allows to extract these values for selected features only. Parameters mzmin, mzmax, rtmin and rtmax allow to define the function to calculate the reported "mzmin", "mzmax", "rtmin" and "rtmax" values.
featureChromatograms: extract ion chromatograms (EICs) for each feature in object. See featureChromatograms() for more details.
featureDefinitions: returns a data.frame with feature definitions or an empty data.frame if no correspondence analysis results are present. Parameters msLevel, mz, ppm and rt allow to define subsets of feature definitions that should be returned with the parameter type defining how these parameters should be used to subset the returned data.frame. See parameter descriptions for details.
featureSpectra: returns a Spectra::Spectra() or List of Spectra with (MS1 or MS2) spectra associated to each feature. See featureSpectra() for more details and available parameters.
featuresSummary: calculate a simple summary on features. See featureSummary() for details.
groupChromPeaks: performs the correspondence analysis (i.e., grouping of chromatographic peaks into LC-MS features). See groupChromPeaks() for details.
hasFeatures: whether correspondence analysis results are presentin in object. The optional parameter msLevel allows to define the MS level(s) for which it should be determined if feature definitions are available.
overlappingFeatures: identify features that overlapping or close in m/z - rt dimension. See overlappingFeatures() for more information.

Extracting data and results from an `XcmsExperiment`

Preprocessing results can be extracted using the following functions:

chromPeaks: extract identified chromatographic peaks. See section on chromatographic peak detection for details.
featureDefinitions: extract the definition of features (chromatographic peaks grouped across samples). See section on correspondence analysis for details.
featureValues: extract a matrix of values for features from each sample (file). Rows are features, columns samples. Which value should be returned can be defined with parameter value, which can be any column of the chromPeaks matrix. By default (value = "into") the integrated chromatographic peak intensities are returned. With parameter msLevel it is possible to extract values for features from certain MS levels. During correspondence analysis, more than one chromatographic peak per sample can be assigned to the same feature (e.g. if they are very close in retention time). Parameter method allows to define the strategy to deal with such cases: method = "medret": report the value from the chromatographic peak with the apex position closest to the feautre's median retention time. method = "maxint": report the value from the chromatographic peak with the largest signal (parameter intensity allows to define the column in chromPeaks that should be selected; defaults to ⁠intensity = "into"). ⁠method = "sum"': sum the values for all chromatographic peaks assigned to the feature in the same sample.
quantify: extract the correspondence analysis results as a SummarizedExperiment::SummarizedExperiment(). The feature values are used as assay in the returned SummarizedExperiment, rowData contains the featureDefinitions (without column "peakidx") and colData the sampleData of object. Additional parameters to the featureValues function (that is used to extract the feature value matrix) can be passed via ....

Visualization

plot: plot for each file the position of individual peaks in the m/z - retention time space (with color-coded intensity) and a base peak chromatogram. This function should ideally be called only on a data subset (i.e. after using filterRt and filterMz to restrict to a region of interest). Parameter msLevel allows to define from which MS level the plot should be created. If x is a XcmsExperiment with available identified chromatographic peaks, also the region defining the peaks are indicated with a rectangle. Parameter peakCol allows to define the color of the border for these rectangles.
plotAdjustedRtime: plot the alignment results; see plotAdjustedRtime() for more information.
plotChromPeakImage: show the density of identified chromatographic peaks per file along the retention time. See plotChromPeakImage() for details.
plotChromPeaks: indicate identified chromatographic peaks from one sample in the RT-m/z space. See plotChromPeaks() for details.

General functionality and functions for backward compatibility

uniqueMsLevels: returns the unique MS levels of the spectra in object.

The functions listed below ensure compatibility with the older XCMSnExp() xcms result object. Also, an XcmsExperiment can be coerced to the older XCMSnExp class using as(object, "XCMSnExp") same as a XCMSnExp class can be coerced to XcmsExperiment using as(object, "XcmsExperiment").

fileNames: returns the original data file names for the spectra data. Ideally, the dataOrigin or dataStorage spectra variables from the object's spectra should be used instead.
fromFile: returns the file (sample) index for each spectrum within object. Generally, subsetting by sample using the [ is the preferred way to get spectra from a specific sample.
polarity: returns the polarity information for each spectrum in object.
processHistory: returns a list with ProcessHistory process history objects that contain also the parameter object used for the different processings. Optional parameter type allows to query for specific processing steps.
rtime: extract retention times of the spectra from the MsExperiment or XcmsExperiment object. It is thus a shortcut for rtime(spectra(object)) which would be the preferred way to extract retention times from an MsExperiment. The rtime method for XcmsExperiment has an additional parameter adjusted which allows to define whether adjusted retention times (if present - adjusted = TRUE) or raw retention times (adjusted = FALSE) should be returned. By default adjusted retention times are returned if available.

Differences compared to the `XCMSnExp()` object

Subsetting by [ supports arbitrary ordering.

Author(s)

Johannes Rainer

Examples


## Creating a MsExperiment object representing the data from an LC-MS
## experiment.
library(MsExperiment)

## Defining the raw data files
fls <- c(system.file('cdf/KO/ko15.CDF', package = "faahKO"),
         system.file('cdf/KO/ko16.CDF', package = "faahKO"),
         system.file('cdf/KO/ko18.CDF', package = "faahKO"))

## Defining a data frame with the sample characterization
df <- data.frame(mzML_file = basename(fls),
                sample = c("ko15", "ko16", "ko18"))
## Importing the data. This will initialize a `Spectra` object representing
## the raw data and assign these to the individual samples.
mse <- readMsExperiment(spectraFiles = fls, sampleData = df)

## Extract a total ion chromatogram and base peak chromatogram
## from the data
bpc <- chromatogram(mse, aggregationFun = "max")
tic <- chromatogram(mse)

## Plot them
par(mfrow = c(2, 1))
plot(bpc, main = "BPC")
plot(tic, main = "TIC")

## Extracting MS2 chromatographic data
##
## To show how MS2 chromatograms can be extracted we first load a DIA
## (SWATH) data set.
mse_dia <- readMsExperiment(system.file("TripleTOF-SWATH",
    "PestMix1_SWATH.mzML", package = "msdata"))

## Extracting MS2 chromatogram requires also to specify the isolation
## window from which to extract the data. Without that chromatograms
## will be empty:
chr_ms2 <- chromatogram(mse_dia, msLevel = 2L)
intensity(chr_ms2[[1L]])

## First we list available isolation windows
table(isolationWindowTargetMz(spectra(mse_dia)))

## We can then extract the TIC of MS2 data for a specific isolation window
chr_ms2 <- chromatogram(mse_dia, msLevel = 2L,
    isolationWindowTargetMz = 244.05)
plot(chr_ms2)

####
## Chromatographic peak detection

## Perform peak detection on the data using the centWave algorith. Note
## that the parameters are chosen to reduce the run time of the example.
p <- CentWaveParam(noise = 10000, snthresh = 40, prefilter = c(3, 10000))
xmse <- findChromPeaks(mse, param = p)
xmse

## Have a quick look at the identified chromatographic peaks
head(chromPeaks(xmse))

## Extract chromatographic peaks identified between 3000 and 3300 seconds
chromPeaks(xmse, rt = c(3000, 3300), type = "within")

## Extract ion chromatograms (EIC) for the first two chromatographic
## peaks.
chrs <- chromatogram(xmse,
    mz = chromPeaks(xmse)[1:2, c("mzmin", "mzmax")],
    rt = chromPeaks(xmse)[1:2, c("rtmin", "rtmax")])

## An EIC for each sample and each of the two regions was extracted.
## Identified chromatographic peaks in the defined regions are extracted
## as well.
chrs

## Plot the EICs for the second defined region
plot(chrs[2, ])

## Subsetting the data to the results (and data) for the second sample
a <- xmse[2]
nrow(chromPeaks(xmse))
nrow(chromPeaks(a))

## Filtering the result by retention time: keeping all spectra and
## chromatographic peaks within 3000 and 3500 seconds.
xmse_sub <- filterRt(xmse, rt = c(3000, 3500))
xmse_sub
nrow(chromPeaks(xmse_sub))

## Perform an initial feature grouping to allow alignment using the
## peak groups method:
pdp <- PeakDensityParam(sampleGroups = rep(1, 3))
xmse <- groupChromPeaks(xmse, param = pdp)

## Perform alignment using the peak groups method.
pgp <- PeakGroupsParam(span = 0.4)
xmse <- adjustRtime(xmse, param = pgp)

## Visualizing the alignment results
plotAdjustedRtime(xmse)

## Performing the final correspondence analysis
xmse <- groupChromPeaks(xmse, param = pdp)

## Show the definition of the first 6 features
featureDefinitions(xmse) |> head()

## Extract the feature values; show the results for the first 6 rows.
featureValues(xmse) |> head()

## The full results can also be extracted as a `SummarizedExperiment`
## that would eventually simplify subsequent analyses with other packages.
## Any additional parameters passed to the function are passed to the
## `featureValues` function that is called to generate the feature value
## matrix.
se <- quantify(xmse, method = "sum")

## EICs for all features can be extracted with the `featureChromatograms`
## function. Note that, depending on the data set, extracting this for
## all features might take some time. Below we extract EICs for the
## first 10 features by providing the feature IDs.
chrs <- featureChromatograms(xmse,
    features = rownames(featureDefinitions(xmse))[1:10])
chrs

plot(chrs[3, ])
## Creating a MsExperiment object representing the data from an LC-MS
## experiment.
library(MsExperiment)

## Defining the raw data files
fls <- c(system.file('cdf/KO/ko15.CDF', package = "faahKO"),
         system.file('cdf/KO/ko16.CDF', package = "faahKO"),
         system.file('cdf/KO/ko18.CDF', package = "faahKO"))

## Defining a data frame with the sample characterization
df <- data.frame(mzML_file = basename(fls),
                sample = c("ko15", "ko16", "ko18"))
## Importing the data. This will initialize a `Spectra` object representing
## the raw data and assign these to the individual samples.
mse <- readMsExperiment(spectraFiles = fls, sampleData = df)

## Extract a total ion chromatogram and base peak chromatogram
## from the data
bpc <- chromatogram(mse, aggregationFun = "max")
tic <- chromatogram(mse)

## Plot them
par(mfrow = c(2, 1))
plot(bpc, main = "BPC")
plot(tic, main = "TIC")

## Extracting MS2 chromatographic data
##
## To show how MS2 chromatograms can be extracted we first load a DIA
## (SWATH) data set.
mse_dia <- readMsExperiment(system.file("TripleTOF-SWATH",
    "PestMix1_SWATH.mzML", package = "msdata"))

## Extracting MS2 chromatogram requires also to specify the isolation
## window from which to extract the data. Without that chromatograms
## will be empty:
chr_ms2 <- chromatogram(mse_dia, msLevel = 2L)
intensity(chr_ms2[[1L]])

## First we list available isolation windows
table(isolationWindowTargetMz(spectra(mse_dia)))

## We can then extract the TIC of MS2 data for a specific isolation window
chr_ms2 <- chromatogram(mse_dia, msLevel = 2L,
    isolationWindowTargetMz = 244.05)
plot(chr_ms2)

####
## Chromatographic peak detection

## Perform peak detection on the data using the centWave algorith. Note
## that the parameters are chosen to reduce the run time of the example.
p <- CentWaveParam(noise = 10000, snthresh = 40, prefilter = c(3, 10000))
xmse <- findChromPeaks(mse, param = p)
xmse

## Have a quick look at the identified chromatographic peaks
head(chromPeaks(xmse))

## Extract chromatographic peaks identified between 3000 and 3300 seconds
chromPeaks(xmse, rt = c(3000, 3300), type = "within")

## Extract ion chromatograms (EIC) for the first two chromatographic
## peaks.
chrs <- chromatogram(xmse,
    mz = chromPeaks(xmse)[1:2, c("mzmin", "mzmax")],
    rt = chromPeaks(xmse)[1:2, c("rtmin", "rtmax")])

## An EIC for each sample and each of the two regions was extracted.
## Identified chromatographic peaks in the defined regions are extracted
## as well.
chrs

## Plot the EICs for the second defined region
plot(chrs[2, ])

## Subsetting the data to the results (and data) for the second sample
a <- xmse[2]
nrow(chromPeaks(xmse))
nrow(chromPeaks(a))

## Filtering the result by retention time: keeping all spectra and
## chromatographic peaks within 3000 and 3500 seconds.
xmse_sub <- filterRt(xmse, rt = c(3000, 3500))
xmse_sub
nrow(chromPeaks(xmse_sub))

## Perform an initial feature grouping to allow alignment using the
## peak groups method:
pdp <- PeakDensityParam(sampleGroups = rep(1, 3))
xmse <- groupChromPeaks(xmse, param = pdp)

## Perform alignment using the peak groups method.
pgp <- PeakGroupsParam(span = 0.4)
xmse <- adjustRtime(xmse, param = pgp)

## Visualizing the alignment results
plotAdjustedRtime(xmse)

## Performing the final correspondence analysis
xmse <- groupChromPeaks(xmse, param = pdp)

## Show the definition of the first 6 features
featureDefinitions(xmse) |> head()

## Extract the feature values; show the results for the first 6 rows.
featureValues(xmse) |> head()

## The full results can also be extracted as a `SummarizedExperiment`
## that would eventually simplify subsequent analyses with other packages.
## Any additional parameters passed to the function are passed to the
## `featureValues` function that is called to generate the feature value
## matrix.
se <- quantify(xmse, method = "sum")

## EICs for all features can be extracted with the `featureChromatograms`
## function. Note that, depending on the data set, extracting this for
## all features might take some time. Below we extract EICs for the
## first 10 features by providing the feature IDs.
chrs <- featureChromatograms(xmse,
    features = rownames(featureDefinitions(xmse))[1:10])
chrs

plot(chrs[3, ])

Filtering of features based on conventional quality assessment

Description

When dealing with metabolomics results, it is often necessary to filter features based on certain criteria. These criteria are typically derived from statistical formulas applied to full rows of data, where each row represents a feature and its abundance of signal in each samples. The filterFeatures function filters features based on these conventional quality assessment criteria. Multiple types of filtering are implemented and can be defined by the filter argument.

Supported filter arguments are:

RsdFilter: Calculates the relative standard deviation (i.e. coefficient of variation) in abundance for each feature in QC (Quality Control) samples and filters them in the input object according to a provided threshold.
DratioFilter: Computes the D-ratio or dispersion ratio, defined as the standard deviation in abundance for QC samples divided by the standard deviation for biological test samples, for each feature and filters them according to a provided threshold.
PercentMissingFilter: Determines the percentage of missing values for each feature in the various sample groups and filters them according to a provided threshold.
BlankFlag: Identifies features where the mean abundance in test samples is lower than a specified multiple of the mean abundance of blank samples. This can be used to flag features that result from contamination in the solvent of the samples. A new column possible_contaminants is added to the featureDefinitions (XcmsExperiment object) or rowData (SummarizedExperiment object) reflecting this.

For specific examples, see the help pages of the individual parameter classes listed above.

Arguments

`object`	`XcmsExperiment` or `SummarizedExperiment`. For an `XcmsExperiment` object, the `featureValues(object)` will be evaluated, and for `Summarizedesxperiment` the `assay(object, assay)`. The object will be filtered.
`filter`	The parameter object selecting and configuring the type of filtering. It can be one of the following classes: `RsdFilter`, `DratioFilter`, `PercentMissingFilter` or `BlankFlag`.
`assay`	For filtering of `SummarizedExperiment` objects only. Indicates which assay the filtering will be based on. Note that the features for the entire object will be removed, but the computations are performed on a single assay. Default is 1, which means the first assay of the `object` will be evaluated.
`...`	Optional parameters. For `object` being an `XcmsExperiment`: parameters for the `featureValues()` call.

Author(s)

Philippine Louail

References

Examples

## See the vignettes for more detailed examples
library(MsExperiment)

## Load a test data set with features defined.
test_xcms <- loadXcmsData()
## Set up parameter to filter based on coefficient of variation. By setting
## the filter such as below, features that have a coefficient of variation
## superior to 0.3 in QC samples will be removed from the object `test_xcms`
## when calling the `filterFeatures` function.

rsd_filter <- RsdFilter(threshold = 0.3,
                        qcIndex = sampleData(test_xcms)$sample_type == "QC")

filtered_data_rsd <- filterFeatures(object = test_xcms, filter = rsd_filter)

## Set up parameter to filter based on D-ratio. By setting the filter such
## as below, features that have a D-ratio computed based on their abundance
## between QC and study samples superior to 0.5 will be removed from the
## object `test_xcms`.

dratio_filter <- DratioFilter(threshold = 0.5,
                 qcIndex = sampleData(test_xcms)$sample_type == "QC",
                 studyIndex = sampleData(test_xcms)$sample_type == "study")

filtered_data_dratio <- filterFeatures(object = test_xcms,
                                       filter = dratio_filter)

## Set up parameter to filter based on the percent of missing data.
## Parameter f should represent the sample group of samples, for which the
## percentage of missing values will be evaluated. As the setting is defined
## bellow, if a feature as less (or equal) to 30% missing values in one
## sample group, it will be kept in the `test_xcms` object.

missing_data_filter <- PercentMissingFilter(threshold = 30,
                                       f = sampleData(test_xcms)$sample_type)

filtered_data_missing <- filterFeatures(object = test_xcms,
                                        filter = missing_data_filter)

## Set up parameter to flag possible contaminants based on blank samples'
## abundance. By setting the filter such as below, features that have mean
## abundance ratio between blank(here use study as an example) and QC
## samples less than 2 will be marked as `TRUE` in an extra column named
## `possible_contaminants` in the `featureDefinitions` table of the object
## `test_xcms`.

filter <- BlankFlag(threshold = 2,
                    qcIndex = sampleData(test_xcms)$sample_type == "QC",
                    blankIndex = sampleData(test_xcms)$sample_type == "study")
filtered_xmse <- filterFeatures(test_xcms, filter)

## See the vignettes for more detailed examples
library(MsExperiment)

## Load a test data set with features defined.
test_xcms <- loadXcmsData()
## Set up parameter to filter based on coefficient of variation. By setting
## the filter such as below, features that have a coefficient of variation
## superior to 0.3 in QC samples will be removed from the object `test_xcms`
## when calling the `filterFeatures` function.

rsd_filter <- RsdFilter(threshold = 0.3,
                        qcIndex = sampleData(test_xcms)$sample_type == "QC")

filtered_data_rsd <- filterFeatures(object = test_xcms, filter = rsd_filter)

## Set up parameter to filter based on D-ratio. By setting the filter such
## as below, features that have a D-ratio computed based on their abundance
## between QC and study samples superior to 0.5 will be removed from the
## object `test_xcms`.

dratio_filter <- DratioFilter(threshold = 0.5,
                 qcIndex = sampleData(test_xcms)$sample_type == "QC",
                 studyIndex = sampleData(test_xcms)$sample_type == "study")

filtered_data_dratio <- filterFeatures(object = test_xcms,
                                       filter = dratio_filter)

## Set up parameter to filter based on the percent of missing data.
## Parameter f should represent the sample group of samples, for which the
## percentage of missing values will be evaluated. As the setting is defined
## bellow, if a feature as less (or equal) to 30% missing values in one
## sample group, it will be kept in the `test_xcms` object.

missing_data_filter <- PercentMissingFilter(threshold = 30,
                                       f = sampleData(test_xcms)$sample_type)

filtered_data_missing <- filterFeatures(object = test_xcms,
                                        filter = missing_data_filter)

## Set up parameter to flag possible contaminants based on blank samples'
## abundance. By setting the filter such as below, features that have mean
## abundance ratio between blank(here use study as an example) and QC
## samples less than 2 will be marked as `TRUE` in an extra column named
## `possible_contaminants` in the `featureDefinitions` table of the object
## `test_xcms`.

filter <- BlankFlag(threshold = 2,
                    qcIndex = sampleData(test_xcms)$sample_type == "QC",
                    blankIndex = sampleData(test_xcms)$sample_type == "study")
filtered_xmse <- filterFeatures(test_xcms, filter)

Chromatographic Peak Detection

Description

The findChromPeaks method performs chromatographic peak detection on LC/GC-MS data. The peak detection algorithm can be selected, and configured, using the param argument.

Supported param objects are:

CentWaveParam(): chromatographic peak detection using the centWave algorithm.
CentWavePredIsoParam(): centWave with predicted isotopes. Peak detection uses a two-step centWave-based approach considering also feature isotopes.
MatchedFilterParam(): peak detection using the matched filter algorithm.
MassifquantParam(): peak detection using the Kalman filter-based massifquant method.
MSWParam(): single-spectrum non-chromatography MS data peak detection.

For specific examples see the help pages of the individual parameter classes listed above.

Usage

findChromPeaks(object, param, ...)

## S4 method for signature 'MsExperiment,Param'
findChromPeaks(
  object,
  param,
  msLevel = 1L,
  chunkSize = 2L,
  ...,
  BPPARAM = bpparam()
)

## S4 method for signature 'XcmsExperiment,Param'
findChromPeaks(
  object,
  param,
  msLevel = 1L,
  chunkSize = 2L,
  add = FALSE,
  ...,
  BPPARAM = bpparam()
)
findChromPeaks(object, param, ...)

## S4 method for signature 'MsExperiment,Param'
findChromPeaks(
  object,
  param,
  msLevel = 1L,
  chunkSize = 2L,
  ...,
  BPPARAM = bpparam()
)

## S4 method for signature 'XcmsExperiment,Param'
findChromPeaks(
  object,
  param,
  msLevel = 1L,
  chunkSize = 2L,
  add = FALSE,
  ...,
  BPPARAM = bpparam()
)

Arguments

`object`	The data object on which to perform the peak detection. Can be an `MSnbase::OnDiskMSnExp()`, `XCMSnExp()`, `MSnbase::MChromatograms()` or `MsExperiment::MsExperiment()` object.
`param`	The parameter object selecting and configuring the algorithm.
`...`	Optional parameters.
`msLevel`	`integer(1)` defining the MS level on which the chromatographic peak detection should be performed.
`chunkSize`	`integer(1)` for `object` being an `MsExperiment` or `XcmsExperiment()`: defines the number of files (samples) for which the full peaks data (m/z and intensity values) should be loaded into memory at the same time. Peak detection is then performed in parallel (per sample) on this subset of loaded data. This setting thus allows to balance between memory demand and speed (due to parallel processing) of the peak detection. Because parallel processing can only performed on the subset of data loaded currently into memory (in each iteration), the value for `chunkSize` should be match the defined parallel setting setup. Using a parallel processing setup using 4 CPUs (separate processes) but using `⁠chunkSize = ⁠`1`⁠will not perform any parallel processing, as only the data from one sample is loaded in memory at a time. On the other hand, setting⁠`chunkSize`⁠to the total number of samples in an experiment will load the full MS data into memory and will thus in most settings cause an out-of-memory error. By setting⁠`chunkSize = -1`⁠the peak detection will be performed separately, and in parallel, for each sample. This will however not work for all⁠`Spectra' backends (see eventually `Spectra::Spectra()` for details).
`BPPARAM`	Parallel processing setup. Uses by default the system-wide default setup. See `BiocParallel::bpparam()` for more details.
`add`	`logical(1)` (if `object` contains already chromatographic peaks, i.e. is either an `XCMSnExp` or `XcmsExperiment`) whether chromatographic peak detection results should be added to existing results. By default (`add = FALSE`) any additional `findChromPeaks` call on a result object will remove previous results.

Author(s)

Johannes Rainer

Chromatographic peak detection using the centWave method

Description

The centWave algorithm perform peak density and wavelet based chromatographic peak detection for high resolution LC/MS data in centroid mode [Tautenhahn 2008].

The CentWaveParam class allows to specify all settings for a chromatographic peak detection using the centWave method. Instances should be created with the CentWaveParam constructor.

The findChromPeaks,OnDiskMSnExp,CentWaveParam method performs chromatographic peak detection using the centWave algorithm on all samples from an OnDiskMSnExp object. OnDiskMSnExp objects encapsule all experiment specific data and load the spectra data (mz and intensity values) on the fly from the original files applying also all eventual data manipulations.

ppm,ppm<-: getter and setter for the ppm slot of the object.

peakwidth,peakwidth<-: getter and setter for the peakwidth slot of the object.

snthresh,snthresh<-: getter and setter for the snthresh slot of the object.

prefilter,prefilter<-: getter and setter for the prefilter slot of the object.

mzCenterFun,mzCenterFun<-: getter and setter for the mzCenterFun slot of the object.

integrate,integrate<-: getter and setter for the integrate slot of the object.

mzdiff,mzdiff<-: getter and setter for the mzdiff slot of the object.

fitgauss,fitgauss<-: getter and setter for the fitgauss slot of the object.

noise,noise<-: getter and setter for the noise slot of the object.

verboseColumns,verboseColumns<-: getter and setter for the verboseColumns slot of the object.

roiList,roiList<-: getter and setter for the roiList slot of the object.

fistBaselineCheck,firstBaselineCheck<-: getter and setter for the firstBaselineCheck slot of the object.

roiScales,roiScales<-: getter and setter for the roiScales slot of the object.

Usage

CentWaveParam(
  ppm = 25,
  peakwidth = c(20, 50),
  snthresh = 10,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1L,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  roiList = list(),
  firstBaselineCheck = TRUE,
  roiScales = numeric(),
  extendLengthMSW = FALSE,
  verboseBetaColumns = FALSE
)

## S4 method for signature 'OnDiskMSnExp,CentWaveParam'
findChromPeaks(
  object,
  param,
  BPPARAM = bpparam(),
  return.type = "XCMSnExp",
  msLevel = 1L,
  ...
)

## S4 method for signature 'CentWaveParam'
ppm(object)

## S4 replacement method for signature 'CentWaveParam'
ppm(object) <- value

## S4 method for signature 'CentWaveParam'
peakwidth(object)

## S4 replacement method for signature 'CentWaveParam'
peakwidth(object) <- value

## S4 method for signature 'CentWaveParam'
snthresh(object)

## S4 replacement method for signature 'CentWaveParam'
snthresh(object) <- value

## S4 method for signature 'CentWaveParam'
prefilter(object)

## S4 replacement method for signature 'CentWaveParam'
prefilter(object) <- value

## S4 method for signature 'CentWaveParam'
mzCenterFun(object)

## S4 replacement method for signature 'CentWaveParam'
mzCenterFun(object) <- value

## S4 method for signature 'CentWaveParam'
integrate(f)

## S4 replacement method for signature 'CentWaveParam'
integrate(object) <- value

## S4 method for signature 'CentWaveParam'
mzdiff(object)

## S4 replacement method for signature 'CentWaveParam'
mzdiff(object) <- value

## S4 method for signature 'CentWaveParam'
fitgauss(object)

## S4 replacement method for signature 'CentWaveParam'
fitgauss(object) <- value

## S4 method for signature 'CentWaveParam'
noise(object)

## S4 replacement method for signature 'CentWaveParam'
noise(object) <- value

## S4 method for signature 'CentWaveParam'
verboseColumns(object)

## S4 replacement method for signature 'CentWaveParam'
verboseColumns(object) <- value

## S4 method for signature 'CentWaveParam'
roiList(object)

## S4 replacement method for signature 'CentWaveParam'
roiList(object) <- value

## S4 method for signature 'CentWaveParam'
firstBaselineCheck(object)

## S4 replacement method for signature 'CentWaveParam'
firstBaselineCheck(object) <- value

## S4 method for signature 'CentWaveParam'
roiScales(object)

## S4 replacement method for signature 'CentWaveParam'
roiScales(object) <- value

## S4 method for signature 'CentWaveParam'
as.list(x, ...)
CentWaveParam(
  ppm = 25,
  peakwidth = c(20, 50),
  snthresh = 10,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1L,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  roiList = list(),
  firstBaselineCheck = TRUE,
  roiScales = numeric(),
  extendLengthMSW = FALSE,
  verboseBetaColumns = FALSE
)

## S4 method for signature 'OnDiskMSnExp,CentWaveParam'
findChromPeaks(
  object,
  param,
  BPPARAM = bpparam(),
  return.type = "XCMSnExp",
  msLevel = 1L,
  ...
)

## S4 method for signature 'CentWaveParam'
ppm(object)

## S4 replacement method for signature 'CentWaveParam'
ppm(object) <- value

## S4 method for signature 'CentWaveParam'
peakwidth(object)

## S4 replacement method for signature 'CentWaveParam'
peakwidth(object) <- value

## S4 method for signature 'CentWaveParam'
snthresh(object)

## S4 replacement method for signature 'CentWaveParam'
snthresh(object) <- value

## S4 method for signature 'CentWaveParam'
prefilter(object)

## S4 replacement method for signature 'CentWaveParam'
prefilter(object) <- value

## S4 method for signature 'CentWaveParam'
mzCenterFun(object)

## S4 replacement method for signature 'CentWaveParam'
mzCenterFun(object) <- value

## S4 method for signature 'CentWaveParam'
integrate(f)

## S4 replacement method for signature 'CentWaveParam'
integrate(object) <- value

## S4 method for signature 'CentWaveParam'
mzdiff(object)

## S4 replacement method for signature 'CentWaveParam'
mzdiff(object) <- value

## S4 method for signature 'CentWaveParam'
fitgauss(object)

## S4 replacement method for signature 'CentWaveParam'
fitgauss(object) <- value

## S4 method for signature 'CentWaveParam'
noise(object)

## S4 replacement method for signature 'CentWaveParam'
noise(object) <- value

## S4 method for signature 'CentWaveParam'
verboseColumns(object)

## S4 replacement method for signature 'CentWaveParam'
verboseColumns(object) <- value

## S4 method for signature 'CentWaveParam'
roiList(object)

## S4 replacement method for signature 'CentWaveParam'
roiList(object) <- value

## S4 method for signature 'CentWaveParam'
firstBaselineCheck(object)

## S4 replacement method for signature 'CentWaveParam'
firstBaselineCheck(object) <- value

## S4 method for signature 'CentWaveParam'
roiScales(object)

## S4 replacement method for signature 'CentWaveParam'
roiScales(object) <- value

## S4 method for signature 'CentWaveParam'
as.list(x, ...)

Arguments

`ppm`	`numeric(1)` defining the maximal tolerated m/z deviation in consecutive scans in parts per million (ppm) for the initial ROI definition.
`peakwidth`	`numeric(2)` with the expected approximate peak width in chromatographic space. Given as a range (min, max) in seconds.
`snthresh`	`numeric(1)` defining the signal to noise ratio cutoff.
`prefilter`	`numeric(2)`: `c(k, I)` specifying the prefilter step for the first analysis step (ROI detection). Mass traces are only retained if they contain at least `k` peaks with intensity `>= I`.
`mzCenterFun`	Name of the function to calculate the m/z center of the chromatographic peak. Allowed are: `"wMean"`: intensity weighted mean of the peak's m/z values, `"mean"`: mean of the peak's m/z values, `"apex"`: use the m/z value at the peak apex, `"wMeanApex3"`: intensity weighted mean of the m/z value at the peak apex and the m/z values left and right of it and `"meanApex3"`: mean of the m/z value of the peak apex and the m/z values left and right of it.
`integrate`	Integration method. For `integrate = 1` peak limits are found through descent on the mexican hat filtered data, for `integrate = 2` the descent is done on the real data. The latter method is more accurate but prone to noise, while the former is more robust, but less exact.
`mzdiff`	`numeric(1)` representing the minimum difference in m/z dimension required for peaks with overlapping retention times; can be negative to allow overlap. During peak post-processing, peaks defined to be overlapping are reduced to the one peak with the largest signal.
`fitgauss`	`logical(1)` whether or not a Gaussian should be fitted to each peak. This affects mostly the retention time position of the peak.
`noise`	`numeric(1)` allowing to set a minimum intensity required for centroids to be considered in the first analysis step (centroids with intensity `< noise` are omitted from ROI detection).
`verboseColumns`	`logical(1)` whether additional peak meta data columns should be returned.
`roiList`	An optional list of regions-of-interest (ROI) representing detected mass traces. If ROIs are submitted the first analysis step is omitted and chromatographic peak detection is performed on the submitted ROIs. Each ROI is expected to have the following elements specified: `scmin` (start scan index), `scmax` (end scan index), `mzmin` (minimum m/z), `mzmax` (maximum m/z), `length` (number of scans), `intensity` (summed intensity). Each ROI should be represented by a `list` of elements or a single row `data.frame`.
`firstBaselineCheck`	`logical(1)`. If `TRUE` continuous data within regions of interest is checked to be above the first baseline. In detail, a first rough estimate of the noise is calculated and peak detection is performed only in regions in which multiple sequential signals are higher than this first estimated baseline/noise level.
`roiScales`	Optional numeric vector with length equal to `roiList` defining the scale for each region of interest in `roiList` that should be used for the centWave-wavelets.
`extendLengthMSW`	Option to force centWave to use all scales when running centWave rather than truncating with the EIC length. Uses the "open" method to extend the EIC to a integer base-2 length prior to being passed to `convolve` rather than the default "reflect" method. See https://github.com/sneumann/xcms/issues/445 for more information.
`verboseBetaColumns`	Option to calculate two additional metrics of peak quality via comparison to an idealized bell curve. Adds `beta_cor` and `beta_snr` to the `chromPeaks` output, corresponding to a Pearson correlation coefficient to a bell curve with several degrees of skew as well as an estimate of signal-to-noise using the residuals from the best-fitting bell curve. See https://github.com/sneumann/xcms/pull/685 and https://doi.org/10.1186/s12859-023-05533-4 for more information.
`object`	For `findChromPeaks`: an `OnDiskMSnExp` object containing the MS- and all other experiment-relevant data. For all other methods: a parameter object.
`param`	An `CentWaveParam` object containing all settings for the centWave algorithm.
`BPPARAM`	A parameter class specifying if and how parallel processing should be performed. It defaults to `bpparam`. See documentation of the `BiocParallel` for more details. If parallel processing is enabled, peak detection is performed in parallel on several of the input samples.
`return.type`	Character specifying what type of object the method should return. Can be either `"XCMSnExp"` (default), `"list"` or `"xcmsSet"`.
`msLevel`	`integer(1)` defining the MS level on which the peak detection should be performed. Defaults to `msLevel = 1`.
`...`	ignored.
`value`	The value for the slot.
`f`	For `integrate`: a `CentWaveParam` object.
`x`	The parameter object.

Details

The centWave algorithm is most suitable for high resolution LC/{TOF,OrbiTrap,FTICR}-MS data in centroid mode. In the first phase the method identifies regions of interest (ROIs) representing mass traces that are characterized as regions with less than ppm m/z deviation in consecutive scans in the LC/MS map. In detail, starting with a single m/z, a ROI is extended if a m/z can be found in the next scan (spectrum) for which the difference to the mean m/z of the ROI is smaller than the user defined ppm of the m/z. The mean m/z of the ROI is then updated considering also the newly included m/z value.

Parallel processing (one process per sample) is supported and can be configured either by the BPPARAM parameter or by globally defining the parallel processing mode using the register method from the BiocParallel package.

Value

The CentWaveParam function returns a CentWaveParam class instance with all of the settings specified for chromatographic peak detection by the centWave method.

For findChromPeaks: if return.type = "XCMSnExp" an XCMSnExp object with the results of the peak detection. If return.type = "list" a list of length equal to the number of samples with matrices specifying the identified peaks. If return.type = "xcmsSet" an xcmsSet object with the results of the peak detection.

Slots

ppm,peakwidth,snthresh,prefilter,mzCenterFun,integrate,mzdiff,fitgauss,noise,verboseColumns,roiList,firstBaselineCheck,roiScales,extendLengthMSW,verboseBetaColumns: See corresponding parameter above. Slots values should exclusively be accessed via the corresponding getter and setter methods listed above.

Note

These methods and classes are part of the updated and modernized xcms user interface which will eventually replace the findPeaks methods. It supports peak detection on OnDiskMSnExp objects (defined in the MSnbase package). All of the settings to the centWave algorithm can be passed with a CentWaveParam object.

Author(s)

Ralf Tautenhahn, Johannes Rainer

References

Ralf Tautenhahn, Christoph Böttcher, and Steffen Neumann "Highly sensitive feature detection for high resolution LC/MS" BMC Bioinformatics 2008, 9:504

Examples


## Create a CentWaveParam object. Note that the noise is set to 10000 to
## speed up the execution of the example - in a real use case the default
## value should be used, or it should be set to a reasonable value.
cwp <- CentWaveParam(ppm = 20, noise = 10000, prefilter = c(3, 10000))
## Change snthresh parameter
snthresh(cwp) <- 25
cwp

## Perform the peak detection using centWave on some of the files from the
## faahKO package. Files are read using the `readMsExperiment` function
## from the MsExperiment package
library(faahKO)
library(xcms)
library(MsExperiment)
fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE,
           full.names = TRUE)
raw_data <- readMsExperiment(fls[1])

## Perform the peak detection using the settings defined above.
res <- findChromPeaks(raw_data, param = cwp)
head(chromPeaks(res))
## Create a CentWaveParam object. Note that the noise is set to 10000 to
## speed up the execution of the example - in a real use case the default
## value should be used, or it should be set to a reasonable value.
cwp <- CentWaveParam(ppm = 20, noise = 10000, prefilter = c(3, 10000))
## Change snthresh parameter
snthresh(cwp) <- 25
cwp

## Perform the peak detection using centWave on some of the files from the
## faahKO package. Files are read using the `readMsExperiment` function
## from the MsExperiment package
library(faahKO)
library(xcms)
library(MsExperiment)
fls <- dir(system.file("cdf/KO", package = "faahKO"), recursive = TRUE,
           full.names = TRUE)
raw_data <- readMsExperiment(fls[1])

## Perform the peak detection using the settings defined above.
res <- findChromPeaks(raw_data, param = cwp)
head(chromPeaks(res))

Two-step centWave peak detection considering also isotopes

Description

This method performs a two-step centWave-based chromatographic peak detection: in a first centWave run peaks are identified for which then the location of their potential isotopes in the mz-retention time is predicted. A second centWave run is then performed on these regions of interest (ROIs). The final list of chromatographic peaks comprises all non-overlapping peaks from both centWave runs.

The CentWavePredIsoParam class allows to specify all settings for the two-step centWave-based peak detection considering also predicted isotopes of peaks identified in the first centWave run. Instances should be created with the CentWavePredIsoParam constructor. See also the documentation of the CentWaveParam for all methods and arguments this class inherits.

The findChromPeaks,OnDiskMSnExp,CentWavePredIsoParam method performs a two-step centWave-based chromatographic peak detection on all samples from an OnDiskMSnExp object. OnDiskMSnExp objects encapsule all experiment specific data and load the spectra data (mz and intensity values) on the fly from the original files applying also all eventual data manipulations.

snthreshIsoROIs,snthreshIsoROIs<-: getter and setter for the snthreshIsoROIs slot of the object.

maxCharge,maxCharge<-: getter and setter for the maxCharge slot of the object.

maxIso,maxIso<-: getter and setter for the maxIso slot of the object.

mzIntervalExtension,mzIntervalExtension<-: getter and setter for the mzIntervalExtension slot of the object.

polarity,polarity<-: getter and setter for the polarity slot of the object.

Usage

CentWavePredIsoParam(
  ppm = 25,
  peakwidth = c(20, 50),
  snthresh = 10,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1L,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  roiList = list(),
  firstBaselineCheck = TRUE,
  roiScales = numeric(),
  extendLengthMSW = FALSE,
  verboseBetaColumns = FALSE,
  snthreshIsoROIs = 6.25,
  maxCharge = 3,
  maxIso = 5,
  mzIntervalExtension = TRUE,
  polarity = "unknown"
)

## S4 method for signature 'OnDiskMSnExp,CentWavePredIsoParam'
findChromPeaks(
  object,
  param,
  BPPARAM = bpparam(),
  return.type = "XCMSnExp",
  msLevel = 1L,
  ...
)

## S4 method for signature 'CentWavePredIsoParam'
snthreshIsoROIs(object)

## S4 replacement method for signature 'CentWavePredIsoParam'
snthreshIsoROIs(object) <- value

## S4 method for signature 'CentWavePredIsoParam'
maxCharge(object)

## S4 replacement method for signature 'CentWavePredIsoParam'
maxCharge(object) <- value

## S4 method for signature 'CentWavePredIsoParam'
maxIso(object)

## S4 replacement method for signature 'CentWavePredIsoParam'
maxIso(object) <- value

## S4 method for signature 'CentWavePredIsoParam'
mzIntervalExtension(object)

## S4 replacement method for signature 'CentWavePredIsoParam'
mzIntervalExtension(object) <- value

## S4 method for signature 'CentWavePredIsoParam'
polarity(object)

## S4 replacement method for signature 'CentWavePredIsoParam'
polarity(object) <- value
CentWavePredIsoParam(
  ppm = 25,
  peakwidth = c(20, 50),
  snthresh = 10,
  prefilter = c(3, 100),
  mzCenterFun = "wMean",
  integrate = 1L,
  mzdiff = -0.001,
  fitgauss = FALSE,
  noise = 0,
  verboseColumns = FALSE,
  roiList = list(),
  firstBaselineCheck = TRUE,
  roiScales = numeric(),
  extendLengthMSW = FALSE,
  verboseBetaColumns = FALSE,
  snthreshIsoROIs = 6.25,
  maxCharge = 3,
  maxIso = 5,
  mzIntervalExtension = TRUE,
  polarity = "unknown"
)

## S4 method for signature 'OnDiskMSnExp,CentWavePredIsoParam'
findChromPeaks(
  object,
  param,
  BPPARAM = bpparam(),
  return.type = "XCMSnExp",
  msLevel = 1L,
  ...
)

## S4 method for signature 'CentWavePredIsoParam'
snthreshIsoROIs(object)

## S4 replacement method for signature 'CentWavePredIsoParam'
snthreshIsoROIs(object) <- value

## S4 method for signature 'CentWavePredIsoParam'
maxCharge(object)

## S4 replacement method for signature 'CentWavePredIsoParam'
maxCharge(object) <- value

## S4 method for signature 'CentWavePredIsoParam'
maxIso(object)

## S4 replacement method for signature 'CentWavePredIsoParam'
maxIso(object) <- value

## S4 method for signature 'CentWavePredIsoParam'
mzIntervalExtension(object)

## S4 replacement method for signature 'CentWavePredIsoParam'
mzIntervalExtension(object) <- value

## S4 method for signature 'CentWavePredIsoParam'
polarity(object)

## S4 replacement method for signature 'CentWavePredIsoParam'
polarity(object) <- value

Arguments

`ppm`	`numeric(1)` defining the maximal tolerated m/z deviation in consecutive scans in parts per million (ppm) for the initial ROI definition.
`peakwidth`	`numeric(2)` with the expected approximate peak width in chromatographic space. Given as a range (min, max) in seconds.
`snthresh`	`numeric(1)` defining the signal to noise ratio cutoff.
`prefilter`	`numeric(2)`: `c(k, I)` specifying the prefilter step for the first analysis step (ROI detection). Mass traces are only retained if they contain at least `k` peaks with intensity `>= I`.
`mzCenterFun`	Name of the function to calculate the m/z center of the chromatographic peak. Allowed are: `"wMean"`: intensity weighted mean of the peak's m/z values, `"mean"`: mean of the peak's m/z values, `"apex"`: use the m/z value at the peak apex, `"wMeanApex3"`: intensity weighted mean of the m/z value at the peak apex and the m/z values left and right of it and `"meanApex3"`: mean of the m/z value of the peak apex and the m/z values left and right of it.
`integrate`	Integration method. For `integrate = 1` peak limits are found through descent on the mexican hat filtered data, for `integrate = 2` the descent is done on the real data. The latter method is more accurate but prone to noise, while the former is more robust, but less exact.
`mzdiff`	`numeric(1)` representing the minimum difference in m/z dimension required for peaks with overlapping retention times; can be negative to allow overlap. During peak post-processing, peaks defined to be overlapping are reduced to the one peak with the largest signal.
`fitgauss`	`logical(1)` whether or not a Gaussian should be fitted to each peak. This affects mostly the retention time position of the peak.
`noise`	`numeric(1)` allowing to set a minimum intensity required for centroids to be considered in the first analysis step (centroids with intensity `< noise` are omitted from ROI detection).
`verboseColumns`	`logical(1)` whether additional peak meta data columns should be returned.
`roiList`	An optional list of regions-of-interest (ROI) representing detected mass traces. If ROIs are submitted the first analysis step is omitted and chromatographic peak detection is performed on the submitted ROIs. Each ROI is expected to have the following elements specified: `scmin` (start scan index), `scmax` (end scan index), `mzmin` (minimum m/z), `mzmax` (maximum m/z), `length` (number of scans), `intensity` (summed intensity). Each ROI should be represented by a `list` of elements or a single row `data.frame`.
`firstBaselineCheck`	`logical(1)`. If `TRUE` continuous data within regions of interest is checked to be above the first baseline. In detail, a first rough estimate of the noise is calculated and peak detection is performed only in regions in which multiple sequential signals are higher than this first estimated baseline/noise level.
`roiScales`	Optional numeric vector with length equal to `roiList` defining the scale for each region of interest in `roiList` that should be used for the centWave-wavelets.
`extendLengthMSW`	Option to force centWave to use all scales when running centWave rather than truncating with the EIC length. Uses the "open" method to extend the EIC to a integer base-2 length prior to being passed to `convolve` rather than the default "reflect" method. See https://github.com/sneumann/xcms/issues/445 for more information.
`verboseBetaColumns`	Option to calculate two additional metrics of peak quality via comparison to an idealized bell curve. Adds `beta_cor` and `beta_snr` to the `chromPeaks` output, corresponding to a Pearson correlation coefficient to a bell curve with several degrees of skew as well as an estimate of signal-to-noise using the residuals from the best-fitting bell curve. See https://github.com/sneumann/xcms/pull/685 and https://doi.org/10.1186/s12859-023-05533-4 for more information.
`snthreshIsoROIs`	`numeric(1)` defining the signal to noise ratio cutoff to be used in the second centWave run to identify peaks for predicted isotope ROIs.
`maxCharge`	`integer(1)` defining the maximal isotope charge. Isotopes will be defined for charges `1:maxCharge`.
`maxIso`	`integer(1)` defining the number of isotope peaks that should be predicted for each peak identified in the first centWave run.
`mzIntervalExtension`	`logical(1)` whether the mz range for the predicted isotope ROIs should be extended to increase detection of low intensity peaks.
`polarity`	`character(1)` specifying the polarity of the data. Currently not used, but has to be `"positive"`, `"negative"` or `"unknown"` if provided.
`object`	For `findChromPeaks`: an `OnDiskMSnExp` object containing the MS- and all other experiment-relevant data. For all other methods: a parameter object.
`param`	An `CentWavePredIsoParam` object with the settings for the chromatographic peak detection algorithm.
`BPPARAM`	A parameter class specifying if and how parallel processing should be performed. It defaults to `bpparam`. See documentation of the `BiocParallel` for more details. If parallel processing is enabled, peak detection is performed in parallel on several of the input samples.
`return.type`	Character specifying what type of object the method should return. Can be either `"XCMSnExp"` (default), `"list"` or `"xcmsSet"`.
`msLevel`	`integer(1)` defining the MS level on which the peak detection should be performed. Defaults to `msLevel = 1`.
`...`	ignored.
`value`	The value for the slot.

Details

See centWave for details on the centWave method.

Value

The CentWavePredIsoParam function returns a CentWavePredIsoParam class instance with all of the settings specified for the two-step centWave-based peak detection considering also isotopes.

Slots

ppm,peakwidth,snthresh,prefilter,mzCenterFun,integrate,mzdiff,fitgauss,noise,verboseColumns,roiList,firstBaselineCheck,roiScales,extendLengthMSW,verboseBetaColumns,snthreshIsoROIs,maxCharge,maxIso,mzIntervalExtension,polarity: See corresponding parameter above.

Note

These methods and classes are part of the updated and modernized xcms user interface which will eventually replace the findPeaks methods. It supports chromatographic peak detection on OnDiskMSnExp objects (defined in the MSnbase package). All of the settings to the algorithm can be passed with a CentWavePredIsoParam object.

Author(s)

Hendrik Treutler, Johannes Rainer

Examples


## Create a param object
p <- CentWavePredIsoParam(maxCharge = 4)
## Change snthresh parameter
snthresh(p) <- 25
p

## Create a param object
p <- CentWavePredIsoParam(maxCharge = 4)
## Change snthresh parameter
snthresh(p) <- 25
p

Chromatographic peak detection using the massifquant method

Description

Massifquant is a Kalman filter (KF)-based chromatographic peak detection for XC-MS data in centroid mode. The identified peaks can be further refined with the centWave method (see findChromPeaks-centWave for details on centWave) by specifying withWave = TRUE.

The MassifquantParam class allows to specify all settings for a chromatographic peak detection using the massifquant method eventually in combination with the centWave algorithm. Instances should be created with the MassifquantParam constructor.

The findChromPeaks,OnDiskMSnExp,MassifquantParam method performs chromatographic peak detection using the massifquant algorithm on all samples from an OnDiskMSnExp object. OnDiskMSnExp objects encapsule all experiment specific data and load the spectra data (mz and intensity values) on the fly from the original files applying also all eventual data manipulations.