Package 'cosmiq' reference manual

Title:	cosmiq - COmbining Single Masses Into Quantities
Description:	cosmiq is a tool for the preprocessing of liquid- or gas - chromatography mass spectrometry (LCMS/GCMS) data with a focus on metabolomics or lipidomics applications. To improve the detection of low abundant signals, cosmiq generates master maps of the mZ/RT space from all acquired runs before a peak detection algorithm is applied. The result is a more robust identification and quantification of low-intensity MS signals compared to conventional approaches where peak picking is performed in each LCMS/GCMS file separately. The cosmiq package builds on the xcmsSet object structure and can be therefore integrated well with the package xcms as an alternative preprocessing step.
Authors:	David Fischer [aut, cre], Christian Panse [aut] , Endre Laczko [ctb]
Maintainer:	David Fischer <[email protected]>
License:	GPL-3
Version:	1.41.0
Built:	2025-03-30 07:17:06 UTC
Source:	https://github.com/bioc/cosmiq

Combine mass spectra of each scan and each file into a single master spectrum

Description

combine_spectra imports each raw file using xcmsRaw and assigns each ion to a previously defined mass vector, which is created using bin size parameter mzbin. This process is repeated for each raw file.

Usage


    combine_spectra(xs, mzbin=0.003, linear=FALSE, continuum=FALSE)    

combine_spectra(xs, mzbin=0.003, linear=FALSE, continuum=FALSE)

Arguments

`xs`	xcmsSet object
`mzbin`	Bin size for the generation of mass vector
`linear`	logical. If TRUE, linear vector will be generated with `mzbin` increments. If FALSE, mass vector will be generated using a non-linear function. This option is recommended for TOF-type mass detectors
`continuum`	boolean flag. default value is FALSE.

Details

This processing step calculates a combined mass spectrum. Mass spectra not only from all scans of a single LCMS run alone are combined but from all acquired datasets. As a result, signal to noise ratio increases for each additional LCMS run.

Author(s)

David Fischer 2013

Examples


cdfpath <- file.path(find.package("faahKO"), "cdf")

my.input.files <- dir(c(paste(cdfpath, "WT", sep='/'),
    paste(cdfpath, "KO", sep='/')), full.names=TRUE)


# create xcmsSet object
xs <- new("xcmsSet")
xs@filepaths <- my.input.files

x<-combine_spectra(xs=xs, mzbin=0.25,
    linear=TRUE, continuum=FALSE)

plot(x$mz, x$intensity, type='l',
    xlab='m/Z', ylab='ion intensity')


cdfpath <- file.path(find.package("faahKO"), "cdf")

my.input.files <- dir(c(paste(cdfpath, "WT", sep='/'),
    paste(cdfpath, "KO", sep='/')), full.names=TRUE)


# create xcmsSet object
xs <- new("xcmsSet")
xs@filepaths <- my.input.files

x<-combine_spectra(xs=xs, mzbin=0.25,
    linear=TRUE, continuum=FALSE)

plot(x$mz, x$intensity, type='l',
    xlab='m/Z', ylab='ion intensity')

cosmiq - main wrapper function

Description

This is the main wrapper function for the package cosmiq. Every processing step of cosmiq will be calculated during this function, including mass spectra combination, detection of relevant masses, generation and combination of extracted ion chromatograms, detection of chromatographic peaks, Localisation and quantification of detected peaks.

Usage


    cosmiq (files, RTinfo=TRUE, mzbin=0.003, 
        center=0, linear=FALSE, profStep=1, retcorrect=TRUE, 
        continuum=FALSE, rtcombine=c(0,0), scales=c(1:10), 
        SNR.Th=10, SNR.area=20, RTscales=c(1:10, seq(12,32,2)), 
        RTSNR.Th=20, RTSNR.area=20, mintr=0.5)

cosmiq (files, RTinfo=TRUE, mzbin=0.003, 
        center=0, linear=FALSE, profStep=1, retcorrect=TRUE, 
        continuum=FALSE, rtcombine=c(0,0), scales=c(1:10), 
        SNR.Th=10, SNR.area=20, RTscales=c(1:10, seq(12,32,2)), 
        RTSNR.Th=20, RTSNR.area=20, mintr=0.5)

Arguments

`files`	String vector containing exact location of each file
`RTinfo`	Logical indicating whether retention time should be used or not. If FALSE, the quantitative information will be retrieved as a sum of ion intensities whithin a retention time window. This retention time window
`rtcombine`	Numerical, with two entries. If no retention time information is to be used, `rtcombine` provides the retention time window where ion intensities will be summed for quantification. If `rtcombine = c(0,0)`, All scans will be used
`mzbin`	Bin size for the generation of mass vector
`center`	Indicates number of file which will be used for retention time alignment. If zero, file will be automatically selected based on maximum summed ion intensity of the combined mass spectrum.
`linear`	logical. If TRUE, linear vector will be generated with `mzbin` increments. If FALSE, mass vector will be generated using a non-linear function. This option is recommended for TOF-type mass detectors
`profStep`	step size (in m/z) to use for profile generation from the raw data files.
`retcorrect`	Logical, should retention time correction be used? `default = TRUE`
`continuum`	Logical, is continuum data used? `default = FALSE`
`scales`	Peak Detection of Mass Peaks in the combined mass spectrum: scales for continuous wavelet transformation (CWT).
`SNR.Th`	Peak Detection of Mass Peaks in the combined mass spectrum: Signal to noise ratio threshold
`SNR.area`	Peak Detection of Mass Peaks in the combined mass spectrum: Area around the peak for noise determination. Indicates number of surrounding peaks on the first CWT scale. `default = 20`
`RTscales`	Peak Detection of Chromatographic peaks on the combined extracted ion chromatogram: scales for continuous wavelet transformation (CWT), see also the `peakDetectionCWT` function of the `MassSpecWavelet` package.
`RTSNR.Th`	Peak Detection of Chromatographic peaks on the combined extracted ion chromatogram: Signal to noise ratio threshold
`RTSNR.area`	Peak Detection of Chromatographic peaks on the combined extracted ion chromatogram: Area around the peak for noise determination. Indicates number of surrounding peaks on the first CWT scale. `default = 20`
`mintr`	Minimal peak width intensity treshold (in percentage), for which two overlapping peaks are considered as separated. The default calue is 0.5.

Details

Current data processing tools focus on a peak detection strategy where initially each LCMS run is treated independently from each other. Subsequent data processing for the alignment of different samples is then calculated on reduced peak tables. This function involves the merging of all LCMS datasets of a given experiment as a first step of raw data processing. The merged LCMS dataset contains an overlay and the sum of ion intensities from all LCMS runs. Peak detection is then performed only on this merged dataset and quantification of the signals in each sample is guided by the peak location in the merged data.

See also xcms::retcor.obiwarp and cosmiq::create_datamatrix which executes each step of the cosmiq function.

For running examples please read the package vignette or ?cosmiq::create_datamatrix.

Author(s)

Christan Panse, David Fischer 2014

Examples


    ## Not run: 
    # the following lines consume 2m17.877s  
    # on a Intel(R) Xeon(R) CPU X5650  @ 2.67GHz
    library(faahKO)

    cdfpath <- file.path(find.package("faahKO"), "cdf")
    my.input.files <- dir(c(paste(cdfpath, "WT", sep='/'), 
        paste(cdfpath, "KO", sep='/')), full.names=TRUE)

     x <- cosmiq(files=my.input.files, mzbin=0.25, SNR.Th=0, linear=TRUE)

     image(t(x$eicmatrix^(1/4)), 
        col=rev(gray(1:20/20)),
        xlab='rt', 
        ylab='m/z', axes=FALSE )

     head(xcms::peakTable(x$xcmsSet))
    
## End(Not run)

## Not run: 
    # the following lines consume 2m17.877s  
    # on a Intel(R) Xeon(R) CPU X5650  @ 2.67GHz
    library(faahKO)

    cdfpath <- file.path(find.package("faahKO"), "cdf")
    my.input.files <- dir(c(paste(cdfpath, "WT", sep='/'), 
        paste(cdfpath, "KO", sep='/')), full.names=TRUE)

     x <- cosmiq(files=my.input.files, mzbin=0.25, SNR.Th=0, linear=TRUE)

     image(t(x$eicmatrix^(1/4)), 
        col=rev(gray(1:20/20)),
        xlab='rt', 
        ylab='m/z', axes=FALSE )

     head(xcms::peakTable(x$xcmsSet))
    
## End(Not run)

Quantifying mz/RT intensities using peak locations from master map

Description

create_datamatrix identifies mz/RT peak boundaries in each raw file using the information from a master mass spectrum and master EIC. For each mz/RT location, the peak volume is calculated and stored in a report table.

Usage


    create_datamatrix(xs,rxy)

create_datamatrix(xs,rxy)

Arguments

`xs`	xcmsSet object
`rxy`	matrix containing mz and RT boundaries for each detected peak

Details

With the information about their position in the combined datasets, each individual mz/RT feature is located in the raw data.

Author(s)

David Fischer 2013

Examples

    ## Not run: 
    cdfpath <- file.path(find.package("faahKO"), "cdf")
    my.input.files <- dir(c(paste(cdfpath, "WT", sep='/'),
        paste(cdfpath, "KO", sep='/')), full.names=TRUE)

    #
    # create xcmsSet object
    # todo
    xs <- new("xcmsSet")
    # consider only two files!!!
    xs@filepaths <- my.input.files[1:2]

    class<-as.data.frame(c(rep("KO",2),rep("WT", 0)))
    rownames(class)<-basename(my.input.files[1:2])
    xs@phenoData<-class

    x<-combine_spectra(xs=xs, mzbin=0.25,
        linear=TRUE, continuum=FALSE)

    plot(x$mz, x$intensity, type='l',
        xlab='m/Z', ylab='ion intensity')

    xy <- peakdetection(x=x$mz, y=x$intensity, scales=1:10,
    SNR.Th=0.0, SNR.area=20, mintr=0.5)

    id.peakcenter<-xy[,4]

    plot(x$mz, x$intensity, type='l',
        xlim=c(440,460),
            xlab='m/Z', ylab='ion intensity')

    points(x$mz[id.peakcenter], x$intensity[id.peakcenter],
        col='red', type='h')


    # create dummy object
    xs@peaks<-matrix(c(rep(1, length(my.input.files) * 6),
        1:length(my.input.files)), ncol=7)

    colnames(xs@peaks) <- c("mz", "mzmin", "mzmax", "rt",
        "rtmin", "rtmax", "sample")

    xs<-xcms::retcor(xs, method="obiwarp", profStep=1,
        distFunc="cor", center=1)



    eicmat<-eicmatrix(xs=xs, xy=xy, center=1)

    #  process a reduced mz range for a better package build performance
    (eicmat.mz.range<-range(which(475 < xy[,1] & xy[,1] < 485)))

    eicmat.filter <- eicmat[eicmat.mz.range[1]:eicmat.mz.range[2],]
    xy.filter <- xy[eicmat.mz.range[1]:eicmat.mz.range[2],]

    #
    # determine the new range and plot the mz versus RT map
    (rt.range <- range(as.double(colnames(eicmat.filter))))
    (mz.range<-range(as.double(row.names(eicmat.filter))))

    image(log(t(eicmat.filter))/log(2), col=rev(gray(1:20/20)),
        xlab='rt [in seconds]', ylab='m/z', axes=FALSE,
        main='overlay of 12 samples using faahKO')

    axis(1, seq(0,1, length=6), round(seq(rt.range[1], rt.range[2], length=6)))
    axis(2, seq(0,1, length=4), seq(mz.range[1], mz.range[2], length=4))

    #
    # determine the chromatographic peaks
    rxy<-retention_time(xs=xs, RTscales=c(1:10, seq(12,32, by=2)),
        xy=xy.filter,
        eicmatrix=eicmat.filter,
        RTSNR.Th=120, RTSNR.area=20)

    rxy.rt <- (rxy[,4] - rt.range[1])/diff(rt.range)
    rxy.mz <- (rxy[,1] - mz.range[1])/diff(mz.range)

    points(rxy.rt, rxy.mz, pch="X", lwd=2, col="red")


    xs<-create_datamatrix(xs=xs, rxy=rxy)

    peaktable <- xcms::peakTable(xs)

    head(peaktable)
    
## End(Not run)
## Not run: 
    cdfpath <- file.path(find.package("faahKO"), "cdf")
    my.input.files <- dir(c(paste(cdfpath, "WT", sep='/'),
        paste(cdfpath, "KO", sep='/')), full.names=TRUE)

    #
    # create xcmsSet object
    # todo
    xs <- new("xcmsSet")
    # consider only two files!!!
    xs@filepaths <- my.input.files[1:2]

    class<-as.data.frame(c(rep("KO",2),rep("WT", 0)))
    rownames(class)<-basename(my.input.files[1:2])
    xs@phenoData<-class

    x<-combine_spectra(xs=xs, mzbin=0.25,
        linear=TRUE, continuum=FALSE)

    plot(x$mz, x$intensity, type='l',
        xlab='m/Z', ylab='ion intensity')

    xy <- peakdetection(x=x$mz, y=x$intensity, scales=1:10,
    SNR.Th=0.0, SNR.area=20, mintr=0.5)

    id.peakcenter<-xy[,4]

    plot(x$mz, x$intensity, type='l',
        xlim=c(440,460),
            xlab='m/Z', ylab='ion intensity')

    points(x$mz[id.peakcenter], x$intensity[id.peakcenter],
        col='red', type='h')


    # create dummy object
    xs@peaks<-matrix(c(rep(1, length(my.input.files) * 6),
        1:length(my.input.files)), ncol=7)

    colnames(xs@peaks) <- c("mz", "mzmin", "mzmax", "rt",
        "rtmin", "rtmax", "sample")

    xs<-xcms::retcor(xs, method="obiwarp", profStep=1,
        distFunc="cor", center=1)



    eicmat<-eicmatrix(xs=xs, xy=xy, center=1)

    #  process a reduced mz range for a better package build performance
    (eicmat.mz.range<-range(which(475 < xy[,1] & xy[,1] < 485)))

    eicmat.filter <- eicmat[eicmat.mz.range[1]:eicmat.mz.range[2],]
    xy.filter <- xy[eicmat.mz.range[1]:eicmat.mz.range[2],]

    #
    # determine the new range and plot the mz versus RT map
    (rt.range <- range(as.double(colnames(eicmat.filter))))
    (mz.range<-range(as.double(row.names(eicmat.filter))))

    image(log(t(eicmat.filter))/log(2), col=rev(gray(1:20/20)),
        xlab='rt [in seconds]', ylab='m/z', axes=FALSE,
        main='overlay of 12 samples using faahKO')

    axis(1, seq(0,1, length=6), round(seq(rt.range[1], rt.range[2], length=6)))
    axis(2, seq(0,1, length=4), seq(mz.range[1], mz.range[2], length=4))

    #
    # determine the chromatographic peaks
    rxy<-retention_time(xs=xs, RTscales=c(1:10, seq(12,32, by=2)),
        xy=xy.filter,
        eicmatrix=eicmat.filter,
        RTSNR.Th=120, RTSNR.area=20)

    rxy.rt <- (rxy[,4] - rt.range[1])/diff(rt.range)
    rxy.mz <- (rxy[,1] - mz.range[1])/diff(mz.range)

    points(rxy.rt, rxy.mz, pch="X", lwd=2, col="red")


    xs<-create_datamatrix(xs=xs, rxy=rxy)

    peaktable <- xcms::peakTable(xs)

    head(peaktable)
    
## End(Not run)

Generate matrix of combined extraced ion chromatograms (EICs)

Description

for each selected mass window, eicmatrix calculates EICs of every raw file and combines them together. It is recommended to correct the retention time first using retcor.obiwarp

Usage


    eicmatrix(xs, xy, center)

eicmatrix(xs, xy, center)

Arguments

`xs`	xcmsSet object
`xy`	table including mz location parameters
`center`	file number which is used as a template for retention time correction

Details

For each detected mass, an extracted ion chromatogram (EIC) is calculated. In order to determine the elution time for each detected mass, the EICs of every mass are combined between all acquired runs.

Make sure that xcms-retention time correction (centwave) was applied to the dataset. The output will be a matrix of EIC's

Author(s)

David Fischer 2013

Examples

      ## Not run: 
    #  see package vignette section 
    # 'Generation and combination of extracted ion chromatograms'
    xs<-create_datamatrix(xs=xs, rxy=rxy)
      
## End(Not run)
## Not run: 
    #  see package vignette section 
    # 'Generation and combination of extracted ion chromatograms'
    xs<-create_datamatrix(xs=xs, rxy=rxy)
      
## End(Not run)

An algorithm for the detection of peak locations and boundaries in mass spectra or ion chromatograms

Description

peakdetection uses Continuous wavelet transformation (CWT) to determine optimal peak location. A modified algorithm of Du et al. (2006) is used to localize peak positions.

Usage


    peakdetection(scales, y, x, SNR.Th, SNR.area, mintr)    

peakdetection(scales, y, x, SNR.Th, SNR.area, mintr)

Arguments

`scales`	vector with the scales to perform CWT
`y`	vector of ion intensities
`x`	vector of mz bins
`SNR.Th`	Signal-to-noise threshold
`SNR.area`	Window size for noise estimation
`mintr`	Minimal peak width intensity treshold (in percentage), for which two overlapping peaks are considered as separated. default is set to 0.5.

Details

A peak detection algorithm based on continuous wavelet transformation (CWT) is used for this step (modified from Du et al., 2006). Peak detection based on CWT has the advantage that a sliding scale of wavelets instead of a single filter function with fixed wavelength is used. This allows for a flexible and automatic approximation of the peak width. As a result it is possible to locate both narrow and broad peaks within a given dynamic range.

Author(s)

David Fischer 2013

References

Du, P., Kibbe, W. A., & Lin, S. M. (2006). Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching. Bioinformatics, 22(17), 2059-2065. doi:10.1093/bioinformatics/btl355

Examples

    
cdfpath <- file.path(find.package("faahKO"), "cdf")

my.input.files <- dir(c(paste(cdfpath, "WT", sep='/'),
    paste(cdfpath, "KO", sep='/')), full.names=TRUE)


# create xcmsSet object
xs <- new("xcmsSet")
xs@filepaths <- my.input.files

op<-par(mfrow=c(3,1))

x<-combine_spectra(xs=xs, mzbin=0.25,
    linear=TRUE, continuum=FALSE)

plot(x$mz, x$intensity, 
    type='h', 
    main='original',
    xlab='m/Z', ylab='ion intensity')

xy <- peakdetection(x=x$mz, y=x$intensity,
    scales=1:10, SNR.Th=1.0, SNR.area=20, mintr=0.5)

id.peakcenter<-xy[,4]

plot(x$mz[id.peakcenter], x$intensity[id.peakcenter], type='h', 
    main='filtered')


plot(x$mz, x$intensity, type='l',
    xlim=c(400,450),
    main='zoom',
    log='y',
    xlab='m/Z', ylab='ion intensity (log scale)')

points(x$mz[id.peakcenter], x$intensity[id.peakcenter],col='red', type='h')
    
cdfpath <- file.path(find.package("faahKO"), "cdf")

my.input.files <- dir(c(paste(cdfpath, "WT", sep='/'),
    paste(cdfpath, "KO", sep='/')), full.names=TRUE)


# create xcmsSet object
xs <- new("xcmsSet")
xs@filepaths <- my.input.files

op<-par(mfrow=c(3,1))

x<-combine_spectra(xs=xs, mzbin=0.25,
    linear=TRUE, continuum=FALSE)

plot(x$mz, x$intensity, 
    type='h', 
    main='original',
    xlab='m/Z', ylab='ion intensity')

xy <- peakdetection(x=x$mz, y=x$intensity,
    scales=1:10, SNR.Th=1.0, SNR.area=20, mintr=0.5)

id.peakcenter<-xy[,4]

plot(x$mz[id.peakcenter], x$intensity[id.peakcenter], type='h', 
    main='filtered')


plot(x$mz, x$intensity, type='l',
    xlim=c(400,450),
    main='zoom',
    log='y',
    xlab='m/Z', ylab='ion intensity (log scale)')

points(x$mz[id.peakcenter], x$intensity[id.peakcenter],col='red', type='h')

Generate report with combined ion intensities

Description

quantify_combined calculates a data table of mz intensities from a sum of ions within a selected mz window and from all MS scans.

Usage


    quantify_combined(xs,xy, rtcombine)

quantify_combined(xs,xy, rtcombine)

Arguments

`xs`	xcmsSet object
`xy`	table including mz location parameters
`rtcombine`	Numerical, with two entries. If no retention time information is to be used, `rtcombine` provides the retention time window where ion intensities will be summed for quantification. If `rtcombine = c(0,0)`, All scans will be used

Details

This function is only used, when no retention time information should be used, for example with direct infusion experiments.

Author(s)

David Fischer 2013

Examples

      ## Not run: 
    #  see package vignette
      
## End(Not run)
## Not run: 
    #  see package vignette
      
## End(Not run)

Detection of chromatographic peak locations from extracted ion chromatograms (EICs)

Description

For each EIC in the EIC matrix, retention_time localizes chromatographic peaks using peakdetection.

Usage


    retention_time(xy, xs, eicmatrix, RTscales, RTSNR.Th, 
RTSNR.area, mintr)

retention_time(xy, xs, eicmatrix, RTscales, RTSNR.Th, 
RTSNR.area, mintr)

Arguments

`xy`	table including mz location parameters
`xs`	xcmsSet object
`eicmatrix`	Matrix containing extracted ion chromatograms (EICs)
`RTscales`	Peak Detection of Chromatographic peaks on the combined extracted ion chromatogram: scales for continuous wavelet transformation (CWT), see also the `peakDetectionCWT` function of the `MassSpecWavelet` package.
`RTSNR.Th`	Peak Detection of Chromatographic peaks on the combined extracted ion chromatogram: Signal to noise ratio threshold
`RTSNR.area`	Peak Detection of Chromatographic peaks on the combined extracted ion chromatogram: Area around the peak for noise determination. Indicates number of surrounding peaks on the first CWT scale. `default = 20`
`mintr`	Minimal peak width intensity treshold (in percentage), for which two overlapping peaks are considered as separated (`default = 0.5`)

Details

Based on the combined extracted ion chromatograms, there is another peak detection step to be performed. The same algorithm as described for the peak picking of mass signals (continuous wavelet transformation) is also used for peak picking in the retention time domain.

Author(s)

David Fischer 2013

Examples

      ## Not run: 
    #  see package vignette
      
## End(Not run)
## Not run: 
    #  see package vignette
      
## End(Not run)

Package 'cosmiq'

Help Index

Combine mass spectra of each scan and each file into a single master spectrum

Description

Usage

Arguments

Details

Author(s)

Examples

cosmiq - main wrapper function

Description

Usage

Arguments

Details

Author(s)

Examples

Quantifying mz/RT intensities using peak locations from master map

Description

Usage

Arguments

Details

Author(s)

Examples

Generate matrix of combined extraced ion chromatograms (EICs)

Description

Usage

Arguments

Details

Author(s)

Examples

An algorithm for the detection of peak locations and boundaries in mass spectra or ion chromatograms

Description

Usage

Arguments

Details

Author(s)

References

Examples

Generate report with combined ion intensities

Description

Usage

Arguments

Details

Author(s)

Examples

Detection of chromatographic peak locations from extracted ion chromatograms (EICs)

Description

Usage

Arguments

Details

Author(s)

Examples