Package 'adductomicsR' reference manual

Title:	Processing of adductomic mass spectral datasets
Description:	Processes MS2 data to identify potentially adducted peptides from spectra that has been corrected for mass drift and retention time drift and quantifies MS1 level mass spectral peaks.
Authors:	Josie Hayes <[email protected]>
Maintainer:	Josie Hayes <[email protected]>
License:	Artistic-2.0
Version:	1.23.0
Built:	2025-03-29 06:19:33 UTC
Source:	https://github.com/bioc/adductomicsR

Adduct quantification for adductomicsR

Description

reads mzXML files from a directory, corrects RT according to RT correction model and quantifies peaks.

Usage

adductQuant(nCores = NULL, targTable = NULL, 
intStdRtDrift = NULL, rtDevModels = NULL, 
filePaths = NULL, quantObject = NULL, indivAdduct = NULL, maxPpm = 4,
minSimScore = 0.8, spikeScans = 2, minPeakHeight = 100, maxRtDrift = 20,
maxRtWindow = 120, isoWindow = 80, 
hkPeptide = "LVNEVTEFAK", gaussAlpha = 16)
adductQuant(nCores = NULL, targTable = NULL, 
intStdRtDrift = NULL, rtDevModels = NULL, 
filePaths = NULL, quantObject = NULL, indivAdduct = NULL, maxPpm = 4,
minSimScore = 0.8, spikeScans = 2, minPeakHeight = 100, maxRtDrift = 20,
maxRtWindow = 120, isoWindow = 80, 
hkPeptide = "LVNEVTEFAK", gaussAlpha = 16)

Arguments

`nCores`	number of cores to use for analysis. If NULL then 1 core will be used.
`targTable`	is the fullpath to the target table. See inst/extdata/examplePeptideTargetTable.csv for an example.
`intStdRtDrift`	the maximum drift for the internal standard in seconds. Default = NULL and therefore no RT correction is applied to the internal standard.
`rtDevModels`	is the full path to the rtDevModels.RData file from rtDevModels(). default is NULL and therefore has no RT correction.
`filePaths`	required list of mzXML files for analysis. If all files are in the same directory these can be accessed using list.files('J:\parentdirectory\directoryContainingfiles', pattern='.mzXML', all.files=FALSE, full.names=TRUE).
`quantObject`	character string for filepath to an AdductQuantif object to be integrated.
`indivAdduct`	numeric vector of AdductQuantif targets to re-integrate
`maxPpm`	numeric for the maximum parts per million to be used.
`minSimScore`	a numeric between 0
`spikeScans`	a numeric for the number of scans that a spike must be seen in for it to be integrated. Default is 2.
`minPeakHeight`	numeric to determine the minimum height for a peak to be integrated. Default is set low at 100.
`maxRtDrift`	numeric for the maximum retention time drift to be considered. Default is 20.
`maxRtWindow`	numeric in seconds for the retention time window (total window will be 2 times this value)
`isoWindow`	numeric for the pepide isotope window in seconds, default is 80
`hkPeptide`	is capitalized string for the housekeeping peptide. The default is 'LVNEVTEFAK' from human serum albumin.
`gaussAlpha`	numeric for the gaussian smoothing parameter to smooth the peaks. Default is 16. Output is an adductQuantf object saved to the working directory

Value

adductQuant object

Examples

## Not run: 
eh = ExperimentHub();
temp = query(eh, 'adductData');
adductQuant(nCores=2, targTable=paste0(system.file("extdata", 
package = "adductomicsR"),'/exampletargTable2.csv'), intStdRtDrift=30, 
rtDevModels=paste0(hubCache(temp),"/rtDevModels.RData"),
filePaths=list.files(hubCache(temp),pattern=".mzXML", all.files=FALSE,
full.names=TRUE)[1],quantObject=NULL,
indivAdduct=NULL,maxPpm=5,minSimScore=0.8,spikeScans=1,
minPeakHeight=100,maxRtDrift=20,maxRtWindow=240,isoWindow=80,
hkPeptide='LVNEVTEFAK', gaussAlpha=16)

## End(Not run)## Not run: 
eh = ExperimentHub();
temp = query(eh, 'adductData');
adductQuant(nCores=2, targTable=paste0(system.file("extdata", 
package = "adductomicsR"),'/exampletargTable2.csv'), intStdRtDrift=30, 
rtDevModels=paste0(hubCache(temp),"/rtDevModels.RData"),
filePaths=list.files(hubCache(temp),pattern=".mzXML", all.files=FALSE,
full.names=TRUE)[1],quantObject=NULL,
indivAdduct=NULL,maxPpm=5,minSimScore=0.8,spikeScans=1,
minPeakHeight=100,maxRtDrift=20,maxRtWindow=240,isoWindow=80,
hkPeptide='LVNEVTEFAK', gaussAlpha=16)

## End(Not run)

AdductQuantif class The AdductQuantif class contains a peak integral matrix, peak ranges and region of integration, the isotopic distribution identified for each integrated peak and the target table of peaks integrated.

Description

AdductQuantif class The AdductQuantif class contains a peak integral matrix, peak ranges and region of integration, the isotopic distribution identified for each integrated peak and the target table of peaks integrated.

Usage

x
x

Format

An object of class NULL of length 0.

Value

peak integral matrix, peak ranges and region of integration, the isotopic distribution identified for each integrated peak and the target table of peaks integrated and their corresponding MS1 scan isotopic patterns

Slots

peakQuantTable: a matrix containing the peak integration results and consisting of a row for each peak identified in each sample (e.g 200 samples and 50 targets 200 * 50 = 10,000 rows)
peakIdData: list of peak IDs
predIsoDist: list of predicted Iso distances
targTable: dataframe target table
file.paths: character path to file
Parameters: dataframe of specified parameters

Methods

c: signature(object = "AdductQuantif"): Concatenates the spectra information.
file.paths: signature(object = "AdductQuantif"): Accesses the file paths.
peakQuantTable: signature(object = "AdductQuantif"): Accesses the peak quantification data as a table.
peakIdData: signature(object = "AdductQuantif"): Accesses the ID data for the peaks.
predIsoDist: signature(object = "AdductQuantif"): Accesses the predicted isotopic distribution.
targTable: signature(object = "AdductQuantif"): Accesses the user provided target table.

Author(s)

JL Hayes [email protected]

AdductSpec class

Description

The AdductSpec class contains dynamic noise filtered composite MS/MS spectra and their corresponding MS1 scan isotopic patterns. Produced by adductSpecGen() from mzXML files.

Usage

x
x

Format

An object of class NULL of length 0.

Value

dynamic noise filtered composite MS/MS spectra and their corresponding MS1 scan isotopic patterns

Slots

adductMS2spec: list of adduct MS2 spectras
groupMS2spec: list of group MS2 spectras
metaData: dataframe of metadata from mzXML
aaResSeqs: matrix of amino acid sequences
specPepMatches: list of spectra peptide matches
specPepCompSpec: list of comp spectra peptide matches
sumAdductType: dataframe of adduct types
Peptides: dataframe of peptides under study
rtDevModels: list of rtDevModels
targetTable: dataframe target table
file.paths: character of file path
Parameters: dataframe of parameters

Methods

c: signature(object = "AdductSpec"): Concatenates the spectra information.
Specfile.paths: signature(object = "AdductSpec"): Accesses the file paths.
adductMS2spec: signature(object = "AdductSpec"): Accesses the adduct MS2 spectral information.
metaData: signature(object = "AdductSpec"): Accesses the scan metadata.
Parameters: signature(object = "AdductSpec"): Accesses the user parameters.
groupMS2spec: signature(object = "AdductSpec"): Accesses the spectral information for the grouped MS2 spectra.
rtDevModels: signature(object = "AdductSpec"): Accesses the retention time deviation models.
sumAdductType: signature(object = "AdductSpec"): Accesses the total adduct types.
Peptides: signature(object = "AdductSpec"): Accesses the peptide information.
specPepMatches: signature(object = "AdductSpec"): Accesses the peptide matches in the spectra.

Author(s)

JL Hayes [email protected]

Constructor of AdductSpec object deconvolute spectra MS2 and MS1 levels

Description

reads mzXML files from a directory extracts metadata info, groups ion signals with signalGrouping, filters noise dynamically dynamicNoiseFilter and identifies precursor ion charge state, by isotopic pattern.

Usage

adductSpecGen(mzXmlDir=NULL, runOrder=NULL, nCores=NULL,
intStdMass=834.77692,intStdPeakList=c(290.21, 403.30, 516.38,
587.42,849.40, 884.92, 958.46, 993.97,1050.52, 1107.06, 1209.73,
1337.79,1465.85),TICfilter=10000, DNF=2, minInt=300,
minPeaks=5,intStd_MaxMedRtDrift=360, intStd_MaxPpmDev=200,minSpecEx=40,
outputPlotDir=NULL)
adductSpecGen(mzXmlDir=NULL, runOrder=NULL, nCores=NULL,
intStdMass=834.77692,intStdPeakList=c(290.21, 403.30, 516.38,
587.42,849.40, 884.92, 958.46, 993.97,1050.52, 1107.06, 1209.73,
1337.79,1465.85),TICfilter=10000, DNF=2, minInt=300,
minPeaks=5,intStd_MaxMedRtDrift=360, intStd_MaxPpmDev=200,minSpecEx=40,
outputPlotDir=NULL)

Arguments

`mzXmlDir`	character a full path to a directory containing either .mzXML or .mzML data
`runOrder`	character a full path to a csv file specifying the runorder for each of the files the first column must contain the precise file name and the second column an integer representing the precise run order.
`nCores`	numeric the number of cores to use for parallel computation. The default is to 1 core
`intStdMass`	numeric vector of the mass of the internal standard. Default is the mass of
`intStdPeakList`	numeric vector of masses for the internal standard peaks
`TICfilter`	numeric minimimum total ion current of an MS/MS scan. Any MS/MS scan below this value will be filtered out (default=0).
`DNF`	dynamic noise filter minimum signal to noise threshold (default = 2), calculated as the ratio between the linear model predicted intensity value and the actual intensity.
`minInt`	numeric minimum intensity value
`minPeaks`	minimum number of signal peaks following dynamic noise filtration (default = 5).
`intStd_MaxMedRtDrift`	numeric the maximum retention time drift window (in seconds) to identify internal standard MS/MS spectrum scans (default = 600).
`intStd_MaxPpmDev`	numeric the maximum mass accuracy window (in ppm). to identify internal standard MS/MS spectrum scans (default = 200 ppm).
`minSpecEx`	numeric the minimum percentage of the total ion current explained by the internal standard fragments (default = 40). Sometime spectra are not identified due to this cutoff being set too high. If unexpected datapoints have been interpolated then reduce this value.
`outputPlotDir`	character string for the output directory for plots, default is working directory.

Value

AdductSpec object

modified `Digest` function (from OrgMassSpecR package)

Description

allows maxCharge to be set to calculate precursor m/z

Usage

digestMod(sequence, enzyme = "trypsin", missed = 0, 
maxCharge = 8,IAA = TRUE, N15 = FALSE, custom = list())
digestMod(sequence, enzyme = "trypsin", missed = 0, 
maxCharge = 8,IAA = TRUE, N15 = FALSE, custom = list())

Arguments

`sequence`	a character string representing the amino acid sequence.
`enzyme`	is the enzyme to perform in silico digestion with
`missed`	the maximum number of missed cleavages. Must be an integer of 0 (default) or greater. An error will result if the specified number of missed cleavages is greater than the maximum possible number of missed cleavages.
`maxCharge`	numeric max charge charge for predicted precursor m/z
`IAA`	logical. TRUE specifies iodoacetylated cysteine and FALSE specifies unmodified cysteine. Used only in determining the elemental formula, not the three letter codes.
`N15`	logical indicating if the nitrogen-15 isotope should be used in place of the default nitrogen-14 isotope. calculation
`custom`	list of custom masses

Details

see Digest for details of further function arguments.

Value

dataframe

Examples

digestMod('MKWVTFISLLFLFSSAYSRGVFRRDAHKSEVAHRFKDLGEENFKALVLIA',
enzyme = "trypsin", missed = 0, maxCharge = 8,IAA = TRUE, N15 = FALSE, 
custom = list()) 
digestMod('MKWVTFISLLFLFSSAYSRGVFRRDAHKSEVAHRFKDLGEENFKALVLIA',
enzyme = "trypsin", missed = 0, maxCharge = 8,IAA = TRUE, N15 = FALSE, 
custom = list())

dot product matrix calculation

Description

dot product matrix calculation

Usage

dotProdMatrix(allSpectra = NULL, spectraNames = NULL, binSizeMS2 =
NULL)
dotProdMatrix(allSpectra = NULL, spectraNames = NULL, binSizeMS2 =
NULL)

Arguments

`allSpectra`	a numeric matrix consisting of two columns 1. mass and 2. intensity
`spectraNames`	character names of individual spectra to compare must equal number of rows of allSpectra
`binSizeMS2`	numeric the MS2 bin size to bin MS2 data prior to dot product calculation (default = 0.1 Da).

Value

a matrix of equal dimension corresponding to the number of unique spectrum names

dot product calculation

Description

hierarchical clustering (complete method see hclust). Dissimilarity metric based on 1-dot product spectral similarity. Retention time and mass groups are therefore further subdivided based on spectral similarity. If outlying mass spectra have been erroneously grouped then these will be reclassified.

Usage

dotProdSpectra(adductSpectra = NULL, nCores = NULL, 
minDotProdSpec = 0.8, maxGroups = 10)
dotProdSpectra(adductSpectra = NULL, nCores = NULL, 
minDotProdSpec = 0.8, maxGroups = 10)

Arguments

`adductSpectra`	AdductSpec object
`nCores`	numeric the number of cores to use for parallel computation. The default is to use 1 core.
`minDotProdSpec`	numeric minimum dot product score
`maxGroups`	numeric maximum number of groups to include from the dendrogram.

Value

adductSpectra AdductSpec object

Dynamic Noise filtration

Description

Dynamic Noise filtration

Usage

dynamicNoiseFilter(spectrum.df = NULL, DNF = 2, minPeaks = 5,
minInt = 100)
dynamicNoiseFilter(spectrum.df = NULL, DNF = 2, minPeaks = 5,
minInt = 100)

Arguments

`spectrum.df`	a dataframe or matrix with two columns: 1. Mass/ Mass-to-charge ratio 2. Intensity
`DNF`	dynamic noise filter minimum signal to noise threshold (default = 2), calculated as the ratio between the linear model predicted intensity value and the actual intensity.
`minPeaks`	minimum number of signal peaks following dynamic noise filtration (default = 5).
`minInt`	integer minimum dynamic noise filter

Details

Dynamic noise filter adapted from the method described in Xu H. and Frietas M. 'A Dynamic Noise Level Algorithm for Spectral Screening of Peptide MS/MS Spectra' 2010 BMC Bioinformatics. The function iteratively calculates linear models starting from the median value of the lower half of all intensities in the spectrum.df. The linear model is used to predict the next peak intensity and ratio is calculated between the predicted and actual intensity value. Assuming that all preceeding intensities included in the linear model are noise, the signal to noise ratio between the predicted and actual values should exceed the minimum signal to noise ratio (default DNF = 2). The function continues until either the DNF value minimum has been exceeded and is also below the maxPeaks or maximum number of peaks value. As the function must necessarily calculate potentially hundreds of linear models the RcppEigen package is used to increase the speed of computation.

Value

a list containing 3 objects: 1. Above.noise The dynamic noise filtered matrix/ dataframe 2. metaData a dataframe with the following column names: 1. Noise.level the noise level determined by the dynamic noise filter function. 2. IntCompSpec Total intensity composite spectrum. 3. TotalIntSNR Sparse ion signal to noise ratio (mean intensity/ stdev intensity) 4. nPeaks number of peaks in composite spectrum 3. aboveMinPeaks Logical are the number of signals above the minimum level

filter samples with low QC and features with large missing values Removes adducts that have not been integrated with many missing values and provides QC on samples

Description

filter samples with low QC and features with large missing values Removes adducts that have not been integrated with many missing values and provides QC on samples

Usage

filterAdductTable(adductTable = NULL, percMissing = 51, HKPmass = 
"575.3", quantPeptideMass = "811.7", remHKPzero = FALSE, remQuantPepzero 
=FALSE, remHKPlow = FALSE, outputDir = NULL)
filterAdductTable(adductTable = NULL, percMissing = 51, HKPmass = 
"575.3", quantPeptideMass = "811.7", remHKPzero = FALSE, remQuantPepzero 
=FALSE, remHKPlow = FALSE, outputDir = NULL)

Arguments

`adductTable`	character a full path to the peaktable with number of rows equal to the number of adducts from outputPeakTable() which starts with adductQuantif_peakList_
`percMissing`	numeric percentage threshold to remove adducts with missing values. Default is 51. It is recommended to use just over the number of samples in the smallest group of your study. 51 is used as default for a 50:50 case control study
`HKPmass`	numeric mass for the housekeeping peptide. Must be the same asthat in the adduct table. max 2 decimal places. default= 575.3 for the LVNEVTEFAK peptide
`quantPeptideMass`	numeric mass for the peptide for which adducts are being quantified, Default is 811.7 for the ALVLIAFAQYLQQCPFEDHVK peptide
`remHKPzero`	logical if TRUE removes all samples where the housekeeping peptide is 0. default= FALSE
`remQuantPepzero`	logical if TRUE removes all samples where the peptide under quantification is 0. default= FALSE
`remHKPlow`	logical if TRUE removes all samples where the housekeeping peptide has an area less than 100000. default= TRUE.This is recommended because this peak should be large. If the HKP has been mis-identified quantification of all adducts will be affected.
`outputDir`	character path to results directory output is a csv file with only adducts and samples that passed filter. Remaining adducts can be quantified manually however it is recommended to rescale the quantification results and include the quantification method as a covariate in downstream analysis.

Value

csv file

Examples

filterAdductTable(adductTable=paste0(system.file("extdata",
package="adductomicsR"),'/example_adductQuantif_peakList.csv'), percMissing
=51,HKPmass = "575.3", quantPeptideMass = "811.7",
remHKPzero=FALSE,remQuantPepzero = FALSE, remHKPlow = FALSE, outputDir =
NULL)
filterAdductTable(adductTable=paste0(system.file("extdata",
package="adductomicsR"),'/example_adductQuantif_peakList.csv'), percMissing
=51,HKPmass = "575.3", quantPeptideMass = "811.7",
remHKPzero=FALSE,remQuantPepzero = FALSE, remHKPlow = FALSE, outputDir =
NULL)

identify peaks

Description

identifies peaks in a vector of intensities.

Usage

findPeaks(x, m = 3)
findPeaks(x, m = 3)

Arguments

`x`	numeric vector of intensities.
`m`	number of peaks to identify

Value

string of peaks

Examples

findPeaks(c(200, 300,200, 200, 200 , 300, 200), m = 3)
findPeaks(c(200, 300,200, 200, 200 , 300, 200), m = 3)

Make a target table for adductomicsR quantificaton using specSimPep results

Description

Make a target table for adductomicsR quantificaton using specSimPep results

Usage

generateTargTable(allresultsFile = NULL, csvDir = NULL)
generateTargTable(allresultsFile = NULL, csvDir = NULL)

Arguments

`allresultsFile`	character a full path to the allResults file generated by specSimPepId
`csvDir`	character a full path to a directory to save the csv file to output is a csv file called targTable.csv which can be used in the adductQuant function

Value

cvs file

Examples

generateTargTable(paste0(system.file("extdata",package="adductomicsR"),
'/allResults_ALVLIAFAQYLQQCPFEDHVK_example.csv'),csvDir=getwd())
generateTargTable(paste0(system.file("extdata",package="adductomicsR"),
'/allResults_ALVLIAFAQYLQQCPFEDHVK_example.csv'),csvDir=getwd())

modified function from package OrgMassSpecR

Description

modified function from package OrgMassSpecR

Usage

IsotopicDistributionMod(formula = list(), charge = 1)
IsotopicDistributionMod(formula = list(), charge = 1)

Arguments

`formula`	list of character strings representing elemental formula
`charge`	numeric for charge of the element

Value

dataframe of a spectrum

Examples

IsotopicDistributionMod(formula=list("CH3CH2OH","H2O"),charge = 1)
IsotopicDistributionMod(formula=list("CH3CH2OH","H2O"),charge = 1)

wrapper script for loess modeling

Description

adapted from bisoreg package

Usage

loessWrapperMod(x, y, span.vals = seq(0.25, 1, by = 0.05),
folds = 5)
loessWrapperMod(x, y, span.vals = seq(0.25, 1, by = 0.05),
folds = 5)

Arguments

`x`	predictor values
`y`	response values
`span.vals`	values of the tuning parameter to evaluate using cross validation
`folds`	number of 'folds' for the cross-validation procedure

Value

LOESS model

Examples

loessWrapperMod (rnorm(200), rnorm(200), span.vals = 
seq(0.25, 1, by = 0.05),folds = 5)
loessWrapperMod (rnorm(200), rnorm(200), span.vals = 
seq(0.25, 1, by = 0.05),folds = 5)

group MS/MS precursor masses

Description

hierarchically cluster ms/ms precursor scans within and across samples, according to a m/z and retention time error.

Usage

ms2Group(adductSpectra = NULL, nCores = NULL, 
maxRtDrift = NULL, 
ms1mzError = 0.1, ms2mzError = 1, dotProdClust = TRUE, minDotProd = 0.8,
fclustMethod = "median", disMetric = "euclidean", compSpecGen = TRUE,
adjPrecursorMZ = TRUE)
ms2Group(adductSpectra = NULL, nCores = NULL, 
maxRtDrift = NULL, 
ms1mzError = 0.1, ms2mzError = 1, dotProdClust = TRUE, minDotProd = 0.8,
fclustMethod = "median", disMetric = "euclidean", compSpecGen = TRUE,
adjPrecursorMZ = TRUE)

Arguments

`adductSpectra`	AdductSpec object
`nCores`	numeric the number of cores to use for parallel computation. The default is to use 1 core.
`maxRtDrift`	numeric for the maximum rentention time drift to be considered. Default is 20.
`ms1mzError`	numeric maximum MS1 mass:charge error
`ms2mzError`	numeric maximum MS2 mass:charge error
`dotProdClust`	logical remove previous dot prod clustering results
`minDotProd`	numeric. Minimum mean dot product spectral similarity score to keep a spectrum within an MS/MS group (default = 0.8).
`fclustMethod`	method to use for the fclust function
`disMetric`	metric to use for distance in clustering
`compSpecGen`	logical for whether composite spectra generation is necessary
`adjPrecursorMZ`	logical for precursor mass:charge adjustment

Value

a list identical to adductSpectra containing an additional list element:

remove lower intensity adjacent peaks

Description

remove lower intensity adjacent peaks

Usage

nAdjPeaks(peaksTmp = NULL, troughsTmp = NULL, peakRangeTmp = NULL)
nAdjPeaks(peaksTmp = NULL, troughsTmp = NULL, peakRangeTmp = NULL)

Arguments

`peaksTmp`	character vector with indices of detected peaks from findPeaks
`troughsTmp`	character vector with indices of detected troughs from findPeaks
`peakRangeTmp`	matrix of the peak range data with at least 3 columns (1. mass-to-charge, 2. intensity, 3. retention time)

Value

peaksTmp but with lower intensity adjacent peaks between the same troughs removed

output peak table from AdductQuantif object

Description

output peak table from AdductQuantif object

Usage

outputPeakTable(object = NULL, outputDir = NULL)
outputPeakTable(object = NULL, outputDir = NULL)

Arguments

`object`	a 'AdductQuantif' class object
`outputDir`	character full path to a directory to output the peak to default is the current working directory

Value

a peaktable with number of rows equal to the number of adducts quantified and 14 peak group information columns plus a number of columns equal to the number of samples quantified. The peak table is saved as a csv file in the output directory named: adductQuantif_peakList_'todays date'.csv. The peak table is also returned to the R session and can be assigned to an object.

Examples

eh = ExperimentHub();
Temp = query(eh, c("adductData", "adductQuant", "Rda"))[[1]];
outputPeakTable(object=Temp)
eh = ExperimentHub();
Temp = query(eh, c("adductData", "adductQuant", "Rda"))[[1]];
outputPeakTable(object=Temp)

Adduct Peak quant

Description

peak must be at least 50 percent resolved from overlapping peaks. i.e. the peaks trough must be at least 50 percent of the peak apex intensity for the peak to be considered sufficiently resolved.

Usage

peakIdQuant_newMethod(mzTmp = NULL, rtTmp = NULL, 
peakRangeRtSub = NULL, rtDevModel = NULL, isoPat = NULL,
isoPatPred = NULL, minSimScore = 0.96, maxPpm = 4, 
gaussAlpha = 16, spikeScans = 2, minPeakHeight = 5000, 
maxRtDrift = 20, showPlots = FALSE, 
isoWindow = 10, maxGapMs1Scan = 5, intMaxPeak = FALSE)
peakIdQuant_newMethod(mzTmp = NULL, rtTmp = NULL, 
peakRangeRtSub = NULL, rtDevModel = NULL, isoPat = NULL,
isoPatPred = NULL, minSimScore = 0.96, maxPpm = 4, 
gaussAlpha = 16, spikeScans = 2, minPeakHeight = 5000, 
maxRtDrift = 20, showPlots = FALSE, 
isoWindow = 10, maxGapMs1Scan = 5, intMaxPeak = FALSE)

Arguments

`mzTmp`	expected mass to charge of target
`rtTmp`	expected retention time (in minutes) of target
`peakRangeRtSub`	matrix MS1 scans covering entire chromatographic range within which to identify peaks of interest. Contains the following three columns column 1 = mass, column 2 = intensity, column 3 = retention time, column 4 = scan number.
`rtDevModel`	loess retention time deviation model for the file.
`isoPat`	named numeric containing the expected mass differences between isotopes for the peptide of interest.
`isoPatPred`	matrix output from the `IsotopicDistribution` function with additional 'id' column.
`minSimScore`	numeric minimum dot product score for consideration (must be between 0-1, default = 0.96).
`maxPpm`	numeric ppm value for EIC extraction and integration.
`gaussAlpha`	numeric alpha value for `smth.gaussian` of smoother package. If supplied gaussian smoothing will be performed (suggested value = 16).
`spikeScans`	numeric number of scans that constitute a spike.
`minPeakHeight`	numeric minimum peak height, default 5000
`maxRtDrift`	numeric maximum retention time drift, default 20 secs
`showPlots`	boolean for whether plots should be produced
`isoWindow`	numeric isowindow size, default 10
`maxGapMs1Scan`	maximum MS1 scan gap, default 5
`intMaxPeak`	boolean integrate maximum peak

Value

list

integrate a peak from a peak table with peak start and peak end retention times

Description

integrate a peak from a peak table with peak start and peak end retention times

Usage

peakIntegrate(peakTable = NULL, peakStart = NULL, 
peakEnd = NULL, expMass = NULL, 
expRt = NULL)
peakIntegrate(peakTable = NULL, peakStart = NULL, 
peakEnd = NULL, expMass = NULL, 
expRt = NULL)

Arguments

`peakTable`	a table of at least 5 columns: mass-to-charge. intensity adjusted retention time raw retention time scan numbers
`peakStart`	retention time for peak start (in seconds).
`peakEnd`	retention time for peak end (in seconds).
`expMass`	expected mass-to-charge of target.
`expRt`	expected retention time of target (in seconds).

Value

list with peak and peak table

peak list Identification

Description

peak list Identification

Usage

peakListId(adductSpectra = NULL, peakList = c(290.21, 403.3, 
516.38, 587.42, 849.4, 884.92, 958.46, 993.97, 1050.52, 1107.06,
1209.73, 1337.79,
1465.85), exPeakMass = 834.7769, frag.delta = 1, minPeaksId = 7, 
minSpecEx = 50, maxRtDrift = 360, maxPpmDev = 200, allScans = TRUE, 
closestMassByFile = TRUE, outputPlotDir = NULL)
peakListId(adductSpectra = NULL, peakList = c(290.21, 403.3, 
516.38, 587.42, 849.4, 884.92, 958.46, 993.97, 1050.52, 1107.06,
1209.73, 1337.79,
1465.85), exPeakMass = 834.7769, frag.delta = 1, minPeaksId = 7, 
minSpecEx = 50, maxRtDrift = 360, maxPpmDev = 200, allScans = TRUE, 
closestMassByFile = TRUE, outputPlotDir = NULL)

Arguments

`adductSpectra`	AdductSpec object param peakList numeric vector of peak masses param exPeakMass numeric internal standard peak mass
`peakList`	numeric vector of peak masses
`exPeakMass`	numeric mass of explained peak
`frag.delta`	integer delta mass accuracy difference.
`minPeaksId`	numeric minimum number of peaks IDed
`minSpecEx`	numeric the minimum percentage of the total ion current explained by the internal standard fragments (default = 40). Sometime spectra are not identified due to this cutoff being set too high. If unexpected datapoints have been interpolated then reduce this value.
`maxRtDrift`	numeric the maximum retention time drift (in seconds) to identify MS/MS spectrum scans (default = 360). param outputPlotDir character string of output directory (e.g. internal standard IAA-T3 peak list = peakList= c(290.21, 403.30, 516.38, 587.42, 849.40, 884.92, 958.46, 993.97, 1050.52, 1107.06, 1209.73, 1337.79, 1465.85))
`maxPpmDev`	numeric ppm deviation
`allScans`	boolean include all scans
`closestMassByFile`	boolean closest mass in files
`outputPlotDir`	character string for output plot directory

Value

dataframe peak list

raw eic signal intensity and mass summation and spike removal.

Description

raw eic signal intensity and mass summation and spike removal.

Usage

peakRangeSum(peakRange = NULL, spikeScans = 2, rtDevModel = NULL,
gaussAlpha = NULL, 
maxEmptyRt = 7)
peakRangeSum(peakRange = NULL, spikeScans = 2, rtDevModel = NULL,
gaussAlpha = NULL, 
maxEmptyRt = 7)

Arguments

`peakRange`	matrix consisting of 5 columns: mass-to-charge values intensity retention time (in seconds) scan number
`spikeScans`	numeric number of scans <= a spike. Any peaks <= this value will be removed (default = 2).= FALSE
`rtDevModel`	loess model to correct retention times.
`gaussAlpha`	numeric alpha value for `smth.gaussian` of smoother package. If supplied gaussian smoothing will be performed (suggested value = 16).
`maxEmptyRt`	numeric maximum size of empty retention time beyond which missing values will be zero-filled

Value

matrix with masses and intensities summed by retention time and retention time correction based on the loess model supplied, the matrix has spikes removed (consecutive non-zero intensity values <= spikeScans in length), empty time segments are zero filled (> 3 seconds), optionally gaussian smoothed using the linksmth.gaussian function of the smoother package and is also subset based on the minimum and maximum retention time windows supplied (rtWin). The returned matrix consists of 5 columns:

average mass-to-charge values by unique retention time in supplied peakRange table
maximum intensity values by unique retention time in supplied peakRange table
loess model corrected retention times
original retention time values
scan number by unique retention time in supplied peakRange table

potentially problematic peak identification

Description

potentially problematic peak identification

Usage

probPeaks(object = NULL, nTimesMad = 3, 
metrics = c("nMadDotProdDistN", 
"nMadSkewness", "nMadKurtosis", "nMadRtGroupDev", 
"nMadPeakArea", "duplicates"))
probPeaks(object = NULL, nTimesMad = 3, 
metrics = c("nMadDotProdDistN", 
"nMadSkewness", "nMadKurtosis", "nMadRtGroupDev", 
"nMadPeakArea", "duplicates"))

Arguments

`object`	an 'AdductQuantif' class object
`nTimesMad`	numeric number of median absolute deviations to identify potential problem peaks.
`metrics`	character string column names of metrics with which to identify potential problem peaks or a list with individual nTimesMad arguments and with list element names corresponding to column names of metrics.
`...`	further arguments to `mad`

Value

'AdductQuantif' class object

loess-based retention time deviation correction

Description

loess-based retention time deviation correction

Usage

retentionCorr(adductSpectra = NULL, 
smoothingSpan = NULL, nMissing = 1, 
nExtra = 1, folds = 7, outputFileDir = NULL)
retentionCorr(adductSpectra = NULL, 
smoothingSpan = NULL, nMissing = 1, 
nExtra = 1, folds = 7, outputFileDir = NULL)

Arguments

`adductSpectra`	AdductSpec object
`smoothingSpan`	numeric. fixed smoothing span, argument to loess. If argument is not supplied then optimal smoothing span is calculated for each file seperately.
`nMissing`	numeric. maximum number of missing files for a MS/MS scan group to be utilized in the loess retention time deviation model. Roughly 15 percent missing values is a good starting point (e.g. nMissing=10 for 68 samples).
`nExtra`	numeric maximum number of extra scans above the total number of files for a MS/MS scan group to be utilized in the loess retention time deviation model. If a MS/MS scan group consists of many scans far in excess of the number of files then potentially MS/MS scans from large tailing peaks or isobars may be erroneously grouped together and used to adjust retention time incorrectly.
`folds`	numeric. number of cross validation steps to perform in identifying optimal smoothing span parameter (see: bisoreg package for more details)
`outputFileDir`	character full path to a directory to save the output images

Value

LOESS RT models as adductSpectra AdductSpec object

MS/MS spectrum grouping and retention time deviation modelling for adductomicsR

Description

MS/MS spectrum grouping and retention time deviation modelling for adductomicsR

Usage

rtDevModelling(MS2Dir = NULL, runOrder = NULL, 
nCores = NULL, TICfilter = 0, 
intStdPeakList=c(290.21, 403.30, 516.38, 587.42,849.40, 884.92, 958.46,
993.97,1050.52, 1107.06, 1209.73, 1337.79,1465.85), 
intStdMass = 834.77692, intStd_MaxMedRtDrift = 600, intStd_MaxPpmDev = 200,
minSpecEx = 40, 
minDotProd = 0.8, percMissing = 15, percExtra = 100, smoothingSpan = 0.8,
saveRtDev = 1, outputPlotDir = NULL)
rtDevModelling(MS2Dir = NULL, runOrder = NULL, 
nCores = NULL, TICfilter = 0, 
intStdPeakList=c(290.21, 403.30, 516.38, 587.42,849.40, 884.92, 958.46,
993.97,1050.52, 1107.06, 1209.73, 1337.79,1465.85), 
intStdMass = 834.77692, intStd_MaxMedRtDrift = 600, intStd_MaxPpmDev = 200,
minSpecEx = 40, 
minDotProd = 0.8, percMissing = 15, percExtra = 100, smoothingSpan = 0.8,
saveRtDev = 1, outputPlotDir = NULL)

Arguments

`MS2Dir`	character a full path to a directory containing either .mzXML or .mzML data
`runOrder`	character a full path to a csv file specifying the runorder for each of the files the first column must contain the precise file name and the second column an integer representing the precise run order.
`nCores`	numeric the number of cores to use for parallel computation. The default is to 1 core.
`TICfilter`	numeric minimimum total ion current of an MS/MS scan. Any MS/MS scan below this value will be filtered out (default=0).
`intStdPeakList`	character a comma seperated list of expected fragment ions for the internal standard spectrum (no white space).
`intStdMass`	numeric expected mass-to-charge ratio of internal standard precursor (default = 834.77692).
`intStd_MaxMedRtDrift`	numeric the maximum retention time drift window (in seconds) to identify internal standard MS/MS spectrum scans (default = 600).
`intStd_MaxPpmDev`	numeric the maximum mass accuracy window (in ppm) to identify internal standard MS/MS spectrum scans (default = 200 ppm).
`minSpecEx`	numeric the minimum percentage of the total ion current explained by the internal standard fragments (default = 40). Sometimes spectra are not identified due to this cutoff being set too high. If unexpected datapoints have been interpolated then reduce this value.
`minDotProd`	numeric. Minimum mean dot product spectral similarity score to keep a spectrum within an MS/MS group (default = 0.8).
`percMissing`	numeric. percentage of missing files for a MS/MS scan group to be utilized in the loess retention time deviation model. Roughly 15 percent missing values (default = 15%) is a good starting point (e.g. nMissing=10 for 68 samples).
`percExtra`	numeric percentage of extra scans above the total number of files for a MS/MS scan group to be utilized in the loess retention time deviation model. If a MS/MS scan group consists of many scans far in excess of the number of files then potentially MS/MS scans from large tailing peaks or isobars may be erroneously grouped together and used to adjust retention time incorrectly (default = 100% i.e. the peak group can only have one scan per file, this value can be increased if two or more consecutive scans for example can be considered).
`smoothingSpan`	numeric. fixed smoothing span, argument to `loess`. If argument is not supplied then optimal smoothing span is calculated for each file seperately using 7-fold CV.
`saveRtDev`	integer (default = 1) should just the retention time deviation model be saved (TRUE = 1) or the AdductSpec class object (FALSE = 0) as .RData workspace files.
`outputPlotDir`	character (default = NULL) output directory for plots.

Value

LOESS RT models as adductSpectra AdductSpec object

Examples

eh = ExperimentHub();
temp = query(eh, 'adductData');
temp[['EH2061']]; #first mzXML file
file.rename(cache(temp["EH2061"]), file.path(hubCache(temp), 
'data42_21221_2.mzXML'));
rtDevModelling(MS2Dir=hubCache(temp),nCores=2,runOrder=paste0(
system.file("extdata",package="adductomicsR"),
'/runOrder2.csv'), intStdPeakList=c(290.21, 403.30, 516.38, 
587.42,849.40, 884.92, 958.46, 993.97,1050.52, 1107.06, 1209.73, 
1337.79,1465.85))
eh = ExperimentHub();
temp = query(eh, 'adductData');
temp[['EH2061']]; #first mzXML file
file.rename(cache(temp["EH2061"]), file.path(hubCache(temp), 
'data42_21221_2.mzXML'));
rtDevModelling(MS2Dir=hubCache(temp),nCores=2,runOrder=paste0(
system.file("extdata",package="adductomicsR"),
'/runOrder2.csv'), intStdPeakList=c(290.21, 403.30, 516.38, 
587.42,849.40, 884.92, 958.46, 993.97,1050.52, 1107.06, 1209.73, 
1337.79,1465.85))

extract and save retention time deviation models from adductSpec class object

Description

extract and save retention time deviation models from adductSpec class object

Usage

rtDevModelSave(object = NULL, outputDir = NULL)
rtDevModelSave(object = NULL, outputDir = NULL)

Arguments

`object`	an 'adductSpec' class object or full path to a .RData file of the 'adductSpec' object
`outputDir`	character full path to a directory to save the .RData file (defaults to the current working directory if unsupplied).

Value

save a .RData file containing the rt deviation models and returns to the workspace.

Signal grouping

Description

Euclidean distances between m/z signals are hierarchically clustering using the average method and the composite spectrum groups determined by an absolute error cutoff

Usage

signalGrouping(spectrum.df = NULL, mzError = 0.8, minPeaks = 5)
signalGrouping(spectrum.df = NULL, mzError = 0.8, minPeaks = 5)

Arguments

`spectrum.df`	a dataframe or matrix with two or more columns: 1. Mass/ Mass-to-charge ratio 2. Intensity
`mzError`	interpeak absolute m/z error for signal grouping (Default = 0.001)
`minPeaks`	numeric minimum number of peaks to integrate

Value

dataframe of m/z grouped signals, the m/z values of the input dataframe/ matrix peak groups are averaged and the signal intensities summed.

spectral similarity based adducted peptide identification for adductomicsR

Description

spectral similarity based adducted peptide identification for adductomicsR

Usage

specSimPepId(MS2Dir=NULL,nCores=NULL,
rtDevModels=NULL, topIons=100, topIntIt=5,minDotProd=0.8, precCh=3,
minSNR=3,minRt=20, maxRt=35, minIdScore=0.4,minFixed=3, minMz=750, 
maxMz=1000,modelSpec=c('ALVLIAFAQYLQQCPFEDHVK','RHPYFYAPELLFFAK'),
groupMzabs=0.005, groupRtDev=0.5, possFormMzabs=0.01,
minMeanSpecSim=0.7,idPossForm=0, outputPlotDir= NULL)
specSimPepId(MS2Dir=NULL,nCores=NULL,
rtDevModels=NULL, topIons=100, topIntIt=5,minDotProd=0.8, precCh=3,
minSNR=3,minRt=20, maxRt=35, minIdScore=0.4,minFixed=3, minMz=750, 
maxMz=1000,modelSpec=c('ALVLIAFAQYLQQCPFEDHVK','RHPYFYAPELLFFAK'),
groupMzabs=0.005, groupRtDev=0.5, possFormMzabs=0.01,
minMeanSpecSim=0.7,idPossForm=0, outputPlotDir= NULL)

Arguments

`MS2Dir`	character a full path to a directory containing either .mzXML or .mzML data
`nCores`	numeric the number of cores to use for parallel computation. The default is to use 1 core.
`rtDevModels`	a list object or a full path to an RData file containing the retention time deviation models for the dataset.
`topIons`	numeric the number of most intense ions to consider for the basepeak to fragment mass difference calculation (default = 100). Larger values will slightly increase computation time, however when the modified/variable ions happen to be low abundance this value should be set high to ensure these fragment ions are considered.
`topIntIt`	numeric the number of most intense peaks to calculate the peak to peak mass differences from (default = 5 i.e. the base peak and the next 4 most intense ions greater than 10 daltons in mass from one another will be considered the multiple iterations increase computation time but in the case that the peptide spectrum is contaminated/chimeric or the variable ions are of lower intensity this parameter should be increased).
`minDotProd`	numeric minimum dot product similarity score (cosine) between the model spectra's variable ions and the corresponding intensities of the basepeak to fragment ion mass differences identified in the experimental spectrum scans (default = 0.8). Low values will greatly increase the potential for false positive peptide annotations.
`precCh`	integer charge state of precursors (default = 3).
`minSNR`	numeric the minimum signal to noise ratio for a fragment ion to be considered. The noise level for each fixed or variable ion is calculated by taking the median of the bottom half of ion intensities within the locality of the fragment ion. The locality is defined as within +/- 100 Daltons of the fragment ion.
`minRt`	numeric the minimum retention time (in minutes) within which to identify peptide spectra (default=20).
`maxRt`	numeric the maximum retention time (in minutes) within which to identify peptide spectra (default=45).
`minIdScore`	numeric the minimum identification score this is an average score of all of the 7 scoring metrics (default=0.4).
`minFixed`	numeric the minimum number of fixed fragment ions that must have been identified in a spectrum for it to be considered.
`minMz`	numeric the minimum mass-to-charge ratio of a precursor ion.
`maxMz`	numeric the maximum mass-to-charge ration of a precursor ion.
`modelSpec`	character full path to a model spectrum file (.csv). Alternatively built in model tables (in the extdata directory) can be used by just supplying the one letter amino acid code for the peptide (currently available are: "ALVLIAFAQYLQQCPFEDHVK" and "RHPYFYAPELLFFAK"). If supplying a custom table it must consist of the following mandatory columns ("mass", "intensity", "ionType" and "fixed or variable"). mass - m/z of fragment ions. intensity - intensity of fragment ions can be either relative or absolute intensity ionType - the identity of the B and Y fragments can optionally added here (e.g. [b6]2+, [y2]1+) or if not known such as for mixed disulfates this column can also contain empty fields. fixed or variable - this column contains whether a fragment ion should be considered either 'fixed', 'variable' (i.e. modified) or if it is an empty field it will not be considered. As default the following model spectra are included in the external data directory of the adductomics package: 'modelSpectrum_ALVLIAFAQYLQQCPFEDHVK.csv' 'modelSpectrum_RHPYFYAPELLFFAK.csv'
`groupMzabs`	numeric after hierarchical clustering of the spectra the dendrogram will be cut at this height (in Da) generating the mass groups.
`groupRtDev`	numeric after hierarchical clustering of the spectra the dendrogram will be cut at this height (in minutes) generating the retention time groups.
`possFormMzabs`	numeric the maximum absolute mass difference for matching adduct mass to possible formulae.
`minMeanSpecSim`	numeric minimum mean dot product similarity score (cosine) between the spectra of a group identified by hierarchical clustering. This parameter is set to prevent erroneous clustering of dissimilar spectra (default = 0.7).
`idPossForm`	integer if = 1 then the average adduct masses of each spectrum group will be matched against an internal database of possible formula to generate hypotheses. The default 0 mean this will not take place as the computation is potentially time consuming.
`outputPlotDir`	character (default = NULL) output directory for plots.

Value

dataframe of putative adducts

Examples

## Not run: 
eh = ExperimentHub();
temp = query(eh, 'adductData');
specSimPepId(MS2Dir=hubCache(temp),nCores=2,
rtDevModels=paste0(hubCache(temp),'/rtDevModels.RData'))

## End(Not run)## Not run: 
eh = ExperimentHub();
temp = query(eh, 'adductData');
specSimPepId(MS2Dir=hubCache(temp),nCores=2,
rtDevModels=paste0(hubCache(temp),'/rtDevModels.RData'))

## End(Not run)

Deconvolute both MS2 and MS1 levels scans adductomics

Description

Deconvolute both MS2 and MS1 levels scans adductomics

Usage

spectraCreate(MS2file = NULL, TICfilter = 10000, DNF = 2, minInt = 
100, minPeaks = 5)
spectraCreate(MS2file = NULL, TICfilter = 10000, DNF = 2, minInt = 
100, minPeaks = 5)

Arguments

`MS2file`	character vector of mzXML file locations
`TICfilter`	numeric minimimum total ion current of an MS/MS scan. Any MS/MS scan below this value will be filtered out (default=0).
`DNF`	dynamic noise filter minimum signal to noise threshold (default = 2), calculated as the ratio between the linear model predicted intensity value and the actual intensity.
`minInt`	numeric minimum intensity value
`minPeaks`	minimum number of signal peaks following dynamic noise filtration (default = 5).

Value

list of MS2 spectra

true peak and trough detection

Description

true peak and trough detection

Usage

truePeakTrough(peaksTmp = NULL, troughsTmp = NULL, peakRangeTmp = 
NULL, minRes = 50, lrRes = FALSE)
truePeakTrough(peaksTmp = NULL, troughsTmp = NULL, peakRangeTmp = 
NULL, minRes = 50, lrRes = FALSE)

Arguments

`peaksTmp`	character vector with indices of detected peaks from findPeaks
`troughsTmp`	character vector with indices of detected troughs from findPeaks
`peakRangeTmp`	matrix of the peak range data with at least 3 columns (1. mass-to-charge, 2. intensity, 3. retention time)
`minRes`	numeric minimum percentage left/right resolution
`lrRes`	logical if true both the left and right troughs must be above the minRes else the peak will be discounted. (default = FALSE i.e. if only the left or right trough is less than minRes then the peak will be retained)

Value

a named numeric of both the peaks and troughs fitting the criteria.

Package 'adductomicsR'

Help Index

Adduct quantification for adductomicsR

Description

Usage

Arguments

Value

Examples

AdductQuantif class The AdductQuantif class contains a peak integral matrix, peak ranges and region of integration, the isotopic distribution identified for each integrated peak and the target table of peaks integrated.

Description

Usage

Format

Value

Slots

Methods

Author(s)

AdductSpec class

Description

Usage

Format

Value

Slots

Methods

Author(s)

Constructor of AdductSpec object deconvolute spectra MS2 and MS1 levels

Description

Usage

Arguments

Value

modified Digest function (from OrgMassSpecR package)

Description

Usage

Arguments

Details

Value

Examples

dot product matrix calculation

Description

Usage

Arguments

Value

dot product calculation

Description

Usage

Arguments

Value

Dynamic Noise filtration

Description

Usage

Arguments

Details

Value

filter samples with low QC and features with large missing values Removes adducts that have not been integrated with many missing values and provides QC on samples

Description

Usage

Arguments

Value

Examples

identify peaks

Description

Usage

Arguments

Value

Examples

Make a target table for adductomicsR quantificaton using specSimPep results

Description

Usage

Arguments

Value

Examples

modified function from package OrgMassSpecR

Description

Usage

Arguments

Value

Examples

wrapper script for loess modeling

Description

Usage

Arguments

modified `Digest` function (from OrgMassSpecR package)