Title: | Mass Spectrometry Data Backend for Mascot Generic Format (mgf) Files |
---|---|
Description: | Mass spectrometry (MS) data backend supporting import and export of MS/MS spectra data from Mascot Generic Format (mgf) files. Objects defined in this package are supposed to be used with the Spectra Bioconductor package. This package thus adds mgf file support to the Spectra package. |
Authors: | RforMassSpectrometry Package Maintainer [cre], Laurent Gatto [aut] , Johannes Rainer [aut] , Sebastian Gibb [aut] , Michael Witting [ctb] , Adriano Rutz [ctb] |
Maintainer: | RforMassSpectrometry Package Maintainer <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.15.2 |
Built: | 2025-01-18 03:21:07 UTC |
Source: | https://github.com/bioc/MsBackendMgf |
The MsBackendMgf
class supports import and export of MS/MS spectra data
from/to files in Mascot Generic Format
(mgf)
files. After initial import, the full MS data is kept in
memory. MsBackendMgf
extends the Spectra::MsBackendDataFrame()
backend
directly and supports thus the Spectra::applyProcessing()
function to make
data manipulations persistent.
New objects are created with the MsBackendMgf
function. The
backendInitialize
method has to be subsequently called to
initialize the object and import MS/MS data from (one or more) mgf
files.
The MsBackendMgf
backend provides an export
method that allows to export
the data from the Spectra
object (parameter x
) to a file in mgf format.
See the package vignette for details and examples.
Default mappings from fields in the MGF file to spectra variable names are
provided by the spectraVariableMapping
function. This function returns a
named character vector were names are the spectra variable names and the
values the respective field names in the MGF files. This named character
vector is submitted to the import and export function with parameter
mapping
. It is also possible to pass own mappings (e.g. for special
MGF dialects) with the mapping
parameter.
## S4 method for signature 'MsBackendMgf' backendInitialize( object, files, mapping = spectraVariableMapping(object), nlines = -1L, ..., BPPARAM = SerialParam() ) MsBackendMgf() ## S4 method for signature 'MsBackendMgf' spectraVariableMapping(object, format = c("mgf")) ## S4 method for signature 'MsBackendMgf' export( object, x, file = tempfile(), mapping = spectraVariableMapping(object), exportTitle = TRUE, ... )
## S4 method for signature 'MsBackendMgf' backendInitialize( object, files, mapping = spectraVariableMapping(object), nlines = -1L, ..., BPPARAM = SerialParam() ) MsBackendMgf() ## S4 method for signature 'MsBackendMgf' spectraVariableMapping(object, format = c("mgf")) ## S4 method for signature 'MsBackendMgf' export( object, x, file = tempfile(), mapping = spectraVariableMapping(object), exportTitle = TRUE, ... )
object |
Instance of |
files |
|
mapping |
for |
nlines |
for |
... |
Currently ignored. |
BPPARAM |
Parameter object defining the parallel processing
setup. If parallel processing is enabled (with |
format |
for |
x |
for |
file |
|
exportTitle |
|
See description above.
Laurent Gatto and Johannes Rainer
library(BiocParallel) fls <- dir(system.file("extdata", package = "MsBackendMgf"), full.names = TRUE, pattern = "mgf$") ## Create an MsBackendMgf backend and import data from test mgf files. be <- backendInitialize(MsBackendMgf(), fls) be be$msLevel be$intensity be$mz ## The spectra variables that are available; note that not all of them ## have been imported from the MGF files. spectraVariables(be) ## The variable "TITLE" represents the title of the spectrum defined in the ## MGF file be$TITLE ## The default mapping of MGF fields to spectra variables is provided by ## the spectraVariableMapping function spectraVariableMapping(MsBackendMgf()) ## We can provide our own mapping e.g. to map the MGF field "TITLE" to a ## variable named "spectrumName": map <- c(spectrumName = "TITLE", spectraVariableMapping(MsBackendMgf())) map ## We can then pass this mapping with parameter `mapping` to the ## backendInitialize method: be <- backendInitialize(MsBackendMgf(), fls, mapping = map) ## The title is now available as variable named spectrumName be$spectrumName ## Next we create a Spectra object with this data sps <- Spectra(be) ## We can use the 'MsBackendMgf' also to export spectra data in mgf format. out_file <- tempfile() export(sps, backend = MsBackendMgf(), file = out_file, map = map) ## The first 20 lines of the generated file: readLines(out_file, n = 20) ## Next we add a new spectra variable to each spectrum sps$spectrum_idx <- seq_along(sps) ## This new spectra variable will also be exported to the mgf file: export(sps, backend = MsBackendMgf(), file = out_file, map = map) readLines(out_file, n = 20)
library(BiocParallel) fls <- dir(system.file("extdata", package = "MsBackendMgf"), full.names = TRUE, pattern = "mgf$") ## Create an MsBackendMgf backend and import data from test mgf files. be <- backendInitialize(MsBackendMgf(), fls) be be$msLevel be$intensity be$mz ## The spectra variables that are available; note that not all of them ## have been imported from the MGF files. spectraVariables(be) ## The variable "TITLE" represents the title of the spectrum defined in the ## MGF file be$TITLE ## The default mapping of MGF fields to spectra variables is provided by ## the spectraVariableMapping function spectraVariableMapping(MsBackendMgf()) ## We can provide our own mapping e.g. to map the MGF field "TITLE" to a ## variable named "spectrumName": map <- c(spectrumName = "TITLE", spectraVariableMapping(MsBackendMgf())) map ## We can then pass this mapping with parameter `mapping` to the ## backendInitialize method: be <- backendInitialize(MsBackendMgf(), fls, mapping = map) ## The title is now available as variable named spectrumName be$spectrumName ## Next we create a Spectra object with this data sps <- Spectra(be) ## We can use the 'MsBackendMgf' also to export spectra data in mgf format. out_file <- tempfile() export(sps, backend = MsBackendMgf(), file = out_file, map = map) ## The first 20 lines of the generated file: readLines(out_file, n = 20) ## Next we add a new spectra variable to each spectrum sps$spectrum_idx <- seq_along(sps) ## This new spectra variable will also be exported to the mgf file: export(sps, backend = MsBackendMgf(), file = out_file, map = map) readLines(out_file, n = 20)
The readMgf
function imports the data from a file in MGF format reading
all specified fields and returning the data as a S4Vectors::DataFrame()
.
For very large MGF files the readMgfSplit
function might be used instead.
In contrast to the readMgf
functions, readMgfSplit
reads only nlines
lines from an MGF file at once reducing thus the memory demand (at the cost
of a lower performance, compared to readMgf
).
readMgf( f, msLevel = 2L, mapping = spectraVariableMapping(MsBackendMgf()), ..., BPPARAM = SerialParam() ) readMgfSplit( f, msLevel = 2L, mapping = spectraVariableMapping(MsBackendMgf()), nlines = 1e+05, BPPARAM = SerialParam(), ... )
readMgf( f, msLevel = 2L, mapping = spectraVariableMapping(MsBackendMgf()), ..., BPPARAM = SerialParam() ) readMgfSplit( f, msLevel = 2L, mapping = spectraVariableMapping(MsBackendMgf()), nlines = 1e+05, BPPARAM = SerialParam(), ... )
f |
|
msLevel |
|
mapping |
named |
... |
Additional parameters, currently ignored. |
BPPARAM |
parallel processing setup that should be used. Only the parsing of the imported MGF file is performed in parallel. |
nlines |
for |
A DataFrame
with each row containing the data from one spectrum
in the MGF file. m/z and intensity values are available in columns "mz"
and "intensity"
in a list representation.
Laurent Gatto, Johannes Rainer, Sebastian Gibb
fls <- dir(system.file("extdata", package = "MsBackendMgf"), full.names = TRUE, pattern = "mgf$")[1L] readMgf(fls)
fls <- dir(system.file("extdata", package = "MsBackendMgf"), full.names = TRUE, pattern = "mgf$")[1L] readMgf(fls)