Title: | Pre-process 1H-NMR FID signals |
---|---|
Description: | This package provides R functions for common pre-procssing steps that are applied on 1H-NMR data. It also provides a function to read the FID signals directly in the Bruker format. |
Authors: | Manon Martin [aut, cre], Bernadette Govaerts [aut, ths], Benoît Legat [aut], Paul H.C. Eilers [aut], Pascal de Tullio [dtc], Bruno Boulanger [ctb], Julien Vanwinsberghe [ctb] |
Maintainer: | Manon Martin <[email protected]> |
License: | GPL-2 | file LICENSE |
Version: | 1.25.0 |
Built: | 2024-12-18 03:41:29 UTC |
Source: | https://github.com/bioc/PepsNMR |
This package provides R functions for classic and advanced pre-processing steps that are applied
on 1H NMR data.
It also provides the function ReadFids
to read the FID directly from the Bruker format.
Those pre-processing are cited below in the advised order of their application:
GroupDelayCorrection
Correct for the first order phase correction.
SolventSuppression
Remove solvent signal from the FIDs.
Apodization
Increase the sensitivity/resolution of the FIDs.
ZeroFilling
Improve the visual representation of the spectra.
FourierTransform
Transform the FID into a spectrum and convert the frequency scale (Hertz -> ppm).
ZeroOrderPhaseCorrection
Correct for the zero order phase correction.
InternalReferencing
Calibrate the spectra with internal compound referencing.
BaselineCorrection
Remove the spectral baseline.
NegativeValuesZeroing
Set negatives values to 0.
Warping
Warp the samples according to a reference spectrum.
WindowSelection
Select the informative part of the spectrum.
Bucketing
Data reduction by integration.
RegionRemoval
Set intensities of a desired region to 0.
ZoneAggregation
Aggregate a region to a single peak.
Normalization
Normalize the spectra.
Package: | PepsNMR |
Type: | Package |
Version: | 0.99.0 |
License: | GPLv2 |
The FIDs are read using ReadFids
which also gives a matrix with meta-information about each FID.
The other functions apply different pre-processing steps on these signals, and some need the info matrix as outputted from ReadFids
.
During this pre-processing, the signal is transformed through fourier transformation and the frequency scale is expressed in ppm.
For more details and illustrated explanations about those pre-treatment steps, see the documentation of each function and/or the chapter 1 of the reference below.
Benoît Legat, Bernadette Govaerts & Manon Martin
Maintainer: Manon Martin <[email protected]>
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
path <- system.file("extdata", package = "PepsNMRData") dir(path) fidList <- ReadFids(file.path(path, "HumanSerum")) Fid_data <- fidList[["Fid_data"]] Fid_info <- fidList[["Fid_info"]] Fid_data <- GroupDelayCorrection(Fid_data, Fid_info) Fid_data <- SolventSuppression(Fid_data) Fid_data <- Apodization(Fid_data, Fid_info) Fid_data <- ZeroFilling(Fid_data) Spectrum_data <- FourierTransform(Fid_data, Fid_info) Spectrum_data <- ZeroOrderPhaseCorrection(Spectrum_data) Spectrum_data <- InternalReferencing(Spectrum_data, Fid_info) Spectrum_data <- BaselineCorrection(Spectrum_data) Spectrum_data <- NegativeValuesZeroing(Spectrum_data) Spectrum_data <- Warping(Spectrum_data) Spectrum_data <- WindowSelection(Spectrum_data) Spectrum_data <- Bucketing(Spectrum_data) Spectrum_data <- RegionRemoval(Spectrum_data, typeofspectra = "serum") # Spectrum_data <- ZoneAggregation(Spectrum_data) Spectrum_data <- Normalization(Spectrum_data, type.norm = "mean")
path <- system.file("extdata", package = "PepsNMRData") dir(path) fidList <- ReadFids(file.path(path, "HumanSerum")) Fid_data <- fidList[["Fid_data"]] Fid_info <- fidList[["Fid_info"]] Fid_data <- GroupDelayCorrection(Fid_data, Fid_info) Fid_data <- SolventSuppression(Fid_data) Fid_data <- Apodization(Fid_data, Fid_info) Fid_data <- ZeroFilling(Fid_data) Spectrum_data <- FourierTransform(Fid_data, Fid_info) Spectrum_data <- ZeroOrderPhaseCorrection(Spectrum_data) Spectrum_data <- InternalReferencing(Spectrum_data, Fid_info) Spectrum_data <- BaselineCorrection(Spectrum_data) Spectrum_data <- NegativeValuesZeroing(Spectrum_data) Spectrum_data <- Warping(Spectrum_data) Spectrum_data <- WindowSelection(Spectrum_data) Spectrum_data <- Bucketing(Spectrum_data) Spectrum_data <- RegionRemoval(Spectrum_data, typeofspectra = "serum") # Spectrum_data <- ZoneAggregation(Spectrum_data) Spectrum_data <- Normalization(Spectrum_data, type.norm = "mean")
The function multiplies the FID by a defined factor to increase the sensibility and/or resolution of the spectra.
Apodization(Fid_data, Fid_info = NULL, DT = NULL, type.apod = c("exp","cos2", "blockexp","blockcos2","gauss","hanning","hamming"), phase = 0, rectRatio = 1/2, gaussLB = 1, expLB = 0.3, plotWindow = FALSE, returnFactor = FALSE, verbose=FALSE)
Apodization(Fid_data, Fid_info = NULL, DT = NULL, type.apod = c("exp","cos2", "blockexp","blockcos2","gauss","hanning","hamming"), phase = 0, rectRatio = 1/2, gaussLB = 1, expLB = 0.3, plotWindow = FALSE, returnFactor = FALSE, verbose=FALSE)
Fid_data |
Matrix containing the FIDs, one row per signal, as outputted by |
Fid_info |
Matrix containing the info about the FIDs, one row per signal, as outputted by |
DT |
If given, used instead of |
type.apod |
Type of apodization, see details. |
phase |
Phase at which the apodization window is maximum for |
rectRatio |
If there is a rectangular window, ratio between the width of the window and the width of the signal. |
gaussLB |
Line Broadening for the gaussian window, see details. |
expLB |
Line Broadening for the exponential window, see details. |
plotWindow |
If |
returnFactor |
If |
verbose |
If |
The apodization is usually performed in order to increase the sensitivity, i.e. the Signal-to-Noise Ratio (SNR) of the spectra. This is based on the fact that the signal intensity is decreasing over time unlike the noise that keeps a constant amplitude, leaving a noisy tail at the end of the FID. Multiplying the FID with a decaying signal will then increase the SNR. Since the area under the spectral peak remains unchanged, a faster decay will also result in a reduced peak height in spectra, lowering the spectral resolution. Optimal trade-off parameters for the apodization signal are thus needed to prevent high losses in sensitivity/resolution.
A FID of the form has a peak
in its spectrum at the frequency
of width that is inversely proportional to
.
This peak is called a spectral line and its width a spectral width.
In the case of the exponential multiplication ("exp"
), which is the default apodization, the decaying exponential becomes:
The new decay which satisfies
is therefore smaller so the spectral line is broader.
That is why we call this parameter the Line Broadening.
If LB increases, the SNR increases but at the expense of the spectral resolution. Usual values in proton NMR for “LB” found in the literature are 0.3 for the NOESY presat pulse sequence and -0.01 for the CMPG presat pulse sequence. It should not exceed the value of 1 to avoid information loss.
The different types of apodization are:
The signal is multiplied by a decreasing exponential .
The signal is multiplied by the value of a from 0 (where its value is 1) until
(where its value is 0).
The first part of the signal (defined by rectRatio
) is left unchanged and the second is multiplied by starting at value 1.
the first part is left unchanged as with blockexp
and the second part is multiplied by a where its value starts at 1 at the end of the block and ends at 0 at the end of the signal.
The signal is multiplied by a gaussian window centered at the beginning of the FID and with .
The signal is multiplied by a hanning window : cos.
The signal is multiplied by a hamming window : cos.
If returnFactor
is TRUE
, will return a list with the following elements: Fid_data
and Factor
. Otherwise, the function will just return Fid_data
.
Fid_data |
The apodized FIDs. |
Factor |
The apodization signal. |
Benoît Legat & Manon Martin
Inspired from the matNMR library.
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
require(PepsNMRData) Apod_res <- Apodization(Data_HS_sp$FidData_HS_2, FidInfo_HS, plotWindow=FALSE) #or Apod_res <- Apodization(Data_HS_sp$FidData_HS_2, FidInfo_HS, plotWindow=FALSE, returnFactor=TRUE) Apod_fid = Apod_res[["Fid_data"]] plot(Apod_res[["Factor"]], type="l")
require(PepsNMRData) Apod_res <- Apodization(Data_HS_sp$FidData_HS_2, FidInfo_HS, plotWindow=FALSE) #or Apod_res <- Apodization(Data_HS_sp$FidData_HS_2, FidInfo_HS, plotWindow=FALSE, returnFactor=TRUE) Apod_fid = Apod_res[["Fid_data"]] plot(Apod_res[["Factor"]], type="l")
The function estimates and removes the smoothed baseline from the spectra.
BaselineCorrection(Spectrum_data, ptw.bc = TRUE, maxIter = 42, lambda.bc = 1e7, p.bc = 0.05, eps = 1e-8, ppm.bc = TRUE, exclude.bc = list(c(5.1,4.5)), returnBaseline = FALSE, verbose = FALSE)
BaselineCorrection(Spectrum_data, ptw.bc = TRUE, maxIter = 42, lambda.bc = 1e7, p.bc = 0.05, eps = 1e-8, ppm.bc = TRUE, exclude.bc = list(c(5.1,4.5)), returnBaseline = FALSE, verbose = FALSE)
Spectrum_data |
Matrix containing the spectra, one row per spectrum. |
ptw.bc |
If |
maxIter |
Maximum number of iterations for the R version (if |
lambda.bc |
Smoothing parameter (generally 1e5 – 1e8). See details. |
p.bc |
Asymmetry parameter. See details. |
eps |
Numerical precision for convergence when estimating the baseline. |
ppm.bc |
If |
exclude.bc |
If not |
returnBaseline |
If |
verbose |
If |
The signal should be an addition of positive peaks which represent metabolites from the samples.
These peaks are added to the baseline which is the signal representing the absence of any metabolite and should therefore be uniformly zero. For each spectrum, its baseline is thus estimated and removed. Let be our initial spectrum an
be its baseline. Once
is approximated, the corrected spectrum is
.
A negative signal doesn't make sense and creates problems with the statistical analysis. The estimated baseline should then not be such that .
Hence, in the objective function to be minimized, the squared difference
are weighted by
if
or
if
.
is indeed taken very small, e.g.
0.05
, to avoid negative intensities. The function NegativeValuesZeroing
is used thereafter to set the remaining negative intensities to zero after the baseline correction.
With this function to minimize, we would simply have as a solution which would make
uniformly zero. Therefore, a roughness penalty term on
is applied so that it does not match exactly the peaks.
The importance of this smoothness constraint in the objective function is tuned by
which is typically equal to
1e7
.
In summary, usefull parameters are:
p.bc
The default value is 0.05
. The smaller it is, the less will try to follow peaks when it is under the function and the more it will try to be under the function.
lambda.bc
The default value is 1e7
. The larger it is, the smoother will be.
With
lambda = 0
, the baseline will be equal to the signal and the corrected signal will be zero.
The algorithm used to find the baseline is iterative. In ptw
, the iteration is done until the baseline is found but if ptw.bc
is set to FALSE
, we stop after maxIter
iterations.
More details and motivations are given in the articles mentionned in the References.
If returnBaseline
is TRUE
, will return a list with the following elements: Spectrum_data
and Baseline
. Otherwise, the function will just return Spectrum_data
.
Spectrum_data |
The matrix of spectra with the baseline removed. |
Baseline |
Estimation of the baseline. |
Benoît Legat, Manon Martin & Paul H. C. Eilers
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
Eilers, PHC. and Boelens, HFM. (2005). Baseline correction with asymmetric least squares smoothing. Leiden University Medical Centre report, 2005.
See also SolventSuppression
which also uses the Whittaker smoother.
require(PepsNMRData) BC_res <- BaselineCorrection(Data_HS_sp$Spectrum_data_HS_5, lambda.bc=5e+06, p.bc=0.05) #or BC_res <- BaselineCorrection(Data_HS_sp$Spectrum_data_HS_5, lambda.bc=5e+06, p.bc=0.05, returnBaseline=TRUE) BC_spec = BC_res[["Spectrum_data"]] plot(BC_res[["Baseline"]], type="l")
require(PepsNMRData) BC_res <- BaselineCorrection(Data_HS_sp$Spectrum_data_HS_5, lambda.bc=5e+06, p.bc=0.05) #or BC_res <- BaselineCorrection(Data_HS_sp$Spectrum_data_HS_5, lambda.bc=5e+06, p.bc=0.05, returnBaseline=TRUE) BC_spec = BC_res[["Spectrum_data"]] plot(BC_res[["Baseline"]], type="l")
Reduces the number of data points by aggregating intensities into buckets.
Bucketing(Spectrum_data, width = FALSE, mb = 500, boundary = NULL, intmeth = c("r", "t"), tolbuck = 10^-4, verbose = FALSE)
Bucketing(Spectrum_data, width = FALSE, mb = 500, boundary = NULL, intmeth = c("r", "t"), tolbuck = 10^-4, verbose = FALSE)
Spectrum_data |
Matrix containing the spectra in ppm, one row per spectrum. |
width |
If |
mb |
The number of buckets OR the buckets' width. If |
boundary |
Numeric vector of left and right boundaries for ppm integration. |
intmeth |
Type of bucketing: rectangular ( |
tolbuck |
Tolerance threshold to check if the buckets of the original spectra are of constant length. |
verbose |
If |
It is important to note that the input spectrum can have its ppm axis in increasing or decreasing order and it does not have to be equispaced.
Bucketing has two main interests:
Ease the statistical analysis
Decrease the impact of peaks misalignments between different spectra that should be aligned; assuming we are in the ideal case where they fall in the same bucket.
Of course, the better the prior warping is, the larger can be without major misalignment and the more informative the spectra will be.
The ppm interval of Spectrum_data
, let's say where
, is divided into
buckets of size
.
The new ppm scale contains the
centers of these intervals.
The spectral intensity at these centers is the integral of the initial spectral intensity on
this bucket using either trapezoidal or rectangular integration.
Spectrum_data |
The matrix of spectra with their new ppm axis. |
Benoît Legat, Bernadette Govaerts & Manon Martin
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
require(PepsNMRData) Bucket.spec <- Bucketing(Data_HS_sp$Spectrum_data_HS_10, mb = 500)
require(PepsNMRData) Bucket.spec <- Bucketing(Data_HS_sp$Spectrum_data_HS_10, mb = 500)
Draws FIDs, spectra or their PCA scores/loadings.
Draw(Signal_data, type.draw = c("signal","pca"), output = c("default","window","png","pdf"), dirpath = ".",filename = "%003d", height = 480, width = 640, pdf.onefile = TRUE, ...)
Draw(Signal_data, type.draw = c("signal","pca"), output = c("default","window","png","pdf"), dirpath = ".",filename = "%003d", height = 480, width = 640, pdf.onefile = TRUE, ...)
Signal_data |
Matrix containing the FIDs or spectra, one line per FID/spectrum. |
type.draw |
Either "signal" or "pca", which calls respectively |
output |
Specifies how to display the drawings:
|
dirpath |
The path to the directory where the png or pdf are outputted. |
filename |
The filenames of the png and pdf, see argument |
height |
Height of the png and pdf in pixels. |
width |
Width of the png and pdf in pixels. |
pdf.onefile |
Wen |
... |
The remaining arguments are passed either to |
Depending on the type.draw
value, it can draw each row of Signal_data
in a way described by subtype
or the PCA scores or loadings (depending on the type.pca
value) of all the FIDs/spectra in Signal_data
.
Benoît Legat & Manon Martin
See Also DrawSignal
and DrawPCA
.
# Draw each signal Real part and Mod in separate png with name end001 end002, ... # Draw the spectra require(PepsNMRData) Draw(FinalSpectra_HS, type.draw = "signal", output="window",subtype="together") # Draw a PCA Draw(FinalSpectra_HS, type.draw="pca",output="window")
# Draw each signal Real part and Mod in separate png with name end001 end002, ... # Draw the spectra require(PepsNMRData) Draw(FinalSpectra_HS, type.draw = "signal", output="window",subtype="together") # Draw a PCA Draw(FinalSpectra_HS, type.draw="pca",output="window")
The function draws the PCA scores or loadings of the FIDs/spectra given in the matrix Signal_data
.
Do not call this function directly but rather call Draw
to specify how the plot will be returned.
DrawPCA(Signal_data, drawNames = TRUE, main = "PCA score plot", Class = NULL, axes = c(1,2), type.pca = c("scores", "loadings"), loadingstype=c("l", "p"), num.stacked = 4, xlab = "rowname", createWindow)
DrawPCA(Signal_data, drawNames = TRUE, main = "PCA score plot", Class = NULL, axes = c(1,2), type.pca = c("scores", "loadings"), loadingstype=c("l", "p"), num.stacked = 4, xlab = "rowname", createWindow)
Signal_data |
Matrix containing the FIDs or spectra, one line per FID/spectrum. |
drawNames |
If |
main |
Plot title. |
Class |
Vector (numeric or character) indicating the class of each spectra. Used for scores plot only. |
axes |
Vector of score or loading numbers to be plotted. If it represents the score's numbers, only the first two elements are used. |
type.pca |
The type of plot, either |
loadingstype |
The type of loadings plot, either a line plot ( |
num.stacked |
Number of stacked plots for the loadings plots. |
xlab |
Label of the x-axis of loadings plots. |
createWindow |
If |
Benoît Legat & Manon Martin
See also Draw
and DrawSignal
.
require(PepsNMRData) # Draw loadings DrawPCA(FinalSpectra_HS, main = "PCA loadings plot", Class = NULL, axes =c(1,3, 5), type ="loadings", loadingstype="l", num.stacked=4, xlab="ppm", createWindow = TRUE) # Draw scores class = substr(rownames(FinalSpectra_HS),5,5) DrawPCA(FinalSpectra_HS, drawNames = TRUE, main = "PCA scores plot", Class = class, axes = c(1,2), type = "scores", createWindow = TRUE)
require(PepsNMRData) # Draw loadings DrawPCA(FinalSpectra_HS, main = "PCA loadings plot", Class = NULL, axes =c(1,3, 5), type ="loadings", loadingstype="l", num.stacked=4, xlab="ppm", createWindow = TRUE) # Draw scores class = substr(rownames(FinalSpectra_HS),5,5) DrawPCA(FinalSpectra_HS, drawNames = TRUE, main = "PCA scores plot", Class = class, axes = c(1,2), type = "scores", createWindow = TRUE)
Depending on the subtype
, will draw the different parts of the complex FIDs/spectra.
DrawSignal(Signal_data, subtype = c("stacked", "together", "separate", "diffmean", "diffmedian", "diffwith"), ReImModArg = c(TRUE, FALSE, FALSE, FALSE), vertical = TRUE , xlab = "index", RowNames = NULL, row = 1, num.stacked = 4, main = NULL, createWindow)
DrawSignal(Signal_data, subtype = c("stacked", "together", "separate", "diffmean", "diffmedian", "diffwith"), ReImModArg = c(TRUE, FALSE, FALSE, FALSE), vertical = TRUE , xlab = "index", RowNames = NULL, row = 1, num.stacked = 4, main = NULL, createWindow)
Signal_data |
Matrix containing the FIDs or spectra, one line per FID/spectrum. |
subtype |
Specifies the drawing array:
|
ReImModArg |
Specifies which of the real, imaginary, modulus, or argument part of the complex signal has to be plotted. Those plots are on the same page. |
vertical |
Specifies whether the parts of the complex signal have to be put vertically or horizontally on the page if there are only 2 parts. If more, there will be 2 horizontally and 2 vertically anyway. |
xlab |
Label of the x-axis. |
RowNames |
Strings to use instead of the rownames as labels for the plots if |
row |
|
num.stacked |
Number of stacked plots if |
main |
If not |
createWindow |
If |
Don't call this function directly but rather call Draw
to specify how the plot will be outputted.
Benoît Legat & Manon Martin
require(PepsNMRData) plots <- DrawSignal(FinalSpectra_HS[1:4,], subtype = "together", ReImModArg = c(TRUE, TRUE, FALSE, FALSE), createWindow = TRUE) grid::grid.draw(plots)
require(PepsNMRData) plots <- DrawSignal(FinalSpectra_HS[1:4,], subtype = "together", ReImModArg = c(TRUE, TRUE, FALSE, FALSE), createWindow = TRUE) grid::grid.draw(plots)
The function removes the group delay at the beginning of the FIDs.
FirstOrderPhaseCorrection(Fid_data, Fid_info = NULL, group_delay = NULL, verbose = FALSE)
FirstOrderPhaseCorrection(Fid_data, Fid_info = NULL, group_delay = NULL, verbose = FALSE)
Fid_data |
Matrix containing the FIDs, one row per signal, as outputted by |
Fid_info |
Matrix containing the info about the FIDs, one row per signal, as outputted by |
group_delay |
If given, it is used instead of |
verbose |
If |
First Order Phase Correction step could also called "removal of Bruker digital filter".
Due to Bruker's digital filter and to other technical reasons a first order phase shift caused by a group delay is present in the FID and needs to be removed.
Luckily, information about this delay is available when loading the FID with ReadFids
and is written in Fid_info
.
This function shifts circularly each FID in order to cancel this delay. By circularly, we mean that the starting portion of the FID becomes its ending portion when applied.
Each FID is shifted by the same amount since it can be non-integer and the columns names which are the time coordinates are shared between all the FIDs.
Fid_data |
The matrix of FIDs corrected for the first order phase shift. |
Benoît Legat & Manon Martin
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
require(PepsNMRData) Fopc.fid <- FirstOrderPhaseCorrection(Data_HS_sp$FidData_HS_0,FidInfo_HS)
require(PepsNMRData) Fopc.fid <- FirstOrderPhaseCorrection(Data_HS_sp$FidData_HS_0,FidInfo_HS)
The function takes the FIDs in the time domain and translate it into the frequency domain. It also converts the frequency scale from hertz to part per million (ppm).
FourierTransform(Fid_data, Fid_info = NULL, SW_h = NULL, SW = NULL, O1 = NULL, reverse.axis = TRUE, verbose = FALSE)
FourierTransform(Fid_data, Fid_info = NULL, SW_h = NULL, SW = NULL, O1 = NULL, reverse.axis = TRUE, verbose = FALSE)
Fid_data |
Matrix containing the FIDs, one row per signal, as outputted by |
Fid_info |
Matrix containing the info about the FIDs, one row per signal, as outputted by |
SW_h |
Sweep Width in hertz. If given, the value in |
SW |
Sweep width in ppm. If given, the value in |
O1 |
Spectrometer frequency offset. If given, the value in |
reverse.axis |
If |
verbose |
If |
The number of points doesn't change and the frequency interval is from
to
(the
is due to the fact that we only have
points, not
and the fourier transform is periodic with period
so it is the same at
and
anyway).
,
and
are usually taken from the
Fid_info
matrix.
and
are assumed to be the same for every FID since their column names are shared.
The frequency scale is dependent on the kind of spectrometer used, more precisely on its external magnetic field. We therefore translate it to a ppm (part per million) scale which is independent of this external magnetic field thanks to the recovered transmitter frequency offset value ().
RawSpect_data |
The matrix of spectra in ppm. |
Benoît Legat & Manon Martin
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
require(PepsNMRData) FT.spec <- FourierTransform(Data_HS_sp$FidData_HS_3,FidInfo_HS_sp, SW_h = 12019.23)
require(PepsNMRData) FT.spec <- FourierTransform(Data_HS_sp$FidData_HS_3,FidInfo_HS_sp, SW_h = 12019.23)
The function removes the group delay at the beginning of the FIDs.
GroupDelayCorrection(Fid_data, Fid_info = NULL, group_delay = NULL, verbose = FALSE)
GroupDelayCorrection(Fid_data, Fid_info = NULL, group_delay = NULL, verbose = FALSE)
Fid_data |
Matrix containing the FIDs, one row per signal, as outputted by |
Fid_info |
Matrix containing the info about the FIDs, one row per signal, as outputted by |
group_delay |
If given, it is used instead of |
verbose |
If |
First Order Phase Correction step could also called "removal of Bruker digital filter".
Due to Bruker's digital filter and to other technical reasons a first order phase shift caused by a group delay is present in the FID and needs to be removed.
Luckily, information about this delay is available when loading the FID with ReadFids
and is written in Fid_info
.
This function shifts circularly each FID in order to cancel this delay. By circularly, we mean that the starting portion of the FID becomes its ending portion when applied.
Each FID is shifted by the same amount since it can be non-integer and the columns names which are the time coordinates are shared between all the FIDs.
Fid_data |
The matrix of FIDs corrected for the first order phase shift. |
Benoît Legat & Manon Martin
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
require(PepsNMRData) Fopc.fid <- GroupDelayCorrection(Data_HS_sp$FidData_HS_0, FidInfo_HS)
require(PepsNMRData) Fopc.fid <- GroupDelayCorrection(Data_HS_sp$FidData_HS_0, FidInfo_HS)
Chemical shifts are referenced against a Reference Compound (RC, e.g. TMSP).
InternalReferencing(Spectrum_data, Fid_info, method = c("max", "thres"), range = c("nearvalue", "all", "window"), ppm.value = 0, direction = "left", shiftHandling = c("zerofilling", "cut", "NAfilling", "circular"), c = 2, pc = 0.02, fromto.RC = NULL, ppm.ir = TRUE, rowindex_graph = NULL, verbose = FALSE)
InternalReferencing(Spectrum_data, Fid_info, method = c("max", "thres"), range = c("nearvalue", "all", "window"), ppm.value = 0, direction = "left", shiftHandling = c("zerofilling", "cut", "NAfilling", "circular"), c = 2, pc = 0.02, fromto.RC = NULL, ppm.ir = TRUE, rowindex_graph = NULL, verbose = FALSE)
Spectrum_data |
Matrix containing the spectra in ppm, one row per spectrum. |
Fid_info |
Matrix containing the information for each spectrum, one row per spectrum, as returned by |
method |
Method used to find the RC peak in the spectra, See the details section. |
range |
How the search zone is defined. Either accross the whole ppm axis ( |
ppm.value |
By default, the ppm value of the reference compound is set to 0, but any arbitrary value in the ppm interval of spectra can be used instead. |
direction |
If |
shiftHandling |
See the details section. |
c |
If |
pc |
If |
fromto.RC |
If |
ppm.ir |
If |
rowindex_graph |
If not |
verbose |
If |
Once the search zone is defined with range
, the RC is found depending on the method
. If method = "thres"
, RC is the first peak in the spectrum higher than a predefined threshold which is computed as: c*(cumulated_mean/cumulated_sd)
. If method = "max"
, the maximum intensity in the search zone is defined as the RC.
Since the spectra can be shifted differently, we need to handle misalignment of the left and right of the spectrum.
This can be illustrated here:
| : TMSP peak before 1 2 3 | 5 6 7 8 9 1 2 3 4 5 | 7 8 9 1 2 3 4 | 6 7 8 9 shifted -5 -4 -3 -2 -1 0 1 2 3 4 5 : ppm scale 1 2 3 | 5 6 7 8 9 1 2 3 4 5 | 7 8 9 1 2 3 4 | 6 7 8 9
The different shift handlings (shiftHandling
) are the following:
NAfilling
The extremities at which a spectrum is not defined are replaced by NA
. It is detected by WindowSelection
which produces a warning if there are NAs in the selected window.
-5 -4 -3 -2 -1 0 1 2 3 4 5 ppm scale NA NA 1 2 3 | 5 6 7 8 9 1 2 3 4 5 | 7 8 9 NA NA NA 1 2 3 4 | 6 7 8 9 NA
zerofilling
The extremities at which a spectrum is not defined are replaced by 0
. It makes sense since in practice the spectrum is close to zero at the extremities.
-5 -4 -3 -2 -1 0 1 2 3 4 5 ppm scale 0 0 1 2 3 | 5 6 7 8 9 1 2 3 4 5 | 7 8 9 0 0 0 1 2 3 4 | 6 7 8 9 0
circular
The spectra are shifted circularly which means that the end of a spectrum is reproduced at the beginning. It makes sense since the spectrum is periodic since it is the result of FFT.
-5 -4 -3 -2 -1 0 1 2 3 ppm scale 8 9 1 2 3 | 5 6 7 1 2 3 4 5 | 7 8 9 9 1 2 3 4 | 6 7 8
cut
The ppm values for which some spectra are not defined are removed.
-3 -2 -1 0 1 2 3 ppm scale 1 2 3 | 5 6 7 3 4 5 | 7 8 9 2 3 4 | 6 7 8
The difference between these shift handlings should not be critical in practice since the extremities of the spectra are not used most of the time and are removed in WindowSelection.
if rowindex_graph
is NULL
:
Spectrum_data |
The matrix of the spectral value in the ppm scale. |
if rowindex_graph
is not NULL
:
Spectrum_data |
The matrix of the spectral value in the ppm scale. |
plots |
The spectra that need to be plotted for inspection. |
Benoît Legat & Manon Martin
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
require(PepsNMRData) PpmConv.spec <- InternalReferencing(Data_HS_sp$Spectrum_data_HS_5, FidInfo_HS, shiftHandling = "zerofilling")
require(PepsNMRData) PpmConv.spec <- InternalReferencing(Data_HS_sp$Spectrum_data_HS_5, FidInfo_HS, shiftHandling = "zerofilling")
The function sets negative intensities to zero.
NegativeValuesZeroing(Spectrum_data, verbose = FALSE)
NegativeValuesZeroing(Spectrum_data, verbose = FALSE)
Spectrum_data |
Matrix containing the spectra in ppm, one row per spectrum. |
verbose |
If |
As explained in BaselineCorrection
,
negative values does not make sense and can have bad impacts on our statistical analyses.
BaselineCorrection
do its best to avoid negative intensity values but there might be some remaining.
This filter simply sets them to zero.
After the BaselineCorrection
they should be close to zero anyway because of the high penalty given to negative values of the signal after the correction.
Spectrum_data |
The matrix of spectrums with the negative values set to zero. |
Benoît Legat & Manon Martin
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
require(PepsNMRData) Nvz.spec <- NegativeValuesZeroing(Data_HS_sp$Spectrum_data_HS_7)
require(PepsNMRData) Nvz.spec <- NegativeValuesZeroing(Data_HS_sp$Spectrum_data_HS_7)
Spectra normalization to correct for the dilution factor common to all biofuid samples.
Normalization(Spectrum_data, type.norm, fromto.norm = c(3.05, 4.05), ref.norm = "median", returnFactor = FALSE, verbose = FALSE)
Normalization(Spectrum_data, type.norm, fromto.norm = c(3.05, 4.05), ref.norm = "median", returnFactor = FALSE, verbose = FALSE)
Spectrum_data |
Matrix containing the spectra in ppm, one row per spectrum. |
type.norm |
Different types of normalization are available: |
fromto.norm |
Used if |
ref.norm |
The reference spectrum if |
returnFactor |
If |
verbose |
If |
Normalization of spectra before their warping or their statistical analysis is necessary in order to be able to efficiently compare their relative peak intensities.
It is therefore appropriate to call this filter at the end of the preprocessing workflow.
Normalization types can be:
mean
Each spectrum is divided by its mean so that its mean becomes 1.
median
Each spectrum is divided by its median so that its median becomes 1.
firstquartile
Each spectrum is divided by its first quartile so that its first quartile becomes 1.
peak
Each spectrum is divided by the value of the peak of the spectrum contained between "fromto.norm"
inclusive (i.e. the maximum value of spectral intensities in that interval).
pqn
Probabilistic Quotient Normalization from Dieterle et al. (2006). If ref.norm
is "median"
or "mean"
, will use the median or the mean spectrum as the reference spectrum ; if it is a single number, will use the spectrum located at that row in the spectral matrix; if ref.norm
is a numeric vertor of length equal to the number of spectral variables, it defines manually the reference spectrum.
The choice of a proper normalisation method is a crucial although not straightforward step in a metabolomic analysis.
Applying CSN is accurate in the following situations:
when working on human/animal sera in the case of not serious pathology, given the homeostasis principle and since no dilution effect is present.
When working on biopsies, the “metabolome quantity” is set constant across the samples by adding a varying volume of a buffer and the same applies when working with cell media, where the quantity of cells is made constant.
To counteract all the dilution effects and the excretion differences between urine samples, the PQN approach is often recommended in the literature (Dieterle et al., 2006).
For any other situation (large difference between the groups, other kind of sample, etc.), the choice of the normalisation method is not straightforward. A solution is to refer to endogenous stable metabolites that are present in a constant quantity across samples and use them as standards to normalize all spectral profiles. For the urine samples, the creatinine has been considered as such standard (this option is also implemented in PepsNMR), even though it has been shown that the creatinine concentration could fluctuate given specific parameters (Tang et al., 2015). A review on normalization techniques for mass spectroscopy metabolomics from Wu \& Li (2015) provides some guidance in the choice on the normalization approach regarding the type of sample analysed and can be transposed to the NMR spectra normalisation.
Spectrum_data |
The matrix of normalized spectra. |
Benoît Legat & Manon Martin
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Yiman Wu, Liang Li. (2016). Sample normalization methods in quantitative metabolomics, Journal of Chromatography A, Volume 1430, Pages 80-95, ISSN 0021-9673
Tang KWA, Toh QC, Teo BW. (2015). Normalisation of urinary biomarkers to creatinine for clinical practice and research – when and why. Singapore Medical Journal. 56(1):7-10.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
Dieterle, F., Ross, A. , Schlotterbeck, G.,and Senn, H (2006). Probabilistic Quotient Normalization as Robust Method to Account for Dilution of Complex Biological Mixtures. Analytical Chemistry 78 (13), 4281-4290
require(PepsNMRData) Norm.spec <- Normalization(Data_HS_sp$Spectrum_data_HS_12, type.norm = "mean")
require(PepsNMRData) Norm.spec <- Normalization(Data_HS_sp$Spectrum_data_HS_12, type.norm = "mean")
The function is a wrapper for all the preprocessing steps available in PepsNMR.
PreprocessingChain(Fid_data = NULL, Fid_info = NULL, data.path = NULL, readFids = TRUE, groupDelayCorr = TRUE, solventSuppression = TRUE, apodization = TRUE, zerofilling = TRUE,fourierTransform = TRUE, zeroOrderPhaseCorr = TRUE, internalReferencing = TRUE, baselineCorrection = TRUE, negativeValues0 = TRUE, warping = TRUE, windowSelection = TRUE, bucketing = TRUE, regionRemoval = TRUE, zoneAggregation = TRUE,normalization = TRUE, ..., export = FALSE, format = c("Rdata", "csv", "txt"), out.path = ".", filename = "filename", writeArg = c("none", "return", "txt"), verbose = FALSE)
PreprocessingChain(Fid_data = NULL, Fid_info = NULL, data.path = NULL, readFids = TRUE, groupDelayCorr = TRUE, solventSuppression = TRUE, apodization = TRUE, zerofilling = TRUE,fourierTransform = TRUE, zeroOrderPhaseCorr = TRUE, internalReferencing = TRUE, baselineCorrection = TRUE, negativeValues0 = TRUE, warping = TRUE, windowSelection = TRUE, bucketing = TRUE, regionRemoval = TRUE, zoneAggregation = TRUE,normalization = TRUE, ..., export = FALSE, format = c("Rdata", "csv", "txt"), out.path = ".", filename = "filename", writeArg = c("none", "return", "txt"), verbose = FALSE)
Fid_data |
If non |
Fid_info |
If non |
data.path |
A character string specifying the directory where the FIDs are searched. |
readFids |
If |
groupDelayCorr |
If |
solventSuppression |
If |
apodization |
If |
zerofilling |
If |
fourierTransform |
If |
zeroOrderPhaseCorr |
If |
internalReferencing |
If |
baselineCorrection |
If |
negativeValues0 |
If |
warping |
If |
windowSelection |
If |
bucketing |
If |
regionRemoval |
If |
zoneAggregation |
If |
normalization |
If |
... |
Other optionnal arguments of the above pre-processing functions. |
export |
If |
format |
Format chosen to export the spectral intensities and the aquisition parameters matrices. |
out.path |
Path used to export the spectral intensities and the aquisition parameters matrices if |
filename |
Name given to exported files. |
writeArg |
If not |
verbose |
If |
The function will return a list with the spectral intensities and the aquisition parameters matrices. If writeArg == "return"
, an additionnal list element is returned (arguments
).
Spectrum_data |
The pre-processed spectra. |
Fid_info |
The acquisition parameters. |
arguments |
The function arguments. |
Manon Martin
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
path <- system.file("extdata", package = "PepsNMRData") data.path <- file.path(path, "HumanSerum") res <- PreprocessingChain(Fid_data = NULL, Fid_info = NULL, data.path = data.path, ReadFids = TRUE, type.norm = "mean", export = FALSE, writeArg = "return")
path <- system.file("extdata", package = "PepsNMRData") data.path <- file.path(path, "HumanSerum") res <- PreprocessingChain(Fid_data = NULL, Fid_info = NULL, data.path = data.path, ReadFids = TRUE, type.norm = "mean", export = FALSE, writeArg = "return")
Finds all directories of path
which contain a valid FID (i.e. contain the files fid
, acqu
and acqus
) and loads
them in a matrix.
ReadFids(path, l = 1, subdirs = FALSE, dirs.names = FALSE, verbose = FALSE)
ReadFids(path, l = 1, subdirs = FALSE, dirs.names = FALSE, verbose = FALSE)
path |
A character string specifying the directory where the FIDs are searched. |
l |
A positive number indicating which line of the title file to use as spectra names. |
subdirs |
If |
dirs.names |
If |
verbose |
If |
The row names are the first line of the file "pdata/1/title" in the directory or the directory name(and subdirectory if subdirs == TRUE
) if the title file doesn't exists or the line l
is blank. The column names are the time coordinates of the FID.
All the FIDs therefore need to have the same length and time interval between points.
Case 1: subdirs = FALSE
DIR1 => 1, 2, 3, ...
Case 2a: subdirs = TRUE
DIR1 => 1 ; DIR2 => 1 ; DIR3 => 1 ; ...
Case 2b: subdirs = TRUE
DIR1 => 1, 2, ... ; DIR2 => 1, 2, ... ; ...
Returns a list with the FIDs and their related information.
Fid_data |
The matrix containing the FIDs. |
Fid_info |
A matrix containing the information about the FIDs.
The naming of the row is the same than for The columns are:
|
Benoît Legat & Manon Martin
path <- system.file("extdata", package = "PepsNMRData") dir(path) fidList_HS <- ReadFids(file.path(path, "HumanSerum")) FidData_HS_0 <- fidList_HS[["Fid_data"]] FidInfo_HS <- fidList_HS[["Fid_info"]]
path <- system.file("extdata", package = "PepsNMRData") dir(path) fidList_HS <- ReadFids(file.path(path, "HumanSerum")) FidData_HS_0 <- fidList_HS[["Fid_data"]] FidInfo_HS <- fidList_HS[["Fid_info"]]
Removes the non-informative regions by setting the values of the spectra in these intervals to zero.
RegionRemoval(Spectrum_data, typeofspectra = c("manual", "serum", "urine"), type.rr = c( "zero", "NA"), fromto.rr = list(Water = c(4.5, 5.1)), verbose = FALSE)
RegionRemoval(Spectrum_data, typeofspectra = c("manual", "serum", "urine"), type.rr = c( "zero", "NA"), fromto.rr = list(Water = c(4.5, 5.1)), verbose = FALSE)
Spectrum_data |
Matrix containing the spectra in ppm, one row per spectrum. |
typeofspectra |
Type of spectra, if not |
type.rr |
Type of region removal method. If |
fromto.rr |
List containing the extremities of the intervals to be removed. |
verbose |
If |
The presence of non-informative regions can strongly bias the subsequent statistical analysis.
The inclusive ppm interval fromto.rr
is set to zero or completed with NAs for every spectrum.
The ppm scale can be increasing or decreasing (i.e. from < to
or from > to
).
The type of spectra can be NULL to manually specify the area to be removed otherwise it is specified as typeofspectra = "serum"
or typeofspectra = "urine"
and the removed area are for typeofspectra = "serum"
: water (4.5 - 5.1 ppm) and for typeofspectra = "urine"
: water, uree and maleic acid (4.5 - 6.1 ppm).
Spectrum_data |
The matrix of spectra with the removed regions. |
Benoît Legat & Manon Martin
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
# Remove the lactate and water regions for serum spectra require(PepsNMRData) fromto <- list(Water =c(4.5, 5.1), Lactate=c(1.32, 1.36)) Rr.spec <- RegionRemoval(Data_HS_sp$Spectrum_data_HS_11,fromto.rr = fromto)
# Remove the lactate and water regions for serum spectra require(PepsNMRData) fromto <- list(Water =c(4.5, 5.1), Lactate=c(1.32, 1.36)) Rr.spec <- RegionRemoval(Data_HS_sp$Spectrum_data_HS_11,fromto.rr = fromto)
Signal smooting for water residuals resonance removal.
SolventSuppression(Fid_data, lambda.ss = 1e6, ptw.ss = TRUE, returnSolvent = FALSE, verbose = FALSE)
SolventSuppression(Fid_data, lambda.ss = 1e6, ptw.ss = TRUE, returnSolvent = FALSE, verbose = FALSE)
Fid_data |
Matrix containing the FIDs, one row per signal, as outputted by |
lambda.ss |
Penalty on roughness used to calculate the smoothed version of the FID. The higher lambda is, the smoother the estimated solvent signal will be. |
ptw.ss |
If |
returnSolvent |
If |
verbose |
If |
FIDs usually present a wavy shape. Under the assumption that water is the main compound of the analyzed samples, its signal can be modelled by the smoothing of the FIDs. We then subtract this wave, i.e. the solvent residuals resonance signal, from the original FIDs.
The smoothing is done with a Whittaker smoother which is obtained by the minimization of
where
is the sum of the squared differences between the original and the smoothed signal.
measures the roughness of the estimated signal.
The larger is, the smoother the solvent residuals resonance signal.
Eilers (2003) and Frasso & Eilers (2015) suggest different ways to tune
in order to optimise the smoothing: either visually, by cross-validation or using the V-curve procedure.
If returnSolvent = TRUE
, will return a list with the following elements: Fid_data
, SolventRe
and SolventIm
. Otherwise, the function will just return Fid_data
.
Fid_data |
The matrix of FIDs with the solvent residuals signal removed. |
SolventRe |
The real part of the solvent signal. |
SolventIm |
The imaginary part of the solvent signal. |
Benoît Legat, Manon Martin & Paul H. C. Eilers
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Frasso, G., & Eilers, P.H.C. (2015). L-and V-curves for optimal smoothing. Statistical Modelling, 15(1), 91-111.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy. PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium.
Eilers, P.H.C. (2003). A perfect smoother. Analytical Chemistry, 75(14), 3631-3636.
See also BaselineCorrection
which also uses the Whittaker smoother.
require(PepsNMRData) Ss.fid <- SolventSuppression(Data_HS_sp$FidData_HS_1, returnSolvent=FALSE) #or Ss.res <- SolventSuppression(Data_HS_sp$FidData_HS_1, returnSolvent=TRUE) Ss.fid = Ss.res[["Fid_data"]] SolventRe = Ss.res[["SolventRe"]] plot(SolventRe[1,], type="l")
require(PepsNMRData) Ss.fid <- SolventSuppression(Data_HS_sp$FidData_HS_1, returnSolvent=FALSE) #or Ss.res <- SolventSuppression(Data_HS_sp$FidData_HS_1, returnSolvent=TRUE) Ss.fid = Ss.res[["Fid_data"]] SolventRe = Ss.res[["SolventRe"]] plot(SolventRe[1,], type="l")
Warps the frequency x-axis to minimize the pairwise distance between a sample spectrum and a reference spectrum.
Warping(Spectrum_data, normalization.type = c("median","mean", "firstquartile","peak","none"), fromto.normW = c(3.05, 4.05), reference.choice = c("fixed", "before", "after", "manual"), reference = 1, optim.crit = c("RMS", "WCC"), ptw.wp = FALSE, K = 3, L = 40, lambda.smooth = 0, deg = 3, lambda.bspline = 0.01, kappa = 0.0001, max_it_Bspline = 10, returnReference = FALSE, returnWarpFunc = FALSE, verbose = FALSE)
Warping(Spectrum_data, normalization.type = c("median","mean", "firstquartile","peak","none"), fromto.normW = c(3.05, 4.05), reference.choice = c("fixed", "before", "after", "manual"), reference = 1, optim.crit = c("RMS", "WCC"), ptw.wp = FALSE, K = 3, L = 40, lambda.smooth = 0, deg = 3, lambda.bspline = 0.01, kappa = 0.0001, max_it_Bspline = 10, returnReference = FALSE, returnWarpFunc = FALSE, verbose = FALSE)
Spectrum_data |
Matrix containing the spectra in ppm, one row per spectrum. |
normalization.type |
Type of normalization applied to the spectra prior to warping.
See |
fromto.normW |
Used by |
reference.choice |
Specifies how the reference will be chosen:
|
reference |
The row number or name of the reference spectrum when |
optim.crit |
If |
ptw.wp |
If set to |
K |
It is the degree of the polynomial used for the warping (see details). |
L |
This is the number of B-splines that are used for the warping.
It should be either 0 or greater than |
lambda.smooth |
Nonnegative coefficient for the smoothing |
deg |
Degree of the B-splines. |
lambda.bspline |
Nonnegative second-order smoothness penalty coefficient for the B-splines warping. See the reference for more details. |
kappa |
Nonnegative ridge (zero-order) penalty coefficient for the B-splines warping. See the reference for more details. |
max_it_Bspline |
Maximum number of iterations for the B-splines warping. |
returnReference |
If |
returnWarpFunc |
If |
verbose |
If |
When reference.choice
is "after"
, the reference with the minimum sum is taken as the reference and the warped spectra according to this reference (that have already been calculated at this stage) are returned. This is times slower than the 2 others where
is the number of spectra.
Principle:
We try to find a warping function between a reference spectrum and a sample.
This function is a sum of polynomial of degree K
and L
B-splines of degree deg
.
The unknowns are the polynomial and B-splines coefficients.
No warping is equivalent to warping with a, the polynomial identity and all the coefficients of the B-splines with value 0. See the reference for details.
First, the polynomial is estimated on the reference and the sample both smoothed with parameter lambda.smooth
.
The B-splines are estimated on the non-smoothed reference and sample using the polynomial just found.
The higher lambda.bspline
and kappa
are, the less flexible the warping function will be.
If returnReference = TRUE
, the function will return the name of the reference spectrum and if returnWarpingfunc = TRUE
, it will also return the warping functions.
Spectrum_data |
The warped spectra. |
Reference |
The name of the reference spectrum. |
Warpingfunc |
The warping functions. |
Benoît Legat, Manon Martin & Paul H. C. Eilers
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
require(PepsNMRData) Warp.spec <- Warping(Data_HS_sp$Spectrum_data_HS_8, reference.choice="fixed", reference = row.names(Data_HS_sp$Spectrum_data_HS_8)[1], returnReference = FALSE) #or Warp.res <- Warping(Data_HS_sp$Spectrum_data_HS_8, reference.choice="fixed", reference = row.names(Data_HS_sp$Spectrum_data_HS_8)[1], returnReference = TRUE) Warp.spec <- Warp.res[["Spectrum_data"]] Warp.res[["Reference"]]
require(PepsNMRData) Warp.spec <- Warping(Data_HS_sp$Spectrum_data_HS_8, reference.choice="fixed", reference = row.names(Data_HS_sp$Spectrum_data_HS_8)[1], returnReference = FALSE) #or Warp.res <- Warping(Data_HS_sp$Spectrum_data_HS_8, reference.choice="fixed", reference = row.names(Data_HS_sp$Spectrum_data_HS_8)[1], returnReference = TRUE) Warp.spec <- Warp.res[["Spectrum_data"]] Warp.res[["Reference"]]
Selects an interval in the ppm scale and returns the value of the spectra in that interval.
WindowSelection(Spectrum_data, from.ws = 10, to.ws = 0.2, verbose = FALSE)
WindowSelection(Spectrum_data, from.ws = 10, to.ws = 0.2, verbose = FALSE)
Spectrum_data |
Matrix containing the spectra in ppm, one row per spectrum. |
from.ws |
The left ppm value of the interval. A typical value is 10. If NULL, default value is the first index without NA. |
to.ws |
The right ppm value of the interval. A typical value is 0.2. If NULL, default value is the last index without NA. |
verbose |
If |
If from.ws
and/or to.ws
are not specified we calculate it so that we have the largest window without NA.
Those NAs are typically produced by the InternalReferencing
function.
Spectrum_data |
The matrix of the value of the spectra in the specified interval. |
Benoît Legat & Manon Martin
require(PepsNMRData) # The interval is chosen so that we have the largest interval without NA Ws.spec <- WindowSelection(Data_HS_sp$Spectrum_data_HS_9) # or Ws.spec <- WindowSelection(Data_HS_sp$Spectrum_data_HS_9, from.ws=10, to.ws=0.2)
require(PepsNMRData) # The interval is chosen so that we have the largest interval without NA Ws.spec <- WindowSelection(Data_HS_sp$Spectrum_data_HS_9) # or Ws.spec <- WindowSelection(Data_HS_sp$Spectrum_data_HS_9, from.ws=10, to.ws=0.2)
The function applies zero filling to the FIDs.
ZeroFilling(Fid_data, fn = ncol(Fid_data), verbose = FALSE)
ZeroFilling(Fid_data, fn = ncol(Fid_data), verbose = FALSE)
Fid_data |
Matrix containing the FIDs, one row per signal, as outputted by |
fn |
Number of 0 to be added. |
verbose |
If |
Zero filling does not improve the spectral resolution but lead to better visually defined lines in the spectra.
During zero filling, fn
zeros are appended at the end of the FIDs. This number is rounded to the nearest 2^x value to ease the upcoming Fourier Transform of the FIDs.
Fid_data |
The zero-filled FIDs. |
Manon Martin
require(PepsNMRData) ZF_fid <- ZeroFilling(Data_HS_sp$FidData_HS_3, fn = ncol(Data_HS_sp$FidData_HS_3))
require(PepsNMRData) ZF_fid <- ZeroFilling(Data_HS_sp$FidData_HS_3, fn = ncol(Data_HS_sp$FidData_HS_3))
The function corrects the spectra in order to have their real part in an absorptive mode.
ZeroOrderPhaseCorrection(Spectrum_data, type.zopc = c("rms", "manual", "max"), plot_rms = NULL, returnAngle = FALSE, createWindow = TRUE, angle = NULL, plot_spectra = FALSE, ppm.zopc = TRUE, exclude.zopc = list(c(5.1,4.5)), verbose = FALSE)
ZeroOrderPhaseCorrection(Spectrum_data, type.zopc = c("rms", "manual", "max"), plot_rms = NULL, returnAngle = FALSE, createWindow = TRUE, angle = NULL, plot_spectra = FALSE, ppm.zopc = TRUE, exclude.zopc = list(c(5.1,4.5)), verbose = FALSE)
Spectrum_data |
Matrix containing the spectra in ppm, one row per spectrum. |
type.zopc |
Method used to select the angles to rotate the spectra. See details. |
plot_rms |
Contains a vector of row names for which a debug plot should be made showing the value of the function we try to minimize as a function of the phase. |
returnAngle |
If |
createWindow |
If |
angle |
If not |
plot_spectra |
If |
ppm.zopc |
If |
exclude.zopc |
If not |
verbose |
If |
We focus our optimization on the positiveness of the real part which should be in an absoptive mode.
When type.zopc
is "rms"
, a positiveness criterion is measured for each spectrum. "manual"
is used when a vector of angles are specified in angle
and "max"
will optimize the maximum spectral intensity in the non-excluded window(s). Beware that if exclude.zopc
is not NULL
, the optimization will only consider the non-excluded spectral window(s).
By default the water region (5.1 - 4.5) is ignored.
BaselineCorrection
and NegativeValuesZeroing
will take care of the last negative values of the real part of the spectra. See the reference for more details.
Spectrum_data |
The matrix of rotated spectra. |
Benoît Legat & Manon Martin
Martin, M., Legat, B., Leenders, J., Vanwinsberghe, J., Rousseau, R., Boulanger, B., & Govaerts, B. (2018). PepsNMR for 1H NMR metabolomic data pre-processing. Analytica chimica acta, 1019, 1-13.
Rousseau, R. (2011). Statistical contribution to the analysis of metabonomics data in 1H NMR spectroscopy (Doctoral dissertation, PhD thesis. Institut de statistique, biostatistique et sciences actuarielles, Université catholique de Louvain, Belgium).
require(PepsNMRData) Zopc.res <- ZeroOrderPhaseCorrection(Data_HS_sp$Spectrum_data_HS_4, ppm.zopc = FALSE, exclude.zopc = list(c(5000,15000)))
require(PepsNMRData) Zopc.res <- ZeroOrderPhaseCorrection(Data_HS_sp$Spectrum_data_HS_4, ppm.zopc = FALSE, exclude.zopc = list(c(5000,15000)))
The function replaces the values given in specified intervals by triangular shaped peaks with the same area than the original peaks.
ZoneAggregation(Spectrum_data, fromto.za = list(Citrate = c(2.5, 2.7)), verbose = FALSE)
ZoneAggregation(Spectrum_data, fromto.za = list(Citrate = c(2.5, 2.7)), verbose = FALSE)
Spectrum_data |
Matrix containing the spectra in ppm, one row per spectrum. |
fromto.za |
List containing the borders in ppm of the intervals to aggregate. |
verbose |
If |
The interval is specified in the unit of the column names (which should be ppm). This aggregation is usually performed with urine samples that contains citrate.
Spectrum_data |
The matrix of spectra with their zone aggregated. |
Benoît Legat & Manon Martin
require(PepsNMRData) Spectrum_data <- ZoneAggregation(Data_HU_sp$Spectrum_data_HU_12, fromto.za = list(Citrate =c(2.5, 2.7)))
require(PepsNMRData) Spectrum_data <- ZoneAggregation(Data_HU_sp$Spectrum_data_HU_12, fromto.za = list(Citrate =c(2.5, 2.7)))