The MsCoreUtils
package low-level functions for mass
spectrometry data and is independent of any high-level data structures
[@rainer_modular_2022]. These functions
include mass spectra processing functions (noise estimation, smoothing,
binning), quantitative aggregation functions (median polish, robust
summarisation, …), missing data imputation, data normalisation
(quantiles, vsn, …) as well as misc helper functions, that are used
across high level data structure within the R for Mass Spectrometry
packages.
For a full list of function, see
## Warning: multiple methods tables found for 'union'
## Warning: multiple methods tables found for 'intersect'
## Warning: multiple methods tables found for 'setdiff'
## Warning: multiple methods tables found for 'setequal'
## [1] "%between%" "aggregate_by_matrix"
## [3] "aggregate_by_vector" "asInteger"
## [5] "between" "bin"
## [7] "breaks_ppm" "closest"
## [9] "coefMA" "coefSG"
## [11] "coefWMA" "colCounts"
## [13] "colMeansMat" "colSumsMat"
## [15] "common" "common_path"
## [17] "entropy" "estimateBaseline"
## [19] "estimateBaselineConvexHull" "estimateBaselineMedian"
## [21] "estimateBaselineSnip" "estimateBaselineTopHat"
## [23] "force_sorted" "formatRt"
## [25] "getImputeMargin" "gnps"
## [27] "group" "i2index"
## [29] "imputeMethods" "impute_MinDet"
## [31] "impute_MinProb" "impute_QRILC"
## [33] "impute_RF" "impute_bpca"
## [35] "impute_fun" "impute_knn"
## [37] "impute_matrix" "impute_min"
## [39] "impute_mixed" "impute_mle"
## [41] "impute_neighbour_average" "impute_with"
## [43] "impute_zero" "isPeaksMatrix"
## [45] "join" "join_gnps"
## [47] "localMaxima" "maxi"
## [49] "medianPolish" "navdist"
## [51] "ndotproduct" "nentropy"
## [53] "neuclidean" "noise"
## [55] "normalizeMethods" "normalize_matrix"
## [57] "nspectraangle" "ppm"
## [59] "rbindFill" "refineCentroids"
## [61] "rla" "robustSummary"
## [63] "rowRla" "rt2character"
## [65] "rt2numeric" "smooth"
## [67] "sumi" "validPeaksMatrix"
## [69] "valleys" "vapply1c"
## [71] "vapply1d" "vapply1l"
## [73] "which.first" "which.last"
or the reference page on the package webpage.
The functions defined in this package utilise basic classes with the aim of being reused in packages that provide a more formal, high-level interface.
As an examples, let’s take the robustSummary()
function,
that calculates the robust summary of the columns of a matrix:
## a b c d e f g
## A -0.92180684 0.5223034 -0.2358269 -2.1467099 -0.8221625 2.2228530 0.9694077
## B -0.01892672 0.6637892 -1.2126325 0.5120246 1.2407720 -0.1911302 0.4464932
## C -1.08724183 -1.1702591 1.4807216 0.5134675 0.7246756 -1.6342862 0.3668191
## h i j
## A 0.4380956 -0.3520584 -0.9330006
## B 1.3594074 -0.9588434 1.3680996
## C 1.8720488 0.4458023 -0.4076399
## a b c d e f
## -0.675991795 0.033090251 0.103145685 -0.235588978 0.381095049 -0.492666467
## g h i j
## 0.594240006 1.223183916 -0.288366461 0.009153058
This function is typicall to be used to summarise peptide quantitation values into protein intensities1. This functionality is available in
the MSnbase::combineFeatures()
function for MSnSet
objects and
the QFeatures::aggregateFeatures()
function for QFeatures
objects.
If you would like to contribute any low-level functionality, please open a GitHub issue to discuss it. Please note that any contributions should follow the style guide and will require an appropriate unit test.
If you wish to reuse any functions in this package, please just go ahead. If you would like any advice or seek help, please either open a GitHub issue.
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] MsCoreUtils_1.19.0 BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] cli_3.6.3 knitr_1.49 rlang_1.1.4
## [4] xfun_0.49 generics_0.1.3 jsonlite_1.8.9
## [7] clue_0.3-66 S4Vectors_0.45.1 buildtools_1.0.0
## [10] htmltools_0.5.8.1 maketools_1.3.1 sys_3.4.3
## [13] sass_0.4.9 stats4_4.4.2 rmarkdown_2.29
## [16] evaluate_1.0.1 jquerylib_0.1.4 MASS_7.3-61
## [19] fastmap_1.2.0 yaml_2.3.10 lifecycle_1.0.4
## [22] BiocManager_1.30.25 cluster_2.1.6 compiler_4.4.2
## [25] digest_0.6.37 R6_2.5.1 bslib_0.8.0
## [28] tools_4.4.2 BiocGenerics_0.53.2 cachem_1.1.0
See Sticker et al. Robust summarization and inference in proteome-wide label-free quantification. https://doi.org/10.1101/668863.↩︎