Title: | Maximum rank reproducibility |
---|---|
Description: | marr (Maximum Rank Reproducibility) is a nonparametric approach that detects reproducible signals using a maximal rank statistic for high-dimensional biological data. In this R package, we implement functions that measures the reproducibility of features per sample pair and sample pairs per feature in high-dimensional biological replicate experiments. The user-friendly plot functions in this package also plot histograms of the reproducibility of features per sample pair and sample pairs per feature. Furthermore, our approach also allows the users to select optimal filtering threshold values for the identification of reproducible features and sample pairs based on output visualization checks (histograms). This package also provides the subset of data filtered by reproducible features and/or sample pairs. |
Authors: | Tusharkanti Ghosh [aut, cre], Max McGrath [aut], Daisy Philtron [aut], Katerina Kechris [aut], Debashis Ghosh [aut, cph] |
Maintainer: | Tusharkanti Ghosh <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.17.0 |
Built: | 2024-11-29 08:27:29 UTC |
Source: | https://github.com/bioc/marr |
Class union allowing MarrData
slot to be a data.frame or
Summarized Experiment
This function applies an Rcpp-based implementation of a computationally efficient method for assessing reproducibility in high-throughput experiments, called the the Marr procedure. This function also defines the Marr class and constructor.
Marr( object, pSamplepairs = 0.75, pFeatures = 0.75, alpha = 0.05, featureVars = NULL )
Marr( object, pSamplepairs = 0.75, pFeatures = 0.75, alpha = 0.05, featureVars = NULL )
object |
an object which is a |
pSamplepairs |
(Optional) a threshold value that lies between 0 and 1, used to assign a feature to be reproducible based on the reproducibility output of the sample pairs per feature. Default is 0.75. |
pFeatures |
(Optional) a threshold value that lies between 0 and 1, used to assign a sample pair to be reproducible based on the reproducibility output of the features per sample pair. Default is 0.75. |
alpha |
(Optional) level of significance to control the False Discovery Rate (FDR). Default is 0.05. |
featureVars |
(Optional) Vector of the columns which identify features. If a 'SummarizedExperiment' is used for 'data', row variables will be used. |
marr (Maximum Rank Reproducibility) is a nonparametric approach, which assesses reproducibility in high-dimensional biological replicate experiments. Although it was originally developed for RNASeq data it can be applied across many different high-dimensional biological data including MassSpectrometry based Metabolomics and ChIPSeq. The Marr procedure uses a maximum rank statistic to identify reproducible signals from noise without making any distributional assumptions of reproducible signals. This procedure can be easily applied to a variety of measurement types since it employs a rank scale.
This function computes the distributions of percent reproducible sample pairs (row-wise) per feature and percent reproducible features (column-wise) per sample pair, respectively. Additionally, it also computes the percent of reproducible sample pairs and features based on a threshold value. See the vignette for more details.
A object of the class Marr
that
contains a numeric vector of the Marr sample pairs in
the MarrSamplepairs
slot, a numeric vector of the Marr
features in the MarrFeatures
slot, a numeric value of
the Marr filtered features in the MarrSamplepairsfiltered
slot,
and a numeric value of the Marr filtered sample pairs in the
MarrFeaturesfiltered
slot.
Philtron, D., Lyu, Y., Li, Q. and Ghosh, D., 2018. Maximum Rank Reproducibility: A Nonparametric Approach to Assessing Reproducibility in Replicate Experiments. Journal of the American Statistical Association, 113(523), pp.1028-1039.
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) data("msprepCOPD") data_Marr_COPD <- Marr(object = msprepCOPD, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) data("msprepCOPD") data_Marr_COPD <- Marr(object = msprepCOPD, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05)
Objects of this class store needed information to work with a Marr object
MarrSamplepairs
returns the distribution of
percent reproducible features (column-wise) per sample pair,
MarrFeatures
returns the distribution of percent reproducible
sample pairs (row-wise) per feature,
MarrSamplepairsfiltered
returns the percent of reproducible
features based on a threshold value and
MarrFeaturesfiltered
returns the percent of reproducible
sample pairs based on a threshold value
MarrSamplepairs
Marr sample pairs
MarrFeatures
Marr features
MarrSamplepairsfiltered
Marr sample pairs post filtering
MarrFeaturesfiltered
Marr metabolites post filtering
MarrData
Original data object passed to Marr
MarrPSamplepairs
Value of pSamplepairs
argument passed
to Marr
MarrPFeatures
Value of pFeatures
argument passed to Marr
MarrAlpha
Value of alpha
argument passed to Marr
MarrFeatureVars
Value of featureVars
passed to Marr
. NULL
if featureVars
was left blank
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05)
Given a Marr object, this function returns the Marr Alpha
Accessors for the 'MarrAlpha' slot of a Marr object.
MarrAlpha(object) ## S4 method for signature 'Marr' MarrAlpha(object)
MarrAlpha(object) ## S4 method for signature 'Marr' MarrAlpha(object)
object |
an object of class |
Value of alpha
argument passed to Marr
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrAlpha(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrAlpha(data_Marr)
Given a Marr object, this function returns the Marr Data
Accessors for the 'MarrData' slot of a Marr object.
MarrData(object) ## S4 method for signature 'Marr' MarrData(object)
MarrData(object) ## S4 method for signature 'Marr' MarrData(object)
object |
an object of class |
Original data object passed to Marr
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrData(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrData(data_Marr)
Given a Marr object, this function returns the Marr features
Accessors for the 'MarrFeatures' slot of a Marr object.
MarrFeatures(object) ## S4 method for signature 'Marr' MarrFeatures(object)
MarrFeatures(object) ## S4 method for signature 'Marr' MarrFeatures(object)
object |
an object of class |
The distribution of percent reproducible sample pairs (row-wise) per feature after applying the maximum rank reproducibility.
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrFeatures(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrFeatures(data_Marr)
Given a Marr object, this function returns the Marr filtered features
Accessors for the 'MarrFeaturesfiltered' slot of a Marr object.
MarrFeaturesfiltered(object) ## S4 method for signature 'Marr' MarrFeaturesfiltered(object)
MarrFeaturesfiltered(object) ## S4 method for signature 'Marr' MarrFeaturesfiltered(object)
object |
an object of class |
The percent of reproducible sample pairs based on a threshold value after applying maximum rank reproducibility.
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrFeaturesfiltered(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrFeaturesfiltered(data_Marr)
Given a Marr object, this function returns the Marr Feature Vars
Accessors for the 'MarrFeatureVars' slot of a Marr object.
MarrFeatureVars(object) ## S4 method for signature 'Marr' MarrFeatureVars(object)
MarrFeatureVars(object) ## S4 method for signature 'Marr' MarrFeatureVars(object)
object |
an object of class |
Value of featureVars
passed to Marr
. NULL
if featureVars
was left blank
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrFeatureVars(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrFeatureVars(data_Marr)
Filters Marr
object according to the Maximum Rank
Reproducibility of the features, samples pairs, or both. Features are removed
if their reproducibility per sample pair is less than pFeatures
.
Samples are removed if their sample pair reproducibility per feature is less
than pSamplepairs
for all pairings of that sample and the other
samples in the set.
MarrFilterData(object, by = c("both", "features", "samplePairs"))
MarrFilterData(object, by = c("both", "features", "samplePairs"))
object |
a Marr object from |
by |
String specifying which reproducibility values to filter by. Options include "features" to filter features according to their reproducibility, "samplePairs" to filter samples according to the reproducibility of sample pairs, or "both" to filter both features and sample pairs according to their respective reproducibility. Default is "both". |
A list of data.frame
's or a SummarizedExperiment
.
If a data.frame
was originally input into the Marr
function,
a list with three elements, filteredData
,
removedSamples
, and removedFeatures
, will be returned. If a
SummarizedExperiment
was originally input, output will be a
SummarizedExperiment
with the assay filtered and with two metadata
objects, removedSamples
and removedFeatures
added.
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrFilterData(data_Marr, by = "both")
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrFilterData(data_Marr, by = "both")
Given a Marr object, this function returns the Marr P Features
Accessors for the 'MarrPFeatures' slot of a Marr object.
MarrPFeatures(object) ## S4 method for signature 'Marr' MarrPFeatures(object)
MarrPFeatures(object) ## S4 method for signature 'Marr' MarrPFeatures(object)
object |
an object of class |
Value of MarrPFeatures
argument passed
to Marr
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrPFeatures(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrPFeatures(data_Marr)
Marr
function.This function plots a histogram showing the features along the y-axis and percent reproducible sample pairs per feature on the x-axis.
MarrPlotFeatures( object, xLab = "Percent reproducible sample pairs per feature", yLab = "Feature" )
MarrPlotFeatures( object, xLab = "Percent reproducible sample pairs per feature", yLab = "Feature" )
object |
a Marr object from |
xLab |
label for x-axis. Default is 'Percent reproducible sample pairs per feature for pairwise replicates'. |
yLab |
label for y-axis. Default is 'Feature' |
A histogram will be created showing the features along the y-axis and percent reproducible sample pairs per feature on the x-axis.
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrPlotFeatures(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrPlotFeatures(data_Marr)
Marr
function.This function plots a histogram showing the sample pairs along the y-axis and percent reproducible features per sample pair on the x-axis.
MarrPlotSamplepairs( object, xLab = "Percent reproducible features per sample pair", yLab = "Sample pair" )
MarrPlotSamplepairs( object, xLab = "Percent reproducible features per sample pair", yLab = "Sample pair" )
object |
a Marr object from |
xLab |
label for x-axis. Default is 'Percent reproducible features per sample pair for pairwise replicates'. |
yLab |
label for y-axis. Default is 'Sample pair' |
A histogram will be created showing the sample pairs along the y-axis and percent reproducible features per sample pair on the x-axis.
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrPlotSamplepairs(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrPlotSamplepairs(data_Marr)
This function is a helper function that
computes distributions of reproducible sample pairs per feature
and reproducible features per sample pair for the function
Marr
.
MarrProc(object, alpha = 0.05)
MarrProc(object, alpha = 0.05)
object |
an object which is a |
alpha |
(Optional) level of significance to control the False Discovery Rate (FDR). Default is 0.05. |
A list of percent reproducible statistics including
samplepairs |
the distribution of percent reproducible features (column-wise) per sample pair |
features |
the distribution of percent reproducible sample pairs (row-wise) per feature |
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_MarrProc <- MarrProc(object=data, alpha = 0.05)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_MarrProc <- MarrProc(object=data, alpha = 0.05)
Given a Marr object, this function returns the Marr P Sample Pairs
Accessors for the 'MarrPSamplepairs' slot of a Marr object.
MarrPSamplepairs(object) ## S4 method for signature 'Marr' MarrPSamplepairs(object)
MarrPSamplepairs(object) ## S4 method for signature 'Marr' MarrPSamplepairs(object)
object |
an object of class |
Value of pSamplepairs
argument passed
to Marr
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrPSamplepairs(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrPSamplepairs(data_Marr)
Given a Marr object, this function returns the Marr sample pairs
Accessors for the 'MarrSamplepairs' slot of a Marr object.
MarrSamplepairs(object) ## S4 method for signature 'Marr' MarrSamplepairs(object)
MarrSamplepairs(object) ## S4 method for signature 'Marr' MarrSamplepairs(object)
object |
an object of class |
The distribution of percent reproducible features (column-wise) per sample pair after applying the maximum rank reproducibility.
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrSamplepairs(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrSamplepairs(data_Marr)
Given a Marr object, this function returns the Marr filtered sample pairs
Accessors for the 'MarrSamplepairsfiltered' slot of a Marr object.
MarrSamplepairsfiltered(object) ## S4 method for signature 'Marr' MarrSamplepairsfiltered(object)
MarrSamplepairsfiltered(object) ## S4 method for signature 'Marr' MarrSamplepairsfiltered(object)
object |
an object of class |
The percent of reproducible features based on a threshold value after applying maximum rank reproducibility.
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrSamplepairsfiltered(data_Marr)
data <- matrix(rnorm(2400), nrow=200, ncol=12) data_Marr <- Marr(object = data, pSamplepairs=0.75, pFeatures=0.75, alpha=0.05) MarrSamplepairsfiltered(data_Marr)
Data contains LC-MS metabolite analysis for samples from 20 subjects. and 662 metabolites. The raw data was pre-processed using MSPrep method. The raw data pre- processing include 3 steps- Filtering, Missing Value Imputation and Normalization. Filtering- the metabolites(columns) in the raw data were removed if they were missing more than 80 percent of the samples. Missing Value Imputation- The Bayesian Principal Component Analysis (BPCA) was applied to impute the missing values. Normalization- median normalization was applied to remove unwanted variation appears from various sources in metabolomics studies. The first three columns indicate "Mass" indicating the mass-to-charge ratio, "Retention.Time", and "Compound.Name" for each present metabolite. The remaining columns indicate abundance for each of the 645 mass/retention-time combination for each subject combination.
data(msprepCOPD)
data(msprepCOPD)
SummarizedExperiment assay object containing 645 metabolites (features) of 20 subjects (samples).
Mass-to-charge ratio
Retention-time
Compound name for each mass/retention time combination
The columns indicate metabolite abundances found in each subject combination. Each column begins with an 'X', followed by the subject ID.
https://www.metabolomicsworkbench.org/data/DRCCMetadata.php?Mode=Project&ProjectID=PR000438
The raw data is available at the NIH Common Fund's National Metabolomics Data Repository (NMDR) website, the Metabolomics Workbench, https://www.metabolomicsworkbench.org, where it has been assigned Project ID PR000438. The raw data can be accessed directly via it's Project DOI: 10.21228/M8FC7C This work is supported by NIH grant, U2C- DK119886.
Nichole Reisdorph. Untargeted LC-MS metabolomics analysis of human COPD plasma, HILIC & C18, metabolomics_workbench, V1.
Hughes, G., Cruickshank-Quinn, C., Reisdorph, R., Lutz, S., Petrache, I., Reisdorph, N., Bowler, R. and Kechris, K., 2014. MSPrep—Summarization, normalization and diagnostics for processing of mass spectrometry–based metabolomic data. Bioinformatics, 30(1), pp.133-134.
data(msprepCOPD)
data(msprepCOPD)