Title: | An R package for quality Control for hydrogen deuterium exchange mass spectrometry experiments |
---|---|
Description: | The hdxmsqc package enables us to analyse and visualise the quality of HDX-MS experiments. Either as a final quality check before downstream analysis and publication or as part of a interative procedure to determine the quality of the data. The package builds on the QFeatures and Spectra packages to integrate with other mass-spectrometry data. |
Authors: | Oliver M. Crook [aut, cre] |
Maintainer: | Oliver M. Crook <[email protected]> |
License: | file LICENSE |
Version: | 1.3.0 |
Built: | 2024-11-29 08:14:36 UTC |
Source: | https://github.com/bioc/hdxmsqc |
A small HDX-MS dataset for BRD4 in apo state and in complex with IBET151
My Name [email protected]
A complete HDX-MS dataset for BRD4 in apo state and in complex with IBET151
My Name [email protected]
Charge states should have correlated incorperation but they need not be exactly the same
chargeCorrelationHdx(object, experiment = NULL, timepoints = NULL)
chargeCorrelationHdx(object, experiment = NULL, timepoints = NULL)
object |
An object of class |
experiment |
A character vector indicating the experimental conditions |
timepoints |
A numeric vector indicating the experimental timepoints |
Oliver Crook
data("BRD4df_full") BRD4df_filtered <- isMissingAtRandom(object = BRD4df_full) BRD4df_full_imputed <- impute(BRD4df_filtered, method = "zero", i = 1) experiment <- c("wt", "iBET") timepoints <- rep(c(0, 15, 60, 600, 3600, 14000), each = 3) monoStat <- chargeCorrelationHdx(object = BRD4df_full_imputed, experiment = experiment, timepoints = timepoints)
data("BRD4df_full") BRD4df_filtered <- isMissingAtRandom(object = BRD4df_full) BRD4df_full_imputed <- impute(BRD4df_filtered, method = "zero", i = 1) experiment <- c("wt", "iBET") timepoints <- rep(c(0, 15, 60, 600, 3600, 14000), each = 3) monoStat <- chargeCorrelationHdx(object = BRD4df_full_imputed, experiment = experiment, timepoints = timepoints)
Check whether deuterium uptakes are compatible with difference overlapping sequences.
compatibleUptake(object, overlap = 5, experiment = NULL, timepoints = NULL)
compatibleUptake(object, overlap = 5, experiment = NULL, timepoints = NULL)
object |
An object of class |
overlap |
How much overlap is required to check consistentcy. Default is sequences within 5 residues |
experiment |
A character vector indicating the experimental conditions |
timepoints |
A numeric vector indicating the experimental timepoints |
Oliver Crook
data("BRD4df") result <- compatibleUptake(BRD4df, experiment = 1, timepoints = 1)
data("BRD4df") result <- compatibleUptake(BRD4df, experiment = 1, timepoints = 1)
Empirical versus theoretical mass errors
computeMassError(object, eCentroid = "Exp.Cent", tCentroid = "Theor.Cent")
computeMassError(object, eCentroid = "Exp.Cent", tCentroid = "Theor.Cent")
object |
An object of class |
eCentroid |
character string indicating column identifier for experimental centroid |
tCentroid |
character string indicating column identifier for theoretical centroid |
The error difference between the empirical and theoretical centroid
Oliver Crook
data("BRD4df") result <- computeMassError(BRD4df, "Exp.Cent", "Theor.Cent") head(result)
data("BRD4df") result <- computeMassError(BRD4df, "Exp.Cent", "Theor.Cent") head(result)
Monotonicity based outlier detection.
computeMonotoneStats(object, experiment = NULL, timepoints = NULL)
computeMonotoneStats(object, experiment = NULL, timepoints = NULL)
object |
An object of class |
experiment |
A character vector indicating the experimental conditions |
timepoints |
A numeric vector indicating the experimental timepoints |
Oliver Crook
data("BRD4df") result <- computeMonotoneStats(BRD4df, experiment = 1, timepoint = 1)
data("BRD4df") result <- computeMonotoneStats(BRD4df, experiment = 1, timepoint = 1)
Computes the number of exchangeable amides based on the sequnece
exchangeableAmides(sequence)
exchangeableAmides(sequence)
sequence |
The sequence of the peptide |
Returns a numeric indicating the number of exchangeable amides
exchangeableAmides(sequence = "HDAEHAHEAPRKL")
exchangeableAmides(sequence = "HDAEHAHEAPRKL")
fourier transform approach to computing isotopic distribution
fourierIsotope( elements, incorp = 0, num_exch_sites = 0, charge = 1, isotopes = NULL )
fourierIsotope( elements, incorp = 0, num_exch_sites = 0, charge = 1, isotopes = NULL )
elements |
A list of elements |
incorp |
The deuterium incoperation |
num_exch_sites |
The number of exchangable amides. Default is 0. |
charge |
The charge state of the peptide |
isotopes |
The number of isotopes to compute. The default is NULL, in whiich a default heuristic is used to make a good guess that covers the expected peaks. |
A list of mass and intensity value corresponding to the isotope distribution
Oliver Crook
fourierIsotope(c(C = 0, H = 2, N = 0, O = 1, S = 0, P = 0))
fourierIsotope(c(C = 0, H = 2, N = 0, O = 1, S = 0, P = 0))
generate Spectra using a fourier transform
generateSpectra( sequences, incorps, charges, customs = list(code = NULL, elements = NULL) )
generateSpectra( sequences, incorps, charges, customs = list(code = NULL, elements = NULL) )
sequences |
A vector of peptide sequences |
incorps |
A vector of deuterium incoperation |
charges |
A vector of charge states of the peptide |
customs |
Custom elements supplied as a list |
A Spectra object corresponding to the isotope distributions
Oliver Crook
generateSpectra(sequence = "HDAEHAHEAPRKL", incorps = c(0.5), charges = 2)
generateSpectra(sequence = "HDAEHAHEAPRKL", incorps = c(0.5), charges = 2)
'hdxmsqc' provides the functionality to assess the quality and perform quality control of HDX-MS experiments. Raw and processed data can be visualized and analyzed to identify potential issues with the data. The package is designed to work with data from any HDX-MS platform. Typically, users will have exported results from either HDExaminer or DynamX software. There is not need to filter the data in either of those software systems.
Oliver Crook
Ion Mobility time based outlier analysis
imTimeOutlier( object, rightIMS = "rightIMS", leftIMS = "leftIMS", searchIMS = "Search.IMS" )
imTimeOutlier( object, rightIMS = "rightIMS", leftIMS = "leftIMS", searchIMS = "Search.IMS" )
object |
An object of class |
rightIMS |
A string indicating the right boundary of the ion mobility separation time. Defaults is "rightIMS". |
leftIMS |
A string indicating the left boundary of the ion mobility separation time. Default is "leftIMS". |
searchIMS |
A string indicating the actual ion mobility search time. The default is "Search.IMS" |
Oliver Crook
data("BRD4df_full") BRD4df_filtered <- isMissingAtRandom(object = BRD4df_full) BRD4df_full_imputed <- impute(BRD4df_filtered, method = "zero", i = 1) imTimeOutlier(object = BRD4df_full_imputed)
data("BRD4df_full") BRD4df_filtered <- isMissingAtRandom(object = BRD4df_full) BRD4df_full_imputed <- impute(BRD4df_filtered, method = "zero", i = 1) imTimeOutlier(object = BRD4df_full_imputed)
Intensity based deviations
intensityOutliers(object, fcolIntensity = "Max.Inty")
intensityOutliers(object, fcolIntensity = "Max.Inty")
object |
An object of class |
fcolIntensity |
character to intensity intensity columns. Default is "Max.Inty" and uses regular expressions to find relevant columns |
The Cook's distance to characterise outleirs
Oliver Crook
data("BRD4df_full") intensityOutliers(BRD4df_full)
data("BRD4df_full") intensityOutliers(BRD4df_full)
Missing at random versus missing not at random
isMissingAtRandom(object, threshold = NULL, filter = TRUE)
isMissingAtRandom(object, threshold = NULL, filter = TRUE)
object |
An object of class |
threshold |
A threshold indicated how many missing values indicate whether missingness is not at random. Default is NULL, which means leads to a threshold which is half the number of columns. |
filter |
A logial indicating whether to filter out data that is deemed missing not at random data("BRD4df_full") isMissingAtRandom(BRD4df_full) |
Adds a missing not at random indicator column
Oliver Crook
fourier transform approach to computing isotopic distribution
isotopicDistributionHDXfourier( sequence, incorp = 0, charge = 1, custom = list(code = NULL, elements = NULL) )
isotopicDistributionHDXfourier( sequence, incorp = 0, charge = 1, custom = list(code = NULL, elements = NULL) )
sequence |
A peptide |
incorp |
The deuterium incoperation |
charge |
The charge state of the peptide |
custom |
custom amino acids can be provided here provide a list of the elements. |
A list of mass and intensity value corresponding to the isotope distribution
Oliver Crook
isotopicDistributionHDXfourier(sequence = "HDAEHAHEAPRKL")
isotopicDistributionHDXfourier(sequence = "HDAEHAHEAPRKL")
Ion Mobility time based outlier analysis
plotImTimeOutlier( object, rightIMS = "rightIMS", leftIMS = "leftIMS", searchIMS = "Search.IMS" )
plotImTimeOutlier( object, rightIMS = "rightIMS", leftIMS = "leftIMS", searchIMS = "Search.IMS" )
object |
An object of class |
rightIMS |
A string indicating the right boundary of the ion mobility separation time. Defaults is "rightIMS". |
leftIMS |
A string indicating the left boundary of the ion mobility separation time. Default is "leftIMS". |
searchIMS |
A string indicating the actual ion mobility search time. The default is "Search.IMS" |
Oliver Crook
library(RColorBrewer) data("BRD4df_full") BRD4df_filtered <- isMissingAtRandom(object = BRD4df_full) BRD4df_full_imputed <- impute(BRD4df_filtered, method = "zero", i = 1) plotImTimeOutlier(object = BRD4df_full_imputed)
library(RColorBrewer) data("BRD4df_full") BRD4df_filtered <- isMissingAtRandom(object = BRD4df_full) BRD4df_full_imputed <- impute(BRD4df_filtered, method = "zero", i = 1) plotImTimeOutlier(object = BRD4df_full_imputed)
Intensity based deviation plot
plotIntensityOutliers(object, fcolIntensity = "Max.Inty")
plotIntensityOutliers(object, fcolIntensity = "Max.Inty")
object |
An object of class |
fcolIntensity |
character to intensity intensity columns. Default is "Max.Inty" and uses regular expressions to find relevant columns |
A ggplot2 object showing intensity based outliers
Oliver Crook
data("BRD4df_full") library(RColorBrewer) plotIntensityOutliers(BRD4df_full)
data("BRD4df_full") library(RColorBrewer) plotIntensityOutliers(BRD4df_full)
Mass error plot
plotMassError(object, eCentroid = "Exp.Cent", tCentroid = "Theor.Cent")
plotMassError(object, eCentroid = "Exp.Cent", tCentroid = "Theor.Cent")
object |
An object of class |
eCentroid |
character string indicating column identifier for experimental centroid |
tCentroid |
character string indicating column identifier for theoretical centroid |
a ggplot2 object which can be used to visualise the
Oliver Crook
library(RColorBrewer) data("BRD4df") result <- plotMassError(BRD4df, "Exp.Cent", "Theor.Cent")
library(RColorBrewer) data("BRD4df") result <- plotMassError(BRD4df, "Exp.Cent", "Theor.Cent")
missing value plot
plotMissing(object, ...)
plotMissing(object, ...)
object |
An object of class |
... |
Additional arguemnts to pheatmap |
a pheatmap showing missing values
Oliver Crook
data("BRD4df_full") library(pheatmap) library(RColorBrewer) plotMissing(BRD4df_full)
data("BRD4df_full") library(pheatmap) library(RColorBrewer) plotMissing(BRD4df_full)
Monotonicity based outlier detection, plot.
plotMonotoneStat(object, experiment = NULL, timepoints = NULL)
plotMonotoneStat(object, experiment = NULL, timepoints = NULL)
object |
An object of class |
experiment |
A character vector indicating the experimental conditions |
timepoints |
A numeric vector indicating the experimental timepoints |
Oliver Crook
library("RColorBrewer") data("BRD4df_full") experiment <- c("wt", "iBET") timepoints <- rep(c(0, 15, 60, 600, 3600, 14000), each = 3) monoStat <- computeMonotoneStats(object = BRD4df_full, experiment = experiment, timepoints = timepoints)
library("RColorBrewer") data("BRD4df_full") experiment <- c("wt", "iBET") timepoints <- rep(c(0, 15, 60, 600, 3600, 14000), each = 3) monoStat <- computeMonotoneStats(object = BRD4df_full, experiment = experiment, timepoints = timepoints)
Retention time based analysis
plotrTimeOutliers( object, leftRT = "leftRT", rightRT = "rightRT", searchRT = "Search.RT" )
plotrTimeOutliers( object, leftRT = "leftRT", rightRT = "rightRT", searchRT = "Search.RT" )
object |
An object of class |
leftRT |
A character indicated pattern associated with left boundary of retention time search. Default is "leftRT". |
rightRT |
A character indicated pattern associated with right boundary of retneton time search. Default is "rightRT". |
searchRT |
The actual search retention time pattern. Default is "Search.RT" |
a ggplot2 object showing distribution of retention time windows.
Oliver Crook
data("BRD4df_full") library(RColorBrewer) plotrTimeOutliers(BRD4df_full)
data("BRD4df_full") library(RColorBrewer) plotrTimeOutliers(BRD4df_full)
QFeatures
Function to curate and HDExaminer file so that in contains all the information
in a sensible format. This object can then be straightforwardly passed to
a object of class QFeatures
processHDE(HDExaminerFile, proteinStates = NULL)
processHDE(HDExaminerFile, proteinStates = NULL)
HDExaminerFile |
an object of class data.frame containing an HDExaminer data |
proteinStates |
a character vector indicating the protein states |
A wide format data frame with HDExaminer data
Oliver Crook
sample_data <- data.frame(read.csv(system.file("extdata", "ELN55049_AllResultsTables_Uncurated.csv", package = "hdxmsqc", mustWork = TRUE), nrows = 10)) processHDE(sample_data)
sample_data <- data.frame(read.csv(system.file("extdata", "ELN55049_AllResultsTables_Uncurated.csv", package = "hdxmsqc", mustWork = TRUE), nrows = 10)) processHDE(sample_data)
Quality Control table function. Generate a table that collates quality control metrics
qualityControl( object, massError = NULL, intensityOutlier = NULL, retentionOutlier = NULL, monotonicityStat = NULL, mobilityOutlier = NULL, chargeCorrelation = NULL, replicateCorrelation = NULL, replicateOutlier = NULL, sequenceCheck = NULL, spectraCheck = NULL, experiment = NULL, timepoints = NULL, undeuterated = FALSE )
qualityControl( object, massError = NULL, intensityOutlier = NULL, retentionOutlier = NULL, monotonicityStat = NULL, mobilityOutlier = NULL, chargeCorrelation = NULL, replicateCorrelation = NULL, replicateOutlier = NULL, sequenceCheck = NULL, spectraCheck = NULL, experiment = NULL, timepoints = NULL, undeuterated = FALSE )
object |
An object of class Qfeatures, with the data used for the analysis |
massError |
The output of the |
intensityOutlier |
The output of the |
retentionOutlier |
The output of the |
monotonicityStat |
The output of the |
mobilityOutlier |
The output of the |
chargeCorrelation |
The output of the |
replicateCorrelation |
The output of the |
replicateOutlier |
The output of the |
sequenceCheck |
The output of the |
spectraCheck |
The output of the |
experiment |
The experimental conditions. |
timepoints |
The timepoints used in the analysis, must include repeat for replicates |
undeuterated |
A logical indicating whether only the undeuterated data should be exported |
An object of class DataFrame
containing a summary of the quality
control results.
Oliver Crook
Correlation based checks
replicateCorrelation(object, experiment, timepoints)
replicateCorrelation(object, experiment, timepoints)
object |
An object of class QFeatures. |
experiment |
A character vector indicating the experimental conditions |
timepoints |
A numeric vector indicating the experimental timepoints |
Returns A list of the same length as the number of experiments indicating outlier from correlation analysis. Outliers are flagged if their deuterium uptake is highly variable.
Oliver Crook
data("BRD4df_full") experiment <- c("wt", "iBET") timepoints <- rep(c(0, 15, 60, 600, 3600, 14000), each = 3) monoStat <- replicateCorrelation(object = BRD4df_full, experiment = experiment, timepoints = timepoints)
data("BRD4df_full") experiment <- c("wt", "iBET") timepoints <- rep(c(0, 15, 60, 600, 3600, 14000), each = 3) monoStat <- replicateCorrelation(object = BRD4df_full, experiment = experiment, timepoints = timepoints)
Correlation based checks
replicateOutlier(object, experiment, timepoints)
replicateOutlier(object, experiment, timepoints)
object |
An object of class QFeatures. |
experiment |
A character vector indicating the experimental conditions |
timepoints |
A numeric vector indicating the experimental timepoints |
Returns A list of the same length as the number of experiments indicating outlier from correlation analysis. Outliers are flagged if their deuterium uptake is highly variable.
Oliver Crook
data("BRD4df_full") BRD4df_filtered <- isMissingAtRandom(object = BRD4df_full) BRD4df_full_imputed <- impute(BRD4df_filtered, method = "zero", i = 1) experiment <- c("wt", "iBET") timepoints <- rep(c(0, 15, 60, 600, 3600, 14000), each = 3) monoStat <- replicateOutlier(object = BRD4df_full_imputed, experiment = experiment, timepoints = timepoints)
data("BRD4df_full") BRD4df_filtered <- isMissingAtRandom(object = BRD4df_full) BRD4df_full_imputed <- impute(BRD4df_filtered, method = "zero", i = 1) experiment <- c("wt", "iBET") timepoints <- rep(c(0, 15, 60, 600, 3600, 14000), each = 3) monoStat <- replicateOutlier(object = BRD4df_full_imputed, experiment = experiment, timepoints = timepoints)
Retention time based analysis
rTimeOutliers( object, leftRT = "leftRT", rightRT = "rightRT", searchRT = "Search.RT" )
rTimeOutliers( object, leftRT = "leftRT", rightRT = "rightRT", searchRT = "Search.RT" )
object |
An object of class |
leftRT |
A character indicated pattern associated with left boundary of retention time search. Default is "leftRT". |
rightRT |
A character indicated pattern associated with right boundary of retneton time search. Default is "rightRT". |
searchRT |
The actual search retention time pattern. Default is "Search.RT" |
A list indicating the retention time based outliers.
Oliver Crook
data("BRD4df_full") rTimeOutliers(BRD4df_full)
data("BRD4df_full") rTimeOutliers(BRD4df_full)
Spectral checking using data from HDsite
spectraSimilarity( peaks, object, experiment = NULL, mzCol = 14, startRT = "Start.RT", endRT = "End.RT", charge = "z", incorpD = "X.D.left", maxD = "maxD", numSpectra = NULL, ppm = 300, BPPARAM = bpparam() )
spectraSimilarity( peaks, object, experiment = NULL, mzCol = 14, startRT = "Start.RT", endRT = "End.RT", charge = "z", incorpD = "X.D.left", maxD = "maxD", numSpectra = NULL, ppm = 300, BPPARAM = bpparam() )
peaks |
a data.frame containing data exported from hdsite |
object |
a data.frame obtained from HDexaminer data |
experiment |
A character vector indicating the experimental conditions |
mzCol |
The column in the peak information indicating the base mz value |
startRT |
The column indicatng the start of the retention time. Default is "Start.RT" |
endRT |
The column indicating the end of the retention time. Default is "End.RT |
charge |
The column indicating the charge information. Default is "z". |
incorpD |
The deuterium uptake value column. Default is "X.D.left". |
maxD |
The maximum allowed deuterium incorporation column. Default is "maxD". |
numSpectra |
The number of spectra to analyse. Default is NULL in which all Spectra are analysed. |
ppm |
The ppm error |
BPPARAM |
Bioconductor parallel options. |
Two list of spectra observed and matching theoretical Spectra
Oliver Crook