Title: | Smooth quantile normalization |
---|---|
Description: | Smooth quantile normalization is a generalization of quantile normalization, which is average of the two types of assumptions about the data generation process: quantile normalization and quantile normalization between groups. |
Authors: | Stephanie C. Hicks [aut, cre] , Kwame Okrah [aut], Koen Van den Berge [ctb], Hector Corrada Bravo [aut] , Rafael Irizarry [aut] |
Maintainer: | Stephanie C. Hicks <[email protected]> |
License: | GPL-3 |
Version: | 1.23.0 |
Built: | 2024-12-30 03:37:11 UTC |
Source: | https://github.com/bioc/qsmooth |
This function applies a generalization of quantile normalization called smoothed quantile normalization. This function defines the qsmooth class and constructor.
qsmooth(object, group_factor, batch = NULL, norm_factors = NULL, window = 0.05)
qsmooth(object, group_factor, batch = NULL, norm_factors = NULL, window = 0.05)
object |
an object which is a |
group_factor |
a group level continuous or categorial
covariate associated with each sample or column in the
|
batch |
(Optional) batch covariate (multiple
batches are not allowed).
If batch covariate is provided, |
norm_factors |
optional normalization scaling factors. |
window |
window size for running median which is
a fraction of the number of rows in |
Quantile normalization is one of the most widely used normalization tools for data analysis in genomics. Although it was originally developed for gene expression microarrays it is now used across many different high-throughput applications including RNAseq and ChIPseq. The methodology relies on the assumption that observed changes in the empirical distribution of samples are due to unwanted variability. Because the data is transformed to remove these differences it has the potential to remove interesting biologically driven global variation. Therefore, applying quantile normalization, or other global normalization methods that rely on similar assumptions, may not be an appropriate depending on the type and source of variation.
This function computes a weight at every quantile that compares the variability between groups relative to within groups. In one extreme quantile normalization is applied and in the other extreme quantile normalization within each biological condition is applied. The weight shrinks the group-level quantile normalized data towards the overall reference quantiles if variability between groups is sufficiently smaller than the variability within groups. See the vignette for more details.
A object of the class qsmooth
that
contains a numeric vector of the qsmooth weights in
the qsmoothWeights
slot and a matrix of normalized
values after applying smoothed quantile normalization in
the qsmoothData
slot.
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) dat_qs <- qsmooth(object = dat, group_factor = rep(c(0,1), each=10))
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) dat_qs <- qsmooth(object = dat, group_factor = rep(c(0,1), each=10))
Objects of this class store all the values needed information to work with a qsmooth object
qsmoothWeights
returns the qsmooth weights and
qsmoothData
returns the qsmooth normalized data
qsmoothWeights
qsmooth weights
qsmoothData
qsmooth normalized data
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) dat_qs <- qsmooth(object = dat, group_factor = rep(c(0,1), each=10))
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) dat_qs <- qsmooth(object = dat, group_factor = rep(c(0,1), each=10))
Given a qsmooth object, this function returns the qsmooth normalized data
Accessors for the 'qsmoothData' slot of a qsmooth object.
qsmoothData(object) ## S4 method for signature 'qsmooth' qsmoothData(object)
qsmoothData(object) ## S4 method for signature 'qsmooth' qsmoothData(object)
object |
an object of class |
The normalized data after applying smoothed quantile normalization.
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) dat_qs <- qsmooth(object = dat, group_factor = rep(c(0,1), each=10)) qsmoothData(dat_qs)
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) dat_qs <- qsmooth(object = dat, group_factor = rep(c(0,1), each=10)) qsmoothData(dat_qs)
This function applies smoothed quantile normalization separately for groups of features that are binned according to their GC-content.
qsmoothGC(object, group_factor, gc, nGroups = 50, round = TRUE, ...)
qsmoothGC(object, group_factor, gc, nGroups = 50, round = TRUE, ...)
object |
an object which is a |
group_factor |
a group level continuous or categorial
covariate associated with each sample or column in the
|
gc |
GC-content of the features, ordered according to the features
in |
nGroups |
The number of equally-sized bins used to group the
GC-content values. Groups are created using |
round |
Should normalized values be rounded to integers? |
... |
(Optional) Additional arguments passed to |
A matrix of normalized counts.
Van den Berge K., Chou H., Roux de BĂ©zieux H., Street K., Risso D., Ngai J., Dudoit S. Normalization benchmark of ATAC-seq datasets shows the importance of accounting for GC-content effects. https://www.biorxiv.org/content/10.1101/2021.01.26.428252v2
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) gc <- runif(n=100, min=0.2, max=0.9) dat_qs <- qsmoothGC(object = dat, gc = gc, group_factor = rep(c(0,1), each=10))
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) gc <- runif(n=100, min=0.2, max=0.9) dat_qs <- qsmoothGC(object = dat, gc = gc, group_factor = rep(c(0,1), each=10))
qsmooth
function.This function plots a scatterplot
showing the qsmoothWeights
along the y-axis
and the quantiles on the x-axis.
qsmoothPlotWeights( object, xLab = "quantiles", yLab = "weights", mainLab = "qsmooth weights" )
qsmoothPlotWeights( object, xLab = "quantiles", yLab = "weights", mainLab = "qsmooth weights" )
object |
a qsmooth object from |
xLab |
label for x-axis. Default is "quantiles" |
yLab |
label for y-axis. Default is "weights" |
mainLab |
title of plot. Default is "qsmooth weights" |
A scatterplot will be created showing the
qsmoothWeights
along the y-axis and the
quantiles on the x-axis.
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) dat_qs <- qsmooth(object = dat, group_factor = rep(c(0,1), each=10)) qsmoothPlotWeights(dat_qs)
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) dat_qs <- qsmooth(object = dat, group_factor = rep(c(0,1), each=10)) qsmoothPlotWeights(dat_qs)
Given a qsmooth object, this function returns the qsmooth weights
Accessors for the 'qsmoothWeights' slot of a qsmooth object.
qsmoothWeights(object) ## S4 method for signature 'qsmooth' qsmoothWeights(object)
qsmoothWeights(object) ## S4 method for signature 'qsmooth' qsmoothWeights(object)
object |
an object of class |
The weights calculated for each feature after applying smoothed quantile normalization.
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) dat_qs <- qsmooth(object = dat, group_factor = rep(c(0,1), each=10)) qsmoothWeights(dat_qs)
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) dat_qs <- qsmooth(object = dat, group_factor = rep(c(0,1), each=10)) qsmoothWeights(dat_qs)
This function is a helper function that
computes quantile statistics for the function
qsmooth
.
qstats(object, group_factor, window = 0.05)
qstats(object, group_factor, window = 0.05)
object |
an object which is a data frame or matrix with observations (e.g. probes or genes) on the rows and samples as the columns. |
group_factor |
a group level continuous or categorial
covariate associated with each sample or column in the
|
window |
window size for running median which is
a fraction of the number of rows in |
A list of quantile statistics including
Q |
sample quantiles |
Qref |
reference quantile |
Qhat |
linear model fit at each quantile |
SST |
total sum of squares |
SSB |
between sum of squares |
SSE |
within sum of squares |
roughWeights |
SSE / SST |
smoothWeights |
smoothed weights computed using a
running median with a given |
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) qs <- qstats(object = dat, group_factor = rep(c(0,1), each=10), window = 0.05)
dat <- cbind(matrix(rnorm(1000), nrow=100, ncol=10), matrix(rnorm(1000, .1, .7), nrow=100, ncol=10)) qs <- qstats(object = dat, group_factor = rep(c(0,1), each=10), window = 0.05)