Title: | Adaptive Robust Regression normalization for Illumina methylation data |
---|---|
Description: | Perform the Adaptive Robust Regression method (ARRm) for the normalization of methylation data from the Illumina Infinium HumanMethylation 450k assay. |
Authors: | Jean-Philippe Fortin, Celia M.T. Greenwood, Aurelie Labbe. |
Maintainer: | Jean-Philippe Fortin <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.47.0 |
Built: | 2024-10-30 03:30:08 UTC |
Source: | https://github.com/bioc/ARRmNormalization |
Normalize Illumina methylation data from the Infinium HumanMethylation 450k assay with the Adaptive Robust Regression method. The normalization takes care of background intensity, dye bias, chip effects and spatial positions. The normalization can be applied to Beta values, M-values or other metrics as well.
Jean-Philippe Fortin <[email protected]> Celia M.T. Greenwood <[email protected]> Aurelie Labbe <[email protected]>
This function estimates background intensity for the two colors by taking the median of the negative control probes in each color channel.
getBackground(greenControlMatrix, redControlMatrix)
getBackground(greenControlMatrix, redControlMatrix)
greenControlMatrix |
matrix of negative control probes intensities in the green channel. Rows are probes, columns are samples. |
redControlMatrix |
matrix of the negative control probes intensities in the red channel. Rows are probes, columns are samples. |
Returns a data.frame
with two columns; "green"
contains the background intensity in the green channel for each sample and "red"
contains the background intensity in the red channel for each sample
Jean-Philippe Fortin <[email protected]>
data(greenControlMatrix) data(redControlMatrix) getBackground(greenControlMatrix,redControlMatrix)
data(greenControlMatrix) data(redControlMatrix) getBackground(greenControlMatrix,redControlMatrix)
For each probe type, it returns the coefficients of the linear model used in the ARRm normalization. Since the model is applied to each percentile separately, different coefficients are returned for every percentile. Residuals are returned as well.
getCoefficients(quantiles,designInfo,backgroundInfo,outliers.perc=0.02)
getCoefficients(quantiles,designInfo,backgroundInfo,outliers.perc=0.02)
quantiles |
A |
designInfo |
matrix returned by |
backgroundInfo |
matrix returned by |
outliers.perc |
Percentage of outliers to be removed in the regression. By default, set to 0.02 |
Returns a list
containing three lists of coefficients for each probe type. ($green
to access coefficients for Type I green probes, $red
to access coefficients for Type I red probes and $II
to access coefficients for Type II probes). Each list of coefficients contains five subfields. res
is a matrix of residuals for the linear model across percentiles (a vector of residuals for each percentile), background.vector
is a vector containing the regression coefficients for background intensity across percentiles; dyebias.vector
is a vector containing the regression coefficients for dye bias across percentiles; chip.variations
is a matrix of chip variations estimated by the linear model; rows correspond to percentiles, columns correspond to chips; position.variations
is a matrix of position deviation from the chip mean estimated by the linear model; rows correspond to percentiles, columns correspond to positions.
Jean-Philippe Fortin <[email protected]>
data(greenControlMatrix) data(redControlMatrix) data(sampleNames) data(betaMatrix) backgroundInfo=getBackground(greenControlMatrix,redControlMatrix) designInfo=getDesignInfo(sampleNames) quantiles=getQuantiles(betaMatrix) coefficients=getCoefficients(quantiles,designInfo,backgroundInfo)
data(greenControlMatrix) data(redControlMatrix) data(sampleNames) data(betaMatrix) backgroundInfo=getBackground(greenControlMatrix,redControlMatrix) designInfo=getDesignInfo(sampleNames) quantiles=getQuantiles(betaMatrix) coefficients=getCoefficients(quantiles,designInfo,backgroundInfo)
If a vector of sample names of the form "6793856729_R03C02" is given, the function builds a data frame containing chip and position indices for the samples. If no samples names are provided by the user but explicit postion and chip vectors are provided, the data frame is built with these explicit indices.
getDesignInfo(sampleNames = NULL, chipVector = NULL, positionVector = NULL)
getDesignInfo(sampleNames = NULL, chipVector = NULL, positionVector = NULL)
sampleNames |
Names of the samples of the form "6793856729_R03C02" (Chip ID, Row, Column) |
chipVector |
Numeric vector of chip indices (one chip contains 12 samples) |
positionVector |
Numeric vector of on-chip position indices (between 1 and 12) |
A data.frame
containing a column named chipInfo
containing the chip indices, a column named positionInfo
containing the position indices, and a column sampleNames
if sample names were provided.
Jean-Philippe Fortin <[email protected]>
data(sampleNames) getDesignInfo(sampleNames)
data(sampleNames) getDesignInfo(sampleNames)
It returns the percentiles of a betaMatrix for Type I Green, Type I Red and Type II probes. If no list of probes is provided, all probes are taken into account to compute the percentiles.
getQuantiles(betaMatrix,goodProbes=NULL)
getQuantiles(betaMatrix,goodProbes=NULL)
betaMatrix |
|
goodProbes |
Ids of the probes to be normalized (Id. of the form "cg00000029"). |
Returns a list
of three matrices of percentiles. For Type I green and Type I red probes, the corresponding matrices can be accessed by $green
and $red
. For Type II probes, the matrix can be accessed by $II
Jean-Philippe Fortin <[email protected]>
data(greenControlMatrix) data(redControlMatrix) data(sampleNames) data(betaMatrix) quantiles=getQuantiles(betaMatrix)
data(greenControlMatrix) data(redControlMatrix) data(sampleNames) data(betaMatrix) quantiles=getQuantiles(betaMatrix)
This function perform Adaptive Robust Regression method (ARRm) normalization on Beta values. The method corrects for background intensity, dye bias and spatial on-chip position. By default, chip mean correction is also performed.
normalizeARRm(betaMatrix, designInfo, backgroundInfo, outliers.perc = 0.02, goodProbes = NULL,chipCorrection=TRUE)
normalizeARRm(betaMatrix, designInfo, backgroundInfo, outliers.perc = 0.02, goodProbes = NULL,chipCorrection=TRUE)
betaMatrix |
|
designInfo |
A |
backgroundInfo |
A |
outliers.perc |
Proportion (between 0 and 1) of outliers to be removed from the ARRm regression |
goodProbes |
Ids of the probes to be normalized (Id. of the form "cg00000029") |
chipCorrection |
logical, should normalization correct for chip mean? |
A matrix
containing the normalized Beta values
Jean-Philippe Fortin <[email protected]>
getBackground
to see how to obtain background information from control probes, and getDesignInfo
to see how to obtain position and chip indices
data(greenControlMatrix) data(redControlMatrix) data(sampleNames) data(betaMatrix) backgroundInfo=getBackground(greenControlMatrix, redControlMatrix) designInfo=getDesignInfo(sampleNames) normMatrix=normalizeARRm(betaMatrix, designInfo, backgroundInfo, outliers.perc = 0.02)
data(greenControlMatrix) data(redControlMatrix) data(sampleNames) data(betaMatrix) backgroundInfo=getBackground(greenControlMatrix, redControlMatrix) designInfo=getDesignInfo(sampleNames) normMatrix=normalizeARRm(betaMatrix, designInfo, backgroundInfo, outliers.perc = 0.02)
For each probe type, and for each sample, deviations from the chip mean are computed for a given percentile. These deviations are plotted against on-chip position.
positionPlots(quantiles,designInfo,percentiles=c(25,50,75))
positionPlots(quantiles,designInfo,percentiles=c(25,50,75))
quantiles |
A |
designInfo |
|
percentiles |
Vector of percentiles to be plotted. By default, the 25th, 50th and 75th percentiles are plotted. ( |
Plots are produced and saved as pdf in the current directory.
Jean-Philippe Fortin <[email protected]>
data(greenControlMatrix) data(redControlMatrix) data(sampleNames) data(betaMatrix) quantiles=getQuantiles(betaMatrix) backgroundInfo=getBackground(greenControlMatrix, redControlMatrix) designInfo=getDesignInfo(sampleNames) positionPlots(quantiles, designInfo, percentiles=c(25,50,75))
data(greenControlMatrix) data(redControlMatrix) data(sampleNames) data(betaMatrix) quantiles=getQuantiles(betaMatrix) backgroundInfo=getBackground(greenControlMatrix, redControlMatrix) designInfo=getDesignInfo(sampleNames) positionPlots(quantiles, designInfo, percentiles=c(25,50,75))
Probe Design information for the Illumina Infinium HumanMethylation 450k array. To each probe is associated the design type, either Infinium I Green, Infinium I Red or Infinium II. Probe names follows Illumina's annotation (names of the form "cg00000029").
data(ProbesType)
data(ProbesType)
A data frame containing two columns. $Probe_Name
contains the names of the probes, and $Design_Type
contains the design information ("I Green", "I Red" or "II").
data(ProbesType)
data(ProbesType)
For each probe type, and for each sample, several percentiles are plotted against background intensity, and also against dye bias.
quantilePlots(quantiles,backgroundInfo,designInfo,percentilesI=NULL,percentilesII=NULL)
quantilePlots(quantiles,backgroundInfo,designInfo,percentilesI=NULL,percentilesII=NULL)
quantiles |
A |
designInfo |
|
backgroundInfo |
|
percentilesI |
List of percentiles to be plotted for Type I probes. Must be a vector of integers from 1 to 100. If set to |
percentilesII |
List of percentiles to be plotted for Type II probes. Must be a vector of integers from 1 to 100. If set to |
Plots are produced and saved as pdf in the current directory.
Jean-Philippe Fortin <[email protected]>
data(greenControlMatrix) data(redControlMatrix) data(sampleNames) data(betaMatrix) quantiles=getQuantiles(betaMatrix) backgroundInfo=getBackground(greenControlMatrix, redControlMatrix) designInfo=getDesignInfo(sampleNames) quantilePlots(quantiles, backgroundInfo, designInfo)
data(greenControlMatrix) data(redControlMatrix) data(sampleNames) data(betaMatrix) quantiles=getQuantiles(betaMatrix) backgroundInfo=getBackground(greenControlMatrix, redControlMatrix) designInfo=getDesignInfo(sampleNames) quantilePlots(quantiles, backgroundInfo, designInfo)