Title: | Preprocessing of Illumina Infinium 450K data |
---|---|
Description: | Precise measurements are important for epigenome-wide studies investigating DNA methylation in whole blood samples, where effect sizes are expected to be small in magnitude. The 450K platform is often affected by batch effects and proper preprocessing is recommended. This package provides functions to read and normalize 450K '.idat' files. The normalization corrects for dye bias and biases related to signal intensity and methylation of probes using local regression. No adjustment for probe type bias is performed to avoid the trade-off of precision for accuracy of beta-values. |
Authors: | Jonathan Alexander Heiss |
Maintainer: | Jonathan Alexander Heiss <[email protected]> |
License: | BSD_2_clause + file LICENSE |
Version: | 1.35.0 |
Built: | 2024-10-30 09:02:52 UTC |
Source: | https://github.com/bioc/normalize450K |
Estimate leukocyte composition from whole blood DNA methylation
estimateLC(eSet)
estimateLC(eSet)
eSet |
A Biobase eSet object as returned from a call of |
Cell proportions are estimated using the algorithm developed by Houseman et al. (2012) by two different models. The first model was trained on a dataset of purified leukocytes (Reinius et al., 2012) and provides predictions for six cell types (granulocytes, monocytes, CD8+ T cells, CD4+ T cells, natural killer cells and CD19+ B cells), the second model was trained on whole blood samples from the LOLIPOP study as described by Heiss et al. (2016) and provides predictions for 4 cell types (neutrophils, eosinophils, lymphocytes, monocytes – ignore the prediction for basophils). Use this function only for normalized data (with normalize450K(...,tissue='Blood')
.
Returns the eSet object with cell proportions estimates added to the phenoData slot.
Jonathan A. Heiss
Houseman EA, et al. (2012) DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics, doi:10.1186/1471-2105-13-86
Reinius LE, et al. (2012) Differential DNA methylation in purified human blood cells: implications for cell lineage and studies on disease susceptibility. PloS ONE, doi:10.1371/journal.pone.0041361
Heiss JA, et al. (2016). Training a model for estimating leukocyte composition using whole-blood DNA methylation and cell counts as reference. Epigenomics, doi:10.2217/epi-2016-0091
Read 450K '.idat' files and compute raw or normalized beta-values.
read450K(idat_files) normalize450K(intensities,tissue='') dont_normalize450K(intensities)
read450K(idat_files) normalize450K(intensities,tissue='') dont_normalize450K(intensities)
idat_files |
a character vector containing the paths to the .idat files stripped from the '_Grn.idat' suffix with one entry for each sample (.idat files for green and red intensities have to be in the same folder). |
intensities |
List object containing raw signal intensities. Result of calling |
tissue |
If set to 'Blood', a set of prespecified reference values are used for normalization. This is recommended if you plan to use |
Function read450K
reads .idat files and returns a list object containing raw signal intensities. dont_normalize450K
returns an ExpressionSet containing beta-values without normalization. normalize450K
performs dye bias correction using the extension controls probes followed by normalization by local regression (Heiss and Brenner, 2015) and returns an ExpressionSet containing beta-values, too.
For read450K
a list containing the methylated, unmethylated and control signal intensities. For dont_normalize450K
and normalize450K
an ExpressionSet containing beta-values, rows corresponding to CpG sites (named) and columns to samples (in the same order as 'idat_files').
A benchmark comparing the performance of this method with other normalization approaches is provided in the vignette.
Jonathan A. Heiss
Heiss JA, Brenner H (2015). Between-array normalization for 450K data. Frontiers in Genetics, doi:10.3389/fgene.2015.00092
## Not run: library(minfiData) ## this package includes some .idat files library(data.table) path <- system.file("extdata",package="minfiData") samples = fread(file.path(path, 'SampleSheet.csv'),integer64='character') samples[,file:=file.path(path,Sentrix_ID,paste0(Sentrix_ID,'_',Sentrix_Position))] ## samples$file is a character vector containing the location of the ## .idat files, but without the suffixes "_Red.idat" or "_Grn.idat" raw = read450K(samples$file) none = dont_normalize450K(raw) ## no normalization norm = normalize450K(raw) ## End(Not run)
## Not run: library(minfiData) ## this package includes some .idat files library(data.table) path <- system.file("extdata",package="minfiData") samples = fread(file.path(path, 'SampleSheet.csv'),integer64='character') samples[,file:=file.path(path,Sentrix_ID,paste0(Sentrix_ID,'_',Sentrix_Position))] ## samples$file is a character vector containing the location of the ## .idat files, but without the suffixes "_Red.idat" or "_Grn.idat" raw = read450K(samples$file) none = dont_normalize450K(raw) ## no normalization norm = normalize450K(raw) ## End(Not run)