Title: | A Hidden Markov Model Approach for Identifying Differentially Methylated Sites and Regions for Beta-Valued DNA Methylation Data |
---|---|
Description: | A novel approach utilizing a homogeneous hidden Markov model. And effectively model untransformed beta values. To identify DMCs while considering the spatial. Correlation of the adjacent CpG sites. |
Authors: | Koyel Majumdar [cre, aut] , Romina Silva [aut], Antoinette Sabrina Perry [aut], Ronald William Watson [aut], Isobel Claire Gorley [aut] , Thomas Brendan Murphy [aut] , Florence Jaffrezic [aut], Andrea Rau [aut] |
Maintainer: | Koyel Majumdar <[email protected]> |
License: | GPL-3 |
Version: | 1.3.1 |
Built: | 2024-12-18 08:35:21 UTC |
Source: | https://github.com/bioc/betaHMM |
The accessor methods for accessing the betaHMMResults/ dmcResults/ dmrResults/ threshold_Results metadata.
annotatedData(object) K(object) N(object) R(object) A(object) phi(object) treatment_group(object) llk(object) tau(object) hidden_states(object) chromosome_number(object) AUC(object) uncertainty(object) model_parameters(object) threshold(object) ## S4 method for signature 'betaHMMResults' annotatedData(object) ## S4 method for signature 'threshold_Results' annotatedData(object) ## S4 method for signature 'RangedSummarizedExperiment' K(object) ## S4 method for signature 'betaHMMResults' K(object) ## S4 method for signature 'dmcResults' K(object) ## S4 method for signature 'threshold_Results' K(object) ## S4 method for signature ''NULL'' K(object) ## S4 method for signature 'RangedSummarizedExperiment' N(object) ## S4 method for signature 'betaHMMResults' N(object) ## S4 method for signature 'dmcResults' N(object) ## S4 method for signature ''NULL'' N(object) ## S4 method for signature 'RangedSummarizedExperiment' R(object) ## S4 method for signature 'betaHMMResults' R(object) ## S4 method for signature 'dmcResults' R(object) ## S4 method for signature ''NULL'' R(object) ## S4 method for signature 'RangedSummarizedExperiment' A(object) ## S4 method for signature 'betaHMMResults' A(object) ## S4 method for signature ''NULL'' A(object) ## S4 method for signature 'RangedSummarizedExperiment' tau(object) ## S4 method for signature 'betaHMMResults' tau(object) ## S4 method for signature ''NULL'' tau(object) ## S4 method for signature 'RangedSummarizedExperiment' treatment_group(object) ## S4 method for signature 'betaHMMResults' treatment_group(object) ## S4 method for signature 'dmcResults' treatment_group(object) ## S4 method for signature ''NULL'' treatment_group(object) ## S4 method for signature 'RangedSummarizedExperiment' llk(object) ## S4 method for signature 'betaHMMResults' llk(object) ## S4 method for signature ''NULL'' llk(object) ## S4 method for signature 'RangedSummarizedExperiment' phi(object) ## S4 method for signature 'betaHMMResults' phi(object) ## S4 method for signature 'threshold_Results' phi(object) ## S4 method for signature ''NULL'' phi(object) ## S4 method for signature 'RangedSummarizedExperiment' hidden_states(object) ## S4 method for signature 'betaHMMResults' hidden_states(object) ## S4 method for signature 'threshold_Results' hidden_states(object) ## S4 method for signature ''NULL'' hidden_states(object) ## S4 method for signature 'RangedSummarizedExperiment' chromosome_number(object) ## S4 method for signature 'betaHMMResults' chromosome_number(object) ## S4 method for signature 'dmrResults' chromosome_number(object) ## S4 method for signature ''NULL'' chromosome_number(object) ## S4 method for signature 'RangedSummarizedExperiment' AUC(object) ## S4 method for signature 'dmcResults' AUC(object) ## S4 method for signature ''NULL'' AUC(object) ## S4 method for signature 'RangedSummarizedExperiment' uncertainty(object) ## S4 method for signature 'dmcResults' uncertainty(object) ## S4 method for signature ''NULL'' uncertainty(object) ## S4 method for signature 'RangedSummarizedExperiment' model_parameters(object) ## S4 method for signature 'threshold_Results' model_parameters(object) ## S4 method for signature ''NULL'' model_parameters(object) ## S4 method for signature 'RangedSummarizedExperiment' threshold(object) ## S4 method for signature 'threshold_Results' threshold(object) ## S4 method for signature ''NULL'' threshold(object)
annotatedData(object) K(object) N(object) R(object) A(object) phi(object) treatment_group(object) llk(object) tau(object) hidden_states(object) chromosome_number(object) AUC(object) uncertainty(object) model_parameters(object) threshold(object) ## S4 method for signature 'betaHMMResults' annotatedData(object) ## S4 method for signature 'threshold_Results' annotatedData(object) ## S4 method for signature 'RangedSummarizedExperiment' K(object) ## S4 method for signature 'betaHMMResults' K(object) ## S4 method for signature 'dmcResults' K(object) ## S4 method for signature 'threshold_Results' K(object) ## S4 method for signature ''NULL'' K(object) ## S4 method for signature 'RangedSummarizedExperiment' N(object) ## S4 method for signature 'betaHMMResults' N(object) ## S4 method for signature 'dmcResults' N(object) ## S4 method for signature ''NULL'' N(object) ## S4 method for signature 'RangedSummarizedExperiment' R(object) ## S4 method for signature 'betaHMMResults' R(object) ## S4 method for signature 'dmcResults' R(object) ## S4 method for signature ''NULL'' R(object) ## S4 method for signature 'RangedSummarizedExperiment' A(object) ## S4 method for signature 'betaHMMResults' A(object) ## S4 method for signature ''NULL'' A(object) ## S4 method for signature 'RangedSummarizedExperiment' tau(object) ## S4 method for signature 'betaHMMResults' tau(object) ## S4 method for signature ''NULL'' tau(object) ## S4 method for signature 'RangedSummarizedExperiment' treatment_group(object) ## S4 method for signature 'betaHMMResults' treatment_group(object) ## S4 method for signature 'dmcResults' treatment_group(object) ## S4 method for signature ''NULL'' treatment_group(object) ## S4 method for signature 'RangedSummarizedExperiment' llk(object) ## S4 method for signature 'betaHMMResults' llk(object) ## S4 method for signature ''NULL'' llk(object) ## S4 method for signature 'RangedSummarizedExperiment' phi(object) ## S4 method for signature 'betaHMMResults' phi(object) ## S4 method for signature 'threshold_Results' phi(object) ## S4 method for signature ''NULL'' phi(object) ## S4 method for signature 'RangedSummarizedExperiment' hidden_states(object) ## S4 method for signature 'betaHMMResults' hidden_states(object) ## S4 method for signature 'threshold_Results' hidden_states(object) ## S4 method for signature ''NULL'' hidden_states(object) ## S4 method for signature 'RangedSummarizedExperiment' chromosome_number(object) ## S4 method for signature 'betaHMMResults' chromosome_number(object) ## S4 method for signature 'dmrResults' chromosome_number(object) ## S4 method for signature ''NULL'' chromosome_number(object) ## S4 method for signature 'RangedSummarizedExperiment' AUC(object) ## S4 method for signature 'dmcResults' AUC(object) ## S4 method for signature ''NULL'' AUC(object) ## S4 method for signature 'RangedSummarizedExperiment' uncertainty(object) ## S4 method for signature 'dmcResults' uncertainty(object) ## S4 method for signature ''NULL'' uncertainty(object) ## S4 method for signature 'RangedSummarizedExperiment' model_parameters(object) ## S4 method for signature 'threshold_Results' model_parameters(object) ## S4 method for signature ''NULL'' model_parameters(object) ## S4 method for signature 'RangedSummarizedExperiment' threshold(object) ## S4 method for signature 'threshold_Results' threshold(object) ## S4 method for signature ''NULL'' threshold(object)
object |
a |
Output varies depending on the method.
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
A dataset containing a subset of the manifest data from the Illumina MethylationEPIC beadchip array. A subset of the complete dataset has been uploaded in the package for testing purpose. The complete dataset is available on GitHub.
data(annotation_data)
data(annotation_data)
A data frame with 100 rows and 9 columns.
IlmnID: The unique identifier from the Illumina CG database, i.e. the probe ID.
Genome_Build: The genome build referenced by the Infinium MethylationEPIC manifest.
CHR: The chromosome containing the CpG (Genome_Build = 37).
MAPINFO: The chromosomal coordinates of the CpG sites.
UCSC_RefGene_Name: The target gene name(s), from the UCSC database. Note: multiple listings of the same gene name indicate splice variants.
UCSC_RefGene_Accession: The UCSC accession numbers of the target transcripts. Accession numbers are in the same order as the target gene transcripts.
UCSC_RefGene_Group: Gene region feature category describing the CpG position, from UCSC. Features are listed in the same order as the target gene transcripts.
UCSC_CpG_Islands_Name: The chromosomal coordinates of the CpG Island from UCSC.
Relation_to_UCSC_CpG_Island: The location of the CpG relative to the CpG island.
A data frame containing the array design for Illumina’s Human Methylation EPIC microarray for the chromosome 7. Based on the v1.0b2 version of the manifest file.
This is the primary user interface for the betaHMM
function
Generic S4 methods are implemented to eatimate the parameters of a
homogeneous hidden Markov model for the beta valued DNA methylation data.
The supported classes are matrix
, data.frame
,
RangedSummarizedExperiment
and GRanges
. The output of
betaHMM
method is an
S4 object of class betaHMMResults
.
betaHMM(methylation_data, annotation_file, ...) ## S4 method for signature 'matrix,matrix' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'data.frame,data.frame' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature ## 'RangedSummarizedExperiment,RangedSummarizedExperiment' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'GRanges,GRanges' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'RangedSummarizedExperiment,matrix' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'RangedSummarizedExperiment,data.frame' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'RangedSummarizedExperiment,GRanges' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'matrix,RangedSummarizedExperiment' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'data.frame,RangedSummarizedExperiment' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'data.frame,GRanges' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'matrix,data.frame' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'matrix,GRanges' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'GRanges,matrix' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'GRanges,data.frame' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'GRanges,RangedSummarizedExperiment' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'data.frame,matrix' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... )
betaHMM(methylation_data, annotation_file, ...) ## S4 method for signature 'matrix,matrix' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'data.frame,data.frame' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature ## 'RangedSummarizedExperiment,RangedSummarizedExperiment' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'GRanges,GRanges' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'RangedSummarizedExperiment,matrix' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'RangedSummarizedExperiment,data.frame' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'RangedSummarizedExperiment,GRanges' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'matrix,RangedSummarizedExperiment' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'data.frame,RangedSummarizedExperiment' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'data.frame,GRanges' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'matrix,data.frame' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'matrix,GRanges' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'GRanges,matrix' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'GRanges,data.frame' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'GRanges,RangedSummarizedExperiment' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... ) ## S4 method for signature 'data.frame,matrix' betaHMM( methylation_data, annotation_file, M = 3, N = 4, R = 2, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... )
methylation_data |
A dataframe of dimension |
annotation_file |
A dataframe containing the EPIC methylation annotation file. Maybe provided as a matrix or data.frame, GRanges or RangedSummarizedExperiment object. |
... |
Extra arguments |
M |
Number of methylation states to be identified in a single DNA sample. |
N |
Number of DNA samples (patients/replicates) collected for each treatment group. |
R |
Number of treatment groups (For. eg: Benign and Tumour). |
treatment_group |
The names of each treatment groups/ conditions. If no value is passed then default values of sample names, e.g. Sample 1, Sample 2, etc are used as legend text (default = NULL). |
parallel_process |
The 'TRUE' option results in parallel processing of the models for each chromosome for increased computational efficiency. The default option has been set as 'FALSE' due to package testing limitations. |
seed |
Seed to allow for reproducibility (default = NULL). |
iterations |
Number of iterations for algorithm convergence (default=100). |
An S4 object of class betaHMMResults
, where conditional
probabilities of each CpG site belonging to a hidden state is stored as a
SimpleList of assay data, and the corresponding estimated model parameters,
log-likelihood values, and most probable hidden state sequence for each
chromosome are stored as metadata.
Koyel Majumdar
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
betaHMMResults
is a subclass of RangedSummarizedExperiment
,
used to store the betaHMM results as well as the annotated data useful
for plotting.
betaHMMResults(SummarizedExperiment, annotatedData)
betaHMMResults(SummarizedExperiment, annotatedData)
SummarizedExperiment |
a |
annotatedData |
The annotated data passed as an input argument to the betaHMM package. |
This constructor function would not typically be used by "end users".
This simple class extends the RangedSummarizedExperiment
class of the
SummarizedExperiment package
to allow other packages to write methods for results
objects from the betaHMM package. It is used by
to wrap up the results table.
a betaHMMResults object
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
A homogeneous hidden Markov model for the beta valued DNA methylation data.
betaHMMrun( methylation_data, annotation_file, M, N, R, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... )
betaHMMrun( methylation_data, annotation_file, M, N, R, treatment_group = NULL, parallel_process = FALSE, seed = NULL, iterations = 100, ... )
methylation_data |
A dataframe of dimension |
annotation_file |
A dataframe containing the EPIC methylation annotation file. |
M |
Number of methylation states to be identified in a single DNA sample. |
N |
Number of DNA samples (patients/replicates) collected for each treatment group. |
R |
Number of treatment groups (For. eg: Benign and Tumour). |
treatment_group |
The names of each treatment groups/ conditions. If no value is passed then default values of sample names, e.g. Sample 1, Sample 2, etc are used as legend text (default = NULL). |
parallel_process |
The 'TRUE' option results in parallel processing of the models for each chromosome for increased computational efficiency. The default option has been set as 'FALSE' due to package testing limitations. |
seed |
Seed to allow for reproducibility (default = NULL). |
iterations |
Number of iterations for algorithm convergence (default=100). |
... |
Extra arguments |
The betaHMM function employs initially set parameters (utilizing a basic 3-state beta hidden Markov model) to estimate the parameters of the homogeneous hidden Markov model, adapted for beta-valued DNA methylation data, through implementation of the Baum-Welch algorithm. Subsequently, the derived parameters are utilized to ascertain the most probable sequence of hidden states using the Viterbi algorithm.
The function returns an object of the
betaHMMResults
class
which contains a SimpleList of assay data containing the posterior
probability of each CpG site belonging to each of the
hidden states and the following values as metadata:
K - The number of hidden states identified using the betaHMM model.
C - The number of CpG sites analysed using the betaHMM model.
N - The number of DNA samples corresponding to each treatment group analysed using the betaHMM model.
R - The number of treatment groups analysed using the betaHMM model.
A - The transition matrix estimated for the betaHMM model.
tau - The initial distribution estimated for the betaHMM model.
treatment_group - The names of the treatment groups/conditions analysed.
phi - The shape parameters estimated for the observed data in the betaHMM model.
llk - A vector containing the log-likelihood values calculated for each iteration of the algorithm.
hidden_states - The vector containing the estimated hidden states for each CpG sites.
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
This is the primary user interface for the
dmc_identification
function.
Generic S4 methods are implemented to identify the DMCs from the estimated
betaHMM model parameters for each chromosome. The supported class is a
betaHMMResults
object. The output
is an S4 object of class of dmcResults
.
dmc_identification(betaHMM_object, ...) ## S4 method for signature 'betaHMMResults' dmc_identification( betaHMM_object, AUC_threshold = 0.8, uncertainty_threshold = 0.2, ... )
dmc_identification(betaHMM_object, ...) ## S4 method for signature 'betaHMMResults' dmc_identification( betaHMM_object, AUC_threshold = 0.8, uncertainty_threshold = 0.2, ... )
betaHMM_object |
An S4 object of class
|
... |
extra arguments |
AUC_threshold |
The threshold for AUC metric for each chromosome. |
uncertainty_threshold |
The threshold for uncertainty of belonging to a particular hidden state, for each chromosome. |
An S4 object of class dmcResults
.
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
The dissimilarities between the cumulative distributions calculated for each hidden state are determined through employment of the area-under-curve (AUC) technique. By incorporating user-defined threshold values for AUC alongside the associated uncertainties in membership within that hidden state, the aim is to pinpoint the most distinctively methylated states. This process facilitates the identification of CpGs that exhibit the most notable differential methylation, guided by the predefined threshold criteria.
dmc_identification_run( betaHMM_object, AUC_threshold = 0.8, uncertainty_threshold = 0.2, ... )
dmc_identification_run( betaHMM_object, AUC_threshold = 0.8, uncertainty_threshold = 0.2, ... )
betaHMM_object |
A |
AUC_threshold |
The threshold for AUC metric for each chromosome. |
uncertainty_threshold |
The threshold for uncertainty of belonging to a particular hidden state, for each chromosome. |
... |
extra arguments |
The function returns an object of the
dmcResults
class
which contains a SimpleList of assay data which contains the
following values:
CHR Chromosome number
MAPINFO Mapinfo
IlmnID IlmnID
N*R columns containing methylation states
hidden_state The assigned hidden_state
DMC The value is 1 if the CpG is a DMC else 0.
The object contains the following values as the metadata:
A list containing the AUC values for K hidden states for each chromosome and the conditions compared which resulted in the highest AUC value when more than 2 conditions are compared.
A list containing the conditional probability values for K hidden states for each chromosome.
The treatment group labels.
K The number of hidden states estimated.
N The number of DNA replicates/patients for each treatment group.
R The number of treatment groups to be compared.
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
dmcResults
is a subclass of RangedSummarizedExperiment
,
used to store the DMCs identified.
dmcResults(SummarizedExperiment)
dmcResults(SummarizedExperiment)
SummarizedExperiment |
a |
This constructor function would not typically be used by "end users".
This simple class extends the RangedSummarizedExperiment
class of the
SummarizedExperiment package
to allow other packages to write methods for results
objects from the
dmc_identification
function.
It is used by
to wrap up the results table.
a dmcResults
object
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
This is the primary user interface for the
dmr_identification
function.
Generic S4 methods are implemented to identify the DMRs from the DMCs
identified in each chromosome. The supported classes are data.frame
and dmcResults
object. The output is an
S4 object of class dmrResults
.
dmr_identification(dmc_identification_object, ...) ## S4 method for signature 'dmcResults' dmr_identification(dmc_identification_object, DMC_count = 2, ...) ## S4 method for signature 'matrix' dmr_identification(dmc_identification_object, DMC_count = 2, ...) ## S4 method for signature 'data.frame' dmr_identification(dmc_identification_object, DMC_count = 2, ...)
dmr_identification(dmc_identification_object, ...) ## S4 method for signature 'dmcResults' dmr_identification(dmc_identification_object, DMC_count = 2, ...) ## S4 method for signature 'matrix' dmr_identification(dmc_identification_object, DMC_count = 2, ...) ## S4 method for signature 'data.frame' dmr_identification(dmc_identification_object, DMC_count = 2, ...)
dmc_identification_object |
a
|
... |
extra arguments |
DMC_count |
The minimal number of consecutive CpGs in a DMR. |
An S4 object of class dmrResults
where the CpG site information for each DMR is stored as a SimpleList
of assay data and the chromosomes analysed by the model is stored as
the metadata.
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
Function to identify the DMRs from the DMCs identified in each chromosome.
dmr_identification_run( dmc_identification_object, DMC_count = 2, parallel_process = FALSE, ... )
dmr_identification_run( dmc_identification_object, DMC_count = 2, parallel_process = FALSE, ... )
dmc_identification_object |
a |
DMC_count |
The minimal number of consecutive CpGs in a DMR. |
parallel_process |
The 'TRUE' option results in parallel processing of the DMCs from each chromosome for increased computational efficiency. The default option has been set as 'FALSE' due to package testing limitations. |
... |
extra arguments |
A dmrResults
object containing a SimpleList of assay data containing the following data:
start_CpG - The starting CpG site IlmnID in the particular DMR
end_CpG - The ending CpG site IlmnID in the particular DMR
DMR_size - Number of CPG sites identified in the DMR
chr_dmr - The chromosome corresponding to the CpG sites in the DMR.
map_start - MAPINFO of starting CpG site in the particular DMR
map_end - MAPINFO of the ending CpG site in the particular DMR
The object also returns the chromosomes analysed by the betaHMM model as the metadata.
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
dmrResults
is a subclass of RangedSummarizedExperiment
,
used to store the DMRs identified.
dmrResults(SummarizedExperiment)
dmrResults(SummarizedExperiment)
SummarizedExperiment |
a |
This constructor function would not typically be used by "end users".
This simple class extends the RangedSummarizedExperiment
class of the
SummarizedExperiment package
to allow other packages to write methods for results
objects from the
dmr_identification
function.
It is used by
to wrap up the results table.
a dmrResults
object
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
A subset of the dataset containing beta methylation values from
sample types (Benign and Tumour), collected from
patients from the a prostate cancer study. The dataset contains methylation
values corresponding to chromosome 7.
data(pca_methylation_data)
data(pca_methylation_data)
A data frame with 38672 rows and 9 columns. The data contain no missing values.
IlmnID: The unique identifier from the Illumina CG database, i.e. the probe ID.
Benign_Patient_1: Methylation values from benign tissue from patient 1.
Benign_Patient_2: Methylation values from benign tissue from patient 2.
Benign_Patient_3: Methylation values from benign tissue from patient 3.
Benign_Patient_4: Methylation values from benign tissue from patient 4.
Tumour_Patient_1: Methylation values from tumor tissue from patient 1.
Tumour_Patient_2: Methylation values from tumor tissue from patient 2.
Tumour_Patient_3: Methylation values from tumor tissue from patient 3.
Tumour_Patient_4: Methylation values from tumor tissue from patient 4.
The array data were then normalized and and probes located outside of CpG sites and on the sex chromosome were filtered out. The CpG sites with missing values were removed from the resulting dataset. A subset of the complete dataset has been uploaded in the package for testing purposes. The complete dataset is available on GitHub.
A data frame containing a subset of methylation data from real study.
betaHMM
/ dmc_identification
/
threshold_identification
functions.Visualize results from betaHMM
/ dmc_identification
/
threshold_identification
functions.
plot(x, ...) ## S4 method for signature 'betaHMMResults' plot( x, chromosome = NULL, what = c("fitted density", "kernel density", "uncertainty"), treatment_group = NULL, AUC = NULL, uncertainty_threshold = 0.2, title = NULL, ... ) ## S4 method for signature 'dmcResults' plot( x, start_CpG = NULL, end_CpG = NULL, treatment_group = NULL, N = NULL, title = NULL, ... ) ## S4 method for signature 'threshold_Results' plot(x, plot_threshold = TRUE, title = NULL, ...)
plot(x, ...) ## S4 method for signature 'betaHMMResults' plot( x, chromosome = NULL, what = c("fitted density", "kernel density", "uncertainty"), treatment_group = NULL, AUC = NULL, uncertainty_threshold = 0.2, title = NULL, ... ) ## S4 method for signature 'dmcResults' plot( x, start_CpG = NULL, end_CpG = NULL, treatment_group = NULL, N = NULL, title = NULL, ... ) ## S4 method for signature 'threshold_Results' plot(x, plot_threshold = TRUE, title = NULL, ...)
x |
An object of class
|
... |
Other graphics parameters. |
chromosome |
The chromosome number for which the plot is to be displayed. |
what |
The different plots that can be obtained are either 'fitted density','kernel density' or 'uncertainty' (default = 'fitted density'). |
treatment_group |
The names of the different treatment groups
to be displayed in the plot.If no value is passed then the sample names
estimated by the |
AUC |
The AUC values for that chromosome. |
uncertainty_threshold |
The uncertainty threshold value used for DMC identification. |
title |
The title that the user wants to display. If no title is to be displayed the default is 'NULL'. |
start_CpG |
The IlmnID of starting CpG site when plotting the DMCs. |
end_CpG |
The IlmnID of ending CpG site/ the total number of CpGs to be plotted excluing the starting CpG site when plotting the DMCs. |
N |
The number of DNA samples corresponding to each
treatment group analysed using the betaHMM model. If 'NULL', the value from
|
plot_threshold |
The "TRUE" option displays the threshold points in the graph for the 3 state betaHMM model (default = "FALSE"). |
This function displays the following plots as requested by the user
when analysing the betaHMMResults
output:
fitted density estimates - Plot showing the fitted density estimates of the clustering solution under the optimal model selected.
kernel density estimates - Plot showing the kernel density estimates of the clustering solution under the optimal model selected.
uncertainty - A boxplot showing the uncertainties in the hidden state estimation.
The function displays the DMCs and DMRs plot from
the dmcResults
object.
The function displays the plot for the estimated shape parameters
and threshold for the methylation states in a single DNA treatment condition
from the threshold_Results
object.
Koyel Majumdar
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty") ## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty") ## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
A dataset containing a subset of the manifest data from the Illumina MethylationEPIC beadchip array. A subset of the complete dataset has been uploaded in the package for testing purpose. The complete dataset is available on GitHub.
data(sample_annotation_file)
data(sample_annotation_file)
A data frame with 100 rows and 9 columns.
IlmnID: The unique identifier from the Illumina CG database, i.e. the probe ID.
Genome_Build: The genome build referenced by the Infinium MethylationEPIC manifest.
CHR: The chromosome containing the CpG (Genome_Build = 37).
MAPINFO: The chromosomal coordinates of the CpG sites.
UCSC_RefGene_Name: The target gene name(s), from the UCSC database. Note: multiple listings of the same gene name indicate splice variants.
UCSC_RefGene_Accession: The UCSC accession numbers of the target transcripts. Accession numbers are in the same order as the target gene transcripts.
UCSC_RefGene_Group: Gene region feature category describing the CpG position, from UCSC. Features are listed in the same order as the target gene transcripts.
UCSC_CpG_Islands_Name: The chromosomal coordinates of the CpG Island from UCSC.
Relation_to_UCSC_CpG_Island: The location of the CpG relative to the CpG island.
A data frame containing the array design for Illumina’s Human Methylation EPIC microarray for the simulated CpG sites. Based on the v1.0b2 version of the manifest file.
A dataset containing simulated beta methylation values from
sample types (Benign and Tumour), collected from
patients.
data(sample_methylation_file)
data(sample_methylation_file)
A data frame with 100 rows and 9 columns. The data contain no missing values.
IlmnID: The unique identifier from the Illumina CG database, i.e. the probe ID.
Benign_Patient_1: Methylation values from benign tissue from patient 1.
Benign_Patient_2: Methylation values from benign tissue from patient 2.
Benign_Patient_3: Methylation values from benign tissue from patient 3.
Benign_Patient_4: Methylation values from benign tissue from patient 4.
Tumour_Patient_1: Methylation values from tumor tissue from patient 1.
Tumour_Patient_2: Methylation values from tumor tissue from patient 2.
Tumour_Patient_3: Methylation values from tumor tissue from patient 3.
Tumour_Patient_4: Methylation values from tumor tissue from patient 4.
The array data were then normalized and and probes located outside of CpG sites and on the sex chromosome were filtered out. The CpG sites with missing values were removed from the resulting dataset. A subset of the complete dataset has been uploaded in the package for testing purposes. The complete dataset is available on GitHub.
The data frame containing simulated methylation values.
A function to summarize the betaHMMResults
, dmcResults
or
dmrResults
objects.
summary(object, ...) ## S4 method for signature 'betaHMMResults' summary(object, ...) ## S4 method for signature 'dmcResults' summary(object, ...) ## S4 method for signature 'dmrResults' summary(object, ...)
summary(object, ...) ## S4 method for signature 'betaHMMResults' summary(object, ...) ## S4 method for signature 'dmcResults' summary(object, ...) ## S4 method for signature 'dmrResults' summary(object, ...)
object |
An object of class |
... |
Additional arguments |
Summary of the 'betaHMMResults'
or
'dmcResults'
or 'dmrResults'
object.
Koyel majumdar
betaHMM
, dmc_identification
,
dmr_identification
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
## Use simulated data for the betaHMM workflow example set.seed(12345) ## read files data(sample_methylation_file) data(sample_annotation_file) # Run betaHMM function beta_out <- betaHMM(sample_methylation_file[1:50,], sample_annotation_file[1:50,], M = 3, N = 4, R = 2,iterations=2, parallel_process = FALSE, seed = 12345, treatment_group = c("Benign","Tumour")) ## Run dmc_identification function dmc_out <- dmc_identification(beta_out) # Run dmr_identification function dmr_out <- dmr_identification(dmc_out, parallel_process = FALSE) # Plot functions # Get the AUC values calculated for each hidden state AUC_chr <- AUC(dmc_out) ## plot the uncertainty for each hidden state plot(beta_out, chromosome = "1", what = "uncertainty")
The supported classes are matrix
and data.frame
.
The output of threshold_identification
is an S4 object of
class threshold_Results
.
threshold_identification(object1, ...) ## S4 method for signature 'matrix' threshold_identification( object1, package_workflow = TRUE, annotation_file = NULL, M = 3, N = 4, parameter_estimation_only = FALSE, seed = NULL, ... ) ## S4 method for signature 'data.frame' threshold_identification( object1, package_workflow = TRUE, annotation_file = NULL, M = 3, N = 4, parameter_estimation_only = FALSE, seed = NULL, ... )
threshold_identification(object1, ...) ## S4 method for signature 'matrix' threshold_identification( object1, package_workflow = TRUE, annotation_file = NULL, M = 3, N = 4, parameter_estimation_only = FALSE, seed = NULL, ... ) ## S4 method for signature 'data.frame' threshold_identification( object1, package_workflow = TRUE, annotation_file = NULL, M = 3, N = 4, parameter_estimation_only = FALSE, seed = NULL, ... )
object1 |
Methylation data and IlmnID. Maybe provided as a matrix or dataframe. |
... |
Extra parameters. |
package_workflow |
Flag set to TRUE if method called from package workflow. If set to FALSE then the parameter annotation_file needs to be supplied to the function. |
annotation_file |
A dataframe containing the EPIC methylation annotation file. |
M |
Number of methylation states to be identified in a single DNA sample. |
N |
Number of DNA samples (patients/replicates) collected for each treatment group. |
parameter_estimation_only |
If only model parameters are to be estimated then value is TRUE else FALSE. |
seed |
Seed to allow for reproducibility (default = NULL). |
An S4 object of class threshold_Results
.
## Use simulated data for the betaHMM workflow example set.seed(12345) library(betaHMM) ## read files data(sample_methylation_file) head(sample_methylation_file) data(sample_annotation_file) head(sample_annotation_file) ##merge data df=merge(sample_annotation_file[,c('IlmnID','CHR','MAPINFO')], sample_methylation_file,by='IlmnID') ## sort data df=df[order(df$CHR,df$MAPINFO),] thr_out=threshold_identification(df[,c(1,4:7)],package_workflow=TRUE,M=3,4, parameter_estimation_only=TRUE,seed=12345)
## Use simulated data for the betaHMM workflow example set.seed(12345) library(betaHMM) ## read files data(sample_methylation_file) head(sample_methylation_file) data(sample_annotation_file) head(sample_annotation_file) ##merge data df=merge(sample_annotation_file[,c('IlmnID','CHR','MAPINFO')], sample_methylation_file,by='IlmnID') ## sort data df=df[order(df$CHR,df$MAPINFO),] thr_out=threshold_identification(df[,c(1,4:7)],package_workflow=TRUE,M=3,4, parameter_estimation_only=TRUE,seed=12345)
HMM for beta valued DNA data for a single treatment condition
threshold_identification_run( data, package_workflow = TRUE, annotation_file = NULL, M, N, parameter_estimation_only = FALSE, seed = NULL, ... )
threshold_identification_run( data, package_workflow = TRUE, annotation_file = NULL, M, N, parameter_estimation_only = FALSE, seed = NULL, ... )
data |
Methylation data and IlmnID. Maybe provided as a matrix or dataframe. |
package_workflow |
Flag set to TRUE if method called from package workflow. If set to FALSE then the parameter annotation_file needs to be supplied to the function. |
annotation_file |
A dataframe containing the EPIC methylation annotation file. |
M |
Number of methylation states to be identified in a single DNA sample. |
N |
Number of DNA samples (patients/replicates) collected for each treatment group. |
parameter_estimation_only |
If only model parameters are to be estimated then value is TRUE else FALSE. |
seed |
Seed to allow for reproducibility (default = NULL). |
... |
Extra parameters. |
An S4 object of class threshold_Results
, where conditional
probabilities
of each CpG site belonging to a one of the methylation states
is stored as a SimpleList of
assay data, and the corresponding estimated model parameters, the thresholds
and most probable hidden state sequence for each chromosome are
stored as metadata.
## Use simulated data for the betaHMM workflow example set.seed(12345) library(betaHMM) ## read files data(sample_methylation_file) head(sample_methylation_file) data(sample_annotation_file) head(sample_annotation_file) ##merge data df=merge(sample_annotation_file[,c('IlmnID','CHR','MAPINFO')], sample_methylation_file,by='IlmnID') ## sort data df=df[order(df$CHR,df$MAPINFO),] thr_out=threshold_identification(df[,c(1,4:7)],package_workflow=TRUE,M=3,4, parameter_estimation_only=TRUE,seed=12345)
## Use simulated data for the betaHMM workflow example set.seed(12345) library(betaHMM) ## read files data(sample_methylation_file) head(sample_methylation_file) data(sample_annotation_file) head(sample_annotation_file) ##merge data df=merge(sample_annotation_file[,c('IlmnID','CHR','MAPINFO')], sample_methylation_file,by='IlmnID') ## sort data df=df[order(df$CHR,df$MAPINFO),] thr_out=threshold_identification(df[,c(1,4:7)],package_workflow=TRUE,M=3,4, parameter_estimation_only=TRUE,seed=12345)
threshold_Results
is a subclass of RangedSummarizedExperiment
,
used to store the threshold_identification results as well as the annotated
data useful for plotting.
threshold_Results(SummarizedExperiment, annotatedData)
threshold_Results(SummarizedExperiment, annotatedData)
SummarizedExperiment |
a |
annotatedData |
The annotated data passed as an input argument to
the |
This constructor function would not typically be used by "end users".
This simple class extends the RangedSummarizedExperiment
class of the
SummarizedExperiment package
to allow other packages to write methods for results
objects from the
threshold_identification
function. It is used by
to wrap up the results table.
a threshold_Results
object
## Use simulated data for the betaHMM workflow example set.seed(12345) library(betaHMM) ## read files data(sample_methylation_file) head(sample_methylation_file) data(sample_annotation_file) head(sample_annotation_file) ##merge data df=merge(sample_annotation_file[,c('IlmnID','CHR','MAPINFO')], sample_methylation_file,by='IlmnID') ## sort data df=df[order(df$CHR,df$MAPINFO),] thr_out=threshold_identification(df[,c(1,4:7)],package_workflow=TRUE,M=3,4, parameter_estimation_only=TRUE,seed=12345)
## Use simulated data for the betaHMM workflow example set.seed(12345) library(betaHMM) ## read files data(sample_methylation_file) head(sample_methylation_file) data(sample_annotation_file) head(sample_annotation_file) ##merge data df=merge(sample_annotation_file[,c('IlmnID','CHR','MAPINFO')], sample_methylation_file,by='IlmnID') ## sort data df=df[order(df$CHR,df$MAPINFO),] thr_out=threshold_identification(df[,c(1,4:7)],package_workflow=TRUE,M=3,4, parameter_estimation_only=TRUE,seed=12345)