Title: | Compare genomic intervals tool - Automated, complete, reproducible and clear report about genomic and epigenomic data sets |
---|---|
Description: | Biological studies often consist of multiple conditions which are examined with different laboratory set ups like RNA-sequencing or ChIP-sequencing. To get an overview about the whole resulting data set, Cogito provides an automated, complete, reproducible and clear report about all samples and basic comparisons between all different samples. This report can be used as documentation about the data set or as starting point for further custom analysis. |
Authors: | Annika Bürger [cre, aut] |
Maintainer: | Annika Bürger <[email protected]> |
License: | LGPL-3 |
Version: | 1.13.0 |
Built: | 2024-10-30 05:21:06 UTC |
Source: | https://github.com/bioc/Cogito |
Aggregates multiple GRanges objects with present columns of attached values (mcols) to genes or ranges of given reference of given organism.
aggregateRanges(ranges, configfile = NULL, organism = NULL, referenceRanges = NULL, name = "", verbose = FALSE)
aggregateRanges(ranges, configfile = NULL, organism = NULL, referenceRanges = NULL, name = "", verbose = FALSE)
ranges |
list of GRanges, GRangesList or CompressedGRangesList with names in "RRBS|DNA|CNV|RNA|CHiP" |
configfile |
character, path to configuration file in json format |
organism |
TxDb or OrganismDb object, default value NULL |
referenceRanges |
list of GRanges, GRangesList or CompressedGRangesList with length one and name of reference, default value NULL |
name |
character, default value "" |
verbose |
logical, default value FALSE |
List object with three members: One GRanges object with one gene or range of given reference per line and one column per sample, configuration information, and name.
Annika Bürger
mm9 <- TxDb.Mmusculus.UCSC.mm9.knownGene::TxDb.Mmusculus.UCSC.mm9.knownGene ### small artificial example ### ranges.RNA.control <- GRanges(seq = "chr10", IRanges(c(41023369, 41211825, 41528287, 41994926, 42301673, 43256520, 43618919, 49503584, 51349066, 52099001), c(41023544, 41212385, 41528663, 41995357, 42302290, 43257075, 43619492, 49504033, 51349425, 52099521)), seqinfo = GenomeInfoDb::seqinfo(mm9), expr = runif(5, 0, 1)) ranges.RNA.condition <- GRanges(seq = "chr10", IRanges(c(41013942, 41208731, 41535166, 41999999, 42292275, 43256194, 43615562, 49497888, 51347046, 52092180), c(41014274, 41209664, 41536039, 42000182, 42292965, 43256430, 43615866, 49498362, 51347969, 52092733)), seqinfo = GenomeInfoDb::seqinfo(mm9), expr = runif(5, 0, 1)) ranges.ChIP.control <- GRanges(seq = "chr10", IRanges(c(41022835, 41307587, 42197924, 42302387, 42893825, 43259749, 43620352, 43721891, 44248812, 45207572, 49508713, 51309978, 51348779, 52101900, 52265513), c(41022954, 41307745, 42198201, 42302555, 42893974, 43259889, 43620604, 43722051, 44248920, 45207704, 49508859, 51310187, 51348921, 52102030, 52265689)), seqinfo = GenomeInfoDb::seqinfo(mm9), score = round(runif(15, 5, 90))) example.dataset <- list(RNA = GRangesList(control = ranges.RNA.control, condition = ranges.RNA.condition), ChIP = ranges.ChIP.control) aggregated.ranges <- aggregateRanges(ranges = example.dataset, organism = mm9, name = "art.example", verbose = TRUE) names(aggregated.ranges) head(aggregated.ranges$genes)
mm9 <- TxDb.Mmusculus.UCSC.mm9.knownGene::TxDb.Mmusculus.UCSC.mm9.knownGene ### small artificial example ### ranges.RNA.control <- GRanges(seq = "chr10", IRanges(c(41023369, 41211825, 41528287, 41994926, 42301673, 43256520, 43618919, 49503584, 51349066, 52099001), c(41023544, 41212385, 41528663, 41995357, 42302290, 43257075, 43619492, 49504033, 51349425, 52099521)), seqinfo = GenomeInfoDb::seqinfo(mm9), expr = runif(5, 0, 1)) ranges.RNA.condition <- GRanges(seq = "chr10", IRanges(c(41013942, 41208731, 41535166, 41999999, 42292275, 43256194, 43615562, 49497888, 51347046, 52092180), c(41014274, 41209664, 41536039, 42000182, 42292965, 43256430, 43615866, 49498362, 51347969, 52092733)), seqinfo = GenomeInfoDb::seqinfo(mm9), expr = runif(5, 0, 1)) ranges.ChIP.control <- GRanges(seq = "chr10", IRanges(c(41022835, 41307587, 42197924, 42302387, 42893825, 43259749, 43620352, 43721891, 44248812, 45207572, 49508713, 51309978, 51348779, 52101900, 52265513), c(41022954, 41307745, 42198201, 42302555, 42893974, 43259889, 43620604, 43722051, 44248920, 45207704, 49508859, 51310187, 51348921, 52102030, 52265689)), seqinfo = GenomeInfoDb::seqinfo(mm9), score = round(runif(15, 5, 90))) example.dataset <- list(RNA = GRangesList(control = ranges.RNA.control, condition = ranges.RNA.condition), ChIP = ranges.ChIP.control) aggregated.ranges <- aggregateRanges(ranges = example.dataset, organism = mm9, name = "art.example", verbose = TRUE) names(aggregated.ranges) head(aggregated.ranges$genes)
This murine data from King et al. was downloaded from the NCBI GEO database under accession number GSE77004.
The available ChIP-seq data (GSE77002) was then processed as described in King et al.: After alignment with bowtie with parameters selecting for uniquely mapped, best-matching reads and a maximum of two mismatches per read, the peak calling was done with homer findPeaks algorithm and an input control. Subsequently, the raw peaks were filtered with the following parameters: -F 8 for H3K4me3, -size 1000 -minDist 3000 -F 4 -tagThreshold 32 for H3K27me3, -F 4 for H3K27ac and -size 1000 -minDist1000 -nfr for H3K4me1.
To reduce the storage size and complexity of the murine example data, the data only contains data of chr5 and four sample conditions (3x TFX and 1x mut) were removed.
MurEpi.ChIP.small
MurEpi.ChIP.small
A GRanges Object containing a lot of Ranges with scores.
NCBI GEO database accession number GSE77002 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE77002
King AD, Huang K, Rubbi L, Liu S, Wang CY, Wang Y, Pellegrini M, Fan G. Reversible Regulation of Promoter and Enhancer Histone Landscape by DNA Methylation in Mouse Embryonic Stem Cells. Cell Rep. 2016 Sep 27;17(1):289-302. doi: 10.1016/j.celrep.2016.08.083
This murine data from King et al. was downloaded from the NCBI GEO database under accession number GSE77004.
The available RNA-seq RPKM values per gene from the same study, provided under the accession number GSE77003.
To reduce the storage size and complexity of the murine example data, the data only contains data of chr5 and four sample conditions (3x TFX and 1x mut) were removed.
MurEpi.RNA.small
MurEpi.RNA.small
A GRanges Object containing a lot of Ranges with expression values scores.
NCBI GEO database accession number GSE77003 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE77003
King AD, Huang K, Rubbi L, Liu S, Wang CY, Wang Y, Pellegrini M, Fan G. Reversible Regulation of Promoter and Enhancer Histone Landscape by DNA Methylation in Mouse Embryonic Stem Cells. Cell Rep. 2016 Sep 27;17(1):289-302. doi: 10.1016/j.celrep.2016.08.083
This murine data from King et al. was downloaded from the NCBI GEO database under accession number GSE77004.
The methylation status, mesured by RRBS, was similarly taken from the published files (accession number GSE84103), which contain the fraction of methylated cytosine for every CpG context supported by a minimum of 5 reads.
To reduce the storage size and complexity of the murine example data, the data only contains data of chr5 and four sample conditions (3x TFX and 1x mut) were removed.
MurEpi.RRBS.small
MurEpi.RRBS.small
A GRanges Object containing a lot of Ranges with methylation status.
NCBI GEO database accession number GSE84103 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE84103
King AD, Huang K, Rubbi L, Liu S, Wang CY, Wang Y, Pellegrini M, Fan G. Reversible Regulation of Promoter and Enhancer Histone Landscape by DNA Methylation in Mouse Embryonic Stem Cells. Cell Rep. 2016 Sep 27;17(1):289-302. doi: 10.1016/j.celrep.2016.08.083
Summarize GRanges with present columns of attached values (mcols).
summarizeRanges(aggregated.ranges, outputFormat = "pdf", verbose = FALSE)
summarizeRanges(aggregated.ranges, outputFormat = "pdf", verbose = FALSE)
aggregated.ranges |
list of GRanges, cofiguration information, and name for example result from function Cogito::summarizeRanges |
outputFormat |
character, can be pdf or html default value pdf |
verbose |
logical, default value FALSE |
No return value, only side effects: creation of a rmd, a pdf or html and a data file (RData).
Annika Bürger
mm9 <- TxDb.Mmusculus.UCSC.mm9.knownGene::TxDb.Mmusculus.UCSC.mm9.knownGene ### small artificial example ### ranges.RNA.control <- GRanges(seq = "chr10", IRanges(c(41023369, 41211825, 41528287, 41994926, 42301673, 43256520, 43618919, 49503584, 51349066, 52099001), c(41023544, 41212385, 41528663, 41995357, 42302290, 43257075, 43619492, 49504033, 51349425, 52099521)), seqinfo = GenomeInfoDb::seqinfo(mm9), expr = runif(5, 0, 1)) ranges.RNA.condition <- GRanges(seq = "chr10", IRanges(c(41013942, 41208731, 41535166, 41999999, 42292275, 43256194, 43615562, 49497888, 51347046, 52092180), c(41014274, 41209664, 41536039, 42000182, 42292965, 43256430, 43615866, 49498362, 51347969, 52092733)), seqinfo = GenomeInfoDb::seqinfo(mm9), expr = runif(5, 0, 1)) ranges.ChIP.control <- GRanges(seq = "chr10", IRanges(c(41022835, 41307587, 42197924, 42302387, 42893825, 43259749, 43620352, 43721891, 44248812, 45207572, 49508713, 51309978, 51348779, 52101900, 52265513), c(41022954, 41307745, 42198201, 42302555, 42893974, 43259889, 43620604, 43722051, 44248920, 45207704, 49508859, 51310187, 51348921, 52102030, 52265689)), seqinfo = GenomeInfoDb::seqinfo(mm9), score = round(runif(15, 5, 90))) example.dataset <- list(RNA = GRangesList(control = ranges.RNA.control, condition = ranges.RNA.condition), ChIP = ranges.ChIP.control) aggregated.ranges <- aggregateRanges(ranges = example.dataset, organism = mm9, name = "art.example", verbose = TRUE) # adding information about conditions aggregated.ranges$config$conditions <- list(condition = c("RNA.condition.expr"), control = c("RNA.control.expr", "ChIP.score")) summarizeRanges(aggregated.ranges = aggregated.ranges, outputFormat = "pdf", verbose = TRUE)
mm9 <- TxDb.Mmusculus.UCSC.mm9.knownGene::TxDb.Mmusculus.UCSC.mm9.knownGene ### small artificial example ### ranges.RNA.control <- GRanges(seq = "chr10", IRanges(c(41023369, 41211825, 41528287, 41994926, 42301673, 43256520, 43618919, 49503584, 51349066, 52099001), c(41023544, 41212385, 41528663, 41995357, 42302290, 43257075, 43619492, 49504033, 51349425, 52099521)), seqinfo = GenomeInfoDb::seqinfo(mm9), expr = runif(5, 0, 1)) ranges.RNA.condition <- GRanges(seq = "chr10", IRanges(c(41013942, 41208731, 41535166, 41999999, 42292275, 43256194, 43615562, 49497888, 51347046, 52092180), c(41014274, 41209664, 41536039, 42000182, 42292965, 43256430, 43615866, 49498362, 51347969, 52092733)), seqinfo = GenomeInfoDb::seqinfo(mm9), expr = runif(5, 0, 1)) ranges.ChIP.control <- GRanges(seq = "chr10", IRanges(c(41022835, 41307587, 42197924, 42302387, 42893825, 43259749, 43620352, 43721891, 44248812, 45207572, 49508713, 51309978, 51348779, 52101900, 52265513), c(41022954, 41307745, 42198201, 42302555, 42893974, 43259889, 43620604, 43722051, 44248920, 45207704, 49508859, 51310187, 51348921, 52102030, 52265689)), seqinfo = GenomeInfoDb::seqinfo(mm9), score = round(runif(15, 5, 90))) example.dataset <- list(RNA = GRangesList(control = ranges.RNA.control, condition = ranges.RNA.condition), ChIP = ranges.ChIP.control) aggregated.ranges <- aggregateRanges(ranges = example.dataset, organism = mm9, name = "art.example", verbose = TRUE) # adding information about conditions aggregated.ranges$config$conditions <- list(condition = c("RNA.condition.expr"), control = c("RNA.control.expr", "ChIP.score")) summarizeRanges(aggregated.ranges = aggregated.ranges, outputFormat = "pdf", verbose = TRUE)