Title: | Visualise ChIP-seq, MNase-seq and motif occurrence as aggregate plots Summarised Over Grouped Genomic Intervals |
---|---|
Description: | The soGGi package provides a toolset to create genomic interval aggregate/summary plots of signal or motif occurence from BAM and bigWig files as well as PWM, rlelist, GRanges and GAlignments Bioconductor objects. soGGi allows for normalisation, transformation and arithmetic operation on and between summary plot objects as well as grouping and subsetting of plots by GRanges objects and user supplied metadata. Plots are created using the GGplot2 libary to allow user defined manipulation of the returned plot object. Coupled together, soGGi features a broad set of methods to visualise genomics data in the context of groups of genomic intervals such as genes, superenhancers and transcription factor binding events. |
Authors: | Gopuraja Dharmalingam, Doug Barrows, Tom Carroll |
Maintainer: | Tom Carroll <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.39.0 |
Built: | 2024-11-30 04:50:58 UTC |
Source: | https://github.com/bioc/soGGi |
Join, subset and manipulate ChIPprofile objects
## S4 method for signature 'ChIPprofile' c(x, ..., recursive = FALSE) ## S4 method for signature 'ChIPprofile' rbind(x, ..., deparse.level = 1) ## S4 method for signature 'ChIPprofile' cbind(x, ..., deparse.level = 1) ## S4 method for signature 'ChIPprofile,ANY,missing' x[[i, j, ...]] ## S4 method for signature 'ChIPprofile' x$name
## S4 method for signature 'ChIPprofile' c(x, ..., recursive = FALSE) ## S4 method for signature 'ChIPprofile' rbind(x, ..., deparse.level = 1) ## S4 method for signature 'ChIPprofile' cbind(x, ..., deparse.level = 1) ## S4 method for signature 'ChIPprofile,ANY,missing' x[[i, j, ...]] ## S4 method for signature 'ChIPprofile' x$name
j |
Should be missing |
... |
objects to be concatenated. |
recursive |
logical. If |
deparse.level |
See |
x |
object from which to extract element(s) or in which to replace element(s). |
i |
indices specifying elements to extract or replace. Indices are
For When indexing arrays by An index value of |
name |
A literal character string or a name (possibly backtick
quoted). For extraction, this is normally (see under
‘Environments’) partially matched to the |
A ChIPprofile object
data(chipExampleBig) x <- c(chipExampleBig[[1]],chipExampleBig[[2]]) y <- rbind(chipExampleBig[[1]],chipExampleBig[[2]])
data(chipExampleBig) x <- c(chipExampleBig[[1]],chipExampleBig[[2]]) y <- rbind(chipExampleBig[[1]],chipExampleBig[[2]])
This dataset contains peaks from ChIP-signal over genes
data(chipExampleBig)
data(chipExampleBig)
ChIPprofiles
A ChIPprofile object
Manual for soggi and ChIPprofile object
The soggi function is the constructor for ChIPprofile objects.
regionPlot(bamFile, testRanges, samplename = NULL, nOfWindows = 100, FragmentLength = 150, style = "point", distanceAround = NULL, distanceUp = NULL, distanceDown = NULL, distanceInRegionStart = NULL, distanceOutRegionStart = NULL, distanceInRegionEnd = NULL, distanceOutRegionEnd = NULL, paired = FALSE, normalize = "RPM", plotBy = "coverage", removeDup = FALSE, verbose = TRUE, format = "bam", seqlengths = NULL, forceFragment = NULL, method = "bin", genome = NULL, cutoff = 80, downSample = NULL, minFragmentLength = NULL, maxFragmentLength = NULL)
regionPlot(bamFile, testRanges, samplename = NULL, nOfWindows = 100, FragmentLength = 150, style = "point", distanceAround = NULL, distanceUp = NULL, distanceDown = NULL, distanceInRegionStart = NULL, distanceOutRegionStart = NULL, distanceInRegionEnd = NULL, distanceOutRegionEnd = NULL, paired = FALSE, normalize = "RPM", plotBy = "coverage", removeDup = FALSE, verbose = TRUE, format = "bam", seqlengths = NULL, forceFragment = NULL, method = "bin", genome = NULL, cutoff = 80, downSample = NULL, minFragmentLength = NULL, maxFragmentLength = NULL)
bamFile |
Character vector for location of BAM file or bigWig, an rleList or PWM matrix. |
testRanges |
GRanges object or character vector of BED file location of regions to plot. |
samplename |
Character vector of sample name. Default is NULL. |
nOfWindows |
Number of windows to bin regions into for coverage calculations (Default 100) |
FragmentLength |
Integer vector Predicted or expected fragment length. |
style |
"Point" for per base pair plot, "percentOfRegion" for normalised length and "region" for combined plot |
distanceAround |
Distance around centre of region to be used for plotting |
distanceUp |
Distance upstream from centre of region to be used for plotting |
distanceDown |
Distance downstream from centre of region to be used for plotting |
distanceInRegionStart |
Distance into region start (5' for Watson/positive strand or notspecified strand Regions,3' for Crick/negatie strand regions) for plotting. |
distanceOutRegionStart |
Distance out from region start (5' for Watson/positive strand or notspecified strand Regions,3' for Crick/negatie strand regions) for plotting. |
distanceInRegionEnd |
Distance into region end (3' for Watson/positive strand or notspecified strand Regions,5' for Crick/negatie strand regions) for plotting. |
distanceOutRegionEnd |
Distance out from region end (3' for Watson/positive strand or notspecified strand Regions,5' for Crick/negatie strand regions) for plotting. |
paired |
Is data paired end |
normalize |
Calculate coverage as RPM. Presently only RPM available. |
plotBy |
Score to be used for plotting. Presently only coverage. |
removeDup |
Remove duplicates before calculating coverage. |
verbose |
TRUE or FALSE |
format |
character vector of "BAM", "BigWig", "RleList" or "PWM" |
seqlengths |
Chromosomes to be used. If missing will report all. |
forceFragment |
Centre fragment and force consistent fragment width. |
method |
Character vector of value "bp","bin" or "spline". The bin method divides a region of interest into equal sized bins of number specified in nOfWindows. Coverage or counts are then summarised within these windows. The spline method creates a spline with the number of spline points as specified in nOfWindows argument. |
downSample |
Down sample BAM reads to this proportion of orginal. |
genome |
BSGenome object to be used when using PWM input. |
cutoff |
Cut-off for idnetifying motifs when using PWM input. |
minFragmentLength |
Remove fragments smaller than this. |
maxFragmentLength |
Remove fragments larger than this. |
ChIPprofile A ChIPprofile object.
See http://bioinformatics.csc.mrc.ac.uk for more details on soGGi workflows
data(chipExampleBig) chipExampleBig
data(chipExampleBig) chipExampleBig
Plot coverage of points or regions.
Returns summits and summmit scores after optional fragment length prediction and read extension
findconsensusRegions(testRanges, bamFiles = NULL, method = "majority", summit = "mean", resizepeak = "asw", overlap = "any", fragmentLength = NULL, NonPrimaryPeaks = list(withinsample = "drop", betweensample = "mean")) summitPipeline(reads, peakfile, fragmentLength, readlength)
findconsensusRegions(testRanges, bamFiles = NULL, method = "majority", summit = "mean", resizepeak = "asw", overlap = "any", fragmentLength = NULL, NonPrimaryPeaks = list(withinsample = "drop", betweensample = "mean")) summitPipeline(reads, peakfile, fragmentLength, readlength)
testRanges |
Named character vector of region locations |
bamFiles |
Named character vector of bamFile locations |
method |
Method to select reproducible summits to merge. |
summit |
Only mean avaialble |
resizepeak |
Only asw available |
overlap |
Type of overlap to consider for finding consensus sites |
fragmentLength |
Predicted fragment length. Set to NULL to auto-calculate |
NonPrimaryPeaks |
A list of parameters to deal with non primary peaks in consensus regions. |
peakfile |
GRanges of genomic intervals to summit. |
reads |
Character vector of bamFile location or GAlignments object |
readlength |
Read length of alignments. |
Consensus A GRanges object of consensus regions with consensus summits.
Summits A GRanges object of summits and summit scores.
Create GRangeslist from all combinations of GRanges
groupByOverlaps(testRanges)
groupByOverlaps(testRanges)
testRanges |
A named list of GRanges or a named GRangesList |
groupedGRanges A named GRangesList object.
data(ik_Example) gts <- groupByOverlaps(ik_Example)
data(ik_Example) gts <- groupByOverlaps(ik_Example)
This dataset contains peaks from Ikaros ChIP by two antibodies
data(ik_Example)
data(ik_Example)
Ikpeaksets
A list containing two GRanges objects
This dataset contains signal over peaks from Ikaros ChIP by two antibodies
data(ik_Profiles)
data(ik_Profiles)
ik_Profiles
A ChIPprofile object
Various normalisation methods for ChIPprofile objects
## S4 method for signature 'ChIPprofile' normalise(object) ## S4 method for signature 'ChIPprofile,character,numeric' normalise(object = "ChIPprofile", method = "rpm", normFactors = NULL)
## S4 method for signature 'ChIPprofile' normalise(object) ## S4 method for signature 'ChIPprofile,character,numeric' normalise(object = "ChIPprofile", method = "rpm", normFactors = NULL)
object |
A ChIPprofile object |
method |
A character vector specifying normalisation method. Currently "rpm" for normalising signal for BAM to total reads, "quantile" to quantile normalise across samples, "signalInRegion" to normalise to proportion of signal within intervals, "normaliseSample" to normalise across samples and "normaliseRegions" to apply a normalisation across intervals. |
normFactors |
A numeric vector used to scale columns or rows. |
A ChIPprofile object
Thomas Carroll
data(chipExampleBig) normalise(chipExampleBig,method="quantile",normFactors=1)
data(chipExampleBig) normalise(chipExampleBig,method="quantile",normFactors=1)
Quantile normalisation across bins/regions.
## S4 method for signature 'ChIPprofile' normaliseQuantiles(object) ## S4 method for signature 'ChIPprofile' normaliseQuantiles(object = "ChIPprofile")
## S4 method for signature 'ChIPprofile' normaliseQuantiles(object) ## S4 method for signature 'ChIPprofile' normaliseQuantiles(object = "ChIPprofile")
object |
A ChIPprofile object |
A ChIPprofile object containing normalised data
Thomas Carroll
data(chipExampleBig) normaliseQuantiles(chipExampleBig)
data(chipExampleBig) normaliseQuantiles(chipExampleBig)
Arithmetic operations
## S4 method for signature 'ChIPprofile,ChIPprofile' Ops(e1, e2) ## S4 method for signature 'ChIPprofile,numeric' Ops(e1, e2) ## S4 method for signature 'numeric,ChIPprofile' Ops(e1, e2) ## S4 method for signature 'ChIPprofile' mean(x, ...) ## S4 method for signature 'ChIPprofile' log2(x) ## S4 method for signature 'ChIPprofile' log(x, base = exp(1))
## S4 method for signature 'ChIPprofile,ChIPprofile' Ops(e1, e2) ## S4 method for signature 'ChIPprofile,numeric' Ops(e1, e2) ## S4 method for signature 'numeric,ChIPprofile' Ops(e1, e2) ## S4 method for signature 'ChIPprofile' mean(x, ...) ## S4 method for signature 'ChIPprofile' log2(x) ## S4 method for signature 'ChIPprofile' log(x, base = exp(1))
e1 |
ChIPprofile object |
e2 |
ChIPprofile object |
x |
objects. |
... |
further arguments passed to methods. |
base |
a positive or complex number: the base with respect to which
logarithms are computed. Defaults to |
A ChIPprofile object of result of arithmetic operation.
data(chipExampleBig) chipExampleBig[[1]] + chipExampleBig[[2]]
data(chipExampleBig) chipExampleBig[[1]] + chipExampleBig[[2]]
Set strand by overlapping or nearest anchor GRanges
orientBy(testRanges, anchorRanges)
orientBy(testRanges, anchorRanges)
testRanges |
The GRanges object to anchor. |
anchorRanges |
A GRanges object by which to anchor strand orientation. |
newRanges A GRanges object.
data(ik_Example) strand(ik_Example[[1]]) <- "+" anchoredGRanges <- orientBy(ik_Example[[2]],ik_Example[[1]])
data(ik_Example) strand(ik_Example[[1]]) <- "+" anchoredGRanges <- orientBy(ik_Example[[2]],ik_Example[[1]])
A function to plot regions
## S4 method for signature 'ChIPprofile' plotRegion(object, gts,sampleData,groupData,summariseBy, colourBy,lineBy,groupBy, plotregion,outliers,freeScale) ## S4 method for signature 'ChIPprofile' plotRegion(object = "ChIPprofile", gts = NULL, sampleData = NULL, groupData = NULL, summariseBy = NULL, colourBy = NULL, lineBy = NULL, groupBy = NULL, plotregion = "full", outliers = NULL, freeScale = FALSE)
## S4 method for signature 'ChIPprofile' plotRegion(object, gts,sampleData,groupData,summariseBy, colourBy,lineBy,groupBy, plotregion,outliers,freeScale) ## S4 method for signature 'ChIPprofile' plotRegion(object = "ChIPprofile", gts = NULL, sampleData = NULL, groupData = NULL, summariseBy = NULL, colourBy = NULL, lineBy = NULL, groupBy = NULL, plotregion = "full", outliers = NULL, freeScale = FALSE)
object |
A ChIPprofile object |
gts |
A list of character vectors or GRangesList |
plotregion |
region to plot. For combined plots with style "region", may be "start" or "end" to show full resolution of plot of edges. |
groupData |
Dataframe of metadata for groups |
sampleData |
Dataframe of metadata for sample |
summariseBy |
Column names from GRanges elementmetadata. Formula or character vector of column names to use to collapse genomic ranges to summarised profiles. summariseBy can not be used injustion with groups specified by gts argument. |
colourBy |
Character vector or formula of either column names from colData(object) containing sample metadata or character vector "group" to colour by groups in gts |
lineBy |
Character vector or formula of either column names from colData(object) containing sample metadata or character vector "group" to set line type by groups in gts |
groupBy |
Character vector or formula of either column names from colData(object) containing sample metadata or character "group" to colour by groups in gts |
outliers |
A numeric vector of length 1 containing proportion from limits to windsorise.] |
freeScale |
TRUE or FALSE to set whether ggplot 2 facets have their own scales. Useful for comparing multiple samples of differing depths without normalisation. Default is FALSE. |
A gg object from ggplot2
Thomas Carroll
data(chipExampleBig) plotRegion(chipExampleBig[[2]])
data(chipExampleBig) plotRegion(chipExampleBig[[2]])
This dataset contains an rlelist of motif coverage
data(pwmCov)
data(pwmCov)
pwmCov
A rlelist of motif coverage
Creates rlelist of pwm hits.
Motif score as an RLElist
pwmToCoverage(pwm, genome, min = "70%", removeRand = FALSE, chrsOfInterest = NULL) makeMotifScoreRle(pwm, regions, genome, extend, removeRand = FALSE, strandScore = "mean", atCentre = FALSE)
pwmToCoverage(pwm, genome, min = "70%", removeRand = FALSE, chrsOfInterest = NULL) makeMotifScoreRle(pwm, regions, genome, extend, removeRand = FALSE, strandScore = "mean", atCentre = FALSE)
pwm |
A PWM matrix object. |
genome |
A BSgenome object |
min |
pwm score (as percentage of maximum score) cutoff |
removeRand |
Remove contigs with rand string |
chrsOfInterest |
Chromosomes to use |
regions |
GRanges object to include in pwm rlelist |
extend |
bps to extend regions by |
strandScore |
Method for averaging strand. Options are max, mean, sum, bothstrands |
atCentre |
TRUE/FALSE. TRUE assigns score onto 1bp position at centre of motif. FALSE assigns every basepair the sum of scores of all overlapping motifs. |
A RLElist of motif density per base pair to be used as input to main soggi function.
Thomas Carroll
data(pwmCov) data(singleGRange)
data(pwmCov) data(singleGRange)
This dataset contains an rlelist of motif coverage
data(singleGRange)
data(singleGRange)
singleGRange
A single GRanges used in motif coverage example/