Package: GenomicDistributions 1.13.0

Kristyna Kupkova

GenomicDistributions: GenomicDistributions: fast analysis of genomic intervals with Bioconductor

If you have a set of genomic ranges, this package can help you with visualization and comparison. It produces several kinds of plots, for example: Chromosome distribution plots, which visualize how your regions are distributed over chromosomes; feature distance distribution plots, which visualizes how your regions are distributed relative to a feature of interest, like Transcription Start Sites (TSSs); genomic partition plots, which visualize how your regions overlap given genomic features such as promoters, introns, exons, or intergenic regions. It also makes it easy to compare one set of ranges to another.

Authors:Kristyna Kupkova [aut, cre], Jose Verdezoto [aut], Tessa Danehy [aut], John Lawson [aut], Jose Verdezoto [aut], Michal Stolarczyk [aut], Jason Smith [aut], Bingjie Xue [aut], Sophia Rogers [aut], John Stubbs [aut], Nathan C. Sheffield [aut]

GenomicDistributions_1.13.0.tar.gz
GenomicDistributions_1.13.0.zip(r-4.5)GenomicDistributions_1.13.0.zip(r-4.4)GenomicDistributions_1.13.0.zip(r-4.3)
GenomicDistributions_1.13.0.tgz(r-4.4-any)GenomicDistributions_1.13.0.tgz(r-4.3-any)
GenomicDistributions_1.13.0.tar.gz(r-4.5-noble)GenomicDistributions_1.13.0.tar.gz(r-4.4-noble)
GenomicDistributions_1.13.0.tgz(r-4.4-emscripten)GenomicDistributions_1.13.0.tgz(r-4.3-emscripten)
GenomicDistributions.pdf |GenomicDistributions.html
GenomicDistributions/json (API)
NEWS

# Install 'GenomicDistributions' in R:
install.packages('GenomicDistributions', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/databio/genomicdistributions/issues

Datasets:

On BioConductor:GenomicDistributions-1.13.0(bioc 3.20)GenomicDistributions-1.12.0(bioc 3.19)

bioconductor-package

43 exports 1.58 score 60 dependencies 1 mentions

Last updated 2 months agofrom:134482eeac

Exports:binBSGenomebinChromsbinRegioncalcChromBinscalcChromBinsRefcalcCumulativePartitionscalcCumulativePartitionsRefcalcDinuclFreqcalcDinuclFreqRefcalcExpectedPartitionscalcExpectedPartitionsRefcalcFeatureDistcalcFeatureDistRefTSScalcGCContentcalcGCContentRefcalcNearestNeighborscalcNeighborDistcalcPartitionscalcPartitionsRefcalcSummarySignalcalcWidthdtToGrgenomePartitionListgetChromSizesgetChromSizesFromFastagetGeneModelsgetGeneModelsFromGTFgetGenomeBinsgetTssFromGTFloadBSgenomeloadEnsDbnlistplotChromBinsplotCumulativePartitionsplotDinuclFreqplotExpectedPartitionsplotFeatureDistplotGCContentplotNeighborDistplotPartitionsplotQTHistplotSummarySignalretrieveFile

Dependencies:askpassbackportsBiocGenericsBiostringsbroomclicolorspacecpp11crayoncurldata.tabledplyrfansifarvergenericsGenomeInfoDbGenomeInfoDbDataGenomicRangesggplot2gluegtablehttrIRangesisobandjsonlitelabelinglatticelifecyclemagrittrMASSMatrixmgcvmimemunsellnlmeopensslpillarpkgconfigplyrpurrrR6RColorBrewerRcppreshape2rlangS4VectorsscalesstringistringrsystibbletidyrtidyselectUCSC.utilsutf8vctrsviridisLitewithrXVectorzlibbioc

Getting started with GenomicDistributions

Rendered fromintro.Rmdusingknitr::rmarkdownon Jun 20 2024.

Last update: 2022-01-28
Started: 2018-03-05

Full power GenomicDistributions

Rendered fromfull-power.Rmdusingknitr::rmarkdownon Jun 20 2024.

Last update: 2023-05-23
Started: 2020-05-07

Readme and manuals

Help Manual

Help pageTopics
Checks to make sure a package object is installed, and if so, returns it. If the library is not installed, it issues a warning and returns NULL..requireAndReturn
Checks class of the list of variables. To be used in functions.validateInputs
Bins a BSgenome object.binBSGenome
Naively splits a chromosome into binsbinChroms
Divide regions into roughly equal binsbinRegion
Converts a list of data.tables (From BSreadbeds) into GRanges.BSdtToGRanges
Calculates the distribution of a query set over the genomecalcChromBins
Returns the distribution of query over a reference assembly Given a query set of elements (a GRanges object) and a reference assembly (*e.g. 'hg38'), this will aggregate and count the distribution of the query elements across bins of the reference genome. This is a helper function to create features for common genomes. It is a wrapper of 'calcChromBins', which is more general.calcChromBinsRef
Returns the distribution of query over a reference assembly Given a query set of elements (a GRanges object) and a reference assembly (*e.g. 'hg38'), this will aggregate and count the distribution of the query elements across bins of the reference genome. This is a helper function to create features for common genomes. It is a wrapper of 'calcChromBins', which is more general.calcChromBinsRefSlow
Calculates the cumulative distribution of overlaps between query and arbitrary genomic partitionscalcCumulativePartitions
Calculates the cumulative distribution of overlaps for a query set to a reference assemblycalcCumulativePartitionsRef
Calculate Dinuclotide content over genomic rangescalcDinuclFreq
Calculate dinucleotide content over genomic rangescalcDinuclFreqRef
Calculates expected partiton overlap based on contribution of each feature (partition) to genome size. Expected and observed overlaps are then compared.calcExpectedPartitions
Calculates the distribution of observed versus expected overlaps for a query set to a reference assemblycalcExpectedPartitionsRef
Find the distance to the nearest genomic featurecalcFeatureDist
Calculates the distribution of distances from a query set to closest TSScalcFeatureDistRefTSS
Calculate GC content over genomic rangescalcGCContent
Calculate GC content over genomic rangescalcGCContentRef
Group regions from the same chromosome together and compute the distance of a region to its nearest neighbor. Distances are then lumped into a numeric vector.calcNearestNeighbors
Group regions from the same chromosome together and calculate the distances of a region to its upstream and downstream neighboring regions. Distances are then lumped into a numeric vector.calcNeighborDist
Calculates the distribution of overlaps between query and arbitrary genomic partitionscalcPartitions
Calculates the distribution of overlaps for a query set to a reference assemblycalcPartitionsRef
The function calcSummarySignal takes the input BED file(s) in form of GRanges or GRangesList object, overlaps it with all defined open chromatin regions across conditions (e.g. cell types) and returns a matrix, where each row is the input genomic region (if overlap was found), each column is a condition, and the value is a meam signal from regions where overlap was found.calcSummarySignal
Calculate the widths of regionscalcWidth
Table the maps cell types to tissues and groupscellTypeMetadata
hg19 chromosome sizeschromSizes_hg19
Converts a data.table (DT) object to a GenomicRanges (GR) object. Tries to be intelligent, guessing chr and start, but you have to supply end or other columns if you want them to be carried into the GR.dtToGr
Two utility functions for converting data.tables into GRanges objectsdtToGrInternal
A dataset containing a subset of open chromatin regions across all cell types defined by ENCODE for Homo Sapiens hg19exampleOpenSignalMatrix_hg19
hg38 gene modelsgeneModels_hg19
Create a basic genome partition list of genes, exons, introns, UTRs, and intergenicgenomePartitionList
Produces summaries and plots of features distributed across genomesGenomicDistributions
Returns built-in chrom sizes for a given reference assemblygetChromSizes
Get gene models from a remote or local FASTA filegetChromSizesFromFasta
Returns built-in gene models for a given reference assemblygetGeneModels
Get gene models from a remote or local GTF filegetGeneModelsFromGTF
Returns bins used in `calcChromBins` function Given a named vector of chromosome sizes, the function returns GRangesList object with bins for each chromosome.getGenomeBins
Get reference data for a specified assemblygetReferenceData
Get transcription start sites (TSSs) from a remote or local GTF filegetTssFromGTF
Convert a GenomicRanges into a data.table.grToDt
Creates labels based on a discretization definition.labelCuts
Loads BSgenome objects from UCSC-style character vectors.loadBSgenome
Load selected EnsDb libraryloadEnsDb
Internal helper function to calculate distance between neighboring regions.neighbordt
Nathan's magical named list function. This function is a drop-in replacement for the base list() function, which automatically names your list according to the names of the variables used to construct it. It seamlessly handles lists with some names and others absent, not overwriting specified names while naming any unnamed parameters. Took me awhile to figure this out.nlist
Plot distribution over chromosomesplotChromBins
Plot the cumulative distribution of regions in featuresplotCumulativePartitions
Plot dinuclotide content within region set(s)plotDinuclFreq
Produces a barplot showing how query regions of interest are distributed relative to the expected distribution across a given partition listplotExpectedPartitions
Plots a histogram of distances to genomic featuresplotFeatureDist
Plots a density distribution of GC vectors Give results from the 'calcGCContent' function, this will produce a density plotplotGCContent
Plot the distances from regions to their upstream/downstream neighbors or nearest neighbors. Distances can be passed as either raw bp or corrected for the number of regions (log10(obs/exp)), but this has to be specified in the function parameters.plotNeighborDist
Produces a barplot showing how query regions of interest are distributed across a given partition listplotPartitions
Plot quantile-trimmed histogramplotQTHist
The function plotSummarySignal visualizes the signalSummaryMatrix obtained from 'calcSummarySignal'.plotSummarySignal
Read local or remote fileretrieveFile
Example BED file read with rtracklayer::importsetB_100
Efficiently split a data.table by a column in the tablesplitDataTable
Clear ggplot face label.theme_blank_facet_label
hg19 TSS locationsTSS_hg19
Example BED file read with rtracklayer::importvistaEnhancers