Package 'geneXtendeR' reference manual

Title:	Optimized Functional Annotation Of ChIP-seq Data
Description:	geneXtendeR optimizes the functional annotation of ChIP-seq peaks by exploring relative differences in annotating ChIP-seq peak sets to variable-length gene bodies. In contrast to prior techniques, geneXtendeR considers peak annotations beyond just the closest gene, allowing users to see peak summary statistics for the first-closest gene, second-closest gene, ..., n-closest gene whilst ranking the output according to biologically relevant events and iteratively comparing the fidelity of peak-to-gene overlap across a user-defined range of upstream and downstream extensions on the original boundaries of each gene's coordinates. Since different ChIP-seq peak callers produce different differentially enriched peaks with a large variance in peak length distribution and total peak count, annotating peak lists with their nearest genes can often be a noisy process. As such, the goal of geneXtendeR is to robustly link differentially enriched peaks with their respective genes, thereby aiding experimental follow-up and validation in designing primers for a set of prospective gene candidates during qPCR.
Authors:	Bohdan Khomtchouk [aut, cre], William Koehler [aut]
Maintainer:	Bohdan Khomtchouk <[email protected]>
License:	GPL (>= 3)
Version:	1.33.0
Built:	2025-03-27 05:24:15 UTC
Source:	https://github.com/bioc/geneXtendeR

Produces box-and-whisker plot showing distribution of peak lengths across a peaks input file.

Description

Makes boxplots of all peak lengths (within a peaks input file) to show how lengths of individual peaks are distributed across the entire peak set.

Usage

allPeakLengths(filename)
allPeakLengths(filename)

Arguments

filename

Name of peaks input file.

Value

Returns a box-and-whisker plot of peak length distribution across a peaks file.

Examples

myfile <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
allPeakLengths(myfile)

myfile <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
allPeakLengths(myfile)

Annotate peaks file.

Description

Annotate a user's peaks file (which has been preprocessed with the peaksInput() command) with gene information based on optimally chosen geneXtendeR upstream extension file. This command requires a preprocessed "peaks.txt" file (generated using peaksInput()) to be present in the user's working directory, otherwise the user is prompted to rerun the peaksInput() command in order to regenerate it.

Usage

annotate(organism, extension)
annotate(organism, extension)

Arguments

`organism`	Object name assigned from readGFF() command.
`extension`	Desired upstream extension.

Value

The gene coordinates are extended by 'extension' at the 5-prime end, and by 500 bp at the 3-prime end. The peaks file is then overlayed on these new gene coordinates, producing a file of peaks annotated with gene ID, gene name, and gene-to-peak genomic distance (in bp). Distance is calculated between 5-prime end of gene and 3-prime end of peak.

A data.table formatted version of the annotated file for checking or further calculations.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
annotate(rat, 2500)

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
annotate(rat, 2500)

Annotate peaks file.

Description

Usage

annotate_n(organism, extension, n = 2)
annotate_n(organism, extension, n = 2)

Arguments

`organism`	Object name assigned from readGFF() command.
`extension`	Desired upstream extension.
`n`	Number of Gene's closest away from the peak

Value

The gene coordinates are extended by ‘extension' at the 5-prime end, and by 500 bp at the 3-prime end. The peaks file is then overlayed on these new gene coordinates, producing a file of peaks annotated with gene ID, gene name, and gene-to-peak genomic distance (in bp). Distance is calculated between 5-prime end of gene and 3-prime end of peak. File named "annotated_’extension'_'n'.txt".

A data.table formatted version of the annotated file for checking or further calculations.

Examples

## Not run: 
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
annotate_n(rat, 2500, n=3)

## End(Not run)

## Not run: 
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
annotate_n(rat, 2500, n=3)

## End(Not run)

Produces bar charts.

Description

Makes bar graphs showing the number of genes under peaks at various upstream extension levels.

Usage

barChart(organism, start, end, by)
barChart(organism, start, end, by)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.
`by`	Interval between consecutive extensions.

Value

Creates bar charts.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
barChart(rat, 1000, 3000, 100)

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
barChart(rat, 1000, 3000, 100)

Produces cumulative line plots.

Description

Makes cumulative differential line plots showing the cumulative sums of the number of genes under peaks at consecutive upstream extension levels.

Usage

cumlinePlot(organism, start, end, by)
cumlinePlot(organism, start, end, by)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.
`by`	Interval between consecutive extensions.

Value

Creates cumulative differential line plots.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
cumlinePlot(rat, 1000, 3000, 100)


library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
cumlinePlot(rat, 1000, 3000, 100)

Finds differential gene ontologies

Description

Determines gene ontology terms for each category (biological process (BP), cellular compartment (CC), molecular function (MF)) of genes-under-peaks that are unique between two different upstream extension levels.

Usage

diffGO(organism, start, end, GOcategory, GOspecies)
diffGO(organism, start, end, GOcategory, GOspecies)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.
`GOcategory`	Either BP, CC, or MF.
`GOspecies`	Either org.Ag.eg.db (mosquito), org.Bt.eg.db (bovine), org.Ce.eg.db (worm), org.Cf.eg.db (canine), org.Dm.eg.db (fly), org.Dr.eg.db (zebrafish), org.Gg.eg.db (chicken), org.Hs.eg.db (human), org.Mm.eg.db (mouse), org.Mmu.eg.db (rhesus), org.Pt.eg.db (chimpanzee), org.Rn.eg.db (rat), org.Sc.sgd.db (yeast), org.Ss.eg.db (pig), or org.Xl.eg.db (frog).

Value

A data frame of gene symbol, gene ontology ID, and gene ontology term for either a BP, CC, or MF category. This data frame displays the annotations of all unique genes (i.e., genes that are located under peaks between two upstream extension levels) with their respective gene ontology information.

Examples

library(rtracklayer)
library(org.Rn.eg.db)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
diffGO(rat, 0, 500, BP, org.Rn.eg.db)

library(rtracklayer)
library(org.Rn.eg.db)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
diffGO(rat, 0, 500, BP, org.Rn.eg.db)

Finds unique genes under peaks.

Description

Determines what genes directly under peaks are actually unique between two different upstream extension levels.

Usage

distinct(organism, start, end)
distinct(organism, start, end)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.

Details

V1-V3 denote the chromosome/start/end positions of the peaks, V4-V6 denote the respective values of the genes, V7 is the gene ID (e.g., Ensembl ID), V8 is the gene name, and V9 is the distance of peak to nearest gene.

Value

A data.table of unique genes located under peaks between two upstream extension levels.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
distinct(rat, 2000, 3000)

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
distinct(rat, 2000, 3000)

Annotate peak file based on gene.

Description

Annotate a user's peaks file (which has been preprocessed with the peaksInput() command) with gene information based on optimally chosen geneXtendeR upstream extension file and compresses the annotations based on genes. This command requires a preprocessed "peaks.txt" file (generated using peaksInput()) to be present in the user's working directory, otherwise the user is prompted to rerun the peaksInput() command in order to regenerate it.

Usage

gene_annotate(organism, extension)
gene_annotate(organism, extension)

Arguments

`organism`	Object name assigned from readGFF() command.
`extension`	Desired upstream extension.

Value

The gene coordinates are extended by 'extension' at the 5-prime end, and by 500 bp at the 3-prime end. The peaks file is then overlayed on these new gene coordinates, producing a file of peaks annotated with gene ID, gene name, gene location, mean and standard deviation of peaks-to-genes, number of peaks-to-genes, and peak-to-gene genomic distance (in bp). Distance is calculated between 5-prime end of gene and 3-prime end of peak.

A data.table formatted version of the gene-annotated file for checking or further calculations.

(From annotate.r) The gene coordinates are extended by 'extension' at the 5-prime end, and by 500 bp at the 3-prime end. The peaks file is then overlayed on these new gene coordinates, producing a file of peaks annotated with gene ID, gene name, and gene-to-peak genomic distance (in bp). Distance is calculated between 5-prime end of gene and 3-prime end of peak.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
gene_annotate(rat, 3400)

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
gene_annotate(rat, 3400)

Looks up specific gene and closest peaks

Description

Looks up closest peak to a specified gene on the peaks file (which has been preprocessed with the peaksInput() command) based on the latest .bed file accessed or for a specified extension. This command requires a preprocessed "peaks.txt" file (generated using peaksInput()) to be present in the user's working directory, otherwise the user is prompted to rerun the peaksInput() command in order to regenerate it.

Usage

gene_lookup(organism, gene_name, n = 2, extension = NA, cutoff = Inf)
gene_lookup(organism, gene_name, n = 2, extension = NA, cutoff = Inf)

Arguments

`organism`	Object name assigned from readGFF() command.
`gene_name`	Gene names or gene ids specified by user in a string form.
`n`	Number of closest peaks located to 'gene_name' on any given chromosome to be found. Default = 2
`extension`	Desired upstream extension. If left unspecified, the latest geneXtender bed file will be chosen. If no extension is specified and no bed file can be found, a default extension of 500 is selected.
`cutoff`	Optional arg to specify max bp distance to search around 'gene_name'. Default = Inf

Value

A data.table with all peaks located within 'n' peaks and 'cutoff' bp distance away on every chromosome that contains 'gene_name'.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
closest <- gene_lookup(rat, c("Vom2r3", "Vom2r5"), n = 7, extension = 1000)
closest

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
closest <- gene_lookup(rat, c("Vom2r3", "Vom2r5"), n = 7, extension = 1000)
closest

Graphs hotspots of statistically significant peak activity.

Description

Makes line plots showing the ratio of statistically significant peaks to the total number of peaks at each genomic interval (e.g., 0-500 bp upstream of every gene in the genome, 500-1000 bp upstream of every gene in the genome, etc.).

Usage

hotspotPlot(totalpeaksfile, significantpeaksfile, organism, start, end, by)
hotspotPlot(totalpeaksfile, significantpeaksfile, organism, start, end, by)

Arguments

`totalpeaksfile`	Filename in user's working directory (or full path to filename) containing all peaks.
`significantpeaksfile`	Filename in user's working directory (or full path to filename) containing only significant peaks.
`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.
`by`	Interval between consecutive extensions.

Value

Line plot showing the ratio of significant to total peaks at each interval across the genome.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
allpeaks <- system.file("extdata", "totalpeaksfile.txt", package="geneXtendeR")
sigpeaks <- system.file("extdata", "significantpeaksfile.txt", package="geneXtendeR")
hotspotPlot(allpeaks, sigpeaks, rat, 0, 10000, 500)


library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
allpeaks <- system.file("extdata", "totalpeaksfile.txt", package="geneXtendeR")
sigpeaks <- system.file("extdata", "significantpeaksfile.txt", package="geneXtendeR")
hotspotPlot(allpeaks, sigpeaks, rat, 0, 10000, 500)

Produces line plots.

Description

Makes differential line plots showing the differences in the number of genes under peaks at consecutive upstream extension levels.

Usage

linePlot(organism, start, end, by)
linePlot(organism, start, end, by)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.
`by`	Interval between consecutive extensions.

Value

Creates differential line plots.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
linePlot(rat, 1000, 3000, 100)


library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
linePlot(rat, 1000, 3000, 100)

Makes gene-GO networks

Description

Creates dynamic and interactive networks of genes linked to their respective gene ontology terms for each category (biological process (BP), cellular compartment (CC), molecular function (MF)) of genes-under-peaks that are unique between two different upstream extension levels.

Usage

makeNetwork(organism, start, end, GOcategory, GOspecies)
makeNetwork(organism, start, end, GOcategory, GOspecies)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.
`GOcategory`	Either BP, CC, or MF.
`GOspecies`	Either org.Ag.eg.db (mosquito), org.Bt.eg.db (bovine), org.Ce.eg.db (worm), org.Cf.eg.db (canine), org.Dm.eg.db (fly), org.Dr.eg.db (zebrafish), org.Gg.eg.db (chicken), org.Hs.eg.db (human), org.Mm.eg.db (mouse), org.Mmu.eg.db (rhesus), org.Pt.eg.db (chimpanzee), org.Rn.eg.db (rat), org.Sc.sgd.db (yeast), org.Ss.eg.db (pig), or org.Xl.eg.db (frog).

Value

An interactive network of gene names linked to their respective gene ontology terms for either a BP, CC, or MF category.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
library(networkD3)
library(dplyr)
library(org.Rn.eg.db)
makeNetwork(rat, 0, 500, BP, org.Rn.eg.db)

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
library(networkD3)
library(dplyr)
library(org.Rn.eg.db)
makeNetwork(rat, 0, 500, BP, org.Rn.eg.db)

Makes word cloud from gene ontology terms

Description

Creates word cloud from gene ontology terms derived from either biological process (BP), cellular compartment (CC), or molecular function (MF) of genes-under-peaks that are unique between two different upstream extension levels.

Usage

makeWordCloud(organism, start, end, GOcategory, GOspecies)
makeWordCloud(organism, start, end, GOcategory, GOspecies)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.
`GOcategory`	Either BP, CC, or MF.
`GOspecies`	Either org.Ag.eg.db (mosquito), org.Bt.eg.db (bovine), org.Ce.eg.db (worm), org.Cf.eg.db (canine), org.Dm.eg.db (fly), org.Dr.eg.db (zebrafish), org.Gg.eg.db (chicken), org.Hs.eg.db (human), org.Mm.eg.db (mouse), org.Mmu.eg.db (rhesus), org.Pt.eg.db (chimpanzee), org.Rn.eg.db (rat), org.Sc.sgd.db (yeast), org.Ss.eg.db (pig), or org.Xl.eg.db (frog).

Value

A word cloud comprised of words gathered from gene ontology terms of either a BP, CC, or MF category.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
library(tm)
library(SnowballC)
library(wordcloud)
library(RColorBrewer)
library(org.Rn.eg.db)
makeWordCloud(rat, 0, 500, BP, org.Rn.eg.db)

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
library(tm)
library(SnowballC)
library(wordcloud)
library(RColorBrewer)
library(org.Rn.eg.db)
makeWordCloud(rat, 0, 500, BP, org.Rn.eg.db)

Calculates mean (average) peak length for any genomic region.

Description

Determines the average peak length of all peaks found within some genomic interval (e.g., 0-500 bp upstream of nearest gene for all genes throughout the genome).

Usage

meanPeakLength(organism, start, end)
meanPeakLength(organism, start, end)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.

Value

A vector composed of a single number representing the average peak length found within a genomic interval.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
sigpeaks <- system.file("extdata", "significantpeaksfile.txt", package="geneXtendeR")
peaksInput(sigpeaks)
meanPeakLength(rat, 0, 500)

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
sigpeaks <- system.file("extdata", "significantpeaksfile.txt", package="geneXtendeR")
peaksInput(sigpeaks)
meanPeakLength(rat, 0, 500)

Produces line plots of mean (average) peak length within any genomic interval.

Description

Makes line plots of mean peak lengths to show the average length of individual peaks within any genomic interval (e.g., 0-500 bp upstream of nearest gene for all genes throughout the genome).

Usage

meanPeakLengthPlot(organism, start, end, by)
meanPeakLengthPlot(organism, start, end, by)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.
`by`	Interval between consecutive extensions.

Value

Creates mean peak length line plots.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
allpeaks <- system.file("extdata", "totalpeaksfile.txt", package="geneXtendeR")
peaksInput(allpeaks)
meanPeakLengthPlot(rat, 0, 10000, 500)


library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
allpeaks <- system.file("extdata", "totalpeaksfile.txt", package="geneXtendeR")
peaksInput(allpeaks)
meanPeakLengthPlot(rat, 0, 10000, 500)

Produces box-and-whisker plot of peak lengths within any genomic interval.

Description

Makes boxplots of peak lengths to show how lengths of individual peaks are distributed within any genomic interval (e.g., 0-500 bp upstream of nearest gene for all genes throughout the genome).

Usage

peakLengthBoxplot(organism, start, end)
peakLengthBoxplot(organism, start, end)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.

Value

Creates boxplots showing how lengths of peaks are distributed within any given genomic interval. Also, creates character vector composed of individual peak lengths.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
allpeaks <- system.file("extdata", "totalpeaksfile.txt", package="geneXtendeR")
peaksInput(allpeaks)
peakLengthBoxplot(rat, 0, 500)


library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
allpeaks <- system.file("extdata", "totalpeaksfile.txt", package="geneXtendeR")
peaksInput(allpeaks)
peakLengthBoxplot(rat, 0, 500)

Preprocesses a peaks input file.

Description

Takes your tab-delimited 3-column (chromosome number, peak start, and peak end) input file (see ?samplepeaksinput) consisting of peaks called from a peak caller (e.g., MACS2 or SICER) and sorts the file by chromosome and start position, thereby creating a preprocessed file for downstream geneXtendeR analysis. This file (called "peaks.txt") is a preprocessed file of the original input and is deposited in the user's working directory and used for the remainder of the analysis. In this "peaks.txt" file, peaks are sorted by chromosome number and start position, where the X chromosome is designated by the integer 100, the Y chromosome by the integer 200, and the mitochondrial chromosome by the integer 300.

Usage

peaksInput(filename)
peaksInput(filename)

Arguments

filename

Name of file containing peaks that have been generated from a peak caller (e.g., MACS2, SICER). See ?samplepeaksinput for an example of such an input file.

Value

Returns a formatted file (called "peaks.txt") that has been preprocessed in preparation for usage with barChart(), linePlot(), distinct(), and other downstream commands and deposited in the user's working directory.

Examples

?samplepeaksinput  #Documentation of some exemplary sample input
data(samplepeaksinput)
head(samplepeaksinput)
tail(samplepeaksinput)
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)

?samplepeaksinput  #Documentation of some exemplary sample input
data(samplepeaksinput)
head(samplepeaksinput)
tail(samplepeaksinput)
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)

Transform peaks into merged peaks.

Description

Takes your tab-delimited 3-column (chromosome number, peak start, and peak end) input file (see ?samplepeaksinput) consisting of peaks called from a peak caller (e.g., MACS2 or SICER) and transforms this file into a file of merged peaks. This file (called "peaks.txt") is a preprocessed file of the original input transformed into merged peaks, and it is deposited in the user's working directory and used for the remainder of the analysis. In this "peaks.txt" file, peaks are sorted by chromosome number and start position, where the X chromosome is designated by the integer 100, the Y chromosome by the integer 200, and the mitochondrial chromosome by the integer 300.

Usage

peaksMerge(filename, mergeby)
peaksMerge(filename, mergeby)

Arguments

`filename`	Name of file containing peaks that have been generated from a peak caller (e.g., MACS2, SICER). See ?samplepeaksinput for an example of such an input file.
`mergeby`	Integer indicating how close two adjacent sorted peaks need to be in order to be merged into one peak.

Value

Returns a formatted file (called "peaks.txt"), deposited in the user's working directory, which has been preprocessed to transform individual peaks into merged peaks in preparation for usage with barChart(), linePlot(), distinct(), and other downstream commands.

Examples

fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksMerge(fpath, 500)

fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksMerge(fpath, 500)

Plots word frequencies found within gene ontology terms

Description

Creates barplots of word frequencies from gene ontology terms derived from either biological process (BP), cellular compartment (CC), or molecular function (MF) of genes-under-peaks that are unique between two different upstream extension levels.

Usage

plotWordFreq(organism, start, end, GOcategory, GOspecies, word_count)
plotWordFreq(organism, start, end, GOcategory, GOspecies, word_count)

Arguments

`organism`	Object name assigned from readGFF() command.
`start`	Lower bound of upstream extension.
`end`	Upper bound of upstream extension.
`GOcategory`	Either BP, CC, or MF.
`GOspecies`	Either org.Ag.eg.db (mosquito), org.Bt.eg.db (bovine), org.Ce.eg.db (worm), org.Cf.eg.db (canine), org.Dm.eg.db (fly), org.Dr.eg.db (zebrafish), org.Gg.eg.db (chicken), org.Hs.eg.db (human), org.Mm.eg.db (mouse), org.Mmu.eg.db (rhesus), org.Pt.eg.db (chimpanzee), org.Rn.eg.db (rat), org.Sc.sgd.db (yeast), org.Ss.eg.db (pig), or org.Xl.eg.db (frog).
`word_count`	Number of top words to display

Value

A barplot comprised of words sorted by frequency of occurrence from gene ontology terms of either a BP, CC, or MF category.

Examples

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
library(tm)
library(SnowballC)
library(wordcloud)
library(RColorBrewer)
library(org.Rn.eg.db)
plotWordFreq(rat, 0, 500, BP, org.Rn.eg.db, 10)

library(rtracklayer)
rat <- readGFF("ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz")
fpath <- system.file("extdata", "somepeaksfile.txt", package="geneXtendeR")
peaksInput(fpath)
library(tm)
library(SnowballC)
library(wordcloud)
library(RColorBrewer)
library(org.Rn.eg.db)
plotWordFreq(rat, 0, 500, BP, org.Rn.eg.db, 10)

Gene transfer format (GTF) file for rat (Rattus_norvegicus.Rnor_6.0.84)

Description

A dataset downloaded from Ensembl that contains the entries of a GTF file for Rattus norvegicus.

Usage

data(rat)
data(rat)

Format

A data frame with 748514 rows and 28 variables corresponding to the entries of a GTF file. Column names are standardized and can be found here: http://www.ensembl.org/info/website/upload/gff.html.

Value

Demonstrates a rat GTF file downloaded from: ftp://ftp.ensembl.org/pub/release-84/gtf/rattus_norvegicus/Rattus_norvegicus.Rnor_6.0.84.gtf.gz.

Examples

head(rat)
tail(rat)

head(rat)
tail(rat)

Sample peaks list to be used as input to geneXtendeR

Description

A dataset containing the chromosome number, start and stop positions of ChIP-seq peaks along the Rattus norvegicus genome (rn6 assembly). A dataset like this may be used as input to the peaksInput() command, which will sort the dataset by chromosome number and start position.

Usage

data(samplepeaksinput)
data(samplepeaksinput)

Format

A data frame with 25089 rows and 3 variables:

chr: Chromosome number
start: Peak start position [in units of base pairs]
end: Peak end position [in units of base pairs]

Value

Demonstrates a sample peaks file used as input.

Examples

head(samplepeaksinput)
tail(samplepeaksinput)

head(samplepeaksinput)
tail(samplepeaksinput)

Package 'geneXtendeR'

Help Index

Produces box-and-whisker plot showing distribution of peak lengths across a peaks input file.

Description

Usage

Arguments

Value

Examples

Annotate peaks file.

Description

Usage

Arguments

Value

Examples

Annotate peaks file.

Description

Usage

Arguments

Value

Examples

Produces bar charts.

Description

Usage

Arguments

Value

Examples

Produces cumulative line plots.

Description

Usage

Arguments

Value

Examples

Finds differential gene ontologies

Description

Usage

Arguments

Value

Examples

Finds unique genes under peaks.

Description

Usage

Arguments

Details

Value

Examples

Annotate peak file based on gene.

Description

Usage

Arguments

Value

Examples

Looks up specific gene and closest peaks

Description

Usage

Arguments

Value

Examples

Graphs hotspots of statistically significant peak activity.

Description

Usage

Arguments

Value

Examples

Produces line plots.

Description

Usage

Arguments

Value

Examples

Makes gene-GO networks

Description

Usage

Arguments

Value

Examples

Makes word cloud from gene ontology terms

Description

Usage

Arguments

Value