Package 'geneRxCluster' reference manual

Package 'geneRxCluster'

Title:	gRx Differential Clustering
Description:	Detect Differential Clustering of Genomic Sites such as gene therapy integrations. The package provides some functions for exploring genomic insertion sites originating from two different sources. Possibly, the two sources are two different gene therapy vectors. Vectors are preferred that target sensitive regions less frequently, motivating the search for localized clusters of insertions and comparison of the clusters formed by integration of different vectors. Scan statistics allow the discovery of spatial differences in clustering and calculation of False Discovery Rates (FDRs) providing statistical methods for comparing retroviral vectors. A scan statistic for comparing two vectors using multiple window widths to detect clustering differentials and compute FDRs is implemented here.
Authors:	Charles Berry
Maintainer:	Charles Berry <[email protected]>
License:	GPL (>= 2)
Version:	1.43.0
Built:	2025-03-21 05:50:00 UTC
Source:	https://github.com/bioc/geneRxCluster

Title:

gRx Differential Clustering

Description:

Detect Differential Clustering of Genomic Sites such as gene therapy integrations. The package provides some functions for exploring genomic insertion sites originating from two different sources. Possibly, the two sources are two different gene therapy vectors. Vectors are preferred that target sensitive regions less frequently, motivating the search for localized clusters of insertions and comparison of the clusters formed by integration of different vectors. Scan statistics allow the discovery of spatial differences in clustering and calculation of False Discovery Rates (FDRs) providing statistical methods for comparing retroviral vectors. A scan statistic for comparing two vectors using multiple window widths to detect clustering differentials and compute FDRs is implemented here.

Authors:

Charles Berry

Maintainer:

Charles Berry <[email protected]>

License:

GPL (>= 2)

Version:

1.43.0

Built:

2025-03-21 05:50:00 UTC

Source:

https://github.com/bioc/geneRxCluster

Help Index

critical regions

Description

critical region cutpoints

Usage

  critVal.alpha(k, p0, alpha, posdiff)
critVal.alpha(k, p0, alpha, posdiff)

Arguments

`k`	- window width(s)
`p0`	- length 2 probabilities
`alpha`	- two tailed
`posdiff`	- position difference matrix

Details

This version uses alpha and will find TFD

Value

list of cutoffs and attributes

Author(s)

Charles Berry

Examples

# symmetric odds:
crit <- critVal.alpha(5:25,c(1,1)/2,alpha=0.05,
                      matrix(1,nr=50,nc=21))
crit[[1]]
sapply(crit,c)
# 5:1 odds
asymmetric.crit <- critVal.alpha(5:25,c(1,5)/6,alpha=0.05,
                      matrix(1,nr=50,nc=21))
# show the critical regions
par(mfrow=c(1,2))
gRxPlot(crit,method="critical")
gRxPlot(asymmetric.crit,method="critical")
rm(crit,asymmetric.crit)
# symmetric odds:
crit <- critVal.alpha(5:25,c(1,1)/2,alpha=0.05,
                      matrix(1,nr=50,nc=21))
crit[[1]]
sapply(crit,c)
# 5:1 odds
asymmetric.crit <- critVal.alpha(5:25,c(1,5)/6,alpha=0.05,
                      matrix(1,nr=50,nc=21))
# show the critical regions
par(mfrow=c(1,2))
gRxPlot(crit,method="critical")
gRxPlot(asymmetric.crit,method="critical")
rm(crit,asymmetric.crit)

critical regions

Description

critical region cutpoints

Usage

  critVal.power(k, p0, target, pwr = 0.8, odds = 7)
critVal.power(k, p0, target, pwr = 0.8, odds = 7)

Arguments

`k`	- window width(s)
`p0`	- length 2 probabilities
`target`	- false discoveries wanted
`pwr`	- desired power
`odds`	- alternative odds ratio

Details

This version uses power and TFD and will limit windows screened

Value

list of cutoffs and attributes

Author(s)

Charles Berry

Examples

# symmetric odds:
crit <-
    critVal.power(5:25,c(1,1),5,pwr=0.8,odds=7)
crit[[1]]
sapply(crit,c)
# 5:1 odds
asymmetric.crit <-
    critVal.power(5:25,c(1,5),5,pwr=0.8,odds=7)
# show the critical regions
par(mfrow=c(1,2))
gRxPlot(crit,method="critical")
gRxPlot(asymmetric.crit,method="critical")
rm(crit,asymmetric.crit)
# symmetric odds:
crit <-
    critVal.power(5:25,c(1,1),5,pwr=0.8,odds=7)
crit[[1]]
sapply(crit,c)
# 5:1 odds
asymmetric.crit <-
    critVal.power(5:25,c(1,5),5,pwr=0.8,odds=7)
# show the critical regions
par(mfrow=c(1,2))
gRxPlot(crit,method="critical")
gRxPlot(asymmetric.crit,method="critical")
rm(crit,asymmetric.crit)

critical regions

Description

critical region cutpoints

Usage

  critVal.target(k, p0, target, posdiff = NULL, ns)
critVal.target(k, p0, target, posdiff = NULL, ns)

Arguments

`k`	window width(s)
`p0`	length 2 probabilities
`target`	- two tailed
`posdiff`	- position difference matrix
`ns`	the number of windows passing filter at each k

Details

This version uses TFD and will find alpha implicitly

Value

list of cutoffs and attributes

Author(s)

Charles Berry

Examples

# symmetric odds:
crit <- critVal.target(5:25,c(1,1),1,ns=rep(10,21))
crit[[1]]
sapply(crit,c)
# 5:1 odds
asymmetric.crit <- critVal.target(5:25,c(1,5),1,ns=rep(10,21))
# show the critical regions
par(mfrow=c(1,2))
gRxPlot(crit,method="critical")
gRxPlot(asymmetric.crit,method="critical")
rm(crit,asymmetric.crit)
# symmetric odds:
crit <- critVal.target(5:25,c(1,1),1,ns=rep(10,21))
crit[[1]]
sapply(crit,c)
# 5:1 odds
asymmetric.crit <- critVal.target(5:25,c(1,5),1,ns=rep(10,21))
# show the critical regions
par(mfrow=c(1,2))
gRxPlot(crit,method="critical")
gRxPlot(asymmetric.crit,method="critical")
rm(crit,asymmetric.crit)

Differential Clustering of Integration Sites

Description

geneRxCluster provides the function gRxCluster and friends.

Details

Windows defined by k consecutive integration sites are scanned. A two class indicator is tallied to determine whether one class dominates. If one does, a flag is set and the window is retained. Various values of k are used. Conflicts between overlapping windows with the same value of k can occur — two windows are dominated by the two different classes. In that case, the sites of overlap are marked and neither window is retained. Conflicts can also arise between windows differing in their values of k. In that case, the window having the smaller value of k is retained and the other is discarded.

Permutation tests and permutation based false discovery rates are available.

Filtering of windows is allowed so that regions which are sparsely populated need not be studied.

gRxCluster

Description

cluster integration sites - optionally perform the permutations needed to estimate the discoveries expected under a null hypothesis

Usage

gRxCluster(object, starts, group, kvals, nperm = 0L,
  pruneFun = prune.loglik, ..., cutpt.filter.expr, cutpt.tail.expr, tmp.env,
  sample.id, sample.tab)
gRxCluster(object, starts, group, kvals, nperm = 0L,
  pruneFun = prune.loglik, ..., cutpt.filter.expr, cutpt.tail.expr, tmp.env,
  sample.id, sample.tab)

Arguments

`object`	chromosome names or other grouping of starts
`starts`	ordered chromosome position or ordered integer vector
`group`	logical vector separating two groups
`kvals`	integer vector of window widths
`nperm`	number of permutations for FDR calculation
`pruneFun`	a function like `prune.loglik`.
`...`	other args
`cutpt.filter.expr`	(optional) R object or call (or variable naming a call) with (optional) var x (window widths in base pairs) to filter windows. It must evaluate to mode "double". If not specified, `as.double(apply(x,2,median,na.rm=TRUE))` is used. If an atomic vector of length one is supplied it is expanded to the proper length and coerced to double. If this arg is the name of a variable provided in `tmp.env`, it must be protected with `quote(...)`.
`cutpt.tail.expr`	R object or call (or variable naming a call) with (optional) vars: k,n, and x (as above). Returns list like `critVal.target`. k is a vector of the number of sites in a collection of windows, and n is a vector of counts or proportions for the two classes of insertion. If not supplied, `critVal.target(k,n,target=5,posdiff=x)` is used. If this arg is the name of a variable provided in `tmp.env`, it must be protected with `quote(...)`.
`tmp.env`	(optional) environment to use in evaluation of cutpt.* expressions. This is usually needed for `critVal.power`, which is first calculated and placed in the environment, and the supplied object is used in the expression for `cutpt.filter.expr`.
`sample.id`	(optional) integer vector indexing cells in `sample.tab` to be looked up to determine `group` under permutation. A factor can be used, too, but will be coerced to integer.
`sample.tab`	(optional) integer vector containing 0 or 1 in each cell. Its length is the same as `max(sample.id)`. Both or neither `sample.id` and `sample.tab` should be supplied. When supplied `sample.tab[sample.id]` must equal `group`. If the arguments are supplied, permutations are of the form `sample(sample.tab)[sample.id]`. Otherwise they are of the form `sample(group)`.

Value

a GRanges object with a special metadata slot, see gRxCluster-object

Author(s)

Charles Berry example inst/ex-gRxCluster.R

gRxCluster object

Description

Overview of the result of gRxCluster(...)

Details

The object returned is a GRanges object.

If the object is x, seqnames(x) and ranges(x) slots demarcate the clusters discovered. There will be one element for each cluster (aka ‘clump’) discovered.

Using the default argument pruneFun=prune.loglik or pruneFun=noprune, mcols(x) will have these columns:

value1 and value2: are the counts of the two classes of insertion sites for the clusters of object x
clump.id: numbers each cluster.

If the user supplies a custom pruneFun, it should return a GRanges with those columns and one element for each unique clump.id. The column target.min has the smallest nominal False Discoveries Expected for each cluster and is added to (or replaces) the mcols(x) produced by the argument supplied as pruneFun.

metadata(x) will include these components:

criticalValues

A list object such as supplied by critVal.target whose elements each give the cutpoints to be used for a window with k sites. attributes(metadata(object)$criticalValues[[i]]) will contain elements

fdr: with dimension c(k+1,4) of target false discovery expectations and and the one-sided p-values
target: the target for false discovery which sometimes is specified a priori and sometimes results from calculation
n: an upper bound on the number of windows to screen, if this number is needed.

In some cases, an attribute is attached to metadata(object)$criticalValues, see critVal.power for an example.

kvals

the number of sites, k, to include in a window

perm_cluster_best

a list whose canonical element is a vector of values like x$target.min obtained from a permutation of the class indicators

summary_matrix

a matrix giving the start, end, depth, and counts in each class for every cluster and depth in sequential order

call

the call invoking gRxCluster

which may include some arguments added by default.

Author(s)

Charles Berry [email protected]

gRxPlot

Description

Plot Clumps and/or Critical Regions

Usage

gRxPlot(object, pi.0 = NULL, method = c("odds", "criticalRegions"),
  xlim = NULL, main = NULL, xlab = "log odds ratio", breaks = "Sturges",
  kvals = NULL, ...)
gRxPlot(object, pi.0 = NULL, method = c("odds", "criticalRegions"),
  xlim = NULL, main = NULL, xlab = "log odds ratio", breaks = "Sturges",
  kvals = NULL, ...)

Arguments

`object`	either the results of `gRxCluster` or a list containing cutpoints for critical regions.
`pi.0`	the background proportion for vector 2
`method`	character vector of “odds” and/or “criticalRegions”
`xlim`	limits of the log odds histogram
`main`	a title for the panel(s)
`xlab`	label fgor the x-axis of the log odds plot
`breaks`	see `hist`
`kvals`	values to use in selecting a subset of the critical regions to display
`...`	other args to pass to the plotting routine(s)

Details

The results of a call to gRxCluster are plotted. Optionally, with method="criticalRegions" only the critical regions are plotted or with method="odds" the log odds only are plotted.

Value

see hist

Author(s)

Charles Berry

Examples

x.seqnames <- rep(letters[1:3],each=500)
x.starts <- c(seq(1,length=500),seq(1,by=2,length=500),seq(1,by=3,length=500))
x.lens <- rep(c(5,10,15,20,25),each=20)
x.group <- rep(rep(c(TRUE,FALSE),length=length(x.lens)),x.lens)
## add a bit of fuzz:
x.group <- 1==rbinom(length(x.group),1,pr=ifelse(x.group,.8,.2))
x.kvals <- as.integer(sort(unique(x.lens)))
x.res <- gRxCluster(x.seqnames,x.starts,x.group,x.kvals)
gRxPlot(x.res)
rm( x.seqnames, x.starts, x.lens, x.group, x.kvals, x.res)
x.seqnames <- rep(letters[1:3],each=500)
x.starts <- c(seq(1,length=500),seq(1,by=2,length=500),seq(1,by=3,length=500))
x.lens <- rep(c(5,10,15,20,25),each=20)
x.group <- rep(rep(c(TRUE,FALSE),length=length(x.lens)),x.lens)
## add a bit of fuzz:
x.group <- 1==rbinom(length(x.group),1,pr=ifelse(x.group,.8,.2))
x.kvals <- as.integer(sort(unique(x.lens)))
x.res <- gRxCluster(x.seqnames,x.starts,x.group,x.kvals)
gRxPlot(x.res)
rm( x.seqnames, x.starts, x.lens, x.group, x.kvals, x.res)

gRxPlotClumps

Description

Plot gRxCluster object clumps

Usage

gRxPlotClumps(object, data, seqlens, panelExpr = quote(grid()))
gRxPlotClumps(object, data, seqlens, panelExpr = quote(grid()))

Arguments

`object`	result of gRxCluster
`data`	(optional) GRanges like that from which args to gRxCluster were derived
`seqlens`	(optional) seqlengths(data) or similar. Can be given if data is missing
`panelExpr`	- an expression to evaluate after drawing each panel

Details

Plot Relative Frequencies of the two classes according to region. Regions typically alternate between clusters and non-clusters on each chromosome.

Author(s)

Charles Berry

Examples

x.seqnames <- rep(letters[1:3],each=50)
x.starts <- c(seq(1,length=50),seq(1,by=2,length=50),seq(1,by=3,length=50))
x.lens <- rep(c(5,10,15,20,25),each=2)
x.group <- rep(rep(c(TRUE,FALSE),length=length(x.lens)),x.lens)
## add a bit of fuzz:
x.group <- 1==rbinom(length(x.group),1,pr=ifelse(x.group,.8,.2))
x.kvals <- as.integer(sort(unique(x.lens)))
x.res <- gRxCluster(x.seqnames,x.starts,x.group,x.kvals)
gRxPlotClumps(x.res)
rm( x.seqnames, x.starts, x.lens, x.group, x.kvals, x.res)
x.seqnames <- rep(letters[1:3],each=50)
x.starts <- c(seq(1,length=50),seq(1,by=2,length=50),seq(1,by=3,length=50))
x.lens <- rep(c(5,10,15,20,25),each=2)
x.group <- rep(rep(c(TRUE,FALSE),length=length(x.lens)),x.lens)
## add a bit of fuzz:
x.group <- 1==rbinom(length(x.group),1,pr=ifelse(x.group,.8,.2))
x.kvals <- as.integer(sort(unique(x.lens)))
x.res <- gRxCluster(x.seqnames,x.starts,x.group,x.kvals)
gRxPlotClumps(x.res)
rm( x.seqnames, x.starts, x.lens, x.group, x.kvals, x.res)

gRxSummary

Description

Summarize gRxCluster Results

Usage

gRxSummary(object, targetFD = NULL)
gRxSummary(object, targetFD = NULL)

Arguments

`object`	the result of gRxCluster
`targetFD`	the critical value target in each tail

Details

Get the FDR and related data for a run of gRxCluster. By selecting a value for targetFD that is smaller that what was used in constructing the object, fewer clumps will be included in the computation fo the False Discovery Rate - akin to what would have been obtained from the object if it had been constructed using that value.

Value

a list containing the summarized results

Author(s)

Charles Berry

Examples

x.seqnames <- rep(letters[1:3],each=50)
x.starts <- c(seq(1,length=50),seq(1,by=2,length=50),seq(1,by=3,length=50))
x.lens <- rep(c(5,10,15,20,25),each=2)
x.group <- rep(rep(c(TRUE,FALSE),length=length(x.lens)),x.lens)
x.kvals <- as.integer(sort(unique(x.lens)))
x.res <- gRxCluster(x.seqnames,x.starts,x.group,x.kvals,nperm=100L)
gRxSummary(x.res)
rm( x.seqnames, x.starts, x.lens, x.group, x.kvals, x.res)
x.seqnames <- rep(letters[1:3],each=50)
x.starts <- c(seq(1,length=50),seq(1,by=2,length=50),seq(1,by=3,length=50))
x.lens <- rep(c(5,10,15,20,25),each=2)
x.group <- rep(rep(c(TRUE,FALSE),length=length(x.lens)),x.lens)
x.kvals <- as.integer(sort(unique(x.lens)))
x.res <- gRxCluster(x.seqnames,x.starts,x.group,x.kvals,nperm=100L)
gRxSummary(x.res)
rm( x.seqnames, x.starts, x.lens, x.group, x.kvals, x.res)

noprune

Description

join contiguous windows

Usage

  noprune(x, ...)
noprune(x, ...)

Arguments

`x`	a GRanges object
`...`	currently unused

Details

return all the candidate sites in a clump without pruning

this is to be used as the pruneFun are of gRxCluster

Value

same as gRxCluster less the metadata

Author(s)

Charles Berry

plot.cutpoints

Description

Plot a set of cutpoints - Utility

Usage

## S3 method for class 'cutpoints'
plot(crit, pi.0 = NULL, kvals = NULL, ...)
## S3 method for class 'cutpoints'
plot(crit, pi.0 = NULL, kvals = NULL, ...)

Arguments

`crit`	- a cutpoint object see `gRxCluster`
`pi.0`	- optional null value to plot
`kvals`	- which cutpoints to includein the plot
`...`	passed to barplot

Details

NOT FOR USERS. Not exported.

Value

list with components of “bar.x” (the value of hist()), “kvals” (window widths plotted), and “pi.0” (the input value of pi.0)

Author(s)

Charles Berry

prune.loglik

Description

best contiguous region

Usage

  prune.loglik(x, p.null = 0.5)
prune.loglik(x, p.null = 0.5)

Arguments

`x`	a GRanges object
`p.null`	the probability of category 1 (FALSE)

Details

prune each end of the region using loglik criterion

this is to be used as the pruneFun are of gRxCluster

Author(s)

Charles Berry

Package 'geneRxCluster'

Help Index

critical regions

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

critical regions

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

critical regions

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Differential Clustering of Integration Sites

Description

Details

gRxCluster

Description

Usage

Arguments

Value

Author(s)

gRxCluster object

Description

Details

Author(s)

gRxPlot

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

gRxPlotClumps

Description

Usage

Arguments

Details

Author(s)

Examples

gRxSummary

Description

Usage

Arguments

Details

Value

Author(s)

Examples

noprune

Description

Usage

Arguments

Details

Value

Author(s)

See Also

plot.cutpoints

Description

Usage

Arguments

Details

Value