Package 'nullranges'

Title: Generation of null ranges via bootstrapping or covariate matching
Description: Modular package for generation of sets of ranges representing the null hypothesis. These can take the form of bootstrap samples of ranges (using the block bootstrap framework of Bickel et al 2010), or sets of control ranges that are matched across one or more covariates. nullranges is designed to be inter-operable with other packages for analysis of genomic overlap enrichment, including the plyranges Bioconductor package.
Authors: Michael Love [aut, cre] , Wancen Mu [aut] , Eric Davis [aut] , Douglas Phanstiel [aut] , Stuart Lee [aut] , Mikhail Dozmorov [ctb], Tim Triche [ctb], CZI [fnd]
Maintainer: Michael Love <[email protected]>
License: GPL-3
Version: 1.13.0
Built: 2024-12-24 06:05:15 UTC
Source: https://github.com/bioc/nullranges

Help Index


Block bootstrap for genomic ranges

Description

Performs a block bootstrap, optionally with respect to a genome segmentation. Returns a bootRanges object, which is a GRanges object with all the ranges from the bootstrap iterations concatenated.

Usage

bootRanges(
  y,
  blockLength,
  R = 1,
  seg = NULL,
  exclude = NULL,
  excludeOption = c("drop", "trim"),
  proportionLength = TRUE,
  type = c("bootstrap", "permute"),
  withinChrom = FALSE,
  storeBlockLength = FALSE
)

Arguments

y

the GRanges to bootstrap sample

blockLength

the length of the blocks (for proportional blocks, this is the maximal length of a block)

R

the number of bootstrap samples to generate

seg

the segmentation GRanges, with a column ("state") indicating segmentation state (optional)

exclude

the GRanges of excluded regions (optional)

excludeOption

whether to "drop" or "trim" bootstrap ranges that overlap a excluded region

proportionLength

for the segmented block bootstrap, whether to use scaled block lengths (scaling by the proportion of the segmentation state out of the total genome length). That is, the resulting blocks will be of size less than blockLength

type

the type of null generation (un-segmented bootstrap only)

withinChrom

whether to re-sample (bootstrap) ranges across chromosomes (default) or only within chromosomes (un-segmented bootstrap only)

storeBlockLength

whether to save blockLength as a metadata column

Details

Note that this function requires input ranges have associated seqlengths, and that these must not be shorter than blockLength. See Seqinfo, seqlevels, and keepStandardChromosomes functions and their use in the Quick Start section of the vignette.

Value

a BootRanges (GRanges object) with the bootstrapped ranges, where iteration and block length are recorded as metadata columns

References

bootRanges manuscript:

Wancen Mu, Eric S. Davis, Stuart Lee, Mikhail G. Dozmorov, Douglas H. Phanstiel, Michael I. Love. 2023. "bootRanges: Flexible generation of null sets of genomic ranges for hypothesis testing." Bioinformatics. doi: 10.1093/bioinformatics/btad190

Original method describing the segmented block bootstrap for genomic features:

Bickel, Peter J., Nathan Boley, James B. Brown, Haiyan Huang, and Nancy R. Zhang. 2010. "Subsampling Methods for Genomic Inference." The Annals of Applied Statistics 4 (4): 1660–97. doi: 10.1214/10-AOAS363

Examples

set.seed(1)
library(GenomicRanges)
gr <- GRanges("chr1", IRanges(0:4 * 10 + 1, width=5),
              seqlengths=c(chr1=50))
br <- bootRanges(gr, blockLength=10)

BootRanges object

Description

This class extends the GRanges class. As produced by bootRanges, it will have an iter column indicating the iterations of the bootstrap, and a optionally a blockLength column indicating the blockLength parameter.


Function for creating combinations of covariates

Description

Function for creating combinations of covariates

Usage

combnCov(x, ...)

## S4 method for signature 'character'
combnCov(x)

Arguments

x

Character vector of covariates to combine.

...

Additional arguments.

Value

Returns a character vector of formulae combinations.

Examples

combnCov(x = c('a', 'b', 'c'))

Get covariates from a Matched object

Description

Get covariates from a Matched object

Usage

covariates(x, ...)

## S4 method for signature 'Matched'
covariates(x, ...)

Arguments

x

Matched object.

...

Additional arguments.

Value

A character vector of covariates

Examples

set.seed(123)
mdf <- makeExampleMatchedDataSet(matched = TRUE)
covariates(mdf)

Get focal set from a Matched object

Description

Get focal set from a Matched object

Usage

focal(x, ...)

## S4 method for signature 'MDF_OR_MGR_OR_MGI'
focal(x, ...)

Arguments

x

A MatchedDataFrame, MatchedGRanges, or MatchedGInteractions object.

...

Additional options.

Value

An object of the same class as x representing the focal set.

Examples

set.seed(123)
x <- makeExampleMatchedDataSet(matched = TRUE)
focal(x)

Get indices of matched set

Description

Extracts the indices of a specified matched set from a Matched object.

Usage

indices(x, set = "matched", ...)

## S4 method for signature 'Matched'
indices(x, set)

Arguments

x

Matched object.

set

A character string describing from which set to extract indices. can be one of 'focal', 'matched', 'pool', or 'unmatched'.

...

Additional arguments.

Details

Indices from 'focal' come from the focal set of matchRanges() used to construct the Matched object while indices from 'matched', 'pool', and 'unmatched' come from the pool set. Default returns the 'matched' indices from the pool set.

Value

An integer vector corresponding to the indices in the focal or pool which comprise the "focal" or c("matched", "pool", "unmatched") sets.

Examples

set.seed(123)
mdf <- makeExampleMatchedDataSet(matched = TRUE)
head(indices(mdf))

head(indices(mdf, set = 'focal'))
head(indices(mdf, set = 'pool'))
head(indices(mdf, set = 'matched'))
head(indices(mdf, set = 'unmatched'))

Function for generating an example matchRanges or Matched dataset

Description

This function will generate an example dataset as either 1) input for matchRanges() (when matched = FALSE) or 2) a Matched Object (when matched = TRUE).

Usage

makeExampleMatchedDataSet(
  type = "DataFrame",
  matched = FALSE,
  method = "rejection",
  replace = FALSE,
  ...
)

## S4 method for signature 
## 'character_OR_missing,
##   logical_OR_missing,
##   character_OR_missing,
##   logical_OR_missing'
makeExampleMatchedDataSet(type, matched, method, replace)

Arguments

type

Character designating which type of dataset to make. options are one of 'data.frame', 'data.table', 'DataFrame', 'GRanges', or 'GInteractions'.

matched

TRUE/FALSE designating whether to generate a Matched dataset (matched = TRUE) or an input dataset for matchRanges() (matched = FALSE).

method

Character describing which matching method to use. Supported options are either 'nearest', 'rejection', or 'stratified'.

replace

TRUE/FALSE describing whether to select matches with or without replacement.

...

Additional arguments

Details

When matched = FALSE, the data returned contains 3 different features that can be subset to perform matching.

Value

Returns an example Matched dataset or an example dataset for input to matchRanges().

Examples

## Make examples for matchRanges() (i.e matched = FALSE)
set.seed(123)
makeExampleMatchedDataSet()
head(makeExampleMatchedDataSet(type = 'data.frame', matched = FALSE))
makeExampleMatchedDataSet(type = 'data.table', matched = FALSE)
makeExampleMatchedDataSet(type = 'DataFrame', matched = FALSE)
makeExampleMatchedDataSet(type = 'GRanges', matched = FALSE)
makeExampleMatchedDataSet(type = 'GInteractions', matched = FALSE)

## Make Matched class examples (i.e. matched = TRUE)
set.seed(123)
makeExampleMatchedDataSet(matched = TRUE)
makeExampleMatchedDataSet(type = 'DataFrame', matched = TRUE,
                          method = 'rejection',
                          replace = FALSE)
makeExampleMatchedDataSet(type = 'GRanges', matched = TRUE,
                          method = 'rejection',
                          replace = FALSE)
# throwing error (April 2023)
#makeExampleMatchedDataSet(type = 'GInteractions', matched = TRUE,
#                          method = 'rejection',
#                          replace = FALSE)

Get matched set from a Matched object

Description

Get matched set from a Matched object

Usage

matched(x, ...)

## S4 method for signature 'MDF_OR_MGR_OR_MGI'
matched(x, ...)

Arguments

x

A MatchedDataFrame, MatchedGRanges, or MatchedGInteractions object.

...

Additional options.

Value

An object of the same class as x representing the matched set.

Examples

set.seed(123)
x <- makeExampleMatchedDataSet(matched = TRUE)
matched(x)

Matched objects

Description

The Matched class is a container for attributes of covariate-matched data resulting from matchRanges().

Slots

matchedData

A data.table with matched data

matchedIndex

An integer vector corresponding to the indices in the pool which comprise the matched set.

covar

A character vector describing the covariates used for matching.

method

Character describing replacement method used for matching.

replace

TRUE/FALSE describing if matching was done with or without replacement.

Accessor methods for Matched Class

Functions that get data from Matched subclasses (x) such as MatchedDataFrame, MatchedGRanges, and MatchedGInteractions are listed below:

  • matchedData(x): Get matched data from a Matched object

  • covariates(x): Get covariates from a Matched object

  • method(x): Get matching method used for Matched object

  • withReplacement(x): Get replace method

  • indices(x, set): Get indices of matched set

For more detail check the help pages for these functions.

See Also

matchedData, covariates, method, withReplacement, indices

Examples

## Make Matched example
set.seed(123)
x <- makeExampleMatchedDataSet(matched = TRUE)
## Accessor functions for Matched class
matchedData(x)
covariates(x)
method(x)
withReplacement(x)
head(indices(x, set = 'matched'))

Get matched data from a Matched object

Description

Get matched data from a Matched object

Usage

matchedData(x, ...)

## S4 method for signature 'Matched'
matchedData(x, ...)

Arguments

x

Matched object.

...

Additional arguments.

Value

A data.table with matched data.

Examples

set.seed(123)
mdf <- makeExampleMatchedDataSet(matched = TRUE)
matchedData(mdf)

MatchedDataFrame objects

Description

The MatchedDataFrame class is a subclass of both Matched and DFrame. Therefore, it contains slots and methods for both of these classes.

Details

The MatchedDataFrame class uses a delegate object during initialization to assign its DFrame slots. MatchedDataFrame behaves as a DataFrame but also includes additional Matched object functionality (see ?Matched). For more information about DataFrame see ?S4Vectors::DataFrame.

Slots

focal

A DataFrame object containing the focal data to match.

pool

A DataFrame object containing the pool from which to select matches.

delegate

A DataFrame object used to initialize DataFrame-specific slots. matchRanges() assigns the matched set to the slot.

matchedData

A data.table with matched data

matchedIndex

An integer vector corresponding to the indices in the pool which comprise the matched set.

covar

A character vector describing the covariates used for matching.

method

Character describing replacement method used for matching.

replace

TRUE/FALSE describing if matching was done with or without replacement.

rownames

rownames(delegate)

nrows

nrows(delegate)

listData

as.list(delegate)

elementType

elementType(delegate)

elementMetadata

elementMetadata(delegate)

metadata

metadata(delegate)

Accessor methods for Matched Class

Functions that get data from Matched subclasses (x) such as MatchedDataFrame, MatchedGRanges, and MatchedGInteractions are listed below:

  • matchedData(x): Get matched data from a Matched object

  • covariates(x): Get covariates from a Matched object

  • method(x): Get matching method used for Matched object

  • withReplacement(x): Get replace method

  • indices(x, set): Get indices of matched set

For more detail check the help pages for these functions.

Accessor methods for Matched subclass objects

Additional functions that get data from Matched subclasses (x) such as MatchedDataFrame, MatchedGRanges, and MatchedGInteractions include:

  • focal(x): Get focal set from a Matched object

  • pool(x): Get pool set from a Matched object

  • matched(x): Get matched set from a Matched object

  • unmatched(x): Get unmatched set from a Matched object

For more detail check the help pages for these functions.

See Also

S4Vectors::DataFrame

matchedData, covariates, method, withReplacement, indices

focal, pool, matched, unmatched

Examples

## Constructing MatchedDataFrame with matchRanges
## data.frame
set.seed(123)
x <- makeExampleMatchedDataSet(type = "data.frame")
mx <- matchRanges(
  focal = x[x$feature1, ],
  pool = x[!x$feature1, ],
  covar = ~ feature2 + feature3,
  method = "rejection",
  replace = FALSE
)
class(mx)

## data.table
set.seed(123)
x <- makeExampleMatchedDataSet(type = "data.table")
mx <- matchRanges(
  focal = x[x$feature1],
  pool = x[!x$feature1],
  covar = ~ feature2 + feature3,
  method = "rejection",
  replace = FALSE
)
class(mx)

## DataFrame
set.seed(123)
x <- makeExampleMatchedDataSet(type = "DataFrame")
mx <- matchRanges(
  focal = x[x$feature1, ],
  pool = x[!x$feature1, ],
  covar = ~ feature2 + feature3,
  method = "rejection",
  replace = FALSE
)
class(mx)

## Make MatchedDataFrame example
set.seed(123)
x <- makeExampleMatchedDataSet(type = "DataFrame", matched = TRUE)
## Accessor functions for Matched class
matchedData(x)
covariates(x)
method(x)
withReplacement(x)
head(indices(x, set = 'matched'))

## Accessor functions for Matched subclasses
focal(x)
pool(x)
matched(x)
unmatched(x)

MatchedGInteractions objects

Description

The MatchedGInteractions class is a subclass of both Matched and GInteractions. Therefore, it contains slots and methods for both of these classes.

Details

The MatchedGInteractions class uses a delegate object during initialization to assign its GInteractions slots. MatchedGInteractions behaves as a GInteractions but also includes additional Matched object functionality (see ?Matched). For more information about GInteractions see ?InteractionSet::GInteractions.

Slots

focal

A GInteractions object containing the focal data to match.

pool

A GInteractions object containing the pool from which to select matches.

delegate

A GInteractions object used to initialize GInteractions-specific slots. matchRanges() assigns the matched set to the slot.

matchedData

A data.table with matched data

matchedIndex

An integer vector corresponding to the indices in the pool which comprise the matched set.

covar

A character vector describing the covariates used for matching.

method

Character describing replacement method used for matching.

replace

TRUE/FALSE describing if matching was done with or without replacement.

anchor1

anchorIds(delegate)$first

anchor2

anchorIds(delegate)$second

regions

regions(delegate)

NAMES

names(delegate)

elementMetadata

elementMetadata(delegate)

metadata

metadata(delegate)

Accessor methods for Matched Class

Functions that get data from Matched subclasses (x) such as MatchedDataFrame, MatchedGRanges, and MatchedGInteractions are listed below:

  • matchedData(x): Get matched data from a Matched object

  • covariates(x): Get covariates from a Matched object

  • method(x): Get matching method used for Matched object

  • withReplacement(x): Get replace method

  • indices(x, set): Get indices of matched set

For more detail check the help pages for these functions.

Accessor methods for Matched subclass objects

Additional functions that get data from Matched subclasses (x) such as MatchedDataFrame, MatchedGRanges, and MatchedGInteractions include:

  • focal(x): Get focal set from a Matched object

  • pool(x): Get pool set from a Matched object

  • matched(x): Get matched set from a Matched object

  • unmatched(x): Get unmatched set from a Matched object

For more detail check the help pages for these functions.

See Also

InteractionSet::GInteractions

matchedData, covariates, method, withReplacement, indices

focal, pool, matched, unmatched

Examples

## Constructing MatchedGInteractions with matchRanges
set.seed(123)
gi <- makeExampleMatchedDataSet(type = "GInteractions")
mgi <- matchRanges(
  focal = gi[gi$feature1, ],
  pool = gi[!gi$feature1, ],
  covar = ~ feature2 + feature3,
  method = "rejection",
  replace = FALSE
)
class(mgi)

## Make MatchedGInteractions example
set.seed(123)
x <- makeExampleMatchedDataSet(type = "GInteractions", matched = TRUE)
## Accessor functions for Matched class
matchedData(x)
covariates(x)
method(x)
withReplacement(x)
head(indices(x, set = 'matched'))

## Accessor functions for Matched subclasses
focal(x)
pool(x)
matched(x)
unmatched(x)

MatchedGRanges objects

Description

The MatchedGRanges class is a subclass of both Matched and GRanges. Therefore, it contains slots and methods for both of these classes.

Details

The MatchedGRanges class uses a delegate object during initialization to assign its GRanges slots. MatchedGRanges behaves as a GRanges but also includes additional Matched object functionality (see ?Matched). For more information about GRanges see ?GenomicRanges::GRanges.

Slots

focal

A GRanges object containing the focal data to match.

pool

A GRanges object containing the pool from which to select matches.

delegate

A GRanges object used to initialize GRanges-specific slots. matchRanges() assigns the matched set to the slot.

matchedData

A data.table with matched data

matchedIndex

An integer vector corresponding to the indices in the pool which comprise the matched set.

covar

A character vector describing the covariates used for matching.

method

Character describing replacement method used for matching.

replace

TRUE/FALSE describing if matching was done with or without replacement.

seqnames

seqnames(delegate)

ranges

ranges(delegate)

strand

strand(delegate)

seqinfo

seqinfo(delegate)

elementMetadata

elementMetadata(delegate)

elementType

elementType(delegate)

metadata

metadata(delegate)

Accessor methods for Matched Class

Functions that get data from Matched subclasses (x) such as MatchedDataFrame, MatchedGRanges, and MatchedGInteractions are listed below:

  • matchedData(x): Get matched data from a Matched object

  • covariates(x): Get covariates from a Matched object

  • method(x): Get matching method used for Matched object

  • withReplacement(x): Get replace method

  • indices(x, set): Get indices of matched set

For more detail check the help pages for these functions.

Accessor methods for Matched subclass objects

Additional functions that get data from Matched subclasses (x) such as MatchedDataFrame, MatchedGRanges, and MatchedGInteractions include:

  • focal(x): Get focal set from a Matched object

  • pool(x): Get pool set from a Matched object

  • matched(x): Get matched set from a Matched object

  • unmatched(x): Get unmatched set from a Matched object

For more detail check the help pages for these functions.

See Also

GenomicRanges::GRanges

matchedData, covariates, method, withReplacement, indices

focal, pool, matched, unmatched

Examples

## Contructing MatchedGRanges with matchRanges
set.seed(123)
gr <- makeExampleMatchedDataSet(type = "GRanges")
mgr <- matchRanges(
  focal = gr[gr$feature1, ],
  pool = gr[!gr$feature1, ],
  covar = ~ feature2 + feature3,
  method = "rejection",
  replace = FALSE
)
class(mgr)

## Make MatchedGRanges example
set.seed(123)
x <- makeExampleMatchedDataSet(type = "GRanges", matched = TRUE)
## Accessor functions for Matched class
matchedData(x)
covariates(x)
method(x)
withReplacement(x)
head(indices(x, set = 'matched'))

## Accessor functions for Matched subclasses
focal(x)
pool(x)
matched(x)
unmatched(x)

Generate a covariate-matched control set of ranges

Description

matchRanges() uses a propensity score-based method to generate a covariate-matched control set of DataFrame, GRanges, or GInteractions objects.

Usage

matchRanges(focal, pool, covar, method = "nearest", replace = TRUE, ...)

## S4 method for signature 
## 'DF_OR_df_OR_dt,
##   DF_OR_df_OR_dt,
##   formula,
##   character_OR_missing,
##   logical_OR_missing'
matchRanges(focal, pool, covar, method, replace)

## S4 method for signature 
## 'GRanges,GRanges,formula,character_OR_missing,logical_OR_missing'
matchRanges(focal, pool, covar, method, replace)

## S4 method for signature 
## 'GInteractions,
##   GInteractions,
##   formula,
##   character_OR_missing,
##   logical_OR_missing'
matchRanges(focal, pool, covar, method, replace)

Arguments

focal

A DataFrame, GRanges, or GInteractions object containing the focal data to match.

pool

A DataFrame, GRanges, or GInteractions object containing the pool from which to select matches.

covar

A rhs formula with covariates on which to match.

method

A character describing which matching method to use. supported options are either 'nearest', 'rejection', or 'stratified'.

replace

TRUE/FALSE describing whether to select matches with or without replacement.

...

Additional arguments.

Details

Available inputs for focal and pool include data.frame, data.table, DataFrame, GRanges, or GInteractions. data.frame and data.table inputs are coerced to DataFrame objects and returned as MatchedDataFrame while GRanges and GInteractions objects are returned as MatchedGRanges or MatchedGInteractions, respectively.

Value

A covariate-matched control set of data.

Methodology

matchRanges uses propensity scores to perform subset selection on the pool set such that the resulting matched set contains similar distributions of covariates to that of the focal set. A propensity score is the conditional probability of assigning an element (in our case, a genomic range) to a particular outcome (Y) given a set of covariates. Propensity scores are estimated using a logistic regression model where the outcome Y=1 for focal and Y=0 for pool, over the provided covariates covar.

Matching methods

  • method = 'nearest': Nearest neighbor matching with replacement. Finds the nearest neighbor by using a rolling join with data.table. Matching without replacement is not currently supported.

  • method = 'rejection': (Default) Rejection sampling with or without replacement. Uses a probability-based approach to select options in the pool that match the focal distribition.

  • method = 'stratified': Iterative stratified sampling with or without replacement. Bins focal and pool propensity scores by value and selects matches within bins until all focal items have a corresponding match in pool.

References

matchRanges manuscript:

Eric S. Davis, Wancen Mu, Stuart Lee, Mikhail G. Dozmorov, Michael I. Love, Douglas H. Phanstiel. 2023. "matchRanges: Generating null hypothesis genomic ranges via covariate-matched sampling." Bioinformatics. doi: 10.1093/bioinformatics/btad197

Examples

## Match with DataFrame
set.seed(123)
x <- makeExampleMatchedDataSet(type = 'DataFrame')
matchRanges(focal = x[x$feature1,],
            pool = x[!x$feature1,],
            covar = ~feature2 + feature3)

## Match with GRanges
set.seed(123)
x <- makeExampleMatchedDataSet(type = "GRanges")
matchRanges(focal = x[x$feature1,],
            pool = x[!x$feature1,],
            covar = ~feature2 + feature3)

## Match with GInteractions
set.seed(123)
x <- makeExampleMatchedDataSet(type = "GInteractions")
matchRanges(focal = x[x$feature1,],
            pool = x[!x$feature1,],
            covar = ~feature2 + feature3)

## Nearest neighbor matching with replacement
set.seed(123)
x <- makeExampleMatchedDataSet(type = 'DataFrame')
matchRanges(focal = x[x$feature1,],
            pool = x[!x$feature1,],
            covar = ~feature2 + feature3,
            method = 'nearest',
            replace = TRUE)

## Rejection sampling without replacement
set.seed(123)
x <- makeExampleMatchedDataSet(type = 'DataFrame')
matchRanges(focal = x[x$feature1,],
            pool = x[!x$feature1,],
            covar = ~feature2 + feature3,
            method = 'rejection',
            replace = FALSE)

## Stratified sampling without replacement
set.seed(123)
x <- makeExampleMatchedDataSet(type = 'DataFrame')
matchRanges(focal = x[x$feature1,],
            pool = x[!x$feature1,],
            covar = ~feature2 + feature3,
            method = 'stratified',
            replace = FALSE)

Get matching method used for Matched object

Description

Get matching method used for Matched object

Usage

method(x, ...)

## S4 method for signature 'Matched'
method(x, ...)

Arguments

x

Matched object.

...

Additional arguments.

Value

A character describing the matched method

Examples

set.seed(123)
mdf <- makeExampleMatchedDataSet(matched = TRUE)
method(mdf)

Segmentation based on one region

Description

This function makes a segmentation (GRanges) based on one region of one chromosome (seqnames).

Usage

oneRegionSegment(x, seqlength)

Arguments

x

a single region as GRanges object

seqlength

optional, the length of the chromosome, if not provided, the function will attempt to pull this using genome(x) and the Seqinfo function

Value

a segmentation (GRanges object) with the region of interest designated as state 2, and the rest of the chromosome as state 1.

Examples

library(GenomicRanges)
library(GenomeInfoDb)
x <- GRanges("chr1", IRanges(10e6+1,width=1e6))
genome(x) <- "hg19"
seg <- oneRegionSegment(x)

Overview of matching quality

Description

The overview function provides a quick assessment of overall matching quality by reporting the N, mean, and s.d. of focal, matched, pool, and unmatched sets for all covariates as well as the propensity scores ('ps'). The mean and s.d. difference in focal - matched is also reported.

Usage

overview(x, digits = 2, ...)

## S4 method for signature 'Matched,numeric_OR_missing'
overview(x, digits)

Arguments

x

Matched object

digits

Integer indicating the number of significant digits to be used. Negative values are allowed (see ?signif).

...

Additional arguments.

Details

Factor, character, or logical covariates are reported by N per set, rather than with mean and s.d.

Value

  • A printed overview of matching quality.

  • (Internally) a MatchedOverview object.

Examples

set.seed(123)
mdf <- makeExampleMatchedDataSet(matched = TRUE)
overview(mdf)

Covariate plotting for Matched objects

Description

This function plots the distributions of a covariate from each matched set of a Matched object.

Usage

plotCovariate(
  x,
  covar = NULL,
  sets = c("focal", "matched", "pool", "unmatched"),
  type = NULL,
  log = NULL,
  ...
)

## S4 method for signature 
## 'Matched,
##   character_OR_missing,
##   character_OR_missing,
##   character_OR_missing,
##   character_OR_missing'
plotCovariate(x, covar, sets, type, log, thresh = 12)

Arguments

x

Matched object

covar

Character naming the covariate to plot. If multiple are provided, only the first one is used.

sets

Character vector describing which matched set(s) to include in the plot. Options are 'focal', 'matched', 'pool', or 'unmatched'. Multiple options are accepted.

type

Character naming the plot type. Available options are one of either 'ridges', 'jitter', 'lines', or 'bars'. Note that for large datasets, use of 'jitter' is discouraged because the large density of points can stall the R-graphics device.

log

Character vector describing which axis or axes to apply log-transformation. Available options are 'x' and/or 'y'.

...

Additional arguments.

thresh

Integer describing the number of unique values required to classify a numeric variable as discrete (and convert it to a factor). If the number of unique values exceeds thresh then the variable is considered continuous.

Details

By default, plotCovariate will sense the class of covariate and make a plot best suited to that data type. For example, if the covariate class is categorical in nature then the type argument defaults to 'bars'. type is set to 'lines' for continuous covariates. These settings can also be overwritten manually.

Value

Returns a plot of a covariate's distribution among matched sets.

See Also

plotPropensity() to plot propensity scores.

Examples

## Matched example dataset
set.seed(123)
mdf <- makeExampleMatchedDataSet(matched = TRUE)

## Visualize covariates
plotCovariate(mdf)
plotCovariate(mdf, covar = 'feature3')
plotCovariate(mdf,
              covar = 'feature2',
              sets = c('focal', 'matched', 'pool'))
plotCovariate(mdf,
              covar = 'feature2',
              sets = c('focal', 'matched', 'pool'),
              type = 'ridges')
plotCovariate(mdf,
              covar = 'feature2',
              sets = c('focal', 'matched', 'pool'),
              type = 'jitter')

Propensity score plotting for Matched objects

Description

This function plots the distribution of propensity scores from each matched set of a Matched object.

Usage

plotPropensity(
  x,
  sets = c("focal", "matched", "pool", "unmatched"),
  type = NULL,
  log = NULL,
  ...
)

## S4 method for signature 
## 'Matched,
##   character_OR_missing,
##   character_OR_missing,
##   character_OR_missing'
plotPropensity(x, sets, type, log, thresh = 12)

Arguments

x

Matched object

sets

Character vector describing which matched set(s) to include in the plot. Options are 'focal', 'matched', 'pool', or 'unmatched'. Multiple options are accepted.

type

Character naming the plot type. Available options are one of either 'ridges', 'jitter', 'lines', or 'bars'. Note that for large datasets, use of 'jitter' is discouraged because the large density of points can stall the R-graphics device.

log

Character vector describing which axis or axes to apply log-transformation. Available options are 'x' and/or 'y'.

...

Additional arguments.

thresh

Integer describing the number of unique values required to classify a numeric variable as discrete (and convert it to a factor). If the number of unique values exceeds thresh then the variable is considered continuous.

Details

plotPropensity uses the thresh argument to determine whether to plot propensity scores as continuous (line plot) or catetgorical (bar plot). These settings can also be overwritten manually.

Value

Returns a plot of propensity score distributions among matched sets.

See Also

plotCovariate() to plot covariate distributions.

Examples

## Matched example dataset
set.seed(123)
mdf <- makeExampleMatchedDataSet(matched = TRUE)

## Visualize propensity scores
plotPropensity(mdf)
plotPropensity(mdf,
              sets = c('focal', 'matched', 'pool'))
plotPropensity(mdf,
              sets = c('focal', 'matched', 'pool'),
              type = 'ridges')
plotPropensity(mdf,
              sets = c('focal', 'matched', 'pool'),
              type = 'jitter')

Plot genome segmentation

Description

Plot genome segmentation

Usage

plotSegment(
  seg,
  exclude = NULL,
  type = c("ranges", "barplot", "boxplot"),
  region = NULL
)

Arguments

seg

the segmentation GRanges returned by segmentDensity function

exclude

GRanges of excluded region

type

the type of plot returned. Choices are segmentation plot included ranges information, barplot showing segmentation states' distribution across chromosome, or a box plot indicating average density within each states. Default is all plots are displayed. The y axis "density" represent square root of overlap counts within segment length. If a user defined segmentation GRanges is given, the y axis is default to be 1 for all states in segmentation plot.

region

GRanges of stricted region that want to be plotted.

Value

A ggplot set by type argument

Examples

example("segmentDensity")
plotSegment(seg, exclude, type = "ranges")
plotSegment(seg, exclude, type = "barplot")
plotSegment(seg, exclude, type = "boxplot")

Get pool set from a Matched object

Description

Get pool set from a Matched object

Usage

pool(x, ...)

## S4 method for signature 'MDF_OR_MGR_OR_MGI'
pool(x, ...)

Arguments

x

A MatchedDataFrame, MatchedGRanges, or MatchedGInteractions object.

...

Additional options.

Value

An object of the same class as x representing the pool set.

Examples

set.seed(123)
x <- makeExampleMatchedDataSet(matched = TRUE)
pool(x)

Combine nearby regions with same state

Description

Combine nearby regions with same state

Usage

reduceSegment(x, col = "state")

Arguments

x

the input GRanges

col

the name of the column for the segment states

Value

a GRanges with metadata column state giving the segmentation state

Examples

n <- 10000
library(GenomicRanges)
gr <- GRanges("chr1", IRanges(round(
  c(runif(n/4,1,991), runif(n/4,1001,3991),
    runif(n/4,4001,4991), runif(n/4,7001,9991))),
  width=10), seqlengths=c(chr1=10000))
gr$name <- rep(1:4,each=10)
gr <- sort(gr)
seg <- reduceSegment(gr, col="name")

Genome segmentation based on feature density

Description

This function allows for various methods (see type) of segmenting based on the density of features x.

Usage

segmentDensity(x, n, L_s = 1e+06, exclude = NULL, type = c("cbs", "hmm"))

Arguments

x

the input GRanges, e.g. genes

n

the number of states

L_s

segment length

exclude

GRanges of excluded region

type

the type of segmentation, either Circular Binary Segmentation "cbs" (which will use 'DNAcopy' to segment) or Hidden Markov Model "hmm" (which will use 'RcppHMM'). These packages are not imported by nullranges, but must be installed by the user

Value

a GRanges with metadata columns containing:

  • state segmentation state

  • counts average number of genes

References

Circular binary segmentation (CBS):

Olshen, A. B., E. S. Venkatraman, R. Lucito, and M. Wigler. 2004. "Circular binary segmentation for the analysis of array-based DNA copy number data." Biostatistics 5 (4): 557–72.

Hidden Markov Model from RcppHMM:

Roberto A. Cardenas-Ovando, Julieta Noguez, and Claudia Rangel-Escareno. "Rcpp Hidden Markov Model." CRAN R package.

Examples

n <- 10000
library(GenomicRanges)
gr <- GRanges("chr1", IRanges(round(
  c(runif(n/4,1,991), runif(n/4,1001,3991),
    runif(n/4,4001,4991), runif(n/4,7001,9991))),
  width=10), seqlengths=c(chr1=10000))
gr <- sort(gr)
exclude <- GRanges("chr1", IRanges(5001,6000), seqlengths=c(chr1=10000))
seg <- segmentDensity(gr, n=3, L_s=100, exclude=exclude, type="cbs")

Get unmatched set from a Matched object

Description

Get unmatched set from a Matched object

Usage

unmatched(x, ...)

## S4 method for signature 'MDF_OR_MGR_OR_MGI'
unmatched(x, ...)

Arguments

x

A MatchedDataFrame, MatchedGRanges, or MatchedGInteractions object.

...

Additional options.

Value

An object of the same class as x representing the unmatched set.

Examples

set.seed(123)
x <- makeExampleMatchedDataSet(matched = TRUE)
unmatched(x)

Get replace method

Description

Determine if a Matched object was created with or without replacement.

Usage

withReplacement(x, ...)

## S4 method for signature 'Matched'
withReplacement(x, ...)

Arguments

x

Matched object.

...

Additional arguments.

Value

TRUE/FALSE indicating whether matching was done with or without replacement.

Examples

set.seed(123)
mdf <- makeExampleMatchedDataSet(matched = TRUE)
withReplacement(mdf)