Package 'AnnotationFilter'

Title: Facilities for Filtering Bioconductor Annotation Resources
Description: This package provides class and other infrastructure to implement filters for manipulating Bioconductor annotation resources. The filters will be used by ensembldb, Organism.dplyr, and other packages.
Authors: Martin Morgan [aut], Johannes Rainer [aut], Joachim Bargsten [ctb], Daniel Van Twisk [ctb], Bioconductor Package Maintainer [cre]
Maintainer: Bioconductor Package Maintainer <[email protected]>
License: Artistic-2.0
Version: 1.31.0
Built: 2024-10-30 03:29:06 UTC
Source: https://github.com/bioc/AnnotationFilter

Help Index


Filters for annotation objects

Description

The filters extending the base AnnotationFilter class represent a simple filtering concept for annotation resources. Each filter object is thought to filter on a single (database) table column using the provided values and the defined condition.

Filter instances created using the constructor functions (e.g. GeneIdFilter).

supportedFilters() lists all defined filters. It returns a two column data.frame with the filter class name and its default field. Packages using AnnotationFilter should implement the supportedFilters for their annotation resource object (e.g. for object = "EnsDb" in the ensembldb package) to list all supported filters for the specific resource.

condition() get the condition value for the filter object.

value() get the value for the filter object.

field() get the field for the filter object.

not() get the not for the filter object.

feature() get the feature for the GRangesFilter object.

Converts an AnnotationFilter object to a character(1) giving an equation that can be used as input to a dplyr filter.

AnnotationFilter translates a filter expression such as ~ gene_id == "BCL2" into a filter object extending the AnnotationFilter class (in the example a GeneIdFilter object) or an AnnotationFilterList if the expression contains multiple conditions (see examples below). Filter expressions have to be written in the form ~ <field> <condition> <value>, with <field> being the default field of the filter class (use the supportedFilter function to list all fields and filter classes), <condition> the logical expression and <value> the value for the filter.

Usage

CdsStartFilter(value, condition = "==", not = FALSE)
CdsEndFilter(value, condition = "==", not = FALSE)
ExonIdFilter(value, condition = "==", not = FALSE)
ExonNameFilter(value, condition = "==", not = FALSE)
ExonRankFilter(value, condition = "==", not = FALSE)
ExonStartFilter(value, condition = "==", not = FALSE)
ExonEndFilter(value, condition = "==", not = FALSE)
GeneIdFilter(value, condition = "==", not = FALSE)
GeneNameFilter(value, condition = "==", not = FALSE)
GeneBiotypeFilter(value, condition = "==", not = FALSE)
GeneStartFilter(value, condition = "==", not = FALSE)
GeneEndFilter(value, condition = "==", not = FALSE)
EntrezFilter(value, condition = "==", not = FALSE)
SymbolFilter(value, condition = "==", not = FALSE)
TxIdFilter(value, condition = "==", not = FALSE)
TxNameFilter(value, condition = "==", not = FALSE)
TxBiotypeFilter(value, condition = "==", not = FALSE)
TxStartFilter(value, condition = "==", not = FALSE)
TxEndFilter(value, condition = "==", not = FALSE)
ProteinIdFilter(value, condition = "==", not = FALSE)
UniprotFilter(value, condition = "==", not = FALSE)
SeqNameFilter(value, condition = "==", not = FALSE)
SeqStrandFilter(value, condition = "==", not = FALSE)

## S4 method for signature 'AnnotationFilter'
condition(object)

## S4 method for signature 'AnnotationFilter'
value(object)

## S4 method for signature 'AnnotationFilter'
field(object)

## S4 method for signature 'AnnotationFilter'
not(object)

GRangesFilter(value, feature = "gene", type = c("any", "start", "end",
  "within", "equal"))

feature(object)

## S4 method for signature 'AnnotationFilter,missing'
convertFilter(object)

## S4 method for signature 'missing'
supportedFilters(object)

AnnotationFilter(expr)

Arguments

object

An AnnotationFilter object.

value

character(), integer(), or GRanges() value for the filter

feature

character(1) defining on what feature the GRangesFilter should be applied. Choices could be "gene", "tx" or "exon".

type

character(1) indicating how overlaps are to be filtered. See findOverlaps in the IRanges package for a description of this argument.

expr

A filter expression, written as a formula, to be converted to an AnnotationFilter or AnnotationFilterList class. See below for examples.

condition

character(1) defining the condition to be used in the filter. For IntegerFilter or DoubleFilter, one of "==", "!=", ">", "<", ">=" or "<=". For CharacterFilter, one of "==", "!=", "startsWith", "endsWith" or "contains". Default condition is "==".

not

logical(1) whether the AnnotationFilter is negated. TRUE indicates is negated (!). FALSE indicates not negated. Default not is FALSE.

Details

By default filters are only available for tables containing the field on which the filter acts (i.e. that contain a column with the name matching the value of the field slot of the object). See the vignette for a description to use filters for databases in which the database table column name differs from the default field of the filter.

Filter expressions for the AnnotationFilter class have to be written as formulas, i.e. starting with a ~.

Value

The constructor function return an object extending AnnotationFilter. For the return value of the other methods see the methods' descriptions.

character(1) that can be used as input to a dplyr filter.

AnnotationFilter returns an AnnotationFilter or an AnnotationFilterList.

Note

Translation of nested filter expressions using the AnnotationFilter function is not yet supported.

See Also

AnnotationFilterList for combining AnnotationFilter objects.

Examples

## filter by GRanges
GRangesFilter(GenomicRanges::GRanges("chr10:87869000-87876000"))
## Create a SymbolFilter to filter on a gene's symbol.
sf <- SymbolFilter("BCL2")
sf

## Create a GeneStartFilter to filter based on the genes' chromosomal start
## coordinates
gsf <- GeneStartFilter(10000, condition = ">")
gsf

filter <- SymbolFilter("ADA", "==")
result <- convertFilter(filter)
result
supportedFilters()

## Convert a filter expression based on a gene ID to a GeneIdFilter
gnf <- AnnotationFilter(~ gene_id == "BCL2")
gnf

## Same conversion but for two gene IDs.
gnf <- AnnotationFilter(~ gene_id %in% c("BCL2", "BCL2L11"))
gnf

## Converting an expression that combines multiple filters. As a result we
## get an AnnotationFilterList containing the corresponding filters.
## Be aware that nesting of expressions/filters does not work.
flt <- AnnotationFilter(~ gene_id %in% c("BCL2", "BCL2L11") &
                        tx_biotype == "nonsense_mediated_decay" |
                        seq_name == "Y")
flt

Combining annotation filters

Description

The AnnotationFilterList allows to combine filter objects extending the AnnotationFilter class to construct more complex queries. Consecutive filter objects in the AnnotationFilterList can be combined by a logical and (&) or or (|). The AnnotationFilterList extends list, individual elements can thus be accessed with [[.

value() get a list with the AnnotationFilter objects. Use [[ to access individual filters.

logicOp() gets the logical operators separating successive AnnotationFilter.

not() gets the logical operators separating successive AnnotationFilter.

Converts an AnnotationFilterList object to a character(1) giving an equation that can be used as input to a dplyr filter.

Usage

AnnotationFilterList(..., logicOp = character(), logOp = character(),
  not = FALSE, .groupingFlag = FALSE)

## S4 method for signature 'AnnotationFilterList'
value(object)

## S4 method for signature 'AnnotationFilterList'
logicOp(object)

## S4 method for signature 'AnnotationFilterList'
not(object)

## S4 method for signature 'AnnotationFilterList'
distributeNegation(object,
  .prior_negation = FALSE)

## S4 method for signature 'AnnotationFilterList,missing'
convertFilter(object)

## S4 method for signature 'AnnotationFilterList'
show(object)

Arguments

...

individual AnnotationFilter objects or a mixture of AnnotationFilter and AnnotationFilterList objects.

logicOp

character of length equal to the number of submitted AnnotationFilter objects - 1. Each value representing the logical operation to combine consecutive filters, i.e. the first element being the logical operation to combine the first and second AnnotationFilter, the second element being the logical operation to combine the second and third AnnotationFilter and so on. Allowed values are "&" and "|". The function assumes a logical and between all elements by default.

logOp

Deprecated; use logicOp=.

not

logical of length one. Indicates whether the grouping of AnnotationFilters are to be negated.

.groupingFlag

Flag desginated for internal use only.

object

An object of class AnnotationFilterList.

.prior_negation

logical(1) unused argument.

Value

AnnotationFilterList returns an AnnotationFilterList.

value() returns a list with AnnotationFilter objects.

logicOp() returns a character() vector of “&” or “|” symbols.

not() returns a character() vector of “&” or “|” symbols.

AnnotationFilterList object with DeMorgan's law applied to it such that it is equal to the original AnnotationFilterList object but all !'s are distributed out of the AnnotationFilterList object and to the nested AnnotationFilter objects.

character(1) that can be used as input to a dplyr filter.

Note

The AnnotationFilterList does not support containing empty elements, hence all elements of length == 0 are removed in the constructor function.

See Also

supportedFilters for available AnnotationFilter objects

Examples

## Create some AnnotationFilters
gf <- GeneNameFilter(c("BCL2", "BCL2L11"))
tbtf <- TxBiotypeFilter("protein_coding", condition = "!=")

## Combine both to an AnnotationFilterList. By default elements are combined
## using a logical "and" operator. The filter list represents thus a query
## like: get all features where the gene name is either ("BCL2" or "BCL2L11")
## and the transcript biotype is not "protein_coding".
afl <- AnnotationFilterList(gf, tbtf)
afl

## Access individual filters.
afl[[1]]

## Create a filter in the form of: get all features where the gene name is
## either ("BCL2" or "BCL2L11") and the transcript biotype is not
## "protein_coding" or the seq_name is "Y". Hence, this will get all feature
## also found by the previous AnnotationFilterList and returns also all
## features on chromosome Y.
afl <- AnnotationFilterList(gf, tbtf, SeqNameFilter("Y"),
                            logicOp = c("&", "|"))
afl

afl <- AnnotationFilter(~!(symbol == 'ADA' | symbol %startsWith% 'SNORD'))
afl <- distributeNegation(afl)
afl
afl <- AnnotationFilter(~symbol=="ADA" & tx_start > "400000")
result <- convertFilter(afl)
result

DEPRECATED Gene name filter

Description

The GenenameFilter class and functions are deprecated. Please use the GeneNameFilter() instead.

Usage

GenenameFilter(value, condition = "==", not = FALSE)

Arguments

value

character() value for the filter

condition

character(1) defining the condition to be used in the filter. One of "==", "!=", "startsWith", "endsWith" or "contains". Default condition is "==".

not

logical(1) whether the AnnotationFilter is negated. TRUE indicates is negated (!). FALSE indicates not negated. Default not is FALSE.

Value

The constructor function return a GenenameFilter.