Title: | Fast Motif Matching in R |
---|---|
Description: | Quickly find motif matches for many motifs and many sequences. Wraps C++ code from the MOODS motif calling library, which was developed by Pasi Rastas, Janne Korhonen, and Petri Martinmäki. |
Authors: | Alicia Schep [aut, cre], Stanford University [cph] |
Maintainer: | Alicia Schep <[email protected]> |
License: | GPL-3 + file LICENSE |
Version: | 1.29.0 |
Built: | 2024-11-19 03:59:04 UTC |
Source: | https://github.com/bioc/motifmatchr |
A few example motifs from JASPAR 2016 for trying out motifmatchr
data(example_motifs)
data(example_motifs)
PFMatrixList
of length 3
data(example_motifs, package = "motifmatchr")
data(example_motifs, package = "motifmatchr")
Find motif matches
matchMotifs(pwms, subject, ...) ## S4 method for signature 'PWMatrixList,DNAStringSet' matchMotifs(pwms, subject, genome = NULL, bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7, ranges = NULL) ## S4 method for signature 'PWMatrixList,character' matchMotifs(pwms, subject, genome = NULL, bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7, ranges = NULL) ## S4 method for signature 'PWMatrixList,DNAString' matchMotifs(pwms, subject, genome = NULL, bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7, ranges = NULL) ## S4 method for signature 'PWMatrixList,GenomicRanges' matchMotifs(pwms, subject, genome = GenomeInfoDb::genome(subject), bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7) ## S4 method for signature 'PWMatrixList,RangedSummarizedExperiment' matchMotifs(pwms, subject, genome = GenomeInfoDb::genome(subject), bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7) ## S4 method for signature 'PWMatrixList,BSgenomeViews' matchMotifs(pwms, subject, bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7) ## S4 method for signature 'PFMatrixList,ANY' matchMotifs(pwms, subject, ...) ## S4 method for signature 'PWMatrix,ANY' matchMotifs(pwms, subject, ...) ## S4 method for signature 'PFMatrix,ANY' matchMotifs(pwms, subject, ...)
matchMotifs(pwms, subject, ...) ## S4 method for signature 'PWMatrixList,DNAStringSet' matchMotifs(pwms, subject, genome = NULL, bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7, ranges = NULL) ## S4 method for signature 'PWMatrixList,character' matchMotifs(pwms, subject, genome = NULL, bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7, ranges = NULL) ## S4 method for signature 'PWMatrixList,DNAString' matchMotifs(pwms, subject, genome = NULL, bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7, ranges = NULL) ## S4 method for signature 'PWMatrixList,GenomicRanges' matchMotifs(pwms, subject, genome = GenomeInfoDb::genome(subject), bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7) ## S4 method for signature 'PWMatrixList,RangedSummarizedExperiment' matchMotifs(pwms, subject, genome = GenomeInfoDb::genome(subject), bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7) ## S4 method for signature 'PWMatrixList,BSgenomeViews' matchMotifs(pwms, subject, bg = c("subject", "genome", "even"), out = c("matches", "scores", "positions"), p.cutoff = 5e-05, w = 7) ## S4 method for signature 'PFMatrixList,ANY' matchMotifs(pwms, subject, ...) ## S4 method for signature 'PWMatrix,ANY' matchMotifs(pwms, subject, ...) ## S4 method for signature 'PFMatrix,ANY' matchMotifs(pwms, subject, ...)
pwms |
either |
subject |
either |
... |
additional arguments depending on inputs |
genome |
BSgenome object, |
bg |
background nucleotide frequencies. Default is to compute based on subject, i.e. the specific set of sequences being evaluated. See Details. |
out |
what to return? see return section |
p.cutoff |
p-value cutoff for returning motifs |
w |
parameter controlling size of window for filtration; default is 7 |
ranges |
if subject is not GenomicRanges or RangedSummarizedExperiment, these ranges can be used to specify what ranges the input sequences correspond to. These ranges will be incorporated into the SummarizedExperiment output if out is "matches" or "scores" or will be used to give absolute positions of motifs if out is "positions" |
Background nucleotide frequencies can be set to "subject" to use the subject sequences or ranges for computing the nucleotide frequencies, "genome" for using the genomice frequencies (in which case a genome must be specified), "even" for using 0.25 for each base, or a numeric vector with A, C, G, and T frequencies.
Either returns a SummarizedExperiment with a sparse matrix with
values set to TRUE for a match (if out == 'matches'), a
SummarizedExperiment with a matches matrix as well as matrices with the
maximum motif score and total motif counts (if out == 'scores'), or a
GenomicRangesList
or a list of
IRangesList
with all the positions of matches
(if out == 'positions')
pwms = PWMatrixList,subject = DNAStringSet
: PWMatrixList/DNAStringSet
pwms = PWMatrixList,subject = character
: PWMatrixList/character
pwms = PWMatrixList,subject = DNAString
: PWMatrixList/DNAString
pwms = PWMatrixList,subject = GenomicRanges
: PWMatrixList/GenomicRanges
pwms = PWMatrixList,subject = RangedSummarizedExperiment
: PWMatrixList/RangedSummarizedExperiment
pwms = PWMatrixList,subject = BSgenomeViews
: PWMatrixList/BSGenomeViews
pwms = PFMatrixList,subject = ANY
: PFMatrixList/ANY
pwms = PWMatrix,subject = ANY
: PWMatrix/ANY
pwms = PFMatrix,subject = ANY
: PFMatrix/ANY
data(example_motifs, package = "motifmatchr") # Make a set of peaks peaks <- GenomicRanges::GRanges(seqnames = c("chr1","chr2","chr2"), ranges = IRanges::IRanges(start = c(76585873,42772928, 100183786), width = 500)) # Get motif matches for example motifs motif_ix <- matchMotifs(example_motifs, peaks, genome = "BSgenome.Hsapiens.UCSC.hg19")
data(example_motifs, package = "motifmatchr") # Make a set of peaks peaks <- GenomicRanges::GRanges(seqnames = c("chr1","chr2","chr2"), ranges = IRanges::IRanges(start = c(76585873,42772928, 100183786), width = 500)) # Get motif matches for example motifs motif_ix <- matchMotifs(example_motifs, peaks, genome = "BSgenome.Hsapiens.UCSC.hg19")
get motif counts from SummarizedExperiment object
motifCounts(object) ## S4 method for signature 'SummarizedExperiment' motifCounts(object)
motifCounts(object) ## S4 method for signature 'SummarizedExperiment' motifCounts(object)
object |
SummarizedExperiment object with counts assay |
matrix with counts
SummarizedExperiment
: method for SummarizedExperiment
data(example_motifs, package = "motifmatchr") # Make a set of peaks peaks <- GenomicRanges::GRanges(seqnames = c("chr1","chr2","chr2"), ranges = IRanges::IRanges(start = c(76585873,42772928, 100183786), width = 500)) # Get motif matches for example motifs motif_ix <- matchMotifs(example_motifs, peaks, genome = "BSgenome.Hsapiens.UCSC.hg19", out = "scores") motifCounts(motif_ix)
data(example_motifs, package = "motifmatchr") # Make a set of peaks peaks <- GenomicRanges::GRanges(seqnames = c("chr1","chr2","chr2"), ranges = IRanges::IRanges(start = c(76585873,42772928, 100183786), width = 500)) # Get motif matches for example motifs motif_ix <- matchMotifs(example_motifs, peaks, genome = "BSgenome.Hsapiens.UCSC.hg19", out = "scores") motifCounts(motif_ix)
get motif matches from SummarizedExperiment object
motifMatches(object) ## S4 method for signature 'SummarizedExperiment' motifMatches(object)
motifMatches(object) ## S4 method for signature 'SummarizedExperiment' motifMatches(object)
object |
SummarizedExperiment object with matches assay |
matrix with scores
SummarizedExperiment
: method for SummarizedExperiment
data(example_motifs, package = "motifmatchr") # Make a set of peaks peaks <- GenomicRanges::GRanges(seqnames = c("chr1","chr2","chr2"), ranges = IRanges::IRanges(start = c(76585873,42772928, 100183786), width = 500)) # Get motif matches for example motifs motif_ix <- matchMotifs(example_motifs, peaks, genome = "BSgenome.Hsapiens.UCSC.hg19") motifMatches(motif_ix)
data(example_motifs, package = "motifmatchr") # Make a set of peaks peaks <- GenomicRanges::GRanges(seqnames = c("chr1","chr2","chr2"), ranges = IRanges::IRanges(start = c(76585873,42772928, 100183786), width = 500)) # Get motif matches for example motifs motif_ix <- matchMotifs(example_motifs, peaks, genome = "BSgenome.Hsapiens.UCSC.hg19") motifMatches(motif_ix)
The motifmatchr package is designed for analyzing many sequences and many motifs to find which sequences contain which motifs.
motifmatchr uses the MOODS C++ library (developedby Pasi Rastas, Janne Korhonen, and Petri Martinmaki) internally for motif matching.
The primary method of motifmatchr is matchMotifs
, which takes
in motif PWMs/PFMs and genomic ranges or sequences and returns either which
ranges/sequences match which motifs or the positions of the matches.
Compared with alternative motif matching functions available in Bioconductor (e.g. matchPWM in Biostrings or searchSeq in TFBSTools), motifmatchr is designed specifically for the use case of determining whether many different sequences/ranges contain many different motifs.
Alicia Schep
motifmatchr has moved functions and methods to camelCase from snake_case. The following functions have been deprecated and replaced with a different name:
motif_matches is now motifMatches
motif_counts is now motifCounts
motif_scores is now motifScores
match_motifs is now matchMotifs
motif_matches(...) motif_counts(...) motif_scores(...) match_motifs(...)
motif_matches(...) motif_counts(...) motif_scores(...) match_motifs(...)
... |
arguments passed to new function |
calls the replacement function
Alicia Schep
get motif scores from SummarizedExperiment object
motifScores(object) ## S4 method for signature 'SummarizedExperiment' motifScores(object)
motifScores(object) ## S4 method for signature 'SummarizedExperiment' motifScores(object)
object |
SummarizedExperiment object with scores assay |
matrix with scores
SummarizedExperiment
: method for SummarizedExperiment
data(example_motifs, package = "motifmatchr") # Make a set of peaks peaks <- GenomicRanges::GRanges(seqnames = c("chr1","chr2","chr2"), ranges = IRanges::IRanges(start = c(76585873,42772928, 100183786), width = 500)) # Get motif matches for example motifs motif_ix <- matchMotifs(example_motifs, peaks, genome = "BSgenome.Hsapiens.UCSC.hg19", out = "scores") motifScores(motif_ix)
data(example_motifs, package = "motifmatchr") # Make a set of peaks peaks <- GenomicRanges::GRanges(seqnames = c("chr1","chr2","chr2"), ranges = IRanges::IRanges(start = c(76585873,42772928, 100183786), width = 500)) # Get motif matches for example motifs motif_ix <- matchMotifs(example_motifs, peaks, genome = "BSgenome.Hsapiens.UCSC.hg19", out = "scores") motifScores(motif_ix)