Changes in version 1.32.0 NEW FEATURES o Added support for CWM (Contribution Weight Matrix) motifs. Contribution Weight Matrices come from TF-MoDISco / DeepLIFT -style attribution methods and hold signed, unnormalised per-position per-letter contribution scores. CWM is now a first-class fifth motif type alongside PCM / PPM / PWM / ICM: `create_motif(matrix, type = "CWM")` accepts signed matrices with no column-sum constraint, `convert_type(cwm, "PPM")` converts via the TF-MoDISco-lite convention `ppm[i,j] = |cwm[i,j]| / sum_i |cwm[i,j]|` (and further routing through PPM lets CWM motifs reach PWM / ICM / PCM); `convert_type(non_cwm, "CWM")` errors with a clear message because contribution scores cannot be recovered from probabilities. `read_meme()` and `write_meme()` gain a `CWM = FALSE` argument: with `CWM = TRUE`, MEME files are parsed as CWMs (raw signed matrices, type tag = "CWM") or written verbatim from a CWM motif list. `read_matrix()` and `write_matrix()` read/write CWMs via `type = "CWM"` (the file format is unchanged; the CWM type tag is opt-in and never auto-detected, so a signed matrix without `type = "CWM"` still reads as a PWM). `view_motifs()` and `view_motifs2()` accept `use.type = "CWM"` and render the raw signed values with a "contribution" y-axis label. CWM motifs flow through every analysis function that implicitly converts the type (`compare_motifs[2]()`, `scan_sequences[2]()`, `merge_motifs[2]()`, `merge_similar[2]()`, `motif_pvalue()`, `motif_tree[2]()`, `motif_rc()`, `trim_motifs()`). o Added trim_cwm(), a CWM-aware trimmer that drops edge columns by absolute column sum. Defaults to the TF-MoDISco-lite fraction-of-peak rule (`trim.threshold = 0.3`); an optional `abs.threshold` argument switches to an absolute-value cutoff. Walks inward from the chosen edges only (`trim.from = c("both", "left", "right")`), mirroring `trim_motifs()`'s edge-only semantics. Returns a trimmed CWM with the type tag preserved. o Added compare_motifs2(), a minimalist DNA/RNA motif-vs-motif comparison function aligned with the command-line tool yamtk cmp (https://github.com/bjmt/yamtk). Uses Pearson correlation per column as the only similarity metric and computes per-pair p-values from a TOMTOM-style null PMF, either empirical from the target database (default) or parametric via Dirichlet-Multinomial over a K=5 simplex grid, with Bonferroni + Benjamini-Hochberg adjustment. Returns either a square matrix (matrix.out = "score" / "pvalue" / "qvalue") or a long-format data.frame of significant pairs. On a 50-motif HOCOMOCOv11 self-comparison the port reproduces yamtk's hits exactly and the p/q-values to within 5e-6 relative error; matrix.out = "score" runs ~40x faster than compare_motifs() on a 200x200 set at 4 threads. o Added scan_sequences2(), a minimalist DNA/RNA scanner aligned with the defaults of the yamtk scan command-line tool (https://github.com/bjmt/yamtk). Reuses the same C++ scanning core as scan_sequences() but skips steps for features it does not support (multifreq scoring, gapped motifs, q-values, exhaustive p-values, non-pvalue threshold types, respect.strand, allow.nonfinite). On a 50-motif HOCOMOCOv11 benchmark, ~2-3x faster than scan_sequences(calc.pvals = TRUE) at 4 threads on equivalent work, and within ~3x of yamtk scan on long-sequence workloads. See the "A faster alternative: scan_sequences2()" section in the Sequence Searches vignette for the full speed comparison. o Added view_motifs2() and motif_tree2(): drop-in display-time siblings of view_motifs() and motif_tree() that use the new Pearson-correlation backend (compare_motifs2()). view_motifs2() picks the highest-IC motif as the anchor, aligns every other motif against it via the same routine used by merge_motifs2() and merge_similar2(), and runs the result through the same logo renderer as view_motifs(). motif_tree2() takes compare_motifs2(matrix.out = "score") and converts the mean Pearson matrix to a symmetric distance as (1 - score) / 2, feeding it into the same hclust / ape::as.phylo / ggtree::ggtree pipeline that motif_tree() uses. Both original functions are unchanged. New arguments are minimal: tryRC, min.overlap, nthreads. DNA / RNA only; for amino-acid or custom-alphabet motifs continue to use the original functions. o New vignette "MotifDatabaseCuration.Rmd": end-to-end recipe for consolidating motif sets from several public sources into a single deduplicated, trimmed, and annotated MEME file. Pulls Homo sapiens motifs from three MotifDb sub-collections (JASPAR2022, HOCOMOCOv11-core, CIS-BP), stamps source provenance onto each motif's family slot, inspects the pairwise q-value distribution from compare_motifs2() to pick a deduplication threshold, runs merge_similar2() with a q-value chosen to avoid graph-transitivity giant components, visualises the cluster structure with motif_tree(), trims uninformative flanks with trim_motifs(), decorates the survivors with provenance and date metadata via to_df()/update_motifs(), and exports the whole library with write_meme() before round-trip-reading it back with read_meme() to verify the export. o New vignette "ChIPseqWorkflow.Rmd": an end-to-end ChIP-seq motif analysis on real data, starting from a peak BED file bundled in inst/extdata (VAL1-GFP peaks in Arabidopsis from Yuan et al. 2021, NAR 49:98-113). The vignette imports peaks via rtracklayer, extracts sequences via BSgenome.Athaliana.TAIR.TAIR9, discovers motifs de novo with motif_finder(), deduplicates with merge_similar2(), compares to JASPAR2022 plant motifs via MotifDb and compare_motifs2(), validates enrichment with enrich_motifs2(), scans peaks with scan_sequences2(), tests central positional bias with motif_peaks(), finds co-occurring motif pairs with motif_coocc(), and closes with a summary table that links each discovered motif to its evidence. New Suggests dependencies: rtracklayer, BSgenome.Athaliana.TAIR.TAIR9. o view_motifs() gains a `sort.by` argument controlling the display order when plotting multiple motifs. Regardless of this setting, alignment is always performed with the highest-IC motif as the anchor (matching merge_motifs2()'s behaviour); `sort.by` controls only the order in which the aligned motifs are shown. Options: `"none"` (default; input order preserved for display while alignment still uses the highest-IC anchor), `"ic"` (descending information content), or `"similarity"` (hierarchical clustering on Pearson correlation between aligned columns, so visually similar motifs are adjacent). o Added motif_coocc(): pairwise motif co-occurrence testing. Given a list of motifs and a set of host sequences, motif_coocc() scans (via scan_sequences2) and tests every motif pair for over-co- occurrence using a one-sided Fisher's exact test on the per- sequence 2x2 contingency table, BH-correcting across all tested pairs. Two input paths: the default scans internally and is DNA/RNA only; alternatively pass a precomputed hit table via `hits = ...` plus `n.sequences = ...` and any alphabet is accepted (AA, custom). Optional `max.distance` argument adds two descriptive columns (both.clustered, median.distance) useful for flagging heterodimer-like arrangements; the Fisher p-value is always computed on the unfiltered 2x2. See ?motif_coocc. o Added implant_motifs(): plant sampled motif instances into existing sequences at known positions, producing benchmark-grade ground-truth positive sequences for testing scan_sequences(), motif_finder(), enrich_motifs2(), motif_peaks(), or any other discovery / scanning pipeline against a known answer key. Three insertion modes: fixed count per sequence (n.per.seq), per-bp Poisson rate (rate), or explicit positions (positions). Optional controls: centre.bias (Irwin-Hall N, 1 = uniform, larger = cluster near sequence centre, pairs naturally with motif_peaks() / plot_motif_peaks() for ChIP- seq-style simulations), min.spacing (minimum gap between adjacent implants), strand ("both" or "+"), and max.retries on placement collision. By default returns the modified XStringSet; with return.indices = TRUE returns a list with the modified sequences plus a data.frame of (sequence.i, motif.i, start, width, strand, planted) ground-truth metadata. See ?implant_motifs. o Added match_bkg() and plot_match_bkg(): composition-matched background sampling for DNA/RNA motif enrichment and de novo discovery analyses. Given a target XStringSet and a larger universe XStringSet (typically extracted from a BSgenome) or a BSgenome object directly, match_bkg() samples background sequences from the universe that match the target on GC fraction and length, the HOMER-style binned approach. This complements shuffle_sequences(), which preserves each input's own k-let composition by shuffling in place; match_bkg() instead controls for composition bias relative to a genomic universe. The result is an XStringSet usable directly as `bkg.sequences` in enrich_motifs2() / motif_finder(). plot_match_bkg() returns a ggplot of overlaid target vs matched-background GC and length densities for visual match-quality QC. Beyond GC and length, match_bkg() can also match on arbitrary external per-sequence covariates (a ChIP signal score, conservation, mappability and so on), supplied through the `covariates` and `universe.covariates` arguments, with `n.bins.covariates` and `bin.type.covariates` controlling how each extra axis is binned. Covariate matching needs an explicit `universe` paired with `universe.covariates` and cannot be combined with `genome`, because internally-sampled genomic windows carry no covariate values. With return.indices = TRUE the result then carries a target. / universe. column for each covariate, and plot_match_bkg() will facet a density panel for every matched axis when handed that data.frame through its new `indices` argument. See ?match_bkg. o Rewrote motif_peaks() as an analytical CentriMo-style positional enrichment function (Bailey & Machanick 2012). It takes a hit table from scan_sequences() or scan_sequences2() (data.frame or GRanges) and tests whether each motif's hits cluster non-uniformly along the input sequences via a one-sided binomial test on candidate windows. Two modes: mode = "central" (default) tests only centre-of-sequence windows (appropriate for sequences centred on a reference like ChIP-seq peak summits); mode = "local" varies the window centre too. Best window per motif is reported with Bonferroni correction across windows tested, then BH q-value across motifs. Also added a companion plot_motif_peaks() that returns a ggplot faceted by motif (hit-centre histogram + best-window shade + per-motif annotation in the strip label). See ?motif_peaks. o Added merge_motifs2() and merge_similar2(), minimalist DNA/RNA counterparts to merge_motifs() and merge_similar() built on compare_motifs2(). merge_motifs2() uses anchor-based progressive alignment (highest-IC anchor + single-call alignment of every other motif against it) and column-union averaging (optionally @nsites-weighted). merge_similar2() clusters motifs by statistical significance: pairs are linked when their compare_motifs2() q-value meets the user's cutoff, and connected components on the resulting graph (via a small union-find) define the clusters. Each multi-motif cluster is then collapsed via merge_motifs2(). This matches the STAMP / TOMTOM-clustering semantics ("group all motifs that are pairwise significantly similar") and removes v1's linkage-method choice and abstract distance-unit cutoff. merge_similar2(return.clusters = TRUE) returns a universalmotif_df of the input motifs (motif objects in the `motif` column, names in `name`) with added `motif.i` and `cluster` columns describing the cluster assignment, instead of the merged motifs. See ?merge_motifs2 and ?merge_similar2. o Added motif_finder(), a DNA/RNA _de novo_ motif discovery function aligned with the command-line tool yamtk me (https://github.com/bjmt/yamtk), whose own design is based on the STREME algorithm (Bailey 2021): per-width word-counting seed enumeration (Fisher's exact on per-sequence presence), iterative PPM refinement, per-motif Fisher's exact significance against a shuffled or user-supplied background, cross-width coverage-overlap dedup, BH q-value filtering, and IC-trimming of low-information flanks. Returns a universalmotif_df with one row per discovered motif; the standard slot columns (name, consensus, alphabet, type, nsites, pval, eval, motif, ...) sit alongside yamtk-style stats (rank, width, seqs_pos, seqs_neg, sites_pos, sites_neg, n_pos, n_neg, pvalue, qvalue). Width-parallel via RcppThread. Replaces a long-standing unexported, untested motif_finder() stub. See ?motif_finder. o Added enrich_motifs2(), a minimalist DNA/RNA motif-enrichment function aligned with the command-line tool yamtk enr (https://github.com/bjmt/yamtk). Built on scan_sequences2() and exposes a single p-value hit threshold, a single q-value result cutoff, two test modes (seqs = Fisher's exact on per-sequence hit presence; sites = Fisher's exact on per-position rate), and hard-coded Benjamini-Hochberg q-value adjustment. Returns a plain data.frame with yamtk's TSV columns one-for-one (motif, motif.i, consensus, target.seq.n/seq.hits/site.hits, bkg.seq.n/seq.hits/ site.hits, enrichment, log2.enrichment, pvalue, qvalue). See ?enrich_motifs2. o Added dedup_hits(), a fast C++ implementation of yamtk's greedy overlap-dedup algorithm. Accepts data.frame or GRanges input; auto-detects pvalue-vs-score priority direction. Used internally by scan_sequences2(no.overlaps = TRUE) and available for post-processing of any hit set. Distinct from scan_sequences()'s existing no.overlaps argument, which uses connected-components clustering and is left unchanged. MINOR CHANGES o merge_motifs() and merge_similar() now emit one-line hints pointing at merge_motifs2() / merge_similar2() when the user's argument set maps cleanly onto the leaner functions. Silence with options(universalmotif.suggest.merge_motifs2 = FALSE) and options(universalmotif.suggest.merge_similar2 = FALSE). o enrich_motifs() now emits a one-line hint pointing at enrich_motifs2() when the user's argument set maps cleanly onto the leaner function. Silence with options(universalmotif.suggest.enrich_motifs2 = FALSE). o compare_motifs() now emits a one-line hint pointing at compare_motifs2() when the user's argument set maps cleanly onto the leaner function. Silence with options(universalmotif.suggest.compare_motifs2 = FALSE). o motif_pvalue(method = "dynamic") now batches across motifs in a single C++ call parallelised with RcppThread, instead of dispatching one C++ call per motif via mapply. Identical numerical output; 2.3x faster at nthreads = 1 and 7.4x faster at nthreads = 4 on a 50-motif HOCOMOCO benchmark. scan_sequences(calc.pvals = TRUE) and scan_sequences2() both benefit automatically. o scan_sequences() now emits a one-line hint pointing at scan_sequences2() when the user's argument set maps cleanly onto the leaner function. Silence with options(universalmotif.suggest.scan_sequences2 = FALSE). o motif_tree() now emits a one-line hint pointing at motif_tree2() when the user's argument set maps cleanly onto the leaner function. Silence with options(universalmotif.suggest.motif_tree2 = FALSE). o Performance: several C++ hot loops now precompute integer alphabet-power tables instead of calling pow() at every sequence position. This affects the k-let counters behind shuffle_sequences() and get_bkg() (Euler / Markov shuffling and higher-order background models) and the Markov sequence generator behind create_sequences(). The per-column motif-comparison metrics in compare_motifs() now use plain multiplication in place of pow(x, 2.0). All numerically identical; modest, consistent speed-ups (roughly 10-20% on the affected paths in informal benchmarks). o motif_pvalue(method = "exhaustive") now stops with a clear error when a motif is too wide for exhaustive score enumeration (the nrow^width path would overflow a 32-bit integer), rather than relying solely on the R-side guard. o Various textual improvements to the vignettes. BUG FIXES o Fixed view_motifs() rendering of negative-height letter stacks of three or more letters (a PWM position with three or more negative log-odds letters). The internal stacking helper offset each letter by only the immediately-preceding letter's height rather than the cumulative height of all letters above it, so letters past the first piled on top of one another below the baseline. Switched to a cumulative offset. One- and two-letter negative stacks are unaffected. o Fixed create_motif() matrix dispatch silently dropping an explicitly supplied alphabet = "DNA"/"RNA"/"AA" when the input matrix also carried rownames. The else-if chain fell through to a clause that overwrote the user's alphabet with collapse_cpp( rownames(matrix)) ("ACGT" for DNA), which is not a key in the C++ alphabet lookup, leaving @consensus empty and colnames(@motif) NULL. Downstream this broke to_list() round-trips via update_motifs() (slot comparison error). Added the missing `&& missing(alphabet)` guard. o Fixed nthreads = 0 silently running serially instead of using all available threads as documented (RcppThread::parallelFor treats 0 as "no workers" rather than "all hardware threads"). Affected scan_sequences(), shuffle_sequences(), compare_motifs(), motif_pvalue(), motif_tree(), create_sequences(), get_bkg(), enrich_motifs(), make_DBscores(), sequence_complexity(), and window_string(). o Fixed convert_motifs(motifs, class = "TFBSTools-PFMatrixList") (and the PWMatrixList / ICMatrixList variants) erroring with "unknown 'class'" instead of returning the corresponding TFBSTools list object. These output classes were already listed in the help page but had no implementation; the list method now converts each motif to the singular motif and dispatched the right TFBSTools list constructor. o Fixed compare_motifs() pairwise similarity calls passing the first motif's background frequencies for both members of the pair when nthreads > 1, instead of motif-1's and motif-2's backgrounds. Comparisons whose two motifs declared different bkg vectors (e.g. DNA at uniform 0.25 vs. a GC-skewed empirical background) were scored against the wrong null. o Fixed compare_motifs() p-value mode calling R's pnorm / plogis / pweibull from inside an RcppThread::parallelFor body, where the R API is not thread-safe and could corrupt internal state under load (most visible on macOS). The row-search half stays parallel, and the actual p-value evaluation now runs serially. o Fixed compare_motifs() row-not-found path in the db.scores lookup infinite-looping when n1 / n2 ran past the table without finding a matching row. o Fixed shuffle_sequences() crashing R when k was larger than the shortest input sequence. The C++ backends now refuse the call with a clear "'k' must be <= the shortest sequence length" error. o Fixed read_homer() storing NA in the nsites and bkgsites slots when those header fields were missing. ifelse(is.na(...), numeric(0), x) cannot return numeric(0); replaced with explicit if/else so missing fields parse to an empty vector as documented. o Fixed read_jaspar(), read_cisbp(), read_meme(), and read_transfac() silently returning an empty list when the input file had no motif headers / separators. All four now error with "no motifs found in ''". o Fixed read_jaspar() producing an nsites slot of -Inf when a header was followed by an empty matrix. The reader now errors with a clear "empty motif matrix" message before the downstream max(colSums(motif)) on numeric(0). o Fixed convert_motifs() for motifStack 'pcm' inputs silently picking unique(colSums(...))[1] as nsites when the column sums disagreed. Now checks the column sums within tolerance and errors with the observed range if they don't agree. o The S4 validity method for PCM motifs now also checks that the motif matrix's column sums are all equal (within the same 1% tolerance the PPM validity check uses). o Test coverage extended from 708 to 878 cases: I/O round-trips and parser error paths per format, RNA / AA coverage across scan / compare / enrich (v1 and v2), v1/v2 smoke parity, and nthreads=1 vs nthreads=2 parity on every parallel function. Changes in version 1.30.1 MINOR CHANGES o Improved memory usage of scan_sequences() for larger motif collections & genomes. BUG FIXES o Fixed filter_motifs() erroring when filtering by nsites (malformed apply() call instead of vapply()). o Fixed motif["name"] <- NA_character_ (and other atomic slots) being silently accepted instead of raising an error. o Fixed to_list() writing literal "0" into the name slot when altname was NA and a dplyr::rename() swap was performed on the data frame. o Fixed new("universalmotif", altname = ...) silently dropping the altname argument due to a missing assignment operator. o Fixed motif_score() erroring with allow.nonfinite = TRUE and the default threshold.type = "total". o Fixed motif_pvalue() rejecting non-finite scores even when allow.nonfinite = TRUE was specified (sanitize_input() was not checking the flag). o Fixed shuffle_motifs() silently resetting the pseudocount to 1 and the type to PPM regardless of the input motif's values. o Fixed shuffle_sequences() and get_bkg() not guarding against window.size < k, or fractional window.size values that resolve to 0 after expansion. o Fixed enrich_motifs() crashing inside fisher.test() when input sequences are shorter than the widest motif. Now gives a clear error message. o Fixed read_jaspar() leaving the nsites slot empty for PCM motifs; it is now derived from the column sums of the count matrix. o Fixed an unsigned-integer underflow in src/scan_sequences.cpp that could cause out-of-bounds reads when scanning sequences shorter than motif width + k. o Fixed a signed-integer overflow in scan_sequences_cpp() when the threshold supplied via threshold.type = "logodds.abs" exceeded ~2e6; on x86_64 the overflow wrapped to INT_MIN, causing all positions to be returned regardless of the threshold. o Fixed the shuffle.method error message in enrich_motifs() which listed "random" instead of "euler" as a valid option. o Fixed a wrong variable name (seq.names instead of bkg.names) in the fallback for unnamed background sequences in enrich_motifs(). o Fixed a wrong assignment operator (= instead of ==) in the motif_pvalue.method compatibility check in scan_sequences() which was silently clobbering the calc.pvals variable. o Fixed scan_sequences() BH q-value calculation; previously used an inverted ratio with an erroneous 100x factor instead of the standard BH formula. o Fixed a wrong inner-loop index in motif_pvalue(method="exhaustive") that produced incorrect p-values for motifs wider than 2*k. o Fixed get_matches() failing when passed a length-1 list of motifs due to an undefined variable and the list-unwrapping occurring after the @ access. o Fixed get_bkg() validation incorrectly rejecting multi-element window.overlap when merge.res = FALSE, due to missing parentheses around an || sub-expression. o Fixed merge_motifs() silently dropping the bkgsites slot when only one of the merged motifs had it populated (threshold was > 1 instead of >= 1). o Fixed get_matches() error message incorrectly referencing the max possible score when reporting a min-score violation. o Removed a dead NULL check in enrich_motifs() that would have errored (nrow(NULL) < 1 yields logical(0)) had it ever been reached. o Removed a leftover debug print() call in enrich_motifs() input validation that caused error messages to be printed twice. o Fixed scan_sequences(return.granges = TRUE) failing when sequences is empty, due to 1:length(sequences) producing c(1, 0) for length-0 input. o Fixed shuffle_sequences() producing identical shuffles across parallel threads when rng.seed = 0. o Fixed score_match(allow.nonfinite = TRUE) returning NA instead of -Inf for match positions that correspond to non-finite motif cells. o Fixed read_motifs() failing with 'object motif not found' when reading a saved motif file containing a multifreq block, because the motif object was used before it was constructed. o Fixed convert_motifs() Motif-method (motifRG) passing the PCM matrix as a positional argument due to a stray <- instead of = in the argument list. o Fixed possible undefined behaviour in scan_sequences() when scanning sequences shorter than k with a multifreq motif (size_t underflow in deal_with_higher_k / deal_with_higher_k_NA). o Fixed merge_motifs() populating pval, qval, eval, nsites, and pseudocount slots with NA when all input motifs had those slots empty; they are now correctly left empty. o Fixed trim_motifs() discarding all successfully-trimmed motifs when any single input motif was fully trimmed away (partial-trim case now correctly returns only the surviving motifs). o Fixed read_motifs() using lexicographic string comparison for version dispatch; 1.1.7 was incorrectly treated as older than 1.1.67. Now uses numeric_version(). o Fixed read_cisbp() off-by-one error when CIS-BP files contain consecutive TF metadata lines, caused by mutating meta_starts during seq_along iteration. o Fixed a temp-file leak in the compare_motifs() HTML report path where the second on.exit() call overwrote the first (now uses add = TRUE). o Fixed add_multifreq() retry loop erroring with 'missing value where TRUE/FALSE needed' or keeping the failing k value, due to a stale seq_along and an off-by-one in seq_len(counter) vs seq_len(counter - 1). o Fixed runtime error ('argument FUN.VALUE is missing') in read_motifs() when reading pre-1.2.x files containing multifreq blocks. o Fixed filter_motifs() erroring when filtering by pval, qval, eval, altname, family, or organism and any motif had an empty slot for that field (sapply returned a list of mixed lengths; now uses vapply). o Fixed shuffle_motifs() dropping the matrix dimension when shuffling a length-1 motif (missing drop = FALSE in column slice), causing downstream errors. o Fixed read_motifs() silently accepting non-logical values for the progress and BP arguments (logi_check was computed but not appended to all_checks). o File connections in all read_*() and write_*() functions are now closed via on.exit() so they are released cleanly even when an error occurs mid-read. Changes in version 1.30.0 BUG FIXES o Fixed create_sequences() not properly validating alphabet and freq values due to a floating point error with logb(). Thanks to @CountercurrentLinda for the bug report (#32). Changes in version 1.28.0 BUG FIXES o Quiet a couple of warnings triggered during testing the universalmotif initializer function. These warnings should never trigger in real-world scenarios. Changes in version 1.26.3 BUG FIXES o Update a few lines in view_motfs() to stay up-to-date with ggplot2. Thanks to Stevie Pederson for the fix (#31). Changes in version 1.26.2 BUG FIXES o Fixed a bug introduced by the last bug fix. Changes in version 1.26.1 BUG FIXES o Plotting logos with negative letter heights is now fixed. Changes in version 1.24.2 BUG FIXES o Incorrect version number in last NEWS update. Changes in version 1.24.1 BUG FIXES o Fixed another missed Rcpp bounds check warning. Changes in version 1.22.3 BUG FIXES o Fixed make_DBscores() crash. Changes in version 1.22.2 o Added CITATION file. Changes in version 1.22.1 BUG FIXES o Always check object sizes before indexing in universalmotif_cpp(). Fixes build timeouts on Bioconductor. Changes in version 1.22.0 BUG FIXES o Address C++ compiler warnings. o Suppress warnings when using as.data.frame() on DataFrame objects in the SequenceSearches.Rmd vignette. Changes in version 1.18.1 BUG FIXES o Fixed compilation flags causing errors on linux. Changes in version 1.18.0 NEW FEATURES o read_meme(readsites.meta.tidy): New option to tidy the output of the readsites.meta option into a single data.frame. MINOR CHANGES o create_motif(): More robust argument checking. o The rowMeans, colMeans, rowSums, and colSums generics are now imported from the MatrixGenerics package instead of BiocGenerics. BUG FIXES o Clean up output of argument checks internal to exported functions. o Delete a reference in IntroductionToSequenceMotifs vignette to a non-exported function. o Delete outdated in MotifManipulation vignette regarding convert_motifs function. Changes in version 1.16.0 NEW FEATURES o write_transfac(name.tag, altname.tag): New arguments to manually set the name and altname tags in the final TRANSFAC motifs. MINOR CHANGES o write_matrix(positions): Partial argument matching allowed. o motif_tree(): Silence messages from ggtree. BUG FIXES o create_motif(): Don't ignore the nsites argument when generating random motifs. o read_meme(): Correctly parse background if lines are prepended with a space. o convert_type(pseudocount): Change the pseudocount within the motif when performing a type conversion and the option is set. Changes in version 1.14.1 BUG FIXES o read_meme(): Handle motif files with custom alphabets and no 'END' line in alphabet definition. Thanks to @manoloff (#24) for the bug report, and Spencer Nystrom for the fix (#25). Changes in version 1.14.0 NEW FEATURES o enrich_motifs(mode, pseudocount): Choose whether to count motif hits once per sequence, and whether to add a pseudocount for P-value calculation. o New function, meme_alph(): Create MEME custom alphabet definition files. o merge_similar(return.clusters): Return the clusters without merging. o convert_motifs(): MotifDb-MotifList now available as an output format. MINOR CHANGES o enrich_motifs(): RC argument now defaults to TRUE, increased max significance values, no.overlaps now defaults to TRUE. Additional columns showing the motif consensus sequence and percent of sequences with hits are now included. o scan_sequences(RC): Only print a warning if RC=TRUE for non-DNA/RNA motifs. o Reduced the size of the message when a pseudocount is added to motifs. Changes in version 1.12.4 BUG FIXES o convert_motifs(): Properly handle TFBSTools class motifs with '*' as their strand. This was achieved by making the universalmotif object creator tolerant to using '*' as a user input. Thanks to David Oliver for the bug report (#22). Changes in version 1.12.3 BUG FIXES o scan_sequences(): Previously this function did not account for the fact that duplicate sequence names are allowed within XStringSet objects. To better keep track of which sequence hits are associated with, an additional sequence.i column has been added which keeps track of the sequence number. This change also fixes a knock-on issue with enrich_motifs(), where sequences with duplicate names did not contribute to the count of sequences containing hits. Thanks to Alexandre Blais for mentioning this issue. Changes in version 1.12.2 BUG FIXES o shuffle_sequences(..., method="markov"): Previously the returning sequences were longer by 1. Changes in version 1.12.1 BUG FIXES o DNA ambiguity letters can be used with create_motif() when alphabet="DNA" is specified. Previously ambiguity letters only worked when alphabet was not specified. Changes in version 1.12.0 NEW FEATURES o New function, sequence_complexity(): Using either the Wootton-Federhen, Trifonov, or DUST algorithms, calculate sequence complexity in sliding windows. A version for small arbitrary strings is also provided: calc_complexity(). o New function, mask_ranges(): Similarly to mask_seqs(), mask specific positions in a XStringSet object by replacing the letters with a specific filler character. o New function, motif_range(): Get the min/max range of possible logodds scores for a motif. o New function, calc_windows(): Utility function for calculating coordinates for sliding windows. o New function, window_string(): Utility function for retrieving sliding windows in a string. o New function, slide_fun(): Utility function which wraps window_string() and vapply() together. o motif_pvalue(method): P-values and scores can now be calculated dynamically instead of exhaustively, substantially increasing both speed and accuracy for bigger jobs. The previous exhaustive method can still be used however, as the dynamic method does not allow non-finite values and thus must be pseudocount-adjusted. o scan_sequences(calc.pvals, calc.qvals, motif_pvalue.method, calc.qvals.method): The calc.pvals argument defaults to TRUE. The P-value calculation method now defaults to dynamic P-values (the previous method was an exhaustive calculation), though this can be changed via motif_pvalue.method. Additionally, adjusted P-values can be calculated as either BH, FDR or a Bonferroni-adjusted P-value. More details can be found in the Sequence Searches vignette. o write_homer(threshold, threshold.type): Finer control over the final motif logodds threshold included with the written motif is now available, using the style of argument parsing from scan_sequences(). The previous logodds_threshold argument is now deprecated and set to NULL, but if set (e.g. an older script is being re-run) then the old behaviour of write_homer() will be used. MINOR CHANGES o New global option, options(pseudocount.warning): Disable the message printed when a motif is pseudocount-adjusted. o Slight performance gains in get_bkg() window code. o motif_pvalue(): Clarify that, indeed, background probabilities are taken into account when calculating P-values from score inputs. The background adjustment takes place during the initial conversion to PWM. o motif_pvalue(): When bkg.probs are provided, use those when converting to a PWM. o scan_sequences(): The default threshold is now 0.0001 (using threshold.type = "pvalue"). o The axis text in view_motifs() is now black instead of grey. o create_motif(): When a named background vector is provided, it is sorted according to the alphabet characters. o scan_sequences(): Check that the sequences aren't shorter than the motifs. o print.universalmotif_df: Changed warning message when subsetting to an incomplete universalmotif_df object. Also added a way to turn off informative messages/warnings via the boolean universalmotif_df.warning global option. o Miscellaneous changes and additions to the vignettes and various function manual pages. Changes in version 1.10.2 BUG FIXES o read_homer() now correct parses enrichment P-value and logodds score. Changes in version 1.10.1 BUG FIXES o Restore temporarily disabled ggtree(layout="daylight") example in the MotifComparisonAndPvalues.Rmd vignette, as tidytree is now patched. o Fixed some awkwardness in view_motifs() panel spacing and title justification. Changes in version 1.10.0 NEW FEATURES o A new data structure, universalmotif_df, has been made available. This allows for motifs to be manipulated as one would a data.frame object. The to_df() function is used to generate this stucture from lists of motifs. The update_motifs() function is used to apply changes to the actual motifs, and to_list() returns the actual motifs. Note that this is only meant as an option for more conveniently manipulating motif slots of multiple motifs simultaneously before returning them to a list; the universalmotif_df structure cannot be used in the various universalmotif functions. Additionally, requires_update() can be used to ascertain whether motifs are out of date in a universalmotif_df object. Many thanks to @snystrom for discussions and significant contributions. o view_motifs(): the universalmotif package now relies entirely on its own code to generate the polygon data used by ggplot2 to plot motifs, meaning the ggseqlogo import has been dropped. A number of new options are now available, including plotting multifreq logos and finer control over letter spacing. An effort has been made to ensure that the default behaviour of the function be unchanged from previous versions. This change should also allow for easier fixing of bugs and flexibility for future additions or changes. o New function, merge_similar(): identify and merge similar motifs in a list of motifs. Essentially, a wrapper around compare_motifs(), hclust(), cutree(), and merge_motifs(). o New function, view_logo(): plot logos with matrix input instead of motif object input. Arbitrary column heights and multi-character letters are allowed. o New function, average_ic(): calculate the average information content for a list of motifs. o trim_motifs(..., trim.from): trim from both directions or just one. o shuffle_sequences(..., window, window.size, window.overlap): shuffle sequences iteratively in windows of specified size. o scan_sequences(..., return.granges): optionally return a GRanges object. o scan_sequences(..., no.overlaps, no.overlaps.by.strand, no.overlaps.strat): remove overlapping hits after scanning, preventing overlapping hits by the same motifs from being returned. This can optionally be done per strand. Either the first hit or the highest scoring hit can be preserved per set of overlapping hits. These new arguments can also be used in enrich_motifs(). o scan_sequences(..., respect.strand): whether to scan the sequence strands according to the motif strand slot. Only applicable for DNA/RNA motifs. This option is also available in enrich_motifs(). MINOR CHANGES o Some additions and clean-up to documentation and vignettes. o Support for MotIV-pwm2 formatted motifs has been dropped, as the package is no longer a part of the current Bioconductor version. o read_matrix()/write_matrix(): the sep argument can now be NULL (no seperators.) o The Rdpack dependency has been dropped. o merge_motifs(): single-motif input now simply returns the motif instead of throwing an error. o view_motifs(..., dedup.names): now TRUE by default. Furthermore, the make.unique() function is now used to deduplicate names. o compare_motifs(..., method): the default comparison method has been changed back to PCC. BUG FIXES o Using create_motif() with a single character no longer throws an error. o Generating random motifs with filled multifreq slots now works. Changes in version 1.8.5 BUG FIXES o Found a typo in the consensus to PPM calculation for the DNA ambiguity letter Y which resulted in incorrect PPM values. Changes in version 1.8.4 BUG FIXES o Fixed incorrect handling of the -alph parameter in run_meme() (reported by @Irenexzwen). Changes in version 1.8.3 BUG FIXES o Fixed improper handling of MEME version string in read_meme(). Changes in version 1.8.2 BUG FIXES o Increase compliance with MEME motif format: motif files with missing alphabets/strand/bkg are now allowed and will be assumed to be DNA/+/uniform frequencies. A check for the MEME version is performed, though only a warning is given if not found. o Fixed typo in run_meme() missing dependency error message. Changes in version 1.8.1 BUG FIXES o Fixed error when scan_sequences() is used with both RC = TRUE and calc.pvals = TRUE. o Fixed motif.i column in scan_sequences() results not reporting correctly when RC = TRUE. o Fixed memory access bug in motif_pvalue() when k was set to a value resulting in three or more submotifs. Changes in version 1.8.0 NEW FEATURES o scan_sequences()/enrich_motifs() can now be used to scan/enrich for gapped motifs. A new section has been added to the SequenceSearches.Rmd vignette. o scan_sequences(..., use.gaps), enrich_motifs(..., use.gaps): ignore motif gap information. o read_meme(), write_meme(): now fully support custom alphabets. o prob_match(), prob_match_bkg(): calculate the probability of a motif match based on background frequencies of the motif object or provided values, respectively. o enrich_motifs(), get_matches(), get_scores(), motif_pvalue(), motif_score(), scan_sequences(), score_match(): new allow.nonfinite parameter, allowing for the functions to work even if non-finite values are present in the motif PWM. o read_matrix(..., comment): allows for comments to be ignored in motif files. o write_matrix(..., digits): control the number of digits to use for writing motif positions. o New mask_seqs() utility function: inject hard masks into sequences. o scan_sequences(..., warn.NA), enrich_motifs(..., warn.NA): new option which can disable warnings from non-standard letters being detected in the input sequences. o get_bkg(..., window, window.size, window.overlap): new options for calculating sequence background in windows. o get_bkg(..., merge.res): new option to return background information for individual sequences. o scan_sequences(..., calc.pvals): new option to calculate P-values for sequence hits. This is merely automating using the results from scan_sequences() to calculate P-values manually with motif_pvalue(). o view_motifs(..., show.positions, show.positions.once, show.names): new options for customizing the look of plotted motifs. MINOR CHANGES o read_matrix(..., positions): added partial argument matching. o create_sequences(), shuffle_sequences(), motif_pvalue(): the c++ random engine has been changed from std::default_random_engine to std::mt19937. This should allow for the same rng.seed value to result in the same output regardless of OS. o score_match() has been vectorized (alongside new prob_match() function). o The ape and ggtree packages are now no longer imported and must be installed seperately in order to use motif_tree(). o The processx package is no longer imported and must be installed seperately in order to use run_meme(). o The pseudocount slot is now shown when universalmotif class objects are printed. o get_bkg(): the list.out and as.prob options have been disabled. To simplify the function, the only possible output (exception: if to.meme is not NULL) is a DataFrame showing both counts and probabilities. o Changed the default look of motifs plotted by view_motifs(). o General documentation cleanup. BUG FIXES o Changing motif backgrounds with `[<-` will now make sure to set correct vector names. o get_bkg() will now correctly ignore non-standard letters and letters missing from the provided alphabet during counting. Changes in version 1.6.4 BUG FIXES o cbind(): do not ignore the pseudocount slot. o Fixed typo in IntroductionToSequenceMotifs.Rmd. o Fixed U() function in IntroductionToSequenceMotifs.Rmd, no longer returns NA values if 0s are present. o read_cisbp(): no parsing errors for motifs with missing/partial header info. Changes in version 1.6.3 BUG FIXES o scan_sequences(): commented out WIP code for scanning gapped motifs. Changes in version 1.6.2 BUG FIXES o motif_tree(): 'daylight' layout is no longer disabled. Changes in version 1.6.1 BUG FIXES o summarise_motif(): properly retrieves altname slot. Contribution from Spencer Nystrom (https://github.com/bjmt/universalmotif/pull/9). o read_meme(): for LIKE type alphabets, make sure PROTEIN-LIKE is understood as being AA. Changes in version 1.6.0 NEW FEATURES o log_string_pval(): small utility function to obtain the log of string-formatted p-values (such as those often carried in MEME-formatted motifs which are smaller than R's double.xmin limit). Likely only a temporary solution. o view_motifs(..., return.raw) option: instead of returning a plot type object, return the aligned motif matrices. o view_motifs(..., dedup.names) option: allows plotting of motifs with duplicated names by appending a unique string. o merge_motifs(..., new.name) option: assign a name to the new merged motif instead of collapsing the names of the merged motifs together. o round_motif() utility: round down very low letter-position scores to zero. MINOR CHANGES o Removed most previously deprecated function arguments. o Make sure view_motifs(..., use.type = "ICM") properly sets ylim. o create_motif(): single motif positions can now be created. o create_motif(), character input: nsites slot is left empty is input is a single string. It is still filled if the input consists of multiple strings. o merge_motifs(): ALLR/ALLR_LL/KL/IS methods no longer add pseudocounts to the motifs. Instead, pseudocounts are added to temporary internal copies which are used for comparison and alignment. The original un-modified matrices are then combined. o read_meme(): now supports DNA-LIKE, RNA-LIKE, and AA-LIKE alphabets, though these will be treated as regular DNA, RNA, and AA alphabets, respectively. Contribution from Spencer Nystrom (https://github.com/bjmt/universalmotif/pull/7). o Some cleanup to documentation and vignettes. o General code cleanup. BUG FIXES o write_meme() now includes altname slot if filled. Contribution from Spencer Nystrom (https://github.com/bjmt/universalmotif/pull/5). o write_meme() checks for and removes any spaces/equal signs in motif names/altnames. o view_motifs(..., use.type = "ICM"): check for zero IC motifs, as these cannot be plotted by ggseqlogo. Changes in version 1.4.10 BUG FIXES o Updated the warning from v1.4.9 to make users aware that the daylight layout will be permanently disabled for Bioconductor 3.10. Changes in version 1.4.9 BUG FIXES o Temporarily disabling the 'daylight' layout for motif_tree() to get around a new bug. Changes in version 1.4.8 BUG FIXES o Fixed buffer overflow error in motif_peaks(). Changes in version 1.4.7 BUG FIXES o Fixed dangling references in compare_motifs_helper.cpp and utils-exported.cpp. o Trying to subset a motif to a single column no longer throws an error. Changes in version 1.4.6 BUG FIXES o scan_sequences(): will now actually emit a warning when non-standard letters are detected. Changes in version 1.4.5 BUG FIXES o run_meme(): don't try and delete R's tempdir. Changes in version 1.4.4 BUG FIXES o Minor fix to how args are fed to processx::run() in run_meme(). Changes in version 1.4.3 BUG FIXES o Suppress message output from library(TFBSTools) call in MotifManipulation.pdf vignette preamble code. Changes in version 1.4.2 BUG FIXES o Stopped using BiocStyle, as a current bug was preventing the package from building successfully. Changes in version 1.4.1 BUG FIXES o When using create_motif() with an AAStringSet object, amino acid letters will now be properly sorted and match the motif rows. Changes in version 1.4.0 NEW FEATURES o scan_sequences(..., threshold.type) option: 'logodds.abs'. Allows the exact threshold scores to be provided. o compare_motifs() option: 'min.position.ic'. Prevent low-IC positions in an alignment from contributing to the final alignment score. o compare_motifs() option: 'score.strat'. Instruct the function how to deal with individual column scores in an alignment. This is also replaced the old way of choosing between sum and mean via prepending an 'M' to the metric name. Strategies for combining column scores include: sum, arithmetic mean, geometric mean, median, Fisher Z-transform, and weighted means. o Motif comparison metrics: average log-likelihood ratio, squared Euclidean distance, Hellinger distance, Bhattacharyya coefficient, Manhattan distance, lower limit average log-likelihood ratio, weighted Euclidean distance, weighted Pearson correlation coefficient. o compare_columns() utility: Compare two 1d numeric vectors using the comparison metrics from compare_motifs(). o compare_motifs() option: 'output.report'. Generate an output report when 'compare.to' is provided, showing motif alignments of top matches. o get_scores() utility: Extract all possible scores from a motif. o filter_motifs(): Filter using the 'extrainfo' slot. o MotifComparisonAndPvalues.pdf vignette: the comparisons and P-values sections have been moved from AdvancedUsage.pdf to their own vignette. Higher order motifs, enrichment and run_meme() usage sections have been moved to SequenceSearches.pdf. MINOR CHANGES o Removed 'random' shuffling method. o Using RcppThread instead of BiocParallel in several functions: compare_motifs(), create_sequences(), get_bkg(), motif_pvalue(), scan_sequences(), shuffle_sequences(). This means parallelization can occur within C++ code which is much faster than having to jump between R and C++. Currently motif_peaks(), read_motifs() and write_motifs() are the only remaining functions which offer optional BiocParallel usage. o Many performance improvements to functions relying on internal C++ code. Several internal R functions have been replaced with C++ versions. o Changed behaviour of make_DBscores() and motif comparison P-values. Re-calculated internal P-value databases. o For merge_motifs(..., use.type): now only accepts 'PPM'. o When comparing all motifs to all motifs with any method in compare_motifs(), the diagonal entries now properly show the max/min possible similarity/distance scores. o New internal merge_motifs() implementation. This also fixes a previous bug with incorrect PPM averaging. o read_homer(): the logodds score is converted to a P-value. o motif_pvalue(): New score calculator. Exact scores are still calculated the same (but with a faster C++ function), but approximate scores are now calculated by randomly generating score distributions from size 'k' motif score blocks. o motif_pvalue(): Added a safety check when trying to use this function with large motifs. Will throw a warning when nrow(matrix)^k > 1e8 and reduce k accordingly before continuing. o Adjusted P-value calculation in motif_peaks() to not display Pval = 0 so easily by instead estimating a normal distribution from random peaks. o convert_type(): make sure not to leave any zeros in bkg vector when a pseudocount greater than zero is used. o enrich_motifs(): split up 'hits' and 'positional' resuts into their own data.frames. o Replaced several instances of cat() with message() for printing progress updates. o Positional tests have been removed from enrich_motifs(). See motif_peaks() for testing motif-sequence positional preferences. o In read_meme(): E-values are now additionally stored in the extrainfo slot. This is to preserve E-values smaller than the R double precision limit. o In read_transfac(): Matrix values are rounded, to prevent errors when reading in matrices with non-integers. o Update JASPAR2018_CORE_DBSCORES with new compare_motifs() methods and params. o universalmotif print() method now returns the object invisibly, instead of NULL. BUG FIXES o read_meme() will now properly parse background letter frequencies which span more than one line. o convert_motifs() will not error-out when trying to convert a PFMatrix with a family character vector longer than one. o Fixed P-value calculation when importing HOMER motifs. Peviously it would simply assume the log threshold value was the P-value. Now motif_pvalue() is used to properly calculate a P-value. Changes in version 1.2.1 BUG FIXES o Mispelled variable in enrich_motifs(). Changes in version 1.2.0 NEW FEATURES o New motif_peaks() function: test for significantly overrepresented peaks of motif sites in a set of sequences. o New get_bkg() function: calculate background frequencies of sequence alphabets, including higher order backgrounds. Works for any sequence alphabet. Can also create MEME background format files. o read_meme() has a new option, readites.meta, which allows for reading individual motif site positions and P-values, as well as combined sequence P-values. o shuffle_sequences(..., method = "markov") works for any set of characters instead of just DNA/RNA. o shuffle_sequences(): new shuffling method, 'euler'. This allows for k>1 shuffling that preserves exact letter counts, as opposed to 'markov'. This method is set as the new default shuffling method. o create_sequences() has a new option "freqs" which allows for generating sequence from higher order backgrounds and any sequence alphabet (options "monofreqs", "difreqs" and "trifreqs" are now deprecated). o universalmotif objects can now hold onto higher order backgrounds. o motif_pvalue() with use.freq > 1 can calculate P-values from provided higher-order backgrounds, instead of assuming a uniform background. o add_multifreq() adds corresponding higher order background probabilities to motifs. o New get_klets() utility function: generate all possible k-lets for any set of characters. o New score_match() utility function: score a match for a particular motif. o New get_matches() utility function: get all possible motif matches above a certain score. o New count_klets() utility function: count all k-lets for any string of characters. o New motif_score() utility function: calculate motif score from input thresholds. o New shuffle_string() utility function: shuffle a string of character using one of three methods: euler, linear, and markov. o The native write_motifs()/read_motifs() universalmotif format is now YAML based. Motifs written before v1.2.0 can still be read by read_motifs(). MINOR CHANGES o Increased input security for character type parameters throughout. o Expanded motif_pvalue(), scan_sequences(), motif_tree() examples sections. o New vignette sections for motif_peaks() and get_bkg() added to SequenceSearches.Rmd. o Various vignette tweaks. o Fixed various spelling mistakes throughout, added Language field to Description, and added spell check to tests. o Documentation for the "random" shuffling method has been removed and a warning is shown when used to tell the user that it will be removed in the next minor update. o Generally increased test coverage. o The "k=1", "linear" and "markov" shuffling methods are much faster. o create_sequences() for higher order backgrounds is much faster. o Faster add_multifreqs(): slight improvement for DNA motifs, big improvement for non-DNA motifs. o sample_sites() has been rewritten for use.freq > 1: the probability of each letter in the site is now dependent on the previous letters (also faster and more memory efficient for any use.freq). o Improvement to calculating motif scores from p-value input: no longer guesses different scores, instead estimating a normal distribution of scores. This new approach is much, much faster and more memory efficient. It does however assume a uniform background. o The "score.pct" column in scan_sequences() results now represents the percent score based on the total possible score, not just the score between zero and the max possible score. o summarise_motifs() is much faster. o Objects in data/ are saved using serialization format version 3. o convert_motifs(motif, class = "universalmotif-universalmotif"): performs a validObject() check if "motif" is a universalmotif object. o The show() method for universalmotif objects performs a validObject() check first. o motif_rc() has a new option "ignore.alphabet", used to turn on or off the alphabet check (checks for DNA/RNA motif). o Added "overwrite" and "append" options to write_*() functions. o enrich_motifs(..., return.scan.results = FALSE): uses a slimmed down version of scan_sequences() which skips construction of the complete results data.frame, saving a tiny bit of time on large jobs. o compare_motifs() now includes log P-values. This way comparisons can still be properly ranked even if their P-values are below the machine limit. o convert_motifs() from MotifList (MotifDb) carries over dataSource. o If a MEME motif has two names, the second will be assigned as "altname" by read_meme(). o Utilities documentation has been split into two: ?utils-motif and ?utils-sequence. BUG FIXES o Fixed IC score calculation from character input in create_motif(). o The internal DNA consensus letter calculation previously did not assign ambiguous letters when one PPM position was >0.5 and another was >0.25. This was unintended behaviour and will now output the proper ambigous DNA letter. Changes in version 1.0.22 BUG FIXES o Fixed incorrect RcppExports code introduced in last patch. Changes in version 1.0.21 BUG FIXES o Fixed an incorrect citation in motif_pvalue(). Changes in version 1.0.20 BUG FIXES o Fixed a bug introduced in previous patches where create_sequences() fails with alphabet = "RNA" and missing difreq/trifreq. Changes in version 1.0.19 BUG FIXES o Fixed create_sequences(alphabet = "RNA") when providing difreq/trifreq. Changes in version 1.0.18 BUG FIXES o Fixed alphabet letters being stripped from difreq and trifreq params in create_sequences(). o Fixed an incorrect call to sample() when using create_sequences() with difreq. Changes in version 1.0.17 BUG FIXES o Custom motif alphabets are properly sorted in the alphabet slot of motifs. o scan_sequences() properly matches custom sequence and motif alphabets. Changes in version 1.0.16 BUG FIXES o scan_sequences() will now properly create a scoring matrix from motifs with pseudocounts of 0. Changes in version 1.0.15 BUG FIXES o view_motifs() will now give an informative error message when trying to plot multiple motifs with non-unique names. Changes in version 1.0.14 BUG FIXES o Fixed 'method' parameter documentation for motif_tree(). Changes in version 1.0.13 BUG FIXES o Fixed the error message given when a vector of incorrect length is used in a function. Changes in version 1.0.12 BUG FIXES o motif_pvalue() no longer throws an error for motif_pvalue(..., pvalue = 0). o motif_tree() now works properly with dist objects as input. Changes in version 1.0.11 BUG FIXES o The compare_motifs() example for min.mean.ic in the Advanced Usage vignette now makes more sense. Changes in version 1.0.10 BUG FIXES o More strangely behaving MotifDb vignette code addressed in Advanced Usage vignette. Changes in version 1.0.9 BUG FIXES o shuffle_motifs() now produces motifs of proper length. Changes in version 1.0.8 BUG FIXES o Trying to prevent R CMD BUILD from changing the behaviour of vignette code involving MotifDb package. Changes in version 1.0.7 BUG FIXES o merge_motifs() will not show repeat families/organisms in new merged motif. o show method will no longer show name slot instead of altname. Changes in version 1.0.6 BUG FIXES o If MEME motif file has no strand info, assume `strand = "+"`, not `strand = c("+", "-")`. Changes in version 1.0.5 BUG FIXES o `read_meme()` can now read non-DNA/RNA motifs. o Removed duplicate line in run_meme.R. o `scan_sequences()` will not scan mismatching motif/sequence alphabets. o Verbose output from `scan_sequences()` will now display correctly. o Using `enrich_motifs()` and not finding any motif hits in the input sequences no longer throws an error. o Fixed threshold calculation in `enrich_motifs()`. o `enrich_motifs()` will now show results for motifs which have hits in target sequences but none in bkg sequences. Changes in version 1.0.4 BUG FIXES o Can now use `view_motifs(..., tryRC=F)` without throwing an error. Changes in version 1.0.3 BUG FIXES o Missed a couple [Biostrings::*StringSet-class] from last patch. o Updated README to reflect new installation method. o Wrapped instances of \link{} with \code{\link{}}. Changes in version 1.0.2 BUG FIXES o `read_meme()` can now read meme result files with missing strand info. o Use \link{*StringSet} instead of [Biostrings::*StringSet-class] in documentation. o No longer load MotifDb package in examples on Windows. Changes in version 1.0.1 BUG FIXES o TFBSTools motifs with multiple species can convert to universalmotif. o `scan_sequences()`: will now ignore non-standard letters instead of crashing. Changes in version 1.0.0 SIGNIFICANT USER-VISIBLE CHANGES o Changed the appearance of some of the vignette code blocks. o More documentation added in data.R. BUG FIXES o Replaced for loop with `lapply()` in add_multifreq.R L120-133. o Replaced for loop with `lapply()` in enrich_motifs.R L327-330. o Replaced for loop with `lapply()` in shuffle_motifs.R L77-80. o Using `diag()` instead of for loop in `fix_pcc_diag()` (compare_motifs.R). o Fixed `read_motifs()` not parsing alphabet correctly. o Vignettes are now built using pdflatex instead of lualatex. Changes in version 0.99.0 o Ready for bioconductor. Changes in version 0.98.0 o Pre-bioconductor.