NEWS
IRanges 2.40.0
BUG FIXES
- Make sure that internal helper coerceToCompressedList() always
propagates the mcols.
IRanges 2.38.0
NEW FEATURES
- Add terminators(), same as promoters() but for terminator regions.
IRanges 2.36.0
SIGNIFICANT USER-VISIBLE CHANGES
- Add link to revElements() in man page for reverse().
BUG FIXES
- Fix is.unsorted() methods for Compressed[Integer|Numeric]List
objects (they were never working since their introduction years
ago).
IRanges 2.34.0
SIGNIFICANT USER-VISIBLE CHANGES
- Improve error handling in AtomicList constructors when input is too big.
IRanges 2.32.0
NEW FEATURES
- splitAsList() can now perform a "dumb split", that is, when
no split factor is supplied, 'splitAsList(x)' is equivalent
to 'unname(splitAsList(x, seq_along(x)))' but is slightly more
efficient.
SIGNIFICANT USER-VISIBLE CHANGES
- Add ellipsis argument (...) to the gaps() generic function.
IRanges 2.30.0
SIGNIFICANT USER-VISIBLE CHANGES
- Like the DataFrame class defined in the S4Vectors package, classes
SimpleDataFrameList, CompressedDataFrameList, SimpleSplitDataFrameList,
and CompressedSplitDataFrameList, are now virtual. This completes the
replacement of DataFrame with DFrame announced in September 2019. See:
https://www.bioconductor.org/help/course-materials/2019/BiocDevelForum/02-DataFrame.pdf
IRanges 2.28.0
SIGNIFICANT USER-VISIBLE CHANGES
- Replace dim(), nrow(), and ncol() methods for DataFrameList objects with
dims(), nrows(), and ncols() methods.
DEPRECATED AND DEFUNCT
- Deprecate dim(), nrow(), and ncol() methods for DataFrameList objects
in favor of the new dims(), nrows(), and ncols() methods.
IRanges 2.26.0
NEW FEATURES
- Add commonColnames() accessor to get or set the character vector of
column names present in the individual DataFrames of a SplitDataFrameList
object.
- Implement unary + and - for AtomicList derivatives.
SIGNIFICANT USER-VISIBLE CHANGES
- Much improved error handling and messages in IRanges() constructor
function
DEPRECATED AND DEFUNCT
- Remove RangesList() constructor (was deprecated in BioC 3.7 and defunct
in BioC 3.8).
BUG FIXES
- Fix unplit() on named List objects.
- Fix findOverlapPairs() for missing subject (fixes #35).
- quantile() on an AtomicList object always returns a matrix (fixes #33).
- Fix which.min()/which.max() for CompressedNumericList objects (fixes #30).
- Export startsWith() and endsWith() methods for CharacterList/RleList
objects (fixes #26).
IRanges 2.24.0
NEW FEATURES
- coverage() now supports 'method="naive"'. This is in addition to the
already supported methods "sort" and "hash". This new method is a slower
version of the "hash" method that has the advantage of avoiding floating
point artefacts in the no-coverage regions of the numeric-Rle object
returned by coverage() when the weights are supplied as a numeric vector
of type 'double'. See "FLOATING POINT ARITHMETIC CAN BRING A SURPRISE"
example in '?coverage'.
DEPRECATED AND DEFUNCT
- Removed RangedData class and anything related to RangedData objects.
BUG FIXES
- Fix bug in list element recycling.
IRanges 2.22.0
SIGNIFICANT USER-VISIBLE CHANGES
- Resync with change to smoothEnds() in R 4.0.
In R 4.0, stats::smoothEnds() always returns an integer vector
when the input is an integer vector. smoothEnds() on an IntegerList now
reflects this: it returns an IntegerList object instead of a NumericList
object.
DEPRECATED AND DEFUNCT
- RangedData objects are now defunct.
RangedData objects are defunct in BioC 3.11. They were deprecated in BioC
3.9 and, before that, their use has been discouraged in favor of GRanges
or GRangesList objects since BioC 2.12, that is, since 2014.
BUG FIXES
- Fix restrict() method for RangesList objects for when ranges are dropped.
IRanges 2.20.0
NEW FEATURES
- IPos objects now exist in 2 flavors: UnstitchedIPos and StitchedIPos
IPos is now a virtual class with 2 concrete subclasses: UnstitchedIPos
and StitchedIPos. In an UnstitchedIPos instance the positions are stored
as an integer vector. In a StitchedIPos instance, like with old IPos
instances, the positions are stored as an IRanges object where each range
represents a run of consecutive positions. See ?IPos for more information.
Old serialized IPos instances need to be converted to StitchedIPos
instances with updateObject().
- IPos objects now can hold names
- The IRanges() and IPos() constructors now accept user-supplied metadata
columns
- Add grep(), startsWith() and endsWith() methods for CharacterList
objects
SIGNIFICANT USER-VISIBLE CHANGES
- as.data.frame(IRanges) now propagates the metadata columns
- Move splitAsList() to the S4Vectors package
- Move S4 class "atomic" from the S4Vectors package
- No longer export %in% (was a leftover from an older time when the
package was defining an %in% method)
DEPRECATED AND DEFUNCT
- After being deprecated in BioC 3.9, the following RangedData methods
are now defunct: findOverlaps, rownames<-, colnames<-, columnMetadata,
columnMetadata<-, c, rbind, as.env, as.data.frame, and coercion from
RangedData to DataFrame.
- Remove the following RangedData methods:
- score, score<-, lapply, within, countOverlaps;
- coercions from list, data.frame, DataTable, Rle, RleList, RleViewsList,
IntegerRanges, or IntegerRangesList to RangedData.
These methods were deprecated in BioC 3.8 and defunct in BioC 3.9.
BUG FIXES
- Fix integer overflow issue in end() setter for IRanges objects.
IRanges 2.18.0
NEW FEATURES
- Add some methods for CharacterList derivatives (nchar, substring,
substr, chartr, toupper, tolower, sub, gsub, grepl).
DEPRECATED AND DEFUNCT
- Deprecate RangedData objects.
The use of RangedData objects has been discouraged in favor of GRanges
or GRangesList objects since BioC 2.12, that is, since 2014. Developers
are required to migrate their code to use GRanges or GRangesList instead
of RangedData objects (the GRanges and GRangesList classes are defined
in the GenomicRanges package).
- Several RangedData methods are now defunct (after being deprecated in
BioC 3.8):
- score, score<-, lapply, within, countOverlaps;
- coercions from list, data.frame, DataTable, Rle, RleList, RleViewsList,
IntegerRanges, or IntegerRangesList to RangedData.
BUG FIXES
- Fix unlist() on a SimpleRleList object of length 0
- Fix drop() for FactorList derivatives
- Fix removed rownames upon replacing in a SplitDataFrameList
IRanges 2.16.0
SIGNIFICANT USER-VISIBLE CHANGES
- Optimize unlist() on Views objects.
- Optimize range(), any() and all() on CompressedRleList objects.
- Optimize start(), end(), width() setters on CompressedRangesList objects.
DEPRECATED AND DEFUNCT
- Deprecate several RangedData methods:
- score, score<-, lapply, within, countOverlaps;
- coercions from list, data.frame, DataTable, Rle, RleList, RleViewsList,
IntegerRanges, or IntegerRangesList to RangedData.
RangedData objects will be deprecated in BioC 3.9 (their use has been
discouraged since BioC 2.12, that is, since 2014). Package developers
that are still using RangedData objects need to migrate their code to
use GRanges or GRangesList objects instead.
- The RangesList() constructor is now defunct (after being deprecated in
BioC 3.7).
BUG FIXES
- Fix DF[IRanges(...), ] on a DataFrame with data.frame columns.
- Make [[, as.list(), lapply(), and unlist() fail more graciously on
a IRanges object.
- NCList objects now properly support c().
IRanges 2.14.0
NEW FEATURES
- Add the windows() generic with various methods. This is a "parallel"
version of window() for list-like objects i.e. it does
'mendoapply(window, x, start, end, width)' but uses a fast
implementation.
Also add heads() and tails() as convenience wrappers around windows().
They do 'mendoapply(head, x, n)' and 'mendoapply(tail, x, n)',
respectively, but use a fast implementation. They're replacements for
S4Vectors::phead() and S4Vectors::ptail() which are now deprecated.
- Add equisplit() to split a vector-like object into a specified number
of partitions with equal (total) width. This is useful for instance to
ensure balanced loading of workers in parallel evaluation.
- promoters() arguments 'upstream' and 'downstream' now can be integer
vectors parallel to 'x' (for consistency with the other intra range
transformations).
- The promoters() generic and methods get the 'use.names' argument.
- Add "resize", "flank", and "restrict" methods for Views objects.
- Add "as.integer" method for Pos objects (equivalent to pos()).
SIGNIFICANT USER-VISIBLE CHANGES
- The Ranges virtual class is now the common parent of the IRanges,
GRanges, and GAlignments classes (GRanges and GAlignments are defined
in the GenomicRanges and GenomicAlignments packages, respectively).
More precisely, Ranges is a virtual class that now serves as the parent
class for any class that represents a vector of ranges. The ranges can
be integer ranges (i.e. ranges on the space of integers) like in an
IRanges object, or genomic ranges (i.e. ranges on a genome) like in a
GRanges object. Note that because Ranges extends List, all Ranges
derivatives are considered list-like objects. This means that GRanges
objects and their derivatives are considered list-like objects, which
is new (even though [[ don't work on them yet, this will be implemented
in Bioconductor 3.8).
- Similarly the RangesList virtual class is now the common parent of the
IRangesList, GRangesList, and GAlignmentsList classes.
- IRanges objects don't support [[, unlist(), as.list(), lapply(), and
as.integer() anymore. This is a temporary situation only. These
operations will be re-introduced in Bioconductor 3.8 but with a
different semantic. The overall goal of all these changes is to bring
more consitency between IRanges and GRanges objects (GRanges objects will
also support [[, unlist(), as.list(), and lapply() in Bioconductor 3.8).
Non-exported IRanges:::unlist_as_integer() helper is a temporary
replacement for what unlist() and as.integer() used to do a IRanges
object.
- Move the pos() generic to BiocGenerics.
- Switch order of breakInChunks() arguments 'chunksize' and 'nchunk' to be
consistent with tileGenome().
- tile() and slidingWindows() now preserve names.
- Optimize [[<- on a CompressedList object. Was very inefficient. The
optimized method can be up to 100x faster or more on a long object.
- All the S4Vectors-specific material in the IRangesOverview.Rnw vignette
has moved to the new S4VectorsOverview.Rnw vignette located in the
S4Vectors package.
DEPRECATED AND DEFUNCT
- Deprecate the RangesList() constructor. IRangesList() should be used
instead.
- The "ranges" methods for Hits and HitsList objects are now defunct
(were deprecated in BioC 3.6).
- The "overlapsAny", "subsetByOverlaps", "coverage" and "range" methods
for RangedData objects are now defunct (were deprecated in BioC 3.6).
- The universe() getter and setter as well as the 'universe' argument of
the RangesList(), IRangesList(), RleViewsList(), and RangedData()
constructor functions are now defunct (were deprecated in BioC 3.6).
IRanges 2.12.0
NEW FEATURES
- Add IPos objects for storing a set of integer positions where most of
the positions are typically (but not necessarily) adjacent.
- Add coercion of a character vector or factor representing ranges (e.g.
"22-155") to an IRanges object, as well as "as.character" and "as.factor"
methods for Ranges objects.
- Introduce overlapsRanges() as a replacement for "ranges" methods for
Hits and HitsList objects, and deprecate the latter.
- Add "is.unsorted" method for Ranges objects.
- Add "ranges" method for Ranges objects (downgrade the object to an
IRanges instance and drop its metadata columns).
- Add 'use.names' and 'use.mcols' args to ranges() generic.
SIGNIFICANT USER-VISIBLE CHANGES
- Change 'maxgap' and 'minoverlap' defaults for findOverlaps() and family
(i.e. countOverlaps(), overlapsAny(), and subsetByOverlaps()). This
change addresses 2 long-standing issues:
(1) by default zero-width ranges are not excluded anymore, and
(2) control of zero-width ranges and adjacent ranges is finally
decoupled (only partially though).
New default for 'minoverlap' is 0 instead of 1. New default for 'maxgap'
is -1 instead of 0. See ?findOverlaps for more information about 'maxgap'
and the meaning of -1. For example, if 'type' is "any", you need to set
'maxgap' to 0 if you want adjacent ranges to be considered as overlapping.
Note that poverlaps() still uses the old 'maxgap' and 'minoverlap'
defaults.
- subsetByOverlaps() first 2 arguments are now named 'x' and 'ranges'
(instead of 'query' and 'subject') for consistency with the
transcriptsByOverlaps(), exonsByOverlaps(), and cdsByOverlaps()
functions from the GenomicFeatures package and with the snpsByOverlaps()
function from the BSgenome package.
- Replace ifelse() generic and methods with ifelse2() (eager semantics).
- Coercion from Ranges to IRanges now propagates the metadata columns.
- Move rglist() generic from GenomicRanges to IRanges package.
- The "union", "intersect", and "setdiff" methods for Ranges objects
don't act like endomorphisms anymore: now they always return an
IRanges *instance* whatever Ranges derivatives are passed to them
(e.g. NCList or NormalIRanges).
DEPRECATED AND DEFUNCT
- Deprecate "ranges" methods for Hits and HitsList objects (replaced with
overlapsRanges()).
- Deprecate the "overlapsAny", "subsetByOverlaps", "coverage" and "range"
methods for RangedData objects.
- Deprecate the universe() getter and setter as well as the 'universe'
argument of the RangesList(), IRangesList(), RleViewsList(), and
RangedData() constructor functions.
- Default "togroup" method is now defunct (was deprecated in BioC 3.3).
- Remove grouplength() (was deprecated in BioC 3.3 and replaced with
grouplengths, then defunct in BioC 3.4).
BUG FIXES
- nearest() and distanceToNearest() now call findOverlaps() internally
with maxgap=0 and minoverlap=0. This fixes incorrect results obtained
in some situations e.g. in the situation reported here:
https://support.bioconductor.org/p/99369/ (zero-width ranges)
but also in this situation:
nearest(IRanges(5, 10), IRanges(1, 4:5), select="all")
where the 2 ranges in the subject are *both* nearest to the 5-10 range.
- Fix restrict() and reverse() on IRanges objects with metadata columns.
- Fix table() on Ranges objects.
- Various other minor fixes.
IRanges 2.10.0
NEW FEATURES
- "range" methods now have a 'with.revmap' argument (like "reduce" and
"disjoin" methods).
- Add coercion from list-like objects to IRangesList objects.
- Add "table" method for SimpleAtomicList objects.
- The "gaps" method for CompressedIRangesList objects now uses a chunk
processing strategy if the input object has more than 10 million list
elements. The hope is to reduce memory usage on very big input objects.
DEPRECATED AND DEFUNCT
- Remove the RangedDataList and RDApplyParams classes, rdapply(), and the
"split" and "reduce" methods for RangedData objects. All these things
were defunct in BioC 3.4.
- Remove 'ignoreSelf' and 'ignoreRedundant' arguments (replaced by
'drop.self' and 'drop.redundant') from findOverlaps,Vector,missing method
(were defunct in BioC 3.4).
- Remove GappedRanges class (was defunct in BioC 3.4).
BUG FIXES
- Fix "setdiff" method for CompressedIRangesList for when all ranges are
empty.
- Fix long standing bug in coercion from Ranges to PartitioningByEnd when
the object to coerce has names.
IRanges 2.8.0
NEW FEATURES
- "disjoin" methods now support 'with.revmap' argument.
- Add 'invert' argument to subsetByOverlaps(), like grep()'s invert.
- Add "unstrsplit" method for RleList objects.
- findOverlapPairs() allows 'subject' to be missing for self pairing.
- Add "union", "intersect" and "setdiff" methods for Pairs.
- Add distance,Pairs,missing method.
- Add ManyToManyGrouping, with coercion targets from FactorList and
DataFrame.
- Add Hits->List and Hits->(ManyToMany)Grouping coercions.
- Add "as.matrix" method for AtomicList objects.
- Add "selfmatch", "duplicated", "order", "rank", and "median" methods
for CompressedAtomicList objects.
- Add "anyNA" method for CompressedAtomicList objects that ensures
recursive=FALSE.
- Add "mean" method for CompressedRleList objects.
- Support 'global' argument on "which.min" and "which.max" methods for
CompressedAtomicList objects.
SIGNIFICANT USER-VISIBLE CHANGES
- Make mstack,Vector method more consistent with stack,List method.
- Optimize and document coercion from AtomicList to RleViews objects.
DEPRECATED AND DEFUNCT
- Are now defunct (were deprecated in BioC 3.3):
- RangedDataList objects.
- RDApplyParams objects and rdapply().
- The "split" and "reduce" methods for RangedData objects.
- The 'ignoreSelf' and/or 'ignoreRedundant' arguments of the
findOverlaps,Vector,missing method (a.k.a. "self findOverlaps" method).
- grouplength()
- GappedRanges objects.
BUG FIXES
- Fix special meaning of findOverlaps's maxgap argument when type="within".
- isDisjoint(IRangesList()) now returns logical(0) instead of NULL.
- Fixes to regroup() and Grouping construction.
- Fix rank,CompressedAtomicList method.
- Fix fromLast=TRUE for duplicated,CompressedAtomicList method.
IRanges 2.6.0
NEW FEATURES
SIGNIFICANT USER-VISIBLE CHANGES
- Remove 'algorithm' argument from findOverlaps(), countOverlaps(),
overlapsAny(), subsetByOverlaps(), nearest(), distanceToNearest(),
findCompatibleOverlaps(), countCompatibleOverlaps(), findSpliceOverlaps(),
summarizeOverlaps(), Union(), IntersectionStrict(), and
IntersectionNotEmpty(). The argument was added in BioC 3.1 to facilitate
the transition from an Interval Tree to a Nested Containment Lists
implementation of findOverlaps() and family. The transition is over.
- Restore 'maxgap' special meaning (from BioC < 3.1) when calling
findOverlaps() (or other member of the family) with 'type' set to
"within".
- No more limit on the max depth of *on-the-fly* NCList objects. Note that
the limit remains and is still 100000 when the user explicitely calls the
NCList() or GNCList() constructor.
- Rename 'ignoreSelf' and 'ignoreRedundant' argument of the
findOverlaps,Vector,missing method -> 'drop.self' and 'drop.redundant'.
The old names are still working but deprecated.
- Rename grouplength() -> grouplengths() (old name still available but
deprecated).
- Modify "replaceROWS" method for IRanges objects so that the replaced
elements in 'x' get their metadata columns from 'value'. See this thread
on bioc-devel:
https://stat.ethz.ch/pipermail/bioc-devel/2015-November/008319.html
- Optimized which.min() and which.max() for atomic lists.
- Remove the ellipsis (...) from all the setops methods, except the methods
for Pairs objects.
- Add "togroup" method for ManyToOneGrouping objects and deprecate default
method.
- Modernize "show" method for Ranges objects: now they're displayed more
like GRanges objects.
- Coercion from IRanges to NormalIRanges now propagates the metadata
columns when the object to coerce is already normal.
- Don't export CompressedHitsList anymore from the IRanges package. This
doesn't seem to be used at all and it's not clear that we need it.
DEPRECATED AND DEFUNCT
- Deprecate RDApplyParams objects and rdapply().
- Deprecate RangedDataList objects.
- Deprecate the "reduce" method for RangedData objects.
- Deprecate GappedRanges objects.
- Deprecate the 'ignoreSelf' and 'ignoreRedundant' arguments of the
findOverlaps,Vector,missing method in favor of the new 'drop.self' and
'drop.redundant' arguments.
- Deprecate grouplength() in favor of grouplengths().
- Default "togroup" method is deprecated.
- Remove IntervalTree and IntervalForest classes and methods (were defunct
in BioC 3.2).
- Remove mapCoords() and pmapCoords() generics (were defunct in BioC 3.2).
- Remove all "updateObject" methods (they were all obsolete).
BUG FIXES
- Fix segfault when calling window() on an Rle object of length 0.
- Fix "which.min" and "which.max" methods for IntegerList, NumericList,
and RleList objects when 'x' is empty or contains empty list elements.
- Fix mishandling of zero-width ranges when calling findOverlaps() (or
other member of the family) with 'type' set to "within".
- Various fixes to "countOverlaps" method for Vector#missing. See svn
commit message for commit 116112 for the details.
- Fix validity method for NormalIRanges objects (was not checking anything).
IRanges 2.4.0
NEW FEATURES
- Add "cbind" methods for binding Rle or RleList objects together.
- Add coercion from Ranges to RangesList.
- Add "paste" method for CompressedAtomicList objects.
- Add "expand" method for Vector objects for expanding a Vector object
'x' based on a column in mcols(x).
- Add overlapsAny,integer,Ranges method.
- coverage" methods now accept 'shift' and 'weight' supplied as an Rle.
SIGNIFICANT USER-VISIBLE CHANGES
- The following was moved to S4Vectors:
- The FilterRules stuff.
- The "aggregate" methods.
- The "split" methods.
- The "sum", "min", "max", "mean", "any", and "all" methods on
CompressedAtomicList objects are 100X faster on lists with 500k elements,
80X faster for 50k elements.
- Tweak "c" method for CompressedList objects to make sure it always
returns an object of the same class as its 1st argument.
- NCList() constructor now propagates the metadata columns.
DEPRECATED AND DEFUNCT
- RangedData/RangedDataList are not formally deprecated yet but the
documentation now officially declares them as superseded by
GRanges/GRangesList and discourages their use.
- After being deprecated in BioC 3.1, IntervalTree and IntervalForest
objects and the "intervaltree" algorithm in findOverlaps() are now
defunct.
- After being deprecated in BioC 3.1, mapCoords() and pmapCoords() are
now defunct.
- Remove seqapply(), mseqapply(), tseqapply(), seqsplit(), and seqby()
(were defunct in BioC 3.1).
BUG FIXES
- Fix FactorList() constructor when 'compress=TRUE' (note that the levels
are combined during compression).
- Fix c() on CompressedFactorList objects (was returning a
CompressedIntegerList object).
IRanges 2.2.0
NEW FEATURES
- Add NCList() and NCLists() for preprocessing a Ranges or RangesList
object into an NCList or NCLists object that can be used for fast overlap
search with findOverlaps(). NCList() and NCLists() are replacements for
IntervalTree() and IntervalForest() that use Nested Containment Lists
instead of interval trees. For a one time use, it's not advised to
explicitely preprocess the input. This is because findOverlaps() or
countOverlaps() will take care of it and do a better job at it (that is,
they preprocess only what's needed when it's needed and release memory
as they go).
- Add coercion methods from Hits to CompressedIntegerList, to
PartitioningByEnd, and to Partitioning.
SIGNIFICANT USER-VISIBLE CHANGES
- The code behind overlap-based operations like findOverlaps(),
countOverlaps(), subsetByOverlaps(), summarizeOverlaps(), nearest(),
etc... was refactored and improved. Some highlights on what has
changed:
- The underlying code used for finding/counting overlaps is now based
on the Nested Containment List algorithm by Alexander V.
Alekseyenko and Christopher J. Lee.
- The old algorithm based on interval trees is still available (but
deprecated). The 'algorithm' argument was added to most overlap-based
operations to let the user choose between the new (algorithm="nclist",
the default) and the old (algorithm="intervaltree") algorithm.
- With the new algorithm, the hits returned by findOverlaps() are not
fully ordered (i.e. ordered by queryHits and subject Hits) anymore,
but only partially ordered (i.e. ordered by queryHits only). Other
than that, and except for the 3 particular situations mentioned below,
choosing one or the other doesn't affect the output, only performance.
- Either the query or subject can be preprocessed with NCList() for
a Ranges object (replacement for IntervalTree()), NCLists() for a
RangesList object (replacement for IntervalForest()), and GNCList()
for a GenomicRanges object (replacement for GIntervalTree()).
However, for a one time use, it's not advised to explicitely preprocess
the input. This is because findOverlaps() or countOverlaps() will take
care of it and do a better job at it (that is, they preprocess only
what's needed when it's needed and release memory as they go).
- With the new algorithm, countOverlaps() on Ranges or GenomicRanges
objects doesn't call findOverlaps() to collect all the hits in a
growing Hits object and count them only at the end. Instead the
counting happens at the C level and the hits are not kept. This
reduces memory usage considerably when there is a lot of hits.
- When 'minoverlap=0', zero-width ranges are interpreted as insertion
points and are considered to overlap with ranges that contain them.
This is the 1st situation where using 'algorithm="nclist"' or
'algorithm="intervaltree"' produces different output.
- When using 'select="arbitrary"', the new algorithm will generally
not select the same hits as the old algorithm. This is the 2nd
situation where using 'algorithm="nclist"' or
'algorithm="intervaltree"' produces different output.
- When using the old interval tree algorithm, 'maxgap' has a special
meaning if 'type' is "start", "end", or "within". This is not yet
the case with the new algorithm. That feature seems somewhat useful
though so maybe the new algorithm should also support it? Anyway,
this is the 3rd situation where using 'algorithm="nclist"' or
'algorithm="intervaltree"' produces different output.
- Objects preprocessed with NCList(), NCLists(), and GNCList() are
serializable.
- The RleViewsList() constructor function now reorders its 'rleList'
argument so that its names match the names on the 'rangesList' argument.
- Minor changes to breakInChunks():
- Add 'nchunk' arg.
- Now returns a PartitioningByEnd instead of a PartitioningByWidth object.
- Now accepts 'chunksize' of 0 if 'totalsize' is 0.
- 300x speedup or more when doing unique() on a CompressedRleList object.
- 20x speedup or more when doing unlist() on a SimpleRleList object.
- Moved the RleTricks.Rnw vignette to the S4Vectors package.
DEPRECATED AND DEFUNCT
- Deprecated mapCoords() and pmapCoords(). They're replaced by
mapToTranscripts() and pmapToTranscripts() from the GenomicFeatures
package and mapToAlignments() and pmapToAlignments() from the
GenomicAlignments package.
- Deprecated IntervalTree and IntervalForest objects.
- seqapply(), seqby(), seqsplit(), etc are now defunct (were deprecated in
IRanges 2.0.0).
- Removed map(), pmap(), and splitAsListReturnedClass() (were defunct in
IRanges 2.0.0).
- Removed 'with.mapping' argunment from reduce() methods (was defunct in
IRanges 2.0.0).
BUG FIXES
- findOverlaps,Vector,missing method now accepts extra arguments via ...
so for example one can specify 'ignore.strand=TRUE' when calling it on a
GRanges object (before that, 'findOverlaps(gr, ignore.strand=TRUE)'
would fail).
- PartitioningByEnd() and PartitioningByWidth() constructors now check
that, when 'x' is an integer vector, it cannot contain NAs or negative
values.
IRanges 2.0.0
NEW FEATURES
- Add mapCoords() and pmapCoords() as replacements for map() and pmap().
- Add coercion from list to RangesList.
- Add slice,ANY method as a convenience for slice(as(x, "Rle"), ...).
- Add mergeByOverlaps(); acts like base::merge as far as it makes sense.
- Add overlapsAny,Vector,missing method.
SIGNIFICANT USER-VISIBLE CHANGES
- Move Annotated, DataTable, Vector, Hits, Rle, List, SimpleList, and
DataFrame classes to new S4Vectors package.
- Move isConstant(), classNameForDisplay(), and low-level argument
checking helpers isSingleNumber(), isSingleString(), etc... to new
S4Vectors package.
- Rename Grouping class -> ManyToOneGrouping. Redefine Grouping class as
the parent of all groupings (it formalizes the most general kind of
grouping).
- Change splitAsList() to a generic.
- In rbind,DataFrame method, no longer coerce the combined column to the
class of the column in the first argument.
- Do not carry over row.names attribute from data.frame to DataFrame.
- No longer make names valid in [[<-,DataFrame method.
- Make the set operations dispatch on Ranges instead of IRanges; they
usually return an IRanges, but the input could be any implementation.
- Add '...' to splitAsList() generic.
- Speed up trim() on a Views object when trimming is actually not needed
(no-op).
- Speed up validation of IRanges objects by 2x.
- Speed up "flank" method for Ranges objects by 4x.
DEPRECATED AND DEFUNCT
- Defunct map() and pmap().
- reduce() argument 'with.mapping' is now defunct.
- splitAsListReturnedClass() is now defunct.
- Deprecate seqapply(), mseqapply(), tseqapply(), seqsplit(), and seqby().
BUG FIXES
- Fix rbind,DataFrame method when first column is a matrix.
- Fix a memory leak in the interval tree code.
- Fix handling of minoverlap > 1 in findOverlaps(), so that it behaves
more consistently and respects 'maxgap', as documented.
- Fix findOverlaps,IRanges method for select="last".
- Fix subset,Vector-method to handle objects with NULL mcols(x) (e.g.
Rle object).
- Fix internal helper rbind.mcols() for DataFrame (and potentially other
tables).
- ranges,SimpleRleList method now returns a SimpleRangesList (instead of
CompressedRangesList).
- Make flank() work on Ranges object of length 0.
IRanges 1.20.0
NEW FEATURES
- Add IntervalForest class from Hector Corrada Bravo.
- Add a FilterMatrix class, for holding the results of multiple filters.
- Add selfmatch() as a faster equivalent of 'match(x, x)'.
- Add "c" method for Views objects (only combine objects with same
subject).
- Add coercion from SimpleRangesList to SimpleIRangesList.
- Add an '%outside%' that is the opposite of '%over%'.
- Add validation of length() and names() of Vector objects.
- Add "duplicated" and "table" methods for Vector objects.
- Add some split methods that dispatch to splitAsList() even when only
'f' is a Vector.
- Add set methods (setdiff, intersect, union) for Rle.
- Add anyNA methods for Rle and Vector.
- Add support for subset(), with(), etc on Vector objects,
where the expressions are evaluated in the scope of the
mcols and fixed columns. For symbols that should resolve
in the calling frame, it is supported and encouraged to escape
them with bquote-style ".(x)".
- Add "tile" generic and methods for partitioning a ranges object
into tiles; useful for iterating over subregions.
SIGNIFICANT USER-VISIBLE CHANGES
- All functionalities related to XVector objects have been moved to the
new XVector package.
- Refine how isDisjoint() handles empty ranges.
- Remove 'keepLength' argument from "window<-" methods.
- unlist( , use.names=FALSE) on a CompressedSplitDataFrameList object
now preserves the rownames of the list elements, which is more
consistent with what unlist() does on other CompressedList objects.
- Splitting a list by a Vector just yields a list, not a List.
- The rbind,DataFrame method now handles the case where Rle and vector
columns need to be combined (assuming an equivalence between Rle and
vector). Also the way the result DataFrame is constructed was changed
(avoids undesirable coercions and should be faster).
- as.data.frame.DataFrame now passes 'stringsAsFactors=FALSE' and
'check.names=!optional' to the underlying data.frame() call.
as(x,"DataFrame") sets 'optional=TRUE' when delegating. Most places
where we called as.data.frame(), we now call 'as(x,"data.frame")'.
- The [<-,DataFrame method now coerces column sub-replacement value to
class of column when the column already exists.
- DataFrame() now automatically derives rownames (from the first argument
that has some). This is a fairly significant change in behavior, but it
probably does better match user behavior.
- Make sure that SimpleList objects are coerced to a DataFrame with a
single column. The automatic coecion methods created by the methods
package were trying to create a DataFrame with one column per element,
because DataFrame extends SimpleList.
- Change default to 'compress=TRUE' for RleList() constructor.
- tapply() now handles the case where only INDEX is a Vector (e.g.
an Rle object).
- Speedup coverage() in the "tiling case" (i.e. when 'x' is a tiling
of the [1, width] interval). This makes it much faster to turn into an
Rle a coverage loaded from a BigWig, WIG or BED as a GRanges object.
- Allow logical Rle return values from filter rules.
- FilterRules no longer requires its elements to be named.
- The select,Vector method now returns a DataFrame even when a single
column is selected.
- Move is.unsorted() generic to BiocGenerics.
DEPRECATED AND DEFUNCT
- Deprecate seqselect() and subsetByRanges().
- Deprecate 'match.if.overlap' arg of "match" method for Ranges objects.
- "match" and "%in%" methods that operate on Views, ViewsList, RangesList,
or RangedData objects (20 methods in total) are now defunct.
- Remove previously defunct tofactor().
BUG FIXES
- The subsetting code for Vector derivatives was substancially refactored.
As a consequence, it's now cleaner, simpler, and [ and [[ behave more
consistently across Vector derivatives. Some obscure long-standing bugs
have been eliminated and the code can be slightly faster in some
circumstances.
- Fix bug in findOverlaps(); zero-width ranges in the query no longer
produce hits ever (regardless of 'maxgap' and 'minoverlap' values).
- Correctly free memory allocated for linked list of results compiled for
findOverlaps(select="all").
- Various fixes for AsIs and DataFrames.
- Allow zero-row replacement values in [<-,DataFrame.
- Fix long standing segfault in "[" method for Rle objects (when doing
Rle()[0]).
- "show" methods now display its most specific class when a column or
slot is an S3 object for which class() returns more than one class.
- "show" methods now display properly cells that are arrays.
- Fix the [<-,DataFrame method for when a value DataFrame has matrix
columns.
- Fix ifelse() for when one or more of the arguments are Rle objects.
- Fix coercion from SimpleList to CompressedList via AtomicList
constructors.
- Make "show" methods robust to "showHeadLines" and "showTailLines" global
options set to NA, Inf or non-integer values.
- Fix error condition in eval,FilterRules method.
- Corrected an error formatting in eval,FilterRules,ANY method.
IRanges 1.18.0
NEW FEATURES
- Add global options 'showHeadLines' and 'showTailLines' to
control the number of head/tails lines displayed by "show" methods
for Ranges, DataTable, and Hits objects.
- "subset" method for Vector objects now considers metadata columns.
- Add classNameForDisplay() generic and use it in all "show" methods
defined in IRanges and GenomicRanges.
- as(x, "DataFrame") now works on *any* R object.
- Add findMatches(), an enhanced version of match() that returns all the
matches between 'x' and 'table'. The hits are returned in a Hits object.
Also add countMatches() for counting the number of matches in 'table'
for each element in 'x'.
- Add overlapsAny() as a replacement for %in% (now deprecated on
range-based objects), and %over% and %within% as convenience wrappers
for overlapsAny(). %over% is the replacement for %in%.
- Add 'with.mapping' arg to "reduce" methods for IRanges, Ranges, Views,
RangesList, and CompressedIRangesList objects.
- Add "order" method for Rle objects.
- Add subsetByRanges() generic with methods for ANY, NULL, vector, and
IRanges for now. This is work-in-progress and more methods will be added
soon. The long term plan is to make this a replacement for seqselect(),
but with a faster and cleaner implementation.
- Add promoters() generic with methods for Ranges, RangesList, Views, and
CompressedIRangesList objects.
- elementLengths() now works on XVectorList objects (and thus works on
DNAStringSet objects and family defined in the Biostrings package).
Note that this is the first step towards having relist() work on XVector
objects (e.g. DNAString objects) eventhough this is not ready yet.
- Add "mstack" method for DataFrame objects.
- Add 'name.var' argument to "stack" method for List objects for naming
the optional column formed when the elements themselves have named
elements.
SIGNIFICANT USER-VISIBLE CHANGES
- "distanceToNearest" methods now return a Hits instead of
a DataFrame object.
- The behavior of distance() has changed. Adjacent and overlapping ranges
now return a distance of 0L. See ?distance man page for details.
A temporary warning will be emitted by distance() until the release of
Bioconductor 2.13.
- Change arg list of expand() generic: function(x, ...) instead of
function(x, colnames, keepEmptyRows).
- Dramatic duplicated() and unique() speedups on CompressedAtomicList
objects.
- Significant endoapply() speedup on XVectorList objects (this benefits
DNAStringSet objects and family defined in the Biostrings package).
- 2x speedup to "c" method for CompressedList objects.
- classNameForDisplay() strips 'Simple' or 'Compressed', which affects
all the "show" methods based on it. So now:
> IntegerList(1:4, 2:-3)
IntegerList of length 2
[[1]] 1 2 3 4
[[2]] 2 1 0 -1 -2 -3
instead of:
> IntegerList(1:4, 2:-3)
CompressedIntegerList of length 2
[[1]] 1 2 3 4
[[2]] 2 1 0 -1 -2 -3
- Optimization of "[<-" method for Rle objects when no indices are
selected (just return self).
- "stack" method for List objects now creates a factor for the optional
name variable.
- Evaluating FilterRules now subsets by each filter individually, rather
than subsetting by all at the end.
- Optimized which() on CompressedLogicalList objects.
- All the binary comparison operations (==, <=, etc...) on Ranges objects
are now using compare() behind the scene. This makes them slightly faster
and also slightly more memory efficient.
DEPRECATED AND DEFUNCT
- %in% is now deprecated on range-based objects. Please use %over% instead.
More precisely:
- "match" and "%in%" methods that operate on Views, ViewsList,
RangesList, or RangedData objects (20 methods in total) are now
deprecated.
- Behavior of match() and %in% on Ranges objects was changed (and
will issue a warning) to use equality instead of overlap for
comparing elements between Ranges objects 'x' and 'table'. The old
behavior is still available for match() via new 'match.if.overlap'
arg that is FALSE by default (the arg will be deprecated in BioC 2.13
and removed in BioC 2.14).
- tofactor() is now defunct.
- '.ignoreElementMetadata' argument of "c" method for IRanges objects is
now defunct.
BUG FIXES
- Small fix to "unlist" method for CompressedList objects when 'use.names'
is TRUE and 'x' is a zero-length named List (the zero-length vector
returned in that case was not named, now it is).
- "resize" method for Ranges objects now allows zero-length 'fix' when
'x' is zero-length.
- Subsetting a Views object now subsets its metadata columns.
- Names on the vector-like columns of a DataFrame object are now preserved
when calling DataFrame(), or when coercing to DataFrame, or when
combining DataFrame objects with rbind().
- relist() now propagates the names on 'skeleton' when returning
a SimpleList.
- Better argument checking in breakInChunks().
- Fix broken "showAsCell" method for ANY. Now tries to coerce
uni-dimensional objects to vector instead of data.frame (which never
worked anyway, due to a bug).
- Fix long standing bug in "flank" method for Ranges objects: it no longer
returns an invalid object when NAs are passed thru the 'width' arg.
Now it's an error to try to do so.
- Fix issue with some of the "as.env" methods not being able to find the
environment of the caller.
- Fix bug in "showAsCell" method for AtomicList objects: now returns
character(0) instead of NULL on an object of length 0.
- sort() now drops NA's when 'na.last=NA' on an Rle object (consistent
with base::sort).
- table() now handles NA's appropriately on an Rle object.
- table() now returns all the levels on a factor-Rle object.
- Fix sub-replacement of Rles when using Ranges as the index.
- Fix bug in [<- method for DataFrame objects. The fix corrects the way
a new column created by a subset assignment is filled. Previously, if
the first row was set, say, to '1', all values in the column were set
to '1' when they needed to be set to NA (for consistency with data.frame).
- Fix bug in compare() (was not returning 0 when comparing a 0-width range
to itself).
- Fix naming of column when passing an AsIs matrix to DataFrame() -- no
more .X suffix.
- Fix "rbind" method for DataFrame objects when some columns are matrix
objects.
IRanges 1.16.0
NEW FEATURES
- as( , "SimpleList"), as( , "CompressedList"), and as( , "List") now
work on atomic vectors, and each element of the vector corresponds to
an element of the returned List (this is consistent with as.list).
- Add as.list,Rle method.
- Add as.matrix,Views method. Each view corresponds to a row in the
returned matrix. Rows corresponding to views shorter than the longest
view are right-padded with NAs.
- Add FilterClosure closure class for functions placed into a
FilterRules. Has methods for getting parameters and showing.
- Support 'na.rm' argument in "runsum", "runwtsum", "runq", and "runmean"
methods for Rle and RleList objects.
- Add splitAsList() and splitAsListReturnedClass().
- Improve summary,FilterRules to support serial evaluation, discarded
counts (instead of passed) and percentages.
- Make rename work on ordinary vector (in addition to Vector).
- Add coercion from RangedData to CompressedIRangesList, IRangesList, or
RangesList. It propagates the data columns (aka values) of the RangedData
object to the inner metadata columns of the RangesList object.
- Add 'NG' arg to PartitioningByEnd() and PartitioningByWidth()
constructors.
- Make PartitioningByEnd() work on list-like objects (like
PartitioningByWidth()).
- Fast disjoin() for moderate-sized CompressedIRangesList.
- Add countQueryHits() and countSubjectHits().
- coverage() now supports method="auto" and this is the new default.
- Add the flippedQuery(), levels(), ngap(), Lngap(), Rngap(), Lencoding(),
and Rencoding() getters for OverlapEncodings objects.
- Add "encodeOverlaps" method for GRangesList objects.
- Enhance "[" methods for IRanges, XVector, XVectorList, and MaskCollection
objects, as well as "[<-" method for IRanges objects, by supporting the
following subscript types: NULL, Rle, numeric, logical, character, and
factor. (All the methods listed above already supported some of those
types but no method supported them all).
- Add remapHits() for remapping the query and subject hits of a Hits
object.
- Add match,Hits method.
- Add %in%,Vector method.
- Add "compare", "==", "!=", "<=", ">=", "<", ">", "is.unsorted", "order",
"rank", "match", and "duplicated" methods for XRawList objects. unique()
and sort() also work on these objects via the "unique" and "sort" methods
for Vector objects.
- Add expand() for expanding a DataFrame based on the contents of one or
more designated columms.
- After being deprecated (in BioC 2.9) and defunct (in BioC 2.10), the
"as.vector" method for AtomicList objects is back, but now it mimics
what as.vector() does on an ordinary list i.e. it's equivalent to
'as.vector(as.list(x), mode=mode)'. Also coercions from AtomicList to
logical/integer/numeric/double/complex/character/raw are back and based
on the "as.vector" method for AtomicList objects i.e. they work only on
objects with top-level elements of length <= 1.
- DataFrame constructor now supports 'check.names' argument.
- Add revElements() generic with methods for List and CompressedList
objects.
SIGNIFICANT USER-VISIBLE CHANGES
- Splitting / relisting a Hits object now returns a HitsList instead of
an ordinary list.
- Operations in the Ops group between a List and an atomic vector operand
now coerce the atomic vector to List (SimpleList or CompressedList)
before performing the operation. Also, operands are recycled and a better
job is done returning zero length results of the correct type.
- Change the warning for 'Integer overflow ...' thrown by sum() on
integer-Rle's
- DataFrame now coerces List/list value to DataFrame in [<-.
- Fix as.matrix,DataFrame for zero column DataFrames. Returns an nrow()x0
logical matrix.
- union,Hits method now sorts the returned hits first by query hit, then
by subject hit.
- Add mcols() accessor as the preferred way (over elementMetadata() and
values()) to access the metadata columns of a Vector object.
- By default, mcols(x) and elementMetadata(x) do NOT propagate the names
of x as the row names of the returned DataTable anymore. However the
user can still get the old behavior by doing mcols(x, use.names=TRUE).
- [<-,XVectorList now preserves the original names instead of propagating
the names of the replacement value, which is consistent with how [<-
operates on an ordinary vector/list.
- coverage() now returns a numeric-Rle when passed numeric weights.
- When called on a List object with use.names=TRUE, unlist() no longer
tries to mimic the kind of non-sense name mangling that base::unlist()
does (e.g. on list(a=1:3)) in a pointless effort to return a vector
with unique names.
- Remove 'hits' argument from signature of encodeOverlaps() generic
function.
- unique,Vector now drops the names for consistency with base::unique().
- Remove make.names() coercion in colnames<-,DataFrame for consistency
with data.frame.
DEPRECATED AND DEFUNCT
- Deprecated tofactor().
- Remove RangesMatching, RangesMatchingList, and Binning classes.
- Change from deprecated to defunct: matchMatrix(), "dim" method for Hits
objects, and RangesMatchingList().
BUG FIXES
- Fix bug in pintersect,IRanges,IRanges when input had empty ranges
(broken since 2010-03-04).
- Avoid integer overflow in mean,Rle method by coercing integer-Rle
to numeric-Rle internally.
- Change evaluation frame of with,List to parent.frame(), and get the
enclosure correct in eval,List.
- Many fixes and improvements to coercion from RangesList to RangedData
(see commit 68195 for the details).
- Fix "runValue" and "ranges" methods for CompressedRleList objects
(broken for a very long time).
- shift,Ranges method now fails in case of integer overflow instead of
returning an invalid Ranges object.
- mstack() now works on Vector objects with NULL metadata columns.
- In case of integer overflow, coverage() now puts NAs in the returned
Rle and issues a warning.
- Fix bug in xvcopy,XRawList objects that prevented sequences from being
removed from the cache of a BSgenome object. See commit 67171 for the
details.
- Fix issues related to duplicate column names in DataFrame (see commit
67163 for the details).
- Fix a bunch of subsetting methods that were not subsetting the metadata
columns: "[", "subseq", and "seqselect" methods for XVector objects,
"seqselect" and "window" methods for XVectorList objects, and "[" method
for MaskCollection objects.
- Fix empty replacement with [<-,Vector
- Make %in% robust on an empty 'table' argument when operating on Hits
objects.
IRanges 1.14.0
NEW FEATURES
- The map generic and RangesMapping class for mapping ranges
between sequences according to some alignment. Some useful
methods are implemented in GenomicRanges.
- The Hits class has experimental support for basic set
operations, including setdiff, union and intersect.
- Added a number of data manipulation functions and methods,
including mstack, multisplit, rename, unsplit for Vector.
- Added compare() generic for generalized range-wise comparison of 2
range-based objects.
- Added OverlapEncodings class and encodeOverlaps() generic for dealing
with "overlap encodings".
- subsetByOverlaps() should now work again on an RleViews object.
- DataFrame now supports storing an array (like a matrix) in a column.
- Added as.matrix,DataFrame method.
- Added merge,DataTable,DataTable method.
- Added disjointBins,RangesList method.
- Added ranges,Rle and ranges,RleList methods.
- Added which.max,Rle method.
- Added drop,AtomicList method.
- Added tofactor() wrapper around togroup().
- Added coercions from vector to any AtomicList subtype (compressed and
uncompressed).
- Added AtomicList to Character/Numeric/Logical/Integer/Raw/ComplexList
coercions.
- Added revElements() for reversing individual elements of a List object.
SIGNIFICANT USER-VISIBLE CHANGES
- RangesMatching has been renamed to Hits and extends Vector, so
that it supports metadata columns and other features.
- RangesMatchingList has been renamed to HitsList.
- The 2 columns of the matrix returned by the "as.matrix" method for Hits
objects are now named queryHits/subjectHits instead of query/subject, for
consistency with the queryHits() and subjectHits() getters.
- queryLength()/subjectLength() are recommended alternatives to dim,Hits.
- breakInChunks() returns a PartitioningByWidth object.
- The 'weight' arg in "coverage" methods for IRanges, Views and
MaskCollection objects now can also be a single string naming a column
in elementMetadata(x).
- "countOverlaps" methods now propagate the names of the query.
DEPRECATED AND DEFUNCT
- matchMatrix,Hits is deprecated.
- Moved the following deprecated features to defunct status:
- use of as.data.frame() or as( , "data.frame") on an AtomicList object;
- all coercion methods from AtomicList to atomic vectors;
- subsetting an IRanges by Ranges;
- subsetting a RangesList or RangedData by RangesList.
BUG FIXES
- within,RangedData/List now support replacing columns
- aggregate() override no longer breaks on . ~ x formulas
- "[", "c", "rep.int" and "seqselect" methods for Rle objects are now
safer and will raise an error if the object to be returned has a
length > .Machine$integer.max
- Avoid blowing up memory by not expanding 'logical' Rle's into logical
vectors internally in "slice" method for RleList objects.
IRanges 1.12.0
NEW FEATURES
- Add "relist" method that works on a List skeleton.
- Add XDoubleViews class with support of most of the functionalities
available for XIntegerViews.
- c() now works on XInteger and XDouble objects (in addition to XRaw
objects).
- Add min, max, mean, sum, which.min, which.max methods as synonyms for
the view* functions.
SIGNIFICANT USER-VISIBLE CHANGES
- Views and RleViewsList classes don't derive from IRanges and IRangesList
classes anymore.
- When used on a List or a list, togroup() now returns an integer vector
(instead of a factor) for consistency with what it does on other
objects (e.g. on a Partitioning object).
- Move compact() generic from Biostrings to IRanges.
- Drop deprecated 'multiple' argument from "findOverlaps" methods.
- Drop deprecated 'start' and 'symmetric' arguments from "resize" method
for Ranges objects.
DEPRECATED AND DEFUNCT
- Using as.data.frame() and or as( , "data.frame") on an AtomicList
object is deprecated.
- Deprecate all coercion methods from AtomicList to atomic vectors.
Those methods were unlisting the object, which can still be done with
unlist().
- Deprecate the Binning class.
- Remove defunct overlap() and countOverlap().
BUG FIXES
- togroup() on a List or a list does not look at the names anymore to infer
the grouping, only at the shape of the list-like object.
- Fix 'relist(IRanges(), IRangesList())'.
- Fix 'rep.int(Rle(), integer(0))'.
- Fix some long-standing issues with the XIntegerViews code (better
handling of "out of limits" or empty views, overflows, NAs).