NEWS
S4Vectors 0.44.0
BUG FIXES
- Make sure that internal helper coerceToSimpleList() always propagates
the mcols.
S4Vectors 0.42.0
NEW FEATURES
- Add 'make.names' argument to as.data.frame() method for DataFrame.
See https://github.com/Bioconductor/S4Vectors/issues/121
SIGNIFICANT USER-VISIBLE CHANGES
- Document global options 'showHeadLines' and 'showTailLines'.
S4Vectors 0.40.0
NEW FEATURES
- Subscript now can be any 1D array-like object when subsetting a
Vector derivative.
SIGNIFICANT USER-VISIBLE CHANGES
- Drop empty "DataFrame and DataFrameList objects" section from
S4VectorsOverview.Rnw vignette.
BUG FIXES
- Fix incorrect C-level if statement in Rle_utils.c
S4Vectors 0.38.0
SIGNIFICANT USER-VISIBLE CHANGES
- The RleTricks vignette was converted from Rnw to Rmd (thanks to
Beryl Kanali and Jen Wokaty for this conversion).
BUG FIXES
- Improve support of DFrame objects with S3-typed columns (commit 3b83d5f).
- Fix bug in internal helper bindROWS2() (commit 05cfd3c).
S4Vectors 0.36.0
NEW FEATURES
- Implement DataFrameFactor objects. See '?DataFrameFactor'.
- Add droplevels() method for Factor objects.
- Factor objects now use a raw vector instead of an integer vector to store
their internal index when they have 255 levels or less. This greatly
reduces their memory footprint.
SIGNIFICANT USER-VISIBLE CHANGES
- Subsetting a DataFrame object no longer tries to keep its colnames
unique.
S4Vectors 0.34.0
NEW FEATURES
- Implement subassignment of TransposedDataFrame objects.
SIGNIFICANT USER-VISIBLE CHANGES
- DataFrame is now a virtual class! This completes the replacement of
DataFrame with DFrame announced in September 2019. See:
https://www.bioconductor.org/help/course-materials/2019/BiocDevelForum/02-DataFrame.pdf
BUG FIXES
- Avoid spurious warnings when DataFrame() is supplied a DataFrameList.
- Fix bug in combineRows(DataFrame(), DataFrame(ref=IRanges(1:2, 10))).
- Fix display of TransposedDataFrame objects with more than 11 rows.
- Fix isEmpty() on ordinary lists and derivatives.
- Fix handling of nested DataFrames in combineUniqueCols().
- Make sure internal helper lowestListElementClass() does not lose
the "package" attribute of the returned class (fixes issue #103).
S4Vectors 0.32.0
SIGNIFICANT USER-VISIBLE CHANGES
- Subsetting a DataFrame object by row names no longer uses partial
matching.
S4Vectors 0.30.0
NEW FEATURES
- Add combineRows(), combineCols(), and combineUniqueCols() for DataFrame
objects. These are more flexible versions of rbind() and cbind() that
don't require the objects to combine to have the same columns or rows.
- Add unname() generic and a method for Vector objects.
DEPRECATED AND DEFUNCT
- Remove parallelSlotNames(). Was deprecated in BioC 3.11 and defunct in
BioC 3.12.
BUG FIXES
- Fix long-standing bug in rbind() method for DataFrame objects. The bug
was causing rbind() to return an incorrect result when the columns of
the DataFrame objects to combine were a mix of ordinary lists and other
list-like objects like IntegerList objects (defined in the IRanges
package).
- Fix issues in DataFrame printing (commits 735c6b7f and 89b045e7).
- Fix bug in expand() when the DataFrame object to expand has one or none
unselected columns (commit a8f839bb).
S4Vectors 0.28.0
SIGNIFICANT USER-VISIBLE CHANGES
- Replaced DataTable class with RectangularData class.
- Replaced DataTable_OR_NULL with DataFrame_OR_NULL class.
- Add parallel_slot_names() generic and methods for Vector derivatives.
This replaces vertical_slot_names(). The concept of "vertical" and
"horizontal" slots is now a RectangularData concept only i.e. only
RectangularData derivatives should define vertical_slot_names() and
horizontal_slot_names() methods. For RectangularData derivatives that
are also Vector derivatives, one of the two methods should typically
be defined as a synonym of parallel_slot_names(). For example
horizontal_slot_names() now returns parallel_slot_names() on a
DataFrame derivative and vertical_slot_names() will return
parallel_slot_names() on a SummarizedExperiment derivative.
- makeClassinfoRowForCompactPrinting() is now exported.
- showAsCell() now trims strings that are > 22 characters.
- Small tweak to show() method for Rle objects. Now it uses showAsCell()
instead of as.character() for more compact display of the run values
of the Rle object, and for consistency with other show() methods (e.g.
with method for DataFrame objects).
DEPRECATED AND DEFUNCT
- parallelSlotNames() is now defunct (after being deprecated in BioC 3.11).
BUG FIXES
- Fix coercion from SimpleList to DataFrame.
- Fix bug in showAsCell() on ordinary data frames.
- Make sure showAsCell() works on a list of non-subsettable objects.
S4Vectors 0.26.0
NEW FEATURES
- Add TransposedDataFrame objects.
- Add make_zero_col_DFrame() for constructing a zero-column DFrame object.
Intended for developers to use in other packages and typically not needed
by the end user.
- Add internal generic makeNakedCharacterMatrixForDisplay() to facilitate
implementation of show() methods. Also add cbind_mcols_for_display()
helper for use within makeNakedCharacterMatrixForDisplay() methods.
SIGNIFICANT USER-VISIBLE CHANGES
- Rename parallelSlotNames() internal generic -> vertical_slot_names().
Also add new horizontal_slot_names() internal generic (no methods yet).
BUG FIXES
- Fix bug causing segfault in C function 'select_hits()' when 'nodup'
is TRUE.
S4Vectors 0.24.0
NEW FEATURES
- Add Factor class. Serves a similar role as factor in base R except that
the levels of a Factor object can be any Vector derivative.
- New methods for DataFrame comparisons (by Aaron Lun)
- Add sameAsPreviousROW() generic and methods for ANY, atomic, integer,
numeric, complex, Rle, DataFrame, and Pairs (by Aaron Lun)
- Support more comparison methods for Pairs objects
- Add methods for coercing back and forth between HitsList and
SortedByQueryHitsList.
- Add anyDuplicated() method for Vector derivatives.
- Support 'by=' argument on sort,List
- Add is.finite() method for Rle objects
- Add add "&" method for FilterRules objects as a convenience for
concatenation
SIGNIFICANT USER-VISIBLE CHANGES
- Add DFrame class (commit 36837bdf). DataFrame() now returns a DFrame
instance (commit 83b09b19).
- Now 'stringsAsFactors' is set to FALSE when coercing something to
a DataFrame.
- Move splitAsList() from the IRanges package
- Move S4 class "atomic" from the IRanges package
- Improve handling of user-supplied metadata columns
DEPRECATED AND DEFUNCT
- Remove phead(), ptail(), and strsplitAsListOfIntegerVectors(). These
functions were deprecated in BioC 3.7 and defunct in BioC 3.8.
BUG FIXES
- Fix split() on a SortedByQueryHits object (issue #39)
- Fix the following coercions:
- Hits -> SelfHits
- SortedByQueryHits -> SortedByQuerySelfHits
- SelfHits -> SortedByQuerySelfHits
- Hits -> SortedByQuerySelfHits
Before this fix all these coercions **seemed** to work but they
were in fact silently producing invalid objects.
- A fix to anyDuplicated() method for Rle objects (commit 63495d6)
- A fix related to replacing DataFrame columns with matrix columns
(commit 00169dd6)
- All show() methods now return an invisible NULL (commit f4b4ee76)
S4Vectors 0.22.0
NEW FEATURES
- Add recursive argument to expand() methods
- Support DataFrame (or any tabular object) in Pairs
- List derivatives now support x[i] <- NULL
- Some Vector derivatives now support appending with [<-
BUG FIXES
- [<-,DataFrame only makes rownames for new rows when rownames present
- DataFrame() lazily deparses arguments
S4Vectors 0.20.0
NEW FEATURES
- rbind() now supports DataFrame objects with the same column names
but in different order, even when some of the column names are
duplicated. How rbind() re-aligns the columns of the various objects
to bind with those of the first object is consistent with what
base:::rbind.data.frame() does.
- Add isSequence() low-level helper.
- Add 'nodup' argument to selectHits().
SIGNIFICANT USER-VISIBLE CHANGES
- The rownames of a DataFrame are no more required to be unique.
- Change 'use.names' default from FALSE to TRUE in mcols() getter.
- Coercion to DataFrame now **always** propagates the names.
- Rename low-level generic concatenateObjects() -> bindROWS().
- replaceROWS() now dispatches on 'x' and 'i' instead of 'x' only.
- Speedup row subsetting of DataFrame with many columns.
DEPRECATED AND DEFUNCT
- phead(), ptail(), and strsplitAsListOfIntegerVectors() are now defunct
(after being deprecated in BioC 3.7).
BUG FIXES
- Fix window() on a DataFrame with data.frame columns.
- 2 fixes to "rbind" method for DataFrame objects:
- It now properly handles DataFrame objects with duplicated colnames.
Note that the new behavior is consistent with base::rbind.data.frame().
- It now properly handles DataFrame objects with columns that are 1D
arrays.
- Fix showAsCell() on nested data-frame-like objects.
- 2 fixes to "as.data.frame" method for DataFrame objects:
- It now works if the DataFrame object contains nested data-frame-like
objects or other complicated S4 objects (as long as these complicated
objects in turn support as.data.frame()).
- It now handles 'stringsAsFactors' argument properly. Originally
reported here: https://github.com/Bioconductor/GenomicRanges/issues/18
S4Vectors 0.18.0
NEW FEATURES
- The package gets a new vignette: S4VectorsOverview.Rnw
The material in this new vignette comes from the IRangesOverview.Rnw
vignette located in the IRanges package. All the S4Vectors-specific
material was moved from the IRangesOverview.Rnw vignette to the new
S4VectorsOverview.Rnw vignette.
- All Vector derivatives now support 'x[i, j]' by default. This allows
the user to conveniently subset the metadata columns thru 'j'.
Note that GenomicRanges objects have been supporting this feature for
years but now all Vector derivatives support it. Developers of Vector
derivatives with a true 2-D semantic (e.g. SummarizedExperiment) need
to overwrite this.
- rank() now suports 'by' on Vector derivatives.
- Add concatenateObjects() generic and methods for LLint, vector, Vector,
Hits, and Rle objects. This is a low-level generic intended to
facilitate implementation of c() on vector-like objects.
The "concatenateObjects" method for Vector objects concatenates the
objects by concatenating all their parallel slots. The method behaves
like an endomorphism with respect to its first argument 'x'. Note that
this method will work out-of-the-box and do the right thing on most
Vector subclasses as long as parallelSlotNames() reports the names of
all the parallel slots on objects of the subclass (some Vector subclasses
might require a "parallelSlotNames" method for this to happen). For those
Vector subclasses on which concatenateObjects() does not work
out-of-the-box or does not do the right thing, it is strongly advised
to override the method for Vector objects rather than trying to override
the (new) "c" method for Vector objects with a specialized method. The
specialized "concatenateObjects" method will typically delegate to the
method below via the use of callNextMethod(). See "concatenateObjects"
methods for Hits and Rle objects for some examples. No Vector subclass
should need to override the "c" method for Vector objects.
- Major refactoring of [[<- for List objects. It's now based on a new
"setListElement" method for List objects that relies on ‘[<-' for
replacement, c() for appending, and '[' for removal, which are the 3
operations that setListElement() can perform (depending on how it’s
called). As a consequence [[<- now works out-of-the box on any List
derivative for which '[<-', c(), and '[' work.
SIGNIFICANT USER-VISIBLE CHANGES
- endoapply() and mendoapply() are now regular functions instead of
generic functions.
- A couple of minor improvements to how default "showAsCell" method
handles list-like and non-list like objects.
- Replace strsplitAsListOfIntegerVectors() with toListOfIntegerVectors().
(The former is still available but deprecated in favor of the latter.)
The input of toListOfIntegerVectors() now can be a list of raw vectors
(in addition to be a character vector), in which case it's treated like
if it was 'sapply(x, rawToChar)'.
- A couple of optimizations to "[<-" method for DataFrame objects
(see commit e63f4cfd637e3471e4b04015c2938348df17e14a).
DEPRECATED AND DEFUNCT
- phead() and ptail() are deprecated in favor of IRanges::heads() and
IRanges::tails().
- strsplitAsListOfIntegerVectors() is deprecated in favor of
toListOfIntegerVectors().
BUG FIXES
- The mcols() setter no more tries to downgrade to DataFrame a supplied
right value that extends DataFrame (e.g. DelayedDataFrame).
- 'DataFrame(I(x))' and 'as(I(x), "DataFrame")' now drop the I() wrapping
before storing 'x' in the returned object. This wrapping was ugly, not
needed, and breaking S4 objects.
- Fix a couple of long-standing bugs in DataFrame subassignment:
- Bug in the "[<-" method for DataFrame objects where replacing the
1st variable with a rectangular object (e.g. x[1] <-
DataFrame(aa=I(matrix(1:6, ncol=2)))) was returning a DataFrame
with the "nrows" slot set incorrectly.
- A couple of bugs in the "replaceROWS" method for DataFrame objects
when used in "rbind mode" i.e. when max(i) > nrow(x).
- Fix bug in "cbind" method for DataFrame where it was appending X to
the column names in some situations (see
https://github.com/Bioconductor/S4Vectors/issues/8).
- Fix order() on SortedByQueryHits objects (see
https://github.com/Bioconductor/S4Vectors/issues/6).
- Fix bug in internal new_Hits() constructor where it was not returning an
object of the class specified via 'Class' in some situations.
- "lapply" for SimpleList objects now calls match.fun(FUN) internally to
find the function to apply.
S4Vectors 0.16.0
NEW FEATURES
- Introduce FilterResults as generic parent of FilterMatrix.
- Optimized subsetting of an Rle object by an integer vector. Speed up
is about 3x or more for big objects with respect to BioC 3.5.
SIGNIFICANT USER-VISIBLE CHANGES
- coerce,list,DataFrame generates "valid" names when list has none.
This ends up introducing an inconsistency between DataFrame and
data.frame but it is arguably a good one. We shouldn't rely on
DataFrame() to generate variable names from scratch anyway.
BUG FIXES
- Fix showAsCell() on data-frame-like and array-like objects with a single
column, and on SplitDataFrameList objects.
- Calling DataFrame() with explict 'row.names=NULL' should block rownames
inference.
- cbind.DataFrame() ensures every argument is a DataFrame, not just first.
- rbind_mcols() now is robust to missing 'x'.
- Fix extractROWS() for arrays when subscript is a RangeNSBS.
- Temporary workaround to make the "union" method for Hits objects work
even in the presence of another "union" generic in the cache (which is
the case e.g. if the user loads the lubridate package).
- A couple of (long-time due) tweaks and fixes to "unlist" method for
List objects so that it behaves consistently with "unlist" method for
CompressedList objects.
- Modify Mini radix C code to accommodate a bug in Apple LLVM version 6.1.0
optimizer.
[commit 241150d2b043e8fcf6721005422891baff018586]
- Fix match,Pairs,Pairs()
[commit a08c12bf4c31b7304d25122c411d882ec52b360c]
- Various other minor fixes.
S4Vectors 0.14.0
NEW FEATURES
- Add LLint vectors: similar to ordinary integer vectors (int values at
the C level) but store "large integers" i.e. long long int values at the
C level. These are 64-bit on Intel platforms vs 32-bit for int values.
See ?LLint for more information. This is in preparation for supporting
long Vector derivatives (planned for BioC 3.6).
- Default "rank" method for Vector objects now supports the same ties
method as base::rank() (was only supporting ties methods "first" and
"min" until now).
- Support x[[i,j]] on DataFrame objects.
- Add "transform" methods for DataTable and Vector objects.
SIGNIFICANT USER-VISIBLE CHANGES
- Rename union classes characterORNULL, vectorORfactor, DataTableORNULL,
and expressionORfunction -> character_OR_NULL, vector_OR_factor,
DataTable_OR_NULL, and expression_OR_function, respectively.
- Remove default "xtfrm" method for Vector objects. Not needed and
introduced infinite recursion when calling order(), sort() or rank() on
Vector objects that don't have specific order/sort/rank methods.
DEPRECATED AND DEFUNCT
- Remove compare() (was defunct in BioC 3.4).
- Remove elementLengths() (was defunct in BioC 3.4).
BUG FIXES
- Make showAsCell() robust to nested lists.
- Fix bug where subsetting a List object 'x' by a list-like subscript was
not always propagating 'mcols(x)'.
S4Vectors 0.12.0
NEW FEATURES
- Add n-ary "merge" method for Vector objects.
- "extractROWS" methods for atomic vectors and DataFrame objects now
support NAs in the subscript. As a consequence a DataFrame can now
be subsetted by row with a subscript that contains NAs. However that
will only succeed if all the columns in the DataFrame can also be
subsetted with a subscript that contains NAs (e.g. it would fail at
the moment if some columns are Rle's but we have plans to make this
work in the future).
- Add "union", "intersect", "setdiff", and "setequal" methods for Vector
objects.
- Add coercion from data.table to DataFrame.
- Add t() S3 methods for Hits and HitsList.
- Add "c" method for Pairs objects.
- Add rbind/cbind methods for List, returning a list matrix.
- aggregate() now supports named aggregator expressions when 'FUN' is
missing.
SIGNIFICANT USER-VISIBLE CHANGES
- "c" method for Rle objects handles factor data more gracefully.
- "eval" method for FilterRules objects now excludes NA results, like
subset(), instead of failing on NAs.
- Drop "as.env" method for List objects so that as.env() behaves more like
as.data.frame() on these objects.
- Speed up "replaceROWS" method for Vector objects when 'x' has names.
- Optimize selfmatch for factors.
DOCUMENTATION IMPROVEMENTS
- Add S4QuickOverview vignette.
DEPRECATED AND DEFUNCT
- elementLengths() and compare() are now defunct (were deprecated in
BioC 3.3).
- Remove "ifelse" methods for Rle objects (were defunct in BioC 3.3),
BUG FIXES
- Fix bug in showAsCell(x) when 'x' is an AsIs object.
- DataFrame() avoids NULL names when there are no columns.
- DataFrame with NULL colnames are now considered invalid.
S4Vectors 0.10.0
NEW FEATURES
- Add SelfHits class, a subclass of Hits for representing objects where the
left and right nodes are identical.
- Add utilities isSelfHit() and isRedundantHit() to operate on SelfHits
objects.
- Add new Pairs class that couples two parallel vectors.
- head() and tail() now work on a DataTable object and behave like on an
ordinary matrix.
- Add as.matrix.Vector().
- Add "append" methods for Rle/vector (they promote to Rle).
SIGNIFICANT USER-VISIBLE CHANGES
- Many changes to the Hits class:
- Replace the old Hits class (where the hits had to be sorted by query)
with the SortedByQueryHits class.
- A new Hits class where the hits can be in any order is re-introduced as
the parent of the SortedByQueryHits class.
- The Hits() constructor gets the new 'sort.by.query' argument that is
FALSE by default. When 'sort.by.query' is set to TRUE, the constructor
returns a SortedByQueryHits instance instead of a Hits instance.
- Bidirectional coercion is supported between Hits and SortedByQueryHits.
When going from Hits to SortedByQueryHits, the hits are sorted by query.
- Add "c" method for Hits objects.
- Rename Hits slots:
queryHits -> from
subjectHits -> to
queryLength -> nLnode (nb of left nodes)
subjectLength -> nRnode (nb of right nodes)
- Add updateObject() method to update serialized Hits objects from old
(queryHits/subjectHits) to new (from/to) internal representation.
- The "show" method for Hits objects now labels columns with from/to by
default and switches to queryHits/subjectHits labels only when the
object is a SortedByQueryHits object.
- New accessors are provided that match the new slot names: from(), to(),
nLnode(), nRnode(). The old accessors (queryHits(), subjectHits(),
queryLength(), and subjectLength()) are just aliases for the new
accessors. Also countQueryHits() and countSubjectHits() are now aliases
for new countLnodeHits() and countRnodeHits().
- Transposition of Hits objects now propagates the metadata columns.
- Rename elementLengths() -> elementNROWS() (the old name was clearly a
misnomer). For backward compatibility the old name still works but is
deprecated (now it's just an "alias" for elementNROWS()).
- Rename compare() -> pcompare(). For backward compatibility the old name
still works but is just an "alias" for pcompare() and is deprecated.
- Some refactoring of the Rle() generic and methods:
- Remove ellipsis from the argument list of the generic.
- Dispatch on 'values' only.
- The 'values' and 'lengths' arguments now have explicit default values
logical(0) and integer(0) respectively.
- Methods have no more 'check' argument but new low-level (non-exported)
constructor new_Rle() does and is what should now be used by code that
needs this feature.
- Optimize subsetting of an Rle object by an Rle subscript: the subscript
is no longer decoded (i.e. expanded into an ordinary vector). This
reduces memory usage and makes the subsetting much faster e.g. it can be
100x times faster or more if the subscript has many (e.g. thousands) of
long runs.
- Modify "replaceROWS" methods so that the replaced elements in 'x' get
their metadata columns from 'value'. See this thread on bioc-devel:
https://stat.ethz.ch/pipermail/bioc-devel/2015-November/008319.html
- Remove ellipsis from the argument list of the "head" and "tail" methods
for Vector objects.
- pc() (parallel combine) now returns a List object only if one of the
supplied objects is a List object, otherwise it returns an ordinary list.
- The "as.data.frame" method for Vector objects now forwards the
'row.names' argument.
- Export the "parallelSlotNames" methods.
DEPRECATED AND DEFUNCT
- Deprecate elementLengths() in favor of elementNROWS(). New name reflects
TRUE semantic.
- Deprecate compare() in favor of pcompare().
- After being deprecated in BioC 3.2, the "ifelse" methods for Rle objects
are now defunct.
- Remove "aggregate" method for vector objects which was an undocumented
bad idea from the start.
BUG FIXES
- Fix 2 long-standing bugs in "as.data.frame" method for List objects:
- must always return an ordinary data.frame (was returning a DataFrame
when 'use.outer.mcols' was TRUE),
- when 'x' has names and 'group_name.as.factor' is TRUE, the levels of
the returned group_name col must be identical to 'unique(names(x))'
(names of empty list elements in 'x' was not showing up in
'levels(group_name)').
- Fix and improve the elementMetadata/mcols setter method for Vector
objects so that the specific methods for GenomicRanges, GAlignments,
and GAlignmentPairs objects are not needed anymore and were removed.
Note that this change also fixes setting the elementMetadata/mcols of a
SummarizedExperiment object with NULL or an ordinary data frame, which
was broken until now.
- Fix bug in match,ANY,Rle method when supplied 'nomatch' is not NA.
- Fix findMatches() for Rle table.
- Fix show,DataTable-method to display all rows if <= nhead + ntail + 1
S4Vectors 0.4.0
NEW FEATURES
- Add isSorted() and isStrictlySorted() generics, plus some methods.
- Add low-level wmsg() helper for formatting error/warning messages.
- Add pc() function for parallel c() of list-like objects.
- Add coerce,Vector,DataFrame; just adds any mcols as columns on top of the
coerce,ANY,DataFrame behavior.
- [[ on a List object now accepts a numeric- or character-Rle of length 1.
- Add "droplevels" methods for Rle, List, and DataFrame objects.
- Add table,DataTable and transform,DataTable methods.
- Add prototype of a better all.equals() for S4 objects.
SIGNIFICANT USER-VISIBLE CHANGES
- Move Annotated, DataTable, Vector, Hits, Rle, List, SimpleList, and
DataFrame classes from the IRanges package.
- Move isConstant(), classNameForDisplay(), and low-level argument
checking helpers isSingleNumber(), isSingleString(), etc... from the
IRanges package.
- Add as.data.frame,List method and remove other inconsistent and not
needed anymore "as.data.frame" methods for List subclasses.
- Remove useless and thus probably never used aggregate,DataTable method
that followed the time-series API.
- coerce,ANY,List method now propagates the names.
BUG FIXES
- Fix bug in coercion from list to SimpleList when the list contains
matrices and arrays.
- Fix subset() on a zero column DataFrame.
- Fix rendering of Date/time classes as DataFrame columns.