NEWS
SQLDataFrame 1.17.1
- There is a total overhaul of the package implementation. It now
defines a 1-dimension DelayedArray called SQLColumnVector to be the
basic element which can sit inside a normal DataFrame for delayed
operation on-disk. It also defines SQLDataFrame to represent a SQL
table (with selected columns), which will be fully file-backed so no
data is loaded into memory until requested. This allows uses to
represent large datasets in limited memory. SQLDataFrame inherites
from DataFrame so it can be used anywhere in Bioconductor's ecosystem
that accepts DataFrame, such as SummarizedExperiment.
- SQLDataFrame is primarily useful for indicating that the in-memory
representation is consistent with the underlying SQL file. It supports
all the usual methods for a 'DataFrame', except that the data is kept
on file and referenced as needed. However, only some operations can
preserve the SQLDataFrame, such as column subsetting and some cbinding
operations. Most operations that add or change data (e.g., filter for
row subsetting, mutate for new/modify columns) would collapse to a
regular DFrame with 'SQLColumnVector's before applying the
operation. They are still file-backed but lack the guarantee of file
consistency. The fallback to 'DFrame' ensures that a 'SQLDataFrame' is
interoperable with other Bioconductor data structures that need to
perform arbitrary 'DataFrame' operations.
SQLDataFrame 1.17.0 (2019-10-01)
NEW FEATURES
- Supports tidy grammar for 'select', 'filter', 'mutate' and '%>%' pipe.
- Supports representation and saving of MySQL database tables.
- Supports lazy cross-MySQL database table aggregations, such as join, union, rbind, etc.
BUG FIXES
- Fixed bugs for single square bracket subsetting with key column(s).
SQLDataFrame 0.99.0 (2019-04-05)
- Submitted to Bioconductor