NEWS
gdsfmt 1.49.1
NEW FEATURES
- new argument 'sort' in ‘cleanup.gds()': if TRUE (the default),
rearrange data blocks when defragmenting -- all node header blocks are
placed first in depth-first traversal order with folder headers
prioritized before other node headers, followed by data blocks sorted
by size (small to large) then by ID. This layout clusters the GDS
tree structure metadata at the front of the file, maximizing page
cache hit rate during tree navigation;
’sort=FALSE' for backward compatibility
- ‘diagnosis.gds()' now appends a trailing ’/' to folder node names in
the stream info output, making it easier to distinguish folders from
leaf nodes
gdsfmt 1.48.1
NEW FEATURES
- support building on Windows ARM64 (aarch64): 'src/Makevars.win' now
falls back to building 'liblzma.a' from the bundled xz-5.2.9 sources
when no prebuilt static library matches 'R_ARCH'; x86_64 and i386
builds continue to use the prebuilt archive
- add 'CdCallbackStream' class in CoreArray for stream access via external
callback function pointers
- add 'GDS_File_Open_Callback()' C-level API allowing external packages
to open GDS files through custom stream backends (e.g., cloud storage)
- add 'GDS_R_MakeFileObj()' C-level API to build a complete 'gds.class'
R object from a 'PdGDSFile' pointer, with correct file index, external
pointer (with GC finalizer), names, and class attributes; for use by
external packages (e.g., gdscloud) that open GDS files via
'GDS_File_Open_Callback()'
- 'openfn.gds()' now detects cloud URL schemes (s3://, gs://, az://) and
delegates to a registered handler if available (e.g., from the gdscloud
package)
- add internal functions '.gds_register_cloud_handler()' and
'.gds_get_cloud_handler()' for external packages to register custom
URL scheme handlers
gdsfmt 1.46.0
UTILITIES
- fix a strange bug in 'append.gdsn(, val=character())' in MacOS when
compiled by Apple clang version 17
gdsfmt 1.44.1
UTILITIES
- faster appending bitN array data when N is one of 1, ..., and 16
gdsfmt 1.44.0
UTILITIES
- fix the C++ error with Apple clang version 17: no
'std::basic_string<unsigned short>' & 'std::basic_string<unsigned int>'
gdsfmt 1.42.2
UTILITIES
- tweak the C Macro to support Linux MUSL
- fix according to the C compiler C23
gdsfmt 1.40.2
UTILITIES
- add "#define STRICT_R_HEADERS 1" to the C++ header file according to
R_r86984
- update the C codes according to '_R_USE_STRICT_R_HEADERS_=true' &
'_R_CXX_USE_NO_REMAP_=true'
gdsfmt 1.38.1
UTILITIES
- fix the compiler warning: -Wformat-security
gdsfmt 1.36.1
UTILITIES
- 'gdsfmt:::.reopen()' allows forking
gdsfmt 1.36.0
UTILITIES
- fix the compiler warning: sprintf is deprecated
- LZ4 updated to v1.9.4
- XZ updated to v5.2.9
- update zlib to v1.2.13
NEW FEATURES
- 'system.gds()$compiler.flag[1]' is either "64-bit" or "32-bit" indicating
the number of bits of internal data pointer
- new argument 'use.abspath=TRUE' in ‘openfn.gds()' and 'createfn.gds()':
the behavior before v1.35.4 is the same as ’use.abspath=TRUE'
gdsfmt 1.34.1
UTILITIES
- avoid using 'crayon::blurred()' in the display (RStudio blurs the screen
output)
gdsfmt 1.34.0
UTILITIES
- update the web links
- update zlib to v1.2.12
gdsfmt 1.32.0
UTILITIES
- optimize using the utilities of the Matrix package for sparse matrices
gdsfmt 1.28.1
UTILITIES
- update according to gcc-11
gdsfmt 1.28.0
NEW FEATURES
- new function 'exist.gdsn()'
- new function 'is.sparse.gdsn()'
UTILITIES
- LZ4 updated to v1.9.3 from v1.9.2
- XZ is updated to v5.2.5 from v5.2.4
- 'apply.gdsn()': work around with factor variables if less-than-32-bit
integers are stored
- a new component 'is.sparse' in 'objdesp.gdsn()'
- 'options(gds.verbose=TRUE)' to show additional information
gdsfmt 1.26.1
UTILITIES
- comply with the R devel (> v4.0.3) to work with factor variables in
'apply.gdsn()'
gdsfmt 1.24.1
UTILITIES
- 'show' option in 'print.gds.class()' for array preview
gdsfmt 1.24.0
NEW FEATURES
- new option 'allow.error' in 'openfn.gds()' for data recovery
- new option 'log.only' in 'diagnosis.gds()'
- new C functions 'GDS_Node_Unload()' and 'GDS_Node_Load()' in R_GDS.h
- new sparse array data types in GDS (SparseReal32, SparseReal64,
SparseInt8, SparseInt16, SparseInt32, SparseInt64, SparseUInt8,
SparseUInt16, SparseUInt32, SparseUInt64)
- the opened gds file will be closed when the object is garbage collected
UTILITIES
- zlib updated to v1.2.11 from v1.2.8
- xz updated to v5.2.4 from v5.2.3
- LZ4 updated to v1.9.2 from v1.7.5
- ‘showfile.gds()' does not return the object of ’gds.class'
- new 'nmax' and 'depth' in 'print.gdsn.class()' and 'print.gds.class()'
gdsfmt 1.22.0
NEW FEATURES
- a new function 'unload.gdsn()' to unload a GDS node from memory
UTILITIES
- add '#pragma GCC optimize("O3")' to some of C++ files when GCC is used
- add the compiler information in 'system.gds()'
- change the file name "vignettes/gdsfmt_vignette.Rmd" to "vignettes/gdsfmt.Rmd",
so 'vignette("gdsfmt")' can work directly
BUG FIXES
- avoid the segfault if the data type is not registered internally
- use O_CLOEXEC (the close-on-exec flag) when open and create files to avoid
potentially leaking file descriptors in forked processes
gdsfmt 1.20.0
UTILITIES
- optimize the C implementation of 'packedreal8' using a look-up table
NEW FEATURES
- new data types 'packedreal8u', 'packedreal16u', 'packedreal24u' and
'packedreal32u'
BUG FIXES
- the compression method 'LZ4_RA.max' does not compress data
- 'add.gdsn()' fails if a factor variable has no level
- ‘add.gdsn(, storage=index.gdsn())' accepts the additional parameters from
'index.gdsn()', e.g., ’offset' and 'scale' for packedreal8
gdsfmt 1.18.1
BUG FIXES
- the node name should not contain a slash '/'
gdsfmt 1.18.0
NEW FEATURES
- new options 'recursive' and 'include.dirs' in 'ls.gdsn()': the listing
recurses into child nodes
UTILITIES
- replace BiocInstaller biocLite mentions with BiocManager
- 'digest.gdsn()' fails if the digest package is not installed
- SIMD optimization in 2-bit array decoding with a logical vector of
selection (3x speedup when there are lots of zeros)
BUG FIXES
- bug fixed: 'put.attr.gdsn()' fails to update the existing attribute
gdsfmt 1.16.0
UTILITIES
- a new storage name 'single' in 'add.gdsn()' for single-precision
floating numbers
- improve the efficiency of bit2-unpacking when there are lots of zero
- ‘system.gds()' outputs ’POPCNT' flag if available
- enable the compression modes "LZMA.ultra", "LZMA.ultra_max",
"LZMA_RA.ultra" and "LZMA_RA.ultra_max"
- show more compression information in 'system.gds()'
BUG FIXES
- avoid the integer overflow when the compression rate is too small using
LZMA_RA (e.g., < 0.01%)
gdsfmt 1.14.0-1.14.1
UTILITIES
- tweak error messages in 'apply.gdsn()'
- ‘cleanup.gds()' allows a file name with a prefix ’~' which will be
automatically replaced by the home directory
gdsfmt 1.12.0
UTILITIES
- update liblzma to v5.2.3
- update lz4 to v1.7.5
- a new citation
gdsfmt 1.10.1
NEW FEATURES
- new data types (variable-length encoding of signed and unsigned integers)
gdsfmt 1.10.0
- the version number was bumped for the Bioconductor release version 3.4
gdsfmt 1.8.0-1.8.4
- the version number was bumped for the Bioconductor release version 3.3
UTILITIES
- define C MACRO 'COREARRAY_ATTR_PACKED' and 'COREARRAY_SIMD_ATTR_ALIGN'
in CoreDEF.h
- SIMD optimization in 1-bit and 2-bit array encoding/decoding (e.g.,
decode, RAW output: +20% for 2-bit, +50% for 1-bit)
gdsfmt 1.7.18-1.7.22
NEW FEATURES
- LZMA compression algorithm is available in the GDS system (LZMA, LZMA_RA)
- faster implementation of variable-length string: the default string
becomes string with the length stored in the file instead of
null-terminated string (new GDS data types: dStr8, dStr16 and dStr32)
UTILITIES
- improve the read speed of characters (+18%)
- significantly improve random access of characters
- correctly interpret factor variable in 'digest.gdsn()' when
'digest.gdsn(..., action="Robject")', since factors are not integers
gdsfmt 1.7.0-1.7.17
NEW FEATURES
- 'digest.gdsn()' to create hash function digests (e.g., md5, sha1,
sha256, sha384, sha512), requiring the package digest
- new function 'summarize.gdsn()'
- 'show()' displays the content preview
- define C MACRO 'COREARRAY_REGISTER_BIT32' and 'COREARRAY_REGISTER_BIT64'
in CoreDEF.h
- new C functions 'GDS_R_Append()' and 'GDS_R_AppendEx()' in R_GDS.h
- allows efficiently concatenating compressed blocks (i.e., ZIP_RA
and LZ4_RA)
- v1.7.12: add a new data type: packedreal24
- define C MACRO 'COREARRAY_SIMD_SSSE3' in CoreDEF.h
- v1.7.13: 'GDS_Array_ReadData()', 'GDS_Array_ReadDataEx()',
'GDS_Array_WriteData()' and 'GDS_Array_AppendData()' return 'void*'
instead of 'void' in R_GDS.h
- v1.7.15: 'GDS_Array_ReadData()' and 'GDS_Array_ReadDataEx()' allow
Start=NULL or Length=NULL
- v1.7.16: new C function 'GDS_Array_AppendStrLen()' in R_GDS.h
UTILITIES
- 'paste(..., sep="")' is replaced by 'paste0(...)' (requiring >=R_v2.15.0)
- improve random access for ZIP_RA and LZ4_RA, e.g., in the example of
vignette, +12% for ZIP_RA and 1.7-fold speedup in LZ4_RA
DEPRECATED AND DEFUNCT
- bit17, ..., bit23, bit25, ..., bit31, sbit17, ..., sbit23, sbit25, ...,
sbit31 are deprecated, and instead it is suggested to use data
compression
BUG FIXES
- v1.7.7: fix a potential issue of uninitialized value in the first
parameter passed to 'LZ4_decompress_safe_continue' (detected by valgrind)
- v1.7.14: fix an issue of 'seldim' in ‘assign.gdsn()': ’seldim' should
allow NULL in a vector
gdsfmt 1.6.0-1.6.2
- the version number was bumped for the Bioconductor release version 3.2
- 'attribute.trim=FALSE' in 'print.gdsn.class()' by default
NEW FEATURES
- 'diagnosis.gds()' returns detailed data block information
BUG FIXES
- v1.6.2: it might be a rare bug (i.e., stop the program when getting
Z_BUF_ERROR), now the GDS kernel ignores Z_BUF_ERROR in deflate()
and inflate(); see http://zlib.net/zlib_how.html for further explanation
gdsfmt 1.5.7-1.5.16
NEW FEATURES
- a new argument 'seldim' in 'assign.gdsn()'
- 'append.gdsn()' allows appending data from a GDS node
- ‘add.gdsn()' allows ’storage' to be a 'gdsn.class' object
- ‘put.attr.gdsn()' allows ’val' to be a 'gdsn.class' object
- 'readex.gdsn()' allows using numeric vectors to select the subset
- the values returned from ‘read.gdsn()' and 'readex.gdsn()' can be
substituted by the values defined by users: ’.value' and '.substitute'
are added to 'read.gdsn()' and 'readex.gdsn()'
- new functions 'copyto.gdsn()', 'getfolder.gdsn()' and 'permdim.gdsn()'
- a new option 'replace+rename' in 'moveto.gdsn()'
- the values passed to the user-defined function in ‘apply.gdsn()' and
'clusterApply.gdsn()' can be substituted by the values defined by users:
’.value' and '.substitute' are added to these functions
- new C functions 'GDS_R_Obj_SEXP2SEXP()', 'GDS_Iter_RDataEx()',
'GDS_Iter_Position()', 'GDS_Parallel_TryLockMutex()',
'GDS_Parallel_InitCondition()', 'GDS_Parallel_DoneCondition()',
'GDS_Parallel_SignalCondition()', 'GDS_Parallel_BroadcastCondition()' and
'GDS_Parallel_WaitCondition()' in "include/R_GDS.h"
- new arguments 'attribute' and 'attribute.trim' in 'print.gds.class()'
and 'print.gdsn.class()'
- 'delete.attr.gdsn()' allows multiple variable names
- gdsfmt_1.5.16: hidden flag is an internal flag along with 'R.invisible',
'objdesp.gdsn()' returns the hidden flag
DEPRECATED AND DEFUNCT
- discontinue the support of SNPRelate (<= v0.9.*)
- 'GDS_R_NodeValid()' and 'GDS_R_NodeValid_SEXP()' are removed from
"include/R_GDS.h"
- 'GDS_R_SEXP2Obj()' in "include/R_GDS.h" has two arguments now
BUG FIXES
- fix an issue in 'print.gdsn.class()' if hidden nodes are included
- fix the issues #3 (https://github.com/zhengxwen/gdsfmt/issues/3)
and #11 (https://github.com/zhengxwen/gdsfmt/issues/11)
- fix a portable issue on Windows when calling a condition object
(dPlatform.h/dPlatform.cpp)
- fix a compiling issue on Solaris with suncc
- fix an issue of additional arguments in 'add.gdsn()',
https://github.com/zhengxwen/gdsfmt/issues/12
- fix an issue of uncompressing variable-length string,
https://github.com/zhengxwen/gdsfmt/issues/13
gdsfmt 1.5.0-1.5.6
NEW FEATURES
- LZ4 library is updated to r131
- a new argument 'include.hidden' is added to the functions 'cnt.gdsn()'
and 'ls.gdsn()'
- define 'GDS_MAX_NUM_DIMENSION' and 'GDS_R_READ_DEFAULT_MODE' in
"include/R_GDS.h"
- a new C function 'GDS_R_SEXP2FileRoot()' in "include/R_GDS.h"
- improve the terminal output via the package crayon
UTILITIES
- adjust for Xcode 6.3.1
- increase test coverage
BUG FIXES
- fix an issue in 'append.gdsn()',
https://github.com/zhengxwen/gdsfmt/issues/9
- fix an issue in 'assign.gdsn()',
https://github.com/zhengxwen/gdsfmt/issues/10
gdsfmt 1.4.0
- the version number was bumped for the Bioconductor release version 3.1
gdsfmt 1.3.0-1.3.11
NEW FEATURES
- a new argument 'visible' is added to the functions 'add.gdsn()',
'addfile.gdsn()' and 'addfolder.gdsn()'
- ‘objdesp.gdsn()' returns ’encoder' to indicate the compression algorithm
- add a new function 'system.gds()' showing system configuration
- support efficient random access of zlib compressed data, which are
composed of independent compressed blocks (ZIP_RA)
- support LZ4 compression format (http://code.google.com/p/lz4/),
based on "lz4frame API" of r128
- allow R RAW data (interpreted as 8-bit signed integer) to replace
32-bit integer with 'read.gdsn()', 'readex.gdsn()', 'apply.gdsn()',
'clusterApply.gdsn()', 'write.gdsn()', 'append.gdsn()'
- a new argument 'target.node' is added to 'apply.gdsn()', which allows
appending data to a target GDS variable
- ‘apply.gdsn()', 'clusterApply.gdsn()': the argument ’as.is' allows
"logical" and "raw"
- more argument checking in 'write.gdsn()'
- new components 'trait' and 'param' in the return value of 'objdesp.gdsn()'
- add new data types: packedreal8, packedreal16, packedreal32
- a new argument 'permute' in the function 'setdim.gdsn()'
SIGNIFICANT USER-VISIBLE CHANGES
- add a R Markdown vignette
BUG FIXES
- minor fixes
- fix an issue of variable-length string,
https://github.com/zhengxwen/gdsfmt/issues/7
- minor fixes, 'sync.gds()' synchronizes the GDS file
- fix a stream read error bug, https://github.com/zhengxwen/gdsfmt/issues/6
- fix an issue in 'add.gdsn()', https://github.com/zhengxwen/gdsfmt/issues/8
- fix the problem of 'setdim.gdsn()' with variable-length string
gdsfmt 1.1.0 (2014-09-11)
NEW FEATURES
- add a new function 'is.element.gdsn()'
- allow closing GDS files in 'showfile.gds()'
BUG FIXES
- fully support big-endian systems
- fix memory leaks in 'cleanup.gds()'
gdsfmt 1.0.4 (2014-04-16)
NEW FEATURES
- 'apply.gdsn()' and 'clusterApply.gdsn()' support characters
- add a new function 'moveto.gds()' to relocate GDS variables
- add a new function 'diagnosis.gds()' to diagnose the GDS file
SIGNIFICANT USER-VISIBLE CHANGES
- ‘apply.gdsn()', 'clusterApply.gdsn()': make the returned value invisible
if ’as.is="none"'
- more options in 'read.gdsn()' and 'readex.gdsn()'
- more Unit tests
BUG FIXES
- fix a bug in 'delete.gdsn()': allocated resource is not released
in the GDS file
gdsfmt 1.0.3 (2014-03-19)
NEW FEATURES
- add two new arguments 'allow.duplicate' and 'allow.fork' to the
function 'openfn.gds()'
- add a new function 'showfile.gds'
- allow reading a GDS file simultaneously between multiple forked
processes (applied in 'mclapply()' etc)
- support the LinkingTo mechanism, via 'R_RegisterCCallable' and
'R_GetCCallable'
gdsfmt 1.0.2 (2014-01-24)
NEW FEATURES
- add 'size' and 'good' to the returned list from 'objdesp.gdsn()'
indicating the state of GDS node
- add a new function 'cache.gdsn'
BUG FIXES
- fix the memory issues reported by valgrind
gdsfmt 1.0.1 (2014-01-07)
NEW FEATURES
- add a new argument 'replace' to the function 'addfile.gdsn()', which
allows replacing the existing variable by a new one
- add a new function 'addfolder.gdsn()' allowing a virtual folder linking
to other GDS files
- add 'message' to the returned list from 'objdesp.gdsn()', which allows
tracking error messages or log information
SIGNIFICANT USER-VISIBLE CHANGES
- remove the argument 'deep' from the function 'cleanup.gds()' to
simplify calling
- reduce the package size
BUG FIXES
- backward compatible with unknown or undefined classes in GDS system
gdsfmt 1.0.0 (2013-09-21)
NEW FEATURES
- support long vectors (>= R v3.0), allowing >4G (memory size)
vectors in R
- ~20x speedup in reading characters from a GDS file, compared to
the previous version
- add a new argument 'replace' to the function 'add.gdsn()', which allows
replacing the existing variable by a new one
- add a new argument 'simplify' to the functions 'read.gdsn()' and
'readex.gdsn()'
SIGNIFICANT USER-VISIBLE CHANGES
- speed improvement for other primitive data types
- a warning is given, when a variable with missing characters is written
to a GDS variable
- replace all '.C()' by '.Call()' internally
- reduce the package size
BUG FIXES
- improve the function 'objdesp.gdsn()'
- fix a bug in 'delete.gdsn()'
gdsfmt 0.9.13-0.9.15
BUG FIXES
- fix an issue of memory leak when a compressor or decompressor is loaded
- fix an error in the CITATION file
- compiler issue fix: Solaris 10
- use 'inherits' to check the inheritance of object install 'class() =='
gdsfmt 0.9.12 (2013-02-19)
NEW FEATURES
- support variable-length string (e.g., VStr8)
SIGNIFICANT USER-VISIBLE CHANGES
- add an argument 'path' to the function ‘index.gdsn()', which uses
’/' as a separator
- support a faster defragmentation algorithm in 'cleanup.gds()'
- 'character' in the function 'add.gdsn()' refers to variable-length
string by default
- fixed-length strings are "fstring", "fstring16" and "fstring32" in
the function 'add.gdsn()'
- variable-length string are 'string', 'string16' and 'string32' in
the function 'add.gdsn()'
- support the 'R.invisible' attribute to hide a GDS node, until
adding 'all=TRUE' to 'print.gds.class()' or 'print.gdsn.class()'
- improve the display of hierarchical structure
- the argument "storage" in the function 'add.gdsn()' is not
case-sensitive now
BUG FIXES
- minor bug fix in 'readex.gdsn()' when
gdsfmt 0.9.11 (2013-01-26)
NEW FEATURES
- 'put.attr.gdsn()' allows a vector with more than one elements
SIGNIFICANT USER-VISIBLE CHANGES
- it is more efficient to store a factor variable
- 'apply.gdsn()' is re-written in C/C++
- the function 'applylt.gdsn()' is merged into 'apply.gdsn()'
- the function 'clusterApplylt.gdsn() is merged into 'clusterApply.gdsn()'
- improve 'clusterApply.gdsn()'
- S3 class name 'gdsclass' is replaced by 'gds.class'
- S3 class name 'gdsn' is replaced by 'gdsn.class'
DEPRECATED AND DEFUNCT
- deprecate 'applylt.gdsn()' and 'clusterApplylt.gdsn()'
BUG FIXES
- bug fix: add a folder using 'add.gdsn()'
gdsfmt 0.9.1-0.9.10
NEW FEATURES
- add two functions with the support of the parallel package (R 2.14.0):
'clusterApply.gdsn()', 'clusterApplylt.gdsn()'
SIGNIFICANT USER-VISIBLE CHANGES
- change 'wstring' to 'string16' in 'add.gdsn()'
- change 'dwstring' to 'string32' in 'add.gdsn()'
- add RUnit tests
- support GCC4.7 compiler
BUG FIXES
- fix warnings
- fix a bug: correct the dimension size of array data with more
than two dimensions
- fix bugs: 'append.gdsn' appends data of bit9, bit10, etc, correctly
- fix a minor bug of compression stream
gdsfmt 0.9.0