Changes in version 1.49.1 NEW FEATURES o new argument 'sort' in `cleanup.gds()`: if TRUE (the default), rearrange data blocks when defragmenting -- all node header blocks are placed first in depth-first traversal order with folder headers prioritized before other node headers, followed by data blocks sorted by size (small to large) then by ID. This layout clusters the GDS tree structure metadata at the front of the file, maximizing page cache hit rate during tree navigation; 'sort=FALSE' for backward compatibility o `diagnosis.gds()` now appends a trailing '/' to folder node names in the stream info output, making it easier to distinguish folders from leaf nodes Changes in version 1.48.1 NEW FEATURES o support building on Windows ARM64 (aarch64): `src/Makevars.win` now falls back to building `liblzma.a` from the bundled xz-5.2.9 sources when no prebuilt static library matches `R_ARCH`; x86_64 and i386 builds continue to use the prebuilt archive o add `CdCallbackStream` class in CoreArray for stream access via external callback function pointers o add `GDS_File_Open_Callback()` C-level API allowing external packages to open GDS files through custom stream backends (e.g., cloud storage) o add `GDS_R_MakeFileObj()` C-level API to build a complete `gds.class` R object from a `PdGDSFile` pointer, with correct file index, external pointer (with GC finalizer), names, and class attributes; for use by external packages (e.g., gdscloud) that open GDS files via `GDS_File_Open_Callback()` o `openfn.gds()` now detects cloud URL schemes (s3://, gs://, az://) and delegates to a registered handler if available (e.g., from the gdscloud package) o add internal functions `.gds_register_cloud_handler()` and `.gds_get_cloud_handler()` for external packages to register custom URL scheme handlers Changes in version 1.46.0 UTILITIES o fix a strange bug in `append.gdsn(, val=character())` in MacOS when compiled by Apple clang version 17 Changes in version 1.44.1 UTILITIES o faster appending bit array data when N is one of 1, ..., and 16 Changes in version 1.44.0 UTILITIES o fix the C++ error with Apple clang version 17: no 'std::basic_string' & 'std::basic_string' Changes in version 1.42.2 UTILITIES o tweak the C Macro to support Linux MUSL o fix according to the C compiler C23 Changes in version 1.40.2 UTILITIES o add "#define STRICT_R_HEADERS 1" to the C++ header file according to R_r86984 o update the C codes according to '_R_USE_STRICT_R_HEADERS_=true' & '_R_CXX_USE_NO_REMAP_=true' Changes in version 1.38.1 UTILITIES o fix the compiler warning: -Wformat-security Changes in version 1.36.1 UTILITIES o `gdsfmt:::.reopen()` allows forking Changes in version 1.36.0 UTILITIES o fix the compiler warning: sprintf is deprecated o LZ4 updated to v1.9.4 o XZ updated to v5.2.9 o update zlib to v1.2.13 NEW FEATURES o `system.gds()$compiler.flag[1]` is either "64-bit" or "32-bit" indicating the number of bits of internal data pointer o new argument 'use.abspath=TRUE' in `openfn.gds()` and `createfn.gds()`: the behavior before v1.35.4 is the same as 'use.abspath=TRUE' Changes in version 1.34.1 UTILITIES o avoid using `crayon::blurred()` in the display (RStudio blurs the screen output) Changes in version 1.34.0 UTILITIES o update the web links o update zlib to v1.2.12 Changes in version 1.32.0 UTILITIES o optimize using the utilities of the Matrix package for sparse matrices Changes in version 1.28.1 UTILITIES o update according to gcc-11 Changes in version 1.28.0 NEW FEATURES o new function `exist.gdsn()` o new function `is.sparse.gdsn()` UTILITIES o LZ4 updated to v1.9.3 from v1.9.2 o XZ is updated to v5.2.5 from v5.2.4 o `apply.gdsn()`: work around with factor variables if less-than-32-bit integers are stored o a new component 'is.sparse' in `objdesp.gdsn()` o `options(gds.verbose=TRUE)` to show additional information Changes in version 1.26.1 UTILITIES o comply with the R devel (> v4.0.3) to work with factor variables in `apply.gdsn()` Changes in version 1.24.1 UTILITIES o 'show' option in `print.gds.class()` for array preview Changes in version 1.24.0 NEW FEATURES o new option 'allow.error' in `openfn.gds()` for data recovery o new option 'log.only' in `diagnosis.gds()` o new C functions `GDS_Node_Unload()` and `GDS_Node_Load()` in R_GDS.h o new sparse array data types in GDS (SparseReal32, SparseReal64, SparseInt8, SparseInt16, SparseInt32, SparseInt64, SparseUInt8, SparseUInt16, SparseUInt32, SparseUInt64) o the opened gds file will be closed when the object is garbage collected UTILITIES o zlib updated to v1.2.11 from v1.2.8 o xz updated to v5.2.4 from v5.2.3 o LZ4 updated to v1.9.2 from v1.7.5 o `showfile.gds()` does not return the object of 'gds.class' o new 'nmax' and 'depth' in `print.gdsn.class()` and `print.gds.class()` Changes in version 1.22.0 NEW FEATURES o a new function `unload.gdsn()` to unload a GDS node from memory UTILITIES o add '#pragma GCC optimize("O3")' to some of C++ files when GCC is used o add the compiler information in `system.gds()` o change the file name "vignettes/gdsfmt_vignette.Rmd" to "vignettes/gdsfmt.Rmd", so `vignette("gdsfmt")` can work directly BUG FIXES o avoid the segfault if the data type is not registered internally o use O_CLOEXEC (the close-on-exec flag) when open and create files to avoid potentially leaking file descriptors in forked processes Changes in version 1.20.0 UTILITIES o optimize the C implementation of 'packedreal8' using a look-up table NEW FEATURES o new data types 'packedreal8u', 'packedreal16u', 'packedreal24u' and 'packedreal32u' BUG FIXES o the compression method 'LZ4_RA.max' does not compress data o `add.gdsn()` fails if a factor variable has no level o `add.gdsn(, storage=index.gdsn())` accepts the additional parameters from `index.gdsn()`, e.g., 'offset' and 'scale' for packedreal8 Changes in version 1.18.1 BUG FIXES o the node name should not contain a slash '/' Changes in version 1.18.0 NEW FEATURES o new options 'recursive' and 'include.dirs' in `ls.gdsn()`: the listing recurses into child nodes UTILITIES o replace BiocInstaller biocLite mentions with BiocManager o `digest.gdsn()` fails if the digest package is not installed o SIMD optimization in 2-bit array decoding with a logical vector of selection (3x speedup when there are lots of zeros) BUG FIXES o bug fixed: `put.attr.gdsn()` fails to update the existing attribute Changes in version 1.16.0 UTILITIES o a new storage name 'single' in `add.gdsn()` for single-precision floating numbers o improve the efficiency of bit2-unpacking when there are lots of zero o `system.gds()` outputs 'POPCNT' flag if available o enable the compression modes "LZMA.ultra", "LZMA.ultra_max", "LZMA_RA.ultra" and "LZMA_RA.ultra_max" o show more compression information in `system.gds()` BUG FIXES o avoid the integer overflow when the compression rate is too small using LZMA_RA (e.g., < 0.01%) Changes in version 1.14.0-1.14.1 UTILITIES o tweak error messages in `apply.gdsn()` o `cleanup.gds()` allows a file name with a prefix '~' which will be automatically replaced by the home directory Changes in version 1.12.0 UTILITIES o update liblzma to v5.2.3 o update lz4 to v1.7.5 o a new citation Changes in version 1.10.1 NEW FEATURES o new data types (variable-length encoding of signed and unsigned integers) Changes in version 1.10.0 o the version number was bumped for the Bioconductor release version 3.4 Changes in version 1.8.0-1.8.4 o the version number was bumped for the Bioconductor release version 3.3 UTILITIES o define C MACRO 'COREARRAY_ATTR_PACKED' and 'COREARRAY_SIMD_ATTR_ALIGN' in CoreDEF.h o SIMD optimization in 1-bit and 2-bit array encoding/decoding (e.g., decode, RAW output: +20% for 2-bit, +50% for 1-bit) Changes in version 1.7.18-1.7.22 NEW FEATURES o LZMA compression algorithm is available in the GDS system (LZMA, LZMA_RA) o faster implementation of variable-length string: the default string becomes string with the length stored in the file instead of null-terminated string (new GDS data types: dStr8, dStr16 and dStr32) UTILITIES o improve the read speed of characters (+18%) o significantly improve random access of characters o correctly interpret factor variable in `digest.gdsn()` when `digest.gdsn(..., action="Robject")`, since factors are not integers Changes in version 1.7.0-1.7.17 NEW FEATURES o `digest.gdsn()` to create hash function digests (e.g., md5, sha1, sha256, sha384, sha512), requiring the package digest o new function `summarize.gdsn()` o `show()` displays the content preview o define C MACRO 'COREARRAY_REGISTER_BIT32' and 'COREARRAY_REGISTER_BIT64' in CoreDEF.h o new C functions `GDS_R_Append()` and `GDS_R_AppendEx()` in R_GDS.h o allows efficiently concatenating compressed blocks (i.e., ZIP_RA and LZ4_RA) o v1.7.12: add a new data type: packedreal24 o define C MACRO 'COREARRAY_SIMD_SSSE3' in CoreDEF.h o v1.7.13: `GDS_Array_ReadData()`, `GDS_Array_ReadDataEx()`, `GDS_Array_WriteData()` and `GDS_Array_AppendData()` return `void*` instead of `void` in R_GDS.h o v1.7.15: `GDS_Array_ReadData()` and `GDS_Array_ReadDataEx()` allow Start=NULL or Length=NULL o v1.7.16: new C function `GDS_Array_AppendStrLen()` in R_GDS.h UTILITIES o `paste(..., sep="")` is replaced by `paste0(...)` (requiring >=R_v2.15.0) o improve random access for ZIP_RA and LZ4_RA, e.g., in the example of vignette, +12% for ZIP_RA and 1.7-fold speedup in LZ4_RA DEPRECATED AND DEFUNCT o bit17, ..., bit23, bit25, ..., bit31, sbit17, ..., sbit23, sbit25, ..., sbit31 are deprecated, and instead it is suggested to use data compression BUG FIXES o v1.7.7: fix a potential issue of uninitialized value in the first parameter passed to 'LZ4_decompress_safe_continue' (detected by valgrind) o v1.7.14: fix an issue of 'seldim' in `assign.gdsn()`: 'seldim' should allow NULL in a vector Changes in version 1.6.0-1.6.2 o the version number was bumped for the Bioconductor release version 3.2 o 'attribute.trim=FALSE' in `print.gdsn.class()` by default NEW FEATURES o `diagnosis.gds()` returns detailed data block information BUG FIXES o v1.6.2: it might be a rare bug (i.e., stop the program when getting Z_BUF_ERROR), now the GDS kernel ignores Z_BUF_ERROR in deflate() and inflate(); see http://zlib.net/zlib_how.html for further explanation Changes in version 1.5.7-1.5.16 NEW FEATURES o a new argument 'seldim' in `assign.gdsn()` o `append.gdsn()` allows appending data from a GDS node o `add.gdsn()` allows 'storage' to be a 'gdsn.class' object o `put.attr.gdsn()` allows 'val' to be a 'gdsn.class' object o `readex.gdsn()` allows using numeric vectors to select the subset o the values returned from `read.gdsn()` and `readex.gdsn()` can be substituted by the values defined by users: '.value' and '.substitute' are added to `read.gdsn()` and `readex.gdsn()` o new functions `copyto.gdsn()`, `getfolder.gdsn()` and `permdim.gdsn()` o a new option 'replace+rename' in `moveto.gdsn()` o the values passed to the user-defined function in `apply.gdsn()` and `clusterApply.gdsn()` can be substituted by the values defined by users: '.value' and '.substitute' are added to these functions o new C functions `GDS_R_Obj_SEXP2SEXP()`, `GDS_Iter_RDataEx()`, `GDS_Iter_Position()`, `GDS_Parallel_TryLockMutex()`, `GDS_Parallel_InitCondition()`, `GDS_Parallel_DoneCondition()`, `GDS_Parallel_SignalCondition()`, `GDS_Parallel_BroadcastCondition()` and `GDS_Parallel_WaitCondition()` in "include/R_GDS.h" o new arguments 'attribute' and 'attribute.trim' in `print.gds.class()` and `print.gdsn.class()` o `delete.attr.gdsn()` allows multiple variable names o gdsfmt_1.5.16: hidden flag is an internal flag along with 'R.invisible', `objdesp.gdsn()` returns the hidden flag DEPRECATED AND DEFUNCT o discontinue the support of SNPRelate (<= v0.9.*) o `GDS_R_NodeValid()` and `GDS_R_NodeValid_SEXP()` are removed from "include/R_GDS.h" o `GDS_R_SEXP2Obj()` in "include/R_GDS.h" has two arguments now BUG FIXES o fix an issue in `print.gdsn.class()` if hidden nodes are included o fix the issues #3 (https://github.com/zhengxwen/gdsfmt/issues/3) and #11 (https://github.com/zhengxwen/gdsfmt/issues/11) o fix a portable issue on Windows when calling a condition object (dPlatform.h/dPlatform.cpp) o fix a compiling issue on Solaris with suncc o fix an issue of additional arguments in `add.gdsn()`, https://github.com/zhengxwen/gdsfmt/issues/12 o fix an issue of uncompressing variable-length string, https://github.com/zhengxwen/gdsfmt/issues/13 Changes in version 1.5.0-1.5.6 NEW FEATURES o LZ4 library is updated to r131 o a new argument 'include.hidden' is added to the functions `cnt.gdsn()` and `ls.gdsn()` o define `GDS_MAX_NUM_DIMENSION` and `GDS_R_READ_DEFAULT_MODE` in "include/R_GDS.h" o a new C function `GDS_R_SEXP2FileRoot()` in "include/R_GDS.h" o improve the terminal output via the package crayon UTILITIES o adjust for Xcode 6.3.1 o increase test coverage BUG FIXES o fix an issue in `append.gdsn()`, https://github.com/zhengxwen/gdsfmt/issues/9 o fix an issue in `assign.gdsn()`, https://github.com/zhengxwen/gdsfmt/issues/10 Changes in version 1.4.0 o the version number was bumped for the Bioconductor release version 3.1 Changes in version 1.3.0-1.3.11 NEW FEATURES o a new argument 'visible' is added to the functions `add.gdsn()`, `addfile.gdsn()` and `addfolder.gdsn()` o `objdesp.gdsn()` returns 'encoder' to indicate the compression algorithm o add a new function `system.gds()` showing system configuration o support efficient random access of zlib compressed data, which are composed of independent compressed blocks (ZIP_RA) o support LZ4 compression format (http://code.google.com/p/lz4/), based on "lz4frame API" of r128 o allow R RAW data (interpreted as 8-bit signed integer) to replace 32-bit integer with `read.gdsn()`, `readex.gdsn()`, `apply.gdsn()`, `clusterApply.gdsn()`, `write.gdsn()`, `append.gdsn()` o a new argument 'target.node' is added to `apply.gdsn()`, which allows appending data to a target GDS variable o `apply.gdsn()`, `clusterApply.gdsn()`: the argument 'as.is' allows "logical" and "raw" o more argument checking in `write.gdsn()` o new components 'trait' and 'param' in the return value of `objdesp.gdsn()` o add new data types: packedreal8, packedreal16, packedreal32 o a new argument 'permute' in the function `setdim.gdsn()` SIGNIFICANT USER-VISIBLE CHANGES o add a R Markdown vignette BUG FIXES o minor fixes o fix an issue of variable-length string, https://github.com/zhengxwen/gdsfmt/issues/7 o minor fixes, `sync.gds()` synchronizes the GDS file o fix a stream read error bug, https://github.com/zhengxwen/gdsfmt/issues/6 o fix an issue in `add.gdsn()`, https://github.com/zhengxwen/gdsfmt/issues/8 o fix the problem of `setdim.gdsn()` with variable-length string Changes in version 1.1.0 (2014-09-11) NEW FEATURES o add a new function `is.element.gdsn()` o allow closing GDS files in `showfile.gds()` BUG FIXES o fully support big-endian systems o fix memory leaks in `cleanup.gds()` Changes in version 1.0.4 (2014-04-16) NEW FEATURES o `apply.gdsn()` and `clusterApply.gdsn()` support characters o add a new function `moveto.gds()` to relocate GDS variables o add a new function `diagnosis.gds()` to diagnose the GDS file SIGNIFICANT USER-VISIBLE CHANGES o `apply.gdsn()`, `clusterApply.gdsn()`: make the returned value invisible if 'as.is="none"' o more options in `read.gdsn()` and `readex.gdsn()` o more Unit tests BUG FIXES o fix a bug in `delete.gdsn()`: allocated resource is not released in the GDS file Changes in version 1.0.3 (2014-03-19) NEW FEATURES o add two new arguments 'allow.duplicate' and 'allow.fork' to the function `openfn.gds()` o add a new function 'showfile.gds' o allow reading a GDS file simultaneously between multiple forked processes (applied in `mclapply()` etc) o support the LinkingTo mechanism, via 'R_RegisterCCallable' and 'R_GetCCallable' Changes in version 1.0.2 (2014-01-24) NEW FEATURES o add 'size' and 'good' to the returned list from `objdesp.gdsn()` indicating the state of GDS node o add a new function 'cache.gdsn' BUG FIXES o fix the memory issues reported by valgrind Changes in version 1.0.1 (2014-01-07) NEW FEATURES o add a new argument 'replace' to the function `addfile.gdsn()`, which allows replacing the existing variable by a new one o add a new function `addfolder.gdsn()` allowing a virtual folder linking to other GDS files o add 'message' to the returned list from `objdesp.gdsn()`, which allows tracking error messages or log information SIGNIFICANT USER-VISIBLE CHANGES o remove the argument 'deep' from the function `cleanup.gds()` to simplify calling o reduce the package size BUG FIXES o backward compatible with unknown or undefined classes in GDS system Changes in version 1.0.0 (2013-09-21) NEW FEATURES o support long vectors (>= R v3.0), allowing >4G (memory size) vectors in R o ~20x speedup in reading characters from a GDS file, compared to the previous version o add a new argument 'replace' to the function `add.gdsn()`, which allows replacing the existing variable by a new one o add a new argument 'simplify' to the functions `read.gdsn()` and `readex.gdsn()` SIGNIFICANT USER-VISIBLE CHANGES o speed improvement for other primitive data types o a warning is given, when a variable with missing characters is written to a GDS variable o replace all '.C()' by '.Call()' internally o reduce the package size BUG FIXES o improve the function `objdesp.gdsn()` o fix a bug in `delete.gdsn()` Changes in version 0.9.13-0.9.15 BUG FIXES o fix an issue of memory leak when a compressor or decompressor is loaded o fix an error in the CITATION file o compiler issue fix: Solaris 10 o use 'inherits' to check the inheritance of object install 'class() ==' Changes in version 0.9.12 (2013-02-19) NEW FEATURES o support variable-length string (e.g., VStr8) SIGNIFICANT USER-VISIBLE CHANGES o add an argument 'path' to the function `index.gdsn()`, which uses '/' as a separator o support a faster defragmentation algorithm in `cleanup.gds()` o 'character' in the function `add.gdsn()` refers to variable-length string by default o fixed-length strings are "fstring", "fstring16" and "fstring32" in the function `add.gdsn()` o variable-length string are 'string', 'string16' and 'string32' in the function `add.gdsn()` o support the 'R.invisible' attribute to hide a GDS node, until adding 'all=TRUE' to `print.gds.class()` or `print.gdsn.class()` o improve the display of hierarchical structure o the argument "storage" in the function `add.gdsn()` is not case-sensitive now BUG FIXES o minor bug fix in `readex.gdsn()` when Changes in version 0.9.11 (2013-01-26) NEW FEATURES o `put.attr.gdsn()` allows a vector with more than one elements SIGNIFICANT USER-VISIBLE CHANGES o it is more efficient to store a factor variable o `apply.gdsn()` is re-written in C/C++ o the function `applylt.gdsn()` is merged into `apply.gdsn()` o the function `clusterApplylt.gdsn() is merged into `clusterApply.gdsn()` o improve `clusterApply.gdsn()` o S3 class name 'gdsclass' is replaced by 'gds.class' o S3 class name 'gdsn' is replaced by 'gdsn.class' DEPRECATED AND DEFUNCT o deprecate `applylt.gdsn()` and `clusterApplylt.gdsn()` BUG FIXES o bug fix: add a folder using `add.gdsn()` Changes in version 0.9.1-0.9.10 NEW FEATURES o add two functions with the support of the parallel package (R 2.14.0): `clusterApply.gdsn()`, `clusterApplylt.gdsn()` SIGNIFICANT USER-VISIBLE CHANGES o change 'wstring' to 'string16' in `add.gdsn()` o change 'dwstring' to 'string32' in `add.gdsn()` o add RUnit tests o support GCC4.7 compiler BUG FIXES o fix warnings o fix a bug: correct the dimension size of array data with more than two dimensions o fix bugs: 'append.gdsn' appends data of bit9, bit10, etc, correctly o fix a minor bug of compression stream Changes in version 0.9.0 o first release of gdsfmt