NEWS

biomformat 1.39.17

BUG FIXES

Added explicit S4 dispatch method for biom_data() with signature c("biom", "character", "missing"). Previously, calling biom_data() with only a character rows= argument (e.g. biom_data(x, rows="GG_OTU_3")) matched the catch-all ("biom", "character", "ANY") method, which then forwarded the still-missing columns argument to the next dispatch layer, causing the error: "argument 'columns' is missing, with no default". The new method defaults columns to 1:ncol(x) and dispatches cleanly. Fixes the vignette build ERROR on Bioconductor (chunk named_subsetting, biomformat.Rmd lines 402-413). R CMD check: 0 ERRORs, 0 WARNINGs, 4 pre-existing NOTEs.

biomformat 1.39.16

BUG FIXES

Updated Joseph N. Paulson's email address in Authors@R to [email protected]. R CMD check: 0 ERRORs, 0 NOTEs, 2 pre-existing vignette WARNINGs.

biomformat 1.39.15

BUG FIXES

Replaced deprecated Author:/Maintainer: DESCRIPTION fields with the modern Authors@R: format using person() with role=c("aut","cre") for the maintainer. This was flagged as an ERROR by BiocCheck >= 1.46 and was the root cause of the "invalid email" complaint from Bioconductor — the field itself was structurally invalid, not just the address string. Updated maintainer email to [email protected] (reachable address). Added .BiocCheck to .Rbuildignore. R CMD check: 0 ERRORs, 0 NOTEs, 2 pre-existing vignette WARNINGs. BiocCheck: 1 ERROR (Watched Tags on support site — manual web action), 3 WARNINGs (version parity, no BioC deps, missing \value), 14 NOTEs (style/cosmetic).

biomformat 1.39.14

DOCUMENTATION / VIGNETTE

Added two new vignette sections to vignettes/biomformat.Rmd: * "Constructing a BIOM from R data": end-to-end make_biom() workflow showing how to build a biom object from a count matrix and a data.frame taxonomy table with list-valued columns (the dada2 pattern), write it with write_biom(), and read it back. Includes a callout directing large-dataset users to write_hdf5_biom() (addressing Issue #8), with a guarded HDF5 code chunk. Directly addresses Issues #4, #6, #9 for users who land on the vignette. * "Subsetting biom_data() by name": demonstrates the character-vector row/column subsetting interface of biom_data() that was previously undocumented in the vignette. R CMD check: 0 ERRORs, 0 NOTEs, 2 pre-existing vignette WARNINGs.

biomformat 1.39.13

DOCUMENTATION

write_biom(): added @details section documenting the 2^31-1 byte size limitation that arises because jsonlite serialises the entire BIOM object to a single R character string. Users with large datasets (thousands of samples or features) are now explicitly directed to write_hdf5_biom() which has no such size constraint. Closes GitHub Issue #8. R CMD check: 0 ERRORs, 0 NOTEs, 2 pre-existing vignette WARNINGs.

biomformat 1.39.12

NEW FEATURES / TESTS

Issue #7 regression test: added inst/extdata/zero_col_hdf5.biom, a minimal 3x3 HDF5 BIOM fixture where the middle sample ("ZeroSamp") has all-zero counts. The fixture is generated by the companion script inst/extdata/create_zero_col_hdf5.R using rhdf5 directly. Added two tests in test-hdf5-write.R: * "zero-column HDF5 fixture: biom_data() returns correct 3x3 matrix" — verifies that ZeroSamp is all zeros and the other two columns have the expected non-zero values. Validates that generate_matrix() (rewritten in v1.39.8 to use sparseMatrix directly from CCS triplets) correctly handles the case where indptr[j] == indptr[j+1]. * "zero-column HDF5 fixture: make_biom() + write_hdf5_biom() round-trip" — verifies that write_hdf5_biom() -> read_biom() preserves the all-zero column. Both tests are guarded with skip_if_not_installed("rhdf5"). R CMD check: 0 ERRORs, 0 NOTEs, 2 pre-existing vignette WARNINGs.

biomformat 1.39.11

BUG FIXES

make_biom(): fix Issue #4 — NULL id was serialised as {} (empty JSON object) by jsonlite. make_biom() now substitutes "No Table ID" when id is NULL, matching the BIOM spec and ensuring write_biom() -> read_biom() is a lossless round-trip. validObject() now succeeds on the round-tripped object.
make_biom(): fix Issue #6 — when observation_metadata (or sample_metadata) is a data.frame with list columns (e.g. a "taxonomy" column holding character vectors of rank assignments, as produced by dada2), the metadata was serialised as a bare JSON array ([[...]]) instead of a named JSON object ({"taxonomy":[...]}). The BIOM spec and downstream tools (phyloseq import_biom(), the Python biom library) require the named-object form. Root cause: as.matrix() on a list-column data.frame produces a list-matrix; as.list(row) then collapses field names. Fix: detect list-column data.frames and build per-row metadata directly as named lists, bypassing as.matrix(). Closes GitHub Issue #6. Also resolves user-support Issue #9 (dada2 -> biom -> write_biom workflow now works correctly). R CMD check: 0 ERRORs, 0 NOTEs, 2 pre-existing vignette WARNINGs.

biomformat 1.39.10

NEW FEATURES

Updated vignette with four new sections: * HDF5 (BIOM v2) read and write via write_hdf5_biom() / read_biom(), including JSON-to-HDF5 conversion. * Tidy long-format output via as.data.frame() and as_tibble.biom(), with purrr-style summarisation examples (Shannon diversity, per-sample total counts) and base-R fallbacks. * SummarizedExperiment interoperability via biom_to_SummarizedExperiment() and as(x, "SummarizedExperiment"), showing assay(), colData(), and rowData() access. * Session info section.
All new vignette sections are guarded with requireNamespace() so the vignette builds cleanly without optional dependencies (rhdf5, tibble, purrr/dplyr, SummarizedExperiment/S4Vectors).

biomformat 1.39.9

NEW FEATURES

write_hdf5_biom(x, biom_file): new exported function that serialises a biom object to the BIOM v2 HDF5 format. Writes both the sample-major and observation-major compressed-sparse representations required by the spec, plus all sample and observation metadata. Requires rhdf5 (Bioconductor); a clear error is raised if it is absent. read_biom() -> write_hdf5_biom() -> read_biom() is a lossless round-trip for both count data and metadata.

BUG FIXES

Moved rhdf5 from Imports to Suggests. The package loads and all JSON BIOM functionality works without rhdf5 installed; HDF5 read/write simply stops with an informative message when rhdf5 is absent.

biomformat 1.39.8

BUG FIXES / PERFORMANCE

generate_matrix(): rewrote HDF5/BIOM-v2 matrix reconstruction to build a sparse Matrix directly from the CCS (indptr/indices/data) triplets stored in the HDF5 file, instead of first constructing a dense base::matrix via sapply() and then converting. For large datasets this avoids an O(n_obs * n_samples) allocation. The return value (list of named vectors, one per observation) is unchanged so all downstream code is unaffected. Also handles the edge case of an all-zero matrix (length(data) == 0) explicitly.

biomformat 1.39.7

NEW FEATURES

as.data.frame.biom(): new S3 method that converts a biom object to a long-format (tidy) data.frame with one row per (feature, sample) pair. Columns: feature_id, sample_id, count, plus any sample and observation metadata columns appended via left-join. Pure base R, no tidyverse dependency.
as_tibble.biom(): thin wrapper around as.data.frame.biom() that returns a tibble. Requires the 'tibble' package (Suggests only). Call via tibble::as_tibble(x) or as_tibble.biom(x) directly.

DEPENDENCY CHANGES

Added tibble to Suggests (optional; only needed for as_tibble.biom()).

biomformat 1.39.6

NEW FEATURES

biom_to_SummarizedExperiment(): new exported function that converts a biom object into a SummarizedExperiment, placing the count/value matrix in assay("counts"), sample metadata in colData(), and feature metadata in rowData(). Both colData and rowData are S4Vectors::DataFrame objects. When a biom object carries no metadata (the accessor returns NULL), an empty DataFrame with correct row/col names is used, ensuring the SE is always valid. No hard dependency is introduced: SummarizedExperiment and S4Vectors are listed in Suggests only.
as(x, "SummarizedExperiment"): S4 coercion method registered at load time when SummarizedExperiment is available, delegating to biom_to_SummarizedExperiment().

DEPENDENCY CHANGES

Added SummarizedExperiment and S4Vectors to Suggests.

TESTS

New tests/testthat/test-SE.R with 6 tests covering: return class, assay content, colData content, rowData content, S4 coercion, NULL metadata, and SE dimension/dimname correctness. All tests are skipped gracefully when SummarizedExperiment is not installed.

biomformat 1.39.5

DEPENDENCY CHANGES

Removed plyr (>= 1.8) from Imports entirely. plyr is unmaintained and every use in this codebase now has a direct base-R equivalent. This eliminates an unmaintained dependency and reduces install footprint.
Bumped R dependency from >= 3.2 to >= 4.1, ensuring modern base-R idioms (including the native pipe |>) are available.
Replaced import(Matrix) (whole-namespace) with selective importFrom(Matrix, Matrix, sparseMatrix, drop0) in NAMESPACE. Replaced import(methods) with selective importFrom(methods, ...). Follows CRAN/Bioconductor best practices; prevents namespace pollution.

USER-VISIBLE CHANGES

biom_data(), sample_metadata(), observation_metadata(): the parallel= argument is now a no-op with a deprecation warning when passed as TRUE. The plyr-backed parallel execution it previously enabled no longer exists. Existing code that passes parallel=FALSE (the default) is unaffected.

INTERNAL CHANGES

make_biom(): replaced plyr::alply() with lapply(seq_len(nrow(...))) for building per-row named metadata lists.
biom_data() dense path: replaced plyr::laply() with do.call(rbind, lapply(...)).
biom_data() sparse numeric path: replaced plyr::ldply(x$data) with do.call(rbind, lapply(x$data, as.data.frame)).
biom_data() sparse unicode path: replaced plyr::ldply(x$data, function...) with do.call(rbind, lapply(x$data, function(e) ...)).
extract_metadata(): replaced plyr::llply()/plyr::ldply() with lapply() / do.call(rbind, lapply(...)).

biomformat 1.39.4

BUG FIXES

biom_data(): fixed data-corruption bug on both dense and sparse BIOM paths where subsetting to a single row or single column silently collapsed the result into a dimensionless named vector, discarding dim(), rownames(), and colnames(). This caused downstream tools (notably phyloseq::import_biom()) to fail silently or produce incorrect OTU tables. Fix: on the sparse path, added drop = FALSE to the matrix subsetting call (m[rows, columns, drop = FALSE]). On the dense path, the laply() result is now immediately reshaped with matrix(m, nrow, ncol) before Matrix() coercion, ensuring a 2-D object is always returned regardless of dimension lengths. Closes GitHub PR #12 (https://github.com/joey711/biomformat/pull/12): "Fix biom_data() when dealing with 1-taxon and 1-sample BIOM data" Supersedes GitHub PR #11 (https://github.com/joey711/biomformat/pull/11): "Fix unidentical biom output by make_biom()"
biom_data(): simplified the post-subsetting naming block. Both paths now always produce a 2-D object, so rownames() and colnames() are applied unconditionally on all code paths (the previous is.null(dim(m)) branch is no longer needed).

TESTS

Added regression tests in tests/testthat/test-IO.R covering all new behaviour introduced across v1.39.2-v1.39.4: * sparse single-row subset retains dim() and dimnames (PR #12) * sparse single-col subset retains dim() and dimnames (PR #12) * sparse single-cell (1x1) subset retains dim() * full sparse matrix unaffected by drop = FALSE fix * read_biom() routes HDF5 fixture cleanly without jsonlite warning (Issue #14, PR #16) * read_biom() correctly classifies all JSON and HDF5 extdata fixtures

biomformat 1.39.3

BUG FIXES

read_biom(): replaced the fragile JSON-first / HDF5-fallback try() chain with a deterministic magic-bytes router. The function now reads the first 4 bytes of the file; if the HDF5 signature (\x89 H D F) is detected it routes exclusively to read_hdf5_biom() and never invokes jsonlite. JSON files route exclusively to jsonlite. This eliminates the confusing "lexical error: invalid char in json text ... 89HDF" warning that users encountered when HDF5 files were accidentally passed through the JSON parser first. Closes GitHub Issue #14 (https://github.com/joey711/biomformat/issues/14): "Unable to read HDF5 biom file" Supersedes GitHub PR #16 (https://github.com/joey711/biomformat/pull/16): "Improve handling of HDF5 BIOM files" Partially addresses GitHub Issue #5 (https://github.com/joey711/biomformat/issues/5) and Issue #3 (https://github.com/joey711/biomformat/issues/3): fatal C-level aborts when reading large or malformed HDF5 files are now caught by the new tryCatch() wrapper in read_hdf5_biom() and re-emitted as informative R-level warnings instead of crashing the session.
read_hdf5_biom(): added requireNamespace("rhdf5", quietly = TRUE) guard. If the rhdf5 package is not installed, or if the underlying HDF5 system libraries are absent (common on stripped BBS nodes or end-user machines without libhdf5), the function now emits a clear R-level warning identifying the missing dependency and returns NULL invisibly, instead of producing a fatal C-level abort.
read_hdf5_biom(): wrapped the h5read() call in tryCatch() so that any C-level or system-library error is caught and re-emitted as an informative R warning, keeping the R session alive.

DEPENDENCY CHANGES

Bumped Matrix dependency from (>= 1.2) to (>= 1.7-0) in DESCRIPTION. This pins the package against the post-SuiteSparse ABI break and prevents runtime crashes caused by binary incompatibilities in upstream sparse-matrix libraries on BBS nodes running R 4.4+.

biomformat 1.39.2

BUG FIXES

Fixed fatal test ERROR under testthat >= 3.0.0: replaced all deprecated expect_that(x, is_true()), expect_that(x, is_identical_to(y)), and expect_that(x, is_a("cls")) calls in tests/testthat/test-IO.R with their modern equivalents (expect_true(), expect_identical(), expect_is(), expect_true(is(x, "cls"))). The removed helpers caused the entire test suite to ERROR on current Bioconductor BBS nodes, which was the primary trigger for the deprecation warning. Addresses GitHub Issue #17 (https://github.com/joey711/biomformat/issues/17): "Bioconductor failure and risk of deprecation"
Added missing importFrom(stats, setNames) and importFrom(utils, packageVersion) directives to NAMESPACE, resolving "no visible global function definition" NOTEs from R CMD check.

biomformat 0.3.13

USER-VISIBLE CHANGES

Added make_biom function. Creates biom object from standard R data table(s).

biomformat 0.3.12

USER-VISIBLE CHANGES

No user-visible changes. All future compatibility changes.

BUG FIXES

Unit test changes to work with upcoming R release and new testthat version.
This solves Issue 4: https://github.com/joey711/biom/issues/4

biomformat 0.3.11

USER-VISIBLE CHANGES

No user-visible changes. All future CRAN compatibility changes.

BUG FIXES

Clarified license and project in the README.md
Added TODO.html, README.html, and TODO.md to .Rbuildignore (requested by CRAN)
Moved 'biom-demo.Rmd' to 'vignettes/'
Updated 'inst/NEWS' (this) file to official format
Removed pre-built vignette HTML so that it is re-built during package build. This updates things like the build-date in the vignette, but also ensures that the user sees in the vignette the results of code that just worked with their copy of the package.

biomformat 0.3.10

USER-VISIBLE CHANGES

These changes should not affect any package behavior.
Some of the top-level documentation has been changed to reflect new development location on GitHub.

BUG FIXES

Minor fixes for CRAN compatibility
This addresses Issue 1: https://github.com/joey711/biom/issues/1

biomformat 0.3.9

SIGNIFICANT USER-VISIBLE CHANGES

speed improvement for sparse matrices

NEW FEATURES

The 'biom_data' parsing function now uses a vectorized (matrix-indexed) assignment while parsing sparse matrices.
Unofficial benchmarks estimate a few 100X speedup.

biomformat 0.3.8

SIGNIFICANT USER-VISIBLE CHANGES

First release version released on CRAN