Fix GitHub organization typo in URLs of the DESCRIPTION file.
GitHub Actions CI fixes:
Add back the "remotes" package that was dropped from the
bioconductor/bioconductor_docker:devel container dependencies.
Putatively fix the test coverage sometimes hanging on
covr::package_coverage() by disabling parallel unit tests. The
rationale for this change was finding similar reports on covr GitHub
issues and the Posit community forum. The covr package itself uses
codecov for it's GitHub Actions and, therefore, does not serve as useful
reference CI code. Unable to reproduce this fault with a local rootless
container.
DESCRIPTION file.Require a minimum version of DFplyr containing two required bugfixes for building the vignette.
This release was accepted to Bioconductor on 2026-02-25 https://github.com/Bioconductor/Contributions/issues/3986#issuecomment-3960925596
Check upstream API version once a week and warn about changes.
Documented Bioconductor release installation instructions to the vignette and README.md file.
Fix vignette typo of "DR and" being transposed https://github.com/Bioconductor/Contributions/issues/3986#issuecomment-3480980762.
Fix incorrect R version in CI and simplify GitHub Actions. The previous GitHub Actions were based on conflicting r-lib/actions and the unmaintained seandavi/BiocActions. Based on Marcel's suggestion on Zulip, instead use the better maintained Bioconductor/BuildABiocWorkshop GitHub Actions.
Support trunk R 4.6.0. This required explicitly specifying the drop
argument of the [ accessor of DataFrame::DataFrame objects otherwise one
runs into the cryptic error "unimplemented type 'list' in 'orderVector1'".
Documented missing return value of CategoriesDataFrame as warned by
BiocCheck.
Added lookup_pubchem() function to standardize various drug database
identifiers to PubChem CIDs. Queries from lookup_pubchem() are performed
using PubChem's Power User Gatway (PUG) API. While similar lookup
functionality exists in other Bioconductor packages, the available packages
have limitations making them unsuitable with the newness and scale of the
ToppGene API:
Even new versions of metaboliteIDmapping having missing identifiers even
for older entries when converting to PubChem CIDs.
ChemmineR does not support vectorized lookups to PubChem CIDs and times
out when performing multiple queries.
max_tries with a default value of 3 to all functions that perform web
queries in keeping with Appendix C.2 of the Bioconductor Guidelines for
Package Development and Maintenance.lookup() and enrich() now return warnings on encountering empty results
instead of throwing errors. This is more reasonable, because enrich() can
return no results if the default PValue or the requested PValue are too
stringent. Similarly, lookup() may yield no results for some queries and
having an empty DataFrame makes it easier to bind results rows of multiple
queries.enrich() can now be called with any combination of functional enrichment
categories. Category thresholds can also be fully adjusted instead of being
fixed at their default values. These modifications are passed to enrich()
using a CategoriesDataFrame objectAdded CategoryDataFrame() S4 class to customize enrich(..., categories)
queries. Below are the default values and allowed value ranges or choices:
Replaced httr with httr2. The httr package home page recommends
against using it because it has been superseded by httr2.
Updated GitHub workflow with more CRAN and Bioconductor checks.
Sped up unit testing by splitting tests creating web API queries
into separate unit test files, because testthat parallelization is
split by ./tests/testthat/test-*.R file.
Implemented ToppGene API using default categories and category options:
components.schemas.EnrichmentRequest.properties.Categories.items.properties.Type.*.default.Support for non-default category options is a planned feature.
Simplified API into 2 function calls, lookup() and enrich().
Both functions return DataFrame objects.
enrich() converts nested Genes lists into CharacterList (Symbol) and
IntegerList (Entrez) columns.
Dropped dependency on GSEABase because Bioconductor annotation mappings are
lossy compared to webserver annotation lookup API.
Replaced the overly complicated openapi package generated code with simpler
httr:
The openapi R package is only on GitHub
https://github.com/zhanghao-njmu/openapi and not on Bioconductor or
CRAN, so it would not have met Bioconductor's package submission
guidelines.
httr looked promising as a replacement for openapi after reading the
source code of a ThirdPartyClient biocView package, KEGGREST.
There are significant bugs in the R output of the OpenAPI code generator https://openapi-generator.tech/ even though the R interface is listed as stable. Found two code logic problems with the generated R code:
No support for nested JSON responses; specifically, the nested portion
appears as NULL. This was problematic for the "Genes" list returned by
each "Annotation" of enrich https://toppgene.cchmc.org/API/enrich.
This blocker issue prompted switching to the httr package.
There is a trivial bug in the OpenAPI R interface that treats
"array[integer]" as a character vector instead of an integer vector,
requiring one to remove unnecessary runtime assertions. All such JSON
return type conversions are handled by httr::content().
The OpenAPI generated code was ugly to work with:
JSON accessors are protected by back-quotes to support OpenAPI edge cases where the JSON accessors may contain special characters.
Error checking is greatly simplified by httr::http_error(), whereas a
lot of generated code is created to check specific error codes.
The generated code required a lot of code reformatting to meet Bioconductor documentation, linter, and code style guidelines.