Title: | Discover and Access Single Cell Data Sets in the CELLxGENE Data Portal |
---|---|
Description: | The cellxgene data portal (https://cellxgene.cziscience.com/) provides a graphical user interface to collections of single-cell sequence data processed in standard ways to 'count matrix' summaries. The cellxgenedp package provides an alternative, R-based inteface, allowind data discovery, viewing, and downloading. |
Authors: | Martin Morgan [aut, cre] , Kayla Interdonato [aut] |
Maintainer: | Martin Morgan <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.11.0 |
Built: | 2024-10-30 04:36:18 UTC |
Source: | https://github.com/bioc/cellxgenedp |
files_download()
retrieves one or more cellxgene
files to a cache on the local system.
links()
, authors()
and publisher_metadata()
are
helper functions to extract 'nested' information from
collections.
collections(cellxgene_db = db()) datasets(cellxgene_db = db()) datasets_visualize(tbl) files(cellxgene_db = db()) files_download(tbl, dry.run = TRUE, cache.path = .cellxgene_cache_path()) links(cellxgene_db = db()) authors(cellxgene_db = db()) publisher_metadata(cellxgene_db = db())
collections(cellxgene_db = db()) datasets(cellxgene_db = db()) datasets_visualize(tbl) files(cellxgene_db = db()) files_download(tbl, dry.run = TRUE, cache.path = .cellxgene_cache_path()) links(cellxgene_db = db()) authors(cellxgene_db = db()) publisher_metadata(cellxgene_db = db())
cellxgene_db |
an optional 'cellxgene_db' object, as returned
by |
tbl |
a |
dry.run |
logical(1) indicating whether the (often large)
file(s) in |
cache.path |
character(1) directory in which to cache
downloaded files. The directory must already exist. The default
is |
Each function returns a tibble describing the corresponding component of the database.
files_download()
returns a character() vector of paths to
the local files.
links()
returns a tibble of external links associated
with each collection. Common links includ DOI, raw data / data
sources, and lab websites.
authors()
returns a tibble of authors associated with
each collection.
publisher_metadata()
returns a tibble of publisher
metadata (journal, publicate date, doi) associated with each
collection.
db <- db() collections(db) collections(db) |> dplyr::glimpse() datasets(db) |> dplyr::glimpse() if (interactive()) { ## visualize the first dataset datasets(db) |> dplyr::slice(1) |> datasets_visualize() } files(db) |> dplyr::glimpse() ## Not run: files(db) |> dplyr::slice(1) |> files_download(dry.run = FALSE) ## End(Not run) ## common links to external data links(db) |> dplyr::count(link_type) ## authors per collection authors() |> dplyr::count(collection_id, sort = TRUE) publisher_metadata() |> dplyr::glimpse()
db <- db() collections(db) collections(db) |> dplyr::glimpse() datasets(db) |> dplyr::glimpse() if (interactive()) { ## visualize the first dataset datasets(db) |> dplyr::slice(1) |> datasets_visualize() } files(db) |> dplyr::glimpse() ## Not run: files(db) |> dplyr::slice(1) |> files_download(dry.run = FALSE) ## End(Not run) ## common links to external data links(db) |> dplyr::count(link_type) ## authors per collection authors() |> dplyr::count(collection_id, sort = TRUE) publisher_metadata() |> dplyr::glimpse()
Shiny application for discovering, viewing, and downloading cellxgene data
cxg(as = c("tibble", "sce"))
cxg(as = c("tibble", "sce"))
as |
character(1) Return value when quiting the shiny
application. |
cxg()
returns either a tibble describing datasets
selected in the shiny application, or a list of datasets
imported into R as SingleCellExperiment objects.
if (interactive()) cxg()
if (interactive()) cxg()
Retrieve updated cellxgene database metadata
db(overwrite = .db_online() && .db_first())
db(overwrite = .db_online() && .db_first())
overwrite |
logical(1) indicating whether the database of
collections should be updated from the internet (the default,
when internet is available and, in an interactive session, the
user requests the update), or read from disk (assuming previous
successful access to the internet). |
The database is retrieved from the cellxgene data portal web site. 'collections' metadata are retrieved on each call; metadata on each collection is cached locally for re-use.
db()
returns an object of class 'cellxgene_db',
summarizing available collections, datasets, and files.
db()
db()
FACETS
is a character vector of common fields used
to subset cellxgene data.
facets()
is used to query the cellxgene database for
current values of one or all facets.
facets_filter()
provides a convenient way to filter
facets based on label or ontology term.
FACETS facets(cellxgene_db = db(), facets = FACETS) facets_filter(facet, key = c("label", "ontology_term_id"), value, exact = TRUE)
FACETS facets(cellxgene_db = db(), facets = FACETS) facets_filter(facet, key = c("label", "ontology_term_id"), value, exact = TRUE)
cellxgene_db |
an (optional) cellxgene_db object, as returned
by |
facets |
a character() vector corersponding to one of the
facets in |
facet |
the column containing faceted information, e.g., |
key |
character(1) identifying whether |
value |
character() value of the label or ontology term to
filter on. The value may be a vector with |
exact |
logical(1) whether values match exactly (default,
|
FACETS
is an object of class character
of length 8.
facets()
returns a tibble with columns facet
, label
,
ontology_term_id
, and n
, the number of times the facet
label is used in the database.
facets_filter()
returns a logical vector with length
equal to the length (number of rows) of facet
, with TRUE
indicating that the value
of key
is present in the dataset.
f <- facets() ## levels of each facet f |> dplyr::count(facet) ## same as facets(, facets = "organism") f |> dplyr::filter(facet == "organism") db <- db() ds <- datasets(db) ## datasets with African American females ds |> dplyr::filter( facets_filter(self_reported_ethnicity, "label", "African American"), facets_filter(sex, "label", "female") ) ## datasets with non-European, known ethnicity facets(db, "self_reported_ethnicity") ds |> dplyr::filter( !facets_filter( self_reported_ethnicity, "label", c("European", "na", "unknown") ) )
f <- facets() ## levels of each facet f |> dplyr::count(facet) ## same as facets(, facets = "organism") f |> dplyr::filter(facet == "organism") db <- db() ds <- datasets(db) ## datasets with African American females ds |> dplyr::filter( facets_filter(self_reported_ethnicity, "label", "African American"), facets_filter(sex, "label", "female") ) ## datasets with non-European, known ethnicity facets(db, "self_reported_ethnicity") ds |> dplyr::filter( !facets_filter( self_reported_ethnicity, "label", c("European", "na", "unknown") ) )