--- title: "XeniumIO: Import 10X Genomics Xenium Analyzer Data" author: "Marcel Ramos" date: "`r format(Sys.time(), '%B %d, %Y')`" vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{VisiumIO Quick Start Guide} %\VignetteEncoding{UTF-8} output: BiocStyle::html_document: number_sections: no toc: yes toc_depth: 4 package: XeniumIO --- ```{r,include=FALSE} knitr::opts_chunk$set(cache = TRUE) ``` # Introduction The `XeniumIO` package provides functions to import 10X Genomics Xenium Analyzer data into R. The package is designed to work with the output of the Xenium Analyzer, which is a software tool that processes Visium spatial gene expression data. The package provides functions to import the output of the Xenium Analyzer into R, and to create a `TENxXenium` object that can be used with other Bioconductor packages. # Supported Formats ## TENxIO The 10X suite of packages support multiple file formats. The following table lists the supported file formats and the corresponding classes that are imported into R. | **Extension** | **Class** | **Imported as** | |---------------------|---------------|----------------------| | .h5 | TENxH5 | SingleCellExperiment w/ TENxMatrix | | .mtx / .mtx.gz | TENxMTX | SummarizedExperiment w/ dgCMatrix | | .tar.gz | TENxFileList | SingleCellExperiment w/ dgCMatrix | | peak_annotation.tsv | TENxPeaks | GRanges | | fragments.tsv.gz | TENxFragments | RaggedExperiment | | .tsv / .tsv.gz | TENxTSV | tibble | ## VisiumIO | **Extension** | **Class** | **Imported as** | |---------------------|---------------|----------------------| | spatial.tar.gz | TENxSpatialList | DataFrame list * | | .parquet | TENxSpatialParquet | tibble * | ## XeniumIO | **Extension** | **Class** | **Imported as** | |---------------------|---------------|----------------------| | .zarr.zip | TENxZarr | (TBD) | # GitHub Installation ```{r,eval=FALSE} BiocManager::install("Bioconductor/XeniumIO") ``` # Loading package ```{r,include=TRUE,results="hide",message=FALSE,warning=FALSE} library(XeniumIO) ``` # XeniumIO The `TENxXenium` class has a `metadata` slot for the `experiment.xenium` file. The `resources` slot is a `TENxFileList` or `TENxH5` object containing the cell feature matrix. The `coordNames` slot is a vector specifying the names of the columns in the spatial data containing the spatial coordinates. The `sampleId` slot is a scalar specifying the sample identifier. ```{r, eval=FALSE} TENxXenium( resources = "path/to/matrix/folder/or/file", xeniumOut = "path/to/xeniumOut/folder", sample_id = "sample01", format = c("mtx", "h5"), boundaries_format = c("parquet", "csv.gz"), spatialCoordsNames = c("x_centroid", "y_centroid"), ... ) ``` The `format` argument specifies the format of the `resources` object, either "mtx" or "h5". The `boundaries_format` allows the user to choose whether to read in the data using the `parquet` or `csv.gz` format. # Example Folder Structure Note that the `xeniumOut` unzipped folder must contain the following files: ``` *outs ├── cell_feature_matrix.h5 ├── cell_feature_matrix.tar.gz | ├── barcodes.tsv* | ├── features.tsv* | └── matrix.mtx* ├── cell_feature_matrix.zarr.zip ├── experiment.xenium ├── cells.csv.gz ├── cells.parquet ├── cells.zarr.zip [...] ``` Note that currently the `zarr` format is not supported as the infrastructure is currently under development. ## Xenium class The `resources` slot should either be the `TENxFileList` from the `mtx` format or a `TENxH5` instance from an `h5` file. The boundaries can either be a `TENxSpatialParquet` instance or a `TENxSpatialCSV`. These classes are automatically instantiated by the constructor function. ```{r} showClass("TENxXenium") ``` ## `import` method The `import` method for a `TENxXenium` instance returns a `SpatialExperiment` class object. Dispatch is only done on the `con` argument. See `?BiocIO::import` for details on the generic. The `import` function call is meant to be a simple call without much input. For more details in the package, see `?TENxXenium`. ```{r} getMethod("import", c(con = "TENxXenium")) ``` # Importing an Example Xenium Dataset The following code snippet demonstrates how to import a Xenium Analyzer output into R. The `TENxXenium` object is created by specifying the path to the `xeniumOut` folder. The `TENxXenium` object is then imported into R using the `import` method for the `TENxXenium` class. First, we cache the ~12 MB file to avoid downloading it multiple times (via `r Biocpkg("BiocFileCache")`). ```{r} zipfile <- paste0( "https://mghp.osn.xsede.org/bir190004-bucket01/BiocXenDemo/", "Xenium_Prime_MultiCellSeg_Mouse_Ileum_tiny_outs.zip" ) destfile <- XeniumIO:::.cache_url_file(zipfile) ``` We then create an output folder for the contents of the zipped file. We use the same name as the zip file but without the extension (with `tools::file_path_sans_ext`). ```{r} outfold <- file.path( tempdir(), tools::file_path_sans_ext(basename(zipfile)) ) if (!dir.exists(outfold)) dir.create(outfold, recursive = TRUE) ``` We now unzip the file and use the `outfold` as the `exdir` argument to `unzip`. The `outfold` variable and folder will be used as the `xeniumOut` argument in the `TENxXenium` constructor. Note that we use the `ref = "Gene Expression"` argument in the `import` method to pass down to the internal `splitAltExps` function call. This will set the `mainExpName` in the `SpatialExperiment` object. ```{r} unzip( zipfile = destfile, exdir = outfold, overwrite = FALSE ) TENxXenium(xeniumOut = outfold) |> import(ref = "Gene Expression") ``` Note that you may also use the `swapAltExp` function to set a `mainExpName` in the `SpatialExperiment` but this is not recommended. The operation returns a `SingleCellExperiment` which has to be coerced back into a `SpatialExperiment`. The coercion also loses some metadata information particularly the `spatialCoords`. ```{r} TENxXenium(xeniumOut = outfold) |> import() |> swapAltExp(name = "Gene Expression") |> as("SpatialExperiment") ``` The dataset was obtained from the 10X Genomics website under the [X0A v3.0 section](https://www.10xgenomics.com/support/software/xenium-onboard-analysis/latest/resources/xenium-example-data#test-data-v3-0) and is a subset of the Xenium Prime 5K Mouse Pan Tissue & Pathways Panel. The link to the data can be seen as the `url` input above and shown below for completeness. # Session Info ```{r} sessionInfo() ```