The VisiumIO
package provides a set of functions to
import 10X Genomics Visium experiment data into a
SpatialExperiment
object. The package makes use of the
SpatialExperiment
data structure, which provides a set of
classes and methods to handle spatially resolved transcriptomics
data.
Extension | Class | Imported as |
---|---|---|
.h5 | TENxH5 | SingleCellExperiment w/ TENxMatrix |
.mtx / .mtx.gz | TENxMTX | SummarizedExperiment w/ dgCMatrix |
.tar.gz | TENxFileList | SingleCellExperiment w/ dgCMatrix |
peak_annotation.tsv | TENxPeaks | GRanges |
fragments.tsv.gz | TENxFragments | RaggedExperiment |
.tsv / .tsv.gz | TENxTSV | tibble |
Extension | Class | Imported as |
---|---|---|
spatial.tar.gz | TENxSpatialList | DataFrame list * |
.parquet | TENxSpatialParquet | tibble * |
Note. (*) Intermediate format
The TENxVisium
class is used to import a
single sample of 10X Visium data. The
TENxVisium
constructor function takes the following
arguments:
TENxVisium(
resources = "path/to/10x/visium/file.tar.gz",
spatialResource = "path/to/10x/visium/spatial/file.spatial.tar.gz",
spacerangerOut = "path/to/10x/visium/sample/folder",
sample_id = "sample01",
images = c("lowres", "hires", "detected", "aligned"),
jsonFile = "scalefactors_json.json",
tissuePattern = "tissue_positions.*\\.csv",
spatialCoordsNames = c("pxl_col_in_fullres", "pxl_row_in_fullres")
)
The resource
argument is the path to the 10X Visium
file. The spatialResource
argument is the path to the 10X
Visium spatial file. It usually ends in spatial.tar.gz
.
Note that we use the images = "lowres"
and
processing = "raw"
arguments based on the name of the
tissue_*_image.png
file and
*_feature_bc_matrix
folder in the spaceranger
output. The directory structure for a single sample is
shown below:
section1
└── outs
├── spatial
│ ├── tissue_lowres_image.png
│ └── tissue_positions_list.csv
└── raw_feature_bc_matrix
├── barcodes.tsv
├── features.tsv
└── matrix.mtx
Using the example data in SpatialExperiment
, we can load
the section1
sample using TENxVisium
.
sample_dir <- system.file(
file.path("extdata", "10xVisium", "section1"),
package = "SpatialExperiment"
)
vis <- TENxVisium(
spacerangerOut = sample_dir, processing = "raw", images = "lowres"
)
vis
## An object of class "TENxVisium"
## Slot "resources":
## TENxFileList of length 3
## names(3): barcodes.tsv features.tsv matrix.mtx
##
## Slot "spatialList":
## TENxSpatialList of length 3
## names(3): scalefactors_json.json tissue_lowres_image.png tissue_positions_list.csv
##
## Slot "coordNames":
## [1] "pxl_col_in_fullres" "pxl_row_in_fullres"
##
## Slot "sampleId":
## [1] "sample01"
The show method of the TENxVisium
class displays the
object’s metadata.
The TEnxVisium
object can be imported into a
SpatialExperiment
object using the import
function.
## class: SpatialExperiment
## dim: 50 50
## metadata(2): resources spatialList
## assays(1): counts
## rownames(50): ENSMUSG00000051951 ENSMUSG00000089699 ...
## ENSMUSG00000005886 ENSMUSG00000101476
## rowData names(3): ID Symbol Type
## colnames(50): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
## AAAGTCGACCCTCAGT-1 AAAGTGCCATCAATTA-1
## colData names(4): in_tissue array_row array_col sample_id
## reducedDimNames(0):
## mainExpName: Gene Expression
## altExpNames(0):
## spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
## imgData names(4): sample_id image_id data scaleFactor
The TENxVisiumList
class is used to import multiple
samples of 10X Visium. The interface is a bit more simple in that you
only need to provide the space ranger
output folder as
input to the function.
TENxVisiumList(
sampleFolders = "path/to/10x/visium/sample/folder",
sample_ids = c("sample01", "sample02"),
...
)
The sampleFolders
argument is a character vector of
paths to the spaceranger
output folder. Note that each
folder must contain an outs
directory. The
sample_ids
argument is a character vector of sample
ids.
The directory structure for multiple samples
(section1
and section2
) is shown below:
section1
└── outs
| ├── spatial
| └── raw_feature_bc_matrix
section2
└── outs
├── spatial
└── raw_feature_bc_matrix
The main inputs to TENxVisiumList
are the
sampleFolders
and sample_ids
. These correspond
to the spaceranger
output sample folders and a vector of
sample identifiers, respectively.
sample_dirs <- list.dirs(
system.file(
file.path("extdata", "10xVisium"), package = "VisiumIO"
),
recursive = FALSE, full.names = TRUE
)
vlist <- TENxVisiumList(
sampleFolders = sample_dirs,
sample_ids = basename(sample_dirs),
processing = "raw",
images = "lowres"
)
vlist
## An object of class "TENxVisiumList"
## Slot "VisiumList":
## List of length 2
The import
method combines both
SingleCellExperiment
objects along with the spatial
information into a single SpatialExperiment
object. The
number of columns in the SpatialExperiment object is equal to the number
of cells across both samples (section1
and
section2
).
## class: SpatialExperiment
## dim: 50 99
## metadata(4): resources spatialList resources spatialList
## assays(1): counts
## rownames(50): ENSMUSG00000051951 ENSMUSG00000089699 ...
## ENSMUSG00000005886 ENSMUSG00000101476
## rowData names(3): ID Symbol Type
## colnames(99): AAACAACGAATAGTTC-1 AAACAAGTATCTCCCA-1 ...
## AAAGTCGACCCTCAGT-1 AAAGTGCCATCAATTA-1
## colData names(4): in_tissue array_row array_col sample_id
## reducedDimNames(0):
## mainExpName: Gene Expression
## altExpNames(0):
## spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
## imgData names(4): sample_id image_id data scaleFactor
The directory structure for a single bin size is shown below.
Visium_HD
└── binned_outputs
└─── square_002um
│ └── filtered_feature_bc_matrix
│ │ └── barcodes.tsv.gz
│ │ └── features.tsv.gz
│ │ └── matrix.mtx.gz
│ └── filtered_feature_bc_matrix.h5
│ └── raw_feature_bc_matrix/
│ └── raw_feature_bc_matrix.h5
│ └── spatial
│ └── [ ... ]
│ └── tissue_positions.parquet
└── square_*
TENxVisiumHD(
spacerangerOut = "./Visium_HD/",
sample_id = "sample01",
processing = c("filtered", "raw"),
images = c("lowres", "hires", "detected", "aligned_fiducials"),
bin_size = c("002", "008", "016"),
jsonFile = .SCALE_JSON_FILE,
tissuePattern = "tissue_positions\\.parquet",
spatialCoordsNames = c("pxl_col_in_fullres", "pxl_row_in_fullres"),
...
)
By default, the MatrixMarket format is read in
(format = "mtx"
).
visfold <- system.file(
package = "VisiumIO", "extdata", mustWork = TRUE
)
TENxVisiumHD(
spacerangerOut = visfold, images = "lowres", bin_size = "002"
) |> import()
## class: SpatialExperiment
## dim: 10 10
## metadata(2): resources spatialList
## assays(1): counts
## rownames(10): ENSMUSG00000051951 ENSMUSG00000025900 ...
## ENSMUSG00000033774 ENSMUSG00000025907
## rowData names(3): ID Symbol Type
## colnames(10): s_002um_02448_01644-1 s_002um_00700_02130-1 ...
## s_002um_01016_02194-1 s_002um_00775_02414-1
## colData names(6): barcode in_tissue ... bin_size sample_id
## reducedDimNames(0):
## mainExpName: Gene Expression
## altExpNames(0):
## spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
## imgData names(4): sample_id image_id data scaleFactor
H5 files are supported via the format = "h5"
argument
input.
TENxVisiumHD(
spacerangerOut = visfold, images = "lowres", bin_size = "002",
format = "h5"
) |> import()
## class: SpatialExperiment
## dim: 10 10
## metadata(2): resources spatialList
## assays(1): counts
## rownames(10): ENSMUSG00000051951 ENSMUSG00000025900 ...
## ENSMUSG00000033774 ENSMUSG00000025907
## rowData names(3): ID Symbol Type
## colnames(10): s_002um_02448_01644-1 s_002um_00700_02130-1 ...
## s_002um_01016_02194-1 s_002um_00775_02414-1
## colData names(6): barcode in_tissue ... bin_size sample_id
## reducedDimNames(0):
## mainExpName: Gene Expression
## altExpNames(0):
## spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
## imgData names(4): sample_id image_id data scaleFactor
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] VisiumIO_1.3.1 TENxIO_1.8.0
## [3] SingleCellExperiment_1.28.0 SummarizedExperiment_1.36.0
## [5] Biobase_2.67.0 GenomicRanges_1.59.0
## [7] GenomeInfoDb_1.43.0 IRanges_2.41.0
## [9] S4Vectors_0.44.0 BiocGenerics_0.53.1
## [11] generics_0.1.3 MatrixGenerics_1.19.0
## [13] matrixStats_1.4.1 BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] rjson_0.2.23 xfun_0.48 bslib_0.8.0
## [4] rhdf5_2.50.0 lattice_0.22-6 tzdb_0.4.0
## [7] rhdf5filters_1.18.0 vctrs_0.6.5 tools_4.4.1
## [10] parallel_4.4.1 tibble_3.2.1 fansi_1.0.6
## [13] R.oo_1.26.0 pkgconfig_2.0.3 BiocBaseUtils_1.9.0
## [16] Matrix_1.7-1 assertthat_0.2.1 lifecycle_1.0.4
## [19] GenomeInfoDbData_1.2.13 compiler_4.4.1 codetools_0.2-20
## [22] htmltools_0.5.8.1 sys_3.4.3 buildtools_1.0.0
## [25] sass_0.4.9 yaml_2.3.10 pillar_1.9.0
## [28] crayon_1.5.3 jquerylib_0.1.4 R.utils_2.12.3
## [31] DelayedArray_0.33.1 cachem_1.1.0 magick_2.8.5
## [34] abind_1.4-8 tidyselect_1.2.1 digest_0.6.37
## [37] purrr_1.0.2 arrow_17.0.0.1 maketools_1.3.1
## [40] fastmap_1.2.0 grid_4.4.1 cli_3.6.3
## [43] SparseArray_1.6.0 magrittr_2.0.3 S4Arrays_1.6.0
## [46] utf8_1.2.4 readr_2.1.5 UCSC.utils_1.2.0
## [49] bit64_4.5.2 rmarkdown_2.28 XVector_0.46.0
## [52] httr_1.4.7 bit_4.5.0 R.methodsS3_1.8.2
## [55] hms_1.1.3 SpatialExperiment_1.16.0 HDF5Array_1.35.1
## [58] evaluate_1.0.1 knitr_1.48 BiocIO_1.17.0
## [61] rlang_1.1.4 Rcpp_1.0.13 glue_1.8.0
## [64] BiocManager_1.30.25 vroom_1.6.5 jsonlite_1.8.9
## [67] Rhdf5lib_1.28.0 R6_2.5.1 zlibbioc_1.52.0