Package: tidytof 1.1.0

Timothy Keyes

tidytof: Analyze High-dimensional Cytometry Data Using Tidy Data Principles

This package implements an interactive, scientific analysis pipeline for high-dimensional cytometry data built using tidy data principles. It is specifically designed to play well with both the tidyverse and Bioconductor software ecosystems, with functionality for reading/writing data files, data cleaning, preprocessing, clustering, visualization, modeling, and other quality-of-life functions. tidytof implements a "grammar" of high-dimensional cytometry data analysis.

Authors:Timothy Keyes [cre], Kara Davis [rth, own], Garry Nolan [rth, own]

tidytof_1.1.0.tar.gz
tidytof_1.1.0.zip(r-4.5)tidytof_1.1.0.zip(r-4.4)tidytof_1.1.0.zip(r-4.3)
tidytof_1.1.0.tgz(r-4.4-x86_64)tidytof_1.1.0.tgz(r-4.4-arm64)
tidytof_1.1.0.tar.gz(r-4.5-noble)tidytof_1.1.0.tar.gz(r-4.4-noble)
tidytof_1.1.0.tgz(r-4.4-emscripten)tidytof_1.1.0.tgz(r-4.3-emscripten)
tidytof.pdf |tidytof.html
tidytof/json (API)
NEWS

# Install 'tidytof' in R:
install.packages('tidytof', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/keyes-timothy/tidytof/issues

Pkgdown site:https://keyes-timothy.github.io

Uses libs:
  • c++– GNU Standard C++ Library v3
Datasets:
  • ddpr_data - CyTOF data from two samples: 5,000 B-cell lineage cells from a healthy patient and 5,000 B-cell lineage cells from a B-cell precursor Acute Lymphoblastic Leukemia (BCP-ALL) patient.
  • ddpr_metadata - Clinical metadata for each patient sample in Good & Sarno et al. (2018).
  • metal_masterlist - A character vector of metal name patterns supported by tidytof.
  • phenograph_data - CyTOF data from 6,000 healthy immune cells from a single patient.

On BioConductor:tidytof-1.1.0(bioc 3.21)tidytof-1.0.0(bioc 3.20)

singlecellflowcytometrybioinformaticscytometrydata-sciencesingle-celltidyversecpp

7.26 score 19 stars 35 scripts 106 downloads 103 exports 106 dependencies

Last updated 3 months agofrom:3ca5f4afad. Checks:7 OK. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKJan 16 2025
R-4.5-win-x86_64OKJan 16 2025
R-4.5-linux-x86_64OKJan 16 2025
R-4.4-win-x86_64OKJan 16 2025
R-4.4-mac-x86_64OKJan 16 2025
R-4.4-mac-aarch64OKJan 16 2025
R-4.3-win-x86_64OKJan 16 2025

Exports::=.data%>%all_ofany_ofas_flowFrameas_flowSetas_seuratas_SingleCellExperimentas_tof_tblcontainsends_witheverythinglast_colmatchesnum_rangerev_asinhstarts_withtidytof_example_datatof_analyze_abundancetof_analyze_abundance_diffcyttof_analyze_abundance_glmmtof_analyze_abundance_ttesttof_analyze_expressiontof_analyze_expression_diffcyttof_analyze_expression_lmmtof_analyze_expression_ttesttof_annotate_clusterstof_assess_channelstof_assess_clusters_distancetof_assess_clusters_entropytof_assess_clusters_knntof_assess_flow_ratetof_assess_modeltof_batch_correcttof_batch_correct_quantiletof_batch_correct_rescaletof_calculate_flow_ratetof_clustertof_cluster_ddprtof_cluster_flowsomtof_cluster_kmeanstof_cluster_phenographtof_create_gridtof_downsampletof_downsample_constanttof_downsample_densitytof_downsample_proptof_estimate_densitytof_extract_central_tendencytof_extract_emdtof_extract_featurestof_extract_jsdtof_extract_proportiontof_extract_thresholdtof_find_knntof_generate_palettetof_get_model_mixturetof_get_model_outcomestof_get_model_penaltytof_get_model_training_datatof_get_model_typetof_get_model_xtof_get_model_ytof_get_paneltof_log_rank_testtof_make_knn_graphtof_make_roc_curvetof_metaclustertof_metacluster_consensustof_metacluster_flowsomtof_metacluster_hierarchicaltof_metacluster_kmeanstof_metacluster_phenographtof_plot_cells_densitytof_plot_cells_embeddingtof_plot_cells_layouttof_plot_cells_scattertof_plot_clusters_heatmaptof_plot_clusters_msttof_plot_clusters_volcanotof_plot_modeltof_plot_sample_featurestof_plot_sample_heatmaptof_postprocesstof_predicttof_preprocesstof_read_datatof_reduce_dimensionstof_reduce_pcatof_reduce_tsnetof_reduce_umaptof_set_paneltof_spade_densitytof_split_datatof_train_modeltof_transformtof_upsampletof_upsample_distancetof_upsample_neighbortof_write_datatof_write_fcswhere

Dependencies:BHBiobaseBiocGenericsbitbit64cachemclassclicliprclockcodetoolscolorspacecpp11crayoncytolibdata.tablediagramdigestdoParalleldplyrfansifarverfastmapflowCoreforeachfuturefuture.applygenericsggforceggplot2ggraphggrepelglmnetglobalsgluegowergraphlayoutsgridExtragtablehardhathmsigraphipredisobanditeratorsKernSmoothlabelinglatticelavalifecyclelistenvlubridatemagrittrMASSMatrixmatrixStatsmemoisemgcvmunsellnlmennetnumDerivparallellypillarpkgconfigpolyclipprettyunitsprodlimprogressprogressrpurrrR6RColorBrewerRcppRcppArmadilloRcppEigenRcppHNSWreadrrecipesRhdf5librlangrpartRProtoBufLibS4VectorsscalesshapeSQUAREMstringistringrsurvivalsystemfontstibbletidygraphtidyrtidyselecttimechangetimeDatetweenrtzdbutf8vctrsviridisviridisLitevroomwithryardstick

GETTING STARTED with tidytof

Rendered fromtidytof.Rmdusingknitr::knitron Jan 16 2025.

Last update: 2024-08-25
Started: 2022-01-23

Reading and writing data

Rendered fromreading-and-writing-data.Rmdusingknitr::knitron Jan 16 2025.

Last update: 2024-08-25
Started: 2022-01-23

Quality control

Rendered fromquality-control.Rmdusingknitr::rmarkdownon Jan 16 2025.

Last update: 2024-08-25
Started: 2023-05-01

Preprocessing

Rendered frompreprocessing.Rmdusingknitr::knitron Jan 16 2025.

Last update: 2024-08-25
Started: 2022-01-23

Downsampling

Rendered fromdownsampling.Rmdusingknitr::knitron Jan 16 2025.

Last update: 2024-08-25
Started: 2022-01-24

Dimensionality reduction

Rendered fromdimensionality-reduction.Rmdusingknitr::knitron Jan 16 2025.

Last update: 2024-08-25
Started: 2022-01-24

Clustering and metaclustering

Rendered fromclustering.Rmdusingknitr::knitron Jan 16 2025.

Last update: 2024-08-25
Started: 2022-01-24

Differential discovery analysis

Rendered fromdifferential-discovery-analysis.Rmdusingknitr::knitron Jan 16 2025.

Last update: 2024-08-25
Started: 2022-01-24

Feature extraction

Rendered fromfeature-extraction.Rmdusingknitr::knitron Jan 16 2025.

Last update: 2024-08-25
Started: 2022-01-24

Building predictive models

Rendered frommodeling.Rmdusingknitr::knitron Jan 16 2025.

Last update: 2024-08-25
Started: 2022-01-24

How to contribute code

Rendered fromcontributing-to-tidytof.Rmdusingknitr::knitron Jan 16 2025.

Last update: 2024-08-25
Started: 2022-04-26

Readme and manuals

Help Manual

Help pageTopics
Coerce an object into a 'flowFrame'as_flowFrame as_flowFrame.tof_tbl
Coerce an object into a 'flowSet'as_flowSet as_flowSet.tof_tbl
Coerce an object into a 'SeuratObject'as_seurat as_seurat.tof_tbl
Coerce an object into a 'SingleCellExperiment'as_SingleCellExperiment as_SingleCellExperiment.tof_tbl
Coerce flowFrames or flowSets into tof_tbl's.as_tof_tbl
Convert an object into a tof_tblas_tof_tbl.flowSet
Find the cosine similarity between two vectorscosine_similarity
CyTOF data from two samples: 5,000 B-cell lineage cells from a healthy patient and 5,000 B-cell lineage cells from a B-cell precursor Acute Lymphoblastic Leukemia (BCP-ALL) patient.ddpr_data
Clinical metadata for each patient sample in Good & Sarno et al. (2018).ddpr_metadata
Find the dot product between two vectors.dot
Find the extension for a fileget_extension
L2 normalize an input vector x to a length of 1l2_normalize
Find the magnitude of a vector.magnitude
Make the AnnotatedDataFrame needed for the flowFrame classmake_flowcore_annotated_data_frame
A character vector of metal name patterns supported by tidytof.metal_masterlist
Constructor for a tof_model.new_tof_model
Constructor for a tof_tibble.new_tof_tibble
CyTOF data from 6,000 healthy immune cells from a single patient.phenograph_data
Reverses arcsinh transformation with cofactor `scale_factor` and a shift of `shift_factor`.rev_asinh
Get paths to tidytof example datatidytof_example_data
Perform Differential Abundance Analysis (DAA) on high-dimensional cytometry datatof_analyze_abundance
Differential Abundance Analysis (DAA) with diffcyttof_analyze_abundance_diffcyt
Differential Abundance Analysis (DAA) with generalized linear mixed-models (GLMMs)tof_analyze_abundance_glmm
Differential Abundance Analysis (DAA) with t-teststof_analyze_abundance_ttest
Perform Differential Expression Analysis (DEA) on high-dimensional cytometry datatof_analyze_expression
Differential Expression Analysis (DEA) with diffcyttof_analyze_expression_diffcyt
Differential Expression Analysis (DEA) with linear mixed-models (LMMs)tof_analyze_expression_lmm
Differential Expression Analysis (DEA) with t-teststof_analyze_expression_ttest
Manually annotate tidytof-computed clusters using user-specified labelstof_annotate_clusters
Perform developmental clustering on CyTOF data using a pre-fit classifiertof_apply_classifier
Detect low-expression (i.e. potentially failed) channels in high-dimensional cytometry datatof_assess_channels
Assess a clustering result by calculating the z-score of each cell's mahalanobis distance to its cluster centroid and flagging outliers.tof_assess_clusters_distance
Assess a clustering result by calculating the shannon entropy of each cell's mahalanobis distance to all cluster centroids and flagging outliers.tof_assess_clusters_entropy
Assess a clustering result by calculating a cell's cluster assignment to that of its K nearest neighbors.tof_assess_clusters_knn
Detect flow rate abnormalities in high-dimensional cytometry datatof_assess_flow_rate
Detect flow rate abnormalities in high-dimensional cytometry data (stored in a single data.frame)tof_assess_flow_rate_tibble
Assess a trained elastic net modeltof_assess_model
Compute a trained elastic net model's performance metrics using new_data.tof_assess_model_new_data
Access a trained elastic net model's performance metrics using its tuning data.tof_assess_model_tuning
Perform groupwise linear rescaling of high-dimensional cytometry measurementstof_batch_correct
Batch-correct a tibble of high-dimensional cytometry data using quantile normalization.tof_batch_correct_quantile
Batch-correct a tibble of high-dimensional cytometry data using quantile normalization.tof_batch_correct_quantile_tibble
Perform groupwise linear rescaling of high-dimensional cytometry measurementstof_batch_correct_rescale
Calculate centroids and covariance matrices for each cell subpopulation in healthy CyTOF data.tof_build_classifier
Calculate the relative flow rates of different timepoints throughout a flow or mass cytometry run.tof_calculate_flow_rate
Check argument specifications for a glmnet model.tof_check_model_args
Classify each cell (i.e. each row) in a matrix of cancer cells into its most similar healthy developmental subpopulation.tof_classify_cells
Rename glmnet's default model evaluation metrics to make them more interpretabletof_clean_metric_names
Cluster high-dimensional cytometry data.tof_cluster
Perform developmental clustering on high-dimensional cytometry data.tof_cluster_ddpr
Perform FlowSOM clustering on high-dimensional cytometry datatof_cluster_flowsom
Cluster (grouped) high-dimensional cytometry data.tof_cluster_grouped
Perform k-means clustering on high-dimensional cytometry data.tof_cluster_kmeans
Perform PhenoGraph clustering on high-dimensional cytometry data.tof_cluster_phenograph
Cluster (ungrouped) high-dimensional cytometry data.tof_cluster_tibble
Compute a Kaplan-Meier curve from sample-level survival datatof_compute_km_curve
A function for finding the cosine distance between each of the rows of a numeric matrix and a numeric vector.tof_cosine_dist
Create an elastic net hyperparameter search grid of a specified sizetof_create_grid
Create a recipe for preprocessing sample-level cytometry data for an elastic net modeltof_create_recipe
Downsample high-dimensional cytometry data.tof_downsample
Downsample high-dimensional cytometry data by randomly selecting a constant number of cells per group.tof_downsample_constant
Downsample high-dimensional cytometry data by randomly selecting a proportion of the cells in each group.tof_downsample_density
Downsample high-dimensional cytometry data by randomly selecting a proportion of the cells in each group.tof_downsample_prop
Estimate the local densities for all cells in a high-dimensional cytometry dataset.tof_estimate_density
Extract the central tendencies of CyTOF markers in each cluster in a `tof_tibble`.tof_extract_central_tendency
Extract aggregated features from CyTOF data using earth-mover's distance (EMD)tof_extract_emd
Extract aggregated, sample-level features from CyTOF data.tof_extract_features
Extract aggregated features from CyTOF data using the Jensen-Shannon Distance (JSD)tof_extract_jsd
Extract the proportion of cells in each cluster in a `tof_tibble`.tof_extract_proportion
Extract aggregated features from CyTOF data using a binary thresholdtof_extract_threshold
Find the optimal hyperparameters for an elastic net model from candidate performance metricstof_find_best
Calculate and store the predicted outcomes for each validation set observation during model tuningtof_find_cv_predictions
Find the earth-mover's distance between two numeric vectorstof_find_emd
Find the Jensen-Shannon Divergence (JSD) between two numeric vectorstof_find_jsd
Find the k-nearest neighbors of each cell in a high-dimensional cytometry dataset.tof_find_knn
Compute the log-rank test p-value for the difference between the two survival curves obtained by splitting a dataset into a "low" and "high" risk group using all possible relative-risk thresholds.tof_find_log_rank_threshold
Use tidytof's opinionated heuristic for extracted a high-dimensional cytometry panel's metal-antigen pairs from a flowFrame (read from a .fcs file.)tof_find_panel_info
Fit a glmnet model and calculate performance metrics using a single rsplit objecttof_fit_split
Generate a color palette using tidytof.tof_generate_palette
Get a `tof_model`'s optimal mixture (alpha) valuetof_get_model_mixture
Get a `tof_model`'s outcome variable name(s)tof_get_model_outcomes
Get a `tof_model`'s optimal penalty (lambda) valuetof_get_model_penalty
Get a `tof_model`'s training datatof_get_model_training_data
Get a `tof_model`'s model typetof_get_model_type
Get a `tof_model`'s processed predictor matrix (for glmnet)tof_get_model_x
Get a `tof_model`'s processed outcome variable matrix (for glmnet)tof_get_model_y
Get panel information from a tof_tibbletof_get_panel
Find if a vector is numerictof_is_numeric
Estimate cells' local densities using K-nearest-neighbor density estimationtof_knn_density
Compute the log-rank test p-value for the difference between the two survival curves obtained by splitting a dataset into a "low" and "high" risk group using a given relative-risk threshold.tof_log_rank_test
Titletof_make_knn_graph
Compute a receiver-operating curve (ROC) for a two-class or multiclass datasettof_make_roc_curve
Metacluster clustered CyTOF data.tof_metacluster
Metacluster clustered CyTOF data using consensus clusteringtof_metacluster_consensus
Metacluster clustered CyTOF data using FlowSOM's built-in metaclustering algorithmtof_metacluster_flowsom
Metacluster clustered CyTOF data using hierarchical agglomerative clusteringtof_metacluster_hierarchical
Metacluster clustered CyTOF data using k-means clusteringtof_metacluster_kmeans
Metacluster clustered CyTOF data using PhenoGraph clusteringtof_metacluster_phenograph
Plot marker expression density plotstof_plot_cells_density
Plot scatterplots of single-cell data using low-dimensional feature embeddingstof_plot_cells_embedding
Plot force-directed layouts of single-cell datatof_plot_cells_layout
Plot scatterplots of single-cell data.tof_plot_cells_scatter
Make a heatmap summarizing cluster marker expression patterns in CyTOF datatof_plot_clusters_heatmap
Visualize clusters in CyTOF data using a minimum spanning tree (MST).tof_plot_clusters_mst
Create a volcano plot from differential expression analysis resultstof_plot_clusters_volcano
Make a heatmap summarizing group marker expression patterns in high-dimensional cytometry datatof_plot_heatmap
Plot the results of a glmnet model fit on sample-level data.tof_plot_model
Plot the results of a linear glmnet model fit on sample-level data.tof_plot_model_linear
Plot the results of a two-class glmnet model fit on sample-level data.tof_plot_model_logistic
Plot the results of a multiclass glmnet model fit on sample-level data.tof_plot_model_multinomial
Plot the results of a survival glmnet model fit on sample-level data.tof_plot_model_survival
Make a heatmap summarizing sample marker expression patterns in CyTOF datatof_plot_sample_features
Make a heatmap summarizing sample marker expression patterns in CyTOF datatof_plot_sample_heatmap
Post-process transformed CyTOF data.tof_postprocess
Use a trained elastic net model to predict fitted values from new datatof_predict
Train a recipe or list of recipes for preprocessing sample-level cytometry datatof_prep_recipe
Preprocess raw high-dimensional cytometry data.tof_preprocess
Read high-dimensional cytometry data from a .csv file into a tidy tibble.tof_read_csv
Read data from an .fcs/.csv file or a directory of .fcs/.csv files.tof_read_data
Read high-dimensional cytometry data from an .fcs file into a tidy tibble.tof_read_fcs
Read high-dimensional cytometry data from a single .fcs or .csv file into a tidy tibble.tof_read_file
Apply dimensionality reduction to a single-cell dataset.tof_reduce_dimensions
Perform principal component analysis on single-cell datatof_reduce_pca
Perform t-distributed stochastic neighborhood embedding on single-cell datatof_reduce_tsne
Apply uniform manifold approximation and projection (UMAP) to single-cell datatof_reduce_umap
Set panel information from a tof_tibbletof_set_panel
Estimate cells' local densities as done in Spanning-tree Progression Analysis of Density-normalized Events (SPADE)tof_spade_density
Split high-dimensional cytometry data into a training and test settof_split_data
Split the dimensionality reduction data that tidytof combines during 'SingleCellExperiment' conversiontof_split_tidytof_reduced_dimensions
Train an elastic net model to predict sample-level phenomena using high-dimensional cytometry data.tof_train_model
Transform raw high-dimensional cytometry data.tof_transform
Tune an elastic net model's hyperparameters over multiple resamplestof_tune_glmnet
Upsample cells into the closest cluster in a reference datasettof_upsample
Upsample cells into the closest cluster in a reference datasettof_upsample_distance
Upsample cells into the cluster of their nearest neighbor a reference datasettof_upsample_neighbor
Write a series of .csv files from a tof_tbltof_write_csv
Write high-dimensional cytometry data to a file or to a directory of filestof_write_data
Write a series of .fcs files from a tof_tbltof_write_fcs
Select variables with a functionwhere