Introduction to Seahtrue

Seahtrue overview

The seahtrue package offers a set of functions to be able to perform reproducible data analysis of extracellular flux analysis. The main function revive_xfplate() reads, preprocess and validates the input data and outputs experimental details and outcome variables in an organized (tidy) way. The output of the revive_xfplate() is a nested tibble.

Extracellular flux analysis scientific primer

With instruments such as the Seahorse XF analyzer from Agilent, but also the O2K from Oroboros, other oyxgraphs (from Hansatech instruments for example) and the ReSipher from Lucid Scientific, scientists are able to analyze oxygen consumption of living biological samples.

Oxygen consumption of cells or small model organism can provide insights into the function of the mitochondria, since mitochondria are most of the time the main O2 consumers of cells. Apart form oxygen consumption the Seahorse XF analyzer is able to analyze in parallel the extracellular acidification of the culture medium in which the sample is emerged. This can be a proxy for glycolytic activity of samples.

Seahorse extracellular flux instruments performs analysis of O2 and pH in either 96 wells, 24 wells or 8 wells, and typically O2 and pH are monitored over a period of around 1 hour, in discrete measurements of typically 3 minutes each. Furthermore, perturbations of cellular functional states can be induced by adding compounds while performing the assay. The most common perturbations that are performed are injections of oligomycin, fccp and anitmycin/rotenone, known as a mitostress test.

Resources

Divakaruni, Ajit S., and Martin Jastroch. “A Practical Guide for the Analysis, Standardization and Interpretation of Oxygen Consumption Measurements.” Nature Metabolism 4, no. 8 (August 15, 2022): 978–94. https://doi.org/10.1038/s42255-022-00619-4.

Gerencser, A. A., A. Neilson, S. W. Choi, U. Edman, N. Yadava, R. J. Oh, D. A. Ferrick, D. G. Nicholls, and M. D. Brand. “Quantitative Microplate-Based Respirometry with Correction for Oxygen Diffusion.” Anal Chem 81, no. 16 (August 15, 2009): 6868–78. https://doi.org/10.1021/ac900881z.

Janssen, J. J. E., B. Lagerwaard, A. Bunschoten, H. F. J. Savelkoul, R. J. J. van Neerven, J. Keijer, and V. C. J. de Boer. “Novel Standardized Method for Extracellular Flux Analysis of Oxidative and Glycolytic Metabolism in Peripheral Blood Mononuclear Cells.” Sci Rep 11, no. 1 (January 18, 2021): 1662. https://doi.org/10.1038/s41598-021-81217-4.

Zhang, Xiang, Taolin Yuan, Jaap Keijer, and Vincent C. J. de Boer. “OCRbayes: A Bayesian Hierarchical Modeling Framework for Seahorse Extracellular Flux Oxygen Consumption Rate Data Analysis.” PLOS ONE 16, no. 7 (July 15, 2021): e0253926. https://doi.org/10.1371/journal.pone.0253926.

Seahtrue

Inspect the seahtrue data


# data_file_path <- 
#   system.file("data", 
#               "revive_output_donor_A.rda", 
#               package = "seahtrue")
# 
# load(data_file_path)


#library(seahtrue)
#library(tidyverse)
revive_output_donor_A <- 
  system.file("extdata", 
              "20191219_SciRep_PBMCs_donor_A.xlsx",
              package = "seahtrue") %>% 
  seahtrue::revive_xfplate()
#> → Start function to read seahorse plate data from Excel file:
#> 20191219_SciRep_PBMCs_donor_A.xlsx
#> ℹ Finished collecting assay information.
#> → plateid is identified as:V0174416419V
#> → Rate was exported WITH background correction
#> ℹ Finished preprocessing of the input data
#> ℹ Finished validating the input data

Take a glimpse at the generated data from the revive_xfplate() function:

revive_output_donor_A %>%  
  dplyr::glimpse()
#> Rows: 1
#> Columns: 9
#> $ plate_id          <chr> "V0174416419V"
#> $ filepath_seahorse <list> [<tbl_df[1 x 3]>]
#> $ date_run          <chr> "19-12-2019 17:25"
#> $ date_processed    <dttm> 2024-10-31 05:17:02
#> $ assay_info        <list> [<tbl_df[1 x 24]>]
#> $ injection_info    <list> [<tbl_df[12 x 3]>]
#> $ raw_data          <list> [<tbl_df[13824 x 21]>]
#> $ rate_data         <list> [<tbl_df[1140 x 12]>]
#> $ validation_output <list> [TRUE, TRUE, [<tbl_df[12 x 10]>], [<tbl_df[13824 x 4…

Our data starts with 4 columns of identifiers. plate_id, filepath_seahorse, date_run and date_processed, this keeps the data easily traceable with ids on the top level. After that, 5 columns of nested tibbles follow. The assay_info column contains a tibble/dataframe with information that was stored in the experimental file. This is either information that the user put into the software before running the experiment, after running the software when processing the data, or was generated by the software. The next colum is the injection_info containing measurement, interval and injection. Then the two data columns for raw_data and rate_data are listed as tibble/dataframe. The final column is the validation_output that has the output of the validations as well as its rules. In the next sections, we will explore each data output separately.

Rate data

The rate_data is what scientist typically use for their interpretation of their XF experiments. It contains the calculated OCR (oxygen consumption rate) and ECAR (extracellular acidification rate) values, together with the PER (proton efflux rate). The PER is calculated from ECAR when the buffer capacity is known and set in the experiment. In our rate_data we only have the OCR and ECAR data, since PER can be easily calculated.

revive_output_donor_A %>%  
  purrr::pluck("rate_data", 1)
#> # A tibble: 1,140 × 12
#>    measurement well  group time_wave OCR_wave OCR_wave_bc ECAR_wave ECAR_wave_bc
#>          <dbl> <chr> <chr>     <dbl>    <dbl>       <dbl>     <dbl>        <dbl>
#>  1           1 A01   Back…      1.31        0        0            0         0   
#>  2           1 A02   50.0…      1.31        0        6.22         0         2.90
#>  3           1 A03   100.…      1.31        0       26.6          0         5.87
#>  4           1 A04   100.…      1.31        0       21.4          0         4.40
#>  5           1 A05   150.…      1.31        0        3.08         0        12.4 
#>  6           1 A06   200.…      1.31        0       41.1          0         8.98
#>  7           1 A07   150.…      1.31        0       39.5          0         9.27
#>  8           1 A08   200.…      1.31        0       40.4          0         4.75
#>  9           1 A09   250.…      1.31        0       58.8          0         7.39
#> 10           1 A10   250.…      1.31        0       59.4          0         6.88
#> # ℹ 1,130 more rows
#> # ℹ 4 more variables: cell_n <dbl>, interval <dbl>, injection <chr>,
#> #   flagged_well <lgl>

The rate_data has well,measurement, group identifiers for each row followed by the time_wave column which provides the time of the measurement in minutes, and the OCR and ECAR data columns. Also the cell_n and flagged_well status is joined in this dataframe. This provides all information for exploring and plotting the data. Since the OCR and ECAR data can be exported with either background on or off, the read functions in the seahtrue package determine whether the OCR and ECAR are background corrected or not, based on whether the Background wells have an OCR of zero. If the data was not corrected for background the the OCR is corrected while reading the .xlsx file. The background corrected data is given in OCR_wave_bc and ECAR_wave_bc. If the data was exported without background correction the OCR_wave and ECAR_wave data columns would contain the non corrected OCR and ECAR.

Raw data

The raw_data tibble contains the raw data that is collected in an XF experiment. This is essential data that can give detailed insights on the quality of the assay. Apart from the data that is presented in the raw data sheet of the .xlsx, some preprocessing output is given. Such as the timescale in seconds (timescale) and minutes (minutes), as well as an interval and injection id. Also, the background corrected raw data values for pH, O2 and its emissions are given. Again, just like in the rate_data tibble, the cell_n and flagged_well status is given.

revive_output_donor_A %>%  
  purrr::pluck("raw_data", 1)
#> # A tibble: 13,824 × 21
#>    well  measurement  tick timescale minutes group interval injection O2_em_corr
#>    <chr>       <dbl> <dbl>     <dbl>   <dbl> <chr>    <dbl> <chr>          <dbl>
#>  1 A01             1     0         0       0 Back…        1 Baseline      12422.
#>  2 A02             1     0         0       0 50.0…        1 Baseline      12323.
#>  3 A03             1     0         0       0 100.…        1 Baseline      12483.
#>  4 A04             1     0         0       0 100.…        1 Baseline      12362.
#>  5 A05             1     0         0       0 150.…        1 Baseline      12103.
#>  6 A06             1     0         0       0 200.…        1 Baseline      12274.
#>  7 A07             1     0         0       0 150.…        1 Baseline      12354.
#>  8 A08             1     0         0       0 200.…        1 Baseline      12325.
#>  9 A09             1     0         0       0 250.…        1 Baseline      12347.
#> 10 A10             1     0         0       0 250.…        1 Baseline      12209.
#> # ℹ 13,814 more rows
#> # ℹ 12 more variables: pH_em_corr <dbl>, O2_mmHg <dbl>, pH <dbl>,
#> #   pH_em_corr_corr <dbl>, O2_em_corr_bkg <dbl>, pH_em_corr_bkg <dbl>,
#> #   O2_mmHg_bkg <dbl>, pH_bkgd <dbl>, pH_em_corr_corr_bkg <dbl>,
#> #   bufferfactor <dbl>, cell_n <dbl>, flagged_well <lgl>

Assay info

The assay_info has information from user or software provided meta data that is associated with the experiment and plate. For example, the barcode of the cartridge that was used:

revive_output_donor_A %>%  
  purrr::pluck("assay_info", 1) %>% 
  pull(cartridge_barcode)
#> [1] "W0013917519B**+405-6+101-2300F+240-2***000A+219-4**+450-1125&"

The XF analyzer reads for each cartridge a barcode that is then associated with the assay. There is some information in the barcode that the software uses for OCR calculation. The emission of the fluorescent O2 sensors at zero oxygen F0 (see Gerencser et al. (2009) Quantitative microplate-based respirometry with correction for oxygen diffusion. Anal Chem 81:6868, for details) is derived from the Stern-Volmer constant KSV. Where the emission at ambient oxygen is typically set at 12500 AU and ambient oxygen levels in wells in culture medium is set to 151.6900241 mmHg.

# KSV in barcode
revive_output_donor_A %>%  
  purrr::pluck("assay_info", 1) %>% 
  pull(cartridge_barcode) %>% 
  stringr::str_sub(-18, -13)
#> [1] "+219-4"
  
# KSV in assay info sheet
revive_output_donor_A %>%  
  purrr::pluck("assay_info", 1) %>% 
  pull(KSV) 
#> [1] 0.0219

# F0 can be calculated using the stern-volmer equation
# and the info 
# emission target at ambient O2 = 12500
# O2 level at ambient in sample medium in well = 151.69
#
# F0/F = 1 + KSV*O2
# F0 = (1+KSV*O2)*F
# F0 = (1+ KSV*151.69)*12500

# F0 from assay info sheet
revive_output_donor_A %>%  
  purrr::pluck("assay_info", 1) %>% 
  pull(F0) 
#> [1] 54025.14

Apart from user and software generated meta info, the functions in the seahtrue package also put some relevant info into this tibble. Such as the time to start the actual measurements (minutes_to_start_measurement_one), that shows how long the user took to insert the cell plate and start running the measurements. The timer starts at t = 0 minutes when the cartridge is calibrated by the user.

revive_output_donor_A %>%  
  purrr::pluck("assay_info", 1) %>% 
  pull(minutes_to_start_measurement_one)
#> [1] 37.23333

Apart from the assay_info there can be some more meta info associated with the data tibbles in the form of attributes. These can also be viewed as shown in the following examples:

revive_output_donor_A %>%  
  purrr::pluck("rate_data", 1) %>% str()
#> tibble [1,140 × 12] (S3: tbl_df/tbl/data.frame)
#>  $ measurement : num [1:1140] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ well        : chr [1:1140] "A01" "A02" "A03" "A04" ...
#>  $ group       : chr [1:1140] "Background" "50.000" "100.000" "100.000" ...
#>  $ time_wave   : num [1:1140] 1.31 1.31 1.31 1.31 1.31 ...
#>  $ OCR_wave    : num [1:1140] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ OCR_wave_bc : num [1:1140] 0 6.22 26.64 21.42 3.08 ...
#>  $ ECAR_wave   : num [1:1140] 0 0 0 0 0 0 0 0 0 0 ...
#>  $ ECAR_wave_bc: num [1:1140] 0 2.9 5.87 4.4 12.36 ...
#>  $ cell_n      : num [1:1140] 0 32472 114732 83567 153510 ...
#>  $ interval    : num [1:1140] 1 1 1 1 1 1 1 1 1 1 ...
#>  $ injection   : chr [1:1140] "Baseline" "Baseline" "Baseline" "Baseline" ...
#>  $ flagged_well: logi [1:1140] FALSE FALSE FALSE FALSE FALSE FALSE ...
#>  - attr(*, "was_background_corrected")= logi TRUE
  revive_output_donor_A %>%  
    purrr::pluck("rate_data", 1) %>% 
    attributes() %>% 
    purrr::pluck("was_background_corrected")
#> [1] TRUE

Injection info

Since every XF experiment uses pertubations with chemicals or nutrients the injection_info is important for interpretation of the experiment. The injection info can be plucked from the data tibble.

  revive_output_donor_A %>%  
    purrr::pluck("injection_info", 1)
#> # A tibble: 12 × 3
#>    measurement interval injection       
#>          <int>    <dbl> <chr>           
#>  1           1        1 Baseline        
#>  2           2        1 Baseline        
#>  3           3        1 Baseline        
#>  4           4        2 FCCP            
#>  5           5        2 FCCP            
#>  6           6        2 FCCP            
#>  7           7        3 AM/ROT          
#>  8           8        3 AM/ROT          
#>  9           9        3 AM/ROT          
#> 10          10        4 Monensin/Hoechst
#> 11          11        4 Monensin/Hoechst
#> 12          12        4 Monensin/Hoechst

Session info

sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] seahtrue_1.1.0   dplyr_1.1.4      BiocStyle_2.35.0
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyr_1.3.1         sass_0.4.9          utf8_1.2.4         
#>  [4] generics_0.1.3      stringi_1.8.4       rematch_2.0.0      
#>  [7] digest_0.6.37       magrittr_2.0.3      evaluate_1.0.1     
#> [10] grid_4.4.1          timechange_0.3.0    fastmap_1.2.0      
#> [13] cellranger_1.1.0    jsonlite_1.8.9      BiocManager_1.30.25
#> [16] purrr_1.0.2         fansi_1.0.6         scales_1.3.0       
#> [19] jquerylib_0.1.4     cli_3.6.3           rlang_1.1.4        
#> [22] munsell_0.5.1       withr_3.0.2         cachem_1.1.0       
#> [25] yaml_2.3.10         tidyxl_1.0.10       tools_4.4.1        
#> [28] colorspace_2.1-1    ggplot2_3.5.1       buildtools_1.0.0   
#> [31] vctrs_0.6.5         logger_0.4.0        R6_2.5.1           
#> [34] settings_0.2.7      lifecycle_1.0.4     lubridate_1.9.3    
#> [37] snakecase_0.11.1    stringr_1.5.1       janitor_2.2.0      
#> [40] validate_1.1.5      pkgconfig_2.0.3     pillar_1.9.0       
#> [43] bslib_0.8.0         gtable_0.3.6        glue_1.8.0         
#> [46] Rcpp_1.0.13         xfun_0.48           tibble_3.2.1       
#> [49] tidyselect_1.2.1    sys_3.4.3           knitr_1.48         
#> [52] htmltools_0.5.8.1   rmarkdown_2.28      maketools_1.3.1    
#> [55] compiler_4.4.1      readxl_1.4.3