The GSgalgoR package provides a practical but straightforward
callback mechanism for adapting different galgo()
execution
sections to final user needs. The GSgalgoR callbacks mechanism enables
adding custom functions to change the galgo()
function
behavior by including minor modification to galgo’s workflow. A common
application of the callback mechanism is to implement personalized
reports, saving partial information during the evolution process or
compute the execution time.
There are five possible points where the user can hook its own code
inside galgo()
execution process.
galgo()
execution process.
(i.e. when galgo()
is about to start.)galgo()
execution process. (i.e. when
galgo()
is about to finish. )Each one of the five possible hooks can be accessed through
parameters with the *_callback* suffix in the galgo()
function.
galgo(...,
start_galgo_callback = callback_default,# `galgo()` is about to start.
end_galgo_callback = callback_default, # `galgo()` is about to finish.
start_gen_callback = callback_default, # At the beginning of each generation
end_gen_callback = callback_default, # At the end of each generation
report_callback = callback_default, # In the middle of the generation,
# right after the new mating pool
# have been created.
...)
A callback function definition can be any R function accepting six parameters.
-userdir
: the directory (“character”) where the user can
save information into local filesystem. -generation
: the
number (“integer”) of the current generation/iteration.
-pop_pool
: the data.frame containing the resulting
solutions for current iteration. -pareto
: the solutions
found by galgo()
accross all generations in the solution
space -prob_matrix
: the expression set (“matrix) where
features are rows and samples distributed in columns.
-current_time
: The current time (an object of
class”POSIXct”).
The following callback function example prints the generation number and current time every two iterations
my_callback <-
function(userdir = "",
generation,
pop_pool,
pareto,
prob_matrix,
current_time) {
# code starts here
if (generation%%2 == 0)
message(paste0("generation: ",generation,
" current_time: ",current_time))
}
then, the my_callback()
function needs to be assigned to
some of the available hooks provided by the galgo()
. An
example of such assignment and the resulting output is provided in the
two snippets below.
A reduced version of the TRANSBIG
dataset is used to setup the expression and clinical information
required for the galgo()
function.
data(transbig)
train <- transbig
rm(transbig)
expression <- Biobase::exprs(train)
clinical <- Biobase::pData(train)
OS <- survival::Surv(time = clinical$t.rfs, event = clinical$e.rfs)
# use a reduced dataset for the example
expression <- expression[sample(1:nrow(expression), 100), ]
# scale the expression matrix
expression <- t(scale(t(expression)))
Then, the galgo()
function is invoked and the recently
defined function my_callback()
is assigned to the
report_callback
hook-point.
# Running galgo
GSgalgoR::galgo(generations = 6,
population = 15,
prob_matrix = expression,
OS = OS,
start_galgo_callback = GSgalgoR::callback_default,
end_galgo_callback = GSgalgoR::callback_default,
report_callback = my_callback, # call `my_callback()` in the mile
# of each generation/iteration.
start_gen_callback = GSgalgoR::callback_default,
end_gen_callback = GSgalgoR::callback_default)
#> Using CPU for computing pearson distance
#> generation: 2 current_time: 2024-10-30 07:22:40.742918
#> generation: 4 current_time: 2024-10-30 07:22:42.520267
#> generation: 6 current_time: 2024-10-30 07:22:44.402916
#> NULL
The following callback function save in a temporary directory the
solutions obtained every five generation/iteration. A file the number of
the generation and with a rda.
extension will be left in a
directory defined by the tempdir()
function.
my_save_pop_callback <-
function(userdir = "",
generation,
pop_pool,
pareto,
prob_matrix,
current_time) {
directory <- paste0(tempdir(), "/")
if (!dir.exists(directory)) {
dir.create(directory, recursive = TRUE)
}
filename <- paste0(directory, generation, ".rda")
if (generation%%2 == 0){
save(file = filename, pop_pool)
}
message(paste("solution file saved in",filename))
}
As usual, the galgo()
function is invoked and the
recently defined function my_save_pop_callback()
is
assigned to the end_gen_callback
hook-point. As a result,
every five generation/iteration the complete solution obtained by galgo
will be saved in a file.
# Running galgo
GSgalgoR::galgo(
generations = 6,
population = 15,
prob_matrix = expression,
OS = OS,
start_galgo_callback = GSgalgoR::callback_default,
end_galgo_callback = GSgalgoR::callback_default,
report_callback = my_callback,# call `my_callback()`
# in the middle of each generation/iteration.
start_gen_callback = GSgalgoR::callback_default,
end_gen_callback = my_save_pop_callback # call `my_save_pop_callback()`
# at the end of each
# generation/iteration
)
#> Using CPU for computing pearson distance
#> solution file saved in /tmp/RtmpAK3MxU/1.rda
#> generation: 2 current_time: 2024-10-30 07:22:47.503315
#> solution file saved in /tmp/RtmpAK3MxU/2.rda
#> solution file saved in /tmp/RtmpAK3MxU/3.rda
#> generation: 4 current_time: 2024-10-30 07:22:49.424645
#> solution file saved in /tmp/RtmpAK3MxU/4.rda
#> solution file saved in /tmp/RtmpAK3MxU/5.rda
#> generation: 6 current_time: 2024-10-30 07:22:51.20821
#> solution file saved in /tmp/RtmpAK3MxU/6.rda
#> NULL
By default, GSfalgoR implements four callback functions
callback_default()
a simple callback that does nothing
at all. It is just used for setting the default behavior of some of the
hook-points inside galgo()
callback_base_report()
a report callback for printing basic
information about the solution provided by galgo()
such as
fitness and crowding distance. callback_no_report()
a
report callback for informing the user galgo is running. Not valuable
information is shown. callback_base_return_pop()
a callback
function for building and returning t he galgo.Obj
object.
In the the default definition of the galgo()
function
the hook-points are defined as follow:
-start_galgo_callback = callback_default
-end_galgo_callback = callback_base_return_pop
-report_callback = callback_base_report
-start_gen_callback = callback_default
-end_gen_callback = callback_default
Notice by using the callback mechanism it is possible to modify even
the returning value of the galgo()
function. The default
callback_base_return_pop()
returns a galgo.Obj
object, however it would simple to change that behavior for something
like the my_save_pop_callback()
and the function will not
returning any value.
# Running galgo
GSgalgoR::galgo(
generations = 6,
population = 15,
prob_matrix = expression,
OS = OS,
start_galgo_callback = GSgalgoR::callback_default,
end_galgo_callback = my_save_pop_callback,
report_callback = my_callback, # call `my_callback()`
# in the middle of each generation/iteration
start_gen_callback = GSgalgoR::callback_default,
end_gen_callback = GSgalgoR::callback_default
)
#> Using CPU for computing pearson distance
#> generation: 2 current_time: 2024-10-30 07:22:54.342906
#> generation: 4 current_time: 2024-10-30 07:22:56.057882
#> generation: 6 current_time: 2024-10-30 07:22:57.792684
#> solution file saved in /tmp/RtmpAK3MxU/6.rda
For preserving the return behavior of the galgo()
function,
callback_base_return_pop()
should be called inside a custom
callback. An example of such situation is shown below:
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] survminer_0.4.9 ggpubr_0.6.0
#> [3] ggplot2_3.5.1 genefu_2.37.0
#> [5] AIMS_1.39.0 e1071_1.7-16
#> [7] iC10_2.0.2 biomaRt_2.63.0
#> [9] survcomp_1.55.0 prodlim_2024.06.25
#> [11] survival_3.7-0 Biobase_2.67.0
#> [13] BiocGenerics_0.53.0 GSgalgoR_1.17.0
#> [15] breastCancerUPP_1.43.0 breastCancerTRANSBIG_1.43.0
#> [17] BiocStyle_2.35.0
#>
#> loaded via a namespace (and not attached):
#> [1] sys_3.4.3 jsonlite_1.8.9 magrittr_2.0.3
#> [4] SuppDists_1.1-9.8 farver_2.1.2 rmarkdown_2.28
#> [7] zlibbioc_1.51.2 vctrs_0.6.5 memoise_2.0.1
#> [10] rstatix_0.7.2 htmltools_0.5.8.1 progress_1.2.3
#> [13] curl_5.2.3 broom_1.0.7 Formula_1.2-5
#> [16] sass_0.4.9 parallelly_1.38.0 KernSmooth_2.23-24
#> [19] bslib_0.8.0 httr2_1.0.5 zoo_1.8-12
#> [22] impute_1.79.0 cachem_1.1.0 commonmark_1.9.2
#> [25] buildtools_1.0.0 iterators_1.0.14 lifecycle_1.0.4
#> [28] pkgconfig_2.0.3 Matrix_1.7-1 R6_2.5.1
#> [31] fastmap_1.2.0 GenomeInfoDbData_1.2.13 future_1.34.0
#> [34] digest_0.6.37 colorspace_2.1-1 AnnotationDbi_1.69.0
#> [37] S4Vectors_0.43.2 nsga2R_1.1 RSQLite_2.3.7
#> [40] labeling_0.4.3 filelock_1.0.3 km.ci_0.5-6
#> [43] fansi_1.0.6 httr_1.4.7 abind_1.4-8
#> [46] compiler_4.4.1 proxy_0.4-27 doParallel_1.0.17
#> [49] bit64_4.5.2 withr_3.0.2 backports_1.5.0
#> [52] carData_3.0-5 DBI_1.2.3 highr_0.11
#> [55] ggsignif_0.6.4 lava_1.8.0 rappdirs_0.3.3
#> [58] tools_4.4.1 iC10TrainingData_2.0.1 future.apply_1.11.3
#> [61] bootstrap_2019.6 glue_1.8.0 gridtext_0.1.5
#> [64] grid_4.4.1 cluster_2.1.6 generics_0.1.3
#> [67] gtable_0.3.6 KMsurv_0.1-5 class_7.3-22
#> [70] tidyr_1.3.1 data.table_1.16.2 hms_1.1.3
#> [73] xml2_1.3.6 car_3.1-3 utf8_1.2.4
#> [76] XVector_0.45.0 markdown_1.13 foreach_1.5.2
#> [79] pillar_1.9.0 stringr_1.5.1 limma_3.61.12
#> [82] splines_4.4.1 ggtext_0.1.2 dplyr_1.1.4
#> [85] BiocFileCache_2.15.0 lattice_0.22-6 bit_4.5.0
#> [88] tidyselect_1.2.1 maketools_1.3.1 Biostrings_2.75.0
#> [91] knitr_1.48 gridExtra_2.3 IRanges_2.39.2
#> [94] pamr_1.57 stats4_4.4.1 xfun_0.48
#> [97] statmod_1.5.0 stringi_1.8.4 UCSC.utils_1.1.0
#> [100] yaml_2.3.10 evaluate_1.0.1 codetools_0.2-20
#> [103] tibble_3.2.1 BiocManager_1.30.25 cli_3.6.3
#> [106] survivalROC_1.0.3.1 xtable_1.8-4 munsell_0.5.1
#> [109] jquerylib_0.1.4 survMisc_0.5.6 Rcpp_1.0.13
#> [112] GenomeInfoDb_1.41.2 rmeta_3.0 globals_0.16.3
#> [115] dbplyr_2.5.0 png_0.1-8 parallel_4.4.1
#> [118] mco_1.17 blob_1.2.4 prettyunits_1.2.0
#> [121] mclust_6.1.1 listenv_0.9.1 scales_1.3.0
#> [124] purrr_1.0.2 crayon_1.5.3 rlang_1.1.4
#> [127] KEGGREST_1.45.1