Package: GSgalgoR 1.17.0

Carlos Catania

GSgalgoR: An Evolutionary Framework for the Identification and Study of Prognostic Gene Expression Signatures in Cancer

A multi-objective optimization algorithm for disease sub-type discovery based on a non-dominated sorting genetic algorithm. The 'Galgo' framework combines the advantages of clustering algorithms for grouping heterogeneous 'omics' data and the searching properties of genetic algorithms for feature selection. The algorithm search for the optimal number of clusters determination considering the features that maximize the survival difference between sub-types while keeping cluster consistency high.

Authors:Martin Guerrero [aut], Carlos Catania [cre]

GSgalgoR_1.17.0.tar.gz
GSgalgoR_1.17.0.zip(r-4.5)GSgalgoR_1.17.0.zip(r-4.4)GSgalgoR_1.17.0.zip(r-4.3)
GSgalgoR_1.17.0.tgz(r-4.4-any)GSgalgoR_1.17.0.tgz(r-4.3-any)
GSgalgoR_1.17.0.tar.gz(r-4.5-noble)GSgalgoR_1.17.0.tar.gz(r-4.4-noble)
GSgalgoR_1.17.0.tgz(r-4.4-emscripten)GSgalgoR_1.17.0.tgz(r-4.3-emscripten)
GSgalgoR.pdf |GSgalgoR.html
GSgalgoR/json (API)
NEWS

# Install 'GSgalgoR' in R:
install.packages('GSgalgoR', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/harpomaxx/gsgalgor/issues4 issues

On BioConductor:GSgalgoR-1.17.0(bioc 3.21)GSgalgoR-1.16.0(bioc 3.20)

geneexpressiontranscriptionclusteringclassificationsurvival

5.48 score 15 stars 6 scripts 154 downloads 21 exports 14 dependencies

Last updated 3 months agofrom:5111e23387. Checks:5 OK, 2 NOTE. Indexed: yes.

TargetResultLatest binary
Doc / VignettesOKDec 29 2024
R-4.5-winNOTEDec 29 2024
R-4.5-linuxNOTEDec 29 2024
R-4.4-winOKDec 29 2024
R-4.4-macOKDec 29 2024
R-4.3-winOKDec 29 2024
R-4.3-macOKDec 29 2024

Exports:calculate_distance_euclidean_cpucalculate_distance_pearson_cpucalculate_distance_spearman_cpucalculate_distance_uncentered_cpucallback_base_reportcallback_base_return_popcallback_defaultcallback_no_reportclassify_multiplecluster_algorithmcluster_classifycosine_similaritycreate_centroidsgalgok_centroidsnon_dominated_summaryplot_paretoselect_distancesurv_fitnessto_dataframeto_list

Dependencies:clustercodetoolsdoParallelforeachiteratorslatticematchingRMatrixmconsga2RproxyRcppRcppArmadillosurvival

GSgalgoR Callbacks Mechanism

Rendered fromGSgalgoR_callbacks.rmdusingknitr::rmarkdownon Dec 29 2024.

Last update: 2020-10-20
Started: 2020-08-03

GSgalgoR user Guide

Rendered fromGSgalgoR.rmdusingknitr::rmarkdownon Dec 29 2024.

Last update: 2021-05-22
Started: 2020-08-03

Citation

Guerrero-Gimenez ME, Fernandez-Munoz JM, Lang BJ, Holton KM, Ciocca DR, Catania CA, Zoppino FCM (2020). “Galgo: A bi-objective evolutionary meta-heuristic identifies robust transcriptomic classifiers associated with patient outcome across multiple cancer types.” Bioinformatics. ISSN 1367-4803, doi:10.1093/bioinformatics/btaa619, btaa619, https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btaa619/33472708/btaa619.pdf, https://doi.org/10.1093/bioinformatics/btaa619.

Corresponding BibTeX entry:

  @Article{10.1093/bioinformatics/btaa619,
    author = {M E Guerrero-Gimenez and J M Fernandez-Munoz and B J Lang
      and K M Holton and D R Ciocca and C A Catania and F C M Zoppino},
    title = {Galgo: A bi-objective evolutionary meta-heuristic
      identifies robust transcriptomic classifiers associated with
      patient outcome across multiple cancer types},
    journal = {Bioinformatics},
    year = {2020},
    month = {07},
    issn = {1367-4803},
    doi = {10.1093/bioinformatics/btaa619},
    url = {https://doi.org/10.1093/bioinformatics/btaa619},
    note = {btaa619},
    eprint =
      {https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btaa619/33472708/btaa619.pdf},
  }

Readme and manuals

GSgalgoR

GSgalgoR is a R package implementing a multi-objective optimization algorithm for disease subtype discovery based on a non-dominated sorting genetic algorithm (galgo). The galgo framework combines the advantages of clustering algorithms for grouping heterogeneous omics data and the searching properties of genetic algorithms for feature selection and optimal number of clusters determination to find features that maximize the survival difference between subtypes while keeping cluster consistency high.

Citation

GSgalgoR is covered in Galgo: A bi-objective evolutionary meta-heuristic identifies robust transcriptomic classifiers associated with patient outcome across multiple cancer types.

Please cite:

M E Guerrero-Gimenez, J M Fernandez-Muñoz, B J Lang, K M Holton, D R Ciocca, C A Catania, F C M Zoppino, Galgo: A bi-objective evolutionary meta-heuristic identifies robust transcriptomic classifiers associated with patient outcome across multiple cancer types, Bioinformatics, , btaa619, https://doi.org/10.1093/bioinformatics/btaa619

Documentation

The full documentation of the package is available at https://harpomaxx.github.io/GSgalgoR/

Package Overview

The GSgalgoR package implements the Galgo algorithm as well as several helper functions for analyzing the results.

In order to standardize the structure of genomic data, the package uses the ExpressionSet structure from the Biobase package. The ExpressionSet objects can hold different types of data in a single structure, but in this case, we opted for using a simplified format to facilitate the example to those not familiar with the Biobase package. The ExpressionSet objects are formed mainly by a matrix of genetic expression, usually derived from microarray or RNAseq experiments and the Phenotypic data containing information on the samples (condition, status, treatment, survival, and other covariates). Additionally, some annotations and feature Meta-data can also be included in the objects.

A complete and detailed explanation about galgo's workflow is provided in the example Vignette

The package provides a simple but robust callback mechanism to adapt the algorithm to different needs (check the Vignette). Additionally, GSgalgoR provides the Wilkerson's centroids to perform lung adenocarcinoma sample classification.

Installation

You can install the released version of GSgalgoR using devtools with:

devtools::install_github("https://github.com/harpomaxx/GSgalgoR")
library(GSgalgoR)

Executing Galgo

The main function in the package is galgo(). The function accepts an expression matrix in the previous detailed section and a survival object survival package) to find robust gene expression signatures related to a given outcome. Besides, galgo() accepts several other parameters such as the number of solutions in the population, the number of generations the algorithm must evolve, and the distance function used for the clustering algorithm, among others. The parameters facilitate the setup according to the characteristics of the analysis to be performed. All the Galgo evolutionary process is executed using a multicore architecture. Alternatively, to speed up the process, it is possible to execute Galgo on Graphics processor units (GPU).

For a rapid testing of GSgalgoR two reduced lung adenocarcinoma gene expression datasets ([TCGA] and [GSE68465]) can be downloaded from https://bit.ly/luad_data_galgo (md5sum 900a74e7c4fdd0dcb7a3f2ddb44bb680) .

An example of a typical GSgalgoR workflow is shown below:

download.file("https://bit.ly/luad_data_galgo",destfile="/tmp/luad.rds")
rna_luad <- readRDS("/tmp/luad.rds")

prm <- rna_luad$TCGA$expression_matrix
clinical <-rna_luad$TCGA$pheno_data
OS <- survival::Surv(time=clinical$time,event=clinical$status)

output<-GSgalgoR::galgo(generations = 5, population = 30,
       prob_matrix = prm, OS=OS, ,verbose = 2,usegpu = F)

An example of the results obtained by Galgo in the TCGA dataset. The first plot shows the Pareto front obtained by GSgalgoR in terms of the Survival (Surv.Fit) and the cohesiveness (SC.Fit) fitness functions. On the second plots shows the different survival subtypes found by the algorithm.

References

  • Guerrero-Gimenez ME, Fernandez-Muñoz JM, Lang BJ, Holton KM, Ciocca DR, Catania CA, Zoppino FCM. Galgo: A bi-objective evolutionary meta-heuristic identifies robust transcriptomic classifiers associated with patient outcome across multiple cancer types, Bioinformatics, 2020, btaa619, https://doi.org/10.1093/bioinformatics/btaa619
  • Guerrero-Gimenez ME, Catania CA, Fernandez-Muñoz JM et al. Genetic algorithm for the searching cancer subtypes with clinical significance according to their gene expression patterns , 9(ISCB Comm J 2018):664 (poster) (doi: 10.7490/f1000research.1118020.1)

Help Manual

Help pageTopics
GSgalgoR: A bi-objective evolutionary meta-heuristic to identify robust transcriptomic classifiers associated with patient outcome across multiple cancer types.GSgalgoR-package GSgalgoR
Functions to calculate distance matrices using cpu computingcalculate_distance calculate_distance_euclidean_cpu calculate_distance_pearson_cpu calculate_distance_spearman_cpu calculate_distance_uncentered_cpu select_distance
Print basic info per generationcallback_base_report
A base callback function that returns a galgo.Objcallback_base_return_pop
A default call_back function that does nothing.callback_default
Print minimal information to the user about galgo execution.callback_no_report
Classify samples from multiple centroidsclassify_multiple
Wrapper function to perform partition around medioids (PAM) for GalgoRcluster_algorithm
Distance to centroid classifier functioncluster_classify
Function for calculating the cosine similaritycosine_similarity
Create Centroidscreate_centroids
GSgalgoR main functiongalgo
Galgo Object classgalgo.Obj galgo.Obj-class
Function to calculate the centroids of different groups (classes)k_centroids
Summary of the non dominated solutionsnon_dominated_summary
Plot pareto front from an galgo.Objplot_pareto
Survival fitness function using the Restricted Mean Survival Time (RMST) of each groupsurv_fitness
Convert galgo.Obj to data.frameto_dataframe
Convert galgo.Obj to listto_list