Title: | SCFA: Subtyping via Consensus Factor Analysis |
---|---|
Description: | Subtyping via Consensus Factor Analysis (SCFA) can efficiently remove noisy signals from consistent molecular patterns in multi-omics data. SCFA first uses an autoencoder to select only important features and then repeatedly performs factor analysis to represent the data with different numbers of factors. Using these representations, it can reliably identify cancer subtypes and accurately predict risk scores of patients. |
Authors: | Duc Tran [aut, cre], Hung Nguyen [aut], Tin Nguyen [fnd] |
Maintainer: | Duc Tran <[email protected]> |
License: | LGPL |
Version: | 1.17.0 |
Built: | 2024-10-31 04:38:51 UTC |
Source: | https://github.com/bioc/SCFA |
GBM dataset, including microRNA and survidal data.
GBM
GBM
A list with two items:
List of microRNA data matrix.
Survival information.
The main function to perform subtyping. It takes a list of data matrices as the input and outputs the subtype for each patient
SCFA(dataList, k = NULL, max.k = 5, ncores = 10L, seed = NULL)
SCFA(dataList, k = NULL, max.k = 5, ncores = 10L, seed = NULL)
dataList |
List of data matrices. In each matrix, rows represent samples and columns represent genes/features. |
k |
Number of clusters, leave as default for auto detection. |
max.k |
Maximum number of cluster |
ncores |
Number of processor cores to use. |
seed |
Seed for reproducibility, you still need to use set.seed function for full reproducibility. |
A numeric vector containing cluster assignment for each sample.
#Load example data (GBM dataset) data("GBM") #List of one matrix (microRNA data) dataList <- GBM$data #Survival information survival <- GBM$survival library(survival) #Generating subtyping result set.seed(1) subtype <- SCFA(dataList, seed = 1, ncores = 2L) #Perform survival analysis on the result coxFit <- coxph(Surv(time = Survival, event = Death) ~ as.factor(subtype), data = survival, ties="exact") coxP <- round(summary(coxFit)$sctest[3],digits = 20) print(coxP)
#Load example data (GBM dataset) data("GBM") #List of one matrix (microRNA data) dataList <- GBM$data #Survival information survival <- GBM$survival library(survival) #Generating subtyping result set.seed(1) subtype <- SCFA(dataList, seed = 1, ncores = 2L) #Perform survival analysis on the result coxFit <- coxph(Surv(time = Survival, event = Death) ~ as.factor(subtype), data = survival, ties="exact") coxP <- round(summary(coxFit)$sctest[3],digits = 20) print(coxP)
Perform risk score prediction on input data. This function requires training data with survival information. The output is the risk scores of patients in testing set.
SCFA.class(dataListTrain, trainLabel, dataListTest, ncores = 10L, seed = NULL)
SCFA.class(dataListTrain, trainLabel, dataListTest, ncores = 10L, seed = NULL)
dataListTrain |
List of training data matrices. In each matrix, rows represent samples and columns represent genes/features. |
trainLabel |
Survival information of patient in training set in form of Surv object. |
dataListTest |
List of testing data matrices. In each matrix, rows represent samples and columns represent genes/features. |
ncores |
Number of processor cores to use. |
seed |
Seed for reproducibility, you still need to use set.seed function for full reproducibility. |
A vector of risk score predictions for patient in test set.
#Load example data (GBM dataset) data("GBM") #List of one matrix (microRNA data) dataList <- GBM$data #Survival information survival <- GBM$survival library(survival) #Split data to train and test set.seed(1) idx <- sample.int(nrow(dataList[[1]]), round(nrow(dataList[[1]])/2) ) survival$Survival <- survival$Survival - min(survival$Survival) + 1 # Survival time must be positive trainList <- lapply(dataList, function(x) x[idx, ] ) trainSurvival <- Surv(time = survival[idx,]$Survival, event = survival[idx,]$Death) testList <- lapply(dataList, function(x) x[-idx, ] ) testSurvival <- Surv(time = survival[-idx,]$Survival, event = survival[-idx,]$Death) #Perform risk prediction result <- SCFA.class(trainList, trainSurvival, testList, seed = 1, ncores = 2L) #Validation using concordance index c.index <- concordance(coxph(testSurvival ~ result))$concordance print(c.index)
#Load example data (GBM dataset) data("GBM") #List of one matrix (microRNA data) dataList <- GBM$data #Survival information survival <- GBM$survival library(survival) #Split data to train and test set.seed(1) idx <- sample.int(nrow(dataList[[1]]), round(nrow(dataList[[1]])/2) ) survival$Survival <- survival$Survival - min(survival$Survival) + 1 # Survival time must be positive trainList <- lapply(dataList, function(x) x[idx, ] ) trainSurvival <- Surv(time = survival[idx,]$Survival, event = survival[idx,]$Death) testList <- lapply(dataList, function(x) x[-idx, ] ) testSurvival <- Surv(time = survival[-idx,]$Survival, event = survival[-idx,]$Death) #Perform risk prediction result <- SCFA.class(trainList, trainSurvival, testList, seed = 1, ncores = 2L) #Validation using concordance index c.index <- concordance(coxph(testSurvival ~ result))$concordance print(c.index)