Package 'SCFA'

Title: SCFA: Subtyping via Consensus Factor Analysis
Description: Subtyping via Consensus Factor Analysis (SCFA) can efficiently remove noisy signals from consistent molecular patterns in multi-omics data. SCFA first uses an autoencoder to select only important features and then repeatedly performs factor analysis to represent the data with different numbers of factors. Using these representations, it can reliably identify cancer subtypes and accurately predict risk scores of patients.
Authors: Duc Tran [aut, cre], Hung Nguyen [aut], Tin Nguyen [fnd]
Maintainer: Duc Tran <[email protected]>
License: LGPL
Version: 1.15.0
Built: 2024-06-30 03:58:56 UTC
Source: https://github.com/bioc/SCFA

Help Index


GBM

Description

GBM dataset, including microRNA and survidal data.

Usage

GBM

Format

A list with two items:

data

List of microRNA data matrix.

survival

Survival information.


SCFA

Description

The main function to perform subtyping. It takes a list of data matrices as the input and outputs the subtype for each patient

Usage

SCFA(dataList, k = NULL, max.k = 5, ncores = 10L, seed = NULL)

Arguments

dataList

List of data matrices. In each matrix, rows represent samples and columns represent genes/features.

k

Number of clusters, leave as default for auto detection.

max.k

Maximum number of cluster

ncores

Number of processor cores to use.

seed

Seed for reproducibility, you still need to use set.seed function for full reproducibility.

Value

A numeric vector containing cluster assignment for each sample.

Examples

#Load example data (GBM dataset)
data("GBM")
#List of one matrix (microRNA data)
dataList <- GBM$data
#Survival information
survival <- GBM$survival
library(survival)
#Generating subtyping result
set.seed(1)
subtype <- SCFA(dataList, seed = 1, ncores = 2L)
#Perform survival analysis on the result
coxFit <- coxph(Surv(time = Survival, event = Death) ~ as.factor(subtype), data = survival, ties="exact")
coxP <- round(summary(coxFit)$sctest[3],digits = 20)
print(coxP)

SCFA.class

Description

Perform risk score prediction on input data. This function requires training data with survival information. The output is the risk scores of patients in testing set.

Usage

SCFA.class(dataListTrain, trainLabel, dataListTest, ncores = 10L, seed = NULL)

Arguments

dataListTrain

List of training data matrices. In each matrix, rows represent samples and columns represent genes/features.

trainLabel

Survival information of patient in training set in form of Surv object.

dataListTest

List of testing data matrices. In each matrix, rows represent samples and columns represent genes/features.

ncores

Number of processor cores to use.

seed

Seed for reproducibility, you still need to use set.seed function for full reproducibility.

Value

A vector of risk score predictions for patient in test set.

Examples

#Load example data (GBM dataset)
data("GBM")
#List of one matrix (microRNA data)
dataList <- GBM$data
#Survival information
survival <- GBM$survival
library(survival)
#Split data to train and test
set.seed(1)
idx <- sample.int(nrow(dataList[[1]]), round(nrow(dataList[[1]])/2) )
survival$Survival <- survival$Survival - min(survival$Survival) + 1 # Survival time must be positive
trainList <- lapply(dataList, function(x) x[idx, ] )
trainSurvival <- Surv(time = survival[idx,]$Survival, event =  survival[idx,]$Death)
testList <- lapply(dataList, function(x) x[-idx, ] )
testSurvival <- Surv(time = survival[-idx,]$Survival, event =  survival[-idx,]$Death)
#Perform risk prediction
result <- SCFA.class(trainList, trainSurvival, testList, seed = 1, ncores = 2L)
#Validation using concordance index
c.index <- concordance(coxph(testSurvival ~ result))$concordance
print(c.index)