Title: | Graphical Interface for Alternative Splicing Quantification, Analysis and Visualisation |
---|---|
Description: | Interactive R package with an intuitive Shiny-based graphical interface for alternative splicing quantification and integrative analyses of alternative splicing and gene expression based on The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression project (GTEx), Sequence Read Archive (SRA) and user-provided data. The tool interactively performs survival, dimensionality reduction and median- and variance-based differential splicing and gene expression analyses that benefit from the incorporation of clinical and molecular sample-associated features (such as tumour stage or survival). Interactive visual access to genomic mapping and functional annotation of selected alternative splicing events is also included. |
Authors: | Nuno Saraiva-Agostinho [aut, cre] , Nuno Luís Barbosa-Morais [aut, led, ths] , André Falcão [ths], Lina Gallego Paez [ctb], Marie Bordone [ctb], Teresa Maia [ctb], Mariana Ferreira [ctb], Ana Carolina Leote [ctb], Bernardo de Almeida [ctb] |
Maintainer: | Nuno Saraiva-Agostinho <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.33.0 |
Built: | 2024-10-31 03:50:43 UTC |
Source: | https://github.com/bioc/psichomics |
Print startup message
.onAttach(libname, pkgname)
.onAttach(libname, pkgname)
libname |
Character: library name |
pkgname |
Character: package name |
Startup message
Plot, print and display as table the results of gene expression and alternative splicing
## S3 method for class 'GEandAScorrelation' x[genes = NULL, ASevents = NULL] ## S3 method for class 'GEandAScorrelation' plot( x, autoZoom = FALSE, loessSmooth = TRUE, loessFamily = c("gaussian", "symmetric"), colour = "black", alpha = 0.2, size = 1.5, loessColour = "red", loessAlpha = 1, loessWidth = 0.5, fontSize = 12, ..., colourGroups = NULL, legend = FALSE, showAllData = TRUE, density = FALSE, densityColour = "blue", densityWidth = 0.5 ) ## S3 method for class 'GEandAScorrelation' print(x, ...) ## S3 method for class 'GEandAScorrelation' as.table(x, pvalueAdjust = "BH", ...)
## S3 method for class 'GEandAScorrelation' x[genes = NULL, ASevents = NULL] ## S3 method for class 'GEandAScorrelation' plot( x, autoZoom = FALSE, loessSmooth = TRUE, loessFamily = c("gaussian", "symmetric"), colour = "black", alpha = 0.2, size = 1.5, loessColour = "red", loessAlpha = 1, loessWidth = 0.5, fontSize = 12, ..., colourGroups = NULL, legend = FALSE, showAllData = TRUE, density = FALSE, densityColour = "blue", densityWidth = 0.5 ) ## S3 method for class 'GEandAScorrelation' print(x, ...) ## S3 method for class 'GEandAScorrelation' as.table(x, pvalueAdjust = "BH", ...)
x |
|
genes |
Character: genes |
ASevents |
Character: AS events |
autoZoom |
Boolean: automatically set the range of PSI values based on
available data? If |
loessSmooth |
Boolean: plot a smooth curve computed by
|
loessFamily |
Character: if |
colour |
Character: points' colour |
alpha |
Numeric: points' alpha |
size |
Numeric: points' size |
loessColour |
Character: loess line's colour |
loessAlpha |
Numeric: loess line's opacity |
loessWidth |
Numeric: loess line's width |
fontSize |
Numeric: plot font size |
... |
Arguments passed on to
|
colourGroups |
List of characters: sample colouring by group |
legend |
Boolean: show legend for sample colouring? |
showAllData |
Boolean: show data outside selected groups as a single
group (coloured based on the |
density |
Boolean: contour plot of a density estimate |
densityColour |
Character: line colour of contours |
densityWidth |
Numeric: line width of contours |
pvalueAdjust |
Character: method used to adjust p-values (see Details) |
The following methods for p-value adjustment are supported by using the
respective string in the pvalueAdjust
argument:
none
: do not adjust p-values
BH
: Benjamini-Hochberg's method (false discovery rate)
BY
: Benjamini-Yekutieli's method (false discovery rate)
bonferroni
: Bonferroni correction (family-wise error rate)
holm
: Holm's method (family-wise error rate)
hochberg
: Hochberg's method (family-wise error rate)
hommel
: Hommel's method (family-wise error rate)
Plots, summary tables or results of correlation analyses
Other functions to correlate gene expression and alternative splicing:
correlateGEandAS()
Other functions to correlate gene expression and alternative splicing:
correlateGEandAS()
annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) geneExpr <- readFile("ex_gene_expression.RDS") corr <- correlateGEandAS(geneExpr, psi, "ALDOA") # Quick display of the correlation results per splicing event and gene print(corr) # Table summarising the correlation analysis results as.table(corr) # Correlation analysis plots colourGroups <- list(Normal=paste("Normal", 1:3), Tumour=paste("Cancer", 1:3)) attr(colourGroups, "Colour") <- c(Normal="#00C65A", Tumour="#EEE273") plot(corr, colourGroups=colourGroups, alpha=1)
annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) geneExpr <- readFile("ex_gene_expression.RDS") corr <- correlateGEandAS(geneExpr, psi, "ALDOA") # Quick display of the correlation results per splicing event and gene print(corr) # Table summarising the correlation analysis results as.table(corr) # Correlation analysis plots colourGroups <- list(Normal=paste("Normal", 1:3), Tumour=paste("Cancer", 1:3)) attr(colourGroups, "Colour") <- c(Normal="#00C65A", Tumour="#EEE273") plot(corr, colourGroups=colourGroups, alpha=1)
Assign average sample values to their corresponding subjects
assignValuePerSubject( data, match, clinical = NULL, patients = NULL, samples = NULL )
assignValuePerSubject( data, match, clinical = NULL, patients = NULL, samples = NULL )
data |
One-row data frame/matrix or vector: values per sample for a single gene |
match |
Matrix: match between samples and subjects |
clinical |
Data frame or matrix: clinical dataset (only required if the
|
patients |
Character: subject identifiers (only required if the
|
samples |
Character: samples to use when assigning values per subject
(if |
Values per subject
Other functions to analyse survival:
getAttributesTime()
,
labelBasedOnCutoff()
,
optimalSurvivalCutoff()
,
plotSurvivalCurves()
,
plotSurvivalPvaluesByCutoff()
,
processSurvTerms()
,
survdiffTerms()
,
survfit.survTerms()
,
testSurvival()
# Calculate PSI for skipped exon (SE) and mutually exclusive (MXE) events annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) # Match between subjects and samples match <- rep(paste("Subject", 1:3), 2) names(match) <- colnames(psi) # Assign PSI values to each subject based on the PSI of their samples assignValuePerSubject(psi[3, ], match)
# Calculate PSI for skipped exon (SE) and mutually exclusive (MXE) events annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) # Match between subjects and samples match <- rep(paste("Subject", 1:3), 2) names(match) <- colnames(psi) # Assign PSI values to each subject based on the PSI of their samples assignValuePerSubject(psi[3, ], match)
Total contribution of a variable is calculated as per
((Cx * Ex) + (Cy * Ey))/(Ex + Ey)
, where:
Cx
and Cy
are the contributions of a variable to
principal components x
and y
Ex
and Ey
are the eigenvalues of principal components
x
and y
calculateLoadingsContribution(pca, pcX = 1, pcY = 2)
calculateLoadingsContribution(pca, pcX = 1, pcY = 2)
pca |
|
pcX |
Character: name of the X axis of interest from the PCA |
pcY |
Character: name of the Y axis of interest from the PCA |
Data frame containing the correlation between variables and selected principal components and the contribution of variables to the selected principal components (both individual and total contribution)
Other functions to analyse principal components:
performPCA()
,
plotPCA()
,
plotPCAvariance()
pca <- performPCA(USArrests) calculateLoadingsContribution(pca)
pca <- performPCA(USArrests) calculateLoadingsContribution(pca)
EList-class
objectSum columns using an EList-class
object
## S4 method for signature 'EList' colSums(x, na.rm = FALSE, dims = 1)
## S4 method for signature 'EList' colSums(x, na.rm = FALSE, dims = 1)
x |
an array of two or more dimensions, containing numeric,
complex, integer or logical values, or a numeric data frame. For
|
na.rm |
logical. Should missing values (including |
dims |
integer: Which dimensions are regarded as ‘rows’ or
‘columns’ to sum over. For |
Numeric vector with the sum of the columns
Convert gene identifiers
convertGeneIdentifiers( annotation, genes, key = "ENSEMBL", target = "SYMBOL", ignoreDuplicatedTargets = TRUE )
convertGeneIdentifiers( annotation, genes, key = "ENSEMBL", target = "SYMBOL", ignoreDuplicatedTargets = TRUE )
annotation |
|
genes |
Character: genes to be converted |
key |
Character: type of identifier used, e.g. |
target |
Character: type of identifier to convert to; read
|
ignoreDuplicatedTargets |
Boolean: if |
Character vector of the respective targets of gene identifiers. The
previous identifiers remain other identifiers have the same target (in case
ignoreDuplicatedTargets = TRUE
) or if no target was found.
Other functions for gene expression pre-processing:
filterGeneExpr()
,
normaliseGeneExpression()
,
plotGeneExprPerSample()
,
plotLibrarySize()
,
plotRowStats()
# Use species name to automatically look for a OrgDb database sp <- "Homo sapiens" genes <- c("ENSG00000012048", "ENSG00000083093", "ENSG00000141510", "ENSG00000051180") convertGeneIdentifiers(sp, genes) convertGeneIdentifiers(sp, genes, key="ENSEMBL", target="UNIPROT") # Alternatively, set the annotation database directly ah <- AnnotationHub::AnnotationHub() sp <- AnnotationHub::query(ah, c("OrgDb", "Homo sapiens"))[[1]] columns(sp) # these attributes can be used to change the attributes convertGeneIdentifiers(sp, genes) convertGeneIdentifiers(sp, genes, key="ENSEMBL", target="UNIPROT")
# Use species name to automatically look for a OrgDb database sp <- "Homo sapiens" genes <- c("ENSG00000012048", "ENSG00000083093", "ENSG00000141510", "ENSG00000051180") convertGeneIdentifiers(sp, genes) convertGeneIdentifiers(sp, genes, key="ENSEMBL", target="UNIPROT") # Alternatively, set the annotation database directly ah <- AnnotationHub::AnnotationHub() sp <- AnnotationHub::query(ah, c("OrgDb", "Homo sapiens"))[[1]] columns(sp) # these attributes can be used to change the attributes convertGeneIdentifiers(sp, genes) convertGeneIdentifiers(sp, genes, key="ENSEMBL", target="UNIPROT")
Test for association between paired samples' gene expression (for any genes of interest) and alternative splicing quantification.
correlateGEandAS(geneExpr, psi, gene, ASevents = NULL, ...)
correlateGEandAS(geneExpr, psi, gene, ASevents = NULL, ...)
geneExpr |
Matrix or data frame: gene expression data |
psi |
Matrix or data frame: alternative splicing quantification data |
gene |
Character: gene symbol for genes of interest |
ASevents |
Character: alternative splicing events to correlate with
gene expression of a gene (if |
... |
Extra parameters passed to cor.test |
List of correlations where each element contains:
eventID |
Alternative splicing event identifier |
cor |
Correlation between gene expression and alternative splicing quantification of one alternative splicing event |
geneExpr |
Gene expression for the selected gene |
psi |
Alternative splicing quantification for the alternative splicing event |
Other functions to correlate gene expression and alternative splicing:
[.GEandAScorrelation()
annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) geneExpr <- readFile("ex_gene_expression.RDS") correlateGEandAS(geneExpr, psi, "ALDOA")
annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) geneExpr <- readFile("ex_gene_expression.RDS") correlateGEandAS(geneExpr, psi, "ALDOA")
Elements are identified by their respective row name.
createGroupByAttribute(col, dataset)
createGroupByAttribute(col, dataset)
col |
Character: column name |
dataset |
Matrix or data frame: dataset |
Named list with each unique value from a given column and respective elements
Other functions for data grouping:
getGeneList()
,
getSampleFromSubject()
,
getSubjectFromSample()
,
groupPerElem()
,
plotGroupIndependence()
,
testGroupIndependence()
df <- data.frame(gender=c("male", "female"), stage=paste("stage", c(1, 3, 1, 4, 2, 3, 2, 2))) rownames(df) <- paste0("subject-", LETTERS[1:8]) createGroupByAttribute(col="stage", dataset=df)
df <- data.frame(gender=c("male", "female"), stage=paste("stage", c(1, 3, 1, 4, 2, 3, 2, 2))) rownames(df) <- paste0("subject-", LETTERS[1:8]) createGroupByAttribute(col="stage", dataset=df)
Perform statistical analyses
diffAnalyses( data, groups = NULL, analyses = c("wilcoxRankSum", "ttest", "kruskal", "levene", "fligner"), pvalueAdjust = "BH", geneExpr = NULL, inputID = "sparklineInput" )
diffAnalyses( data, groups = NULL, analyses = c("wilcoxRankSum", "ttest", "kruskal", "levene", "fligner"), pvalueAdjust = "BH", geneExpr = NULL, inputID = "sparklineInput" )
data |
Data frame or matrix: gene expression or alternative splicing quantification |
groups |
Named list of characters (containing elements belonging to each
group) or character vector (containing the group of each individual sample);
if |
analyses |
Character: statistical tests to perform (see Details) |
pvalueAdjust |
Character: method used to adjust p-values (see Details) |
geneExpr |
Character: name of the gene expression dataset (only required for density sparklines available in the interactive mode) |
inputID |
Character: identifier of input to get attributes of clicked event (Shiny only) |
The following statistical analyses may be performed simultaneously via the
analysis
argument:
ttest
- Unpaired t-test (2 groups)
wilcoxRankSum
- Wilcoxon Rank Sum test (2 groups)
kruskal
- Kruskal test (2 or more groups)
levene
- Levene's test (2 or more groups)
fligner
- Fligner-Killeen test (2 or more groups)
density
- Sample distribution per group (only usable
through the visual interface)
The following p-value adjustment methods are supported via the
pvalueAdjust
argument:
none
: do not adjust p-values
BH
: Benjamini-Hochberg's method (false discovery rate)
BY
: Benjamini-Yekutieli's method (false discovery rate)
bonferroni
: Bonferroni correction (family-wise error rate)
holm
: Holm's method (family-wise error rate)
hochberg
: Hochberg's method (family-wise error rate)
hommel
: Hommel's method (family-wise error rate)
Table of statistical analyses
Other functions to perform and plot differential analyses:
plotDistribution()
# Calculate PSI for skipped exon (SE) and mutually exclusive (MXE) events eventType <- c("SE", "MXE") annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) group <- c(rep("Normal", 3), rep("Tumour", 3)) diffAnalyses(psi, group)
# Calculate PSI for skipped exon (SE) and mutually exclusive (MXE) events eventType <- c("SE", "MXE") annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) group <- c(rep("Normal", 3), rep("Tumour", 3)) diffAnalyses(psi, group)
Remove alternative splicing quantification values based on coverage
discardLowCoveragePSIvalues( psi, minReads = 10, vasttoolsScoresToDiscard = c("VLOW", "N") )
discardLowCoveragePSIvalues( psi, minReads = 10, vasttoolsScoresToDiscard = c("VLOW", "N") )
psi |
Data frame or matrix: alternative splicing quantification |
minReads |
Currently this argument does nothing |
vasttoolsScoresToDiscard |
Character: if you are using inclusion levels
from VAST-TOOLS, filter the data based on quality scores for read coverage,
e.g. use |
Alternative splicing quantification data with missing values for any values with insufficient coverage
Convert from Ensembl to UniProt identifier
ensemblToUniprot(protein)
ensemblToUniprot(protein)
protein |
Character: Ensembl identifier |
UniProt protein identifier
Other functions to retrieve external information:
plotProtein()
,
plotTranscripts()
,
queryEnsemblByGene()
gene <- "ENSG00000173262" ensemblToUniprot(gene) protein <- "ENSP00000445929" ensemblToUniprot(protein)
gene <- "ENSG00000173262" ensemblToUniprot(gene) protein <- "ENSP00000445929" ensemblToUniprot(protein)
Uses filterByExpr
to determine genes with sufficiently
large counts to retain for statistical analysis.
filterGeneExpr( geneExpr, minMean = 0, maxMean = Inf, minVar = 0, maxVar = Inf, minCounts = 10, minTotalCounts = 15 )
filterGeneExpr( geneExpr, minMean = 0, maxMean = Inf, minVar = 0, maxVar = Inf, minCounts = 10, minTotalCounts = 15 )
geneExpr |
Data frame or matrix: gene expression |
minMean |
Numeric: minimum of read count mean per gene |
maxMean |
Numeric: maximum of read count mean per gene |
minVar |
Numeric: minimum of read count variance per gene |
maxVar |
Numeric: maximum of read count variance per gene |
minCounts |
Numeric: minimum number of read counts per gene for a
worthwhile number of samples (check |
minTotalCounts |
Numeric: minimum total number of read counts per gene |
Boolean vector indicating which genes have sufficiently large counts
Other functions for gene expression pre-processing:
convertGeneIdentifiers()
,
normaliseGeneExpression()
,
plotGeneExprPerSample()
,
plotLibrarySize()
,
plotRowStats()
geneExpr <- readFile("ex_gene_expression.RDS") # Add some genes with low expression geneExpr <- rbind(geneExpr, lowReadGene1=c(rep(4:5, 10)), lowReadGene2=c(rep(5:1, 10)), lowReadGene3=c(rep(10:1, 10)), lowReadGene4=c(rep(7:8, 10))) # Filter out genes with low reads across samples geneExpr[filterGeneExpr(geneExpr), ]
geneExpr <- readFile("ex_gene_expression.RDS") # Add some genes with low expression geneExpr <- rbind(geneExpr, lowReadGene1=c(rep(4:5, 10)), lowReadGene2=c(rep(5:1, 10)), lowReadGene3=c(rep(10:1, 10)), lowReadGene4=c(rep(7:8, 10))) # Filter out genes with low reads across samples geneExpr[filterGeneExpr(geneExpr), ]
Groups containing a number of non-missing values less than the threshold are discarded.
filterGroups(vector, group, threshold = 1)
filterGroups(vector, group, threshold = 1)
vector |
Character: elements |
group |
Character: respective group of each elements |
threshold |
Integer: number of valid non-missing values by group |
Named vector with filtered elements from valid groups. The group of the respective element is given as an attribute.
# Removes groups with less than two elements vec <- 1:6 names(vec) <- paste("sample", letters[1:6]) filterGroups(vec, c("A", "B", "B", "C", "D", "D"), threshold=2)
# Removes groups with less than two elements vec <- 1:6 names(vec) <- paste("sample", letters[1:6]) filterGroups(vec, c("A", "B", "B", "C", "D", "D"), threshold=2)
Filter alternative splicing quantification
filterPSI( psi, eventType = NULL, eventSubtype = NULL, minPSI = -Inf, maxPSI = Inf, minMedian = -Inf, maxMedian = Inf, minLogVar = -Inf, maxLogVar = Inf, minRange = -Inf, maxRange = Inf )
filterPSI( psi, eventType = NULL, eventSubtype = NULL, minPSI = -Inf, maxPSI = Inf, minMedian = -Inf, maxMedian = Inf, minLogVar = -Inf, maxLogVar = Inf, minRange = -Inf, maxRange = Inf )
psi |
Data frame or matrix: alternative splicing quantification |
eventType |
Character: filter data based on event type; check all event
types available by using |
eventSubtype |
Character: filter data based on event subtype; check all
event subtypes available in your data by using
|
minPSI |
Numeric: minimum PSI value |
maxPSI |
Numeric: maximum PSI value |
minMedian |
Numeric: minimum median PSI per splicing event |
maxMedian |
Numeric: maximum median PSI per splicing event |
minLogVar |
Numeric: minimum log10(PSI variance) per splicing event |
maxLogVar |
Numeric: maximum log10(PSI variance) per splicing event |
minRange |
Numeric: minimum PSI range across samples per splicing event |
maxRange |
Numeric: maximum PSI range across samples per splicing event |
Boolean vector indicating which splicing events pass the thresholds
Other functions for PSI quantification:
getSplicingEventTypes()
,
listSplicingAnnotations()
,
loadAnnotation()
,
plotRowStats()
,
quantifySplicing()
# Calculate PSI for skipped exon (SE) and mutually exclusive (MXE) events annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) # Filter PSI psi[filterPSI(psi, minMedian=0.05, maxMedian=0.95, minRange=0.15), ]
# Calculate PSI for skipped exon (SE) and mutually exclusive (MXE) events annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) # Filter PSI psi[filterPSI(psi, minMedian=0.05, maxMedian=0.95, minRange=0.15), ]
Get time values for given columns in a clinical dataset
getAttributesTime( clinical, event, timeStart, timeStop = NULL, followup = "days_to_last_followup" )
getAttributesTime( clinical, event, timeStart, timeStop = NULL, followup = "days_to_last_followup" )
clinical |
Data frame: clinical data |
event |
Character: name of column containing time of the event of interest |
timeStart |
Character: name of column containing starting time of the interval or follow up time |
timeStop |
Character: name of column containing ending time of the interval (only relevant for interval censoring) |
followup |
Character: name of column containing follow up time |
Data frame containing the time for the given columns
Other functions to analyse survival:
assignValuePerSubject()
,
labelBasedOnCutoff()
,
optimalSurvivalCutoff()
,
plotSurvivalCurves()
,
plotSurvivalPvaluesByCutoff()
,
processSurvTerms()
,
survdiffTerms()
,
survfit.survTerms()
,
testSurvival()
df <- data.frame(followup=c(200, 300, 400), death=c(NA, 300, NA)) rownames(df) <- paste("subject", 1:3) getAttributesTime(df, event="death", timeStart="death", followup="followup")
df <- data.frame(followup=c(200, 300, 400), death=c(NA, 300, NA)) rownames(df) <- paste("subject", 1:3) getAttributesTime(df, event="death", timeStart="death", followup="followup")
Get the path to the Downloads folder
getDownloadsFolder()
getDownloadsFolder()
Path to Downloads folder
Other functions associated with TCGA data retrieval:
getTCGAdataTypes()
,
isFirebrowseUp()
,
loadTCGAdata()
,
parseTCGAsampleTypes()
Other functions associated with GTEx data retrieval:
getGtexDataTypes()
,
getGtexTissues()
,
loadGtexData()
Other functions associated with SRA data retrieval:
loadSRAproject()
getDownloadsFolder()
getDownloadsFolder()
Available gene lists:
Sebestyen et al., 2016: 1350 genes encoding RNA-binding proteins, 167 of which are splicing factors
getGeneList(genes = NULL)
getGeneList(genes = NULL)
genes |
Vector of characters: intersect lists with given genes (lists with no matching genes will not be returned) |
List of genes
Other functions for data grouping:
createGroupByAttribute()
,
getSampleFromSubject()
,
getSubjectFromSample()
,
groupPerElem()
,
plotGroupIndependence()
,
testGroupIndependence()
getGeneList()
getGeneList()
Get GTEx data information
getGtexDataTypes() getGtexReleases()
getGtexDataTypes() getGtexReleases()
GTEx data information
Other functions associated with GTEx data retrieval:
getDownloadsFolder()
,
getGtexTissues()
,
loadGtexData()
getGtexDataTypes() getGtexReleases()
getGtexDataTypes() getGtexReleases()
Get GTEx tissues from given GTEx sample attributes
getGtexTissues(folder = getDownloadsFolder(), release = getGtexReleases()[[1]])
getGtexTissues(folder = getDownloadsFolder(), release = getGtexReleases()[[1]])
folder |
Character: folder containing data |
release |
Numeric: GTEx data release to load |
Character: available tissues
Other functions associated with GTEx data retrieval:
getDownloadsFolder()
,
getGtexDataTypes()
,
loadGtexData()
## Not run: getGtexTissues() ## End(Not run)
## Not run: getGtexTissues() ## End(Not run)
Get samples matching the given subjects
getSampleFromSubject( patients, samples, clinical = NULL, rm.NA = TRUE, match = NULL, showMatch = FALSE )
getSampleFromSubject( patients, samples, clinical = NULL, rm.NA = TRUE, match = NULL, showMatch = FALSE )
patients |
Character or list of characters: subject identifiers |
samples |
Character: sample identifiers |
clinical |
Data frame or matrix: clinical dataset |
rm.NA |
Boolean: remove missing values? |
match |
Integer: vector of subject index with the sample identifiers as name to save time (optional) |
showMatch |
Boolean: show matching subject index? |
Names of the matching samples (if showMatch = TRUE
,
a character with the subjects as values and their respective samples as names
is returned)
Other functions for data grouping:
createGroupByAttribute()
,
getGeneList()
,
getSubjectFromSample()
,
groupPerElem()
,
plotGroupIndependence()
,
testGroupIndependence()
subjects <- c("GTEX-ABC", "GTEX-DEF", "GTEX-GHI", "GTEX-JKL", "GTEX-MNO") samples <- paste0(subjects, "-sample") clinical <- data.frame(samples=samples) rownames(clinical) <- subjects getSampleFromSubject(subjects[c(1, 4)], samples, clinical)
subjects <- c("GTEX-ABC", "GTEX-DEF", "GTEX-GHI", "GTEX-JKL", "GTEX-MNO") samples <- paste0(subjects, "-sample") clinical <- data.frame(samples=samples) rownames(clinical) <- subjects getSampleFromSubject(subjects[c(1, 4)], samples, clinical)
Get splicing event information for given alternative splicing quantification data
getSplicingEventData(psi)
getSplicingEventData(psi)
psi |
Matrix or data frame: alternative splicing quantification data |
Matrix or data frame containing splicing event information for
alternative splicing events in psi
(if available)
Get alternative splicing events from genes or vice-versa
getSplicingEventFromGenes(genes, ASevents, data = NULL) getGenesFromSplicingEvents(ASevents, data = NULL)
getSplicingEventFromGenes(genes, ASevents, data = NULL) getGenesFromSplicingEvents(ASevents, data = NULL)
genes |
Character: gene symbols (or TCGA-styled gene symbols) |
ASevents |
Character: alternative splicing events |
data |
Matrix or data frame: alternative splicing information |
A list of alternative splicing events is required to run
getSplicingEventFromGenes
Named character containing alternative splicing events or genes and their respective genes or alternative splicing events as names (depending on the function in use)
ASevents <- c("SE_1_+_201763003_201763300_201763374_201763594_NAV1", "SE_1_+_183515472_183516238_183516387_183518343_SMG7", "SE_1_+_183441784_183471388_183471526_183481972_SMG7", "SE_1_+_181019422_181022709_181022813_181024361_MR1", "SE_1_+_181695298_181700311_181700367_181701520_CACNA1E") genes <- c("NAV1", "SMG7", "MR1", "HELLO") # Get splicing events from genes matchedASevents <- getSplicingEventFromGenes(genes, ASevents) # Names of matched events are the matching input genes names(matchedASevents) matchedASevents # Get genes from splicing events matchedGenes <- getGenesFromSplicingEvents (ASevents) # Names of matched genes are the matching input alternative splicing events names(matchedGenes) matchedGenes
ASevents <- c("SE_1_+_201763003_201763300_201763374_201763594_NAV1", "SE_1_+_183515472_183516238_183516387_183518343_SMG7", "SE_1_+_183441784_183471388_183471526_183481972_SMG7", "SE_1_+_181019422_181022709_181022813_181024361_MR1", "SE_1_+_181695298_181700311_181700367_181701520_CACNA1E") genes <- c("NAV1", "SMG7", "MR1", "HELLO") # Get splicing events from genes matchedASevents <- getSplicingEventFromGenes(genes, ASevents) # Names of matched events are the matching input genes names(matchedASevents) matchedASevents # Get genes from splicing events matchedGenes <- getGenesFromSplicingEvents (ASevents) # Names of matched genes are the matching input alternative splicing events names(matchedGenes) matchedGenes
Get supported splicing event types
getSplicingEventTypes(psi = NULL, acronymsAsNames = FALSE)
getSplicingEventTypes(psi = NULL, acronymsAsNames = FALSE)
psi |
Data frame or matrix: alternative splicing quantification data |
acronymsAsNames |
Boolean: return acronyms as names? |
Named character vector with splicing event types
Other functions for PSI quantification:
filterPSI()
,
listSplicingAnnotations()
,
loadAnnotation()
,
plotRowStats()
,
quantifySplicing()
getSplicingEventTypes()
getSplicingEventTypes()
Get subjects from given samples
getSubjectFromSample(sampleId, patientId = NULL, na = FALSE, sampleInfo = NULL)
getSubjectFromSample(sampleId, patientId = NULL, na = FALSE, sampleInfo = NULL)
sampleId |
Character: sample identifiers |
patientId |
Character: subject identifiers to filter by (optional; if a matrix or data frame is given, its rownames will be used to infer the subject identifiers) |
na |
Boolean: return |
sampleInfo |
Data frame or matrix: sample information containing the sample identifiers as rownames and a column named "Subject ID" with the respective subject identifiers |
Character: subject identifiers corresponding to the given samples
Other functions for data grouping:
createGroupByAttribute()
,
getGeneList()
,
getSampleFromSubject()
,
groupPerElem()
,
plotGroupIndependence()
,
testGroupIndependence()
samples <- paste0("GTEX-", c("ABC", "DEF", "GHI", "JKL", "MNO"), "-sample") getSubjectFromSample(samples) # Filter returned samples based on available subjects subjects <- paste0("GTEX-", c("DEF", "MNO")) getSubjectFromSample(samples, subjects)
samples <- paste0("GTEX-", c("ABC", "DEF", "GHI", "JKL", "MNO"), "-sample") getSubjectFromSample(samples) # Filter returned samples based on available subjects subjects <- paste0("GTEX-", c("DEF", "MNO")) getSubjectFromSample(samples, subjects)
Parameters obtained via FireBrowse
getTCGAdataTypes() getTCGAdates() getTCGAcohorts(cohort = NULL)
getTCGAdataTypes() getTCGAdates() getTCGAcohorts(cohort = NULL)
cohort |
Character: filter results by cohorts (optional) |
Parsed response
Other functions associated with TCGA data retrieval:
getDownloadsFolder()
,
isFirebrowseUp()
,
loadTCGAdata()
,
parseTCGAsampleTypes()
getTCGAdataTypes() if (isFirebrowseUp()) getTCGAdates() if (isFirebrowseUp()) getTCGAcohorts()
getTCGAdataTypes() if (isFirebrowseUp()) getTCGAdates() if (isFirebrowseUp()) getTCGAcohorts()
Assign one group to each element
groupPerElem(groups, elem = NULL, outerGroupName = NA)
groupPerElem(groups, elem = NULL, outerGroupName = NA)
groups |
List of integers: groups of elements |
elem |
Character: all elements available |
outerGroupName |
Character: name to give to outer group (if |
Character vector where each element corresponds to the group of the respective element
Other functions for data grouping:
createGroupByAttribute()
,
getGeneList()
,
getSampleFromSubject()
,
getSubjectFromSample()
,
plotGroupIndependence()
,
testGroupIndependence()
groups <- list(1:3, 4:7, 8:10) names(groups) <- paste("Stage", 1:3) groupPerElem(groups)
groups <- list(1:3, 4:7, 8:10) names(groups) <- paste("Stage", 1:3) groupPerElem(groups)
Check if FireBrowse API is running
isFirebrowseUp()
isFirebrowseUp()
Invisible TRUE
if the
FireBrowse API is working; otherwise,
raises a warning with the status code and a brief explanation.
Other functions associated with TCGA data retrieval:
getDownloadsFolder()
,
getTCGAdataTypes()
,
loadTCGAdata()
,
parseTCGAsampleTypes()
isFirebrowseUp()
isFirebrowseUp()
Label groups based on a given cutoff
labelBasedOnCutoff(data, cutoff, label = NULL, gte = TRUE)
labelBasedOnCutoff(data, cutoff, label = NULL, gte = TRUE)
data |
Numeric: test data |
cutoff |
Numeric: test cutoff |
label |
Character: label to prefix group names |
gte |
Boolean: test using greater than or equal than cutoff
( |
Labelled groups
Other functions to analyse survival:
assignValuePerSubject()
,
getAttributesTime()
,
optimalSurvivalCutoff()
,
plotSurvivalCurves()
,
plotSurvivalPvaluesByCutoff()
,
processSurvTerms()
,
survdiffTerms()
,
survfit.survTerms()
,
testSurvival()
labelBasedOnCutoff(data=c(1, 0, 0, 1, 0, 1), cutoff=0.5) labelBasedOnCutoff(data=c(1, 0, 0, 1, 0, 1), cutoff=0.5, "Ratio") # Use "greater than" instead of "greater than or equal to" labelBasedOnCutoff(data=c(1, 0, 0, 0.5, 0, 1), cutoff=0.5, gte=FALSE)
labelBasedOnCutoff(data=c(1, 0, 0, 1, 0, 1), cutoff=0.5) labelBasedOnCutoff(data=c(1, 0, 0, 1, 0, 1), cutoff=0.5, "Ratio") # Use "greater than" instead of "greater than or equal to" labelBasedOnCutoff(data=c(1, 0, 0, 0.5, 0, 1), cutoff=0.5, gte=FALSE)
List alternative splicing annotations
listSplicingAnnotations( species = NULL, assembly = NULL, date = NULL, cache = getAnnotationHubOption("CACHE"), group = FALSE )
listSplicingAnnotations( species = NULL, assembly = NULL, date = NULL, cache = getAnnotationHubOption("CACHE"), group = FALSE )
species |
Character: filter results by species (regular expression) |
assembly |
Character: filter results by assembly (regular expression) |
date |
Character: filter results by date (regular expression) |
cache |
Character: path to |
group |
Boolean: group values based on data provider? |
Named character vector with splicing annotation names
Other functions for PSI quantification:
filterPSI()
,
getSplicingEventTypes()
,
loadAnnotation()
,
plotRowStats()
,
quantifySplicing()
listSplicingAnnotations() # Return all alternative splicing annotations listSplicingAnnotations(assembly="hg19") # Search for hg19 annotation listSplicingAnnotations(assembly="hg38") # Search for hg38 annotation listSplicingAnnotations(date="201(7|8)") # Search for 2017 or 2018 annotation
listSplicingAnnotations() # Return all alternative splicing annotations listSplicingAnnotations(assembly="hg19") # Search for hg19 annotation listSplicingAnnotations(assembly="hg38") # Search for hg38 annotation listSplicingAnnotations(date="201(7|8)") # Search for 2017 or 2018 annotation
AnnotationHub
Load alternative splicing annotation from AnnotationHub
loadAnnotation(annotation, cache = getAnnotationHubOption("CACHE"))
loadAnnotation(annotation, cache = getAnnotationHubOption("CACHE"))
annotation |
Character: annotation to load |
cache |
Character: path to |
List of data frames containing the alternative splicing annotation per event type
Other functions for PSI quantification:
filterPSI()
,
getSplicingEventTypes()
,
listSplicingAnnotations()
,
plotRowStats()
,
quantifySplicing()
human <- listSplicingAnnotations(species="Homo sapiens")[[1]] ## Not run: annot <- loadAnnotation(human) ## End(Not run)
human <- listSplicingAnnotations(species="Homo sapiens")[[1]] ## Not run: annot <- loadAnnotation(human) ## End(Not run)
Download and load GTEx data
loadGtexData( folder = getDownloadsFolder(), data = getGtexDataTypes(), tissue = NULL, release = getGtexReleases()[[1]], progress = TRUE )
loadGtexData( folder = getDownloadsFolder(), data = getGtexDataTypes(), tissue = NULL, release = getGtexReleases()[[1]], progress = TRUE )
folder |
Character: folder containing data |
data |
Character: data types to load (see |
tissue |
Character: tissues to load (if |
release |
Numeric: GTEx data release to load |
progress |
Boolean: display progress? |
List with loaded data
Other functions associated with GTEx data retrieval:
getDownloadsFolder()
,
getGtexDataTypes()
,
getGtexTissues()
Other functions to load data:
loadLocalFiles()
,
loadSRAproject()
,
loadTCGAdata()
## Not run: # Download and load all available GTEx data data <- loadGtexData() # Download and load only junction quantification and sample info from GTEx getGtexDataTypes() data <- loadGtexData(data=c("sampleInfo", "junctionQuant")) # Download and load only data for specific tissues getGtexTissues() data <- loadGtexData(tissue=c("Stomach", "Small Intestine")) # Download and load data from a specific GTEx data release data <- loadGtexData(tissue=c("Stomach", "Small Intestine"), release=7) ## End(Not run)
## Not run: # Download and load all available GTEx data data <- loadGtexData() # Download and load only junction quantification and sample info from GTEx getGtexDataTypes() data <- loadGtexData(data=c("sampleInfo", "junctionQuant")) # Download and load only data for specific tissues getGtexTissues() data <- loadGtexData(tissue=c("Stomach", "Small Intestine")) # Download and load data from a specific GTEx data release data <- loadGtexData(tissue=c("Stomach", "Small Intestine"), release=7) ## End(Not run)
Load local files
loadLocalFiles( folder, ignore = c(".aux.", ".mage-tab."), name = "Data", verbose = FALSE )
loadLocalFiles( folder, ignore = c(".aux.", ".mage-tab."), name = "Data", verbose = FALSE )
folder |
Character: path to folder or ZIP archive |
ignore |
Character: skip folders and filenames that match the expression |
name |
Character: name |
verbose |
Boolean: print steps? |
List of data frames from valid files
Other functions to load data:
loadGtexData()
,
loadSRAproject()
,
loadTCGAdata()
## Not run: folder <- "~/Downloads/ACC 2016" data <- loadLocalFiles(folder) ignore <- c(".aux.", ".mage-tab.", "junction quantification") loadLocalFiles(folder, ignore) ## End(Not run)
## Not run: folder <- "~/Downloads/ACC 2016" data <- loadLocalFiles(folder) ignore <- c(".aux.", ".mage-tab.", "junction quantification") loadLocalFiles(folder, ignore) ## End(Not run)
Download and load SRA projects via recount2
loadSRAproject(project, outdir = getDownloadsFolder())
loadSRAproject(project, outdir = getDownloadsFolder())
project |
Character: SRA project identifiers (check
|
outdir |
Character: directory to store the downloaded files |
List with loaded projects
Other functions associated with SRA data retrieval:
getDownloadsFolder()
Other functions to load data:
loadGtexData()
,
loadLocalFiles()
,
loadTCGAdata()
## Not run: View(recount::recount_abstract) sra <- loadSRAproject("SRP053101") names(sra) names(sra[[1]]) ## End(Not run)
## Not run: View(recount::recount_abstract) sra <- loadSRAproject("SRP053101") names(sra) names(sra[[1]]) ## End(Not run)
TCGA data obtained via FireBrowse
loadTCGAdata( folder = getDownloadsFolder(), data = c("clinical", "junction_quantification", "RSEM_genes"), exclude = c(".aux.", ".mage-tab.", "MANIFEST.txt"), ..., download = TRUE )
loadTCGAdata( folder = getDownloadsFolder(), data = c("clinical", "junction_quantification", "RSEM_genes"), exclude = c(".aux.", ".mage-tab.", "MANIFEST.txt"), ..., download = TRUE )
folder |
Character: directory to store the downloaded archives (by
default, saves to |
data |
Character: data to load (see |
exclude |
Character: files and folders to exclude from downloading and
from loading into R (by default, exclude files containing |
... |
Arguments passed on to
|
download |
Boolean: download missing files |
A list with the loaded data, unless required files are unavailable
and download = FALSE
(if so, it returns the URL of files to download)
Other functions associated with TCGA data retrieval:
getDownloadsFolder()
,
getTCGAdataTypes()
,
isFirebrowseUp()
,
parseTCGAsampleTypes()
Other functions to load data:
loadGtexData()
,
loadLocalFiles()
,
loadSRAproject()
getTCGAcohorts() getTCGAdataTypes() ## Not run: loadTCGAdata(cohort = "ACC", data_type = "Clinical") ## End(Not run)
getTCGAcohorts() getTCGAdataTypes() ## Not run: loadTCGAdata(cohort = "ACC", data_type = "Clinical") ## End(Not run)
Gene expression is filtered and normalised in the following steps:
Filter gene expression;
Normalise gene expression with calcNormFactors
;
If performVoom = FALSE
, compute counts per million (CPM) using
cpm
and log2-transform values if
log2transform = TRUE
;
If performVoom = TRUE
, use voom
to compute
log2-CPM, quantile-normalise (if method = "quantile"
) and estimate
mean-variance relationship to calculate observation-level weights.
normaliseGeneExpression( geneExpr, geneFilter = NULL, method = "TMM", p = 0.75, log2transform = TRUE, priorCount = 0.25, performVoom = FALSE ) normalizeGeneExpression( geneExpr, geneFilter = NULL, method = "TMM", p = 0.75, log2transform = TRUE, priorCount = 0.25, performVoom = FALSE )
normaliseGeneExpression( geneExpr, geneFilter = NULL, method = "TMM", p = 0.75, log2transform = TRUE, priorCount = 0.25, performVoom = FALSE ) normalizeGeneExpression( geneExpr, geneFilter = NULL, method = "TMM", p = 0.75, log2transform = TRUE, priorCount = 0.25, performVoom = FALSE )
geneExpr |
Matrix or data frame: gene expression |
geneFilter |
Boolean: filtered genes (if |
method |
Character: normalisation method, including |
p |
numeric value between 0 and 1 specifying which quantile of the counts should be used by |
log2transform |
Boolean: perform log2-transformation? |
priorCount |
Average count to add to each observation to avoid zeroes after log-transformation |
performVoom |
Boolean: perform mean-variance modelling
(using |
edgeR::calcNormFactors
will be used to normalise gene
expression if method
is TMM
, RLE
, upperquartile
or none
. If performVoom = TRUE
, voom
will
only normalise if method = "quantile"
.
Available normalisation methods:
TMM
is recommended for most RNA-seq data where more than half of
the genes are believed not differentially expressed between any pair of
samples;
RLE
calculates the median library from the geometric mean of all
columns and the median ratio of each sample to the median library is taken as
the scale factor;
upperquartile
calculates the scale factors from a given quantile
of the counts for each library, after removing genes with zero counts in all
libraries;
quantile
forces the entire empirical distribution of each
column to be identical (only performed if performVoom = TRUE
).
Filtered and normalised gene expression
Other functions for gene expression pre-processing:
convertGeneIdentifiers()
,
filterGeneExpr()
,
plotGeneExprPerSample()
,
plotLibrarySize()
,
plotRowStats()
geneExpr <- readFile("ex_gene_expression.RDS") normaliseGeneExpression(geneExpr)
geneExpr <- readFile("ex_gene_expression.RDS") normaliseGeneExpression(geneExpr)
Uses stats::optim
with the Brent method to test multiple cutoffs and
to find the minimum log-rank p-value.
optimalSurvivalCutoff( clinical, data, censoring, event, timeStart, timeStop = NULL, followup = "days_to_last_followup", session = NULL, filter = TRUE, survTime = NULL, lower = NULL, upper = NULL )
optimalSurvivalCutoff( clinical, data, censoring, event, timeStart, timeStop = NULL, followup = "days_to_last_followup", session = NULL, filter = TRUE, survTime = NULL, lower = NULL, upper = NULL )
clinical |
Data frame: clinical data |
data |
Numeric: data values |
censoring |
Character: censor using |
event |
Character: name of column containing time of the event of interest |
timeStart |
Character: name of column containing starting time of the interval or follow up time |
timeStop |
Character: name of column containing ending time of the interval (only relevant for interval censoring) |
followup |
Character: name of column containing follow up time |
session |
Shiny session (only used for the visual interface) |
filter |
Boolean or numeric: elements to use (all are used by default) |
survTime |
|
lower , upper
|
Bounds in which to search (if |
List containing the optimal cutoff (par
) and the corresponding
p-value (value
)
Other functions to analyse survival:
assignValuePerSubject()
,
getAttributesTime()
,
labelBasedOnCutoff()
,
plotSurvivalCurves()
,
plotSurvivalPvaluesByCutoff()
,
processSurvTerms()
,
survdiffTerms()
,
survfit.survTerms()
,
testSurvival()
clinical <- read.table(text = "2549 NA ii female 840 NA i female NA 1204 iv male NA 383 iv female 1293 NA iii male NA 1355 ii male") names(clinical) <- c("patient.days_to_last_followup", "patient.days_to_death", "patient.stage_event.pathologic_stage", "patient.gender") timeStart <- "days_to_death" event <- "days_to_death" psi <- c(0.1, 0.2, 0.9, 1, 0.2, 0.6) opt <- optimalSurvivalCutoff(clinical, psi, "right", event, timeStart)
clinical <- read.table(text = "2549 NA ii female 840 NA i female NA 1204 iv male NA 383 iv female 1293 NA iii male NA 1355 ii male") names(clinical) <- c("patient.days_to_last_followup", "patient.days_to_death", "patient.stage_event.pathologic_stage", "patient.gender") timeStart <- "days_to_death" event <- "days_to_death" psi <- c(0.1, 0.2, 0.9, 1, 0.2, 0.6) opt <- optimalSurvivalCutoff(clinical, psi, "right", event, timeStart)
Retrieve elements grouped by their unique group based on each categorical column
parseCategoricalGroups(df)
parseCategoricalGroups(df)
df |
Data frame |
List of lists containing values based on rownames of df
testGroupIndependence()
and
plotGroupIndependence()
df <- data.frame("race"=c("caucasian", "caucasian", "asian"), "gender"=c("male", "female", "male")) rownames(df) <- paste("subject", 1:3) parseCategoricalGroups(df)
df <- data.frame("race"=c("caucasian", "caucasian", "asian"), "gender"=c("male", "female", "male")) rownames(df) <- paste("subject", 1:3) parseCategoricalGroups(df)
Parse alternative splicing event identifier
parseSplicingEvent( event, char = FALSE, pretty = FALSE, extra = NULL, coords = FALSE, data = NULL )
parseSplicingEvent( event, char = FALSE, pretty = FALSE, extra = NULL, coords = FALSE, data = NULL )
event |
Character: event identifier |
char |
Boolean: return character vector instead of list with parsed values? |
pretty |
Boolean: return a prettier name of the event identifier? |
extra |
Character: extra information to add (such as species and
assembly version); only used if |
coords |
Boolean: display extra coordinates regarding the alternative
and constitutive regions of alternative splicing events? Only used if
|
data |
Matrix or data frame: alternative splicing information |
Data.frame containing type of event, chromosome, strand, gene and position of alternative splicing events or character with that same information (depending on what is available)
events <- c( "A3SS_15_+_63353138_63353912_63353397_TPM1", "A3SS_11_-_61118463_61117115_61117894_CYB561A3", "A5SS_21_+_48055675_48056459_48056808_PRMT2", "A5SS_1_-_1274742_1274667_1274033_DVL1", "AFE_9_+_131902430_131901928_131904724_PPP2R4", "AFE_5_-_134686513_134688636_134681747_H2AFY", "ALE_12_+_56554104_56554410_56555171_MYL6", "ALE_8_-_38314874_38287466_38285953_FGFR1", "SE_9_+_6486925_6492303_6492401_6493826_UHRF2", "SE_19_-_5218431_5216778_5216731_5215606_PTPRS", "MXE_15_+_63335142_63335905_63336030_63336226_63336351_63349184_TPM1", "MXE_17_-_74090495_74087316_74087224_74086478_74086410_74085401_EXOC7") parseSplicingEvent(events)
events <- c( "A3SS_15_+_63353138_63353912_63353397_TPM1", "A3SS_11_-_61118463_61117115_61117894_CYB561A3", "A5SS_21_+_48055675_48056459_48056808_PRMT2", "A5SS_1_-_1274742_1274667_1274033_DVL1", "AFE_9_+_131902430_131901928_131904724_PPP2R4", "AFE_5_-_134686513_134688636_134681747_H2AFY", "ALE_12_+_56554104_56554410_56555171_MYL6", "ALE_8_-_38314874_38287466_38285953_FGFR1", "SE_9_+_6486925_6492303_6492401_6493826_UHRF2", "SE_19_-_5218431_5216778_5216731_5215606_PTPRS", "MXE_15_+_63335142_63335905_63336030_63336226_63336351_63349184_TPM1", "MXE_17_-_74090495_74087316_74087224_74086478_74086410_74085401_EXOC7") parseSplicingEvent(events)
Parse events from alternative splicing annotation
parseSuppaAnnotation( folder, types = c("SE", "AF", "AL", "MX", "A5", "A3", "RI"), genome = "hg19" ) parseVastToolsAnnotation( folder, types = c("ALT3", "ALT5", "COMBI", "IR", "MERGE3m", "MIC", "EXSK", "MULTI"), genome = "Hsa", complexEvents = FALSE ) parseMisoAnnotation( folder, types = c("SE", "AFE", "ALE", "MXE", "A5SS", "A3SS", "RI", "TandemUTR"), genome = "hg19" ) parseMatsAnnotation( folder, types = c("SE", "AFE", "ALE", "MXE", "A5SS", "A3SS", "RI"), genome = "fromGTF", novelEvents = TRUE )
parseSuppaAnnotation( folder, types = c("SE", "AF", "AL", "MX", "A5", "A3", "RI"), genome = "hg19" ) parseVastToolsAnnotation( folder, types = c("ALT3", "ALT5", "COMBI", "IR", "MERGE3m", "MIC", "EXSK", "MULTI"), genome = "Hsa", complexEvents = FALSE ) parseMisoAnnotation( folder, types = c("SE", "AFE", "ALE", "MXE", "A5SS", "A3SS", "RI", "TandemUTR"), genome = "hg19" ) parseMatsAnnotation( folder, types = c("SE", "AFE", "ALE", "MXE", "A5SS", "A3SS", "RI"), genome = "fromGTF", novelEvents = TRUE )
folder |
Character: path to folder |
types |
Character: type of events to retrieve (depends on the program of origin; see details) |
genome |
Character: genome of interest (for instance, |
complexEvents |
Boolean: should complex events in A3SS and A5SS be parsed? |
novelEvents |
Boolean: parse events detected due to novel splice sites |
Type of parsable events:
Alternative 3' splice site
Alternative 5' splice site
Alternative first exon
Alternative last exon
Skipped exon (may include skipped micro-exons)
Mutually exclusive exon
Retained intron
Tandem UTR
Retrieve data frame with events based on a given alternative splicing annotation
Other functions to prepare alternative splicing annotations:
prepareAnnotationFromEvents()
# Load sample files folder <- "extdata/eventsAnnotSample/suppa_output/suppaEvents" suppaOutput <- system.file(folder, package="psichomics") suppa <- parseSuppaAnnotation(suppaOutput) # Load sample files folder <- "extdata/eventsAnnotSample/VASTDB/Hsa/TEMPLATES" vastToolsOutput <- system.file(folder, package="psichomics") vast <- parseVastToolsAnnotation(vastToolsOutput) # Load sample files folder <- "extdata/eventsAnnotSample/miso_annotation" misoOutput <- system.file(folder, package="psichomics") miso <- parseMisoAnnotation(misoOutput) # Load sample files folder <- "extdata/eventsAnnotSample/mats_output/ASEvents" matsOutput <- system.file(folder, package="psichomics") mats <- parseMatsAnnotation(matsOutput) # Do not parse novel events mats <- parseMatsAnnotation(matsOutput, novelEvents=FALSE)
# Load sample files folder <- "extdata/eventsAnnotSample/suppa_output/suppaEvents" suppaOutput <- system.file(folder, package="psichomics") suppa <- parseSuppaAnnotation(suppaOutput) # Load sample files folder <- "extdata/eventsAnnotSample/VASTDB/Hsa/TEMPLATES" vastToolsOutput <- system.file(folder, package="psichomics") vast <- parseVastToolsAnnotation(vastToolsOutput) # Load sample files folder <- "extdata/eventsAnnotSample/miso_annotation" misoOutput <- system.file(folder, package="psichomics") miso <- parseMisoAnnotation(misoOutput) # Load sample files folder <- "extdata/eventsAnnotSample/mats_output/ASEvents" matsOutput <- system.file(folder, package="psichomics") mats <- parseMatsAnnotation(matsOutput) # Do not parse novel events mats <- parseMatsAnnotation(matsOutput, novelEvents=FALSE)
Parse sample information from TCGA sample identifiers
parseTCGAsampleTypes( samples, filename = system.file("extdata", "TCGAsampleType.RDS", package = "psichomics") ) parseTCGAsampleInfo(samples, match = NULL)
parseTCGAsampleTypes( samples, filename = system.file("extdata", "TCGAsampleType.RDS", package = "psichomics") ) parseTCGAsampleInfo(samples, match = NULL)
samples |
Character: sample identifiers |
filename |
Character: path to RDS file containing corresponding types |
match |
Integer: match between samples and subjects ( |
Metadata associated with each TCGA sample
Other functions associated with TCGA data retrieval:
getDownloadsFolder()
,
getTCGAdataTypes()
,
isFirebrowseUp()
,
loadTCGAdata()
parseTCGAsampleTypes(c("TCGA-01A-Tumour", "TCGA-10B-Normal")) samples <- c("TCGA-3C-AAAU-01A-11R-A41B-07", "TCGA-3C-AALI-01A-11R-A41B-07", "TCGA-3C-AALJ-01A-31R-A41B-07", "TCGA-3C-AALK-01A-11R-A41B-07", "TCGA-4H-AAAK-01A-12R-A41B-07", "TCGA-5L-AAT0-01A-12R-A41B-07") parseTCGAsampleInfo(samples)
parseTCGAsampleTypes(c("TCGA-01A-Tumour", "TCGA-10B-Normal")) samples <- c("TCGA-3C-AAAU-01A-11R-A41B-07", "TCGA-3C-AALI-01A-11R-A41B-07", "TCGA-3C-AALJ-01A-31R-A41B-07", "TCGA-3C-AALK-01A-11R-A41B-07", "TCGA-4H-AAAK-01A-12R-A41B-07", "TCGA-5L-AAT0-01A-12R-A41B-07") parseTCGAsampleInfo(samples)
Perform independent component analysis after processing missing values
performICA( data, n.comp = min(5, ncol(data)), center = TRUE, scale. = FALSE, missingValues = round(0.05 * nrow(data)), alg.typ = c("parallel", "defaltion"), fun = c("logcosh", "exp"), alpha = 1, ... )
performICA( data, n.comp = min(5, ncol(data)), center = TRUE, scale. = FALSE, missingValues = round(0.05 * nrow(data)), alg.typ = c("parallel", "defaltion"), fun = c("logcosh", "exp"), alpha = 1, ... )
data |
an optional data frame (or similar: see
|
n.comp |
number of components to be extracted |
center |
a logical value indicating whether the variables
should be shifted to be zero centered. Alternately, a vector of
length equal the number of columns of |
scale. |
a logical value indicating whether the variables should
be scaled to have unit variance before the analysis takes
place. The default is |
missingValues |
Integer: number of tolerated missing values per column to be replaced with the mean of the values of that same column |
alg.typ |
if |
fun |
the functional form of the |
alpha |
constant in range [1, 2] used in approximation to
neg-entropy when |
... |
Arguments passed on to |
ICA result in a prcomp
object
Other functions to analyse independent components:
plotICA()
performICA(USArrests)
performICA(USArrests)
Perform principal component analysis after processing missing values
performPCA( data, center = TRUE, scale. = FALSE, missingValues = round(0.05 * nrow(data)), ... )
performPCA( data, center = TRUE, scale. = FALSE, missingValues = round(0.05 * nrow(data)), ... )
data |
an optional data frame (or similar: see
|
center |
a logical value indicating whether the variables
should be shifted to be zero centered. Alternately, a vector of
length equal the number of columns of |
scale. |
a logical value indicating whether the variables should
be scaled to have unit variance before the analysis takes
place. The default is |
missingValues |
Integer: number of tolerated missing values per column to be replaced with the mean of the values of that same column |
... |
Arguments passed on to |
PCA result in a prcomp
object
Other functions to analyse principal components:
calculateLoadingsContribution()
,
plotPCA()
,
plotPCAvariance()
performPCA(USArrests)
performPCA(USArrests)
The tooltip shows the median, variance, maximum, minimum and number of non-NA samples of each data series, as well as sample names if available.
plotDistribution( data, groups = NULL, rug = length(data) < 500, vLine = TRUE, ..., title = NULL, subtitle = NULL, type = c("density", "boxplot", "violin"), invertAxes = FALSE, psi = NULL, rugLabels = FALSE, rugLabelsRotation = 0, legend = TRUE, valueLabel = NULL )
plotDistribution( data, groups = NULL, rug = length(data) < 500, vLine = TRUE, ..., title = NULL, subtitle = NULL, type = c("density", "boxplot", "violin"), invertAxes = FALSE, psi = NULL, rugLabels = FALSE, rugLabelsRotation = 0, legend = TRUE, valueLabel = NULL )
data |
Numeric, data frame or matrix: gene expression data or
alternative splicing event quantification values (sample names are based on
their |
groups |
List of sample names or vector containing the group name per
|
rug |
Boolean: show rug plot? |
vLine |
Boolean: plot vertical lines (including descriptive statistics for each group)? |
... |
Arguments passed on to
|
title |
Character: plot title |
subtitle |
Character: plot subtitle |
type |
Character: |
invertAxes |
Boolean: plot X axis as Y and vice-versa? |
psi |
Boolean: are |
rugLabels |
Boolean: plot sample names in the rug? |
rugLabelsRotation |
Numeric: rotation (in degrees) of rug labels; this
may present issues at different zoom levels and depending on the proximity
of |
legend |
Boolean: show legend? |
valueLabel |
Character: label for the value (by default, either
|
Argument groups
can be either:
a list of sample names, e.g.
list("Group 1"=c("Sample A", "Sample B"), "Group 2"=c("Sample C")))
a character vector with the same length as data
, e.g.
c("Sample A", "Sample C", "Sample B")
.
highchart
object with density plot
Other functions to perform and plot differential analyses:
diffAnalyses()
data <- sample(20, rep=TRUE)/20 groups <- paste("Group", c(rep("A", 10), rep("B", 10))) names(data) <- paste("Sample", seq(data)) plotDistribution(data, groups) # Using colours attr(groups, "Colour") <- c("Group A"="pink", "Group B"="orange") plotDistribution(data, groups)
data <- sample(20, rep=TRUE)/20 groups <- paste("Group", c(rep("A", 10), rep("B", 10))) names(data) <- paste("Sample", seq(data)) plotDistribution(data, groups) # Using colours attr(groups, "Colour") <- c("Group A"="pink", "Group B"="orange") plotDistribution(data, groups)
Plot distribution of gene expression per sample
plotGeneExprPerSample(geneExpr, ...)
plotGeneExprPerSample(geneExpr, ...)
geneExpr |
Data frame or matrix: gene expression |
... |
Arguments passed on to
|
Gene expression distribution plots
Other functions for gene expression pre-processing:
convertGeneIdentifiers()
,
filterGeneExpr()
,
normaliseGeneExpression()
,
plotLibrarySize()
,
plotRowStats()
df <- data.frame(geneA=c(2, 4, 5), geneB=c(20, 3, 5), geneC=c(5, 10, 21)) colnames(df) <- paste("Sample", 1:3) plotGeneExprPerSample(df)
df <- data.frame(geneA=c(2, 4, 5), geneB=c(20, 3, 5), geneC=c(5, 10, 21)) colnames(df) <- paste("Sample", 1:3) plotGeneExprPerSample(df)
-log10(p-values)
of the results obtained after multiple group
independence testingPlot -log10(p-values)
of the results obtained after multiple group
independence testing
plotGroupIndependence( groups, top = 50, textSize = 10, colourLow = "lightgrey", colourMid = "blue", colourHigh = "orange", colourMidpoint = 150 )
plotGroupIndependence( groups, top = 50, textSize = 10, colourLow = "lightgrey", colourMid = "blue", colourHigh = "orange", colourMidpoint = 150 )
groups |
|
top |
Integer: number of attributes to render |
textSize |
Integer: size of the text |
colourLow |
Character: name or HEX code of colour for lower values |
colourMid |
Character: name or HEX code of colour for middle values |
colourHigh |
Character: name or HEX code of colour for higher values |
colourMidpoint |
Numeric: midpoint to identify middle values |
ggplot
object
parseCategoricalGroups()
and
testGroupIndependence()
Other functions for data grouping:
createGroupByAttribute()
,
getGeneList()
,
getSampleFromSubject()
,
getSubjectFromSample()
,
groupPerElem()
,
testGroupIndependence()
elements <- paste("subjects", 1:50) ref <- elements[10:50] groups <- list(race=list(asian=elements[1:3], white=elements[4:7], black=elements[8:10]), region=list(european=elements[c(4, 5, 9)], african=elements[c(6:8, 10:50)])) groupTesting <- testGroupIndependence(ref, groups, elements) plotGroupIndependence(groupTesting)
elements <- paste("subjects", 1:50) ref <- elements[10:50] groups <- list(race=list(asian=elements[1:3], white=elements[4:7], black=elements[8:10]), region=list(european=elements[c(4, 5, 9)], african=elements[c(6:8, 10:50)])) groupTesting <- testGroupIndependence(ref, groups, elements) plotGroupIndependence(groupTesting)
Create multiple scatterplots from ICA
plotICA(ica, components = seq(10), groups = NULL, ...)
plotICA(ica, components = seq(10), groups = NULL, ...)
ica |
Object resulting from |
components |
Numeric: independent components to plot |
groups |
Matrix: groups to plot indicating the index of interest of the samples (use clinical or sample groups) |
... |
Arguments passed on to
|
Multiple scatterplots as a pairsD3
object
Other functions to analyse independent components:
performICA()
data <- scale(USArrests) ica <- fastICA::fastICA(data, n.comp=4) plotICA(ica) # Colour by groups groups <- NULL groups$sunny <- c("California", "Hawaii", "Florida") groups$ozEntrance <- c("Kansas") groups$novel <- c("New Mexico", "New York", "New Hampshire", "New Jersey") plotICA(ica, groups=groups)
data <- scale(USArrests) ica <- fastICA::fastICA(data, n.comp=4) plotICA(ica) # Colour by groups groups <- NULL groups$sunny <- c("California", "Hawaii", "Florida") groups$ozEntrance <- c("Kansas") groups$novel <- c("New Mexico", "New York", "New Hampshire", "New Jersey") plotICA(ica, groups=groups)
Plot library size
plotLibrarySize( data, log10 = TRUE, title = "Library size distribution across samples", subtitle = "Library size: total number of mapped reads", colour = "orange" )
plotLibrarySize( data, log10 = TRUE, title = "Library size distribution across samples", subtitle = "Library size: total number of mapped reads", colour = "orange" )
data |
Data frame or matrix: gene expression |
log10 |
Boolean: log10-transform |
title |
Character: plot title |
subtitle |
Character: plot subtitle |
colour |
Character: data colour |
Library size distribution
Other functions for gene expression pre-processing:
convertGeneIdentifiers()
,
filterGeneExpr()
,
normaliseGeneExpression()
,
plotGeneExprPerSample()
,
plotRowStats()
df <- data.frame(geneA=c(2, 4, 5), geneB=c(20, 3, 5), geneC=c(5, 10, 21)) colnames(df) <- paste("Sample", 1:3) plotLibrarySize(df)
df <- data.frame(geneA=c(2, 4, 5), geneB=c(20, 3, 5), geneC=c(5, 10, 21)) colnames(df) <- paste("Sample", 1:3) plotLibrarySize(df)
Create a scatterplot from a PCA object
plotPCA( pca, pcX = 1, pcY = 2, groups = NULL, individuals = TRUE, loadings = FALSE, nLoadings = NULL )
plotPCA( pca, pcX = 1, pcY = 2, groups = NULL, individuals = TRUE, loadings = FALSE, nLoadings = NULL )
pca |
|
pcX |
Character: name of the X axis of interest from the PCA |
pcY |
Character: name of the Y axis of interest from the PCA |
groups |
Matrix: groups to plot indicating the index of interest of the samples (use clinical or sample groups) |
individuals |
Boolean: plot PCA individuals |
loadings |
Boolean: plot PCA loadings/rotations |
nLoadings |
Integer: Number of variables to plot, ordered by those that
most contribute to selected principal components (this allows for faster
performance as only the most contributing variables are rendered); if
|
Scatterplot as an highchart
object
Other functions to analyse principal components:
calculateLoadingsContribution()
,
performPCA()
,
plotPCAvariance()
pca <- prcomp(USArrests, scale=TRUE) plotPCA(pca) plotPCA(pca, pcX=2, pcY=3) # Plot both individuals and loadings plotPCA(pca, pcX=2, pcY=3, loadings=TRUE) # Only plot loadings plotPCA(pca, pcX=2, pcY=3, loadings=TRUE, individuals=FALSE)
pca <- prcomp(USArrests, scale=TRUE) plotPCA(pca) plotPCA(pca, pcX=2, pcY=3) # Plot both individuals and loadings plotPCA(pca, pcX=2, pcY=3, loadings=TRUE) # Only plot loadings plotPCA(pca, pcX=2, pcY=3, loadings=TRUE, individuals=FALSE)
Create the explained variance plot from a PCA
plotPCAvariance(pca)
plotPCAvariance(pca)
pca |
|
Plot variance as an highchart
object
Other functions to analyse principal components:
calculateLoadingsContribution()
,
performPCA()
,
plotPCA()
pca <- prcomp(USArrests) plotPCAvariance(pca)
pca <- prcomp(USArrests) plotPCAvariance(pca)
Plot protein features
plotProtein(molecule)
plotProtein(molecule)
molecule |
Character: UniProt protein or Ensembl transcript identifier |
highcharter
object
Other functions to retrieve external information:
ensemblToUniprot()
,
plotTranscripts()
,
queryEnsemblByGene()
protein <- "P38398" plotProtein(protein) transcript <- "ENST00000488540" plotProtein(transcript)
protein <- "P38398" plotProtein(protein) transcript <- "ENST00000488540" plotProtein(transcript)
Scatter plot to compare between the row-wise mean, median, variance or range
from a data frame or matrix. Also supports transformations of those
variables, such as log10(mean)
. If y = NULL
, a density plot is
rendered instead.
plotRowStats( data, x, y = NULL, subset = NULL, xmin = NULL, xmax = NULL, ymin = NULL, ymax = NULL, xlim = NULL, ylim = NULL, cache = NULL, verbose = FALSE, data2 = NULL, legend = FALSE, legendLabels = c("Original", "Highlighted") )
plotRowStats( data, x, y = NULL, subset = NULL, xmin = NULL, xmax = NULL, ymin = NULL, ymax = NULL, xlim = NULL, ylim = NULL, cache = NULL, verbose = FALSE, data2 = NULL, legend = FALSE, legendLabels = c("Original", "Highlighted") )
data |
Data frame or matrix containing samples per column and, for instance, gene or alternative splicing event per row |
x , y
|
Character: statistic to calculate and display in the plot per row;
choose between |
subset |
Boolean or integer: |
xmin , xmax , ymin , ymax
|
Numeric: minimum and maximum X and Y values to draw in the plot |
xlim , ylim
|
Numeric: X and Y axis range |
cache |
List of summary statistics for |
verbose |
Boolean: print messages of the steps performed |
data2 |
Same as |
legend |
Boolean: show legend? |
legendLabels |
Character: legend labels |
Plot of data
Other functions for gene expression pre-processing:
convertGeneIdentifiers()
,
filterGeneExpr()
,
normaliseGeneExpression()
,
plotGeneExprPerSample()
,
plotLibrarySize()
Other functions for PSI quantification:
filterPSI()
,
getSplicingEventTypes()
,
listSplicingAnnotations()
,
loadAnnotation()
,
quantifySplicing()
library(ggplot2) # Plotting gene expression data geneExpr <- readFile("ex_gene_expression.RDS") plotRowStats(geneExpr, "mean", "var^(1/4)") + ggtitle("Mean-variance plot") + labs(y="Square Root of the Standard Deviation") # Plotting alternative splicing quantification annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) medianVar <- plotRowStats(psi, x="median", y="var", xlim=c(0, 1)) + labs(x="Median PSI", y="PSI variance") medianVar rangeVar <- plotRowStats(psi, x="range", y="log10(var)", xlim=c(0, 1)) + labs(x="PSI range", y="log10(PSI variance)") rangeVar
library(ggplot2) # Plotting gene expression data geneExpr <- readFile("ex_gene_expression.RDS") plotRowStats(geneExpr, "mean", "var^(1/4)") + ggtitle("Mean-variance plot") + labs(y="Square Root of the Standard Deviation") # Plotting alternative splicing quantification annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) medianVar <- plotRowStats(psi, x="median", y="var", xlim=c(0, 1)) + labs(x="Median PSI", y="PSI variance") medianVar rangeVar <- plotRowStats(psi, x="range", y="log10(var)", xlim=c(0, 1)) + labs(x="PSI range", y="log10(PSI variance)") rangeVar
Plot diagram of alternative splicing events
plotSplicingEvent( ASevent, data = NULL, showText = TRUE, showPath = TRUE, showAlternative1 = TRUE, showAlternative2 = TRUE, constitutiveWidth = NULL, alternativeWidth = NULL, intronWidth = NULL, constitutiveFill = "lightgray", constitutiveStroke = "darkgray", alternative1Fill = "#ffb153", alternative1Stroke = "#faa000", alternative2Fill = "#caa06c", alternative2Stroke = "#9d7039", class = NULL, style = NULL )
plotSplicingEvent( ASevent, data = NULL, showText = TRUE, showPath = TRUE, showAlternative1 = TRUE, showAlternative2 = TRUE, constitutiveWidth = NULL, alternativeWidth = NULL, intronWidth = NULL, constitutiveFill = "lightgray", constitutiveStroke = "darkgray", alternative1Fill = "#ffb153", alternative1Stroke = "#faa000", alternative2Fill = "#caa06c", alternative2Stroke = "#9d7039", class = NULL, style = NULL )
ASevent |
Character: alternative splicing event identifiers |
data |
Matrix or data frame: alternative splicing information |
showText |
Boolean: display coordinates and length (if available) |
showPath |
Boolean: display alternative splicing junctions |
showAlternative1 |
Boolean: show alternative exon 1 and respective splicing junctions and text? |
showAlternative2 |
Boolean: show alternative exon 2 and respective splicing junctions and text? (only related with mutually exclusive exons) |
constitutiveWidth |
Numeric: width of constitutive exon(s) |
alternativeWidth |
Numeric: width of alternative exon(s) |
intronWidth |
Numeric: width of intron's representation |
constitutiveFill |
Character: fill colour of constitutive exons |
constitutiveStroke |
Character: stroke colour of constitutive exons |
alternative1Fill |
Character: fill colour of alternative exon 1 |
alternative1Stroke |
Character: stroke colour of alternative exon 1 |
alternative2Fill |
Character: fill colour of alternative exon 2 |
alternative2Stroke |
Character: stroke colour of alternative exon 2 |
class |
Character: class of SVG parent tag |
style |
Character: style of SVG parent tag |
List of SVG (one for each alternative splicing event)
events <- c( "A3SS_15_+_63353138_63353912_63353397_TPM1", "A3SS_11_-_61118463_61117115_61117894_CYB561A3", "A5SS_21_+_48055675_48056459_48056808_PRMT2", "A5SS_1_-_1274742_1274667_1274033_DVL1", "AFE_9_+_131902430_131901928_131904724_PPP2R4", "AFE_5_-_134686513_134688636_134681747_H2AFY", "ALE_12_+_56554104_56554410_56555171_MYL6", "ALE_8_-_38314874_38287466_38285953_FGFR1", "SE_9_+_6486925_6492303_6492401_6493826_UHRF2", "SE_19_-_5218431_5216778_5216731_5215606_PTPRS", "MXE_15_+_63335142_63335905_63336030_63336226_63336351_63349184_TPM1", "MXE_17_-_74090495_74087316_74087224_74086478_74086410_74085401_EXOC7") diagram <- plotSplicingEvent(events) ## Not run: diagram[["A3SS_3_-_145796903_145794682_145795711_PLOD2"]] diagram[[6]] diagram ## End(Not run)
events <- c( "A3SS_15_+_63353138_63353912_63353397_TPM1", "A3SS_11_-_61118463_61117115_61117894_CYB561A3", "A5SS_21_+_48055675_48056459_48056808_PRMT2", "A5SS_1_-_1274742_1274667_1274033_DVL1", "AFE_9_+_131902430_131901928_131904724_PPP2R4", "AFE_5_-_134686513_134688636_134681747_H2AFY", "ALE_12_+_56554104_56554410_56555171_MYL6", "ALE_8_-_38314874_38287466_38285953_FGFR1", "SE_9_+_6486925_6492303_6492401_6493826_UHRF2", "SE_19_-_5218431_5216778_5216731_5215606_PTPRS", "MXE_15_+_63335142_63335905_63336030_63336226_63336351_63349184_TPM1", "MXE_17_-_74090495_74087316_74087224_74086478_74086410_74085401_EXOC7") diagram <- plotSplicingEvent(events) ## Not run: diagram[["A3SS_3_-_145796903_145794682_145795711_PLOD2"]] diagram[[6]] diagram ## End(Not run)
Plot survival curves
plotSurvivalCurves( surv, mark = TRUE, interval = FALSE, pvalue = NULL, title = "Survival analysis", scale = NULL, auto = TRUE )
plotSurvivalCurves( surv, mark = TRUE, interval = FALSE, pvalue = NULL, title = "Survival analysis", scale = NULL, auto = TRUE )
surv |
Survival object |
mark |
Boolean: mark times? |
interval |
Boolean: show interval ranges? |
pvalue |
Numeric: p-value of the survival curves |
title |
Character: plot title |
scale |
Character: time scale (default is |
auto |
Boolean: return the plot automatically prepared ( |
Plot of survival curves
Other functions to analyse survival:
assignValuePerSubject()
,
getAttributesTime()
,
labelBasedOnCutoff()
,
optimalSurvivalCutoff()
,
plotSurvivalPvaluesByCutoff()
,
processSurvTerms()
,
survdiffTerms()
,
survfit.survTerms()
,
testSurvival()
require("survival") fit <- survfit(Surv(time, status) ~ x, data = aml) plotSurvivalCurves(fit)
require("survival") fit <- survfit(Surv(time, status) ~ x, data = aml) plotSurvivalCurves(fit)
Plot p-values of survival difference between groups based on multiple cutoffs
plotSurvivalPvaluesByCutoff( clinical, data, censoring, event, timeStart, timeStop = NULL, followup = "days_to_last_followup", significance = 0.05, cutoffs = seq(0, 0.99, 0.01) )
plotSurvivalPvaluesByCutoff( clinical, data, censoring, event, timeStart, timeStop = NULL, followup = "days_to_last_followup", significance = 0.05, cutoffs = seq(0, 0.99, 0.01) )
clinical |
Data frame: clinical data |
data |
Numeric: elements of interest to test against the cutoff |
censoring |
Character: censor using |
event |
Character: name of column containing time of the event of interest |
timeStart |
Character: name of column containing starting time of the interval or follow up time |
timeStop |
Character: name of column containing ending time of the interval (only relevant for interval censoring) |
followup |
Character: name of column containing follow up time |
significance |
Numeric: significance threshold |
cutoffs |
Numeric: cutoffs to test |
p-value plot
Other functions to analyse survival:
assignValuePerSubject()
,
getAttributesTime()
,
labelBasedOnCutoff()
,
optimalSurvivalCutoff()
,
plotSurvivalCurves()
,
processSurvTerms()
,
survdiffTerms()
,
survfit.survTerms()
,
testSurvival()
clinical <- read.table(text = "2549 NA ii female 840 NA i female NA 1204 iv male NA 383 iv female 1293 NA iii male") names(clinical) <- c("patient.days_to_last_followup", "patient.days_to_death", "patient.stage_event.pathologic_stage", "patient.gender") clinical <- do.call(rbind, rep(list(clinical), 5)) rownames(clinical) <- paste("Subject", seq(nrow(clinical))) # Calculate PSI for skipped exon (SE) and mutually exclusive (MXE) events annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) # Match between subjects and samples match <- c("Cancer 1"="Subject 3", "Cancer 2"="Subject 17", "Cancer 3"="Subject 21") eventData <- assignValuePerSubject(psi[3, ], match) event <- "days_to_death" timeStart <- "days_to_death" plotSurvivalPvaluesByCutoff(clinical, eventData, censoring="right", event=event, timeStart=timeStart)
clinical <- read.table(text = "2549 NA ii female 840 NA i female NA 1204 iv male NA 383 iv female 1293 NA iii male") names(clinical) <- c("patient.days_to_last_followup", "patient.days_to_death", "patient.stage_event.pathologic_stage", "patient.gender") clinical <- do.call(rbind, rep(list(clinical), 5)) rownames(clinical) <- paste("Subject", seq(nrow(clinical))) # Calculate PSI for skipped exon (SE) and mutually exclusive (MXE) events annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") psi <- quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE")) # Match between subjects and samples match <- c("Cancer 1"="Subject 3", "Cancer 2"="Subject 17", "Cancer 3"="Subject 21") eventData <- assignValuePerSubject(psi[3, ], match) event <- "days_to_death" timeStart <- "days_to_death" plotSurvivalPvaluesByCutoff(clinical, eventData, censoring="right", event=event, timeStart=timeStart)
Plot transcripts
plotTranscripts( info, eventPosition = NULL, event = NULL, eventData = NULL, shiny = FALSE )
plotTranscripts( info, eventPosition = NULL, event = NULL, eventData = NULL, shiny = FALSE )
info |
Information retrieved from Ensembl |
eventPosition |
Numeric: coordinates of the alternative splicing event
(ignored if |
event |
Character: identifier of the alternative splicing event to plot |
eventData |
Object containing event information to be parsed |
shiny |
Boolean: is the function running in a Shiny session? |
NULL
(function is only used to modify the Shiny session's
state or internal variables)
Other functions to retrieve external information:
ensemblToUniprot()
,
plotProtein()
,
queryEnsemblByGene()
event <- "SE_12_-_7985318_7984360_7984200_7982602_SLC2A14" info <- queryEnsemblByEvent(event, species="human", assembly="hg19") ## Not run: plotTranscripts(info, event=event) ## End(Not run)
event <- "SE_12_-_7985318_7984360_7984200_7982602_SLC2A14" info <- queryEnsemblByEvent(event, species="human", assembly="hg19") ## Not run: plotTranscripts(info, event=event) ## End(Not run)
In case more than one data frame with alternative splicing events is given, the events are cross-referenced according to the chromosome, strand and relevant coordinates per event type (see details).
prepareAnnotationFromEvents(...)
prepareAnnotationFromEvents(...)
... |
Data frame(s) of alternative splicing events to include in the annotation |
Events from two or more data frames are cross-referenced based on each event's chromosome, strand and specific coordinates relevant for each event type:
Skipped exon: constitutive exon 1 end, alternative exon (start and end) and constitutive exon 2 start
Mutually exclusive exon: constitutive exon 1 end, alternative exon 1 and 2 (start and end) and constitutive exon 2 start
Alternative 5' splice site: constitutive exon 1 end, alternative exon 1 end and constitutive exon 2 start
Alternative first exon: same as alternative 5' splice site
Alternative 3' splice site: constitutive exon 1 end, alternative exon 1 start and constitutive exon 2 start
Alternative last exon: same as alternative 3' splice site
List of data frames with the annotation from different data frames joined by event type
When cross-referencing events, gene information is discarded.
Other functions to prepare alternative splicing annotations:
parseSuppaAnnotation()
# Load sample files (SUPPA annotation) folder <- "extdata/eventsAnnotSample/suppa_output/suppaEvents" suppaOutput <- system.file(folder, package="psichomics") # Parse and prepare SUPPA annotation suppa <- parseSuppaAnnotation(suppaOutput) annot <- prepareAnnotationFromEvents(suppa) # Load sample files (rMATS annotation) folder <- "extdata/eventsAnnotSample/mats_output/ASEvents/" matsOutput <- system.file(folder, package="psichomics") # Parse rMATS annotation and prepare combined annotation from rMATS and SUPPA mats <- parseMatsAnnotation(matsOutput) annot <- prepareAnnotationFromEvents(suppa, mats)
# Load sample files (SUPPA annotation) folder <- "extdata/eventsAnnotSample/suppa_output/suppaEvents" suppaOutput <- system.file(folder, package="psichomics") # Parse and prepare SUPPA annotation suppa <- parseSuppaAnnotation(suppaOutput) annot <- prepareAnnotationFromEvents(suppa) # Load sample files (rMATS annotation) folder <- "extdata/eventsAnnotSample/mats_output/ASEvents/" matsOutput <- system.file(folder, package="psichomics") # Parse rMATS annotation and prepare combined annotation from rMATS and SUPPA mats <- parseMatsAnnotation(matsOutput) annot <- prepareAnnotationFromEvents(suppa, mats)
Prepare user-provided files to be loaded into psichomics
prepareSRAmetadata(file, output = "psichomics_metadata.txt") prepareJunctionQuant( ..., output = "psichomics_junctions.txt", startOffset = NULL, endOffset = NULL ) prepareGeneQuant( ..., output = "psichomics_gene_counts.txt", strandedness = c("unstranded", "stranded", "stranded (reverse)") )
prepareSRAmetadata(file, output = "psichomics_metadata.txt") prepareJunctionQuant( ..., output = "psichomics_junctions.txt", startOffset = NULL, endOffset = NULL ) prepareGeneQuant( ..., output = "psichomics_gene_counts.txt", strandedness = c("unstranded", "stranded", "stranded (reverse)") )
file |
Character: path to file |
output |
Character: path of output file (if |
... |
Character: path of (optionally named) input files (see Examples) |
startOffset |
Numeric: value to offset start position |
endOffset |
Numeric: value to offset end position |
strandedness |
Character: strandedness of RNA-seq protocol; may be one
of the following: |
Prepared file (if output != NULL
) and object
## Not run: prepareJunctionQuant("Control rep1"=junctionFile1, "Control rep2"=junctionFile2, "KD rep1"=junctionFile3, "KD rep2"=junctionFile4) ## End(Not run) ## Not run: prepareGeneQuant("Control rep1"=geneCountFile1, "Control rep2"=geneCountFile2, "KD rep1"=geneCountFile3, "KD rep2"=geneCountFile4) ## End(Not run)
## Not run: prepareJunctionQuant("Control rep1"=junctionFile1, "Control rep2"=junctionFile2, "KD rep1"=junctionFile3, "KD rep2"=junctionFile4) ## End(Not run) ## Not run: prepareGeneQuant("Control rep1"=geneCountFile1, "Control rep2"=geneCountFile2, "KD rep1"=geneCountFile3, "KD rep2"=geneCountFile4) ## End(Not run)
Process survival curves terms to calculate survival curves
processSurvTerms( clinical, censoring, event, timeStart, timeStop = NULL, group = NULL, formulaStr = NULL, coxph = FALSE, scale = "days", followup = "days_to_last_followup", survTime = NULL )
processSurvTerms( clinical, censoring, event, timeStart, timeStop = NULL, group = NULL, formulaStr = NULL, coxph = FALSE, scale = "days", followup = "days_to_last_followup", survTime = NULL )
clinical |
Data frame: clinical data |
censoring |
Character: censor using |
event |
Character: name of column containing time of the event of interest |
timeStart |
Character: name of column containing starting time of the interval or follow up time |
timeStop |
Character: name of column containing ending time of the interval (only relevant for interval censoring) |
group |
Character: group relative to each subject |
formulaStr |
Character: formula to use |
coxph |
Boolean: fit a Cox proportional hazards regression model? |
scale |
Character: rescale the survival time to |
followup |
Character: name of column containing follow up time |
survTime |
|
The event
time is only used to determine whether the event
has occurred (1
) or not (0
) in case of missing values.
If survTime = NULL
, survival times are obtained from the clinical
dataset according to the names given in timeStart
, timeStop
,
event
and followup
. This may become quite slow when used in a
loop. If the aforementioned variables are constant, consider running
getAttributesTime()
outside the loop and using its output via
the survTime
argument of this function (see Examples).
A list with a formula
object and a data frame with terms
needed to calculate survival curves
Other functions to analyse survival:
assignValuePerSubject()
,
getAttributesTime()
,
labelBasedOnCutoff()
,
optimalSurvivalCutoff()
,
plotSurvivalCurves()
,
plotSurvivalPvaluesByCutoff()
,
survdiffTerms()
,
survfit.survTerms()
,
testSurvival()
clinical <- read.table(text = "2549 NA ii female 840 NA i female NA 1204 iv male NA 383 iv female 1293 NA iii male NA 1355 ii male") names(clinical) <- c("patient.days_to_last_followup", "patient.days_to_death", "patient.stage_event.pathologic_stage", "patient.gender") timeStart <- "days_to_death" event <- "days_to_death" formulaStr <- "patient.stage_event.pathologic_stage + patient.gender" survTerms <- processSurvTerms(clinical, censoring="right", event, timeStart, formulaStr=formulaStr) # If running multiple times, consider calculating survTime only once survTime <- getAttributesTime(clinical, event, timeStart) for (i in seq(5)) { survTerms <- processSurvTerms(clinical, censoring="right", event, timeStart, formulaStr=formulaStr, survTime=survTime) }
clinical <- read.table(text = "2549 NA ii female 840 NA i female NA 1204 iv male NA 383 iv female 1293 NA iii male NA 1355 ii male") names(clinical) <- c("patient.days_to_last_followup", "patient.days_to_death", "patient.stage_event.pathologic_stage", "patient.gender") timeStart <- "days_to_death" event <- "days_to_death" formulaStr <- "patient.stage_event.pathologic_stage + patient.gender" survTerms <- processSurvTerms(clinical, censoring="right", event, timeStart, formulaStr=formulaStr) # If running multiple times, consider calculating survTime only once survTime <- getAttributesTime(clinical, event, timeStart) for (i in seq(5)) { survTerms <- processSurvTerms(clinical, censoring="right", event, timeStart, formulaStr=formulaStr, survTime=survTime) }
Start graphical interface of psichomics
psichomics( ..., launch.browser = TRUE, shinyproxy = FALSE, testData = FALSE, cache = getAnnotationHubOption("CACHE") )
psichomics( ..., launch.browser = TRUE, shinyproxy = FALSE, testData = FALSE, cache = getAnnotationHubOption("CACHE") )
... |
Arguments passed on to
|
launch.browser |
If true, the system's default web browser will be launched automatically after the app is started. Defaults to true in interactive sessions only. The value of this parameter can also be a function to call with the application's URL. |
shinyproxy |
Boolean: prepare visual interface to run in Shinyproxy? |
testData |
Boolean: load with test data |
cache |
Character: path to |
NULL
(function is only used to modify the Shiny session's
state or internal variables)
## Not run: psichomics() ## End(Not run)
## Not run: psichomics() ## End(Not run)
Quantify alternative splicing events
quantifySplicing( annotation, junctionQuant, eventType = c("SE", "MXE", "ALE", "AFE", "A3SS", "A5SS"), minReads = 10, genes = NULL )
quantifySplicing( annotation, junctionQuant, eventType = c("SE", "MXE", "ALE", "AFE", "A3SS", "A5SS"), minReads = 10, genes = NULL )
annotation |
List of data frames: annotation for each alternative splicing event type |
junctionQuant |
Data frame: junction quantification |
eventType |
Character: splicing event types to quantify |
minReads |
Integer: values whose number of total supporting read counts
is below |
genes |
Character: gene symbols for which to quantify splicing events
(if |
Data frame with the quantification of the alternative splicing events
Other functions for PSI quantification:
filterPSI()
,
getSplicingEventTypes()
,
listSplicingAnnotations()
,
loadAnnotation()
,
plotRowStats()
# Calculate PSI for skipped exon (SE) and mutually exclusive (MXE) events annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE"))
# Calculate PSI for skipped exon (SE) and mutually exclusive (MXE) events annot <- readFile("ex_splicing_annotation.RDS") junctionQuant <- readFile("ex_junctionQuant.RDS") quantifySplicing(annot, junctionQuant, eventType=c("SE", "MXE"))
Query information from Ensembl
queryEnsemblByGene(gene, species = NULL, assembly = NULL) queryEnsemblByEvent(event, species = NULL, assembly = NULL, data = NULL)
queryEnsemblByGene(gene, species = NULL, assembly = NULL) queryEnsemblByEvent(event, species = NULL, assembly = NULL, data = NULL)
gene |
Character: gene |
species |
Character: species (may be |
assembly |
Character: assembly version (may be NULL for an Ensembl identifier) |
event |
Character: alternative splicing event |
data |
Matrix or data frame: alternative splicing information |
Information from Ensembl
Other functions to retrieve external information:
ensemblToUniprot()
,
plotProtein()
,
plotTranscripts()
queryEnsemblByGene("BRCA1", "human", "hg19") queryEnsemblByGene("ENSG00000139618") event <- "SE_17_-_41251792_41249306_41249261_41246877_BRCA1" queryEnsemblByEvent(event, species="human", assembly="hg19")
queryEnsemblByGene("BRCA1", "human", "hg19") queryEnsemblByGene("ENSG00000139618") event <- "SE_17_-_41251792_41249306_41249261_41246877_BRCA1" queryEnsemblByEvent(event, species="human", assembly="hg19")
Load psichomics-specific file
readFile(file)
readFile(file)
file |
Character: path to the file |
Loaded file
junctionQuant <- readFile("ex_junctionQuant.RDS")
junctionQuant <- readFile("ex_junctionQuant.RDS")
Tests if there is a difference between two or more survival curves using
the family of tests, or for a single curve against a known alternative.
survdiffTerms(survTerms, ...)
survdiffTerms(survTerms, ...)
survTerms |
|
... |
Arguments passed on to
|
survfit
object. See survfit.object
for details. Methods
defined for survfit
objects are print
, plot
,
lines
, and points
.
This function implements the G-rho family of
Harrington and Fleming (1982), with weights on each death of ,
where
is the Kaplan-Meier estimate of survival.
With
rho = 0
this is the log-rank or Mantel-Haenszel test,
and with rho = 1
it is equivalent to the Peto & Peto modification
of the Gehan-Wilcoxon test.
Peto and Peto show that the Gehan-Wilcoxon test can be badly biased if the two groups have different censoring patterns, and proposed an alternative. Prentice and Marek later showed an actual example where this issue occurs. For most data sets the Gehan-Wilcoxon and Peto-Peto-Prentice variant will hardly differ, however.
If the right hand side of the formula consists only of an offset term,
then a one sample test is done.
To cause missing values in the predictors to be treated as a separate
group, rather than being omitted, use the factor
function with its
exclude
argument to recode the righ-hand-side covariate.
Harrington, D. P. and Fleming, T. R. (1982). A class of rank test procedures for censored survival data. Biometrika, 553-566.
Peto R. Peto and Peto, J. (1972) Asymptotically efficient rank invariant test procedures (with discussion), JRSSA, 185-206.
Prentice, R. and Marek, P. (1979) A qualitative discrepancy between censored data rank tests, Biometics, 861–867.
Other functions to analyse survival:
assignValuePerSubject()
,
getAttributesTime()
,
labelBasedOnCutoff()
,
optimalSurvivalCutoff()
,
plotSurvivalCurves()
,
plotSurvivalPvaluesByCutoff()
,
processSurvTerms()
,
survfit.survTerms()
,
testSurvival()
clinical <- read.table(text = "2549 NA ii female 840 NA i female NA 1204 iv male NA 383 iv female 1293 NA iii male NA 1355 ii male") names(clinical) <- c("patient.days_to_last_followup", "patient.days_to_death", "patient.stage_event.pathologic_stage", "patient.gender") timeStart <- "days_to_death" event <- "days_to_death" formulaStr <- "patient.stage_event.pathologic_stage + patient.gender" survTerms <- processSurvTerms(clinical, censoring="right", event, timeStart, formulaStr=formulaStr) survdiffTerms(survTerms)
clinical <- read.table(text = "2549 NA ii female 840 NA i female NA 1204 iv male NA 383 iv female 1293 NA iii male NA 1355 ii male") names(clinical) <- c("patient.days_to_last_followup", "patient.days_to_death", "patient.stage_event.pathologic_stage", "patient.gender") timeStart <- "days_to_death" event <- "days_to_death" formulaStr <- "patient.stage_event.pathologic_stage + patient.gender" survTerms <- processSurvTerms(clinical, censoring="right", event, timeStart, formulaStr=formulaStr) survdiffTerms(survTerms)
Create survival curves
## S3 method for class 'survTerms' survfit(formula, ...)
## S3 method for class 'survTerms' survfit(formula, ...)
formula |
|
... |
Arguments passed on to
|
A survival curve is based on a tabulation of the number at risk and
number of events at each unique death time. When time is a floating
point number the definition of "unique" is subject to interpretation.
The code uses factor() to define the set.
For further details see the documentation for the appropriate method, i.e.,
?survfit.formula
or ?survfit.coxph
.
A survfit object may contain a single curve, a set of curves (vector), a
matrix of curves, or even a 3 way array: dim(fit)
will reveal
the dimensions.
Predicted curves from a coxph
model have one row for each
stratum in the Cox model fit and one column for each specified
covariate set.
Curves from a multi-state model have one row for each stratum and
a column for each state, the strata correspond to predictors on the
right hand side of the equation. The default printing and plotting
order for curves is by column, as with other matrices.
survfit
object. See survfit.object
for details. Methods
defined for survfit
objects are print
, plot
,
lines
, and points
.
Other functions to analyse survival:
assignValuePerSubject()
,
getAttributesTime()
,
labelBasedOnCutoff()
,
optimalSurvivalCutoff()
,
plotSurvivalCurves()
,
plotSurvivalPvaluesByCutoff()
,
processSurvTerms()
,
survdiffTerms()
,
testSurvival()
library("survival") clinical <- read.table(text = "2549 NA ii female 840 NA i female NA 1204 iv male NA 383 iv female 1293 NA iii male NA 1355 ii male") names(clinical) <- c("patient.days_to_last_followup", "patient.days_to_death", "patient.stage_event.pathologic_stage", "patient.gender") timeStart <- "days_to_death" event <- "days_to_death" formulaStr <- "patient.stage_event.pathologic_stage + patient.gender" survTerms <- processSurvTerms(clinical, censoring="right", event, timeStart, formulaStr=formulaStr) survfit(survTerms)
library("survival") clinical <- read.table(text = "2549 NA ii female 840 NA i female NA 1204 iv male NA 383 iv female 1293 NA iii male NA 1355 ii male") names(clinical) <- c("patient.days_to_last_followup", "patient.days_to_death", "patient.stage_event.pathologic_stage", "patient.gender") timeStart <- "days_to_death" event <- "days_to_death" formulaStr <- "patient.stage_event.pathologic_stage + patient.gender" survTerms <- processSurvTerms(clinical, censoring="right", event, timeStart, formulaStr=formulaStr) survfit(survTerms)
sticky
objects when extracting or transposing
objectMost attributes - with the exception of names
, dim
,
dimnames
, class
and row.names
- are preserved in simple
transformations of objects from class sticky
## S3 method for class 'sticky' t(x) ## S3 method for class 'sticky' x[i, j, ...]
## S3 method for class 'sticky' t(x) ## S3 method for class 'sticky' x[i, j, ...]
x |
Object |
i , j , ...
|
Numeric or character: indices of elements to extract |
Transformed object with most attributes preserved
Test multiple contingency tables comprised by two groups (one reference group and another containing remaining elements) and provided groups.
testGroupIndependence(ref, groups, elements, pvalueAdjust = "BH")
testGroupIndependence(ref, groups, elements, pvalueAdjust = "BH")
ref |
List of character: list of groups where each element contains the identifiers of respective elements |
groups |
List of characters: list of groups where each element contains the identifiers of respective elements |
elements |
Character: all available elements (if a data frame is given, its rownames will be used) |
pvalueAdjust |
Character: method used to adjust p-values (see Details) |
The following methods for p-value adjustment are supported by using
the respective string in the pvalueAdjust
argument:
none
: Do not adjust p-values
BH
: Benjamini-Hochberg's method (false discovery rate)
BY
: Benjamini-Yekutieli's method (false discovery rate)
bonferroni
: Bonferroni correction (family-wise error rate)
holm
: Holm's method (family-wise error rate)
hochberg
: Hochberg's method (family-wise error rate)
hommel
: Hommel's method (family-wise error rate)
multiGroupIndependenceTest
object, a data frame containing:
attribute |
Name of the original groups compared against the reference groups |
table |
Contingency table used for testing |
pvalue |
Fisher's exact test's p-value |
parseCategoricalGroups()
and
plotGroupIndependence()
Other functions for data grouping:
createGroupByAttribute()
,
getGeneList()
,
getSampleFromSubject()
,
getSubjectFromSample()
,
groupPerElem()
,
plotGroupIndependence()
elements <- paste("subjects", 1:10) ref <- elements[5:10] groups <- list(race=list(asian=elements[1:3], white=elements[4:7], black=elements[8:10]), region=list(european=elements[c(4, 5, 9)], african=elements[c(6:8, 10)])) groupTesting <- testGroupIndependence(ref, groups, elements) # View(groupTesting)
elements <- paste("subjects", 1:10) ref <- elements[5:10] groups <- list(race=list(asian=elements[1:3], white=elements[4:7], black=elements[8:10]), region=list(european=elements[c(4, 5, 9)], african=elements[c(6:8, 10)])) groupTesting <- testGroupIndependence(ref, groups, elements) # View(groupTesting)
Test the survival difference between groups of subjects
testSurvival(survTerms, ...)
testSurvival(survTerms, ...)
survTerms |
|
... |
Arguments passed on to
|
p-value of the survival difference or NA
Instead of raising errors, returns NA
Other functions to analyse survival:
assignValuePerSubject()
,
getAttributesTime()
,
labelBasedOnCutoff()
,
optimalSurvivalCutoff()
,
plotSurvivalCurves()
,
plotSurvivalPvaluesByCutoff()
,
processSurvTerms()
,
survdiffTerms()
,
survfit.survTerms()
require("survival") data <- aml timeStart <- "event" event <- "event" followup <- "time" data$event <- NA data$event[aml$status == 1] <- aml$time[aml$status == 1] censoring <- "right" formulaStr <- "x" survTerms <- processSurvTerms(data, censoring=censoring, event=event, timeStart=timeStart, followup=followup, formulaStr=formulaStr) testSurvival(survTerms)
require("survival") data <- aml timeStart <- "event" event <- "event" followup <- "time" data$event <- NA data$event[aml$status == 1] <- aml$time[aml$status == 1] censoring <- "right" formulaStr <- "x" survTerms <- processSurvTerms(data, censoring=censoring, event=event, timeStart=timeStart, followup=followup, formulaStr=formulaStr) testSurvival(survTerms)