Title: | SigFuge |
---|---|
Description: | Algorithm for testing significance of clustering in RNA-seq data. |
Authors: | Patrick Kimes, Christopher Cabanski |
Maintainer: | Patrick Kimes <[email protected]> |
License: | GPL-3 |
Version: | 1.45.0 |
Built: | 2024-12-01 05:44:29 UTC |
Source: | https://github.com/bioc/SigFuge |
Tests significance of clustering in RNA-seq data.
SFpval
computes a -value for
significance of clustering for RNA-seq data, and
SFfigure
produces accompanying figures.
Patrick Kimes [email protected]
A dataset containing the annotations for the CDKN2A locus.
data(geneAnnot)
data(geneAnnot)
A GRanges
object
The Cancer Genome Atlas Research Network. (2012) Comprehensive genomic characterization of squamous cell lung cancers. Nature 489: 519-525.
A dataset containing read depths for 179 lung squamous cell carcinoma samples across the CDKN2A locus.
data(geneDepth)
data(geneDepth)
A
data.frame
of read depth (coverage). Each
column corresponds to a sample and each row to a base position along the
CDKN2A locus. These RNA-Seq read counts are a subset from 179 lung squamous
cell tumor samples sequenced as part of the Cancer Genome Atlas.
The Cancer Genome Atlas Research Network. (2012) Comprehensive genomic characterization of squamous cell lung cancers. Nature 489: 519-525.
Function for producing various figures corresponding to
the SigFuge functional data approach to studying RNA-seq
data as expression curves along base positions. The primary
input for the function is a read count matrix and
GRanges
. The default behavior is to identify
clusters based on applying SFlabels
to a normalized
version of the data produced by SFnormalize
.
If specified, the function will compute a p-value for the
significance of the labels by calling the SFpval
function.
SFfigure(data, locusname, annot = c(), flip.fig = 1, label.exon = 1, print.n = 1, data.labels = 0, label.colors = c(), flag = 1, lplots = 2, log10 = 1, summary.type = "median", savestr = c(), titlestr = c(), pval = 1)
SFfigure(data, locusname, annot = c(), flip.fig = 1, label.exon = 1, print.n = 1, data.labels = 0, label.colors = c(), flag = 1, lplots = 2, log10 = 1, summary.type = "median", savestr = c(), titlestr = c(), pval = 1)
data |
a |
locusname |
a character string specifying gene or locus name to be used in figure title. |
annot |
a
|
flip.fig |
an indicator whether to flip the plotting
direction of the locus if |
label.exon |
an indicator whether to print the exon boundaries to the figure. |
print.n |
an indicator whether to print cluster sizes. |
data.labels |
a |
label.colors |
a |
flag |
a |
lplots |
a specification of which figures to output
|
log10 |
an indicator whether the y-axis (read depth) should be log10 transformed. Default is to plot on log-scale. |
summary.type |
a character string specifying which summary statistic should be used when plotting clusters in lplots == 2, 3, and 5. Options: "median" (default) or "mean". |
savestr |
a string specifying the file name for
resulting figures. Extensions can also be specified in
|
titlestr |
a string specifying figure title. If
unspecified, default is
|
pval |
an indicator whether the |
SFfigure
returns a figure that is saved to the
current working directory if a savestr
is
specified. Else, a list containing the plots is returned.
Patrick Kimes <[email protected]>
# load data data(geneAnnot) data(geneDepth) # only use first 50 samples mdata <- geneDepth[,1:50] # make plot locusname <- "CDKN2A" SFfigure(mdata, locusname, geneAnnot, flag=1, lplots=3, savestr=paste0(locusname,".pdf"), titlestr="CDKN2A locus, LUSC samples", pval=1) mySFs <- SFfigure(mdata, locusname, geneAnnot, flag=1, lplots=1, savestr=c(), titlestr="CDKN2A locus, LUSC samples not saved", pval=0) mySFs$plot1
# load data data(geneAnnot) data(geneDepth) # only use first 50 samples mdata <- geneDepth[,1:50] # make plot locusname <- "CDKN2A" SFfigure(mdata, locusname, geneAnnot, flag=1, lplots=3, savestr=paste0(locusname,".pdf"), titlestr="CDKN2A locus, LUSC samples", pval=1) mySFs <- SFfigure(mdata, locusname, geneAnnot, flag=1, lplots=1, savestr=c(), titlestr="CDKN2A locus, LUSC samples not saved", pval=0) mySFs$plot1
Function for producing vector of SigFuge labels using 2-means
clustering on non-low expression normalized data and combining
with low expression flags. Typically, SFlabels
is used by passing output from SFnormalize
.
SFlabels(normData)
SFlabels(normData)
normData |
a list containing
|
SFlabels
returns a
vector of class labels.
Patrick Kimes <[email protected]>
data(geneDepth) normalizedData <- SFnormalize(geneDepth) labels <- SFlabels(normalizedData)
data(geneDepth) normalizedData <- SFnormalize(geneDepth) labels <- SFlabels(normalizedData)
Function for normalizing read count data as specified in the SigFuge method. The normalization procedure is applied prior to SigFuge clustering to remove the effect of sample-locus specific expression from the analysis. This allows the method to identify clusters based on expression patterns across the genomic locus. It is recommended to flag and remove low expression samples from the normalization and analysis since their shapes may be overwhelmed by noise. A threshold based method for identifying low expression samples is included in the function, but users may also specify their own flags for low expression samples.
SFnormalize(data, flag = 1)
SFnormalize(data, flag = 1)
data |
a |
flag |
a |
SFnormalize
returns a list containing:
data.norm a matrix
of normalized read counts where
is the number
of low expression samples.
flag a logical vector of flagged samples.
Patrick Kimes <[email protected]>
data(geneDepth) depthnorm <- SFnormalize(geneDepth, flag = 1)
data(geneDepth) depthnorm <- SFnormalize(geneDepth, flag = 1)
-valueFunction for computing significance of clustering
-value.
-value is obtained from
sigclust
, a simulation based procedure for
testing significance of clustering in high dimension low
sample size (HDLSS) data.
The SigClust hypothesis test is given:
H0: data generated from single Gaussian
H1: data not generated from single Gaussian
SFpval(data, normalize = 1, flag = 1)
SFpval(data, normalize = 1, flag = 1)
data |
a |
normalize |
a |
flag |
a |
SFpval
returns an object of class
sigclust-class
. Avaliable slots are
described in detail in the sigclust
package. Primarily, we make use of @pvalnorm
.
Patrick Kimes <[email protected]>
data(geneDepth) SFout <- SFpval(geneDepth, normalize = 1, flag = 1) SFout@pvalnorm
data(geneDepth) SFout <- SFpval(geneDepth, normalize = 1, flag = 1) SFout@pvalnorm