Title: | NanoString nCounter Tools |
---|---|
Description: | Tools for NanoString Technologies nCounter Technology. Provides support for reading RCC files into an ExpressionSet derived object. Also includes methods for QC and normalizaztion of NanoString data. |
Authors: | Patrick Aboyoun [aut], Nicole Ortogero [aut], Maddy Griswold [cre], Zhi Yang [ctb] |
Maintainer: | Maddy Griswold <[email protected]> |
License: | MIT |
Version: | 1.15.0 |
Built: | 2024-10-30 08:27:41 UTC |
Source: | https://github.com/bioc/NanoStringNCTools |
The interactive version of geom_beeswarm
from
ggbeeswarm.
geom_beeswarm_interactive(mapping = NULL, data = NULL, priority = c("ascending", "descending", "density", "random", "none"), cex = 1, groupOnX = NULL, dodge.width = 0, stat = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...)
geom_beeswarm_interactive(mapping = NULL, data = NULL, priority = c("ascending", "descending", "density", "random", "none"), cex = 1, groupOnX = NULL, dodge.width = 0, stat = "identity", na.rm = FALSE, show.legend = NA, inherit.aes = TRUE, ...)
mapping |
The aesthetic mapping.
See |
data |
The data to be displayed at this layer.
See |
priority |
Method used to perform point layout.
See |
cex |
Scaling for adjusting point spacing.
See |
groupOnX |
Indicator for jittering on x-axis.
See |
dodge.width |
Dodge amount for points from different aesthetic groups.
See |
stat |
The statistical transformation to use on the data for this layer.
See |
na.rm |
Indicator for removing missing values with a warning.
See |
show.legend |
Indicator for including this layer in the legend.
See |
inherit.aes |
Indicator for inheriting the aesthetics.
See |
... |
Additional arguments.
See |
The interactive geometry based on geom_beeswarm
.
Patrick Aboyoun
# Create NanoStringRccSet from data files datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) eg_data <- as.data.frame(assayDataElement(solidTumor, "exprs")[1:5, 1]) eg_data[["tooltip"]] <- names(eg_data) geom_beeswarm_interactive(aes_string(tooltip = "tooltip"), data=eg_data)
# Create NanoStringRccSet from data files datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) eg_data <- as.data.frame(assayDataElement(solidTumor, "exprs")[1:5, 1]) eg_data[["tooltip"]] <- names(eg_data) geom_beeswarm_interactive(aes_string(tooltip = "tooltip"), data=eg_data)
Safe log and log2 calculations where values within [0, thresh) are thresholded to thresh prior to the transformation.
logt(x, thresh = 0.5) log2t(x, thresh = 0.5)
logt(x, thresh = 0.5) log2t(x, thresh = 0.5)
x |
a numeric or complex vector. |
thresh |
a positive number specifying the threshold. |
For non-negative elements in x
, calculates log(pmax(x, thresh))
or log2(pmax(x, thresh))
.
A vector of the same length as x
containing the transformed values.
Patrick Aboyoun
logt(0:8) identical(logt(0:8), log(c(0.5, 1:8))) log2t(0:8) identical(log2t(0:8), log2(c(0.5, 1:8)))
logt(0:8) identical(logt(0:8), log(c(0.5, 1:8))) log2t(0:8) identical(log2t(0:8), log2(c(0.5, 1:8)))
Generate common plots to visualize and QC NanoStringRccSet data.
## S3 method for class 'NanoStringRccSet' autoplot(object, type = c("boxplot-feature", "boxplot-signature", "bindingDensity-mean", "bindingDensity-sd", "ercc-linearity", "ercc-lod", "heatmap-genes", "heatmap-signatures", "housekeep-geom", "lane-bindingDensity", "lane-fov", "mean-sd-features", "mean-sd-samples"), log2scale = TRUE, elt = "exprs", index = 1L, geomParams = list(), tooltipDigits = 4L, heatmapGroup = NULL, blacklist = NULL, tooltipID = NULL, qcCutoffs = list( Housekeeper = c("failingCutoff" = 32,"passingCutoff" = 100) , Imaging = c("fovCutoff" = 0.75) , BindingDensity = c("minimumBD" = 0.1, "maximumBD" = 2.25, "maximumBDSprint" = 1.8) , ERCCLinearity = c("correlationValue" = 0.95) , ERCCLoD = c("standardDeviations" = 2) ), scalingFactor=1L, show_rownames_gene_limit=60L, show_colnames_gene_limit=36L, show_rownames_sig_limit=60L, show_colnames_sig_limit=36L, subSet = NULL , ...)
## S3 method for class 'NanoStringRccSet' autoplot(object, type = c("boxplot-feature", "boxplot-signature", "bindingDensity-mean", "bindingDensity-sd", "ercc-linearity", "ercc-lod", "heatmap-genes", "heatmap-signatures", "housekeep-geom", "lane-bindingDensity", "lane-fov", "mean-sd-features", "mean-sd-samples"), log2scale = TRUE, elt = "exprs", index = 1L, geomParams = list(), tooltipDigits = 4L, heatmapGroup = NULL, blacklist = NULL, tooltipID = NULL, qcCutoffs = list( Housekeeper = c("failingCutoff" = 32,"passingCutoff" = 100) , Imaging = c("fovCutoff" = 0.75) , BindingDensity = c("minimumBD" = 0.1, "maximumBD" = 2.25, "maximumBDSprint" = 1.8) , ERCCLinearity = c("correlationValue" = 0.95) , ERCCLoD = c("standardDeviations" = 2) ), scalingFactor=1L, show_rownames_gene_limit=60L, show_colnames_gene_limit=36L, show_rownames_sig_limit=60L, show_colnames_sig_limit=36L, subSet = NULL , ...)
object |
A NanoStringRccSet object |
type |
Character string referencing the type of plot to generate |
log2scale |
An optional boolean indicating expression data is on log2 scale |
elt |
An optional character string of the expression matrix name |
index |
An optional integer giving the feature of interest row location |
geomParams |
An option |
tooltipDigits |
An optional integer for number of tooltip decimal places to display |
heatmapGroup |
An optional character string referencing |
blacklist |
An optional character vector of features not to plot |
tooltipID |
An optional character string referencing |
qcCutoffs |
An optional |
scalingFactor |
An optional numeric value indicating a scaling factor to apply to plot drawing |
show_rownames_gene_limit |
An optional integer limit on number of features to display row-wise |
show_colnames_gene_limit |
An optional integer limit on number of features to display column-wise |
show_rownames_sig_limit |
An optional integer limit on number of signatures to display row-wise |
show_colnames_sig_limit |
An optional integer limit on number of signatures to display column-wise |
subSet |
An optional subset to plot on |
... |
Additional arguments to pass on to autoplot function |
"boxplot-feature"
Generate feature boxplots
"boxplot-signature"
Generate signature boxplots
"bindingDensity-mean"
Plot binding density displayed as average expression
"bindingDensity-sd"
Plot binding density displayed as standard deviation of expression
"ercc-linearity"
Assess linearity of ERCCs
"ercc-lod"
Assess limit of detection based on ERCC expression
"heatmap-genes"
Generate a heatmap from feature expression
"heatmap-signatures"
Generate a heatmap from signature expression
"housekeep-geom"
Plot geometric mean of housekeeper genes
"lane-bindingDensity"
View binding density by lane
"lane-fov"
Assess image quality by lane
"mean-sd-features"
Plot mean versus standard deviation feature-wise
"mean-sd-samples"
Plot mean versus standard deviation sample-wise
A ggplot
or pheatmap
plot depending on the type of plot generated
# Create NanoStringRccSet from data files datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) # Assess experiment linearity #autoplot(solidTumor, "ercc-linearity") # Plot a feature's expression across all samples #autoplot(solidTumor, "boxplot-feature", index=2)
# Create NanoStringRccSet from data files datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) # Assess experiment linearity #autoplot(solidTumor, "ercc-linearity") # Plot a feature's expression across all samples #autoplot(solidTumor, "boxplot-feature", index=2)
The NanoStringRccSet
class extends the
ExpressionSet
class for NanoString Reporter Code Count
(RCC) data.
NanoStringRccSet(assayData, phenoData = annotatedDataFrameFrom(assayData, byrow = FALSE), featureData = annotatedDataFrameFrom(assayData, byrow = TRUE), experimentData = MIAME(), annotation = character(), protocolData = annotatedDataFrameFrom(assayData, byrow = FALSE), dimLabels = c("GeneName", "SampleID"), signatures = SignatureSet(), design = NULL, ...)
NanoStringRccSet(assayData, phenoData = annotatedDataFrameFrom(assayData, byrow = FALSE), featureData = annotatedDataFrameFrom(assayData, byrow = TRUE), experimentData = MIAME(), annotation = character(), protocolData = annotatedDataFrameFrom(assayData, byrow = FALSE), dimLabels = c("GeneName", "SampleID"), signatures = SignatureSet(), design = NULL, ...)
assayData |
A |
phenoData |
An |
featureData |
An |
experimentData |
An optional |
annotation |
A |
protocolData |
An |
dimLabels |
A |
signatures |
An optional |
design |
An optional one-sided formula representing the experimental
design based on columns from |
... |
Additional arguments for |
An S4 class containing NanoString Expression Level Assays
In addition to the standard ExpressionSet
accessor
methods, NanoStringRccSet
objects have the following:
extracts the data.frame
containing the
sample data, cbind(pData(object), pData(protocolData(object)))
.
extracts the sample data column names,
c(varLabels(object), varLabels(protocolData(object)))
.
extracts the column names to use as labels
for the features and samples in the autoplot
method.
replaces the dimLabels
of
the object
.
extracts the SignatureSet
of the object
.
replaces the
SignatureSet
of the object
.
extracts the matrix of computed signature scores.
extracts the one-sided formula representing
the experimental design based on columns from
phenoData
.
replaces the one-sided formula
representing the experimental design based on columns from
phenoData
.
returns the signature functions.
replaces the signature functions.
returns the signature groups.
replaces the signature groups.
When signatureScores = FALSE
, the marginal summaries of the
elt
assayData
matrix along either the
feature (MARGIN = 1
) or sample (MARGIN = 2
) dimension.
When signatureScores = TRUE
, the marginal summaries of the
elt
signatureScores
matrix along either the
signature (MARGIN = 1
) or sample (MARGIN = 2
) dimension.
When log2scale = FALSE
, the summary statistics are Mean, Standard
Deviation, Skewness, Excess Kurtosis, Minimum, First Quartile, Median,
Third Quartile, and Maximum.
When log2scale = TRUE
, the summary statistics are Geometric Mean
with thresholding at 0.5, Size Factor
(2^(`MeanLog2` - mean(`MeanLog2`))
), Mean of Log2 with
thresholding at 0.5, Standard Deviation of Log2 with thresholding at 0.5,
Minimum, First Quartile, Median, Third Quartile, and Maximum.
In addition to the standard ExpressionSet
subsetting
methods, NanoStringRccSet
objects have the following:
Subset the feature and sample
dimensions using the subset
and select
arguments
respectively. The subset
argument will be evaluated with
respect to the featureData
, while the
select
argument will be evaluated with respect to the
phenoData
and protocolData
.
Extracts the endogenous
barcode class feature subset of x
with optional additional
subsetting using subset
and select
.
Extracts the housekeeping
barcode class feature subset of x
with optional additional
subsetting using subset
and select
.
Extracts the negative
control barcode class feature subset of x
with optional additional
subsetting using subset
and select
.
Extracts the positive
control barcode class feature subset of x
with optional additional
subsetting using subset
and select
.
Extracts the feature subset
representing the controls of x
with optional additional
subsetting using subset
and select
.
Extracts the feature subset
representing the non-controls of x
with optional additional
subsetting using subset
and select
.
Extracts the feature subset
representing the genes in the signatures
of x
with
optional additional subsetting using subset
and select
.
Loop over
the feature (MARGIN = 1
) or sample (MARGIN = 2
) dimension
of assayDataElement(X, elt)
.
Loop
over the signature (MARGIN = 1
) or sample (MARGIN = 2
)
dimension of signatureScores(X, elt)
.
Split
X
by GROUP
column within featureData
,
phenoData
, or protocolData
and apply FUN
to each partition.
munge argument data
into a data.frame object for modeling and
visualization using the mapping
argument. Supplemental data can be
specified using the extradata
argument.
Similar to the
transform
generic in the base package, creates
or modifies one or more assayData
matrices based
upon name = value
pairs in ...
. The expressions in
...
are appended to the preprocessing list in
experimentData
, which can be extracted using the
preproc
method.
Evaluate expression expr
with
respect to assayData
,
featureData
, phenoData
,
and protocolData
;
c(as.list(assayData(data)), fData(data), sData(data))
.
the NanoStringRccSet
method for ggplot
.
Patrick Aboyoun
readNanoStringRccSet
,
writeNanoStringRccSet
,
ExpressionSet
# Create NanoStringRccSet from data files datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) # Create a deep copy of a NanoStringRccSet object deepCopy <- NanoStringRccSet(solidTumor) all.equal(solidTumor, deepCopy) identical(solidTumor, deepCopy) # Accessing sample data and column names head(sData(solidTumor)) svarLabels(solidTumor) # Set experimental design design(solidTumor) <- ~ BRAFGenotype + Treatment design(solidTumor) munge(solidTumor) # Marginal summarizing of NanoStringRccSet assayData matrices head(summary(solidTumor, 1)) # Marginal summaries along features head(summary(solidTumor, 2)) # Marginal summaries along samples # Subsetting NanoStringRccSet objects # Extract the positive controls for wildtype BRAF dim(solidTumor) dim(subset(solidTumor, CodeClass == "Positive", BRAFGenotype == "wt/wt")) # Extract by barcode class with(solidTumor, table(CodeClass)) with(endogenousSubset(solidTumor), table(CodeClass)) with(housekeepingSubset(solidTumor), table(CodeClass)) with(negativeControlSubset(solidTumor), table(CodeClass)) with(positiveControlSubset(solidTumor), table(CodeClass)) with(controlSubset(solidTumor), table(CodeClass)) with(nonControlSubset(solidTumor), table(CodeClass)) # Looping over NanoStringRccSet assayData matrices log1pCoefVar <- function(x){ x <- log1p(x) sd(x) / mean(x) } # Log1p Coefficient of Variation along Features head(assayDataApply(solidTumor, 1, log1pCoefVar)) # Log1p Coefficient of Variation along Samples head(assayDataApply(solidTumor, 2, log1pCoefVar)) # Transforming NanoSetRccSet assayData matrices # Subtract max count from each sample # Create log1p transformation of adjusted counts thresh <- assayDataApply(negativeControlSubset(solidTumor), 2, max) solidTumor2 <- transform(solidTumor, negCtrlZeroed = sweep(exprs, 2, thresh), log1p_negCtrlZeroed = log1p(pmax(negCtrlZeroed, 0))) assayDataElementNames(solidTumor2) # Evaluating expression using NanoStringRccSet data meanLog1pExprs <- with(solidTumor, { means <- split(apply(exprs, 1, function(x) mean(log1p(x))), CodeClass) means <- means[order(sapply(means, median))] boxplot(means, horizontal = TRUE) means })
# Create NanoStringRccSet from data files datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) # Create a deep copy of a NanoStringRccSet object deepCopy <- NanoStringRccSet(solidTumor) all.equal(solidTumor, deepCopy) identical(solidTumor, deepCopy) # Accessing sample data and column names head(sData(solidTumor)) svarLabels(solidTumor) # Set experimental design design(solidTumor) <- ~ BRAFGenotype + Treatment design(solidTumor) munge(solidTumor) # Marginal summarizing of NanoStringRccSet assayData matrices head(summary(solidTumor, 1)) # Marginal summaries along features head(summary(solidTumor, 2)) # Marginal summaries along samples # Subsetting NanoStringRccSet objects # Extract the positive controls for wildtype BRAF dim(solidTumor) dim(subset(solidTumor, CodeClass == "Positive", BRAFGenotype == "wt/wt")) # Extract by barcode class with(solidTumor, table(CodeClass)) with(endogenousSubset(solidTumor), table(CodeClass)) with(housekeepingSubset(solidTumor), table(CodeClass)) with(negativeControlSubset(solidTumor), table(CodeClass)) with(positiveControlSubset(solidTumor), table(CodeClass)) with(controlSubset(solidTumor), table(CodeClass)) with(nonControlSubset(solidTumor), table(CodeClass)) # Looping over NanoStringRccSet assayData matrices log1pCoefVar <- function(x){ x <- log1p(x) sd(x) / mean(x) } # Log1p Coefficient of Variation along Features head(assayDataApply(solidTumor, 1, log1pCoefVar)) # Log1p Coefficient of Variation along Samples head(assayDataApply(solidTumor, 2, log1pCoefVar)) # Transforming NanoSetRccSet assayData matrices # Subtract max count from each sample # Create log1p transformation of adjusted counts thresh <- assayDataApply(negativeControlSubset(solidTumor), 2, max) solidTumor2 <- transform(solidTumor, negCtrlZeroed = sweep(exprs, 2, thresh), log1p_negCtrlZeroed = log1p(pmax(negCtrlZeroed, 0))) assayDataElementNames(solidTumor2) # Evaluating expression using NanoStringRccSet data meanLog1pExprs <- with(solidTumor, { means <- split(apply(exprs, 1, function(x) mean(log1p(x))), CodeClass) means <- means[order(sapply(means, median))] boxplot(means, horizontal = TRUE) means })
This package performs normalization on NanoStringRccSet data using one of three methods.
normalize(object, ...)
normalize(object, ...)
object |
|
... |
|
Normalization is performed in one of three ways with data pulled from one slot of assayData and inserted into another. It is possible to overwrite the original slot of assayData if the fromElt and toElt are set to the same slot. nSolver
normalization uses positive controls to scale and housekeepers to standardize the data and mimics the normalization performed by default in the nSolver software. The Housekeeping-Log2
normalization calculates the log2 sizeFactor of the housekeeping genes and then takes 2^ log2 expression data centered by the log transformed sizeFactor. PositiveControl-Log2Log2
regresses the log2 positive control probes greater than 0.5 concentration on their geometric mean and then uses the intercept and slope to predict normalized values from the log2 transformed expression values. The predictions are then rescaled by 2^.
Additional parameters with NanoStringRccSet method include:
type
normalization method to use. Options are nSolver
, Housekeeping-Log2
, and PositiveControl-Log2Log2
fromElt
assayData slot from which to pull raw data
toElt
assayData slot to which normalized data will be inserted
The function returns a new NanoStringRccSet with either an additional assayData slot of normalized data, or overwrites the original assayData depending on whether fromElt and toElt are identical.
Patrick Aboyoun
NanoString nSolver User Manual https://www.nanostring.com/download_file/view/1168
datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) solidTumor <- normalize(solidTumor, "nSolver" , fromElt = "exprs", toElt = "exprs_norm") head( assayDataElement( solidTumor , elt = "exprs_norm" ) )
datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) solidTumor <- normalize(solidTumor, "nSolver" , fromElt = "exprs", toElt = "exprs_norm") head( assayDataElement( solidTumor , elt = "exprs_norm" ) )
Create an instance of class NanoStringRccSet
by reading
data from NanoString Reporter Code Count (RCC) files.
readNanoStringRccSet(rccFiles, rlfFile = NULL, phenoDataFile = NULL, phenoDataRccColName = "^RCC", phenoDataColPrefix = "")
readNanoStringRccSet(rccFiles, rlfFile = NULL, phenoDataFile = NULL, phenoDataRccColName = "^RCC", phenoDataColPrefix = "")
rccFiles |
A character vector containing the paths to the RCC files. |
rlfFile |
An optional character string representing the path to the corresponding RLF file. |
phenoDataFile |
An optional character string representing the path to the corresponding phenotypic csv data file. |
phenoDataRccColName |
The regular expression that specifies the RCC
column in the |
phenoDataColPrefix |
An optional prefix to add to the phenoData column names to distinguish them from the names of assayData matrices, featureData columns, and protocolData columns. |
An instance of the NanoStringRccSet
class.
Patrick Aboyoun
NanoStringRccSet
, writeNanoStringRccSet
# Data file paths datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") # Just RCC data solidTumorNoRlfPheno <- readNanoStringRccSet(rccs) varLabels(solidTumorNoRlfPheno) fvarLabels(solidTumorNoRlfPheno) # RCC and RLF data solidTumorNoPheno <- readNanoStringRccSet(rccs, rlfFile = rlf) setdiff(fvarLabels(solidTumorNoPheno), fvarLabels(solidTumorNoRlfPheno)) # All data solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) varLabels(solidTumor) design(solidTumor) <- ~ BRAFGenotype + Treatment # All data with phenoData prefix solidTumorPhenoPrefix <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno, phenoDataColPrefix = "PHENO_") varLabels(solidTumorPhenoPrefix) design(solidTumorPhenoPrefix) <- ~ PHENO_BRAFGenotype + PHENO_Treatment
# Data file paths datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") # Just RCC data solidTumorNoRlfPheno <- readNanoStringRccSet(rccs) varLabels(solidTumorNoRlfPheno) fvarLabels(solidTumorNoRlfPheno) # RCC and RLF data solidTumorNoPheno <- readNanoStringRccSet(rccs, rlfFile = rlf) setdiff(fvarLabels(solidTumorNoPheno), fvarLabels(solidTumorNoRlfPheno)) # All data solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) varLabels(solidTumor) design(solidTumor) <- ~ BRAFGenotype + Treatment # All data with phenoData prefix solidTumorPhenoPrefix <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno, phenoDataColPrefix = "PHENO_") varLabels(solidTumorPhenoPrefix) design(solidTumorPhenoPrefix) <- ~ PHENO_BRAFGenotype + PHENO_Treatment
Read a NanoString Reporter Code Count (RCC) file.
readRccFile(file)
readRccFile(file)
file |
A character string containing the path to the RCC file. |
An list object with five elements:
"Header" |
a |
"Sample_Attributes" |
a |
"Lane_Attributes" |
a |
"Code_Summary" |
a |
"Messages" |
A character vector containing messages, if any. |
Patrick Aboyoun
datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rccData <- lapply(rccs, readRccFile)
datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rccData <- lapply(rccs, readRccFile)
Read a NanoString Reporter Library File (RLF) file.
readRlfFile(file)
readRlfFile(file)
file |
A character string containing the path to the RLF file. |
An instance of the DataFrame
class containing columns:
"CodeClass" |
code class |
"GeneName" |
gene name |
"Accession" |
accession number |
... |
additional columns |
Patrick Aboyoun
datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") rlfData <- readRlfFile(rlf)
datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") rlfData <- readRlfFile(rlf)
This function takes a list containing the quality control (QC) thresholds for data in a NanoStringRccSet and then returns a matrix of QC retults by sample to protocolData.
setQCFlags(object, ...)
setQCFlags(object, ...)
object |
A valid NanoStringRccSet object with all housekeeping genes, positive control probes, and negative control probes present |
... |
Additional arguments to pass |
This function checks that the housekeeping genes, positive control, and negative control probes or genes are within acceptable boundaries. Additional parameters with NanoStringRccSet method include:
qcCutoffs
An optional list with members named Housekeeper
, Imaging
, BindingDensity
, ERCCLinearity
, and ERCCLoD
hkGenes
An optional vector of housekeeping gene names if alternative genes to those defined in the panel are to be used
ReferenceSampleColumn
An optional character string indicating the pData
column containing reference sample information
Borderline thresholds and fail thresholds are defined and each sample receives a row in a matrix that contains flags indicating either borderline or failing performance.
Housekeeper
is a vector with names members. failingCutoff
sets the lower bound of housekeeper gene expression such that samples with a value below this threshold are labeled as failures. passingCutoff
sets a lower bound of housekeeper gene expression such that samples with a value below this threshold are labeled as borderline. Values greater than or equal to either threshold are labeled as either borderline or passing. The default values are failingCutoff
= 32 and passingCutoff
= 100.
Imaging
is a vector with a single named member fovCutoff
. This threshold determines the mimimum proportion of FOV to be counted. The default value is 0.75.
BindingDensity
is a named vector with members minimumBD
, maximumBD
, and maximumBDSprint
. minimumBD
sets a minimum threshold for binding density across machine platforms. maximumBD
sets a maxmimum binding density for non-Sprint machines while maximumBDSprint
does the same for Sprint machines. The default values are minimumBD
= 0.1, maximumBD
= 2.25, and maximumBDSprint
= 1.8.
ERCCLinearity
is a named vector with a single member correlationValue
. This member sets a minimum threshold for the correlation between the observed counts of positive controls and their theoretical concentration. The default value is 0.95.
ERCCLoD
is a named vector with a single member standardDeviations
. This sets a minimum threshold for the 0.5uMol concentration to be above the geoMean of the negative controls in units of standard deviation of the negative controls. The default value is 2.
This function returns a new NanoStringRccSet
with matrices of QC pass and QC borderline criteria added to the protocolData
slots called QCFlags
and QCBorderlineFlags
, respectively.
# Create NanoStringRccSet from data files datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) #Set QC flags with default cutoffs solidTumorDefaultQC <- setQCFlags(solidTumor) head( protocolData( solidTumorDefaultQC )[["QCFlags"]] ) head( protocolData( solidTumorDefaultQC )[["QCBorderlineFlags"]] ) #Update cutoffs newQCCutoffs <- list( Housekeeper = c("failingCutoff" = 32,"passingCutoff" = 100) , Imaging = c("fovCutoff" = 0.75) , BindingDensity = c("minimumBD" = 0.1, "maximumBD" = 2.25, "maximumBDSprint" = 1.8) , ERCCLinearity = c("correlationValue" = 0.98) , ERCCLoD = c("standardDeviations" = 2) ) #Set QC flags with new cutoffs solidTumorNewQC <- setQCFlags(solidTumor, qcCutoffs=newQCCutoffs) #Compare QC results with default and new cutoffs head( protocolData( solidTumorDefaultQC )[["QCFlags"]] ) head( protocolData( solidTumorNewQC )[["QCFlags"]] )
# Create NanoStringRccSet from data files datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) rlf <- file.path(datadir, "3D_SolidTumor_Sig.rlf") pheno <- file.path(datadir, "3D_SolidTumor_PhenoData.csv") solidTumor <- readNanoStringRccSet(rccs, rlfFile = rlf, phenoDataFile = pheno) #Set QC flags with default cutoffs solidTumorDefaultQC <- setQCFlags(solidTumor) head( protocolData( solidTumorDefaultQC )[["QCFlags"]] ) head( protocolData( solidTumorDefaultQC )[["QCBorderlineFlags"]] ) #Update cutoffs newQCCutoffs <- list( Housekeeper = c("failingCutoff" = 32,"passingCutoff" = 100) , Imaging = c("fovCutoff" = 0.75) , BindingDensity = c("minimumBD" = 0.1, "maximumBD" = 2.25, "maximumBDSprint" = 1.8) , ERCCLinearity = c("correlationValue" = 0.98) , ERCCLoD = c("standardDeviations" = 2) ) #Set QC flags with new cutoffs solidTumorNewQC <- setQCFlags(solidTumor, qcCutoffs=newQCCutoffs) #Compare QC results with default and new cutoffs head( protocolData( solidTumorDefaultQC )[["QCFlags"]] ) head( protocolData( solidTumorNewQC )[["QCFlags"]] )
The SignatureSet
class defines gene-based signatures.
SignatureSet(weights = NumericList(), groups = factor(), func = character(), version = character(), ...)
SignatureSet(weights = NumericList(), groups = factor(), func = character(), version = character(), ...)
weights |
A named |
groups |
A factor vector indicating groups in the |
func |
Character indicating function to use |
version |
Character indicating version to use |
... |
Additional arguments for future use. |
A SignatureSet
object
returns the number of signatures in x
.
returns a named integer vector
containing the number of genes in each of the signatures in x
.
returns a character vector containing the signature
names in x
.
returns a named NumericList
that
defines the linear combination based signatures.
replaces the NumericList
that defines the linear combination based signatures.
returns the signature functions of an object.
returns a factor vector representing the signature groups.
replaces the factor vector representing the signature groups.
: returns the signature version.
Patrick Aboyoun
SignatureSet(weights=list(x = c(a = 1), y = c(b = 1/3, d = 2/3), z = c(a = 2, c = 4)), groups=factor("x", "y", "z"), func = c(x="default", y="default", z="default"))
SignatureSet(weights=list(x = c(a = 1), y = c(b = 1/3, d = 2/3), z = c(a = 2, c = 4)), groups=factor("x", "y", "z"), func = c(x="default", y="default", z="default"))
Convenience functions for matrix thresholding, centering, and scaling based upon margin statistics.
# Loop over features fThresh(x, STATS) fCenter(x, STATS) fScale(x, STATS) ## Round results to integers fIntThresh(x, STATS) fIntCenter(x, STATS) fIntScale(x, STATS) ## Comparisons fAbove(x, STATS) fBelow(x, STATS) fAtLeast(x, STATS) fAtMost(x, STATS) # Loop over samples sThresh(x, STATS) sCenter(x, STATS) sScale(x, STATS) # Round results to integers sIntThresh(x, STATS) sIntCenter(x, STATS) sIntScale(x, STATS) ## Comparisons sAbove(x, STATS) sBelow(x, STATS) sAtLeast(x, STATS) sAtMost(x, STATS)
# Loop over features fThresh(x, STATS) fCenter(x, STATS) fScale(x, STATS) ## Round results to integers fIntThresh(x, STATS) fIntCenter(x, STATS) fIntScale(x, STATS) ## Comparisons fAbove(x, STATS) fBelow(x, STATS) fAtLeast(x, STATS) fAtMost(x, STATS) # Loop over samples sThresh(x, STATS) sCenter(x, STATS) sScale(x, STATS) # Round results to integers sIntThresh(x, STATS) sIntCenter(x, STATS) sIntScale(x, STATS) ## Comparisons sAbove(x, STATS) sBelow(x, STATS) sAtLeast(x, STATS) sAtMost(x, STATS)
x |
a numeric array. |
STATS |
the summary statistic for thresholding, centering, or scaling. |
These functions are convenience wrappers for the following code:
fThresh
:sweep(x, 1L, STATS, FUN = "pmax")
fCenter
:sweep(x, 1L, STATS, FUN = "-")
fScale
:sweep(x, 1L, STATS, FUN = "/")
fIntThresh
:round(sweep(x, 1L, STATS, FUN = "pmax"))
fIntCenter
:round(sweep(x, 1L, STATS, FUN = "-"))
fIntScale
:round(sweep(x, 1L, STATS, FUN = "/"))
fAbove
:sweep(x, 1L, STATS, FUN = ">")
fBelow
:sweep(x, 1L, STATS, FUN = "<")
fAtLeast
:sweep(x, 1L, STATS, FUN = ">=")
fAtMost
:sweep(x, 1L, STATS, FUN = "<=")
sThresh
:sweep(x, 2L, STATS, FUN = "pmax")
sCenter
:sweep(x, 2L, STATS, FUN = "-")
sScale
:sweep(x, 2L, STATS, FUN = "/")
sIntThresh
:round(sweep(x, 2L, STATS, FUN = "pmax"))
sIntCenter
:round(sweep(x, 2L, STATS, FUN = "-"))
sIntScale
:round(sweep(x, 2L, STATS, FUN = "/"))
sAbove
:sweep(x, 2L, STATS, FUN = ">")
sBelow
:sweep(x, 2L, STATS, FUN = "<")
sAtLeast
:sweep(x, 2L, STATS, FUN = ">=")
sAtMost
:sweep(x, 2L, STATS, FUN = "<=")
An array with the same shape as x
that has been modified by
thresholding, centering, or scaling.
Patrick Aboyoun
# Find reasonable column minimums thresh <- apply(stack.x, 2L, quantile, 0.05) # Threshold column values identical(sThresh(stack.x, thresh), sweep(stack.x, 2L, thresh, FUN = "pmax")) # Substract column values identical(sCenter(stack.x, thresh), sweep(stack.x, 2L, thresh)) # Scale to common mean identical(sScale(stack.x, colMeans(stack.x) / mean(colMeans(stack.x))), sweep(stack.x, 2L, colMeans(stack.x) / mean(colMeans(stack.x)), FUN = "/")) # Scale to common mean, rounded to the nearest integer sIntScale(stack.x, colMeans(stack.x) / mean(colMeans(stack.x)))
# Find reasonable column minimums thresh <- apply(stack.x, 2L, quantile, 0.05) # Threshold column values identical(sThresh(stack.x, thresh), sweep(stack.x, 2L, thresh, FUN = "pmax")) # Substract column values identical(sCenter(stack.x, thresh), sweep(stack.x, 2L, thresh)) # Scale to common mean identical(sScale(stack.x, colMeans(stack.x) / mean(colMeans(stack.x))), sweep(stack.x, 2L, colMeans(stack.x) / mean(colMeans(stack.x)), FUN = "/")) # Scale to common mean, rounded to the nearest integer sIntScale(stack.x, colMeans(stack.x) / mean(colMeans(stack.x)))
Write NanoString Reporter Code Count (RCC) files from an instance of class
NanoStringRccSet
.
writeNanoStringRccSet(x, dir = getwd())
writeNanoStringRccSet(x, dir = getwd())
x |
an instance of class |
dir |
An optional character string representing the path to the directory for the RCC files. |
Writes a set of NanoString Reporter Code Count (RCC) files based upon x
in dir
.
A character vector containing the paths for all the newly created RCC files.
Patrick Aboyoun
NanoStringRccSet
, readNanoStringRccSet
datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) solidTumorNoRlfPheno <- readNanoStringRccSet(rccs) writeNanoStringRccSet(solidTumorNoRlfPheno, tempdir()) for (i in seq_along(rccs)) { stopifnot(identical(readLines(rccs[i]), readLines(file.path(tempdir(), basename(rccs[i]))))) }
datadir <- system.file("extdata", "3D_Bio_Example_Data", package = "NanoStringNCTools") rccs <- dir(datadir, pattern = "SKMEL.*\\.RCC$", full.names = TRUE) solidTumorNoRlfPheno <- readNanoStringRccSet(rccs) writeNanoStringRccSet(solidTumorNoRlfPheno, tempdir()) for (i in seq_along(rccs)) { stopifnot(identical(readLines(rccs[i]), readLines(file.path(tempdir(), basename(rccs[i]))))) }