Title: | Differential Expression Analysis of NanoString nCounter Data |
---|---|
Description: | This Package utilizes a generalized linear model(GLM) of the negative binomial family to characterize count data and allows for multi-factor design. NanoStrongDiff incorporate size factors, calculated from positive controls and housekeeping controls, and background level, obtained from negative controls, in the model framework so that all the normalization information provided by NanoString nCounter Analyzer is fully utilized. |
Authors: | hong wang <[email protected]>, tingting zhai <[email protected]>, chi wang <[email protected]> |
Maintainer: | tingting zhai <[email protected]>,hong wang <[email protected]> |
License: | GPL |
Version: | 1.37.0 |
Built: | 2024-10-30 08:26:03 UTC |
Source: | https://github.com/bioc/NanoStringDiff |
A more detailed description of what the package does. A length of about one to five lines is recommended.
This section should provide a more detailed overview of how to use the package, including the most important functions.
Your Name, email optional.
Maintainer: Your Name <[email protected]>
This optional section can contain literature or other references for background information.
Optional links to other man pages
## Not run: ## Optional simple examples of the most important functions ## These can be in \dontrun{} and \donttest{} blocks. ## End(Not run)
## Not run: ## Optional simple examples of the most important functions ## These can be in \dontrun{} and \donttest{} blocks. ## End(Not run)
This function estimates positive size factors, background noise and housekeeping size factors for the input "NanoStringSet" object and return the same object with positiveFactor, negativeFactor and housekeepingFactor slots filled or replaced.
estNormalizationFactors(NanoStringData)
estNormalizationFactors(NanoStringData)
NanoStringData |
An object of "NanoStringSet" class. |
The same "NanoStringSet" object with positiveFactor, negativeFactor and housekeepingFactor field filled or replaced.
hong wang <[email protected]> chi wang <[email protected]>
data(NanoStringData) NanoStringData=estNormalizationFactors(NanoStringData) pf=positiveFactor(NanoStringData) nf=negativeFactor(NanoStringData) hf=housekeepingFactor(NanoStringData)
data(NanoStringData) NanoStringData=estNormalizationFactors(NanoStringData) pf=positiveFactor(NanoStringData) nf=negativeFactor(NanoStringData) hf=housekeepingFactor(NanoStringData)
The method considers a generalized linear model of the negative binomial family to characterize count data and allows for multi-factor design. The method propose an empirical Bayes shrinkage approach to estimate the dispersion parameter and use likelihood ratio test to obtain p-value.
glm.LRT(NanoStringData,design.full,Beta=ncol(design.full), contrast=NULL)
glm.LRT(NanoStringData,design.full,Beta=ncol(design.full), contrast=NULL)
NanoStringData |
An object of "NanoStringSet" class. |
design.full |
numeric matrix giving the design matrix for the generalized linear models under full model. must be of full column rank. |
Beta |
integer or character vector indicating which coefficients of the linear model are to be tested equal to zero. Values must be columns or column names of design. Defaults to the last coefficient. Ignored if contrast is specified. |
contrast |
numeric vector or matrix specifying one or more contrasts of the linear model coefficients to be tested equal to zero. |
A list
table |
A data frame with each row corresponding to a gene. Rows are sorted according to likelihood ratio test statistics. The columns are: logFC: log fold change between two groups. lr: likelihood ratio test statictics. pvalue: p-value. qvalue: adjust p-value using the procedure of Benjamini and Hochberg. |
dispersion |
a vertor of dispersion |
log.dispersion |
a vector of log dispersion: log.dispersion=log(dispersion) |
design.full |
numeric matrix giving the design matrix under full generalizedlinear model. |
design.reduce |
numeric matrix giving the design matrix under reduced generalizedlinear model. |
Beta.full |
coefficients under full model. |
mean.full |
mean value under full model. |
Beta.reduce |
coefficients under reduced model. |
mean.reduce |
mean value under reduced model. |
m0 |
hyper-parameter: mean value of the prior distribution of log dispersion |
sigma |
hyper-parameter: standard deviation of the prior distribution of log dispersion |
hong wang<[email protected]> chi wang <[email protected]>
data(NanoStringData) NanoStringData=estNormalizationFactors(NanoStringData) group=pData(NanoStringData) design.full=model.matrix(~0+factor(group$group)) contrast=c(1,-1) result=glm.LRT(NanoStringData,design.full, Beta=ncol(design.full),contrast=contrast) head(result$table)
data(NanoStringData) NanoStringData=estNormalizationFactors(NanoStringData) group=pData(NanoStringData) design.full=model.matrix(~0+factor(group$group)) contrast=c(1,-1) result=glm.LRT(NanoStringData,design.full, Beta=ncol(design.full),contrast=contrast) head(result$table)
user-defined housekeeping control genes can be used to estimate housekeeping factors to adjust variation caused by different sample input.
## S4 method for signature 'NanoStringSet' housekeepingControl(object) ## S4 replacement method for signature 'NanoStringSet,matrix' housekeepingControl(object) <- value
## S4 method for signature 'NanoStringSet' housekeepingControl(object) ## S4 replacement method for signature 'NanoStringSet,matrix' housekeepingControl(object) <- value
object |
A NanoStringSet object. |
value |
A matrix with housekeeping control genes. |
NanoString nCounter analyzer also contains probes for a set of species-specific mRNA housekeeping(reference) genes that are not spike-in the system. Nanostring recommends at least three housekeeping genes, but the more that are included, the more accurate the normalization will be. Housekeeping control genes are expected consistent in their expression levels.
A matrix contain housekeeping control genes
Hong Wang <[email protected]> chi wang <[email protected]>
housekeepingFactor
data(NanoStringData) ## obtain housekeeping control genes housekeepingControl(NanoStringData) ## assign a matrix n=ncol(exprs(NanoStringData)) r=nrow(housekeepingControl(NanoStringData)) housekeeping=matrix(rpois(r*n,1000),ncol=n) housekeepingControl(NanoStringData)=housekeeping
data(NanoStringData) ## obtain housekeeping control genes housekeepingControl(NanoStringData) ## assign a matrix n=ncol(exprs(NanoStringData)) r=nrow(housekeepingControl(NanoStringData)) housekeeping=matrix(rpois(r*n,1000),ncol=n) housekeepingControl(NanoStringData)=housekeeping
Housekeeping size factors can be used to adjust the variance caused by different sample input.
## S4 method for signature 'NanoStringSet' housekeepingFactor(object) ## S4 replacement method for signature 'NanoStringSet,numeric' housekeepingFactor(object) <- value
## S4 method for signature 'NanoStringSet' housekeepingFactor(object) ## S4 replacement method for signature 'NanoStringSet,numeric' housekeepingFactor(object) <- value
object |
A NanoStringSet object. |
value |
A vector of housekeeping size factors. |
Housekeeping gene normalization corrects for different in sample input between assays,since reference genes are suppose have same expression rate between samples.So the read counts from housekeeping genes, after subtracting background noise and adjusting by positive size factors, that are not expected to vary between samples. If there exist difference, which should be caused by sample input variation.
A vector contain housekeeping factors
Hong Wang <[email protected]> chi wang <[email protected]>
housekeepingControl
data(NanoStringData) ## obtain housekeeping factors housekeepingFactor(NanoStringData) ## assign a vector n=ncol(exprs(NanoStringData)) housekeepingFactor(NanoStringData)=rep(1,n)
data(NanoStringData) ## obtain housekeeping factors housekeepingFactor(NanoStringData) ## assign a vector n=ncol(exprs(NanoStringData)) housekeepingFactor(NanoStringData)=rep(1,n)
The object is created based on Mori Data with normal and tumor groups and 2 samples in each group. The object contain 599 endogenes, 6 pisitive control, 6 negative control and 4 housekeeping control.
data(NanoStringData)
data(NanoStringData)
An object of NanoStringSet
data(NanoStringData) NanoStringData
data(NanoStringData) NanoStringData
This function is used to get Normalized NanoString Data after adjusting for positive size factors, background noise and housekeeping size factors. Note that the normalized data values should only be used for data exploration / visualization purposes, e.g. drawing a heatmap. To perform differential expression analysis, we recommend users to follow the procedure described in the package vignette.
NanoStringDataNormalization(path=path, header=TRUE, designs)
NanoStringDataNormalization(path=path, header=TRUE, designs)
path |
the path of the file which the data are to be read from. |
header |
a logical value indicating whether the file contains the names of the variables as its first line. If missing, the value is determined from the file format: header is set to TRUE if and only if the first row contains one fewer field than the number of columns. |
designs |
a data frame in which the length of vector matches the column number of NanoStringData |
hong wang <[email protected]> tingting zhai <[email protected]> chi wang <[email protected]>
##path="/Users/NanoStringdiff-Rcode/Data/horbinski.csv" ##designs=data.frame(control=c(0,0,0,1,1,1)) ##NanoStringDataNormalization(path=path, header=TRUE, designs)
##path="/Users/NanoStringdiff-Rcode/Data/horbinski.csv" ##designs=data.frame(control=c(0,0,0,1,1,1)) ##NanoStringDataNormalization(path=path, header=TRUE, designs)
The NanoStringSet
is a s4 class used to store data from NanoString
nCounter analyzer. This class a subclass of ExpressionSet
, with six
more slots: positiveControl, negativeControl, housekeepingControl,
positiveFactor, negativeFactor and housekeepingFactor.
The constructor functions createNanoStringSet
and
createNanoStringSetFromCsv
create a NanoStringSet object from two types
of input: seperate matrix or csv files. See the vignette for examples of
contruction from these two input types.
createNanoStringSet(endogenous,positiveControl,negativeControl, housekeepingControl,designs) createNanoStringSetFromCsv(path, header=TRUE, designs)
createNanoStringSet(endogenous,positiveControl,negativeControl, housekeepingControl,designs) createNanoStringSetFromCsv(path, header=TRUE, designs)
endogenous |
for matrix input: a matrix of non-negative integers of endogenes |
positiveControl |
for matrix input: a matrix of non-negative integers of positive control genes. There must have 6 positive control genes order by concentrations form high to low |
negativeControl |
for matrix input: a matrix of non-negative integers of negative control genes |
housekeepingControl |
for matrix input: a matrix of non-negative integers of housekeeping control genes |
designs |
for data.frame input: phenotype data for NanoString nCounter data with at least one column. Each row is one sample, that is the number of rows must equal number of samples or replicates in the data. |
path |
path to the csv file. |
header |
a logical value indicating whether the file contains the names of the variables as its first line. The default value is TRUE. |
A NanoStringSet object.
Access and set positive control genes.
Access and set negative control genes.
Access and set housekeeping control genes.
Access and set positive factors.
Access and set negative factors.
Access and set housekeeping factors.
hong wang <[email protected]> chi wang <[email protected]>
positiveControl, negativeControl, housekeepingControl, positiveFactor, negativeFactor, housekeepingFactor
endogenous=matrix(rpois(100,50),25,4) positive=matrix(rpois(24,c(128,32,8,2,0.5,0.125)*80),6,4) negative=matrix(rpois(32,10),8,4) housekeeping=matrix(rpois(12,100),3,4) designs=data.frame(group=c(0,0,1,1),gender=c("male","female","female","male"), age=c(20,40,39,37)) NanoStringData=createNanoStringSet(endogenous,positive,negative, housekeeping,designs) NanoStringData pData(NanoStringData) positiveControl(NanoStringData) head(exprs(NanoStringData))
endogenous=matrix(rpois(100,50),25,4) positive=matrix(rpois(24,c(128,32,8,2,0.5,0.125)*80),6,4) negative=matrix(rpois(32,10),8,4) housekeeping=matrix(rpois(12,100),3,4) designs=data.frame(group=c(0,0,1,1),gender=c("male","female","female","male"), age=c(20,40,39,37)) NanoStringData=createNanoStringSet(endogenous,positive,negative, housekeeping,designs) NanoStringData pData(NanoStringData) positiveControl(NanoStringData) head(exprs(NanoStringData))
Negative control genes are provided by nCounter Analyzer which can be used to estimate background noise for each sample.
## S4 method for signature 'NanoStringSet' negativeControl(object) ## S4 replacement method for signature 'NanoStringSet,matrix' negativeControl(object) <- value
## S4 method for signature 'NanoStringSet' negativeControl(object) ## S4 replacement method for signature 'NanoStringSet,matrix' negativeControl(object) <- value
object |
A NanoStringSet object. |
value |
A matrix with negative control genes. |
Each code set in the nCounter Analyzer includes several negatives control genes for which no tranCounterript is expected to be present. We use these spike-in negative control genes to estimate background noise for each sample.
A matrix contain negative control genes
Hong Wang <[email protected]> chi wang <[email protected]>
negativeFactor
data(NanoStringData) ## obtain negative control genes negativeControl(NanoStringData) ## assign a matrix n=ncol(exprs(NanoStringData)) r=nrow(negativeControl(NanoStringData)) negative=matrix(rpois(r*n,10),ncol=n) negativeControl(NanoStringData)=negative
data(NanoStringData) ## obtain negative control genes negativeControl(NanoStringData) ## assign a matrix n=ncol(exprs(NanoStringData)) r=nrow(negativeControl(NanoStringData)) negative=matrix(rpois(r*n,10),ncol=n) negativeControl(NanoStringData)=negative
Negative size factors can be used to adjust background niose for each sample.
## S4 method for signature 'NanoStringSet' negativeFactor(object) ## S4 replacement method for signature 'NanoStringSet,numeric' negativeFactor(object) <- value
## S4 method for signature 'NanoStringSet' negativeFactor(object) ## S4 replacement method for signature 'NanoStringSet,numeric' negativeFactor(object) <- value
object |
A NanoStringSet object. |
value |
A vector of background noise. |
Accurate estimation of system background is essential for DE detection analysis. Each code set in the nCounter Analyzer includes several negatives control genes for which no tranCounterript is expected to be present. We use these spike-in negative control genes to estimate background noise for each sample
A vector contain background noise
Hong Wang <[email protected]> chi wang <[email protected]>
negativeControl
data(NanoStringData) ## obtain negative factors negativeFactor(NanoStringData) ## assign a vector n=ncol(exprs(NanoStringData)) lamda=rpois(n,10) negativeFactor(NanoStringData)=lamda
data(NanoStringData) ## obtain negative factors negativeFactor(NanoStringData) ## assign a vector n=ncol(exprs(NanoStringData)) lamda=rpois(n,10) negativeFactor(NanoStringData)=lamda
This function is used to pre-check the expressions of positive controls and housekeeping genes before data analysis. Linear regression plot of positive controls and variation analysis of housekeeping genes are available. The expressions of positive controls are supposed to be linearly related to the concentration of input sample materials, and the expressions of housekeeping genes are supposed to have relatively low variation. Nanostring recommends at least three housekeeping genes, but the more that are included, the more accurate the normalization will be.
PlotsPositiveHousekeeping(path=path, header=TRUE)
PlotsPositiveHousekeeping(path=path, header=TRUE)
path |
the path of the file which the data are to be read from. |
header |
a logical value indicating whether the file contains the names of the variables as its first line. If missing, the value is determined from the file format: header is set to TRUE if and only if the first row contains one fewer field than the number of columns. |
hong wang <[email protected]> tingting zhai <[email protected]> chi wang <[email protected]>
##path="/Users/NanoStringdiff-Rcode/Data/horbinski.csv" ##PlotsPositiveHousekeeping(path=path, header=TRUE)
##path="/Users/NanoStringdiff-Rcode/Data/horbinski.csv" ##PlotsPositiveHousekeeping(path=path, header=TRUE)
nCounter Analyzer has positive spike-in RNA hybridization controls for each sample which can be used to estimate the overall efficiency of hybridization and recovery for each sample.
## S4 method for signature 'NanoStringSet' positiveControl(object) ## S4 replacement method for signature 'NanoStringSet,matrix' positiveControl(object) <- value
## S4 method for signature 'NanoStringSet' positiveControl(object) ## S4 replacement method for signature 'NanoStringSet,matrix' positiveControl(object) <- value
object |
A NanoStringSet object. |
value |
A matrix with six positive control genes. |
Positive control genes are provided by NanoString nCounter technology. For each sample, nCounter provide six positive controls corresponding to six different concentrations in the 30 ul hybridzation: 128fM, 32fM, 8fM, 2fM, 0.5fM, and 0.125fM. Six positive control genes must be order by concentrations from high to low.
A matrix contain positive control genes
Hong Wang <[email protected]> chi wang <[email protected]>
positiveFactor
data(NanoStringData) ## obtain positive control genes positiveControl(NanoStringData) ## assign a matrix n=ncol(exprs(NanoStringData)) x=matrix(c(128,32,8,2,0.5,0.125)*80,ncol=1) positive=matrix(rpois(6*n,x),ncol=n) positiveControl(NanoStringData)=positive
data(NanoStringData) ## obtain positive control genes positiveControl(NanoStringData) ## assign a matrix n=ncol(exprs(NanoStringData)) x=matrix(c(128,32,8,2,0.5,0.125)*80,ncol=1) positive=matrix(rpois(6*n,x),ncol=n) positiveControl(NanoStringData)=positive
Positive size factors can be used to adjust all platform associated sources of variation.
## S4 method for signature 'NanoStringSet' positiveFactor(object) ## S4 replacement method for signature 'NanoStringSet,numeric' positiveFactor(object) <- value
## S4 method for signature 'NanoStringSet' positiveFactor(object) ## S4 replacement method for signature 'NanoStringSet,numeric' positiveFactor(object) <- value
object |
A NanoStringSet object. |
value |
A vector of positive size factors. |
The observed counts including negative control genes and housekeeping control genes might be effect by some experimental factors like hybridization and binding efficiency. In order to get the true rate of gene expression, these variations must be normalized. Positive size factors can normalize this kind of variation.
A vector contain positive size factors
Hong Wang <[email protected]> chi wang <[email protected]>
positiveControl
data(NanoStringData) ## obtain positive factors positiveFactor(NanoStringData) ## assign a vector n=ncol(exprs(NanoStringData)) positiveFactor(NanoStringData)=rep(1,n)
data(NanoStringData) ## obtain positive factors positiveFactor(NanoStringData) ## assign a vector n=ncol(exprs(NanoStringData)) positiveFactor(NanoStringData)=rep(1,n)