Title: | Functions for the projection of weights from PCA, CoGAPS, NMF, correlation, and clustering |
---|---|
Description: | Functions for the projection of data into the spaces defined by PCA, CoGAPS, NMF, correlation, and clustering. |
Authors: | Gaurav Sharma, Charles Shin, Jared Slosberg, Loyal Goff, Genevieve Stein-O'Brien |
Maintainer: | Genevieve Stein-O'Brien <[email protected]> |
License: | GPL (==2) |
Version: | 1.23.0 |
Built: | 2024-10-31 03:41:24 UTC |
Source: | https://github.com/bioc/projectR |
Function to provide alluvial matrix for generating alluvial plot
alluvialMat( projection, annotations, annotationName = "Cell type", annotationType = "Cell", plot = TRUE, minPropExplained = 0.75, pvalThreshold = 0.05, qvalThreshold = 0.05 )
alluvialMat( projection, annotations, annotationName = "Cell type", annotationType = "Cell", plot = TRUE, minPropExplained = 0.75, pvalThreshold = 0.05, qvalThreshold = 0.05 )
projection |
a projection generated from projectR, ensure that full = TRUE while generating projection |
annotations |
a character vector of annotations for the data |
annotationName |
a charcter fof collective name of the annotations, default is "Cell type" |
annotationType |
a character indicating the type of data annotated, default is "Cell" |
plot |
logical indicating whether to return the alluvial plot, default is TRUE |
minPropExplained |
threshold for minimum proportion of samples that correspond to a pattern to be used for plotting |
pvalThreshold |
theshold level of significance for p-value |
qvalThreshold |
theshold level of significance for Benjamini-Hochberg corrected p-value |
A matrix to generate alluvial plots
projection <- projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=AP.RNAseq6l3c3t$Amean, dataNames = map.ESepiGen4c1l[["GeneSymbols"]], full = TRUE) alluvialMat(projection,pd.ESepiGen4c1l$Condition)
projection <- projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=AP.RNAseq6l3c3t$Amean, dataNames = map.ESepiGen4c1l[["GeneSymbols"]], full = TRUE) alluvialMat(projection,pd.ESepiGen4c1l$Condition)
AP.RNAseq6l3c3t contains the output of the gapsRun function in the CoGAPS package for data = p.RNAseq6l3c3t
AP.RNAseq6l3c3t
AP.RNAseq6l3c3t
A list of 12 items
Calculates AUC values for each set of weights for each label and outputs the results as a matrix
aucMat(labels, weights)
aucMat(labels, weights)
labels |
a vector of labels whose length is equal to the number of columns in the weight matrix |
weights |
a matrix of weights from projection analysis |
A matrix of AUC values for each set of weights classifying each label.
projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=AP.RNAseq6l3c3t$Amean, dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) -> projection aucMat(pd.ESepiGen4c1l$Condition,projection)
projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=AP.RNAseq6l3c3t$Amean, dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) -> projection aucMat(pd.ESepiGen4c1l$Condition,projection)
Calculate weighted/unweighted mean difference for each gene between 2 groups
bonferroniCorrectedDifferences( group1, group2, pvalue, diff_weights = NULL, mode = "CI" )
bonferroniCorrectedDifferences( group1, group2, pvalue, diff_weights = NULL, mode = "CI" )
group1 |
count matrix 1 |
group2 |
count matrix 2 |
pvalue |
significance value to threshold |
diff_weights |
loadings to weight the differential expression |
mode |
statistical approach, confidence intervals(CI) or pvalues(PV) |
Function to make patterns of continuous weights from clusters.
cluster2pattern(clusters, NP, data, ...) ## S4 method for signature 'character' cluster2pattern(clusters, data) ## S4 method for signature 'numeric' cluster2pattern(clusters, data) ## S4 method for signature 'kmeans' cluster2pattern(clusters, data) ## S4 method for signature 'hclust' cluster2pattern(clusters, NP, data = NA)
cluster2pattern(clusters, NP, data, ...) ## S4 method for signature 'character' cluster2pattern(clusters, data) ## S4 method for signature 'numeric' cluster2pattern(clusters, data) ## S4 method for signature 'kmeans' cluster2pattern(clusters, data) ## S4 method for signature 'hclust' cluster2pattern(clusters, NP, data = NA)
clusters |
a cluster object which could be either an hclust or a kmeans object |
NP |
number of desired patterns |
data |
data used to make clusters object |
... |
Additional arguments to cluster2pattern |
An object of class pclust containing pattern weights corresponding for each cluster.
k.RNAseq6l3c3t<-kmeans(t(p.RNAseq6l3c3t),3) cluster2pattern(clusters=k.RNAseq6l3c3t,data=p.RNAseq6l3c3t) distp <- dist(t(p.RNAseq6l3c3t)) hc.RNAseq6l3c3t <- hclust(distp) cluster2pattern(clusters=hc.RNAseq6l3c3t,NP=3,data=p.RNAseq6l3c3t)
k.RNAseq6l3c3t<-kmeans(t(p.RNAseq6l3c3t),3) cluster2pattern(clusters=k.RNAseq6l3c3t,data=p.RNAseq6l3c3t) distp <- dist(t(p.RNAseq6l3c3t)) hc.RNAseq6l3c3t <- hclust(distp) cluster2pattern(clusters=hc.RNAseq6l3c3t,NP=3,data=p.RNAseq6l3c3t)
class of cluster2pattern output.
clusterMatrix
matrix of continous values for projection that is output of cluster2pattern function
plotting function for clustering objects
clusterPlotR(cData, cls, x, NC, ...) ## S4 method for signature 'ANY,kmeans' clusterPlotR( cData = NA, cls = NA, x = NA, NC = NA, annoIndx = NA, label = NULL, ... ) ## S4 method for signature 'ANY,hclust' clusterPlotR( cData = NA, cls = NA, x = NA, NC = NA, annoIndx = NA, label = NULL, ... )
clusterPlotR(cData, cls, x, NC, ...) ## S4 method for signature 'ANY,kmeans' clusterPlotR( cData = NA, cls = NA, x = NA, NC = NA, annoIndx = NA, label = NULL, ... ) ## S4 method for signature 'ANY,hclust' clusterPlotR( cData = NA, cls = NA, x = NA, NC = NA, annoIndx = NA, label = NULL, ... )
cData |
data used to get clusters |
cls |
a cluster (kmeans or hclust) object |
x |
a vector of length equal to number of samples to use for plotting |
NC |
vector of integers indicating which clusters to use |
... |
additional parameters for plotting. ex. pch,cex,col,labels, xlab, etc. |
annoIndx |
vector indexing into subsets for plotting |
label |
character vector to use for plotting text, defaults is NULL |
A plot of the mean behavior for each cluster
## Not run: k.RNAseq6l3c3t<-kmeans(p.RNAseq6l3c3t,22) clusterPlotR(p.RNAseq6l3c3t, cls=k.RNAseq6l3c3t,NC=1,x=pd.RNAseq6l3c3t$days, col=pd.RNAseq6l3c3t$color) ## End(Not run)
## Not run: k.RNAseq6l3c3t<-kmeans(p.RNAseq6l3c3t,22) clusterPlotR(p.RNAseq6l3c3t, cls=k.RNAseq6l3c3t,NC=1,x=pd.RNAseq6l3c3t$days, col=pd.RNAseq6l3c3t$color) ## End(Not run)
Function to extract genes highly correlated with a gene or reference expression pattern.
correlateR(genes, dat, threshtype = "R", threshold = 0.7, absR = FALSE, ...)
correlateR(genes, dat, threshtype = "R", threshold = 0.7, absR = FALSE, ...)
genes |
gene or character vector of genes for reference expression pattern |
dat |
matrix or data frame with genes to be used for to calculate correlation |
threshtype |
Default "R" indicates thresholding by R value or equivalent. Alternatively, "N" indicates a numerical cut off. |
threshold |
numeric indicating value at which to make threshold. |
absR |
logical indicating where to include both positive and negatively correlated genes |
... |
addtion inputs to cor, such as method |
If threshtype is "R" than threshold must be between -1 and 1. Otherwise if top N correlated genes are required, set threshtype
as "N" and set threshold
= N, i.e, the number of correlated genes required.
A correlation matrix
cor2T<-correlateR(genes="T", dat=p.RNAseq6l3c3t, threshtype="N", threshold=10, absR=TRUE)
cor2T<-correlateR(genes="T", dat=p.RNAseq6l3c3t, threshtype="N", threshold=10, absR=TRUE)
class of correlateR output.
corM
correlation matrix obtained from correlateR
cr_microglia contains the output of the CoGAPS function in the CoGAPS package for data = microglial_counts
cr_microglial
cr_microglial
A CogapsResult object
CR.RNAseq6l3c3t contains the output of the CoGAPS function in the CoGAPS package for data = p.RNAseq6l3c3t
CR.RNAseq6l3c3t
CR.RNAseq6l3c3t
A CogapsResult object
Matches genes accross datasets
geneMatchR( data1, data2, data1Names = NULL, data2Names = NULL, merge = FALSE, ... )
geneMatchR( data1, data2, data1Names = NULL, data2Names = NULL, merge = FALSE, ... )
data1 |
a data matrix, typically genes by samples |
data2 |
an amplitude matrix, typically genes by factors |
data1Names |
rownames of data matrix, for eg genenames |
data2Names |
rownames of amplitude matrix to be matched to rownames of datamatrix |
merge |
logical indicating wether or not to merged data sets |
... |
Additional arguments to geneMatchR |
A list of genes (intersection) in both datasets. (if merge = TRUE, Also returns merged data.)
geneMatchR(data1=p.ESepiGen4c1l$mRNA.Seq,data2=p.RNAseq6l3c3t, data1Names=map.ESepiGen4c1l[["GeneSymbols"]])
geneMatchR(data1=p.ESepiGen4c1l$mRNA.Seq,data2=p.RNAseq6l3c3t, data1Names=map.ESepiGen4c1l[["GeneSymbols"]])
Function to provide tSNE of projection
getTSNE(projection, axis = 2, ...)
getTSNE(projection, axis = 2, ...)
projection |
martrix, a projection generated from projectR |
axis |
integer, either 1 umap of projection or 2 for umap of transpose of projection |
... |
addtional arguments passed to tsne |
projection <- projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=AP.RNAseq6l3c3t$Amean, dataNames = map.ESepiGen4c1l[["GeneSymbols"]], full = TRUE) projectionTSNE <- getTSNE(projection)
projection <- projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=AP.RNAseq6l3c3t$Amean, dataNames = map.ESepiGen4c1l[["GeneSymbols"]], full = TRUE) projectionTSNE <- getTSNE(projection)
Function to provide umap of projection
getUMAP(projection, axis = 2, umapMethod = "naive", umapConfig = umap.defaults)
getUMAP(projection, axis = 2, umapMethod = "naive", umapConfig = umap.defaults)
projection |
martrix, a projection generated from projectR |
axis |
integer, either 1 umap of projection or 2 for umap of transpose of projection |
umapMethod |
character, implementation. Available methods are 'naive' (an implementation written in pure R) and 'umap-learn' (requires python package 'umap-learn') |
umapConfig |
umap.config, a list of parameters to customize umap embedding |
A umap of projection
library(umap) projection <- projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=AP.RNAseq6l3c3t$Amean, dataNames = map.ESepiGen4c1l[["GeneSymbols"]], full = TRUE) umapConfig = umap.defaults umapConfig$n_neighbors = 3 projectionUMAP <- getUMAP(projection,umapConfig = umapConfig)
library(umap) projection <- projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=AP.RNAseq6l3c3t$Amean, dataNames = map.ESepiGen4c1l[["GeneSymbols"]], full = TRUE) umapConfig = umap.defaults umapConfig$n_neighbors = 3 projectionUMAP <- getUMAP(projection,umapConfig = umapConfig)
log-normalized count data from astrocytes and oligodendrocytes in the p6 mouse cortex.
glial_counts
glial_counts
A gene (rows) by cell (column) matrix
Constructor for cluster2pattern
## S4 method for signature 'cluster2pattern' initialize(.Object, clusterMatrix, ...)
## S4 method for signature 'cluster2pattern' initialize(.Object, clusterMatrix, ...)
.Object |
clusterMatrix object |
clusterMatrix |
matrix of continous values for projection that is output of cluster2pattern function |
... |
additional arguments to intialize cluster2pattern |
initialized cluster2pattern object
Constructor for correlateR
## S4 method for signature 'correlateR' initialize(.Object, corM, ...)
## S4 method for signature 'correlateR' initialize(.Object, corM, ...)
.Object |
correlateR object |
corM |
correlation matrix obtained from correlateR |
... |
additional arguments to intialize correlateR |
initialized correlateR object
Constructor for rotatoR
## S4 method for signature 'rotatoR' initialize(.Object, rotatedM, ...)
## S4 method for signature 'rotatoR' initialize(.Object, rotatedM, ...)
.Object |
rotatoR object |
rotatedM |
rotated matrix from rotatoR function |
... |
additional arguments to intialize rotatoR |
initialized rotatoR object
A function to find and test the intersecting values of two sets of objects, presumably the genes associated with patterns in two different datasets. Both the input objects need to be of the same type either kmeans or hclust.
intersectoR(pSet1, pSet2, pval, ...) ## S4 method for signature 'kmeans,kmeans' intersectoR(pSet1 = NA, pSet2 = NA, pval = 0.05, full = FALSE) ## S4 method for signature 'hclust,hclust' intersectoR(pSet1 = NA, pSet2 = NA, pval = 0.05, full = FALSE, k = NULL)
intersectoR(pSet1, pSet2, pval, ...) ## S4 method for signature 'kmeans,kmeans' intersectoR(pSet1 = NA, pSet2 = NA, pval = 0.05, full = FALSE) ## S4 method for signature 'hclust,hclust' intersectoR(pSet1 = NA, pSet2 = NA, pval = 0.05, full = FALSE, k = NULL)
pSet1 |
an object for a set of patterns where each entry is a set of genes associated with a single pattern |
pSet2 |
an object for a second set of patterns where each entry is a set of genes associated with a single pattern |
pval |
the maximum p-value considered significant |
... |
additional parameters depending on input object |
full |
logical indicating whether to return full data frame of signigicantly overlapping sets. Default is false will return summary matrix. |
k |
Numeric giving cut height for hclust objects, if a vector is given arguments will be applied to pSet1 and pSet2 in that order |
A list containing: Overlap matrix, overlap index, and overlapping sets.
ESepiGen4c1lmRNASeq <- p.ESepiGen4c1l$mRNA.Seq rownames(ESepiGen4c1lmRNASeq) <- map.ESepiGen4c1l$GeneSymbols k.RNAseq6l3c3t<-kmeans(p.RNAseq6l3c3t,22) k.ESepiGen4c1l<-kmeans(ESepiGen4c1lmRNASeq,10) intersectoR(k.RNAseq6l3c3t, k.ESepiGen4c1l, pval=.05) h.RNAseq6l3c3t<-hclust(as.dist(1-(cor(t(p.RNAseq6l3c3t))))) h.ESepiGen4c1l<-hclust(as.dist(1-(cor(t(ESepiGen4c1lmRNASeq))))) intersectoR(pSet1=h.ESepiGen4c1l, pSet2=h.RNAseq6l3c3t, pval=.05, k=c(3,4))
ESepiGen4c1lmRNASeq <- p.ESepiGen4c1l$mRNA.Seq rownames(ESepiGen4c1lmRNASeq) <- map.ESepiGen4c1l$GeneSymbols k.RNAseq6l3c3t<-kmeans(p.RNAseq6l3c3t,22) k.ESepiGen4c1l<-kmeans(ESepiGen4c1lmRNASeq,10) intersectoR(k.RNAseq6l3c3t, k.ESepiGen4c1l, pval=.05) h.RNAseq6l3c3t<-hclust(as.dist(1-(cor(t(p.RNAseq6l3c3t))))) h.ESepiGen4c1l<-hclust(as.dist(1-(cor(t(ESepiGen4c1lmRNASeq))))) intersectoR(pSet1=h.ESepiGen4c1l, pSet2=h.RNAseq6l3c3t, pval=.05, k=c(3,4))
map.ESepiGen4c1l contains gene annotations
map.ESepiGen4c1l
map.ESepiGen4c1l
A data frames with 93 rows and 9 variables:
1. Gifford, C. A. et al. Transcriptional and epigenetic dynamics during specification of human embryonic stem cells. Cell 153, 1149-1163 (2013).
map.RNAseq6l3c3 contains gene annotations for polyA bulk sequencing of 6 cell lines in 3 experimental condition at 3 time points.
map.RNAseq6l3c3t
map.RNAseq6l3c3t
A data frames with 108 rows and 54 variables:
log-normalized count data from microglial cells in the p6 mouse cortex.
microglial_counts
microglial_counts
A gene (rows) by cell (column) matrix
Performs multivariate analysis across specified clusters in datasets
multivariateAnalysisR( significanceLevel = 0.05, patternKeys, seuratobj, dictionaries, customNames = NULL, exclusive = TRUE, exportFolder = "", ANOVAwidth = 1000, ANOVAheight = 1000, CIwidth = 1000, CIheight = 1000, CIspacing = 1 )
multivariateAnalysisR( significanceLevel = 0.05, patternKeys, seuratobj, dictionaries, customNames = NULL, exclusive = TRUE, exportFolder = "", ANOVAwidth = 1000, ANOVAheight = 1000, CIwidth = 1000, CIheight = 1000, CIspacing = 1 )
significanceLevel |
double value for testing significance in ANOVA test |
patternKeys |
list of strings indicating pattern subsets from seuratobj to be analyzed |
seuratobj |
Seurat Object Data containing patternKeys in meta.data |
dictionaries |
list of dictionaries indicating clusters to be compared |
customNames |
list of custom names for clusters in corresponding order |
exclusive |
boolean value for determining interpolation between params in clusters |
exportFolder |
name of folder to store exported graphs and CSV files |
ANOVAwidth |
width of ANOVA png |
ANOVAheight |
height of ANOVA png |
CIwidth |
width of CI png |
CIheight |
height of CI png |
CIspacing |
spacing between each CI in CI graph |
a sorted list of ANOVA and CI results; ANOVA and Confidence Intervals are visualized and exported in both PNG and CSV
Truncated Seurat Object with latent space projection done to unspecified cells in different stages for multivariateAnalysisR analysis
multivariateAnalysisR_seurat_test
multivariateAnalysisR_seurat_test
A Seurat Object with 31034 observations of 4 variables in meta.data:
p.ESepiGen4c1l contains log2(RPKM + 1) values for polyA bulk sequencing and log2 counts of normalized ChIPSeq reads of 1 cell lines with 2 replicates in 4 experimental conditions at a single time point.
p.ESepiGen4c1l
p.ESepiGen4c1l
p.ESepiGen4c1l is a list of 6 data frames each with with 93 rows and between 4 and 9 variables:
1. Gifford, C. A. et al. Transcriptional and epigenetic dynamics during specification of human embryonic stem cells. Cell 153, 1149-1163 (2013).
p.RNAseq6l3c3 contains log2(RPKM + 1) values for polyA bulk sequencing of 6 cell lines in 3 experimental condition at 3 time points.
p.RNAseq6l3c3t
p.RNAseq6l3c3t
A data frames with 108 rows and 54 variables:
pd.ESepiGen4c1l.4cond contains sample phenotype and experimental information
pd.ESepiGen4c1l
pd.ESepiGen4c1l
A data frames with 9 rows and 2 variables:
1. Gifford, C. A. et al. Transcriptional and epigenetic dynamics during specification of human embryonic stem cells. Cell 153, 1149-1163 (2013).
pd.RNAseq6l3c3t contains sample phenotype and experimental information for polyA bulk sequencing of 6 cell lines in 3 experimental condition at 3 time points.
pd.RNAseq6l3c3t
pd.RNAseq6l3c3t
A data frames with 54 rows and 38 variables:
Generate volcano plot and gate genes based on fold change and pvalue, includes vectors that can be used with fast gene set enrichment (fgsea)
pdVolcano( result, FC = 0.2, pvalue = NULL, subset = NULL, filter.inf = FALSE, label.num = 5L, display = TRUE )
pdVolcano( result, FC = 0.2, pvalue = NULL, subset = NULL, filter.inf = FALSE, label.num = 5L, display = TRUE )
result |
result output from projectionDriveR function in PV mode |
FC |
fold change threshold, default at 0.2 |
pvalue |
significance threshold, default set stored pvalue |
subset |
vector of gene names to subset the plot by |
filter.inf |
remove genes that have pvalues below machine double minimum value |
label.num |
Number of genes to label on either side of the volcano plot, default 5 |
display |
boolean. Whether or not to plot and display volcano plots |
A list with weighted and unweighted differential expression metrics
Generate point and line confidence intervals from provided estimates.
plotConfidenceIntervals( confidence_intervals, interval_name = c("low", "high"), pattern_name = NULL, sort = TRUE, genes = NULL, weights = NULL, weights_clip = 0.99, weights_vis_norm = "none", weighted = FALSE )
plotConfidenceIntervals( confidence_intervals, interval_name = c("low", "high"), pattern_name = NULL, sort = TRUE, genes = NULL, weights = NULL, weights_clip = 0.99, weights_vis_norm = "none", weighted = FALSE )
confidence_intervals |
A dataframe of features x estimates. |
interval_name |
Estimate column names. Default: c("low","high") |
pattern_name |
string to use as the title for plots. |
sort |
Boolean. Sort genes by their estimates (default = TRUE) |
genes |
a vector with names of genes to include in plot. If sort=F, estimates will be plotted in this order. |
weights |
optional. weights of features to include as annotation. |
weights_clip |
optional. quantile of data to clip color scale for improved visualization. Default: 0.99 |
weights_vis_norm |
Which version of weights to visualize as a heatmap. Options are "none" (uses provided weights) or "quantiles". Default: none |
weighted |
specifies whether the confidence intervals in use are weighted by the pattern and labels plots accordingly |
A list with pointrange estimates and a heatmap of pattern weights.
Volcano plotting function
plotVolcano(stats, metadata, FC, pvalue, title)
plotVolcano(stats, metadata, FC, pvalue, title)
stats |
data frame with differential expression statistics |
metadata |
#metadata from pdVolcano |
FC |
Fold change threshold |
pvalue |
p value threshold |
title |
plot title |
Calculate weighted expression difference between two groups (group1 - group2)
projectionDriveR( cellgroup1, cellgroup2, loadings, pattern_name, loadingsNames = NULL, pvalue = 1e-05, display = TRUE, normalize_pattern = TRUE, mode = "CI" )
projectionDriveR( cellgroup1, cellgroup2, loadings, pattern_name, loadingsNames = NULL, pvalue = 1e-05, display = TRUE, normalize_pattern = TRUE, mode = "CI" )
cellgroup1 |
gene x cell count matrix for cell group 1 |
cellgroup2 |
gene x cell count matrix for cell group 2 |
loadings |
A matrix of continuous values defining the features |
pattern_name |
column of loadings for which drivers will be calculated |
loadingsNames |
a vector with names of loading rows defaults to rownames |
pvalue |
confidence level. Default 1e-5 |
display |
boolean. Whether or not to display confidence intervals |
normalize_pattern |
Boolean. Whether or not to normalize pattern weights |
mode |
statistical approach, confidence intervals or pvalues. default CI |
A list with unweighted/weighted mean differences and differential genes that meet the provided signficance threshold.
A function for the projection of new data into a previously defined feature space.
projectR(data, loadings, dataNames = NULL, loadingsNames = NULL, ...) ## S4 method for signature 'matrix,matrix' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE, family = "gaussianff", bootstrapPval = FALSE, bootIter = 1000 ) ## S4 method for signature 'dgCMatrix,matrix' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE, family = "gaussianff" ) ## S4 method for signature 'matrix,LinearEmbeddingMatrix' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE, model = NA, family = "gaussianff", bootstrapPval = FALSE, bootIter = 1000 ) ## S4 method for signature 'matrix,prcomp' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE ) ## S4 method for signature 'matrix,rotatoR' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE ) ## S4 method for signature 'matrix,correlateR' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE, bootstrapPval = FALSE, bootIter = 1000 ) ## S4 method for signature 'matrix,hclust' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, full = FALSE, targetNumPatterns, sourceData, bootstrapPval = FALSE, bootIter = 1000 ) ## S4 method for signature 'matrix,kmeans' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, full = FALSE, sourceData, bootstrapPval = FALSE, bootIter = 1000 ) ## S4 method for signature 'matrix,cluster2pattern' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, full = FALSE, sourceData, bootstrapPval = FALSE, bootIter = 1000 )
projectR(data, loadings, dataNames = NULL, loadingsNames = NULL, ...) ## S4 method for signature 'matrix,matrix' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE, family = "gaussianff", bootstrapPval = FALSE, bootIter = 1000 ) ## S4 method for signature 'dgCMatrix,matrix' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE, family = "gaussianff" ) ## S4 method for signature 'matrix,LinearEmbeddingMatrix' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE, model = NA, family = "gaussianff", bootstrapPval = FALSE, bootIter = 1000 ) ## S4 method for signature 'matrix,prcomp' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE ) ## S4 method for signature 'matrix,rotatoR' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE ) ## S4 method for signature 'matrix,correlateR' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, NP = NA, full = FALSE, bootstrapPval = FALSE, bootIter = 1000 ) ## S4 method for signature 'matrix,hclust' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, full = FALSE, targetNumPatterns, sourceData, bootstrapPval = FALSE, bootIter = 1000 ) ## S4 method for signature 'matrix,kmeans' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, full = FALSE, sourceData, bootstrapPval = FALSE, bootIter = 1000 ) ## S4 method for signature 'matrix,cluster2pattern' projectR( data, loadings, dataNames = NULL, loadingsNames = NULL, full = FALSE, sourceData, bootstrapPval = FALSE, bootIter = 1000 )
data |
Target dataset into which you will project. It must of type matrix. |
loadings |
loadings learned from source dataset. |
dataNames |
a vector containing unique name, i.e. gene names, for the rows of the target dataset to be used to match features with the loadings, if not provided by |
loadingsNames |
a vector containing unique names, i.e. gene names, for the rows ofloadings to be used to match features with the data, if not provided by |
... |
Additional arguments to projectR |
NP |
vector of integers indicating which columns of loadings object to use. The default of NP=NA will use entire matrix. |
full |
logical indicating whether to return the full model solution. By default only the new pattern object is returned. |
family |
VGAM family function for model fitting (default: "gaussianff") |
bootstrapPval |
logical to indicate whether to generate p-values using bootstrap, not available for prcomp and rotatoR objects |
bootIter |
number of bootstrap iterations, default = 1000 |
model |
Optional arguements to choose method for projection |
targetNumPatterns |
desired number of patterns with hclust |
sourceData |
data used to create cluster object |
loadings
can belong to one of several classes depending on upstream
analysis. Currently permitted classes are matrix
, CogapsResult
,
CoGAPS
, pclust
, prcomp
, rotatoR
,
and correlateR
. Please note that loadings
should not contain NA.
A matrix of sample weights for each input basis in the loadings matrix (if full=TRUE, full model solution is returned).
projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=AP.RNAseq6l3c3t$Amean, dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) library("CoGAPS") # CR.RNAseq6l3c3t <- CoGAPS(p.RNAseq6l3c3t, params = new("CogapsParams", nPatterns=5)) projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=CR.RNAseq6l3c3t, dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) pca.RNAseq6l3c3t<-prcomp(t(p.RNAseq6l3c3t)) pca.ESepiGen4c1l<-projectR(data=p.ESepiGen4c1l$mRNA.Seq, loadings=pca.RNAseq6l3c3t, dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) pca.RNAseq6l3c3t<-prcomp(t(p.RNAseq6l3c3t)) r.RNAseq6l3c3t<-rotatoR(1,1,-1,-1,pca.RNAseq6l3c3t$rotation[,1:2]) pca.ESepiGen4c1l<-projectR(data=p.ESepiGen4c1l$mRNA.Seq, loadings=r.RNAseq6l3c3t, dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) c.RNAseq6l3c3t<-correlateR(genes="T", dat=p.RNAseq6l3c3t, threshtype="N", threshold=10, absR=TRUE) cor.ESepiGen4c1l<-projectR(data=p.ESepiGen4c1l$mRNA.Seq, loadings=c.RNAseq6l3c3t, NP="PositiveCOR", dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) library("projectR") data(p.RNAseq6l3c3t) nP<-3 kClust<-kmeans(t(p.RNAseq6l3c3t),centers=nP) kpattern<-cluster2pattern(clusters = kClust, NP = nP, data = p.RNAseq6l3c3t) p<-as.matrix(p.RNAseq6l3c3t) projectR(p,kpattern)
projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=AP.RNAseq6l3c3t$Amean, dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) library("CoGAPS") # CR.RNAseq6l3c3t <- CoGAPS(p.RNAseq6l3c3t, params = new("CogapsParams", nPatterns=5)) projectR(data=p.ESepiGen4c1l$mRNA.Seq,loadings=CR.RNAseq6l3c3t, dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) pca.RNAseq6l3c3t<-prcomp(t(p.RNAseq6l3c3t)) pca.ESepiGen4c1l<-projectR(data=p.ESepiGen4c1l$mRNA.Seq, loadings=pca.RNAseq6l3c3t, dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) pca.RNAseq6l3c3t<-prcomp(t(p.RNAseq6l3c3t)) r.RNAseq6l3c3t<-rotatoR(1,1,-1,-1,pca.RNAseq6l3c3t$rotation[,1:2]) pca.ESepiGen4c1l<-projectR(data=p.ESepiGen4c1l$mRNA.Seq, loadings=r.RNAseq6l3c3t, dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) c.RNAseq6l3c3t<-correlateR(genes="T", dat=p.RNAseq6l3c3t, threshtype="N", threshold=10, absR=TRUE) cor.ESepiGen4c1l<-projectR(data=p.ESepiGen4c1l$mRNA.Seq, loadings=c.RNAseq6l3c3t, NP="PositiveCOR", dataNames = map.ESepiGen4c1l[["GeneSymbols"]]) library("projectR") data(p.RNAseq6l3c3t) nP<-3 kClust<-kmeans(t(p.RNAseq6l3c3t),centers=nP) kpattern<-cluster2pattern(clusters = kClust, NP = nP, data = p.RNAseq6l3c3t) p<-as.matrix(p.RNAseq6l3c3t) projectR(p,kpattern)
CoGAPS patterns learned from the developing mouse retina.
retinal_patterns
retinal_patterns
A gene (rows) by pattern (column) matrix
1. Clark, B.S., & Stein-O'Brien G.L., et al. Single-Cell RNA-Seq Analysis of Development Identifies NFI Factors as Regulating Mitotic Exit and Late-Born Cell Specification. Cell 102, 1111-1126 (2019).
a function for rotating two basis about a point or line in that plane
rotatoR(x1, y1, x2, y2, basisSET)
rotatoR(x1, y1, x2, y2, basisSET)
x1 |
a value describing a the coordinate of a point in the first basis. If no values are provided for x2 |
y1 |
a value describing a the coordinate of a point in the second basis |
x2 |
a value describing a the coordinate of the second point in the second basis |
y2 |
a value describing a the coordinate of the second point in the second basis |
basisSET |
the basis to be rotated |
An object of class rotatoR.
pca.RNAseq6l3c3t<-prcomp(t(p.RNAseq6l3c3t)) r.RNAseq6l3c3t<-rotatoR(1,1,-1,-1,pca.RNAseq6l3c3t$rotation[,1:2])
pca.RNAseq6l3c3t<-prcomp(t(p.RNAseq6l3c3t)) r.RNAseq6l3c3t<-rotatoR(1,1,-1,-1,pca.RNAseq6l3c3t$rotation[,1:2])
class of rotatoR output.
rotatedM
rotated basis set (matrix) that is output of rotatoR function