Title: | rcellminer: Molecular Profiles, Drug Response, and Chemical Structures for the NCI-60 Cell Lines |
---|---|
Description: | The NCI-60 cancer cell line panel has been used over the course of several decades as an anti-cancer drug screen. This panel was developed as part of the Developmental Therapeutics Program (DTP, http://dtp.nci.nih.gov/) of the U.S. National Cancer Institute (NCI). Thousands of compounds have been tested on the NCI-60, which have been extensively characterized by many platforms for gene and protein expression, copy number, mutation, and others (Reinhold, et al., 2012). The purpose of the CellMiner project (http://discover.nci.nih.gov/ cellminer) has been to integrate data from multiple platforms used to analyze the NCI-60 and to provide a powerful suite of tools for exploration of NCI-60 data. |
Authors: | Augustin Luna, Vinodh Rajapakse, Fabricio Sousa |
Maintainer: | Augustin Luna <[email protected]>, Vinodh Rajapakse <[email protected]>, Fathi Elloumi <[email protected]> |
License: | LGPL-3 + file LICENSE |
Version: | 2.29.0 |
Built: | 2024-10-31 04:23:05 UTC |
Source: | https://github.com/bioc/rcellminer |
Display citation message
.onAttach(libname, pkgname)
.onAttach(libname, pkgname)
libname |
a character string giving the library directory where the package defining the namespace was found. |
pkgname |
a character string giving the name of the package. |
Make sure that rcellminerData is loaded
.onLoad(libname, pkgname)
.onLoad(libname, pkgname)
libname |
a character string giving the library directory where the package defining the namespace was found. |
pkgname |
a character string giving the name of the package. |
Returns an indexed eSet object from a MolData object eSet list.
## S4 method for signature 'MolData' x[[i]]
## S4 method for signature 'MolData' x[[i]]
x |
A MolData object. |
i |
Index or named item in MolData object eSet list. |
An indexed eSet object from a MolData object eSet list.
Assigns an eSet object to a specified position in a MolData object eSet list.
## S4 replacement method for signature 'MolData' x[[i]] <- value
## S4 replacement method for signature 'MolData' x[[i]] <- value
x |
A MolData object. |
i |
Index or named item in MolData object eSet list. |
value |
An eSet object to be assigned. |
An eSet object to a specified position in a MolData object eSet list.
CellMiner Version
The version of CellMiner used
Vinodh Rajapakse vinodh.rajapakse AT nih.gov
http://discover.nci.nih.gov/cellminer
Calculate cross-correlations with between rows of input matrices
crossCors(X, Y = NULL, method = "pearson")
crossCors(X, Y = NULL, method = "pearson")
X |
a matrix or data.frame |
Y |
a matrix or data.frame |
method |
a string specifying the type of correlation, chosen from pearson (default) or spearman. |
a list containing matrices of pairwise correlations and their p-values between rows of the input matrices or dataframes.
Sudhir Varma, NCI-LMP, with input checks, support for Spearman's correlation added by VNR.
drugActData <- exprs(getAct(rcellminerData::drugData)) crossCors(drugActData[c("94600"), ], drugActData[c("727625", "670655"), ]) crossCors(drugActData[c("94600"), ], drugActData[c("727625", "670655"), ], method="spearman")
drugActData <- exprs(getAct(rcellminerData::drugData)) crossCors(drugActData[c("94600"), ], drugActData[c("727625", "670655"), ]) crossCors(drugActData[c("94600"), ], drugActData[c("727625", "670655"), ], method="spearman")
Calculate Spearman's correlations with between rows of input matrices
crossCorsSpearman(X, Y = NULL)
crossCorsSpearman(X, Y = NULL)
X |
a matrix or data.frame |
Y |
a matrix or data.frame |
a list containing matrices of pairwise Spearman's correlations and their p-values between rows of the input matrices or dataframes.
## Not run: crossCorsSpearman(drugActData[c("94600"), ], drugActData[c("727625", "670655"), ]) ## End(Not run)
## Not run: crossCorsSpearman(drugActData[c("94600"), ], drugActData[c("727625", "670655"), ]) ## End(Not run)
A data frame with descriptive information for all compound mechanism of action (MOA) abbreviations used in CellMiner.
Returns a DrugData object.
DrugData(act, repeatAct, sampleData, ...)
DrugData(act, repeatAct, sampleData, ...)
act |
An eSet object containing drug activity data across a set of biological samples. |
repeatAct |
An eSet object containing repeat drug activity experiment data with respect to the same samples associated with act. |
sampleData |
A MIAxE object capturing sample and other data set information. |
... |
Other possible parameters. |
A DrugData object.
An S4 class to represent drug activity and related data recorded for a set of biological samples.
... |
Other possible parameters. |
act
An eSet object containing drug activity data across a set of biological samples.
repeatAct
An eSet object containing repeat drug activity experiment data with respect to the same samples associated with act.
sampleData
A MIAxE object capturing sample and other data set information.
Returns a DrugData object.
## S4 method for signature 'eSet,eSet,MIAxE' DrugData(act, repeatAct, sampleData, ...)
## S4 method for signature 'eSet,eSet,MIAxE' DrugData(act, repeatAct, sampleData, ...)
act |
An eSet object containing drug activity data across a set of biological samples. |
repeatAct |
An eSet object containing repeat drug activity experiment data with respect to the same samples associated with act. |
sampleData |
A MIAxE object capturing sample and other data set information. |
... |
Other possible parameters. |
A DrugData object.
CellMiner Drug Response Values
A list containing response values and annotations:
act Z-scores of the averaged negative log GI (growth inhibition) 50 values across repeats for the NCI-60; assay described here: http://dtp.nci.nih.gov/branches/btb/ivclsp.html
annot
id Dataset identifier; NOTE: DO NOT use this column; the NSC is the primary drug identifier
nsc National Service Center identifier; the primary drug identifier
name Compound name
brand_name Brand name for the compound, if sold commericially
formula Compound chemical formula
testing_status Information on whether it is known if the compound is FDA approved or undergoing testing in clinical trials
source TODO
smiles Compound chemical structure as a SMILES string
weight Compound chemical weight in g/mol
mechanism Pharmacological mechanism of action
confidential_flag A flag to indicate if compound information is public
total_probes TODO
total_good_probes TODO
low_correlations TODO
failure_reason TODO
cas CAS Registry Number; NOTE: Due to data restrictions PubChem IDs are the preferred mapping ID to other datasets
pubchem_id PubChem ID
Vinodh Rajapakse vinodh.rajapakse AT nih.gov
http://discover.nci.nih.gov/cellminer/loadDownload.do
Z-scores of values for a variety of assays conducted on the NCI-60 to facilitate comparison. Z-scores calculated over the 60 cell lines for the given feature.
A list containing various assay values:
cop Copy number values; Described in Pubmed ID: 24670534
exp Expression values; Obtained from "RNA: 5 Platform Gene Transcript" http://discover.nci.nih.gov/cellminer/loadDownload.do; Missing values imputed using the R package "impute"
mut Mutation values; Deleterious mutations obtained from TODO
mir MicroRNA values; Obtained from "RNA: Agilent Human microRNA (V2)" http://discover.nci.nih.gov/cellminer/loadDownload.do
pro Reverse protein lysate array values; Obtain from "Protein: Lysate Array" http://discover.nci.nih.gov/cellminer/loadDownload.do
mda NCI-60 metadata.
CNV_GAIN Proportion of genome copy number gains; Described in Pubmed ID: 24670534
CNV_LOSS Proportion of genome copy number losses; Described in Pubmed ID: 24670534
CNV_TOTAL Sum of CNV_GAIN and CNV_LOSS
P53_BIN Binary TP53 profile curated by William Reinhold
MSI_OGAN_BIN Binary microsatellite instability (MSI) profile curated by Ogan Abaan using COSMIC data; Obtained from Supplementary Table 1 - Ogan Whole Exome Sequencing (WES) paper in Cancer Res.
EPITHELIAL Epithelial by tissue of origin - pattern extracted from the CellMiner cell line metadata http://discover.nci.nih.gov/cellminer/celllineMetadata.do
EPITHELIAL_KURT Kurt Kohn curation for epithelial-like cell lines based on molecular parameters described in Pubmed ID: 24940735
DELETERIOUS Total deleterious variants from WES dataset; Fabricio Sousa curation
MISSENSE Total missense variants from WES dataset; Fabricio Sousa curation
SILENT Total silent variants from WES dataset; Fabricio Sousa curation
TOTAL_AA Total amino acid changing variants from WES dataset; Fabricio Sousa curation
CELL-CELL Cell-to-cell adhesion curated by William Reinhold
DOUBLINGTIME The doubling time pattern was extracted from the CellMiner cell line metadata http://discover.nci.nih.gov/cellminer/celllineMetadata.do
Vinodh Rajapakse vinodh.rajapakse AT nih.gov
http://discover.nci.nih.gov/cellminer
Molecular Fingerprint List
Augustin Luna augustin AT mail.nih.gov
http://discover.nci.nih.gov/cellminer
Returns an eSet object with drug activity data.
getAct(object, ...)
getAct(object, ...)
object |
Object for which drug activity data is to be returned. |
... |
Other possible parameters. |
An eSet object with drug activity data.
Returns an eSet object with drug activity data.
## S4 method for signature 'DrugData' getAct(object)
## S4 method for signature 'DrugData' getAct(object)
object |
DrugData object for which drug activity data is to be returned. |
An eSet object with drug activity data.
Returns a table of activity range statistics for a set of compounds.
getActivityRangeStats( nscSet, concFormat = "NegLogGI50M", onlyCellMinerExps = TRUE )
getActivityRangeStats( nscSet, concFormat = "NegLogGI50M", onlyCellMinerExps = TRUE )
nscSet |
a character vector specifying NSC identifier(s) for compound(s) of interest. |
concFormat |
a string selected from "NegLogGI50M" or "IC50MicroM". "NegLogGI50M" specifies activities as the negative log of the 50 inhibitory concentration (molar). "IC50MicroM" specifies activities as the 50 inhibitory concentration (micromolar). |
onlyCellMinerExps |
a logical value indicating whether to only return experimental data included in CellMiner (default=TRUE). |
a table of activity range statistics for a set of compounds.
nscSet <- c("609699", "740") getActivityRangeStats(nscSet) getActivityRangeStats(nscSet, concFormat="IC50MicroM")
nscSet <- c("609699", "740") getActivityRangeStats(nscSet) getActivityRangeStats(nscSet, concFormat="IC50MicroM")
Returns a list of feature data matrices.
getAllFeatureData(object, ...)
getAllFeatureData(object, ...)
object |
Object for which a list of feature data matrices is to be returned. |
... |
Other possible parameters. |
A list of feature data matrices.
Returns a list of feature data matrices.
## S4 method for signature 'MolData' getAllFeatureData(object)
## S4 method for signature 'MolData' getAllFeatureData(object)
object |
MolData object for which a list of feature data matrices is to be returned. |
A list of feature data matrices.
Compute a binary gene mutation data matrix from SNP and other mutation event-level data.
getBinaryMutationData( mutInfo, mutData, maxVariantFreq = 0.2, maxNormalPopulationFreq = 0.005, maxSiftScore = 0.05, minPolyPhenScore = 0.85 )
getBinaryMutationData( mutInfo, mutData, maxVariantFreq = 0.2, maxNormalPopulationFreq = 0.005, maxSiftScore = 0.05, minPolyPhenScore = 0.85 )
mutInfo |
A data frame with the following named columns: Gene, the name of the gene associated with the mutation event; probe.ids, a unique identifier specifying the mutation event; SNP_1000_genome, the frequency of the mutation event in SNP 1000; ESP5400, the frequency of the mutation event in ESP5400; SNP_type, the type of mutation event, chosen from "MISSENSE", "FRAMESHIFT", "NONFRAMESHIFT", "NONSENSE", "SPLICING"; SIFT_score, the SIFT score; Polyphen_score, the POLYPHEN score. Rownames of mutInfo should be set to probe.ids, i.e., the unique mutation event specifier. |
mutData |
A matrix with event level mutation information, with SNPs, etc. along rows and samples along columns. Rownames of mutData should exactly match those of mutInfo. The i-th row of mutInfo should thus give detailed information for the mutation event with data specified in the i-th row of mutData. |
maxVariantFreq |
The maximum proportion of mutant samples (used to exclude frequently occuring events); default value = 0.2. |
maxNormalPopulationFreq |
The maximum freqency of a mutation in the normal population (used to exclude likely germline variants); default value = 0.005. |
maxSiftScore |
The maximum accepted SIFT score (used to exclude presumed non-deleterious mutations); default value = 0.05. |
minPolyPhenScore |
The minimum accepted POLYPHEN score (used to exclude presumed non-deleterious mutations); default value = 0.85. |
A binary gene mutation matrix, with genes along rows, samples along columns, and 1s indicating deleterious mutations.
Calculate quantile for the columns in a matrix
getColumnQuantiles(X, prob, naRm = FALSE, onlyNonzeroVals = FALSE)
getColumnQuantiles(X, prob, naRm = FALSE, onlyNonzeroVals = FALSE)
X |
the matrix |
prob |
a numeric probablity |
naRm |
a boolean, whether to remove NAs |
onlyNonzeroVals |
a boolean, whether to only include non-zero values |
a vector of quantiles
getColumnQuantiles(matrix(1:25, nrow=5), prob = 0.5)
getColumnQuantiles(matrix(1:25, nrow=5), prob = 0.5)
Returns a matrix containing activity (-logGI50) data for a set of compounds.
getDrugActivityData(nscSet, onlyCellMinerExps = TRUE)
getDrugActivityData(nscSet, onlyCellMinerExps = TRUE)
nscSet |
A string specifying the NSC identifiers for the compounds. |
onlyCellMinerExps |
A logical value indicating whether to compute results using only experimental data included in CellMiner (default=TRUE). |
a matrix with NCI-60 average (over experiments) -logGI50 activity data; compound activity profiles are along rows.
nscSet <- c("141540", "123127") # Etoposide, Doxorubicin. actData <- getDrugActivityData(nscSet)
nscSet <- c("141540", "123127") # Etoposide, Doxorubicin. actData <- getDrugActivityData(nscSet)
Returns a vector of log activity range values for set of compounds.
getDrugActivityRange(nscSet, computeIQR = FALSE)
getDrugActivityRange(nscSet, computeIQR = FALSE)
nscSet |
a character vector specifying NSC identifier(s) for compound(s) of interest. |
computeIQR |
logical value indicated whether inter-quartile range is to be computed; otherwise absolute range is computed (default=FALSE). |
a numeric vector of NCI-60 log activity (-logGI50) range values indexed by the identifiers in nscSet.
nscSet <- c("609699", "740") getDrugActivityRange(nscSet)
nscSet <- c("609699", "740") getDrugActivityRange(nscSet)
Returns a matrix containing repeat activity experiment data for a compound.
getDrugActivityRepeatData( nscStr, concFormat = "NegLogGI50M", onlyCellMinerExps = TRUE )
getDrugActivityRepeatData( nscStr, concFormat = "NegLogGI50M", onlyCellMinerExps = TRUE )
nscStr |
a string specifying the NSC identifier for the compound. |
concFormat |
a string selected from "NegLogGI50M" or "IC50MicroM". "NegLogGI50M" specifies activities as the negative log of the 50 inhibitory concentration (molar). "IC50MicroM" specifies activities as the 50 growth inhibitory concentration (micromolar). |
onlyCellMinerExps |
a logical value indicating whether to only return experimental data included in CellMiner (default=TRUE). |
a matrix with activity data from each experiment associated with a compound organized along the rows.
nscStr <- "609699" actData <- getDrugActivityRepeatData(nscStr, concFormat='NegLogGI50M') actData <- getDrugActivityRepeatData(nscStr, concFormat='IC50MicroM')
nscStr <- "609699" actData <- getDrugActivityRepeatData(nscStr, concFormat='NegLogGI50M') actData <- getDrugActivityRepeatData(nscStr, concFormat='IC50MicroM')
Get a list of applicable MOA strings for a drug.
getDrugMoaList(nsc, moaToCompoundListMap = NULL)
getDrugMoaList(nsc, moaToCompoundListMap = NULL)
nsc |
An NSC string. |
moaToCompoundListMap |
A named list of character vectors, with each name indicating an MOA class, and its corresponding character vector specifying MOA-associated drugs. If unspecified, this is constructed based on MOA information provided by CellMiner. |
LINK TO MOAs?
A character vector giving all MOA classes for the drug.
getDrugMoaList("754365")
getDrugMoaList("754365")
Get the drug names for a set of NSC identifiers.
getDrugName(nscSet)
getDrugName(nscSet)
nscSet |
A character vector of NSC strings |
A named character vector indicating the compound names for each NSC in nscSet (with an empty string returned if no such information is available, and an NA returned if the NSC is not included in the CellMiner database).
nscSet <- c("609699", "94600") getDrugName(nscSet)
nscSet <- c("609699", "94600") getDrugName(nscSet)
Returns a list of eSet objects.
getESetList(object, ...)
getESetList(object, ...)
object |
Object for which a list of eSets is to be returned. |
... |
Other possible parameters. |
A list of eSet objects.
Returns a list of eSet objects.
## S4 method for signature 'MolData' getESetList(object)
## S4 method for signature 'MolData' getESetList(object)
object |
MolData object for which a list of eSet objects is to be returned. |
A list of eSet objects.
Returns a list of data frames with feature information.
getFeatureAnnot(object, ...)
getFeatureAnnot(object, ...)
object |
Object for which feature data is to be returned. |
... |
Other possible parameters. |
A list of data frames with feature information.
Returns a list of data frames with feature information.
## S4 method for signature 'DrugData' getFeatureAnnot(object)
## S4 method for signature 'DrugData' getFeatureAnnot(object)
object |
DrugData object for which feature data is to be returned. |
A named list of data frames with feature information for drugs and drug repeat experiments.
Returns a list of data frames with feature information.
## S4 method for signature 'MolData' getFeatureAnnot(object)
## S4 method for signature 'MolData' getFeatureAnnot(object)
object |
MolData object for which feature data is to be returned. |
A named list of data frames with feature information for available molecular data types.
Extract from a list of matrices the data associated with a set of features.
getFeatureDataFromMatList( featureSet, dataMatList, excludeMissingFeatures = TRUE )
getFeatureDataFromMatList( featureSet, dataMatList, excludeMissingFeatures = TRUE )
featureSet |
a character vector of feature names. |
dataMatList |
a list of matrices with feature data organized along the rows, and feature names accessible via rownames(dataMatList). |
excludeMissingFeatures |
a logical value indicating whether features whose data cannot be found in any matrices in dataMatList should be excluded in the output. (default=TRUE). |
a single matrix containing data for all features in featureSet.
featureSet <- c("expSLFN11", "mutSLX4") molDataMats <- getMolDataMatrices() featureData <- getFeatureDataFromMatList(featureSet, molDataMats)
featureSet <- c("expSLFN11", "mutSLX4") molDataMats <- getMolDataMatrices() featureData <- getFeatureDataFromMatList(featureSet, molDataMats)
Returns a vector of median sensitive cell line activity (-logGI50) values for a set of compounds.
getMedSenLineActivity( idSet, senLineActZThreshold = 0.5, onlyCellMinerExps = TRUE, dataSource = "NCI60" )
getMedSenLineActivity( idSet, senLineActZThreshold = 0.5, onlyCellMinerExps = TRUE, dataSource = "NCI60" )
idSet |
a character vector specifying identifier(s) for compound(s) of interest. |
senLineActZThreshold |
the minimum activity z-score for a sensitive cell line (default=0.5). |
onlyCellMinerExps |
a logical value indicating whether to base results strictly on experimental data included in CellMiner (default=TRUE). |
dataSource |
character string indicating data source (default="NCI60"). Currently only "NCI60" is supported. |
a numeric vector of median sensitive cell line activity (-logGI50) values indexed by the identifiers in idSet.
idSet <- c("609699", "740") getMedSenLineActivity(idSet)
idSet <- c("609699", "740") getMedSenLineActivity(idSet)
Returns a table indicating, for each compound in a specified set, the least significant correlation and associated p-value between its replicate experiments.
getMinDrugActivityRepeatCor(nscSet)
getMinDrugActivityRepeatCor(nscSet)
nscSet |
a character vector specifying NSC identifier(s) for compound(s) of interest. |
a dataframe containing the following columns: NSC, cor, pval
nscSet <- c("123528", "339316") repExpCorTab <- getMinDrugActivityRepeatCor(nscSet)
nscSet <- c("123528", "339316") repExpCorTab <- getMinDrugActivityRepeatCor(nscSet)
Get MOA string
getMoaStr(nscStr)
getMoaStr(nscStr)
nscStr |
an NSC string |
LINK TO MOAs?
a comma-delimited string with MOA
getMoaStr("94600") getMoaStr(c("94600", "609699"))
getMoaStr("94600") getMoaStr(c("94600", "609699"))
Get a named list mapping MOA classes to associated compound sets.
getMoaToCompounds()
getMoaToCompounds()
a named list mapping MOA classes to associated compound sets (each represented as a character vector).
moaToCompounds <- getMoaToCompounds()
moaToCompounds <- getMoaToCompounds()
Returns a list of molecular data type matrices, with rownames in each matrix prefixed with a data type abbreviation.
getMolDataMatrices(molDataMats = NULL)
getMolDataMatrices(molDataMats = NULL)
molDataMats |
A named list of molecular data type matrices with feature data specified along the rows, and feature names indicated in the row names. |
a list containing molecular data type matrices, with rownames in each matrix prefixed with a data type abbreviation, e.g., 'exp' for mRNA expression, etc. The matrix-specific data type abbreviations are derived from the names of molDataMats.
molDataMats <- getMolDataMatrices()
molDataMats <- getMolDataMatrices()
Get the molecular data type prefixes for a set of features.
getMolDataType(features, prefixLen = 3)
getMolDataType(features, prefixLen = 3)
features |
A vector of features. |
prefixLen |
The length of the molecular data type prefix. |
A character vector of molecular data type prefixes.
#' @examples getMolDataType(c("expTP53", "copMDM2", "mutCHEK2", "mutBRAF"))
Returns a vector indicating the number of drug activity repeat experiments with available data for each member of a set of compounds.
getNumDrugActivityRepeats(nscSet, onlyCellMinerExps = TRUE)
getNumDrugActivityRepeats(nscSet, onlyCellMinerExps = TRUE)
nscSet |
a character vector specifying NSC identifier(s) for compound(s) of interest. |
onlyCellMinerExps |
a logical value indicating whether to return only the number of experiments with data included in CellMiner (default=TRUE). |
a numeric vector, indexed by nscSet, indicating the number of drug activity repeat experiments for each one of its compounds.
nscSet <- c("1", "17", "89", "609699") getNumDrugActivityRepeats(nscSet)
nscSet <- c("1", "17", "89", "609699") getNumDrugActivityRepeats(nscSet)
Returns a vector indicating the number of NCI-60 cell lines with missing activity data for set of compounds.
getNumMissingLines(nscSet)
getNumMissingLines(nscSet)
nscSet |
a character vector specifying NSC identifier(s) for compound(s) of interest. |
a numeric vector indicating the number of NCI-60 cell lines with missing activity data, indexed by the identifiers in nscSet.
nscSet <- c("1", "17", "89", "609699") getNumMissingLines(nscSet)
nscSet <- c("1", "17", "89", "609699") getNumMissingLines(nscSet)
Returns an eSet object with drug repeat activity experiment data.
getRepeatAct(object, ...)
getRepeatAct(object, ...)
object |
Object for which drug repeat activity experiment data is to be returned. |
... |
Other possible parameters. |
An eSet object with drug repeat activity experiment data.
Returns an eSet object with drug repeat activity experiment data.
## S4 method for signature 'DrugData' getRepeatAct(object)
## S4 method for signature 'DrugData' getRepeatAct(object)
object |
DrugData object for which drug repeat activity experiment data is to be returned. |
An eSet object with drug repeat activity experiment data.
Computes the relative standard deviation values with respect to the columns of a matrix or data.frame.
getRsd(dat, onlyReturnMedian = TRUE)
getRsd(dat, onlyReturnMedian = TRUE)
dat |
a matrix or data.frame with numeric values. |
onlyReturnMedian |
a logical value indicating whether only the median column RSD value should be returned (vs. all RSD values). |
median RSD value over the data set columns or all RSD values, depending on value of onlyReturnMedian (default=TRUE).
A <- matrix(rnorm(10*60), nrow=10) getRsd(A) getRsd(A, onlyReturnMedian=FALSE)
A <- matrix(rnorm(10*60), nrow=10) getRsd(A) getRsd(A, onlyReturnMedian=FALSE)
Returns a data frame with sample information.
getSampleData(object, ...)
getSampleData(object, ...)
object |
Object for which sample data is to be returned. |
... |
Other possible parameters. |
A data frame with sample information.
Returns a data frame with sample information.
## S4 method for signature 'DrugData' getSampleData(object)
## S4 method for signature 'DrugData' getSampleData(object)
object |
DrugData object for which sample data is to be returned. |
A data frame with sample information.
Returns a data frame with sample information.
## S4 method for signature 'MolData' getSampleData(object)
## S4 method for signature 'MolData' getSampleData(object)
object |
MolData object for which sample data is to be returned. |
A data frame with sample information.
Get the SMILES strings for a set of NSC identifiers.
getSmiles(nscSet)
getSmiles(nscSet)
nscSet |
A character vector of NSC strings |
A named character vector indicating the SMILES string for each NSC in nscSet (or NA if no structural information is available).
nscSet <- c("609699", "94600") getSmiles(nscSet)
nscSet <- c("609699", "94600") getSmiles(nscSet)
Check if NSC has Mechanism of Action (MOA) Annotation
hasMoa(nsc)
hasMoa(nsc)
nsc |
a string, an NSC identifier |
a boolean whether the NSC has an MOA
hasMoa("754365")
hasMoa("754365")
Returns a DrugData object.
## S4 method for signature 'DrugData' initialize(.Object, act, repeatAct, sampleData)
## S4 method for signature 'DrugData' initialize(.Object, act, repeatAct, sampleData)
.Object |
An object: see "new()" documentation in "methods" package. |
act |
An eSet object containing drug activity data across a set of biological samples. |
repeatAct |
An eSet object containing repeat drug activity experiment data with respect to the same samples associated with act. |
sampleData |
A MIAxE object capturing sample and other data set information. |
A DrugData object.
Seems to be required for definition of a constructor.
Returns a MolData object.
## S4 method for signature 'MolData' initialize(.Object, eSetList, sampleData)
## S4 method for signature 'MolData' initialize(.Object, eSetList, sampleData)
.Object |
An object: see "new()" documentation in "methods" package. |
eSetList |
A list of eSet objects for a common set of samples. |
sampleData |
A MIAxE object capturing sample and other data set information. |
A MolData object.
Check if an NSC ID is public
isPublic(nscs)
isPublic(nscs)
nscs |
a vector of NSC string IDs |
a vector of boolean values of whether each NSC is public
isPublic("-1") isPublic(c("-1", "609699"))
isPublic("-1") isPublic(c("-1", "609699"))
Returns data to plot CellMiner plots
loadCellminerPlotInfo(returnDf = FALSE)
loadCellminerPlotInfo(returnDf = FALSE)
returnDf |
a boolean if a data.frame with all information (default: FALSE) |
a vector of colors as strings or a data.frame with dataType, label, xMin, xMax
loadCellminerPlotInfo()
loadCellminerPlotInfo()
Returns a 60-element color set that matches the color set used on http://discover.nci.nih.gov/
loadNciColorSet(returnDf = FALSE)
loadNciColorSet(returnDf = FALSE)
returnDf |
a boolean if a data.frame with tissue names and abbreviations should be returned (default: FALSE) |
a vector of colors as strings or a data.frame with tissues, tissue abbreviations, cell line abbreviations and colors
loadNciColorSet()
loadNciColorSet()
Returns a MolData object.
MolData(eSetList, sampleData, ...)
MolData(eSetList, sampleData, ...)
eSetList |
A list of eSet objects for a common set of samples. |
sampleData |
A MIAxE object capturing sample and other data set information. |
... |
Other possible parameters. |
A MolData object.
An S4 class to represent molecular data recorded for a set of biological samples.
... |
Other possible parameters. |
eSetList
A list of eSet objects for a common set of samples.
sampleData
A MIAxE object capturing sample and other data set information.
Returns a MolData object.
## S4 method for signature 'list,MIAxE' MolData(eSetList, sampleData, ...)
## S4 method for signature 'list,MIAxE' MolData(eSetList, sampleData, ...)
eSetList |
A list of eSet objects for a common set of samples. |
sampleData |
A MIAxE object capturing sample and other data set information. |
... |
Other possible parameters. |
A MolData object.
Compare an input pattern against a set of patterns, excluding the predictive effect of a fixed pattern or set of patterns.
parCorPatternComparison(x, Y, Z, updateProgress = NULL)
parCorPatternComparison(x, Y, Z, updateProgress = NULL)
x |
An N element input pattern specified as either a vector or a 1 x N matrix or data frame. |
Y |
An N element pattern specified as a vector for comparison with the input pattern x or a k x N matrix with k patterns for comparison with the input pattern x specified along the rows, with rownames set appropriately. |
Z |
An N element pattern specified as a vector or a k x N matrix of patterns specified along the rows. These are the patterns whose effect (with respect to a linear model) is to be excluded when comparing x with Y or each row entry of Y. Note that for the partial correlation to be value, the pattern(s) in Z should not overlap with those in x or Y. |
updateProgress |
A optional function to be invoked with each computed partial correlation to indicate progress. |
A data frame with pattern comparison results (ordered by PARCOR): NAME: Name of entry in Y being compared. PARCOR: Partial correlation between x and the entry in Y with respect to Z. PVAL: p-value.
x <- exprs(getAct(rcellminerData::drugData))["609699", ] Y <- rcellminer::getAllFeatureData(rcellminerData::molData)[["exp"]][1:100, ] Z <- rcellminer::getAllFeatureData(rcellminerData::molData)[["exp"]][c("SLFN11", "JAG1"), ] results <- parCorPatternComparison(x, Y, Z) Y <- rcellminer::getAllFeatureData(rcellminerData::molData)[["exp"]][1, , drop=TRUE] Z <- rcellminer::getAllFeatureData(rcellminerData::molData)[["exp"]]["SLFN11", , drop=TRUE] results <- parCorPatternComparison(x, Y, Z)
x <- exprs(getAct(rcellminerData::drugData))["609699", ] Y <- rcellminer::getAllFeatureData(rcellminerData::molData)[["exp"]][1:100, ] Z <- rcellminer::getAllFeatureData(rcellminerData::molData)[["exp"]][c("SLFN11", "JAG1"), ] results <- parCorPatternComparison(x, Y, Z) Y <- rcellminer::getAllFeatureData(rcellminerData::molData)[["exp"]][1, , drop=TRUE] Z <- rcellminer::getAllFeatureData(rcellminerData::molData)[["exp"]]["SLFN11", , drop=TRUE] results <- parCorPatternComparison(x, Y, Z)
Compare an input pattern against a set of patterns.
patternComparison(pattern, profileMatrixList, method = "pearson")
patternComparison(pattern, profileMatrixList, method = "pearson")
pattern |
An N element input pattern specified as either a named vector or an 1 x N matrix or data frame. Names (or column names) must match the column names of each element of profileMatrixList. |
profileMatrixList |
A single matrix (or data frame) or a list of matrices (or data frames). Each matrix (data frame) must be k x N - that is the k patterns for comparison with the input pattern must be specified along the rows, with rownames set appropriately. |
method |
a string specifying the type of correlation, chosen from pearson (default) or spearman. |
A data frame with pattern comparison results. Specifically, if M is the total number patterns in profileMatrixList elements, an M x 2 matrix is returned with sorted Pearson's correlations in the first column and corresponding p-values in the second column. Comparison pattern names are indicated in the row names.
drugAct <- exprs(getAct(rcellminerData::drugData)) molDataMats <- getMolDataMatrices()[c("exp", "mut")] molDataMats <- lapply(molDataMats, function(X) X[1:10, ]) pcResults <- patternComparison(drugAct["609699", ], molDataMats) pcResults <- patternComparison(drugAct["609699", ], molDataMats, method="spearman") pcResults <- patternComparison(drugAct["609699", ], molDataMats$exp, method="spearman")
drugAct <- exprs(getAct(rcellminerData::drugData)) molDataMats <- getMolDataMatrices()[c("exp", "mut")] molDataMats <- lapply(molDataMats, function(X) X[1:10, ]) pcResults <- patternComparison(drugAct["609699", ], molDataMats) pcResults <- patternComparison(drugAct["609699", ], molDataMats, method="spearman") pcResults <- patternComparison(drugAct["609699", ], molDataMats$exp, method="spearman")
Description: Produces CellMiner-like plots in R
plotCellMiner( drugAct, molData, plots, nsc = NULL, gene = NULL, features = NULL, sub = NULL, xLimits = NULL, xLabel = NULL, extraPlot = NULL, verbose = FALSE )
plotCellMiner( drugAct, molData, plots, nsc = NULL, gene = NULL, features = NULL, sub = NULL, xLimits = NULL, xLabel = NULL, extraPlot = NULL, verbose = FALSE )
drugAct |
a matrix of drug activity values (cell lines as columns, drug entries as rows) |
molData |
a list of matricies a molecular |
plots |
a vector of characters denoting the plots to include and the order (e.g. c("mut", "drug", "cop"). Currently, supported entries mutations (mut), drug activities (drug), copy number variations (cop) |
nsc |
a string NSC ID that will be plotted when a "drug" entry appears in the plots vector |
gene |
a string HUGO gene symbol for which the "mut", "cop", or "exp" plots will be produced if in plots vector |
features |
a vector of strings that provide the full IDs for elements to be plotted (e.g. mutCDK4 for CDK4 mutations). This overwrites the nsc and gene parameters, but is needed in advanced plots that involve data that involves one-to-many relationships (e.g. many entries for a given gene in the exome data) and a gene symbol is ambiguous. |
sub |
a vector of strings with sub-titles for each plot |
xLimits |
a 2 number vector with the the minimum and maximum X-axis values (default: -3,3 for Z-scores, 0,1 for binary entries) |
xLabel |
a string for the default X-axis label |
extraPlot |
a list containing title, label, and values (numeric vector of length 60); only one extra plot can be included |
verbose |
a boolean to show debugging information |
None
Augustin Luna <augustin AT mail.nih.gov>
drugAct <- exprs(getAct(rcellminerData::drugData)) molData <- getMolDataMatrices() plots <- c("mut", "drug", "cop", "xai", "pro") plotCellMiner(drugAct, molData, plots=plots, nsc="94600", gene="CDK4", verbose=FALSE) plots <- c("mut", "xai", "cop", "cop", "cop", "cop") plotCellMiner(drugAct, molData, plots=plots, nsc="94600", gene=c("CDK4", "TP53", "BRAF", "GAPDH"), verbose=FALSE) plotCellMiner(drugAct, molData, plots=NULL, nsc=NULL, features=c("mutCDK4", "xaiCDK4", "exochr1:101704532_G_T", "mdaIS_P53_MUT", "mirhsa-miR-22", "proTP53_26_GBL00064"), verbose=FALSE)
drugAct <- exprs(getAct(rcellminerData::drugData)) molData <- getMolDataMatrices() plots <- c("mut", "drug", "cop", "xai", "pro") plotCellMiner(drugAct, molData, plots=plots, nsc="94600", gene="CDK4", verbose=FALSE) plots <- c("mut", "xai", "cop", "cop", "cop", "cop") plotCellMiner(drugAct, molData, plots=plots, nsc="94600", gene=c("CDK4", "TP53", "BRAF", "GAPDH"), verbose=FALSE) plotCellMiner(drugAct, molData, plots=NULL, nsc=NULL, features=c("mutCDK4", "xaiCDK4", "exochr1:101704532_G_T", "mdaIS_P53_MUT", "mirhsa-miR-22", "proTP53_26_GBL00064"), verbose=FALSE)
Make a simple 2d plot using two variables with ggplot2
plotCellMiner2D( df, xCol = "x", yCol = "y", xLabel = xCol, yLabel = yCol, title = NULL, colorPalette = NULL, classCol = NULL, tooltipCol = NULL, showLegend = FALSE, showTrendLine = TRUE, showTitle = TRUE, singleColor = "#0000FF", alpha = 1, numberColPrefix = "X", xLimVal = NULL, yLimVal = NULL, pointSize = 3 )
plotCellMiner2D( df, xCol = "x", yCol = "y", xLabel = xCol, yLabel = yCol, title = NULL, colorPalette = NULL, classCol = NULL, tooltipCol = NULL, showLegend = FALSE, showTrendLine = TRUE, showTitle = TRUE, singleColor = "#0000FF", alpha = 1, numberColPrefix = "X", xLimVal = NULL, yLimVal = NULL, pointSize = 3 )
df |
a data.frame with at least two columns |
xCol |
the name of the column in df with the "x" data. See Note |
yCol |
the name of the column in df with the "y" data. See Note |
xLabel |
the x plot label |
yLabel |
the y plot label |
title |
the plot title, if null the correlation will appear (DEFAULT: NULL) |
colorPalette |
a named vector with the names classes and value colors (DEFAULT: NULL) |
classCol |
the name of the column with the classes. Values in column of df must be a factor (DEFAULT: NULL) |
tooltipCol |
the name of the column used for tooltips when plotted with plotly |
showLegend |
boolean, whether to show the legend (DEFAULT: FALSE) |
showTrendLine |
boolean, whether to show the trendline |
showTitle |
boolean, whether to show the title |
singleColor |
a color to be used for all points when a color palette is not provided (DEFAULT: blue) |
alpha |
value from 0-1, where 0 indicates transparent points (DEFAULT: 1, not transparent) |
numberColPrefix |
a prefix to add to column names that start with a number that causes issues with ggplot (DEFAULT: X) |
xLimVal |
a two entry vector (min, max) to set the x-axis |
yLimVal |
a two entry vector (min, max) to set the y-axis |
pointSize |
size of points on plot (DEFAULT: 3) |
a ggplot object
TROUBLESHOOTING NOTES: 1) Avoid ":" in colnames
Uses ggplot aes_string() which uses parse() to turn your text expression into a proper R symbol that can be resolved within the data.frame. Avoid numbers and spaces in
Augustin Luna <augustin AT mail.nih.gov>
## Not run: # Load data nci60DrugActZ <- exprs(getAct(rcellminerData::drugData)) nci60GeneExpZ <- getAllFeatureData(rcellminerData::molData)[["exp"]] # Load colors colorTab <- loadNciColorSet(returnDf=TRUE) tissueColorTab <- unique(colorTab[, c("tissues", "colors")]) # Merge data df <- as.data.frame(t(rbind(nci60DrugActZ["94600",], nci60GeneExpZ["SLFN11",]))) colnames(df) <- c("y", "x") df <- cbind(df, colorTab) # Plot data plotCellMiner2D(df, xCol="x", yCol="y", xLabel="SLFN11", yLabel="94600") plotCellMiner2D(df, xCol="x", yCol="y", showTrendLine = FALSE, showTitle = FALSE) plotCellMiner2D(df, xCol="x", yCol="y", showTrendLine = FALSE, showLegend = FALSE) ## End(Not run)
## Not run: # Load data nci60DrugActZ <- exprs(getAct(rcellminerData::drugData)) nci60GeneExpZ <- getAllFeatureData(rcellminerData::molData)[["exp"]] # Load colors colorTab <- loadNciColorSet(returnDf=TRUE) tissueColorTab <- unique(colorTab[, c("tissues", "colors")]) # Merge data df <- as.data.frame(t(rbind(nci60DrugActZ["94600",], nci60GeneExpZ["SLFN11",]))) colnames(df) <- c("y", "x") df <- cbind(df, colorTab) # Plot data plotCellMiner2D(df, xCol="x", yCol="y", xLabel="SLFN11", yLabel="94600") plotCellMiner2D(df, xCol="x", yCol="y", showTrendLine = FALSE, showTitle = FALSE) plotCellMiner2D(df, xCol="x", yCol="y", showTrendLine = FALSE, showLegend = FALSE) ## End(Not run)
Plot NCI-60 drug activity profiles for repeat experiments.
plotDrugActivityRepeats( nscStr, useZScore = FALSE, maxRepNum = 5, pdfFilename = NULL, pdfWidth = 12, pdfHeight = 6 )
plotDrugActivityRepeats( nscStr, useZScore = FALSE, maxRepNum = 5, pdfFilename = NULL, pdfWidth = 12, pdfHeight = 6 )
nscStr |
a string specifying the NSC identifier for a compound. |
useZScore |
a boolean specifying whether to plot z-transformed data (as opposed to -logGI50 values). |
maxRepNum |
an integer specifying the maximum number of repeat experiments to plot. |
pdfFilename |
name of a PDF output |
pdfWidth |
with of the PDF (default: 12) |
pdfHeight |
with of the PDF (default: 6) |
NONE
plotDrugActivityRepeats("609699") plotDrugActivityRepeats("609699", useZScore=TRUE, maxRepNum=3)
plotDrugActivityRepeats("609699") plotDrugActivityRepeats("609699", useZScore=TRUE, maxRepNum=3)
Produces a barplot of the average values for a set of NSCs with a error bar (one standard deviation)
plotDrugSets( drugAct, drugs, mainLabel = "", pdfFilename = NULL, statistic = "mean" )
plotDrugSets( drugAct, drugs, mainLabel = "", pdfFilename = NULL, statistic = "mean" )
drugAct |
a matrix of drug activity values (cell lines as columns, drug entries as rows) |
drugs |
a vector of NSC IDs whose values will be averaged by cell line |
mainLabel |
a main label for the plot |
pdfFilename |
a string file name for a PDF plot, no file output will be produced if this is not provided |
statistic |
a string, either 'mean' or 'median' (Default: mean) |
no values are returned
drugAct <- exprs(getAct(rcellminerData::drugData)) drugs <- rownames(drugAct)[1:8] plotDrugSets(drugAct, drugs, "Test")
drugAct <- exprs(getAct(rcellminerData::drugData)) drugs <- rownames(drugAct)[1:8] plotDrugSets(drugAct, drugs, "Test")
Remove molecular data type prefixes from features.
removeMolDataType(features, prefixLen = 3)
removeMolDataType(features, prefixLen = 3)
features |
A vector of features. |
prefixLen |
The length of the molecular data type prefix. |
This function is primarily used to remove prefixes from elastic net features.
A named vector of features without molecular data type prefixes.
removeMolDataType(c("expTP53", "copMDM2", "mutCHEK2", "mutBRAF"))
removeMolDataType(c("expTP53", "copMDM2", "mutCHEK2", "mutBRAF"))
Restricts a feature matrix to only include features associated with a specified gene set.
restrictFeatureMat(geneSet, featureMat, prefixSet = c("cop", "exp", "mut"))
restrictFeatureMat(geneSet, featureMat, prefixSet = c("cop", "exp", "mut"))
geneSet |
a character vector of gene names. |
featureMat |
a matrix or data frame with feature vectors along rows and feature names specified in rownames(featureMat). |
prefixSet |
a set of feature name prefixes to be prepended to each element of geneSet to obtain a collection of geneSet-associated features. |
a matrix containing the features in the intersection of rownames(featureMat) and the set of geneSet-derived features (obtained by prepending each element of prefixSet to each gene in geneSet).
#' @examples X <- matrix(1:25, nrow=5) rownames(X) <- c("expA", "expB", "copC", "mutC", "expD") restrictFeatureMat(geneSet = c("B", "C"), X)
Correlation between ith row of x and ith row of y for all i
rowCors(X, Y)
rowCors(X, Y)
X |
a matrix |
Y |
a matrix |
a list of two vectors: cor (correlation values) and pval (correlation p-values)
Sudhir Varma, NCI-LMP
a <- matrix(runif(100), nrow=10, ncol=10) b <- matrix(runif(100), nrow=10, ncol=10) c <- rowCors(a, b)
a <- matrix(runif(100), nrow=10, ncol=10) b <- matrix(runif(100), nrow=10, ncol=10) c <- rowCors(a, b)
Search for NSCs
searchForNscs(pattern)
searchForNscs(pattern)
pattern |
a search pattern. This string will be treated as a regular expression with the case ignored. |
Use this function with caution. Not all compounds have names and compounds can have many synonyms not included in CellMiner.
A vector of matching NSCs
searchForNscs("nib$")
searchForNscs("nib$")