Title: | Dry lab for exploring miRNA-mRNA relationships |
---|---|
Description: | Provide tools exploring miRNA-mRNA relationships, including popular miRNA target prediction methods, ensemble methods that integrate individual methods, functions to get data from online resources, functions to validate the results, and functions to conduct enrichment analyses. |
Authors: | Thuc Duy Le, Junpeng Zhang, Mo Chen, Vu Viet Hoang Pham |
Maintainer: | Thuc Duy Le <[email protected]> |
License: | GPL (>=2) |
Version: | 1.37.0 |
Built: | 2024-11-29 08:44:25 UTC |
Source: | https://github.com/bioc/miRLAB |
Provide tools exploring miRNA-mRNA relationships, including popular miRNA target prediction methods using expression data, ensemble methods that integrate individual methods, functions to get data from online resources, functions to validate the results, and functions to conduct enrichment analyses.
Package: | miRLAB |
Type: | Package |
Version: | 0.99 |
Date: | 2015-04-23 |
License: | GPL(>=2) |
Thuc Duy Le, Junpeng Zhang
Maintainer: Thuc Duy Le <[email protected]>
miRLAB: An R based dry lab for exploring miRNA-mRNA relationships
Use the Borda count election method to integrate the rankings from different miRNA target prediction methods
Borda(listCEmatrices)
Borda(listCEmatrices)
listCEmatrices |
a list of matrices that include the correlation coefficients/causal effects/scores resulting from different target prediction methods |
a matrix of ranking scores (averaging all the rankings from different methods). Columns are miRNAs and rows are mRNAs
1. Le, T.D., Zhang, J., Liu, L., and Li, J. (2015) Ensemble Methods for miRNA Target Prediction from Expression Data, Plos ONE.
2. Marbach, D., Costello, J.C., Kuffner, R., Vega, N.M., Prill, R.J., Camacho, D.M., Allison, K.R. and DREAM5 Consortium (2012). Wisdom of crowds for robust gene network inference. Nat. Methods, 9, 796-804.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") ps=Pearson(dataset, cause=1:3, effect=4:18) ida=IDA(dataset, cause=1:3, effect=4:18) borda=Borda(list(ps, ida))
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") ps=Pearson(dataset, cause=1:3, effect=4:18) ida=IDA(dataset, cause=1:3, effect=4:18) borda=Borda(list(ps, ida))
Use the Borda count election method to integrate the rankings from different miRNA target prediction methods, but only topk targets of each miRNA are included in the calculation. The targets outside the topk will be assigned a large and fixed rank, e.g. number of genes in the dataset.
BordaTopk(listCEmatrices, topk)
BordaTopk(listCEmatrices, topk)
listCEmatrices |
a list of matrices that include the correlation/causal effects/scores resulting from a target prediction method |
topk |
number of targets of a miRNA to be included in the calculation (Borda count election) |
a matrix of ranking scores (averaging all the rankings from different methods). Columns are miRNAs and rows are mRNAs
Le, T.D., Zhang, J., Liu, L., and Li, J. (2015) Ensemble Methods for miRNA Target Prediction from Expression Data, Plos ONE.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") ps=Pearson(dataset, cause=1:3, effect=4:18) ida=IDA(dataset, cause=1:3, effect=4:18) borda=BordaTopk(list(ps, ida), topk=10)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") ps=Pearson(dataset, cause=1:3, effect=4:18) ida=IDA(dataset, cause=1:3, effect=4:18) borda=BordaTopk(list(ps, ida), topk=10)
Extract topk predicted targets of a miRNA Rank all the targets of a miRNA and extract the topk targets
bRank(CEmatrix, causeIndex, topk, downreg = TRUE)
bRank(CEmatrix, causeIndex, topk, downreg = TRUE)
CEmatrix |
the matrix of correlation/causal effect/score results with columns are miRNAs and rows are mRNAs |
causeIndex |
the column index of the miRNA that we would like to extract |
topk |
the number of targets being extracted |
downreg |
if TRUE the negative correlation/causal effect/score will be on the top of the ranking. This is to favour the negative regulations. |
a matrix with 3 columns, where the first column contains the miRNA, the second column contains the mRNAs and the last column contains the correlations/causal effects/scores
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") ps=Pearson(dataset, cause=1:3, effect=4:18) miR200aTop10 = bRank(ps, 3, 10, TRUE)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") ps=Pearson(dataset, cause=1:3, effect=4:18) miR200aTop10 = bRank(ps, 3, 10, TRUE)
This function convert the miRNAs in the input file from the "source" miRBase version to the "Target" version. If users do not know the miRBase version of the input file, please set the source version to 0. The function will match the miRNAs in the input file to all miRBase versions to find the most likely miRBase version. Currently, we have versions 16-21.
convert(miRNAListFile, sourceV, targetV)
convert(miRNAListFile, sourceV, targetV)
miRNAListFile |
the input file containing a list of miRNA symbols in csv format |
sourceV |
the miRBase version of the input miRNAs, e.g. 16. If users do not know the version, use 0. |
targetV |
the miRBase version that we want to convert into, e.g. 21. |
A csv file in the working directory containing the converted miRNA symbols.
miRs=system.file("extdata", "ToymiRs.csv", package="miRLAB") convert(miRs, 17, 21)
miRs=system.file("extdata", "ToymiRs.csv", package="miRLAB") convert(miRs, 17, 21)
Calculate the Distance correlation of each pair of miRNA-mRNA,and return a matrix of correlation coefficients with columns are miRNAs and rows are mRNAs.
Dcov(datacsv, cause, effect, targetbinding = NA)
Dcov(datacsv, cause, effect, targetbinding = NA)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the Distance correlation values. Columns are miRNAs, rows are mRNAs.
Szekely, G., Rizzo, M. and Bakirov, N. (2007) Measuring and testing independence by correlation of distances. Ann. Stat., 35, 2769 - 94.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Dcov(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Dcov(dataset, 1:3, 4:18)
Find the top miRNAs and mRNAs that are differently expression between different conditions, e.g. cancer vs normal
DiffExpAnalysis(miR1, miR2, mR1, mR2, topkmiR, topkmR, p.miR, p.mR)
DiffExpAnalysis(miR1, miR2, mR1, mR2, topkmiR, topkmR, p.miR, p.mR)
miR1 |
the miRNA dataset for condition 1, e.g. cancer |
miR2 |
the miRNA dataset for condition 1, e.g. normal |
mR1 |
the mRNA dataset for condition 1, e.g. cancer |
mR2 |
the mRNA dataset for condition 2, e.g. normal |
topkmiR |
the maximum number of miRNAs that we would like to extract, e.g. top 50 miRNAs. |
topkmR |
the maximum number of mRNAs that we would like to extract, e.g. top 2000 mRNAs. |
p.miR |
cutoff value for adjusted p-values when conducting differentially expressed analysis for miRNAs. |
p.mR |
cutoff value for adjusted p-values when conducting differentially expressed analysis for mRNAs. |
the dataset that includes differentially expressed miRNAs and mRNAs. columns are miRNAs and mRNAs and rows are samples
Smyth, G.K. (2005). Limma: linear models for microarray data. In Bioinformatics and computational biology solutions using R and Bioconductor (pp. 397-420). Springer New York.
Calculate the Elastic-net regression coefficient of each pair of miRNA-mRNA,and return a matrix of correlation coefficients with columns are miRNAs and rows are mRNAs.
Elastic(datacsv, cause, effect, targetbinding = NA)
Elastic(datacsv, cause, effect, targetbinding = NA)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the Elastic-net regression coefficients. Columns are miRNAs, rows are mRNAs.
1. Le, T.D., Zhang, J., Liu, L., and Li, J. (2015) Ensemble Methods for miRNA Target Prediction from Expression Data, under review.
2. Zou, H., & Hastie, T. (2005). Regularization and variable selection via the elastic net. J. R. Stat. Soc. Series B Stat. Methodol., 67, 301-320.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Elastic(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Elastic(dataset, 1:3, 4:18)
Function for validate the results from all 12 methods.
experiment(allmethods, topk, Expgroundtruth, LFC, downreg)
experiment(allmethods, topk, Expgroundtruth, LFC, downreg)
allmethods |
A list of results (matrix with columns are miRNA and rows are mRNAs). |
topk |
Top k targets of each miRNA that will be extracted for validation |
Expgroundtruth |
The ground truth in .csv file for validation |
LFC |
log fold-change for validating the results using transfection experiments |
downreg |
If set to TRUE the negative effects will have higher ranks than the positives. |
The validation results for all 12 methods
Rank the miRNA-mRNA interactions based on absolute values of the correlations/scores/causal effects, and return the topk interactions.
Extopk(cormat, topk)
Extopk(cormat, topk)
cormat |
the correlation matrix that need to be extracted with columns are miRNAs and rows are mRNAs |
topk |
the number of interactions that need to be extracted. |
topk interactions
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") EMTresults=Pearson(dataset, 1:3, 4:18) top10=Extopk(EMTresults, 10)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") EMTresults=Pearson(dataset, 1:3, 4:18) top10=Extopk(EMTresults, 10)
Filter and compare the validation results from 12 methods Keep the miRNAs that have at least noVal confirmed targets and compare the validation results from all methods.
filterAndCompare(allresults, noVal)
filterAndCompare(allresults, noVal)
allresults |
the results from all methods generated from experiment function. This is a list. |
noVal |
Number of confirmed targets in each method (threshold) to filter. Records (miRNA) with less than this will be removed |
the validation results of all methods
print("result=filterAndCompare(allresults, 2)")
print("result=filterAndCompare(allresults, 2)")
getData from GDC
getData(cancerName)
getData(cancerName)
cancerName |
The name of cancer in string format |
dataset in matrix format
GO BP enrichment analysis for a gene list
GOBPenrichment(Genes, Cutoff)
GOBPenrichment(Genes, Cutoff)
Genes |
a list of gene symbols |
Cutoff |
the significant level, e.g. 0.05 |
a list of GO terms for the genes
Ashburner, M., Ball, C.A., Blake, J.A., Botstein, D., Butler, H., Cherry, J.M., Davis, A.P., Dolinski, K., Dwight, S.S., Eppig, J.T., Harris, M.A., Hill, D.P., Issel-Tarver, L., Kasarskis, A., Lewis, S., Matese, J.C., Richardson, J.E., Ringwald, M., Rubin, G.M. and Sherlock, G. (2000) Gene Ontology: tool for the unification of biology. Nat. Genet., 25, 25-29.
print("result = GOBPenrichment(genelist, 0.05)")
print("result = GOBPenrichment(genelist, 0.05)")
Calculate the Hoeffding correlation coefficient of each pair of miRNA-mRNA,and return a matrix of correlation coefficients with columns are miRNAs and rows are mRNAs.
Hoeffding(datacsv, cause, effect, targetbinding = NA)
Hoeffding(datacsv, cause, effect, targetbinding = NA)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the Hoeffding correlation coefficients. Columns are miRNAs, rows are mRNAs.
Hoeffding, W. (1948) A non-parametric test of independence. Ann. Math. Stat., 19, 546 - 57.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Hoeffding(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Hoeffding(dataset, 1:3, 4:18)
This function identifies miRNA targets by ICP and PAM50.
ICPPam50(d, nmiR, nmR, fiftymRNAsData)
ICPPam50(d, nmiR, nmR, fiftymRNAsData)
d |
A matrix of expression of miRNAs and mRNAs with columns being miRNA or mRNA names and rows being samples |
nmiR |
Number of miRNAs |
nmR |
Number of mRNAs |
fiftymRNAsData |
A matrix of expression of 50 mRNAs in PAM50 with columns being mRNA names and rows being samples |
The matrix of causal effects of miRNAs and mRNAs with columns being miRNAs and rows being mRNAs
1. Parker, J. S., et al. (2009). "Supervised Risk Predictor of Breast Cancer Based on Intrinsic Subtypes." Journal of Clinical Oncology 27(8): 1160-1167.
Calculate the causal effect of each pair of miRNA-mRNA,and return a matrix of causal effects with columns are miRNAs and rows are mRNAs.
IDA( datacsv, cause, effect, pcmethod = "original", alpha = 0.05, targetbinding = NA )
IDA( datacsv, cause, effect, pcmethod = "original", alpha = 0.05, targetbinding = NA )
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
pcmethod |
choose different versons of the PC algorithm, including "original" (default) "stable", and "stable.fast" |
alpha |
significance level for the conditional independence test, e.g. 0.05. |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the causal effects. Columns are miRNAs, rows are mRNAs.
1. Le, T.D., Liu, L., Tsykin, A., Goodall, G.J., Liu, B., Sun, B.Y. and Li, J. (2013) Inferring microRNA-mRNA causal regulatory relationships from expression data. Bioinformatics, 29, 765-71.
2. Zhang, J., Le, T.D., Liu, L., Liu, B., He, J., Goodall, G.J. and Li, J. (2014) Identifying direct miRNA-mRNA causal regulatory relationships in heterogeneous data. J. Biomed. Inform., 52, 438-47.
3. Maathuis, H.M., Colombo, D., Kalisch, M. and Buhlmann, P. (2010) Predicting causal effects in large-scale systems from observational data. Nat. Methods, 7, 247-249.
4. Maathuis, H.M., Kalisch, M. and Buhlmann, P. (2009) Estimating high-dimensional intervention effects from observational data. Ann. Stat., 37, 3133-3164.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=IDA(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=IDA(dataset, 1:3, 4:18)
This function identifies the top miRNA targets by an ensemble method with ICP-PAM50, Pearson and Lasso.
identifymiRTargetsByEnsemble(d, nmiR, nmR, fiftymRNAsData, top = 1, topk = 500)
identifymiRTargetsByEnsemble(d, nmiR, nmR, fiftymRNAsData, top = 1, topk = 500)
d |
A matrix of expression of miRNAs and mRNAs with columns being miRNA or mRNA names and rows being samples |
nmiR |
Number of miRNAs |
nmR |
Number of mRNAs |
fiftymRNAsData |
A matrix of expression of 50 mRNAs in PAM50 with columns being mRNA names and rows being samples |
top |
1 if getting the top of all miRNAs and 2 if getting the top of each miRNA |
topk |
Number of the top to get |
The top k miRNA targets
1. Parker, J. S., et al. (2009). "Supervised Risk Predictor of Breast Cancer Based on Intrinsic Subtypes." Journal of Clinical Oncology 27(8): 1160-1167.
This function identifies the top miRNA targets by ICP and PAM50.
identifymiRTargetsByICPPam50(d, nmiR, nmR, fiftymRNAsData, top = 1, topk = 500)
identifymiRTargetsByICPPam50(d, nmiR, nmR, fiftymRNAsData, top = 1, topk = 500)
d |
A matrix of expression of miRNAs and mRNAs with columns being miRNA or mRNA names and rows being samples |
nmiR |
Number of miRNAs |
nmR |
Number of mRNAs |
fiftymRNAsData |
A matrix of expression of 50 mRNAs in PAM50 with columns being mRNA names and rows being samples |
top |
1 if getting the top of all miRNAs and 2 if getting the top of each miRNA |
topk |
Number of the top to get |
The top k miRNA targets
1. Parker, J. S., et al. (2009). "Supervised Risk Predictor of Breast Cancer Based on Intrinsic Subtypes." Journal of Clinical Oncology 27(8): 1160-1167.
Remove the genes (rows) that have more than r% of missing data; use the impute package to fill in missing data, and finally normalise the data.
ImputeNormData(dataset, r)
ImputeNormData(dataset, r)
dataset |
The input dataset in csv format. e.g. "EMT.csv" |
r |
The rate threshold to filter the records (genes). Genes with more than r% missing data will be removed. |
The processed dataset.
1. Hastie T, Tibshirani R, Narasimhan B and Chu G. impute: Imputation for microarray data. R package version 1.42.0.
2. Smyth, G.K. (2005). Limma: linear models for microarray data. In Bioinformatics and computational biology solutions using R and Bioconductor (pp. 397-420). Springer New York.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") impdata=ImputeNormData(dataset, 0.1)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") impdata=ImputeNormData(dataset, 0.1)
Functional enrichment analysis KEGG enrichment analysis for a gene list
KEGGenrichment(Genes, Cutoff)
KEGGenrichment(Genes, Cutoff)
Genes |
a list of gene symbols |
Cutoff |
the significant level, e.g. 0.05 |
a list of pathways for the genes
Kanehisa, M. and Goto, S. (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res., 28, 27-30.
print("result = KEGGenrichment(genelist, 0.05)")
print("result = KEGGenrichment(genelist, 0.05)")
Calculate the Kendall correlation coefficient of each pair of miRNA-mRNA,and return a matrix of correlation coefficients with columns are miRNAs and rows are mRNAs.
Kendall(datacsv, cause, effect, targetbinding = NA)
Kendall(datacsv, cause, effect, targetbinding = NA)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the Kendall correlation coefficients. Columns are miRNAs, rows are mRNAs.
Kendall, M. (1938) A new measure of rank correlation. Biometrika, 30, 81 - 9.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Kendall(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Kendall(dataset, 1:3, 4:18)
Calculate the Lasso regression coefficient of each pair of miRNA-mRNA, and return a matrix of coefficients with columns are miRNAs and rows are mRNAs.
Lasso(datacsv, cause, effect, targetbinding = NA)
Lasso(datacsv, cause, effect, targetbinding = NA)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the Lasso regression coefficients. Columns are miRNAs, rows are mRNAs.
1. Le, T.D., Zhang, J., Liu, L., and Li, J. (2015) Ensemble Methods for miRNA Target Prediction from Expression Data, submitted.
2. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Series B Stat. Methodol., 267-288.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Lasso(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Lasso(dataset, 1:3, 4:18)
Calculate the mutual information of each pair of miRNA-mRNA,and return a matrix of mutual information values with columns are miRNAs and rows are mRNAs.
MI(datacsv, cause, effect, targetbinding = NA)
MI(datacsv, cause, effect, targetbinding = NA)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the mutual information values. Columns are miRNAs, rows are mRNAs.
Moon, Y.I., Balaji, R., and Lall, U. (1995) Estimation of mutual information using kernel density estimators. Phys. Rev. E, 52, 2318 - 21.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=MI(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=MI(dataset, 1:3, 4:18)
Calculate the Pearson correlation coefficient of each pair of miRNA-mRNA,and return a matrix of correlation coefficients with columns are miRNAs and rows are mRNAs.
Pearson(datacsv, cause, effect, targetbinding = NA)
Pearson(datacsv, cause, effect, targetbinding = NA)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the Pearson correlation coefficients. Columns are miRNAs, rows are mRNAs.
Pearson, K. (1920) Notes on the history of correlation. Biometrika, 13, 25 - 45.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Pearson(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Pearson(dataset, 1:3, 4:18)
Calculate the Randomized Dependence coefficient of each pair of miRNA-mRNA,and return a matrix of coefficients with columns are miRNAs and rows are mRNAs.
RDC(datacsv, cause, effect, targetbinding = NA)
RDC(datacsv, cause, effect, targetbinding = NA)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the correlation coefficients. Columns are miRNAs, rows are mRNAs.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=RDC(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=RDC(dataset, 1:3, 4:18)
Read dataset from csv file
Read(dataset)
Read(dataset)
dataset |
The input dataset in csv format |
dataset in matrix format
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") data=Read(dataset)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") data=Read(dataset)
Read the results predicted by external methods (methods that are not in this package and may not be implemented in R). Consequently, we can compare the results predicted by the external methods and results predicted by the methods in the miRLAB package.
ReadExtResult(datacsv, cause, effect, ExtCEcsv)
ReadExtResult(datacsv, cause, effect, ExtCEcsv)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
ExtCEcsv |
score matrix predicted by an external matrix with columns are miRNAs and rows are mRNAs. |
a matrix of scores predicted by an external matrix and ready for further validation and comparison tasks.
print("GenemiR=ReadExtResult(dataset, cause=1:3, effect=4:18, 'genemirresults.csv')")
print("GenemiR=ReadExtResult(dataset, cause=1:3, effect=4:18, 'genemirresults.csv')")
Read the header of the dataset
readHeader(dataset)
readHeader(dataset)
dataset |
the character string of the names of the dataset in csv format, e.g. "ToyEMT.csv" |
the header of the dataset
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") header=readHeader(dataset)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") header=readHeader(dataset)
Calculate the Spearman correlation coefficient of each pair of miRNA-mRNA,and return a matrix of correlation coefficients with columns are miRNAs and rows are mRNAs.
Spearman(datacsv, cause, effect, targetbinding = NA)
Spearman(datacsv, cause, effect, targetbinding = NA)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the Spearman correlation coefficients. Columns are miRNAs, rows are mRNAs.
Spearman, C. (1904) General intelligence, objectively determined and measured. Am. J. Psychol., 15, 201 - 92.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Spearman(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Spearman(dataset, 1:3, 4:18)
Stardarsise the dataset Stadardise the dataset to have mean=0 and std=1 in each column.
Standardise(dataset)
Standardise(dataset)
dataset |
The input dataset in csv format. e.g. "ToyEMT.csv". The first column is the sample name. |
The standardised dataset.
## Not run: dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") stdata=Standardise(dataset) ## End(Not run)
## Not run: dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") stdata=Standardise(dataset) ## End(Not run)
Given the predicted target of all miRNA, the function returns a list of targets of each miRNA that are confirmed based on the experimentally validated interactions or curated transfection data. Users need to download the file logFC.imputed.rda from nugget.unisa.edu.au/Thuc/miRLAB/ and place it in the working directory (this file is obtained from the TargetScoreData package)
ValidateAll(CEmatrix, topk, groundtruth, LFC, downreg = TRUE)
ValidateAll(CEmatrix, topk, groundtruth, LFC, downreg = TRUE)
CEmatrix |
the matrix of correlation/causal effects/scores with columns are miRNAs and rows are mRNAs |
topk |
the number of targets of each miRNA that are being validated. |
groundtruth |
the csv file containing the ground truth. |
LFC |
the log fold change threshold for the transfection data. The targets that have the absolute value of log fold change greater than the LFC will be regarded as the confirmed targets. |
downreg |
if TRUE the negative correlation/causal effect/score values will be ranked on the top of the ranking. This is to favour the down regulations. |
a list of matrices that contains the confirmed interactions by both provided ground truth and built-in transfection data.
print("ps=Pearson(dataset, cause=1:3, effect=4:18)") print("results=ValidateAll(ps, 10, groundtruth, LFC=0.5, downreg=TRUE)")
print("ps=Pearson(dataset, cause=1:3, effect=4:18)") print("results=ValidateAll(ps, 10, groundtruth, LFC=0.5, downreg=TRUE)")
Given the predicted target of a miRNA, the function returns a list of targets that are experimentally confirmed based on the provided ground truth. Users can provide their own ground truth or use the built-in ground truth which is the union of Tarbase, miRTarbase, miRecords, and miRWalk.
Validation(topkList, datacsv)
Validation(topkList, datacsv)
topkList |
a matrix with 3 columns. The first column is the miRNA name, the second contains the target mRNAs, and the third contains the correlation values/ causal effects/ scores |
datacsv |
the ground truth for the validation. The ground truth is a matrix with 2 columns, where the first column is the miRNA and the second is the mRNA. |
a matrix in the same format of the input matrix put only contains the confirmed interactions.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") ps=Pearson(dataset, cause=1:3, effect=4:18) miR200aTop10=bRank(ps, 3, 10, TRUE) groundtruth=system.file("extdata", "Toygroundtruth.csv", package="miRLAB") miR200aTop10Confirmed = Validation(miR200aTop10, groundtruth)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") ps=Pearson(dataset, cause=1:3, effect=4:18) miR200aTop10=bRank(ps, 3, 10, TRUE) groundtruth=system.file("extdata", "Toygroundtruth.csv", package="miRLAB") miR200aTop10Confirmed = Validation(miR200aTop10, groundtruth)
Given the predicted target of a miRNA, the function returns a list of targets that are confirmed based on the curated transfection data. Users need to download the file logFC.imputed.rda from nugget.unisa.edu.au/Thuc/miRLAB/ and place it in the working directory (this file is obtained from the TargetScoreData package)
ValidationT(topkList, LFC)
ValidationT(topkList, LFC)
topkList |
a matrix with 3 columns. The first column is the miRNA name, the second contains the target mRNAs, and the third contains the correlation values/ causal effects/ scores |
LFC |
the log fold change threshold. The targets that have the absolute value of log fold change greater than the LFC will be regarded as the confirmed targets. |
a matrix in the same format of the input matrix put only contains the confirmed interactions.
1. Le, T.D., Zhang, J., Liu, L., and Li, J. (2015) Ensemble Methods for miRNA Target Prediction from Expression Data, under review.
2. Li Y, Goldenberg A, Wong K and Zhang Z (2014). A probabilistic approach to explore human microRNA targetome using microRNA-overexpression data and sequence information. Bioinformatics, 30(5), pp. 621-628. http://dx.doi.org/10.1093/bioinformatics/btt599.
print("ps=Pearson(dataset, cause=1:35, effect=36:1189)") print("miR200aTop100=bRank(ps, 11, 100, TRUE)") print("miR200aTop100Confirmed = ValidationT(miR200aTop100, 1.0)")
print("ps=Pearson(dataset, cause=1:35, effect=36:1189)") print("miR200aTop100=bRank(ps, 11, 100, TRUE)") print("miR200aTop100Confirmed = ValidationT(miR200aTop100, 1.0)")
Calculate the Z-score value of each pair of miRNA-mRNA, and return a matrix of values with columns are miRNAs and rows are mRNAs.
Zscore(datacsv, cause, effect, targetbinding = NA)
Zscore(datacsv, cause, effect, targetbinding = NA)
datacsv |
the input dataset in csv format |
cause |
the column range that specifies the causes (miRNAs), e.g. 1:35 |
effect |
the column range that specifies the effects (mRNAs), e.g. 36:2000 |
targetbinding |
the putative target, e.g. "TargetScan.csv". If targetbinding is not specified, only expression data is used. If targetbinding is specified, the prediction results using expression data with be intersected with the interactions in the target binding file. |
A matrix that includes the Z-score values. Columns are miRNAs, rows are mRNAs.
Prill, R.J., Marbach, D., Saez-Rodriguez, J., Sorger, P.K., Alexopoulos, L.G., Xue, X., Clarke, N.D., Altan-Bonnet, G. and Stolovitzky, G. (2010) Towards a rigorous assessment of systems biology models: the DREAM3 challenges. PLoS One, 5, e9202.
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Zscore(dataset, 1:3, 4:18)
dataset=system.file("extdata", "ToyEMT.csv", package="miRLAB") results=Zscore(dataset, 1:3, 4:18)