Title: | GDCRNATools: an R/Bioconductor package for integrative analysis of lncRNA, mRNA, and miRNA data in GDC |
---|---|
Description: | This is an easy-to-use package for downloading, organizing, and integrative analyzing RNA expression data in GDC with an emphasis on deciphering the lncRNA-mRNA related ceRNA regulatory network in cancer. Three databases of lncRNA-miRNA interactions including spongeScan, starBase, and miRcode, as well as three databases of mRNA-miRNA interactions including miRTarBase, starBase, and miRcode are incorporated into the package for ceRNAs network construction. limma, edgeR, and DESeq2 can be used to identify differentially expressed genes/miRNAs. Functional enrichment analyses including GO, KEGG, and DO can be performed based on the clusterProfiler and DO packages. Both univariate CoxPH and KM survival analyses of multiple genes can be implemented in the package. Besides some routine visualization functions such as volcano plot, bar plot, and KM plot, a few simply shiny apps are developed to facilitate visualization of results on a local webpage. |
Authors: | Ruidong Li, Han Qu, Shibo Wang, Julong Wei, Le Zhang, Renyuan Ma, Jianming Lu, Jianguo Zhu, Wei-De Zhong, Zhenyu Jia |
Maintainer: | Ruidong Li <[email protected]>, Han Qu <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.27.0 |
Built: | 2024-11-07 06:52:52 UTC |
Source: | https://github.com/bioc/GDCRNATools |
This is an easy-to-use package for downloading, organizing, and integrative analyzing RNA expression data in GDC with an emphasis on deciphering the lncRNA-mRNA related ceRNA regulatory network in cancer.
gdcDEAnalysis
for downstream analysisOutput of gdcDEAnalysis
for downstream analysis
gdcEnrichAnalysis
for visualizationOutput of gdcEnrichAnalysis
for visualization
A bar plot showing the number of down-regulated and up-regulated DE genes/miRNAs of different biotypes
gdcBarPlot(deg, angle = 0, data.type)
gdcBarPlot(deg, angle = 0, data.type)
deg |
a dataframe generated from |
angle |
a numeric value specifying the angle of text on x-axis.
Default is |
data.type |
one of |
A bar plot
Ruidong Li and Han Qu
genes <- c('ENSG00000231806','ENSG00000261211','ENSG00000260920', 'ENSG00000228594','ENSG00000125170','ENSG00000179909', 'ENSG00000280012','ENSG00000134612','ENSG00000213071') symbol <- c('PCAT7','AL031123.2','AL031985.3', 'FNDC10','DOK4','ZNF154', 'RPL23AP61','FOLH1B','LPAL2') group <- rep(c('long_non_coding','protein_coding','pseudogene'), each=3) logFC <- c(2.8,2.3,-1.1,1.9,-1.2,-1.6,1.5,2.1,-1.1) FDR <- rep(c(0.1,0.00001,0.0002), each=3) deg <- data.frame(symbol, group, logFC, FDR) rownames(deg) <- genes gdcBarPlot(deg, angle=45, data.type='RNAseq')
genes <- c('ENSG00000231806','ENSG00000261211','ENSG00000260920', 'ENSG00000228594','ENSG00000125170','ENSG00000179909', 'ENSG00000280012','ENSG00000134612','ENSG00000213071') symbol <- c('PCAT7','AL031123.2','AL031985.3', 'FNDC10','DOK4','ZNF154', 'RPL23AP61','FOLH1B','LPAL2') group <- rep(c('long_non_coding','protein_coding','pseudogene'), each=3) logFC <- c(2.8,2.3,-1.1,1.9,-1.2,-1.6,1.5,2.1,-1.1) FDR <- rep(c(0.1,0.00001,0.0002), each=3) deg <- data.frame(symbol, group, logFC, FDR) rownames(deg) <- genes gdcBarPlot(deg, angle=45, data.type='RNAseq')
Identify ceRNAs by (1) number of shared miRNAs between lncRNA and mRNA; (2) expression correlation of lncRNA and mRNA; (3) regulation similarity of shared miRNAs on lncRNA and mRNA; (4) sensitivity correlation
gdcCEAnalysis(lnc, pc, deMIR = NULL, lnc.targets = "starBase", pc.targets = "starBase", rna.expr, mir.expr)
gdcCEAnalysis(lnc, pc, deMIR = NULL, lnc.targets = "starBase", pc.targets = "starBase", rna.expr, mir.expr)
lnc |
a vector of Ensembl long non-coding gene ids |
pc |
a vector of Ensembl protein coding gene ids |
deMIR |
a vector of differentially expressed miRNAs.
Default is |
lnc.targets |
a character string specifying the database
of miRNA-lncRNA interactions.
Should be one of |
pc.targets |
a character string specifying the database of
miRNA-lncRNA interactions.
Should be one of |
rna.expr |
|
mir.expr |
|
A dataframe containing ceRNA pairs, expression correlation between lncRNA and mRNA, the number and hypergeometric significance of shared miRNAs, regulation similarity score, and the mean sensitity correlation (the difference between Pearson correlation and partial correlation) of multiple lncRNA-miRNA-mRNA triplets, etc.
Ruidong Li and Han Qu
Paci P, Colombo T, Farina L. Computational analysis identifies a sponge interaction network between long non-coding RNAs and messenger RNAs in human breast cancer. BMC systems biology. 2014 Jul 17;8(1):83.
####### ceRNA network analysis ####### deLNC <- c('ENSG00000260920','ENSG00000242125','ENSG00000261211') dePC <- c('ENSG00000043355','ENSG00000109586','ENSG00000144355') genes <- c(deLNC, dePC) samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') rnaExpr <- data.frame(matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6), stringsAsFactors=FALSE) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples mirExpr <- data.frame(matrix(c(7.7,7.4,7.9,8.9,8.6,9.5, 5.1,4.4,5.5,8.5,4.4,3.5, 4.9,5.5,6.9,6.1,5.5,4.1, 12.4,13.5,15.1,15.4,13.0,12.8, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,2.7,6.2,1.5,4.4,4.2),6,6), stringsAsFactors=FALSE) colnames(mirExpr) <- samples rownames(mirExpr) <- c('hsa-miR-340-5p','hsa-miR-181b-5p', 'hsa-miR-181a-5p', 'hsa-miR-181c-5p', 'hsa-miR-199b-5p','hsa-miR-182-5p') ceOutput <- gdcCEAnalysis(lnc = deLNC, pc = dePC, lnc.targets = 'starBase', pc.targets = 'starBase', rna.expr = rnaExpr, mir.expr = mirExpr)
####### ceRNA network analysis ####### deLNC <- c('ENSG00000260920','ENSG00000242125','ENSG00000261211') dePC <- c('ENSG00000043355','ENSG00000109586','ENSG00000144355') genes <- c(deLNC, dePC) samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') rnaExpr <- data.frame(matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6), stringsAsFactors=FALSE) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples mirExpr <- data.frame(matrix(c(7.7,7.4,7.9,8.9,8.6,9.5, 5.1,4.4,5.5,8.5,4.4,3.5, 4.9,5.5,6.9,6.1,5.5,4.1, 12.4,13.5,15.1,15.4,13.0,12.8, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,2.7,6.2,1.5,4.4,4.2),6,6), stringsAsFactors=FALSE) colnames(mirExpr) <- samples rownames(mirExpr) <- c('hsa-miR-340-5p','hsa-miR-181b-5p', 'hsa-miR-181a-5p', 'hsa-miR-181c-5p', 'hsa-miR-199b-5p','hsa-miR-182-5p') ceOutput <- gdcCEAnalysis(lnc = deLNC, pc = dePC, lnc.targets = 'starBase', pc.targets = 'starBase', rna.expr = rnaExpr, mir.expr = mirExpr)
Download clinical data in GDC either by providing the manifest file or specifying the project id and data type
gdcClinicalDownload(manifest = NULL, project.id, directory = "Clinical", write.manifest = FALSE, method = "gdc-client")
gdcClinicalDownload(manifest = NULL, project.id, directory = "Clinical", write.manifest = FALSE, method = "gdc-client")
manifest |
menifest file that is downloaded from the GDC cart.
If provided, files whose UUIDs are in the manifest file will be
downloaded via gdc-client, otherwise, |
project.id |
project id in GDC |
directory |
the folder to save downloaded files.
Default is |
write.manifest |
logical, whether to write out the manifest file |
method |
method that is used to download data. Either
|
downloaded files in the specified directory
Ruidong Li and Han Qu
####### Download Clinical data by manifest file ####### manifest <- 'Clinical.manifest.txt' ## Not run: gdcClinicalDownload(manifest = manifest, directory = 'Clinical') ## End(Not run) ####### Download Clinical data by project id ####### project <- 'TCGA-PRAD' ## Not run: gdcClinicalDownload(project.id = project, write.manifest = TRUE, directory = 'Clinical') ## End(Not run)
####### Download Clinical data by manifest file ####### manifest <- 'Clinical.manifest.txt' ## Not run: gdcClinicalDownload(manifest = manifest, directory = 'Clinical') ## End(Not run) ####### Download Clinical data by project id ####### project <- 'TCGA-PRAD' ## Not run: gdcClinicalDownload(project.id = project, write.manifest = TRUE, directory = 'Clinical') ## End(Not run)
Merge clinical data in .xml
files
that are downloaded from GDC to a dataframe
gdcClinicalMerge(path, key.info = TRUE, organized = FALSE)
gdcClinicalMerge(path, key.info = TRUE, organized = FALSE)
path |
path to downloaded files for merging |
key.info |
logical, whether to return the key clinical
information only. If |
organized |
logical, whether the clinical data have already
been organized into a single folder (eg., data downloaded by the
'GenomicDataCommons' method are already organized).
Default is |
A dataframe of clinical data with rows are patients and columns are clinical traits
Ruidong Li and Han Qu
####### Merge clinical data ####### path <- 'Clinical/' ## Not run: clinicalDa <- gdcClinicalMerge(path=path, key.info=TRUE)
####### Merge clinical data ####### path <- 'Clinical/' ## Not run: clinicalDa <- gdcClinicalMerge(path=path, key.info=TRUE)
Scatter plot showing the expression correlation between two genes/miRNAs
gdcCorPlot(gene1, gene2, rna.expr, metadata)
gdcCorPlot(gene1, gene2, rna.expr, metadata)
gene1 |
an Ensembl gene id or miRBase v21 mature miRNA id |
gene2 |
an Ensembl gene id or miRBase v21 mature miRNA id |
rna.expr |
|
metadata |
metadata parsed from |
A scatter plot with line of best fit
Ruidong Li and Han Qu
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-11', 'TCGA-2F-A9KT-11', 'TCGA-2F-A9KW-11') metaMatrix <- data.frame(sample_type=rep(c('PrimaryTumor', 'SolidTissueNormal'),each=3), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples gdcCorPlot(gene1 = 'ENSG00000000938', gene2 = 'ENSG00000001084', rna.expr = rnaExpr, metadata = metaMatrix)
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-11', 'TCGA-2F-A9KT-11', 'TCGA-2F-A9KW-11') metaMatrix <- data.frame(sample_type=rep(c('PrimaryTumor', 'SolidTissueNormal'),each=3), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples gdcCorPlot(gene1 = 'ENSG00000000938', gene2 = 'ENSG00000001084', rna.expr = rnaExpr, metadata = metaMatrix)
Performs differential gene expression analysis by limma, edgeR, and DESeq2
gdcDEAnalysis(counts, group, comparison, method = "limma", n.cores = NULL, filter = TRUE)
gdcDEAnalysis(counts, group, comparison, method = "limma", n.cores = NULL, filter = TRUE)
counts |
a dataframe or numeric matrix of raw counts data generated
from |
group |
a vector giving the group that each sample belongs to |
comparison |
a character string specifying the two groups
being compared. |
method |
one of |
n.cores |
a numeric value of cores to be used for
|
filter |
logical, whether to filter out low expression genes.
If |
A dataframe containing Ensembl gene ids/miRBase v21 mature miRNA ids, gene symbols, biotypes, fold change on the log2 scale, p value, and FDR etc. of all genes/miRNAs of analysis.
It may takes long time for method='DESeq2'
with a
single core. Please use multiple cores if possible
Ruidong Li and Han Qu
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package
for differential expression analysis of digital gene expression data.
Bioinformatics. 2010 Jan 1;26(1):139-40.
Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK.
limma powers differential expression analyses for RNA-sequencing and
microarray studies. Nucleic acids research. 2015 Jan 20;
43(7):e47-e47.
Love MI, Huber W, Anders S. Moderated estimation of fold change and
dispersion for RNA-seq data with DESeq2. Genome biology. 2014 Dec 5;
15(12):550.
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-11', 'TCGA-2F-A9KT-11', 'TCGA-2F-A9KW-11') metaMatrix <- data.frame(sample_type=rep(c('PrimaryTumor', 'SolidTissueNormal'),each=3), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaMatrix <- matrix(c(6092,11652,5426,4383,3334,2656, 8436,2547,7943,3741,6302,13976, 1506,6467,5324,3651,1566,2780, 834,4623,10275,5639,6183,4548, 24702,43,1987,269,3322,2410, 2815,2089,3804,230,883,5415), 6,6) rownames(rnaMatrix) <- genes colnames(rnaMatrix) <- samples DEGAll <- gdcDEAnalysis(counts = rnaMatrix, group = metaMatrix$sample_type, comparison = 'PrimaryTumor-SolidTissueNormal', method = 'limma')
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-11', 'TCGA-2F-A9KT-11', 'TCGA-2F-A9KW-11') metaMatrix <- data.frame(sample_type=rep(c('PrimaryTumor', 'SolidTissueNormal'),each=3), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaMatrix <- matrix(c(6092,11652,5426,4383,3334,2656, 8436,2547,7943,3741,6302,13976, 1506,6467,5324,3651,1566,2780, 834,4623,10275,5639,6183,4548, 24702,43,1987,269,3322,2410, 2815,2089,3804,230,883,5415), 6,6) rownames(rnaMatrix) <- genes colnames(rnaMatrix) <- samples DEGAll <- gdcDEAnalysis(counts = rnaMatrix, group = metaMatrix$sample_type, comparison = 'PrimaryTumor-SolidTissueNormal', method = 'limma')
Report genes/miRNAs that are differentially expressed satisfying a given threshold
gdcDEReport(deg, gene.type = "all", fc = 2, pval = 0.01)
gdcDEReport(deg, gene.type = "all", fc = 2, pval = 0.01)
deg |
A dataframe of DE analysis result from
|
gene.type |
one of |
fc |
a numeric value specifying the threshold of fold change |
pval |
a nuemric value specifying the threshold of p value |
A dataframe or numeric matrix of differentially expressed genes/miRNAs
Ruidong Li and Han Qu
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-11', 'TCGA-2F-A9KT-11', 'TCGA-2F-A9KW-11') metaMatrix <- data.frame(sample_type=rep(c('PrimaryTumor', 'SolidTissueNormal'),each=3), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaMatrix <- matrix(c(6092,11652,5426,4383,3334,2656, 8436,2547,7943,3741,6302,13976, 1506,6467,5324,3651,1566,2780, 834,4623,10275,5639,6183,4548, 24702,43,1987,269,3322,2410, 2815,2089,3804,230,883,5415), 6,6) rownames(rnaMatrix) <- genes colnames(rnaMatrix) <- samples DEGAll <- gdcDEAnalysis(counts = rnaMatrix, group = metaMatrix$sample_type, comparison = 'PrimaryTumor-SolidTissueNormal', method = 'limma') dePC <- gdcDEReport(deg=DEGAll)
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-11', 'TCGA-2F-A9KT-11', 'TCGA-2F-A9KW-11') metaMatrix <- data.frame(sample_type=rep(c('PrimaryTumor', 'SolidTissueNormal'),each=3), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaMatrix <- matrix(c(6092,11652,5426,4383,3334,2656, 8436,2547,7943,3741,6302,13976, 1506,6467,5324,3651,1566,2780, 834,4623,10275,5639,6183,4548, 24702,43,1987,269,3322,2410, 2815,2089,3804,230,883,5415), 6,6) rownames(rnaMatrix) <- genes colnames(rnaMatrix) <- samples DEGAll <- gdcDEAnalysis(counts = rnaMatrix, group = metaMatrix$sample_type, comparison = 'PrimaryTumor-SolidTissueNormal', method = 'limma') dePC <- gdcDEReport(deg=DEGAll)
Performs Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway and Disease Ontology (DO) enrichment analyses by clusterProfiler and DOSE packages
gdcEnrichAnalysis(gene, simplify = TRUE, level = 0)
gdcEnrichAnalysis(gene, simplify = TRUE, level = 0)
gene |
a vector of Ensembl gene id |
simplify |
logical, specifying whether to remove redundant GO terms.
Default |
level |
a numeric value, restrict the GO enrichment result at a
specific GO level. Default is |
A dataframe of enrichment analysis result containing enriched terms, number of overlpped genes, p value of hypergeometric test, fdr, fold of enrichment, Ensembl gene ids, gene symbols, and functional categories, etc.
Ruidong Li and Han Qu
Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for
comparing biological themes among gene clusters.
Omics: a journal of integrative biology. 2012 May 1;16(5):284-7.
Yu G, Wang LG, Yan GR, He QY. DOSE: an R/Bioconductor package for
disease ontology semantic and enrichment analysis. Bioinformatics.
2014 Oct 17;31(4):608-9.
####### GO, KEGG, DO enrichment analysis ####### deg <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') ## Not run: enrichOutput <- gdcEnrichAnalysis(gene=deg, simplify=TRUE)
####### GO, KEGG, DO enrichment analysis ####### deg <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') ## Not run: enrichOutput <- gdcEnrichAnalysis(gene=deg, simplify=TRUE)
Bar plot and bubble plot for GO, KEGG, and DO functional enrichment analysis
gdcEnrichPlot(enrichment, type = "bar", category = "KEGG", num.terms = 10, bar.color = "black")
gdcEnrichPlot(enrichment, type = "bar", category = "KEGG", num.terms = 10, bar.color = "black")
enrichment |
a dataframe generated from
|
type |
type of the plot, should be one of |
category |
which category should be plotted. Possible values are
|
num.terms |
number of terms to be plotted. Default is |
bar.color |
color of the bar plot. Default is |
A bar plot or bubble plot of functional enrichment analysis
Ruidong Li and Han Qu
####### Enrichment plots ####### enrichOutput<-data.frame(Terms=c('hsa05414~Dilated cardiomyopathy (DCM)', 'hsa04510~Focal adhesion', 'hsa05205~Proteoglycans in cancer'), Category=rep('KEGG',3), FDR=c(0.001,0.002,0.003)) gdcEnrichPlot(enrichment=enrichOutput, type='bar', category='KEGG')
####### Enrichment plots ####### enrichOutput<-data.frame(Terms=c('hsa05414~Dilated cardiomyopathy (DCM)', 'hsa04510~Focal adhesion', 'hsa05205~Proteoglycans in cancer'), Category=rep('KEGG',3), FDR=c(0.001,0.002,0.003)) gdcEnrichPlot(enrichment=enrichOutput, type='bar', category='KEGG')
Export nodes and edges of ce network for Cytoscape visualization
gdcExportNetwork(ceNetwork, net)
gdcExportNetwork(ceNetwork, net)
ceNetwork |
a dataframe generated from |
net |
one of |
A dataframe of nodes or edges
Ruidong Li and Han Qu
####### ceRNA network analysis ####### ceOutput <- data.frame(lncRNAs=c('ENSG00000242125','ENSG00000242125', 'ENSG00000245532'), Genes=c('ENSG00000043355','ENSG00000109586', 'ENSG00000144355'), miRNAs=c('hsa-miR-340-5p','hsa-miR-340-5p', 'hsa-miR-320b,hsa-miR-320d, hsa-miR-320c,hsa-miR-320a'), Counts=c(1,1,4), stringsAsFactors=FALSE) ####### Export edges ####### edges <- gdcExportNetwork(ceNetwork=ceOutput, net='edges') ####### Export nodes ####### ## Not run: nodes <- gdcExportNetwork(ceNetwork=ceOutput, net='nodes')
####### ceRNA network analysis ####### ceOutput <- data.frame(lncRNAs=c('ENSG00000242125','ENSG00000242125', 'ENSG00000245532'), Genes=c('ENSG00000043355','ENSG00000109586', 'ENSG00000144355'), miRNAs=c('hsa-miR-340-5p','hsa-miR-340-5p', 'hsa-miR-320b,hsa-miR-320d, hsa-miR-320c,hsa-miR-320a'), Counts=c(1,1,4), stringsAsFactors=FALSE) ####### Export edges ####### edges <- gdcExportNetwork(ceNetwork=ceOutput, net='edges') ####### Export nodes ####### ## Not run: nodes <- gdcExportNetwork(ceNetwork=ceOutput, net='nodes')
Filter out samples that are sequenced for two or more times
gdcFilterDuplicate(metadata)
gdcFilterDuplicate(metadata)
metadata |
metadata parsed from |
A filtered dataframe of metadata without duplicated samples
Ruidong Li and Han Qu
####### Parse metadata by project id and data type ####### metaMatrix <- gdcParseMetadata(project.id='TARGET-RT', data.type='RNAseq') metaMatrix <- gdcFilterDuplicate(metadata=metaMatrix)
####### Parse metadata by project id and data type ####### metaMatrix <- gdcParseMetadata(project.id='TARGET-RT', data.type='RNAseq') metaMatrix <- gdcFilterDuplicate(metadata=metaMatrix)
Filter out samples that are neither Solid Tissue Normal nor Primary Tumor
gdcFilterSampleType(metadata)
gdcFilterSampleType(metadata)
metadata |
metadata parsed from |
A filtered dataframe of metadata with Solid Tissue Normal and Primary Tumor samples only
Ruidong Li and Han Qu
####### Parse metadata by project id and data type ####### metaMatrix <- gdcParseMetadata(project.id='TARGET-RT', data.type='RNAseq') metaMatrix <- gdcFilterSampleType(metadata=metaMatrix)
####### Parse metadata by project id and data type ####### metaMatrix <- gdcParseMetadata(project.id='TARGET-RT', data.type='RNAseq') metaMatrix <- gdcFilterSampleType(metadata=metaMatrix)
A heatmap showing unsupervised hierarchical clustering of
DE genes/miRNAs by heatmap.2
in the
gplots package
gdcHeatmap(deg.id, metadata, rna.expr)
gdcHeatmap(deg.id, metadata, rna.expr)
deg.id |
a vector of Ensembl gene ids or miRBase v21 mature miRNA ids |
metadata |
metadata parsed from |
rna.expr |
|
A heatmap with rows are DE genes/miRNAs and columns are samples. Solid Tissue Normal samples are labeled with blue and Primary Tumor samples are labeled with red
Ruidong Li and Han Qu
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples gdcHeatmap(deg.id=genes, metadata=metaMatrix, rna.expr=rnaExpr)
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples gdcHeatmap(deg.id=genes, metadata=metaMatrix, rna.expr=rnaExpr)
Plot Kaplan Meier survival curve
gdcKMPlot(gene, rna.expr, metadata, sep = "median")
gdcKMPlot(gene, rna.expr, metadata, sep = "median")
gene |
an Ensembl gene id |
rna.expr |
|
metadata |
metadata parsed from |
sep |
a character string specifying which point should be used to
separate low-expression and high-expression groups. Possible values
are |
A plot of Kaplan Meier survival curve
Ruidong Li and Han Qu
####### KM plots ####### genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples gdcKMPlot(gene='ENSG00000000938', rna.expr=rnaExpr, metadata=metaMatrix, sep='median')
####### KM plots ####### genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples gdcKMPlot(gene='ENSG00000000938', rna.expr=rnaExpr, metadata=metaMatrix, sep='median')
Check if samples in the metadata and expression data match
gdcMatchSamples(metadata, rna.expr)
gdcMatchSamples(metadata, rna.expr)
metadata |
metadata parsed from |
rna.expr |
|
A logical value. If TRUE
, all the samples matched
Ruidong Li and Han Qu
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples gdcMatchSamples(metadata=metaMatrix, rna.expr=rnaExpr)
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples gdcMatchSamples(metadata=metaMatrix, rna.expr=rnaExpr)
Parse metadata either by providing the .json file that is downloaded from GDC cart or by parse metadata automatically by providing the projct id and data type
gdcParseMetadata(metafile = NULL, project.id, data.type, write.meta = FALSE)
gdcParseMetadata(metafile = NULL, project.id, data.type, write.meta = FALSE)
metafile |
metadata file in |
project.id |
project id in GDC |
data.type |
one of |
write.meta |
logical, whether to write the metadata to a
|
A dataframe of metadata containing file_name, sample_id, etc. as well as some basic clinical data
Ruidong Li and Han Qu
####### Merge RNA expression data ####### metaMatrix <- gdcParseMetadata(project.id='TARGET-RT', data.type='RNAseq')
####### Merge RNA expression data ####### metaMatrix <- gdcParseMetadata(project.id='TARGET-RT', data.type='RNAseq')
Download gene expression quantification and isoform expression quantification data from GDC either by providing the manifest file or by sepcifying the project id and data type
gdcRNADownload(manifest = NULL, project.id, data.type, directory = "Data", write.manifest = FALSE, method = "gdc-client")
gdcRNADownload(manifest = NULL, project.id, data.type, directory = "Data", write.manifest = FALSE, method = "gdc-client")
manifest |
menifest file that is downloaded from the GDC cart.
If provided, files whose UUIDs are in the manifest file
will be downloaded via gdc-client, otherwise, |
project.id |
project id in GDC |
data.type |
one of |
directory |
the folder to save downloaded files.
Default is |
write.manifest |
logical, whether to write out the manifest file |
method |
method that is used to download data. Either
|
Downloaded files in the specified directory
Ruidong Li and Han Qu
####### Download RNA data by menifest file ####### manifest <- 'RNAseq.manifest.txt' ## Not run: gdcRNADownload(manifest=manifest) ####### Download RNA data by project id and data type ####### project <- 'TCGA-PRAD' ## Not run: gdcRNADownload(project.id=project, data.type='RNAseq')
####### Download RNA data by menifest file ####### manifest <- 'RNAseq.manifest.txt' ## Not run: gdcRNADownload(manifest=manifest) ####### Download RNA data by project id and data type ####### project <- 'TCGA-PRAD' ## Not run: gdcRNADownload(project.id=project, data.type='RNAseq')
Merge raw counts data that is downloaded from GDC to a single expression matrix
gdcRNAMerge(metadata, path, data.type, organized = FALSE)
gdcRNAMerge(metadata, path, data.type, organized = FALSE)
metadata |
metadata parsed from |
path |
path to downloaded files for merging |
data.type |
one of |
organized |
logical, whether the raw counts data have already
been organized into a single folder (eg., data downloaded by the
'GenomicDataCommons' method are already organized).
Default is |
A dataframe or numeric matrix of raw counts data with rows are genes or miRNAs and columns are samples
Ruidong Li and Han Qu
####### Merge RNA expression data ####### metaMatrix <- gdcParseMetadata(project.id='TARGET-RT', data.type='RNAseq') ## Not run: rnaExpr <- gdcRNAMerge(metadata=metaMatrix, path='RNAseq/', data.type='RNAseq') ## End(Not run)
####### Merge RNA expression data ####### metaMatrix <- gdcParseMetadata(project.id='TARGET-RT', data.type='RNAseq') ## Not run: rnaExpr <- gdcRNAMerge(metadata=metaMatrix, path='RNAseq/', data.type='RNAseq') ## End(Not run)
Univariate Cox Proportional-Hazards and Kaplan Meier survival analysis of a vector of genes
gdcSurvivalAnalysis(gene, rna.expr, metadata, method = "coxph", sep = "median")
gdcSurvivalAnalysis(gene, rna.expr, metadata, method = "coxph", sep = "median")
gene |
a vector of Ensembl gene ids |
rna.expr |
|
metadata |
metadata parsed from |
method |
method for survival analysis. Possible values are
|
sep |
which point should be used to separate low-expression
and high-expression groups for |
A dataframe or numeric matrix of hazard ratio, 95% confidence interval, p value, and FDR
Ruidong Li and Han Qu
Therneau TM, Lumley T. Package ‘survival’.
Andersen PK, Gill RD. Cox's regression model for
counting processes: a large sample study.
The annals of statistics. 1982 Dec 1:1100-20.
Therneau TM, Grambsch PM. Extending the Cox model.
Edited by P. Bickel, P. Diggle, S. Fienberg, K. Krickeberg.
2000:51.
Harrington DP, Fleming TR. A class of rank test procedures
for censored survival data.Biometrika. 1982 Dec 1;69(3):553-66.
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples survOutput <- gdcSurvivalAnalysis(gene=genes, rna.expr=rnaExpr, metadata=metaMatrix)
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples survOutput <- gdcSurvivalAnalysis(gene=genes, rna.expr=rnaExpr, metadata=metaMatrix)
A volcano plot showing differentially expressed genes/miRNAs
gdcVolcanoPlot(deg.all, fc = 2, pval = 0.01)
gdcVolcanoPlot(deg.all, fc = 2, pval = 0.01)
deg.all |
a dataframe generated from |
fc |
a numeric value specifying the threshold of fold change |
pval |
a nuemric value specifying the threshold of p value |
A volcano plot
Ruidong Li and Han Qu
genes <- c('ENSG00000231806','ENSG00000261211','ENSG00000260920', 'ENSG00000228594','ENSG00000125170','ENSG00000179909', 'ENSG00000280012','ENSG00000134612','ENSG00000213071') symbol <- c('PCAT7','AL031123.2','AL031985.3', 'FNDC10','DOK4','ZNF154', 'RPL23AP61','FOLH1B','LPAL2') group <- rep(c('long_non_coding','protein_coding','pseudogene'), each=3) logFC <- c(2.8,2.3,-1.1,1.9,-1.2,-1.6,1.5,2.1,-1.1) FDR <- rep(c(0.1,0.00001,0.0002), each=3) deg <- data.frame(symbol, group, logFC, FDR) rownames(deg) <- genes gdcVolcanoPlot(deg.all=deg)
genes <- c('ENSG00000231806','ENSG00000261211','ENSG00000260920', 'ENSG00000228594','ENSG00000125170','ENSG00000179909', 'ENSG00000280012','ENSG00000134612','ENSG00000213071') symbol <- c('PCAT7','AL031123.2','AL031985.3', 'FNDC10','DOK4','ZNF154', 'RPL23AP61','FOLH1B','LPAL2') group <- rep(c('long_non_coding','protein_coding','pseudogene'), each=3) logFC <- c(2.8,2.3,-1.1,1.9,-1.2,-1.6,1.5,2.1,-1.1) FDR <- rep(c(0.1,0.00001,0.0002), each=3) deg <- data.frame(symbol, group, logFC, FDR) rownames(deg) <- genes gdcVolcanoPlot(deg.all=deg)
Normalize raw counts data by TMM implemented in edgeR
and then transform it by voom
in limma
gdcVoomNormalization(counts, filter = TRUE)
gdcVoomNormalization(counts, filter = TRUE)
counts |
raw counts of RNA/miRNA expression data |
filter |
logical, whether to filter out low-expression genes.
If |
A dataframe or numeric matrix of TMM normalized and
voom
transformed expression values on
the log2 scale
Ruidong Li and Han Qu
Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package
for differential expression analysis of digital gene expression data.
Bioinformatics. 2010 Jan 1;26(1):139-40.
Law CW, Chen Y, Shi W, Smyth GK. Voom: precision weights unlock
linear model analysis tools for RNA-seq read counts. Genome biology.
2014 Feb 3;15(2):R29.
####### Normalization ####### rnaMatrix <- matrix(sample(1:100,100), 4, 25) rnaExpr <- gdcVoomNormalization(counts=rnaMatrix, filter=FALSE)
####### Normalization ####### rnaMatrix <- matrix(sample(1:100,100), 4, 25) rnaExpr <- gdcVoomNormalization(counts=rnaMatrix, filter=FALSE)
A simple shiny app to show scatter plot of correlations between two genes/miRNAs on local web browser
shinyCorPlot(gene1, gene2, rna.expr, metadata)
shinyCorPlot(gene1, gene2, rna.expr, metadata)
gene1 |
a vector of Ensembl gene ids or miRBase v21 mature miRNA ids |
gene2 |
a vector of Ensembl gene ids or miRBase v21 mature miRNA ids |
rna.expr |
|
metadata |
metadata parsed from |
a local webpage for visualization of correlation plots
Ruidong Li and Han Qu
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples ## Not run: shinyCorPlot(gene1=genes[1:3], gene2=genes[4:5], rna.expr=rnaExpr, metadata=metaMatrix) ## End(Not run)
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples ## Not run: shinyCorPlot(gene1=genes[1:3], gene2=genes[4:5], rna.expr=rnaExpr, metadata=metaMatrix) ## End(Not run)
A simple shiny app to show KM survival curves on local web browser
shinyKMPlot(gene, rna.expr, metadata)
shinyKMPlot(gene, rna.expr, metadata)
gene |
a vector of Ensembl gene ids |
rna.expr |
|
metadata |
metadata parsed from |
a local webpage for visualization of KM plots
Ruidong Li and Han Qu
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples ## Not run: shinyKMPlot(gene=genes, rna.expr=rnaExpr, metadata=metaMatrix) ## End(Not run)
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') samples <- c('TCGA-2F-A9KO-01', 'TCGA-2F-A9KP-01', 'TCGA-2F-A9KQ-01', 'TCGA-2F-A9KR-01', 'TCGA-2F-A9KT-01', 'TCGA-2F-A9KW-01') metaMatrix <- data.frame(sample_type=rep('PrimaryTumor',6), sample=samples, days_to_death=seq(100,600,100), days_to_last_follow_up=rep(NA,6)) rnaExpr <- matrix(c(2.7,7.0,4.9,6.9,4.6,2.5, 0.5,2.5,5.7,6.5,4.9,3.8, 2.1,2.9,5.9,5.7,4.5,3.5, 2.7,5.9,4.5,5.8,5.2,3.0, 2.5,2.2,5.3,4.4,4.4,2.9, 2.4,3.8,6.2,3.8,3.8,4.2),6,6) rownames(rnaExpr) <- genes colnames(rnaExpr) <- samples ## Not run: shinyKMPlot(gene=genes, rna.expr=rnaExpr, metadata=metaMatrix) ## End(Not run)
A simple shiny app to show pathways genetrated by pathview package on local web browser
shinyPathview(gene, pathways, directory = ".")
shinyPathview(gene, pathways, directory = ".")
gene |
a vector of numeric values (eg. fold change on log2 scale) with names are Ensembl gene ids |
pathways |
a vector of KEGG pathway ids |
directory |
the folder to save pathway figures. Default is the working directory |
a local webpage for visualization of KEGG maps
Ruidong Li and Han Qu
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') pathways <- c("hsa05414~Dilated cardiomyopathy (DCM)", "hsa05410~Hypertrophic cardiomyopathy (HCM)", "hsa05412~Arrhythmogenic right ventricular cardiomyopathy", "hsa04512~ECM-receptor interaction", "hsa04510~Focal adhesion", "hsa04360~Axon guidance", "hsa04270~Vascular smooth muscle contraction", "hsa05205~Proteoglycans in cancer", "hsa04022~cGMP-PKG signaling pathway", "hsa00480~Glutathione metabolism") ## Not run: shinyPathview(gene=genes, pathways=pathways)
genes <- c('ENSG00000000938','ENSG00000000971','ENSG00000001036', 'ENSG00000001084','ENSG00000001167','ENSG00000001460') pathways <- c("hsa05414~Dilated cardiomyopathy (DCM)", "hsa05410~Hypertrophic cardiomyopathy (HCM)", "hsa05412~Arrhythmogenic right ventricular cardiomyopathy", "hsa04512~ECM-receptor interaction", "hsa04510~Focal adhesion", "hsa04360~Axon guidance", "hsa04270~Vascular smooth muscle contraction", "hsa05205~Proteoglycans in cancer", "hsa04022~cGMP-PKG signaling pathway", "hsa00480~Glutathione metabolism") ## Not run: shinyPathview(gene=genes, pathways=pathways)