Compilation of TCGA molecular subtypes

TCGAbiolinks retrieved molecular subtypes information from TCGA samples. The functions PanCancerAtlas_subtypes and TCGAquery_subtype can be used to get the information tables.

While the PanCancerAtlas_subtypes function gives access to a curated table retrieved from synapse (probably with the most updated molecular subtypes) the TCGAquery_subtype function has the complete table also with sample information retrieved from the TCGA marker papers.

PanCancerAtlas_subtypes: Curated molecular subtypes.

Data and description retrieved from synapse (https://www.synapse.org/#!Synapse:syn8402849)

Synapse has published a single file with all available molecular subtypes that have been described by TCGA (all tumor types and all molecular platforms), which can be accessed using the PanCancerAtlas_subtypes function as below:

subtypes <- PanCancerAtlas_subtypes()
DT::datatable(
    data = subtypes,
    filter = 'top',
    options = list(scrollX = TRUE, keys = TRUE, pageLength = 5),
    rownames = FALSE
)

The columns “Subtype_Selected” was selected as most prominent subtype classification (from the other columns)

All available molecular data based-subtype Selected subtype Number of samples Link to file Reference link to paper
ACC mRNA, DNAmeth, protein, miRNA, CNA, COC, C1A.C1B DNAmeth 91 Link Cancer Cell 2016 Link
AML mRNA and miRNA mRNA 187 Link NEJM 2013 Link
BLCA mRNA subtypes mRNA 129 Link Nature 2014 Link
BRCA PAM50 (mRNA) PAM50 1218 Link Nature 2012 Link
GBM/LGG* mRNA, DNAmeth, protein, Supervised_DNAmeth Supervised_DNAmeth 1122 Link Cell 2016 Link
Pan-GI (preliminary) ESCA/STAD/COAD/READ Molecular_Subtype Molecular_Subtype 1011 Link Cancer Cell 2018 Link
HNSC mRNA, DNAmeth, RPPA, miRNA, CNA, Paradigm mRNA 279 Link (TabS7.2) Nature 2015 Link
KICH Eosinophilic Eosinophilic 66 Link Cancer Cell 2014 Link
KIRC mRNA, miRNA mRNA 442 Link Nature 2013 Link
KIRP mRNA, DNAmeth, protein, miRNA, CNA, COC COC 161 Link NEJM 2015 Link
LIHC (preliminary) mRNA, DNAmeth, protein, miRNA, CNA, Paradigma, iCluster iCluster 196 Link (Table S1A) not published
LUAD DNAmeth, iCluster iCluster 230 Link (Table S7) Nature 2014 Link
LUSC mRNA mRNA 178 Link (Data file S7.5) Nature 2012 Link
OVCA mRNA mRNA 489 Link Nature 2011 Link
PCPG mRNA, DNAmeth, protein, miRNA, CNA mRNA 178 tableS2 Cancer Cell 2017 Link
PRAD mRNA, DNAmeth, protein, miRNA, CNA, icluster, mutation/fusion mutation/fusion 333 Link Cell 2015 Link
SKCM mRNA, DNAmeth, protein, miRNA, mutation mutation 331 Link (Table S1D) Cell 2015 Link
THCA mRNA, DNAmeth, protein, miRNA, CNA, histology mRNA 496 Link (Table S2 - Tab1) Cell 2014 Link
UCEC iCluster, MSI, CNA, mRNA iCluster - updated according to Pan-Gyne/Pathways groups 538 Link (datafile S1.1) Nature 2013 Link
Link
UCS (preliminary) mRNA mRNA 57 Link not published

TCGAquery_subtype: Working with molecular subtypes data.

The Cancer Genome Atlas (TCGA) Research Network has reported integrated genome-wide studies of various diseases. We have added some of the subtypes defined by these report in our package:

TCGA dataset Link Paper Journal
ACC doi:10.1016/j.ccell.2016.04.002 Comprehensive Pan-Genomic Characterization of Adrenocortical Carcinoma. Cancer cell 2016
BRCA https://www.cell.com/cancer-cell/fulltext/S1535-6108(18)30119-3 A Comprehensive Pan-Cancer Molecular Study of Gynecologic and Breast Cancers Cancer cell 2018
BLCA http://www.cell.com/cell/fulltext/S0092-8674(17)31056-5 Comprehensive Molecular Characterization of Muscle-Invasive Bladder Cancer Cell 2017
CHOL http://www.sciencedirect.com/science/article/pii/S2211124717302140?via%3Dihub Integrative Genomic Analysis of Cholangiocarcinoma Identifies Distinct IDH-Mutant Molecular Profiles Cell Reports 2017
COAD http://www.nature.com/nature/journal/v487/n7407/abs/nature11252.html Comprehensive molecular characterization of human colon and rectal cancer Nature 2012
ESCA https://www.nature.com/articles/nature20805 Integrated genomic characterization of oesophageal carcinoma Nature 2017
GBM http://dx.doi.org/10.1016/j.cell.2015.12.028 Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma Cell 2016
HNSC http://www.nature.com/nature/journal/v517/n7536/abs/nature14129.html Comprehensive genomic characterization of head and neck squamous cell carcinomas Nature 2015
KICH http://www.sciencedirect.com/science/article/pii/S1535610814003043 The Somatic Genomic Landscape of Chromophobe Renal Cell Carcinoma Cancer cell 2014
KIRC http://www.nature.com/nature/journal/v499/n7456/abs/nature12222.html Comprehensive molecular characterization of clear cell renal cell carcinoma Nature 2013
KIRP http://www.nejm.org/doi/full/10.1056/NEJMoa1505917 Comprehensive Molecular Characterization of Papillary Renal-Cell Carcinoma NEJM 2016
LIHC http://linkinghub.elsevier.com/retrieve/pii/S0092-8674(17)30639-6 Comprehensive and Integrative Genomic Characterization of Hepatocellular Carcinoma Cell 2017
LGG http://dx.doi.org/10.1016/j.cell.2015.12.028 Molecular Profiling Reveals Biologically Discrete Subsets and Pathways of Progression in Diffuse Glioma Cell 2016
LUAD http://www.nature.com/nature/journal/v511/n7511/abs/nature13385.html Comprehensive molecular profiling of lung adenocarcinoma Nature 2014
LUSC http://www.nature.com/nature/journal/v489/n7417/abs/nature11404.html Comprehensive genomic characterization of squamous cell lung cancers Nature 2012
PAAD http://www.cell.com/cancer-cell/fulltext/S1535-6108(17)30299-4 Integrated Genomic Characterization of Pancreatic Ductal Adenocarcinoma Cancer Cell 2017
PCPG http://dx.doi.org/10.1016/j.ccell.2017.01.001 Comprehensive Molecular Characterization of Pheochromocytoma and Paraganglioma Cancer cell 2017
PRAD http://www.sciencedirect.com/science/article/pii/S0092867415013392 The Molecular Taxonomy of Primary Prostate Cancer Cell 2015
READ http://www.nature.com/nature/journal/v487/n7407/abs/nature11252.html Comprehensive molecular characterization of human colon and rectal cancer Nature 2012
SARC http://www.cell.com/cell/fulltext/S0092-8674(17)31203-5 Comprehensive and Integrated Genomic Characterization of Adult Soft Tissue Sarcomas Cell 2017
SKCM http://www.sciencedirect.com/science/article/pii/S0092867415006340 Genomic Classification of Cutaneous Melanoma Cell 2015
STAD http://www.nature.com/nature/journal/v511/n7511/abs/nature13385.html Comprehensive molecular characterization of gastric adenocarcinoma Nature 2013
THCA http://www.sciencedirect.com/science/article/pii/S0092867414012380 Integrated Genomic Characterization of Papillary Thyroid Carcinoma Cell 2014
UCEC http://www.nature.com/nature/journal/v497/n7447/abs/nature12113.html Integrated genomic characterization of endometrial carcinoma Nature 2013
UCS http://www.cell.com/cancer-cell/fulltext/S1535-6108(17)30053-3 Integrated Molecular Characterization of Uterine Carcinosarcoma Cancer Cell 2017
UVM http://www.cell.com/cancer-cell/fulltext/S1535-6108(17)30295-7 Integrative Analysis Identifies Four Molecular and Clinical Subsets in Uveal Melanoma Cancer Cell 2017

These subtypes will be automatically added in the summarizedExperiment object through GDCprepare. But you can also use the TCGAquery_subtype function to retrieve this information.

lgg.gbm.subtype <- TCGAquery_subtype(tumor = "lgg")
## lgg subtype information from:doi:10.1016/j.cell.2015.12.028

A subset of the LGG subytpe is shown below:

Session Information


sessionInfo()
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: Etc/UTC
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] grid      stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] maftools_2.23.0             jpeg_0.1-10                
##  [3] png_0.1-8                   DT_0.33                    
##  [5] dplyr_1.1.4                 SummarizedExperiment_1.37.0
##  [7] Biobase_2.67.0              GenomicRanges_1.59.1       
##  [9] GenomeInfoDb_1.43.1         IRanges_2.41.1             
## [11] S4Vectors_0.45.2            BiocGenerics_0.53.3        
## [13] generics_0.1.3              MatrixGenerics_1.19.0      
## [15] matrixStats_1.4.1           TCGAbiolinks_2.35.0        
## [17] testthat_3.2.1.1           
## 
## loaded via a namespace (and not attached):
##   [1] RColorBrewer_1.1-3          sys_3.4.3                  
##   [3] rstudioapi_0.17.1           jsonlite_1.8.9             
##   [5] magrittr_2.0.3              GenomicFeatures_1.59.1     
##   [7] rmarkdown_2.29              BiocIO_1.17.0              
##   [9] fs_1.6.5                    zlibbioc_1.52.0            
##  [11] vctrs_0.6.5                 Rsamtools_2.23.0           
##  [13] memoise_2.0.1               RCurl_1.98-1.16            
##  [15] htmltools_0.5.8.1           S4Arrays_1.7.1             
##  [17] usethis_3.0.0               progress_1.2.3             
##  [19] curl_6.0.1                  SparseArray_1.7.2          
##  [21] sass_0.4.9                  bslib_0.8.0                
##  [23] htmlwidgets_1.6.4           desc_1.4.3                 
##  [25] plyr_1.8.9                  httr2_1.0.6                
##  [27] cachem_1.1.0                GenomicAlignments_1.43.0   
##  [29] buildtools_1.0.0            mime_0.12                  
##  [31] lifecycle_1.0.4             pkgconfig_2.0.3            
##  [33] Matrix_1.7-1                R6_2.5.1                   
##  [35] fastmap_1.2.0               GenomeInfoDbData_1.2.13    
##  [37] shiny_1.9.1                 digest_0.6.37              
##  [39] colorspace_2.1-1            ShortRead_1.65.0           
##  [41] AnnotationDbi_1.69.0        rprojroot_2.0.4            
##  [43] pkgload_1.4.0               crosstalk_1.2.1            
##  [45] RSQLite_2.3.8               hwriter_1.3.2.1            
##  [47] filelock_1.0.3              fansi_1.0.6                
##  [49] httr_1.4.7                  abind_1.4-8                
##  [51] compiler_4.4.2              remotes_2.5.0              
##  [53] bit64_4.5.2                 withr_3.0.2                
##  [55] downloader_0.4              BiocParallel_1.41.0        
##  [57] DBI_1.2.3                   pkgbuild_1.4.5             
##  [59] R.utils_2.12.3              biomaRt_2.63.0             
##  [61] rappdirs_0.3.3              DelayedArray_0.33.2        
##  [63] sessioninfo_1.2.2           rjson_0.2.23               
##  [65] DNAcopy_1.81.0              tools_4.4.2                
##  [67] httpuv_1.6.15               R.oo_1.27.0                
##  [69] glue_1.8.0                  restfulr_0.0.15            
##  [71] promises_1.3.0              gtable_0.3.6               
##  [73] tzdb_0.4.0                  R.methodsS3_1.8.2          
##  [75] tidyr_1.3.1                 data.table_1.16.2          
##  [77] hms_1.1.3                   xml2_1.3.6                 
##  [79] utf8_1.2.4                  XVector_0.47.0             
##  [81] pillar_1.9.0                stringr_1.5.1              
##  [83] vroom_1.6.5                 later_1.3.2                
##  [85] splines_4.4.2               BiocFileCache_2.15.0       
##  [87] lattice_0.22-6              deldir_2.0-4               
##  [89] rtracklayer_1.67.0          aroma.light_3.37.0         
##  [91] survival_3.7-0              bit_4.5.0                  
##  [93] tidyselect_1.2.1            maketools_1.3.1            
##  [95] Biostrings_2.75.1           miniUI_0.1.1.1             
##  [97] knitr_1.49                  xfun_0.49                  
##  [99] devtools_2.4.5              brio_1.1.5                 
## [101] stringi_1.8.4               UCSC.utils_1.3.0           
## [103] yaml_2.3.10                 codetools_0.2-20           
## [105] TCGAbiolinksGUI.data_1.26.0 evaluate_1.0.1             
## [107] interp_1.1-6                EDASeq_2.41.0              
## [109] tibble_3.2.1                BiocManager_1.30.25        
## [111] cli_3.6.3                   xtable_1.8-4               
## [113] munsell_0.5.1               jquerylib_0.1.4            
## [115] Rcpp_1.0.13-1               dbplyr_2.5.0               
## [117] XML_3.99-0.17               parallel_4.4.2             
## [119] ellipsis_0.3.2              ggplot2_3.5.1              
## [121] readr_2.1.5                 blob_1.2.4                 
## [123] prettyunits_1.2.0           latticeExtra_0.6-30        
## [125] profvis_0.4.0               urlchecker_1.0.1           
## [127] bitops_1.0-9                pwalign_1.3.0              
## [129] scales_1.3.0                purrr_1.0.2                
## [131] crayon_1.5.3                BiocStyle_2.35.0           
## [133] rlang_1.1.4                 KEGGREST_1.47.0            
## [135] rvest_1.0.4