Title: | Comprehensive analysis of transcriptome data |
---|---|
Description: | This package translates microarray expression data into metadata of reduced dimension. It provides various sample-centered and group-centered visualizations, sample similarity analyses and functional enrichment analyses. The underlying SOM algorithm combines feature clustering, multidimensional scaling and dimension reduction, along with strong visualization capabilities. It enables extraction and description of functional expression modules inherent in the data. |
Authors: | Henry Loeffler-Wirth <[email protected]>, Hoang Thanh Le <[email protected]> and Martin Kalcher <[email protected]> |
Maintainer: | Henry Loeffler-Wirth <[email protected]> |
License: | GPL (>=2) |
Version: | 2.25.0 |
Built: | 2024-10-30 09:21:10 UTC |
Source: | https://github.com/bioc/oposSOM |
This package translates microarray expression data into metadata of reduced dimension. It provides various sample-centered and group-centered visualizations, sample similarity analyses and functional enrichment analyses. The underlying SOM algorithm combines feature clustering, multidimensional scaling and dimension reduction, along with strong visualization capabilities. It enables extraction and description of functional expression modules inherent in the data. The results are given within a separate folder and can be browsed using the summary HTML file.
Package: | oposSOM |
Type: | Package |
Version: | 2.4.2 |
Date: | 2024-08-13 |
License: | GPL (>= 2) |
Author: Henry Loeffler-Wirth <[email protected]> and Martin Kalcher <[email protected]>
Maintainer: Henry Loeffler-Wirth <[email protected]>
Wirth, Loeffler, v.Bergen, Binder: Expression cartography of human tissues using self organizing maps. (BMC Bioinformatics 2011)
Wirth, v.Bergen, Binder: Mining SOM expression portraits: feature selection and integrating concepts of molecular function. (BioData Mining 2012)
Loeffler-Wirth, Kalcher, Binder: oposSOM: R-package for high-dimensional portraying of genome-wide expression landscapes on Bioconductor. (Bioinformatics 2015)
# Example with artificial data env <- opossom.new(list(dataset.name="Example", dim.1stLvlSom=20)) env$indata <- matrix(rnorm(10000), 1000, 10) env$group.labels <- "auto" opossom.run(env) # Real Example - This will take several minutes #env <- opossom.new(list(dataset.name="Tissues", # dim.1stLvlSom=30, # geneset.analysis=TRUE, # pairwise.comparison.list=list( # list("Homeostasis"=c(1, 2), "Imune System"=c(9, 10)), # list("Homeostasis"=c(1, 2), "Muscle"=c(8)) # ))) # #data(opossom.tissues) #env$indata <- opossom.tissues # #env$group.labels <- c(rep("Homeostasis", 2), # "Endocrine", # "Digestion", # "Exocrine", # "Epithelium", # "Reproduction", # "Muscle", # rep("Imune System", 2), # rep("Nervous System", 2)) # #opossom.run(env)
# Example with artificial data env <- opossom.new(list(dataset.name="Example", dim.1stLvlSom=20)) env$indata <- matrix(rnorm(10000), 1000, 10) env$group.labels <- "auto" opossom.run(env) # Real Example - This will take several minutes #env <- opossom.new(list(dataset.name="Tissues", # dim.1stLvlSom=30, # geneset.analysis=TRUE, # pairwise.comparison.list=list( # list("Homeostasis"=c(1, 2), "Imune System"=c(9, 10)), # list("Homeostasis"=c(1, 2), "Muscle"=c(8)) # ))) # #data(opossom.tissues) #env$indata <- opossom.tissues # #env$group.labels <- c(rep("Homeostasis", 2), # "Endocrine", # "Digestion", # "Exocrine", # "Epithelium", # "Reproduction", # "Muscle", # rep("Imune System", 2), # rep("Nervous System", 2)) # #opossom.run(env)
Genesets collected from publications and independent analyses.
data(opossom.genesets)
data(opossom.genesets)
The data set is stored in RData (binary) format. Each element of the list represents one distinct gene set and contains the Ensembl-IDs of the member genes.
The oposSOM package allows for analysing the biological background of the samples using predefined sets of genes of known biological context. A large and diverse collection of such gene sets is automatically derived from the Gene Ontology (GO) annotation database using biomaRt interface. opossom.genesets
contains more than 4,500 additional gene sets collected from Biocarta, KEGG and Reactome databases, from literature on chemical and genetic perturbations, from literature on cancer types and subtypes, and from previous analyses using the oposSOM pipeline.
This function initializes the oposSOM environment and sets the preferences.
opossom.new(preferences)
opossom.new(preferences)
preferences |
list with the following optional values:
|
The package accepts the indata
parameter in two formats:<br>
Firstly a simple two-dimensional numerical matrix, where the columns and rows represent the samples and genes, respectively. The expression values are usually obtained by calibration and summarization algorithms (e.g. MAS5, VSN or RMA), and transformed into logarithmic scale prior to utilizing them in the pipeline. Secondly the input data can also be given as Biobase::ExpressionSet
object.
Please check the vignette for more details on the parameters.
A new oposSOM environment which is passed to opossom.run
.
env <- opossom.new(list(dataset.name="Example", note="a test with 10 random samples", dim.1stLvlSom="auto", dim.2ndLvlSom=10, training.extension=1, rotate.SOM.portraits=0, flip.SOM.portraits=FALSE, database.dataset="auto", activated.modules = list( "reporting" = TRUE, "primary.analysis" = TRUE, "sample.similarity.analysis" = TRUE, "geneset.analysis" = TRUE, "psf.analysis" = TRUE, "group.analysis" = TRUE, "difference.analysis" = TRUE ), standard.spot.modules="dmap", spot.coresize.modules=4, spot.threshold.modules=0.9, spot.coresize.groupmap=4, spot.threshold.groupmap=0.7, feature.centralization=TRUE, sample.quantile.normalization=TRUE, pairwise.comparison.list=list( list("groupA"=c("sample1", "sample2"), "groupB"=c("sample3", "sample4"))))) # definition of indata, group.labels and group.colors env$indata = matrix( runif(1000), 100, 10 ) env$group.labels = c( rep("class 1", 5), rep("class 2", 4), "class 3" ) env$group.colors = c( rep("red", 5), rep("blue", 4), "green" ) # alternative definition of indata, group.labels and group.colors using Biobase::ExpressionSet library(Biobase) env$indata = ExpressionSet( assayData=matrix(runif(1000), 100, 10), phenoData=AnnotatedDataFrame(data.frame( group.labels = c( rep("class 1", 5), rep("class 2", 4), "class 3" ), group.colors = c( rep("red", 5), rep("blue", 4), "green" ) )) )
env <- opossom.new(list(dataset.name="Example", note="a test with 10 random samples", dim.1stLvlSom="auto", dim.2ndLvlSom=10, training.extension=1, rotate.SOM.portraits=0, flip.SOM.portraits=FALSE, database.dataset="auto", activated.modules = list( "reporting" = TRUE, "primary.analysis" = TRUE, "sample.similarity.analysis" = TRUE, "geneset.analysis" = TRUE, "psf.analysis" = TRUE, "group.analysis" = TRUE, "difference.analysis" = TRUE ), standard.spot.modules="dmap", spot.coresize.modules=4, spot.threshold.modules=0.9, spot.coresize.groupmap=4, spot.threshold.groupmap=0.7, feature.centralization=TRUE, sample.quantile.normalization=TRUE, pairwise.comparison.list=list( list("groupA"=c("sample1", "sample2"), "groupB"=c("sample3", "sample4"))))) # definition of indata, group.labels and group.colors env$indata = matrix( runif(1000), 100, 10 ) env$group.labels = c( rep("class 1", 5), rep("class 2", 4), "class 3" ) env$group.colors = c( rep("red", 5), rep("blue", 4), "green" ) # alternative definition of indata, group.labels and group.colors using Biobase::ExpressionSet library(Biobase) env$indata = ExpressionSet( assayData=matrix(runif(1000), 100, 10), phenoData=AnnotatedDataFrame(data.frame( group.labels = c( rep("class 1", 5), rep("class 2", 4), "class 3" ), group.colors = c( rep("red", 5), rep("blue", 4), "green" ) )) )
This function realizes the complete pipeline functionality: single gene expression values are culstered to metagenes using a self-organizing map. Based on these metagenes, visualizations (e.g. expression portraits), downstreaming sample similarity analyses (e.g. hierarchical clustering, ICA) and functional enrichment analyses are performed. The results are given within a separate folder and can be browsed using the summary HTML file.
opossom.run(env)
opossom.run(env)
env |
the opossom environment created with |
# Example with artificial data env <- opossom.new(list(dataset.name="Example", dim.1stLvlSom=20)) env$indata <- matrix(rnorm(1000), 100, 10) opossom.run(env)
# Example with artificial data env <- opossom.new(list(dataset.name="Example", dim.1stLvlSom=20)) env$indata <- matrix(rnorm(1000), 100, 10) opossom.run(env)
A data set comprising of 12 selected human tissues.
data(opossom.tissues)
data(opossom.tissues)
The data set is stored in RData (binary) format.
The data set was downloaded from Gene Expression Omnibus repository (http://www.ncbi.nlm.nih.gov/geo, GEO accession no. GSE7307). About 20,000 genes in more than 650 samples were measured using the Affymetrix HGU133-Plus2 microarray. A subset of 12 selected tissues from different categories is used as example data set for the oposSOM-package.
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE7307