Title: | A wrapper for Gemma's Restful API to access curated gene expression data and differential expression analyses |
---|---|
Description: | Low- and high-level wrappers for Gemma's RESTful API. They enable access to curated expression and differential expression data from over 10,000 published studies. Gemma is a web site, database and a set of tools for the meta-analysis, re-use and sharing of genomics data, currently primarily targeted at the analysis of gene expression profiles. |
Authors: | Javier Castillo-Arnemann [aut]
|
Maintainer: | Ogan Mancarci <[email protected]> |
License: | Apache License (>= 2) |
Version: | 3.1.9 |
Built: | 2024-07-17 19:38:41 UTC |
Source: | https://github.com/bioc/gemma.R |
Some functions such as get_datasets
and get_platforms_by_ids
include a filter argument that allows creation of more complex queries. This
function returns a list of supported properties to be used in those filters
filter_properties()
filter_properties()
A list of data.tables that contain supported properties and their data types
filter_properties()
filter_properties()
Forget past results from memoised calls to the Gemma API (ie. using functions with memoised = TRUE
)
forget_gemma_memoised()
forget_gemma_memoised()
TRUE to indicate cache was cleared.
forget_gemma_memoised()
forget_gemma_memoised()
A minimal function to create custom calls. Can be used to acquire unimplemented endpoints and/or raw output without any processing. Refer to the API documentation.
gemma_call(call, ..., json = TRUE)
gemma_call(call, ..., json = TRUE)
call |
Gemma API endpoint. |
... |
parameters included in the call |
json |
If |
A list if json = TRUE
and an httr response if FALSE
# get singular value decomposition for the dataset gemma_call('datasets/{dataset}/svd',dataset = 1)
# get singular value decomposition for the dataset gemma_call('datasets/{dataset}/svd',dataset = 1)
Creates a kable
where certain columns are automatically
shortened to better fit a document.
gemma_kable(table)
gemma_kable(table)
table |
A data.table or data.frame outputted by a gemma.R function |
Enable and disable memoisation of gemma.R functions
gemma_memoise( memoised = FALSE, cache = rappdirs::user_cache_dir(appname = "gemmaR") )
gemma_memoise( memoised = FALSE, cache = rappdirs::user_cache_dir(appname = "gemmaR") )
memoised |
boolean. If TRUE memoisation will be enabled |
cache |
File path or "cache_in_memory". File path will chose a location to save the cache files for memoisation. "cache_in_memory" will store the cache in the current R session |
This package contains wrappers and convenience functions for Gemma's RESTful API that enables access to curated expression and differential expression data from over 15,000 published studies (as of mid-2022). Gemma (https://gemma.msl.ubc.ca) is a web site, database and a set of tools for the meta-analysis, re-use and sharing of transcriptomics data, currently primarily targeted at the analysis of gene expression profiles.
Most users will want to start with the high-level functions like get_dataset_object
, get_differential_expression_values
and get_platform_annotations
Additional lower-level methods are available that directly map to the Gemma RESTful API methods.
For more information and detailed usage instructions check the README, the function reference and the vignette.
All software-related questions should be posted to the Bioconductor Support Site: https://support.bioconductor.org
Javier Castillo-Arnemann, Jordan Sicherman, Ogan Mancarci, Guillaume Poirier-Morency
Lim, N. et al., Curation of over 10 000 transcriptomic studies to enable data reuse, Database, 2021. https://doi.org/10.1093/database/baab006
Useful links:
Report bugs at https://github.com/PavlidisLab/gemma.R/issues
Given a Gemma.R output from a function with offset and limit arguments, returns the output from all pages. All arguments other than offset, limit
get_all_pages( query, step_size = 100, binder = rbind, directory = NULL, file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_all_pages( query, step_size = 100, binder = rbind, directory = NULL, file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
query |
Output from a gemma.R function with offset and limit argument |
step_size |
Size of individual calls to the server. 100 is the maximum value |
binder |
Binding function for the calls. If |
directory |
Directory to save the output from the individual calls to. If provided, each page is saved to separate files. |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data.table or a list containing data from all pages.
When querying for ontology terms, Gemma propagates these terms to include any datasets with their child terms in the results. This function returns these children for any number of terms, including all children and the terms itself in the output vector
get_child_terms(terms)
get_child_terms(terms)
terms |
An array of terms |
An array containing descendends of the annotation terms, including the terms themselves
get_child_terms("http://purl.obolibrary.org/obo/MONDO_0000408")
get_child_terms("http://purl.obolibrary.org/obo/MONDO_0000408")
Retrieve the annotations of a dataset
get_dataset_annotations( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_dataset_annotations( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
dataset |
A numerical dataset identifier or a dataset short name |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the annotations of the queried
dataset. A list if raw = TRUE
.A 404 error
if the given
identifier does not map to any object.
The fields of the output data.table are:
class.name
: Name of the annotation class (e.g. organism part)
class.URI
: URI for the annotation class
term.name
: Name of the annotation term (e.g. lung)
term.URI
: URI for the annotation term
object.class
: Class of object that the term originated from.
get_dataset_annotations("GSE2018")
get_dataset_annotations("GSE2018")
Retrieve the design of a dataset
get_dataset_design( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_dataset_design( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
dataset |
A numerical dataset identifier or a dataset short name |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table of the design matrix for the queried dataset.
A 404 error
if the given identifier does not map to any object
head(get_dataset_design("GSE2018"))
head(get_dataset_design("GSE2018"))
Retrieve annotations and surface level stats for a dataset's differential analyses
get_dataset_differential_expression_analyses( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_dataset_differential_expression_analyses( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
dataset |
A numerical dataset identifier or a dataset short name |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the differential expression
analysis of the queried dataset. Note that this funciton does not return
differential expression values themselves. Use get_differential_expression_values
to get differential expression values (see examples).
The fields of the output data.table are:
result.ID
: Result set ID of the differential expression analysis.
May represent multiple factors in a single model.
contrast.ID
: Id of the specific contrast factor. Together with the result.ID
they uniquely represent a given contrast.
experiment.ID
: Id of the source experiment
factor.category
: Category for the contrast
factor.category.URI
: URI for the contrast category
factor.ID
: ID of the factor
baseline.factors
: Characteristics of the baseline. This field is a data.table
experimental.factors
: Characteristics of the experimental group. This field is a data.table
isSubset
: TRUE if the result set belong to a subset, FALSE if not. Subsets are created when performing differential expression to avoid unhelpful comparisons.
subsetFactor
: Characteristics of the subset. This field is a data.table
probes.analyzed
: Number of probesets represented in the contrast
genes.analyzed
: Number of genes represented in the contrast
result <- get_dataset_differential_expression_analyses("GSE2872") get_differential_expression_values(resultSet = result$result.ID[1])
result <- get_dataset_differential_expression_analyses("GSE2872") get_differential_expression_values(resultSet = result$result.ID[1])
Retrieve the expression data matrix of a set of datasets and genes
get_dataset_expression_for_genes( datasets, genes, keepNonSpecific = FALSE, consolidate = NA_character_, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_dataset_expression_for_genes( datasets, genes, keepNonSpecific = FALSE, consolidate = NA_character_, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
datasets |
A vector of dataset IDs or short names |
genes |
A vector of NCBI IDs, Ensembl IDs or gene symbols. |
keepNonSpecific |
logical. |
consolidate |
An option for gene expression level consolidation. If empty, will return every probe for the genes. "pickmax" to pick the probe with the highest expression, "pickvar" to pick the prove with the highest variance and "average" for returning the average expression |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A list of data frames
get_dataset_expression_for_genes("GSE2018", genes = c(10225, 2841))
get_dataset_expression_for_genes("GSE2018", genes = c(10225, 2841))
Return an annotated Bioconductor-compatible data structure or a long form tibble of the queried dataset, including expression data and the experimental design.
get_dataset_object( datasets, genes = NULL, keepNonSpecific = FALSE, consolidate = NA_character_, resultSets = NULL, contrasts = NULL, metaType = "text", type = "se", memoised = getOption("gemma.memoised", FALSE) )
get_dataset_object( datasets, genes = NULL, keepNonSpecific = FALSE, consolidate = NA_character_, resultSets = NULL, contrasts = NULL, metaType = "text", type = "se", memoised = getOption("gemma.memoised", FALSE) )
datasets |
A vector of dataset IDs or short names |
genes |
A vector of NCBI IDs, Ensembl IDs or gene symbols. |
keepNonSpecific |
logical. |
consolidate |
An option for gene expression level consolidation. If empty, will return every probe for the genes. "pickmax" to pick the probe with the highest expression, "pickvar" to pick the prove with the highest variance and "average" for returning the average expression |
resultSets |
Result set IDs of the a differential expression analysis. Optional. If provided, the output will only include
the samples from the subset used in the result set ID.
Must be the same length as |
contrasts |
Contrast IDs of a differential expression contrast. Optional. Need resultSets to be defined to work. If provided, the output will only include samples relevant to the specific contrats. |
metaType |
How should the metadata information should be included. Can be "text", "uri" or "both". "text" and "uri" options |
type |
"se"for a SummarizedExperiment or "eset" for Expression Set. We recommend using SummarizedExperiments which are more recent. See the Summarized experiment vignette or the ExpressionSet vignette for more details. "tidy" for a long form data frame compatible with tidyverse functions. 'list' to return a list containing individual data frames containing expression values, design and the experiment. |
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
A list of SummarizedExperiment
s,
ExpressionSet
s or a tibble containing metadata and
expression data for the queried datasets and genes. Metadata will be expanded to include
a variable number of factors that annotates samples from a dataset but will
always include single "factorValues" column that houses data.tables that
include all annotations for a given sample.
get_dataset_object("GSE2018")
get_dataset_object("GSE2018")
Retrieve the platforms of a dataset
get_dataset_platforms( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_dataset_platforms( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
dataset |
A numerical dataset identifier or a dataset short name |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the platform(s). A list if raw = TRUE
. A 404 error
if the given identifier
does not map to any object
The fields of the output data.table are:
platform.ID
: Internal identifier of the platform
platform.shortName
: Shortname of the platform.
platform.name
: Full name of the platform.
platform.description
: Free text description of the platform
platform.troubled
: Whether or not the platform was marked "troubled" by a Gemma process or a curator
platform.experimentCount
: Number of experiments using the platform within Gemma
platform.type
: Technology type for the platform.
taxon.name
: Name of the species platform was made for
taxon.scientific
: Scientific name for the taxon
taxon.ID
: Internal identifier given to the species by Gemma
taxon.NCBI
: NCBI ID of the taxon
taxon.database.name
: Underlying database used in Gemma for the taxon
taxon.database.ID
: ID of the underyling database used in Gemma for the taxon
get_dataset_platforms("GSE2018")
get_dataset_platforms("GSE2018")
Retrieve processed expression data of a dataset
get_dataset_processed_expression( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_dataset_processed_expression( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
dataset |
A numerical dataset identifier or a dataset short name |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
If raw is FALSE (default), a data table of the expression matrix for the queried dataset. If raw is TRUE, returns the binary file in raw form.
get_dataset_processed_expression("GSE2018")
get_dataset_processed_expression("GSE2018")
Retrieve quantitation types of a dataset
get_dataset_quantitation_types( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_dataset_quantitation_types( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
dataset |
A numerical dataset identifier or a dataset short name |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data.table containing the quantitation types
The fields of the output data.table are:
id
: If of the quantitation type. Any raw quantitation type
can be accessed by get_dataset_raw_expression
function using
this id.
name
: Name of the quantitation type
description
: Description of the quantitation type
type
: Type of the quantitation type. Either raw or processed.
Each dataset will have one processed quantitation type which is the data
returned using get_dataset_processed_expression
ratio
: Whether or not the quanitation type is a ratio of multiple
quantitation types. Typically TRUE for processed TWOCOLOR quantitation type.
preferred
: The preferred raw quantitation type. This version
is used in generation of the processed data within gemma.
recomputed
: If TRUE this quantitation type is generated by
recomputing raw data files Gemma had access to.
get_dataset_quantitation_types("GSE59918")
get_dataset_quantitation_types("GSE59918")
Retrieve raw expression data of a dataset
get_dataset_raw_expression( dataset, quantitationType, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_dataset_raw_expression( dataset, quantitationType, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
dataset |
A numerical dataset identifier or a dataset short name |
quantitationType |
Quantitation type id. These can be acquired
using |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
If raw is FALSE (default), a data table of the expression matrix for the queried dataset. If raw is TRUE, returns the binary file in raw form.
q_types <- get_dataset_quantitation_types("GSE59918") get_dataset_raw_expression("GSE59918", q_types$id[q_types$name == "Counts"])
q_types <- get_dataset_quantitation_types("GSE59918") get_dataset_raw_expression("GSE59918", q_types$id[q_types$name == "Counts"])
Retrieve the samples of a dataset
get_dataset_samples( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_dataset_samples( dataset, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
dataset |
A numerical dataset identifier or a dataset short name |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the samples of the queried dataset. A list if
raw = TRUE
. A 404 error
if the given identifier does not map to any object.
The fields of the output data.table are:
sample.name
: Internal name given to the sample.
sample.ID
: Internal ID of the sample
sample.description
: Free text description of the sample
sample.outlier
: Whether or not the sample is marked as an outlier
sample.accession
: Accession ID of the sample in it's original database
sample.database
: Database of origin for the sample
sample.characteristics
: Characteristics of the sample. This field is a data table
sample.factorValues
: Experimental factor values of the sample. This field is a data table
head(get_dataset_samples("GSE2018"))
head(get_dataset_samples("GSE2018"))
Retrieve all datasets
get_datasets( query = NA_character_, filter = NA_character_, taxa = NA_character_, uris = NA_character_, offset = 0L, limit = 20L, sort = "+id", raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_datasets( query = NA_character_, filter = NA_character_, taxa = NA_character_, uris = NA_character_, offset = 0L, limit = 20L, sort = "+id", raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
query |
The search query. Queries can include plain text or ontology terms They also support conjunctions ("alpha AND beta"), disjunctions ("alpha OR beta") grouping ("(alpha OR beta) AND gamma"), prefixing ("alpha*"), wildcard characters ("BRCA?") and fuzzy matches ("alpha~"). |
filter |
Filter results by matching expression. Use |
taxa |
A vector of taxon common names (e.g. human, mouse, rat). Providing multiple
species will return results for all species. These are appended
to the filter and equivalent to filtering for |
uris |
A vector of ontology term URIs. Providing multiple terms will
return results containing any of the terms and their children. These are
appended to the filter and equivalent to filtering for |
offset |
The offset of the first retrieved result. |
limit |
Defaults to 20. Limits the result to specified amount
of objects. Has a maximum value of 100. Use together with |
sort |
Order results by the given property and direction. The '+' sign indicate ascending order whereas the '-' indicate descending. |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the queried dataset(s). A list if
raw = TRUE
. Returns an empty list if no datasets matched.
The fields of the output data.table are:
experiment.shortName
: Shortname given to the dataset within Gemma. Often corresponds to accession ID
experiment.name
: Full title of the dataset
experiment.ID
: Internal ID of the dataset.
experiment.description
: Description of the dataset
experiment.troubled
: Did an automatic process within gemma or a curator mark the dataset as "troubled"
experiment.accession
: Accession ID of the dataset in the external database it was taken from
experiment.database
: The name of the database where the dataset was taken from
experiment.URI
: URI of the original database
experiment.sampleCount
: Number of samples in the dataset
experiment.batchEffectText
: A text field describing whether the dataset has batch effects
experiment.batchCorrected
: Whether batch correction has been performed on the dataset.
experiment.batchConfound
: 0 if batch info isn't available, -1 if batch counfoud is detected, 1 if batch information is available and no batch confound found
experiment.batchEffect
: -1 if batch p value < 0.0001, 1 if batch p value > 0.1, 0 if otherwise and when there is no batch information is available or when the data is confounded with batches.
experiment.rawData
: -1 if no raw data available, 1 if raw data was available. When available, Gemma reprocesses raw data to get expression values and batches
geeq.qScore
: Data quality score given to the dataset by Gemma.
geeq.sScore
: Suitability score given to the dataset by Gemma. Refers to factors like batches, platforms and other aspects of experimental design
taxon.name
: Name of the species
taxon.scientific
: Scientific name for the taxon
taxon.ID
: Internal identifier given to the species by Gemma
taxon.NCBI
: NCBI ID of the taxon
taxon.database.name
: Underlying database used in Gemma for the taxon
taxon.database.ID
: ID of the underyling database used in Gemma for the taxon
get_datasets() get_datasets(taxa = c("mouse", "human"), uris = "http://purl.obolibrary.org/obo/UBERON_0002048") # filter below is equivalent to the call above get_datasets(filter = "taxon.commonName in (mouse,human) and allCharacteristics.valueUri = http://purl.obolibrary.org/obo/UBERON_0002048") get_datasets(query = "lung")
get_datasets() get_datasets(taxa = c("mouse", "human"), uris = "http://purl.obolibrary.org/obo/UBERON_0002048") # filter below is equivalent to the call above get_datasets(filter = "taxon.commonName in (mouse,human) and allCharacteristics.valueUri = http://purl.obolibrary.org/obo/UBERON_0002048") get_datasets(query = "lung")
Retrieve datasets by their identifiers
get_datasets_by_ids( datasets = NA_character_, filter = NA_character_, taxa = NA_character_, uris = NA_character_, offset = 0L, limit = 20L, sort = "+id", raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_datasets_by_ids( datasets = NA_character_, filter = NA_character_, taxa = NA_character_, uris = NA_character_, offset = 0L, limit = 20L, sort = "+id", raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
datasets |
Numerical dataset identifiers or dataset short names. If not specified, all datasets will be returned instead |
filter |
Filter results by matching expression. Use |
taxa |
A vector of taxon common names (e.g. human, mouse, rat). Providing multiple
species will return results for all species. These are appended
to the filter and equivalent to filtering for |
uris |
A vector of ontology term URIs. Providing multiple terms will
return results containing any of the terms and their children. These are
appended to the filter and equivalent to filtering for |
offset |
The offset of the first retrieved result. |
limit |
Defaults to 20. Limits the result to specified amount
of objects. Has a maximum value of 100. Use together with |
sort |
Order results by the given property and direction. The '+' sign indicate ascending order whereas the '-' indicate descending. |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the queried dataset(s). A list if
raw = TRUE
. Returns an empty list if no datasets matched.
The fields of the output data.table are:
experiment.shortName
: Shortname given to the dataset within Gemma. Often corresponds to accession ID
experiment.name
: Full title of the dataset
experiment.ID
: Internal ID of the dataset.
experiment.description
: Description of the dataset
experiment.troubled
: Did an automatic process within gemma or a curator mark the dataset as "troubled"
experiment.accession
: Accession ID of the dataset in the external database it was taken from
experiment.database
: The name of the database where the dataset was taken from
experiment.URI
: URI of the original database
experiment.sampleCount
: Number of samples in the dataset
experiment.batchEffectText
: A text field describing whether the dataset has batch effects
experiment.batchCorrected
: Whether batch correction has been performed on the dataset.
experiment.batchConfound
: 0 if batch info isn't available, -1 if batch counfoud is detected, 1 if batch information is available and no batch confound found
experiment.batchEffect
: -1 if batch p value < 0.0001, 1 if batch p value > 0.1, 0 if otherwise and when there is no batch information is available or when the data is confounded with batches.
experiment.rawData
: -1 if no raw data available, 1 if raw data was available. When available, Gemma reprocesses raw data to get expression values and batches
geeq.qScore
: Data quality score given to the dataset by Gemma.
geeq.sScore
: Suitability score given to the dataset by Gemma. Refers to factors like batches, platforms and other aspects of experimental design
taxon.name
: Name of the species
taxon.scientific
: Scientific name for the taxon
taxon.ID
: Internal identifier given to the species by Gemma
taxon.NCBI
: NCBI ID of the taxon
taxon.database.name
: Underlying database used in Gemma for the taxon
taxon.database.ID
: ID of the underyling database used in Gemma for the taxon
get_datasets_by_ids("GSE2018") get_datasets_by_ids(c("GSE2018", "GSE2872"))
get_datasets_by_ids("GSE2018") get_datasets_by_ids(c("GSE2018", "GSE2872"))
Retrieves the differential expression result set(s) associated with the dataset.
To get more information about the contrasts in individual resultSets and
annotation terms associated them, use get_dataset_differential_expression_analyses()
get_differential_expression_values( dataset = NA_character_, resultSets = NA_integer_, keepNonSpecific = FALSE, readableContrasts = FALSE, memoised = getOption("gemma.memoised", FALSE) )
get_differential_expression_values( dataset = NA_character_, resultSets = NA_integer_, keepNonSpecific = FALSE, readableContrasts = FALSE, memoised = getOption("gemma.memoised", FALSE) )
dataset |
A dataset identifier. |
resultSets |
resultSet identifiers. If a dataset is not provided, all result sets will be downloaded. If it is provided it will only be used to ensure all result sets belong to the dataset. |
keepNonSpecific |
logical. FALSE by default. If TRUE, results from probesets that are not specific to the gene will also be returned. |
readableContrasts |
If |
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
In Gemma each result set corresponds to the estimated effects associated with a single factor in the design, and each can have multiple contrasts (for each level compared to baseline). Thus a dataset with a 2x3 factorial design will have two result sets, one of which will have one contrast, and one having two contrasts.
The methodology for differential expression is explained in Curation of over 10000 transcriptomic studies to enable data reuse. Briefly, differential expression analysis is performed on the dataset based on the annotated experimental design with up two three potentially nested factors. Gemma attempts to automatically assign baseline conditions for each factor. In the absence of a clear control condition, a baseline is arbitrarily selected. A generalized linear model with empirical Bayes shrinkage of t-statistics is fit to the data for each platform element (probe/gene) using an implementation of the limma algorithm. For RNA-seq data, we use weighted regression, applying the voom algorithm to compute weights from the mean–variance relationship of the data. Contrasts of each condition are then computed compared to the selected baseline. In some situations, Gemma will split the data into subsets for analysis. A typical such situation is when a ‘batch’ factor is present and confounded with another factor, the subsets being determined by the levels of the confounding factor.
A list of data tables with differential expression values per result set.
get_differential_expression_values("GSE2018")
get_differential_expression_values("GSE2018")
Retrieve the differential expression results for a given gene among datasets matching the provided query and filter
get_gene_differential_expression_values( gene, query = NA_character_, taxa = NA_character_, uris = NA_character_, filter = NA_character_, threshold = 1, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_gene_differential_expression_values( gene, query = NA_character_, taxa = NA_character_, uris = NA_character_, filter = NA_character_, threshold = 1, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
gene |
An ensembl gene identifier which typically starts with ensg or an ncbi gene identifier or an official gene symbol approved by hgnc |
query |
The search query. Queries can include plain text or ontology terms They also support conjunctions ("alpha AND beta"), disjunctions ("alpha OR beta") grouping ("(alpha OR beta) AND gamma"), prefixing ("alpha*"), wildcard characters ("BRCA?") and fuzzy matches ("alpha~"). |
taxa |
A vector of taxon common names (e.g. human, mouse, rat). Providing multiple
species will return results for all species. These are appended
to the filter and equivalent to filtering for |
uris |
A vector of ontology term URIs. Providing multiple terms will
return results containing any of the terms and their children. These are
appended to the filter and equivalent to filtering for |
filter |
Filter results by matching expression. Use |
threshold |
number |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data.table containing differential expression results. This table
is stripped down some relevant information for speed of execution. Details about
the contrasts can be accessesed via get_result_sets
function
The fields of the output data.table are:
result.ID
: Result set ID of the differential expression analysis.
May represent multiple factors in a single model.
contrast.ID
: Id of the specific contrast factor. Together with the result.ID
they uniquely represent a given contrast.
experiment.ID
: Id of the source experiment
factor.coefficient
: Model coefficient calculated for the specific contrast factor
factor.logfc
: Log 2 fold change calculated for the specific contrast factor
factor.pvalue
: p values calculated for the specific contrast factor
# get all differential expression results for ENO2 # from datasets marked with the ontology term for brain head(get_gene_differential_expression_values(2026, uris = "http://purl.obolibrary.org/obo/UBERON_0000955"))
# get all differential expression results for ENO2 # from datasets marked with the ontology term for brain head(get_gene_differential_expression_values(2026, uris = "http://purl.obolibrary.org/obo/UBERON_0000955"))
Retrieve the GO terms associated to a gene
get_gene_go_terms( gene, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_gene_go_terms( gene, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
gene |
An ensembl gene identifier which typically starts with ensg or an ncbi gene identifier or an official gene symbol approved by hgnc |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the GO terms assigned to the
queried gene. A list if raw = TRUE
. A 404 error
if the given identifier does not map to any
object.
The fields of the output data.table are:
term.name
: Name of the term
term.ID
: ID of the term
term.URI
: URI of the term
get_gene_go_terms(3091)
get_gene_go_terms(3091)
Retrieve the physical locations of a given gene
get_gene_locations( gene, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_gene_locations( gene, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
gene |
An ensembl gene identifier which typically starts with ensg or an ncbi gene identifier or an official gene symbol approved by hgnc |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the physical location of the
queried gene. A list if raw = TRUE
. A 404 error
if the given identifier does not map to any object.
The fields of the output data.table are:
chromosome
: Name of the chromosome the gene is located
strand
: Which strand the gene is located
nucleotide
: Nucleotide number for the gene
length
: Gene length
taxon.name
: Name of the taxon
taxon.scientific
: Scientific name for the taxon
taxon.ID
: Internal ID for the taxon given by Gemma
taxon.NCBI
: NCBI ID for the taxon
taxon.database.name
: Name of the database used in Gemma for the taxon
get_gene_locations("DYRK1A") get_gene_locations(1859)
get_gene_locations("DYRK1A") get_gene_locations(1859)
Retrieve the probes associated to a genes across all platforms
get_gene_probes( gene, offset = 0L, limit = 20L, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_gene_probes( gene, offset = 0L, limit = 20L, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
gene |
An ensembl gene identifier which typically starts with ensg or an ncbi gene identifier or an official gene symbol approved by hgnc |
offset |
The offset of the first retrieved result. |
limit |
Defaults to 20. Limits the result to specified amount
of objects. Has a maximum value of 100. Use together with |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the probes representing a gene across
all platrofms. A list if raw = TRUE
.
A 404 error
if the given identifier does not map to any genes.
The fields of the output data.table are:
element.name
: Name of the element. Typically the probeset name
element.description
: A free text field providing optional information about the element
platform.shortName
: Shortname of the platform given by Gemma. Typically the GPL identifier.
platform.name
: Full name of the platform
platform.ID
: Id number of the platform given by Gemma
platform.type
: Type of the platform.
platform.description
: Free text field describing the platform.
platform.troubled
: Whether the platform is marked as troubled by a Gemma curator.
taxon.name
: Name of the species platform was made for
taxon.scientific
: Scientific name for the taxon
taxon.ID
: Internal identifier given to the species by Gemma
taxon.NCBI
: NCBI ID of the taxon
taxon.database.name
: Underlying database used in Gemma for the taxon
taxon.database.ID
: ID of the underyling database used in Gemma for the taxon
get_gene_probes(1859)
get_gene_probes(1859)
Retrieve genes matching gene identifiers
get_genes( genes, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_genes( genes, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
genes |
A vector of NCBI IDs, Ensembl IDs or gene symbols. |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the querried gene(s)
A list if raw = TRUE
.
The fields of the output data.table are:
gene.symbol
: Symbol for the gene
gene.ensembl
: Ensembl ID for the gene
gene.NCBI
: NCBI id for the gene
gene.name
: Name of the gene
gene.aliases
: Gene aliases. Each row includes a vector
gene.MFX.rank
: Multifunctionality rank for the gene
taxon.name
: Name of the species
taxon.scientific
: Scientific name for the taxon
taxon.ID
: Internal identifier given to the species by Gemma
taxon.NCBI
: NCBI ID of the taxon
taxon.database.name
: Underlying database used in Gemma for the taxon
taxon.database.ID
: ID of the underlying database used in Gemma for the taxon
get_genes("DYRK1A") get_genes(c("DYRK1A", "PTEN"))
get_genes("DYRK1A") get_genes(c("DYRK1A", "PTEN"))
Gets Gemma's platform annotations including mappings of microarray probes to genes.
get_platform_annotations( platform, annotType = c("noParents", "allParents", "bioProcess"), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE), memoised = getOption("gemma.memoise", FALSE), unzip = FALSE )
get_platform_annotations( platform, annotType = c("noParents", "allParents", "bioProcess"), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE), memoised = getOption("gemma.memoise", FALSE), unzip = FALSE )
platform |
A platform numerical identifiers or platform short name. |
annotType |
Which GO terms should the output include |
file |
Where to save the annotation file to, or empty to just load into memory |
overwrite |
Whether or not to overwrite an existing file |
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
unzip |
Whether or not to unzip the file (if @param file is not empty) |
A table of annotations
ProbeName
: Probeset names provided by the platform.
Gene symbols for generic annotations
GeneSymbols
: Genes that were found to be aligned to
the probe sequence. Note that it is possible for probes to be
non-specific. Alignment to multiple genes are indicated with gene
symbols separated by "|"s
GeneNames
: Name of the gene
GOTerms
: GO Terms associated with the genes. annotType
argument can be used to choose which terms should be included.
GemmaIDs
and NCBIids
: respective IDs for the genes.
head(get_platform_annotations("GPL96")) head(get_platform_annotations('Generic_human_ncbiIds'))
head(get_platform_annotations("GPL96")) head(get_platform_annotations('Generic_human_ncbiIds'))
Retrieve all experiments using a given platform
get_platform_datasets( platform, offset = 0L, limit = 20L, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_platform_datasets( platform, offset = 0L, limit = 20L, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
platform |
A platform numerical identifier or a platform short name |
offset |
The offset of the first retrieved result. |
limit |
Defaults to 20. Limits the result to specified amount
of objects. Has a maximum value of 100. Use together with |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the queried dataset(s). A list if
raw = TRUE
. Returns an empty list if no datasets matched.
The fields of the output data.table are:
experiment.shortName
: Shortname given to the dataset within Gemma. Often corresponds to accession ID
experiment.name
: Full title of the dataset
experiment.ID
: Internal ID of the dataset.
experiment.description
: Description of the dataset
experiment.troubled
: Did an automatic process within gemma or a curator mark the dataset as "troubled"
experiment.accession
: Accession ID of the dataset in the external database it was taken from
experiment.database
: The name of the database where the dataset was taken from
experiment.URI
: URI of the original database
experiment.sampleCount
: Number of samples in the dataset
experiment.batchEffectText
: A text field describing whether the dataset has batch effects
experiment.batchCorrected
: Whether batch correction has been performed on the dataset.
experiment.batchConfound
: 0 if batch info isn't available, -1 if batch counfoud is detected, 1 if batch information is available and no batch confound found
experiment.batchEffect
: -1 if batch p value < 0.0001, 1 if batch p value > 0.1, 0 if otherwise and when there is no batch information is available or when the data is confounded with batches.
experiment.rawData
: -1 if no raw data available, 1 if raw data was available. When available, Gemma reprocesses raw data to get expression values and batches
geeq.qScore
: Data quality score given to the dataset by Gemma.
geeq.sScore
: Suitability score given to the dataset by Gemma. Refers to factors like batches, platforms and other aspects of experimental design
taxon.name
: Name of the species
taxon.scientific
: Scientific name for the taxon
taxon.ID
: Internal identifier given to the species by Gemma
taxon.NCBI
: NCBI ID of the taxon
taxon.database.name
: Underlying database used in Gemma for the taxon
taxon.database.ID
: ID of the underyling database used in Gemma for the taxon
head(get_platform_datasets("GPL1355"))
head(get_platform_datasets("GPL1355"))
Retrieve the genes associated to a probe in a given platform
get_platform_element_genes( platform, probe, offset = 0L, limit = 20L, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_platform_element_genes( platform, probe, offset = 0L, limit = 20L, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
platform |
A platform numerical identifier or a platform short name |
probe |
A probe name or it's numerical identifier |
offset |
The offset of the first retrieved result. |
limit |
Defaults to 20. Limits the result to specified amount
of objects. Has a maximum value of 100. Use together with |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the querried gene(s)
A list if raw = TRUE
.
The fields of the output data.table are:
gene.symbol
: Symbol for the gene
gene.ensembl
: Ensembl ID for the gene
gene.NCBI
: NCBI id for the gene
gene.name
: Name of the gene
gene.aliases
: Gene aliases. Each row includes a vector
gene.MFX.rank
: Multifunctionality rank for the gene
taxon.name
: Name of the species
taxon.scientific
: Scientific name for the taxon
taxon.ID
: Internal identifier given to the species by Gemma
taxon.NCBI
: NCBI ID of the taxon
taxon.database.name
: Underlying database used in Gemma for the taxon
taxon.database.ID
: ID of the underlying database used in Gemma for the taxon
get_platform_element_genes("GPL1355", "AFFX_Rat_beta-actin_M_at")
get_platform_element_genes("GPL1355", "AFFX_Rat_beta-actin_M_at")
Retrieve all platforms matching a set of platform identifiers
get_platforms_by_ids( platforms = NA_character_, filter = NA_character_, taxa = NA_character_, offset = 0L, limit = 20L, sort = "+id", raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_platforms_by_ids( platforms = NA_character_, filter = NA_character_, taxa = NA_character_, offset = 0L, limit = 20L, sort = "+id", raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
platforms |
Platform numerical identifiers or platform short names. If not specified, all platforms will be returned instead |
filter |
Filter results by matching expression. Use |
taxa |
A vector of taxon common names (e.g. human, mouse, rat). Providing multiple
species will return results for all species. These are appended
to the filter and equivalent to filtering for |
offset |
The offset of the first retrieved result. |
limit |
Defaults to 20. Limits the result to specified amount
of objects. Has a maximum value of 100. Use together with |
sort |
Order results by the given property and direction. The '+' sign indicate ascending order whereas the '-' indicate descending. |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with information about the platform(s). A list if raw = TRUE
. A 404 error
if the given identifier
does not map to any object
The fields of the output data.table are:
platform.ID
: Internal identifier of the platform
platform.shortName
: Shortname of the platform.
platform.name
: Full name of the platform.
platform.description
: Free text description of the platform
platform.troubled
: Whether or not the platform was marked "troubled" by a Gemma process or a curator
platform.experimentCount
: Number of experiments using the platform within Gemma
platform.type
: Technology type for the platform.
taxon.name
: Name of the species platform was made for
taxon.scientific
: Scientific name for the taxon
taxon.ID
: Internal identifier given to the species by Gemma
taxon.NCBI
: NCBI ID of the taxon
taxon.database.name
: Underlying database used in Gemma for the taxon
taxon.database.ID
: ID of the underyling database used in Gemma for the taxon
get_platforms_by_ids("GPL1355") get_platforms_by_ids(c("GPL1355", "GPL96"))
get_platforms_by_ids("GPL1355") get_platforms_by_ids(c("GPL1355", "GPL96"))
Returns queried result set
get_result_sets( datasets = NA_character_, resultSets = NA_character_, filter = NA_character_, offset = 0, limit = 20, sort = "+id", raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
get_result_sets( datasets = NA_character_, resultSets = NA_character_, filter = NA_character_, offset = 0, limit = 20, sort = "+id", raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
datasets |
A vector of dataset IDs or short names |
resultSets |
A resultSet identifier. Note that result set identifiers
are not static and can change when Gemma re-runs analyses internally. Whem
using these as inputs, try to make sure you access a currently existing
result set ID by basing them on result sets returned for a particular dataset or
filter used in |
filter |
Filter results by matching expression. Use |
offset |
The offset of the first retrieved result. |
limit |
Defaults to 20. Limits the result to specified amount
of objects. Has a maximum value of 100. Use together with |
sort |
Order results by the given property and direction. The '+' sign indicate ascending order whereas the '-' indicate descending. |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
Output and usage of this function is mostly identical to get_dataset_differential_expression_analyses
.
The principal difference being the ability to restrict your result sets, being able to
query across multiple datasets and being able to use the filter argument
to search based on result set properties.
A data table with information about the queried result sets. Note that this function does not return
differential expression values themselves. Use get_differential_expression_values
to get differential expression values
result.ID
: Result set ID of the differential expression analysis.
May represent multiple factors in a single model.
contrast.ID
: Id of the specific contrast factor. Together with the result.ID
they uniquely represent a given contrast.
experiment.ID
: Id of the source experiment
factor.category
: Category for the contrast
factor.category.URI
: URI for the contrast category
factor.ID
: ID of the factor
baseline.factors
: Characteristics of the baseline. This field is a data.table
experimental.factors
: Characteristics of the experimental group. This field is a data.table
isSubset
: TRUE if the result set belong to a subset, FALSE if not. Subsets are created when performing differential expression to avoid unhelpful comparisons.
subsetFactor
: Characteristics of the subset. This field is a data.table
get_result_sets(dataset = 1) # get all contrasts comparing disease states. use filter_properties to see avaialble options get_result_sets(filter = "baselineGroup.characteristics.value = disease")
get_result_sets(dataset = 1) # get all contrasts comparing disease states. use filter_properties to see avaialble options get_result_sets(filter = "baselineGroup.characteristics.value = disease")
Returns taxa and their versions used in Gemma
get_taxa(memoised = getOption("gemma.memoised", FALSE))
get_taxa(memoised = getOption("gemma.memoised", FALSE))
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
A data frame including the names, IDs and database information about the taxons
get_taxa()
get_taxa()
Using on the output of get_dataset_samples
, this function creates
a simplified design table, granting one column to each experimental variable
make_design(samples, metaType = "text")
make_design(samples, metaType = "text")
samples |
An output from get_dataset_samples. The output should not be raw |
metaType |
Type of metadata to include in the output. "text", "uri" or "both" |
A data.frame including the design table for the dataset
samples <- get_dataset_samples('GSE46416') make_design(samples)
samples <- get_dataset_samples('GSE46416') make_design(samples)
Search for annotation tags
search_annotations( query, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
search_annotations( query, raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
query |
The search query. Queries can include plain text or ontology terms They also support conjunctions ("alpha AND beta"), disjunctions ("alpha OR beta") grouping ("(alpha OR beta) AND gamma"), prefixing ("alpha*"), wildcard characters ("BRCA?") and fuzzy matches ("alpha~"). |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
A data table with annotations (annotation search result value objects)
matching the given identifiers. A list if raw = TRUE
. A 400 error
if required parameters are missing.
The fields of the output data.table are:
category.name
: Category that the annotation belongs to
category.URI
: URI for the category.name
value.name
: Annotation term
value.URI
: URI for the value.name
search_annotations("traumatic")
search_annotations("traumatic")
Search everything in Gemma
search_gemma( query, taxon = NA_character_, platform = NA_character_, limit = 100, resultType = "experiment", raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
search_gemma( query, taxon = NA_character_, platform = NA_character_, limit = 100, resultType = "experiment", raw = getOption("gemma.raw", FALSE), memoised = getOption("gemma.memoised", FALSE), file = getOption("gemma.file", NA_character_), overwrite = getOption("gemma.overwrite", FALSE) )
query |
The search query. Queries can include plain text or ontology terms They also support conjunctions ("alpha AND beta"), disjunctions ("alpha OR beta") grouping ("(alpha OR beta) AND gamma"), prefixing ("alpha*"), wildcard characters ("BRCA?") and fuzzy matches ("alpha~"). |
taxon |
A numerical taxon identifier or an ncbi taxon identifier or a taxon identifier that matches either its scientific or common name |
platform |
A platform numerical identifier or a platform short name |
limit |
Defaults to 100 with a maximum value of 2000. Limits the number of returned results. Note that this function does not support pagination. |
resultType |
The kind of results that should be included in the output. Can be experiment, gene, platform or a long object type name, documented in the API documentation. |
raw |
|
memoised |
Whether or not to save to cache for future calls with the
same inputs and use the result saved in cache if a result is already saved.
Doing |
file |
The name of a file to save the results to, or |
overwrite |
Whether or not to overwrite if a file exists at the specified filename. |
If raw = FALSE
and resultType is experiment, gene or platform,
a data.table containing the search results. If it is any other type, a list
of results. A list with additional details about the search if raw = TRUE
search_gemma("bipolar")
search_gemma("bipolar")
Allows the user to access information that requires logging in to Gemma. To log out, run set_gemma_user
without specifying the username or password.
set_gemma_user(username = NULL, password = NULL)
set_gemma_user(username = NULL, password = NULL)
username |
Your username (or empty, if logging out) |
password |
Your password (or empty, if logging out) |
TRUE if authentication is successful, FALSE if not
Re-runs the function used to create a gemma.R output to update the data at hand. Useful if you have a reason to believe parts of the data has changed since your last accession and you wish to update while decoupling the update process from your original code used to generate the data.
update_result(query)
update_result(query)
query |
Output from a gemma.R function |
Note that if you have used the file and overwrite arguments with the original call, this will also repeat to regenarete the file based on your initial preference
annots <- get_dataset_annotations(1) # wait for a couple of years.. # wonder if the results are the same updated_annots <- update_result(annots) # also works with outputs of get_all_pages platforms <- get_all_pages(get_platforms_by_ids()) updated_platforms <- update_result(platforms)
annots <- get_dataset_annotations(1) # wait for a couple of years.. # wonder if the results are the same updated_annots <- update_result(annots) # also works with outputs of get_all_pages platforms <- get_all_pages(get_platforms_by_ids()) updated_platforms <- update_result(platforms)