Package 'GEOquery'

Title: Get data from NCBI Gene Expression Omnibus (GEO)
Description: The NCBI Gene Expression Omnibus (GEO) is a public repository of high-throughput functional genomics data, including microarray, RNA-seq, and single-cell experiments. GEOquery is the bridge between GEO and Bioconductor: it downloads and parses GEO Series (GSE), Sample (GSM), Platform (GPL), and DataSet (GDS) records. By default it parses GEO Series Matrix files into Bioconductor 'ExpressionSet' objects; it can also parse the full SOFT format into GEOquery's own S4 classes, retrieve NCBI-computed RNA-seq quantifications, download supplementary files, and search GEO.
Authors: Sean Davis [aut, cre] (ORCID: <https://orcid.org/0000-0002-8991-6458>)
Maintainer: Sean Davis <[email protected]>
License: MIT + file LICENSE
Version: 2.81.21
Built: 2026-06-14 15:52:50 UTC
Source: https://github.com/bioc/GEOquery

Help Index


Coerce a GEOquery ExpressionSet to a SummarizedExperiment

Description

A thin wrapper around SummarizedExperiment::makeSummarizedExperimentFromExpressionSet() used by getGEO(..., returnType = "SummarizedExperiment"), and available directly so existing ExpressionSet results can be modernized without re-downloading.

Usage

as_SummarizedExperiment(eset)

Arguments

eset

An ExpressionSet, e.g. an element returned by getGEO() for a GSE Series Matrix file.

Value

A SummarizedExperiment.

Examples

## Not run: 
  gse <- getGEO("GSE2553")[[1]]
  se <- as_SummarizedExperiment(gse)

## End(Not run)

Open the GEO page for a given accession

Description

Sometimes, you just need to see the GEO website page for a GEO accession. This function opens the GEO page for a given accession number in the default browser.

Usage

browseGEOAccession(geo)

Arguments

geo

A GEO accession number

See Also

urlForAccession

Examples

## Not run: 
browseGEOAccession("GSE262484")

## End(Not run)

Browse GEO search website for RNA-seq datasets

Description

This function opens a browser window to the NCBI GEO website with a search for RNA-seq datasets. It is included as a convenience function to remind users of how to search for RNA-seq datasets using the NCBI GEO website and an "rnaseq counts" filter.

Usage

browseWebsiteRNASeqSearch()

Examples

## Not run: 
browseWebsiteRNASeqSearch()

## End(Not run)

Clear the GEOquery download cache

Description

Remove all entries from the persistent download cache (see geoCache).

Usage

clearGEOCache()

Value

NULL, invisibly.

See Also

geoCache


Convert a GDS data structure to a BioConductor data structure

Description

Functions to take a GDS data structure from getGEO and coerce it to limma MALists or ExpressionSets.

Arguments

GDS

The GDS datastructure returned by getGEO

do.log2

Boolean, should the data in the GDS be log2 transformed before inserting into the new data structure

GPL

Either a GPL data structure (from a call to getGEO) or NULL. If NULL, this will cause a call to getGEO to produce a GPL. The gene information from the GPL is then used to construct the genes slot of the resulting limma MAList object or the featureData slot of the ExpressionSet instance.

AnnotGPL

In general, the annotation GPL files will be available for GDS records, so the default is to use these files over the user-submitted GPL files

getGPL

A boolean defaulting to TRUE as to whether or not to download and include GPL information when converting to ExpressionSet or MAList. You may want to set this to FALSE if you know that you are going to annotate your featureData using Bioconductor tools rather than relying on information provided through NCBI GEO. Download times can also be greatly reduced by specifying FALSE.

Details

This function just rearranges one data structure into another. For GDS, it also deals appropriately with making the 'targets' list item for the limma data structure and the phenoData slot of ExpressionSets.

Value

GDS2MA

A limma MAList

GDS2eSet

An ExpressionSet object

Author(s)

Sean Davis

References

See the limma and ExpressionSet help in the appropriate packages

Examples

## Not run: gds505 <- getGEO('GDS505')
## Not run: MA <- GDS2MA(gds505)
## Not run: eset <- GDS2eSet(gds505)

Class 'GDS'

Description

A class describing a GEO GDS entity

Objects from the Class

Objects of this class are returned by getGEO; they are not normally constructed directly.

Author(s)

Sean Davis

See Also

GEOData-class


GEOquery download cache

Description

Return the BiocFileCache object that backs GEOquery's persistent download cache (see clearGEOCache). The cache is used by the download functions only when options(GEOquery.cache = TRUE) is set; its location defaults to tools::R_user_dir("GEOquery", "cache") and can be overridden with options(GEOquery.cache.path = ...).

Usage

geoCache()

Value

A BiocFileCache object.

See Also

clearGEOCache


Accessors for GEOquery objects

Description

Accessor generics for the S4 objects returned by getGEO when parsing SOFT-format records (GSE, GSM, GPL, GDS). Use these rather than reaching into slots directly.

Arguments

object

A GEOquery S4 object (GSE, GSM, GPL, GDS, or GEODataTable).

Details

Meta(object)

The record metadata as a named list (title, submission dates, sample/platform attributes, and so on).

Accession(object)

The GEO accession (the geo_accession metadata field).

Table(object)

The data table as a data.frame – for example the measurement table of a GSM or the probe annotation of a GPL.

Columns(object)

A data.frame describing the columns of Table(object).

dataTable(object)

The underlying GEODataTable object, which holds both Table() and Columns().

GSMList(object)

For a GSE, the list of its GSM sample objects.

GPLList(object)

For a GSE, the list of its GPL platform objects.

Value

Meta() a list; Accession() a character string; Table() and Columns() data.frames; dataTable() a GEODataTable; GSMList() and GPLList() named lists.

Author(s)

Sean Davis

See Also

GEOData-class, getGEO

Examples

## Not run: 
  gsm <- getGEO("GSM11805")
  Meta(gsm)$title
  head(Table(gsm))
  Columns(gsm)

  gse <- getGEO("GSE781", GSEMatrix = FALSE)
  names(GSMList(gse))

## End(Not run)

Class 'GEOData'

Description

A virtual class for holding GEO samples, platforms, and datasets

Objects from the Class

Objects of this class are returned by getGEO; they are not normally constructed directly.

Author(s)

Sean Davis

See Also

GDS-class, GPL-class, GSM-class, GEODataTable-class,


Class 'GEODataTable'

Description

Contains the column descriptions and data for the datatable part of a GEO object

Objects from the Class

Objects of this class are returned by getGEO; they are not normally constructed directly.

Author(s)

Sean Davis


Inventory the single-cell supplementary files of a GEO Series

Description

Lists the supplementary files attached to a GSE and classifies each by single-cell format (10x Matrix Market triplet, 10x HDF5, AnnData h5ad, loom, Seurat rds, tar archive, or other), extracting the GSM sample id where present. This lets you see what a single-cell study contains – and how 10x triplets group by sample – before downloading potentially many gigabytes.

Usage

geoSingleCellManifest(GEO)

Arguments

GEO

A GEO Series accession, e.g. "GSE161228".

Details

No files are downloaded. The result feeds the planned single-cell readers (see ADR-0004); reading itself uses Bioconductor importers (TENxIO, anndataR) that are optional dependencies.

Value

A data.frame with columns fname, sample (GSM id or NA), format, role, and url. Zero rows if the GSE has no supplementary files.

See Also

getGEOSuppFiles

Examples

## Not run: 
  m <- geoSingleCellManifest("GSE161228")
  m

## End(Not run)

Group a single-cell manifest into loadable units

Description

Collapses a geoSingleCellManifest into one row per loadable unit (a sample + format combination) and reports completeness. A 10x Matrix Market unit is "complete" only when its matrix, barcodes, and features files are all present; single-file formats (h5ad, 10x h5, loom, rds) are always complete. The loadable column flags units a reader can consume.

Usage

geoSingleCellUnits(manifest)

Arguments

manifest

A data.frame returned by geoSingleCellManifest().

Value

A data.frame with columns sample, format, n_files, status, and loadable.

See Also

geoSingleCellManifest

Examples

## Not run: 
  m <- geoSingleCellManifest("GSE161228")
  geoSingleCellUnits(m)

## End(Not run)

get a directory listing from NCBI GEO

Description

This one makes some assumptions about the structure of the HTML response returned.

Usage

getDirListing(url)

Arguments

url

A URL, assumed to return an NCBI-formatted index page


Get a GEO object from NCBI or file

Description

This function is the main user-level function in the GEOquery package. It directs the download (if no filename is specified) and parsing of a GEO SOFT format file into an R data structure specifically designed to make access to each of the important parts of the GEO SOFT format easily accessible.

Usage

getGEO(
  GEO = NULL,
  filename = NULL,
  destdir = tempdir(),
  GSElimits = NULL,
  GSEMatrix = TRUE,
  AnnotGPL = FALSE,
  getGPL = TRUE,
  parseCharacteristics = TRUE,
  returnType = c("SummarizedExperiment", "ExpressionSet")
)

Arguments

GEO

A character string representing a GEO object for download and parsing. (eg., 'GDS505','GSE2','GSM2','GPL96')

filename

The filename of a previously downloaded GEO SOFT format file or its gzipped representation (in which case the filename must end in .gz). Either one of GEO or filename may be specified, not both. GEO series matrix files are also handled. Note that since a single file is being parsed, the return value is not a list of esets, but a single eset when GSE matrix files are parsed.

destdir

The destination directory for any downloads. Defaults to the architecture-dependent tempdir. You may want to specify a different directory if you want to save the file for later use. Doing so is a good idea if you have a slow connection, as some of the GEO files are HUGE!

GSElimits

This argument can be used to load only a contiguous subset of the GSMs from a GSE. It should be specified as a vector of length 2 specifying the start and end (inclusive) GSMs to load. This could be useful for splitting up large GSEs into more manageable parts, for example.

GSEMatrix

A boolean telling GEOquery whether or not to use GSE Series Matrix files from GEO. The parsing of these files can be many orders-of-magnitude faster than parsing the GSE SOFT format files. Defaults to TRUE, meaning that the SOFT format parsing will not occur; set to FALSE if you for some reason need other columns from the GSE records.

AnnotGPL

A boolean defaulting to FALSE as to whether or not to use the Annotation GPL information. These files are nice to use because they contain up-to-date information remapped from Entrez Gene on a regular basis. However, they do not exist for all GPLs; in general, they are only available for GPLs referenced by a GDS

getGPL

A boolean defaulting to TRUE as to whether or not to download and include GPL information when getting a GSEMatrix file. You may want to set this to FALSE if you know that you are going to annotate your featureData using Bioconductor tools rather than relying on information provided through NCBI GEO. Download times can also be greatly reduced by specifying FALSE.

parseCharacteristics

A boolean defaulting to TRUE as to whether or not to parse the characteristics information (if available) for a GSE Matrix file. Set this to FALSE if you experience trouble while parsing the characteristics.

returnType

One of "SummarizedExperiment" (default) or "ExpressionSet". For GSE Series Matrix results, controls whether each entity is returned as a SummarizedExperiment or an ExpressionSet. SOFT-format results (GDS/GPL/GSM/GSE S4 objects) are unaffected. As of this release the default is "SummarizedExperiment"; pass returnType = "ExpressionSet" for the previous behavior.

Details

getGEO functions to download and parse information available from NCBI GEO (http://www.ncbi.nlm.nih.gov/geo). Here are some details about what is avaible from GEO. All entity types are handled by getGEO and essentially any information in the GEO SOFT format is reflected in the resulting data structure.

From the GEO website:

The Gene Expression Omnibus (GEO) from NCBI serves as a public repository for a wide range of high-throughput experimental data. These data include single and dual channel microarray-based experiments measuring mRNA, genomic DNA, and protein abundance, as well as non-array techniques such as serial analysis of gene expression (SAGE), and mass spectrometry proteomic data. At the most basic level of organization of GEO, there are three entity types that may be supplied by users: Platforms, Samples, and Series. Additionally, there is a curated entity called a GEO dataset.

A Platform record describes the list of elements on the array (e.g., cDNAs, oligonucleotide probesets, ORFs, antibodies) or the list of elements that may be detected and quantified in that experiment (e.g., SAGE tags, peptides). Each Platform record is assigned a unique and stable GEO accession number (GPLxxx). A Platform may reference many Samples that have been submitted by multiple submitters.

A Sample record describes the conditions under which an individual Sample was handled, the manipulations it underwent, and the abundance measurement of each element derived from it. Each Sample record is assigned a unique and stable GEO accession number (GSMxxx). A Sample entity must reference only one Platform and may be included in multiple Series.

A Series record defines a set of related Samples considered to be part of a group, how the Samples are related, and if and how they are ordered. A Series provides a focal point and description of the experiment as a whole. Series records may also contain tables describing extracted data, summary conclusions, or analyses. Each Series record is assigned a unique and stable GEO accession number (GSExxx).

GEO DataSets (GDSxxx) are curated sets of GEO Sample data. A GDS record represents a collection of biologically and statistically comparable GEO Samples and forms the basis of GEO's suite of data display and analysis tools. Samples within a GDS refer to the same Platform, that is, they share a common set of probe elements. Value measurements for each Sample within a GDS are assumed to be calculated in an equivalent manner, that is, considerations such as background processing and normalization are consistent across the dataset. Information reflecting experimental design is provided through GDS subsets.

Value

An object of the appropriate class (GDS, GPL, GSM, or GSE) is returned. If the GSEMatrix option is used, then a list of SummarizedExperiment objects is returned by default (or ExpressionSet objects if returnType = "ExpressionSet"), one for each SeriesMatrix file associated with the GSE accession.

Warning

Some of the files that are downloaded, particularly those associated with GSE entries from GEO are absolutely ENORMOUS and parsing them can take quite some time and memory. So, particularly when working with large GSE entries, expect that you may need a good chunk of memory and that coffee may be involved when parsing....

Author(s)

Sean Davis

See Also

getGEOfile

Examples

## Not run: 

gds <- getGEO('GDS10')
gds

gse <- getGEO('GSE10')
# Returns a list, so look at first item

gse[[1]]


## End(Not run)

Download a file from GEO soft file to the local machine

Description

This function simply downloads a SOFT format file associated with the GEO accession number given.

Usage

getGEOfile(
  GEO,
  destdir = tempdir(),
  AnnotGPL = FALSE,
  amount = c("full", "brief", "quick", "data")
)

Arguments

GEO

Character string, the GEO accession for download (eg., GDS84, GPL96, GSE2553, or GSM10)

destdir

Directory in which to store the resulting downloaded file. Defaults to tempdir()

AnnotGPL

A boolean defaulting to FALSE as to whether or not to use the Annotation GPL information. These files are nice to use because they contain up-to-date information remapped from Entrez Gene on a regular basis. However, they do not exist for all GPLs; in general, they are only available for GPLs referenced by a GDS

amount

Amount of information to pull from GEO. Only applies to GSE, GPL, or GSM. See details...

Details

This function downloads GEO SOFT files based on accession number. It does not do any parsing. The first two arguments should be fairly self-explanatory, but the last is based on the input to the acc.cgi url at the geo website. In the default 'full' mode, the entire SOFT format file is downloaded. Both 'brief' and 'quick' offer shortened versions of the files, good for 'peeking' at the file before a big download on a slow connection. Finally, 'data' downloads only the data table part of the SOFT file and is good for downloading a simple EXCEL-like file for use with other programs (a convenience).

Value

Invisibly returns the full path of the downloaded file.

Author(s)

Sean Davis

References

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi

See Also

getGEO

Examples

# myfile <- getGEOfile('GDS10')

GSE Supplemental file listing

Description

The GEO Series records often have one or more supplemental files. In most cases, those files are archived as '.tar' files, the contents of which are only available in a file listing file not present on the website for download.

Usage

getGEOSeriesFileListing(GSE)

Arguments

GSE

character(1) the GSE accession

Details

This function reads that file listing file and returns the results as a data.frame.

Value

A data.frame with 5 columns. See example.

Examples

## Not run: 
getGEOSeriesFileListing('GSE288770')


## End(Not run)

Download and read the single-cell data of a GEO Series

Description

High-level, best-effort convenience wrapper: inventories the GSE (geoSingleCellManifest), groups files into loadable units (geoSingleCellUnits), downloads each loadable unit, reads it with readGEOSingleCell, and returns the results. It reports which units it loads and which it skips.

Usage

getGEOSingleCell(
  GEO,
  samples = NULL,
  format = NULL,
  combine = FALSE,
  destdir = tempdir()
)

Arguments

GEO

A GEO Series accession, e.g. "GSE161228".

samples

Optional character vector of GSM ids to restrict to.

format

Optional format(s) to restrict to ("10x_mtx", "10x_h5", "h5ad").

combine

Logical; if TRUE attempt to cbind the per-sample objects into one (requires matching features). Default FALSE returns a list.

destdir

Download destination directory.

Details

This handles common, well-structured layouts (clean per-sample 10x or h5ad). It does NOT handle every GSE: loom and Seurat .rds formats, files packaged inside a _RAW.tar archive, and idiosyncratic layouts (e.g. a single combined matrix for many samples) are out of scope – use the manifest plus readGEOSingleCell() directly for those.

Value

A named list of SingleCellExperiment (one per sample), or a single combined object if combine = TRUE.

See Also

geoSingleCellManifest, readGEOSingleCell


Get Supplemental Files from GEO

Description

NCBI GEO allows supplemental files to be attached to GEO Series (GSE), GEO platforms (GPL), and GEO samples (GSM). This function 'knows' how to get these files based on the GEO accession. No parsing of the downloaded files is attempted, since the file format is not generally knowable by the computer.

Usage

getGEOSuppFiles(
  GEO,
  makeDirectory = TRUE,
  baseDir = getwd(),
  fetch_files = TRUE,
  filter_regex = NULL,
  quiet = getOption("GEOquery.quiet", FALSE)
)

Arguments

GEO

A GEO accession number such as GPL1073 or GSM1137

makeDirectory

Should a 'subdirectory' for the downloaded files be created? Default is TRUE. If FALSE, the files will be downloaded directly into the baseDir.

baseDir

The base directory for the downloads. Default is the current working directory.

fetch_files

logical(1). If TRUE, then actually download the files. If FALSE, just return the filenames that would have been downloaded. Useful for testing and getting a list of files without actual download.

filter_regex

A character(1) regular expression that will be used to filter the filenames from GEO to limit those files that will be downloaded. This is useful to limit to, for example, bed files only.

quiet

logical(1). If TRUE, suppress informational messages such as "No supplemental files found" and "Using locally cached version". Defaults to the 'GEOquery.quiet' option, or FALSE.

Details

Again, just a note that the files are simply downloaded.

Value

If fetch_files=TRUE, a data frame is returned invisibly with rownames representing the full path of the resulting downloaded files and the records in the data.frame the output of file.info for each downloaded file. If fetch_files=FALSE, a data.frame of URLs and filenames is returned.

Author(s)

Sean Davis [email protected]

Examples

## Not run: 

a <- getGEOSuppFiles('GSM1137', fetch_files = FALSE)
a

# with a set of single-cell RNA-seq data
a <- getGEOSuppFiles('GSE161228', fetch_files = FALSE)
a


## End(Not run)

Get GEO supplemental file URL for a given GEO accession

Description

Get GEO supplemental file URL for a given GEO accession

Usage

getGEOSuppFileURL(GEO)

Arguments

GEO

Examples

# an example of a GEO supplemental file URL
# with a set of single-cell RNA-seq data
url = getGEOSuppFileURL("GSE161228")
url

## Not run: 
  browseURL(url)

## End(Not run)

Get GSE data tables from GEO into R data structures.

Description

In some cases, instead of individual sample records (GSM) containing information regarding sample phenotypes, the GEO Series contains that information in an attached data table. And example is given by GSE3494 where there are two data tables with important information contained within them. Using getGEO with the standard parameters downloads the GSEMatrix file which, unfortunately, does not contain the information in the data tables. This function simply downloads the “header” information from the GSE record and parses out the data tables into R data.frames.

Usage

getGSEDataTables(GSE)

Arguments

GSE

The GSE identifier, such as “GSE3494”.

Value

A list of data.frames.

Author(s)

Sean Davis [email protected]

See Also

getGEO

Examples

## Not run: 

dfl = getGSEDataTables('GSE3494')
lapply(dfl,head)



## End(Not run)

Get GEO RNA-seq quantifications as a SummarizedExperiment object

Description

For human and mouse GEO datasets, NCBI GEO attempts to process the raw data and provide quantifications in the form of raw counts and an annotation file. This function downloads the raw counts and annotation files from GEO and merges that with the metadata from the GEO object to create a SummarizedExperiment.

Usage

getRNASeqData(accession)

Arguments

accession

GEO accession number

Details

A major barrier to fully exploiting and reanalyzing the massive volumes of public RNA-seq data archived by SRA is the cost and effort required to consistently process raw RNA-seq reads into concise formats that summarize the expression results. To help address this need, the NCBI SRA and GEO teams have built a pipeline that precomputes RNA-seq gene expression counts and delivers them as count matrices that may be incorporated into commonly used differential expression analysis and visualization software.

The pipeline processes RNA-seq data from SRA using the HISAT2 aligner and and then generates gene expression counts using the featureCounts program.

See the GEO documentation for more details.

Value

A SummarizedExperiment object with the raw counts as the counts assay, the annotation as the rowData, and the metadata from GEO as the colData.

Examples

## Not run: 
se <- getRNASeqData("GSE164073")
se


## End(Not run)

Extract genome build and species for GEO RNA-seq quantification

Description

This function extracts the genome build and species information for a GEO RNA-seq quantification.

Usage

getRNASeqQuantGenomeInfo(gse)

Arguments

gse

GEO accession number

Value

A character vector with the genome build and species information

Examples

## Not run: 
getRNASeqQuantGenomeInfo("GSE164073")


## End(Not run)

Class 'GPL'

Description

Contains a full GEO Platform entity

Objects from the Class

Objects of this class are returned by getGEO; they are not normally constructed directly.

Author(s)

Sean Davis

See Also

GEOData-class


Class 'GSE'

Description

Contains a GEO Series entity

Objects from the Class

Objects of this class are returned by getGEO; they are not normally constructed directly.

Author(s)

Sean Davis

See Also

GPL-class,GSM-class


Class 'GSM'

Description

A class containing a GEO Sample entity

Objects from the Class

Objects of this class are returned by getGEO; they are not normally constructed directly.

Author(s)

Sean Davis

See Also

GEOData-class


Does a GEO accession have RNA-seq quantifications?

Description

This function checks if a GEO accession number has RNA-seq quantifications available. It does this by checking if the GEO accession number has a "RNA-Seq raw counts" link available on the GEO download page.

Usage

hasRNASeqQuantifications(accession)

Arguments

accession

GEO accession number

Value

TRUE if the GEO accession number has RNA-seq quantifications available, FALSE otherwise.

Examples

hasRNASeqQuantifications("GSE164073")

Parse GEO text

Description

Workhorse GEO parsers.

Usage

parseGEO(
  fname,
  GSElimits,
  destdir = tempdir(),
  AnnotGPL = FALSE,
  getGPL = TRUE,
  parseCharacteristics = TRUE
)

Arguments

fname

The filename of a SOFT format file. If the filename ends in .gz, a gzfile() connection is used to read the file directly.

GSElimits

Used to limit the number of GSMs parsed into the GSE object; useful for memory management for large GSEs.

destdir

The destination directory into which files will be saved (to be used for caching)

AnnotGPL

Fetch the annotation GPL if available

getGPL

Fetch the GPL associated with a GSEMatrix entity (should remain TRUE for all normal use cases)

parseCharacteristics

Whether or not to parse the characteristics information (if available) for a GSE Matrix file. Set to FALSE if you experience trouble parsing the characteristics.

Details

These are probably not useful to the end-user. Use getGEO to access these functions. parseGEO simply delegates to the appropriate specific parser. There should be no reason to use the parseGPL, parseGDS, parseGSE, or parseGSM functions directly.

Value

parseGEO returns an object of the associated type. For example, if it is passed the text from a GDS entry, a GDS object is returned.

Author(s)

Sean Davis

See Also

getGEO


Read a single-cell file (or 10x triplet) into a SingleCellExperiment

Description

Low-level reader: given already-downloaded local file(s), dispatch on format to the appropriate Bioconductor importer and return a SingleCellExperiment. Use this for full control; see getGEOSingleCell for the high-level convenience wrapper.

Usage

readGEOSingleCell(x, format = NULL)

Arguments

x

A path to a single file (.h5/.h5ad), a directory containing a 10x triplet, or a character vector of the triplet files.

format

One of "10x_mtx", "10x_h5", "h5ad". If NULL (default), guessed from x.

Details

Supported formats: "10x_mtx" (a directory, or the matrix/barcodes/ features files, read via TENxIO), "10x_h5" (CellRanger HDF5, TENxIO), and "h5ad" (AnnData, anndataR). loom and Seurat .rds are not supported here – read them with their native packages.

Value

A SingleCellExperiment.

See Also

getGEOSingleCell, geoSingleCellManifest


Provide a list of possible search fields for GEO search

Description

Provide a list of possible search fields for GEO search

Usage

searchFieldsGEO()

Value

a data.frame with names of possible search fields for GEO search as well as descriptions, data types, etc. for each field. Fields are in rows and their properties are in columns.

See Also

searchGEO

Examples

searchFieldsGEO()

Search GEO database

Description

This function searches the GDS database, and return a data.frame for all the search results.

Usage

searchGEO(query, step = 500L)

Arguments

query

character, the search term. The NCBI uses a search term syntax which can be associated with a specific search field with square brackets. So, for instance "Homo sapiens[ORGN]" denotes a search for ⁠Homo sapiens⁠ in the “Organism” field. Details see https://www.ncbi.nlm.nih.gov/geo/info/qqtutorial.html. The names and definitions of these fields can be identified using searchFieldsGEO.

step

the number of records to fetch from the database each time. You may choose a smaller value if failed.

Details

The NCBI allows users to access more records (10 per second) if they register for and use an API key. set_entrez_key function allows users to set this key for all calls to rentrez functions during a particular R session. You can also set an environment variable ENTREZ_KEY by Sys.setenv. Once this value is set to your key rentrez will use it for all requests to the NCBI. Details see https://docs.ropensci.org/rentrez/articles/rentrez_tutorial.html#rate-limiting-and-api-keys

Value

a data.frame contains the search results

See Also

searchFieldsGEO

Examples

## Not run: 
searchGEO("diabetes[ALL] AND Homo sapiens[ORGN] AND GSE[ETYP]")

## End(Not run)

The URL for a GEO accession

Description

Sometimes, you just need the URL for a GEO accession. This function returns the URL for a given GEO accession number that can be used to access the GEO page for that accession.

Usage

urlForAccession(geo)

Arguments

geo

A GEO accession number

Value

A character vector with the URL for the GEO accession

See Also

browseGEOAccession

Examples

urlForAccession("GSE262484")