Title: | regutools: an R package for data extraction from RegulonDB |
---|---|
Description: | RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools provides researchers with the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks. |
Authors: | Joselyn Chavez [aut, cre] , Carmina Barberena-Jonas [aut] , Jesus E. Sotelo-Fonseca [aut] , Jose Alquicira-Hernandez [ctb] , Heladia Salgado [ctb] , Leonardo Collado-Torres [aut] , Alejandro Reyes [aut] |
Maintainer: | Joselyn Chavez <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.19.0 |
Built: | 2024-11-18 04:13:36 UTC |
Source: | https://github.com/bioc/regutools |
Given a list of filters, this function builds a logical
condition to query database.
The output is used in get_dataset()
.
build_condition(regulondb, dataset, filters, operator, interval, partialmatch)
build_condition(regulondb, dataset, filters, operator, interval, partialmatch)
regulondb |
A |
dataset |
dataset of interest |
filters |
List of filters to be used. The names should correspond to the attribute and the values correspond to the condition for selection. |
operator |
A string indicating if all the filters (AND) or some of them (OR) should be met |
interval |
the filters with values considered as interval |
partialmatch |
name of the condition(s) with a string pattern for full or partial match in the query |
A character(1)
with the sql logical condition to query the dataset
.
Carmina Barberena Jonás, Jesús Emiliano Sotelo Fonseca, José Alquicira Hernández, Joselyn Chávez
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Build the condition for ara build_condition( e_coli_regulondb, dataset = "GENE", filters = list( name = c("ara"), strand = c("forward"), posright = c("2000", "40000") ), operator = "AND", interval = "posright", partialmatch = "name" )
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Build the condition for ara build_condition( e_coli_regulondb, dataset = "GENE", filters = list( name = c("ara"), strand = c("forward"), posright = c("2000", "40000") ), operator = "AND", interval = "posright", partialmatch = "name" )
This function downloads the RegulonDB SQLite database file prior to making a connection to it. It will cache the database file such that subsequent calls will run faster. This function requires an active internet connection.
connect_database( ah = AnnotationHub::AnnotationHub(), bfc = BiocFileCache::BiocFileCache() )
connect_database( ah = AnnotationHub::AnnotationHub(), bfc = BiocFileCache::BiocFileCache() )
ah |
An |
bfc |
A |
An SQLiteConnection-class connection to the RegulonDB database.
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Connect to the database without using AnnotationHub regulondb_conn_noAH <- connect_database(ah = NULL)
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Connect to the database without using AnnotationHub regulondb_conn_noAH <- connect_database(ah = NULL)
This function converts, when possible, a regulon_result object into a Biostrings object.
convert_to_biostrings(regulondb_result, seq_type = "DNA")
convert_to_biostrings(regulondb_result, seq_type = "DNA")
regulondb_result |
A regulon_result object. |
seq_type |
A character string with either DNA or protein, specyfing what |
A XStringSet object.
Alejandro Reyes
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Obtain all the information from the "GENE" dataset convert_to_biostrings(get_dataset(e_coli_regulondb, dataset = "GENE"))
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Obtain all the information from the "GENE" dataset convert_to_biostrings(get_dataset(e_coli_regulondb, dataset = "GENE"))
This function converts, when possible, a regulon_result object into a GRanges object.
convert_to_granges(regulondb_result)
convert_to_granges(regulondb_result)
regulondb_result |
A regulon_result object. |
A GRanges object.
Alejandro Reyes
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Obtain all the information from the "GENE" dataset convert_to_granges(get_dataset(e_coli_regulondb, dataset = "GENE"))
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Obtain all the information from the "GENE" dataset convert_to_granges(get_dataset(e_coli_regulondb, dataset = "GENE"))
Given a list of filters, this function builds a logical
condition to query database using intervals.
The output is used in build_condition()
.
existing_intervals(filters, interval, operator, partialmatch)
existing_intervals(filters, interval, operator, partialmatch)
filters |
List of filters to be used. The names should correspond to the attribute and the values correspond to the condition for selection. |
interval |
the filters with values considered as interval. |
operator |
A string indicading if all the filters (AND) or some of them (OR) should be met. |
partialmatch |
name of the condition(s) with a string pattern for full or partial match in the query. |
A character(1)
with the sql logical condition to query the dataset.
Carmina Barberena Jonás, Jesús Emiliano Sotelo Fonseca, José Alquicira Hernández, Joselyn Chávez
## Build the SQL query for existing interval partial matches for ara existing_intervals( filters = list( name = "ara", strand = "for", posright = c("2000", "40000") ), interval = c("posright"), operator = "AND", partialmatch = c("name", "strand") )
## Build the SQL query for existing interval partial matches for ara existing_intervals( filters = list( name = "ara", strand = "for", posright = c("2000", "40000") ), interval = c("posright"), operator = "AND", partialmatch = c("name", "strand") )
Given a list of filters, this function builds a logical
condition to query database using intervals.
The output is used in existing_intervals()
and non_existing_intervals()
.
existing_partial_match(filters, partialmatch, operator)
existing_partial_match(filters, partialmatch, operator)
filters |
List of filters to be used. The names should correspond to the attribute and the values correspond to the condition for selection. |
partialmatch |
name of the condition(s) with a string pattern for full or partial match in the query. |
operator |
A string indicating if all the filters (AND) or some of them (OR) should be met. |
A character(1)
with the sql logical condition to query the dataset.
Carmina Barberena Jonás, Jesús Emiliano Sotelo Fonseca, José Alquicira Hernández
## Build the SQL query for existing partial matches for ara existing_partial_match( filters = list( name = c("ara"), strand = c("forward"), posright = c("2000", "40000") ), partialmatch = "name", operator = "AND" )
## Build the SQL query for existing partial matches for ara existing_partial_match( filters = list( name = c("ara"), strand = c("forward"), posright = c("2000", "40000") ), partialmatch = "name", operator = "AND" )
Retrieve the binding sites and genome location for a given transcription factor.
get_binding_sites(regulondb, transcription_factor, output_format = "GRanges")
get_binding_sites(regulondb, transcription_factor, output_format = "GRanges")
regulondb |
A |
transcription_factor |
name of the transcription factor. |
output_format |
The output object. Can be either a |
Either a GRanges object or a Biostrings object summarizing information about the binding sites of the transcription factors.
José Alquicira Hernández, Jacques van Helden, Joselyn Chávez
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Get the binding sites for AraC get_binding_sites(e_coli_regulondb, transcription_factor = "AraC")
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Get the binding sites for AraC get_binding_sites(e_coli_regulondb, transcription_factor = "AraC")
This function retrieves data from RegulonDB. Attributes from datasets can be selected and filtered.
get_dataset( regulondb, dataset = NULL, attributes = NULL, filters = NULL, and = TRUE, interval = NULL, partialmatch = NULL, output_format = "regulondb_result" )
get_dataset( regulondb, dataset = NULL, attributes = NULL, filters = NULL, and = TRUE, interval = NULL, partialmatch = NULL, output_format = "regulondb_result" )
regulondb |
A |
dataset |
Dataset of interest. Use the function list_datasets for an overview of valid datasets. |
attributes |
Vector of attributes to be retrieved. |
filters |
List of filters to be used. The names should correspond to the attribute and the values correspond to the condition for selection. |
and |
Logical argument. If FALSE, filters will be considered under the "OR" operator |
interval |
the filters whose values will be considered as interval |
partialmatch |
name of the condition(s) with a string pattern for full or partial match in the query |
output_format |
A string specifying the output format. Possible options are "regulondb_result", "GRanges", "DNAStringSet" or "BStringSet". |
By default, a regulon_results object. If specified in the parameter output_format, it can also return either a GRanges object or a Biostrings object.
Carmina Barberena Jonas, Jesús Emiliano Sotelo Fonseca, José Alquicira Hernández, Joselyn Chávez
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Obtain all the information from the "GENE" dataset get_dataset(e_coli_regulondb, dataset = "GENE") ## Get the attributes posright and name from the "GENE" dataset get_dataset(e_coli_regulondb, dataset = "GENE", attributes = c("posright", "name") ) ## From "GENE" dataset, get the gene name, strand, posright, product name ## and id of all genes regulated with name like "ara", strand as "forward" ## with a position right between 2000 and 40000 get_dataset( e_coli_regulondb, dataset = "GENE", attributes = c("name", "strand", "posright", "product_name", "id"), filters = list( name = c("ara"), strand = c("forward"), posright = c("2000", "40000") ), and = TRUE, partialmatch = "name", interval = "posright" )
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Obtain all the information from the "GENE" dataset get_dataset(e_coli_regulondb, dataset = "GENE") ## Get the attributes posright and name from the "GENE" dataset get_dataset(e_coli_regulondb, dataset = "GENE", attributes = c("posright", "name") ) ## From "GENE" dataset, get the gene name, strand, posright, product name ## and id of all genes regulated with name like "ara", strand as "forward" ## with a position right between 2000 and 40000 get_dataset( e_coli_regulondb, dataset = "GENE", attributes = c("name", "strand", "posright", "product_name", "id"), filters = list( name = c("ara"), strand = c("forward"), posright = c("2000", "40000") ), and = TRUE, partialmatch = "name", interval = "posright" )
Retrieve genomic elements from regulonDB
get_dna_objects( regulondb, genome = "eschColi_K12", grange = GRanges("chr", IRanges(1, 5000)), elements = "gene" )
get_dna_objects( regulondb, genome = "eschColi_K12", grange = GRanges("chr", IRanges(1, 5000)), elements = "gene" )
regulondb |
A |
genome |
A valid UCSC genome name. |
grange |
A |
elements |
A character vector specifying which annotation elements to
plot. It can be any from: |
GenomicRanges::GRanges-class()
object with the elements found.
Joselyn Chavez
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) { regulondb_conn <- connect_database() } ## Build the regulondb object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "chr", database_version = "1", genome_version = "1" ) ## Get all genes from E. coli get_dna_objects(e_coli_regulondb) ## Get genes providing Genomic Ranges grange <- GenomicRanges::GRanges( "chr", IRanges::IRanges(5000, 10000) ) get_dna_objects(e_coli_regulondb, grange) ## Get aditional elements within genomic positions get_dna_objects(e_coli_regulondb, grange, elements = c("gene", "promoter") )
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) { regulondb_conn <- connect_database() } ## Build the regulondb object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "chr", database_version = "1", genome_version = "1" ) ## Get all genes from E. coli get_dna_objects(e_coli_regulondb) ## Get genes providing Genomic Ranges grange <- GenomicRanges::GRanges( "chr", IRanges::IRanges(5000, 10000) ) get_dna_objects(e_coli_regulondb, grange) ## Get aditional elements within genomic positions get_dna_objects(e_coli_regulondb, grange, elements = c("gene", "promoter") )
Given a list of genes (name, bnumber or GI), get all transcription factors or genes that regulate them. The effect of regulators over the gene of interest can be positive (+), negative (-) or dual (+/-)
get_gene_regulators(regulondb, genes, format = "multirow", output.type = "TF")
get_gene_regulators(regulondb, genes, format = "multirow", output.type = "TF")
regulondb |
A regulondb class. |
genes |
Vector of genes (name, bnumber or GI). |
format |
Output format: multirow, onerow, table |
output.type |
How regulators will be represented: "TF"/"GENE" |
A regulondb_result object.
Carmina Barberena Jonas, Jesús Emiliano Sotelo Fonseca, José Alquicira Hernández, Joselyn Chávez
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Get Transcription factors that regulate araC in one row get_gene_regulators( e_coli_regulondb, genes = c("araC"), output.type = "TF", format = "onerow" ) ## Get genes that regulate araC in table format get_gene_regulators( e_coli_regulondb, genes = c("araC"), output.type = "GENE", format = "table" )
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Get Transcription factors that regulate araC in one row get_gene_regulators( e_coli_regulondb, genes = c("araC"), output.type = "TF", format = "onerow" ) ## Get genes that regulate araC in table format get_gene_regulators( e_coli_regulondb, genes = c("araC"), output.type = "GENE", format = "table" )
Given a list of genes (id, name, bnumber or gi), get the gene synonyms (name, bnumber of gi).
get_gene_synonyms( regulondb, genes, from = "name", to = c("id", "name", "bnumber", "gi") )
get_gene_synonyms( regulondb, genes, from = "name", to = c("id", "name", "bnumber", "gi") )
regulondb |
A |
genes |
Character vector of gene identifiers (id, name, bnumber or gi). |
from |
A |
to |
A |
A regulondb_result object.
Jesús Emiliano Sotelo Fonseca
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Lists all available identifiers for "araC" get_gene_synonyms(e_coli_regulondb, "araC", from = "name") ## Retrieve only the ID get_gene_synonyms(e_coli_regulondb, "araC", from = "name", to = "id") ## Use an ID to retrieve the synonyms get_gene_synonyms(e_coli_regulondb, "ECK120000998", from = "id")
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Lists all available identifiers for "araC" get_gene_synonyms(e_coli_regulondb, "araC", from = "name") ## Retrieve only the ID get_gene_synonyms(e_coli_regulondb, "araC", from = "name", to = "id") ## Use an ID to retrieve the synonyms get_gene_synonyms(e_coli_regulondb, "ECK120000998", from = "id")
This function retrieves all the regulation networks in regulonDB between TF-TF, GENE-GENE or TF-GENE depending on the parameter 'type'.
get_regulatory_network( regulondb, regulator = NULL, type = "TF-GENE", cytograph = FALSE )
get_regulatory_network( regulondb, regulator = NULL, type = "TF-GENE", cytograph = FALSE )
regulondb |
A |
regulator |
Name of TF or gene that acts as regulator. If |
type |
"TF-GENE", "TF-TF", "GENE-GENE" |
cytograph |
If TRUE, displays network in Cytoscape. This option requires previous instalation and launch of Cytoscape. |
A regulondb_result object.
Carmina Barberena Jonas, Jesús Emiliano Sotelo Fonseca, José Alquicira Hernández, Joselyn Chávez
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Retrieve regulation of 'araC' get_regulatory_network(e_coli_regulondb, regulator = "AraC", type = "TF-GENE" ) ## Retrieve all GENE-GENE networks get_regulatory_network(e_coli_regulondb, type = "GENE-GENE") ## Retrieve TF-GENE network of AraC and display in Cytoscape ## Note that Cytospace needs to be open for this to work cytoscape_present <- try(RCy3::cytoscapePing(), silent = TRUE) if (!is(cytoscape_present, "try-error")) { get_regulatory_network( e_coli_regulondb, regulator = "AraC", type = "TF-GENE", cytograph = TRUE ) }
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Retrieve regulation of 'araC' get_regulatory_network(e_coli_regulondb, regulator = "AraC", type = "TF-GENE" ) ## Retrieve all GENE-GENE networks get_regulatory_network(e_coli_regulondb, type = "GENE-GENE") ## Retrieve TF-GENE network of AraC and display in Cytoscape ## Note that Cytospace needs to be open for this to work cytoscape_present <- try(RCy3::cytoscapePing(), silent = TRUE) if (!is(cytoscape_present, "try-error")) { get_regulatory_network( e_coli_regulondb, regulator = "AraC", type = "TF-GENE", cytograph = TRUE ) }
This function takes the output of get_gene_regulators()
with
format multirow,
onerow or table, or a vector with genes and retrieves information about the
TFs and their regulated genes
get_regulatory_summary(regulondb, gene_regulators)
get_regulatory_summary(regulondb, gene_regulators)
regulondb |
A |
gene_regulators |
Result from |
A data frame with the following columns:
The name or gene of TF
Regulated Genes per TF
Percent of regulated genes per TF
positive, negative or dual regulation
Name(s) of regulated genes
Carmina Barberena Jonas, Jesús Emiliano Sotelo Fonseca, José Alquicira Hernández, Joselyn Chávez
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Get the araC regulators araC_regulation <- get_gene_regulators( e_coli_regulondb, genes = c("araC"), format = "multirow", output.type = "TF" ) ## Summarize the araC regulation get_regulatory_summary(e_coli_regulondb, araC_regulation) ## Retrieve summary of genes 'araC' and 'modB' get_regulatory_summary(e_coli_regulondb, gene_regulators = c("araC", "modB") ) ## Obtain the summary for 'ECK120000050' and 'modB' get_regulatory_summary(e_coli_regulondb, gene_regulators = c("ECK120000050", "modB") )
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Get the araC regulators araC_regulation <- get_gene_regulators( e_coli_regulondb, genes = c("araC"), format = "multirow", output.type = "TF" ) ## Summarize the araC regulation get_regulatory_summary(e_coli_regulondb, araC_regulation) ## Retrieve summary of genes 'araC' and 'modB' get_regulatory_summary(e_coli_regulondb, gene_regulators = c("araC", "modB") ) ## Obtain the summary for 'ECK120000050' and 'modB' get_regulatory_summary(e_coli_regulondb, gene_regulators = c("ECK120000050", "modB") )
Given a gene identifier, return the most likely gene_id type.
guess_id(gene, regulondb)
guess_id(gene, regulondb)
gene |
Character vector of gene identifiers (id, name, bnumber or gi). |
regulondb |
A |
A character(1)
vector with the name column guessed value.
Jesús Emiliano Sotelo Fonseca
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Lists all available identifiers for "araC" ## Guess name guess_id("araC", e_coli_regulondb) ## Guess id guess_id("ECK120000050", e_coli_regulondb) ## Guess bnumber guess_id("b0064", e_coli_regulondb)
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## Lists all available identifiers for "araC" ## Guess name guess_id("araC", e_coli_regulondb) ## Guess id guess_id("ECK120000050", e_coli_regulondb) ## Guess bnumber guess_id("b0064", e_coli_regulondb)
List all attributes and their description of a dataset from
RegulonDB. The result of this function may
be used as parameter 'values' in list_attributes()
function.
list_attributes(regulondb, dataset)
list_attributes(regulondb, dataset)
regulondb |
A |
dataset |
Dataset of interest. The name should correspond to a table of the database. |
A character vector with the field names.
Carmina Barberena Jonás, Jesús Emiliano Sotelo Fonseca, José Alquicira Hernández, Joselyn Chavez
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## List the transcription factor attributes list_attributes(e_coli_regulondb, "TF") ## List the operon attributes list_attributes(e_coli_regulondb, "OPERON")
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## List the transcription factor attributes list_attributes(e_coli_regulondb, "TF") ## List the operon attributes list_attributes(e_coli_regulondb, "OPERON")
This function returns a vector of all available tables from a regulondb class.
list_datasets(regulondb)
list_datasets(regulondb)
regulondb |
A regulondb class. |
A character()
with the names of the available datasets.
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## List the available datasets list_datasets(e_coli_regulondb)
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build the regulon db object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" ) ## List the available datasets list_datasets(e_coli_regulondb)
Given a list of filters, this function builds a logical
condition to query database using intervals.
The output is used in build_condition()
.
non_existing_intervals(filters, interval, operator, partialmatch)
non_existing_intervals(filters, interval, operator, partialmatch)
filters |
List of filters to be used. The names should correspond to the attribute and the values correspond to the condition for selection. |
interval |
the filters whose values will be considered as interval |
operator |
A string indicating if all the filters (AND) or some of them (OR) should be met. |
partialmatch |
name of the condition(s) with a string pattern for full or partial match in the query. |
A character(1)
with the sql logical condition to query the dataset.
Carmina Barberena Jonás, Jesús Emiliano Sotelo Fonseca, José Alquicira Hernández
## Build the SQL query for finidng non-existing intervals for the gene ara non_existing_intervals( filters = list(name = "ara", strand = "for"), interval = NULL, operator = "AND", partialmatch = c("name", "strand") )
## Build the SQL query for finidng non-existing intervals for the gene ara non_existing_intervals( filters = list(name = "ara", strand = "for"), interval = NULL, operator = "AND", partialmatch = c("name", "strand") )
Plot annotation elements within genomic region
plot_dna_objects( regulondb, genome = "eschColi_K12", grange = GRanges("chr", IRanges(1, 5000)), elements = "gene" )
plot_dna_objects( regulondb, genome = "eschColi_K12", grange = GRanges("chr", IRanges(1, 5000)), elements = "gene" )
regulondb |
A |
genome |
A valid UCSC genome name. |
grange |
A |
elements |
A character vector specifying which annotation elements to
plot. It can be any from: |
A plot with genomic elements found within a genome region, including genes and regulators.
Joselyn Chavez
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) { regulondb_conn <- connect_database() } ## Build the regulondb object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "chr", database_version = "1", genome_version = "1" ) ## Plot some genes from E. coli using default parameters plot_dna_objects(e_coli_regulondb) ## Plot genes providing Genomic Ranges grange <- GenomicRanges::GRanges( "chr", IRanges::IRanges(5000, 10000) ) plot_dna_objects(e_coli_regulondb, grange) ## Plot aditional elements within genomic positions plot_dna_objects(e_coli_regulondb, grange, elements = c("gene", "promoter") )
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) { regulondb_conn <- connect_database() } ## Build the regulondb object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "chr", database_version = "1", genome_version = "1" ) ## Plot some genes from E. coli using default parameters plot_dna_objects(e_coli_regulondb) ## Plot genes providing Genomic Ranges grange <- GenomicRanges::GRanges( "chr", IRanges::IRanges(5000, 10000) ) plot_dna_objects(e_coli_regulondb, grange) ## Plot aditional elements within genomic positions plot_dna_objects(e_coli_regulondb, grange, elements = c("gene", "promoter") )
The build_regulondb function is a constructor function of a regulondb class.
regulondb(database_conn, organism, genome_version, database_version)
regulondb(database_conn, organism, genome_version, database_version)
database_conn |
A
SQLiteConnection-class connection to
the RegulonDB database made with |
organism |
A character vector with the name of the organism of the database. |
genome_version |
A character vector with the version of the genome build. |
database_version |
A character vector with the version of regulondb build. |
A regulondb object.
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build a regulondb object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" )
## Connect to the RegulonDB database if necessary if (!exists("regulondb_conn")) regulondb_conn <- connect_database() ## Build a regulondb object e_coli_regulondb <- regulondb( database_conn = regulondb_conn, organism = "E.coli", database_version = "1", genome_version = "1" )
The regulondb class is an extension of the DataFrame class, with additional slots that host information of the database used to obtain these results.
organism
A character string with the name of the organism of the database.
genome_version
A character string with the version of the genome build.
database_version
A character string with the version of regulondb build.
dataset
A character string with the name of the table used for the query in get_dataset().
The regulondb class is an extension of the SQLiteConnection, which as the name suggests, consists of an SQLite connection to a database with the table design of the RegulonDb database. In addition to the slots defined in the SQLiteConnection object, the regulondb class also contains additional slots to store information about database versions, organism information and genome build versions.
organism
A character vector with the name of the organism of the database.
genome_version
A character vector with the version of the genome build.
database_version
A character vector with the version of regulondb build.
Methods for regulondb objects
## S4 method for signature 'regulondb' show(object)
## S4 method for signature 'regulondb' show(object)
object |
A regulondb object |
A regulondb object.