Title: | Metabolite Mashing with R |
---|---|
Description: | A package to merge, filter sort, organise and otherwise mash together metabolite annotation tables. Metabolite annotations can be imported from multiple sources (software) and combined using workflow steps based on S4 class templates derived from the `struct` package. Other modular workflow steps such as filtering, merging, splitting, normalisation and rest-api queries are included. |
Authors: | Gavin Rhys Lloyd [aut, cre] , Ralf Johannes Maria Weber [aut] |
Maintainer: | Gavin Rhys Lloyd <[email protected]> |
License: | GPL-3 |
Version: | 1.1.0 |
Built: | 2024-10-30 08:26:15 UTC |
Source: | https://github.com/bioc/MetMashR |
A wrapper around dplyr::left_join
. Adds columns to
an annotation table by performing a left-join with an input
data.frame (annotations on the left of the join).
add_columns(new_columns, by, ...)
add_columns(new_columns, by, ...)
new_columns |
(data.frame, annotation_database) A data.frame to be left-joined to the annotation table. Can also be an annotation_database. |
by |
(character) A (named) character vector of column names to
join by e.g. |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
A add_columns
object with the following output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A add_columns
object inherits the following struct
classes:
[add_columns]
-> [model]
-> [struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
M <- add_columns( new_columns = data.frame(), by = "id")
M <- add_columns( new_columns = data.frame(), by = "id")
Adds new columns with the specified labels for each record.
add_labels(labels, replace = FALSE, ...)
add_labels(labels, replace = FALSE, ...)
labels |
(list) A named list of columns and the label to use for all records in that column. |
replace |
(logical) Replace columns. Allowed values are limited to the following:
The default is
|
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
A add_labels
object with the following output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A add_labels
object inherits the following struct
classes: [add_labels]
-> [model]
-> [struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
M <- add_labels( labels = list(), replace = FALSE)
M <- add_labels( labels = list(), replace = FALSE)
Display a bar chart of labels in the specified column of an annotation_source.
annotation_bar_chart( factor_name, label_rotation = FALSE, label_location = "inside", label_type = "percent", legend = FALSE, ... )
annotation_bar_chart( factor_name, label_rotation = FALSE, label_location = "inside", label_type = "percent", legend = FALSE, ... )
factor_name |
(character) The name of the column in the
|
label_rotation |
(logical) Rotate labels. Allowed values are limited to the following:
The default is |
label_location |
(character) Label location. Allowed values are limited to the following:
The default is |
label_type |
(character) Label type. Allowed values are limited to the following:
The default is
|
legend |
(logical) Display legend. Allowed values are limited to the following:
The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
ggplot2
A
annotation_bar_chart
object. This object has no output
slots.
See chart_plot
in the struct
package to
plot this chart object.
A annotation_bar_chart
object inherits the following struct
classes: [annotation_bar_chart]
-> [chart]
-> [struct_class]
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
M <- annotation_bar_chart( factor_name = "V1", label_location = "inside", label_rotation = FALSE, legend = FALSE, label_type = "percent")
M <- annotation_bar_chart( factor_name = "V1", label_location = "inside", label_rotation = FALSE, legend = FALSE, label_type = "percent")
An annotation_database
is an annotation_source()
where the imported data.frame contains meta data for annotations. For
example it might be a table of molecular identifiers, associated
pathways etc.
annotation_database(data = data.frame(), tag = "", ...)
annotation_database(data = data.frame(), tag = "", ...)
data |
(data.frame, NULL) A data.frame of annotation data. The
default is |
tag |
(character) A (short) character string that is used to
represent this source e.g. in column names or source columns when
used in a workflow. The default is |
... |
Additional slots and values passed to |
A
annotation_database
object. This object has no output
slots.
A annotation_database
object inherits the following struct
classes: [annotation_database]
-> [annotation_source]
-> [struct_class]
Other annotation databases:
AnnotationDb_database
,
GO_database
,
annotation_source
,
excel_database
,
rdata_database
,
rds_cache
,
rds_database
Other annotation sources:
annotation_table
,
cd_source
,
ls_source
,
mspurity_source
M <- annotation_database( tag = character(0), data = data.frame(), source = "ANY")
M <- annotation_database( tag = character(0), data = data.frame(), source = "ANY")
Display a histogram of value in the specified column of an annotation_source.
annotation_histogram( factor_name, bins = 30, bin_edge = "grey", bin_fill = "lightgrey", vline = NULL, vline_colour = "red", ... )
annotation_histogram( factor_name, bins = 30, bin_edge = "grey", bin_fill = "lightgrey", vline = NULL, vline_colour = "red", ... )
factor_name |
(character) The name of the column in the
|
bins |
(numeric, integer) The number of bins to use when
computing the histogram. The default is |
bin_edge |
(character) The colour to use when plotting the edges
of bins. The default is |
bin_fill |
(character) The colour to use when plotting the bins.
The default is |
vline |
(numeric, NULL, list) The x-axis location of veritcal
lines used to indicate e.g. upper and lower limits. Use NULL if not
required. The default is |
vline_colour |
(character) The colour to use when plotting
vertical lines. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
ggplot2
A
annotation_histogram
object. This object has no output
slots.
See chart_plot
in the struct
package to
plot this chart object.
A annotation_histogram
object inherits the following struct
classes: [annotation_histogram]
-> [chart]
-> [struct_class]
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
M <- annotation_histogram( factor_name = "V1", bins = 30, bin_edge = "grey", bin_fill = "lightgrey", vline = NULL, vline_colour = "red")
M <- annotation_histogram( factor_name = "V1", bins = 30, bin_edge = "grey", bin_fill = "lightgrey", vline = NULL, vline_colour = "red")
Display a histogram of value in the specified columns of an annotation_source.
annotation_histogram2d( factor_name, bins = 30, bin_edge = "grey", bin_fill = "lightgrey", vline = NULL, vline_colour = "red", ... )
annotation_histogram2d( factor_name, bins = 30, bin_edge = "grey", bin_fill = "lightgrey", vline = NULL, vline_colour = "red", ... )
factor_name |
(character) The names of the two columns in the
|
bins |
(numeric, integer) The number of bins to use when
computing the histograms. The default is |
bin_edge |
(character) The colour to use when plotting the edges
of bins. The default is |
bin_fill |
(character) The colour to use when plotting the bins.
The default is |
vline |
(numeric, NULL, list) The x-axis location of lines used
to indicate e.g. upper and lower limits. Use NULL if not required. A
2 element list can be provided to set vlines for each |
vline_colour |
(character) The colour to use when plotting
vertical lines. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
ggplot2
patchwork
A
annotation_histogram2d
object. This object has no output
slots.
See chart_plot
in the struct
package to
plot this chart object.
A annotation_histogram2d
object inherits the following struct
classes: [annotation_histogram2d]
-> [annotation_histogram]
-> [chart]
-> [struct_class]
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
Pedersen T (2024). patchwork: The Composer of Plots. R package version 1.2.0, https://CRAN.R-project.org/package=patchwork.
M <- annotation_histogram2d( factor_name = "V1", bins = 30, bin_edge = "grey", bin_fill = "lightgrey", vline = NULL, vline_colour = "red")
M <- annotation_histogram2d( factor_name = "V1", bins = 30, bin_edge = "grey", bin_fill = "lightgrey", vline = NULL, vline_colour = "red")
Display a pie chart of labels in the specified column of an annotation_source.
annotation_pie_chart( factor_name, label_rotation = FALSE, label_location = "inside", label_type = "percent", legend = FALSE, pie_rotation = 0, centre_radius = 0, centre_label = NULL, count_na = FALSE, ... )
annotation_pie_chart( factor_name, label_rotation = FALSE, label_location = "inside", label_type = "percent", legend = FALSE, pie_rotation = 0, centre_radius = 0, centre_label = NULL, count_na = FALSE, ... )
factor_name |
(character) The name of the column in the
|
label_rotation |
(logical) Rotate labels. Allowed values are limited to the following:
The default is |
label_location |
(character) Label location. Allowed values are limited to the following:
The default is |
label_type |
(character) Label type. Allowed values are limited to the following:
The default is
|
legend |
(logical) Display legend. Allowed values are limited to the following:
The default is |
pie_rotation |
(numeric) The number of degrees to rotate the pie
chart by, clockwise. The default is |
centre_radius |
(numeric, integer) The radius of the centre circle. Used to make a "donut" plot. Should be a value between 0 and
|
centre_label |
(NULL, character) The text to display in the
centre of the pie chart. Mostly used with donut plots where
|
count_na |
(logical) Include the number of missing values in the
pie chart. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
ggplot2
A
annotation_pie_chart
object. This object has no output
slots.
See chart_plot
in the struct
package to
plot this chart object.
A annotation_pie_chart
object inherits the following struct
classes: [annotation_pie_chart]
-> [chart]
-> [struct_class]
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-3-319-24277-4, https://ggplot2.tidyverse.org.
M <- annotation_pie_chart( factor_name = "V1", label_location = "inside", label_rotation = FALSE, legend = FALSE, pie_rotation = 0, label_type = "percent", centre_radius = 0, centre_label = NULL, count_na = FALSE)
M <- annotation_pie_chart( factor_name = "V1", label_location = "inside", label_rotation = FALSE, legend = FALSE, pie_rotation = 0, label_type = "percent", centre_radius = 0, centre_label = NULL, count_na = FALSE)
A base class defining an annotation source. This object is extended by MetmashR to define other objects.
annotation_source(source = character(0), data = data.frame(), tag = "", ...)
annotation_source(source = character(0), data = data.frame(), tag = "", ...)
source |
(ANY) The source of annotation data. The default is
|
data |
(data.frame, NULL) A data.frame of annotation data. The
default is |
tag |
(character) A (short) character string that is used to
represent this source e.g. in column names or source columns when
used in a workflow. The default is |
... |
Additional slots and values passed to |
A
annotation_source
object. This object has no output
slots.
A annotation_source
object inherits the following struct
classes:
[annotation_source]
-> [struct_class]
Other annotation databases:
AnnotationDb_database
,
GO_database
,
annotation_database
,
excel_database
,
rdata_database
,
rds_cache
,
rds_database
M <- annotation_source( tag = character(0), data = data.frame(), source = "ANY")
M <- annotation_source( tag = character(0), data = data.frame(), source = "ANY")
An annotation_table
is an annotation_source()
where the imported data.frame contains measured experimental data. An
id_column
of values is required to uniquely indentify each record
(row) in the table (NB these are NOT molecule identifiers, which may
be be present in multiple records).
annotation_table(data = data.frame(), tag = "", id_column = NULL, ...)
annotation_table(data = data.frame(), tag = "", id_column = NULL, ...)
data |
(data.frame, NULL) A data.frame of annotation data. The
default is |
tag |
(character) A (short) character string that is used to
represent this source e.g. in column names or source columns when
used in a workflow. The default is |
id_column |
(character) The column name of the annotation
data.frame containing row identifers. If NULL This will be generated
automatically. The default is |
... |
Additional slots and values passed to |
A
annotation_table
object. This object has no output
slots.
A annotation_table
object inherits the following struct
classes:
[annotation_table]
-> [annotation_source]
-> [struct_class]
Other annotation tables:
cd_source
,
ls_source
Other annotation sources:
annotation_database
,
cd_source
,
ls_source
,
mspurity_source
M <- annotation_table( id_column = "id", tag = character(0), data = data.frame(), source = "ANY")
M <- annotation_table( id_column = "id", tag = character(0), data = data.frame(), source = "ANY")
Display an UpSet chart of labels in the specified column of an annotation_source.
annotation_upset_chart( factor_name, group_column = NULL, width_ratio = 0.2, xlabel = "group", sort_intersections = "descending", intersections = "observed", n_intersections = NULL, min_size = 0, queries = list(), keep_empty_groups = FALSE, ... )
annotation_upset_chart( factor_name, group_column = NULL, width_ratio = 0.2, xlabel = "group", sort_intersections = "descending", intersections = "observed", n_intersections = NULL, min_size = 0, queries = list(), keep_empty_groups = FALSE, ... )
factor_name |
(character) The name of the column(s) in the
|
group_column |
(character, NULL) The name of the column in the
|
width_ratio |
(numeric) Proportion of plot given to set size bar
chart. The default is |
xlabel |
(character) The label used for the x-axis. The default
is |
sort_intersections |
(character) Sort intersections. Allowed values are limited to the following:
The default is
|
intersections |
(character, list) The intersections to include
in the plot. The default is |
n_intersections |
(numeric, integer, NULL) The number of
intersections to include in the plot. The default is |
min_size |
(numeric, integer) The minimum size of an
intersection for it to be included in the plot. The default is
|
queries |
(list) A list of upset queries. The default is
|
keep_empty_groups |
(logical) Whether empty sets should be kept
(including sets which are only empty after filtering by size). The
default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
ComplexUpset
A
annotation_upset_chart
object. This object has no output
slots.
See chart_plot
in the struct
package to
plot this chart object.
A annotation_upset_chart
object inherits the following struct
classes: [annotation_upset_chart]
-> [chart]
-> [struct_class]
Krassowski M (2020). "ComplexUpset." doi:10.5281/zenodo.3700590 https://doi.org/10.5281/zenodo.3700590, https://doi.org/10.5281/zenodo.3700590.
Lex A, Gehlenborg N, Strobelt H, Vuillemot R, Pfister H (2014). "UpSet: Visualization of Intersecting Sets,." IEEE Transactions on Visualization and Computer Graphics, 20(12), 1983–1992. doi:10.1109/TVCG.2014.2346248 https://doi.org/10.1109/TVCG.2014.2346248, https://doi.org/10.1109/TVCG.2014.2346248.
M <- annotation_upset_chart( factor_name = "V1", group_column = NULL, width_ratio = 0.2, xlabel = "group", sort_intersections = "descending", intersections = "observed", n_intersections = NULL, min_size = 0, queries = list(), keep_empty_groups = FALSE)
M <- annotation_upset_chart( factor_name = "V1", group_column = NULL, width_ratio = 0.2, xlabel = "group", sort_intersections = "descending", intersections = "observed", n_intersections = NULL, min_size = 0, queries = list(), keep_empty_groups = FALSE)
Display a venn diagram of labels present in two annotation_sources.
annotation_venn_chart( factor_name, group_column = NULL, fill_colour = "white", line_colour = "black", labels = TRUE, legend = FALSE, ... )
annotation_venn_chart( factor_name, group_column = NULL, fill_colour = "white", line_colour = "black", labels = TRUE, legend = FALSE, ... )
factor_name |
(character) The name of the column(s) in the
|
group_column |
(character, NULL) The name of the column in the
|
fill_colour |
(character) The line colour of the groups in a
format compatible with ggplot e.g. "black" or "#000000". Special case
".group" sets the colour based on the group label and "none" will not
fill the groups. The default is |
line_colour |
(character) The line colour of the groups in a
format compatible with ggplot e.g. "black" or "#000000". Special case
".group" sets the colour based on the group label, and ".none" will
not display lines. The default is |
labels |
(logical) Group labels. Allowed values are limited to the following:
The default is |
legend |
(logical) Legend. Allowed values are limited to the following:
The
default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
RVenn
ggVennDiagram
A
annotation_venn_chart
object. This object has no output
slots.
See chart_plot
in the struct
package to
plot this chart object.
A annotation_venn_chart
object inherits the following struct
classes: [annotation_venn_chart]
-> [chart]
-> [struct_class]
Akyol T (2019). RVenn: Set Operations for Many Sets. R package version 1.1.0, https://CRAN.R-project.org/package=RVenn.
Gao C, Dusa A (2024). ggVennDiagram: A 'ggplot2' Implement of Venn Diagram. R package version 1.5.2, https://CRAN.R-project.org/package=ggVennDiagram.
M <- annotation_venn_chart( factor_name = "V1", line_colour = ".group", fill_colour = ".group", labels = FALSE, legend = FALSE, group_column = NULL)
M <- annotation_venn_chart( factor_name = "V1", line_colour = ".group", fill_colour = ".group", labels = FALSE, legend = FALSE, group_column = NULL)
Retrieve a table from an AnnotationDb package.
AnnotationDb_database(source, table, ...)
AnnotationDb_database(source, table, ...)
source |
(character) The name of an AnnotationDb package to import the specified table from. Note the package should already be installed. |
table |
(character) The name of a table to import from the specified source AnnotationDb package. |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
AnnotationDbi
A
AnnotationDb_database
object. This object has no output
slots.
A AnnotationDb_database
object inherits the following struct
classes: [AnnotationDb_database]
-> [annotation_database]
->
[annotation_source]
-> [struct_class]
Pagès H, Carlson M, Falcon S, Li N (2023). AnnotationDbi: Manipulation of SQLite-based annotations in Bioconductor. doi:10.18129/B9.bioc.AnnotationDbi https://doi.org/10.18129/B9.bioc.AnnotationDbi, R package version 1.64.1, https://bioconductor.org/packages/AnnotationDbi.
Other annotation databases:
GO_database
,
annotation_database
,
annotation_source
,
excel_database
,
rdata_database
,
rds_cache
,
rds_database
M <- AnnotationDb_database( table = character(0), tag = character(0), data = data.frame(), source = character(0))
M <- AnnotationDb_database( table = character(0), tag = character(0), data = data.frame(), source = character(0))
A wrapper around [annotationDbi::select()]
that can be
used to import columns from the database where the keys are provided
by a column in the annotation table.
AnnotationDb_select( database, key_column, key_type, database_columns, drop_na = TRUE, ... )
AnnotationDb_select( database, key_column, key_type, database_columns, drop_na = TRUE, ... )
database |
(character) The name of the AnnotationDbi package/object to import. |
key_column |
(character) The name of a column in the annotation table containing key values used to extract records from the AnnotationDbi database. |
key_type |
(character) The name of a column in the AnnoationDb database searched for matches to the key values. |
database_columns |
(character) The name of columns to import
from the AnnoationDb database. Special case |
drop_na |
(logical) Drop NA. Allowed values are limited to the following:
The default
is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
AnnotationDbi
A AnnotationDb_select
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A AnnotationDb_select
object inherits the following struct
classes: [AnnotationDb_select]
-> [model]
-> [struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
Pagès H, Carlson M, Falcon S, Li N (2023). AnnotationDbi: Manipulation of SQLite-based annotations in Bioconductor. doi:10.18129/B9.bioc.AnnotationDbi https://doi.org/10.18129/B9.bioc.AnnotationDbi, R package version 1.64.1, https://bioconductor.org/packages/AnnotationDbi.
M <- AnnotationDb_select( database = "", key_column = "", key_type = "", database_columns = ".all", drop_na = FALSE)
M <- AnnotationDb_select( database = "", key_column = "", key_type = "", database_columns = ".all", drop_na = FALSE)
A cached resource using BiocFileCache.
BiocFileCache_database( source, bfc_path = NULL, resource_name, bfc_fun = cache_as_is, import_fun = read.csv, offline = FALSE, ... )
BiocFileCache_database( source, bfc_path = NULL, resource_name, bfc_fun = cache_as_is, import_fun = read.csv, offline = FALSE, ... )
source |
(ANY) The source of annotation data. |
bfc_path |
(character, NULL) |
resource_name |
(character) The name given to this resource in
the cache. (see |
bfc_fun |
(function) A function to process the object before
storing it in the cache, e.g. to store an unzipped file in the cache
instead of the zipped version. This would prevent needing to unzip
the resource each time it is retrieved from the cache, but would mean
using more space on disk. The default function does nothing to the
resource. See |
import_fun |
(function) A function to process the object after retrieving it from the cache e.g. it might need to be unzipped before importingas a data.frame. This function should take the path to the cached object as the first input and return a data.frame. |
offline |
(logical) If |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
BiocFileCache
A
BiocFileCache_database
object. This object has no output
slots.
A BiocFileCache_database
object inherits the following struct
classes: [BiocFileCache_database]
-> [annotation_database]
->
[annotation_source]
-> [struct_class]
Shepherd L, Morgan M (2024). BiocFileCache: Manage Files Across Sessions. R package version 2.10.2.
Other database:
sqlite_database
M <- BiocFileCache_database( bfc_path = NULL, resource_name = "bfc", bfc_fun = function(){}, import_fun = function(){}, offline = FALSE, tag = character(0), data = data.frame(), source = "ANY")
M <- BiocFileCache_database( bfc_path = NULL, resource_name = "bfc", bfc_fun = function(){}, import_fun = function(){}, offline = FALSE, tag = character(0), data = data.frame(), source = "ANY")
This helper function is for use with BiocFileCache
objects. Using it will
copy the file directly to the cache without making any changes.
cache_as_is(from, to)
cache_as_is(from, to)
from |
incoming path |
to |
the outgoing path |
TRUE if successful
M <- BiocFileCache_database( source = tempfile(), resource_name = "example", bfc_fun = cache_as_is )
M <- BiocFileCache_database( source = tempfile(), resource_name = "example", bfc_fun = cache_as_is )
Calculate ppm difference between two columns in an
annotation_table
. e.g. for comparing observed m/z to theortical
ones.
calc_ppm_diff( obs_mz_column, ref_mz_column, out_column, check_names = "unique", ... )
calc_ppm_diff( obs_mz_column, ref_mz_column, out_column, check_names = "unique", ... )
obs_mz_column |
(character) Column name in annotation_table containing the observed m/z values. |
ref_mz_column |
(character) Column name in annotation table containing the . |
out_column |
(character) Column name in annotation table to store the computed ppm differences. |
check_names |
(character) Check names. Allowed values are limited to the following:
The default is
|
... |
Additional slots and values passed to |
A calc_ppm_diff
object with the following
output
slots:
updated |
(annotation_table) The input annotation source with the computed ppm diffences in a new column. |
A calc_ppm_diff
object inherits the following struct
classes:
[calc_ppm_diff]
-> [model]
-> [struct_class]
M <- calc_ppm_diff( obs_mz_column = character(0), ref_mz_column = "reference (theoretical) m/z values.", out_column = character(0), check_names = "unique")
M <- calc_ppm_diff( obs_mz_column = character(0), ref_mz_column = "reference (theoretical) m/z values.", out_column = character(0), check_names = "unique")
Calculate RT difference between two RT values
calc_rt_diff( obs_rt_column, ref_rt_column, out_column, check_names = "unique", ... )
calc_rt_diff( obs_rt_column, ref_rt_column, out_column, check_names = "unique", ... )
obs_rt_column |
(character) Column name in annotation table containing the observed (measured) RT values. |
ref_rt_column |
(character) Column name in annotation table containing the reference (theoretical) RT values. |
out_column |
(character) Column name in annotation table to store the computed RT differences. |
check_names |
(character) Check names. Allowed values are limited to the following:
The default is
|
... |
Additional slots and values passed to |
A calc_rt_diff
object with the following output
slots:
updated |
(annotation_table) The input annotation source with the newly generated column. |
A calc_rt_diff
object inherits the following struct
classes:
[calc_rt_diff]
-> [model]
-> [struct_class]
M <- calc_rt_diff( obs_rt_column = character(0), ref_rt_column = character(0), out_column = character(0), check_names = "unique")
M <- calc_rt_diff( obs_rt_column = character(0), ref_rt_column = character(0), out_column = character(0), check_names = "unique")
An LCMS table extends annotation_table()
to
represent annotation data for an LCMS experiment. Columns
representing m/z and retention time are required for an lcms_table
.
cd_source( source, sheets = c(1, 1), tag = "CD", mz_column = "mz", rt_column = "rt", id_column = "id", data = NULL, ... )
cd_source( source, sheets = c(1, 1), tag = "CD", mz_column = "mz", rt_column = "rt", id_column = "id", data = NULL, ... )
source |
(character) The path to the Compound Discoverer Excel files to import. Both the compounds and isomers file should be included, in that order. |
sheets |
(character, numeric, integer) The name or index of the
sheets to read from the source file(s). A sheet should be provided
for each input file. The default is |
tag |
(character) A (short) character string that is used to
represent this source e.g. in column names or source columns when
used in a workflow. The default is |
mz_column |
(character) The column name of the annotation
data.frame containing m/z values. The default is |
rt_column |
(character) The column name of the annotation
data.frame containing retention time values. The default is
|
id_column |
(character) The column name of the annotation
data.frame containing row identifers. If NULL This will be generated
automatically. The default is |
data |
(data.frame, NULL) A data.frame of annotation data. The
default is |
... |
Additional slots and values passed to |
A
cd_source
object. This object has no output
slots.
A cd_source
object inherits the following struct
classes: [cd_source]
-> [lcms_table]
-> [annotation_table]
->
[annotation_source]
-> [struct_class]
Other annotation sources:
annotation_database
,
annotation_table
,
ls_source
,
mspurity_source
Other annotation tables:
annotation_table
,
ls_source
M <- cd_source( sheets = c(2, 2), mz_column = "mz", rt_column = "rt", id_column = "id", tag = character(0), data = data.frame(), source = character(0))
M <- cd_source( sheets = c(2, 2), mz_column = "mz", rt_column = "rt", id_column = "id", tag = character(0), data = data.frame(), source = character(0))
Plots a chart object
## S4 method for signature 'annotation_bar_chart,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'annotation_histogram,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'annotation_histogram2d,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'annotation_pie_chart,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'annotation_upset_chart,annotation_source' chart_plot(obj, dobj, ...) ## S4 method for signature 'annotation_upset_chart,list' chart_plot(obj, dobj) ## S4 method for signature 'annotation_venn_chart,annotation_source' chart_plot(obj, dobj, ...) ## S4 method for signature 'annotation_venn_chart,list' chart_plot(obj, dobj) ## S4 method for signature 'mwb_structure,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'openbabel_structure,character' chart_plot(obj, dobj) ## S4 method for signature 'openbabel_structure,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'pubchem_structure,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'pubchem_widget,annotation_source' chart_plot(obj, dobj)
## S4 method for signature 'annotation_bar_chart,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'annotation_histogram,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'annotation_histogram2d,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'annotation_pie_chart,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'annotation_upset_chart,annotation_source' chart_plot(obj, dobj, ...) ## S4 method for signature 'annotation_upset_chart,list' chart_plot(obj, dobj) ## S4 method for signature 'annotation_venn_chart,annotation_source' chart_plot(obj, dobj, ...) ## S4 method for signature 'annotation_venn_chart,list' chart_plot(obj, dobj) ## S4 method for signature 'mwb_structure,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'openbabel_structure,character' chart_plot(obj, dobj) ## S4 method for signature 'openbabel_structure,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'pubchem_structure,annotation_source' chart_plot(obj, dobj) ## S4 method for signature 'pubchem_widget,annotation_source' chart_plot(obj, dobj)
obj |
a chart object |
dobj |
a struct object |
... |
additiional inputs to chart_plot |
a plot object
C <- example_chart() chart_plot(C, example_model())
C <- example_chart() chart_plot(C, example_model())
annotation_source
This method checks for the presence of columns by name in an
annotation_source()
. It returns TRUE if all are present, or a vector
of messages indicating which columns are missing from the data.frame. It is
used by MetMashR to ensure validity of certain objects.
check_for_columns(obj, ..., msg = FALSE) ## S4 method for signature 'annotation_source' check_for_columns(obj, ..., msg = FALSE)
check_for_columns(obj, ..., msg = FALSE) ## S4 method for signature 'annotation_source' check_for_columns(obj, ..., msg = FALSE)
obj |
an |
... |
the column names to check for |
msg |
TRUE/FALSE indicates whether to return a message if some columns
are missing. If |
logical if all columns are present, or a vector of messages if requested.
# test if column present AT <- annotation_source(data = data.frame(id = character(0))) check_for_columns(AT, "id") # TRUE check_for_columns(AT, "cake") # FALSE # return a message if missing check_for_columns(AT, "cake", msg = TRUE)
# test if column present AT <- annotation_source(data = data.frame(id = character(0))) check_for_columns(AT, "id") # TRUE check_for_columns(AT, "cake") # FALSE # return a message if missing check_for_columns(AT, "cake", msg = TRUE)
Queries the ClassyFire database by inchikey to obtain chemical ontology information.
classyfire_lookup( query_column, output_items = "kingdom", output_fields = "name", suffix = "_cf", ... )
classyfire_lookup( query_column, output_items = "kingdom", output_fields = "name", suffix = "_cf", ... )
query_column |
(character) The name of a column in the annotation table containing values to search in the api call. |
output_items |
(character) The names of the items to return from
the results of the search. Can include any number of "kingdom",
"superclass", "class", "subclass", "direct_parent",
"intermediate_nodes", "substituents", "smiles",
"molecular_framework", "description", "ancestors",
"predicted_chebi_terms". Keyword ".all" may be used to return all
items. The default is |
output_fields |
(character) The names of fields to return for
each output_item. Can include any of "name", "description",
"chemont_id" and "url". Keyword ".all" may be used to return all
fields. Some items do not have fields, so output_category is ignored.
The default is |
suffix |
(character) A suffix appended to all column names in
the returned result. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
httr
A classyfire_lookup
object with the following
output
slots:
updated |
(annotation_source) The annotation_source after adding data returned by the API. |
A classyfire_lookup
object inherits the following struct
classes:
[classyfire_lookup]
-> [rest_api]
-> [model]
->
[struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
Wickham H (2023). httr: Tools for Working with URLs and HTTP. R package version 1.4.7, https://CRAN.R-project.org/package=httr.
Other REST API's:
kegg_lookup
,
lipidmaps_lookup
,
mwb_compound_lookup
,
rest_api
M <- classyfire_lookup( output_items = "kingdom", output_fields = "name", base_url = "http://classyfire.wishartlab.com/entities", url_template = "<base_url>/<query_column>.json", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
M <- classyfire_lookup( output_items = "kingdom", output_fields = "name", base_url = "http://classyfire.wishartlab.com/entities", url_template = "<base_url>/<query_column>.json", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
A wrapper for paste()
and interaction()
.
Combines the values in multiple columns row-wise.
combine_columns( column_names, separator = "_", prefix = NULL, suffix = NULL, output_column = "combined", clean = TRUE, ... )
combine_columns( column_names, separator = "_", prefix = NULL, suffix = NULL, output_column = "combined", clean = TRUE, ... )
column_names |
(character) The column name(s) in the annotation_source to combine. |
separator |
(character) A string placed in between the two being
joined. The default is |
prefix |
(character, NULL) A string placed at the start of the
combined strings. The default is |
suffix |
(character, NULL) A string placed at the end of the
combined strings. The default is |
output_column |
(character) The name of a column to store the
combined values in. The default is |
clean |
(logical) Clean old columns. Allowed values are limited to the following:
The default is
|
... |
Additional slots and values passed to |
A combine_columns
object with the following
output
slots:
updated |
(annotation_source) The annotation_source after combining the columns. |
A combine_columns
object inherits the following struct
classes:
[combine_columns]
-> [model]
-> [struct_class]
M <- combine_columns( column_names = "V1", separator = "_", output_column = "combined", clean = FALSE, prefix = NULL, suffix = NULL)
M <- combine_columns( column_names = "V1", separator = "_", output_column = "combined", clean = FALSE, prefix = NULL, suffix = NULL)
Combine annotation records (rows) based on a key. All records with the same key will be combined. A number of helper functions are provided for common approaches to merging records.
combine_records( group_by, default_fcn = fuse(separator = " || "), fcns = list(), ... )
combine_records( group_by, default_fcn = fuse(separator = " || "), fcns = list(), ... )
group_by |
(character) The column used as the key for grouping records. |
default_fcn |
(function) The default function to use for
summarising columns when combining records and a specific function
has not been provided in fcns. The default is |
fcns |
(list) A named list of functions to use for summarising
named columns when combining records. Names should correspond to the
columns in the annotation table. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
A combine_records
object with the following
output
slots:
updated |
(annotation_source) The input annotation source with the newly generated column. |
A combine_records
object inherits the following struct
classes:
[combine_records]
-> [model]
-> [struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
Lloyd GR, Jankevics A, Weber RJM (2020). "struct: an R/Bioconductor-based framework for standardized metabolomics data analysis and beyond." Bioinformatics, 36(22-23), 5551-5552.
M <- combine_records( fcns = list(), group_by = character(0), default_fcn = function(){})
M <- combine_records( fcns = list(), group_by = character(0), default_fcn = function(){})
This page documents helper functions for use with combine_records()
.
compute_mode(ties = FALSE, na.rm = TRUE) compute_mean(na.rm = TRUE) compute_median(na.rm = TRUE) fuse(separator, na_string = "NA") select_max(max_col, use_abs = FALSE, keep_NA = FALSE) select_min(min_col, use_abs = FALSE, keep_NA = FALSE) select_match(match_col, search_col, separator, na_string = "NA") select_exact(match_col, match, separator, na_string = "NA") fuse_unique( separator, na_string = "NA", digits = 6, drop_na = FALSE, sort = FALSE ) prioritise(match_col, priority, separator, no_match = NA, na_string = "NA") nothing() count_records() select_grade(grade_col, keep_NA = FALSE, upper_case = TRUE)
compute_mode(ties = FALSE, na.rm = TRUE) compute_mean(na.rm = TRUE) compute_median(na.rm = TRUE) fuse(separator, na_string = "NA") select_max(max_col, use_abs = FALSE, keep_NA = FALSE) select_min(min_col, use_abs = FALSE, keep_NA = FALSE) select_match(match_col, search_col, separator, na_string = "NA") select_exact(match_col, match, separator, na_string = "NA") fuse_unique( separator, na_string = "NA", digits = 6, drop_na = FALSE, sort = FALSE ) prioritise(match_col, priority, separator, no_match = NA, na_string = "NA") nothing() count_records() select_grade(grade_col, keep_NA = FALSE, upper_case = TRUE)
ties |
(logical) If TRUE then all records matching the tied groups are returned. Otherwise the first record is returned. |
na.rm |
(logical) If TRUE then NA is ignored |
separator |
(character, NULL) if !NULL this string is used to collapse matches with the same priority |
na_string |
(character) NA values are replaced with this string |
max_col |
(character) the column name to search for the maximum value. |
use_abs |
(logical) If TRUE then the sign of the values is ignored. |
keep_NA |
(logical) If TRUE keeps records with NA values |
min_col |
(character) the column name to search for the minimum value. |
match_col |
(character) the column with labels to prioritise |
search_col |
(character) the name of a column to use as a reference for locating values in the matching column. |
match |
(character) a value to search for in the matching column. |
digits |
(numeric) the number of digits to use when converting numerical values to characters when determining if values are unique. |
drop_na |
(logical) exclude NA from the list of unique entires |
sort |
(logical) sort the values before collapsing. |
priority |
(character) a list of labels in priority order |
no_match |
(character, NULL) if !NULL then annotations not matching any of the priority labels are replaced with this value |
grade_col |
(character) the name of a column containing grades |
upper_case |
(logical) If TRUE then grades are compared to upper case letters to determine their ordering, otherwise lower case. |
A function for use with combine_records()
compute_mode()
: returns the most common value,
excluding NA. If ties == TRUE
then all tied
values are returned, otherwise the first value in
a sorted unique list is returned (equal to min if numeric).
If na.rm = FALSE
then NA are included when searching for the modal value
and placed last if ties = FALSE
(values are returned preferentially over
NA).
compute_mean()
: calculates the mean value,
excluding NA if na.rm = TRUE
compute_median()
: calculates the median value,
excluding NA if na.rm = TRUE
fuse()
: collapses multiple matching
records into a single string using the provided separator.
select_max()
: selects a record based on
the index of the maximum value in a another column.
select_min()
: selects a record based on the
index of the minimum in a second column.
select_match()
: returns all records based on
the indices of identical matches in a second column and collapses them
using the provided separator.
select_exact()
: returns records based on
the index of identical value matching the match
parameter within the
current column, and collapses them using the provided separator if necessary.
fuse_unique()
: collapses a set of records to a
set of unique values using the provided separator. digits
can be provided
for numeric columns to control the precision used when determining unique
values.
prioritise()
: reduces a set of annotations by
prioritising values according to the input. If there are multiple matches
with the same priority then they are collapsed using a separator.
nothing()
: a pass-through function to
allow some annotation table columns to remain unchanged.
count_records()
: adds a new column indicating
the number of annotations that match the given grouping variable.
select_grade()
: returns records based on the
index of the best grade in a second list. The best grade is defined as "A"
for upper_case = TRUE
or "a" for upper_case = FALSE
and the worst grade is "Z" or "z". Any non-exact matches to a character in
LETTERS
or letters
are replaced with NA.
# Select matching records M <- combine_records( group_by = "example", default_fcn = select_exact( match_col = "match_column", match = "find_me", separator = ", ", na_string = "NA" ) ) # Collapse unique values M <- combine_records( group_by = "example", default_fcn = fuse_unique( digits = 6, separator = ", ", na_string = "NA", sort = FALSE ) ) # Prioritise by source M <- combine_records( group_by = "InChiKey", default_fcn = prioritise( match_col = "source", priority = c("CD", "LS"), separator = " || " ) ) # Do nothing to all columns M <- combine_records( group_by = "InChiKey", default_fcn = nothing() ) # Add a column with the number of records with a matching inchikey M <- combine_records( group_by = "InChiKey", fcns = list( count = count_records() ) ) # Select annotation with highest (best) grade M <- combine_records( group_by = "InChiKey", default_fcn = select_grade( grade_col = "grade", keep_NA = FALSE, upper_case = TRUE ) )
# Select matching records M <- combine_records( group_by = "example", default_fcn = select_exact( match_col = "match_column", match = "find_me", separator = ", ", na_string = "NA" ) ) # Collapse unique values M <- combine_records( group_by = "example", default_fcn = fuse_unique( digits = 6, separator = ", ", na_string = "NA", sort = FALSE ) ) # Prioritise by source M <- combine_records( group_by = "InChiKey", default_fcn = prioritise( match_col = "source", priority = c("CD", "LS"), separator = " || " ) ) # Do nothing to all columns M <- combine_records( group_by = "InChiKey", default_fcn = nothing() ) # Add a column with the number of records with a matching inchikey M <- combine_records( group_by = "InChiKey", fcns = list( count = count_records() ) ) # Select annotation with highest (best) grade M <- combine_records( group_by = "InChiKey", default_fcn = select_grade( grade_col = "grade", keep_NA = FALSE, upper_case = TRUE ) )
Annotation tables are joined and matching columns merged.
combine_sources( source_list, matching_columns = NULL, keep_cols = NULL, source_col = "annotation_source", exclude_cols = NULL, tag = "combined", as = annotation_source(name = "combined", description = paste0("A source created by combining two or ", "more sources")), ... )
combine_sources( source_list, matching_columns = NULL, keep_cols = NULL, source_col = "annotation_source", exclude_cols = NULL, tag = "combined", as = annotation_source(name = "combined", description = paste0("A source created by combining two or ", "more sources")), ... )
source_list |
(list) A list of annotation sources to be combined. |
matching_columns |
(character, NULL) A named vector of columns
names to be created by merging columns from individual sources. e.g.
|
keep_cols |
(character, NULL) A list of column names to keep in
the combined table (padded with NA) if detected in one of the input
tables. Special case ".all" will keep all columns from all tables.
The default is |
source_col |
(character) The column name that will be created to
contain a tag to indicate which source the annotation originated
from. The default is |
exclude_cols |
(NULL, character) Column names to be excluded
from the merged annotation table. Note this is applied after
|
tag |
(character) The tag given to the newly combined table. The
default is |
as |
(annotation_source) An annotation_source object to use as
the base class for the combined sources. The default is
|
... |
Additional slots and values passed to |
A combine_sources
object with the following
output
slots:
combined_table |
(annotation_source) The annotation tabel after combining the input tables. |
A combine_sources
object inherits the following struct
classes:
[combine_sources]
-> [model]
-> [struct_class]
M <- combine_sources( source_list = list(), matching_columns = NULL, keep_cols = NULL, source_col = "annotation_source", exclude_cols = NULL, tag = "combined", as = annotation_source())
M <- combine_sources( source_list = list(), matching_columns = NULL, keep_cols = NULL, source_col = "annotation_source", exclude_cols = NULL, tag = "combined", as = annotation_source())
Imports the compounds table of a CompDB source as an
annotation_source
.
CompoundDb_source(source, tag = "cdb", ...)
CompoundDb_source(source, tag = "cdb", ...)
source |
(ANY) The source of annotation data. |
tag |
(character) A (short) character string that is used to
represent this source e.g. in column names or source columns when
used in a workflow. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
CompoundDb
A
CompoundDb_source
object. This object has no output
slots.
A CompoundDb_source
object inherits the following struct
classes:
[CompoundDb_source]
-> [annotation_source]
-> [struct_class]
Rainer J, Vicini A, Salzer L, Stanstrup J, Badia J, Neumann S, Stravs M, Verri Hernandes V, Gatto L, Gibb S, Witting M (2022). "A Modular and Expandable Ecosystem for Metabolomics Data Annotation in R." Metabolites, 12, 173. doi:10.3390/metabo12020173 https://doi.org/10.3390/metabo12020173, https://www.mdpi.com/2218-1989/12/2/173.
M <- CompoundDb_source( tag = character(0), data = data.frame(), source = "ANY")
M <- CompoundDb_source( tag = character(0), data = data.frame(), source = "ANY")
Compute values for a new column based on an input column.
compute_column(input_columns, output_column, fcn, ...)
compute_column(input_columns, output_column, fcn, ...)
input_columns |
(character) The name of a column in the input table used to compute a new column. |
output_column |
(character) The name of the newply computed column. |
fcn |
(function) The function used to compute the values for the new column. |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
A compute_column
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A compute_column
object inherits the following struct
classes:
[compute_column]
-> [model]
-> [struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
M <- compute_column( input_columns = character(0), output_column = character(0), fcn = function(){})
M <- compute_column( input_columns = character(0), output_column = character(0), fcn = function(){})
Compute values for a record based on other values in a record
compute_record(fcn, ...)
compute_record(fcn, ...)
fcn |
(function) The function used to compute the values for the record. |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
A compute_record
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A compute_record
object inherits the following struct
classes:
[compute_record]
-> [model]
-> [struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
M <- compute_record( fcn = function(){})
M <- compute_record( fcn = function(){})
Search a database (data.frame) for annotation matches based on values in a specified column.
database_lookup( query_column, database_column, database, include = NULL, suffix = NULL, not_found = NA, ... )
database_lookup( query_column, database_column, database, include = NULL, suffix = NULL, not_found = NA, ... )
query_column |
(character) The annotation table column name to use as the reference for searching the database e.g. "HMBD_ID". |
database_column |
(character) The database column to search for matches to the values in annoation_column. |
database |
(data.frame, annotation_database) A database to be
searched. Can be a |
include |
(character, NULL) The name of the database columns to
be added to the annotations. If NULL, all columns are retained. The
default is |
suffix |
(character, NULL) A string appended to the column names
from the database. Used to distinguish columns from different
databases with identical column names.If suffix = NULL then the
column names are not changed. The default is |
not_found |
(character, numeric, logical, NULL) The returned
value when there are no matches. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
A database_lookup
object with the following
output
slots:
updated |
(annotation_source) The input annotation_source is updated with matching columns from the database. |
A database_lookup
object inherits the following struct
classes:
[database_lookup]
-> [model]
-> [struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
M <- database_lookup( query_column = "V1", database_column = "", database = data.frame(), include = NULL, suffix = NULL, not_found = NULL)
M <- database_lookup( query_column = "V1", database_column = "", database = data.frame(), include = NULL, suffix = NULL, not_found = NULL)
Submit a query to one of the NCBI E-utils databases. See https://www.ncbi.nlm.nih.gov/books/NBK25501/ for details.
eutils_lookup(query_column, database, term, result_fields = "idlist", ...)
eutils_lookup(query_column, database, term, result_fields = "idlist", ...)
query_column |
(character) The column name to use as the reference for searching the database e.g. "HMBD_ID". |
database |
(character) The name of the E-utils database to search. See https://www.ncbi.nlm.nih.gov/books/NBK25501/ for details. |
term |
(character) A correctly formated search term to use with
E-utils. See https://www.ncbi.nlm.nih.gov/books/NBK25501/ for
details. When used with the provided url template will automatically
include the value from the |
result_fields |
(character) The name of the search result field
to return. For E-utils this is often "idlist". The default is
|
... |
Additional slots and values passed to |
A eutils_lookup
object with the following
output
slots:
updated |
(annotation_source) The annotation_source after adding data returned by the API. |
A eutils_lookup
object inherits the following struct
classes:
[eutils_lookup]
-> [rest_api]
-> [model]
-> [struct_class]
M <- eutils_lookup( database = "gene", term = "[pdat]", result_fields = "idlist", base_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils", url_template = "<base_url>/esearch.fcgi?db=<database>&term=<query_column><term>&retmode=json", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
M <- eutils_lookup( database = "gene", term = "[pdat]", result_fields = "idlist", base_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils", url_template = "<base_url>/esearch.fcgi?db=<database>&term=<query_column><term>&retmode=json", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
A data.frame imported from the sheet of an excel file
excel_database( source = character(0), sheet = 1, rowNames = FALSE, colNames = TRUE, startRow = 1, ... )
excel_database( source = character(0), sheet = 1, rowNames = FALSE, colNames = TRUE, startRow = 1, ... )
source |
(ANY) The source of annotation data. The default is
|
sheet |
(character) The name of the sheet to import. The default
is |
rowNames |
(logical) If TRUE, first column of data will be used
as row names. The default is |
colNames |
(logical) If TRUE, first row of data will be used as
column names. The default is |
startRow |
(numeric, integer) First row to begin looking for
data. Empty rows at the top of a file are always skipped, regardless
of the value of startRow. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
openxlsx
A
excel_database
object. This object has no output
slots.
A excel_database
object inherits the following struct
classes:
[excel_database]
-> [annotation_database]
->
[annotation_source]
-> [struct_class]
Schauberger P, Walker A (2023). openxlsx: Read, Write and Edit xlsx Files. R package version 4.2.5.2, https://CRAN.R-project.org/package=openxlsx.
Other annotation databases:
AnnotationDb_database
,
GO_database
,
annotation_database
,
annotation_source
,
rdata_database
,
rds_cache
,
rds_database
M <- excel_database( sheet = character(0), rowNames = FALSE, colNames = FALSE, startRow = 1, tag = character(0), data = data.frame(), source = "ANY")
M <- excel_database( sheet = character(0), rowNames = FALSE, colNames = FALSE, startRow = 1, tag = character(0), data = data.frame(), source = "ANY")
Removes (or includes) annotations such that the named column excludes (or includes) the specified labels.
filter_labels( column_name, labels, mode = "exclude", perl = FALSE, fixed = FALSE, match_na = FALSE, ... )
filter_labels( column_name, labels, mode = "exclude", perl = FALSE, fixed = FALSE, match_na = FALSE, ... )
column_name |
(character) The column name to filter. |
labels |
(character) The labels to filter by. Uses |
mode |
(character) Filter mode. Allowed values are limited to the following:
The
default is |
perl |
(logical) Use a Perl-compatible regex. The default is
|
fixed |
(logical) Use exact matching. The default is
|
match_na |
(logical) Match NA. Allowed values are limited to the following:
The default is |
... |
Additional slots and values passed to |
A filter_labels
object with the following
output
slots:
filtered |
(annotation_source) The annotation_source after filtering. |
flags |
(data.frame) A list of flags indicating which annotations had a matching label. |
A filter_labels
object inherits the following struct
classes:
[filter_labels]
-> [model]
-> [struct_class]
M <- filter_labels( column_name = "V1", labels = "", mode = "exclude", perl = FALSE, fixed = FALSE, match_na = FALSE)
M <- filter_labels( column_name = "V1", labels = "", mode = "exclude", perl = FALSE, fixed = FALSE, match_na = FALSE)
Filters annotations where the named column is NA
filter_na(column_name, mode = "exclude", ...)
filter_na(column_name, mode = "exclude", ...)
column_name |
(character) The column name to use for filtering. |
mode |
(character) Filter mode. Allowed values are limited to the following:
The default is |
... |
Additional slots and values passed to |
A filter_na
object with the following output
slots:
filtered |
(annotation_source) Annotation_source after filtering. |
flags |
(data.frame) A list of flags indicating which annotations were removed. |
A filter_na
object inherits the following struct
classes: [filter_na]
-> [model]
-> [struct_class]
M <- filter_na( column_name = "V1", mode = "exclude")
M <- filter_na( column_name = "V1", mode = "exclude")
Removes annotations where the names column is greater than an upper limit or less than a lower limit.
filter_range( column_name, upper_limit = Inf, lower_limit = -Inf, equal_to = TRUE, ... )
filter_range( column_name, upper_limit = Inf, lower_limit = -Inf, equal_to = TRUE, ... )
column_name |
(character) The column name to filter. |
upper_limit |
(numeric, integer, function) The upper limit used
for filtering. Can be a value, or a function that computes a value
(e.g. mean). The default is |
lower_limit |
(numeric, integer, function) The lower limit used
for filtering. Can be a value, or a function that computes a value
(e.g. mean). The default is |
equal_to |
(logical) Equal to limits. Allowed values are limited to the following:
The default is |
... |
Additional slots and values passed to |
A filter_range
object with the following output
slots:
filtered |
(annotation_source) Annotation_source after filtering. |
flags |
(data.frame) A list of flags indicating which annotations were removed. |
A filter_range
object inherits the following struct
classes:
[filter_range]
-> [model]
-> [struct_class]
M <- filter_range( column_name = "V1", upper_limit = Inf, lower_limit = -Inf, equal_to = FALSE)
M <- filter_range( column_name = "V1", upper_limit = Inf, lower_limit = -Inf, equal_to = FALSE)
A wrapper around dplyr::filter
. Select rows from an
annotation table using tidy grammar.
filter_records(where = wherever(A > 0), ...)
filter_records(where = wherever(A > 0), ...)
where |
(quosures) A list of |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
rlang
A filter_records
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A filter_records
object inherits the following struct
classes:
[filter_records]
-> [model]
-> [struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
Henry L, Wickham H (2024). rlang: Functions for Base Types and Core R and 'Tidyverse' Features. R package version 1.1.3, https://CRAN.R-project.org/package=rlang.
M <- filter_records( where = wherever(A>10))
M <- filter_records( where = wherever(A>10))
Removes (or includes) annotations such that the named column excludes (or includes) the specified levels.
filter_venn( factor_name, group_column = NULL, tables = NULL, levels, mode = "exclude", perl = FALSE, fixed = FALSE, ... )
filter_venn( factor_name, group_column = NULL, tables = NULL, levels, mode = "exclude", perl = FALSE, fixed = FALSE, ... )
factor_name |
(character) The name of the column(s) in the
|
group_column |
(character, NULL) The name of the column in the
|
tables |
(list, NULL) A list of |
levels |
(character) The venn diagram levels to filter by. |
mode |
(character) Filter mode. Allowed values are limited to the following:
The
default is |
perl |
(logical) Use a Perl-compatible regex. The default is
|
fixed |
(logical) Use exact matching. The default is
|
... |
Additional slots and values passed to |
A filter_venn
object with the following output
slots:
filtered |
(annotation_source) Annotation_source after filtering. |
flags |
(data.frame) A list of flags indicating which annotations were removed. |
A filter_venn
object inherits the following struct
classes:
[filter_venn]
-> [model]
-> [struct_class]
M <- filter_venn( factor_name = "V1", group_column = NULL, tables = NULL, levels = "", mode = "exclude", perl = FALSE, fixed = FALSE)
M <- filter_venn( factor_name = "V1", group_column = NULL, tables = NULL, levels = "", mode = "exclude", perl = FALSE, fixed = FALSE)
Uses the GitHub REST API to retrieve a file from a specifiedGitHub repository.
github_file( username, repository_name, file_path, bfc_path = NULL, resource_name = paste(username, repository_name, file_path, sep = "_"), ... )
github_file( username, repository_name, file_path, bfc_path = NULL, resource_name = paste(username, repository_name, file_path, sep = "_"), ... )
username |
(character) The GitHub username to retireve the file from. |
repository_name |
(character) The name of a repository for the specified GitHub usernamethat contains the file to download. |
file_path |
(character) The path to the file to download within the specified GitHub repository. |
bfc_path |
(character, NULL) |
resource_name |
(character) The name given to this resource in
the cache. (see |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
BiocFileCache
httr
A
github_file
object. This object has no output
slots.
A github_file
object inherits the following struct
classes:
[github_file]
-> [BiocFileCache_database]
->
[annotation_database]
-> [annotation_source]
-> [struct_class]
Shepherd L, Morgan M (2024). BiocFileCache: Manage Files Across Sessions. R package version 2.10.2.
Wickham H (2023). httr: Tools for Working with URLs and HTTP. R package version 1.4.7, https://CRAN.R-project.org/package=httr.
M <- github_file( username = character(0), repository_name = character(0), file_path = character(0), bfc_path = NULL, resource_name = "bfc", bfc_fun = function(){}, import_fun = function(){}, offline = FALSE, tag = character(0), data = data.frame(), source = "ANY")
M <- github_file( username = character(0), repository_name = character(0), file_path = character(0), bfc_path = NULL, resource_name = "bfc", bfc_fun = function(){}, import_fun = function(){}, offline = FALSE, tag = character(0), data = data.frame(), source = "ANY")
Retrieve a table from the Gene Ontology using the
GO.db
package.
GO_database(source = "GO.db", table = "GOBPOFFSPRING", ...)
GO_database(source = "GO.db", table = "GOBPOFFSPRING", ...)
source |
(character) The name of an AnnotationDb package to
import the specified table from. Note the package should already be
installed. The default is |
table |
(character) The name of a table to import from the GO.db
package. Allowed tables include:
GOBPANCESTOR,GOBPPARENTS,GOBPCHILDREN,GOBPOFFSPRING (and their CC or
MF equivalents), GOTERM, GOSYNONYM, GOOBSOLETE. The default is
|
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
GO.db
A
GO_database
object. This object has no output
slots.
A GO_database
object inherits the following struct
classes:
[GO_database]
-> [AnnotationDb_database]
->
[annotation_database]
-> [annotation_source]
-> [struct_class]
Carlson M (2023). GO.db: A set of annotation maps describing the entire Gene Ontology. R package version 3.18.0.
Pagès H, Carlson M, Falcon S, Li N (2023). AnnotationDbi: Manipulation of SQLite-based annotations in Bioconductor. doi:10.18129/B9.bioc.AnnotationDbi https://doi.org/10.18129/B9.bioc.AnnotationDbi, R package version 1.64.1, https://bioconductor.org/packages/AnnotationDbi.
Other annotation databases:
AnnotationDb_database
,
annotation_database
,
annotation_source
,
excel_database
,
rdata_database
,
rds_cache
,
rds_database
M <- GO_database( table = "GOBPCHILDREN", tag = character(0), data = data.frame(), source = character(0))
M <- GO_database( table = "GOBPCHILDREN", tag = character(0), data = data.frame(), source = character(0))
A dictionary for converting Greek characters to Romanised names. It is
intended for use with the normalise_strings()
object.
greek_dictionary
greek_dictionary
An object of class list
of length 48.
A dictionary for use with normalise_strings()
M <- normalise_strings( search_column = "example", output_column = "result", dictionary = greek_dictionary )
M <- normalise_strings( search_column = "example", output_column = "result", dictionary = greek_dictionary )
Requests HMBD records based on HMDB identifiers.
hmdb_lookup(query_column, suffix = "_hmdb", output = "inchikey", ...)
hmdb_lookup(query_column, suffix = "_hmdb", output = "inchikey", ...)
query_column |
(character) The name of a column in the annotation table containing values to search in the api call. |
suffix |
(character) A suffix appended to all column names in
the returned result. The default is |
output |
(character) The value returned from the HMDB xml. The
default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
XML
A hmdb_lookup
object with the following output
slots:
updated |
(annotation_source) The annotation_source after adding data returned by the API. |
A hmdb_lookup
object inherits the following struct
classes:
[hmdb_lookup]
-> [rest_api]
-> [model]
-> [struct_class]
Temple Lang D (2024). XML: Tools for Parsing and Generating XML Within R and S-Plus. R package version 3.99-0.16.1, https://CRAN.R-project.org/package=XML.
M <- hmdb_lookup( output = "inchikey", base_url = "http://www.hmdb.ca/metabolites", url_template = "<base_url>/<query_column>.xml", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
M <- hmdb_lookup( output = "inchikey", base_url = "http://www.hmdb.ca/metabolites", url_template = "<base_url>/<query_column>.xml", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
Adds the number of times an identical identifier is present to each record.
id_counts(id_column, count_column = "id_counts", count_na = TRUE, ...)
id_counts(id_column, count_column = "id_counts", count_na = TRUE, ...)
id_column |
(character) Column name of the variable ids in variable_meta. |
count_column |
(character) The name of the new column to store
the counts in. The default is |
count_na |
(logical) Count NA. Allowed values are limited to the following:
The default
is |
... |
Additional slots and values passed to |
A id_counts
object with the following output
slots:
updated |
(annotation_source) The input annotation source with the newly generated column. |
A id_counts
object inherits the following struct
classes: [id_counts]
-> [model]
-> [struct_class]
M <- id_counts( id_column = character(0), count_column = character(0), count_na = FALSE)
M <- id_counts( id_column = character(0), count_column = character(0), count_na = FALSE)
A wrapper for read_source()
that can be used in an
annotation workflow to import an annotation source.
import_source(...)
import_source(...)
... |
Additional slots and values passed to |
A import_source
object with the following
output
slots:
imported |
(annotation_source) The annotation_source
after importing the data. |
A import_source
object inherits the following struct
classes:
[import_source]
-> [model]
-> [struct_class]
M <- import_source()
M <- import_source()
A function that returns TRUE if the database has been designed for use in read and write mode.
is_writable(obj, ...) ## S4 method for signature 'annotation_database' is_writable(obj) ## S4 method for signature 'rdata_database' is_writable(obj)
is_writable(obj, ...) ## S4 method for signature 'annotation_database' is_writable(obj) ## S4 method for signature 'rdata_database' is_writable(obj)
obj |
A |
... |
additional database specific inputs |
TRUE if the database is writable; FALSE otherwise. This method does not check file properties, only the intended usage of the object.
M <- annotation_database() is_writable(M)
M <- annotation_database() is_writable(M)
Searches the Kegg database to obtain external identifiers. KEGG compound, drug and glycan databases can be queried for pubchem and chebi identifiers, and vice-versa.
kegg_lookup( get = "pubchem", from = "compound", query_column, suffix = "_kegg", ... )
kegg_lookup( get = "pubchem", from = "compound", query_column, suffix = "_kegg", ... )
get |
(character) Get identifier. Allowed values are limited to the following:
The default is |
from |
(character) From identifier. Allowed values are limited to the following:
The default is |
query_column |
(character) The name of the column containing identifiers to search the database for. They should be identifiers of the type selected for the "from" slot. |
suffix |
(character) A suffix appended to all column names in
the returned result. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
KEGGREST
dplyr
A kegg_lookup
object with the following output
slots:
updated |
(annotation_source) An annotation_source object with a new column of compound identifiers. |
A kegg_lookup
object inherits the following struct
classes:
[kegg_lookup]
-> [model]
-> [struct_class]
Tenenbaum D, Maintainer B (2023). KEGGREST: Client-side REST access to the Kyoto Encyclopedia of Genes and Genomes (KEGG). doi:10.18129/B9.bioc.KEGGREST https://doi.org/10.18129/B9.bioc.KEGGREST, R package version 1.42.0, https://bioconductor.org/packages/KEGGREST.
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
Other REST API's:
classyfire_lookup
,
lipidmaps_lookup
,
mwb_compound_lookup
,
rest_api
M <- kegg_lookup( get = "pubchem", from = "compound", query_column = "V1", suffix = "_kegg")
M <- kegg_lookup( get = "pubchem", from = "compound", query_column = "V1", suffix = "_kegg")
An LCMS table extends annotation_table()
to
represent annotation data for an LCMS experiment. Columns
representing m/z and retention time are required for an lcms_table
.
lcms_table( data = NULL, tag = "", id_column = "id", mz_column = "mz", rt_column = "rt", ... )
lcms_table( data = NULL, tag = "", id_column = "id", mz_column = "mz", rt_column = "rt", ... )
data |
(data.frame, NULL) A data.frame of annotation data. The
default is |
tag |
(character) A (short) character string that is used to
represent this source e.g. in column names or source columns when
used in a workflow. The default is |
id_column |
(character) The column name of the annotation
data.frame containing row identifers. If NULL This will be generated
automatically. The default is |
mz_column |
(character) The column name of the annotation
data.frame containing m/z values. The default is |
rt_column |
(character) The column name of the annotation
data.frame containing retention time values. The default is
|
... |
Additional slots and values passed to |
A
lcms_table
object. This object has no output
slots.
A lcms_table
object inherits the following struct
classes: [lcms_table]
-> [annotation_table]
-> [annotation_source]
->
[struct_class]
M <- lcms_table( mz_column = "mz", rt_column = "rt", id_column = "id", tag = character(0), data = data.frame(), source = "ANY")
M <- lcms_table( mz_column = "mz", rt_column = "rt", id_column = "id", tag = character(0), data = data.frame(), source = "ANY")
Search the LipidMaps database using the API
lipidmaps_lookup( query_column, context, context_item, output_item = "all", suffix = "_lipidmaps", ... )
lipidmaps_lookup( query_column, context, context_item, output_item = "all", suffix = "_lipidmaps", ... )
query_column |
(character) The name of a column in the annotation table containing values to search in the api call. |
context |
(character) The search API context. Must be one of "compound", "gene", or "protein". |
context_item |
(character) The context item being searched. See https://lipidmaps.org/resources/rest for details. |
output_item |
(character) The names of the columns to return
from the results of the search. See
https://lipidmaps.org/resources/rest for details. The default is
|
suffix |
(character) A suffix appended to all column names in
the returned result. The default is |
... |
Additional slots and values passed to |
A lipidmaps_lookup
object with the following
output
slots:
updated |
(annotation_source) The annotation_source after adding data returned by the API. |
A lipidmaps_lookup
object inherits the following struct
classes:
[lipidmaps_lookup]
-> [rest_api]
-> [model]
-> [struct_class]
Other REST API's:
classyfire_lookup
,
kegg_lookup
,
mwb_compound_lookup
,
rest_api
M <- lipidmaps_lookup( query_column = character(0), output_item = "input", context = "compound", context_item = character(0), base_url = "https://www.lipidmaps.org/rest", url_template = "<base_url>/<context>/<context_item>/<query_column>/<output_item>/json", cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
M <- lipidmaps_lookup( query_column = character(0), output_item = "input", context = "compound", context_item = character(0), base_url = "https://www.lipidmaps.org/rest", url_template = "<base_url>/<context>/<context_item>/<query_column>/<output_item>/json", cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
An LCMS table extends annotation_table()
to
represent annotation data for an LCMS experiment. Columns
representing m/z and retention time are required for an lcms_table
.
ls_source( source, tag = "LS", mz_column = "mz", rt_column = "rt", id_column = "id", data = NULL, ... )
ls_source( source, tag = "LS", mz_column = "mz", rt_column = "rt", id_column = "id", data = NULL, ... )
source |
(ANY) The source of annotation data. |
tag |
(character) A (short) character string that is used to
represent this source e.g. in column names or source columns when
used in a workflow. The default is |
mz_column |
(character) The column name of the annotation
data.frame containing m/z values. The default is |
rt_column |
(character) The column name of the annotation
data.frame containing retention time values. The default is
|
id_column |
(character) The column name of the annotation
data.frame containing row identifers. If NULL This will be generated
automatically. The default is |
data |
(data.frame, NULL) A data.frame of annotation data. The
default is |
... |
Additional slots and values passed to |
A
ls_source
object. This object has no output
slots.
A ls_source
object inherits the following struct
classes: [ls_source]
-> [lcms_table]
-> [annotation_table]
->
[annotation_source]
-> [struct_class]
Other annotation sources:
annotation_database
,
annotation_table
,
cd_source
,
mspurity_source
Other annotation tables:
annotation_table
,
cd_source
M <- ls_source( mz_column = "mz", rt_column = "rt", id_column = "id", tag = character(0), data = data.frame(), source = "ANY")
M <- ls_source( mz_column = "mz", rt_column = "rt", id_column = "id", tag = character(0), data = data.frame(), source = "ANY")
Applies method to the input DatasetExperiment
## S4 method for signature 'model,annotation_source' model_apply(M, D) ## S4 method for signature 'model,list' model_apply(M, D) ## S4 method for signature 'model_seq,list' model_apply(M, D) ## S4 method for signature 'model_seq,annotation_source' model_apply(M, D) ## S4 method for signature 'AnnotationDb_select,annotation_source' model_apply(M, D) ## S4 method for signature 'CompoundDb_source,annotation_source' model_apply(M, D) ## S4 method for signature 'add_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'add_labels,annotation_source' model_apply(M, D) ## S4 method for signature 'calc_ppm_diff,annotation_table' model_apply(M, D) ## S4 method for signature 'calc_rt_diff,annotation_table' model_apply(M, D) ## S4 method for signature 'rest_api,annotation_source' model_apply(M, D) ## S4 method for signature 'combine_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'combine_records,annotation_source' model_apply(M, D) ## S4 method for signature 'combine_sources,annotation_source' model_apply(M, D) ## S4 method for signature 'combine_sources,list' model_apply(M, D) ## S4 method for signature 'compute_column,annotation_source' model_apply(M, D) ## S4 method for signature 'compute_record,annotation_source' model_apply(M, D) ## S4 method for signature 'database_lookup,annotation_source' model_apply(M, D) ## S4 method for signature 'split_records,annotation_source' model_apply(M, D) ## S4 method for signature 'filter_labels,annotation_source' model_apply(M, D) ## S4 method for signature 'filter_na,annotation_source' model_apply(M, D) ## S4 method for signature 'filter_range,annotation_source' model_apply(M, D) ## S4 method for signature 'filter_records,annotation_source' model_apply(M, D) ## S4 method for signature 'filter_venn,annotation_source' model_apply(M, D) ## S4 method for signature 'id_counts,annotation_source' model_apply(M, D) ## S4 method for signature 'import_source,annotation_source' model_apply(M, D) ## S4 method for signature 'kegg_lookup,annotation_source' model_apply(M, D) ## S4 method for signature 'mspurity_source,lcms_table' model_apply(M, D) ## S4 method for signature 'mz_match,annotation_source' model_apply(M, D) ## S4 method for signature 'mzrt_match,lcms_table' model_apply(M, D) ## S4 method for signature 'normalise_lipids,annotation_source' model_apply(M, D) ## S4 method for signature 'normalise_strings,annotation_source' model_apply(M, D) ## S4 method for signature 'pivot_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'prioritise_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'remove_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'rename_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'rt_match,annotation_table' model_apply(M, D) ## S4 method for signature 'select_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'split_column,annotation_source' model_apply(M, D) ## S4 method for signature 'trim_whitespace,annotation_source' model_apply(M, D) ## S4 method for signature 'unique_records,annotation_source' model_apply(M, D)
## S4 method for signature 'model,annotation_source' model_apply(M, D) ## S4 method for signature 'model,list' model_apply(M, D) ## S4 method for signature 'model_seq,list' model_apply(M, D) ## S4 method for signature 'model_seq,annotation_source' model_apply(M, D) ## S4 method for signature 'AnnotationDb_select,annotation_source' model_apply(M, D) ## S4 method for signature 'CompoundDb_source,annotation_source' model_apply(M, D) ## S4 method for signature 'add_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'add_labels,annotation_source' model_apply(M, D) ## S4 method for signature 'calc_ppm_diff,annotation_table' model_apply(M, D) ## S4 method for signature 'calc_rt_diff,annotation_table' model_apply(M, D) ## S4 method for signature 'rest_api,annotation_source' model_apply(M, D) ## S4 method for signature 'combine_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'combine_records,annotation_source' model_apply(M, D) ## S4 method for signature 'combine_sources,annotation_source' model_apply(M, D) ## S4 method for signature 'combine_sources,list' model_apply(M, D) ## S4 method for signature 'compute_column,annotation_source' model_apply(M, D) ## S4 method for signature 'compute_record,annotation_source' model_apply(M, D) ## S4 method for signature 'database_lookup,annotation_source' model_apply(M, D) ## S4 method for signature 'split_records,annotation_source' model_apply(M, D) ## S4 method for signature 'filter_labels,annotation_source' model_apply(M, D) ## S4 method for signature 'filter_na,annotation_source' model_apply(M, D) ## S4 method for signature 'filter_range,annotation_source' model_apply(M, D) ## S4 method for signature 'filter_records,annotation_source' model_apply(M, D) ## S4 method for signature 'filter_venn,annotation_source' model_apply(M, D) ## S4 method for signature 'id_counts,annotation_source' model_apply(M, D) ## S4 method for signature 'import_source,annotation_source' model_apply(M, D) ## S4 method for signature 'kegg_lookup,annotation_source' model_apply(M, D) ## S4 method for signature 'mspurity_source,lcms_table' model_apply(M, D) ## S4 method for signature 'mz_match,annotation_source' model_apply(M, D) ## S4 method for signature 'mzrt_match,lcms_table' model_apply(M, D) ## S4 method for signature 'normalise_lipids,annotation_source' model_apply(M, D) ## S4 method for signature 'normalise_strings,annotation_source' model_apply(M, D) ## S4 method for signature 'pivot_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'prioritise_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'remove_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'rename_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'rt_match,annotation_table' model_apply(M, D) ## S4 method for signature 'select_columns,annotation_source' model_apply(M, D) ## S4 method for signature 'split_column,annotation_source' model_apply(M, D) ## S4 method for signature 'trim_whitespace,annotation_source' model_apply(M, D) ## S4 method for signature 'unique_records,annotation_source' model_apply(M, D)
M |
a method object |
D |
another object used by the first |
Returns a modified method object
M <- example_model() M <- model_apply(M, iris_DatasetExperiment())
M <- example_model() M <- model_apply(M, iris_DatasetExperiment())
An annotation source for importing an annotation table
from the format created by the msPurity
package.
mspurity_source(source, tag = "msPurity", ...)
mspurity_source(source, tag = "msPurity", ...)
source |
(ANY) The source of annotation data. |
tag |
(character) A (short) character string that is used to
represent this source e.g. in column names or source columns when
used in a workflow. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
msPurity
A
mspurity_source
object. This object has no output
slots.
A mspurity_source
object inherits the following struct
classes:
[mspurity_source]
-> [annotation_source]
-> [struct_class]
Lawson, Nigel T, Weber, M. RJ, Jones, R. M, Chetwynd, J. A, Blanco R, Alejandro G, Guida D, Riccardo, Viant, R. M, Dunn, B W (2017). "msPurity: Automated Evaluation of Precursor Ion Purity for Mass Spectrometry-Based Fragmentation in Metabolomics." Analytical Chemistry, 89, 2432-2439. doi:10.1021/acs.analchem.6b04358 https://doi.org/10.1021/acs.analchem.6b04358.
Other annotation sources:
annotation_database
,
annotation_table
,
cd_source
,
ls_source
M <- mspurity_source( tag = character(0), data = data.frame(), source = "ANY")
M <- mspurity_source( tag = character(0), data = data.frame(), source = "ANY")
Imports the MTox700+ database, which is made available under the ODC Attribution License. MTox700+ is a list of toxicologically relevant metabolites derived from publications, public databases and relevant toxicological assays.
MTox700plus_database( version = "latest", bfc_path = NULL, resource_name = "MetMashR_MTox700plus", ... )
MTox700plus_database( version = "latest", bfc_path = NULL, resource_name = "MetMashR_MTox700plus", ... )
version |
(character) The version number of the MTox700+
database to import. Available versions are listed
here.
|
bfc_path |
(character, NULL) |
resource_name |
(character) The name given to this resource in
the cache. (see |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
BiocFileCache
httr
A
MTox700plus_database
object. This object has no output
slots.
A MTox700plus_database
object inherits the following struct
classes: [MTox700plus_database]
-> [BiocFileCache_database]
->
[annotation_database]
-> [annotation_source]
-> [struct_class]
Shepherd L, Morgan M (2024). BiocFileCache: Manage Files Across Sessions. R package version 2.10.2.
Wickham H (2023). httr: Tools for Working with URLs and HTTP. R package version 1.4.7, https://CRAN.R-project.org/package=httr.
Sostare E, Lawson TN, Saunders LR, Colbourne JK, Weber RJM, Sobanski T, Viant MR (2022). "Knowledge-Driven Approaches to Create the MTox700+ Metabolite Panel for Predicting Toxicity." Toxicological Sciences, 186, 208-220. doi:10.1093/toxsci/kfac007 https://doi.org/10.1093/toxsci/kfac007.
M <- MTox700plus_database( version = "v1.0", bfc_path = NULL, resource_name = "bfc", bfc_fun = function(){}, import_fun = function(){}, offline = FALSE, tag = character(0), data = data.frame(), source = "ANY")
M <- MTox700plus_database( version = "v1.0", bfc_path = NULL, resource_name = "bfc", bfc_fun = function(){}, import_fun = function(){}, offline = FALSE, tag = character(0), data = data.frame(), source = "ANY")
Searches MetabolomicsWorkbench for compound identifiers.
mwb_compound_lookup( input_item = "inchi_key", query_column, output_item = "pubchem_id", suffix = "_mwb", ... )
mwb_compound_lookup( input_item = "inchi_key", query_column, output_item = "pubchem_id", suffix = "_mwb", ... )
input_item |
(character) A valid input item for the compound
context (see
https://www.metabolomicsworkbench.org/tools/mw_rest.php). The values
in the query_column should be of this type. The default is
|
query_column |
(character) The name of a column in the annotation table containing values to search in the api call. |
output_item |
(character) A comma separated list of Valid output
items for the compound context (see
https://www.metabolomicsworkbench.org/tools/mw_rest.php). The default
is |
suffix |
(character) A suffix appended to all column names in
the returned result. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
metabolomicsWorkbenchR
dplyr
A mwb_compound_lookup
object with the following
output
slots:
updated |
(annotation_source) The annotation_source after adding data returned by the API. |
A mwb_compound_lookup
object inherits the following struct
classes: [mwb_compound_lookup]
-> [rest_api]
-> [model]
->
[struct_class]
Lloyd GR, Weber RJM (????). metabolomicsWorkbenchR: Metabolomics Workbench in R. R package version 1.14.1.
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
Other REST API's:
classyfire_lookup
,
kegg_lookup
,
lipidmaps_lookup
,
rest_api
M <- mwb_compound_lookup( input_item = "inchi_key", output_item = "inchi_key", base_url = "https://www.metabolomicsworkbench.org/rest", url_template = "<base_url>/compound/<input_item>/<query_column>/<output_item>", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
M <- mwb_compound_lookup( input_item = "inchi_key", output_item = "inchi_key", base_url = "https://www.metabolomicsworkbench.org/rest", url_template = "<base_url>/compound/<input_item>/<query_column>/<output_item>", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
Imports the Metabolomics Workbench refmet database.
mwb_refmet_database(bfc = NULL, ...)
mwb_refmet_database(bfc = NULL, ...)
bfc |
(character) |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
BiocFileCache
httr
plyr
A
mwb_refmet_database
object. This object has no output
slots.
A mwb_refmet_database
object inherits the following struct
classes: [mwb_refmet_database]
-> [annotation_database]
->
[annotation_source]
-> [struct_class]
Shepherd L, Morgan M (2024). BiocFileCache: Manage Files Across Sessions. R package version 2.10.2.
Wickham H (2023). httr: Tools for Working with URLs and HTTP. R package version 1.4.7, https://CRAN.R-project.org/package=httr.
Wickham H (2011). "The Split-Apply-Combine Strategy for Data Analysis." Journal of Statistical Software, 40(1), 1-29. https://www.jstatsoft.org/v40/i01/.
M <- mwb_refmet_database( bfc = character(0), tag = character(0), data = data.frame(), source = "ANY")
M <- mwb_refmet_database( bfc = character(0), tag = character(0), data = data.frame(), source = "ANY")
Query the Metabolomic Workbench API and retrieve a display an image of the matching molecular structure.
mwb_structure(query_column, row_index, ...)
mwb_structure(query_column, row_index, ...)
query_column |
(character) The name of the |
row_index |
(integer, numeric) The row index of the
|
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
cowplot
metabolomicsWorkbenchR
This object queries the Metabolomics Workbench API for matches to your query without caching the results. It is therefore intended for limited use. If you wish to obtain images for a large number of molecules you should seek an alternative solution.
A
mwb_structure
object. This object has no output
slots.
See chart_plot
in the struct
package to
plot this chart object.
A mwb_structure
object inherits the following struct
classes:
[mwb_structure]
-> [chart]
-> [struct_class]
Wilke C (2024). cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'. R package version 1.1.3, https://CRAN.R-project.org/package=cowplot.
Lloyd GR, Weber RJM (????). metabolomicsWorkbenchR: Metabolomics Workbench in R. R package version 1.14.1.
M <- mwb_structure( row_index = 1, query_column = "V1")
M <- mwb_structure( row_index = 1, query_column = "V1")
Annotations will be matched to the measured data variable meta data.frame by determining which annotations ppm window overlaps with the ppm window from the measured mz.
mz_match(variable_meta, mz_column, ppm_window, id_column, ...)
mz_match(variable_meta, mz_column, ppm_window, id_column, ...)
variable_meta |
(data.frame) A data.frame of variable IDs and their corresponding mz values. |
mz_column |
(character) Column name of the mz values in variable_meta. |
ppm_window |
(numeric, integer) Ppm window to use for matching. If a single value is provided then the same ppm is used for both variable meta and the annotations. A named vector can also be provided e.g. c("variable_meta"=5,"annotations"=2) to use ", "different windows for each data table. |
id_column |
(character) Column name of the variable ids in variable_meta. id_column="rownames" will use the rownames as ids. |
... |
Additional slots and values passed to |
A mz_match
object with the following output
slots:
updated |
(annotation_source) The input annotation source with the newly generated column. |
A mz_match
object inherits the following struct
classes: [mz_match]
-> [model]
-> [struct_class]
M <- mz_match( variable_meta = data.frame(), mz_column = character(0), ppm_window = 5, id_column = character(0))
M <- mz_match( variable_meta = data.frame(), mz_column = character(0), ppm_window = 5, id_column = character(0))
Annotations will be matched to the measured data variable meta data.frame by determining which annotations ppm AND rt windows overlap with the ppm AND rt windows of the measured mz.
mzrt_match( variable_meta, mz_column, rt_column, ppm_window, rt_window, id_column, ... )
mzrt_match( variable_meta, mz_column, rt_column, ppm_window, rt_window, id_column, ... )
variable_meta |
(data.frame) A data.frame of variable IDs and their corresponding mz values. |
mz_column |
(character) Column name of the mz values in variable_meta. |
rt_column |
(character) Column name of the rt values in variable_meta. |
ppm_window |
(numeric, integer) Ppm window to use for matching. If a single value is provided then the same ppm is used for both variable meta and the annotations. A named vector can also be provided e.g. c("variable_meta"=5,"annotations"=2) to use different ", "windows for each data table. |
rt_window |
(numeric, integer) Rt window to use for matching. If a single value is provided then the same rt is used for both variable meta and the annotations. A named vector can also be provided e.g. c("variable_meta"=5,"annotations"=2) to use different ", "windows for each data table. |
id_column |
(character) Column name of the variable ids in variable_meta. ", "id_column="rownames" will use the rownames as ids. |
... |
Additional slots and values passed to |
A mzrt_match
object with the following output
slots:
updated |
(annotation_source) The input annotation source with the newly generated column. |
A mzrt_match
object inherits the following struct
classes: [mzrt_match]
-> [model]
-> [struct_class]
M <- mzrt_match( variable_meta = data.frame(), mz_column = character(0), ppm_window = 5, id_column = character(0), rt_column = character(0), rt_window = 20)
M <- mzrt_match( variable_meta = data.frame(), mz_column = character(0), ppm_window = 5, id_column = character(0), rt_column = character(0), rt_window = 20)
Normalises differently formated lipid names to a consistent format.
normalise_lipids( column_name, grammar = ".all", columns = ".all", suffix = "_goslin", batch_size = 10000, ... )
normalise_lipids( column_name, grammar = ".all", columns = ".all", suffix = "_goslin", batch_size = 10000, ... )
column_name |
(character) The name of the column containing Lipids names to normalise. |
grammar |
(character) The grammar to use for normalising lipid
names. Allowed values are: Shorthand2020, Goslin, FattyAcids,
LipidMaps, SwissLipids, HMDB or .all. The default is |
columns |
(character) Column names to include from the goslin
output. Can be any of "Normalized.Name", "Original.Name", "Grammar",
"Adduct", "Adduct.Charge", "Lipid.Maps.Category",
"Lipid.Maps.Main.Class", "Species.Name", "Extended.Species.Name",
"Molecular.Species.Name", "Sn.Position.Name",
"Structure.Defined.Name", "Full.Structure.Name",
"Functional.Class.Abbr", "Functional.Class.Synonyms", "Level",
"Total.C", "Total.OH", "Total.O", "Total.DB", "Mass",
"Sum.Formula".".all" will return all columns. ". The default is
|
suffix |
(character) A suffic added to the column names of the
goslin output. The default is |
batch_size |
(numeric, integer) The maximum number of
annotations to be parsed by rgoslin at a time. If the batch size is
less than the total number of records then the records will be split
into multiple batches to help prevent crashes. The default is
|
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
rgoslin
A normalise_lipids
object with the following
output
slots:
updated |
(annotation_source) Annotation_source after normalising lipid names. |
A normalise_lipids
object inherits the following struct
classes:
[normalise_lipids]
-> [model]
-> [struct_class]
Kopczynski D, Hoffmann N, Peng B, Ahrends R (2020). "Goslin: A Grammar of Succinct Lipid Nomenclature." Analytical Chemistry, 92(16), 10957-10960. https://pubs.acs.org/doi/10.1021/acs.analchem.0c01690.
M <- normalise_lipids( column_name = "V1", grammar = ".all", columns = ".all", suffix = "_goslin", batch_size = 10000)
M <- normalise_lipids( column_name = "V1", grammar = ".all", columns = ".all", suffix = "_goslin", batch_size = 10000)
Replace matching (sub)strings based on a provided dictionary of search terms and their replacements.
normalise_strings( search_column, output_column = NULL, dictionary = list(), ... )
normalise_strings( search_column, output_column = NULL, dictionary = list(), ... )
search_column |
(character) The column name of the input
|
output_column |
(character, NULL) The name of a new column that
the modified strings will be stored in. If NULL the |
dictionary |
(list, annotation_database) A list of patterns and
functions that take the input pattern and return a replacement
string. A |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
Each item of the dictionary
list should #' have at least two
fields: "pattern" and "replace". "pattern" is used as
inputs to the [grepl()]
function to detect matches to the input pattern.
Parameters such as perl = TRUE
can also be included in the list and these
will be passed to [grepl()]
, otherwise the defaults are used.
When a match is detected the function in "replace" is called with the same
inputs as [grepl()]
. The "replace" function should return a new string.
Alternatively replace = NA
can be used to return NA for a matching pattern.
If a character string is provided then [gsub()]
will be used by default.
A normalise_strings
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A normalise_strings
object inherits the following struct
classes:
[normalise_strings]
-> [model]
-> [struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
M <- normalise_strings( search_column = character(0), output_column = NULL, dictionary = list())
M <- normalise_strings( search_column = character(0), output_column = NULL, dictionary = list())
Display an image of the molecular structure computed using OpenBabel.
openbabel_structure( smiles_column = "smiles", row_index = 1, image_size = 300, hydrogens = "implicit", carbons = "terminal", double_bonds = "asymmetric", colour_atoms = TRUE, scale_to_fit = TRUE, view_port = 300, title_column = NULL, subtitle_column = NULL, ... )
openbabel_structure( smiles_column = "smiles", row_index = 1, image_size = 300, hydrogens = "implicit", carbons = "terminal", double_bonds = "asymmetric", colour_atoms = TRUE, scale_to_fit = TRUE, view_port = 300, title_column = NULL, subtitle_column = NULL, ... )
smiles_column |
(character) The name of the |
row_index |
(integer, numeric) The row index of the
|
image_size |
(numeric, integer) The size of the image to return
in pixels. Images will be square. The default is |
hydrogens |
(character) Hydrogen atoms. Allowed values are limited to the following:
The default is |
carbons |
(character) Carbon atoms. Allowed values are limited to the following:
The default is |
double_bonds |
(character) The display style of double carbon
bonds. The default is |
colour_atoms |
(logical) Display some atoms in colour. The
default is |
scale_to_fit |
(logical) Normalise coordinates. Allowed values are limited to the following:
The default is
|
view_port |
(numeric, integer) Scales the image insde the
viewport. Can be used to ensure a set of images have the same bond
lengths and font sizes. Has no effect if |
title_column |
(NULL, character) The column containing text to
use as the title for the image. If NULL then no title is included.
The default is |
subtitle_column |
(NULL, character) The column containing text
to use as the subtitle for the image. If NULL then no subtitle is
included. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
ChemmineOB
cowplot
rsvg
A
openbabel_structure
object. This object has no output
slots.
See chart_plot
in the struct
package to
plot this chart object.
A openbabel_structure
object inherits the following struct
classes: [openbabel_structure]
-> [chart]
-> [struct_class]
Horan K, Girke T (2023). ChemmineOB: R interface to a subset of OpenBabel functionalities. doi:10.18129/B9.bioc.ChemmineOB https://doi.org/10.18129/B9.bioc.ChemmineOB, R package version 1.40.0, https://bioconductor.org/packages/ChemmineOB.
Wilke C (2024). cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'. R package version 1.1.3, https://CRAN.R-project.org/package=cowplot.
Ooms J (2023). rsvg: Render SVG Images into PDF, PNG, (Encapsulated) PostScript, or Bitmap Arrays. R package version 2.6.0, https://CRAN.R-project.org/package=rsvg.
M <- openbabel_structure( smiles_column = "V1", image_size = 300, hydrogens = "implicit", carbons = "terminal", double_bonds = "symmetric", colour_atoms = FALSE, scale_to_fit = FALSE, row_index = 1, view_port = 300, title_column = NULL, subtitle_column = NULL)
M <- openbabel_structure( smiles_column = "V1", image_size = 300, hydrogens = "implicit", carbons = "terminal", double_bonds = "symmetric", colour_atoms = FALSE, scale_to_fit = FALSE, row_index = 1, view_port = 300, title_column = NULL, subtitle_column = NULL)
Uses the OPSIN API to search for identifers based on the input annotation column.
opsin_lookup(query_column, suffix = "_opsin", output = "cids", ...)
opsin_lookup(query_column, suffix = "_opsin", output = "cids", ...)
query_column |
(character) The column name to use as the reference for searching the database e.g. "compound_name". OPSIN expect molecule names as input. |
suffix |
(character) A suffix appended to all column names in
the returned result. The default is |
output |
(character) The value returned from the pubchem
database. The default is |
... |
Additional slots and values passed to |
A opsin_lookup
object with the following output
slots:
updated |
(annotation_source) The annotation_source after adding data returned by the API. |
A opsin_lookup
object inherits the following struct
classes:
[opsin_lookup]
-> [rest_api]
-> [model]
-> [struct_class]
Lowe, M. D, Corbett, T. P, Murray-Rust, Peter, Glen, C. R (2011). "Chemical Name to Structure: OPSIN, an Open ", "Source Solution." Journal of Chemical Information and Modeling, 51(3), 793-753. doi:10.1021/ci100384d https://doi.org/10.1021/ci100384d.
M <- opsin_lookup( output = "stdinchikey", base_url = "https://opsin.ch.cam.ac.uk/opsin", url_template = "<base_url>/<query_column>.<output>", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
M <- opsin_lookup( output = "stdinchikey", base_url = "https://opsin.ch.cam.ac.uk/opsin", url_template = "<base_url>/<query_column>.<output>", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
Imports the PathBank database (https://pathbank.org/) of metabolites linked to pathways.
PathBank_metabolite_database( version = "primary", bfc_path = NULL, resource_name = "MetMashR_PathBank", ... )
PathBank_metabolite_database( version = "primary", bfc_path = NULL, resource_name = "MetMashR_PathBank", ... )
version |
(character) PathBank version. Allowed values are limited to the following:
The default is |
bfc_path |
(character, NULL) |
resource_name |
(character) The name given to this resource in
the cache. (see |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
BiocFileCache
httr
A
PathBank_metabolite_database
object. This object has no output
slots.
A PathBank_metabolite_database
object inherits the following
struct
classes: [PathBank_metabolite_database]
-> [BiocFileCache_database]
->
[annotation_database]
-> [annotation_source]
-> [struct_class]
Shepherd L, Morgan M (2024). BiocFileCache: Manage Files Across Sessions. R package version 2.10.2.
Wickham H (2023). httr: Tools for Working with URLs and HTTP. R package version 1.4.7, https://CRAN.R-project.org/package=httr.
Wishart, S D, Li, Carin, Marcu, Ana, Badran, Hasan, Pon, Allison, Budinski, Zachary, Patron, Jonas, Lipton, Debra, Cao, Xuan, Oler, Eponine, Li, Krissa, Paccoud, Maïlys, Hong, Chelsea, Guo, C A, Chan, Christopher, Wei, William, Ramirez-Gaona, Miguel (2019). "PathBank: a comprehensive pathway database for model organisms." Nucleic Acids Research, 48, D470-D478. doi:10.1093/nar/gkz861 https://doi.org/10.1093/nar/gkz861.
M <- PathBank_metabolite_database( version = "primary", bfc_path = NULL, resource_name = "bfc", bfc_fun = function(){}, import_fun = function(){}, offline = FALSE, tag = character(0), data = data.frame(), source = "ANY")
M <- PathBank_metabolite_database( version = "primary", bfc_path = NULL, resource_name = "bfc", bfc_fun = function(){}, import_fun = function(){}, offline = FALSE, tag = character(0), data = data.frame(), source = "ANY")
Combine multiple groups of columns into a single group of columns with group labels.
pivot_columns(column_groups, group_labels, ...)
pivot_columns(column_groups, group_labels, ...)
column_groups |
(list) A named list of columns to group together into a single group of columns. There should be the same number of columns in each group. |
group_labels |
(list) A named list of columns and the label to use for all records in that column. |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
dplyr
A pivot_columns
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A pivot_columns
object inherits the following struct
classes:
[pivot_columns]
-> [model]
-> [struct_class]
Wickham H, François R, Henry L, Müller K, Vaughan D (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.4, https://CRAN.R-project.org/package=dplyr.
M <- pivot_columns( group_labels = list(), column_groups = list())
M <- pivot_columns( group_labels = list(), column_groups = list())
Several columns are merged into a single column. If multiple columns contain overlapping values then priority can be given columns earlier in the list.
prioritise_columns( column_names, output_name, source_name, source_tags = column_names, clean = TRUE, ... )
prioritise_columns( column_names, output_name, source_name, source_tags = column_names, clean = TRUE, ... )
column_names |
(character) The name(s) of column(s) to be combined. |
output_name |
(character) The name of the new column. |
source_name |
(character) The column name used to indicate the where the merged values originated. |
source_tags |
(character) The tags used to identify the source of each item in the new column. A tag should be provided for each column_name. By default the column name is used. |
clean |
(logical) Clean old columns. Allowed values are limited to the following:
The default is
|
... |
Additional slots and values passed to |
A prioritise_columns
object with the following
output
slots:
updated |
(annotation_source) The input annotation source with the newly generated column. |
A prioritise_columns
object inherits the following struct
classes: [prioritise_columns]
-> [model]
-> [struct_class]
M <- prioritise_columns( column_names = "V1", output_name = "", clean = FALSE, source_name = "source_name", source_tags = "x")
M <- prioritise_columns( column_names = "V1", output_name = "", clean = FALSE, source_name = "source_name", source_tags = "x")
Uses the PubChem API to search for CID based on the input annotation column.
pubchem_compound_lookup( query_column, search_by, suffix = "_pubchem", output = "cids", records = "best", ... )
pubchem_compound_lookup( query_column, search_by, suffix = "_pubchem", output = "cids", records = "best", ... )
query_column |
(character) The column name to use as the reference for searching the database e.g. "HMBD_ID". |
search_by |
(character) The PubChem domain to search for matches to the annotation_column. |
suffix |
(character) A suffix appended to all column names in
the returned result. The default is |
output |
(character) The value returned from the pubchem
database. The default is |
records |
(character) Returned record(s). Allowed values are limited to the following:
The
default is |
... |
Additional slots and values passed to |
A pubchem_compound_lookup
object with the following
output
slots:
updated |
(annotation_source) The annotation_source after adding data returned by the API. |
A pubchem_compound_lookup
object inherits the following struct
classes: [pubchem_compound_lookup]
-> [rest_api]
-> [model]
->
[struct_class]
M <- pubchem_compound_lookup( search_by = "cid", output = "cids", records = "best", base_url = "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound", url_template = "<base_url>/<search_by>/<query_column>/<output>/JSON", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
M <- pubchem_compound_lookup( search_by = "cid", output = "cids", records = "best", base_url = "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound", url_template = "<base_url>/<search_by>/<query_column>/<output>/JSON", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
Uses the PubChem API to search for CID based onthe input annotation column and returns property information.
pubchem_property_lookup( query_column, search_by, suffix = "_pubchem", property = "InChIKey", ... )
pubchem_property_lookup( query_column, search_by, suffix = "_pubchem", property = "InChIKey", ... )
query_column |
(character) The column name to use as the reference for searching the database e.g. "HMBD_ID". |
search_by |
(character) The PubChem domain to search for matches to the annotation_column. |
suffix |
(character) A suffix appended to all column names in
the returned result. The default is |
property |
(character) A comma separated list of properties to
return from the pubchem database. (see
https://pubchem.ncbi.nlm.nih.gov/docs/pug-rest#section=Compound-Property-Tables
for details). Keyword ".all" will return all properties. The default
is |
... |
Additional slots and values passed to |
A pubchem_property_lookup
object with the following
output
slots:
updated |
(annotation_source) The annotation_source after adding data returned by the API. |
A pubchem_property_lookup
object inherits the following struct
classes: [pubchem_property_lookup]
-> [pubchem_compound_lookup]
->
[rest_api]
-> [model]
-> [struct_class]
M <- pubchem_property_lookup( search_by = "cid", property = "InChIKey", output = "cids", records = "best", base_url = "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound", url_template = "<base_url>/<search_by>/<query_column>/property/<property>/JSON", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
M <- pubchem_property_lookup( search_by = "cid", property = "InChIKey", output = "cids", records = "best", base_url = "https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound", url_template = "<base_url>/<search_by>/<query_column>/property/<property>/JSON", query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
Query the PubChem api and retrieve a display an image of the matching molecular structure.
pubchem_structure( query_column, search_by, row_index, record_type = "2d", image_size = "large", ... )
pubchem_structure( query_column, search_by, row_index, record_type = "2d", image_size = "large", ... )
query_column |
(character) The name of the |
search_by |
(character) The PubChem domain to search for matches to the annotation_column. |
row_index |
(integer, numeric) The row index of the
|
record_type |
(character) The record type to return from the
PubChem query. Can be one of "2d" or "3d". The default is
|
image_size |
(character) The size of the image to return from
the PubChem query. Can be one of "large" or "small". For |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
cowplot
This object queries the PubChem API for matches to your query without caching the results. It is therefore intended for limited use. If you wish to obtain images for a large number of moelucules you should seek an alternative solution.
A
pubchem_structure
object. This object has no output
slots.
See chart_plot
in the struct
package to
plot this chart object.
A pubchem_structure
object inherits the following struct
classes:
[pubchem_structure]
-> [chart]
-> [struct_class]
Wilke C (2024). cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'. R package version 1.1.3, https://CRAN.R-project.org/package=cowplot.
M <- pubchem_structure( query_column = "V1", search_by = "cid", row_index = 1, record_type = "2d", image_size = "large")
M <- pubchem_structure( query_column = "V1", search_by = "cid", row_index = 1, record_type = "2d", image_size = "large")
Display a PubChem HTML widget for a compound.
pubchem_widget( query_column, row_index, record_type = "2D-Structure", hide_title = FALSE, width = "600px", height = "650px", display = TRUE, ... )
pubchem_widget( query_column, row_index, record_type = "2D-Structure", hide_title = FALSE, width = "600px", height = "650px", display = TRUE, ... )
query_column |
(character) The name of the |
row_index |
(integer, numeric) The row index of the
|
record_type |
(character) The record type for the widget. The
default is |
hide_title |
(logical) Hide widget title. Allowed values are limited to the following:
The
default is |
width |
(integer, numeric, character) The width of the widget in
a CSS style compatible format. Numerical values will be converted to
character. The default is |
height |
(integer, numeric, character) The height of the widget
in a CSS style compatible format.Numerical values will be converted
to character. The default is |
display |
(logical) Display widget. Allowed values are limited to the following:
The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
htmltools
A
pubchem_widget
object. This object has no output
slots.
See chart_plot
in the struct
package to
plot this chart object.
A pubchem_widget
object inherits the following struct
classes:
[pubchem_widget]
-> [chart]
-> [struct_class]
Cheng J, Sievert C, Schloerke B, Chang W, Xie Y, Allen J (2024). htmltools: Tools for HTML. R package version 0.5.8.1, https://CRAN.R-project.org/package=htmltools.
M <- pubchem_widget( query_column = "V1", row_index = 1, record_type = "2D-Structure", hide_title = FALSE, width = 600, height = 400, display = FALSE)
M <- pubchem_widget( query_column = "V1", row_index = 1, record_type = "2D-Structure", hide_title = FALSE, width = 600, height = 400, display = FALSE)
This dictionary removes racemic properties from molecule names. It is
intended for use with the normalise_strings()
object.
racemic_dictionary
racemic_dictionary
An object of class list
of length 5.
A dictionary for use with normalise_strings()
M <- normalise_strings( search_column = "example", output_column = "result", dictionary = racemic_dictionary )
M <- normalise_strings( search_column = "example", output_column = "result", dictionary = racemic_dictionary )
A data.frame stored as an RData file.
rdata_database(source = character(0), variable_name, ...)
rdata_database(source = character(0), variable_name, ...)
source |
(ANY) The source of annotation data. The default is
|
variable_name |
(character, function) The name of the data.frame in the imported workspace to use as the data.frame for this source. A function can be provided to e.g. extract a data.frame from a list in the imported environment. |
... |
Additional slots and values passed to |
A
rdata_database
object. This object has no output
slots.
A rdata_database
object inherits the following struct
classes:
[rdata_database]
-> [annotation_database]
->
[annotation_source]
-> [struct_class]
Other annotation databases:
AnnotationDb_database
,
GO_database
,
annotation_database
,
annotation_source
,
excel_database
,
rds_cache
,
rds_database
M <- rdata_database( variable_name = "a data frame", tag = character(0), data = data.frame(), source = "ANY")
M <- rdata_database( variable_name = "a data frame", tag = character(0), data = data.frame(), source = "ANY")
A data.frame stored as an RDS file. Intended to be used
with rest_api
objects as mechanism for caching search results. The
data.frame for an rds_cache
object must have a column named
".search".
rds_cache( source = character(0), data = data.frame(.search = character(0)), ... )
rds_cache( source = character(0), data = data.frame(.search = character(0)), ... )
source |
(ANY) The source of annotation data. The default is
|
data |
(data.frame, NULL) A data.frame of annotation data. The
default is |
... |
Additional slots and values passed to |
A
rds_cache
object. This object has no output
slots.
A rds_cache
object inherits the following struct
classes: [rds_cache]
-> [rds_database]
-> [annotation_database]
->
[annotation_source]
-> [struct_class]
Other annotation databases:
AnnotationDb_database
,
GO_database
,
annotation_database
,
annotation_source
,
excel_database
,
rdata_database
,
rds_database
M <- rds_cache( tag = character(0), data = data.frame(), source = "ANY")
M <- rds_cache( tag = character(0), data = data.frame(), source = "ANY")
A data.frame stored as an RDS file.
rds_database(source = character(0), ...)
rds_database(source = character(0), ...)
source |
(ANY) The source of annotation data. The default is
|
... |
Additional slots and values passed to |
A
rds_database
object. This object has no output
slots.
A rds_database
object inherits the following struct
classes:
[rds_database]
-> [annotation_database]
-> [annotation_source]
-> [struct_class]
Other annotation databases:
AnnotationDb_database
,
GO_database
,
annotation_database
,
annotation_source
,
excel_database
,
rdata_database
,
rds_cache
M <- rds_database( tag = character(0), data = data.frame(), source = "ANY")
M <- rds_database( tag = character(0), data = data.frame(), source = "ANY")
Reads an annotation_database and returns the data.frame.
read_database(obj, ...) ## S4 method for signature 'annotation_database' read_database(obj) ## S4 method for signature 'AnnotationDb_database' read_database(obj) ## S4 method for signature 'BiocFileCache_database' read_database(obj) ## S4 method for signature 'MTox700plus_database' read_database(obj) ## S4 method for signature 'PathBank_metabolite_database' read_database(obj) ## S4 method for signature 'excel_database' read_database(obj) ## S4 method for signature 'github_file' read_database(obj) ## S4 method for signature 'mwb_refmet_database' read_database(obj) ## S4 method for signature 'rdata_database' read_database(obj) ## S4 method for signature 'rds_database' read_database(obj) ## S4 method for signature 'sqlite_database' read_database(obj)
read_database(obj, ...) ## S4 method for signature 'annotation_database' read_database(obj) ## S4 method for signature 'AnnotationDb_database' read_database(obj) ## S4 method for signature 'BiocFileCache_database' read_database(obj) ## S4 method for signature 'MTox700plus_database' read_database(obj) ## S4 method for signature 'PathBank_metabolite_database' read_database(obj) ## S4 method for signature 'excel_database' read_database(obj) ## S4 method for signature 'github_file' read_database(obj) ## S4 method for signature 'mwb_refmet_database' read_database(obj) ## S4 method for signature 'rdata_database' read_database(obj) ## S4 method for signature 'rds_database' read_database(obj) ## S4 method for signature 'sqlite_database' read_database(obj)
obj |
An |
... |
additional database specific inputs |
A data.frame
M <- rds_database(tempfile()) df <- read_database(M)
M <- rds_database(tempfile()) df <- read_database(M)
Import an data from e.g. a raw file and parse it into an
annotation_source()
object.
read_source(obj, ...) ## S4 method for signature 'annotation_source' read_source(obj) ## S4 method for signature 'annotation_database' read_source(obj) ## S4 method for signature 'cd_source' read_source(obj) ## S4 method for signature 'ls_source' read_source(obj)
read_source(obj, ...) ## S4 method for signature 'annotation_source' read_source(obj) ## S4 method for signature 'annotation_database' read_source(obj) ## S4 method for signature 'cd_source' read_source(obj) ## S4 method for signature 'ls_source' read_source(obj)
obj |
an |
... |
not currently used |
an annotation_table()
or
annotation_database()
object
# prepare source CD <- cd_source( source = system.file( paste0("extdata/MTox/CD/HILIC_POS.xlsx"), package = "MetMashR" ) )
# prepare source CD <- cd_source( source = system.file( paste0("extdata/MTox/CD/HILIC_POS.xlsx"), package = "MetMashR" ) )
A wrapper around tidyselect::eval_select
. Remove
columns from an annotation table using tidy grammar.
remove_columns(expression = everything(), ...)
remove_columns(expression = everything(), ...)
expression |
(call) A valid rlang::expr for tidy evaluation via
eval_select. e.g. |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
tidyselect
rlang
A remove_columns
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A remove_columns
object inherits the following struct
classes:
[remove_columns]
-> [model]
-> [struct_class]
Henry L, Wickham H (2024). tidyselect: Select from a Set of Strings. R package version 1.2.1, https://CRAN.R-project.org/package=tidyselect.
Henry L, Wickham H (2024). rlang: Functions for Base Types and Core R and 'Tidyverse' Features. R package version 1.1.3, https://CRAN.R-project.org/package=rlang.
M <- remove_columns( expression = call("example"))
M <- remove_columns( expression = call("example"))
A wrapper around dplyr::rename
. Rename columns from
an annotation table using tidy grammar.
rename_columns(expression, ...)
rename_columns(expression, ...)
expression |
(call) A valid rlang::expr for tidy evaluation e.g.
|
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
tidyselect
rlang
A rename_columns
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A rename_columns
object inherits the following struct
classes:
[rename_columns]
-> [model]
-> [struct_class]
Henry L, Wickham H (2024). tidyselect: Select from a Set of Strings. R package version 1.2.1, https://CRAN.R-project.org/package=tidyselect.
Henry L, Wickham H (2024). rlang: Functions for Base Types and Core R and 'Tidyverse' Features. R package version 1.1.3, https://CRAN.R-project.org/package=rlang.
M <- rename_columns( expression = call("example"))
M <- rename_columns( expression = call("example"))
Some annotation_sources
, such as LCMS tables (lcms_table
), require that
certain columns are present in the data.frame. These are defined by slots in
the source definition. The name of slots containing the required column names
for a source can be retrieved using the required_cols
function, which will
collect and return the names of slots containing required column names for
the object and all of its parent objects.
required_cols(obj, ...) ## S4 method for signature 'annotation_source' required_cols(obj)
required_cols(obj, ...) ## S4 method for signature 'annotation_source' required_cols(obj)
obj |
an |
... |
additional source specific inputs |
a character vector of slot names
# prepare object M <- lcms_table(id_column = "id", mz_column = "mz", rt_column = "rt") #' # get values for required slots r <- required_cols(M) # get slot names for required columns names(r)
# prepare object M <- lcms_table(id_column = "id", mz_column = "mz", rt_column = "rt") #' # get values for required slots r <- required_cols(M) # get slot names for required columns names(r)
A base class providing common methods for making REST API calls.
rest_api( base_url, url_template, suffix, status_codes, delay, cache = NULL, query_column, ... )
rest_api( base_url, url_template, suffix, status_codes, delay, cache = NULL, query_column, ... )
base_url |
(character) The base URL of the API. |
url_template |
(character) A template describing how the URL
should be constructed from the base URL and input parameters. e.g.
<base_url>/ |
suffix |
(character) A suffix appended to all column names in the returned result. |
status_codes |
(list) Named list of status codes and function indicating how to respond. Should minimally contain a function to parse a successful response for status code 200. Any codes not provided will be passed to httr::stop_for_status(). |
delay |
(numeric, integer) Delay in seconds between API calls. |
cache |
(annotation_database, NULL) A struct cache object that
contains parsed responses to previous api queries. If not using a
cache then set to NULL. The default is |
query_column |
(character) The name of a column in the annotation table containing values to search in the api call. |
... |
Additional slots and values passed to |
A rest_api
object with the following output
slots:
updated |
(annotation_source) The annotation_source after adding data returned by the API. |
A rest_api
object inherits the following struct
classes: [rest_api]
-> [model]
-> [struct_class]
Other REST API's:
classyfire_lookup
,
kegg_lookup
,
lipidmaps_lookup
,
mwb_compound_lookup
M <- rest_api( base_url = "V1", url_template = character(0), query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
M <- rest_api( base_url = "V1", url_template = character(0), query_column = character(0), cache = NULL, status_codes = list(), delay = 0.5, suffix = "_rest_api")
Annotations will be matched to the measured variable meta data.frame by determining which annotations rt window overlaps with the rt window from the measured rt.
rt_match(variable_meta, rt_column, rt_window, id_column, ...)
rt_match(variable_meta, rt_column, rt_window, id_column, ...)
variable_meta |
(data.frame) A data.frame of variable IDs and their corresponding rt values. |
rt_column |
(character) Column name of the rt values in variable_meta. |
rt_window |
(numeric, integer) Rt window to use for matching. If a single value is provided then the same rt is used for both variable meta and the annotations. A named vector can also be provided e.g. c("variable_meta"=5,"annotations"=2) to use different ", "windows for each data table. |
id_column |
(character) Column name of the variable ids in variable_meta. ", "id_column="rownames" will use the rownames as ids. |
... |
Additional slots and values passed to |
A rt_match
object with the following output
slots:
updated |
(annotation_table) The input annotation source with the newly generated column. |
A rt_match
object inherits the following struct
classes: [rt_match]
-> [model]
-> [struct_class]
M <- rt_match( variable_meta = data.frame(), rt_column = character(0), rt_window = 20, id_column = character(0))
M <- rt_match( variable_meta = data.frame(), rt_column = character(0), rt_window = 20, id_column = character(0))
A wrapper around tidyselect::eval_select
. Select
columns from an annotation table using tidy grammar. This imitates
dplyr::select()
.
select_columns(expression = everything(), ...)
select_columns(expression = everything(), ...)
expression |
(call) A valid rlang::expr for tidy evaluation via
eval_select. e.g. |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
tidyselect
rlang
A select_columns
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A select_columns
object inherits the following struct
classes:
[select_columns]
-> [model]
-> [struct_class]
Henry L, Wickham H (2024). tidyselect: Select from a Set of Strings. R package version 1.2.1, https://CRAN.R-project.org/package=tidyselect.
Henry L, Wickham H (2024). rlang: Functions for Base Types and Core R and 'Tidyverse' Features. R package version 1.1.3, https://CRAN.R-project.org/package=rlang.
M <- select_columns( expression = call("example"))
M <- select_columns( expression = call("example"))
A wrapper for strsplit
. Divides a column into
multiple columns by dividing the contents
split_column( column_name, separator = "_", padding = NA, keep_indices = NULL, clean = TRUE, ... )
split_column( column_name, separator = "_", padding = NA, keep_indices = NULL, clean = TRUE, ... )
column_name |
(character) The column name in the annotation_source split. |
separator |
(character) A substring to split the column by. The
default is |
padding |
(character, logical) A character string used to
represent missing and zero length strings after splitting. The
default is |
keep_indices |
(numeric, integer) The indices of columns to keep
after splitting. If NULL then all columns are retained. The default
is |
clean |
(logical) Clean old columns. Allowed values are limited to the following:
The default is
|
... |
Additional slots and values passed to |
A split_column
object with the following output
slots:
updated |
(annotation_source) The annotation_source after splitting the column. |
A split_column
object inherits the following struct
classes:
[split_column]
-> [model]
-> [struct_class]
M <- split_column( column_name = "V1", separator = "_", clean = FALSE, padding = FALSE, keep_indices = numeric(0))
M <- split_column( column_name = "V1", separator = "_", clean = FALSE, padding = FALSE, keep_indices = numeric(0))
Expand single records into multiple records by splitting
strings in a named column at the chosen separator. For example, if a
for a record the column synonyms = c("glucose,dextrose")
then by
splitting at the comma results in two records, one for glucose and
one for dextrose with identical values (apart from the column being
split). The original record is removed.
split_records(column_name, separator, clean = TRUE, ...)
split_records(column_name, separator, clean = TRUE, ...)
column_name |
(character) The column name of the
|
separator |
(character) The substring used to split the values in column_name into multiple records. |
clean |
(logical) Remove the original column. If FALSE the
original column will be retained in the final output with .original
appended to the column name. The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
tidytext
A split_records
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A split_records
object inherits the following struct
classes:
[split_records]
-> [model]
-> [struct_class]
Silge J, Robinson D (2016). "tidytext: Text Mining and Analysis Using Tidy Data Principles in R." JOSS, 1(3). doi:10.21105/joss.00037 https://doi.org/10.21105/joss.00037, http://dx.doi.org/10.21105/joss.00037.
M <- split_records( column_name = character(0), separator = ",", clean = FALSE)
M <- split_records( column_name = character(0), separator = ",", clean = FALSE)
A data.frame stored in an SQLite database.
sqlite_database(source, table = "annotation_database", ...)
sqlite_database(source, table = "annotation_database", ...)
source |
(ANY) The source of annotation data. |
table |
(character) The name of a table in the SQLite database.
The default is |
... |
Additional slots and values passed to |
This object makes use of functionality from the following packages:
RSQLite
A
sqlite_database
object. This object has no output
slots.
A sqlite_database
object inherits the following struct
classes:
[sqlite_database]
-> [annotation_database]
->
[annotation_source]
-> [struct_class]
Müller K, Wickham H, James DA, Falcon S (2024). RSQLite: SQLite Interface for R. R package version 2.3.7, https://CRAN.R-project.org/package=RSQLite.
Other database:
BiocFileCache_database
M <- sqlite_database( table = character(0), tag = character(0), data = data.frame(), source = "ANY")
M <- sqlite_database( table = character(0), tag = character(0), data = data.frame(), source = "ANY")
A wrapper for trimws()
. Removes leading and/or
trailing whitespace from character strings.
trim_whitespace(column_names, which = "both", whitespace = "[ \t\r\n]", ...)
trim_whitespace(column_names, which = "both", whitespace = "[ \t\r\n]", ...)
column_names |
(character) The column name(s) in the annotation_source to trim white space from. Special case ".all" will apply to all columns. |
which |
(character) Trailing and/or leading whitespace. Allowed values are limited to the following:
The default is |
whitespace |
(character) A string specifying a regular
expression to match (one character of) "white space". See
|
... |
Additional slots and values passed to |
A trim_whitespace
object with the following
output
slots:
updated |
(annotation_source) The annotation_source after trimming whitespace. |
A trim_whitespace
object inherits the following struct
classes:
[trim_whitespace]
-> [model]
-> [struct_class]
M <- trim_whitespace( column_names = "V1", which = "both", whitespace = "[ ]")
M <- trim_whitespace( column_names = "V1", which = "both", whitespace = "[ ]")
A dictionary for converting tripeptides encoded using single letter IUPAC codes to use three letter codes for amino acids separated by hyphens. e.g. INK becomes Ile-Asn-Lys
tripeptide_dictionary
tripeptide_dictionary
An object of class list
of length 1.
A dictionary for use with normalise_strings()
M <- normalise_strings( search_column = "example", output_column = "result", dictionary = tripeptide_dictionary )
M <- normalise_strings( search_column = "example", output_column = "result", dictionary = tripeptide_dictionary )
reduces an annotation source to unique records only; all duplicates are removed.
unique_records(...)
unique_records(...)
... |
Additional slots and values passed to |
A unique_records
object with the following
output
slots:
updated |
(annotation_source) The updated annotations as an
annotation_source object. |
A unique_records
object inherits the following struct
classes:
[unique_records]
-> [model]
-> [struct_class]
M <- unique_records()
M <- unique_records()
This helper function is for use with BiocFileCache_database()
objects.
Using it as the
bfc_fun
input for this object will unzip a downloaded resource into a
temporary folder before storing it in the cache.
unzip_before_cache(from, to)
unzip_before_cache(from, to)
from |
incoming path |
to |
the outgoing path |
TRUE if successful
M <- BiocFileCache_database( source = tempfile(), resource_name = "example", bfc_fun = unzip_before_cache )
M <- BiocFileCache_database( source = tempfile(), resource_name = "example", bfc_fun = unzip_before_cache )
A function to join sources vertically. A vertical join involves matching common columns across source data.frames and padding missing columns to create a single new data.frame with data and records from multiple sources.
vertical_join(x, y, ...) ## S4 method for signature 'annotation_source,annotation_source' vertical_join( x, y, matching_columns = NULL, keep_cols = NULL, source_col = "annotation_source", exclude_cols = NULL, as = annotation_source() ) ## S4 method for signature 'list,missing' vertical_join( x, y, matching_columns = NULL, keep_cols = NULL, source_col = "annotation_source", exclude_cols = NULL, as = annotation_source() )
vertical_join(x, y, ...) ## S4 method for signature 'annotation_source,annotation_source' vertical_join( x, y, matching_columns = NULL, keep_cols = NULL, source_col = "annotation_source", exclude_cols = NULL, as = annotation_source() ) ## S4 method for signature 'list,missing' vertical_join( x, y, matching_columns = NULL, keep_cols = NULL, source_col = "annotation_source", exclude_cols = NULL, as = annotation_source() )
x |
an |
y |
an second |
... |
additional inputs (not currently used) |
matching_columns |
(list) a named list of column names that all contain the same information. All columns named in the same list element will be merged into a single column with the same name as the list element. |
keep_cols |
(character) a list of column names to keep in the final joined table. All other columns will be dropped. |
source_col |
(character) the name of a new column that will contain the tags of the original source object for each row in the joined table. |
exclude_cols |
(character) the names of columns to exclude from the joined table. |
as |
(character) the type of object the joined table should be returned as e.g. "lcms_table". |
an annotation_source
object
M <- annotation_source(data = data.frame(id = 1, value = "A")) N <- annotation_source(data = data.frame(id = 2, value = "B")) O <- vertical_join(M, N, keep_cols = ".all")
M <- annotation_source(data = data.frame(id = 1, value = "A")) N <- annotation_source(data = data.frame(id = 2, value = "B")) O <- vertical_join(M, N, keep_cols = ".all")
Returns a list of quosures for use with
filter_records
to allow the use of dplyr-style expressions. See examples.
wherever(...)
wherever(...)
... |
Expressions that return a logical value and are defined in terms
of the columns in the annotation_source. If multiple conditions are
included then they are combined with the |
a list of quosures for use with filter_records
# some annotation data AN <- annotation_source(data = iris) # filter to setosa where Sepal length is less than 5 M <- filter_records( wherever( Species == "setosa", Sepal.Length < 5 ) ) M <- model_apply(M, AN) predicted(M) # 20 rows
# some annotation data AN <- annotation_source(data = iris) # filter to setosa where Sepal length is less than 5 M <- filter_records( wherever( Species == "setosa", Sepal.Length < 5 ) ) M <- model_apply(M, AN) predicted(M) # 20 rows
Writes a data.frame to a annotation_database
.
write_database(obj, ...) ## S4 method for signature 'annotation_database' write_database(obj, df) ## S4 method for signature 'rds_database' write_database(obj, df) ## S4 method for signature 'sqlite_database' write_database(obj, df)
write_database(obj, ...) ## S4 method for signature 'annotation_database' write_database(obj, df) ## S4 method for signature 'rds_database' write_database(obj, df) ## S4 method for signature 'sqlite_database' write_database(obj, df)
obj |
A |
... |
additional database specific inputs |
df |
(data.frame) the data.frame to store in the database. |
Silently returns TRUE if successful, FALSE otherwise
M <- rds_database(tempfile()) write_database(M, data.frame())
M <- rds_database(tempfile()) write_database(M, data.frame())