Title: | Mismatch Tolerant Maximum Common Substructure Searching |
---|---|
Description: | The fmcsR package introduces an efficient maximum common substructure (MCS) algorithms combined with a novel matching strategy that allows for atom and/or bond mismatches in the substructures shared among two small molecules. The resulting flexible MCSs (FMCSs) are often larger than strict MCSs, resulting in the identification of more common features in their source structures, as well as a higher sensitivity in finding compounds with weak structural similarities. The fmcsR package provides several utilities to use the FMCS algorithm for pairwise compound comparisons, structure similarity searching and clustering. |
Authors: | Yan Wang, Tyler Backman, Kevin Horan, Thomas Girke |
Maintainer: | Thomas Girke <[email protected]> |
License: | Artistic-2.0 |
Version: | 1.49.0 |
Built: | 2024-12-29 05:37:43 UTC |
Source: | https://github.com/bioc/fmcsR |
The package consists of two main functions, fmcs
which computes the flexible MCS between two SDF
objects. And fmcsBatch
runs the FMCS algorithm on a SDFset
.
Package: | fmcsR |
Type: | Package |
Version: | 1.0 |
Date: | 2012-02-01 |
Yan Wang
Maintainer: Yan Wang <[email protected]>
library(fmcsR) data(sdfsample) sdfset <- sdfsample result1 <- fmcs(sdfset[[1]], sdfset[[2]]) result2 <- fmcs(sdfset[[1]], sdfset[[2]], au=3) result3 <- fmcs(sdfset[[1]], sdfset[[2]], bu=3) result4 <- fmcs(sdfset[[1]], sdfset[[2]], au=1, bu=1) result5 <- fmcs(sdfset[[1]], sdfset[[2]], matching.mode="aromatic") result6 <- fmcs(sdfset[[1]], sdfset[[2]], au=2, bu=1, matching.mode="aromatic") fmcsBatch(sdfset[[1]], sdfset[1:3]) fmcsBatch(sdfset[[1]], sdfset[1:3], au=2) fmcsBatch(sdfset[[1]], sdfset[1:3], bu=1) fmcsBatch(sdfset[[1]], sdfset[1:3], matching.mode="aromatic", au=1, bu=1)
library(fmcsR) data(sdfsample) sdfset <- sdfsample result1 <- fmcs(sdfset[[1]], sdfset[[2]]) result2 <- fmcs(sdfset[[1]], sdfset[[2]], au=3) result3 <- fmcs(sdfset[[1]], sdfset[[2]], bu=3) result4 <- fmcs(sdfset[[1]], sdfset[[2]], au=1, bu=1) result5 <- fmcs(sdfset[[1]], sdfset[[2]], matching.mode="aromatic") result6 <- fmcs(sdfset[[1]], sdfset[[2]], au=2, bu=1, matching.mode="aromatic") fmcsBatch(sdfset[[1]], sdfset[1:3]) fmcsBatch(sdfset[[1]], sdfset[1:3], au=2) fmcsBatch(sdfset[[1]], sdfset[1:3], bu=1) fmcsBatch(sdfset[[1]], sdfset[1:3], matching.mode="aromatic", au=1, bu=1)
R function to call the C++ implementation of the flexible common substructure (FMCS) algorithm. The FMCS algorithm provides an improved maximum common substructure (MCS) search method that allows atom and/or bond mismatches in the substructures shared among two small molecules. The resulting flexible MCSs (FMCSs) are often larger than strict MCSs, resulting in the identification of more common features in their source structures, as well as a higher sensitivity in detecting weak similarities among compounds.
fmcs(sdf1, sdf2, al = 0, au = 0, bl = 0, bu = 0, matching.mode = "static", fast = FALSE, timeout=60000)
fmcs(sdf1, sdf2, al = 0, au = 0, bl = 0, bu = 0, matching.mode = "static", fast = FALSE, timeout=60000)
sdf1 |
Input query |
sdf2 |
Input target |
al |
Lower bound for the number of atom mismatches. |
au |
Upper bound for the number of atom mismatches. |
bl |
Lower bound for the number of bond mismatches. |
bu |
Upper bound for the number of bond mismatches. |
matching.mode |
Three modes for bond matching are supported: "static", "aromatic", and "ring". |
fast |
If |
timeout |
The maximum amount of time to spend searching, in milliseconds. A value of 0 indicates no timeout. |
...
Returns object of class MCS
Yan Wang, Thomas Girke
Publication in preparation.
plotMCS
, fmcsBatch
, ?"MCS-class"
library(fmcsR) data(sdfsample) sdfset <- sdfsample mcs1 <- fmcs(sdfset[[1]], sdfset[[2]]) mcsfast <- fmcs(sdfset[[1]], sdfset[[2]], fast=TRUE) mcs2 <- fmcs(sdfset[[1]], sdfset[[2]], au=3) mcs3 <- fmcs(sdfset[[1]], sdfset[[2]], bu=3) mcs4 <- fmcs(sdfset[[1]], sdfset[[2]], au=1, bu=1) mcs5 <- fmcs(sdfset[[1]], sdfset[[2]], matching.mode="aromatic") mcs6 <- fmcs(sdfset[[1]], sdfset[[2]], au=2, bu=1, matching.mode="aromatic") ## Plot MCS objects plotMCS(mcs6) ## Methods to return components of MCS objects stats(mcs6) mcs6[["stats"]] mcs1(mcs6) mcs6[["mcs1"]] mcs2(mcs6) mcs6[["mcs2"]] ## Constructor method from list mylist <- list(stats=stats(mcs6), mcs1=mcs1(mcs6), mcs2=mcs2(mcs6)) mymcs <- as(mylist, "MCS")
library(fmcsR) data(sdfsample) sdfset <- sdfsample mcs1 <- fmcs(sdfset[[1]], sdfset[[2]]) mcsfast <- fmcs(sdfset[[1]], sdfset[[2]], fast=TRUE) mcs2 <- fmcs(sdfset[[1]], sdfset[[2]], au=3) mcs3 <- fmcs(sdfset[[1]], sdfset[[2]], bu=3) mcs4 <- fmcs(sdfset[[1]], sdfset[[2]], au=1, bu=1) mcs5 <- fmcs(sdfset[[1]], sdfset[[2]], matching.mode="aromatic") mcs6 <- fmcs(sdfset[[1]], sdfset[[2]], au=2, bu=1, matching.mode="aromatic") ## Plot MCS objects plotMCS(mcs6) ## Methods to return components of MCS objects stats(mcs6) mcs6[["stats"]] mcs1(mcs6) mcs6[["mcs1"]] mcs2(mcs6) mcs6[["mcs2"]] ## Constructor method from list mylist <- list(stats=stats(mcs6), mcs1=mcs1(mcs6), mcs2=mcs2(mcs6)) mymcs <- as(mylist, "MCS")
Compound search function that runs the FMCS algorithm for a query compound against a set of molecules stored
in an SDFset
container.
fmcsBatch(querySdf, sdfset, al = 0, au = 0, bl = 0, bu = 0, matching.mode = "static",timeout=60000,numParallel=1)
fmcsBatch(querySdf, sdfset, al = 0, au = 0, bl = 0, bu = 0, matching.mode = "static",timeout=60000,numParallel=1)
querySdf |
Input query |
sdfset |
Input target |
al |
Lower bound for the number of atom mismatches. |
au |
Upper bound for the number of atom mismatches. |
bl |
Lower bound for the number of bond mismatches. |
bu |
Upper bound for the number of bond mismatches. |
matching.mode |
Three matching mode are supported, "static", "aromatic", and "ring". |
timeout |
The maximum amount of time to spend on each pair of comparisons, in milliseconds. A value of 0 indicates no timeout. |
numParallel |
The number of comparisons to run in parallel, using local cores. |
This function runs the FMCS algorithm in fast computing mode. Thus, it will only return the similarity scores and size information about the source structures and their MCSs, while omitting all structural information.
Returns a matrix
with compound IDs as row names and the following columns:
Query_Size, Target_Size, MCS_Size, Tanimoto_Coefficient and
Overlap_Coefficient. For details see vignette of this package.
Yan Wang, Thomas Girke
plotMCS
, fmcs
, ?"MCS-class"
library(fmcsR) data(sdfsample) sdfset <- sdfsample fmcsBatch(sdfset[[1]], sdfset[1:3]) fmcsBatch(sdfset[[1]], sdfset[1:3], au=2) fmcsBatch(sdfset[[1]], sdfset[1:3], bu=1) fmcsBatch(sdfset[[1]], sdfset[1:3], matching.mode="aromatic", au=1, bu=1)
library(fmcsR) data(sdfsample) sdfset <- sdfsample fmcsBatch(sdfset[[1]], sdfset[1:3]) fmcsBatch(sdfset[[1]], sdfset[1:3], au=2) fmcsBatch(sdfset[[1]], sdfset[1:3], bu=1) fmcsBatch(sdfset[[1]], sdfset[1:3], matching.mode="aromatic", au=1, bu=1)
SDFset
object
Sample compound structures stored in SDF format.
data(fmcstest)
data(fmcstest)
Object of class SDFset
Object stores X molecules from a sample SD file.
ftp://ftp.ncbi.nih.gov/pubchem/
SDF format definition: http://www.symyx.com/downloads/public/ctfile/ctfile.jsp
data(fmcstest) sdfset <- fmcstest view(sdfset)
data(fmcstest) sdfset <- fmcstest view(sdfset)
"MCS"
List-like container for storing results from fmcs
function.
Objects can be created by calls of the form new("MCS", ...)
.
stats
:Object of class "numeric"
~~
mcs1
:Object of class "SDFset"
~~
mcs2
:Object of class "SDFset"
~~
signature(x = "MCS")
: ...
signature(from = "list", to = "MCS")
: ...
signature(x = "MCS")
: ...
signature(x = "MCS")
: ...
signature(x = "MCS")
: ...
...
Yan Wang
...
Related classes: SDF, SDFstr
## Create MCS instance showClass("MCS") data(sdfsample) sdfset <- sdfsample mcs <- fmcs(sdfset[[1]], sdfset[[2]], au=2, bu=2) ## Methods to return components of MCS stats(mcs) mcs[["stats"]] mcs1(mcs) mcs[["mcs1"]] mcs2(mcs) mcs[["mcs2"]] ## Constructor method from list mylist <- list(stats=stats(mcs), mcs1=mcs1(mcs), mcs2=mcs2(mcs)) mymcs <- as(mylist, "MCS")
## Create MCS instance showClass("MCS") data(sdfsample) sdfset <- sdfsample mcs <- fmcs(sdfset[[1]], sdfset[[2]], au=2, bu=2) ## Methods to return components of MCS stats(mcs) mcs[["stats"]] mcs1(mcs) mcs[["mcs1"]] mcs2(mcs) mcs[["mcs2"]] ## Constructor method from list mylist <- list(stats=stats(mcs), mcs1=mcs1(mcs), mcs2=mcs2(mcs)) mymcs <- as(mylist, "MCS")
Helper function to run atomsubset
from ChemmineR
library on MCS
objects in order to obtain their results in SDFset
format.
mcs2sdfset(x, ...)
mcs2sdfset(x, ...)
x |
Object of class |
... |
Arguments to be passed to/from other methods. |
Returns MCS data in form of a list containing two SDFset
objects, one for the query and one for the target structure.
List with two SDFset
objects.
...
Thomas Girke
...
fmcs
library(fmcsR) data(sdfsample) sdfset <- sdfsample mcs <- fmcs(sdfset[[1]], sdfset[[2]], au=2, bu=1, matching.mode="aromatic") mcs2sdfset(x=mcs, type="new") mcs2sdfset(x=mcs, type="old")[[1]][[1]] plot(mcs2sdfset(x=mcs, type="new")[[1]][1])
library(fmcsR) data(sdfsample) sdfset <- sdfsample mcs <- fmcs(sdfset[[1]], sdfset[[2]], au=2, bu=1, matching.mode="aromatic") mcs2sdfset(x=mcs, type="new") mcs2sdfset(x=mcs, type="old")[[1]][[1]] plot(mcs2sdfset(x=mcs, type="new")[[1]][1])
Convenience plotting function to visualize and compare MCSs generated by fmcs
function.
plotMCS(x, mcs = 1, print = FALSE, ...)
plotMCS(x, mcs = 1, print = FALSE, ...)
x |
|
mcs |
Selection of MCS solution by position number, default is 1. |
print |
|
... |
Arguments to be passed to/from other methods. |
The two structures, target and query, used to generate x
with a call to fmcs
are plotted next to each other, and the corresponding MCS substructures are highlighted in color.
Prints summary of MCS to screen and plots their structures to graphics device.
...
Yan Wang
...
sdf.visualize
library(fmcsR) data(sdfsample) sdfset <- sdfsample mcs <- fmcs(sdfset[[1]], sdfset[[2]], au=2, bu=1, matching.mode="aromatic") plotMCS(mcs, mcs=1)
library(fmcsR) data(sdfsample) sdfset <- sdfsample mcs <- fmcs(sdfset[[1]], sdfset[[2]], au=2, bu=1, matching.mode="aromatic") plotMCS(mcs, mcs=1)