--- title: "owlents: using OWL directly in ontoProc" author: "Vincent J. Carey, stvjc at channing.harvard.edu" date: "`r format(Sys.time(), '%B %d, %Y')`" vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{owlents: using OWL directly in ontoProc} %\VignetteEncoding{UTF-8} bibliography: ontobib.bib output: BiocStyle::html_document: highlight: pygments number_sections: yes theme: united toc: yes --- # Introduction In Bioconductor 3.19, ontoProc can work with OWL RDF/XML serializations of ontologies, via the [owlready2](https://owlready2.readthedocs.io/en/v0.42/) python modules. The `owl2cache` function retrieves OWL from a URL or file and places it in a cache to avoid repetitious retrievals. The default cache is the one defined by `BiocFileCache::BiocFileCache()`. Here we work with the cell ontology. `setup_entities2` will use basilisk to acquire owlready2 python modules that parse the OWL and produce an `ontology_index` instance (defined in CRAN package ontologyIndex). ```{r getcl,message=FALSE} library(ontoProc) clont_path = owl2cache(url="http://purl.obolibrary.org/obo/cl.owl") cle = setup_entities2(clont_path) cle ``` The usual plotting approach works. ```{r lkcl} sel = c("CL_0000492", "CL_0001054", "CL_0000236", "CL_0000625", "CL_0000576", "CL_0000623", "CL_0000451", "CL_0000556") onto_plot2(cle, sel) ``` # Illustration with Human Phenotype ontology We'll obtain and ad hoc selection of 15 UBERON term names and visualize the hierarchy. ```{r gethp} hpont_path = owl2cache(url="http://purl.obolibrary.org/obo/hp.owl") hpents = setup_entities2(hpont_path) kp = grep("UBER", names(hpents$name), value=TRUE)[21:30] onto_plot2(hpents, kp) ``` The prefixes of class names in the ontology give a sense of its scope. ```{r lkta} t(t(table(sapply(strsplit(names(hpents$name), "_"), "[", 1)))) ``` To characterize human phenotypes ontologically, CL, GO, CHEBI, and UBERON play significant roles.