---
title: "owlents: using OWL directly in ontoProc"
author: "Vincent J. Carey, stvjc at channing.harvard.edu"
date: "`r format(Sys.time(), '%B %d, %Y')`"
vignette: >
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{owlents: using OWL directly in ontoProc}
  %\VignetteEncoding{UTF-8}
bibliography: ontobib.bib
output:
  BiocStyle::html_document:
    highlight: pygments
    number_sections: yes
    theme: united
    toc: yes
---

# Introduction

In Bioconductor 3.19, ontoProc can work with OWL RDF/XML
serializations of ontologies, via the 
[owlready2](https://owlready2.readthedocs.io/en/v0.42/) python modules.

The `owl2cache` function retrieves OWL from a URL or file
and places it in a cache to avoid repetitious retrievals.  The
default cache is the one defined by `BiocFileCache::BiocFileCache()`.
Here we work with the cell ontology.  `setup_entities2` will use basilisk
to acquire
owlready2 python modules that parse the OWL and produce an `ontology_index` instance
(defined in CRAN package ontologyIndex).


```{r getcl,message=FALSE}
library(ontoProc)
clont_path = owl2cache(url="http://purl.obolibrary.org/obo/cl.owl")
cle = setup_entities2(clont_path)
cle
```

The usual plotting approach works.
```{r lkcl}
sel = c("CL_0000492", "CL_0001054", "CL_0000236", 
"CL_0000625", "CL_0000576", 
"CL_0000623", "CL_0000451", "CL_0000556")
onto_plot2(cle, sel)
```

# Illustration with Human Phenotype ontology

We'll obtain and ad hoc selection of
15 UBERON term names and visualize
the hierarchy.

```{r gethp}
hpont_path = owl2cache(url="http://purl.obolibrary.org/obo/hp.owl")
hpents = setup_entities2(hpont_path)
kp = grep("UBER", names(hpents$name), value=TRUE)[21:30]
onto_plot2(hpents, kp)
```

The prefixes of class names in the ontology
give a sense of its scope.
```{r lkta}
t(t(table(sapply(strsplit(names(hpents$name), "_"), "[", 1))))
```

To characterize human phenotypes ontologically, 
CL, GO, CHEBI, and
UBERON play significant roles.