5. Tutorial: Export Human Protein Atlas (HPA) data as JSON

library(BiocStyle)
library(HPAanalyze)
library(dplyr)
library(jsonlite)

The case

In certain situation, users may want to export HPA downloaded data into JavaScript Object Notation (JSON) format to use for purposed such as asynchronous, real-time server-to-browser communication. To reduce package dependencies, HPAanalyze does not support exporting to JSON via the hpaExport function. However, this can be done using a short script as described below.

The solution

Exporting data to JSON can be achieved by converting dataframes resulting from hpaDownload/hpaSubset to JSON format using the jsonlite package and write the files to .json file.

Download and subset data

There is no special processing needed to the datasets. You can download and subset data as usual. The resulting object is a list of dataframes.

data <- hpaDownload(downloadList = "histology", version = "example")
data_subset <-
  hpaSubset(data,
            targetGene = c('TP53', 'EGFR', 'CD44', 'PTEN', 'IDH1'))

Convert dataframes to JSON

The list of dataframes will then be converted to a list of json using jsonlite::toJSON.

data_json <- lapply(data_subset, jsonlite::toJSON)

str(data_json)

# List of 3
#  $ normal_tissue       : 'json' chr "[{\"ensembl\":\"ENSG00000026508\",\"gene\":\"CD44\",\"tissue\":\"adrenal gland\",\"cell_type\":\"glandular cell"| __truncated__
#  $ pathology           : 'json' chr "[{\"ensembl\":\"ENSG00000026508\",\"gene\":\"CD44\",\"cancer\":\"breast cancer\",\"high\":1,\"medium\":6,\"low\"| __truncated__
#  $ subcellular_location: 'json' chr "[{\"ensembl\":\"ENSG00000026508\",\"gene\":\"CD44\",\"reliability\":\"Enhanced\",\"enhanced\":\"Golgi apparatus"| __truncated__

Write JSON file

Finally, the .json file can be saved to your working folder using the follow code. Notice that there will be one .json file for each dataset.

for (i in seq_along(data_json)) {
  write(data_json[[i]], 
        file = paste0("hpa_data_", names(data_json[i]), ".json"))
}

In one function

If you routinely export HPA data into JSON format, the following function allow you to do so with the same syntax as hpaExport.

## The function (note that you don't need to put .json into the file name)
hpaExportJSON <- function(data, fileName) {
  data_json <- lapply(data, jsonlite::toJSON)
  for (i in seq_along(data_json)) {
    write(data_json[[i]],
          file = paste0(fileName, "_", names(data_json[i]), ".json"))
  }
}

## Export data subset
hpaExportJSON(data_subset, fileName = "hpa_data")