How to run CaDrA within a Docker Environment

Software requirements

  • Git version >= 2.21
  • Docker version >= 20.10

Don’t have Git installed, see Git Guides

Don’t have Docker installed, see Docker Engine

Build Docker image of CaDrA

(1) Clone this repository

git clone https://github.com/montilab/CaDrA

Run CaDrA container with its built image

docker run --name cadra -d -p 8787:8787 -e PASSWORD=CaDrA montilab/cadra:latest

--name: give an identity to the container
-d: run container in detached mode
-p: map host port to container port [host_port]:[container_port]
-e: set a default password to access RStudio Server

For more information about the Docker syntax, see Docker run reference

Check if the container is built successfully

docker ps

CONTAINER ID   IMAGE                    COMMAND       CREATED        STATUS        PORTS                    NAMES
b37b6b19c4e8   montilab/cadra:latest    "/init"       5 hours ago    Up 5 hours    0.0.0.0:8787->8787/tcp   cadra

Run CaDrA on RStudio Server hosted within a Docker environment

Using your preferred web browser, go to http://localhost:8787. You will be prompted to log into Rstudio Server. Enter the following credentials:

username: rstudio
password: CaDrA

When the Rstudio Server is opened, copy the following commands and run them in the R console. The script is used to search for candidate drivers that associated with the YAP/TAZ Activity in the BrCA dataset that provided with the package.

# Load R packages
library(CaDrA)
library(SummarizedExperiment)

## Read in BRCA GISTIC+Mutation object
utils::data(BRCA_GISTIC_MUT_SIG)
eset_mut_scna <- BRCA_GISTIC_MUT_SIG

## Read in input score
utils::data(TAZYAP_BRCA_ACTIVITY)
input_score <- TAZYAP_BRCA_ACTIVITY

## Samples to keep based on the overlap between the two inputs
overlap <- base::intersect(base::names(input_score), base::colnames(eset_mut_scna))
eset_mut_scna <- eset_mut_scna[,overlap]
input_score <- input_score[overlap]

## Binarize FS to only have 0's and 1's
SummarizedExperiment::assay(eset_mut_scna)[SummarizedExperiment::assay(eset_mut_scna) > 1] <- 1.0

## Pre-filter FS based on occurrence frequency
eset_mut_scna_flt <- CaDrA::prefilter_data(
  FS = eset_mut_scna,
  max_cutoff = 0.6,  # max event frequency (60%)
  min_cutoff = 0.03  # min event frequency (3%)
) 

# Run candidate search
topn_res <- CaDrA::candidate_search(
  FS = eset_mut_scna_flt,
  input_score = input_score,
  method = "ks_pval",          # Use Kolmogorow-Smirnow scoring function 
  method_alternative = "less", # Use one-sided hypothesis testing
  weights = NULL,              # If weights is provided, perform a weighted-KS test
  search_method = "both",      # Apply both forward and backward search
  top_N = 7,                   # Evaluate top 7 starting points for each search
  max_size = 7,                # Maximum size a meta-feature matrix can extend to
  do_plot = FALSE,             # Plot after finding the best features
  best_score_only = FALSE      # Return all results from the search
)

## Fetch the meta-feature set corresponding to its best scores over top N features searches
topn_best_meta <- CaDrA::topn_best(topn_res)

# Visualize the best results with the meta-feature plot
CaDrA::meta_plot(topn_best_list = topn_best_meta, input_score_label = "YAP/TAZ Activity")

# Evaluate results across top N features you started from
CaDrA::topn_plot(topn_res) 


Any questions or issues? Please report them on our github issues.