Loading a ChAS export file and do a bit of cleaning
ChAS exports files contain only basic information about the copy
number (gain, loss or LOH), plus the segment may overlap the centromere.
When the file is loaded by OncoscanR (load_chas
function),
all segments are assigend to a chromosomal arms and split if necessary.
The LOH segments are given by ChAS independently of the copy number
variation segments. Therefore one may have a LOH segment overlapping
with a copy loss. As this information is redundant (a copy loss will
always have a LOH), we need to trim and split these LOH with the
adjust_loh
function.
library(magrittr)
# Load the ChAS file
chas.fn <- system.file("extdata", "chas_example.txt", package = "oncoscanR")
segments <- load_chas(chas.fn, oncoscan_na33.cov)
# Clean the segments: restricted to Oncoscan coverage, LOH not overlapping
# with copy loss segments, smooth&merge segments within 300kb and prune
# segments smaller than 300kb.
segs.clean <- trim_to_coverage(segments, oncoscan_na33.cov) %>%
adjust_loh() %>%
merge_segments() %>%
prune_by_size()
Of note, the oncoscan_na33.cov
objects contains the
genomic coverage of the oncoscan assay (start/end for each chromosomal
arm, hg19). One could re-compute the latter by downloading the
annotation file from the ThermoFisher website and process it with the
get_oncoscan_coverage_from_probes
function.
A similar function is available for loading a result file from the
ASCAT program: load_ascat
. The ASCAT file is expected to
have the following column names: - ‘chr’ (chromosome number, with or
withour “chr”) - ‘startpos’ (first position of CNV segment) - ‘endpos’
(last position of CNV segment) - ‘nMajor’ (Number of copies of the major
allele) - ‘nMinor’ (Number of copies of the minor allele)
ascat.fn <- system.file("extdata", "ascat_example.txt", package = "oncoscanR")
ascat.segments <- load_ascat(ascat.fn, oncoscan_na33.cov)
head(ascat.segments)
#> GRanges object with 2 ranges and 2 metadata columns:
#> seqnames ranges strand | cn cn.type
#> <Rle> <IRanges> <Rle> | <numeric> <character>
#> [1] 1q 144009053-249212878 * | 3 Gain
#> [2] 5p 38139-46193462 * | 3 Gain
#> -------
#> seqinfo: 43 sequences from an unspecified genome; no seqlengths
Computation of arm-level alteration
Function armlevel_alt
An arm is declared globally altered if more than 80% of its bases are
altered with a similar CNV type (amplifications [3 extra copies or
more], gains [1-2 extra copies], losses or copy-neutral losses of
heterozygozity [LOH]). For instance, “gain of 3p” indicates that there
is more than 80% of arm with 3 copies but less than 80% with 5
(otherwise it would be an amplification). Prior to computation, segments
of same copy number and at a distance <300Kbp (Oncoscan resolution
genome-wide) are merged. The remaining segments are filtered to a
minimum size of 300Kbp.
For instance if we want to get all arms that have a global LOH
alteration, we run:
chas.fn <- system.file("extdata", "triploide_gene_list_full_location.txt",
package = "oncoscanR")
segments <- load_chas(chas.fn, oncoscan_na33.cov)
armlevel.loh <- get_loh_segments(segments) %>%
armlevel_alt(kit.coverage = oncoscan_na33.cov)
The variable armlevel.loh
is a named vector containing
the arms that have percentage of base with LOH above the threshold
(90%). To obtain the percentage of LOH bases in all arms, one could set
the threshold to zero:
armlevel.loh <- get_loh_segments(segments) %>%
armlevel_alt(kit.coverage = oncoscan_na33.cov, threshold = 0)
HRD scores
The package contains several HRD scores described below.
Score LST
Function score_lst
Procedure based on the paper from Popova et al, Can. Res. 2012 (PMID:
22933060). First segments smaller than 3Mb are removed, then segments
are smoothed with respect to copy number at a distance of 3Mb. The
number of LSTs is the number of breakpoints (breakpoints closer than 3Mb
are merged) that have a segment larger or equal to 10Mb on each side.
This score was linked to BRCA1/2-deficient tumors.
Score HR-LOH
Function score_loh
Procedure based on the paper from Abkevich et al., Br J Cancer 2012
(PMID: 23047548). Number of LOH segments larger than 15Mb but excluding
segments on chromosomes with a global LOH alteration. This score was
linked to BRCA1/2-deficient tumors.
Score nLST
Function score_nlst
HRD score developed at HUG and based on the LST score by Popova et
al. but normalized by an estimation of the number of whole-genome
doubling events. Of note, copy-neutral LOH segments are removed before
computation.
nLST = LST - 7*W/2
where W
is the number of
whole-genome doubling events.
The score is positive if there are at least 15 nLST.
The nLST score has been validated on 469 high grade ovarian cancer
samples from the PAOLA-1 clinical trial and is used in routine at the
Geneva University Hospitals for prediction of PARP inhibitors
response.
How to cite
Christinat Y, Ho L, Clément S, et al. 2022-RA-567-ESGO The Geneva HRD
test: clinical validation on 469 samples from the PAOLA-1 trial.
International Journal of Gynecologic Cancer 2022;32:A238-A239.
Score gLOH
Function score_gloh
The percentage genomic LOH score is computed as described in the
FoundationFocus CDx BRCA LOH assay; i.e. the percentage of bases covered
by the Oncoscan that display a loss of heterozygosity independently of
the number of copies, excluding chromosomal arms that have a global LOH
(>=90% of arm length). To compute with the armlevel_alt function on
LOH segments only). This score was linked to BRCA1/2-deficient
tumors.
Example
First we need to load and clean the ChAS export file (from a female
patient). We adjust the Oncoscan coverage to exclude the 21p arm as it
is only partially covered.
# Load data
chas.fn <- system.file("extdata", "LST_gene_list_full_location.txt",
package = "oncoscanR")
segments <- load_chas(chas.fn, oncoscan_na33.cov)
# Clean the segments: restricted to Oncoscan coverage, LOH not overlapping
# with copy loss segments, smooth&merge segments within 300kb and prune
# segments smaller than 300kb.
segs.clean <- trim_to_coverage(segments, oncoscan_na33.cov) %>%
adjust_loh() %>%
merge_segments() %>%
prune_by_size()
# Then we need to compute the arm-level alteration for loss and LOH since many
# scores discard arms that are globally altered.
arms.loss <- names(get_loss_segments(segs.clean) %>%
armlevel_alt(kit.coverage = oncoscan_na33.cov))
arms.loh <- names(get_loh_segments(segs.clean) %>%
armlevel_alt(kit.coverage = oncoscan_na33.cov))
# Get the number of LST
lst <- score_lst(segs.clean, oncoscan_na33.cov)
# Get the number of HR-LOH
hrloh <- score_loh(segs.clean, arms.loh, arms.loss, oncoscan_na33.cov)
# Get the genomic LOH score
gloh <- score_gloh(segs.clean, arms.loh, arms.loss, oncoscan_na33.cov)
# Get the number of nLST
wgd <- score_estwgd(segs.clean, oncoscan_na33.cov) # Get the avg CN, including 21p
nlst <- score_nlst(segs.clean, wgd["WGD"], oncoscan_na33.cov)
print(c(LST=lst, `HR-LOH`=hrloh, gLOH=gloh, nLST=nlst))
#> LST HR-LOH gLOH nLST.nLST
#> "26" "25" "0.411605161891022" "22.5"
#> nLST.HRD
#> "Positive"
Main workflow (as used at the Geneva University Hospitals)
The main workflow used for routine analysis can be launched either in
R via the workflow_oncoscan.chas(chas.fn, gender)
function
or via the script bin/run_oncoscan_workflow.R
:
Usage:
Rscript path_to_oncoscanR_package/bin/oncoscan-workflow.R CHAS_FILE
- CHAS_FILE
: Path to the text export file from ChAS or a
compatible text file.
The script will output a JSON string into the terminal with all the
computed information. :
{
"armlevel": {
"AMP": [],
"LOSS": ["17p", "2q", "4p"],
"LOH": ["14q", "5q", "8p", "8q"],
"GAIN": [19p", "19q", "1q", "20p", "20q", "3q", "5p", "6p", "9p", "9q",
"Xp", "Xq"]
},
"scores": {
"HRD": "Negative, nLST=12",
"TDplus": 22,
"avgCN": "2.43"
},
"file": "H19001012_gene_list_full_location.txt"
}
Or to launch the workflow within R:
segs.filename <- system.file('extdata', 'LST_gene_list_full_location.txt',
package = 'oncoscanR')
dat <- workflow_oncoscan.chas(segs.filename)
message(paste('Arms with copy loss:',
paste(dat$armlevel$LOSS, collapse = ', ')))
#> Arms with copy loss: 15q
message(paste('Arms with copy gains:',
paste(dat$armlevel$GAIN, collapse = ', ')))
#> Arms with copy gains: 11q, 12p, 12q, 16p, 1q, 20p, 20q, 21q, 2p, 4p, 5p, 6p, 6q, 7p, 7q, 8q, 9q
message(paste('HRD score:', dat$scores$HRD))
#> HRD score: Positive, nLST=22.5
A similar function is available for running the workflow from an
ASCAT result file: workflow_oncoscan.ascat
.
library(jsonlite)
segs.filename <- system.file('extdata', 'ascat_example.txt',
package = 'oncoscanR')
dat <- workflow_oncoscan.ascat(segs.filename)
toJSON(dat, auto_unbox=TRUE, pretty=TRUE)
#> {
#> "armlevel": {
#> "AMP": [],
#> "LOSS": [],
#> "LOH": [],
#> "GAIN": ["1q", "5p"]
#> },
#> "scores": {
#> "HRD": "Negative, nLST=0",
#> "TDplus": 0,
#> "avgCN": "2.05"
#> },
#> "file": "ascat_example.txt"
#> }
Please read the manual for a description of all available R
functions.