Title: | Secondary analyses of CNV data (HRD and more) |
---|---|
Description: | The software uses the copy number segments from a text file and identifies all chromosome arms that are globally altered and computes various genome-wide scores. The following HRD scores (characteristic of BRCA-mutated cancers) are included: LST, HR-LOH, nLST and gLOH. the package is tailored for the ThermoFisher Oncoscan assay analyzed with their Chromosome Alteration Suite (ChAS) but can be adapted to any input. |
Authors: | Yann Christinat [aut, cre], Geneva University Hospitals [aut, cph] |
Maintainer: | Yann Christinat <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.9.0 |
Built: | 2024-10-30 09:10:03 UTC |
Source: | https://github.com/bioc/oncoscanR |
Trim LOH segments with respect to loss segments.
adjust_loh(segments)
adjust_loh(segments)
segments |
A |
LOH segments completely contained within (or equal to) a copy loss segment are deleted. LOH segments partially overlapping (on one end only) with a copy loss segment are trimmed to remove the overlap or split into several segments.
A GRanges
object containing the cleaned segments, their copy
number and copy number types.
segs.adj <- adjust_loh(segs.chas_example)
segs.adj <- adjust_loh(segs.chas_example)
Get all globally-altered chromosome arms.
armlevel_alt(segments, kit.coverage, threshold = 0.9)
armlevel_alt(segments, kit.coverage, threshold = 0.9)
segments |
A |
kit.coverage |
A |
threshold |
The minimum percentage of the arm to be considered as globally altered. Defaults to 80%. |
By default uses the sum of all alterations and set the arm as
globally altered if \>80% of the arm is altered. Does not account for
alteration type and copy number.
Will run the function trim_to_coverage
on the segments.
A list of globally-altered chromosome arms with the percentage of arm altered.
arms <- armlevel_alt(segs.chas_example, oncoscan_na33.cov, 0.9)
arms <- armlevel_alt(segs.chas_example, oncoscan_na33.cov, 0.9)
Accepted types of CN for the segments - 'Gain': 1-2 extra copies - 'Weak amplification': 3-7 extra copies - 'Strong amplification': 8 or more extra copies - 'Heterozygote loss': Loss of one copy out of two - 'Homozygote loss': Loss of all copies - 'LOH': copy-neutral loss of one parental allele
cntypes
cntypes
A list of strings containing all CN types
cntypes <- list(LOH='LOH', Gain='Gain', Loss='Loss')
Return all segments with an amplification (5 or more copies).
get_amp_segments(segments)
get_amp_segments(segments)
segments |
A |
A GRanges
object containing the selected segments, their copy
number and copy number types.
segs.amp <- get_amp_segments(segs.chas_example)
segs.amp <- get_amp_segments(segs.chas_example)
Return all segments with gain of copies.
get_gain_segments(segments)
get_gain_segments(segments)
segments |
A |
A GRanges
object containing the selected segments, their copy
number and copy number types.
segs.gain <- get_gain_segments(segs.chas_example)
segs.gain <- get_gain_segments(segs.chas_example)
Return all segments with heterozygous loss.
get_hetloss_segments(segments)
get_hetloss_segments(segments)
segments |
A |
A GRanges
object containing the selected segments, their copy
number and copy number types.
segs.hetloss <- get_hetloss_segments(segs.chas_example)
segs.hetloss <- get_hetloss_segments(segs.chas_example)
Return all segments with homozygous loss.
get_homloss_segments(segments)
get_homloss_segments(segments)
segments |
A |
A GRanges
object containing the selected segments, their copy
number and copy number types.
get_homloss_segments <- get_homloss_segments(segs.chas_example)
get_homloss_segments <- get_homloss_segments(segs.chas_example)
Return all segments of type LOH, independently of the copy number.
get_loh_segments(segments)
get_loh_segments(segments)
segments |
A |
A GRanges
object containing the selected segments, their copy
number and copy number types.
segs.loh <- get_loh_segments(segs.chas_example)
segs.loh <- get_loh_segments(segs.chas_example)
Return all segments with loss of 1 or 2 copies.
get_loss_segments(segments)
get_loss_segments(segments)
segments |
A |
A GRanges
object containing the selected segments, their copy
number and copy number types.
segs.loh <- get_loh_segments(segs.chas_example)
segs.loh <- get_loh_segments(segs.chas_example)
Load the oncoscan coverage BED file into a GenomicRanges object.
get_oncoscan_coverage_from_bed(filename)
get_oncoscan_coverage_from_bed(filename)
filename |
Path to the coverage BED file. |
Expects the following columns from the BED file (no header): 1. Name of the chromosomal arm (e.g. "1p") 2. Start position of the arm 3. End position of the arm
A GRanges
object containing the regions covered on each
chromosome arm.
oncoscan_na33.cov <- get_oncoscan_coverage_from_bed( system.file('extdata', 'Oncoscan.na33.r2.cov.processed.bed', package = 'oncoscanR'))
oncoscan_na33.cov <- get_oncoscan_coverage_from_bed( system.file('extdata', 'Oncoscan.na33.r2.cov.processed.bed', package = 'oncoscanR'))
Load am ASCAT text export file.
load_ascat(filename, kit.coverage)
load_ascat(filename, kit.coverage)
filename |
Path to the ASCAT file. |
kit.coverage |
A |
The ASCAT file is expected to have the following column names: 'chr' (chromosome number), 'startpos' (first position of CNV segment), 'endpos' (last position of CNV segment), 'nMajor' (Number of copies of the major allele) and 'nMinor' (Number of copies of the minor allele).
The segments are attributed to each chromosome arm and split if necessary.
A GRanges
object containing the segments, their copy number
(field cn
), their copy
number types (field cntype
). cntype
contains either 'Gain',
'Loss' or 'LOH'.
If the file contains twice the same segment or does not respect the format
specifications, then an error is raised.
NB. If the chromosome name is in the format '1' and not 'chr1' and will
be transformed if needed.
segs.filename <- system.file('extdata', 'ascat_example.txt', package = 'oncoscanR') segs.ascat_example <- load_ascat(segs.filename, oncoscan_na33.cov)
segs.filename <- system.file('extdata', 'ascat_example.txt', package = 'oncoscanR') segs.ascat_example <- load_ascat(segs.filename, oncoscan_na33.cov)
Load a ChAS text export file.
load_chas(filename, kit.coverage)
load_chas(filename, kit.coverage)
filename |
Path to the ChAS file. |
kit.coverage |
A |
The ChAS file is expected to have the following column names: 'CN State' (number or empty), 'Type' (expected value: 'Gain', 'Loss' or 'LOH') and 'Full Location' (in the format 'chr:start-end').
The segments are attributed to each chromosome arm and split if necessary.
A GRanges
object containing the segments, their copy number
(field cn
), their copy
number types (field cntype
). cntype
contains either 'Gain',
'Loss' or 'LOH'.
If the file contains twice the same segment or does not respect the format
specifications, then an error is
raised. NB. The chromosome name is in the format '1' and not 'chr1' and will
be transformed if needed.
segs.filename <- system.file('extdata', 'chas_example.txt', package = 'oncoscanR') segs.chas_example <- load_chas(segs.filename, oncoscan_na33.cov)
segs.filename <- system.file('extdata', 'chas_example.txt', package = 'oncoscanR') segs.chas_example <- load_chas(segs.filename, oncoscan_na33.cov)
Merge segments with respect to the kit resolution and the copy number.
merge_segments(segments, kit.resolution = 300)
merge_segments(segments, kit.resolution = 300)
segments |
A |
kit.resolution |
Number >0 indicating the minimum segment size detectable by the technique (in kilobases). Defaults to the Oncoscan assay resolution outside of cancer genes: 300Kb. |
If two segments are at a distance smaller than the resolution, then
the segments are merged if the
share the same cn
value. Note that the function does not look at the
copy number type or subtype but
only at the actual copy number to decide whether segments can be merged.
A GRanges
object containing the cleaned segments, their copy
number and copy number types.
segs.merged <- merge_segments(segs.chas_example) segs.merged_50k <- merge_segments(segs.chas_example, 50)
segs.merged <- merge_segments(segs.chas_example) segs.merged_50k <- merge_segments(segs.chas_example, 50)
GenomicRanges object of the chromosomal arms coverage for the oncoscan assay (based on file extdata/Oncoscan.na33.r2.cov.processed.bed).
oncoscan_na33.cov
oncoscan_na33.cov
A GRanges
object containing the region covered on each
chromosome arm.
oncoscan_na33.cov <- get_oncoscan_coverage_from_bed(
system.file('extdata', 'Oncoscan.na33.r2.cov.processed.bed',
package = 'oncoscanR'))
Remove segments smaller than the kit resolution.
prune_by_size(segments, threshold = 300)
prune_by_size(segments, threshold = 300)
segments |
A |
threshold |
Number indicating the minimum segment size to be kept (in kilobases). Defaults to the Oncoscan assay resolution outside of cancer genes: 300Kb. |
A GRanges
object containing the cleaned segments, their copy
number and copy number types.
segs.300k <- prune_by_size(segs.chas_example) segs.50k <- prune_by_size(segs.chas_example, 50)
segs.300k <- prune_by_size(segs.chas_example) segs.50k <- prune_by_size(segs.chas_example, 50)
Compute the average copy number variation across the genome.
score_avgcn(segments, kit.coverage)
score_avgcn(segments, kit.coverage)
segments |
A |
kit.coverage |
A |
Compute the weighted average (by segment length) of the copy number variation. LOH segments and sexual chromosomes are excluded. Copy number variation is rounded to the next level (1.67 -> 1 but 2.33 -> 3).
A decimal value
score_avgcn(segs.chas_example, oncoscan_na33.cov)
score_avgcn(segs.chas_example, oncoscan_na33.cov)
Estimates the number of whole-genome doubling events (WGD).
score_estwgd(segments, kit.coverage)
score_estwgd(segments, kit.coverage)
segments |
A |
kit.coverage |
A |
Based on the publication from Carter et al. (Nature Biotechnology
2012; PubMed ID: 22544022).
On a pan-cancer cohort, they observed that tumors that underwent one
whole-genome doubling event had a ploidy (average copy number) between 2.2
and 3.4. This function relies on the function score_avgcn
to compute
the ploidy.
A named list with two values: WGD (whole-genome doubling events) and avgCN (the average copy number). WGD values are 0 for no WGD event, 1 for one WGD event, 2 for several WGD events.
score_estwgd(segs.chas_example, oncoscan_na33.cov)
score_estwgd(segs.chas_example, oncoscan_na33.cov)
Compute the genomic LOH score.
score_gloh(segments, arms.loh, arms.hetloss, kit.coverage)
score_gloh(segments, arms.loh, arms.hetloss, kit.coverage)
segments |
A |
arms.loh |
A list of arms with global/arm-level LOH alteration. |
arms.hetloss |
A list of arms with global/arm-level heterozygous loss. |
kit.coverage |
A |
The percentage genomic LOH score is computed as described in the
FoundationFocus CDx BRCA LOH assay; i.e. the percentage of bases covered by
the Oncoscan that display a loss of heterozygosity independently of the
number of copies, excluding chromosomal arms that have a global LOH (>=90
arm length).
To compute with the armlevel_alt
function on LOH segments only).
This score was linked to BRCA1/2-deficient tumors.
An integer representing the percentage of LOH bases.
armlevel.loh <- armlevel_alt(get_loh_segments(segs.chas_example), kit.coverage = oncoscan_na33.cov) armlevel.hetloss <- armlevel_alt(get_hetloss_segments(segs.chas_example), kit.coverage = oncoscan_na33.cov) score_gloh(segs.chas_example, names(armlevel.loh), names(armlevel.hetloss), oncoscan_na33.cov)
armlevel.loh <- armlevel_alt(get_loh_segments(segs.chas_example), kit.coverage = oncoscan_na33.cov) armlevel.hetloss <- armlevel_alt(get_hetloss_segments(segs.chas_example), kit.coverage = oncoscan_na33.cov) score_gloh(segs.chas_example, names(armlevel.loh), names(armlevel.hetloss), oncoscan_na33.cov)
Compute the number HR deficiency-associated LOH regions.
score_loh(segments, arms.loh, arms.hetloss, kit.coverage)
score_loh(segments, arms.loh, arms.hetloss, kit.coverage)
segments |
A |
arms.loh |
A list of arms with global/arm-level LOH alteration. |
arms.hetloss |
A list of arms with global/arm-level heterozygous losses. |
kit.coverage |
A |
Procedure based on the paper from Abkevich et al., Br J Cancer 2012
(PMID: 23047548). All LOH segments larger than 15Mb but excluding chromosome
with a global LOH alteration (to compute with the armlevel_alt
function on LOH segments only). This score was linked to BRCA1/2-deficient
tumors.
Note that the function will merge overlapping or neighbor LOH segments (at a
distance of 1bp).
An integer representing the number of HRD-LOH regions.
armlevel.loh <- armlevel_alt(get_loh_segments(segs.chas_example), kit.coverage = oncoscan_na33.cov) armlevel.hetloss <- armlevel_alt(get_hetloss_segments(segs.chas_example), kit.coverage = oncoscan_na33.cov) score_loh(segs.chas_example, names(armlevel.loh), names(armlevel.hetloss), oncoscan_na33.cov)
armlevel.loh <- armlevel_alt(get_loh_segments(segs.chas_example), kit.coverage = oncoscan_na33.cov) armlevel.hetloss <- armlevel_alt(get_hetloss_segments(segs.chas_example), kit.coverage = oncoscan_na33.cov) score_loh(segs.chas_example, names(armlevel.loh), names(armlevel.hetloss), oncoscan_na33.cov)
Compute the number of Large-scale State Transitions (LSTs).
score_lst(segments, kit.coverage)
score_lst(segments, kit.coverage)
segments |
A |
kit.coverage |
A |
Procedure based on the paper from Popova et al, Can. Res. 2012 (PMID: 22933060). First segments smaller than 3Mb are removed, then segments are smoothed with respect to copy number at a distance of 3Mb. The number of LSTs is the number of breakpoints (breakpoints closer than 3Mb are merged) that have a segment larger or equal to 10Mb on each side. This score was linked to BRCA1/2-deficient tumors.
An integer representing the number of LSTs.
score_lst(segs.chas_example, oncoscan_na33.cov)
score_lst(segs.chas_example, oncoscan_na33.cov)
Computes the total number of Mbp altered.
score_mbalt(segments, kit.coverage, loh.rm = TRUE)
score_mbalt(segments, kit.coverage, loh.rm = TRUE)
segments |
A |
kit.coverage |
A |
loh.rm |
A boolean (TRUE by default) to indicate whether LOH segments should be excluded. |
A named list representing the Mbp altered in the sample and the total Mbp of the kit.
score_mbalt(segs.chas_example, oncoscan_na33.cov) score_mbalt(segs.chas_example, oncoscan_na33.cov, FALSE)
score_mbalt(segs.chas_example, oncoscan_na33.cov) score_mbalt(segs.chas_example, oncoscan_na33.cov, FALSE)
Compute the number of LSTs, normalized by the number of WGD events.
score_nlst(segments, n.wgd, kit.coverage, threshold = 15)
score_nlst(segments, n.wgd, kit.coverage, threshold = 15)
segments |
A |
n.wgd |
Number of whole-genome doubling events (0 if diploid). |
kit.coverage |
A |
threshold |
A number above which the test is returned positive (>=). |
Compute the number of LSTs in non-LOH segments via the
score_lst
function and subtract the extra noise induced by WGD events:
nLST = LST - 7*W/2 where W is the number of WGD events.
A sample is HRD positive (deficient in HR pathway) if nLST is greater or
equal to the threshold (15 by default).
This score was linked to BRCA1/2-deficient tumors.
A named list with the number of nLSTs and the corresponding label ('Positive', 'Negative').
w <- score_estwgd(segs.chas_example, oncoscan_na33.cov) score_nlst(segs.chas_example, w['WGD'], oncoscan_na33.cov)
w <- score_estwgd(segs.chas_example, oncoscan_na33.cov) score_nlst(segs.chas_example, w['WGD'], oncoscan_na33.cov)
Compute the number of large tandem duplication (TDplus).
score_td(segments)
score_td(segments)
segments |
A |
Procedure based on the paper from Popova et al., Cancer Res 2016 (PMID: 26787835). The TDplus score is defined as the number of regions larger than 1Mb but smaller or equal to 10Mb with a gain of one or two copies. This score was linked to CDK12-deficient tumors. They also identified as second category of tandem duplication whose size is smaller or equal than 1Mb and around 300Kb but could not link it to a phenotype. Note that due to its resolution the Oncoscan assay will most likely miss this second category. Nonetheless it is reported by the function.
A list of integer containing the TDplus score ('TDplus'
) and
the small TD score ('TD'
).
score_td(segs.chas_example)
score_td(segs.chas_example)
Expected segments from loading the ChAS file 'chas_example.txt'.
segs.chas_example
segs.chas_example
A GRanges
object containing the segments, their copy number
(field cn
) and their copy number types (field cn.type
).
segs.filename <- system.file('extdata', 'chas_example.txt', package = 'oncoscanR') mykit.cov <- get_oncoscan_coverage_from_probes() segs.chas_example <- load_chas(segs.filename, kit.coverage = mykit.cov)
Trim segments with respect to the kit's coverage.
trim_to_coverage(segments, kit.coverage)
trim_to_coverage(segments, kit.coverage)
segments |
A |
kit.coverage |
A |
All segments that are not entirely contained within the kit coverage will be trimmed to the coverage's limits.
A GRanges
object containing the cleaned segments, their copy
number and copy number types.
segs.trimmed <- trim_to_coverage(segs.chas_example, oncoscan_na33.cov)
segs.trimmed <- trim_to_coverage(segs.chas_example, oncoscan_na33.cov)
Run the standard workflow for ASCAT files (from oncoscan data).
workflow_oncoscan.ascat(ascat.fn)
workflow_oncoscan.ascat(ascat.fn)
ascat.fn |
Path to the text-export ASCAT file |
Identifies the globally altered arms (\>=90% of arm altered),
computes the HRD and TD+ scores. The amplification is defined as a CN>=5.
An arm is gained if of CN type cntype.gain
unless the arm is
amplified.
A list of lists with the following elements:
armlevel = list(AMP= list of arms, GAIN= list of arms, LOSS= list of
arms, LOH= list of arms),
scores = list(LST= number, LOH= number, TDplus= number, TD= number),
file = path of the ChAS file as given by the parameter)
segs.filename <- system.file('extdata', 'ascat_example.txt', package = 'oncoscanR') workflow_oncoscan.ascat(segs.filename)
segs.filename <- system.file('extdata', 'ascat_example.txt', package = 'oncoscanR') workflow_oncoscan.ascat(segs.filename)
Run the standard workflow for Oncoscan ChAS files.
workflow_oncoscan.chas(chas.fn)
workflow_oncoscan.chas(chas.fn)
chas.fn |
Path to the text-export ChAS file |
Identifies the globally altered arms (\>=90% of arm altered),
computes the HRD and TD+ scores. The amplification is defined as a CN subtype
cntype.weakamp
or cntype.strongamp
. An arm is gained if of CN
type cntype.gain
unless the arm is amplified.
A list of lists with the following elements:
armlevel = list(AMP= list of arms, GAIN= list of arms, LOSS= list of
arms, LOH= list of arms),
scores = list(LST= number, LOH= number, TDplus= number, TD= number),
file = path of the ChAS file as given by the parameter)
segs.filename <- system.file('extdata', 'chas_example.txt', package = 'oncoscanR') workflow_oncoscan.chas(segs.filename)
segs.filename <- system.file('extdata', 'chas_example.txt', package = 'oncoscanR') workflow_oncoscan.chas(segs.filename)