Title: | Find breakpoints in Strand-seq data |
---|---|
Description: | This package implements functions for finding breakpoints, plotting and export of Strand-seq data. |
Authors: | David Porubsky, Ashley Sanders, Aaron Taudt |
Maintainer: | David Porubsky <[email protected]> |
License: | file LICENSE |
Version: | 1.25.0 |
Built: | 2024-12-27 03:59:58 UTC |
Source: | https://github.com/bioc/breakpointR |
This package implements functions for finding breakpoints, plotting and export of Strand-seq data.
The main function of this package is breakpointr
and produces several plots and browser files. If you want to have more fine-grained control over the different steps check the vignette How to use breakpointR.
David Porubsky, Ashley Sanders, Aaron Taudt
The BreakPoint
object is output of the function runBreakpointr
and is basically a list with various entries. The class() attribute of this list was set to "BreakPoint". Entries can be accessed with the list operators '[[]]' and '$'.
fragments |
A |
deltas |
A |
breaks |
A |
counts |
A |
params |
A vector with parameters that were used to obtain the results. |
runBreakpointr
This function is an easy-to-use wrapper to find breakpoints with runBreakpointr
in parallel, write the results to file, plot results and find hotspots.
breakpointr( inputfolder, outputfolder, configfile = NULL, numCPU = 1, reuse.existing.files = FALSE, windowsize = 1e+06, binMethod = "size", multi.sizes = NULL, pairedEndReads = FALSE, pair2frgm = FALSE, chromosomes = NULL, min.mapq = 10, filtAlt = FALSE, genoT = "fisher", trim = 10, peakTh = 0.33, zlim = 3.291, background = 0.05, minReads = 10, maskRegions = NULL, callHotSpots = FALSE, conf = 0.99 )
breakpointr( inputfolder, outputfolder, configfile = NULL, numCPU = 1, reuse.existing.files = FALSE, windowsize = 1e+06, binMethod = "size", multi.sizes = NULL, pairedEndReads = FALSE, pair2frgm = FALSE, chromosomes = NULL, min.mapq = 10, filtAlt = FALSE, genoT = "fisher", trim = 10, peakTh = 0.33, zlim = 3.291, background = 0.05, minReads = 10, maskRegions = NULL, callHotSpots = FALSE, conf = 0.99 )
inputfolder |
Folder with BAM files. |
outputfolder |
Folder to output the results. If it does not exist it will be created. |
configfile |
A file specifying the parameters of this function (without |
numCPU |
The numbers of CPUs that are used. Should not be more than available on your machine. |
reuse.existing.files |
A logical indicating whether or not existing files in |
windowsize |
The window size used to calculate deltaWs, either number of reads or genomic size depending on |
binMethod |
Method used to calculate optimal number of reads in the window ("size", "reads"). By default |
multi.sizes |
User defined multiplications of the original window size. |
pairedEndReads |
Set to |
pair2frgm |
Set to |
chromosomes |
If only a subset of the chromosomes should be binned, specify them here. |
min.mapq |
Minimum mapping quality when importing from BAM files. |
filtAlt |
Set to |
genoT |
A method ('fisher' or 'binom') to genotype regions defined by a set of breakpoints. |
trim |
The amount of outliers in deltaWs removed to calculate the stdev (10 will remove top 10% and bottom 10% of deltaWs). |
peakTh |
The treshold that the peak deltaWs must pass to be considered a breakpoint (e.g. 0.33 is 1/3 of max(deltaW)). |
zlim |
The number of stdev that the deltaW must pass the peakTh (ensures only significantly higher peaks are considered). |
background |
The percent (e.g. 0.05 = 5%) of background reads allowed for WW or CC genotype calls. |
minReads |
The minimal number of reads between two breaks required for genotyping. |
maskRegions |
List of regions to be excluded from the analysis (tab-separated file: chromosomes start end). |
callHotSpots |
Search for regions of high abundance of breakpoints in single cells. |
conf |
Desired confidence interval of localized breakpoints. |
NULL
David Porubsky, Aaron Taudt, Ashley Sanders
## Not run: ## The following call produces plots and genome browser files for all BAM files in "my-data-folder" breakpointr(inputfolder="my-data-folder", outputfolder="my-output-folder") ## End(Not run)
## Not run: ## The following call produces plots and genome browser files for all BAM files in "my-data-folder" breakpointr(inputfolder="my-data-folder", outputfolder="my-output-folder") ## End(Not run)
Write a bedfile or bedgraph from a breakpointR object for upload on to the UCSC Genome browser.
breakpointr2UCSC( index, outputDirectory, fragments = NULL, deltaWs = NULL, breakTrack = NULL, confidenceIntervals = NULL, breaksGraph = NULL )
breakpointr2UCSC( index, outputDirectory, fragments = NULL, deltaWs = NULL, breakTrack = NULL, confidenceIntervals = NULL, breaksGraph = NULL )
index |
A character used to name the bedfile(s). |
outputDirectory |
Location to write bedfile(s). |
fragments |
A |
deltaWs |
A |
breakTrack |
A |
confidenceIntervals |
A |
breaksGraph |
A |
NULL
Ashley Sanders, David Porubsky, Aaron Taudt
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file brkpts <- get(load(exampleFile)) ## Write results to BED files breakpointr2UCSC(index='testfile', outputDirectory=tempdir(), breakTrack=brkpts$breaks)
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file brkpts <- get(load(exampleFile)) ## Write results to BED files breakpointr2UCSC(index='testfile', outputDirectory=tempdir(), breakTrack=brkpts$breaks)
Find breakpoints from deltaWs by localizing significant peaks based on z-score calculation.
breakSeekr(deltaWs, trim = 10, peakTh = 0.33, zlim = 3.291)
breakSeekr(deltaWs, trim = 10, peakTh = 0.33, zlim = 3.291)
deltaWs |
A |
trim |
The amount of outliers in deltaWs removed to calculate the stdev (10 will remove top 10% and bottom 10% of deltaWs). |
peakTh |
The treshold that the peak deltaWs must pass to be considered a breakpoint (e.g. 0.33 is 1/3 of max(deltaW)). |
zlim |
The number of stdev that the deltaW must pass the peakTh (ensures only significantly higher peaks are considered). |
A GRanges-class
object containing breakpoint coordinates with various metadata columns.
David Porubsky, Aaron Taudt, Ashley Sanders
## Get an example file exampleFolder <- system.file("extdata", "example_bams", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file fragments <- readBamFileAsGRanges(exampleFile, pairedEndReads=FALSE, chromosomes='chr22') ## Calculate deltaW values dw <- deltaWCalculator(fragments) ## Get significant peaks in deltaW values breaks <- breakSeekr(dw)
## Get an example file exampleFolder <- system.file("extdata", "example_bams", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file fragments <- readBamFileAsGRanges(exampleFile, pairedEndReads=FALSE, chromosomes='chr22') ## Calculate deltaW values dw <- deltaWCalculator(fragments) ## Get significant peaks in deltaW values breaks <- breakSeekr(dw)
Collapse consecutive bins with the same value defined in 'id.field'.
collapseBins(gr, id.field = 3)
collapseBins(gr, id.field = 3)
gr |
A |
id.field |
A number of metadata column to use for region merging. |
A GRanges-class
object.
Estimate confidence intervals for breakpoints by going outwards from the breakpoint read by read, and multiplying the probability that the read doesn't belong to the assigned segment.
confidenceInterval(breaks, fragments, background = 0.05, conf = 0.99)
confidenceInterval(breaks, fragments, background = 0.05, conf = 0.99)
breaks |
Genotyped breakpoints as outputted from function |
fragments |
Read fragments from function |
background |
The percent (e.g. 0.05 = 5%) of background reads allowed for WW or CC genotype calls. |
conf |
Desired confidence interval of localized breakpoints. |
A GRanges-class
object of breakpoint ranges for a given confidence interval in conf
.
Aaron Taudt, David Porubsky
## Not run: ## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file breakpoint.objects <- get(load(exampleFile)) ## Calculate confidence intervals of genotyped breakpoints confint <- confidenceInterval(breaks=breakpoint.objects$breaks, fragments=breakpoint.objects$fragments, background=0.02) ## End(Not run)
## Not run: ## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file breakpoint.objects <- get(load(exampleFile)) ## Calculate confidence intervals of genotyped breakpoints confint <- confidenceInterval(breaks=breakpoint.objects$breaks, fragments=breakpoint.objects$fragments, background=0.02) ## End(Not run)
Estimate confidence intervals for breakpoints by going outwards from the breakpoint read by read, and performing a binomial test of getting the observed or a more extreme outcome, given that the reads within the confidence interval belong to the other side of the breakpoint.
confidenceInterval.binomial(breaks, fragments, background = 0.02, conf = 0.99)
confidenceInterval.binomial(breaks, fragments, background = 0.02, conf = 0.99)
breaks |
Genotyped breakpoints as outputted from function |
fragments |
Read fragments from function |
background |
The percent (e.g. 0.05 = 5%) of background reads allowed for WW or CC genotype calls. |
conf |
Desired confidence interval of localized breakpoints. |
A GRanges-class
object of breakpoint ranges for a given confidence interval in conf
.
Aaron Taudt, David Porubsky
## Not run: ## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file breakpoint.objects <- get(load(exampleFile)) ## Calculate confidence intervals of genotyped breakpoints confint <- confidenceInterval.binomial(breakpoint.objects$breaks, breakpoint.objects$fragments, background=0.02) ## End(Not run)
## Not run: ## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file breakpoint.objects <- get(load(exampleFile)) ## Calculate confidence intervals of genotyped breakpoints confint <- confidenceInterval.binomial(breakpoint.objects$breaks, breakpoint.objects$fragments, background=0.02) ## End(Not run)
This function will move through BAM files in a folder, read in each individual file and go through each chromosome, determine if the chromosome is WW or CC based on WCcutoff, reverse complement all reads in the WW file, append to a new composite file for that chromosome, order the composite file of each chromosome based on position.
createCompositeFile( file.list, chromosomes = NULL, pairedEndReads = TRUE, pair2frgm = FALSE, min.mapq = 10, filtAlt = FALSE, WC.cutoff = 0.9, genoT = "fisher", background = 0.05 )
createCompositeFile( file.list, chromosomes = NULL, pairedEndReads = TRUE, pair2frgm = FALSE, min.mapq = 10, filtAlt = FALSE, WC.cutoff = 0.9, genoT = "fisher", background = 0.05 )
file.list |
A list of BAM files to process. |
chromosomes |
If only a subset of the chromosomes should be binned, specify them here. |
pairedEndReads |
Set to |
pair2frgm |
Set to |
min.mapq |
Minimum mapping quality when importing from BAM files. |
filtAlt |
Set to |
WC.cutoff |
Percentage of WW or CC reads to consider chromosome being WW or CC |
genoT |
A method ('fisher' or 'binom') to genotype regions defined by a set of breakpoints. |
background |
The percent (e.g. 0.05 = 5%) of background reads allowed for WW or CC genotype calls. |
A GRanges-class
object.
Ashley Sanders, David Porubsky
This function will calculate deltaWs from a GRanges-class
object with read fragments.
deltaWCalculator(frags, reads.per.window = 100)
deltaWCalculator(frags, reads.per.window = 100)
frags |
A |
reads.per.window |
Number of reads in each dynamic window. |
The input frags
with additional meta-data columns.
Aaron Taudt
readBamFileAsGRanges
## Get an example file exampleFolder <- system.file("extdata", "example_bams", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file fragments <- readBamFileAsGRanges(exampleFile, pairedEndReads=FALSE, chromosomes='chr22') ## Calculate deltaW values dw <- deltaWCalculator(fragments)
## Get an example file exampleFolder <- system.file("extdata", "example_bams", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file fragments <- readBamFileAsGRanges(exampleFile, pairedEndReads=FALSE, chromosomes='chr22') ## Calculate deltaW values dw <- deltaWCalculator(fragments)
This function will calculate deltaWs from a GRanges-class
object with read fragments.
deltaWCalculatorVariousWindows( frags, reads.per.window = 100, multi.sizes = c(2, 4, 6) )
deltaWCalculatorVariousWindows( frags, reads.per.window = 100, multi.sizes = c(2, 4, 6) )
frags |
A |
reads.per.window |
Number of reads in each dynamic window. |
multi.sizes |
User defined multiplications of the original window size. |
The input frags
with additional meta-data columns.
David Porubsky
deltaWCalculator
Function to print WC regions after breakpointR analysis
exportRegions( datapath, file = NULL, collapseInversions = FALSE, collapseRegionSize = 5e+06, minRegionSize = 5e+06, state = "wc" )
exportRegions( datapath, file = NULL, collapseInversions = FALSE, collapseRegionSize = 5e+06, minRegionSize = 5e+06, state = "wc" )
datapath |
A path to that |
file |
A filename to print exported regions to. |
collapseInversions |
Set to |
collapseRegionSize |
Upper range of what sized regions should be collapsed. |
minRegionSize |
Minimal size of the region to be reported. |
state |
A genotype of the regions to be exported ('ww', 'cc' or 'wc'). |
A data.frame
object containing all regions with user defined 'state'.
David Porubsky
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") ## To export regions genotyped as 'wc' wc.regions <- exportRegions(datapath=exampleFolder, collapseInversions=FALSE, minRegionSize=5000000, state='wc')
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") ## To export regions genotyped as 'wc' wc.regions <- exportRegions(datapath=exampleFolder, collapseInversions=FALSE, minRegionSize=5000000, state='wc')
Assign states to any given region using binomial test.
genotype.binom(wReads, cReads, background = 0.05, minReads = 10, log = FALSE)
genotype.binom(wReads, cReads, background = 0.05, minReads = 10, log = FALSE)
wReads |
Number of Watson reads. |
cReads |
Number of Crick reads. |
background |
The percent (e.g. 0.05 = 5%) of background reads allowed for WW or CC genotype calls. |
minReads |
The minimal number of reads between two breaks required for genotyping. |
log |
Set to |
A list
with the $bestFit and $pval.
David Porubsky
## Get Crick and Watson read counts ## Crick read count cReads <- 30 ## Watson read count wReads <- 5 genotype.binom(cReads = cReads, wReads = wReads, background = 0.05, minReads = 10, log = TRUE)
## Get Crick and Watson read counts ## Crick read count cReads <- 30 ## Watson read count wReads <- 5 genotype.binom(cReads = cReads, wReads = wReads, background = 0.05, minReads = 10, log = TRUE)
Assign states to any given region using Fisher Exact Test.
genotype.fisher(cReads, wReads, roiReads, background = 0.05, minReads = 10)
genotype.fisher(cReads, wReads, roiReads, background = 0.05, minReads = 10)
cReads |
Number of Crick reads. |
wReads |
Number of Watson reads. |
roiReads |
Total number of Crick and Watson reads. |
background |
The percent (e.g. 0.05 = 5%) of background reads allowed for WW or CC genotype calls. |
minReads |
The minimal number of reads between two breaks required for genotyping. |
A list
with the $bestFit and $pval.
David Porubsky, Aaron Taudt
## Get Crick and Watson read counts ## Crick read count cReads <- 30 ## Watson read count wReads <- 5 genotype.fisher(cReads = cReads, wReads = wReads, roiReads = cReads + wReads, background = 0.05, minReads = 10)
## Get Crick and Watson read counts ## Crick read count cReads <- 30 ## Watson read count wReads <- 5 genotype.fisher(cReads = cReads, wReads = wReads, roiReads = cReads + wReads, background = 0.05, minReads = 10)
Each defined region is given one of the three states ('ww', 'cc' or 'wc') Consecutive regions with the same state are collapsed
GenotypeBreaks( breaks, fragments, background = 0.05, minReads = 10, genoT = "fisher" )
GenotypeBreaks( breaks, fragments, background = 0.05, minReads = 10, genoT = "fisher" )
breaks |
A |
fragments |
A |
background |
The percent (e.g. 0.05 = 5%) of background reads allowed for WW or CC genotype calls. |
minReads |
The minimal number of reads between two breaks required for genotyping. |
genoT |
A method ('fisher' or 'binom') to genotype regions defined by a set of breakpoints. |
Function GenotypeBreaks
exports states of each region defined by breakpoints.
Function genotype.fisher
assigns states to each region based on expected counts of Watson and Crick reads.
Function genotype.binom
assigns states to each region based on expected counts of Watson and Crick reads.
A GRanges-class
object with genotyped breakpoint coordinates.
GenotypeBreaks()
: Genotypes breakpoint defined regions.
David Porubsky, Ashley Sanders, Aaron Taudt
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file breakpoint.objects <- get(load(exampleFile)) ## Genotype regions between breakpoints gbreaks <- GenotypeBreaks(breaks=breakpoint.objects$breaks, fragments=breakpoint.objects$fragments)
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file breakpoint.objects <- get(load(exampleFile)) ## Genotype regions between breakpoints gbreaks <- GenotypeBreaks(breaks=breakpoint.objects$breaks, fragments=breakpoint.objects$fragments)
Find hotspots of genomic events by using kernel density estimation.
hotspotter(gr.list, bw, pval = 1e-08)
hotspotter(gr.list, bw, pval = 1e-08)
gr.list |
A list or |
bw |
Bandwidth used for kernel density estimation (see |
pval |
P-value cutoff for hotspots. |
The hotspotter uses density
to perform a KDE. A p-value is calculated by comparing the density profile of the genomic events with the density profile of a randomly subsampled set of genomic events. Due to this random sampling, the result can vary for each function call, most likely for hotspots whose p-value is close to the specified pval
.
A GRanges-class
object containing coordinates of hotspots with p-values.
Aaron Taudt
## Get example BreakPoint objects exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFiles <- list.files(exampleFolder, full.names=TRUE) breakpoint.objects <- loadFromFiles(exampleFiles) ## Extract breakpoint coordinates breaks <- lapply(breakpoint.objects, '[[', 'breaks') ## Get hotspot coordinates hotspots <- hotspotter(gr.list=breaks, bw=1e6)
## Get example BreakPoint objects exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFiles <- list.files(exampleFolder, full.names=TRUE) breakpoint.objects <- loadFromFiles(exampleFiles) ## Extract breakpoint coordinates breaks <- lapply(breakpoint.objects, '[[', 'breaks') ## Get hotspot coordinates hotspots <- hotspotter(gr.list=breaks, bw=1e6)
Add two columns with transformed genomic coordinates to the GRanges-class
object. This is useful for making genomewide plots.
insertchr(gr)
insertchr(gr)
gr |
A |
The input GRanges-class
object with an additional metadata column containing chromosome name with 'chr'.
Wrapper to load breakpointR objects from file and check the class of the loaded objects.
loadFromFiles(files, check.class = c("GRanges", "BreakPoint"))
loadFromFiles(files, check.class = c("GRanges", "BreakPoint"))
files |
A list of |
check.class |
Any combination of |
A list of GRanges-class
or BreakPoint
objects.
## Get some files that you want to load exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFiles <- list.files(exampleFolder, full.names=TRUE) ## Load the processed data breakpoint.objects <- loadFromFiles(exampleFiles)
## Get some files that you want to load exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFiles <- list.files(exampleFolder, full.names=TRUE) ## Load the processed data breakpoint.objects <- loadFromFiles(exampleFiles)
This function will create genome-wide ideograms from a BreakPoint
object.
plotBreakpoints(files2plot, file = NULL)
plotBreakpoints(files2plot, file = NULL)
files2plot |
A list of files that contains |
file |
Name of the file to plot to. |
A list with ggplot
objects.
David Porubsky, Aaron Taudt, Ashley Sanders
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Plot the file plotBreakpoints(files2plot=exampleFile)
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Plot the file plotBreakpoints(files2plot=exampleFile)
This function will create chromsome specific enome-wide ideograms from a BreakPoint
object.
plotBreakpointsPerChr(files2plot, plotspath = NULL, chromosomes = NULL)
plotBreakpointsPerChr(files2plot, plotspath = NULL, chromosomes = NULL)
files2plot |
A list of files that contains |
plotspath |
Directory to store plots. |
chromosomes |
Set specific chromosome(s) to be plotted. |
A list with ggplot
objects.
David Porubsky
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFiles <- list.files(exampleFolder, full.names=TRUE) ## Plot results plotBreakpointsPerChr(exampleFiles, chromosomes='chr7')
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFiles <- list.files(exampleFolder, full.names=TRUE) ## Plot results plotBreakpointsPerChr(exampleFiles, chromosomes='chr7')
Plot a genome-wide heatmap of template inheritance states from a BreakPoint
object.
plotHeatmap(files2plot, file = NULL, hotspots = NULL)
plotHeatmap(files2plot, file = NULL, hotspots = NULL)
files2plot |
A list of files that contains |
file |
Name of the file to plot to. |
hotspots |
A |
A ggplot
object.
David Porubsky, Aaron Taudt, Ashley Sanders
## Get example BreakPoint objects to plot exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFiles <- list.files(exampleFolder, full.names=TRUE) breakpoint.objects <- loadFromFiles(exampleFiles) ## Plot the heatmap plotHeatmap(breakpoint.objects)
## Get example BreakPoint objects to plot exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFiles <- list.files(exampleFolder, full.names=TRUE) breakpoint.objects <- loadFromFiles(exampleFiles) ## Plot the heatmap plotHeatmap(breakpoint.objects)
Write a bedfile from Breakpoint.R files for upload on to UCSC Genome browser
ranges2UCSC(gr, outputDirectory = ".", index = "bedFile", colorRGB = "0,0,0")
ranges2UCSC(gr, outputDirectory = ".", index = "bedFile", colorRGB = "0,0,0")
gr |
A |
outputDirectory |
Location to write bedfile(s). |
index |
A character used to name the bedfile(s). |
colorRGB |
An RGB color to be used for submitted ranges. |
NULL
David Porubsky
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file counts <- get(load(exampleFile))[['counts']] ## Export 'wc' states into a UCSC formated file ranges2UCSC(gr=counts[counts$states == 'wc'], index='testfile', outputDirectory=tempdir())
## Get an example file exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file counts <- get(load(exampleFile))[['counts']] ## Export 'wc' states into a UCSC formated file ranges2UCSC(gr=counts[counts$states == 'wc'], index='testfile', outputDirectory=tempdir())
Import aligned reads from a BAM file into a GRanges-class
object.
readBamFileAsGRanges( file, bamindex = file, chromosomes = NULL, pairedEndReads = FALSE, min.mapq = 10, remove.duplicate.reads = TRUE, pair2frgm = FALSE, filtAlt = FALSE )
readBamFileAsGRanges( file, bamindex = file, chromosomes = NULL, pairedEndReads = FALSE, min.mapq = 10, remove.duplicate.reads = TRUE, pair2frgm = FALSE, filtAlt = FALSE )
file |
Bamfile with aligned reads. |
bamindex |
Bam-index file with or without the .bai ending. If this file does not exist it will be created and a warning is issued. |
chromosomes |
If only a subset of the chromosomes should be binned, specify them here. |
pairedEndReads |
Set to |
min.mapq |
Minimum mapping quality when importing from BAM files. |
remove.duplicate.reads |
A logical indicating whether or not duplicate reads should be kept. |
pair2frgm |
Set to |
filtAlt |
Set to |
A GRanges-class
object.
David Porubsky, Aaron Taudt, Ashley Sanders
## Get an example file exampleFolder <- system.file("extdata", "example_bams", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file fragments <- readBamFileAsGRanges(exampleFile, pairedEndReads=FALSE, chromosomes='chr22')
## Get an example file exampleFolder <- system.file("extdata", "example_bams", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Load the file fragments <- readBamFileAsGRanges(exampleFile, pairedEndReads=FALSE, chromosomes='chr22')
Read an breakpointR configuration file into a list structure. The configuration file has to be specified in INI format. R expressions can be used and will be evaluated.
readConfig(configfile)
readConfig(configfile)
configfile |
Path to the configuration file |
A list
with one entry for each element in configfile
.
Aaron Taudt
This function will take from a double SCE chromosome only WW or CC region (Longer region is taken).
removeDoubleSCEs(gr, collapseWidth = 5e+06)
removeDoubleSCEs(gr, collapseWidth = 5e+06)
gr |
A |
collapseWidth |
A segment size to be collapsed with neighbouring segments. |
The input GRanges-class
object with only WW or CC region retained.
This function takes a GRanges-class
object of aligned short reads
and removes pockets of reads that are stacked on top of each other based on the
maximum number of reads allowed to pileup in 'max.pileup' parameter.
removeReadPileupSpikes(gr = NULL, max.pileup = 30)
removeReadPileupSpikes(gr = NULL, max.pileup = 30)
gr |
A |
max.pileup |
A maximum number of reads overlapping each other to be kept. |
A GRanges-class
object.
David Porubsky
## Get some files that you want to load exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") infile <- list.files(exampleFolder, full.names=TRUE)[1] ## Read in the reads breakP.obj <- get(load(infile)) frags <- breakP.obj$fragments ## Remove read spikes frags <- removeReadPileupSpikes(gr=frags)
## Get some files that you want to load exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") infile <- list.files(exampleFolder, full.names=TRUE)[1] ## Read in the reads breakP.obj <- get(load(infile)) frags <- breakP.obj$fragments ## Remove read spikes frags <- removeReadPileupSpikes(gr=frags)
Find breakpoints in Strand-seq data. See section Details on how breakpoints are located.
runBreakpointr( bamfile, ID = basename(bamfile), pairedEndReads = TRUE, chromosomes = NULL, windowsize = 1e+06, binMethod = "size", multi.sizes = NULL, trim = 10, peakTh = 0.33, zlim = 3.291, background = 0.05, min.mapq = 10, pair2frgm = FALSE, filtAlt = FALSE, genoT = "fisher", minReads = 20, maskRegions = NULL, conf = 0.99 )
runBreakpointr( bamfile, ID = basename(bamfile), pairedEndReads = TRUE, chromosomes = NULL, windowsize = 1e+06, binMethod = "size", multi.sizes = NULL, trim = 10, peakTh = 0.33, zlim = 3.291, background = 0.05, min.mapq = 10, pair2frgm = FALSE, filtAlt = FALSE, genoT = "fisher", minReads = 20, maskRegions = NULL, conf = 0.99 )
bamfile |
A file with aligned reads in BAM format. |
ID |
A character string that will serve as identifier in downstream functions. |
pairedEndReads |
Set to |
chromosomes |
If only a subset of the chromosomes should be binned, specify them here. |
windowsize |
The window size used to calculate deltaWs, either number of reads or genomic size depending on |
binMethod |
Method used to calculate optimal number of reads in the window ("size", "reads"). By default |
multi.sizes |
User defined multiplications of the original window size. |
trim |
The amount of outliers in deltaWs removed to calculate the stdev (10 will remove top 10% and bottom 10% of deltaWs). |
peakTh |
The treshold that the peak deltaWs must pass to be considered a breakpoint (e.g. 0.33 is 1/3 of max(deltaW)). |
zlim |
The number of stdev that the deltaW must pass the peakTh (ensures only significantly higher peaks are considered). |
background |
The percent (e.g. 0.05 = 5%) of background reads allowed for WW or CC genotype calls. |
min.mapq |
Minimum mapping quality when importing from BAM files. |
pair2frgm |
Set to |
filtAlt |
Set to |
genoT |
A method ('fisher' or 'binom') to genotype regions defined by a set of breakpoints. |
minReads |
The minimal number of reads between two breaks required for genotyping. |
maskRegions |
List of regions to be excluded from the analysis in |
conf |
Desired confidence interval of localized breakpoints. |
Breakpoints are located in the following way:
calculate deltaWs chromosome-by-chromsome
localize breaks that pass zlim above the threshold
genotype both sides of breaks to confirm whether strand state changes
write a file of _reads, _deltaWs and _breaks in a chr fold -> can upload on to UCSC Genome browser
write a file for each index with all chromosomes included -> can upload on to UCSC Genome browser
A BreakPoint
object.
David Porubsky, Ashley Sanders, Aaron Taudt
## Get an example file exampleFolder <- system.file("extdata", "example_bams", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Run breakpointR brkpts <- runBreakpointr(exampleFile, chromosomes='chr22', pairedEndReads=FALSE)
## Get an example file exampleFolder <- system.file("extdata", "example_bams", package="breakpointRdata") exampleFile <- list.files(exampleFolder, full.names=TRUE)[1] ## Run breakpointR brkpts <- runBreakpointr(exampleFile, chromosomes='chr22', pairedEndReads=FALSE)
This function will calculate deltaWs from a GRanges-class
object with read fragments.
summarizeBreaks(breakpoints)
summarizeBreaks(breakpoints)
breakpoints |
A list containing breakpoints stored in |
A data.frame
of compiled breakpoints together with confidence intervals.
David Porubsky
## Get some files that you want to load exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") file <- list.files(exampleFolder, full.names=TRUE)[1] breakpoints <- get(load(file))[c('breaks', 'confint')] summarizeBreaks(breakpoints)
## Get some files that you want to load exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") file <- list.files(exampleFolder, full.names=TRUE)[1] breakpoints <- get(load(file))[c('breaks', 'confint')] summarizeBreaks(breakpoints)
This function aims to synchronize strand directionality of reads that fall into WW and CC regions.
synchronizeReadDir(files2sync, collapseWidth = 5e+06)
synchronizeReadDir(files2sync, collapseWidth = 5e+06)
files2sync |
A list of files that contains |
collapseWidth |
A segment size to be collapsed with neighbouring segments. |
A GRanges-class
object that reads synchronized by directionality.
David Porubsky
## Get some files that you want to load exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") files2sync <- list.files(exampleFolder, full.names=TRUE)[1] synchronizeReadDir(files2sync=files2sync)
## Get some files that you want to load exampleFolder <- system.file("extdata", "example_results", package="breakpointRdata") files2sync <- list.files(exampleFolder, full.names=TRUE)[1] synchronizeReadDir(files2sync=files2sync)
Add two columns with transformed genomic coordinates to the GRanges-class
object. This is useful for making genomewide plots.
transCoord(gr)
transCoord(gr)
gr |
A |
The input GRanges-class
with two additional metadata columns 'start.genome' and 'end.genome'.
Write an breakpointR configuration file from a list structure.
writeConfig(config, configfile)
writeConfig(config, configfile)
config |
A list structure with parameter values. Each entry will be written in one line. |
configfile |
Filename of the outputfile. |
NULL
Aaron Taudt