Title: | Prediction of intronic splicing branchpoints |
---|---|
Description: | Predicts branchpoint probability for sites in intronic branchpoint windows. Queries can be supplied as intronic regions; or to evaluate the effects of mutations, SNPs. |
Authors: | Beth Signal |
Maintainer: | Beth Signal <[email protected]> |
License: | BSD_3_clause + file LICENSE |
Version: | 1.33.0 |
Built: | 2024-11-29 04:27:48 UTC |
Source: | https://github.com/bioc/branchpointer |
Converts a GTF annotation to exon locations
gtfToExons(gtf)
gtfToExons(gtf)
gtf |
file containing the gtf annotation. |
exon annotation GRanges
Beth Signal
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf", package = "branchpointer") exons <- gtfToExons(smallExons)
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf", package = "branchpointer") exons <- gtfToExons(smallExons)
Generate branchpoint window regions corresponding to annotated exon(s) within a queried gene, transcript or exon id
makeBranchpointWindowForExons(id, idType, exons, forceClosestExon = FALSE)
makeBranchpointWindowForExons(id, idType, exons, forceClosestExon = FALSE)
id |
identifier(s) for the query gene/transcript/exon id |
idType |
type of id to match in the exon annotation file ( |
exons |
GRanges containing exon co-ordinates. |
forceClosestExon |
Force branchpointer to find the closest exon and not the exon annotated as 5' to the query |
Granges with formatted query
Beth Signal
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf",package = "branchpointer") exons <- gtfToExons(smallExons) windowquery <- makeBranchpointWindowForExons("ENSG00000139618.14", "gene_id", exons) windowquery <- makeBranchpointWindowForExons("ENST00000357654.7", "transcript_id", exons) windowquery <- makeBranchpointWindowForExons("ENSE00003518965.1", "exon_id", exons)
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf",package = "branchpointer") exons <- gtfToExons(smallExons) windowquery <- makeBranchpointWindowForExons("ENSG00000139618.14", "gene_id", exons) windowquery <- makeBranchpointWindowForExons("ENST00000357654.7", "transcript_id", exons) windowquery <- makeBranchpointWindowForExons("ENSE00003518965.1", "exon_id", exons)
Searches Biomart for refsnp ids, and pulls genomic location and sequence identity information Reformats alleles so each query has only one alternative allele
makeBranchpointWindowForSNP(refSNP, mart.snp, exons, maxDist = 50, filter = TRUE)
makeBranchpointWindowForSNP(refSNP, mart.snp, exons, maxDist = 50, filter = TRUE)
refSNP |
Vector of refsnp ids |
mart.snp |
biomaRt mart object specifying the BioMart database and dataset to be used |
exons |
GRanges containing exon co-ordinates. Should be produced by gtfToExons() |
maxDist |
maximum distance a SNP can be from an annotated 3' exon. |
filter |
remove SNP queries prior to finding finding nearest exons? |
formatted SNP query GRanges
Beth Signal
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf",package = "branchpointer") exons <- gtfToExons(smallExons) mart.snp <- biomaRt::useMart("ENSEMBL_MART_SNP", dataset="hsapiens_snp", host="www.ensembl.org") query <- makeBranchpointWindowForSNP("rs587776767", mart.snp, exons)
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf",package = "branchpointer") exons <- gtfToExons(smallExons) mart.snp <- biomaRt::useMart("ENSEMBL_MART_SNP", dataset="hsapiens_snp", host="www.ensembl.org") query <- makeBranchpointWindowForSNP("rs587776767", mart.snp, exons)
Plots branchpointer predictions
plotBranchpointWindow(queryName, predictions, probabilityCutoff = 0.52, plotMutated = FALSE, plotStructure = TRUE, exons)
plotBranchpointWindow(queryName, predictions, probabilityCutoff = 0.52, plotMutated = FALSE, plotStructure = TRUE, exons)
queryName |
query id used to identify the SNP or region |
predictions |
Granges object generated by predictBranchpoints() |
probabilityCutoff |
probability score cutoff value for displaying U2 binding energy |
plotMutated |
plot alternative sequence predicitons alongside reference sequence predictions |
plotStructure |
plot structures for gene and 3' exon containing and skipping isoforms |
exons |
Granges containing exon co-ordinates. Should be produced by gtfToExons() |
ggplot2 plot with branchpoint features in the specified intronic region
Beth Signal
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf", package = "branchpointer") exons <- gtfToExons(smallExons) g <- BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38 querySNPFile <- system.file("extdata","SNP_example.txt", package = "branchpointer") querySNP <- readQueryFile(querySNPFile,queryType = "SNP",exons = exons, filter = FALSE) predictionsSNP <- predictBranchpoints(querySNP,queryType = "SNP",BSgenome = g) plotBranchpointWindow(querySNP$id[1], predictionsSNP, plotMutated = TRUE, exons = exons)
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf", package = "branchpointer") exons <- gtfToExons(smallExons) g <- BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38 querySNPFile <- system.file("extdata","SNP_example.txt", package = "branchpointer") querySNP <- readQueryFile(querySNPFile,queryType = "SNP",exons = exons, filter = FALSE) predictionsSNP <- predictBranchpoints(querySNP,queryType = "SNP",BSgenome = g) plotBranchpointWindow(querySNP$id[1], predictionsSNP, plotMutated = TRUE, exons = exons)
predicts branchpoint probability scores for each query site.
predictBranchpoints(query, uniqueId = "test", queryType, workingDirectory = ".", genome = NA, bedtoolsLocation = NA, BSgenome = NULL, useParallel = FALSE, cores = 1, rmChr = FALSE)
predictBranchpoints(query, uniqueId = "test", queryType, workingDirectory = ".", genome = NA, bedtoolsLocation = NA, BSgenome = NULL, useParallel = FALSE, cores = 1, rmChr = FALSE)
query |
branchpointer query GenomicRanges |
uniqueId |
unique string identifier for intermediate .bed and .fa files. |
queryType |
type of branchpointer query. "SNP" or "region". |
workingDirectory |
directory where intermediate .bed and .fa are located |
genome |
.fa genome file location |
bedtoolsLocation |
bedtools binary location (which bedtools) |
BSgenome |
BSgenome object |
useParallel |
use parallelisation to speed up code? |
cores |
number of cores to use in parallelisation (default = |
rmChr |
remove "chr" before chromosome names before writing bed file. Required if genome sequence names do not contain "chr" |
GenomicRanges object with branchpoint probaility scores for each site in query
Beth Signal
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf", package = "branchpointer") exons <- gtfToExons(smallExons) g <- BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38 querySNPFile <- system.file("extdata","SNP_example.txt", package = "branchpointer") querySNP <- readQueryFile(querySNPFile,queryType = "SNP",exons = exons, filter = FALSE) predictionsSNP <- predictBranchpoints(querySNP,queryType = "SNP",BSgenome = g)
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf", package = "branchpointer") exons <- gtfToExons(smallExons) g <- BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38 querySNPFile <- system.file("extdata","SNP_example.txt", package = "branchpointer") querySNP <- readQueryFile(querySNPFile,queryType = "SNP",exons = exons, filter = FALSE) predictionsSNP <- predictBranchpoints(querySNP,queryType = "SNP",BSgenome = g)
Takes predictions of branchpoint probabilities from a reference and alternative SNP and summarises the effect(s) of the SNP.
predictionsToSummary(query, predictions, probabilityCutoff = 0.52, probabilityChange = 0.15)
predictionsToSummary(query, predictions, probabilityCutoff = 0.52, probabilityChange = 0.15)
query |
query GRanges containing all SNP ids to be summarised |
predictions |
site-wide branchpoint proability predictions produced from predictBranchpoints() |
probabilityCutoff |
Value to be used as the cutoff for
discriminating branchpoint sites from non-branchpoint sites
(default = |
probabilityChange |
Minimum probability score change
required to call a branchpoint site as deleted or created by
a SNP (default = |
GRanges with summarised branchpoint changes occuring within the intron due to a SNP.
Beth Signal
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf", package = "branchpointer") exons <- gtfToExons(smallExons) g <- BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38 querySNPFile <- system.file("extdata","SNP_example.txt", package = "branchpointer") querySNP <- readQueryFile(querySNPFile,queryType = "SNP",exons = exons, filter = FALSE) predictionsSNP <- predictBranchpoints(querySNP,queryType = "SNP",BSgenome = g) summarySNP <- predictionsToSummary(querySNP,predictionsSNP)
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf", package = "branchpointer") exons <- gtfToExons(smallExons) g <- BSgenome.Hsapiens.UCSC.hg38::BSgenome.Hsapiens.UCSC.hg38 querySNPFile <- system.file("extdata","SNP_example.txt", package = "branchpointer") querySNP <- readQueryFile(querySNPFile,queryType = "SNP",exons = exons, filter = FALSE) predictionsSNP <- predictBranchpoints(querySNP,queryType = "SNP",BSgenome = g) summarySNP <- predictionsToSummary(querySNP,predictionsSNP)
Reads and formats a manually generated query file, and finds realtive locations of the closest annotated exons Converts unstranded SNPs to two entries for each strand. Checks for duplicate names and replaces if found.
readQueryFile(queryFile, queryType, exons, maxDist = 50, filter = TRUE)
readQueryFile(queryFile, queryType, exons, maxDist = 50, filter = TRUE)
queryFile |
tab delimited file containing query information. For intronic regions should be in the format: region id, chromosome name, region start, region end, strand. For SNP variants should be in the format: SNP id, chromosome name, SNP position, strand, reference allele (A/T/C/G), alternative allele (A/T/C/G) |
queryType |
type of query file ( |
exons |
GRanges containing exon co-ordinates. Should be produced by gtfToExons() |
maxDist |
maximum distance a SNP can be from an annotated 3' exon. |
filter |
remove SNP queries prior to finding finding nearest exons. |
Formatted query GRanges
Beth Signal
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf", package = "branchpointer") exons <- gtfToExons(smallExons) querySNPFile <- system.file("extdata","SNP_example.txt", package = "branchpointer") querySNP <- readQueryFile(querySNPFile, queryType = "SNP", exons) queryIntronFile <- system.file("extdata","intron_example.txt", package = "branchpointer") queryIntron <- readQueryFile(queryIntronFile,queryType = "region", exons)
smallExons <- system.file("extdata","gencode.v26.annotation.small.gtf", package = "branchpointer") exons <- gtfToExons(smallExons) querySNPFile <- system.file("extdata","SNP_example.txt", package = "branchpointer") querySNP <- readQueryFile(querySNPFile, queryType = "SNP", exons) queryIntronFile <- system.file("extdata","intron_example.txt", package = "branchpointer") queryIntron <- readQueryFile(queryIntronFile,queryType = "region", exons)