Package 'VaSP' reference manual

Title:	Quantification and Visualization of Variations of Splicing in Population
Description:	Discovery of genome-wide variable alternative splicing events from short-read RNA-seq data and visualizations of gene splicing information for publication-quality multi-panel figures in a population. (Warning: The visualizing function is removed due to the dependent package Sushi deprecated. If you want to use it, please change back to an older version.)
Authors:	Huihui Yu [aut, cre] , Qian Du [aut] , Chi Zhang [aut]
Maintainer:	Huihui Yu <[email protected]>
License:	GPL (>= 2.0)
Version:	1.19.0
Built:	2025-04-01 06:04:55 UTC
Source:	https://github.com/bioc/VaSP

Discover bimodal distrubition features

Description

Find bimodal distrubition features and divide the samples into 2 groups by k-means clustering.

Usage

BMfinder(x, p.value = 0.01, maf = 0.05, miss = 0.05, fold = 2, log = FALSE, 
        cores = detectCores() - 1)
BMfinder(x, p.value = 0.01, maf = 0.05, miss = 0.05, fold = 2, log = FALSE, 
        cores = detectCores() - 1)

Arguments

`x`	a numeric matrix with feature rows and sample columns, e.g., splicing score matrix from `spliceGenome` or `spliceGene` function.
`p.value`	p.value threshold for bimodal distrubition test
`maf`	minor allele frequency threshold in k-means clustering
`miss`	missing grouping rate threshold in k-means clustering
`fold`	fold change threshold between the two groups
`log`	whether the scores are to be logarithmic. If TRUE, all the scores are log2 tranformed before k-means clustering: x = log2(x+1).
`cores`	threads to be used. This value is passed to ?mclapply in parallel package

Details

The matrix contains 1, 2 and NA, and values of 'x' in group 2 are larger than group 1.

Value

a matrix with feature rows and sample columns.

Examples

data(rice.bg)
score<-spliceGene(rice.bg,'MSTRG.183',junc.type='score')
score<-round(score,2)
as<-BMfinder(score,cores=1) # 4 bimodal distrubition features found

##compare
as
score[rownames(score)%in%rownames(as),]
data(rice.bg)
score<-spliceGene(rice.bg,'MSTRG.183',junc.type='score')
score<-round(score,2)
as<-BMfinder(score,cores=1) # 4 bimodal distrubition features found

##compare
as
score[rownames(score)%in%rownames(as),]

Get Read Depth

Description

Get read depth from a BAM file (in bedgraph format)

Usage

getDepth(x, chrom, start, end)
getDepth(x, chrom, start, end)

Arguments

`x`	path to a BAM file
`chrom`	chromosome of a region to be searched
`start`	start position
`end`	end position

Value

a data.frame in bedgraph file format.

Examples

path <- system.file('extdata',package='VaSP')
bam_files<-list.files(path,'bam$')
bam_files

depth<-getDepth(file.path(path, bam_files[1]), 'Chr1',
                start=1171800, end=1179400)
head(depth)

# library(Sushi)
# plotBedgraph(depth,'Chr1',chromstart=1171800, chromend=1179400,yaxt='s')
# mtext('Depth',side=2,line=2.5,cex=1.2,font=2)
# labelgenome('Chr1',1171800,1179400,side=1,scipen=20,n=5,scale='Kb')
path <- system.file('extdata',package='VaSP')
bam_files<-list.files(path,'bam$')
bam_files

depth<-getDepth(file.path(path, bam_files[1]), 'Chr1',
                start=1171800, end=1179400)
head(depth)

# library(Sushi)
# plotBedgraph(depth,'Chr1',chromstart=1171800, chromend=1179400,yaxt='s')
# mtext('Depth',side=2,line=2.5,cex=1.2,font=2)
# labelgenome('Chr1',1171800,1179400,side=1,scipen=20,n=5,scale='Kb')

Get Gene Informaton from a ballgown object

Description

Get gene informaton from a ballgown object by genes or by genomic regions

Usage

getGeneinfo(genes = NA, bg, chrom, start, end, samples = sampleNames(bg),
            trans.select = NA)
getGeneinfo(genes = NA, bg, chrom, start, end, samples = sampleNames(bg),
            trans.select = NA)

Arguments

`genes`	a character vector specifying gene IDs in 'bg'. Any values other than NA override genomic region (chrom, start, stop)
`bg`	ballgown object
`chrom`	chromosome of a region
`start`	start postion
`end`	stop postion
`samples`	names of samples. The transcrpts in these samples are subjected to 'trans.select'
`trans.select`	logical expression-like string, indicating transcript rows to select from a matrix of transcript coverages: NA value keeps all transcripts.

Value

a data.frame in bed-like file format

Examples

data(rice.bg)
unique(geneIDs(rice.bg))

gene_id <- c('MSTRG.181', 'MSTRG.182', 'MSTRG.183')
geneinfo <- getGeneinfo(genes=gene_id,rice.bg)
trans <- table(geneinfo$name) # show how many exons each transcript has
trans

# library(Sushi)
# chrom = geneinfo$chrom[1]
# chromstart = min(geneinfo$start) - 1e3
# chromend = max(geneinfo$stop) + 1e3
# color = rep(SushiColors(2)(length(trans)), trans)

# par(mar=c(3,1,1,1))
# plotGenes(geneinfo, chrom, chromstart, chromend, col = color, bheight = 0.2,
#           bentline = FALSE, plotgenetype = 'arrow', labeloffset = 0.5)
# labelgenome(chrom, chromstart , chromend, side = 1, n = 5, scale = 'Kb')
data(rice.bg)
unique(geneIDs(rice.bg))

gene_id <- c('MSTRG.181', 'MSTRG.182', 'MSTRG.183')
geneinfo <- getGeneinfo(genes=gene_id,rice.bg)
trans <- table(geneinfo$name) # show how many exons each transcript has
trans

# library(Sushi)
# chrom = geneinfo$chrom[1]
# chromstart = min(geneinfo$start) - 1e3
# chromend = max(geneinfo$stop) + 1e3
# color = rep(SushiColors(2)(length(trans)), trans)

# par(mar=c(3,1,1,1))
# plotGenes(geneinfo, chrom, chromstart, chromend, col = color, bheight = 0.2,
#           bentline = FALSE, plotgenetype = 'arrow', labeloffset = 0.5)
# labelgenome(chrom, chromstart , chromend, side = 1, n = 5, scale = 'Kb')

Rice ballgown object

Description

Small ballgown object created with a subset of rice RNAseq data, for demonstration purposes

Format

a ballgown object with 33 transcripts and 6 samples

Details

The raw RNA-seq data were screened and trimmed using Trimmomatic (Bolger et al., 2014) and RNA-seq mapping, transcript assembly, and quantification were conducted with HISAT, StringTie, and Ballgown by following the method described by Pertea et al. (Pertea et al., 2016). The rice.bg is a subset ballgown object with 33 transcripts and 6 samples (Yu et al., 2021).

Source

The raw RNA-seq data were from the project of variation in transcriptional responses to salt stress in rice (SRA Accession: SRP106054)

References

Yu, H., Du, Q., Campbell, M., Yu, B., Walia, H. and Zhang, C. (2021), Genome‐wide discovery of natural variation in pre‐mRNA splicing and prioritising causal alternative splicing to salt stress response in rice. New Phytol. https://doi.org/10.1111/nph.17189

Bolger, A.M., Lohse, M., and Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114-2120.

Pertea, M., Kim, D., Pertea, G.M., Leek, J.T., and Salzberg, S.L. (2016). Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc 11, 1650-1667.

Examples

data(rice.bg)
rice.bg
# ballgown instance with 33 transcripts and 6 samples
data(rice.bg)
rice.bg
# ballgown instance with 33 transcripts and 6 samples

Calculate Splicing Scores for One Gene

Description

Calculate splicing Scores from ballgown object for a given gene. This function can only calculate one gene. Please use function spliceGenome to obtain genome-wide splicing scores.

Usage

spliceGene(bg, gene, samples = sampleNames(bg), junc.type = c("score", "count"),
            trans.select = "rowMaxs(x)>=1", junc.select = "rowMaxs(x)>=5")
spliceGene(bg, gene, samples = sampleNames(bg), junc.type = c("score", "count"),
            trans.select = "rowMaxs(x)>=1", junc.select = "rowMaxs(x)>=5")

Arguments

`bg`	ballgown object
`gene`	a character string specifying gene id
`samples`	names of samples
`junc.type`	type of junction estimate ('score' for junction score; 'count' for junction read count)
`trans.select`	logical expression-like string, indicating transcript rows to select from a matrix of transcript coverages: NA value keeps all transcripts. e.g. use trans.select='rowMaxs(x)>=1' to filter the transcrpts with the maximium coverage among all the samples less than 1.
`junc.select`	logical expression-like string, indicating junction rows to select from a matrix of junction counts: NA value keeps all junctions. e.g. use junc.select='rowMaxs(x)>=5' to filter the junctions with the maximium read count among all the samples less than 5.

Details

score = junction count/gene-level per base read coverage. Row functions for matrices are useful to select transcripts and junctions. See matrixStats package.

Value

a matrix of junction scores with intron rows and sample columns.

References

Examples

data(rice.bg)
rice.bg
head(geneIDs(rice.bg))

score<-spliceGene(rice.bg,'MSTRG.183',junc.type='score')
count<-spliceGene(rice.bg,'MSTRG.183',junc.type='count')

## compare
tail(score)
tail(count)

## get intron structrue
intron<-structure(rice.bg)$intron
intron[intron$id%in%rownames(score)]

data(rice.bg)
rice.bg
head(geneIDs(rice.bg))

score<-spliceGene(rice.bg,'MSTRG.183',junc.type='score')
count<-spliceGene(rice.bg,'MSTRG.183',junc.type='count')

## compare
tail(score)
tail(count)

## get intron structrue
intron<-structure(rice.bg)$intron
intron[intron$id%in%rownames(score)]

Calculate Genome-wide Splicing Scores

Description

Calculate splicing scores from ballgown objects for all genes.

Usage

spliceGenome(bg, gene.select = "rowQuantiles(x,probs = 0.05)>=1",
            intron.select = "rowQuantiles(x,probs = 0.95)>=5")
spliceGenome(bg, gene.select = "rowQuantiles(x,probs = 0.05)>=1",
            intron.select = "rowQuantiles(x,probs = 0.95)>=5")

Arguments

`bg`	ballgown object
`gene.select`	logical expression-like string, indicating genes to select from a matrix of gene-level coverages: NA value keeps all genes. e.g. gene.select = 'rowQuantiles(x,probs = 0.05)>=1' keeps the genes with the read coverage greater than or equal to 1 in at least 95 (0.05 quantile). Used to filter low expressed genes.
`intron.select`	logical expression-like string, indicating introns to select from a matrix of junction counts: NA value keeps all introns. e.g. intron.select = 'rowQuantiles(x,probs = 0.95)>=5' keeps the introns with the read count greater than or euqal to 5 in at least 5 (0.95 quantile). Used to filter introns with very few junction reads supporting.

Details

score = junction count/gene-level per base read coverage. Row functions for matrices in matrixStats package are useful to select genes and introns.

Value

a list of two elelments: 'score' is matrix of intron splicing scores with intron rows and sample columns and 'intron' is a GRanges object of intron structure. See structure in ballgown package

References

Examples

data(rice.bg)
rice.bg

splice<-spliceGenome(rice.bg,gene.select=NA,intron.select=NA)
names(splice)

head(splice$score)
splice$intron

data(rice.bg)
rice.bg

splice<-spliceGenome(rice.bg,gene.select=NA,intron.select=NA)
names(splice)

head(splice$score)
splice$intron

Package 'VaSP'

Help Index

Discover bimodal distrubition features

Description

Usage

Arguments

Details

Value

Examples

Get Read Depth

Description

Usage

Arguments

Value

Examples

Get Gene Informaton from a ballgown object

Description

Usage

Arguments

Value

Examples

Rice ballgown object

Description

Format

Details

Source

References

Examples

Calculate Splicing Scores for One Gene

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Calculate Genome-wide Splicing Scores

Description

Usage

Arguments

Details

Value

References

See Also

Examples