Title: | Analysis of Alternative Splicing Using RNA-Seq |
---|---|
Description: | Integrative pipeline for the analysis of alternative splicing using RNAseq. |
Authors: | Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky and Ariel Chernomoretz |
Maintainer: | Ariel Chernomoretz <[email protected]> |
License: | GPL |
Version: | 2.17.0 |
Built: | 2024-11-29 05:13:38 UTC |
Source: | https://github.com/bioc/ASpli |
AspliDU object may contain results of differential expression of genes,
differential usage of bins and junctions, however not everything is
calculated at the same or even present. Calculations for genes and bins can
be done independently from junctions. Functions containsJunctions
and
containsGenesAndBins
allow to interrogate an ASpliDU object about the
kind of results it contain.
containsJunctions( du ) containsGenesAndBins( du )
containsJunctions( du ) containsGenesAndBins( du )
du |
An ASpliDU object. |
A logical value that indicates that results for genes and bins, or results for junctions are available in the object.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# see ASpli package
# see ASpli package
ASpli is an integrative and flexible package that facilitates the characterization of genome-wide changes in AS under different experimental conditions. ASpli analyzes the differential usage of introns, exons, and splice junctions using read counts, and estimates the magnitude of changes in AS by calculating differences in the percentage of exon inclusion or intron retention using splice junctions. This integrative approach allows the identification of changes in both annotated and novel AS events.
ASpli allows users to produce self-explanatory intermediate outputs, based on the aim of their analysis. A typical workflow involves parsing the genome annotation into new features called bins, overlapping read alignments against those bins, and inferring differential bin usage based on the number of reads aligning to the bins and junctions.
Package: | ASpli |
Type: | Package |
Version: | 1.5.1 |
Date: | 2018-02-22 |
License: | GPL |
Depends: | methods, GenomicRanges, GenomicFeatures, edgeR, methods, BiocGenerics, IRanges, GenomicAlignments, Gviz |
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
Acute effects of light on alternative splicing in light-grown plants. Photochemistry and Photobiology. Mancini, E, Sanchez, S, Romanowsky, A, Yanovsky, MJ. DOI: 10.1111/php.12550
GEMIN2 attenuates the effects of temperature on alternative splicing and circadian rhythms in Arabidopsis thaliana. Proceedings of the National Academy of Sciences. Schlaen, RG, Mancini, E, Sanchez, SE, Perez-Santangelo, S, Rugnone, ML, Simpson, CG, Brown, JWS, Zhang, X, Chernomoretz, A, Yanovsky, MJ. DOI:10.1073/pnas.1504541112
Genome wide comparative analysis of the effects of PRMT5 and PRMT4/CARM1 arginine methyltransferases on the Arabidopsis thaliana transcriptome. BMC Genomics. Hernando, E, Sanchez, S, Mancini, E, Yanovsky MJ. DOI:10.1186/s12864-015-1399-2
A role for LSM genes in the regulation of circadian rhythms. Proceedings of the National Academy of Sciences. Perez Santangelo, S, Mancini, E, Francey, LJ, Schlaen, RG, Chernomoretz, A, Hogenesch, JB, Yanovsky MJ. DOI: 10.1073/pnas.1409791111
The dengue virus NS5 protein intrudes in the cellular spliceosome and modulates splicing. PLOS Pathogens. De Maio, F., Risso, G., Iglesias, G., Shah, P, Pozzi, B., Gebhard, L., Mammi, L., Mancini, E., Yanovsky, M., Andino, R., Krogan, N., Srebrow, A. and Gamarnik, A. DOI:10.1371/journal.ppat.1005841
library(GenomicFeatures) gtfFileName <- aspliExampleGTF() genomeTxDb <- txdbmaker::makeTxDbFromGFF( gtfFileName ) features <- binGenome( genomeTxDb ) BAMFiles <- aspliExampleBamList() targets <- data.frame( row.names = paste0('Sample',c(1:12)), bam = BAMFiles, f1 = c( 'A','A','A','A','A','A', 'B','B','B','B','B','B'), f2 = c( 'C','C','C','D','D','D', 'C','C','C','D','D','D'), stringsAsFactors = FALSE) getConditions(targets) mBAMs <- data.frame(bam = sub("_[02]","",targets$bam[c(1,4,7,10)]), condition= c("A_C","A_D","B_C","B_D")) gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) asd <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) gb <- gbDUreport(counts=gbcounts, contrast = c( 1, -1, -1, 1 ) ) jdur <- jDUreport(asd, contrast = c( 1, -1, -1, 1 ) , mergedBams = mBAMs) sr <- splicingReport(gb, jdur, counts =gbcounts ) is <- integrateSignals(sr,asd)
library(GenomicFeatures) gtfFileName <- aspliExampleGTF() genomeTxDb <- txdbmaker::makeTxDbFromGFF( gtfFileName ) features <- binGenome( genomeTxDb ) BAMFiles <- aspliExampleBamList() targets <- data.frame( row.names = paste0('Sample',c(1:12)), bam = BAMFiles, f1 = c( 'A','A','A','A','A','A', 'B','B','B','B','B','B'), f2 = c( 'C','C','C','D','D','D', 'C','C','C','D','D','D'), stringsAsFactors = FALSE) getConditions(targets) mBAMs <- data.frame(bam = sub("_[02]","",targets$bam[c(1,4,7,10)]), condition= c("A_C","A_D","B_C","B_D")) gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) asd <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) gb <- gbDUreport(counts=gbcounts, contrast = c( 1, -1, -1, 1 ) ) jdur <- jDUreport(asd, contrast = c( 1, -1, -1, 1 ) , mergedBams = mBAMs) sr <- splicingReport(gb, jdur, counts =gbcounts ) is <- integrateSignals(sr,asd)
Methods to retrieve and set data in ASpliAS object. Setting data into an ASpliAS object is not a typical task and must be done with care, because it can affect the integrity of the object.
altPSI( x ) esPSI( x ) irPIR( x ) joint( x ) junctionsPIR( x ) junctionsPJU( x )
altPSI( x ) esPSI( x ) irPIR( x ) joint( x ) junctionsPIR( x ) junctionsPJU( x )
x |
An ASpliAS object |
Returns dataframes with genomic metadata and PSI and PIR metrics
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# Accessing data tables from an ASpliAS object #as <- aspliASexample() #ap <- altPSI(as) #ep <- esPSI(as) #ip <- irPIR(as) #j <- joint(as) #jpi <- junctionsPIR(as) #jps <- junctionsPJU(as) # Setting data tables to an ASpliAS object #as2 <- new( 'ASpliAS' ) #altPSI( as2 ) <- ap #esPSI( as2 ) <- ep #irPIR( as2 ) <- ip #joint( as2 ) <- j #junctionsPIR( as2 ) <- jpi #junctionsPJU( as2 ) <- jps
# Accessing data tables from an ASpliAS object #as <- aspliASexample() #ap <- altPSI(as) #ep <- esPSI(as) #ip <- irPIR(as) #j <- joint(as) #jpi <- junctionsPIR(as) #jps <- junctionsPJU(as) # Setting data tables to an ASpliAS object #as2 <- new( 'ASpliAS' ) #altPSI( as2 ) <- ap #esPSI( as2 ) <- ep #irPIR( as2 ) <- ip #joint( as2 ) <- j #junctionsPIR( as2 ) <- jpi #junctionsPJU( as2 ) <- jps
These functions are provided for compatibility with older versions of ‘ASpli’ only, and will be defunct at the next release.
The following functions are deprecated and will be made defunct; use the replacement indicated below:
loadBAM: gbCounts
readCounts: gbCounts
AsDiscover: jCounts,
splicingReport,
integrateSignals
DUreport: gbDUreport,
jDUreport
DUreportBinSplice: gbDUreport
junctionDUreport: jDUreport
mergeBinDUAS: splicingReport,
integrateSignals
junctionsPSI: junctionsPJU
plotGenomicRegions: exportSplicingReports, exportIntegratedSignals
"ASpliAS"
Results of PSI and PIR using experimental junctions
irPIR
:Reports: event, e1i counts (J1), ie1 counts (J2), j_within (J3), PIR by condition. J1, J2, J3 sum of junctions (J1, J2, J3) by condition.
altPSI
:Reports: event, J1 (start), J2 (end), J3 (exclusion), PSI. J1, J2, J3 sum of junctions (J1, J2, J3) by condition.
esPSI
:Reports: event, J1 (start), J2 (end), J3 (exclusion), PSI. J1, J2, J3 sum of junctions (J1, J2, J3) by condition.
join
:It is a combination of irPIR, altPSI and esPSI tables
junctionsPIR
:PIR metric for each experimental junction using e1i and ie2 counts. Exclusion junction is the junction itself. This output helps to discover new introns as well as new retention events
junctionsPJU
:Given a junction, it is possible to analyze if it shares start, end or both with another junction. If so, is because there is more than one way for/of splicing. Ratio between them along samples is reported.
targets
:DataFrame with targets.
.ASpliVersion
:ASpli version when this object was created. It should not be modified by the user.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
Methods: AsDiscover
Accesors: altPSI
,
irPIR
,
esPSI
,
joint
,
junctionsPIR
,
junctionsPJU
"ASpliCounts"
Contains results of read overlaps against all feature levels summarization
gene.counts
exon.intron.counts
junction.counts
e1i.counts
ie2.counts
gene.rd
bin.rd
condition.order
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
"ASpliCounts"
Contains results of read overlaps against all feature levels summarization
gene.counts
:Object of class "data.frame"
exon.intron.counts
:Object of class "data.frame"
junction.counts
:Object of class "data.frame"
e1i.counts
:Object of class "data.frame"
ie2.counts
:Object of class "data.frame"
gene.rd
:Object of class "data.frame"
bin.rd
:Object of class "data.frame"
condition.order
:Object of class "character"
targets
:Object of class "data.frame"
.ASpliVersion
:ASpli version when this object was created. It should not be modified by the user.
psi and pir metrics
bin counts accesor
e1i counts accesor
gene counts accesor
ie2 counts accesor
junction counts accesor
differential expression and usage estimation using DEXSeq
differential expression and usage estimation using DEXSeq
bin read densities accesor
gen read densities acceesor
compute read densities on genes and bins
Export count tables
Export read density tables
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
"ASpliDU"
Contains results of differential expression at gene level and differential usage at bin and junction level estimation using DEreport method.
genes
bins
junctions
contrast
.ASpliVersion
:ASpli version when this object was created. It should not be modified by the user.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
"ASpliFeatures"
Contains Genomic Ranges of different features extracted from a TxDb
genes
:bins
:junctions
:transcriptExons
:Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
"ASpliIntegratedSignals"
Contains results of differential expression at junction level.
signals
filters
.ASpliVersion
:ASpli version when this object was created. It should not be modified by the user.
Andres Rabinovich, Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
"ASpliJDU"
Contains results of differential expression at junction level.
localec
localej
anchorc
anchorj
jir
jes
jalt
contrast
.ASpliVersion
:ASpli version when this object was created. It should not be modified by the user.
Andres Rabinovich, Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
"ASpliSplicingReport"
Contains results of differential expression at junction level.
binbased
localebased
anchorbased
contrast
.ASpliVersion
:ASpli version when this object was created. It should not be modified by the user.
Andres Rabinovich, Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
Exons and introns are subdivided into new features called exon and intron bins and are then classified into exclusively exonic bins, exclusively intronic bins or alternative splicing (AS) bins .
binGenome(genome, geneSymbols = NULL, logTo = "ASpli_binFeatures.log", cores = 1)
binGenome(genome, geneSymbols = NULL, logTo = "ASpli_binFeatures.log", cores = 1)
genome |
An object of class transcriptDb (TxDb) |
geneSymbols |
A dataframe with symbol (common names) of TxDb genes. If geneSymbols is NULL, gene name will be repeated |
logTo |
Filename where to print features extraction log |
cores |
Number of cores to use in parallel when binning the genome |
Exon and intron coordinates are extracted from gene annotation, only those from multi-exonic genes are saved for further evaluation. In case more than one isoform exist, some exons and introns will overlap. Exons and introns are then disjoint into new features called exon and intron bins, and then they are classified into exclusively exonic bins, exclusively intronic bind or alternative splicing bins (AS-bins), which are labeled according to which alternative splicing event are assumed to came from:
ES: exon skipping
IR: intron retention
Alt5|3'ss: alternative five/three prime splicing site
"*" (ES*, IR*, AltSS*) means this AS bin/region is involved simultaneously in more than one AS event type
external: from the beginning or the end of a transcript
Subgenic features are labeled as follow (hypothetical GeneAAA):
GeneAAA:E001: defines first exonic bin
GeneAAA:I001: defines first intronic bin
GeneAAA:Io001: defines first intron before disjoint into bins
GeneAAA:J001: defines first junction
Junctions are defined as the last position of five prime exon (donor position) and first position of three prime exon (aceptor position). Using TxDb object, it is possible to extract annotated/known junctions. This information will be useful for the analysis of "experimental" junctions (reads aligned with gaps). Bins and junctions are labelled always in 5' to 3' sense. This notation is strand independent. It implies that bin / junction with lower numbering is always at 5'.
An ASpliFeatures object. It is a list of features using GRanges format.
Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
featuresg
,
featuresb
,
featuresj
# Create a transcript DB from gff/gtf annotation file. library(GenomicFeatures) gtfFileName <- aspliExampleGTF() genomeTxDb <- txdbmaker::makeTxDbFromGFF( gtfFileName ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Extract gene, bin and junctions features GeneCoord <- featuresg(features) BinCoord <- featuresb(features) JunctionCoord <- featuresj(features)
# Create a transcript DB from gff/gtf annotation file. library(GenomicFeatures) gtfFileName <- aspliExampleGTF() genomeTxDb <- txdbmaker::makeTxDbFromGFF( gtfFileName ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Extract gene, bin and junctions features GeneCoord <- featuresg(features) BinCoord <- featuresb(features) JunctionCoord <- featuresj(features)
Feature coordinates extraction from a Transcript Database
signature(genome = "TxDb")
An object of class transcriptDb (TxDb)
Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
featuresg, featuresb , featuresj
Accessors for ASpliCounts object
countsb(x) countse1i(x) countsg(x) countsie2(x) countsj(x) rdsg(x) rdsb(x) condition.order(x) targets(x)
countsb(x) countse1i(x) countsg(x) countsie2(x) countsj(x) rdsg(x) rdsb(x) condition.order(x) targets(x)
x |
An ASpliCounts object |
Returns dataframes with counts by sample and genomic metadata
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# Get data tables from an ASpliCounts object #counts <- aspliCountsExample() #cb1 <- countsb(counts) #ce1i <- countse1i(counts) #cg <- countsg(counts) #cie2 <- countsie2(counts) #cj <- countsj(counts) #rg <- rdsg(counts) #rb <- rdsb(counts) #co <- condition.order(counts) #tg <- targets(counts) # Set data tables to an ASpliCounts object #countsb(counts) <- cb1 #countse1i(counts) <- ce1i #countsg(counts) <- cg #countsie2(counts) <- cie2 #countsj(counts) <- cj #rdsg(counts) <- rg #rdsb(counts) <- rb
# Get data tables from an ASpliCounts object #counts <- aspliCountsExample() #cb1 <- countsb(counts) #ce1i <- countse1i(counts) #cg <- countsg(counts) #cie2 <- countsie2(counts) #cj <- countsj(counts) #rg <- rdsg(counts) #rb <- rdsb(counts) #co <- condition.order(counts) #tg <- targets(counts) # Set data tables to an ASpliCounts object #countsb(counts) <- cb1 #countse1i(counts) <- ce1i #countsg(counts) <- cg #countsie2(counts) <- cie2 #countsj(counts) <- cj #rdsg(counts) <- rg #rdsb(counts) <- rb
Accessors for ASpliDU object
genesDE( x ) binsDU( x ) junctionsDU( x )
genesDE( x ) binsDU( x ) junctionsDU( x )
x |
An ASpliDU object |
Returns dataframes with genomic metadata and logFC and pvalue
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# Get data tables from an ASpliDU object #du <- aspliDUexample1() #gde <- genesDE( du ) #bdu <- binsDU( du ) #jdu <- junctionsDU( du ) # Set data tables to an ASpliDU object #genesDE( du ) <- gde #binsDU( du ) <- bdu #junctionsDU( du ) <- jdu
# Get data tables from an ASpliDU object #du <- aspliDUexample1() #gde <- genesDE( du ) #bdu <- binsDU( du ) #jdu <- junctionsDU( du ) # Set data tables to an ASpliDU object #genesDE( du ) <- gde #binsDU( du ) <- bdu #junctionsDU( du ) <- jdu
Estimate differential expression at gene level and differential usage at bin level. When targets has only two conditions, and contrast is not set, the estimation of differential expression and usage is done with an exact test, otherwise is estimated using a generalized linear model.
DUreport( counts, targets, minGenReads = 10, minBinReads = 5, minRds = 0.05, offset = FALSE, offsetAggregateMode = c( "geneMode", "binMode" )[1], offsetUseFitGeneX = TRUE, contrast = NULL, forceGLM = FALSE, ignoreExternal = TRUE, ignoreIo = TRUE, ignoreI = FALSE, filterWithContrasted = FALSE, verbose = FALSE)
DUreport( counts, targets, minGenReads = 10, minBinReads = 5, minRds = 0.05, offset = FALSE, offsetAggregateMode = c( "geneMode", "binMode" )[1], offsetUseFitGeneX = TRUE, contrast = NULL, forceGLM = FALSE, ignoreExternal = TRUE, ignoreIo = TRUE, ignoreI = FALSE, filterWithContrasted = FALSE, verbose = FALSE)
counts |
An object of class ASpliCounts |
targets |
A data.frame containing sample, bam and experimental factor columns. |
minGenReads |
Genes with at least an average of |
minBinReads |
Bins with at least an average of |
minRds |
Genes with at least an average of |
ignoreExternal |
Ignore Exon Bins at the beginning or end of the transcript. Default value is TRUE. |
ignoreIo |
Ignore original introns. Default TRUE |
ignoreI |
Ignore intron bins, test is performed only for exons. Default FALSE |
offset |
Corrects bin expression using an offset matrix derived from gene expression data. Default = FALSE |
offsetAggregateMode |
Choose the method to aggregate gene counts to
create the offset matrix. When |
,
offsetUseFitGeneX |
Default= TRUE |
contrast |
Define the comparison between conditions to be tested.
|
forceGLM |
Force the use of a generalized linear model to estimate differential expression and usage. Default = FALSE |
filterWithContrasted |
A logical value specifying if bins, genes and junction will be filtered by read quantity and read density using data from those conditions that will be used in the comparison, i.e. those which coefficients in contrast argument are different from zero. The default value is FALSE, it is strongly recommended to do not change this value. |
verbose |
A logical value that indicates that detailed information about each step in the analysis will be presented to the user. |
An ASpliDU object with results at genes
, bins
level.
Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
edgeR
, junctionDUreport
Accessors: genesDE
, binsDU
Export: writeDU
#This function has been deprecated and is no longer neded. Please see vignette for new pipeline.
#This function has been deprecated and is no longer neded. Please see vignette for new pipeline.
Estimate differential expression at gene level and differential usage at bin level. When targets has only two conditions, and contrast is not set, the estimation of differential expression and usage is done with an exact test, otherwise is estimated using a generalized linear model.
DUreport.norm( counts, minGenReads = 10, minBinReads = 5, minRds = 0.05, contrast = NULL, ignoreExternal = TRUE, ignoreIo = TRUE, ignoreI = FALSE, filterWithContrasted = TRUE, verbose = FALSE, threshold = 5)
DUreport.norm( counts, minGenReads = 10, minBinReads = 5, minRds = 0.05, contrast = NULL, ignoreExternal = TRUE, ignoreIo = TRUE, ignoreI = FALSE, filterWithContrasted = TRUE, verbose = FALSE, threshold = 5)
counts |
An object of class ASpliCounts |
minGenReads |
Genes with at least an average of |
minBinReads |
Bins with at least an average of |
minRds |
Genes with at least an average of |
ignoreExternal |
Ignore Exon Bins at the beginning or end of the transcript. Default value is TRUE. |
ignoreIo |
Ignore original introns. Default TRUE |
ignoreI |
Ignore intron bins, test is performed only for exons. Default FALSE |
contrast |
Define the comparison between conditions to be tested.
|
filterWithContrasted |
A logical value specifying if bins, genes and junction will be filtered by read quantity and read density using data from those conditions that will be used in the comparison, i.e. those which coefficients in contrast argument are different from zero. The default value is TRUE, it is strongly recommended to do not change this value. |
verbose |
A logical value that indicates that detailed information about each step in the analysis will be presented to the user. |
threshold |
Default = 5 |
An ASpliDU object with results at genes
, bins
level.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
edgeR
, jDUreport
Accessors: genesDE
, binsDU
Export: writeDU
#check ASpli package examples
#check ASpli package examples
Estimate differential expression at gene level and differential usage at bin level. When targets has only two conditions, and contrast is not set, the estimation of differential expression and usage is done with an exact test, otherwise is estimated using a generalized linear model.
DUreport.offset( counts, minGenReads = 10, minBinReads = 5, minRds = 0.05, offsetAggregateMode = c( "geneMode", "binMode" )[1], offsetUseFitGeneX = TRUE, contrast = NULL, ignoreExternal = TRUE, ignoreIo = TRUE, ignoreI = FALSE, filterWithContrasted = TRUE, verbose = FALSE)
DUreport.offset( counts, minGenReads = 10, minBinReads = 5, minRds = 0.05, offsetAggregateMode = c( "geneMode", "binMode" )[1], offsetUseFitGeneX = TRUE, contrast = NULL, ignoreExternal = TRUE, ignoreIo = TRUE, ignoreI = FALSE, filterWithContrasted = TRUE, verbose = FALSE)
counts |
An object of class ASpliCounts |
minGenReads |
Genes with at least an average of |
minBinReads |
Bins with at least an average of |
minRds |
Genes with at least an average of |
ignoreExternal |
Ignore Exon Bins at the beginning or end of the transcript. Default value is TRUE. |
ignoreIo |
Ignore original introns. Default TRUE |
ignoreI |
Ignore intron bins, test is performed only for exons. Default FALSE |
offset |
Corrects bin expression using an offset matrix derived from gene expression data. Default = FALSE |
offsetAggregateMode |
Choose the method to aggregate gene counts to
create the offset matrix. When |
,
offsetUseFitGeneX |
Default= TRUE |
contrast |
Define the comparison between conditions to be tested.
|
filterWithContrasted |
A logical value specifying if bins, genes and junction will be filtered by read quantity and read density using data from those conditions that will be used in the comparison, i.e. those which coefficients in contrast argument are different from zero. The default value is TRUE, it is strongly recommended to do not change this value. |
verbose |
A logical value that indicates that detailed information about each step in the analysis will be presented to the user. |
An ASpliDU object with results at genes
, bins
level.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
edgeR
, jDUreport
Accessors: genesDE
, binsDU
Export: writeDU
#check ASpli pacakge example
#check ASpli pacakge example
Estimate differential expression at gene level and differential usage at bin level using diffSpliceDGE function from edgeR package. This is an alternative approach to DUreport. The results at gene level are the same as the results from DUreport. The results at bin level are slightly different.
DUreportBinSplice( counts, targets, minGenReads = 10, minBinReads = 5, minRds = 0.05, contrast = NULL, forceGLM = FALSE, ignoreExternal = TRUE, ignoreIo = TRUE, ignoreI = FALSE, filterWithContrasted = FALSE, verbose = TRUE )
DUreportBinSplice( counts, targets, minGenReads = 10, minBinReads = 5, minRds = 0.05, contrast = NULL, forceGLM = FALSE, ignoreExternal = TRUE, ignoreIo = TRUE, ignoreI = FALSE, filterWithContrasted = FALSE, verbose = TRUE )
counts |
An object of class ASpliCounts |
targets |
A dataframe containing sample, bam and experimental factor columns. |
minGenReads |
Genes with at least an average of |
minBinReads |
Bins with at least an average of |
minRds |
Genes with at least an average of |
ignoreExternal |
Ignore Exon Bins at the beginning or end of the transcript. Default value is TRUE. |
ignoreIo |
Ignore original introns. Default TRUE |
ignoreI |
Ignore intron bins, test is performed only for exons. Default FALSE |
contrast |
Define the comparison between conditions to be tested.
|
forceGLM |
Force the use of a generalized linear model to estimate differential expression. It is not used to differential usage of bins. Default = FALSE |
filterWithContrasted |
A logical value specifying if bins, genes and junction will be filtered by read quantity and read density using data from those conditions that will be used in the comparison, i.e. those which coefficients in contrast argument are different from zero. The default value is FALSE, it is strongly recommended to do not change this value. |
verbose |
A logical value that indicates that detailed information about each step in the analysis will be presented to the user. |
An ASpliDU object with results at genes
, bins
level.
Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
edgeR
, junctionDUreport
Accessors: genesDE
, binsDU
Export: writeDU
#This function has been deprecated. Please see vignette for new pipeline.
#This function has been deprecated. Please see vignette for new pipeline.
ASpli includes functions to easily build ASpli objects, used in examples in the vignette and man pages.
aspliASexample() aspliBamsExample() aspliCountsExample() aspliDUexample1() aspliDUexample2() aspliExampleBamList() aspliExampleGTF() aspliFeaturesExample() aspliJunctionDUexample() aspliTargetsExample()
aspliASexample() aspliBamsExample() aspliCountsExample() aspliDUexample1() aspliDUexample2() aspliExampleBamList() aspliExampleGTF() aspliFeaturesExample() aspliJunctionDUexample() aspliTargetsExample()
An ASpli object with example data.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
#as <- aspliASexample() #bams <- aspliBamsExample() #counts <- aspliCountsExample() #du1 <- aspliDUexample1() #du2 <- aspliDUexample2() #bamfiles <- aspliExampleBamList() #gtffile <- aspliExampleGTF() #features <- aspliFeaturesExample() #jdu <- aspliJunctionDUexample() #targets <-aspliTargetsExample()
#as <- aspliASexample() #bams <- aspliBamsExample() #counts <- aspliCountsExample() #du1 <- aspliDUexample1() #du2 <- aspliDUexample2() #bamfiles <- aspliExampleBamList() #gtffile <- aspliExampleGTF() #features <- aspliFeaturesExample() #jdu <- aspliJunctionDUexample() #targets <-aspliTargetsExample()
Export integrated signals in an easy to analyze HTML table.
exportIntegratedSignals( is, output.dir="is", sr, counts, features, asd, mergedBams, jCompletelyIncluded = FALSE, zoomRegion = 1.5, useLog = FALSE, tcex = 1, ntop = NULL, openInBrowser = FALSE, makeGraphs = TRUE, bforce=FALSE )
exportIntegratedSignals( is, output.dir="is", sr, counts, features, asd, mergedBams, jCompletelyIncluded = FALSE, zoomRegion = 1.5, useLog = FALSE, tcex = 1, ntop = NULL, openInBrowser = FALSE, makeGraphs = TRUE, bforce=FALSE )
is |
An object of class |
sr |
An object of class |
counts |
An object of class |
features |
An object of class |
asd |
An object of class |
output.dir |
HTML reports output directory |
mergedBams |
Dataframe with two columns, bams and conditions. Bams are paths to merged bams for each condition to be ploted. |
jCompletelyIncluded |
If TRUE only plot junctions completely included in plot region. Else plot any overlapping junction in the region |
zoomRegion |
Magnify plot region by this factor |
useLog |
Plot counts log |
tcex |
Text size |
ntop |
Only show n top signals |
openInBrowser |
Open reports in browser when done |
makeGraphs |
Generate graphs in reports |
bforce |
Force plot generation even if plot already exists |
Produces html reports
Andres Rabinovich, Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
gbDUreport
, jDUreport
, ASpliSplicingReport
, splicingReport
, ASpliIntegratedSignals
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D')) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Test for factor1 gbPaired <- gbDUreport(gbcounts, contrast = c(1, -1)) jPaired <- jDUreport(jcounts, , contrast = c(1, -1)) # Generate a splicing report merging bins and junctions DU sr <- splicingReport(gbPaired, jPaired, gbcounts) is <- integrateSignals(sr, jcounts) #Make merged bams dataframe mergedBamsFileNames <- c( "A_C.bam", "A_D.bam" ) mergedBams <- data.frame(bams = system.file( 'extdata', mergedBamsFileNames, package="ASpli" ), condition = c("C", "D"), stringsAsFactors = FALSE) # Export integrated signals exportIntegratedSignals(is, output.dir = paste0(tempdir(), "/is"), sr, gbcounts, features, jcounts, mergedBams, makeGraphs = TRUE, bforce = TRUE )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D')) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Test for factor1 gbPaired <- gbDUreport(gbcounts, contrast = c(1, -1)) jPaired <- jDUreport(jcounts, , contrast = c(1, -1)) # Generate a splicing report merging bins and junctions DU sr <- splicingReport(gbPaired, jPaired, gbcounts) is <- integrateSignals(sr, jcounts) #Make merged bams dataframe mergedBamsFileNames <- c( "A_C.bam", "A_D.bam" ) mergedBams <- data.frame(bams = system.file( 'extdata', mergedBamsFileNames, package="ASpli" ), condition = c("C", "D"), stringsAsFactors = FALSE) # Export integrated signals exportIntegratedSignals(is, output.dir = paste0(tempdir(), "/is"), sr, gbcounts, features, jcounts, mergedBams, makeGraphs = TRUE, bforce = TRUE )
Export splicing reports in easy to analyze HTML tables.
exportSplicingReports( sr, output.dir="sr" , openInBrowser = FALSE, maxBinFDR = 0.2, maxJunctionFDR = 0.2 )
exportSplicingReports( sr, output.dir="sr" , openInBrowser = FALSE, maxBinFDR = 0.2, maxJunctionFDR = 0.2 )
sr |
An object of class |
output.dir |
HTML reports output directory |
openInBrowser |
Open reports in browser when done |
maxBinFDR |
Only show bins with FDR < maxBinFDR |
maxJunctionFDR |
Only show junctions with FDR < maxJunctionFDR |
Produces html reports
Andres Rabinovich, Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
gbDUreport
, jDUreport
, splicingReport
, ASpliSplicingReport
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Test for factor1 controlling for paired subject gbPaired <- gbDUreport(gbcounts, formula = formula(~subject+factor1)) jPaired <- jDUreport(jcounts, formula = formula(~subject+factor1)) # Generate a splicing report merging bins and junctions DU sr <- splicingReport(gbPaired, jPaired, gbcounts) # Export splicing report exportSplicingReports(output.dir = paste0(tempdir(), "/sr"), sr)
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Test for factor1 controlling for paired subject gbPaired <- gbDUreport(gbcounts, formula = formula(~subject+factor1)) jPaired <- jDUreport(jcounts, formula = formula(~subject+factor1)) # Generate a splicing report merging bins and junctions DU sr <- splicingReport(gbPaired, jPaired, gbcounts) # Export splicing report exportSplicingReports(output.dir = paste0(tempdir(), "/sr"), sr)
Accessors for ASpliFeatures object
featuresg( x ) featuresb( x ) featuresj( x ) transcriptExons( x )
featuresg( x ) featuresb( x ) featuresj( x ) transcriptExons( x )
x |
An ASpliFeatures object |
Returns a GenomicRanges object. Function featuresg returns a GRangesList object containing exon ranges for each gene. Functions featuresb and featuresj, returns GRanges object for all bins and junctions.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# Get data from an ASpliFeatures object features <- aspliFeaturesExample() fg <- featuresg( features ) fb <- featuresb( features ) fj <- featuresj( features ) # Set data to an ASpliFeatures object featuresg( features ) <- fg featuresb( features ) <- fb featuresj( features ) <- fj
# Get data from an ASpliFeatures object features <- aspliFeaturesExample() fg <- featuresg( features ) fb <- featuresb( features ) fj <- featuresj( features ) # Set data to an ASpliFeatures object featuresg( features ) <- fg featuresb( features ) <- fb featuresj( features ) <- fj
ASpliDU object can be filtered to retain genes, bins or junction according to their fdr corrected p-value estimated and log-fold-change.
filterDU( du , what = c( 'genes','bins','junctions'), fdr = 1, logFC = 0, absLogFC = TRUE, logFCgreater = TRUE )
filterDU( du , what = c( 'genes','bins','junctions'), fdr = 1, logFC = 0, absLogFC = TRUE, logFCgreater = TRUE )
du |
An ASpliDU object |
what |
A character vector that specifies the kind of features that will be filtered. Accepted values are 'genes', 'bins', 'junctions'. Multiple values can be passed at the same time. The default value is c( 'genes','bins', 'junctions') |
fdr |
A double value representing the maximum accepted value of fdr corrected p-value to pass the filter. The default value is 1, the neutral value for fdr filtering operation. |
logFC |
A double value representing the cut-off for accepted values of log-fold-change to pass the filter. The default value is 0, the neutral value for logFC filtering operation if logFCgreater and absLocFC arguments are both TRUE. |
absLogFC |
A logical value that specifies that the absolute value of log-fold-change will be used in the filter operation. The default value is TRUE. |
logFCgreater |
A logical value that specifies that the log-fold-change value ( or abs(log-fold-change) if absLogFC argument is TRUE) of features must be greater than the cut-off value to pass the filter. The default value is TRUE. |
A new ASpliDU object with the results of the filtering operations. The elements of features that were not specified to be filtered are kept from the input ASpliDU object.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
DUreport.norm
,
DUreport.offset
,
jDUreport
,
gbDUreport
,
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D') ) # Load reads from bam files #bams <- loadBAM( targets ) # Read counts from bam files #counts <- readCounts( features, bams, targets, cores = 1, readLength = 100, # maxISize = 50000 ) # Calculate differential usage of junctions only #du <- DUreport.norm( counts, targets ) # Filter by FDR #duFiltered1 <- filterDU( du, what=c('genes','bins'), # fdr = 0.01 ) # Filter by logFC, only those that were up-regulated #duFiltered2 <- filterDU( du, what=c('genes','bins'), # logFC = log( 1.5, 2 ), absLogFC = FALSE )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D') ) # Load reads from bam files #bams <- loadBAM( targets ) # Read counts from bam files #counts <- readCounts( features, bams, targets, cores = 1, readLength = 100, # maxISize = 50000 ) # Calculate differential usage of junctions only #du <- DUreport.norm( counts, targets ) # Filter by FDR #duFiltered1 <- filterDU( du, what=c('genes','bins'), # fdr = 0.01 ) # Filter by logFC, only those that were up-regulated #duFiltered2 <- filterDU( du, what=c('genes','bins'), # logFC = log( 1.5, 2 ), absLogFC = FALSE )
Filter signals
filterSignals( sr, bin.FC = 3, bin.fdr = 0.05, nonunif = 1, bin.inclussion = 0.1, bjs.inclussion = 0.2, bjs.fdr = 0.1, a.inclussion = 0.3, a.fdr = 0.05, l.inclussion = 0.3, l.fdr = 0.05, bDetectionSummary = FALSE)
filterSignals( sr, bin.FC = 3, bin.fdr = 0.05, nonunif = 1, bin.inclussion = 0.1, bjs.inclussion = 0.2, bjs.fdr = 0.1, a.inclussion = 0.3, a.fdr = 0.05, l.inclussion = 0.3, l.fdr = 0.05, bDetectionSummary = FALSE)
sr |
An object of class |
bin.FC |
Description TODO |
bin.fdr |
Description TODO |
nonunif |
Description TODO |
bin.inclussion |
Description TODO |
bjs.inclussion |
Description TODO |
bjs.fdr |
Description TODO |
a.inclussion |
Description TODO |
a.fdr |
Description TODO |
l.inclussion |
Description TODO |
l.fdr |
Description TODO |
bDetectionSummary |
Description TODO |
TODO
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #as <- new("ASpliAS") #targets <- 1 #a <- junctionDUreportExt(as, targets)
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #as <- new("ASpliAS") #targets <- 1 #a <- junctionDUreportExt(as, targets)
Summarize read overlaps against all feature levels
gbCounts( features, targets, minReadLength, maxISize, minAnchor = 10, libType="SE", strandMode=0, alignFastq = FALSE, dropBAM = FALSE)
gbCounts( features, targets, minReadLength, maxISize, minAnchor = 10, libType="SE", strandMode=0, alignFastq = FALSE, dropBAM = FALSE)
features |
An object of class ASpliFeatures. It is a list of GRanges at gene, bin and junction level |
targets |
A dataframe containing sample, bam and experimental factors columns |
minReadLength |
Minimum read length of sequenced library. It is used for computing E1I and IE2 read summarization. Make sure this number is smaller than the maximum read length in every bam file, otherwise no E1I or IE2 will be found. |
maxISize |
Maximum intron expected size. Junctions longer than this size will be dicarded |
minAnchor |
Minimum percentage of read that should be aligned to an exon-intron boundary. |
libType |
Defines how reads will be treated according their sequencing lybrary type (paired (PE, default) or single end (SE)) |
strandMode |
controls the behavior of the strand getter. It indicates how the strand of a pair should be inferred from the strand of the first and last alignments in the pair. 0: strand of the pair is always *. 1: strand of the pair is strand of its first alignment. This mode should be used when the paired-end data was generated using one of the following stranded protocols: Directional Illumina (Ligation), Standard SOLiD. 2: strand of the pair is strand of its last alignment. This mode should be used when the paired-end data was generated using one of the following stranded protocols: dUTP, NSR, NNSR, Illumina stranded TruSeq PE protocol. For more information see ?strandMode |
alignFastq |
Experimental (that means it's highly recommended to, leave the default, FALSE): executes an alignment step previous to Bam summarization. Useful if not enough space on local disks for beans so fasts can be aligned on the fly, even from a remote machine, and then the BAMs can be deleted after each summarization. If set to TRUE, targets data frame must have a column named alignerCall with complete call to aligner for each sample. ie: STAR –runMode alignReads –outSAMtype BAM SortedByCoordinate –readFilesCommand zcat –genomeDir /path/to/STAR/genome/folder -runThreadN 4 –outFileNamePrefix sample_name –readFilesIn /path/to/R1 /path/to/R2. Output must match bam files provided in targets. |
dropBAM |
Experimental (that means it's highly recommended to leave the default, FALSE): If alignFastq is TRUE, deletes BAMs after sumarization. Used in conjunction with alignFastq to delete BAMs when there's not enough free space on disk. Use with caution as it will delete all files in "bam" column in targets dataframe. |
An object of class ASpliCounts. Each slot is a dataframe containing features metadata and read counts. Summarization is reported at gene, bin, junction and intron flanking regions (E1I, IE2).
countsg |
symbol: gene symbol locus_overlap: other genes overlaping this locus gene_coordinates: gene coordinates start: gene start end: gene end length: gene length effective_length: gene effective lengh From effective_length to the end, gene counts for all samples |
countsb |
feature: bin type event: type of event asigned by ASpli when bining. locus: gene locus locus_overlap: genes overlaping the same locus symbol: gene symbol gene_coordinates: gene coordinates start: bin start end: bin end length: bin length From length to the end, bin counts for all samples |
countsj |
junction: annotated junction matching the current junction. gene: gene matching the current junction. strand: gene strand for the current junction in case a gene matches with the junction. multipleHit: semicolon separated list of junctions matching the current junction. symbol: gene symbol. gene_coordinates: gene coordinates. bin_spanned: semicolon separated list of all the bins spaned by this junction. j_within_bin: other junctions in the bins. From j_within_bin to the end, junction counts for all samples. |
countse1i |
event: type of event asigned by ASpli when bining. locus: gene locus locus_overlap: genes overlaping the same locus symbol: gene symbol gene_coordinates: gene coordinates start: bin start end: bin end length: bin length From length to the end, bin counts for all samples |
countsie2 |
event: type of event asigned by ASpli when bining. locus: gene locus locus_overlap: genes overlaping the same locus symbol: gene symbol gene_coordinates: gene coordinates start: bin start end: bin end length: bin length From length to the end, bin counts for all samples |
rdsg |
symbol: gene symbol locus_overlap: other genes overlaping this locus gene_coordinates: gene coordinates start: gene start end: gene end length: gene length effective_length: gene effective lengh From effective_length to the end, gene counts/effective_length for all samples |
countsb |
feature: bin type event: type of event asigned by ASpli when bining. locus: gene locus locus_overlap: genes overlaping the same locus symbol: gene symbol gene_coordinates: gene coordinates start: bin start end: bin end length: bin length From length to the end, bin counts/length for all samples |
condition.order |
The order in which ASpli is reading the conditions. This is useful for contrast tests, in order to make sure which conditions are being contrasted. |
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
Accesors: countsg
,
countsb
,
countsj
,
countse1i
,
countsie2
,
rdsg
,
rdsb
,
condition.order
,
Export: writeCounts
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) # Access summary and gene and bin counts and display them gbcounts countsg(gbcounts) countsb(gbcounts) # Export data writeCounts( gbcounts, output.dir = paste0(tempdir(), "/only_counts") )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) # Access summary and gene and bin counts and display them gbcounts countsg(gbcounts) countsb(gbcounts) # Export data writeCounts( gbcounts, output.dir = paste0(tempdir(), "/only_counts") )
Estimate differential expression at gene level and differential usage at bin level using diffSpliceDGE function from edgeR package.
gbDUreport( counts, minGenReads = 10, minBinReads = 5, minRds = 0.05, contrast = NULL, ignoreExternal = TRUE, ignoreIo = TRUE, ignoreI = FALSE, filterWithContrasted = TRUE, verbose = TRUE, formula = NULL, coef = NULL)
gbDUreport( counts, minGenReads = 10, minBinReads = 5, minRds = 0.05, contrast = NULL, ignoreExternal = TRUE, ignoreIo = TRUE, ignoreI = FALSE, filterWithContrasted = TRUE, verbose = TRUE, formula = NULL, coef = NULL)
counts |
An object of class ASpliCounts |
minGenReads |
Genes with at least an average of |
minBinReads |
Bins with at least an average of |
minRds |
Genes with at least an average of |
ignoreExternal |
Ignore Exon Bins at the beginning or end of the transcript. Default value is TRUE. |
ignoreIo |
Ignore original introns. Default TRUE |
ignoreI |
Ignore intron bins, test is performed only for exons. Default FALSE |
contrast |
Either a formula or a contrast can be tested.
If contrast is used, it defines the comparison between conditions to be tested.
|
filterWithContrasted |
A logical value specifying if bins, genes and junction will be filtered by read quantity and read density using data from those conditions that will be used in the comparison, i.e. those which coefficients in contrast argument are different from zero. The default value is TRUE, it is strongly recommended to do not change this value. |
verbose |
A logical value that indicates that detailed information about each step in the analysis will be presented to the user. |
formula |
Either a formula or a contrast can be tested.
If formula is used, complex tests can be run.
|
coef |
For formula only. The coefficient to be tested. If null the test defaults to the last term in the formula |
An ASpliDU object with results at genes
, bins
level.
genesDE |
symbol: gene symbol locus_overlap: genes overlaping the same locus gene_coordinates: gene coordinates start: gene start end: gene end length: gene length effective_length: gene effective length logFC: gene log2 fold change between conditions pvalue: p-value gen.fdr: fdr corrected p-value for multiple testing |
binsDU |
feature: bin type event: type of event asigned by ASpli when bining. locus: gene locus locus_overlap: genes overlaping the same locus symbol: gene symbol gene_coordinates: gene coordinates start: bin start end: bin end length: bin length logFC: bin log2 fold change between conditions pvalue: p-value bin.fdr: fdr corrected p-value for multiple testing |
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
edgeR
, jDUreport
Accessors: genesDE
, binsDU
Export: writeDU
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) # Test for factor1 # Test for factor1 controlling for paired subject gbPaired <- gbDUreport(gbcounts, formula = formula(~subject+factor1)) # Show all genes and bins ordered by FDR genesDE(gbPaired) binsDU(gbPaired) # Test for factor1 without controlling for paired subject. # Must change conditions inside gbcounts object to accommodate the contrast. gbcounts@targets$condition <- targets$factor1 [email protected] <- c("C", "D") gbContrast <- gbDUreport(gbcounts, contrast = c(1, -1)) # Show all genes and bins ordered by FDR genesDE(gbContrast) binsDU(gbContrast) # Export results writeDU( du = gbPaired, output.dir = paste0(tempdir(), "/gbPaired") ) writeDU( du = gbContrast, output.dir = paste0(tempdir(), "/gbContrast") )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) # Test for factor1 # Test for factor1 controlling for paired subject gbPaired <- gbDUreport(gbcounts, formula = formula(~subject+factor1)) # Show all genes and bins ordered by FDR genesDE(gbPaired) binsDU(gbPaired) # Test for factor1 without controlling for paired subject. # Must change conditions inside gbcounts object to accommodate the contrast. gbcounts@targets$condition <- targets$factor1 gbcounts@condition.order <- c("C", "D") gbContrast <- gbDUreport(gbcounts, contrast = c(1, -1)) # Show all genes and bins ordered by FDR genesDE(gbContrast) binsDU(gbContrast) # Export results writeDU( du = gbPaired, output.dir = paste0(tempdir(), "/gbPaired") ) writeDU( du = gbContrast, output.dir = paste0(tempdir(), "/gbContrast") )
Targets data frame contains experimental factors values for each
sample. This function generates a simple name for each unique condition
resulting from the combination of all experimental factors. The order of the
conditions given by getConditions
is the same that in contrast
argument of DUreport.norm
function.
getConditions( targets )
getConditions( targets )
targets |
A dataframe containing sample, bam and experimental factors columns |
A character vector with the names of the conditions derived from experimental factors.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
DUreport.norm
,
DUreport.offset
# Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D') ) # Load reads from bam files. # Return value is c('C', 'D') in this example. #conditions <- getConditions(targets) # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_C_3.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam", "A_D_3.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:8)), # bam = file.path( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','C','D','D','D','D'), # factor2 = c( 'E','E','F','F','E','E','F','F') ) # Load reads from bam files. # Return value is c("C_E", "C_F", "D_E", "D_F") in this example. #conditions <- getConditions(targets)
# Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D') ) # Load reads from bam files. # Return value is c('C', 'D') in this example. #conditions <- getConditions(targets) # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_C_3.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam", "A_D_3.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:8)), # bam = file.path( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','C','D','D','D','D'), # factor2 = c( 'E','E','F','F','E','E','F','F') ) # Load reads from bam files. # Return value is c("C_E", "C_F", "D_E", "D_F") in this example. #conditions <- getConditions(targets)
Accessors for ASpliIntegratedSignals object
signals( x ) filters( x )
signals( x ) filters( x )
x |
An ASpliIntegratedSignals object |
Returns dataframes
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
Integrates differential usage signals from different sources using overlaping regions. See vignette for more details
integrateSignals(sr = NULL, asd = NULL, bin.FC = 3, bin.fdr = 0.05, nonunif = 1, usenonunif = FALSE, bin.inclussion = 0.2, bjs.inclussion = 10.3, bjs.fdr = 0.01, a.inclussion = 0.3, a.fdr = 0.01, l.inclussion = 0.3, l.fdr = 0.01, otherSources = NULL, overlapType = "any")
integrateSignals(sr = NULL, asd = NULL, bin.FC = 3, bin.fdr = 0.05, nonunif = 1, usenonunif = FALSE, bin.inclussion = 0.2, bjs.inclussion = 10.3, bjs.fdr = 0.01, a.inclussion = 0.3, a.fdr = 0.01, l.inclussion = 0.3, l.fdr = 0.01, otherSources = NULL, overlapType = "any")
sr |
An object of class |
asd |
An object of class |
bin.FC |
Filter bin signals by fold change. Actually, log2 fold change is return, so default would return only bin signlas with bin.fc > log2(3). |
bin.fdr |
Filter bin signals by fdr. |
nonunif |
Filter intronic bins with non uniform support (nonunif << 1 is uniform) |
usenonunif |
Use non uniformity as filter. |
bin.inclussion |
Filter bin signals by junction support with dPIR or dPSI accordingly. |
bjs.inclussion |
Filter annotated junction signals by junction inclussion with dPIR or dPSI accordingly. |
bjs.fdr |
Filter annotated junction signals by fdr. |
a.inclussion |
Filter anchor junction signals by junction inclussion with dPIR. |
a.fdr |
Filter anchor junction signals by fdr. |
l.inclussion |
Filter locale junction signals by junction inclussion with dPSI. |
l.fdr |
Filter locale junction signals by fdr. |
otherSources |
If user wants to compare ASpli results with results from other methods, otherSources must be a GenomicRange object with all the regions found with the other methods. It will be integrated with a new column next to signals information. |
overlapType |
Type of regions overlap matching between the different signals. Defaults to "any" and can be any of the following: "any", "start", "end", "within", "equal". |
It returns A ASpliIntegratedSignals
with all overlaping signals present in the region filtered by different parameters.
Andres Rabinovich, Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
Accesors: signals,
filters,
Export: exportIntegratedSignals
gbDUreport
, jDUreport
, ASpliSplicingReport
, splicingReport
, ASpliIntegratedSignals
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Test for factor1 controlling for paired subject gbPaired <- gbDUreport(gbcounts, formula = formula(~subject+factor1)) jPaired <- jDUreport(jcounts, formula = formula(~subject+factor1)) # Generate a splicing report merging bins and junctions DU sr <- splicingReport(gbPaired, jPaired, gbcounts) is <- integrateSignals(sr, jcounts) # Show integrate signals results and filters used signals(is) filters(is)
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Test for factor1 controlling for paired subject gbPaired <- gbDUreport(gbcounts, formula = formula(~subject+factor1)) jPaired <- jDUreport(jcounts, formula = formula(~subject+factor1)) # Generate a splicing report merging bins and junctions DU sr <- splicingReport(gbPaired, jPaired, gbcounts) is <- integrateSignals(sr, jcounts) # Show integrate signals results and filters used signals(is) filters(is)
Summarize read overlaps against junctions. Report PSI, PJU, PIR and counts for experimental junctions. PSI or PIR metrics are calculated for each bin and experimental condition. The selection of which metric is used is based on the kind of splicing event associated with each bin.
jCounts( counts, features, minReadLength, threshold = 5, minAnchor = 10, libType="SE", strandMode=0, alignFastq = FALSE, dropBAM = FALSE )
jCounts( counts, features, minReadLength, threshold = 5, minAnchor = 10, libType="SE", strandMode=0, alignFastq = FALSE, dropBAM = FALSE )
counts |
An object of class ASpliCounts. |
features |
An object of class ASpliFeatures. |
minReadLength |
Minimum read length of sequenced library. It is used for computing E1I and IE2 read summarization. Make sure this number is smaller than the maximum read length in every bam file, otherwise no junctions will be found. |
threshold |
Minimum number of reads supporting junctions. Default=5 |
minAnchor |
An intronic junction must overlap completely and at least an minAnchor% into the exon region and the intron region. The regions can be exon1-intron or intron-exon2. |
libType |
Defines how reads will be treated according their sequencing lybrary type (paired (PE, default) or single end (SE)) |
strandMode |
controls the behavior of the strand getter. It indicates how the strand of a pair should be inferred from the strand of the first and last alignments in the pair. 0: strand of the pair is always *. 1: strand of the pair is strand of its first alignment. This mode should be used when the paired-end data was generated using one of the following stranded protocols: Directional Illumina (Ligation), Standard SOLiD. 2: strand of the pair is strand of its last alignment. This mode should be used when the paired-end data was generated using one of the following stranded protocols: dUTP, NSR, NNSR, Illumina stranded TruSeq PE protocol. For more information see ?strandMode |
alignFastq |
Experimental (that means it's highly recommended to, leave the default, FALSE): executes an alignment step previous to Bam summarization. Useful if not enough space on local disks for beans so fasts can be aligned on the fly, even from a remote machine, and then the BAMs can be deleted after each summarization. If set to TRUE, targets data frame must have a column named alignerCall with complete call to aligner for each sample. ie: STAR –runMode alignReads –outSAMtype BAM SortedByCoordinate –readFilesCommand zcat –genomeDir /path/to/STAR/genome/folder -runThreadN 4 –outFileNamePrefix sample-name –readFilesIn /path/to/R1 /path/to/R2. Output must match bam files provided in targets. |
dropBAM |
Experimental (that means it's highly recommended to leave the default, FALSE): If alignFastq is TRUE, deletes BAMs after sumarization. Used in conjunction with alignFastq to delete BAMs when there's not enough free space on disk. Use with caution as it will delete all files in "bam" column in targets dataframe. |
An object of class ASpliAS. Accesors: irPIR, esPSI, altPSI, junctionsPIR, junctionsPJU
irPIR |
event: Type of event asigned by ASpli when bining. J1: Semicolon separated list of all the junctions with an end matching the start of the intron. J2: Semicolon separated list of all the junctions with an end matching the end of the intron. J3: Semicolon separated list of all the junctions overlaping the intron. All the columns from J1 to J2 represent the J1 counts in the different samples for each bin. The counts are the sum of all the J1 junctions. All the columns from J2 to J3 represent the J2 counts in the different samples for each bin. The counts are the sum of all the J2 junctions. All the columns from J3 to the first condition represent the J3 counts in the different samples for each bin. The counts are the sum of all the J3 junctions. The last columns are the PIR metrics calculated for each condition. The PIR metric is calculated as:
Where the junctions are the sum by condition. |
altPSI |
event: Type of event asigned by ASpli when bining. J1(J2): Semicolon separated list of all the junctions with an end matching the end of alt5'SS(alt3'SS). J3: Semicolon separated list of all the junctions with an end matching the start of alt5'SS or the start of alt3'SS. All the columns from J1 to J2 represent the J1 counts in the different samples for each bin. The counts are the sum of all the J1 junctions. All the columns from J2 to J3 represent the J2 counts in the different samples for each bin. The counts are the sum of all the J2 junctions. All the columns from J3 to the first condition represent the J3 counts in the different samples for each bin. The counts are the sum of all the J3 junctions. The last columns are the PSI metrics calculated for each condition. The PSI metric is calculated as:
Where J12 is J1 if it's an alt 5' event or J2 if it's an alt 3' event and the junctions are the sum by condition. |
esPSI |
event: Type of event asigned by ASpli when bining J1: Semicolon separated list of all the junctions with an end on the alternative exon. J2: Semicolon separated list of all the junctions with an end on the alternative exon. J3: Semicolon separated list of all the junctions overlaping the alternative exon. All the columns from J1 to J2 represent the J1 counts in the different samples for each bin. The counts are the sum of all the J1 junctions. All the columns from J2 to J3 represent the J2 counts in the different samples for each bin. The counts are the sum of all the J2 junctions. All the columns from J3 to the first condition represent the J3 counts in the different samples for each bin. The counts are the sum of all the J3 junctions. The PSI metric is calculated as:
Where the junctions are the sum by condition. |
junctionsPIR |
PIR metric for each experimental junction using e1i and ie2 counts. Exclusion junction is the junction itself. This output helps to discover new introns as well as new retention events. hitIntron: If the junction matches a bin, the bin is shown here. hitIntronEvent: If the junction matches a bin, the type of event asigned by ASpli to this bin. All the columns from hitIntronEvent up to the first repetition of the samples names in the columns, represent the J1 counts in the different samples for each region. From there to the next time the names of the columns repeat themselves, the J2 counts and from there to the first condition, the J3 counts. The last columns are the PIR metrics calculated for each condition. The PIR metric is calculated as:
Where the junctions are the sum by condition. |
junctionsPJU |
Given a junction, it is possible to analyze if it shares start, end or both with another junction. If so, it is because there is alternative splicing.
Junction: name of the junction.
gene: gene it belongs to.
strand: gene strand.
multipleHit: if other gene overlaps the gene the junction belongs to.
symbol: gene symbol.
gene_coordinates: gene coordinates.
bin_spanned: semicolon separated list of all the bins spaned by this junction.
j_within_bin: other junctions in the bins.
StartHit: all the junctions sharing the start with this junction and |
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky and Ariel Chernomoretz
Accesors: irPIR
, altPSI
, esPSI
,
junctionsPIR
, junctionsPJU
Export: writeAS
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Access summary and gene and bin counts and display them gbcounts countsg(gbcounts) countsb(gbcounts) # Access summary and junction counts and display them jcounts irPIR(jcounts) esPSI(jcounts) altPSI(jcounts) junctionsPIR(jcounts) junctionsPJU(jcounts) # Export data writeAS( as = jcounts, output.dir = paste0(tempdir(), "/only_as") )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Access summary and gene and bin counts and display them gbcounts countsg(gbcounts) countsb(gbcounts) # Access summary and junction counts and display them jcounts irPIR(jcounts) esPSI(jcounts) altPSI(jcounts) junctionsPIR(jcounts) junctionsPJU(jcounts) # Export data writeAS( as = jcounts, output.dir = paste0(tempdir(), "/only_as") )
Accessors for ASpliJDU object
anchorc( x ) anchorj( x ) localec( x ) localej( x ) jir( x ) jes( x ) jalt( x )
anchorc( x ) anchorj( x ) localec( x ) localej( x ) jir( x ) jes( x ) jalt( x )
x |
An ASpliJDU object |
Returns dataframes
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
This function estimates the differential usage of junctions combining different types of evidence
Differential junction usage is estimated using a combination of evidences
jDUreport(asd, minAvgCounts = 5, contrast = NULL, filterWithContrasted = TRUE, runUniformityTest = FALSE, mergedBams = NULL, maxPValForUniformityCheck = 0.2, strongFilter = TRUE, maxConditionsForDispersionEstimate = 24, formula = NULL, coef = NULL, maxFDRForParticipation = 0.05, useSubset = FALSE)
jDUreport(asd, minAvgCounts = 5, contrast = NULL, filterWithContrasted = TRUE, runUniformityTest = FALSE, mergedBams = NULL, maxPValForUniformityCheck = 0.2, strongFilter = TRUE, maxConditionsForDispersionEstimate = 24, formula = NULL, coef = NULL, maxFDRForParticipation = 0.05, useSubset = FALSE)
asd |
An object of class |
minAvgCounts |
Minimum average counts for filtering |
contrast |
Define the comparison between conditions to be tested.
|
filterWithContrasted |
A logical value specifying if bins, genes and junction will be filtered by read quantity and read density using data from those conditions that will be used in the comparison, i.e. those which coefficients in contrast argument are different from zero. The default value is TRUE, it is strongly recommended to do not change this value. |
runUniformityTest |
Run uniformity test on Intron Retention. Sometimes Mutually Exclusive Exons (MEX) events can be confused with Intron Retention events. This test compares the standard deviation of the inner intron region (11 bases from both ends) to the mean of both intron ends. Numbers closer to 0 mean the event is more probably an Intron Retention event than an MEX event. The test takes some time to run so it defaults to FALSE. |
mergedBams |
Path to merged bams for each testing condition. If no merged bams exist (for example, paired samples without replicates), use the same bams as targets. |
maxPValForUniformityCheck |
To speed up uniformity test only check junctions with pval < maxPValForUniformityCheck |
strongFilter |
If strongFilter is TRUE, then we remove all events with at least one junction that doesn't pass the filter. |
maxConditionsForDispersionEstimate |
In order to reduce resource usage, estimate dispersion for statistics tests with a reduced number of conditions. |
formula |
Either a formula or a contrast can be tested. If formula is used, complex tests can be run.
|
coef |
For formula only. The coefficient to be tested. If null the test defaults to the last term in the formula |
maxFDRForParticipation |
In order to calculate junctionPSI participation, only use significant junctions (ie junctions with FDR < maxFDRForParticipation). |
useSubset |
Experimental. It is strongly recomended to leave the default, FALSE. |
Estimation is made at junction level using diffSpliceDGE function from edgeR package. Junctions belonging to the same AS event comprises the event "set". Each junction is tested against this "set" in a similar fashion that bins are tested against their gene in diffSpliceDGE. Localec are clusters made of junctions that share an end with at least another junction in the cluster.
An ASpliJDU
object with results of differential usage at junctions
level.
localec |
size: number of junctions belonging to the cluster. cluster.LR: likelihood ratio of cluster differential usage. pvalue: pvalue of cluster differential usage. FDR: fdr of cluster differential usage. range: cluster location. participation: participation of the significant junction (FDR < maxFDRForParticipation) presenting maximal participation value inside the cluster dParticipation: delta participation of the significant junction (FDR < maxFDRForParticipation) presenting maximal participation value inside the cluster |
localej |
cluster: name of the cluster the junction belongs to log.mean: log of mean counts accross all conditions for this junction logFC: log fold change of junction accross conditions pvalue: pvalue of junction FDR: FDR of junction annotated: is junction annotated or new participation: the maximal participation value observed across contrasted condictions dParticipation: delta participation of the maximal participation value observed across contrasted condictions From dParticipation to the end, junction counts for all samples |
anchorc |
cluster.LR: likelihood ratio of cluster differential usage. pvalue: pvalue of cluster differential usage. FDR: fdr of cluster differential usage. |
anchorj |
log.mean: log of mean counts accross all conditions for this junction logFC: log fold change of junction accross conditions LR: likelihood ratio of junction differential usage. pvalue: pvalue of junction FDR: FDR of junction J1.pvalue: pvalue of J1 junction J2.pvalue: pvalue of J2 junction NonUniformity: if non uniformity test was performed, numbers closer to zero mean uniformity and closer to one mean non uniformity dPIR: junction delta PIR annotated: is junction annotated or new From annotated to the end, junction counts for all samples |
jir |
J3: J3 junction/s logFC: log fold change of junction accross conditions log.mean: log of mean counts accross all conditions for this junction pvalue: pvalue of junction FDR: FDR of junction LR: likelihood ratio of junction differential usage. NonUniformity: if non uniformity test was performed, numbers closer to zero mean uniformity and closer to one mean non uniformity dPIR: junction delta PIR multiplicity: do multiple junctions cross the region From multiplicity to the end, junction counts for all samples |
jes |
event: type of event J3: J3 junction/s logFC: log fold change of junction accross conditions log.mean: log of mean counts accross all conditions for this junction pvalue: pvalue of junction FDR: FDR of junction LR: likelihood ratio of junction differential usage. dPSI: junction delta PSI multiplicity: do multiple junctions cross the region From multiplicity to the end, junction counts for all samples |
jalt |
event: type of event J3: J3 junction/s logFC: log fold change of junction accross conditions log.mean: log of mean counts accross all conditions for this junction pvalue: pvalue of junction FDR: FDR of junction LR: likelihood ratio of junction differential usage. dPSI: junction delta PSI multiplicity: do multiple junctions cross the region From multiplicity to the end, junction counts for all samples |
contrast |
Conditions contrasted by ASpli |
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
Accesors: localec
,
localej
,
anchorc
,
anchorj
,
jir
,
jes
,
jalt
,
junctionsDU
,
Export: writeJDU
,
writeDU
, edgeR
, ASpliAS
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Test for factor1 controlling for paired subject jPaired <- jDUreport(jcounts, formula = formula(~subject+factor1)) # Show junctions information jPaired localej(jPaired) localec(jPaired) anchorj(jPaired) anchorc(jPaired) jir(jPaired) jes(jPaired) jalt(jPaired) # Export results writeJDU( jPaired, output.dir = paste0(tempdir(), "/jPaired") )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Test for factor1 controlling for paired subject jPaired <- jDUreport(jcounts, formula = formula(~subject+factor1)) # Show junctions information jPaired localej(jPaired) localec(jPaired) anchorj(jPaired) anchorc(jPaired) jir(jPaired) jes(jPaired) jalt(jPaired) # Export results writeJDU( jPaired, output.dir = paste0(tempdir(), "/jPaired") )
Estimate differential usage at junction level. When targets has only two conditions, and contrast is not set, the estimation of differential expression and usage is done with an exact test, otherwise is estimated using a generalized linear model.
junctionDUreport( counts, targets, appendTo = NULL, minGenReads = 10, minRds = 0.05, threshold = 5, offset = FALSE, offsetUseFitGeneX = TRUE, contrast = NULL, forceGLM = FALSE)
junctionDUreport( counts, targets, appendTo = NULL, minGenReads = 10, minRds = 0.05, threshold = 5, offset = FALSE, offsetUseFitGeneX = TRUE, contrast = NULL, forceGLM = FALSE)
counts |
An object of class |
targets |
A dataframe containing sample, bam and experimental factor columns. |
appendTo |
An object of class |
minGenReads |
Junctions within genes with at least an average of |
minRds |
Junctions within genes with at least an average of |
threshold |
Junction with at least |
offset |
Corrects junction counts using an offset matrix derived from gene expression data. Default = FALSE |
offsetUseFitGeneX |
Fit a GLM using gene counts to build the offset matrix. This argument is used only when 'offset' argument is set to TRUE. The default value is TRUE |
contrast |
Define the comparison between conditions to be tested.
|
forceGLM |
Force the use of a generalized linear model to estimate differential expression and usage. Default = FALSE |
An ASpliDU
object with results of differential usage of junctions
Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
edgeR
, DUreport
Accessors: junctionsDU
Export: writeDU
#This function has been deprecated. Please see vignette for new pipeline.
#This function has been deprecated. Please see vignette for new pipeline.
Load BAM files into R session using a targets specification.
loadBAM(targets, cores, libType, strandMode)
loadBAM(targets, cores, libType, strandMode)
targets |
A data frame containing sample, bam and experimental factors columns |
cores |
Number of processors to use |
libType |
Options are: "SE" or "PE" |
strandMode |
Options are: 0,1,2. See ?strandMode for more information |
A list of GAlignments or GAlignmentPairs . Each element of the list correspond to a samples BAM file.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D') ) # Load reads from bam files #bams <- loadBAM( targets )
# Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D') ) # Load reads from bam files #bams <- loadBAM( targets )
This function merges the results of differential usage of bins, from an ASpliDU object, with PSI/PIR and junction information, from an ASpliAS object. Also, a delta PSI/PIR value is calculated from a contrast.
mergeBinDUAS( du, as, targets, contrast = NULL )
mergeBinDUAS( du, as, targets, contrast = NULL )
du |
An object of class |
as |
An object of class |
targets |
A data frame containing sample, bam files and experimental factor columns. |
contrast |
Define the comparison between conditions to be tested.
|
A data frame containing feature, event, locus, locus_overlap, symbol, gene coordinates, start of bin, end of bin, bin length, log-Fold-Change value, p-value, fdr corrected p-value, J1 inclusion junction, J1 junction counts for each sample, J2 inclusion junction, J2 junction counts for each sample, J3 exclusion junction, J3 junction counts for each sample, PSI or PIR value for each bin, and delta PSI/PIR.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam", # "B_C_0.bam", "B_C_1.bam", "B_C_2.bam", # "B_D_0.bam", "B_D_1.bam", "B_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:12)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'A','A','A','A','A','A','B','B','B','B','B','B'), # factor2 = c( 'C','C','C','D','D','D','C','C','C','D','D','D') ) # Load reads from bam files #bams <- loadBAM( targets ) # Read counts from bam files #counts <- readCounts( features, bams, targets, cores = 1, readLength = 100, # maxISize = 50000 ) # Calculate differential usage of genes and bins #du <- DUreport.norm( counts, targets , contrast = c(1,-1,-1,1)) # Calculate PSI / PIR for bins and junction. #as <- AsDiscover( counts, targets, features, bams, readLength = 100, # threshold = 5, cores = 1 ) #mas <- mergeBinDUAS( du, as, targets, contrast = c(1,-1,-1,1) )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam", # "B_C_0.bam", "B_C_1.bam", "B_C_2.bam", # "B_D_0.bam", "B_D_1.bam", "B_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:12)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'A','A','A','A','A','A','B','B','B','B','B','B'), # factor2 = c( 'C','C','C','D','D','D','C','C','C','D','D','D') ) # Load reads from bam files #bams <- loadBAM( targets ) # Read counts from bam files #counts <- readCounts( features, bams, targets, cores = 1, readLength = 100, # maxISize = 50000 ) # Calculate differential usage of genes and bins #du <- DUreport.norm( counts, targets , contrast = c(1,-1,-1,1)) # Calculate PSI / PIR for bins and junction. #as <- AsDiscover( counts, targets, features, bams, readLength = 100, # threshold = 5, cores = 1 ) #mas <- mergeBinDUAS( du, as, targets, contrast = c(1,-1,-1,1) )
Creates a plot with gene counts, bin counts, PSI/PIR value, inclusion and exclusion junctions for selected bins and conditions.
plotBins( counts, as, bin, factorsAndValues, targets, main = NULL, colors = c( '#2F7955', '#79552F', '#465579', '#A04935', '#752020', '#A07C35') , panelTitleColors = '#000000', panelTitleCex = 1, innerMargins = c( 2.1, 3.1, 1.1, 1.1 ), outerMargins = c( 0, 0, 2.4, 0 ), useBarplots = NULL, barWidth = 0.9, barSpacer = 0.4, las.x = 2, useHCColors = FALSE, legendAtSide = TRUE, outfolder = NULL, outfileType = c( 'png', 'bmp', 'jpeg', 'tiff', 'pdf')[1], deviceOpt = NULL )
plotBins( counts, as, bin, factorsAndValues, targets, main = NULL, colors = c( '#2F7955', '#79552F', '#465579', '#A04935', '#752020', '#A07C35') , panelTitleColors = '#000000', panelTitleCex = 1, innerMargins = c( 2.1, 3.1, 1.1, 1.1 ), outerMargins = c( 0, 0, 2.4, 0 ), useBarplots = NULL, barWidth = 0.9, barSpacer = 0.4, las.x = 2, useHCColors = FALSE, legendAtSide = TRUE, outfolder = NULL, outfileType = c( 'png', 'bmp', 'jpeg', 'tiff', 'pdf')[1], deviceOpt = NULL )
counts |
An object of class |
as |
An object of class |
bin |
A character vector with the names of the bins to be plotted. |
factorsAndValues |
A list containing the factor and the values for each
factor to be plotted. The order of the factors will modify how the
conditions are grouped in the plot. |
targets |
A data frame containing sample, bam files and experimental factor columns |
main |
Main title of the plot. If |
colors |
A vector of character colors for lines and bar plots. |
panelTitleColors |
A vector of character colors for the titles of each plot panel. |
panelTitleCex |
Character size expansion for panel titles. |
innerMargins |
A numerical vector of the form c(bottom, left, top, right) which gives the size of each plot panel margins. Defaults to c( 2.1, 3.1, 1.1, 1.1 ) |
outerMargins |
A numerical vector of the form c(bottom, left, top, right) which gives the size of margins. Defaults to c( 0, 0, 2.4, 0 ) |
useBarplots |
A logical value that indicates the type of plot to be
used. If |
barWidth |
The width of the bars in bar plots. |
barSpacer |
Fraction of |
las.x |
Text orientation of x-axis labels. |
useHCColors |
A logical value. If |
are not used, instead panel title are automatically chosen to have high
contrast against colors
.
legendAtSide |
A logical value that forces panel title to be shown on the y-axis, instead of over the plot. |
outfolder |
Path to output folder to write plot images. Is |
outfileType |
File format of the output files used if |
deviceOpt |
A list of named options to be passed to the graphic device
selected in |
Returns a png for each selected bin
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam", # "B_C_0.bam", "B_C_1.bam", "B_C_2.bam", # "B_D_0.bam", "B_D_1.bam", "B_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:12)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'A','A','A','A','A','A','B','B','B','B','B','B'), # factor2 = c( 'C','C','C','D','D','D','C','C','C','D','D','D') ) # Load reads from bam files #bams <- loadBAM( targets ) # Read counts from bam files #counts <- readCounts( features, bams, targets, cores = 1, readLength = 100, # maxISize = 50000 ) # Calculate differential usage of genes, bins and junctions #du <- DUreport.norm( counts, targets , contrast = c(1,-1,-1,1)) # Calculate PSI / PIR for bins and junction. #as <- AsDiscover( counts, targets, features, bams, readLength = 100, # threshold = 5, cores = 1 ) # Plot bin data. Factor2 is the main factor for graphic representation in # this example as it is the first in factorsAndValues argument. # This makes a bar plot comparing four conditions, grouped by factor1. #plotBins( counts, as, 'GENE03:E002', # factorsAndValues = list( # factor2 = c('C','D'), # factor1 = c('A','B') ), # las.x = 1, # legendAtSide = TRUE, # useHCColors = TRUE, # targets = targets, # barWidth = 0.95, # innerMargins = c( 2.1, 4.1, 1.1, 1.1 ) ) # Redefine targets #targets <- data.frame( # row.names = paste0('Sample_',c(1:12)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'A','A','B','B','C','C','D','D','E','E','F','F') ) #as <- AsDiscover( counts, targets, features, bams, readLength = 100, # threshold = 5, cores = 1 ) # This makes a line plot for six conditions, grouped by factor1. #plotBins( counts, as, 'GENE03:E002', # factorsAndValues = list( # factor1 = c('A','B','C','D','E','F') ), # las.x = 1, # legendAtSide = FALSE, # targets = targets, # innerMargins = c( 2.1, 4.1, 1.1, 1.1 ) )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam", # "B_C_0.bam", "B_C_1.bam", "B_C_2.bam", # "B_D_0.bam", "B_D_1.bam", "B_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:12)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'A','A','A','A','A','A','B','B','B','B','B','B'), # factor2 = c( 'C','C','C','D','D','D','C','C','C','D','D','D') ) # Load reads from bam files #bams <- loadBAM( targets ) # Read counts from bam files #counts <- readCounts( features, bams, targets, cores = 1, readLength = 100, # maxISize = 50000 ) # Calculate differential usage of genes, bins and junctions #du <- DUreport.norm( counts, targets , contrast = c(1,-1,-1,1)) # Calculate PSI / PIR for bins and junction. #as <- AsDiscover( counts, targets, features, bams, readLength = 100, # threshold = 5, cores = 1 ) # Plot bin data. Factor2 is the main factor for graphic representation in # this example as it is the first in factorsAndValues argument. # This makes a bar plot comparing four conditions, grouped by factor1. #plotBins( counts, as, 'GENE03:E002', # factorsAndValues = list( # factor2 = c('C','D'), # factor1 = c('A','B') ), # las.x = 1, # legendAtSide = TRUE, # useHCColors = TRUE, # targets = targets, # barWidth = 0.95, # innerMargins = c( 2.1, 4.1, 1.1, 1.1 ) ) # Redefine targets #targets <- data.frame( # row.names = paste0('Sample_',c(1:12)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'A','A','B','B','C','C','D','D','E','E','F','F') ) #as <- AsDiscover( counts, targets, features, bams, readLength = 100, # threshold = 5, cores = 1 ) # This makes a line plot for six conditions, grouped by factor1. #plotBins( counts, as, 'GENE03:E002', # factorsAndValues = list( # factor1 = c('A','B','C','D','E','F') ), # las.x = 1, # legendAtSide = FALSE, # targets = targets, # innerMargins = c( 2.1, 4.1, 1.1, 1.1 ) )
Graphic representation of coverage and junctions is useful to complement the results of differential usage of bins and junction and differential expression analysis.
Function plotGenomicRegions allow to create plots for multiple conditions for bins and genes. Each individual plot can only correspond to a single gene or bin, but can contain many panels for different experimental conditions.
plotGenomicRegions( features, x, genomeTxDb, targets, xIsBin = TRUE, layout = 'auto', colors = 'auto', plotTitles = 'auto', sashimi = FALSE, zoomOnBins = FALSE, deviceOpt = NULL, highLightBin = TRUE, outfolder = NULL, outfileType = 'png', mainFontSize = 24, annotationHeight = 0.2, annotationCol = 'black', annotationFill = 'gray', annotationColTitle = 'black', preMergedBAMs = NULL, useTransparency = FALSE, tempFolder = 'tmp', avoidReMergeBams= FALSE, verbose = TRUE )
plotGenomicRegions( features, x, genomeTxDb, targets, xIsBin = TRUE, layout = 'auto', colors = 'auto', plotTitles = 'auto', sashimi = FALSE, zoomOnBins = FALSE, deviceOpt = NULL, highLightBin = TRUE, outfolder = NULL, outfileType = 'png', mainFontSize = 24, annotationHeight = 0.2, annotationCol = 'black', annotationFill = 'gray', annotationColTitle = 'black', preMergedBAMs = NULL, useTransparency = FALSE, tempFolder = 'tmp', avoidReMergeBams= FALSE, verbose = TRUE )
features |
An ASpliFeatures object, generated with |
x |
A character vector with the names of bins or genes to plot. To plot into a window is recommended that the length of x be one. |
genomeTxDb |
A TxDb object with the annotation of reference genome. |
targets |
A data frame containing sample, bam files and experimental factor columns |
xIsBin |
A logical value that indicates if values in x corresponds to gene names or bin names. |
layout |
A character with value 'auto' or a character matrix with condition names arranged with the desired layout of the panels in plots. The dimensions of layout matrix, colors matrix and plotTitles matrices must be the same. Matrix can have NA values, however, the height of the panels corresponding to that column are modified to occupy the total height. The default value is 'auto'. |
colors |
A character containing value 'auto' or containing colors strings, or a character matrix with color names arranged with the layout specified in layout argument. The dimensions of layout matrix, colors matrix and plotTitles matrices must be the same. The default value is 'auto'. |
plotTitles |
A character containing value 'auto', or a character matrix with titles for each panel arranged with the layout specified in layout argument. The dimensions of layout matrix, colors matrix and plotTitles matrices must be the same. The default value is 'auto'. |
sashimi |
A logical value that specifies that a sashimi plot for junctions must be included into each panel. The default value is FALSE. |
zoomOnBins |
A FALSE logical value or a double value between 0 and 1. If value is FALSE then the genomic range to be plotted correspond to the complete gene, otherwise the genomic range is that the size of the bin corresponds to a fraction equals to zoomOnBins value of the total and is centered in the bin. Is used only when xIsBin is TRUE. The default value is FALSE. |
deviceOpt |
A named list of arguments to be passed to the graphic device used to plot. This allow to further customization of the plot. The default value is an empty list. |
highLightBin |
A logical value that indicates if the bin should be highlighted. The default value is TRUE. |
outfolder |
NULL or a character vector representing a folder path that will be used to save the plot images. If the folder doesn't exists it is created. If NULL, the plot is made in a window. The default value is NULL. For each bin, a single image file is generated. The name of the file is the name of the bin, added with a '.gr.' string, and the file extension at the end. If the name of the bin contains invalid character for a file name, those will be replaced by an underscore character. |
outfileType |
A character value the specifies the file format of the plot to be created. Is used only when outfolder is not NULL. Accepted values are 'png', 'jpeg', 'bmp', 'tiff', 'pdf'. Each value is the graphic device used to create the image. The default value is 'png'. |
mainFontSize |
A numeric value specifying the size of the main title. The default value is 24. |
annotationHeight |
A double value specifying the proportion of the total height used to represent the gene model. The default value is 0.2. |
annotationCol |
A character value that specifies the color of the borders of bars in gene model representation. The default value is 'black'. |
annotationFill |
A character value that specifies the color of the filling of bars in gene model representation. The default value is 'gray'. |
annotationColTitle |
A character value that specifies the color of text in gene model representation. The default value is 'black'. |
preMergedBAMs |
A one column data frame that associates a condition, specified in the row name, with a character value representing the path of bam file with the reads for that condition. This bam file is typically generated by merging the bam files of all replicates for that condition. The default value is NULL, this specifies that not merged bam files are used, instead on-the-fly read extraction and merging is done from the bam files specified in the targets argument. |
useTransparency |
A logical value that specifies if transparency will be used to generate the plots, this leads to better looking plot, however not all graphic devices support transparency. The default value is FALSE. |
tempFolder |
A character value specifying the path to store intermediate files produced while extracting and merging reads from bam files. It is only used when preMergedBAMs arguments is NULL. The files created are not automatically removed after plotting because can be reused to create a new plot of the same genes or bins with different graphic options. The default value is 'tmp', that means a new 'tmp' folder in the current working folder. |
avoidReMergeBams |
A logical value specifying that extraction and merging of bam files will be avoided. This is only meaningful when the extraction and merging of the same set of genes and bins was done in the previous execution of plotGenomicRegions function. The default value is FALSE. |
verbose |
A logical value specifying that detailed information about the execution will be informed to the user. The default value is TRUE. |
Returns a png for each selected bin
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
Devices
,
pdf
,
png
,
bmp
,
jpeg
,
tiff
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D'), # stringsAsFactors = FALSE ) # Plot a single bin to a window #plotGenomicRegions( # features, # 'GENE01:E002', # genomeTxDb, # targets, # sashimi = FALSE, # colors = '#AA4444', # annotationHeight = 0.1, # tempFolder = 'tmp', # verbose = TRUE , # avoidReMergeBams = FALSE, # useTransparency = FALSE ) # # plot two bins to pdf files. #plotGenomicRegions( # features, c( 'GENE01:E002', 'GENE02:E002' ), # genomeTxDb, # targets, # layout = matrix( c( 'C', 'D'), ncol = 1), # colors = matrix( c( '#663243', '#363273'), ncol = 1), # plotTitles = matrix( c( 'C condition', 'D condition'), ncol = 1), # sashimi = FALSE, # mainFontSize = 12, # annotationHeight = 0.1, # tempFolder = 'tmp', # verbose = TRUE , # avoidReMergeBams = FALSE, # useTransparency = TRUE, # outfolder = '.', # outfileType = 'pdf', # deviceOpt = list( height = 6, width = 5, paper = 'a4r' ) )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D'), # stringsAsFactors = FALSE ) # Plot a single bin to a window #plotGenomicRegions( # features, # 'GENE01:E002', # genomeTxDb, # targets, # sashimi = FALSE, # colors = '#AA4444', # annotationHeight = 0.1, # tempFolder = 'tmp', # verbose = TRUE , # avoidReMergeBams = FALSE, # useTransparency = FALSE ) # # plot two bins to pdf files. #plotGenomicRegions( # features, c( 'GENE01:E002', 'GENE02:E002' ), # genomeTxDb, # targets, # layout = matrix( c( 'C', 'D'), ncol = 1), # colors = matrix( c( '#663243', '#363273'), ncol = 1), # plotTitles = matrix( c( 'C condition', 'D condition'), ncol = 1), # sashimi = FALSE, # mainFontSize = 12, # annotationHeight = 0.1, # tempFolder = 'tmp', # verbose = TRUE , # avoidReMergeBams = FALSE, # useTransparency = TRUE, # outfolder = '.', # outfileType = 'pdf', # deviceOpt = list( height = 6, width = 5, paper = 'a4r' ) )
Read density of gene and bins is the quotient between the number of read
counts and the length of the feature. The results are appended into an
ASpliCounts object that must be given as argument. The explicit calculation
of read densities is usually not required because is automatically performed
by readCounts
function.
rds( counts, targets )
rds( counts, targets )
counts |
An ASpliCounts object |
targets |
A data frame containing sample, bam and experimental factors columns |
An ASpliCounts object containing read densities of genes and bins.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # ## Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # # # Define bam files, sample names and experimental factors for targets. # bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) # targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D') ) # # # Load reads from bam files # bams <- loadBAM( targets ) # # # Read counts from bam files # counts <- readCounts( features, bams, targets, cores = 1, readLength = 100, # maxISize = 50000 ) # # # Calculates read densities # counts <- rds( counts, targets )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # ## Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # # # Define bam files, sample names and experimental factors for targets. # bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) # targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D') ) # # # Load reads from bam files # bams <- loadBAM( targets ) # # # Read counts from bam files # counts <- readCounts( features, bams, targets, cores = 1, readLength = 100, # maxISize = 50000 ) # # # Calculates read densities # counts <- rds( counts, targets )
Display a summary of data contained in ASpliObjects
Display a summary of data contained in ASpliObjects
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
This function integrates bin and junction usage in a comprehensive report
splicingReport(bdu, jdu, counts)
splicingReport(bdu, jdu, counts)
bdu |
An object of class |
jdu |
An object of class |
counts |
An object of class |
An ASpliSplicingReport
object with junction differential usage report. See vignette for more details
Andres Rabinovich, Estefania Mancini, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
Accesors: binbased
,
localebased
,
anchorbased
,
Export: writeSplicingReport
gbDUreport
, jDUreport
, ASpliSplicingReport
, splicingReport
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Test for factor1 controlling for paired subject gbPaired <- gbDUreport(gbcounts, formula = formula(~subject+factor1)) jPaired <- jDUreport(jcounts, formula = formula(~subject+factor1)) # Generate a splicing report merging bins and junctions DU sr <- splicingReport(gbPaired, jPaired, gbcounts) # Access splicing report elements sr localebased(sr) anchorbased(sr) binbased(sr)
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. library(GenomicFeatures) genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', package="ASpli") ) # Create an ASpliFeatures object from TxDb features <- binGenome( genomeTxDb ) # Define bam files, sample names and experimental factors for targets. bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) targets <- data.frame( row.names = paste0('Sample_',c(1:6)), bam = system.file( 'extdata', bamFileNames, package="ASpli" ), factor1 = c( 'C','C','C','D','D','D'), subject = c(0, 1, 2, 0, 1, 2)) # Read counts from bam files gbcounts <- gbCounts( features = features, targets = targets, minReadLength = 100, maxISize = 50000, libType="SE", strandMode=0) jcounts <- jCounts(counts = gbcounts, features = features, minReadLength = 100, libType="SE", strandMode=0) # Test for factor1 controlling for paired subject gbPaired <- gbDUreport(gbcounts, formula = formula(~subject+factor1)) jPaired <- jDUreport(jcounts, formula = formula(~subject+factor1)) # Generate a splicing report merging bins and junctions DU sr <- splicingReport(gbPaired, jPaired, gbcounts) # Access splicing report elements sr localebased(sr) anchorbased(sr) binbased(sr)
Accessors for ASpliSplicingReport object
binbased( x ) localebased( x ) anchorbased( x )
binbased( x ) localebased( x ) anchorbased( x )
x |
An ASpliSplicingReport object |
Returns dataframes
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
ASpli provides utility functions to easy subset ASpliCounts objects, ASpliAS objects, targets data frame and lists GAlignments generated with loadBAM function. The subset can be done selecting some of the experimental conditions or samples names ( but not both ).
subset( x, ... ) subsetBams( x, targets, select ) subsetTargets( targets, select, removeRedundantExpFactors )
subset( x, ... ) subsetBams( x, targets, select ) subsetTargets( targets, select, removeRedundantExpFactors )
x |
An ASpliCount or ASpliAS object for subset function, or list of GAlignments for subsetrBams function. |
targets |
A dataframe containing sample, bam and experimental factor columns. |
select |
A character vector specifying the conditions or samples to be kept after subset operation. It's assumed that condition names are different from sample names. |
removeRedundantExpFactors |
When sub-setting the targets data frame, one or more experimental factors can have only one value. If this argument is TRUE those experimental factors are absent in the resulting target data frame. |
... |
Subsetting ASpliCounts and ASpliAS objects sub subset method requires a targets argument and a select argument with the same specifications that the arguments with the same name in subsetBams and subsetTargets functions |
A data frame similar to x ( or targets for subsetTargets) with only the containing only the selected elements.
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # # Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D') ) # # Load reads from bam files #bams <- loadBAM( targets ) # # Read counts from bam files #counts <- readCounts( features, bams, targets, cores = 1, readLength = 100, # maxISize = 50000 ) # # # Create ASpliAS object #as <- AsDiscover( counts, targets, features, bams, readLength = 100, # threshold = 5, cores = 1 ) # # Define selection #select <- c('Sample_1', 'Sample_2', 'Sample_4', 'Sample_5') # # Subset target #targets2 <- subsetTargets( targets, select ) # Subset bams #bams2 <- subsetBams( bams, targets, select ) # Subset ASpliCounts object #counts2 <- subset( counts, targets, select ) # Subset ASpliAS object #as2 <- subset( as, targets, select )
# Create a transcript DB from gff/gtf annotation file. # Warnings in this examples can be ignored. #library(GenomicFeatures) #genomeTxDb <- txdbmaker::makeTxDbFromGFF( system.file('extdata','genes.mini.gtf', # package="ASpli") ) # # Create an ASpliFeatures object from TxDb #features <- binGenome( genomeTxDb ) # # Define bam files, sample names and experimental factors for targets. #bamFileNames <- c( "A_C_0.bam", "A_C_1.bam", "A_C_2.bam", # "A_D_0.bam", "A_D_1.bam", "A_D_2.bam" ) #targets <- data.frame( # row.names = paste0('Sample_',c(1:6)), # bam = system.file( 'extdata', bamFileNames, package="ASpli" ), # factor1 = c( 'C','C','C','D','D','D') ) # # Load reads from bam files #bams <- loadBAM( targets ) # # Read counts from bam files #counts <- readCounts( features, bams, targets, cores = 1, readLength = 100, # maxISize = 50000 ) # # # Create ASpliAS object #as <- AsDiscover( counts, targets, features, bams, readLength = 100, # threshold = 5, cores = 1 ) # # Define selection #select <- c('Sample_1', 'Sample_2', 'Sample_4', 'Sample_5') # # Subset target #targets2 <- subsetTargets( targets, select ) # Subset bams #bams2 <- subsetBams( bams, targets, select ) # Subset ASpliCounts object #counts2 <- subset( counts, targets, select ) # Subset ASpliAS object #as2 <- subset( as, targets, select )
Export tab delimited files in structured output
writeCounts(counts, output.dir="counts") writeRds(counts, output.dir="rds") writeDU(du, output.dir="du") writeAS(as, output.dir="as") writeJDU(jdu, output.dir="jdu") writeSplicingReport(sr, output.dir="sr") writeAll(counts, du, as, output.dir="output")
writeCounts(counts, output.dir="counts") writeRds(counts, output.dir="rds") writeDU(du, output.dir="du") writeAS(as, output.dir="as") writeJDU(jdu, output.dir="jdu") writeSplicingReport(sr, output.dir="sr") writeAll(counts, du, as, output.dir="output")
counts |
An ASpliCounts object |
as |
An ASpliAS object |
du |
An ASpliDU object |
jdu |
An ASpliJDU object |
sr |
An ASpliSplicingReport object |
output.dir |
Name of output folder (new or existing) |
Tab delimited files are exported in a tidy manner into output folder
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
jCounts
,
binGenome
,
DUreport.norm
,
DUreport.offset
Export tab delimited files in structured output
Tab delimited files are exported in a tidy manner into output folder
Estefania Mancini, Andres Rabinovich, Javier Iserte, Marcelo Yanovsky, Ariel Chernomoretz
jCounts, binGenome,DUreport.norm, DUreport.offset