Title: | An effective identification of alternative splicing events using junction arrays and RNA-Seq data |
---|---|
Description: | EventPointer is an R package to identify alternative splicing events that involve either simple (case-control experiment) or complex experimental designs such as time course experiments and studies including paired-samples. The algorithm can be used to analyze data from either junction arrays (Affymetrix Arrays) or sequencing data (RNA-Seq). The software returns a data.frame with the detected alternative splicing events: gene name, type of event (cassette, alternative 3',...,etc), genomic position, statistical significance and increment of the percent spliced in (Delta PSI) for all the events. The algorithm can generate a series of files to visualize the detected alternative splicing events in IGV. This eases the interpretation of results and the design of primers for standard PCR validation. |
Authors: | Juan Pablo Romero [aut], Juan A. Ferrer-Bonsoms [aut, cre], Pablo Sacristan [aut], Ander Muniategui [aut], Fernando Carazo [aut], Ander Aramburu [aut], Angel Rubio [aut] |
Maintainer: | Juan A. Ferrer-Bonsoms <[email protected]> |
License: | Artistic-2.0 |
Version: | 3.15.0 |
Built: | 2024-12-29 06:40:36 UTC |
Source: | https://github.com/bioc/EventPointer |
Alternative splicing events detected by EventPointer
data(AllEvents_RNASeq)
data(AllEvents_RNASeq)
A list
object AllEvents_RNASeq[[i]][[j]]
displays the jth splicing event for the ith gene.
AllEvents_RNASeq object contains all the detected
alternativesplicing events using EventPointermethodology.
The splicing events where detected using the BAM files from
the dataset published in Seshagiri et al. 2012 andused in
the SGSeq
R package vignette.
Alternative splicing multi-path events detected by EventPointer
data(AllEvents_RNASeq_MP)
data(AllEvents_RNASeq_MP)
A list
object AllEvents_RNASeq[[i]][[j]]
displays the jth splicing event for the ith gene.
AllEvents_RNASeq_MP object contains all the detected alternative
splicing events using EventPointer methodology for multi-path events.
The splicing events where detected using the BAM files from the dataset
published in Seshagiri et al. 2012 and used
in the SGSeq
R package vignette.
Preprocessed arrays data with multi-path events
data(ArrayDatamultipath)
data(ArrayDatamultipath)
A data.frame
with preprocessed arrays data. The preprocessing
was done using aroma.affymetrix
. See the package vignette for the
preprocessing pipeline
ArrayDatamultipath object contains preprocessed junction
arrays data. The preprocessing was done using aroma.affymetrix
R package, refer to EventPointer vignette for the pipeline used for
the preprocessing. The data corresponds to 4 samples from the SUM149
Cell line hybridized to the HTA 2.0 Affymetrix array. The first two samples
are control and the second ones are treated.
Preprocessed arrays data
data(ArraysData)
data(ArraysData)
A data.frame
with preprocessed arrays data.
The preprocessing was done using aroma.affymetrix
.
See the package vignette for the preprocessing pipeline
ArraysData object contains preprocessed junction arrays data.
The preprocessing was done using aroma.affymetrix
R package,
refer to EventPointer vignette for the pipeline used for the preprocessing.
The data corresponds to 4 samples from the SUM149 Cell line hybridized to
the HTA 2.0 Affymetrix array. The first two samples are control and
the second ones are treated.
Generates the CDF file to be used under the aroma.affymetrix framework
CDFfromGTF( input = "Ensembl", inputFile = NULL, PSR, Junc, PathCDF, microarray = NULL )
CDFfromGTF( input = "Ensembl", inputFile = NULL, PSR, Junc, PathCDF, microarray = NULL )
input |
Reference transcriptome used to build the CDF file. Must be one of: 'Ensembl', 'UCSC' , 'AffyGTF' or 'CustomGTF'. |
inputFile |
If input is 'AffyGTF' or 'CustomGTF', inputFile should point to the GTF file to be used. |
PSR |
Path to the Exon probes txt file |
Junc |
Path to the Junction probes txt file |
PathCDF |
Directory where the output will be saved |
microarray |
Microarray used to create the CDF file. Must be one of: HTA-2_0, ClariomD, RTA or MTA |
The function displays a progress bar to show the user the progress of the function. However, there is no value returned in R as the function creates three files that are used later by other EventPointer functions.1) EventsFound.txt : Tab separated file with all the information of all the alternative splcing events found. 2) .flat file : Used to build the corresponding CDF file. 3) .CDF file: Output required for the aroma.affymetrix preprocessing pipeline. Both the .flat and .CDF file take large ammounts of memory in the hard drive, it is recommended to have at least 1.5 GB of free space.
## Not run: PathFiles<-system.file('extdata',package='EventPointer') DONSON_GTF<-paste(PathFiles,'/DONSON.gtf',sep='') PSRProbes<-paste(PathFiles,'/PSR_Probes.txt',sep='') JunctionProbes<-paste(PathFiles,'/Junction_Probes.txt',sep='') Directory<-tempdir() microarray<-'HTA-2_0' # Run the function CDFfromGTF(input='AffyGTF',inputFile=DONSON_GTF,PSR=PSRProbes,Junc=JunctionProbes, PathCDF=Directory,microarray=microarray) ## End(Not run)
## Not run: PathFiles<-system.file('extdata',package='EventPointer') DONSON_GTF<-paste(PathFiles,'/DONSON.gtf',sep='') PSRProbes<-paste(PathFiles,'/PSR_Probes.txt',sep='') JunctionProbes<-paste(PathFiles,'/Junction_Probes.txt',sep='') Directory<-tempdir() microarray<-'HTA-2_0' # Run the function CDFfromGTF(input='AffyGTF',inputFile=DONSON_GTF,PSR=PSRProbes,Junc=JunctionProbes, PathCDF=Directory,microarray=microarray) ## End(Not run)
Generates the CDF file to be used under the aroma.affymetrix framework.
CDFfromGTF_Multipath( input = "Ensembl", inputFile = NULL, PSR, Junc, PathCDF, microarray = NULL, paths = 2 )
CDFfromGTF_Multipath( input = "Ensembl", inputFile = NULL, PSR, Junc, PathCDF, microarray = NULL, paths = 2 )
input |
Reference transcriptome used to build the CDF file. Must be one of Ensembl, UCSC or GTF. |
inputFile |
If input is GTF, inputFile should point to the GTF file to be used. |
PSR |
Path to the Exon probes txt file |
Junc |
Path to the Junction probes txt file |
PathCDF |
Directory where the output will be saved |
microarray |
Microarray used to create the CDF file. Must be one of: HTA-2_0, ClariomD, RTA or MTA |
paths |
Maximum number of paths of the events to find. |
The function displays a progress bar to show the user the progress of the function. However, there is no value returned in R as the function creates three files that are used later by other EventPointer functions. 1) EventsFound.txt : Tab separated file with all the information of all the alternative splcing events found. 2) .flat file : Used to build the corresponding CDF file. 3) .CDF file: Output required for the aroma.affymetrix preprocessing pipeline. Both the .flat and .CDF file take large ammounts of memory in the hard drive, it is recommended to have at least 1.5 GB of free space.
## Not run: PathFiles<-system.file('extdata',package='EventPointer') DONSON_GTF<-paste(PathFiles,'/DONSON.gtf',sep='') PSRProbes<-paste(PathFiles,'/PSR_Probes.txt',sep='') JunctionProbes<-paste(PathFiles,'/Junction_Probes.txt',sep='') Directory<-tempdir() microarray<-'HTA-2_0' # Run the function CDFfromGTF_Multipath(input='AffyGTF',inputFile=DONSON_GTF,PSR=PSRProbes,Junc=JunctionProbes, PathCDF=Directory,microarray=microarray,paths=3) ## End(Not run)
## Not run: PathFiles<-system.file('extdata',package='EventPointer') DONSON_GTF<-paste(PathFiles,'/DONSON.gtf',sep='') PSRProbes<-paste(PathFiles,'/PSR_Probes.txt',sep='') JunctionProbes<-paste(PathFiles,'/Junction_Probes.txt',sep='') Directory<-tempdir() microarray<-'HTA-2_0' # Run the function CDFfromGTF_Multipath(input='AffyGTF',inputFile=DONSON_GTF,PSR=PSRProbes,Junc=JunctionProbes, PathCDF=Directory,microarray=microarray,paths=3) ## End(Not run)
Generates the Events x RBP matrix for the splicing factor enrichment analysis.
CreateExSmatrix( pathtoeventstable, SG_List, nt = 400, Peaks, POSTAR, EventsRegions = NULL, cores = 1 )
CreateExSmatrix( pathtoeventstable, SG_List, nt = 400, Peaks, POSTAR, EventsRegions = NULL, cores = 1 )
pathtoeventstable |
Path to eventsFound.txt with the information of all the events |
SG_List |
List with the information of the splicing graph of the genes. Returned by the funciotn EventDetectio_transcriptome |
nt |
Number of nt up and down for the splicing regions of each event |
Peaks |
Table with the peaks |
POSTAR |
Table with peaks of POSTAR |
EventsRegions |
Events regions if calculated prevously. Not need to calculated again. |
cores |
Number of cores if user want to run in parallel. |
The function returns a list with the ExS matrix and with the splicing regions of the events. If the Splicign regions is an input of the function then only the ExS matrix will be returned. The ExS matrix is the input for the Splicing Factor enrichment analysis.
Identification of all the alternative splicing events in the splicing graphs
EventDetection(Input, cores, Path)
EventDetection(Input, cores, Path)
Input |
Output of the PrepareBam_EP function |
cores |
Number of cores used for parallel processing |
Path |
Directory where to write the EventsFound_RNASeq.txt file |
list with all the events found for all the genes present in the experiment. It also generates a file called EventsFound_RNASeq.txt with the information of each event.
## Not run: # Run EventDetection function data(SG_RNASeq) TxtPath<-tempdir() AllEvents_RNASeq<-EventDetection(SG_RNASeq,cores=1,Path=TxtPath) ## End(Not run)
## Not run: # Run EventDetection function data(SG_RNASeq) TxtPath<-tempdir() AllEvents_RNASeq<-EventDetection(SG_RNASeq,cores=1,Path=TxtPath) ## End(Not run)
Finds all the possible alternative splicing (AS) events given a reference transcriptome. This function use parallel foreach. User must set the value of cores (by default equal to one). Moreover, it will create a .txt file with the relative information of all the AS events found. Besides, it will return a list with main information of the splicing graph of each event. This list will be used as an input in downstream functions (Get_PSI_FromTranRef, FindPrimers, and EventPointer_RNASeq_TranRef_IGV)
EventDetection_transcriptome( inputFile = NULL, Transcriptome = NULL, Pathtxt = NULL, cores = 1 )
EventDetection_transcriptome( inputFile = NULL, Transcriptome = NULL, Pathtxt = NULL, cores = 1 )
inputFile |
Path to the GTF file of the reference transcriptome. |
Transcriptome |
Name of the transcriptome |
Pathtxt |
Directory to save the .txt of the events found |
cores |
Number of cores using in the parallel processing (by default = 1) |
a list is returned with the following information:
ExTP1 a sparce matrix of Events x Transcripts that relates which isoform build up the path1 of each event.
ExTP2 a sparce matrix of Events x Transcripts that relates which isoform build up the path2 of each event.
ExTPRef a sparce matrix of Events x Transcripts that relates which isoform build up the pathRef of each event.
transcritnames a vector with the annotation names of the isoforms.
SG_List A list containing the information of the splicing graph of each gene.
## Not run: PathFiles<-system.file("extdata",package="EventPointer") inputFile <- paste(PathFiles,"/gencode.v24.ann_2genes.gtf",sep="") Transcriptome <- "Gencode24_2genes" Pathtxt <- tempdir() # Run the function EventXtrans <- EventDetection_transcriptome(inputFile = inputFile, Transcriptome = Transcriptome, Pathtxt=Pathtxt, cores=1) ## End(Not run)
## Not run: PathFiles<-system.file("extdata",package="EventPointer") inputFile <- paste(PathFiles,"/gencode.v24.ann_2genes.gtf",sep="") Transcriptome <- "Gencode24_2genes" Pathtxt <- tempdir() # Run the function EventXtrans <- EventDetection_transcriptome(inputFile = inputFile, Transcriptome = Transcriptome, Pathtxt=Pathtxt, cores=1) ## End(Not run)
Identification of all the multipath alternative splicing events in the splicing graphs
EventDetectionMultipath(Input, cores, Path, paths = 2)
EventDetectionMultipath(Input, cores, Path, paths = 2)
Input |
Output of the PrepareBam_EP function |
cores |
Number of cores used for parallel processing |
Path |
Directory where to write the EventsFound_RNASeq.txt file |
paths |
Maximum number of paths of the events to find. |
list with all the events found for all the genes present in the experiment. It also generates a file called EventsFound_RNASeq.txt with the information each event.
## Not run: # Run EventDetection function data(SG_RNASeq) TxtPath<-tempdir() AllEvents_RNASeq_MP<-EventDetectionMultipath(SG_RNASeq,cores=1,Path=TxtPath,paths=3) ## End(Not run)
## Not run: # Run EventDetection function data(SG_RNASeq) TxtPath<-tempdir() AllEvents_RNASeq_MP<-EventDetectionMultipath(SG_RNASeq,cores=1,Path=TxtPath,paths=3) ## End(Not run)
Statistical analysis of alternative splcing events
EventPointer( Design, Contrast, ExFit, Eventstxt, Filter = TRUE, Qn = 0.25, Statistic = "LogFC", PSI = FALSE )
EventPointer( Design, Contrast, ExFit, Eventstxt, Filter = TRUE, Qn = 0.25, Statistic = "LogFC", PSI = FALSE )
Design |
The design matrix for the experiment. |
Contrast |
The contrast matrix for the experiment. |
ExFit |
aroma.affymetrix pre-processed variable after using
|
Eventstxt |
Path to the EventsFound.txt file generated by CDFfromGTF function. |
Filter |
Boolean variable to indicate if an expression filter is applied |
Qn |
Quantile used to filter the events (Bounded between 0-1, Q1 would be 0.25). |
Statistic |
Statistical test to identify differential splicing events, must be one of : LogFC, Dif_LogFC or DRS. |
PSI |
Boolean variable to indicate if Delta PSI should be calculated for every splicing event. |
Data.frame ordered by the splicing p.value . The object contains the different information for each splicing event such as Gene name, event type, genomic position, p.value, z.value and delta PSI.
data(ArraysData) Dmatrix<-matrix(c(1,1,1,1,0,0,1,1),nrow=4,ncol=2,byrow=FALSE) Cmatrix<-t(t(c(0,1))) EventsFound<-paste(system.file('extdata',package='EventPointer'),'/EventsFound.txt',sep='') Events<-EventPointer(Design=Dmatrix, Contrast=Cmatrix, ExFit=ArraysData, Eventstxt=EventsFound, Filter=TRUE, Qn=0.25, Statistic='LogFC', PSI=TRUE)
data(ArraysData) Dmatrix<-matrix(c(1,1,1,1,0,0,1,1),nrow=4,ncol=2,byrow=FALSE) Cmatrix<-t(t(c(0,1))) EventsFound<-paste(system.file('extdata',package='EventPointer'),'/EventsFound.txt',sep='') Events<-EventPointer(Design=Dmatrix, Contrast=Cmatrix, ExFit=ArraysData, Eventstxt=EventsFound, Filter=TRUE, Qn=0.25, Statistic='LogFC', PSI=TRUE)
Statistical analysis of alternative splicing events with bootstrap technique.
EventPointer_Bootstraps( PSI, Design, Contrast, cores = 1, ram = 0.1, nBootstraps = 10000, UsePseudoAligBootstrap = TRUE, Threshold = 0 )
EventPointer_Bootstraps( PSI, Design, Contrast, cores = 1, ram = 0.1, nBootstraps = 10000, UsePseudoAligBootstrap = TRUE, Threshold = 0 )
PSI |
Array or matrix that contains the values of PSI calculated in the function GetPSIFromTranRef. If bootstrap option was selected in GetPSIFromTranRef, input must be an array. If not, input must be a matrix |
Design |
Design matrix |
Contrast |
Contrast matrix |
cores |
The number of cores desired to use. |
ram |
How many ram memory is used,in Gb. |
nBootstraps |
How many layers, Bootstraps or samplings are going to be used. Caution, high numbers increase computational time. |
UsePseudoAligBootstrap |
TRUE (default) if bootstrap data from pseudoaligment want to be used or FALSe if not. |
Threshold |
it assigns a threshold to compute the pvalues. default = 0. |
A list containing the summary of the Bootstrap analysis: DeltaPSI, Pvalues, FDR. This info can be obtained in a simple table with the function ResulTable.
data(PSIss) PSI <- PSIss$PSI Dmatrix <- cbind(1,rep(c(0,1),each=2)) Cmatrix <- matrix(c(0,1),nrow=2) Fit <- EventPointer_Bootstraps(PSI = PSI, Design = Dmatrix, Contrast = Cmatrix, cores = 1, ram = 1, nBootstraps = 10, UsePseudoAligBootstrap = TRUE)
data(PSIss) PSI <- PSIss$PSI Dmatrix <- cbind(1,rep(c(0,1),each=2)) Cmatrix <- matrix(c(0,1),nrow=2) Fit <- EventPointer_Bootstraps(PSI = PSI, Design = Dmatrix, Contrast = Cmatrix, cores = 1, ram = 1, nBootstraps = 10, UsePseudoAligBootstrap = TRUE)
Generates of files to be loaded in IGV for visualization and interpretation of events
EventPointer_IGV( Events, input, inputFile = NULL, PSR, Junc, PathGTF, EventsFile, microarray = NULL )
EventPointer_IGV( Events, input, inputFile = NULL, PSR, Junc, PathGTF, EventsFile, microarray = NULL )
Events |
Data.frame generated by EventPointer with the events to be included in the GTF file. |
input |
Reference transcriprome. Must be one of: 'Ensembl', 'UCSC' , 'AffyGTF' or 'CustomGTF'. |
inputFile |
If input is 'AffyGTF' or 'CustomGTF', inputFile should point to the GTF file to be used. |
PSR |
Path to the Exon probes txt file. |
Junc |
Path to the Junction probes txt file. |
PathGTF |
Directory where to write the GTF files. |
EventsFile |
Path to EventsFound.txt file generated with CDFfromGTF function. |
microarray |
Microarray used to create the CDF file. Must be one of: HTA-2_0, ClariomD, RTA or MTA |
The function displays a progress bar to show the user the progress of the function. Once the progress bar reaches 100 in PathGTF. The created files are: 1) paths.gtf : GTF file representing the alternative splicing events and 2) probes.gtf : GTF file representing the probes that measure each event and each path.
## Not run: PathFiles<-system.file('extdata',package='EventPointer') DONSON_GTF<-paste(PathFiles,'/DONSON.gtf',sep='') PSRProbes<-paste(PathFiles,'/PSR_Probes.txt',sep='') JunctionProbes<-paste(PathFiles,'/Junction_Probes.txt',sep='') Directory<-tempdir() data(ArraysData) Dmatrix<-matrix(c(1,1,1,1,0,0,1,1),nrow=4,ncol=2,byrow=FALSE) Cmatrix<-t(t(c(0,1))) EventsFound<-paste(system.file('extdata',package='EventPointer'),'/EventsFound.txt',sep='') Events<-EventPointer(Design=Dmatrix, Contrast=Cmatrix, ExFit=ArraysData, Eventstxt=EventsFound, Filter=TRUE, Qn=0.25, Statistic='LogFC', PSI=TRUE) EventPointer_IGV(Events=Events[1,,drop=FALSE], input='AffyGTF', inputFile=DONSON_GTF, PSR=PSRProbes, Junc=JunctionProbes, PathGTF=Directory, EventsFile= EventsFound, microarray='HTA-2_0') ## End(Not run)
## Not run: PathFiles<-system.file('extdata',package='EventPointer') DONSON_GTF<-paste(PathFiles,'/DONSON.gtf',sep='') PSRProbes<-paste(PathFiles,'/PSR_Probes.txt',sep='') JunctionProbes<-paste(PathFiles,'/Junction_Probes.txt',sep='') Directory<-tempdir() data(ArraysData) Dmatrix<-matrix(c(1,1,1,1,0,0,1,1),nrow=4,ncol=2,byrow=FALSE) Cmatrix<-t(t(c(0,1))) EventsFound<-paste(system.file('extdata',package='EventPointer'),'/EventsFound.txt',sep='') Events<-EventPointer(Design=Dmatrix, Contrast=Cmatrix, ExFit=ArraysData, Eventstxt=EventsFound, Filter=TRUE, Qn=0.25, Statistic='LogFC', PSI=TRUE) EventPointer_IGV(Events=Events[1,,drop=FALSE], input='AffyGTF', inputFile=DONSON_GTF, PSR=PSRProbes, Junc=JunctionProbes, PathGTF=Directory, EventsFile= EventsFound, microarray='HTA-2_0') ## End(Not run)
Statistical analysis of all the alternative splicing events found in the given bam files.
EventPointer_RNASeq(Events, Design, Contrast, Statistic = "LogFC", PSI = FALSE)
EventPointer_RNASeq(Events, Design, Contrast, Statistic = "LogFC", PSI = FALSE)
Events |
Output from EventDetection function |
Design |
The design matrix for the experiment. |
Contrast |
The contrast matrix for the experiment. |
Statistic |
Statistical test to identify differential splicing events, must be one of : LogFC, Dif_LogFC and DRS. |
PSI |
Boolean variable to indicate if PSI should be calculated for every splicing event. |
Data.frame ordered by the splicing p.value . The object contains the different information for each splicing event such as Gene name, event type, genomic position, p.value, z.value and delta PSI.
data(AllEvents_RNASeq) Dmatrix<-matrix(c(1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1),ncol=2,byrow=FALSE) Cmatrix<-t(t(c(0,1))) Events <- EventPointer_RNASeq(AllEvents_RNASeq,Dmatrix,Cmatrix,Statistic='LogFC',PSI=TRUE)
data(AllEvents_RNASeq) Dmatrix<-matrix(c(1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1),ncol=2,byrow=FALSE) Cmatrix<-t(t(c(0,1))) Events <- EventPointer_RNASeq(AllEvents_RNASeq,Dmatrix,Cmatrix,Statistic='LogFC',PSI=TRUE)
Generates of files to be loaded in IGV for visualization and interpretation of events
EventPointer_RNASeq_IGV(Events, SG_RNASeq, EventsTxt, PathGTF)
EventPointer_RNASeq_IGV(Events, SG_RNASeq, EventsTxt, PathGTF)
Events |
Data.frame generated by EventPointer_RNASeq with the events to be included in the GTF file. |
SG_RNASeq |
Output from PrepareBam_EP function. Contains splicing graphs components. |
EventsTxt |
Path to EventsFound.txt file generated with EventDetection function |
PathGTF |
Directory where to write the GTF files. |
The function displays a progress bar to show the user the progress of the function. Once the progress bar reaches 100 file is written to the specified directory in PathGTF. The created file: 1) paths_RNASeq.gtf : GTF file representing the alternative splicing events.
## Not run: data(AllEvents_RNASeq) data(SG_RNASeq) # Run EventPointer Dmatrix<-matrix(c(1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1),ncol=2,byrow=FALSE) Cmatrix<-t(t(c(0,1))) Events <- EventPointer_RNASeq(AllEvents_RNASeq,Dmatrix,Cmatrix,Statistic='LogFC',PSI=TRUE) # IGV Visualization EventsTxt<-paste(system.file('extdata',package='EventPointer'),'/EventsFound_RNASeq.txt',sep='') PathGTF<-tempdir() EventPointer_RNASeq_IGV(Events,SG_RNASeq,EventsTxt,PathGTF) ## End(Not run)
## Not run: data(AllEvents_RNASeq) data(SG_RNASeq) # Run EventPointer Dmatrix<-matrix(c(1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1),ncol=2,byrow=FALSE) Cmatrix<-t(t(c(0,1))) Events <- EventPointer_RNASeq(AllEvents_RNASeq,Dmatrix,Cmatrix,Statistic='LogFC',PSI=TRUE) # IGV Visualization EventsTxt<-paste(system.file('extdata',package='EventPointer'),'/EventsFound_RNASeq.txt',sep='') PathGTF<-tempdir() EventPointer_RNASeq_IGV(Events,SG_RNASeq,EventsTxt,PathGTF) ## End(Not run)
Statistical analysis of alternative splicing events with the output of GetPSI_FromTranRef
EventPointer_RNASeq_TranRef( Count_Matrix, Statistic = "LogFC", Design, Contrast )
EventPointer_RNASeq_TranRef( Count_Matrix, Statistic = "LogFC", Design, Contrast )
Count_Matrix |
The list containing the expression data taken from the ouput of GetPSI_FromTranRef |
Statistic |
The type of statistic to apply. Default = 'LogFC' (can be 'logFC, 'Dif_LogFC','DRS') |
Design |
The design matrix of the experiment. |
Contrast |
The Contrast matrix of the experiment. |
a data.frame with the information of the names of the event, its p.values and the corresponding z.value. If there is more than one contrast, the function returns as many data.frames as number of contrast and all these data.frame are sotred in an unique list.
## Not run: data(EventXtrans) data(PSIss) # Design and contrast matrix: Design <- matrix(c(1,1,1,1,0,0,1,1),nrow=4) Contrast <- matrix(c(0,1),nrow=2) # Statistical analysis: Fit <- EventPointer_RNASeq_TranRef(Count_Matrix = PSIss$ExpEvs, Statistic = 'LogFC',Design = Design, Contrast = Contrast) ## End(Not run)
## Not run: data(EventXtrans) data(PSIss) # Design and contrast matrix: Design <- matrix(c(1,1,1,1,0,0,1,1),nrow=4) Contrast <- matrix(c(0,1),nrow=2) # Statistical analysis: Fit <- EventPointer_RNASeq_TranRef(Count_Matrix = PSIss$ExpEvs, Statistic = 'LogFC',Design = Design, Contrast = Contrast) ## End(Not run)
Generates of files to be loaded in IGV for visualization and interpretation of events detected from a reference transcriptome (see EventDetection_transcriptome).
EventPointer_RNASeq_TranRef_IGV(SG_List, pathtoeventstable, PathGTF)
EventPointer_RNASeq_TranRef_IGV(SG_List, pathtoeventstable, PathGTF)
SG_List |
List with the Splicing Graph information of the events. This list is created by EventDetection_transcriptome function. |
pathtoeventstable |
Complete path to the table returned by EventDetection_transcriptome that contains the information of each event, or table with specific events that the user want to load into IGV to visualize. |
PathGTF |
Directory where to write the GTF files. |
The function displays a progress bar to show the user the progress of the function. Once the progress bar reaches 100 file is written to the specified directory in PathGTF. The created file is named 'paths_RNASeq.gtf'.
###### example using all the events found in a reference transcriptome data("EventXtrans") SG_List <- EventXtrans$SG_List PathEventsTxt<-system.file('extdata',package='EventPointer') PathEventsTxt <- paste0(PathEventsTxt,"/EventsFound_Gencode24_2genes.txt") PathGTF <- tempdir() EventPointer_RNASeq_TranRef_IGV(SG_List = SG_List,pathtoeventstable = PathEventsTxt,PathGTF = PathGTF)
###### example using all the events found in a reference transcriptome data("EventXtrans") SG_List <- EventXtrans$SG_List PathEventsTxt<-system.file('extdata',package='EventPointer') PathEventsTxt <- paste0(PathEventsTxt,"/EventsFound_Gencode24_2genes.txt") PathGTF <- tempdir() EventPointer_RNASeq_TranRef_IGV(SG_List = SG_List,pathtoeventstable = PathEventsTxt,PathGTF = PathGTF)
EventPointer can detect splicing events that cannot be cataloged in any of the canonical types (Cassette Exon, Alternative 3' or 5' splice site, retained intron and mutually exclusive exon). These events are classified as "Complex Events". With this function, EventPointer reclassifies these complex events according to how similar the event is to the canonical events. The same complex event can have several types. Further, EP adds a new type of event: "multiple skipping exon". These events are characterized by presenting several exons in a row as alternative exons. If there is only one alternative exon we would be talking about a "Casstte Exon".
Events_ReClassification(EventTable, SplicingGraph)
Events_ReClassification(EventTable, SplicingGraph)
EventTable |
Table returned by EventDetection_transcriptome. Can be easily loaded using the function read.delim as data.frame. |
SplicingGraph |
A list with the splicing graph of all the genes of a reference transcriptome. This data is returned by the function EventDetection_transcriptome. |
A data.frame containing a new column with the new classification ('EventType_new'):
#load splicing graph data("SG_reclassify") #load table with info of the events PathFiles<-system.file("extdata",package="EventPointer") inputFile <- paste(PathFiles,"/Events_found_class.txt",sep="") EventTable <- read.delim(file=inputFile) #this table has the information of 5 complex events. EventTable_new <- Events_ReClassification(EventTable = EventTable, SplicingGraph = SG_reclassify)
#load splicing graph data("SG_reclassify") #load table with info of the events PathFiles<-system.file("extdata",package="EventPointer") inputFile <- paste(PathFiles,"/Events_found_class.txt",sep="") EventTable <- read.delim(file=inputFile) #this table has the information of 5 complex events. EventTable_new <- Events_ReClassification(EventTable = EventTable, SplicingGraph = SG_reclassify)
relationship between isoforms and events
data(EventXtrans)
data(EventXtrans)
A list
object EventXtrans[[1]]
displays the isoform
that build up the path1 of each event.
EventXtrans object contains the relationship between the isoforms and the events. It is a list of 4 elements. the first three stored sparse matrices relating the isoforms with the events. The fourth element stores de names of the reference annotation used (isoforms names)
FindPrimers is the main function of the primers design option. The aim of this function is the design of PCR primers and TaqMan probes for detection and quantification of alternative splicing.
Depending on the assay we want to carry out the the algorithm will design the primers for a conventional PCR or the primers and TaqMan probes if we are performing a TaqMan assay.
In the case of a conventional PCR we will be able to detect the alternative splicing event. Besides, the algorithm gives as an output the length of the PCR bands that are going to appear. In the case of a TaqMan assay, we will not only detect but also quantify alternative splicing.
FindPrimers( SG, EventNum, Primer3Path, Dir, mygenomesequence, taqman = NA, nProbes = 1, nPrimerstwo = 3, ncommonForward = 3, ncommonReverse = 3, nExons = 5, nPrimers = 15, shortdistpenalty = 2000, maxLength = 1000, minsep = 100, wminsep = 200, valuethreePenalty = 1000, minexonlength = 25, wnpaths = 200, qualityfilter = 5000 )
FindPrimers( SG, EventNum, Primer3Path, Dir, mygenomesequence, taqman = NA, nProbes = 1, nPrimerstwo = 3, ncommonForward = 3, ncommonReverse = 3, nExons = 5, nPrimers = 15, shortdistpenalty = 2000, maxLength = 1000, minsep = 100, wminsep = 200, valuethreePenalty = 1000, minexonlength = 25, wnpaths = 200, qualityfilter = 5000 )
SG |
Information of the graph of the gene where the selected event belongs. This information is avaible in the output of EventDetection_transcriptome function. |
EventNum |
The "EventNum" variable can be found in the returned .txt file from the EventDetection_transcriptome function in the column "EventNumber" or in the output of EventPointer_RNASeq_TranRef, the number after the "_" character of the 'Event_ID'. |
Primer3Path |
Complete path where primer3_core.exe is placed. |
Dir |
Complete path where primer3web_v4_0_0_default_settings.txt file and primer3_config directory are stored. |
mygenomesequence |
genome sequence of reference |
taqman |
TRUE if you want to get probes and primers for taqman. FALSE if you want to get primers for conventional PCR. |
nProbes |
Number of probes for Taqman experiments. By default 1. |
nPrimerstwo |
Number of potential exon locations for primers using two primers (one forward and one reverse). By default 3. |
ncommonForward |
Number of potential exon locations for primers using one primer in forward and two in reverse. By default 3. |
ncommonReverse |
Number of potential exon locations for primers using two primer in forward and one in reverse. By default 3. |
nExons |
Number of combinations of ways to place primers in exons to interrogate an event after sorting. By default 5. |
nPrimers |
Once the exons are selected, number of primers combination sequences to search within the whole set of potential sequences. By default 5. |
shortdistpenalty |
Penalty for short exons following an exponential funciton(A * exp(-dist * shortdistpenalty)). By defautl 2000. |
maxLength |
Max length of exons that are between primers and for paths once we have calculated the sequence. By default 1000. |
minsep |
Distance from which it is penalized primers for being too close By default 100. |
wminsep |
Weigh of the penalization to primers for being too close By default 200. |
valuethreePenalty |
penalization for cases that need three primers instead of 2. By default 1000. |
minexonlength |
Minimum length that a exon has to have to be able to contain a primer. By default 25. |
wnpaths |
Penalty for each existing path By default 200. |
qualityfilter |
Results will show as maximum 3 combinations with a punctuation higher than qualityfilter By default 5000. |
The output of the function is a 'data.frame' whose columns are:
For1Seq: Sequence of the first forward primer.
For2Seq: Sequence of the second forward primer in case it is needed.
Rev1Seq: Sequence of the first reverse primer.
Rev2Seq: Sequence of the second reverse primer in case it is needed.
For1Exon: Name of the exon of the first forward primer.
For2Exon: Name of the exon of the second forward primer in case it is needed.
Rev1Exon: Name of the exon of the first reverse primer.
Rev2Exon: Name of the exon of the second reverse primer in case it is needed.
FINALvalue: Final punctuation for that combination of exons and sequences. The lower it is this score, the better it is the combination.
DistPath1: Distances of the bands, in base pairs, that interrogate Path1 when we perform the conventional PCR experiment.
DistPath2: Distances of the bands, in base pairs, that interrogate Path2 'when we perform the conventional PCR experiment.
DistNoPath: Distances of the bands, in base pairs, that they do not interrogate any of the two paths when we perform the conventional PCR experiment.
SeqProbeRef: Sequence of the TaqMan probe placed in the Reference.
SeqProbeP1: Sequence of the TaqMan probe placed in the Path1.
SeqProbeP2: Sequence of the TaqMan probe placed in the Path2.
## Not run: data("EventXtrans") #From the output of EventsGTFfromTranscriptomeGTF we take the splicing graph information SG_list <- EventXtrans$SG_List #SG_list contains the information of the splicing graphs for each gene #Let's supone we want to design primers for the event 1 of the gene ENSG00000254709.7 #We take the splicing graph information of the required gene SG <- SG_list$ENSG00000254709.7 #We point the event number EventNum <- 1 #Define rest of variables: Primer3Path <- Sys.which("primer3_core") Dir <- "C:\\PROGRA~2\\primer3\\" MyPrimers <- FindPrimers(SG = SG, EventNum = EventNum, Primer3Path = Primer3Path, Dir = Dir, mygenomesequence = BSgenome.Hsapiens.UCSC.hg38::Hsapiens, taqman = 1, nProbes=1, nPrimerstwo=4, ncommonForward=4, ncommonReverse=4, nExons=10, nPrimers =5, maxLength = 1200) ## End(Not run)
## Not run: data("EventXtrans") #From the output of EventsGTFfromTranscriptomeGTF we take the splicing graph information SG_list <- EventXtrans$SG_List #SG_list contains the information of the splicing graphs for each gene #Let's supone we want to design primers for the event 1 of the gene ENSG00000254709.7 #We take the splicing graph information of the required gene SG <- SG_list$ENSG00000254709.7 #We point the event number EventNum <- 1 #Define rest of variables: Primer3Path <- Sys.which("primer3_core") Dir <- "C:\\PROGRA~2\\primer3\\" MyPrimers <- FindPrimers(SG = SG, EventNum = EventNum, Primer3Path = Primer3Path, Dir = Dir, mygenomesequence = BSgenome.Hsapiens.UCSC.hg38::Hsapiens, taqman = 1, nProbes=1, nPrimerstwo=4, ncommonForward=4, ncommonReverse=4, nExons=10, nPrimers =5, maxLength = 1200) ## End(Not run)
Result of EventPointer_Bootstrap
data(Fit)
data(Fit)
A list
object
A list containing the summary of the Bootstrap analysis: DeltaPSI, Pvalues, FDR. This info can be obtained in a simple table with the function ResulTable.
Function to load the values of the bootstrap returned by kallisto or salmon pseudoaligners.
getbootstrapdata(PathSamples, type)
getbootstrapdata(PathSamples, type)
PathSamples |
A vector with the complete directory to the folder of the output of kallisto/salmon. |
type |
'kallisto' or 'salmon'. |
A list containing the quantification data with the bootstrap information.
PathSamples<-system.file("extdata",package="EventPointer") PathSamples <- paste0(PathSamples,"/output") PathSamples <- dir(PathSamples,full.names = TRUE) data_exp <- getbootstrapdata(PathSamples = PathSamples,type = "kallisto")
PathSamples<-system.file("extdata",package="EventPointer") PathSamples <- paste0(PathSamples,"/output") PathSamples <- dir(PathSamples,full.names = TRUE) data_exp <- getbootstrapdata(PathSamples = PathSamples,type = "kallisto")
Get the values of PSI. A filer expression is applied if the user select the option of filter.
GetPSI_FromTranRef( Samples, PathsxTranscript, Bootstrap = FALSE, Filter = TRUE, Qn = 0.25 )
GetPSI_FromTranRef( Samples, PathsxTranscript, Bootstrap = FALSE, Filter = TRUE, Qn = 0.25 )
Samples |
matrix or list containing the expression of the samples. |
PathsxTranscript |
the output of EventDetection_transcriptome. |
Bootstrap |
Boolean variable to indicate if bootstrap data from pseudo-alignment is used. |
Filter |
Boolean variable to indicate if an expression filter is applied. Default TRUE. |
Qn |
Quartile used to filter the events (Bounded between 0-1, Qn would be 0.25 by default). |
The output is a list containing two elements: a matrix with the values of PSI and a list containing as many matrices as number of events. In each matrix is stored the expression of the different paths of an event along the samples.
data(EventXtrans) PathSamples <- system.file("extdata",package="EventPointer") PathSamples <- paste0(PathSamples,"/output") PathSamples <- dir(PathSamples,full.names = TRUE) data_exp <- getbootstrapdata(PathSamples = PathSamples,type = "kallisto") #same annotation rownames(data_exp[[1]]) <- gsub("\\|.*","",rownames(data_exp[[1]])) #Obtain values of PSI PSI_List <- GetPSI_FromTranRef(PathsxTranscript = EventXtrans,Samples = data_exp,Bootstrap = TRUE, Filter = FALSE) PSI <- PSI_List$PSI Expression_List <- PSI_List$ExpEvs
data(EventXtrans) PathSamples <- system.file("extdata",package="EventPointer") PathSamples <- paste0(PathSamples,"/output") PathSamples <- dir(PathSamples,full.names = TRUE) data_exp <- getbootstrapdata(PathSamples = PathSamples,type = "kallisto") #same annotation rownames(data_exp[[1]]) <- gsub("\\|.*","",rownames(data_exp[[1]])) #Obtain values of PSI PSI_List <- GetPSI_FromTranRef(PathsxTranscript = EventXtrans,Samples = data_exp,Bootstrap = TRUE, Filter = FALSE) PSI <- PSI_List$PSI Expression_List <- PSI_List$ExpEvs
Data frame with primers design for conventional PCR
data(MyPrimers)
data(MyPrimers)
A data.frame
object displays the relative
information for primers design for conventional PCR
MyPrimers object contains a data.frame with the information of the design primers for conventioanl PCR.
Data frame with primers design for taqman PCR
data(MyPrimers_taqman)
data(MyPrimers_taqman)
A data.frame
object displays the relative
information for primers design for taqman PCR
MyPrimers_taqman object contains a data.frame with the information of the design primers for taqman PCR.
Prepares the information contained in .bam files to be analyzed by EventPointer
PrepareBam_EP( Samples, SamplePath, Ref_Transc = "Ensembl", fileTransc = NULL, cores = 1, Alpha = 2 )
PrepareBam_EP( Samples, SamplePath, Ref_Transc = "Ensembl", fileTransc = NULL, cores = 1, Alpha = 2 )
Samples |
Name of the .bam files to be analyzed (Sample1.bam,Sample2.bam,...,etc). |
SamplePath |
Path where the bam files are stored. |
Ref_Transc |
Reference transcriptome used to name the genes found in bam files. Options are: Ensembl, UCSC or GTF. |
fileTransc |
Path to the GTF reference transcriptome ff Ref_Transc is GTF. |
cores |
Number of cores used for parallel processing. |
Alpha |
Internal SGSeq parameter to include or exclude regions |
SGFeaturesCounts object. It contains a GRanges object with the corresponding elements to build the different splicing graphs found and the counts related to each of the elements.
## Not run: # Obtain the samples and directory for .bam files BamInfo<-si Samples<-BamInfo[,2] PathToSamples <- system.file('extdata/bams', package = 'SGSeq') PathToGTF<-paste(system.file('extdata',package='EventPointer'),'/FBXO31.gtf',sep='') # Run PrepareBam function SG_RNASeq<-PrepareBam_EP(Samples=Samples, SamplePath=PathToSamples, Ref_Transc='GTF', fileTransc=PathToGTF, cores=1) ## End(Not run)
## Not run: # Obtain the samples and directory for .bam files BamInfo<-si Samples<-BamInfo[,2] PathToSamples <- system.file('extdata/bams', package = 'SGSeq') PathToGTF<-paste(system.file('extdata',package='EventPointer'),'/FBXO31.gtf',sep='') # Run PrepareBam function SG_RNASeq<-PrepareBam_EP(Samples=Samples, SamplePath=PathToSamples, Ref_Transc='GTF', fileTransc=PathToGTF, cores=1) ## End(Not run)
Analyze whether the presence of a protein domain increases or decreases in the condition under study.
Protein_Domain_Enrichment(PathsxTranscript, TxD, Diff_PSI, method = "spearman")
Protein_Domain_Enrichment(PathsxTranscript, TxD, Diff_PSI, method = "spearman")
PathsxTranscript |
the output of EventDetection_transcriptome. |
TxD |
matrix that relates transcripts with Protein domain. Users can get it from BioMart |
Diff_PSI |
matrix with the difference of psi of the condition under study. Can get it from the output of EventPointer_Bootstraps |
method |
a character string indicating which correlation coeffcient is to be calculated. "spearman" (default) or "pearson" can be selected. |
A list containing the results of the protein domain enrichment anaylisis. This list contains 3 matrices in which the rows indicate the protein domains and the columns the number of contrasts. The 3 matrices are the following:
-mycor: correlation value between the deltaPSI and the DifProtDomain matrix (see more details in vignette)
-STATISTIC: the values of the test statistic
-PVAL: the pvalues of the test statistic
## Not run: data("EventXtrans") data("TxD") data("Fit") #same annotation in TxD and EventXtrans transcriptnames <- EventXtrans$transcritnames transcriptnames <- gsub("\\..*","",transcriptnames) EventXtrans$transcritnames <- transcriptnames Result_PDEA <- Protein_Domain_Enrichment(PathsxTranscript = EventXtrans, TxD = TxD, Diff_PSI = Fit$deltaPSI) ## End(Not run)
## Not run: data("EventXtrans") data("TxD") data("Fit") #same annotation in TxD and EventXtrans transcriptnames <- EventXtrans$transcritnames transcriptnames <- gsub("\\..*","",transcriptnames) EventXtrans$transcritnames <- transcriptnames Result_PDEA <- Protein_Domain_Enrichment(PathsxTranscript = EventXtrans, TxD = TxD, Diff_PSI = Fit$deltaPSI) ## End(Not run)
Statistical analysis of the alternative splicing events. This function takes as input the values of PSI. Perform a statistical analysis based on permutation test
PSI_Statistic(PSI, Design, Contrast, nboot)
PSI_Statistic(PSI, Design, Contrast, nboot)
PSI |
A matrix with the values of the PSI. |
Design |
The design matrix for the experiment. |
Contrast |
The contrast matrix for the experiment. |
nboot |
The number of random analysis. |
The output of these functions is a list containing: two data.frame (deltaPSI and Pvalues) with the values of the deltaPSI and the p.values for each contrast, and a third element (LocalFDR) with the information of the local false discovery rate.
## Not run: data(ArraysData) PSI_Arrays_list<-EventPointer:::getPSI(ArraysData) PSI_Arrays <- PSI_Arrays_list$PSI Design <- matrix(c(1,1,1,1,0,0,1,1),nrow=4) Contrast <- matrix(c(0,1),nrow=1) # Statistical analysis: table <- PSI_Statistic(PSI_Arrays,Design = Design, Contrast = Contrast, nboot = 50) ## End(Not run)
## Not run: data(ArraysData) PSI_Arrays_list<-EventPointer:::getPSI(ArraysData) PSI_Arrays <- PSI_Arrays_list$PSI Design <- matrix(c(1,1,1,1,0,0,1,1),nrow=4) Contrast <- matrix(c(0,1),nrow=1) # Statistical analysis: table <- PSI_Statistic(PSI_Arrays,Design = Design, Contrast = Contrast, nboot = 50) ## End(Not run)
relationship between isoforms and events
data(PSIss)
data(PSIss)
A object PSIss[[1]]
displays the values of PSI
and PSIss[[2]]
the valeus of expression.
PSIss object the values of PSI calculated by the funcion GetPSI_FromTranRef and also the values of expression.
Extract a table of the top-ranked events from the output of EventPointer_Bootstraps.
ResulTable(EP_Result,coef = 1,number = Inf)
ResulTable(EP_Result,coef = 1,number = Inf)
EP_Result |
The output of the function EventPointer_Bootstraps |
coef |
Number specifying which coefficient or contrast of the model is of interest. |
number |
Maximum number of events to list |
A dataframe with a row for the number of top events and the following columns:
deltaPSI: the difference of PSI between conditions
pvalue: raw p-value
lfdr: local false discovery rate
qvalue: adjusted p-value or q-value
data(PSIss) PSI <- PSIss$PSI Dmatrix <- cbind(1,rep(c(0,1),each=2)) Cmatrix <- matrix(c(0,1),nrow=2) Fit <- EventPointer_Bootstraps(PSI = PSI, Design = Dmatrix, Contrast = Cmatrix, cores = 1, ram = 1, nBootstraps = 10, UsePseudoAligBootstrap = TRUE) ResulTable(EP_Result = Fit,coef = 1,number = 5)
data(PSIss) PSI <- PSIss$PSI Dmatrix <- cbind(1,rep(c(0,1),each=2)) Cmatrix <- matrix(c(0,1),nrow=2) Fit <- EventPointer_Bootstraps(PSI = PSI, Design = Dmatrix, Contrast = Cmatrix, cores = 1, ram = 1, nBootstraps = 10, UsePseudoAligBootstrap = TRUE) ResulTable(EP_Result = Fit,coef = 1,number = 5)
Methodology to predict context-specific splicing factors
SF_Prediction( P_value_PSI, ExS, nSel = 1000, significance = NULL, method = "Fisher" )
SF_Prediction( P_value_PSI, ExS, nSel = 1000, significance = NULL, method = "Fisher" )
P_value_PSI |
A data.frame with the p.values of the experiment. |
ExS |
The ExS matrix biuldt in CreateExSmatrix function. |
nSel |
Top ranked events to be considered as spliced events. |
significance |
Threshold of P.value to consider which events are deferentially spliced. A vector of length equal to the number of contrasts. If null it will consider the nSel top ranked events. |
method |
methodology to apply: "Fisher" for Fisher's exact test (default), "PoiBin" for Poisson Binomial test, "Wilcoxon" for a wilcoxon test or "Gsea" for a test of kolmogorov smirnov |
The function returs a list. This list has for each contrast a data.frame containing the results of the prediction.
Splicing graph example for Events_ReClassification function
data(SG_reclassify)
data(SG_reclassify)
A list
object SG_reclassify[[i]]
displays the splicing graph of the ith gene.
A list with the splicing graph of the 5 genes corresponding to the alternative splicing events depicted in the example of the function Events_ReClassification.
Splicing graph elements predicted from BAM files
data(SG_RNASeq)
data(SG_RNASeq)
A SGFeatureCounts
objects with predicted splicing
graph features and counts
SG_RNASeq object displays the predicted features found
in the BAM files from the dataset published in Seshagiri et al.
2012 and used in the SGSeq
R package vignette.
Transcript x Protein Domain matrix: small matrix for examples
data(TxD)
data(TxD)
A matrix
object
A matrix containing the relates Transcripts with Protein Domains