SpliceWiz is a graphical interface for differential alternative splicing and visualization in R. It differs from other alternative splicing tools as it is designed for users with basic bioinformatic skills to analyze datasets containing up to hundreds of samples! SpliceWiz contains a number of innovations including:
This vignette is a runnable working example of the SpliceWiz workflow. The purpose is to quickly demonstrate the basic functionalities of SpliceWiz.
We provide here a brief outline of the workflow for users to get started as quickly as possible. However, we also provide more details for those wishing to know more. Many sections will contain extra information that can be displayed when clicked on, such as these:
We recommend the following memory requirements (RAM) for running various
steps of SpliceWiz:
buildRef()
processBAM()
collateData()
lowMemoryMode=TRUE
): 32 gigabyteslowMemoryMode=FALSE
): 8 gigabytes per threadDifferential analysis
SpliceWiz defines alternative splicing events (ASEs) as binary events between two possibilities, the included and excluded isoform. It detects and measures: skipped (casette) exons (SE), mutually-exclusive exons (MXE), alternative 5’/3’ splice site usage (A5SS / A3SS), alternate first / last exon usage (AFE / ALE), and retained introns (IR or RI).
SpliceWiz uses splice-specific read counts to measure ASEs. Namely, these are junction reads (reads that align across splice sites). The exception is intron retention (IR) whereby the (trimmed) mean read depth across the intron is measured (identical to the method used in IRFinder).
SpliceWiz provides two metrics:
SpliceOver
or SpliceMax
method (the
latter is identical to that used in IRFinder)
Novel splicing events are those in which at least one isoform is not an
annotated transcript in the given gene annotation. SpliceWiz DOES detect
novel splicing events.
It detects novel events by using novel junctions, using pairs of junctions that originate from or terminate at a common coordinate (novel alternate splice site usage).
Additionally, SpliceWiz detects “tandem junction reads”. These are reads that span across two or more splice junctions. The region between splice junctions can then be annotated as novel exons (if they are not identical to annotated exons). These novel exons can then be used to measure novel casette exon usage.
The basic steps of SpliceWiz are as follows:
To install SpliceWiz, start R (devel version) and enter:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("SpliceWiz")
For those wishing to set up a self-contained environment with SpliceWiz
installed (e.g. on a high performance cluster), we recommend using
miniconda. For installation instructions, see the documentation on
how to install miniconda
After installing miniconda, create a conda environment as follows:
After following the prompts, activate the environment:
Next, install R 4.2.1 as follows:
NB: We have not been able to successfuly use r-base=4.3, so we recommend using r-base=4.2.1 (until further notice).
Many of SpliceWiz’s dependencies are up-to-date from the conda-forge channel, so they are best installed via conda:
conda install -c conda-forge r-devtools r-essentials r-xml r-biocmanager \
r-fst r-plotly r-rsqlite r-rcurl
After this is done, the remainder of the packages need to be installed from the R terminal. This is because most Bioconductor packages are from the bioconda channel and appear not to be routinely updated.
So, lets enter the R terminal from the command line:
Set up Bioconductor 3.16 (which is the latest version compatible with R 4.2):
Again, follow the prompts to update any necessary packages.
Once this is done, install SpliceWiz (devel) from github:
The last step will install any remaining dependencies, taking approximately 20-30 minutes depending on your system.
For MacOS users, make sure OpenMP libraries are
installed correctly. We recommend users follow this guide, but the quickest way
to get started is to install libomp
via brew:
SpliceWiz uses established statistical tools to perform alternative
splicing differential analysis:
To install all of these packages:
NxtIRFdata
data package.
This data package contains the example “chrZ” genome / annotations and 6
example BAM files that are used in this working example. Also,
NxtIRFdata provides pre-generated mappability exclusion annotations for
building human and mouse SpliceWiz references
SpliceWiz offers a graphical user interface (GUI) for interactive users, e.g. in the RStudio environment. To start using SpliceWiz GUI:
SpliceWiz first needs to generate a set of reference files. The
SpliceWiz reference is used to quantitate alternative splicing in BAM
files, as well as in downstream collation, differential analysis and
visualisation.
Using the example FASTA and GTF files, use the
buildRef()
function to build the SpliceWiz reference:
ref_path <- file.path(tempdir(), "Reference")
buildRef(
reference_path = ref_path,
fasta = chrZ_genome(),
gtf = chrZ_gtf(),
ontologySpecies = "Homo sapiens"
)
The SpliceWiz reference can be viewed as data frames using various getter functions. For example, to view the annotated alternative splicing events (ASE):
See ?View-Reference-methods
for a comprehensive list of
getter functions
After starting the SpliceWiz GUI in demo mode, click the
Reference
tab from the menu side bar. The following
interface will be shown:
Load Demo FASTA/GTF
(5), and then click
Build Reference
(6)
The helper functions chrZ_genome()
and
chrZ_gtf()
returns the paths to the example genome (FASTA)
and transcriptome (GTF) file included with the NxtIRFdata
package that contains the working example used by SpliceWiz:
SpliceWiz supports gene ontology analysis. To enable this capability, we
first need to generate the gene ontology annotations for the appropriate
species.
To see a list of supported species:
getAvailableGO()
#> [1] "Anopheles gambiae" "Arabidopsis thaliana"
#> [3] "Bos taurus" "Canis familiaris"
#> [5] "Gallus gallus" "Pan troglodytes"
#> [7] "Escherichia coli" "Drosophila melanogaster"
#> [9] "Homo sapiens" "Mus musculus"
#> [11] "Sus scrofa" "Rattus norvegicus"
#> [13] "Macaca mulatta" "Caenorhabditis elegans"
#> [15] "Xenopus laevis" "Saccharomyces cerevisiae"
#> [17] "Danio rerio"
Note that, if genome_type
is specified to a supported
genome, the human / mouse gene ontology annotation will be automatically
generated.
For the most part, the SpliceWiz reference can be built with just the
FASTA and GTF files. This is sufficient for assessment for most forms of
alternative splicing events.
For intron retention, accurate assessment of intron depth is important. However, introns contain many repetitive regions that are difficult to map. We refer to these regions as “mappability exclusions”.
We adopt IRFinder’s algorithm to identify these mappability exclusions. This is determined empirically by generating synthetic reads systematically from the genome, then aligning these reads back to the same genome. Regions that contain less than the expected coverage depth of reads define “mappability exclusions”.
See the vignette: SpliceWiz cookbook for details on how to generate “mappability exclusions” for any genome.
For human and mouse genomes, SpliceWiz provides pre-built mappability
exclusion references that can be used to build the SpliceWiz reference.
SpliceWiz provides these annotations via the NxtIRFdata
package.
Simply specify the genome in the parameter genome_type
in the buildRef()
function (which accepts
hg38
, hg19
, mm10
and
mm9
).
Additionally, a reference for non-polyadenylated transcripts is used. This has a minor role in QC of samples (to assess the adequacy of polyA capture).
For example, assuming your genome file "genome.fa"
and a
transcript annotation "transcripts.gtf"
are in the working
directory, a SpliceWiz reference can be built using the built-in
hg38
low mappability regions and non-polyadenylated
transcripts as follows:
The function SpliceWiz_example_bams()
retrieves 6
example BAM files from ExperimentHub and places a copy of these in the
temporary directory.
bams <- SpliceWiz_example_bams()
#> Downloading record from hub, as required: NxtIRF/example_bam/02H003
#> Downloading record from hub, as required: NxtIRF/example_bam/02H025
#> Downloading record from hub, as required: NxtIRF/example_bam/02H026
#> Downloading record from hub, as required: NxtIRF/example_bam/02H033
#> Downloading record from hub, as required: NxtIRF/example_bam/02H043
#> Downloading record from hub, as required: NxtIRF/example_bam/02H046
Often, alignment pipelines process multiple samples. SpliceWiz provides
convenience functions to recursively locate all the BAM files in a given
folder, and tries to ascertain sample names. Often sample names can be
gleaned when: * The BAM files are named by their sample names,
e.g. “sample1.bam”, “sample2.bam”. In this case, level = 0
* The BAM files have generic names but are contained inside parent
directories labeled by their sample names, e.g. “sample1/Unsorted.bam”,
“sample2/Unsorted.bam”. In this case, level = 1
Process these BAM files using SpliceWiz:
pb_path <- file.path(tempdir(), "pb_output")
processBAM(
bamfiles = bams$path,
sample_names = bams$sample,
reference_path = ref_path,
output_path = pb_path
)
After building the demo reference as shown in the previous section,
start SpliceWiz GUI in demo mode. Then, click the
Experiment
tab from the menu side bar. The following
interface will be shown:
The buttons on the left hand side are as follows:
processBAM
(process BAM files)collateData
- collating the
experimentcollateData
(collating the experiment)processBAM
and
`collateData functionsAlso, (8) is a row of tabs that toggle between different tables, showing details of the Reference, BAM files, processBAM output Files, and sample Annotations
To continue with our example, click on (1)
Define Project Folders
to bring up the following drop-down
box:
We need to define the folders that contain our reference, BAM files, as well as the experiment (NxtSE) output folder for the final compiled experiment
Choose Reference Folder
and select the
Reference
directory (where the SpliceWiz reference was
generated by the previous step. Then,Choose BAM Folder
and select the
bams
directory (where the demo BAM files have been
generated).Choose / Create Experiment (NxtSE) Folder
and
select the NxtSE
directory (which should currently be empty
except for the pbOutput
subdirectory). Note that when an
Experiment folder is chosen via this step, a pbOutput
subdirectory will be created if it does not already existAfter our folders have been defined, on the right hand side, an interactive table should be displayed that looks like the following:
To process the example BAM files, first make sure the BAM files you
wish to process have been selected (BAM files can be unselected by
removing the ticks in the selected
column). Also, users
have the option of renaming the samples (by setting the names in the
sampleName
column).
To continue with our example, lets leave the names as-is. Click the
Process Selected BAMs
button. A prompt should pop up asking
for confirmation. Click OK
to start running processBAM.
processBAM()
function
SpliceWiz’s processBAM()
function can process one or more
BAM files. This function is ultra-fast, relying on an internal native
C++ function that uses OpenMP multi-threading (via the
ompBAM
C++ API).
Input BAM files can be either read-name sorted or coordinate sorted (although SpliceWiz prefers the former). Indexing of coordinate-sorted BAMs are not necessary.
processBAM()
loads the SpliceWiz reference. Then, it
reads each BAM file in their entirety, and quantifies the following:
processBAM()
generates two output files.
The first is a gzipped text file containing all the quantitation data.
The second is a COV
file which contains the per-nucleotide
RNA-seq coverage of the sample.processBAM()
function
At minimum, processBAM()
requires four parameters:
bamfiles
: The paths of the BAM filessample_names
: The sample names corresponding to the
given BAM filesreference_path
: The directory containing the SpliceWiz
referenceoutput_path
: The directory where the output of
processBAM()
should gopb_path <- file.path(tempdir(), "pb_output")
processBAM(
bamfiles = bams$path,
sample_names = bams$sample,
reference_path = ref_path,
output_path = pb_path
)
processBAM()
also takes several optional, but useful,
parameters:
n_threads
: The number of threads for
multi-threadingoverwrite
: Whether existing files in the output
directory should be overwrittenrun_featureCounts
: (Requires the Rsubread package)
runs featureCounts to obtain gene counts (which outputs results as an
RDS file)For example, to run processBAM()
using 2 threads,
disallow overwrite of existing processBAM()
outputs, and
run featureCounts afterwards, one would run the following:
# NOT RUN
# Re-run IRFinder without overwrite, and run featureCounts
require(Rsubread)
processBAM(
bamfiles = bams$path,
sample_names = bams$sample,
reference_path = ref_path,
output_path = pb_path,
n_threads = 2,
overwrite = FALSE,
run_featureCounts = TRUE
)
# Load gene counts
gene_counts <- readRDS(file.path(pb_path, "main.FC.Rds"))
# Access gene counts:
gene_counts$counts
The helper function findSpliceWizOutput()
organises the
output files of SpliceWiz’s processBAM()
function. It
identifies matching "txt.gz"
and "cov"
files
for each sample, and organises these file paths conveniently into a
3-column data frame:
Using this data frame, collate the experiment using
collateData()
. We name the output directory as
NxtSE_output
as this folder will contain the data needed to
import the NxtSE object:
nxtse_path <- file.path(tempdir(), "NxtSE_output")
collateData(
Experiment = expr,
reference_path = ref_path,
output_path = nxtse_path
)
collateData()
function
collateData()
combines the processBAM()
output
files of multiple samples and builds a single database.
collateData()
creates a number of files in the chosen
output directory. These outputs can then be imported into the R session
as a NxtSE
data object for downstream analysis.
At minimum, collateData()
takes the following
parameters:
Experiment
: The 2- or 3- column data frame. The first
column should contain (unique) sample names. The second and (optional)
third columns contain the "txt.gz"
and "cov"
file pathsreference_path
: The directory containing the SpliceWiz
referenceoutput_path
: The directory where the output of
processBAM()
should gocollateData()
can take some optional parameters:
IRMode
: Whether to use SpliceWiz’s
SpliceOver
method, or IRFinder’s SpliceMax
method, to determine total spliced transcript abundance. Briefly,
SpliceMax
considers junction reads that have either
flanking splice site coordinate. SpliceOver
considers
additional junction reads that splices across exon clusters in common.
Exon clusters are groups of mutually-overlapping exons.
SpliceOver
is the default option.overwrite
: Whether files in the output directory
should be overwrittenn_threads
: Use multi-threaded operations where
possiblelowMemoryMode
: Minimise memory usage where possible.
Note that most of the collateData pipeline will be single-threaded if
this is set to TRUE
.collateData()
is a memory-intensive operation when run
using multiple threads. We estimate it can use up to 6-7 Gb per thread.
lowMemoryMode
will minimise RAM usage to ~ 8 Gb, but will
be slower and run on a single thread.
Novel splicing detection can be switched on by setting
novelSplicing = TRUE
from within the
collateData()
function:
# Modified pipeline - collateData with novel ASE discovery:
nxtse_path <- file.path(tempdir(), "NxtSE_output_novel")
collateData(
Experiment = expr,
reference_path = ref_path,
output_path = nxtse_path,
novelSplicing = TRUE ## NEW ##
)
collateData()
uses split reads that are not annotated
introns to help construct hypothetical minimal transcripts. These are
then injected into the original transcriptome annotation (GTF) file,
whereby the SpliceWiz reference is rebuilt. The new SpliceWiz reference
(which contains these novel transcripts) is then used to collate the
samples.
To reduce false positives in novel splicing detection, SpliceWiz provides several filters to reduce the number of novel junctions fed into the analysis:
novelSplicing_minSamples
parameternovelSplicing_countThreshold
)
in a smaller number of samples (set using
novelSplicing_minSamplesAboveThreshold
)novelSplicing_requireOneAnnotatedSJ = TRUE
).For example, if one wished to retain novel reads seen in 3 or more samples, or novel spliced reads with 10 or more counts in at least 1 sample, and requiring at least one end of a novel junction being an annotated splice site:
By default, tandem junction reads (reads that align across two or
more splice junctions) are used to detect novel exons. This can be
turned off by setting novelSplicing_useTJ = FALSE
.
nxtse_path <- file.path(tempdir(), "NxtSE_output_novel")
collateData(
Experiment = expr,
reference_path = ref_path,
output_path = nxtse_path,
## NEW ##
novelSplicing = TRUE,
# switches on novel splice detection
novelSplicing_requireOneAnnotatedSJ = TRUE,
# novel junctions must share one annotated splice site
novelSplicing_minSamples = 3,
# retain junctions observed in 3+ samples (of any non-zero expression)
novelSplicing_minSamplesAboveThreshold = 1,
# only 1 sample required if its junction count exceeds a set threshold
novelSplicing_countThreshold = 10 ,
# threshold for previous parameter
novelSplicing_useTJ = TRUE
# whether tandem junction reads should be used to identify novel exons
)
After running the Reference and processBAM steps as indicated in the
previous sections (of the GUI instructions), there is an option to
assign annotations to the experiment prior to collation. Annotations can
be assigned from existing tabular files (such as csv files). For this
example, we will demonstrate how to use a csv file containing
annotations to annotate our experiment. Note that the annotation table
should contain matching sample names in its leftmost column.
To select a file containing annotations, click on
Import Annotations from file
, then select the
demo_annotations.csv
file that should be in the working
directory. Press ok. Your interface should now look like this:
Note that an extra button
Add / Remove Annotation Columns
(*) has appeared. Clicking
on this button allows us to add/remove annotation columns
After annotating the experiment in the step above, click on the
Experiment Settings
button. Then enable
Look for Novel Splicing
to bring up the following
display:
This drop-down dialog box contains several parameters related to novel splicing detection:
Also, there is an option (7) to overwrite a previously-compiled NxtSE in the same folder. There is also an option to clear all the Reference/BAM/NxtSE folders (8).
For this example, we will leave the settings as above. Proceed to run
collateData() by clicking on Collate Experiment
. After
several moments, a pop-up message should be shown when the experiment
has been successfully collated.
Before differential analysis can be performed, the collated
experiment must be imported into the R session as an NxtSE
data object.
After running collateData()
, import the experiment using
the makeSE()
function:
After running the steps in the previous GUI sections, navigate to
Analysis
and then click the Load Experiment
on
the menu bar. The display should look like this:
The buttons on the left hand side are as follows:
pbOutput
subdirectory)To continue with our example, click the
Select Experiment (NxtSE) Folder
, then select the
NxtSE
directory. The interface should now look like
this:
To view any existing annotations, click the Annotations
tab in the Experiment Display
above the sample table. If
you followed the prior steps in the Collate the experiment
section, there should already be annotations here. If not, feel free to
add annotations using the Import Annotations from File
(and
click on “demo_annotations.csv” file), or manually add and annotate
columns by clicking on the Add / Remove Annotation Columns
button to open the drop-down box.
To load the NxtSE object, click the
Load NxtSE from Folder
to which will load the NxtSE object
into the current session. A pop-up will appear once the NxtSE object has
been successfully completed.
makeSE()
function
makeSE()
function imports the compiled data generated
by the collateData()
function. Data is imported as an
NxtSE
object. Downstream analysis, including differential
analysis and visualization, is performed using the NxtSE
object.makeSE()
function
By default, makeSE()
uses delayed operations to avoid
consuming memory until the data is actually needed. This is advantageous
in analysis of hundreds of samples on a computer with limited resources.
However, it will be slower. To load all the data into memory, we need to
“realize” the NxtSE object, as follows:
Alternatively, makeSE()
can realize the NxtSE object at
construction:
By default, makeSE()
constructs the NxtSE object using
all the samples in the collated data. It is possible (and particularly
useful in large data sets) to read only a subset of samples. In this
case, construct a data frame object with the first column containing the
desired sample names and parse this into the colData
parameter as shown:
subset_samples <- colnames(se)[1:4]
df <- data.frame(sample = subset_samples)
se_small <- makeSE(nxtse_path, colData = df, RemoveOverlapping = TRUE)
RemoveOverlapping = FALSE
.
NB: to add annotations via the GUI workflow, see the
Collate the experiment
section.
Once an NxtSE object has been loaded into memory, you can save it as an
RDS object so it can be reloaded in a later session. To do this, click
the Save NxtSE as RDS
button. Choose a file name and
location and press OK
. This RDS file can be loaded as an
NxtSE object in a later GUI session by clicking the
Load NxtSE from RDS
button.
NxtSE
object
NxtSE
is a data object which contains all the required data
for downstream analysis after all the BAM alignment files have been
process and the experiment is collated.
se
#> class: NxtSE
#> dim: 167 6
#> metadata(14): Up_Inc Down_Inc ... ref row_gr
#> assays(5): Included Excluded Depth Coverage minDepth
#> rownames(167): TRA2B/ENST00000453386_Intron8/clean
#> TRA2B/ENST00000453386_Intron7/clean ...
#> RI:SRSF2-203-exon2;SRSF2-202-intron2
#> RI:SRSF2-203-exon2;SRSF2-206-intron1
#> rowData names(20): EventName EventType ... is_annotated_IR
#> NMD_direction
#> colnames(6): 02H003 02H025 ... 02H043 02H046
#> colData names(2): condition batch
The NxtSE
object inherits the
SummarizedExperiment
object. This means that the functions
for SummarizedExperiment can be used on the NxtSE object. These include
row and column annotations using the rowData()
and
colData()
accessors.
Rows in the NxtSE
object contain information about each
alternate splicing event. For example:
head(rowData(se))
#> DataFrame with 6 rows and 20 columns
#> EventName EventType
#> <character> <character>
#> TRA2B/ENST00000453386_Intron8/clean TRA2B/ENST0000045338.. IR
#> TRA2B/ENST00000453386_Intron7/clean TRA2B/ENST0000045338.. IR
#> TRA2B/ENST00000453386_Intron6/clean TRA2B/ENST0000045338.. IR
#> TRA2B/ENST00000453386_Intron5/clean TRA2B/ENST0000045338.. IR
#> TRA2B/ENST00000453386_Intron4/clean TRA2B/ENST0000045338.. IR
#> TRA2B/ENST00000453386_Intron3/clean TRA2B/ENST0000045338.. IR
#> EventRegion gene_id
#> <character> <character>
#> TRA2B/ENST00000453386_Intron8/clean chrZ:1921-2559/- ENSG00000136527
#> TRA2B/ENST00000453386_Intron7/clean chrZ:2634-3631/- ENSG00000136527
#> TRA2B/ENST00000453386_Intron6/clean chrZ:3692-5298/- ENSG00000136527
#> TRA2B/ENST00000453386_Intron5/clean chrZ:5383-6205/- ENSG00000136527
#> TRA2B/ENST00000453386_Intron4/clean chrZ:6322-7990/- ENSG00000136527
#> TRA2B/ENST00000453386_Intron3/clean chrZ:8180-9658/- ENSG00000136527
#> gene_id_b intron_id
#> <character> <character>
#> TRA2B/ENST00000453386_Intron8/clean ENSG00000136527 ENST00000453386_Intr..
#> TRA2B/ENST00000453386_Intron7/clean ENSG00000136527 ENST00000453386_Intr..
#> TRA2B/ENST00000453386_Intron6/clean ENSG00000136527 ENST00000453386_Intr..
#> TRA2B/ENST00000453386_Intron5/clean ENSG00000136527 ENST00000453386_Intr..
#> TRA2B/ENST00000453386_Intron4/clean ENSG00000136527 ENST00000453386_Intr..
#> TRA2B/ENST00000453386_Intron3/clean ENSG00000136527 ENST00000453386_Intr..
#> Inc_Is_Protein_Coding Exc_Is_Protein_Coding
#> <logical> <logical>
#> TRA2B/ENST00000453386_Intron8/clean TRUE TRUE
#> TRA2B/ENST00000453386_Intron7/clean TRUE TRUE
#> TRA2B/ENST00000453386_Intron6/clean TRUE TRUE
#> TRA2B/ENST00000453386_Intron5/clean TRUE TRUE
#> TRA2B/ENST00000453386_Intron4/clean TRUE TRUE
#> TRA2B/ENST00000453386_Intron3/clean TRUE TRUE
#> Exc_Is_NMD Inc_Is_NMD Inc_TSL
#> <logical> <logical> <character>
#> TRA2B/ENST00000453386_Intron8/clean FALSE TRUE 1
#> TRA2B/ENST00000453386_Intron7/clean FALSE TRUE 1
#> TRA2B/ENST00000453386_Intron6/clean FALSE TRUE 1
#> TRA2B/ENST00000453386_Intron5/clean FALSE TRUE 1
#> TRA2B/ENST00000453386_Intron4/clean FALSE TRUE 1
#> TRA2B/ENST00000453386_Intron3/clean FALSE TRUE 1
#> Exc_TSL Event1a Event2a
#> <character> <character> <character>
#> TRA2B/ENST00000453386_Intron8/clean 1 chrZ:1921-2559/- NA
#> TRA2B/ENST00000453386_Intron7/clean 1 chrZ:2634-3631/- NA
#> TRA2B/ENST00000453386_Intron6/clean 1 chrZ:3692-5298/- NA
#> TRA2B/ENST00000453386_Intron5/clean 1 chrZ:5383-6205/- NA
#> TRA2B/ENST00000453386_Intron4/clean 1 chrZ:6322-7990/- NA
#> TRA2B/ENST00000453386_Intron3/clean 1 chrZ:8180-9658/- NA
#> Event1b Event2b
#> <character> <character>
#> TRA2B/ENST00000453386_Intron8/clean NA NA
#> TRA2B/ENST00000453386_Intron7/clean NA NA
#> TRA2B/ENST00000453386_Intron6/clean NA NA
#> TRA2B/ENST00000453386_Intron5/clean NA NA
#> TRA2B/ENST00000453386_Intron4/clean NA NA
#> TRA2B/ENST00000453386_Intron3/clean NA NA
#> is_always_first_intron
#> <logical>
#> TRA2B/ENST00000453386_Intron8/clean FALSE
#> TRA2B/ENST00000453386_Intron7/clean FALSE
#> TRA2B/ENST00000453386_Intron6/clean FALSE
#> TRA2B/ENST00000453386_Intron5/clean FALSE
#> TRA2B/ENST00000453386_Intron4/clean FALSE
#> TRA2B/ENST00000453386_Intron3/clean FALSE
#> is_always_last_intron is_annotated_IR
#> <logical> <logical>
#> TRA2B/ENST00000453386_Intron8/clean FALSE FALSE
#> TRA2B/ENST00000453386_Intron7/clean FALSE FALSE
#> TRA2B/ENST00000453386_Intron6/clean FALSE FALSE
#> TRA2B/ENST00000453386_Intron5/clean FALSE FALSE
#> TRA2B/ENST00000453386_Intron4/clean FALSE TRUE
#> TRA2B/ENST00000453386_Intron3/clean FALSE FALSE
#> NMD_direction
#> <numeric>
#> TRA2B/ENST00000453386_Intron8/clean 1
#> TRA2B/ENST00000453386_Intron7/clean 1
#> TRA2B/ENST00000453386_Intron6/clean 1
#> TRA2B/ENST00000453386_Intron5/clean 1
#> TRA2B/ENST00000453386_Intron4/clean 1
#> TRA2B/ENST00000453386_Intron3/clean 1
Columns contain information about each sample. By default, no annotations are assigned to each sample. These can be assigned as shown above.
Also, NxtSE
objects can be subsetted by rows (ASEs) or
columns (samples). This is useful if one wishes to perform analysis on a
subset of the dataset, or only on a subset of ASEs (say for example,
only skipped exon events). Subsetting is performed just like for
SummarizedExperiment
objects:
# Subset by columns: select the first 2 samples
se_sample_subset <- se[,1:2]
# Subset by rows: select the first 10 ASE events
se_ASE_subset <- se[1:10,]
SpliceWiz offers default filters to identify and remove low confidence alternative splice events (ASEs). Run the default filter using the following:
se.filtered <- se[applyFilters(se),]
#> Running Depth filter
#> Running Participation filter
#> Running Participation filter
#> Running Consistency filter
#> Running Terminus filter
#> Running ExclusiveMXE filter
#> Running StrictAltSS filter
After following the GUI tutorials in the prior sections, click on
Analysis
and then Filters
from the menu bar.
It should look like this:
To load SpliceWiz’s default filters, click the top right button
Load Default Filters
. Then to apply these filters to the
NxtSE, click Apply Filters
. After the filters have been
run, your session should now look like this:
Often, the gene annotations contain isoforms for all discovered splicing
events. Most annotated transcripts are not expressed, and their
inclusion in differential analysis complicates results including
adjusting for multiple testing. It is prudent to filter these out using
various approaches, akin to removing genes with low gene counts in
differential gene analysis. We suggest using the default filters which
work well for small experiments with sequencing depths at 100-million
paired-end reads.
?ASEFilters
Using the edgeR wrapper ASE_edgeR()
, perform
differential ASE analysis between conditions “A” and “B”:
# Requires edgeR to be installed:
require("edgeR")
res_edgeR <- ASE_edgeR(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A"
)
After running the previous sections (of the GUI instructions), click
Analysis
and then
Differential Expression Analysis
on the menu side bar. It
should look something like this:
To perform edgeR-based differential analysis, first ensure
Method
is set to edgeR
. Using the
Variable
drop-down box, select condition
.
Then, select the Nominator
and Denominator
fields to B
and A
, respectively. Leave the
batch factor fields as (none)
. Then, click
Perform DE
.
Once differential expression analysis has finished, your session
should look like below. The output is a DT-based data table equivalent
to the ASE_edgeR()
function.
NB: The interface allows users to choose to sort the results either by nominal or (multiple-testing) adjusted P values
NB2: There are 3 different ways Intron Retention events can be quantified and analysed - see “What are the different ways intron retention is measured?” below for further details.
NB3: Analyses can be saved or loaded to/from RDS files using the corresponding buttons.
SpliceWiz provides wrappers to three established algorithms:
ASE_limma
uses limma
to model isoform
counts as log-normal distributions. Limma is probably the fastest method
and is ideal for large datasets. Time series analysis is available for
this mode.ASE_DESeq
uses DESeq2
to model isoform
counts as negative binomial distribution. This method is the most
computationally expensive, but gives robust results. Time series
analysis is also available for this modeASE_edgeR
uses edgeR
to model isoform
counts as negative binomial distributions. SpliceWiz uses the
quasi-likelihood method that deals better with variance at near-zero
junction counts, resulting in reduced false positives.ASE_DoubleExpSeq
uses the lesser-known CRAN package
DoubleExpSeq
. This package uses the beta-binomial
distribution to model isoform counts. The method is at least as fast as
limma
, but for now it is restricted to analysis between two
groups (i.e. batch correction is not implemented)We recommend the following for differential analysis:
# Requires limma to be installed:
require("limma")
res_limma <- ASE_limma(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A"
)
# Requires DoubleExpSeq to be installed:
require("DoubleExpSeq")
res_DES <- ASE_DoubleExpSeq(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A"
)
# Requires DESeq2 to be installed:
require("DESeq2")
res_deseq <- ASE_DESeq(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A",
n_threads = 1
)
# Requires edgeR to be installed:
require("edgeR")
res_edgeR <- ASE_edgeR(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A"
)
Intron retention can be measured via two approaches.
The first (and preferred) approach is using IR-ratio. We presume that every intron is potentially retained (thus ignoring annotation). Given this results in many overlapping introns, SpliceWiz adjusts for this via the following:
makeSE()
step.SpliceOver
method in SpliceWiz). Alternately,
users can choose to use IRFinder’s SpliceMax
method,
summing junction reads that share either splice junction with the intron
of interest. This choice is also made by the user at the
makeSE()
step.At the differential analysis step, users can choose the following:
IRmode = "all"
- all introns are potentially retained,
use IR-ratio to quantify IR (EventType = "IR"
)IRmode = "annotated"
- only annotated retained intron
events are considered, but use IR-ratio to quantify IR
(EventType = "IR"
)IRmode = "annotated_binary"
- only annotated retained
intron events are considered, use PSI to quantify IR - which considers
the IR-transcript and the transcript isoform with the exactly-spliced
intron as binary alternatives. Splicing of overlapping introns are not
considered in PSI quantitation.res_edgeR_allIntrons <- ASE_edgeR(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A",
IRmode = "all"
)
res_edgeR_annotatedIR <- ASE_edgeR(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A",
IRmode = "annotated"
)
res_edgeR_annotated_binaryIR <- ASE_edgeR(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A",
IRmode = "annotated_binary"
)
ASE_limma
, ASE_edgeR
, and
ASE_DESeq
can accept up to 2 categories of batches from
which to normalize. For example, to normalize the analysis by the
batch
category, one would run:
require("edgeR")
res_edgeR_batchnorm <- ASE_edgeR(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A",
batch1 = "batch"
)
Time series analysis can be performed using limma, edgeR, and
DESeq2.
For limma and edgeR, time series analysis is done using the
ASE_limma_timeseries()
and
ASE_edgeR_timeseries()
function. test_factor
,
despite its name, should be a column in colData(se)
containing numerical values that represent time series data.
Note that these time series wrappers function requires the
splines
package.
colData(se.filtered)$timevar <- rep(c(0,1,2), 2)
require("splines")
require("limma")
res_limma_cont <- ASE_limma_timeseries(
se = se.filtered,
test_factor = "timevar"
)
require("splines")
require("edgeR")
res_edgeR_cont <- ASE_edgeR_timeseries(
se = se.filtered,
test_factor = "timevar"
)
For DESeq2, time series analysis is performed using the
ASE_DESeq()
funcction. The key difference is that, for time
series analysis, simply do not specify the test_nom
and
test_denom
parameters. As long as the
test_factor
contains numeric values, ASE_DESeq
will treat it as a continuous variable. See the following example:
colData(se.filtered)$timevar <- rep(c(0,1,2), 2)
require("DESeq2")
res_deseq_cont <- ASE_DESeq(
se = se.filtered,
test_factor = "timevar"
)
We have implemented wrapper functions enabling advanced users to perform differential ASE analysis by constructing their own design matrices. This allows users to evaluate effects of covariates in complex experimental models.
We will be building a separate vignette to illustrate the full functionality of these edgeR-based functions, but for now a quick example can be found in the relevant documentation, which can be viewed via:
Volcano plots show changes in PSI levels (log fold change, x axis) against statistical significance (-log10 p values, y axis):
library(ggplot2)
ggplot(res_edgeR,
aes(x = logFC, y = -log10(FDR))) +
geom_point() +
labs(title = "Differential analysis - B vs A",
x = "Log2-fold change", y = "FDR (-log10)")
Yes. We can use ggplot2
’s facet_wrap
function
to separately plot volcanos for each modality of ASE. The type of ASE is
contained in the EventType
column of the differential
results data frame.
ggplot(res_edgeR,
aes(x = logFC, y = -log10(FDR))) +
geom_point() + facet_wrap(vars(EventType)) +
labs(title = "Differential analysis - B vs A",
x = "Log2-fold change", y = "FDR (-log10)")
After following the previous sections including differential analysis,
navigate to Display
and then Volcano Plot
.
Notice that there will be a message that says “No events found. Consider
relaxing some filters”.
This message occurs because our example dataset has no differential
events that surpass an adjusted P value of less than 0.05 (which is the
default filter setting). The SpliceWiz GUI avoids plotting all ASEs as
this will crowd the visualization. In this example, change the
Filter Events by
to Nominal P value
, and move
the P-value/FDR threshold
all the way to the right. There
should now be a volcano plot but most events have near-zero significance
because the default y-axis setting is to
Plot adjusted P values
. Switch this off to show the
following:
You can customize this volcano plot using the following controls:
NMD Mode
is ON
, the horizontal axis
represents whether splicing is shifted towards (positive values) or away
from (negative values) a NMD substrate transcriptAlso, on the (right hand side) main panel:
Scatter plots are useful for showing splicing levels (percent-spliced-in, PSI) between two conditions. The results from differential analysis contains these values and can be plotted:
library(ggplot2)
ggplot(res_edgeR, aes(x = 100 * AvgPSI_B, y = 100 * AvgPSI_A)) +
geom_point() + xlim(0, 100) + ylim(0, 100) +
labs(title = "PSI values across conditions",
x = "PSI of condition B", y = "PSI of condition A")
#> Warning: Removed 1 row containing missing values or values outside the scale range
#> (`geom_point()`).
After following the previous sections including differential analysis,
navigate to Display
and then Scatter Plot
.
After relaxing the event filters similar to the previous section, change
the Variable
to condition
,
X-axis condition
to A
and
Y-axis condition
to B
. A scatter plot should
be automatically generated as follows:
You can customize the scatter plot using the following controls:
Variable
refers to the annotation table column
nameX-axis
and Y-axis
dropdowns
refer to the condition categories (that are used to contrast between two
experimental conditions)NMD Mode
is ON
, the PSI values are
altered such that they represent the inclusion values of the NMD
substrate (instead of that of the “included” isoform)As in the volcano plot, the scatter plot is interactive and points highlighted on this plot will stay highlighted in other plots.
SpliceWiz GUI generates plotly interactive figures. For volcano and
scatter plots, points of interest can be selected using the lasso or box
select tools. For example, we can select the top hits from the faceted
volcano plot as shown:
These ASEs of interest will then be highlighted in other plots, for example scatter plot:
SpliceWiz provides the makeMeanPSI()
function that can
generate mean PSI values for each condition of a condition category. For
example, the below code will calculate the mean PSIs of each “batch” of
this example experiment:
Our working example does not have enough genes to demonstrate a workable
gene ontology analysis. Instead, the following explains the controls
found in the Gene Ontology
panel:
The controls are as follows:
SpliceWiz has built-in gene ontology analysis. For now, only a limited number of species are supported for Ensembl gene ID’s. To see a list of supported organisms:
getAvailableGO()
#> [1] "Anopheles gambiae" "Arabidopsis thaliana"
#> [3] "Bos taurus" "Canis familiaris"
#> [5] "Gallus gallus" "Pan troglodytes"
#> [7] "Escherichia coli" "Drosophila melanogaster"
#> [9] "Homo sapiens" "Mus musculus"
#> [11] "Sus scrofa" "Rattus norvegicus"
#> [13] "Macaca mulatta" "Caenorhabditis elegans"
#> [15] "Xenopus laevis" "Saccharomyces cerevisiae"
#> [17] "Danio rerio"
To enable gene ontology analysis, one must use a SpliceWiz reference with prepared GO annotations for the specified organism. To view the gene ontology annotation for a given SpliceWiz reference:
ref_path <- file.path(tempdir(), "Reference")
ontology <- viewGO(ref_path)
head(ontology)
#> gene_id go_id evidence go_term ontology gene_name
#> 1 ENSG00000121410 GO:0003674 ND molecular_function MF <NA>
#> 2 ENSG00000121410 GO:0005576 HDA extracellular region CC <NA>
#> 3 ENSG00000121410 GO:0005576 IDA extracellular region CC <NA>
#> 4 ENSG00000121410 GO:0005576 TAS extracellular region CC <NA>
#> 5 ENSG00000121410 GO:0005615 HDA extracellular space CC <NA>
#> 6 ENSG00000121410 GO:0005886 IBA plasma membrane CC <NA>
Note that gene_name
s are not available for our example
reference because there are only 7 genes in the example reference.
Nevertheless, the GO annotation is complete and relies on Ensembl
gene_id
s for matching.
For simple gene over-representation analysis, we use the
goGenes
function. As an example, to analyse for enriched
biological functions of the first 1000 genes in the reference:
allGenes <- sort(unique(ontology$gene_id))
exampleGeneID <- allGenes[1:1000]
exampleBkgdID <- allGenes
go_byGenes <- goGenes(
enrichedGenes = exampleGeneID,
universeGenes = exampleBkgdID,
ontologyRef = ontology
)
To visualize the top gene ontology categories of the above analysis:
Of course, we wish to perform GO analysis on the top differential events in our analysis:
res_edgeR <- ASE_edgeR(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A"
)
#> Sep 08 08:25:03 AM Performing edgeR contrast for included / excluded counts separately
#> Sep 08 08:25:04 AM Performing edgeR contrast for included / excluded counts together
go_byASE <- goASE(
enrichedEventNames = res_edgeR$EventName[1:10],
universeEventNames = NULL,
se = se
)
head(go_byASE)
#> go_id go_term
#> <char> <char>
#> 1: GO:0000381 regulation of alternative mRNA splicing, via spliceosome
#> 2: GO:0045892 negative regulation of DNA-templated transcription
#> 3: GO:0048026 positive regulation of mRNA splicing, via spliceosome
#> 4: GO:0000122 negative regulation of transcription by RNA polymerase II
#> 5: GO:0000375 RNA splicing, via transesterification reactions
#> 6: GO:0000423 mitophagy
#> pval padj overlap size overlapGenes expected foldEnrichment
#> <num> <num> <int> <int> <list> <num> <num>
#> 1: 0.4761905 0.7862282 2 2 ENSG0000.... 2 1
#> 2: 0.4761905 0.7862282 2 2 ENSG0000.... 2 1
#> 3: 0.4761905 0.7862282 2 2 ENSG0000.... 2 1
#> 4: 0.7142857 0.7862282 1 1 ENSG0000.... 1 1
#> 5: 0.7142857 0.7862282 1 1 ENSG0000.... 1 1
#> 6: 0.7142857 0.7862282 1 1 ENSG0000.... 1 1
In most cases, users will wish to use the set of genes represented in the background ASEs as the background genes. This is because some genes do not have alternative splicing events, most often because they are intronless genes!
To perform GO analysis using background genes from analysed ASEs:
Heatmaps are useful for visualizing differential expression of individual samples, as well as potential patterns of expression.
First, obtain a matrix of PSI values:
# Create a matrix of values of the top 10 differentially expressed events:
mat <- makeMatrix(
se.filtered,
event_list = res_edgeR$EventName[1:10],
method = "PSI"
)
makeMatrix()
work?
makeMatrix()
provides a matrix of PSI values from the given
NxtSE
object. The parameters event_list
and
sample_list
allows subsetting for ASEs and/or samples,
respectively.
The parameter method
accepts 3 options:
"PSI"
: outputs raw PSI values"logit"
: outputs logit PSI values"Z-score"
: outputs Z-score transformed PSI valuesAlso, makeMatrix()
facilitates exclusion of low
confidence PSI values. These can occur when counts of both isoforms are
too low. Setting the depth_threshold
(default
10
) will set samples with total isoform count below this
value to be converted to NA
.
NA
values are filtered
out. Setting the parameter na.percent.max
(default
0.1
) means any ASE with the proportion of NA
above this threshold will be removed from the final matrix.
Plot this matrix of values in a heatmap:
library(pheatmap)
anno_col_df <- as.data.frame(colData(se.filtered))
anno_col_df <- anno_col_df[, 1, drop=FALSE]
pheatmap(mat, annotation_col = anno_col_df)
Navigate to Display
, then Heatmap
in the menu
side bar. After relaxing the event filters as per the prior sections, a
heatmap will be automatically generated:
The heatmap can be customized as follows:
Coverage plots visualize RNA-seq coverage in individual samples.
SpliceWiz uses its coverage normalization algorithm to visualize group
differences in PSIs.
SpliceWiz produces RNA-seq coverage plots of analysed samples. Coverage
data is compiled simultaneous to the IR and junction quantitation
performed by processBAM()
. This data is saved in “COV”
files, which is a BGZF compressed and indexed file. COV files show
compression and performance gains over BigWig files.
First, lets obtain a list of differential events with delta PSI > 5%:
res_edgeR <- ASE_edgeR(
se = se.filtered,
test_factor = "condition",
test_nom = "B",
test_denom = "A"
)
#> Sep 08 08:25:05 AM Performing edgeR contrast for included / excluded counts separately
#> Sep 08 08:25:06 AM Performing edgeR contrast for included / excluded counts together
res_edgeR.filtered <- res_edgeR[res_edgeR$abs_deltaPSI > 0.05,]
res_edgeR.filtered$EventName[1]
#> [1] "NSUN5/ENST00000252594_Intron2/clean"
We can see here the top differential event belongs to NSUN5. The first step to plotting this event is to create a data object that contains the requisite data for the gene NSUN5:
This step retrieves all coverage and junction data for NSUN5 (and
surrounding genes).
Next, we need to create event-specific data for the IR event in NSUN5 intron 2.
This step normalizes the coverage and junction data from the
viewpoint of NSUN5 intron 2.
The final step is to generate the plot:
plotView(
plotObj,
centerByEvent = TRUE, # whether the plot should be centered at the `Event`
trackList = list(1,2,3,4,5,6),
plotJunctions = TRUE
)
This plots all 6 individual coverage plots, each in its own
track.
In the covPlotObject which is returned by the above function, each
“track” is a sample. These can be viewed using the tracks()
function:
For convenience, users have the option of referring to tracks by their actual names, or by numbers. So, in the above example, the two parameters below are equivalent:
trackList = list(1,2)
To plot “normalized” coverage plots, where coverage is normalized to
transcript depth (i.e. sum of all transcripts = 1), set
normalizeCoverage = TRUE
. We can also include multiple
samples in each trace, so lets stack all samples in the same condition
in the same trace:
plotView(
plotObj,
centerByEvent = TRUE,
trackList = list(c(1,2,3), c(4,5,6)),
# Each list element contains a vector of track id's
normalizeCoverage = TRUE
)
NB: junction (sashimi) arcs are not plotted in tracks with more than 1 trace (to avoid cluttering)
To plot group coverage plots, first we need to generate a new
covPlotObject
. This is because the previous
covPlotObject
was generated on the basis of each track
being an individual sample. To generate a plot object where each track
represents a condition:
plotObj_group <- getPlotObject(
dataObj,
Event = res_edgeR.filtered$EventName[1],
condition = "condition",
tracks = c("A", "B")
)
# NB:
# when `condition` is not specified, tracks are assumed to be the same samples
# as that of the covDataObject
# when `condition` is specified, tracks must refer to the condition categories
# that are desired for the final plot
Note that there are several scenarios where a new covPlotObject is required:
*
-
unstranded)To generate the group coverage plot, with the two conditions on the same track:
Sometimes, we are interested in the differential coverage of exons but not of the intervening introns. Given introns are often much longer than exons, it is useful to plot by the exons of interest.
For example, to plot the skipped casette exon in TRA2B:
dataObj <- getCoverageData(se, Gene = "TRA2B", tracks = colnames(se))
plotObj <- getPlotObject(
dataObj,
Event = "SE:TRA2B-206-exon2;TRA2B-205-int1",
condition = "condition", tracks = c("A", "B")
)
plotView(
plotObj,
centerByEvent = TRUE,
trackList = list(c(1,2)),
filterByEventTranscripts = TRUE
)
NB: setting filterByEventTranscripts = TRUE
means only
transcripts involved in the specified splicing event are plotted in the
annotation
Note that the involved exons are in only a small area of the coverage plot. To zoom in on the exons, we first have to plot an “exon view” so the exons are labelled, and at the same time return their coordinates:
gr <- plotView(
plotObj,
centerByEvent = TRUE,
trackList = list(c(1,2)),
filterByEventTranscripts = TRUE,
showExonRanges = TRUE
)
By setting the parameter showExonRanges = TRUE
, the
plotView
function shows a plot with exons and their names
in the annotation track, and returns a GRanges object, with ranges named
by their corresponding exon names:
names(gr)
#> [1] "TRA2B-205-E3" "TRA2B-206-E4" "TRA2B-206-E3" "TRA2B-206-E2" "TRA2B-206-E1"
#> [6] "TRA2B-205-E2" "TRA2B-213-E2" "TRA2B-213-E1" "TRA2B-205-E1"
We can see in the above figure that he exons of interest are
c("TRA2B-206-E1", "TRA2B-206-E2", "TRA2B-206-E3")
. To plot
these in an “exons view” coverage plot:
The main coverage plot interface is as follows
The top bar contains controls to locate the genomic loci of interest:
In the plot control panel on the left hand side of the main plot area:
The main plot (19) is a plotly-based interactive plot. Users can zoom in to a genomic loci of interest using the zoom tool.
There are several options that appear if a specific condition is set:
As mentioned, when the “Exon Plot mode” is selected, an exons table is displayed where users can select one or more exon ranges can be selected, as shown below:
Selecting 2 or more exon ranges will trigger (after a 3 second delay) a static exon-window coverage plot to be generated.
Note that at the bottom of the left hand panel, users can save the interactive (top) plot, or the static exon-window coverage (bottom) plot to PDF as ggplot- based figures.
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.1 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: Etc/UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] splines stats4 stats graphics grDevices utils datasets
#> [8] methods base
#>
#> other attached packages:
#> [1] pheatmap_1.0.12 ggplot2_3.5.1
#> [3] DESeq2_1.45.3 SummarizedExperiment_1.35.1
#> [5] Biobase_2.65.1 MatrixGenerics_1.17.0
#> [7] matrixStats_1.4.0 GenomicRanges_1.57.1
#> [9] GenomeInfoDb_1.41.1 IRanges_2.39.2
#> [11] S4Vectors_0.43.2 DoubleExpSeq_1.1
#> [13] edgeR_4.3.14 limma_3.61.9
#> [15] fstcore_0.9.18 AnnotationHub_3.13.3
#> [17] BiocFileCache_2.13.0 dbplyr_2.5.0
#> [19] BiocGenerics_0.51.1 SpliceWiz_1.7.3
#> [21] NxtIRFdata_1.11.0 rmarkdown_2.28
#>
#> loaded via a namespace (and not attached):
#> [1] later_1.3.2 BiocIO_1.15.2
#> [3] bitops_1.0-8 filelock_1.0.3
#> [5] tibble_3.2.1 R.oo_1.26.0
#> [7] XML_3.99-0.17 lifecycle_1.0.4
#> [9] lattice_0.22-6 dendextend_1.17.1
#> [11] magrittr_2.0.3 plotly_4.10.4
#> [13] sass_0.4.9 jquerylib_0.1.4
#> [15] yaml_2.3.10 httpuv_1.6.15
#> [17] cowplot_1.1.3 DBI_1.2.3
#> [19] buildtools_1.0.0 RColorBrewer_1.1-3
#> [21] abind_1.4-5 zlibbioc_1.51.1
#> [23] rvest_1.0.4 purrr_1.0.2
#> [25] R.utils_2.12.3 RCurl_1.98-1.16
#> [27] rappdirs_0.3.3 seriation_1.5.6
#> [29] GenomeInfoDbData_1.2.12 maketools_1.3.0
#> [31] genefilter_1.87.0 annotate_1.83.0
#> [33] DelayedMatrixStats_1.27.3 codetools_0.2-20
#> [35] DelayedArray_0.31.11 DT_0.33
#> [37] xml2_1.3.6 tidyselect_1.2.1
#> [39] farver_2.1.2 UCSC.utils_1.1.0
#> [41] rhandsontable_0.3.8 viridis_0.6.5
#> [43] TSP_1.2-4 shinyWidgets_0.8.6
#> [45] webshot_0.5.5 GenomicAlignments_1.41.0
#> [47] jsonlite_1.8.8 fst_0.9.8
#> [49] survival_3.7-0 iterators_1.0.14
#> [51] foreach_1.5.2 tools_4.4.1
#> [53] progress_1.2.3 Rcpp_1.0.13
#> [55] glue_1.7.0 gridExtra_2.3
#> [57] SparseArray_1.5.31 xfun_0.47
#> [59] dplyr_1.1.4 ca_0.71.1
#> [61] HDF5Array_1.33.6 numDeriv_2016.8-1.1
#> [63] shinydashboard_0.7.2 withr_3.0.1
#> [65] BiocManager_1.30.25 fastmap_1.2.0
#> [67] rhdf5filters_1.17.0 fansi_1.0.6
#> [69] digest_0.6.37 R6_2.5.1
#> [71] mime_0.12 colorspace_2.1-1
#> [73] GO.db_3.19.1 RSQLite_2.3.7
#> [75] R.methodsS3_1.8.2 utf8_1.2.4
#> [77] tidyr_1.3.1 generics_0.1.3
#> [79] data.table_1.16.0 rtracklayer_1.65.0
#> [81] prettyunits_1.2.0 httr_1.4.7
#> [83] htmlwidgets_1.6.4 S4Arrays_1.5.7
#> [85] pkgconfig_2.0.3 gtable_0.3.5
#> [87] blob_1.2.4 registry_0.5-1
#> [89] XVector_0.45.0 sys_3.4.2
#> [91] htmltools_0.5.8.1 fgsea_1.31.0
#> [93] scales_1.3.0 ompBAM_1.9.1
#> [95] png_0.1-8 knitr_1.48
#> [97] rjson_0.2.22 curl_5.2.2
#> [99] cachem_1.1.0 rhdf5_2.49.0
#> [101] BiocVersion_3.20.0 parallel_4.4.1
#> [103] AnnotationDbi_1.67.0 restfulr_0.0.15
#> [105] pillar_1.9.0 grid_4.4.1
#> [107] vctrs_0.6.5 promises_1.3.0
#> [109] shinyFiles_0.9.3 xtable_1.8-4
#> [111] evaluate_0.24.0 locfit_1.5-9.10
#> [113] cli_3.6.3 compiler_4.4.1
#> [115] Rsamtools_2.21.1 rlang_1.1.4
#> [117] crayon_1.5.3 heatmaply_1.5.0
#> [119] labeling_0.4.3 fs_1.6.4
#> [121] stringi_1.8.4 viridisLite_0.4.2
#> [123] BiocParallel_1.39.0 assertthat_0.2.1
#> [125] munsell_0.5.1 Biostrings_2.73.1
#> [127] lazyeval_0.2.2 Matrix_1.7-0
#> [129] ExperimentHub_2.13.1 BSgenome_1.73.0
#> [131] hms_1.1.3 patchwork_1.2.0
#> [133] sparseMatrixStats_1.17.2 bit64_4.0.5
#> [135] Rhdf5lib_1.27.0 KEGGREST_1.45.1
#> [137] statmod_1.5.0 shiny_1.9.1
#> [139] highr_0.11 memoise_2.0.1
#> [141] bslib_0.8.0 fastmatch_1.1-4
#> [143] bit_4.0.5