Package 'PAIRADISE'

Title: PAIRADISE: Paired analysis of differential isoform expression
Description: This package implements the PAIRADISE procedure for detecting differential isoform expression between matched replicates in paired RNA-Seq data.
Authors: Levon Demirdjian, Ying Nian Wu, Yi Xing
Maintainer: Qiang Hu <[email protected]>, Levon Demirdjian <[email protected]>
License: MIT + file LICENSE
Version: 1.23.0
Built: 2024-10-31 06:09:06 UTC
Source: https://github.com/bioc/PAIRADISE

Help Index


PDseDataSet counts

Description

PDseDataSet counts

Usage

counts(object)

Arguments

object

A PDseDataSet object

Value

A counts matrix


pairadise

Description

Primary function of the PAIRADISE package. Analyzes matched pairs for differences in isoform expression. Uses parallel processing to speed up computation.

Usage

pairadise(
  pdat,
  nIter = 100,
  tol = 10^(-2),
  pseudocount = 0,
  seed = 12321,
  equal.variance = FALSE,
  numCluster = 2,
  BPPARAM = MulticoreParam(numCluster)
)

Arguments

pdat

A PDseDataSet object

nIter

Positive integer. Specifies the maximum number of iterations of the optimization algorithm allowed. Default is nIter = 100

tol

Positive number. Specifies the tolerance level for terminating the optimization algorithm, defined as the difference in log-likelihood ratios between iterations. Default is tol = 10^(-2)

pseudocount

Positive number. Specifies a value for a pseudocount added to each count at the beginning of the analysis. Default is pseudocount = 0

seed

An integer to set seed.

equal.variance

Are the group variances assumed equal? Default value is FALSE.

numCluster

Number of clusters to use for parallel computing.

BPPARAM

parallel parameters from package BiocParallel.

Details

This is the primary function of the PAIRADISE package that implements the PAIRADISE algorithm.

Value

A PDseDataSet object contains outputs from PAIRADISE algorithm.

Examples

#############################
## Example: Simulated data ##
#############################

set.seed(12345)
data("sample_dataset")
pdat <- PDseDataSetFromMat(sample_dataset)
pdat <- pairadise(pdat, numCluster =4)
results(pdat)

PAIRADISE Detecting allele-specific alternative splicing from population-scale RNA-seq data

Description

We introduce PAIRADISE (PAIred Replicate analysis of Allelic DIfferential Splicing Events), a method for detecting allele-specific alternative splicing (ASAS) from RNA-seq data. PAIRADISE uses a statistical model that aggregates ASAS signals across multiple individuals in a population. It formulates ASAS detection as a statistical problem for identifying differential alternative splicing from RNA-seq data with paired replicates. The PAIRADISE statistical model is applicable to many forms of allele-specific isoform variation (e.g. RNA editing), and can be used as a generic statistical model for RNA-seq studies involving paired replicates.

See Also

pairadise


PDseDataSet object and constuctor

Description

'PDseDataSet' is a subclass of 'SummarizedExperiment'. It can used to store inclusion and skipping splicing counts for pair designed samples.

Usage

PDseDataSet(counts, design, lengths)

Arguments

counts

The counts of splicing events, including inclusion and skipping counts in 3 dimensions for each sample.

design

The paired design data.frame, including sample column for sample ids and group column for design factors.

lengths

Two columns iLen and sLen for the effective lengths of inclusion and skipping isoforms.

Value

A PDseDataSet object

Examples

icount <- matrix(1:4, 1)
scount <- matrix(5:8, 1)
acount <- abind::abind(icount, scount, along = 3)
design <- data.frame(sample = rep(c("s1", "s2"), 2),
group = rep(c("T", "N"), each = 2))
lens <- data.frame(sLen=1L, iLen=2L)
PDseDataSet(acount, design, lens)

PDseDataSet from rMATs/PAIRADISE Mat format

Description

The Mat format should have 7 columns, arranged as follows: Column 1 contains the ID of the alternative splicing events. Column 2 contains counts of isoform 1 corresponding to the first group. Column 3 contains counts of isoform 2 corresponding to the first group. Column 4 contains counts of isoform 1 corresponding to the second group. Column 5 contains counts of isoform 2 corresponding to the second group. Column 6 contains the effective length of isoform 1. Column 7 contains the effective length of isoform 2. Replicates in columns 2-5 should be separated by commas, e.g. "1623,432,6" for three replicates and the replicate order should be consistent for each column to ensure pairs are matched correctly.

Usage

PDseDataSetFromMat(dat)

Arguments

dat

The Mat format dataframe.

Value

A PDseDataSet object

Examples

data("sample_dataset")
pdat <- PDseDataSetFromMat(sample_dataset)

Extract results for pairadise analysis

Description

Extract results for pairadise analysis

Usage

results(pdat, p.adj = "BH", sig.level = 0.01, details = FALSE)

Arguments

pdat

A PDseDataSet object from pairadise analysis

p.adj

The p ajustment method.

sig.level

The cutoff of significant results

details

Whether to list detailed results.

Value

The function return a results DataFrame.

testStats

Vector of test statistics for paired analysis.

p.value

Vector of pvalues for each exon/event.

p.adj

The adjusted p values

If details is TRUE, more detailed parameter estimates for constrained and unconstrained model will return.

Examples

data("sample_dataset")
pdat <- PDseDataSetFromMat(sample_dataset)
pdat <- pairadise(pdat)
results(pdat)

sample_dataset

Description

The CEU dataset was generated by analyzing the allele-specific alternative splicing events in the GEUVADIS CEU data. Allele-specific reads were mapped onto alternative splicing events using rPGA (version 2.0.0). Then the allele-specific bam files mapped onto the two haplotypes are merged together to detect alternative splicing events using rMATS (version 3.2.5)16.

The LUSC dataset was generated by analyzing the tumor versus adjacent control samples from TCGA LUSC RNA-seq data.

Usage

data(sample_dataset)

data(sample_dataset_CEU)

data(sample_dataset_LUSC)

Format

The dataset has 7 columns, arranged as follows:

ExonID

Column 1 contains the ID of the alternative splicing events.

I1

Column 2 contains counts of isoform 1 corresponding to the first group.

S1

Column 3 contains counts of isoform 2 corresponding to the first group.

I2

Column 4 contains counts of isoform 1 corresponding to the second group.

S2

Column 5 contains counts of isoform 2 corresponding to the second group.

I_len

Column 6 contains the effective length of isoform 1.

S_len

Column 7 contains the effective length of isoform 2.

The dataset has 7 columns, arranged as follows:

ExonID

Column 1 contains the ID of the alternative splicing events.

I1

Column 2 contains counts of isoform 1 corresponding to the first group.

S1

Column 3 contains counts of isoform 2 corresponding to the first group.

I2

Column 4 contains counts of isoform 1 corresponding to the second group.

S2

Column 5 contains counts of isoform 2 corresponding to the second group.

I_len

Column 6 contains the effective length of isoform 1.

S_len

Column 7 contains the effective length of isoform 2.

The dataset has 7 columns, arranged as follows:

ExonID

Column 1 contains the ID of the alternative splicing events.

I1

Column 2 contains counts of isoform 1 corresponding to the first group.

S1

Column 3 contains counts of isoform 2 corresponding to the first group.

I2

Column 4 contains counts of isoform 1 corresponding to the second group.

S2

Column 5 contains counts of isoform 2 corresponding to the second group.

I_len

Column 6 contains the effective length of isoform 1.

S_len

Column 7 contains the effective length of isoform 2.