Title: | Exon Based Strategy for Expression Analysis of genes |
---|---|
Description: | Calculates differential expression of genes based on exon counts of genes obtained from RNA-seq sequencing data. |
Authors: | Arfa Mehmood, Asta Laiho, Laura L. Elo |
Maintainer: | Arfa Mehmood <[email protected]> |
License: | GPL-2 |
Version: | 1.35.0 |
Built: | 2024-10-30 05:28:30 UTC |
Source: | https://github.com/bioc/EBSEA |
EBSEA takes the filtered raw exon-level read counts as input, normalizes and performs a two-group statistical comparison with DESeq2. The exon-level results are aggregated to the gene-level using empirical Brown’s method. The samples in the two groups can be paired.
EBSEA(data, columnData, design, test = "Wald", contrasts = NULL, plot = FALSE)
EBSEA(data, columnData, design, test = "Wald", contrasts = NULL, plot = FALSE)
data |
A dataframe of raw exon-counts |
columnData |
A dataframe indicated the groups of the samples. |
design |
Design matrix (see more information od design matrixes in DESeq2 reference manual) |
test |
The statistical test to be carried out. It can be either Wald or Likelihood Ratio Test. For further details about the methods you can look into DESeq2 refernce manual. Default: Wald |
contrasts |
a character vector with exactly three elements: the name of a factor in the design formula, the name of the numerator level for the fold change, and the name of the denominator level for the fold change Default: NULL |
plot |
A logical value indicating a volcano plot is produced. Default: FALSE |
The function returns a list containing containing exon and gene-level results. ExonTable is a data frame containing an average expression, log2 fold-change, p-value and adjusted p-value. GeneTable is a data frame containing the gene-level p-value, and adjusted-value. Other returned elements include the raw and normalised exon-level read counts, group information and design matrix used.
Laiho, A., & Elo, L. L. (2014). A note on an exon-based strategy to identify differentially expressed genes in RNA-seq experiments. PloS One, 9(12), e115964.
# The exon-based analysis for unpaired samples can be performed as follows: data(exonCounts) group <- data.frame('group' = as.factor(c('G1', 'G1', 'G1', 'G2', 'G2', 'G2', 'G2'))) row.names(group) <- colnames(exonCounts) design <- ~group ebsea.out <- EBSEA(exonCounts, group, design) # The exon-based analysis for paired samples with contrast provided can be performed as follows: data(exonCounts) group <- data.frame('group' = as.factor(c('G1', 'G1', 'G1', 'G2', 'G2', 'G2', 'G2')), 'paired' = as.factor(c(1,2,3,1,2,3,3))) row.names(group) <- colnames(exonCounts) design <- ~group contrastInfo <- c('group', 'G2', 'G1') ebsea.out <- EBSEA(exonCounts, group, design, contrasts = contrastInfo)
# The exon-based analysis for unpaired samples can be performed as follows: data(exonCounts) group <- data.frame('group' = as.factor(c('G1', 'G1', 'G1', 'G2', 'G2', 'G2', 'G2'))) row.names(group) <- colnames(exonCounts) design <- ~group ebsea.out <- EBSEA(exonCounts, group, design) # The exon-based analysis for paired samples with contrast provided can be performed as follows: data(exonCounts) group <- data.frame('group' = as.factor(c('G1', 'G1', 'G1', 'G2', 'G2', 'G2', 'G2')), 'paired' = as.factor(c(1,2,3,1,2,3,3))) row.names(group) <- colnames(exonCounts) design <- ~group contrastInfo <- c('group', 'G2', 'G1') ebsea.out <- EBSEA(exonCounts, group, design, contrasts = contrastInfo)
exonCounts consists of a subset of the exon counts from the pasilla dataset.
data("exonCounts")
data("exonCounts")
A data frame with 1000 rows and 7 variables
Exoncounts from Pasilla package https://bioconductor.org/packages/release/data/experiment/html/pasilla.html
Huber W, Reyes A (2020). pasilla: Data package with per-exon and per-gene read counts of RNA-seq samples of Pasilla knock-down by Brooks et al., Genome Research 2011
Filtering of exons based on their expression levels
filterCounts(x, mean = 1, exonCount = 1)
filterCounts(x, mean = 1, exonCount = 1)
x |
A numeric dataframe of exon counts across the samples. Exon number in format GeneName:Exonnumber should be indicated in the row name and sample names as column names. |
mean |
Exons with average count value across the dataset less than mean are filtered out. Default: 1 |
exonCount |
After filtering the individual exons, only genes with at least the given number of exons remaining will be retained. Default: 1 |
A dataframe of filtered counts of exons
data(exonCounts) res <- filterCounts(exonCounts)
data(exonCounts) res <- filterCounts(exonCounts)
Produces a visualization summarizing the normalized read count in each sample group and expression difference across the expressed exons.Top panel contains the log2 fold-change for each expressed exon. Asterisk denotes the significance level (*: < 0.05, **: < 0.01). Bottom panel shows the averaged normalized read count for each sample group. The title of the figure shows the gene name and the adjusted gene-level p-value (fdr)
visualizeGenes(gene, ebsea.out)
visualizeGenes(gene, ebsea.out)
gene |
Gene name matching the input data. |
ebsea.out |
Plots the mean count and fold-change the exons of the specified gene. |
Plots the mean count and fold-change across the exons of the specified gene.
data(exonCounts) group <- data.frame('group' = as.factor(c('G1', 'G1', 'G1', 'G2', 'G2', 'G2', 'G2'))) row.names(group) <- colnames(exonCounts) design <- ~group ebsea.out <- EBSEA(exonCounts, group, design) visualizeGenes('FBgn0000017', ebsea.out)
data(exonCounts) group <- data.frame('group' = as.factor(c('G1', 'G1', 'G1', 'G2', 'G2', 'G2', 'G2'))) row.names(group) <- colnames(exonCounts) design <- ~group ebsea.out <- EBSEA(exonCounts, group, design) visualizeGenes('FBgn0000017', ebsea.out)