Title: | Hits Selection for Synthetic Lethal RNAi Screen Data |
---|---|
Description: | Select hits from synthetic lethal RNAi screen data. For example, there are two identical celllines except one gene is knocked-down in one cellline. The interest is to find genes that lead to stronger lethal effect when they are knocked-down further by siRNA. Quality control and various visualisation tools are implemented. Four different algorithms could be used to pick up the interesting hits. This package is designed based on 384 wells plates, but may apply to other platforms with proper configuration. |
Authors: | Chunxuan Shao [aut, cre] |
Maintainer: | Chunxuan Shao <[email protected]> |
License: | GPL-3 |
Version: | 2.7.0 |
Built: | 2024-10-31 05:35:20 UTC |
Source: | https://github.com/bioc/synlet |
Calculate the B-score for plates belonging to the same master plate. Positive / negative controls are removed from the calculation.
bScore(masterPlate, dta, treatment, control, outFile = FALSE)
bScore(masterPlate, dta, treatment, control, outFile = FALSE)
masterPlate |
a maste plate to be normalized. |
dta |
synthetic lethal RNAi screen data. |
treatment |
the treatment experiment condition in EXPERIMENT_MODIFICATION |
control |
the control experiment condition in EXPERIMENT_MODIFICATION. |
outFile |
should calculated B-score files be written to the current folder? File names is (masterPlate).bscore.csv. |
A list contains B-score for each master plate, treatment plates are the first columns, followed by control plates
Brideau, C., Gunter, B., Pikounis, B. & Liaw, A. Improved statistical methods for hit selection in high-throughput screening. J. Biomol. Screen. 8, 634-647 (2003).
data(example_dt) res <- sapply(unique(example_dt$MASTER_PLATE), bScore, example_dt, treatment = "treatment", control = "control", simplify = FALSE)
data(example_dt) res <- sapply(unique(example_dt$MASTER_PLATE), bScore, example_dt, treatment = "treatment", control = "control", simplify = FALSE)
A dataset containing synthetic lethal RNAi screen data to show how functions work. The variables are as follows (all are character except READOUT):
data(example_dt)
data(example_dt)
A data.table with 4320 rows and 8 variables
PLATE. plate names.
MASTER_PLATE. master plate names.
WELL_CONTENT_NAME. siRNA targets of wells.
EXPERIMENT_TYPE. sample, negative/positive controls.
EXPERIMENT_MODIFICATION. experiment conditions, "treatment" or "control".
ROW_NAME. row names of plates.
COL_NAME. column names of plates.
READOUT. screen results.
A data.table containing RANi screen data, the READOUT value has no real biological meaning.
Select hits basing on median +- k*MAD, by default k is three.
madSelect( masterPlate, dat, k = 3, treatment, control, outFile = FALSE, normMethod = "PLATE" )
madSelect( masterPlate, dat, k = 3, treatment, control, outFile = FALSE, normMethod = "PLATE" )
masterPlate |
the master plate to analysis |
dat |
synthetic lethal RNAi screen data |
k |
cutoff for selecting hits, default is three |
treatment |
the treatment condition in EXPERIMENT_MODIFICATION |
control |
the control condition in EXPERIMENT_MODIFICATION |
outFile |
whether or not write the median normalized results |
normMethod |
normalization methods to be used. If "PLATE", the raw readouts are normalized by plate median, otherwise use median provided control siRNA. |
A data.frame contains the hits selection results.
MASTER_PLATE: location of siRNA
treat_cont_ratio: ratio of treatment / control
treat_median: median value of treatment plates
control_median: median value of control plates
Hits: Is this siRNA a hit?
Chung,N.etal. Median absolute deviation to improve hits election for genome-scale RNAi screens. J. Biomol. Screen. 13, 149-158 (2008).
data(example_dt) res <- sapply((unique(example_dt$MASTER_PLATE)), madSelect, example_dt, control = "control", treatment = "treatment", simplify = FALSE)
data(example_dt) res <- sapply((unique(example_dt$MASTER_PLATE)), madSelect, example_dt, control = "control", treatment = "treatment", simplify = FALSE)
Put all individual plates in one graph, values are the readout in experiments.
plateHeatmap(dta, base_size = 12, heatmap_col = NULL)
plateHeatmap(dta, base_size = 12, heatmap_col = NULL)
dta |
synthetic lethal RNAi screen data |
base_size |
basic font size used for x/y axis and title for heatmaps |
heatmap_col |
color function generated by colorRampPalette. |
a ggplot object
data(example_dt) plateHeatmap(example_dt)
data(example_dt) plateHeatmap(example_dt)
Select hits by rank product methods by comparing treatment and control.
rankProdHits(masterPlate, dta, treatment, control, normMethod = "PLATE")
rankProdHits(masterPlate, dta, treatment, control, normMethod = "PLATE")
masterPlate |
the master plate to be analyzed |
dta |
synthetic lethal RNAi screen data |
treatment |
the treatment condition in EXPERIMENT_MODIFICATION |
control |
the control condition in EXPERIMENT_MODIFICATION |
normMethod |
normalization methods to be used. If "PLATE", the raw readouts are normalized by plate median, otherwise use provided control siRNA |
A list contains results by the rank product method for each master plate.
MASTER_PLATE: location of siRNA
pvalue_treat_lowerthan_cont: p-value for the hypothesis that treatment has lower normalized readout compared to control
FDR_treat_lowerthan_cont: FDR value
treat_cont_log2FC: log2 fold change of treatment / control
Breitling, R., Armengaud, P., Amtmann, A. & Herzyk, P. Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experiments. FEBS Lett 573, 83-92 (2004). Hong, F. et al. RankProd: a bioconductor package for detecting differentially expressed genes in meta-analysis. Bioinformatics 22, 2825-2827 (2006).
data(example_dt) res <- sapply(unique(example_dt$MASTER_PLATE), rankProdHits, example_dt, control = "control", treatment = "treatment", simplify = FALSE)
data(example_dt) res <- sapply(unique(example_dt$MASTER_PLATE), rankProdHits, example_dt, control = "control", treatment = "treatment", simplify = FALSE)
Selected hits by redundant siRNA activity method. Here is a wrapper function of RSA 1.8 by Yingyao Zhou.
rsaHits( dta, treatment, control, normMethod = "PLATE", LB, UB, revHits = FALSE, Bonferroni = FALSE, outputFile = "RSAhits.csv", scoreFile = "RSA_score.csv" )
rsaHits( dta, treatment, control, normMethod = "PLATE", LB, UB, revHits = FALSE, Bonferroni = FALSE, outputFile = "RSAhits.csv", scoreFile = "RSA_score.csv" )
dta |
synthetic lethal RNAi screen data |
treatment |
the treatment condition in EXPERIMENT_MODIFICATION |
control |
the control condition in EXPERIMENT_MODIFICATION |
normMethod |
normalization methods. If "PLATE", then values are normalized by plate median, otherwise use the provided control siRNA |
LB |
Low bound |
UB |
up bound |
revHits |
reverse hit picking, default the lower the score the better |
Bonferroni |
conceptually useful when there are different number of siRNAs per gene, default FALSE |
outputFile |
output file name |
scoreFile |
name of the score file to be written under the current folder |
A result file written to the current folder.
Gene_ID,Well_ID,Score: columns from input spreadsheet
LogP: OPI p-value in log10, i.e., -2 means 0.01
OPI_Hit: whether the well is a hit, 1 means yes, 0 means no
#hitWell: number of hit wells for the gene
#totalWell: total number of wells for the gene. If gene A has three wells w1, w2 and w3, and w1 and w2 are hits, #totalWell should be 3, #hitWell should be 2, w1 and w2 should have OPI_Hit set as 1 and w3 should have OPI_Hit set as 0.
OPI_Rank: ranking column to sort all wells for hit picking
Cutoff_Rank: ranking column to sort all wells based on Score in the simple activity-based method
Note: a rank value of 999999 means the well is not a hit
Koenig, R. et al. A probability-based approach for the analysis of large-scale RNAi screens. Nat Methods 4, 847-849 (2007).
data(example_dt) rsaHits(example_dt, treatment = "treatment", control = "control", normMethod = "PLATE", LB = 0.2, UB = 0.8, revHits = FALSE, Bonferroni = FALSE, outputFile = "RSAhits.csv")
data(example_dt) rsaHits(example_dt, treatment = "treatment", control = "control", normMethod = "PLATE", LB = 0.2, UB = 0.8, revHits = FALSE, Bonferroni = FALSE, outputFile = "RSAhits.csv")
Produce a single plot for readous of each plate, with the option of highlighting specific signals, like positive/negative controls.
scatterPlot( dta, scatter_colour = rainbow(10), controlOnly = FALSE, control_name = NULL )
scatterPlot( dta, scatter_colour = rainbow(10), controlOnly = FALSE, control_name = NULL )
dta |
synthetic lethal RNAi screen data |
scatter_colour |
colour for different signals |
controlOnly |
whether or not to plot control wells only |
control_name |
names of control siRNAs. |
a ggplot object
data(example_dt) scatterPlot(example_dt, control_name = c("PLK1 si1", "scrambled control si1", "lipid only"))
data(example_dt) scatterPlot(example_dt, control_name = c("PLK1 si1", "scrambled control si1", "lipid only"))
Plot the normalized RNAi screen data, row data, control signals and Z' factor.
siRNAPlot( gene, dta, controlsiRNA, FILEPATH = ".", colour = rainbow(10), zPrimeMed, zPrimeMean, treatment, control, normMethod = c("PLATE"), save_plot = FALSE, width = 15, height = 14 )
siRNAPlot( gene, dta, controlsiRNA, FILEPATH = ".", colour = rainbow(10), zPrimeMed, zPrimeMean, treatment, control, normMethod = c("PLATE"), save_plot = FALSE, width = 15, height = 14 )
gene |
gene symbol, case sensitive |
dta |
synthetic lethal RNAi screen data |
controlsiRNA |
controlsiRNA could be a vector of several siRNA, including postive/negative control |
FILEPATH |
path to store the figure |
colour |
colour used in graphs |
zPrimeMed |
zPrime factor basing on median |
zPrimeMean |
zPrime factor basing on mean |
treatment |
the treatment condition in EXPERIMENT_MODIFICATION |
control |
the control condition in EXPERIMENT_MODIFICATION |
normMethod |
could be a PLATE and negative controls |
save_plot |
whether save a png file in the working directory. |
width |
width of the plot |
height |
height of the plot |
Return the ggplot2 objects in a list, which could be plotted individually.
data(example_dt) zF_mean <- zFactor(example_dt, negativeCon = "scrambled control si1", positiveCon = "PLK1 si1") zF_med <- zFactor(example_dt, negativeCon = "scrambled control si1", positiveCon = "PLK1 si1", useMean = FALSE) p01 <- siRNAPlot("AAK1", example_dt, controlsiRNA = c("lipid only", "scrambled control si1"), FILEPATH = ".", zPrimeMed = zF_med, zPrimeMean = zF_mean, treatment = "treatment", control = "control", normMethod = c("PLATE", "lipid only", "scrambled control si1"))
data(example_dt) zF_mean <- zFactor(example_dt, negativeCon = "scrambled control si1", positiveCon = "PLK1 si1") zF_med <- zFactor(example_dt, negativeCon = "scrambled control si1", positiveCon = "PLK1 si1", useMean = FALSE) p01 <- siRNAPlot("AAK1", example_dt, controlsiRNA = c("lipid only", "scrambled control si1"), FILEPATH = ".", zPrimeMed = zF_med, zPrimeMean = zF_mean, treatment = "treatment", control = "control", normMethod = c("PLATE", "lipid only", "scrambled control si1"))
Select hits by student's t-test using B-score from treatment and control plates.
tTest(mtx, n_treat, n_cont)
tTest(mtx, n_treat, n_cont)
mtx |
b-score matrix. |
n_treat |
number of treatment plates |
n_cont |
number of control plates |
A list containing student's t-test for each master plate
pvalue: p-value of the t-test
Treat_Cont: difference in bscore: treatment - control
p_adj: BH adjusted p-value
Birmingham, A. et al. Statistical methods for analysis of high-throughput RNA interference screens. Nat Methods 6, 569-575 (2009).
data(example_dt) bscore_res <- sapply(unique(example_dt$MASTER_PLATE), bScore, example_dt, control = "control", treatment = "treatment", simplify = FALSE) tTest(bscore_res$P001, 3, 3)
data(example_dt) bscore_res <- sapply(unique(example_dt$MASTER_PLATE), bScore, example_dt, control = "control", treatment = "treatment", simplify = FALSE) tTest(bscore_res$P001, 3, 3)
calcualte the Z and Z' factor for each plate.
zFactor(dta, negativeCon, positiveCon, useMean = TRUE)
zFactor(dta, negativeCon, positiveCon, useMean = TRUE)
dta |
synthetic lethal RNAi screen data. |
negativeCon |
the negative control used in the WELL_CONTENT_NAME. |
positiveCon |
the positive control used in the WELL_CONTENT_NAME. |
useMean |
use mean to calcualate z factor and z' factor by default; otherwise use median. |
A data.frame contains z factor and z' factor
Zhang J.H., Chung T.D. & Oldenburg K.R. A simple statistical parameter for use in evaluation and validation of high throughput screening assays. J. Biomol. Screen. B, 4 67-73 (1999). Birmingham,A. et al. (2009) Statistical methods for analysis of high-throughput RNA interference screens. Nat Methods, 6, 569-575.
data(example_dt) res <- zFactor(example_dt, negativeCon = "scrambled control si1", positiveCon = "PLK1 si1")
data(example_dt) res <- zFactor(example_dt, negativeCon = "scrambled control si1", positiveCon = "PLK1 si1")