Package 'pmm'

Title: Parallel Mixed Model
Description: The Parallel Mixed Model (PMM) approach is suitable for hit selection and cross-comparison of RNAi screens generated in experiments that are performed in parallel under several conditions. For example, we could think of the measurements or readouts from cells under RNAi knock-down, which are infected with several pathogens or which are grown from different cell lines.
Authors: Anna Drewek
Maintainer: Anna Drewek <[email protected]>
License: GPL-3
Version: 1.39.0
Built: 2024-10-31 03:31:00 UTC
Source: https://github.com/bioc/pmm

Help Index


The PMM-Package

Description

This package contains R functions for fitting the Parallel Mixed Model and analyzing its results.

Details

The Parallel Mixed Model (PMM) approach is suitable for hit selection and cross-comparison of RNAi screens generated in experiments that are performed in parallel under several conditions. As an example, we could think of the measurements or readouts from cells under RNAi knock-down, which are infected with several pathogens or which are grown from different cell lines. PMM simultaneously takes into account all the knock-down effects in order to gain statistical power for the hit detection. As a special feature, PMM allows incorporating RNAi weights that can be assigned according to the additional information on the used RNAis or the screening quality.
The following functions are contained in this R-package:

pmm fits the PMM
hitheatmap visualizes the results of PMM
sharedness computes the sharedness

Author(s)

Anna Drewek <[email protected]>

References

Rämö, P., Drewek, A., Arrieumerlou, C., Beerenwinkel, N., Ben-Tekaya H., Cardel, B., Casanova, A., Conde-Alvarez. R., Cossart, P., Csucs, G., Eicher, S., Emmenlauer, M. Greber, U., Hardt, W.-D., Helenius, A., Kasper, C., Kaufmann, A., Kreibich, S., Kuebacher, A., Kunszt, P., Low, S.H., Mercer, J., Mudrak, D., Muntwiler, S., Pelkmans, L., Pizarro-Cerda, J., Podvinec, M., Pujadas, E., Rinn, B., Rouilly, V., Schmich F., Siebourg, J., Snijder, B., Stebler, M., Studer, G., Szczurek, E., Truttmann, M., von Mering, C., Vonderheit, A., Yakimovich, A., Buehlmann, P. and Dehio, C., Simultaneous analysis of large-scale RNAi screens for pathogen entry, BMC Genomics 15(1162): p.1471-2164. (2014)


Visualization of the PMM results

Description

This function visualizes the results of PMM.

Usage

hitheatmap(fit, threshold = 0.2, sharedness.score = FALSE,
                  main = "", na.action = "use", ...)

Arguments

fit

data frame returned by the pmm function.

threshold

threshold for the false discovery rate. Genes are counted as hits if they are below this threshold. Default is 0.2.

sharedness.score

logical value that indicates whether the sharedness score among the conditions should be additionally plotted. Default is FALSE.

main

the title at the top of the plot.

na.action

a function that indicates what happens if fit contains NAs. There are two options: "na.omit" or "use" (default). In the case of "na.omit" the hitheatmap is plotted for na.omit(fit) and in the other case the hitheatmap plots all data in fit.

...

further arguments passed to plot and par function.

Details

The heat map represents the effects c_cg estimated by PMM. Red color indicates a positive c_cg coefficient, blue color a negative c_cg coefficient. The darker the color, the stronger is the c_cg effect. The heat map contains only the genes with false discovery rate below the given threshold for at least one condition. The yellow star indicates the hit genes in each condition. If sharedness.score = TRUE, an additional row is plotted. It represents the strength of sharedness for a gene among the conditions. The darker the color, the stronger is the sharedness effect. If na.omit = "use" then NAs are plotted in white color and marked by "NA".

Value

A heat map

Author(s)

Anna Drewek <[email protected]>

Examples

data(kinome)
fit1 <- pmm(kinome,"InfectionIndex","weight_library")

hitheatmap(fit1, threshold = 0.4)
hitheatmap(fit1, threshold = 0.2, main = "Results PMM")
hitheatmap(fit1, sharedness.score = TRUE)

## NA-Handling
kinome$InfectionIndex[kinome$GeneID == 3611 & kinome$condition ==
"ADENO"] <- rep(NA,12)
fit2 <- pmm(kinome,"InfectionIndex","weight_library")
hitheatmap(fit2, main = "Results PMM with NA")

## Using par options
hitheatmap(fit1, sharedness.score = TRUE, cex.main = 2,
                main = "My modified plot", col.main = "white",
                col.axis = "white", cex.axis = 0.8, bg = "black",
                mar = c(7,6,4,6))

Example Data from InfectX

Description

Data from gene knock-down experiments performed with 11 siRNA for 8 different pathogens. The data was generated by the InfectX consortium.

Usage

data(kinome)

Format

The data frame contains the microscope image readouts of 826 kinases knock-down experiments. For each gene cells were targeted by a total of 12 independent siRNAs coming from three manufactures: Ambion (3 siRNAs), Qiagen (4 siRNAs) and Dharmacon (4 siRNAs + 1 pool siRNA). All experiments were conducted for 8 different pathogens. Each row of the data frame corresponds to the result of one experiment.

GeneID ID of the gene that is knocked down.
GeneName Name of the gene that is knocked down.
company Company that provided the siRNA for knock-down.
siRNA Label to identify the different siRNA replicates that are used.
CellCount normalized image readout describing the number of cells in the well
InfectionIndex normalized image readout describing the number of infected cells in the well
weight_library weight denoting the quality of libraries. We assigned a higher weight to
Dharmacon Pooled and Ambion libraries (weight 2) than to the unpooled
libraries Dharmacon and Qiagen (weight 1).

Value

data.frame

Note

All of our screening data, including raw images, are available at the openBIS portal (http://infectx.ch/dataaccess).

References

Rämö, P., Drewek, A., Arrieumerlou, C., Beerenwinkel, N., Ben-Tekaya H., Cardel, B., Casanova, A., Conde-Alvarez. R., Cossart, P., Csucs, G., Eicher, S., Emmenlauer, M. Greber, U., Hardt, W.-D., Helenius, A., Kasper, C., Kaufmann, A., Kreibich, S., Kuebacher, A., Kunszt, P., Low, S.H., Mercer, J., Mudrak, D., Muntwiler, S., Pelkmans, L., Pizarro-Cerda, J., Podvinec, M., Pujadas, E., Rinn, B., Rouilly, V., Schmich F., Siebourg, J., Snijder, B., Stebler, M., Studer, G., Szczurek, E., Truttmann, M., von Mering, C., Vonderheit, A., Yakimovich, A., Buehlmann, P. and Dehio, C., Simultaneous analysis of large-scale RNAi screens for pathogen entry, BMC Genomics 15(1162): p.1471-2164. (2014)

Examples

data(kinome)
str(kinome)
head(kinome)

Fitting the PMM

Description

Fits the parallel mixed model.

Usage

pmm(df.data, response, weight = "None", ignore = 3, simplify =
TRUE, gene.col = "GeneID", condition.col = "condition")

Arguments

df.data

a data frame containing the variables for the model. Each row should correspond to one independent siRNA experiment. The data frame needs to have at least the following variables: GeneID, condition and a column with the measurements/readouts of the screens.

response

name of the column that contains the measurements/readouts of the screens.

weight

an optional vector of weights to be used in the fitting process of the linear mixed model. It should be a numeric vector. Default is a fit without weights.

ignore

number of minimal required sirna replicates for each gene. If a gene has less siRNA replicates it is ignored during the fitting process. Default is 3.

simplify

logical value that indicates whether the output of pmm should be simplified.

gene.col

name of the column that give a gene identifier. Default is "GeneID".

condition.col

name of the column that indicates the condition that was used for each measurement. Default is "condition".

Details

The Parallel Mixed Model (PMM) is composed of a linear mixed model and an assessment of the local False Discovery Rate. The linear mixed model consists of a fixed effect for condition and of two random effects for gene g and for gene g within a condition c. We fit a linear mixed model by using lmer function from lme4 R-package. To distinguish hit genes, PMM provides also an estimate of the local False Discovery Rate (FDR). pmm will only use the data of genes that have at least a certain number of siRNA replicates per condition. The number of ignored genes can be passed to pmm by the argument ignore. We recommend using at least 3 siRNA replicates per gene and condition in order to obtain a reliable fit.

Value

The simplified output of pmm is a matrix that contains the c_cg effects for each condition c and gene g, as well as an estimate for the local false discovery rate. A positive estimated c_cg effect means that the response was enhanced when the corresponding gene is knocked down. A negative effect means that the response was reduced.
The non-simplified output of pmm is a list of three components. The first component contains the simpilified output, i.e the matrix with the c_cg effects and fdr values, the second component contains the fit of the linear mixed model and the third component contains the a_g and b_cg values.

Author(s)

Anna Drewek <[email protected]>

Examples

data(kinome)

 ## Fitting the parallel mixed model with weights
 fit1 <- pmm(kinome,"InfectionIndex","weight_library")
 head(fit1)

 ## Fitting the parallel mixed model without weights
 fit2 <- pmm(kinome,"InfectionIndex","None")
 head(fit2)

 ## Accessing the fit of the linear mixed model
 fit3 <- pmm(kinome,"InfectionIndex","weight_library",simplify=FALSE)
 identical(fit1,fit3[[1]])
 summary(fit3[[2]])

 ## NA-Handling
 kinome$InfectionIndex[kinome$GeneID == 10000 & kinome$condition ==
 "ADENO"] <- rep(NA,12)
 fit4 <- pmm(kinome,"InfectionIndex","weight_library",3)
 head(fit4)

Sharedness Score

Description

The sharedness score computes the strength of sharedness of hit genes among the conditions.

Usage

sharedness(fit, threshold = 0.2, na.action = "na.omit")

Arguments

fit

data frame returned by the pmm function.

threshold

threshold for the false discovery rate. Genes are counted as hits if they are below this threshold. Default is 0.2.

na.action

a function that indicates what happens if fit contains NAs. There are two options: "na.omit" (default) or "use". In the case of "na.omit" the sharedness score is applied to na.omit(fit) and in the other case the sharedness score is adapted for each gene to the number of conditions without NA.

Details

The sharedness score is a combination of two quantities:

shg=12((1mean(fdrcg))+c(fdrcg<1))sh_g = \frac{1}{2} \left( (1 - mean(fdr_{cg})) + \sum_{c} (fdr_{cg} < 1) \right)


The first part defines the shift away from 1 and the second part describes how many pathogens support the shift (proportion of FDRs < 1).

Value

The score returns a value between 0 and 1 for each gene. Score 0 indicates that a gene is not shared among the condition and score 1 that the gene is significant among all conditions.

Author(s)

Anna Drewek <[email protected]>

Examples

data(kinome)
fit <- pmm(kinome,"InfectionIndex","weight_library")
sh <- sharedness(fit, threshold = 0.2)
head(sh)

## NA-Handling
kinome$InfectionIndex[kinome$GeneID == 132158 & kinome$condition ==
"ADENO"] <- rep(NA,12)
fit <- pmm(kinome,"InfectionIndex","weight_library")
## Sharedness score for genes present in all conditions
sh <- sharedness(fit, threshold = 0.2, na.action = "na.omit")
head(sh)
## Sharedness score for all significant genes
sh <- sharedness(fit, threshold = 0.2, na.action = "use")
head(sh)