The metapod package provides utilities for meta-analyses on precomputed p-values, typicall generated from differential analyses of some sort. This enables users to consolidate multiple related tests into a single metric for easier interpretation and prioritization of interesting events. The functionality here was originally developed as part of the csaw package, to combine inferences from adjacent windows inside a genomic region; as well as from the scran package, to merge statistics from multiple batches or from multiple pairwise comparisons during marker gene detection. Most functions in this package involve combining multiple (independent or not) p-values into a single combined p-value. Additional functions are also provided to summarize the direction of any effect.
metapod provides two families of functions to combine a group of p-values into a single statistic. The first family considers parallel vectors of p-values where each group of p-values is defined from corresponding elements across vectors. This is useful for combining different tests on the same set of features, e.g., DE analyses with the same gene annotation. We illustrate this below by using Simes’ method to combine two parallel vectors:
library(metapod)
p1 <- rbeta(10000, 0.5, 1)
p2 <- rbeta(10000, 0.5, 1)
par.combined <- parallelSimes(list(p1, p2))
str(par.combined)
## List of 3
## $ p.value : num [1:10000] 0.0209 0.1647 0.5674 0.9113 0.0205 ...
## $ representative: int [1:10000] 2 1 1 1 1 2 1 1 1 1 ...
## $ influential :List of 2
## ..$ : logi [1:10000] FALSE TRUE TRUE TRUE TRUE FALSE ...
## ..$ : logi [1:10000] TRUE FALSE TRUE TRUE FALSE TRUE ...
The other approach is to consider groups of p-values within a single numeric vector. This is useful when the p-values to be combined are generated as part of a single analysis. The motivating example is that of adjacent windows in the same genomic region for ChIP-seq, but one can also imagine the same situation for other features, e.g., exons in RNA-seq. In this situation, we need a grouping factor to define the groups within our vector:
p <- rbeta(10000, 0.5, 1)
g <- sample(100, length(p), replace=TRUE)
gr.combined <- groupedSimes(p, g)
str(gr.combined)
## List of 3
## $ p.value : Named num [1:100] 2.56e-02 4.52e-06 4.86e-02 1.37e-02 2.88e-03 ...
## ..- attr(*, "names")= chr [1:100] "1" "2" "3" "4" ...
## $ representative: Named int [1:100] 5856 8385 7808 3703 3362 6331 8475 9279 5442 2362 ...
## ..- attr(*, "names")= chr [1:100] "1" "2" "3" "4" ...
## $ influential : logi [1:10000] FALSE FALSE FALSE FALSE FALSE FALSE ...
Both families of functions return the combined p-value along with some other
helpful statistics. The representative
vector specifies the
index of the representative test for each group; this is a test that is
particularly important to the calculation of the combined p-value and can be used to, e.g.,
obtain a representative effect size. Similarly, the
influential
vector indicates which of the individual tests
in each group are “influential” in computing the combined p-value. This is defined as the
subset of tests that, if their p-values were set to unity, would
alter the combined p-value.
The following methods yield a significant result for groups where any of the individual tests are significant:
parallelSimes()
and
groupedSimes()
) is robust to dependencies between tests and
supports weighting of individual tests.parallelFisher()
and
groupedFisher()
) assumes that tests are independent and
does not support weighting.parallelPearson()
and
groupedPearson()
) assumes that tests are independent and
does not support weighting.parallelStouffer()
and
groupedStouffer()
) assumes that tests are independent and
does support weighting.Fisher’s method is most sensitive to the smallest p-value while Pearson’s method is most sensitive to the largest p-value. Stouffer’s method serves as a compromise between these two extremes. Simes’ method is more conservative than all three but is still functional in the presence of dependencies.
The following methods yield a significant result for groups where some of the individual tests are significant (i.e., a minimum number or proportion of individual tests reject the null hypothesis):
parallelWilkinson()
and
groupedWilkinson()
) assumes that tests are independent and
does not support weighting.parallelHolmMin()
and
groupedHolmMin()
) does not require independence and
supports weighting.Technically, Wilkinson’s method can reject if any of the individual nulls are significant. However, it has a dramatic increase in detection power when the specified minimum is attained, hence its inclusion in this category.
Finally, Berger’s intersection union test will yield a significant result for groups where all of the individual tests are significant. This does not require independence but does not support weighting.
The summarizeParallelDirection()
and
summarizeGroupedDirection()
functions will summarize the
direction of effect of all influential tests for each group. For
example, if all influential tests have positive log-fold changes, the
group would be reported as having a direction of "up"
.
Alternatively, if influential tests have both positive and negative
log-fold changes, the reported direction would be "mixed"
.
By only considering the influential tests, we avoid noise from “less
significant” tests that do not contribute to the final p-value.
logfc1 <- rnorm(10000)
logfc2 <- rnorm(10000)
par.dir <- summarizeParallelDirection(list(logfc1, logfc2),
influential = par.combined$influential)
table(par.dir)
## par.dir
## down mixed up
## 4289 1490 4221
countParallelDirection()
and
countGroupedDirection
will just count the number of
significant tests changing in each direction. This is done after
correcting for the number of tests in each group with the
Benjamini-Hochberg or Holm methods. These counts are somewhat simpler to
interpret than the summarized direction but have their own caveats - see
?countParallelDirection
for details.
## List of 2
## $ down: int [1:10000] 2 0 0 0 1 0 1 0 0 0 ...
## $ up : int [1:10000] 0 0 0 0 0 0 1 0 0 0 ...
Another approach is to just use the direction of effect of the representative test. This is usually the best approach if a single direction and effect size is required, e.g., for visualization purposes.
All functions support log.p=TRUE
to accept
log-transformed p-values as
input and to return log-transformed output. This is helpful in
situations where underflow would cause generation of zero p-values.
Several functions support weighting of the individual tests within each group. This is helpful if some of the tests are a priori known to be more important than others.
The combineParallelPValues()
and
combineGroupedPValues()
wrapper functions will dispatch to
the requested method
, making it easier to switch between
methods inside other packages:
## List of 3
## $ p.value : Named num [1:100] 1 1 1 1 1 1 1 1 1 1 ...
## ..- attr(*, "names")= chr [1:100] "1" "2" "3" "4" ...
## $ representative: Named int [1:100] 4603 6887 3992 1065 1312 1062 3402 5251 6632 3128 ...
## ..- attr(*, "names")= chr [1:100] "1" "2" "3" "4" ...
## $ influential : logi [1:10000] TRUE FALSE FALSE TRUE TRUE TRUE ...
## R version 4.4.2 (2024-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] metapod_1.15.0 knitr_1.49 BiocStyle_2.35.0
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.37 R6_2.5.1 fastmap_1.2.0
## [4] xfun_0.49 maketools_1.3.1 cachem_1.1.0
## [7] htmltools_0.5.8.1 rmarkdown_2.29 buildtools_1.0.0
## [10] lifecycle_1.0.4 cli_3.6.3 sass_0.4.9
## [13] jquerylib_0.1.4 compiler_4.4.2 sys_3.4.3
## [16] tools_4.4.2 evaluate_1.0.1 bslib_0.8.0
## [19] Rcpp_1.0.13-1 yaml_2.3.10 BiocManager_1.30.25
## [22] jsonlite_1.8.9 rlang_1.1.4