Package 'swfdr' reference manual

Title:	Estimation of the science-wise false discovery rate and the false discovery rate conditional on covariates
Description:	This package allows users to estimate the science-wise false discovery rate from Jager and Leek, "Empirical estimates suggest most published medical research is true," 2013, Biostatistics, using an EM approach due to the presence of rounding and censoring. It also allows users to estimate the false discovery rate conditional on covariates, using a regression framework, as per Boca and Leek, "A direct approach to estimating false discovery rates conditional on covariates," 2018, PeerJ.
Authors:	Jeffrey T. Leek, Leah Jager, Simina M. Boca, Tomasz Konopka
Maintainer:	Simina M. Boca <[email protected]>, Jeffrey T. Leek <[email protected]>
License:	GPL (>= 3)
Version:	1.33.0
Built:	2025-03-07 06:06:07 UTC
Source:	https://github.com/bioc/swfdr

Subset of SNPs from meta-analysis of BMI GWAS study.

Description

A dataset containing 50,000 SNPs and results for their associations with BMI.

Usage

data(BMI_GIANT_GWAS_sample)
data(BMI_GIANT_GWAS_sample)

Format

A data frame with 50,000 rows and 9 variables:

SNP: ID for SNP (single nucleotide polymorphism)
A1: Allele 1 for SNP
A2: Allele 2 for SNP
Freq_MAF_Hapmap: Frequency of minor allele (MAF) in Hapmap project
b: Estimated beta for association between SNP and BMI
se: Estimated standard error (se) for association between SNP and BMI
p: P-value for association between SNP and BMI
N: Total sample size considered for association of SNP and BMI
Freq_MAF_Int_Hapmap: Three approximately equal intervals for the Hapmap MAFs

Value

Object of class tbl_df, tbl, data.frame.

Source

https://www.broadinstitute.org/collaboration/giant/index.php/GIANT_consortium_data_files#GWAS_Anthropometric_2015_BMI

Calculate the science-wise FDR (swfdr)

Description

Calculate the science-wise FDR (swfdr)

Usage

calculateSwfdr(
  pValues,
  truncated,
  rounded,
  pi0 = 0.5,
  alpha = 1,
  beta = 50,
  numEmIterations = 100
)
calculateSwfdr(
  pValues,
  truncated,
  rounded,
  pi0 = 0.5,
  alpha = 1,
  beta = 50,
  numEmIterations = 100
)

Arguments

`pValues`	Numerical vector of p-values
`truncated`	Vector of 0s and 1s with indices corresponding to those in pValues; 1 indicates that the p-values is truncated, 0 that it is not truncated
`rounded`	Vector of 0s and 1s with indices corresponding to those in pValues; 1 indicates that the p-values is rounded, 0 that it is not rounded
`pi0`	Initial prior probability that a hypothesis is null (default is 0.5)
`alpha`	Initial value of parameter alpha from Beta(alpha, beta) true positive distribution (default is 1)
`beta`	Initial value of parameter beta from Beta(alpha, beta) true positive distribution (default is 50)
`numEmIterations`	The number of EM iterations (default is 100)

Value

`pi0`	Final value of prior probability - estimated from EM - that a hypothesis is null, i.e. estimated swfdr
`alpha`	Final value of parameter alpha - estimated from EM - from Beta(alpha, beta) true positive distribution
`beta`	Final value of parameter beta - estimated from EM - from Beta(alpha, beta) true positive distribution
`z`	Vector of expected values of the indicator of whether the p-value is null or not - estimated from EM - for the non-rounded p-values (values of NA represent the rounded p-values)
`n0`	Expected number of rounded null p-values - estimated from EM - between certain cutpoints (0.005, 0.015, 0.025, 0.035, 0.045, 0.05)
`n`	Number of rounded p-values between certain cutpoints (0.005, 0.015, 0.025, 0.035, 0.045, 0.05)

Examples

pVals <- runif(100)
tt <- rr <- rep(0, 100)
resSwfdr <- calculateSwfdr(pValues = pVals, truncated = tt, rounded = rr, numEmIterations=100)

pVals <- runif(100)
tt <- rr <- rep(0, 100)
resSwfdr <- calculateSwfdr(pValues = pVals, truncated = tt, rounded = rr, numEmIterations=100)

Get number of decimals (i.e. return total number of digits after decimal point) for any vector of numbers in [0,1) if number of decimals <= 6

Description

Get number of decimals (i.e. return total number of digits after decimal point) for any vector of numbers in [0,1) if number of decimals <= 6

Usage

get_number_decimals(x)
get_number_decimals(x)

Arguments

`x`	Numerical vector where all elements are in [0,1)

Value

Vector giving the number of decimals for each element in x if the number is <= 6; otherwise return 7 with a warning

Examples

#get_number_decimals(c(0.0006, 0.0750, 0.0420, 0.0031, 0.0001, 0.0100))
#get_number_decimals(c(6*10^-4, 7.5*10^-2, 4.2*10^-2, 3.1*10^-3, 10^-4, 10^-2))
#get_number_decimals(c(6.5*10^-4, 0.0100)) 
#get_number_decimals(c(6.5e-4, 0.0100))
#get_number_decimals(c(0.00065, 0.0100))
#get_number_decimals(c(10^-7, 10e-7, 10e-3))
#get_number_decimals(c(0.0006, 0.0750, 0.0420, 0.0031, 0.0001, 0.0100))
#get_number_decimals(c(6*10^-4, 7.5*10^-2, 4.2*10^-2, 3.1*10^-3, 10^-4, 10^-2))
#get_number_decimals(c(6.5*10^-4, 0.0100)) 
#get_number_decimals(c(6.5e-4, 0.0100))
#get_number_decimals(c(0.00065, 0.0100))
#get_number_decimals(c(10^-7, 10e-7, 10e-3))

P-values from abstracts from articles in 5 biomedical journals (American Journal of Epidemiology, BMJ, JAMA, Lancet, New England Journal of Medicine), over 11 years (2000-2010).

Description

A dataset containing 15,653 p-values.

Usage

journals_pVals
journals_pVals

Format

A tbl data frame with 15,653 rows and 5 variables:

pvalue: P-value
pvalueTruncated: Equals to 1 if the p-value is truncated, 0 otherwise
pubmedID: Pubmed ID of the article
year: Year of publication
journal: Journal

Value

Object of class tbl_df, tbl, data.frame.

Source

Code for extracting p-values at: inst/script/getPvalues.R

Estimation of pi0, proportion of p-values consistent with a null hypothesis

Description

Estimation of pi0, proportion of p-values consistent with a null hypothesis

Usage

lm_pi0(
  p,
  lambda = seq(0.05, 0.95, 0.05),
  X,
  type = c("logistic", "linear"),
  smooth.df = 3,
  threshold = TRUE,
  smoothing = c("unit.spline", "smooth.spline")
)
lm_pi0(
  p,
  lambda = seq(0.05, 0.95, 0.05),
  X,
  type = c("logistic", "linear"),
  smooth.df = 3,
  threshold = TRUE,
  smoothing = c("unit.spline", "smooth.spline")
)

Arguments

`p`	numeric vector, p-values
`lambda`	numeric vector, thresholds used to bin pvalues, must be in [0,1).
`X`	numeric matrix, covariates that might be related to p values (one test per row, one variable per column).
`type`	character, type of regression used to fit features to pvalues
`smooth.df`	integer, degrees of freedom when estimating pi0(x) with a smoother.
`threshold`	logical, if TRUE, all estimates are thresholded into unit interval; if FALSE, all estimates are left as they are computed
`smoothing`	character, type of smoothing used to fit pi0

Value

object of class ‘lm_pi0’, which is a list with several components

`call`	matched function call
`lambda`	numeric vector of thresholds used in calculating pi0.lambda
`X.names`	character vector of covariates used in modeling
`pi0.lambda`	numeric matrix of estimated pi0(x) for each value of lambda. The number of columns is the number of tests, the number of rows is the length of lambda.
`pi0`	numerical vector of smoothed estimate of pi0(x). The length is the number of rows in X.
`pi0.smooth`	(only output with smoothing="smooth.spline") Matrix of fitted values from the smoother fit to the pi0(x) estimates at each value of lambda (same number of rows and columns as pi0.lambda)

Examples

# define a covariate
X <- seq(-1,2,length=1000)
# set probability of being null
pi0 <- 1/4*X + 1/2
# generate null/alternative p-values
nullI <- rbinom(1000,prob=pi0,size=1)> 0
# vector of p-values
pValues <- rep(NA,1000) 
pValues[nullI] <- runif(sum(nullI)) # from U(0,1)
pValues[!nullI] <- rbeta(sum(!nullI),1,2) # from Beta
pi0x <- lm_pi0(pValues, X=X)

# define a covariate
X <- seq(-1,2,length=1000)
# set probability of being null
pi0 <- 1/4*X + 1/2
# generate null/alternative p-values
nullI <- rbinom(1000,prob=pi0,size=1)> 0
# vector of p-values
pValues <- rep(NA,1000) 
pValues[nullI] <- runif(sum(nullI)) # from U(0,1)
pValues[!nullI] <- rbeta(sum(!nullI),1,2) # from Beta
pi0x <- lm_pi0(pValues, X=X)

Estimation of qvalues conditioned on covariates

Description

The recipe for turning pvalues into qvalues is adapted from package 'qvalue' and articles by Storey, Tibshirani, Taylor, Siegmund.

Usage

lm_qvalue(
  p,
  X,
  pfdr = FALSE,
  pi0 = NULL,
  smoothing = c("unit.spline", "smooth.spline"),
  ...
)
lm_qvalue(
  p,
  X,
  pfdr = FALSE,
  pi0 = NULL,
  smoothing = c("unit.spline", "smooth.spline"),
  ...
)

Arguments

`p`	numeric vector of p-values
`X`	matrix of covariates (can be missing if pi0 is specified instead)
`pfdr`	logical, making estimates robust for small p-values and a small sample size
`pi0`	list with pi0 estimates from lm_pi0. If this is not provided, pi0 is estimated using function lm_pi0.
`smoothing`	character, type of smoothing used to fit pi0. Note the default in this function is different than in lm_pi0.
`...`	other parameters (passed on to lm_pi0 if pi0 is not provided)

Value

object of class ‘lm_qvalue’, which is a list with several components

`call`	matched function call
`pvalues`	numeric vector of original p-values
`qvalues`	numeric vector of q-values
	other list elements transferred from pi0

Examples

# define a covariate
X <- rep(c(0, 1), each=1000)
# generate p-values, randomly for group 0 and with low values for group 1
pVal <- c(runif(1000), rbeta(1000, 0.2, 1))
# compute qvalues, using the covariate
qVal <- lm_qvalue(pVal, X=X)

# define a covariate
X <- rep(c(0, 1), each=1000)
# generate p-values, randomly for group 0 and with low values for group 1
pVal <- c(runif(1000), rbeta(1000, 0.2, 1))
# compute qvalues, using the covariate
qVal <- lm_qvalue(pVal, X=X)

Package 'swfdr'

Help Index

Subset of SNPs from meta-analysis of BMI GWAS study.

Description

Usage

Format

Value

Source

Calculate the science-wise FDR (swfdr)

Description

Usage

Arguments

Value

Examples

Get number of decimals (i.e. return total number of digits after decimal point) for any vector of numbers in [0,1) if number of decimals <= 6

Description

Usage

Arguments

Value

Examples

P-values from abstracts from articles in 5 biomedical journals (American Journal of Epidemiology, BMJ, JAMA, Lancet, New England Journal of Medicine), over 11 years (2000-2010).

Description

Usage

Format

Value

Source

Estimation of pi0, proportion of p-values consistent with a null hypothesis

Description

Usage

Arguments

Value

Examples

Estimation of qvalues conditioned on covariates

Description

Usage

Arguments

Value

Examples