Title: | An R package for single-cell eQTL analysis and visualization |
---|---|
Description: | This package specializes in analyzing and visualizing eQTL at the single-cell level. It can read gene expression matrices or Seurat data, or SingleCellExperiment object along with genotype data. It offers a function for cis-eQTL analysis to detect eQTL within a given range, and another function to fit models with three methods. Using this package, users can also generate single-cell level visualization result. |
Authors: | Xiaofeng Wu [aut, cre, cph] |
Maintainer: | Xiaofeng Wu <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.99.12 |
Built: | 2025-03-27 03:39:34 UTC |
Source: | https://github.com/bioc/scQTLtools |
Adjust p-values and perform threshold filtering based on the adjusted p-values.
adjust_pvalues(result, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05)
adjust_pvalues(result, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05)
result |
Dataframe that contains gene-SNP pairs' information. |
pAdjustMethod |
Methods for p-value adjusting, one of "bonferroni", "holm", "hochberg", "hommel" or "BH". The default option is "bonferroni". |
pAdjustThreshold |
Only SNP-Gene pairs with adjusted p-values meeting the threshold will be displayed. Default by 0.05. |
A dataframe that has been adjusted and filtered, containing information on gene-SNP pairs.
example_data <- data.frame( gene = c("Gene1", "Gene2", "Gene3", "Gene4"), SNP = c("SNP1", "SNP2", "SNP3", "SNP4"), pvalue = c(0.001, 0.04, 0.03, 0.0005)) pAdjustMethod <- "BH" pAdjustThreshold <- 0.05 adjusted_result <- adjust_pvalues(example_data, pAdjustMethod, pAdjustThreshold)
example_data <- data.frame( gene = c("Gene1", "Gene2", "Gene3", "Gene4"), SNP = c("SNP1", "SNP2", "SNP3", "SNP4"), pvalue = c(0.001, 0.04, 0.03, 0.0005)) pAdjustMethod <- "BH" pAdjustThreshold <- 0.05 adjusted_result <- adjust_pvalues(example_data, pAdjustMethod, pAdjustThreshold)
Build zinb model.
buildZINB(counts)
buildZINB(counts)
counts |
a vector for gene expression. |
Four parameters
data(testGene) gene <- unlist(testGene[1, ]) result <-buildZINB(gene)
data(testGene) gene <- unlist(testGene[1, ]) result <-buildZINB(gene)
callQTL: Uncover single-cell eQTLs exclusively using scRNA-seq data. A function designed to identify eQTLs from scRNA-seq data.
callQTL( eQTLObject, gene_ids = NULL, downstream = NULL, upstream = NULL, gene_mart = NULL, snp_mart = NULL, pAdjustMethod = "bonferroni", useModel = "zinb", pAdjustThreshold = 0.05, logfcThreshold = 0.1 )
callQTL( eQTLObject, gene_ids = NULL, downstream = NULL, upstream = NULL, gene_mart = NULL, snp_mart = NULL, pAdjustMethod = "bonferroni", useModel = "zinb", pAdjustThreshold = 0.05, logfcThreshold = 0.1 )
eQTLObject |
An S4 object of class eQTLObject. |
gene_ids |
A gene ID or a list of gene IDS. |
downstream |
Being used to match SNPs within a base range defined by the start position of genes. |
upstream |
Being used to match SNPs within a base range defined by the end position of genes. |
gene_mart |
An object of class Mart representing the BioMart database to connect to. If NULL, the function will use the Ensembl Gene BioMart. |
snp_mart |
An object of class Mart representing the BioMart database to connect to. If NULL, the function will use the Ensembl SNP BioMart. |
pAdjustMethod |
Methods for p-value adjusting, one of 'bonferroni', 'holm', 'hochberg', 'hommel' or 'BH'. Default by 'bonferroni'. |
useModel |
Model for fitting dataframe, one of 'possion', 'zinb', or 'linear'. |
pAdjustThreshold |
Only SNP-Gene pairs with adjusted p-values meeting the threshold will be displayed. Default by 0.05. |
logfcThreshold |
Represents the minimum beta threshold for fitting SNP-Gene pairs. |
A dataframe, each row describes eQTL discovering result of a SNP-Gene pair.
data(testEQTL) library(biomaRt) gene_mart <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl", mirror = 'asia') snp_mart <- useEnsembl(biomart = "snps", dataset = "hsapiens_snp", mirror = 'asia') eqtl <- callQTL( eQTLObject = testEQTL, gene_ids = NULL, downstream = NULL, upstream = NULL, gene_mart = gene_mart, snp_mart = snp_mart, pAdjustMethod = 'bonferroni', useModel = 'linear', pAdjustThreshold = 0.05, logfcThreshold = 0.025 )
data(testEQTL) library(biomaRt) gene_mart <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl", mirror = 'asia') snp_mart <- useEnsembl(biomart = "snps", dataset = "hsapiens_snp", mirror = 'asia') eqtl <- callQTL( eQTLObject = testEQTL, gene_ids = NULL, downstream = NULL, upstream = NULL, gene_mart = gene_mart, snp_mart = snp_mart, pAdjustMethod = 'bonferroni', useModel = 'linear', pAdjustThreshold = 0.05, logfcThreshold = 0.025 )
Check if the SNP ids in the input genotype matrix are valid.
checkSNPList(snpList, snp_mart = NULL, snpDataset = "hsapiens_snp")
checkSNPList(snpList, snp_mart = NULL, snpDataset = "hsapiens_snp")
snpList |
a list of SNPs id. |
snp_mart |
An object of class 'Mart' representing the BioMart database connect to for SNPs. If provided, this should be a 'Mart' object obtained by calling 'useEnsembl()', which allows specifying a mirror in case of connection issues. If 'NULL', the function will create and use a 'Mart' object pointing to the Ensembl SNP BioMart using the specified 'snpDataset' and a default mirror. |
snpDataset |
A character string specifying the SNP dataset to use from ENSEMBL. The default is 'hsapiens_snp' for human SNPs. |
SNP location dataframe
data(testSNP2) snpList <- rownames(testSNP2) snpDataset <- 'hsapiens_snp' snps_loc <- checkSNPList(snpList = snpList, snpDataset = snpDataset)
data(testSNP2) snpList <- rownames(testSNP2) snpDataset <- 'hsapiens_snp' snps_loc <- checkSNPList(snpList = snpList, snpDataset = snpDataset)
'CPM_normalize()' scales an expression matrix using Counts Per Million (CPM) normalization, applying logarithm and scaling operations to adjust data.
CPM_normalize(expressionMatrix)
CPM_normalize(expressionMatrix)
expressionMatrix |
Input raw gene expression matrix. |
A gene expression matrix after normalized.
data(testGene) CPM_normalize(testGene)
data(testGene) CPM_normalize(testGene)
Create gene location dataframe.
createGeneLoc( geneList, gene_mart = NULL, geneDataset = "hsapiens_gene_ensembl", OrgDb )
createGeneLoc( geneList, gene_mart = NULL, geneDataset = "hsapiens_gene_ensembl", OrgDb )
geneList |
A gene id or a list of genes id. |
gene_mart |
An object of class 'Mart' representing the BioMart database connect to for gene. If provided, this should be a 'Mart' object obtained by calling 'useEnsembl()', which allows specifying a mirror in case of connection issues. If 'NULL', the function will create and use a 'Mart' object pointing to the Ensembl Gene BioMart using the specified 'geneDataset' and a default mirror. |
geneDataset |
A character string specifying the gene dataset to use from ENSEMBL. The default is "hsapiens_gene_ensembl" for human genes. |
OrgDb |
OrgDb name:"org.Hs.eg.db", "org.Mm.eg.db". |
data.frame
data(testGene) geneList <- rownames(testGene) library(GOSemSim) library(biomaRt) OrgDb <- load_OrgDb("org.Hs.eg.db") gene_mart <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl", mirror = 'asia') gene_loc <- createGeneLoc(geneList = geneList, gene_mart = gene_mart, OrgDb = OrgDb)
data(testGene) geneList <- rownames(testGene) library(GOSemSim) library(biomaRt) OrgDb <- load_OrgDb("org.Hs.eg.db") gene_mart <- useEnsembl(biomart = "genes", dataset = "hsapiens_gene_ensembl", mirror = 'asia') gene_loc <- createGeneLoc(geneList = geneList, gene_mart = gene_mart, OrgDb = OrgDb)
createObject:Create the eQTLObject. We next create a S4 object. The object serves as a container that contains both data (like the count matrix) and meta.data.
createQTLObject( snpMatrix, genedata, biClassify = FALSE, species = NULL, group = NULL, ... )
createQTLObject( snpMatrix, genedata, biClassify = FALSE, species = NULL, group = NULL, ... )
snpMatrix |
A genotype matrix where each row is one variant and each column is one sample, and the scoring method is 0/1/2/3. |
genedata |
A gene expression matrix or a Seurat object, or a SingleCellExperiment object. |
biClassify |
The user chooses whether to convert the counting method of the snpMatrix to 0/1/2, TRUE indicates conversion, and FALSE indicates no conversion, default is no conversion. |
species |
The species that the user wants to select, human or mouse. |
group |
Provided by Seurat's meta.data, such as celltypes, cellstatus and so on. By default, it is NULL. |
... |
other parameters |
eQTLObject
data(testSNP) data(testGene) eqtl <- createQTLObject(snpMatrix = testSNP, genedata = testGene, biClassify = FALSE, species = 'human', group = NULL)
data(testSNP) data(testGene) eqtl <- createQTLObject(snpMatrix = testSNP, genedata = testGene, biClassify = FALSE, species = 'human', group = NULL)
Create SNP location dataframe.
createSNPsLoc(snpList, snp_mart = NULL, snpDataset = "hsapiens_snp")
createSNPsLoc(snpList, snp_mart = NULL, snpDataset = "hsapiens_snp")
snpList |
a list of SNPs id. |
snp_mart |
An object of class 'Mart' representing the BioMart database connect to for SNPs. If provided, this should be a 'Mart' object obtained by calling 'useEnsembl()', which allows specifying a mirror in case of connection issues. If 'NULL', the function will create and use a 'Mart' object pointing to the Ensembl SNP BioMart using the specified 'snpDataset' and a default mirror. |
snpDataset |
A character string specifying the SNP dataset to use from ENSEMBL. The default is 'hsapiens_snp' for human SNPs. |
data.frame
snpList <- c('rs546', 'rs549', 'rs568', 'rs665', 'rs672') library(biomaRt) snp_mart <- useEnsembl(biomart = "snps", dataset = "hsapiens_snp", mirror = 'asia') snp_loc <- createSNPsLoc(snpList = snpList, snp_mart = snp_mart)
snpList <- c('rs546', 'rs549', 'rs568', 'rs665', 'rs672') library(biomaRt) snp_mart <- useEnsembl(biomart = "snps", dataset = "hsapiens_snp", mirror = 'asia') snp_loc <- createSNPsLoc(snpList = snpList, snp_mart = snp_mart)
'DESeq_normalize()'normalizes an expression matrix using the DESeq2 package.
DESeq_normalize(expressionMatrix)
DESeq_normalize(expressionMatrix)
expressionMatrix |
Input raw gene expression matrix. |
A gene expression matrix after normalized.
data(testGene) DESeq_normalize(testGene)
data(testGene) DESeq_normalize(testGene)
'draw_boxplot()' creates a boxplot visualizing expression levels across different SNP factors in the dataframe. It uses ggplot2 to produce a plot with customizable aesthetics for clarity and presentation.
draw_boxplot(df, unique_group)
draw_boxplot(df, unique_group)
df |
Data frames listed as gene expression data, genotype data, and groups |
unique_group |
name of unique group |
ggplot
set.seed(123) counts_Ref <- rnorm(50, mean = 10, sd = 2) counts_Alt <- rnorm(50, mean = 12, sd = 2) i <- rep("GroupA", 100) unique_group <- unique(i) dataframe <- data.frame( expression = c(counts_Ref, counts_Alt), snp = c(rep("REF", length(counts_Ref)), rep("ALT", length(counts_Alt))), group = i) dataframe$snp <- factor(dataframe$snp, levels = c("REF", "ALT")) draw_boxplot(df = dataframe, unique_group = unique_group)
set.seed(123) counts_Ref <- rnorm(50, mean = 10, sd = 2) counts_Alt <- rnorm(50, mean = 12, sd = 2) i <- rep("GroupA", 100) unique_group <- unique(i) dataframe <- data.frame( expression = c(counts_Ref, counts_Alt), snp = c(rep("REF", length(counts_Ref)), rep("ALT", length(counts_Alt))), group = i) dataframe$snp <- factor(dataframe$snp, levels = c("REF", "ALT")) draw_boxplot(df = dataframe, unique_group = unique_group)
'draw_histplot()' generates histograms using ggplot2, displaying the distribution of expression values categorized by SNP type.
draw_histplot(df, unique_group)
draw_histplot(df, unique_group)
df |
Data frames listed as gene expression data, genotype data, and groups |
unique_group |
name of unique group |
ggplot
set.seed(123) counts_Ref <- rnorm(50, mean = 10, sd = 2) counts_Alt <- rnorm(50, mean = 12, sd = 2) i <- rep("GroupA", 100);unique_group <- unique(i) dataframe <- data.frame(expression = c(counts_Ref, counts_Alt), snp = c(rep("REF", length(counts_Ref)), rep("ALT", length(counts_Alt))), group = i) dataframe$snp <- factor(dataframe$snp, levels = c("REF", "ALT")) draw_histplot(df = dataframe, unique_group = unique_group)
set.seed(123) counts_Ref <- rnorm(50, mean = 10, sd = 2) counts_Alt <- rnorm(50, mean = 12, sd = 2) i <- rep("GroupA", 100);unique_group <- unique(i) dataframe <- data.frame(expression = c(counts_Ref, counts_Alt), snp = c(rep("REF", length(counts_Ref)), rep("ALT", length(counts_Alt))), group = i) dataframe$snp <- factor(dataframe$snp, levels = c("REF", "ALT")) draw_histplot(df = dataframe, unique_group = unique_group)
'draw_QTLplot()' generates a combined plot using ggplot2, showing the distribution of expression values across different SNPs. It combines a violin plot, boxplot, and scatter points for each SNP category.
draw_QTLplot(df, unique_group)
draw_QTLplot(df, unique_group)
df |
Data frames listed as gene expression data, genotype data, and groups |
unique_group |
name of unique group |
ggplot
set.seed(123) counts_Ref <- rnorm(50, mean = 10, sd = 2) counts_Alt <- rnorm(50, mean = 12, sd = 2) i <- rep("GroupA", 100);unique_group <- unique(i) dataframe <- data.frame(expression = c(counts_Ref, counts_Alt), snp = c(rep("REF", length(counts_Ref)), rep("ALT", length(counts_Alt))), group = i) dataframe$snp <- factor(dataframe$snp, levels = c("REF", "ALT")) draw_QTLplot(df = dataframe, unique_group = unique_group)
set.seed(123) counts_Ref <- rnorm(50, mean = 10, sd = 2) counts_Alt <- rnorm(50, mean = 12, sd = 2) i <- rep("GroupA", 100);unique_group <- unique(i) dataframe <- data.frame(expression = c(counts_Ref, counts_Alt), snp = c(rep("REF", length(counts_Ref)), rep("ALT", length(counts_Alt))), group = i) dataframe$snp <- factor(dataframe$snp, levels = c("REF", "ALT")) draw_QTLplot(df = dataframe, unique_group = unique_group)
'draw_violinplot()' creates a violin plot visualizing expression levels across different SNP factors in the dataframe. It uses ggplot2 to produce a plot with customizable aesthetics for clarity and presentation.
draw_violinplot(df, unique_group)
draw_violinplot(df, unique_group)
df |
Data frames listed as gene expression data, genotype data, and groups |
unique_group |
name of unique group |
ggplot
set.seed(123) counts_Ref <- rnorm(50, mean = 10, sd = 2) counts_Alt <- rnorm(50, mean = 12, sd = 2) i <- rep("GroupA", 100);unique_group <- unique(i) dataframe <- data.frame(expression = c(counts_Ref, counts_Alt), snp = c(rep("REF", length(counts_Ref)), rep("ALT", length(counts_Alt))), group = i) dataframe$snp <- factor(dataframe$snp, levels = c("REF", "ALT")) draw_violinplot(df = dataframe, unique_group = unique_group)
set.seed(123) counts_Ref <- rnorm(50, mean = 10, sd = 2) counts_Alt <- rnorm(50, mean = 12, sd = 2) i <- rep("GroupA", 100);unique_group <- unique(i) dataframe <- data.frame(expression = c(counts_Ref, counts_Alt), snp = c(rep("REF", length(counts_Ref)), rep("ALT", length(counts_Alt))), group = i) dataframe$snp <- factor(dataframe$snp, levels = c("REF", "ALT")) draw_violinplot(df = dataframe, unique_group = unique_group)
Class 'eQTLObject' The eQTLObject class is an R object designed to store data related to eQTL analysis, encompassing data lists, result data frames, and layers for biClassify, species, and grouping information.
eQTLObject
rawData
A gene expression dataframe, the row names represent gene IDs and the column names represent cell IDs.
filterData
Gene expression matrix after normalizing.
eQTLResult
The result dataframe obtained the sc-eQTL results.
biClassify
The user chooses whether to convert the counting method of the snpMatrix to 0, 1, 2, TRUE indicates conversion, and FALSE indicates no conversion, default is no conversion.
species
The species that the user wants to select, human or mouse.
groupBy
Options for cell grouping, users can choose celltype, cellstatus, etc., depending on metadata.
useModel
model for fitting dataframe.
Filters data frame by absolute b-values, returning rows meeting or exceeding a threshold.
filter_by_abs_b(result, logfcThreshold)
filter_by_abs_b(result, logfcThreshold)
result |
Dataframe that contains gene-SNP pairs' information. |
logfcThreshold |
Represents the minimum beta threshold for fitting SNP-Gene pairs. Default by 0.1. |
A dataframe filtered by absolute b-values.
example_result <- data.frame( gene = c("Gene1", "Gene2", "Gene3", "Gene4"), SNP = c("SNP1", "SNP2", "SNP3", "SNP4"), b = c(-2.5, 1.0, -0.5, 3.0)) logfcThreshold <- 0.1 filtered_result <- filter_by_abs_b(example_result, logfcThreshold)
example_result <- data.frame( gene = c("Gene1", "Gene2", "Gene3", "Gene4"), SNP = c("SNP1", "SNP2", "SNP3", "SNP4"), b = c(-2.5, 1.0, -0.5, 3.0)) logfcThreshold <- 0.1 filtered_result <- filter_by_abs_b(example_result, logfcThreshold)
filterGeneSNP: Filter gene expression matrix and genotype matrix.
filterGeneSNP( eQTLObject, snpNumOfCellsPercent = 10, expressionMin = 0, expressionNumOfCellsPercent = 10 )
filterGeneSNP( eQTLObject, snpNumOfCellsPercent = 10, expressionMin = 0, expressionNumOfCellsPercent = 10 )
eQTLObject |
An S4 object of class eQTLObject. |
snpNumOfCellsPercent |
Only SNPs where cells with each of the different genotypes (REF and ALT, or AA, Aa, and aa) individually account for at least 'snpNumOfCellsPercent' Default by 10. |
expressionMin |
threshold for valid gene expression levels, utilized alongside another parameter,expression.number.of.cells.Default by 0. |
expressionNumOfCellsPercent |
Only genes with expression levels exceeding 'expressionMin' in at least 'expressionNumOfCellsPercent' of cells are considered. The default value is 10. |
filtered matrices.
data(testSNP) data(testGene) eqtl <- createQTLObject(snpMatrix = testSNP, genedata = testGene) eqtl <- normalizeGene(eqtl) eqtl <- filterGeneSNP(eqtl, snpNumOfCellsPercent = 2, expressionMin = 0, expressionNumOfCellsPercent = 2)
data(testSNP) data(testGene) eqtl <- createQTLObject(snpMatrix = testSNP, genedata = testGene) eqtl <- normalizeGene(eqtl) eqtl <- filterGeneSNP(eqtl, snpNumOfCellsPercent = 2, expressionMin = 0, expressionNumOfCellsPercent = 2)
This function extracts the names of cells from a SNP matrix that correspond to a specified value for a given SNP.
get_cell_groups(snpMatrix, SNPid, biClassify)
get_cell_groups(snpMatrix, SNPid, biClassify)
snpMatrix |
A matrix containing SNP data where rows represent SNPs and columns represent cells. |
SNPid |
A character string or numeric index representing the specific SNP of interest in the SNP matrix. |
biClassify |
The user chooses whether to convert the counting method of the snpMatrix to 0/1/2, TRUE indicates conversion, and FALSE indicates no conversion, default is no conversion. |
A list of cell names (column names of the SNP matrix) that correspond to the specified genotype value for the given SNP.
data(testSNP) biClassify <- FALSE get_cell_groups(testSNP, "1:632445", biClassify)
data(testSNP) biClassify <- FALSE get_cell_groups(testSNP, "1:632445", biClassify)
This function retrieves expression counts for a specified gene from an expression matrix, based on the provided list of cells.
get_counts(expressionMatrix, Geneid, cells)
get_counts(expressionMatrix, Geneid, cells)
expressionMatrix |
A matrix containing gene expression data where rows represent genes and columns represent cells. |
Geneid |
A character string or numeric index representing the specific gene of interest in the expression matrix. |
cells |
A character vector of cell names (column names of the expression matrix) from which to extract counts for the specified gene. |
A numeric vector of expression counts for the specified gene in the selected cells.
data(testGene) get_counts(testGene, "CNN2", c("CGGCAGTGTAGCCCTG", "GGAGGATTCCCGTTCA"))
data(testGene) get_counts(testGene, "CNN2", c("CGGCAGTGTAGCCCTG", "GGAGGATTCCCGTTCA"))
Generic to access eQTLObject filter data
get_filter_data(x)
get_filter_data(x)
x |
A eQTLObject object. |
filtered matrices.
data(testEQTL) get_filter_data(testEQTL)
data(testEQTL) get_filter_data(testEQTL)
Method to access eQTLObject filter data
## S4 method for signature 'eQTLObject' get_filter_data(x)
## S4 method for signature 'eQTLObject' get_filter_data(x)
x |
A eQTLObject object. |
filtered matrices.
Generic to access eQTLObject used model information
get_model_info(x)
get_model_info(x)
x |
A eQTLObject object. |
used model information of eQTLObject.
data(testEQTL) get_model_info(testEQTL)
data(testEQTL) get_model_info(testEQTL)
Method to access eQTLObject used model information
## S4 method for signature 'eQTLObject' get_model_info(x)
## S4 method for signature 'eQTLObject' get_model_info(x)
x |
A eQTLObject object. |
used model information of eQTLObject.
Generic to access eQTLObject raw data
get_raw_data(x)
get_raw_data(x)
x |
A eQTLObject object. |
raw data matrix.
data(testEQTL) get_raw_data(testEQTL)
data(testEQTL) get_raw_data(testEQTL)
Method to access eQTLObject raw data
## S4 method for signature 'eQTLObject' get_raw_data(x)
## S4 method for signature 'eQTLObject' get_raw_data(x)
x |
A eQTLObject object. |
raw data matrix.
Generic to access the result of identifying eQTLs from scRNA-seq data
get_result_info(x)
get_result_info(x)
x |
A eQTLObject object. |
A dataframe.
data(testEQTL) get_result_info(testEQTL)
data(testEQTL) get_result_info(testEQTL)
Method to access the result of identifying eQTLs from scRNA-seq data
## S4 method for signature 'eQTLObject' get_result_info(x)
## S4 method for signature 'eQTLObject' get_result_info(x)
x |
A eQTLObject object. |
A dataframe.
This function initializes a progress bar for use in the 'linearModel' , 'poissonModel' and 'zinbModel' function. It is designed to provide feedback on the progress of the analysis by displaying the current step and a percentage completion.
initialize_progress_bar(total, k)
initialize_progress_bar(total, k)
total |
The total number of steps or iterations for which the progress bar will be updated. |
k |
A label or identifier for the specific group or iteration for which the progress bar is being initialized. |
A 'progress_bar' object from the 'progress' package, which is used to track and display the progress.
unique_group <- c("CMP", "GMP") total_snp_count <- 10 # assume each group have 100 SNP. pb_model <- lapply(unique_group, function(k) { pb <- initialize_progress_bar(total = total_snp_count, k) for (i in seq_len(total_snp_count)) { Sys.sleep(0.1) # assume progress time pb$tick() # update pb } })
unique_group <- c("CMP", "GMP") total_snp_count <- 10 # assume each group have 100 SNP. pb_model <- lapply(unique_group, function(k) { pb <- initialize_progress_bar(total = total_snp_count, k) for (i in seq_len(total_snp_count)) { Sys.sleep(0.1) # assume progress time pb$tick() # update pb } })
'limma_normalize()'normalizes an expression matrix using the quantile normalization method provided by the limma package.
limma_normalize(expressionMatrix)
limma_normalize(expressionMatrix)
expressionMatrix |
Input raw gene expression matrix. |
A gene expression matrix after normalized.
data(testGene) limma_normalize(testGene)
data(testGene) limma_normalize(testGene)
Linear model fitting the gene expression matrix and genotype matrix.
linearModel( eQTLObject, geneIDs, snpIDs, biClassify = FALSE, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05, logfcThreshold = 0.1 )
linearModel( eQTLObject, geneIDs, snpIDs, biClassify = FALSE, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05, logfcThreshold = 0.1 )
eQTLObject |
An S4 object of class eQTLObject. |
geneIDs |
Matching genes can be used to fit data. |
snpIDs |
Matching SNPs can be used to fit data. |
biClassify |
The user chooses whether to convert the counting method of the snpMatrix to 0/1/2, TRUE indicates conversion, and FALSE indicates no conversion, default is no conversion. |
pAdjustMethod |
Methods for p-value adjusting, one of "bonferroni", "holm", "hochberg", "hommel" or "BH". The default option is "bonferroni". |
pAdjustThreshold |
Only SNP-Gene pairs with adjusted p-values meeting the threshold will be displayed. Default by 0.05. |
logfcThreshold |
Represents the minimum beta threshold for fitting SNP-Gene pairs. Default by 0.1. |
Dataframe that contains gene-SNP pairs' information.
data(testEQTL) Gene <- rownames(slot(testEQTL, "filterData")$expMat) SNP <- rownames(slot(testEQTL, "filterData")$snpMat) linearResult <- linearModel( eQTLObject = testEQTL, geneIDs = Gene, snpIDs = SNP, biClassify = FALSE, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05, logfcThreshold = 0.025)
data(testEQTL) Gene <- rownames(slot(testEQTL, "filterData")$expMat) SNP <- rownames(slot(testEQTL, "filterData")$snpMat) linearResult <- linearModel( eQTLObject = testEQTL, geneIDs = Gene, snpIDs = SNP, biClassify = FALSE, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05, logfcThreshold = 0.025)
Generic to access eQTLObject biclassify information
load_biclassify_info(x)
load_biclassify_info(x)
x |
A eQTLObject object. |
biclassify information of eQTLObject.
data(testEQTL) load_biclassify_info(testEQTL)
data(testEQTL) load_biclassify_info(testEQTL)
Method to access eQTLObject biclassify information
## S4 method for signature 'eQTLObject' load_biclassify_info(x)
## S4 method for signature 'eQTLObject' load_biclassify_info(x)
x |
A eQTLObject object. |
biclassify information of eQTLObject.
Generic to access eQTLObject cell grouping information
load_group_info(x)
load_group_info(x)
x |
A eQTLObject object. |
A dataframe.
data(testEQTL) load_group_info(testEQTL)
data(testEQTL) load_group_info(testEQTL)
Method to access eQTLObject cell grouping information
## S4 method for signature 'eQTLObject' load_group_info(x)
## S4 method for signature 'eQTLObject' load_group_info(x)
x |
A eQTLObject object. |
A dataframe.
Generic to access eQTLObject species information
load_species_info(x)
load_species_info(x)
x |
A eQTLObject object. |
species information of eQTLObject.
data(testEQTL) load_species_info(testEQTL)
data(testEQTL) load_species_info(testEQTL)
Method to access eQTLObject species information
## S4 method for signature 'eQTLObject' load_species_info(x)
## S4 method for signature 'eQTLObject' load_species_info(x)
x |
A eQTLObject object. |
species information of eQTLObject.
'log_normalize()' transforms an expression matrix by applying logarithm and scaling operations to normalize data.
log_normalize(expressionMatrix)
log_normalize(expressionMatrix)
expressionMatrix |
Input raw gene expression matrix. |
A gene expression matrix after normalized.
data(testGene) log_normalize(testGene)
data(testGene) log_normalize(testGene)
Gene expression matrix normalization is necessary to eliminate technical biases and variabilities, ensuring accurate and comparable analysis of gene expression data. Here we provide 'normalizeGene()'to normalize the data.
normalizeGene(eQTLObject, method = "logNormalize")
normalizeGene(eQTLObject, method = "logNormalize")
eQTLObject |
An S4 object of class eQTLObject. |
method |
Method for normalizing for gene expression dataframe, one of "logNormalize", "CPM", "TPM", "DESeq" or "limma" |
A normalized gene expression matrix.
data(testEQTL) eqtl <- normalizeGene(testEQTL, method = "logNormalize")
data(testEQTL) eqtl <- normalizeGene(testEQTL, method = "logNormalize")
Theme options for customized plots
plots_theme_opts()
plots_theme_opts()
A ggplot2 theme object with customized settings.
library(ggplot2) data <- data.frame( x = c("A", "B", "C", "D", "E"), y = c(10, 20, 30, 40, 50)) ggplot(data, aes(x, y)) + geom_point() + plots_theme_opts()
library(ggplot2) data <- data.frame( x = c("A", "B", "C", "D", "E"), y = c(10, 20, 30, 40, 50)) ggplot(data, aes(x, y)) + geom_point() + plots_theme_opts()
Poisson model fitting the gene expression matrix and genotype matrix.
poissonModel( eQTLObject, geneIDs, snpIDs, biClassify = FALSE, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05, logfcThreshold = 0.1 )
poissonModel( eQTLObject, geneIDs, snpIDs, biClassify = FALSE, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05, logfcThreshold = 0.1 )
eQTLObject |
An S4 object of class eQTLObject. |
geneIDs |
Matching genes can be used to fit data. |
snpIDs |
Matching SNPs can be used to fit data. |
biClassify |
The user chooses whether to convert the counting method of the snpMatrix to 0/1/2, TRUE indicates conversion, and FALSE indicates no conversion, default is FALSE. |
pAdjustMethod |
Methods for p-value adjusting, one of "bonferroni", "holm", "hochberg", "hommel" or "BH". The default option is "bonferroni". |
pAdjustThreshold |
Only SNP-Gene pairs with adjusted p-values meeting the threshold will be displayed. The default value is 0.05. |
logfcThreshold |
Represents the minimum beta threshold for fitting SNP-Gene pairs. |
Dataframe that contains gene-SNP pairs' information.
data(testEQTL) Gene <- rownames(slot(testEQTL, "filterData")$expMat) SNP <- rownames(slot(testEQTL, "filterData")$snpMat) poissonResult <- poissonModel( eQTLObject = testEQTL, geneIDs = Gene, snpIDs = SNP, biClassify = FALSE, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05, logfcThreshold = 0.025 )
data(testEQTL) Gene <- rownames(slot(testEQTL, "filterData")$expMat) SNP <- rownames(slot(testEQTL, "filterData")$snpMat) poissonResult <- poissonModel( eQTLObject = testEQTL, geneIDs = Gene, snpIDs = SNP, biClassify = FALSE, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05, logfcThreshold = 0.025 )
Process a matrix to extract a row and convert it to a data frame
process_matrix(id, matrix, name)
process_matrix(id, matrix, name)
id |
The identifier for the row to be extracted from the matrix. |
matrix |
The input matrix from which the row will be extracted. |
name |
The column names for the resulting data frame. |
A data frame containing the extracted row and a column with the row names.
rownames <- c("CNN2", "TIGD2", "DTD2") colnames <- c("Col1", "Col2", "Col3", "Col4") matrix_data <- matrix(1:12, nrow = 3, ncol = 4, dimnames = list(rownames, colnames)) geneid <- "CNN2" gene_mat <- process_matrix(geneid, matrix_data, "gene_mat")
rownames <- c("CNN2", "TIGD2", "DTD2") colnames <- c("Col1", "Col2", "Col3", "Col4") matrix_data <- matrix(1:12, nrow = 3, ncol = 4, dimnames = list(rownames, colnames)) geneid <- "CNN2" gene_mat <- process_matrix(geneid, matrix_data, "gene_mat")
remove_outliers() is a function designed to process gene expression data stored in an expression matrix. It identifies outliers within the data based on the MAD method and filters them out. The function updates specified cell lists by retaining only those cells that have non-outlier expression values for a specified gene.
remove_outliers(exprsMat, Geneid, A_cells, B_cells, C_cells = NULL)
remove_outliers(exprsMat, Geneid, A_cells, B_cells, C_cells = NULL)
exprsMat |
Input gene expression matrix |
Geneid |
Chosen gene id. |
A_cells |
A genotype cells |
B_cells |
B genotype cells |
C_cells |
C genotype cells |
a list of cells ids
## Mock expression matrix set.seed(123) exprsMat <- matrix(rnorm(200), nrow = 5) rownames(exprsMat) <- paste0("Gene", 1:nrow(exprsMat)) colnames(exprsMat) <- paste0("cell", 1:ncol(exprsMat)) A_cells <- colnames(exprsMat)[1:13] # Example A cell list B_cells <- colnames(exprsMat)[14:26] # Example B cell list C_cells <- colnames(exprsMat)[27:40] # Example C cell list remove_outliers(exprsMat, "Gene1", A_cells, B_cells, C_cells)
## Mock expression matrix set.seed(123) exprsMat <- matrix(rnorm(200), nrow = 5) rownames(exprsMat) <- paste0("Gene", 1:nrow(exprsMat)) colnames(exprsMat) <- paste0("cell", 1:ncol(exprsMat)) A_cells <- colnames(exprsMat)[1:13] # Example A cell list B_cells <- colnames(exprsMat)[14:26] # Example B cell list C_cells <- colnames(exprsMat)[27:40] # Example C cell list remove_outliers(exprsMat, "Gene1", A_cells, B_cells, C_cells)
Generic to set eQTLObject filter data
set_filter_data(x, value, name)
set_filter_data(x, value, name)
x |
A eQTLObject object. |
value |
The filtered data. |
name |
The matrix named 'name' is stored under the 'filterData' slot as an element within its list. |
eQTLObject.
data(testEQTL) data123 <- matrix(0, nrow = 3, ncol = 3) set_filter_data(testEQTL, data123, "expMat")
data(testEQTL) data123 <- matrix(0, nrow = 3, ncol = 3) set_filter_data(testEQTL, data123, "expMat")
Method to set eQTLObject filter data
## S4 method for signature 'eQTLObject' set_filter_data(x, value, name)
## S4 method for signature 'eQTLObject' set_filter_data(x, value, name)
x |
A eQTLObject object. |
value |
The filtered data. |
name |
The matrix named 'name' is stored under the 'filterData' slot as an element within its list. |
eQTLObject.
data(testEQTL) data123 <- matrix(0, nrow = 3, ncol = 3) set_filter_data(testEQTL, data123, "expMat")
data(testEQTL) data123 <- matrix(0, nrow = 3, ncol = 3) set_filter_data(testEQTL, data123, "expMat")
Generic to set eQTLObject used model information
set_model_info(x, value)
set_model_info(x, value)
x |
A eQTLObject object. |
value |
The used model information to set to eQTLObject. |
eQTLObject.
data(testEQTL) useModel <- "zinb" set_model_info(testEQTL, useModel)
data(testEQTL) useModel <- "zinb" set_model_info(testEQTL, useModel)
Method to set eQTLObject used model information
## S4 method for signature 'eQTLObject' set_model_info(x, value)
## S4 method for signature 'eQTLObject' set_model_info(x, value)
x |
A eQTLObject object. |
value |
The used model information to set to eQTLObject. |
eQTLObject.
data(testEQTL) useModel <- "zinb" set_model_info(testEQTL, useModel)
data(testEQTL) useModel <- "zinb" set_model_info(testEQTL, useModel)
Generic to set eQTLObject raw data
set_raw_data(x, value, name)
set_raw_data(x, value, name)
x |
A eQTLObject object. |
value |
The raw data. |
name |
The matrix named 'name' is stored under the 'rawData' slot as an element within its list. |
eQTLObject.
data(testEQTL) data123 <- matrix(0, nrow = 3, ncol = 3) set_raw_data(testEQTL, data123, "rawExpMat")
data(testEQTL) data123 <- matrix(0, nrow = 3, ncol = 3) set_raw_data(testEQTL, data123, "rawExpMat")
Method to set eQTLObject raw data
## S4 method for signature 'eQTLObject' set_raw_data(x, value, name)
## S4 method for signature 'eQTLObject' set_raw_data(x, value, name)
x |
A eQTLObject object. |
value |
The raw data. |
name |
The matrix named 'name' is stored under the 'rawData' slot as an element within its list. |
eQTLObject.
data(testEQTL) data123 <- matrix(0, nrow = 3, ncol = 3) set_raw_data(testEQTL, data123, "rawExpMat")
data(testEQTL) data123 <- matrix(0, nrow = 3, ncol = 3) set_raw_data(testEQTL, data123, "rawExpMat")
Generic to set the result of identifying eQTLs from scRNA-seq data
set_result_info(x, value)
set_result_info(x, value)
x |
A eQTLObject object. |
value |
A dataframe, each row describes eQTL discovering result of a SNP-Gene pair. |
eQTLObject.
data(testEQTL) result <- matrix(0, nrow = 3, ncol = 3) set_result_info(testEQTL, result)
data(testEQTL) result <- matrix(0, nrow = 3, ncol = 3) set_result_info(testEQTL, result)
Method to set the result of identifying eQTLs from scRNA-seq data
## S4 method for signature 'eQTLObject' set_result_info(x, value)
## S4 method for signature 'eQTLObject' set_result_info(x, value)
x |
A eQTLObject object. |
value |
A dataframe, each row describes eQTL discovering result of a SNP-Gene pair. |
eQTLObject.
data(testEQTL) result <- matrix(0, nrow = 3, ncol = 3) set_result_info(testEQTL, result)
data(testEQTL) result <- matrix(0, nrow = 3, ncol = 3) set_result_info(testEQTL, result)
This method is to display information about an object of class eQTLObject. When called on an eQTLObject, it prints a descriptive message to the console
## S4 method for signature 'eQTLObject' show(object)
## S4 method for signature 'eQTLObject' show(object)
object |
An S4 object of class eQTLObject. |
information of eQTLObject
data(testEQTL) testEQTL
data(testEQTL) testEQTL
An 'eqtlObject' created by the 'createQTLObject' function, where the raw expression matrix is normalized using 'normalizeGene()', and both the genotype matrix and the normalized gene expression matrix are filtered by 'filterGeneSNP()'.
testEQTL.rds is the RDS format versions of the original testEQTL.rda, providing the same normalized eQTL object for easier loading and use in R.
data(testEQTL) data(testEQTL)
data(testEQTL) data(testEQTL)
A simple object
.
A eqtlObject
read by the 'readRDS' function.
data(testEQTL) data(testEQTL)
data(testEQTL) data(testEQTL)
A dataset containing example gene expression data for testing purposes. 100 rows and 2705 colnums. The row names represent gene IDs or SYMBOL and the column names represent cell IDs.
data(testGene)
data(testGene)
A simple matrix
.
data(testGene)
data(testGene)
A Seurat object for single-cell RNA-seq data.
testSeurat.rds datasets are the RDS format versions of the original testSeurat.rda files, providing the preprocessed Seurat object for easier loading and use in R.
data(testSeurat) data(testSeurat)
data(testSeurat) data(testSeurat)
A object
A Seurat
read by the 'readRDS' function.
data(testSeurat) data(testSeurat)
data(testSeurat) data(testSeurat)
A dataset containing single nucleotide variant data. 1000 rows and 2705 colnums. Each row is one variant and each column is one cell.
data(testSNP)
data(testSNP)
A simple matrix
.
data(testSNP)
data(testSNP)
A dataset containing single nucleotide variant data.500 rows and 500 colnums Each row is one variant and each column is one cell.
data(testSNP2)
data(testSNP2)
A simple matrix
.
data(testSNP2)
data(testSNP2)
'TPM_normalize()' scales an expression matrix using Transcripts Per Million (TPM) normalization, applying logarithm and scaling operations to adjust data based on library size.
TPM_normalize(expressionMatrix)
TPM_normalize(expressionMatrix)
expressionMatrix |
Input raw gene expression matrix. |
A gene expression matrix after normalized.
data(testGene) TPM_normalize(testGene)
data(testGene) TPM_normalize(testGene)
visualizeQTL: Visualize the gene-snp pairs by group.
visualizeQTL( eQTLObject, SNPid, Geneid, groupName = NULL, plottype = "QTLplot", removeoutlier = FALSE )
visualizeQTL( eQTLObject, SNPid, Geneid, groupName = NULL, plottype = "QTLplot", removeoutlier = FALSE )
eQTLObject |
An S4 object of class eQTLObject. |
SNPid |
ID of SNP. |
Geneid |
ID of Gene. |
groupName |
Users can choose one or more than one single cell groups. |
plottype |
Types of plot,one of "QTLplot","violin","boxplot" or "histplot". |
removeoutlier |
Whether identify and remove the outliers. Default by FALSE. |
list
data(testEQTL) ## We have to call the eQTLs firstly using `callQTL()`. eqtl <- callQTL(eQTLObject = testEQTL, useModel = "linear") visualizeQTL(eQTLObject = eqtl, SNPid = "1:632647", Geneid = "RPS27", groupName = NULL, plottype = "QTLplot", removeoutlier = FALSE)
data(testEQTL) ## We have to call the eQTLs firstly using `callQTL()`. eqtl <- callQTL(eQTLObject = testEQTL, useModel = "linear") visualizeQTL(eQTLObject = eqtl, SNPid = "1:632647", Geneid = "RPS27", groupName = NULL, plottype = "QTLplot", removeoutlier = FALSE)
Zinb model fitting the gene expression matrix.
zinbModel( eQTLObject, geneIDs, snpIDs, biClassify = FALSE, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05 )
zinbModel( eQTLObject, geneIDs, snpIDs, biClassify = FALSE, pAdjustMethod = "bonferroni", pAdjustThreshold = 0.05 )
eQTLObject |
An S4 object of class eQTLObject. |
geneIDs |
Matching genes can be used to fit data. |
snpIDs |
Matching SNPs can be used to fit data. |
biClassify |
The user chooses whether to convert the counting method of the snpMatrix to 0/1/2, TRUE indicates conversion, and FALSE indicates no conversion, default is no conversion. |
pAdjustMethod |
Methods for p-value adjusting, one of 'bonferroni', 'holm', 'hochberg', 'hommel' or 'BH'. The default option is 'bonferroni'. |
pAdjustThreshold |
Only SNP-Gene pairs with adjusted p-values meeting the threshold will be displayed. Default by 0.05. |
Dataframe that contains gene-SNP pairs' information.
data(testEQTL) Gene <- rownames(slot(testEQTL, 'filterData')$expMat) SNP <- rownames(slot(testEQTL, 'filterData')$snpMat) zinbResult <- zinbModel( eQTLObject = testEQTL, geneIDs = Gene, snpIDs = SNP, biClassify = FALSE, pAdjustMethod = 'bonferroni', pAdjustThreshold = 0.05)
data(testEQTL) Gene <- rownames(slot(testEQTL, 'filterData')$expMat) SNP <- rownames(slot(testEQTL, 'filterData')$snpMat) zinbResult <- zinbModel( eQTLObject = testEQTL, geneIDs = Gene, snpIDs = SNP, biClassify = FALSE, pAdjustMethod = 'bonferroni', pAdjustThreshold = 0.05)