Title: | Quality Control Tool for High-Throughput Sequencing Data |
---|---|
Description: | Rqc is an optimised tool designed for quality control and assessment of high-throughput sequencing data. It performs parallel processing of entire files and produces a report which contains a set of high-resolution graphics. |
Authors: | Welliton Souza, Benilton Carvalho <[email protected]> |
Maintainer: | Welliton Souza <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.41.0 |
Built: | 2024-12-19 04:09:37 UTC |
Source: | https://github.com/bioc/Rqc |
Rqc is an optimized tool designed for quality assessment of high-throughput sequencing data. It performs parallel processing of entire files and produces a report, which contains a set of high-resolution images that can be directly used on publications.
Welliton Souza, Benilton Carvalho
Maintainer: Welliton Souza <[email protected]>
options(device.ask.default = FALSE) folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") rqc(folder, ".fastq.gz", pair=c(1,1), workers=1)
options(device.ask.default = FALSE) folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") rqc(folder, ".fastq.gz", pair=c(1,1), workers=1)
This utility function can be used to save time on task that takes long time to complete. A Rda file are written on disk containing only objects setted to keep. If checkpoint function find related Rda file then this Rda will be loaded.
checkpoint( label, CODE, path = ".", overwrite = FALSE, verbose = FALSE, keep = NULL )
checkpoint( label, CODE, path = ".", overwrite = FALSE, verbose = FALSE, keep = NULL )
label |
name of this code, will create a Rda file with the same name. |
CODE |
R code. |
path |
directory to write/load Rda file. |
overwrite |
Rerun CODE and replace Rda file. |
verbose |
argument passed to load function |
keep |
vector of object/variable name to keep. NULL means error. |
Nothing.
Experimental function.
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet")
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet")
Detect file format
detectFileFormat(file)
detectFileFormat(file)
file |
file name |
FastqFile or BamFiles objects
folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) input <- lapply(files, detectFileFormat) sapply(input, class)
folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) input <- lapply(files, detectFileFormat) sapply(input, class)
This function receives a vector of strings containing codified DNA and returns a vector of string containing original DNA sequences.
fromRRDNA(rrdnas)
fromRRDNA(rrdnas)
rrdnas |
Vector of codified DNA (character vector). |
Vector of original DNA sequences (character vector).
This function is used internally to restore original DNA sequences stored in RqcResultSet objects (per file top reads).
Welliton Souza
dna <- "ATCG" dna.converted <- toRRDNA(dna) dna.reverted <- fromRRDNA(dna.converted) all.equal(dna, dna.reverted)
dna <- "ATCG" dna.converted <- toRRDNA(dna) dna.reverted <- fromRRDNA(dna.converted) all.equal(dna, dna.reverted)
This function receives a vector of strings representing codified DNA sequences and returns a integer matrix representing the similarities between all sequences from input vectors.
matdist(rrdnas)
matdist(rrdnas)
rrdnas |
Vector of codified DNA sequences (character vector). |
Matrix , where
is the length of the largest
original DNA sequence.
This function is used internally to compute data for rqcFileHeatmap function.
Welliton Souza
dna1 <- toRRDNA("atcgn") dna2 <- toRRDNA("atcga") matdist(c(dna1, dna2))
dna1 <- toRRDNA("atcgn") dna2 <- toRRDNA("atcga") matdist(c(dna1, dna2))
Rqc is an optimized tool designed for quality assessment of high-throughput sequencing data. It performs parallel processing of entire files and produces an HTML report, which contains a set of high-resolution images that can be directly used on publications.
rqc( path = ".", pattern, sample = TRUE, n = 1e+06, group = NULL, top = 10, pair = NULL, outdir = tempdir(), file = "rqc_report", openBrowser = TRUE, workers = multicoreWorkers() )
rqc( path = ".", pattern, sample = TRUE, n = 1e+06, group = NULL, top = 10, pair = NULL, outdir = tempdir(), file = "rqc_report", openBrowser = TRUE, workers = multicoreWorkers() )
path |
directory path that contains input files. |
pattern |
a regex expression that macthes to input file names |
sample |
it reads a random sample from files if this parameter is TRUE. |
n |
number of sequences to read from each input file. This represents sample size if 'sample' parameter is TRUE, if not represents the chunk size to read on each iteration. By default, it reads a sample of one million sequences from each input file. |
group |
group name for each input file. |
top |
number of top over-represented reads. Default is 10 reads. |
pair |
combination of files for paired-end reads. By default, all input
files are treated as single-end. For paired-end, please define a vector of
numbers where two index with the same value represent a pair. Examples,
single-end |
outdir |
output directory path. Is created a temporary directory by default. |
file |
output file name. |
openBrowser |
if TRUE opens report file on default Internet Browser. |
workers |
Number of parallel workers. Set 1 to serial. Default value
from |
A invisible named list of RqcResultSet
objects, each one
represents a file.
Welliton Souza
options(device.ask.default = FALSE) folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") rqc(folder, ".fastq.gz", pair=c(1,1), workers=1, openBrowser=FALSE)
options(device.ask.default = FALSE) folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") rqc(folder, ".fastq.gz", pair=c(1,1), workers=1, openBrowser=FALSE)
This function plots line graph of per cycle average quality.
rqcCycleAverageQualityCalc(rqcResultSet) rqcCycleAverageQualityPlot(rqcResultSet)
rqcCycleAverageQualityCalc(rqcResultSet) rqcCycleAverageQualityPlot(rqcResultSet)
rqcResultSet |
list of |
ggplot2 object
rqcCycleAverageQualityCalc
: calculates necessary statistics
Welliton Souza
rqcGroupCycleAverageQualityPlot
plots cycle-specific quality by groups
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcCycleAverageQualityPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcCycleAverageQualityPlot(rqcResultSet)
This function creates a Biplot of PCA of per cycle read average quality
rqcCycleAverageQualityPcaCalc(rqcResultSet) rqcCycleAverageQualityPcaPlot(rqcResultSet)
rqcCycleAverageQualityPcaCalc(rqcResultSet) rqcCycleAverageQualityPcaPlot(rqcResultSet)
rqcResultSet |
list of |
Plot object from ggplot
function.
rqcCycleAverageQualityPcaCalc
: calculates necessary statistics
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcCycleAverageQualityPcaPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcCycleAverageQualityPcaPlot(rqcResultSet)
Creates a bar graph of per cycle base calls.
rqcCycleBaseCallsCalc(rqcResultSet) rqcCycleBaseCallsLinePlot(rqcResultSet) rqcCycleBaseCallsPlot(rqcResultSet)
rqcCycleBaseCallsCalc(rqcResultSet) rqcCycleBaseCallsLinePlot(rqcResultSet) rqcCycleBaseCallsPlot(rqcResultSet)
rqcResultSet |
list of |
Plot object from ggplot
function.
rqcCycleBaseCallsCalc
: calculates necessary statistics
rqcCycleBaseCallsLinePlot
: creates a line graph
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcCycleBaseCallsPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcCycleBaseCallsPlot(rqcResultSet)
Creates a line graph of per cycle percentual GC.
rqcCycleGCCalc(rqcResultSet) rqcCycleGCPlot(rqcResultSet)
rqcCycleGCCalc(rqcResultSet) rqcCycleGCPlot(rqcResultSet)
rqcResultSet |
list of |
Plot object from ggplot
function.
rqcCycleGCCalc
: calculates necessary statistics
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcCycleGCPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcCycleGCPlot(rqcResultSet)
Plots per cycle quality box plot.
rqcCycleQualityBoxCalc(rqcResultSet) rqcCycleQualityBoxPlot(rqcResultSet)
rqcCycleQualityBoxCalc(rqcResultSet) rqcCycleQualityBoxPlot(rqcResultSet)
rqcResultSet |
list of |
Plot object from ggplot
function.
rqcCycleQualityBoxCalc
: calculates necessary statistics
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcCycleQualityBoxPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcCycleQualityBoxPlot(rqcResultSet)
Creates a graph of per cycle quality.
rqcCycleQualityCalc(rqcResultSet) rqcCycleQualityPlot(rqcResultSet)
rqcCycleQualityCalc(rqcResultSet) rqcCycleQualityPlot(rqcResultSet)
rqcResultSet |
list of |
Plot object from ggplot
function.
rqcCycleQualityCalc
: calculates necessary statistics
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, workers=1) }, keep="rqcResultSet") rqcCycleQualityPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, workers=1) }, keep="rqcResultSet") rqcCycleQualityPlot(rqcResultSet)
This function plots a heatmap of distance matrix of top over-represented reads. This function does not work with list of RqcResultSet objects, only with one RqcResultSet object.
rqcFileHeatmap( rqcResultSet, dist.method = "euclidean", hclust.method = "ward.D" )
rqcFileHeatmap( rqcResultSet, dist.method = "euclidean", hclust.method = "ward.D" )
rqcResultSet |
|
dist.method |
the distance measure to be used by |
hclust.method |
the agglomeration method to be used by |
Plot object from ggplot
function.
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcFileHeatmap(rqcResultSet[[1]])
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcFileHeatmap(rqcResultSet[[1]])
This function plots cycle-specific quality by groups
rqcGroupCycleAverageQualityCalc(rqcResultSet) rqcGroupCycleAverageQualityPlot(rqcResultSet)
rqcGroupCycleAverageQualityCalc(rqcResultSet) rqcGroupCycleAverageQualityPlot(rqcResultSet)
rqcResultSet |
list of |
ggplot2 object
rqcGroupCycleAverageQualityCalc
: calculates necessary statistics
Welliton Souza
rqcCycleAverageQualityPlot
plots cycle-specific quality by files
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcGroupCycleAverageQualityPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcGroupCycleAverageQualityPlot(rqcResultSet)
Process a set of files and returns a list of quality control data. Files must be FASTQ format, compressed or not.
rqcQA( x, sample = TRUE, n = 1e+06, group = rep("None", length(x)), top = 10, pair = seq_along(x), ... ) ## S4 method for signature 'list' rqcQA(x, sample, n, group, top, pair, workers = multicoreWorkers()) ## S4 method for signature 'character' rqcQA( x, sample = TRUE, n = 1e+06, group = rep("None", length(x)), top = 10, pair = seq_along(x), workers = multicoreWorkers() ) ## S4 method for signature 'BamFile' rqcQA(x, sample, n, group, top, pair) ## S4 method for signature 'FastqFile' rqcQA(x, sample, n, group, top, pair)
rqcQA( x, sample = TRUE, n = 1e+06, group = rep("None", length(x)), top = 10, pair = seq_along(x), ... ) ## S4 method for signature 'list' rqcQA(x, sample, n, group, top, pair, workers = multicoreWorkers()) ## S4 method for signature 'character' rqcQA( x, sample = TRUE, n = 1e+06, group = rep("None", length(x)), top = 10, pair = seq_along(x), workers = multicoreWorkers() ) ## S4 method for signature 'BamFile' rqcQA(x, sample, n, group, top, pair) ## S4 method for signature 'FastqFile' rqcQA(x, sample, n, group, top, pair)
x |
input file(s) |
sample |
It reads a random sample from files if this parameter is TRUE. |
n |
Number of sequences to read from each input file. This represents sample size if 'sample' parameter is TRUE, if not represents the chunk size to read on each iteration. Default is read a sample of one million sequences from each input file. |
group |
group name for each input file. |
top |
number of top over-represented reads. Default is 10 reads. |
pair |
combination of files for paired-end reads. By default, all input
files are treated as single-end. For paired-end, please define a vector of
numbers where two index with the same value represent a pair. Examples,
single-end |
... |
other parameters |
workers |
number of parallel workers |
Input files are read using FastStreamer
and FastSampler
classes of ShortRead
package. Process multiple files in
parallel using bplapply
function of BiocParallel
package.
A named list of RqcResultSet
objects, each one represents a
file.
character
: automatically detects file format
(using detectFileFormat
function) of input files then process.
BamFile
: process only one BAM file.
FastqFile
: process only one FASTQ file.
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcReadQualityPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcReadQualityPlot(rqcResultSet)
This function creates a bar graph of read frequency (in percentage).
rqcReadFrequencyCalc(rqcResultSet) rqcReadFrequencyPlot(rqcResultSet)
rqcReadFrequencyCalc(rqcResultSet) rqcReadFrequencyPlot(rqcResultSet)
rqcResultSet |
list of |
Plot object from ggplot
function.
rqcReadFrequencyCalc
: calculates necessary statistics
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcReadFrequencyPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcReadFrequencyPlot(rqcResultSet)
This function creates crate a graphic charts with box plots describing per read mean quality distribution for each input file
rqcReadQualityBoxCalc(rqcResultSet) rqcReadQualityBoxPlot(rqcResultSet)
rqcReadQualityBoxCalc(rqcResultSet) rqcReadQualityBoxPlot(rqcResultSet)
rqcResultSet |
list of |
Plot object from ggplot
function.
rqcReadQualityBoxCalc
: calculates necessary statistics
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcReadQualityBoxPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcReadQualityBoxPlot(rqcResultSet)
Plots the quality of all the files by read.
rqcReadQualityCalc(rqcResultSet) rqcReadQualityPlot(rqcResultSet)
rqcReadQualityCalc(rqcResultSet) rqcReadQualityPlot(rqcResultSet)
rqcResultSet |
list of |
Plot object from ggplot
function.
rqcReadQualityCalc
: calculates necessary statistics
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcReadQualityPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcReadQualityPlot(rqcResultSet)
Creates bar graph of per read width from all elements of input list.
rqcReadWidthCalc(rqcResultSet) rqcReadWidthPlot(rqcResultSet)
rqcReadWidthCalc(rqcResultSet) rqcReadWidthPlot(rqcResultSet)
rqcResultSet |
list of |
Plot object from ggplot
function.
rqcReadWidthCalc
: calculates necessary statistics
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcReadWidthPlot(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") rqcReadWidthPlot(rqcResultSet)
Generates an HTML report file.
rqcReport( rqcResultSet, outdir = tempdir(), file = "rqc_report", keepMD = FALSE, templateFile = system.file("templates", package = "Rqc", "rqc_report.Rmd") )
rqcReport( rqcResultSet, outdir = tempdir(), file = "rqc_report", keepMD = FALSE, templateFile = system.file("templates", package = "Rqc", "rqc_report.Rmd") )
rqcResultSet |
list of |
outdir |
output directory path. It is created a temporary directory by default. |
file |
output file name. |
keepMD |
If true Rqc does not delete markdown file. |
templateFile |
Path of Rmarkdown file as Rqc web report template. |
Also creates a directory called "figure" in outdir
path.
Report file path.
Welliton Souza
options(device.ask.default = FALSE) checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") reportFile <- rqcReport(rqcResultSet) browseURL(reportFile)
options(device.ask.default = FALSE) checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") reportFile <- rqcReport(rqcResultSet) browseURL(reportFile)
Class RqcResultSet
Frequency distribution of cycle-specific base call
Frequency distribution of cycle-specific quality
File information
Top over-represented sequencing reads
Read frequency table
Frequency distribution of per read mean quality
Frequency distribution of read width
perCycleBasecall(x) ## S4 method for signature 'RqcResultSet' perCycleBasecall(x) ## S4 method for signature 'list' perCycleBasecall(x) perCycleQuality(x) ## S4 method for signature 'RqcResultSet' perCycleQuality(x) ## S4 method for signature 'list' perCycleQuality(x) perFileInformation(x) ## S4 method for signature 'RqcResultSet' perFileInformation(x) ## S4 method for signature 'list' perFileInformation(x) perFileTopReads(x) ## S4 method for signature 'RqcResultSet' perFileTopReads(x) ## S4 method for signature 'list' perFileTopReads(x) perReadFrequency(x) ## S4 method for signature 'RqcResultSet' perReadFrequency(x) ## S4 method for signature 'list' perReadFrequency(x) perReadQuality(x) ## S4 method for signature 'RqcResultSet' perReadQuality(x) ## S4 method for signature 'list' perReadQuality(x) perReadWidth(x) ## S4 method for signature 'RqcResultSet' perReadWidth(x) ## S4 method for signature 'list' perReadWidth(x)
perCycleBasecall(x) ## S4 method for signature 'RqcResultSet' perCycleBasecall(x) ## S4 method for signature 'list' perCycleBasecall(x) perCycleQuality(x) ## S4 method for signature 'RqcResultSet' perCycleQuality(x) ## S4 method for signature 'list' perCycleQuality(x) perFileInformation(x) ## S4 method for signature 'RqcResultSet' perFileInformation(x) ## S4 method for signature 'list' perFileInformation(x) perFileTopReads(x) ## S4 method for signature 'RqcResultSet' perFileTopReads(x) ## S4 method for signature 'list' perFileTopReads(x) perReadFrequency(x) ## S4 method for signature 'RqcResultSet' perReadFrequency(x) ## S4 method for signature 'list' perReadFrequency(x) perReadQuality(x) ## S4 method for signature 'RqcResultSet' perReadQuality(x) ## S4 method for signature 'list' perReadQuality(x) perReadWidth(x) ## S4 method for signature 'RqcResultSet' perReadWidth(x) ## S4 method for signature 'list' perReadWidth(x)
x |
|
data frame
data frame
data frame
data frame
data frame
data frame
data frame
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") head(perCycleBasecall(rqcResultSet)) head(perCycleQuality(rqcResultSet)) head(perReadFrequency(rqcResultSet)) head(perReadQuality(rqcResultSet)) head(perReadWidth(rqcResultSet)) perFileInformation(rqcResultSet) perFileTopReads(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") head(perCycleBasecall(rqcResultSet)) head(perCycleQuality(rqcResultSet)) head(perReadFrequency(rqcResultSet)) head(perReadQuality(rqcResultSet)) head(perReadWidth(rqcResultSet)) perFileInformation(rqcResultSet) perFileTopReads(rqcResultSet)
This function runs a Shiny web application of interactive Rqc report. This is useful for large amount of files and sample groups.
rqcShinyReport(rqcResultSet)
rqcShinyReport(rqcResultSet)
rqcResultSet |
list of |
function
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") # rqcShinyReport(rqcResultSet)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") # rqcShinyReport(rqcResultSet)
This function estimates how many reads would be lost if the sequences are filtered by a minimum read mean quality value. Also this function estimates what is the minimum read mean quality value for filtering and lose max percentage defined.
stats4trim(rqcResultSet, qmin, pmax)
stats4trim(rqcResultSet, qmin, pmax)
rqcResultSet |
list of |
qmin |
Minimum read mean quality value (bewteen 0 and 41). |
pmax |
Maximum percentage of reads permitted been lost during trimming step. |
A data frame containg estimated minimum quality and maximum percentage for each input file.
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") stats4trim(rqcResultSet, qmin=20) stats4trim(rqcResultSet, pmax=10)
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") stats4trim(rqcResultSet, qmin=20) stats4trim(rqcResultSet, pmax=10)
This function subsets RqcResultSet object function by group name.
subsetByGroup(rqcResultSet, group)
subsetByGroup(rqcResultSet, group)
rqcResultSet |
list of |
group |
Name of the group to subset |
list of RqcResultSet
objects from only one group.
Welliton Souza
folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, workers=1, group=c("a", "b")) perFileInformation(subsetByGroup(rqcResultSet, "a"))
folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, workers=1, group=c("a", "b")) perFileInformation(subsetByGroup(rqcResultSet, "a"))
This function subsets RqcResultSet object function by pair files.
subsetByPair(rqcResultSet, pair)
subsetByPair(rqcResultSet, pair)
rqcResultSet |
list of |
pair |
index of the pair |
list of RqcResultSet
objects from only one pair.
Welliton Souza
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") perFileInformation(subsetByPair(rqcResultSet, 1))
checkpoint("Rqc", path=system.file(package="Rqc", "extdata"), { folder <- system.file(package="ShortRead", "extdata/E-MTAB-1147") files <- list.files(full.names=TRUE, path=folder) rqcResultSet <- rqcQA(files, pair=c(1,1), workers=1) }, keep="rqcResultSet") perFileInformation(subsetByPair(rqcResultSet, 1))
This function receives a vector of strings (character vector) containing DNA sequences and returns a vector of strings containing codified DNA.
toRRDNA(dnas)
toRRDNA(dnas)
dnas |
Vector of DNA sequences (character vector). |
Vector of DNA converted to reduced representation format (character vector).
This function is used internally to compute top over-represented reads and to store in RqcResultSet objects (per file top reads).
Welliton Souza
dna <- "ATCGNATCGTA" dna.converted <- toRRDNA(dna) nchar(dna) nchar(dna.converted)
dna <- "ATCGNATCGTA" dna.converted <- toRRDNA(dna) nchar(dna) nchar(dna.converted)