Title: | cn.FARMS - factor analysis for copy number estimation |
---|---|
Description: | This package implements the cn.FARMS algorithm for copy number variation (CNV) analysis. cn.FARMS allows to analyze the most common Affymetrix (250K-SNP6.0) array types, supports high-performance computing using snow and ff. |
Authors: | Andreas Mitterecker, Djork-Arne Clevert |
Maintainer: | Andreas Mitterecker <[email protected]> |
License: | LGPL (>= 2.0) |
Version: | 1.53.0 |
Built: | 2024-07-07 05:48:03 UTC |
Source: | https://github.com/bioc/cn.farms |
Defines which variables should be written back when calling a cn.farms run
callSummarize(object, psInfo, summaryMethod, summaryParam, batchList = NULL, cores = 1, runtype = "ff", returnValues, saveFile = "summData")
callSummarize(object, psInfo, summaryMethod, summaryParam, batchList = NULL, cores = 1, runtype = "ff", returnValues, saveFile = "summData")
object |
an matrix with normalized intensity values. |
psInfo |
a data frame stating the physical position. |
summaryMethod |
the summarization method. |
summaryParam |
a list with the parameters of the summarization method. |
batchList |
batchList |
cores |
cores |
runtype |
mode how the results are saved. Possible values are ff or bm. If ff is chosen the data will not be saved automatically. With bm the results will be saved permanently. |
returnValues |
list with return values. For possible values see summaryMethod. |
saveFile |
name of the file to save. |
Results of FARMS run with specified parameters - exact FARMS version
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
Wrapper for the cn.farms algorithm
cn.farms(filenames, cores = 1, runtype = "bm")
cn.farms(filenames, cores = 1, runtype = "bm")
filenames |
the absolute filepaths of the CEL files. |
cores |
number of parallel instances. |
runtype |
either ff or bm. |
An instance of ExpressionSet
containing the results of the analysis.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
## Not run: require('hapmapsnp6') celDir <- system.file('celFiles', package = 'hapmapsnp6') filenames <- dir(path = celDir, full.names = TRUE) cn.farms(filenames = filenames) ## End(Not run)
## Not run: require('hapmapsnp6') celDir <- system.file('celFiles', package = 'hapmapsnp6') filenames <- dir(path = celDir, full.names = TRUE) cn.farms(filenames = filenames) ## End(Not run)
This function was taken from snowfall and edited due to some deprecated function calls.
cnLibrary(package, pos = 2, lib.loc = NULL, character.only = FALSE, warn.conflicts = TRUE, keep.source = getOption("keep.source.pkgs"), verbose = getOption("verbose"), version, stopOnError = TRUE)
cnLibrary(package, pos = 2, lib.loc = NULL, character.only = FALSE, warn.conflicts = TRUE, keep.source = getOption("keep.source.pkgs"), verbose = getOption("verbose"), version, stopOnError = TRUE)
package |
name of the package. Check 'library' for details. |
pos |
position in search path to load library. |
lib.loc |
a character vector describing the location of the R library trees to search through, or 'NULL'. Check 'library' for details. |
character.only |
a logical indicating package can be assumed to be a character string. Check 'library' for details. |
warn.conflicts |
warn on conflicts (see "library"). |
keep.source |
DEPRECATED (see "library"). |
verbose |
enable verbose messages. |
version |
version of library to load (see "library"). |
stopOnError |
logical. |
for more information see "library".
xxx
Suitable for SNP or non-polymorphic data which were already processed with single locus FARMS
combineData(object01, object02, obj01Var = "intensity", obj02Var = "intensity", runtype = "ff", saveFile = "combData")
combineData(object01, object02, obj01Var = "intensity", obj02Var = "intensity", runtype = "ff", saveFile = "combData")
object01 |
An instance of |
object02 |
An instance of |
obj01Var |
States the variable which should be combined from the assayData slot. Default is intensity. |
obj02Var |
States the variable which should be combined from the assayData slot. Default is intensity. |
runtype |
Mode how the results are saved. Possible values are ff or bm. If ff is chosen the data will not be saved automatically. With bm the results will be saved permanently. |
saveFile |
Name of the file to save. |
An instance of ExpressionSet
.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
load(system.file("exampleData/normData.RData", package = "cn.farms")) notes(experimentData(normData))$annotDir <- system.file("exampleData/annotation/pd.genomewidesnp.6/1.1.0", package = "cn.farms") summaryMethod <- "Variational" summaryParam <- list() summaryParam$cyc <- c(10) slData <- slSummarization(normData, summaryMethod = summaryMethod, summaryParam = summaryParam) assayData(slData)$L_z[1:10, ] combData <- combineData(slData, slData) combData
load(system.file("exampleData/normData.RData", package = "cn.farms")) notes(experimentData(normData))$annotDir <- system.file("exampleData/annotation/pd.genomewidesnp.6/1.1.0", package = "cn.farms") summaryMethod <- "Variational" summaryParam <- list() summaryParam$cyc <- c(10) slData <- slSummarization(normData, summaryMethod = summaryMethod, summaryParam = summaryParam) assayData(slData)$L_z[1:10, ] combData <- combineData(slData, slData) combData
Annotation files for cn.farms are created
createAnnotation(filenames = NULL, annotation = NULL, annotDir = NULL, checks = TRUE)
createAnnotation(filenames = NULL, annotation = NULL, annotDir = NULL, checks = TRUE)
filenames |
An absolute path of the CEL files to process. |
annotation |
Optional parameter stating the annotation from a pd-mapping. |
annotDir |
Optional parameter stating where the annotation should go. |
checks |
States if sanity checks should be done. |
NULL
The annotation files used for cn.farms will be placed in the current work directory under annotations.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
## Not run: library("hapmapsnp6") celDir <- system.file("celFiles", package = "hapmapsnp6") filenames <- dir(path = celDir, full.names = TRUE) createAnnotation(filenames = filenames) ## End(Not run)
## Not run: library("hapmapsnp6") celDir <- system.file("celFiles", package = "hapmapsnp6") filenames <- dir(path = celDir, full.names = TRUE) createAnnotation(filenames = filenames) ## End(Not run)
Creates the needed matrix
createMatrix(runtype, nrow, ncol, type = "double", bmName = "NA")
createMatrix(runtype, nrow, ncol, type = "double", bmName = "NA")
runtype |
Mode how the results are saved. Possible values are ff or bm. If ff is chosen the data will not be saved automatically. With bm the results will be saved permanently. |
nrow |
nrow |
ncol |
ncol |
type |
type |
bmName |
Identifier for ff name |
a matrix
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
Be aware that this function is implemented quite slow.
distributionDistance(intensityData, method = c("JSDiv", "KLDiv", "KLInf"), useSubset = T, subsetFraction = 0.25, useQuantileReference = FALSE)
distributionDistance(intensityData, method = c("JSDiv", "KLDiv", "KLInf"), useSubset = T, subsetFraction = 0.25, useQuantileReference = FALSE)
intensityData |
A matrix or an AffyBatch object. |
method |
The method you want to use. |
useSubset |
Logical. States if only a subset should be used. |
subsetFraction |
The fraction of the subset. |
useQuantileReference |
Logical for a quantile reference. |
Computes the distribution distance
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
load(system.file("exampleData/normData.RData", package = "cn.farms")) x <- assayData(normData)$intensity[, 1:3] y <- distributionDistance(x) attr(y, "Labels") <- substr(sampleNames(normData), 1, 7) plotDendrogram(y)
load(system.file("exampleData/normData.RData", package = "cn.farms")) x <- assayData(normData)$intensity[, 1:3] y <- distributionDistance(x) attr(y, "Labels") <- substr(sampleNames(normData), 1, 7) plotDendrogram(y)
This function even works very well with ff matrices,
dnaCopySf(x, chrom, maploc, cores = 1, smoothing, ...)
dnaCopySf(x, chrom, maploc, cores = 1, smoothing, ...)
x |
A matrix with data of the copy number experiments |
chrom |
The chromosomes (or other group identifier) from which the markers came |
maploc |
The locations of marker on the genome |
cores |
Number of cores to use |
smoothing |
States if smoothing of the data should be done |
... |
Further parameter for the function segment of DNAcopy |
An instance of ExpressionSet
containing the segments.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
load(system.file("exampleData/mlData.RData", package = "cn.farms")) mlData <- mlData[, 1:3] colnames(assayData(mlData)$L_z) <- sampleNames(mlData) segments <- dnaCopySf( x = assayData(mlData)$L_z, chrom = fData(mlData)$chrom, maploc = fData(mlData)$start, cores = 1, smoothing = FALSE) fData(segments)
load(system.file("exampleData/mlData.RData", package = "cn.farms")) mlData <- mlData[, 1:3] colnames(assayData(mlData)$L_z) <- sampleNames(mlData) segments <- dnaCopySf( x = assayData(mlData)$L_z, chrom = fData(mlData)$chrom, maploc = fData(mlData)$start, cores = 1, smoothing = FALSE) fData(segments)
Works for all kind of Affymetrix SNP arrays
doCnFarmsSingle(celfiles, samplenames, normalization)
doCnFarmsSingle(celfiles, samplenames, normalization)
celfiles |
The celfiles which you want to process with the whole path. Either a vector or a matrix with two columns for combined analysis e.g. 500K Array. |
samplenames |
An optional vector with the same dimension as the number of cel files |
normalization |
The normalization method you want to use. |
The ready cn.FARMS results.
Andreas Mitterecker
Does a fragment length correction on intensities
flcSnp6Std(y, fragmentLengths, targetFcn = NULL, subsetToFit = NULL, runtype = "ff", cores = 1, saveFile = "flc", ...)
flcSnp6Std(y, fragmentLengths, targetFcn = NULL, subsetToFit = NULL, runtype = "ff", cores = 1, saveFile = "flc", ...)
y |
y |
fragmentLengths |
fragmentLengths |
targetFcn |
targetFcn |
subsetToFit |
subsetToFit |
runtype |
runtype |
cores |
cores |
saveFile |
Name of the file to save. |
... |
... |
data frame
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
Does a fragment length correction on intensities
flcStd(y, fragmentLengths, targetFcn = NULL, subsetToFit = NULL, runtype = "ff", cores = 1, saveFile = "flc", ...)
flcStd(y, fragmentLengths, targetFcn = NULL, subsetToFit = NULL, runtype = "ff", cores = 1, saveFile = "flc", ...)
y |
y |
fragmentLengths |
fragmentLengths |
targetFcn |
targetFcn |
subsetToFit |
subsetToFit |
runtype |
Mode how the results are saved. Possible values are ff or bm. If ff is chosen the data will not be saved automatically. With bm the results will be saved permanently. |
cores |
cores |
saveFile |
Name of the file to save. |
... |
... |
data frame
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
Does a fragment length correction
fragLengCorr(object, runtype = "ff", saveFile = "slDataFlc", ...)
fragLengCorr(object, runtype = "ff", saveFile = "slDataFlc", ...)
object |
An instance of
|
runtype |
Mode how the results are saved. Possible values are ff or bm. |
... |
Further parameters passed to the correction method. |
saveFile |
Name of the file to save. |
An instance of
ExpressionSet
.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
load(system.file("exampleData/slData.RData", package = "cn.farms")) slDataFlc <- fragLengCorr(slData)
load(system.file("exampleData/slData.RData", package = "cn.farms")) slDataFlc <- fragLengCorr(slData)
Finds SNPs which belong to one fragment
getFragmentSet(fragLength)
getFragmentSet(fragLength)
fragLength |
fragLength |
windows for fragments
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
Combines data for probeset summarization
getSingleProbeSetSize(fsetid)
getSingleProbeSetSize(fsetid)
fsetid |
fsetid |
a Indices whhich are used for probeset summarization
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
Method for computation of the multi-loci summarization
mlSummarization(object, windowMethod, windowParam, summaryMethod, summaryParam, callParam = list(runtype = "ff"), returnValues, saveFile = "mlData")
mlSummarization(object, windowMethod, windowParam, summaryMethod, summaryParam, callParam = list(runtype = "ff"), returnValues, saveFile = "mlData")
object |
an instance of |
windowMethod |
Method for combination of neighbouring SNPs. Possible values are Std and Bps. |
windowParam |
further parameters as the window size |
summaryMethod |
allowed versions for the summarization step are: Gaussian, Variational, Exact. Default is Variational. |
summaryParam |
The parameters for the summaryMethod. Further information
can be obtained via the according functions:
|
callParam |
Additional parameters for runtype (ff or bm) as well as cores for parallelization. |
returnValues |
List with return values. |
saveFile |
Name of the file to save. For possible values see summaryMethod. |
Multi-loci summarized data of an instance of
ExpressionSet
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
load(system.file("exampleData/slData.RData", package = "cn.farms")) windowMethod <- "std" windowParam <- list() windowParam$windowSize <- 5 windowParam$overlap <- TRUE summaryMethod <- "Variational" summaryParam <- list() summaryParam$cyc <- c(20) mlData <- mlSummarization(slData, windowMethod, windowParam, summaryMethod, summaryParam) assayData(mlData)
load(system.file("exampleData/slData.RData", package = "cn.farms")) windowMethod <- "std" windowParam <- list() windowParam$windowSize <- 5 windowParam$overlap <- TRUE summaryMethod <- "Variational" summaryParam <- list() summaryParam$cyc <- c(20) mlData <- mlSummarization(slData, windowMethod, windowParam, summaryMethod, summaryParam) assayData(mlData)
Extracts info from the package name
normAdd(pkgname)
normAdd(pkgname)
pkgname |
The package name according to the bioconductor annotation names. |
Additional info for save files.
Andreas Mitterecker
Scales the range of the non-polymorphic data to the range of a given array.
normalizeAverage(x, baselineArray, avg = median, targetAvg = 2200, ...)
normalizeAverage(x, baselineArray, avg = median, targetAvg = 2200, ...)
x |
Data matrix |
baselineArray |
Choose the baseline channel array. |
avg |
The function for averaging. |
targetAvg |
Value to which the array should be averaged. |
... |
Further optional parameters. |
Normalized non-polymorphic data.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
x <- matrix(rnorm(100, 11), 20, 5) normalizeAverage(x, x[, 1])
x <- matrix(rnorm(100, 11), 20, 5) normalizeAverage(x, x[, 1])
This functions provides different normalization methods for microarray data. At the moment only SOR and quantile normalization are implemented.
normalizeCels(filenames, method = c("SOR", "quantiles", "none"), cores = 1, alleles = FALSE, runtype = "bm", annotDir = NULL, saveFile = "normData", ...)
normalizeCels(filenames, method = c("SOR", "quantiles", "none"), cores = 1, alleles = FALSE, runtype = "bm", annotDir = NULL, saveFile = "normData", ...)
filenames |
The absolute path of the CEL files as a list. |
method |
The normalization method. Possible methods so far: SOR, quantiles |
cores |
Number of cores for used for parallelization. |
alleles |
States if information for allele A and B should be given back. |
runtype |
Mode how the results are saved. Possible values are ff or bm. If ff is chosen the data will not be saved automatically. With bm the results will be saved permanently. |
annotDir |
An optional annotation directory. |
saveFile |
Name of the file to save. |
... |
Further parameters for the normalization method. |
An ExpressionSet object with the normalized data.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
## Not run: library("hapmapsnp6") celDir <- system.file("celFiles", package = "hapmapsnp6") filenames <- dir(path = celDir, full.names = TRUE) createAnnotation(filenames = filenames) normData <- normalizeCels(filenames, method = "SOR") ## End(Not run)
## Not run: library("hapmapsnp6") celDir <- system.file("celFiles", package = "hapmapsnp6") filenames <- dir(path = celDir, full.names = TRUE) createAnnotation(filenames = filenames) normData <- normalizeCels(filenames, method = "SOR") ## End(Not run)
Runs the SOR normalization on microarray data
normalizeNone(filenames, cores = 1, annotDir = NULL, alleles = FALSE, runtype = "ff", cyc = 5, pkgname = NULL, saveFile = "Sor")
normalizeNone(filenames, cores = 1, annotDir = NULL, alleles = FALSE, runtype = "ff", cyc = 5, pkgname = NULL, saveFile = "Sor")
filenames |
an absolute path of the CEL files |
cores |
cores |
annotDir |
annotDir |
alleles |
alleles |
cyc |
states the number of cycles for the EM algorithm. |
runtype |
Mode how the results are saved. Possible values are ff or bm. If ff is chosen the data will not be saved automatically. With bm the results will be saved permanently. |
pkgname |
Optional parameter for the CEL mapping. |
saveFile |
Name of the file to save. |
An instance of ExpressionSet
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
Normalization for non-polymorphic data for Affymetrix SNP5 and SNP6
normalizeNpData(filenames, cores = 1, annotDir = NULL, runtype = "ff", saveFile = "npData", method = c("baseline", "quantiles", "none"))
normalizeNpData(filenames, cores = 1, annotDir = NULL, runtype = "ff", saveFile = "npData", method = c("baseline", "quantiles", "none"))
filenames |
the absolute path of the CEL files as a list |
cores |
number of cores for used for parallelization |
annotDir |
Optional annotation directory. |
runtype |
Mode how the results are saved. Possible values are ff or bm. If ff is chosen the data will not be saved automatically. With bm the results will be saved permanently. |
saveFile |
Name of the file to save. |
method |
The method for the normalization. |
An instance of ExpressionSet
containing the non-polymorphic data of the microarray.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
## Not run: library("hapmapsnp6") celDir <- system.file("celFiles", package = "hapmapsnp6") filenames <- dir(path = celDir, full.names = TRUE) createAnnotation(filenames = filenames) npData <- normalizeNpData(filenames) ## End(Not run)
## Not run: library("hapmapsnp6") celDir <- system.file("celFiles", package = "hapmapsnp6") filenames <- dir(path = celDir, full.names = TRUE) createAnnotation(filenames = filenames) npData <- normalizeNpData(filenames) ## End(Not run)
Normalization Quantiles
normalizeQuantiles(filenames, cores = 1, batch = NULL, annotDir = NULL, runtype = "ff", pkgname = NULL, saveFile = "normDataQuant")
normalizeQuantiles(filenames, cores = 1, batch = NULL, annotDir = NULL, runtype = "ff", pkgname = NULL, saveFile = "normDataQuant")
filenames |
filenames |
cores |
cores |
batch |
batch |
annotDir |
annotDir |
runtype |
Mode how the results are saved. Possible values are ff or bm. If ff is chosen the data will not be saved automatically. With bm the results will be saved permanently. |
pkgname |
Optional parameter for the CEL mapping. |
saveFile |
Name of the file to save. |
The normalized data.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
Correction for probe sequence effects
normalizeSequenceEffect(object, annotDir = NULL, runtype = "ff", saveFile = "seqNorm")
normalizeSequenceEffect(object, annotDir = NULL, runtype = "ff", saveFile = "seqNorm")
object |
an instance of
|
annotDir |
the directory where the anntotation can be found |
runtype |
mode how the results are saved. Possible values are ff or bm. If ff is chosen the data will not be saved automatically. |
saveFile |
name of the file to save. |
Some data
Andreas Mitterecker
Runs the SOR normalization on microarray data
normalizeSor(filenames, cores = 1, annotDir = NULL, alleles = FALSE, runtype = "ff", cyc = 5, pkgname = NULL, saveFile = "Sor")
normalizeSor(filenames, cores = 1, annotDir = NULL, alleles = FALSE, runtype = "ff", cyc = 5, pkgname = NULL, saveFile = "Sor")
filenames |
an absolute path of the CEL files |
cores |
cores |
annotDir |
annotDir |
alleles |
alleles |
cyc |
states the number of cycles for the EM algorithm. |
runtype |
Mode how the results are saved. Possible values are ff or bm. If ff is chosen the data will not be saved automatically. With bm the results will be saved permanently. |
pkgname |
Optional parameter for the CEL mapping. |
saveFile |
Name of the file to save. |
An instance of ExpressionSet
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
Plots a dendrogram
plotDendrogram(DivMetric, colorLabels)
plotDendrogram(DivMetric, colorLabels)
DivMetric |
The input data (see example). |
colorLabels |
A color label with the dimension of the columns. |
A dendrogram.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
load(system.file("exampleData/normData.RData", package = "cn.farms")) x <- assayData(normData)$intensity[, 1:3] y <- distributionDistance(x) attr(y, "Labels") <- substr(sampleNames(normData), 1, 7) plotDendrogram(y)
load(system.file("exampleData/normData.RData", package = "cn.farms")) x <- assayData(normData)$intensity[, 1:3] y <- distributionDistance(x) attr(y, "Labels") <- substr(sampleNames(normData), 1, 7) plotDendrogram(y)
Simple density plot. Adapted from the aroma.affymetrix package (www.aroma-project.org)
plotDensity(x, xlim = c(0, 16), ylim, col, lty, lwd, add = FALSE, xlab, ylab, log = TRUE, ...)
plotDensity(x, xlim = c(0, 16), ylim, col, lty, lwd, add = FALSE, xlab, ylab, log = TRUE, ...)
x |
Matrix with numeric values. |
xlim |
The limits for the x axis. |
ylim |
The limits for the y axis. |
col |
Vector with colors corresponding to the columns of the matrix. |
lty |
The line type (see |
lwd |
The line width, a positive number, defaulting to 1
(see |
add |
If FALSE (the default) then a new plot is produced. If TRUE, density lines are added to the open graphics device. |
xlab |
The labeling of the x axis. |
ylab |
The labeling of the y axis. |
log |
Logical values which states if the log2 should be taken from the data. |
... |
Further arguments of the plot function ' |
A plot written to the graphics device.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
load(system.file("exampleData/slData.RData", package = "cn.farms")) plotDensity(assayData(slData)$intensity)
load(system.file("exampleData/slData.RData", package = "cn.farms")) plotDensity(assayData(slData)$intensity)
Creates a plot with known regions and a numeric vector
plotEvalIc(object, segments, chrom, variable, ylim, ylab = "CN indicator", stripCol = "lightgray", regionCol = rgb(130, 0, 139, maxColorValue = 255), pointSize = 0.75, pointType = 4, bandwidth = c(0.01, 1000), nbin = 100)
plotEvalIc(object, segments, chrom, variable, ylim, ylab = "CN indicator", stripCol = "lightgray", regionCol = rgb(130, 0, 139, maxColorValue = 255), pointSize = 0.75, pointType = 4, bandwidth = c(0.01, 1000), nbin = 100)
object |
an instance of |
segments |
A data.frame with known regions. |
chrom |
the chromosome. |
variable |
The numeric vector which should be plotted. |
ylim |
the limits of the y axis. |
ylab |
the ylab from function par. |
stripCol |
color of points. |
regionCol |
color of regions. |
pointSize |
size of the points. |
pointType |
type of the points. |
bandwidth |
for the color of the plot. |
nbin |
number of bins for the coloring. |
Some data
Andreas Mitterecker
load(system.file("exampleData/slData.RData", package = "cn.farms")) load(system.file("exampleData/testSegments.RData", package = "cn.farms")) plotEvalIc(slData, fData(testSegments), variable = assayData(slData)$L_z[, 1], 23)
load(system.file("exampleData/slData.RData", package = "cn.farms")) load(system.file("exampleData/testSegments.RData", package = "cn.farms")) plotEvalIc(slData, fData(testSegments), variable = assayData(slData)$L_z[, 1], 23)
A pdf in the working directory is produced.
plotRegions(object, segments, addInd = NULL, ylim, variable, colorVersion = 0, plotLegend = TRUE, pdfname)
plotRegions(object, segments, addInd = NULL, ylim, variable, colorVersion = 0, plotLegend = TRUE, pdfname)
object |
An instance of |
segments |
An instance of |
addInd |
States how many indices should be plotted besides the region |
ylim |
The limits for the y axis. |
variable |
States which variable of the assayData should be plotted. |
colorVersion |
States different color versions. |
plotLegend |
If a legend should be plotted or not. |
pdfname |
The name of the pdf file. |
A graph. Normally a pdf in the current work directory.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
load(system.file("exampleData/slData.RData", package = "cn.farms")) load(system.file("exampleData/testSegments.RData", package = "cn.farms")) plotRegions(slData, testSegments, addInd = 10, ylim = c(-2, 2), variable = "L_z", colorVersion = 1, plotLegend = TRUE, pdfname = "slData.pdf")
load(system.file("exampleData/slData.RData", package = "cn.farms")) load(system.file("exampleData/testSegments.RData", package = "cn.farms")) plotRegions(slData, testSegments, addInd = 10, ylim = c(-2, 2), variable = "L_z", colorVersion = 1, plotLegend = TRUE, pdfname = "slData.pdf")
Creates a smooth scatter plot
plotSmoothScatter(object, variable, chrom, start, end, ylim, pdfname, ...)
plotSmoothScatter(object, variable, chrom, start, end, ylim, pdfname, ...)
object |
An instance of |
variable |
States which variable of the assayData should be plotted. |
chrom |
The chromosome you want to plot. |
start |
The physical start position. |
end |
The physical end position. |
ylim |
The limits for the y axis. |
pdfname |
The name of the pdf file. |
... |
Further arguments passed to smoothScatter function. |
A graph.
Andreas Mitterecker
load(system.file("exampleData/slData.RData", package = "cn.farms")) plotSmoothScatter(slData[, 1:3], chrom = "23")
load(system.file("exampleData/slData.RData", package = "cn.farms")) plotSmoothScatter(slData[, 1:3], chrom = "23")
This function creates a violine plot on intensity values
plotViolines(object, variable = "intensity", groups, ...)
plotViolines(object, variable = "intensity", groups, ...)
object |
An instance of
|
variable |
states which variable of assayData should be plotted. |
groups |
Vector with the dimension of the samples for coloring. |
... |
Further arguments passed to the lattice graph. |
Creates a violine plot.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
load(system.file("exampleData/normData.RData", package = "cn.farms")) normData <- normData[, 1:10] groups <- seq(sampleNames(normData)) plotViolines(normData, variable = "intensity", groups, xlab = "Intensity values")
load(system.file("exampleData/normData.RData", package = "cn.farms")) normData <- normData[, 1:10] groups <- seq(sampleNames(normData)) plotViolines(normData, variable = "intensity", groups, xlab = "Intensity values")
The different probes of the SNPs of the array are summarized to a probeset.
slSummarization(object, summaryMethod = "Exact", summaryParam = list(), callParam = list(runtype = "ff", cores = 1), summaryWindow = c("std", "fragment"), returnValues, saveFile = "slData")
slSummarization(object, summaryMethod = "Exact", summaryParam = list(), callParam = list(runtype = "ff", cores = 1), summaryWindow = c("std", "fragment"), returnValues, saveFile = "slData")
object |
An instance of
|
summaryMethod |
allowed versions for the summarization step are: Gaussian,Variational, Exact. Default is Variational. |
summaryParam |
The parameters for the summaryMethod. Further information
can be obtained via the according functions:
|
callParam |
Additional parameters for runtype (ff or bm) as well as cores for parallelization. |
summaryWindow |
Method for combination of the SNPs. Possible values are sl and fragment. |
returnValues |
List with return values. |
saveFile |
Name of the file to save. |
Single-locus summarized data of an instance of
ExpressionSet
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
load(system.file("exampleData/normData.RData", package = "cn.farms")) notes(experimentData(normData))$annotDir <- system.file("exampleData/annotation/pd.genomewidesnp.6/1.1.0", package = "cn.farms") summaryMethod <- "Variational" summaryParam <- list() summaryParam$cyc <- c(10) slData <- slSummarization(normData, summaryMethod = summaryMethod, summaryParam = summaryParam) assayData(slData)$L_z[1:10, 1:10] summaryMethod <- "Gaussian" summaryParam <- list() summaryParam$cyc <- c(10) slData <- slSummarization(normData, summaryMethod = summaryMethod, summaryParam = summaryParam) assayData(slData)$L_z[1:10, 1:10] summaryMethod <- "Exact" summaryParam <- list() summaryParam$cyc <- c(10, 20) slData <- slSummarization(normData, summaryMethod = summaryMethod, summaryParam = summaryParam) assayData(slData)$L_z[1:10, 1:10]
load(system.file("exampleData/normData.RData", package = "cn.farms")) notes(experimentData(normData))$annotDir <- system.file("exampleData/annotation/pd.genomewidesnp.6/1.1.0", package = "cn.farms") summaryMethod <- "Variational" summaryParam <- list() summaryParam$cyc <- c(10) slData <- slSummarization(normData, summaryMethod = summaryMethod, summaryParam = summaryParam) assayData(slData)$L_z[1:10, 1:10] summaryMethod <- "Gaussian" summaryParam <- list() summaryParam$cyc <- c(10) slData <- slSummarization(normData, summaryMethod = summaryMethod, summaryParam = summaryParam) assayData(slData)$L_z[1:10, 1:10] summaryMethod <- "Exact" summaryParam <- list() summaryParam$cyc <- c(10, 20) slData <- slSummarization(normData, summaryMethod = summaryMethod, summaryParam = summaryParam) assayData(slData)$L_z[1:10, 1:10]
Normalizes the data with SOR
sparseFarmsC(probes, cyc = 5)
sparseFarmsC(probes, cyc = 5)
probes |
The intensity matrix. |
cyc |
Number of cycles. |
Normalized Data.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
x <- matrix(rnorm(100, 11), 20, 5) sparseFarmsC(x, 50)
x <- matrix(rnorm(100, 11), 20, 5) sparseFarmsC(x, 50)
This function implements an exact Laplace FARMS algorithm.
summarizeFarmsExact(probes, mu = 1, weight = 0.001, weightSignal = 1, weightZ = 1, weightProbes = TRUE, cyc = c(10, 10), tol = 1e-05, weightType = "mean", centering = "median", rescale = FALSE, backscaleComputation = FALSE, maxIntensity = TRUE, refIdx, ...)
summarizeFarmsExact(probes, mu = 1, weight = 0.001, weightSignal = 1, weightZ = 1, weightProbes = TRUE, cyc = c(10, 10), tol = 1e-05, weightType = "mean", centering = "median", rescale = FALSE, backscaleComputation = FALSE, maxIntensity = TRUE, refIdx, ...)
probes |
A matrix with numeric values. |
mu |
Hyperparameter value which allows to quantify different aspects of potential prior knowledge. Values near zero assumes that most positions do not contain a signal, and introduces a bias for loading matrix elements near zero. Default value is 0 and it's recommended not to change it. |
weight |
Hyperparameter value which determines the influence of the Gaussian prior of the loadings |
weightSignal |
Hyperparameter value on the signal. |
weightZ |
Hyperparameter value which determines how strong the Laplace prior of the factor should be at 0. Users should be aware, that a change of weightZ in comparison to the default parameter might also entail a need to change other parameters. Unexperienced users should not change weightZ. |
weightProbes |
Parameter (TRUE/FALSE), that determines, if the number of probes should additionally be considered in weight. If TRUE, weight will be modified. |
cyc |
Number of cycles. If the length is two, it is assumed, that a minimum and a maximum number of cycles is given. If the length is one, the value is interpreted as the exact number of cycles to be executed (minimum == maximum). |
tol |
States the termination tolerance if cyc[1]!=cyc[2]. Default is 0.00001. |
weightType |
Flag, that is used to summarize the probes of a sample. |
centering |
States how the data should be centered ("mean", "median"). Default is median. |
rescale |
Parameter (TRUE/FALSE), that determines, if moments in exact Laplace FARMS are rescaled in each iteration. Default is FALSE. |
backscaleComputation |
Parameter (TRUE/FALSE), that determines if the moments of hidden variables should be reestimated after rescaling the parameters. |
maxIntensity |
Parameter (TRUE/FALSE), that determines if the expectation value (=FALSE) or the maximum value (=TRUE) of p(z|x_i) should be used for an estimation of the hidden varaible. |
refIdx |
index or indices which are used for computation of the centering |
... |
Further parameters for expert users. |
A list including: the found parameters: lambda0, lambda1, Psi
the estimated factors: z (expectation), maxZ (maximum)
p: log-likelihood of the data given the found lambda0, lambda1, Psi (not the posterior likelihood that is optimized)
varzx: variances of the hidden variables given the data
KL: Kullback Leibler divergences between between posterior and prior distribution of the hidden variables
IC: Information Content considering the hidden variables and data
ICtransform: transformed Information Content
Case: Case for computation of a sample point (non-exception, special exception)
L1median: Median of the lambda vector components
intensity: back-computed summarized probeset values with mean correction
L_z: back-computed summarized probeset values without mean correction
rawCN: transformed values of L_z
SNR: some additional signal to noise ratio value
Andreas Mayr [email protected] and Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
x <- matrix(rnorm(100, 11), 20, 5) summarizeFarmsExact(x)
x <- matrix(rnorm(100, 11), 20, 5) summarizeFarmsExact(x)
This function implements an exact Laplace FARMS algorithm.
summarizeFarmsExact2(probes, mu = 1, weight = 0.5, weightSignal = 1, weightZ = 1, weightProbes = TRUE, cyc = c(10, 10), tol = 1e-05, weightType = "mean", centering = "median", rescale = FALSE, backscaleComputation = FALSE, maxIntensity = TRUE, refIdx, ...)
summarizeFarmsExact2(probes, mu = 1, weight = 0.5, weightSignal = 1, weightZ = 1, weightProbes = TRUE, cyc = c(10, 10), tol = 1e-05, weightType = "mean", centering = "median", rescale = FALSE, backscaleComputation = FALSE, maxIntensity = TRUE, refIdx, ...)
probes |
A matrix with numeric values. |
mu |
Hyperparameter value which allows to quantify different aspects of potential prior knowledge. Values near zero assumes that most positions do not contain a signal, and introduces a bias for loading matrix elements near zero. Default value is 0 and it's recommended not to change it. |
weight |
Hyperparameter value which determines the influence of the Gaussian prior of the loadings |
weightSignal |
Hyperparameter value on the signal. |
weightZ |
Hyperparameter value which determines how strong the Laplace prior of the factor should be at 0. Users should be aware, that a change of weightZ in comparison to the default parameter might also entail a need to change other parameters. Unexperienced users should not change weightZ. |
weightProbes |
Parameter (TRUE/FALSE), that determines, if the number of probes should additionally be considered in weight. If TRUE, weight will be modified. |
cyc |
Number of cycles. If the length is two, it is assumed, that a minimum and a maximum number of cycles is given. If the length is one, the value is interpreted as the exact number of cycles to be executed (minimum == maximum). |
tol |
States the termination tolerance if cyc[1]!=cyc[2]. Default is 0.00001. |
weightType |
Flag, that is used to summarize the probes of a sample. |
centering |
States how the data should be centered ("mean", "median"). Default is median. |
rescale |
Parameter (TRUE/FALSE), that determines, if moments in exact Laplace FARMS are rescaled in each iteration. Default is FALSE. |
backscaleComputation |
Parameter (TRUE/FALSE), that determines if the moments of hidden variables should be reestimated after rescaling the parameters. |
maxIntensity |
Parameter (TRUE/FALSE), that determines if the expectation value (=FALSE) or the maximum value (=TRUE) of p(z|x_i) should be used for an estimation of the hidden varaible. |
refIdx |
index or indices which are used for computation of the centering |
... |
Further parameters for expert users. |
A list including: the found parameters: lambda0, lambda1, Psi
the estimated factors: z (expectation), maxZ (maximum)
p: log-likelihood of the data given the found lambda0, lambda1, Psi (not the posterior likelihood that is optimized)
varzx: variances of the hidden variables given the data
KL: Kullback Leibler divergences between between posterior and prior distribution of the hidden variables
IC: Information Content considering the hidden variables and data
ICtransform: transformed Information Content
Case: Case for computation of a sample point (non-exception, special exception)
L1median: Median of the lambda vector components
intensity: back-computed summarized probeset values with mean correction
L_z: back-computed summarized probeset values without mean correction
rawCN: transformed values of L_z
SNR: some additional signal to noise ratio value
Andreas Mayr [email protected] and Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
x <- matrix(rnorm(100, 11), 20, 5) summarizeFarmsExact(x)
x <- matrix(rnorm(100, 11), 20, 5) summarizeFarmsExact(x)
This function implements an exact Laplace FARMS algorithm.
summarizeFarmsExact3(probes, mu = 1, weight = 100, weightSignal = 1, weightZ = 30, weightProbes = TRUE, updateSignal = FALSE, cyc = c(10, 10), tol = 1e-05, weightType = "mean", centering = "median", rescale = FALSE, backscaleComputation = FALSE, maxIntensity = TRUE, refIdx, ...)
summarizeFarmsExact3(probes, mu = 1, weight = 100, weightSignal = 1, weightZ = 30, weightProbes = TRUE, updateSignal = FALSE, cyc = c(10, 10), tol = 1e-05, weightType = "mean", centering = "median", rescale = FALSE, backscaleComputation = FALSE, maxIntensity = TRUE, refIdx, ...)
probes |
A matrix with numeric values. |
mu |
Hyperparameter value which allows to quantify different aspects of potential prior knowledge. Values near zero assumes that most positions do not contain a signal, and introduces a bias for loading matrix elements near zero. Default value is 0 and it's recommended not to change it. |
weight |
Hyperparameter value which determines the influence of the Gaussian prior of the loadings |
weightSignal |
Hyperparameter value on the signal. |
weightZ |
Hyperparameter value which determines how strong the Laplace prior of the factor should be at 0. Users should be aware, that a change of weightZ in comparison to the default parameter might also entail a need to change other parameters. Unexperienced users should not change weightZ. |
weightProbes |
Parameter (TRUE/FALSE), that determines, if the number of probes should additionally be considered in weight. If TRUE, weight will be modified. |
updateSignal |
updateSignal. |
cyc |
Number of cycles. If the length is two, it is assumed, that a minimum and a maximum number of cycles is given. If the length is one, the value is interpreted as the exact number of cycles to be executed (minimum == maximum). |
tol |
States the termination tolerance if cyc[1]!=cyc[2]. Default is 0.00001. |
weightType |
Flag, that is used to summarize the probes of a sample. |
centering |
States how the data should be centered ("mean", "median"). Default is median. |
rescale |
Parameter (TRUE/FALSE), that determines, if moments in exact Laplace FARMS are rescaled in each iteration. Default is FALSE. |
backscaleComputation |
Parameter (TRUE/FALSE), that determines if the moments of hidden variables should be reestimated after rescaling the parameters. |
maxIntensity |
Parameter (TRUE/FALSE), that determines if the expectation value (=FALSE) or the maximum value (=TRUE) of p(z|x_i) should be used for an estimation of the hidden varaible. |
refIdx |
index or indices which are used for computation of the centering |
... |
Further parameters for expert users. |
A list including: the found parameters: lambda0, lambda1, Psi
the estimated factors: z (expectation), maxZ (maximum)
p: log-likelihood of the data given the found lambda0, lambda1, Psi (not the posterior likelihood that is optimized)
varzx: variances of the hidden variables given the data
KL: Kullback Leibler divergences between between posterior and prior distribution of the hidden variables
IC: Information Content considering the hidden variables and data
ICtransform: transformed Information Content
Case: Case for computation of a sample point (non-exception, special exception)
L1median: Median of the lambda vector components
intensity: back-computed summarized probeset values with mean correction
L_z: back-computed summarized probeset values without mean correction
rawCN: transformed values of L_z
SNR: some additional signal to noise ratio value
Andreas Mayr [email protected] and Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
x <- matrix(rnorm(100, 11), 20, 5) summarizeFarmsExact(x)
x <- matrix(rnorm(100, 11), 20, 5) summarizeFarmsExact(x)
This function runs the FARMS algorithm.
summarizeFarmsGaussian(probes, weight = 0.15, mu = 0, cyc = 10, tol = 1e-04, weightType = "mean", init = 0.6, correction = 0, minNoise = 0.35, centering = "median", refIdx)
summarizeFarmsGaussian(probes, weight = 0.15, mu = 0, cyc = 10, tol = 1e-04, weightType = "mean", init = 0.6, correction = 0, minNoise = 0.35, centering = "median", refIdx)
probes |
A matrix with numeric values. |
weight |
Hyperparameter value in the range of [0,1] which determines the influence of the prior. |
mu |
Hyperparameter value which allows to quantify different aspects of potential prior knowledge. Values near zero assumes that most genes do not contain a signal, and introduces a bias for loading matrix elements near zero. Default value is 0. |
cyc |
Number of cycles for the EM algorithm. |
tol |
States the termination tolerance. Default is 0.00001. |
weightType |
Flag, that is used to summarize the loading matrix. The default value is set to mean. |
init |
Parameter for estimation. |
correction |
Value that indicates whether the covariance matrix should be corrected for negative eigenvalues which might emerge from the non-negative correlation constraints or not. Default = O (means that no correction is done), 1 (minimal noise (0.0001) is added to the diagonal elements of the covariance matrix to force positive definiteness), 2 (Maximum Likelihood solution to compute the nearest positive definite matrix under the given non-negative correlation constraints of the covariance matrix) |
minNoise |
States the minimal noise. Default is 0.35. |
centering |
States how the data is centered. Default is median. |
refIdx |
index or indices which are used for computation of the centering |
A list containing the results of the run.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
x <- matrix(rnorm(100, 11), 20, 5) summarizeFarmsGaussian(x)
x <- matrix(rnorm(100, 11), 20, 5) summarizeFarmsGaussian(x)
Possible FARMS summarization
summarizeFarmsMethods()
summarizeFarmsMethods()
Returns a data frame with all possible FARMS calls.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
summarizeFarmsMethods()
summarizeFarmsMethods()
Mean or median instead of the FARMS model
summarizeFarmsStatistics(probes, type = "median", ...)
summarizeFarmsStatistics(probes, type = "median", ...)
probes |
A matrix with numeric values. |
type |
The statistic which you want to apply. |
... |
Further parameters |
Some data
Andreas Mitterecker
This function runs the FARMS algorithm.
summarizeFarmsVariational(probes, weight = 0.15, mu = 0, cyc = 10, weightType = "median", init = 0.6, correction = 0, minNoise = 0.35, spuriousCorrelation = 0.3, centering = "median")
summarizeFarmsVariational(probes, weight = 0.15, mu = 0, cyc = 10, weightType = "median", init = 0.6, correction = 0, minNoise = 0.35, spuriousCorrelation = 0.3, centering = "median")
probes |
A matrix with numeric values. |
weight |
Hyperparameter value in the range of [0,1] which determines the influence of the prior. |
mu |
Hyperparameter value which allows to quantify different aspects of potential prior knowledge. Values near zero assumes that most genes do not contain a signal, and introduces a bias for loading matrix elements near zero. Default value is 0. |
cyc |
Number of cycles for the EM algorithm. |
weightType |
Flag, that is used to summarize the loading matrix. The default value is set to mean. |
init |
Parameter for estimation. |
correction |
Value that indicates whether the covariance matrix should be corrected for negative eigenvalues which might emerge from the non-negative correlation constraints or not. Default = O (means that no correction is done), 1 (minimal noise (0.0001) is added to the diagonal elements of the covariance matrix to force positive definiteness), 2 (Maximum Likelihood solution to compute the nearest positive definite matrix under the given non-negative correlation constraints of the covariance matrix) |
spuriousCorrelation |
Numeric value for suppression of spurious correlation. |
minNoise |
States the minimal noise. Default is 0.35. |
centering |
States how the data is centered. Default is median. |
A list containing the results of the run.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
x <- matrix(rnorm(100, 11), 20, 5) summarizeFarmsVariational(x)
x <- matrix(rnorm(100, 11), 20, 5) summarizeFarmsVariational(x)
Combines neighbouring locations to windows
summarizeWindowBps(phInf, fixedBps = 10000, upperLimit = 6)
summarizeWindowBps(phInf, fixedBps = 10000, upperLimit = 6)
phInf |
The locations on the chromosomes. |
fixedBps |
Size of the window in basepairs. |
upperLimit |
Maximal number of neigbouring locations to combine. |
Indices for summarization
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
## create toy physical data sizeTmp <- 30 phInf <- data.frame( chrom = rep("15", sizeTmp), start = seq(from = 1, by = 300, length.out = sizeTmp), end = seq(from = 3600, by = 300, length.out = sizeTmp), man_fsetid = paste("SNP_A-", seq(sizeTmp)+1000, sep = "")) summarizeWindowBps(phInf)
## create toy physical data sizeTmp <- 30 phInf <- data.frame( chrom = rep("15", sizeTmp), start = seq(from = 1, by = 300, length.out = sizeTmp), end = seq(from = 3600, by = 300, length.out = sizeTmp), man_fsetid = paste("SNP_A-", seq(sizeTmp)+1000, sep = "")) summarizeWindowBps(phInf)
Function to list how neighbouring positions can be combined.
summarizeWindowMethods()
summarizeWindowMethods()
Returns a data frame with all possible methods.
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
summarizeWindowMethods()
summarizeWindowMethods()
Combines neighbouring locations to windows
summarizeWindowStd(phInf, windowSize = 3, overlap = TRUE)
summarizeWindowStd(phInf, windowSize = 3, overlap = TRUE)
phInf |
The locations on the chromosomes. |
windowSize |
Size of how many Locations should be combined. |
overlap |
States if the windows should overlap. |
Indices for summarization
Djork-Arne Clevert [email protected] and Andreas Mitterecker [email protected]
## create toy physical data sizeTmp <- 30 phInf <- data.frame( chrom = rep("15", sizeTmp), start = seq(from = 1, by = 300, length.out = sizeTmp), end = seq(from = 3600, by = 300, length.out = sizeTmp), man_fsetid = paste("SNP_A-", seq(sizeTmp)+1000, sep = "")) summarizeWindowStd(phInf)
## create toy physical data sizeTmp <- 30 phInf <- data.frame( chrom = rep("15", sizeTmp), start = seq(from = 1, by = 300, length.out = sizeTmp), end = seq(from = 3600, by = 300, length.out = sizeTmp), man_fsetid = paste("SNP_A-", seq(sizeTmp)+1000, sep = "")) summarizeWindowStd(phInf)