Title: | Quantile smoothing and genomic visualization of array data |
---|---|
Description: | Implements quantile smoothing as introduced in: Quantile smoothing of array CGH data; Eilers PH, de Menezes RX; Bioinformatics. 2005 Apr 1;21(7):1146-53. |
Authors: | Jan Oosting, Paul Eilers, Renee Menezes |
Maintainer: | Jan Oosting <[email protected]> |
License: | GPL-2 |
Version: | 1.73.0 |
Built: | 2024-10-31 04:34:43 UTC |
Source: | https://github.com/bioc/quantsmooth |
Dataset used to produce human chromosomal ideograms for plotting purposes.
data(chrom.bands)
data(chrom.bands)
A data frame with 4068 observations on the following 12 variables.
chr
a character vector
arm
a character vector
band
a character vector
ISCN.top
a numeric vector
ISCN.bot
a numeric vector
bases.top
a numeric vector
bases.bot
a numeric vector
stain
a character vector
cM.top
a numeric vector
cM.bot
a numeric vector
n.markers
a numeric vector
p.markers
a numeric vector
The original file gives only the physical map positions. The genetic map positions are interpolated from the Rutgers linkage map (Kong et al 2004).
ftp://ftp.ncbi.nlm.nih.gov/genomes/H\_sapiens/maps/mapview/BUILD.35.1/ideogram.gz.
Kong X, Murphy K, Raj T, He C, White PS, Matise TC. 2004. A Combined Linkage-Physical Map of the Human Genome. American Journal of Human Genetics, 75(6):1143-8.
A collection of arrays that contains data of chromosome 14 of 3 colorectal tumors. The first tumor shows 1 region of loss, the second tumor shows no abberation, while the third tumor shows loss of 1 copy of the chromosome.
Copy number values of 358 probes from Affymetrix 10K genechip. Data was obtained from DChip
corresponding probe positions
Copy number values of 112 probes from a 1 mb spaced BAC array-CGH
corresponding probe positions
Copy number values of 207 probes from Illumina GoldenGate Linkage IV data
corresponding probe positions
data(chr14)
data(chr14)
Matrices of copy number values and vectors of chromosomal probe positions
Jan Oosting
This function paints chromosomal icons on an existing plot
drawSimpleChrom(x, y, len = 3, width = 1, fill, col, orientation = c("h", "v"), centromere.size = 0.6)
drawSimpleChrom(x, y, len = 3, width = 1, fill, col, orientation = c("h", "v"), centromere.size = 0.6)
x |
start x-position |
y |
start y-position |
len |
total length of the chromosome |
width |
width of the chromosome |
fill |
character, {"a","p","q","q[1-3]","p[1-3]" }. Events to a chromosome can be depicted by coloring "a"ll of the chromosome, the complete p or q-arm, or a subsegment of the arms |
col |
color(s) of |
orientation |
either "h"orizontal or "v"ertical |
centromere.size |
The size of the centromere as fraction of the width |
This function is executed for its side effects
Jan Oosting
plot(c(0,4),c(0,3),type="n",xaxt="n",yaxt="n",xlab="",ylab="") drawSimpleChrom(2,3,fill=c("p","q3"),col=c("red","blue"),orientation="v")
plot(c(0,4),c(0,3),type="n",xaxt="n",yaxt="n",xlab="",ylab="") drawSimpleChrom(2,3,fill=c("p","q3"),col=c("red","blue"),orientation="v")
retrieve regions of interest in a vector of intensities using quantile smoothing
getChangedRegions(intensities, positions, normalized.to=1, interval, threshold, minlength=2, ...)
getChangedRegions(intensities, positions, normalized.to=1, interval, threshold, minlength=2, ...)
intensities |
numeric vector |
positions |
numeric vector of the same length as intensities. If this argument is not given the results contain the indexes of the |
normalized.to |
numeric, reference value. Changes are compared to this value |
interval |
numeric [0,1], bandwidth around reference. If the smoothed line at the higher quantile drops below the |
threshold |
numeric, if the median smoothed value drops below |
minlength |
integer, not used currently |
... |
extra arguments for |
This function uses quantsmooth
to detect regions in the genome that are abnormal.
If interval
is set then a smoothed line is calculated for tau = 0.5 - interval/2
, and a region is determined as upregulated if this line is above the reference. Down regulation is determined when the smoothed line for tau = 0.5 + interval/2
is below the reference value.
If threshold
is set then a smoothed line is calculated for tau = 0.5
and up- or down regulation are determined when this line is outside the range [normalized.t - threshold:normalized.to + threshold]
A data.frame with 3 colums is returned. Each row contains a region with columns up
, start
and end
. start
and end
indicate positions in the vector of the first and last position that were up- or downregulated
Jan Oosting
data(chr14) getChangedRegions(ill.cn[,1],ill.pos,normalized.to=2,interval=0.5)
data(chr14) getChangedRegions(ill.cn[,1],ill.pos,normalized.to=2,interval=0.5)
Test a set of smoothing parameters to find best fit to data
getLambdaMin(intensities,lambdas,...)
getLambdaMin(intensities,lambdas,...)
intensities |
numeric vector |
lambdas |
numeric vector; see |
... |
extra parameters for |
Cross validation is performed using a set of lambda values in order to find the lambda value that shows the best fit to the data.
This function returns the lambda value that has the lowest cross validation value on this dataset
Jan Oosting
data(chr14) lambdas<-2^seq(from=-2,to=5,by=0.25) getLambdaMin(bac.cn[,1],lambdas)
data(chr14) lambdas<-2^seq(from=-2,to=5,by=0.25) getLambdaMin(bac.cn[,1],lambdas)
A chromosme is drawn including the cytobands
grid.chromosome(chrom, side = 1, units = "hg19", chrom.width = 0.5, length.out, bands = "major", legend = c("chrom", "band", "none"), cex.leg = 0.7, bleach = 0, ...)
grid.chromosome(chrom, side = 1, units = "hg19", chrom.width = 0.5, length.out, bands = "major", legend = c("chrom", "band", "none"), cex.leg = 0.7, bleach = 0, ...)
chrom |
numeric or character, id of chromosome to plot |
side |
numeric [1:4], side of rectangle to draw, 4 sides, side 2 and 4 are vertical |
units |
character or data.frame, type of units for genomic data, or a dataframe with UCSC cytoband data, see |
chrom.width |
numeric [0,1], The width relative to the width (sides 2 and 4) or height(sides 1 and 3) of the viewport |
length.out |
numeric, size of native units of viewport |
bands |
character, draw either major or minor bands |
legend |
character, type of legend |
cex.leg |
numeric, relative size of legend text |
bleach |
numeric [0,1], proportion by which to bleach the chromosome |
... |
arguments for viewport(), especially x,y, width, and height |
The chromosome is drawn within a rectangle defined by x, y, width, and height, which is pushed as a viewport. The legend is drawn within the same rectangle in the space left over by chrom.width.
This function is executed for its side effects
David L Duffy ,Jan Oosting
lodplot package
grid.newpage() grid.chromosome(1,units="bases",height=0.15)
grid.newpage() grid.chromosome(1,units="bases",height=0.15)
Retrieve human chromosomal length from NCBI data
lengthChromosome(chrom, units = "hg19")
lengthChromosome(chrom, units = "hg19")
chrom |
vector of chromosomal id, 1:22,X,Y |
units |
character, or data.frame, see details |
The cytoband data was originally obtained from the lodplot package by David Duffy, which contained basepair data from genome version hg17, but also the linkage related positions in cM.
These datasets have units "bases"
and "cM"
respectively.
Cytoband data for genome versions "hg18", "hg19", "hg38" and "mm10" has been included, and can be referenced by these strings.
It is also possible to use cytoband data as obtained from the UCSC site, by downloading the cytoBand.txt.gz
or cytoBandIdeo.txt.gz
annotation file for a species (see example below). Note however that this information is not available for most species.
A numeric vector in the requested units
Jan Oosting
# Show length of chromosome 1 in several types of units lengthChromosome(1,"cM") lengthChromosome(1,"bases") lengthChromosome(1,"hg38") # mm9 cytoband data temp <- tempfile(fileext = ".txt.gz") download.file("http://hgdownload.soe.ucsc.edu/goldenPath/mm9/database/cytoBand.txt.gz",temp) mm9cytobands <- read.table(temp,sep="\t") lengthChromosome(1,mm9cytobands) # remove temp file unlink(temp)
# Show length of chromosome 1 in several types of units lengthChromosome(1,"cM") lengthChromosome(1,"bases") lengthChromosome(1,"hg38") # mm9 cytoband data temp <- tempfile(fileext = ".txt.gz") download.file("http://hgdownload.soe.ucsc.edu/goldenPath/mm9/database/cytoBand.txt.gz",temp) mm9cytobands <- read.table(temp,sep="\t") lengthChromosome(1,mm9cytobands) # remove temp file unlink(temp)
The function converts chromosomal ids to their numeric form, and the sex chromosomes to values between 98 and 100. This simplifies sorting on chromosome ID
numericCHR(CHR, prefix="chr") characterCHR(CHR, prefix="")
numericCHR(CHR, prefix="chr") characterCHR(CHR, prefix="")
CHR |
character/numeric vector for both functions the mode of the input is not forced. For numericCHR strings "X","Y" and "XY" are converted to 98,99 and 100 respectively. |
prefix |
character, string is excluded from ( |
numericCHR
returns a numeric vector of same length as CHR
characterCHR
returns a character vector of same length as CHR
Jan Oosting
chroms<-c("3","2","8","X","7","Y","5","1","9","10","11","12","4","6") sort(chroms) sort(numericCHR(chroms)) characterCHR(sort(numericCHR(chroms)),prefix="chr")
chroms<-c("3","2","8","X","7","Y","5","1","9","10","11","12","4","6") sort(chroms) sort(numericCHR(chroms)) characterCHR(sort(numericCHR(chroms)),prefix="chr")
Paints a human chromosomal idiogram in an existing plot Adapted from the paint.chromosome function in the lodplot package by David L Duffy
paintCytobands(chrom, pos = c(0, 0), units = "hg19", width = 0.4, length.out, bands = "major", orientation = c("h","v"), legend = TRUE, cex.leg = 0.7, bleach = 0, ...)
paintCytobands(chrom, pos = c(0, 0), units = "hg19", width = 0.4, length.out, bands = "major", orientation = c("h","v"), legend = TRUE, cex.leg = 0.7, bleach = 0, ...)
chrom |
chromosomal id, chromosome to plot 1:22,X,Y |
pos |
numeric vector of length 2, position in the plot to start the plot |
units |
character or data.frame, type of units for genomic data, or a dataframe with UCSC cytoband data, see |
width |
numeric, width of the chromosome, the chromosome is plotted between |
length.out |
numeric, if given, the chromosome will have this length in the plot |
bands |
if not equal to "major", then also the minor bands will be plotted |
orientation |
chromosome is plotted either Horizontally to the right of the starting point or Vertically down from the starting point |
legend |
logical, if |
cex.leg |
numeric, relative size of legend text |
bleach |
numeric [0,1], proportion by which to bleach the chromosome |
... |
extra parameters for |
This function is executed for its side effects
David L Duffy , Jan Oosting
lodplot package
plot(c(0,lengthChromosome(14,"bases")),c(-2,2),type="n",xaxt="n",yaxt="n",xlab="",ylab="") paintCytobands(14,units="bases")
plot(c(0,lengthChromosome(14,"bases")),c(-2,2),type="n",xaxt="n",yaxt="n",xlab="",ylab="") paintCytobands(14,units="bases")
This function is a wrapper for plotSmoothed, to make data subsetting easier
plotChromosome(gendata, chrompos, chromosome, dataselection = NULL, ylim = NULL, normalized.to = NULL, grid = NULL, smooth.lambda = 2, interval = 0.5, ...)
plotChromosome(gendata, chrompos, chromosome, dataselection = NULL, ylim = NULL, normalized.to = NULL, grid = NULL, smooth.lambda = 2, interval = 0.5, ...)
gendata |
numeric matrix or data.frame |
chrompos |
chrompos object with same numer of rows as gendata |
chromosome |
numeric, chromosme to show |
dataselection |
optional, subset of samples/columns in gendata |
ylim |
limits for plot |
normalized.to |
y-value(s) for line |
grid |
x-value(s) for line |
smooth.lambda |
smoothing parameter, see |
interval |
position of extra lines besides median, see |
... |
extra arguments for |
The function is used for its side effects
Jan Oosting
Plot a smoothed line together with the original data values
plotSmoothed(intensities, position, ylim=NULL, ylab="intensity", xlab="position", normalized.to=NULL, grid=NULL, smooth.lambda=2, interval=0.5, plotnew=TRUE, cols, cex.pts = 0.6, ...)
plotSmoothed(intensities, position, ylim=NULL, ylab="intensity", xlab="position", normalized.to=NULL, grid=NULL, smooth.lambda=2, interval=0.5, plotnew=TRUE, cols, cex.pts = 0.6, ...)
intensities |
numeric vector or matrix, data are plotted by column |
position |
numeric vector; the length should be the number of rows in intensities |
ylim |
numeric vector of length 2, limits for plot. If |
ylab |
character, label for y-position |
xlab |
character, label for x-position |
normalized.to |
numeric, a line(s) is drawn at this horizontal position |
grid |
numeric, a line(s) is drawn at this vertical position |
smooth.lambda |
numeric, smoothing parameter see |
interval |
numeric (0..1), plotting of extra smoothed lines around median. With |
plotnew |
logical, if TRUE a new plot is created, else the data are plotted into an existing plot |
cols |
color vector, colors for columns in |
cex.pts |
size of the dots in the plot. Set to |
... |
extra parameters for |
This function plots the raw data values as dots and the median smoothed values as a continuous line. If interval is supplied these are plotted as lines in different line types. More than 1 interval can be given.
This function is used for its side effects
Jan Oosting
data(chr14) plotSmoothed(bac.cn,bac.pos,ylim=c(1,2.5),normalized.to=2,smooth.lambda=2.5)
data(chr14) plotSmoothed(bac.cn,bac.pos,ylim=c(1,2.5),normalized.to=2,smooth.lambda=2.5)
Determine cytoband position based on location of probe
position2Cytoband(chrom, position, units = "hg19", bands = c("major", "minor"))
position2Cytoband(chrom, position, units = "hg19", bands = c("major", "minor"))
chrom |
chromosomal id, chromosome to plot 1:22,X,Y |
position |
numeric vector |
units |
character, type of positional unit |
bands |
chararcter, type of cytoband |
Character vector with cytobands, if an illegal position was used, the value "-" is returned. All positions within a single function call should be for a single chromosome
Jan Oosting
position2Cytoband(1,c(50e6,125e6,200e6),units="bases") position2Cytoband(1,c(50,125,200),units="cM",bands="minor")
position2Cytoband(1,c(50e6,125e6,200e6),units="bases") position2Cytoband(1,c(50,125,200),units="cM",bands="minor")
This function starts up a plot consisting of all chromosomes of a genomen, including axes with chromosome names.
prepareGenomePlot(chrompos, cols = "grey50", paintCytobands = FALSE, bleach = 0, topspace = 1, organism, sexChromosomes = FALSE, units = "hg19",...)
prepareGenomePlot(chrompos, cols = "grey50", paintCytobands = FALSE, bleach = 0, topspace = 1, organism, sexChromosomes = FALSE, units = "hg19",...)
chrompos |
chrompos object, data.frame with |
cols |
color(s) for the chromosome lines |
paintCytobands |
logical, use |
bleach |
numeric [0,1], proportion by which to bleach the ideograms |
topspace |
numerical, extra space on top of plot, i.e. for legends |
organism |
character, if given a 2 column plot is created with the chromosomes for the given species. Currently "hsa", "mmu", and "rno" are supported |
sexChromosomes |
logical, if |
units |
characterr or data.frame, type of units for genomic data, or a dataframe with UCSC cytoband data, see |
... |
extra arguments for |
If organism
is not supplied then a single column is plotted of the available
chromosomes in chrompos$CHR
. The arguments paintCytobands
, bleach
,
and sexChromosomes
are not used in that case.
If organism
is supplied and chrompos
is NULL
then a result is generated
with the starting Y and X position of each chromosome
A matrix with 2 columns that contain the Y and X positions for the probes on the plot
Jan Oosting
Quantile smoothing of array data
quantsmooth(intensities,smooth.lambda=2, tau=0.5, ridge.kappa=0,smooth.na=TRUE,segment)
quantsmooth(intensities,smooth.lambda=2, tau=0.5, ridge.kappa=0,smooth.na=TRUE,segment)
intensities |
numeric vector |
smooth.lambda |
numeric |
tau |
numeric [0..1], the quantile desired; see |
ridge.kappa |
fudge parameter; see details |
smooth.na |
logical; handling of NA |
segment |
integer, length of overlapping segments |
This function returns a vector of the same length as intensities
, or a matrix
if the length of tau
is greater than 1.
Jan Oosting
data(chr14) plot(quantsmooth(bac.cn[,1],smooth.lambda=2.8),type="l")
data(chr14) plot(quantsmooth(bac.cn[,1],smooth.lambda=2.8),type="l")
Cross validation of smoothing parameters
quantsmooth.cv(intensities,smooth.lambda=2, ridge.kappa=0)
quantsmooth.cv(intensities,smooth.lambda=2, ridge.kappa=0)
intensities |
numeric vector |
smooth.lambda |
numeric; see |
ridge.kappa |
fudge parameter; see |
Cross validation is performed by calculating the fit from the even indices on the odd indices and vice versa.
This function returns the sum of squared differences or NA
if the fitting function gave an error
Jan Oosting
data(chr14) # A low value is indicative of a better fit to the data quantsmooth.cv(bac.cn[,1],1) quantsmooth.cv(bac.cn[,1],2.8)
data(chr14) # A low value is indicative of a better fit to the data quantsmooth.cv(bac.cn[,1],1) quantsmooth.cv(bac.cn[,1],2.8)
segmented Quantile smoothing of array data
quantsmooth.seg(y, x = 1:length(y), lambda = 2, tau = 0.5,kappa = 0, nb = length(x))
quantsmooth.seg(y, x = 1:length(y), lambda = 2, tau = 0.5,kappa = 0, nb = length(x))
y |
numeric vector |
x |
numeric vector of same length as |
lambda |
numeric |
tau |
numeric [0..1], the quantile desired; see |
kappa |
fudge parameter; see details |
nb |
integer, basis |
This function returns a vector of the same length as y
Jan Oosting
data(chr14) plot(quantsmooth.seg(bac.cn[,1],lambda=2.8,nb=50),type="l")
data(chr14) plot(quantsmooth.seg(bac.cn[,1],lambda=2.8,nb=50),type="l")
This function scales data to a new range while enforcing the boundaries. This can be helpful in preventing overlap between chromosomal plots that display multiple chromosomes in the same plot
scaleto(x, fromlimits = c(0, 50), tolimits = c(0.5, -0.5), adjust = TRUE)
scaleto(x, fromlimits = c(0, 50), tolimits = c(0.5, -0.5), adjust = TRUE)
x |
numeric |
fromlimits |
numeric vector with length 2, original range of data |
tolimits |
numeric vector with length 2, target range of data |
adjust |
logical, if |
numeric of same size as x
Jan Oosting