Package 'segmenter' reference manual

Title:	Perform Chromatin Segmentation Analysis in R by Calling ChromHMM
Description:	Chromatin segmentation analysis transforms ChIP-seq data into signals over the genome. The latter represents the observed states in a multivariate Markov model to predict the chromatin's underlying states. ChromHMM, written in Java, integrates histone modification datasets to learn the chromatin states de-novo. The goal of this package is to call chromHMM from within R, capture the output files in an S4 object and interface to other relevant Bioconductor analysis tools. In addition, segmenter provides functions to test, select and visualize the output of the segmentation.
Authors:	Mahmoud Ahmed [aut, cre]
Maintainer:	Mahmoud Ahmed <[email protected]>
License:	GPL-3
Version:	1.13.0
Built:	2025-03-08 05:55:24 UTC
Source:	https://github.com/bioc/segmenter

Call Java `BinarizeBed`

Description

Call the Java module BinarizeBed which binarize a bed file of the aligned reads.

Usage

.Binarize(inputdir, cellmarkfiletable, chromsizefile, binsize, outputdir, type)
.Binarize(inputdir, cellmarkfiletable, chromsizefile, binsize, outputdir, type)

Arguments

`inputdir`	A string. The path to bed files.
`cellmarkfiletable`	A tab delimited files of three columns. The columns contains the cell, mark and the name or the bed file.
`chromsizefile`	A string. The path to the chromosomes sizes file.
`binsize`	An integer. The bin size to use. Default is 200.
`outputdir`	A string. The path to a directory where output will be written.
`type`	A string. The file type 'bam' or 'bed'.

Value

NULL. Output files are written to the output directory.

Call Java `LearnModel`

Description

Call the Java module LearnModel which learns a multi-state model from ChIP-seq data.

Usage

.LearnModel(
  inputdir,
  outputdir,
  numstates,
  coordsdir,
  anchorsdir,
  chromsizefile,
  assembly,
  optional
)
.LearnModel(
  inputdir,
  outputdir,
  numstates,
  coordsdir,
  anchorsdir,
  chromsizefile,
  assembly,
  optional
)

Arguments

`inputdir`	A string. The path to binarized files.
`outputdir`	A string. The path to a directory where output will be written.
`numstates`	An integer. The number of desired states in the model.
`coordsdir`	A string. The path to genomic coordiantes files.
`anchorsdir`	A string. The path to the genomic anchors files.
`chromsizefile`	A string. The path to the chromosomes sizes file.
`assembly`	A string. The name of the genomic assembely.
`optional`	A string. Other optional arguments passed to the Java command.

Value

NULL. Output files are written to the output directory.

Accessors for the `segmentation` objects

Description

These functions can be used to access the contents of segmentation objects as well as modifying them.

Usage

model(object)

## S4 method for signature 'segmentation'
model(object)

emission(object)

## S4 method for signature 'segmentation'
emission(object)

transition(object)

## S4 method for signature 'segmentation'
transition(object)

overlap(object, ...)

## S4 method for signature 'segmentation'
overlap(object, cell)

TSS(object, ...)

## S4 method for signature 'segmentation'
TSS(object, cell)

TES(object, ...)

## S4 method for signature 'segmentation'
TES(object, cell)

segment(object, ...)

## S4 method for signature 'segmentation'
segment(object, cell)

bins(object, ...)

## S4 method for signature 'segmentation'
bins(object, cell)

counts(object, ...)

## S4 method for signature 'segmentation'
counts(object, cell)

likelihood(object)

## S4 method for signature 'segmentation'
likelihood(object)

cells(object)

## S4 method for signature 'segmentation'
cells(object)

states(object)

## S4 method for signature 'segmentation'
states(object)

markers(object)

## S4 method for signature 'segmentation'
markers(object)
model(object)

## S4 method for signature 'segmentation'
model(object)

emission(object)

## S4 method for signature 'segmentation'
emission(object)

transition(object)

## S4 method for signature 'segmentation'
transition(object)

overlap(object, ...)

## S4 method for signature 'segmentation'
overlap(object, cell)

TSS(object, ...)

## S4 method for signature 'segmentation'
TSS(object, cell)

TES(object, ...)

## S4 method for signature 'segmentation'
TES(object, cell)

segment(object, ...)

## S4 method for signature 'segmentation'
segment(object, cell)

bins(object, ...)

## S4 method for signature 'segmentation'
bins(object, cell)

counts(object, ...)

## S4 method for signature 'segmentation'
counts(object, cell)

likelihood(object)

## S4 method for signature 'segmentation'
likelihood(object)

cells(object)

## S4 method for signature 'segmentation'
cells(object)

states(object)

## S4 method for signature 'segmentation'
states(object)

markers(object)

## S4 method for signature 'segmentation'
markers(object)

Arguments

`object`	An object of class `segmentation`
`...`	Other argument passed to the accessors
`cell`	A string

Value

The data in the corresponding slot or a subset of it.

Examples

model(test_obj)

emission(test_obj)

transition(test_obj)

overlap(test_obj)
overlap(test_obj, cell = 'K562')

TSS(test_obj)
TSS(test_obj, cell = 'K562')

TES(test_obj)
TES(test_obj, cell = 'K562')

segment(test_obj)
segment(test_obj, cell = 'K562')

bins(test_obj)

counts(test_obj)

likelihood(test_obj)

cells(test_obj)

states(test_obj)

markers(test_obj)

model(test_obj)

emission(test_obj)

transition(test_obj)

overlap(test_obj)
overlap(test_obj, cell = 'K562')

TSS(test_obj)
TSS(test_obj, cell = 'K562')

TES(test_obj)
TES(test_obj, cell = 'K562')

segment(test_obj)
segment(test_obj, cell = 'K562')

bins(test_obj)

counts(test_obj)

likelihood(test_obj)

cells(test_obj)

states(test_obj)

markers(test_obj)

Annotate segments

Description

Annotate the GRanges objects of the segments using annotatePeak (see for details)

Usage

annotate_segments(segments, ...)
annotate_segments(segments, ...)

Arguments

`segments`	A `GRanges` object. Usually the output of calling `segment` on the the output object of `lean_model`.
`...`	Other arguments passed to `annotatePeak`

Value

A GRanges object which is identical to the input in addition to the annotations as metadata columns.

Examples

library(TxDb.Hsapiens.UCSC.hg18.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg18.knownGene
segs <- segment(test_obj)
segs_annotated <- annotate_segments(segs, TxDb = txdb, verbose = FALSE)

library(TxDb.Hsapiens.UCSC.hg18.knownGene)
txdb <- TxDb.Hsapiens.UCSC.hg18.knownGene
segs <- segment(test_obj)
segs_annotated <- annotate_segments(segs, TxDb = txdb, verbose = FALSE)

Binarize the bam files

Description

Transform the aligned reads into a binary format.

Usage

binarize_bam(
  inputdir,
  cellmarkfiletable,
  chromsizefile,
  binsize = 200,
  outputdir
)
binarize_bam(
  inputdir,
  cellmarkfiletable,
  chromsizefile,
  binsize = 200,
  outputdir
)

Arguments

`inputdir`	A string. The dirctory of the bam files.
`cellmarkfiletable`	A string. The path to the input files table. Only
`chromsizefile`	A string. The path to the chromosomes sizes file.
`binsize`	An integer. The number in bp used to generate binarized files.
`outputdir`	A string. The path to a directory where output will be written.

Value

NULL. Write files to the outputdir

Examples

# locate input and output files
inputdir <- system.file("extdata", package = "bamsignals")
cellmarkfiletable <- system.file('extdata',
                                 'cell_mark_table.tsv',
                                 package = 'segmenter')
chromsizefile <- system.file('extdata/CHROMSIZES',
                             'hg18.txt',
                              package = 'chromhmmData')
outputdir <- tempdir()

# run command
binarize_bam(inputdir,
             chromsizefile = chromsizefile,
             cellmarkfiletable = cellmarkfiletable,
             outputdir = outputdir)

# show output files
list.files(outputdir, pattern = '*_binary.txt')

# locate input and output files
inputdir <- system.file("extdata", package = "bamsignals")
cellmarkfiletable <- system.file('extdata',
                                 'cell_mark_table.tsv',
                                 package = 'segmenter')
chromsizefile <- system.file('extdata/CHROMSIZES',
                             'hg18.txt',
                              package = 'chromhmmData')
outputdir <- tempdir()

# run command
binarize_bam(inputdir,
             chromsizefile = chromsizefile,
             cellmarkfiletable = cellmarkfiletable,
             outputdir = outputdir)

# show output files
list.files(outputdir, pattern = '*_binary.txt')

Compare two or more models

Description

Compare two or more models

Usage

compare_models(objs, type = "emission", plot = FALSE, ...)
compare_models(objs, type = "emission", plot = FALSE, ...)

Arguments

`objs`	A list of segmentation items
`type`	A string. What to compare. Default to 'emission'
`plot`	A logical.
`...`	Other arguments passed to plot

Value

A numeric vector or a plot with the same values.

Examples

compare_models(test_objs)
compare_models(test_objs, type = 'likelihood')

compare_models(test_objs)
compare_models(test_objs, type = 'likelihood')

Count reads in `GRanges` objects from bam files

Description

Count reads in GRanges objects from bam files

Usage

count_reads_ranges(ranges, cellmarkfiletable, inputbamdir)
count_reads_ranges(ranges, cellmarkfiletable, inputbamdir)

Arguments

`ranges`	A `GRanges` to count in.
`cellmarkfiletable`	A string. The path to the input files table.
`inputbamdir`	A `string`. The path to the input bam files directory.

Value

A SummarizedExperiment object with ranges as its rowRanges and the counts as the assay.

Make emissions file name

Description

Make emissions file name

Usage

emissions_file(numstates)
emissions_file(numstates)

Arguments

numstates

An integer

Value

A string

Examples

emissions_file(3)

emissions_file(3)

Make enrichment file names

Description

Make enrichment file names

Usage

enrichment_files(numstates, cells, table = "RefSeq", annotation = "TSS")
enrichment_files(numstates, cells, table = "RefSeq", annotation = "TSS")

Arguments

`numstates`	An integer
`cells`	A character vector
`table`	A string
`annotation`	A string

Value

A character vector

Examples

enrichment_files(3, 'K562')

enrichment_files(3, 'K562')

Get the frequency of the segments in each cell type

Description

Get the frequency of the segments in each cell type

Usage

get_frequency(segments, normalize = FALSE, tidy = FALSE, plot = FALSE, ...)
get_frequency(segments, normalize = FALSE, tidy = FALSE, plot = FALSE, ...)

Arguments

`segments`	A `GRanges` object. Usually the output of calling `segment` on the the output object of `lean_model`.
`normalize`	A logical. Whether the frequency should be normalized by the total number of segments
`tidy`	A logical.
`plot`	A logical.
`...`	Other arguments passed to barplot

Value

A data.frame when tidy is TRUE otherwise a matrix or a plot

Examples

get_frequency(segment(test_obj))
get_frequency(segment(test_obj), normalize = TRUE)

get_frequency(segment(test_obj))
get_frequency(segment(test_obj), normalize = TRUE)

Get the width of the segments in each cell type

Description

Get the width of the segments in each cell type

Usage

get_width(segments, average = FALSE)
get_width(segments, average = FALSE)

Arguments

`segments`	A `GRanges` object. Usually the output of calling `segment` on the the output object of `lean_model`.
`average`	A logical. Whether the width should be averaged across cells.

Value

A data.frame

Examples

get_width(segment(test_obj))
get_width(segment(test_obj), average = TRUE)

get_width(segment(test_obj))
get_width(segment(test_obj), average = TRUE)

Learn a multi-state model from chromatin data

Description

Integrate multiple ChIP-seq chromatin datasets of histone modifications, transcription factors or other DNA binding proteins to build a multi-state model of the combinatorial and spatial frequently occurring patterns. The function uses as an input binarized ChIP-seq data and the genome annotations on which the states will be discovered.

Usage

learn_model(
  inputdir,
  outputdir,
  numstates,
  coordsdir,
  anchorsdir,
  chromsizefile,
  assembly,
  cells,
  annotation,
  binsize,
  inputbamdir,
  cellmarkfiletable,
  read_only = FALSE,
  read_bins = FALSE,
  counts = FALSE
)
learn_model(
  inputdir,
  outputdir,
  numstates,
  coordsdir,
  anchorsdir,
  chromsizefile,
  assembly,
  cells,
  annotation,
  binsize,
  inputbamdir,
  cellmarkfiletable,
  read_only = FALSE,
  read_bins = FALSE,
  counts = FALSE
)

Arguments

`inputdir`	A string. The path to binarized files.
`outputdir`	A string. The path to a directory where output will be written.
`numstates`	An integer. The number of desired states in the model.
`coordsdir`	A string. The path to genomic coordinates files.
`anchorsdir`	A string. The path to the genomic anchors files.
`chromsizefile`	A string. The path to the chromosomes sizes file.
`assembly`	A string. The name of the genomic assembely.
`cells`	A `character` vector. The names of the cells as they occur in the binarized files (first line).
`annotation`	A string. The name of the type of annotation as it occurs in the genomic annotation files.
`binsize`	An integer. The number in bp used to generate binarized files.
`inputbamdir`	A string. The path to the input bam files. Only used when `count = TRUE`.
`cellmarkfiletable`	A string. The path to the input files table. Only used when `bins = TRUE`.
`read_only`	A logical. Default is `FALSE`. Whether to look for and load output files or generate the model from scratch.
`read_bins`	A logical. Default is `FALSE`. Whether to load the binarized data into the output object.
`counts`	A logical. Default is `FALSE`. Whether to load the reads counts in bins data into the output object.

Details

By default, this functions runs the analysis commands, writes the output to files and loads it into an object of class segmentation. In addition, the binarized data and the reads counts in the bins can be loaded. When read_only is TRUE. The functions looks for previously generated files in the output directory and load them without rerunning the commands.

Value

An object of class segmentation (see for details) and the files written to the output directory.

Examples

# locate input and output files
inputdir <- system.file('extdata/SAMPLEDATA_HG18',
                        package = 'segmenter')
outputdir <- tempdir()
coordsdir <- system.file('extdata/COORDS',
                         package = 'chromhmmData')
anchorsdir <- system.file('extdata/ANCHORFILES',
                          package = 'chromhmmData')
chromsizefile <- system.file('extdata/CHROMSIZES',
                             'hg18.txt',
                             package = 'chromhmmData')

# run command
obj <- learn_model(inputdir = inputdir,
                   outputdir = outputdir,
                   coordsdir = coordsdir,
                   anchorsdir = anchorsdir,
                   chromsizefile = chromsizefile,
                   numstates = 3,
                   assembly = 'hg18',
                   cells = c('K562', 'GM12878'),
                   annotation = 'RefSeq',
                   binsize = 200)

# show the output
obj

# locate input and output files
inputdir <- system.file('extdata/SAMPLEDATA_HG18',
                        package = 'segmenter')
outputdir <- tempdir()
coordsdir <- system.file('extdata/COORDS',
                         package = 'chromhmmData')
anchorsdir <- system.file('extdata/ANCHORFILES',
                          package = 'chromhmmData')
chromsizefile <- system.file('extdata/CHROMSIZES',
                             'hg18.txt',
                             package = 'chromhmmData')

# run command
obj <- learn_model(inputdir = inputdir,
                   outputdir = outputdir,
                   coordsdir = coordsdir,
                   anchorsdir = anchorsdir,
                   chromsizefile = chromsizefile,
                   numstates = 3,
                   assembly = 'hg18',
                   cells = c('K562', 'GM12878'),
                   annotation = 'RefSeq',
                   binsize = 200)

# show the output
obj

Merge segments and bins objects

Description

Merge segments and bins objects

Usage

merge_segments_bins(segments, bins)
merge_segments_bins(segments, bins)

Arguments

`segments`	A `GRanges` object. Usually the output of calling `segment` on the the output object of `lean_model`.
`bins`	A `SummarizedExperiment` object. Usually the output of calling `bins` on the the output object of `lean_model`.

Value

A SummarizedExperiment object with the segment assignment added to the metadata of the rowRanges.

Methods to interact with `segmentation` objects

Description

These functions can be used to interact with segmentation objects for purposes other than accessing or modifying their contents.

Usage

## S4 method for signature 'segmentation'
show(object)
## S4 method for signature 'segmentation'
show(object)

Arguments

object

An object of class segmentation

Value

Prints a summary of the segmentation object contents.

Examples

show(test_obj)

show(test_obj)

Make model file name

Description

Make model file name

Usage

model_file(numstates)
model_file(numstates)

Arguments

numstates

An integer

Value

A string

Examples

model_file(3)

model_file(3)

Make overlap file names

Description

Make overlap file names

Usage

overlap_files(numstates, cells)
overlap_files(numstates, cells)

Arguments

`numstates`	An integer
`cells`	A character vector

Value

A character vector

Examples

overlap_files(3, 'K562')

overlap_files(3, 'K562')

Visualize the model output

Description

Visualize the model output

Usage

plot_heatmap(obj, type = "emission", ...)
plot_heatmap(obj, type = "emission", ...)

Arguments

`obj`	A segmentation object
`type`	A string. Which kind of parameter to print. Default is 'emission' and possible values are 'emission', 'transition', 'overlap', 'TSS' or 'TES'
`...`	Other arguments to path to Heatmap

Value

A heatmap

Examples

plot_heatmap(test_obj)

plot_heatmap(test_obj)

Format the loaded binarized data

Description

The function takes the data.frames of the loaded binarized data files and format them into GRanges or SummarizedExperiment objects.

Usage

range_bins(bins, chromsizefile, binsize, return = "GRanges", tidy = TRUE)
range_bins(bins, chromsizefile, binsize, return = "GRanges", tidy = TRUE)

Arguments

`bins`	A `list` of the `read_bins_file` output.
`chromsizefile`	A string. The path to the chromosomes sizes file.
`binsize`	An integer. The number in bp used to generate binarized files.
`return`	A string. Possible values are `GRanges` (default) or `SummarizedExperiment`.
`tidy`	A `logical`. Default is `TRUE`. Whether to tidy the metadata columns of the `GRanges` object.

Value

GRanges (default) or SummarizedExperiment.

Format the loaded counts data

Description

The function takes the data.frames of the loaded counts data and format them into GRanges or SummarizedExperiment objects.

Usage

range_counts(
  counts,
  features,
  return = "GRanges",
  tidy = FALSE,
  average = FALSE,
  marks
)
range_counts(
  counts,
  features,
  return = "GRanges",
  tidy = FALSE,
  average = FALSE,
  marks
)

Arguments

`counts`	A `matrix` of the `read_bam_file` output.
`features`	A `GRanges`. That was used to count the bam files.
`return`	A string. Possible values are `GRanges` (default) or `SummarizedExperiment`.
`tidy`	A `logical`. Default is `TRUE`. Whether to tidy the metadata columns of the `GRanges` object.
`average`	A `logical`. Default is `FALSE`. Whether to average the counts by `marks` before building the object.
`marks`	A `character` vector. The length shoud equal the numbe of columns in `counts` and is used for averaging and renaming the matrix columns.

Value

GRanges (default) or SummarizedExperiment.

Read `bam` files

Description

Count the reads in each range of the GRanges object

Usage

read_bam_file(file, features, ...)
read_bam_file(file, features, ...)

Arguments

`file`	A string. The path to the file.
`features`	A `GRanges` object.
`...`	Other arguments passed to `bamCount`.

Value

A matrix

Examples

# locate the bam file
bam_file <- system.file("extdata", "randomBam.bam", package = "bamsignals")

# load a granges object
rand_anno <- system.file("extdata",
                         "randomAnnot.Rdata",
                         package = "bamsignals")
features <- GenomicRanges::promoters(get(load(rand_anno)))

# count reads in ranges
read_bam_file(bam_file, features)

# locate the bam file
bam_file <- system.file("extdata", "randomBam.bam", package = "bamsignals")

# load a granges object
rand_anno <- system.file("extdata",
                         "randomAnnot.Rdata",
                         package = "bamsignals")
features <- GenomicRanges::promoters(get(load(rand_anno)))

# count reads in ranges
read_bam_file(bam_file, features)

Read `bins` files

Description

The files contain the cell and the chromosome info in the first line and the binarized data from all marks in the rest.

Usage

read_bins_file(file)
read_bins_file(file)

Arguments

file

A string. The path to the file.

Value

A list of 3 items: cell, seqname and binaries.

Examples

# locate the file
fl <- system.file('extdata/SAMPLEDATA_HG18/',
                  'GM12878_chr11_binary.txt.gz',
                  package = 'segmenter')

# read the file
read_bins_file(fl)

# locate the file
fl <- system.file('extdata/SAMPLEDATA_HG18/',
                  'GM12878_chr11_binary.txt.gz',
                  package = 'segmenter')

# read the file
read_bins_file(fl)

Read `cellmarktable` file

Description

The file should contain at least three columns: cell, mark and file for the names of the cells/conditions, the available marks and binarized data files.

Usage

read_cellmark_file(file)
read_cellmark_file(file)

Arguments

file

A string. The path to the file.

Value

A data.frame

Examples

# locate the file
fl <- system.file('extdata',
                  'cell_mark_table.tsv',
                  package = 'segmenter')

# read the file
read_cellmark_file(fl)

# locate the file
fl <- system.file('extdata',
                  'cell_mark_table.tsv',
                  package = 'segmenter')

# read the file
read_cellmark_file(fl)

Read `chromsizefile`

Description

The file should contain exactly two columns. One for the name of the chromosome and the other for its length.

Usage

read_chromsize_file(file)
read_chromsize_file(file)

Arguments

file

A string. The path to the file.

Value

A data.frame

Examples

# locate the file
chromsizefile <- system.file('extdata/CHROMSIZES',
                             'hg18.txt',
                             package = 'chromhmmData')

# read the file
read_chromsize_file(chromsizefile)

# locate the file
chromsizefile <- system.file('extdata/CHROMSIZES',
                             'hg18.txt',
                             package = 'chromhmmData')

# read the file
read_chromsize_file(chromsizefile)

Read `emissions` file

Description

The segments files are the output of running learn_model and named emissions_3_segment.bed

Usage

read_emissions_file(file, states, marks)
read_emissions_file(file, states, marks)

Arguments

`file`	A string. The path to the file.
`states`	A `character` vector. The names of the states.
`marks`	A `character` vector. The names of the marks

Value

A matrix

Examples

# locate the file
fl <- file.path(tempdir(), 'emissions_3.txt')

# read the file
read_emissions_file(fl)

# locate the file
fl <- file.path(tempdir(), 'emissions_3.txt')

# read the file
read_emissions_file(fl)

Read `enrichment` files

Description

The segments files are the output of running learn_model and named <cell>_3_TSS.txt or <cell>_3_TES.txt.

Usage

read_enrichment_file(file, states, regions)
read_enrichment_file(file, states, regions)

Arguments

`file`	A string. The path to the file.
`states`	A `character` vector. The names of the states.
`regions`	A `character` vector. The names of the regions.

Value

A matrix

Examples

# locate the file
fl <- file.path(tempdir(), 'GM12878_3_RefSeqTSS_neighborhood.txt')

# read the file
read_enrichment_file(fl)

# locate the file
fl <- file.path(tempdir(), 'GM12878_3_RefSeqTSS_neighborhood.txt')

# read the file
read_enrichment_file(fl)

Read `modelfile`

Description

The model file is the output of running learn_model and named model_#.txt

Usage

read_model_file(file)
read_model_file(file)

Arguments

file

A string. The path to the file.

Value

A data.frame

Examples

# locate the file
modelfile <- file.path(tempdir(), 'model_3.txt')

# read the file
read_model_file(modelfile)

# locate the file
modelfile <- file.path(tempdir(), 'model_3.txt')

# read the file
read_model_file(modelfile)

Read `segments` files

Description

The segments files are the output of running learn_model and named <cell>_3_overlap.txt

Usage

read_overlap_file(file, states, regions)
read_overlap_file(file, states, regions)

Arguments

`file`	A string. The path to the file.
`states`	A `character` vector. The names of the states.
`regions`	A `character` vector. The names of the regions.

Value

A matrix

Examples

# locate the file
fl <- file.path(tempdir(), 'GM12878_3_overlap.txt')

# read the file
read_overlap_file(fl)

# locate the file
fl <- file.path(tempdir(), 'GM12878_3_overlap.txt')

# read the file
read_overlap_file(fl)

Read `segments` files

Description

The segments files are the output of running learn_model and named <cell>_3_segment.bed

Usage

read_segements_file(file, states)
read_segements_file(file, states)

Arguments

`file`	A string. The path to the file.
`states`	A `character` vector. The names of the states.

Value

A data.frame

Examples

# locate the file
segmentfile <- file.path(tempdir(), 'GM12878_3_segments.bed')

# read the file
segs <- read_segements_file(segmentfile)
head(segs)

# locate the file
segmentfile <- file.path(tempdir(), 'GM12878_3_segments.bed')

# read the file
segs <- read_segements_file(segmentfile)
head(segs)

Read `transitions` file

Description

The segments files are the output of running learn_model and named transitions_3_segment.bed

Usage

read_transitions_file(file, states)
read_transitions_file(file, states)

Arguments

`file`	A string. The path to the file.
`states`	A `character` vector. The names of the states.

Value

A matrix

Examples

# locate the file
fl <- file.path(tempdir(), 'transitions_3.txt')

# read the file
read_transitions_file(fl)

# locate the file
fl <- file.path(tempdir(), 'transitions_3.txt')

# read the file
read_transitions_file(fl)

segmentation objects

Description

The segmentation class consists of matrices and lists. The components contain the output of the chromatin segmentation analysis. Loading the input data is optional. The object is returned as a result of calling learn_model or reading its already existing output.

Slots

model: list. The list consists of 6 items corresponding to the contents of the model_#.txt file. These are number_states and number_marks for the numbers of states and marks in the model; likelihood and probinit for the likelihood and the initial probabilities of the multi-state model; transitionprobs and emissionprobs for the probabilities of the transitions and emissions parameters of the model. Can be accessed using model.
emission: matrix. The matrix contains the emission parameters of n states (rows) for n marks (columns) corresponding to the contents of the emission_#.txt file. Can be accessed using emission.
transition: matrix. The matrix contains the transition parameters of n by n states corresponding to the contents of the transition_#.txt file. Can be accessed using transition.
overlap: list. A list of n number of cells/conditions items. Each item is a matrix of the overlap enrichment of n states (rows) at n genomic annotations (columns) corresponding to the contents of the <cell>_#_overlap.txt files. Can be accessed using overlap.
TSS: list. A list of n number of cells/conditions items. Each item is a matrix of the overlap enrichment of n states (rows) at n locations around the transcription start site (TSS) (columns) corresponding to the contents of the <cell>_#_TSS_neighborhood.txt files. Can be accessed using TSS.
TES: list. A list of n number of cells/conditions items. Each item is a matrix of the overlap enrichment of n states (rows) at n locations around the transcription end site (TES) (columns) corresponding to the contents of the <cell>_#_TES_neighborhood.txt files. Can be accessed using TES.
segment: list. A list of n number of cells/conditions items. Each item is a GRanges object containing the segmentation and assigned states as a metadata column 'state'. These contents correspond to the <cell>_#_segment.bed files. Annotations of the ranges are optional. Can be accessed using segment.
bins: list. A list of n number of cells/conditions items. Each item is a SummarizedExperiment object containing the binarized input data. The coordinates of the bins are saved as the rowRanges each assigned to a state and the binary data itself is saved as assay. Can be accessed using bins.
counts: list. A list of n number of cells/conditions items. Each item is a SummarizedExperiment object containing the read counts in bins. The coordinates of the bins are saved as the rowRanges each assigned to a state and the counts data itself is saved as assay. Can be accessed using counts.

Make segments file names

Description

Make segments file names

Usage

segments_files(numstates, cells)
segments_files(numstates, cells)

Arguments

`numstates`	An integer
`cells`	A character vector

Value

A character vector

Examples

segments_files(3, 'K562')

segments_files(3, 'K562')

A segmentation object generated from the test data

Description

A segmentation object generated by running lean_model on the test dataset in 'inst/extdata/ChromHMM/SAMPLEDATA_HG18'. The source code to this run is in 'inst/script/test_obj.R'

Usage

test_obj
test_obj

Format

An object of class segmentation of length 1.

A a list of segmentation objects generated from the test data

Description

A segmentation object generated by running lean_model on the test dataset in 'inst/extdata/ChromHMM/SAMPLEDATA_HG18' for 3 to 8 states. The source code to this run is in 'inst/script/test_objs.R'

Usage

test_objs
test_objs

Format

An object of class list of length 6.

Tidy the metadata of a `GRanges` object

Description

Tidy the metadata of a GRanges object

Usage

tidy_ranges(gr, columns, low = 0)
tidy_ranges(gr, columns, low = 0)

Arguments

`gr`	A `GRanges` object
`columns`	A `character` vectors. The names of columns to be tidied.
`low`	An `integer`. All values <= this `integer` will be removed.

Value

A GRanges object

Examples

tidy_ranges(segment(test_obj, cell = 'K562')[[1]])

tidy_ranges(segment(test_obj, cell = 'K562')[[1]])

Make transitions file name

Description

Make transitions file name

Usage

transitions_file(numstates)
transitions_file(numstates)

Arguments

numstates

An integer

Value

A string

Examples

transitions_file(3)

transitions_file(3)

Package 'segmenter'

Help Index

Call Java BinarizeBed

Description

Usage

Arguments

Value

See Also

Call Java LearnModel

Description

Usage

Arguments

Value

See Also

Accessors for the segmentation objects

Description

Usage

Arguments

Value

See Also

Examples

Annotate segments

Description

Usage

Arguments

Value

Examples

Binarize the bam files

Description

Usage

Arguments

Value

See Also

Examples

Binarize the bed files

Description

Usage

Arguments

Value

See Also

Compare two or more models

Description

Usage

Arguments

Value

Examples

Count reads in GRanges objects from bam files

Description

Usage

Arguments

Value

Make emissions file name

Description

Usage

Arguments

Value

Examples

Make enrichment file names

Description

Usage

Arguments

Value

Examples

Get the frequency of the segments in each cell type

Description

Usage

Arguments

Value

Examples

Get the width of the segments in each cell type

Description

Usage

Arguments

Value

Examples

Learn a multi-state model from chromatin data

Description

Usage

Arguments

Details

Call Java `BinarizeBed`

Call Java `LearnModel`

Accessors for the `segmentation` objects

Count reads in `GRanges` objects from bam files

Methods to interact with `segmentation` objects

Read `bam` files

Read `bins` files

Read `cellmarktable` file

Read `chromsizefile`

Read `emissions` file