Package 'HiCExperiment'

Title: Bioconductor class for interacting with Hi-C files in R
Description: R generic interface to Hi-C contact matrices in `.(m)cool`, `.hic` or HiC-Pro derived formats, as well as other Hi-C processed file formats. Contact matrices can be partially parsed using a random access method, allowing a memory-efficient representation of Hi-C data in R. The `HiCExperiment` class stores the Hi-C contacts parsed from local contact matrix files. `HiCExperiment` instances can be further investigated in R using the `HiContacts` analysis package.
Authors: Jacques Serizay [aut, cre]
Maintainer: Jacques Serizay <[email protected]>
License: MIT + file LICENSE
Version: 1.7.0
Built: 2024-12-19 03:40:46 UTC
Source: https://github.com/bioc/HiCExperiment

Help Index


AggrHiCExperiment S4 class

Description

The AggrHiCExperiment extends HiCExperiment class.

Usage

AggrHiCExperiment(
  file,
  resolution = NULL,
  targets,
  flankingBins = 50,
  metadata = list(),
  topologicalFeatures = S4Vectors::SimpleList(),
  pairsFile = NULL,
  bed = NULL,
  maxDistance = NULL,
  BPPARAM = BiocParallel::bpparam()
)

## S4 method for signature 'AggrHiCExperiment,missing'
slices(x)

## S4 method for signature 'AggrHiCExperiment,character'
slices(x, name)

## S4 method for signature 'AggrHiCExperiment,numeric'
slices(x, name)

## S4 method for signature 'AggrHiCExperiment'
show(object)

Arguments

file

CoolFile or plain path to a Hi-C contact file

resolution

Resolution to use with the Hi-C contact file

targets

Set of chromosome coordinates for which interaction counts are extracted from the Hi-C contact file, provided as a GRanges object (for diagnoal-centered loci) or as a GInteractions object (for off-diagonal coordinates).

flankingBins

Number of bins on each flank of the bins containing input targets.

metadata

list of metadata

topologicalFeatures

topologicalFeatures provided as a named SimpleList

pairsFile

Path to an associated .pairs file

bed

Path to regions file generated by HiC-Pro

maxDistance

Maximum distance to use when compiling distance decay

BPPARAM

BiocParallel parameters

x, object

A AggrHiCExperiment object.

name

The name/index of slices to extract.

Value

An AggrHiCExperiment object.

Slots

fileName

Path of Hi-C contact file

resolutions

Resolutions available in the Hi-C contact file.

resolution

Current resolution

interactions

Genomic Interactions extracted from the Hi-C contact file

scores

Available interaction scores.

slices

Available interaction slices.

topologicalFeatures

Topological features associated with the dataset (e.g. loops (\<Pairs\>), borders (\<GRanges\>), viewpoints (\<GRanges\>), etc...)

pairsFile

Path to the .pairs file associated with the Hi-C contact file

metadata

metadata associated with the Hi-C contact file.

See Also

HiCExperiment()

Examples

fpath <- HiContactsData::HiContactsData('yeast_wt', 'mcool')
data(centros_yeast)
x <- AggrHiCExperiment(
  file = fpath, 
  resolution = 8000,
  targets = centros_yeast[c(4, 7)]
)
x
slices(x, 'count')[1:10, 1:10, 1]

Coercing functions

Description

Coercing functions available for HiCExperiment objects.

Usage

## S4 method for signature 'HiCExperiment'
as.matrix(x, use.scores = "balanced", sparse = FALSE)

## S4 method for signature 'HiCExperiment'
as.data.frame(x)

gi2cm(gi, use.scores = "score")

cm2matrix(cm, replace_NA = NA, sparse = FALSE)

df2gi(
  df,
  seqnames1 = "seqnames1",
  start1 = "start1",
  end1 = "end1",
  seqnames2 = "seqnames2",
  start2 = "start2",
  end2 = "end2"
)

Arguments

x

HiCExperiment object

use.scores

Which scores to use to inflate GInteractions

sparse

Whether to return the contact matrix as a sparse matrix

gi

GInteractions object

cm

A ContactMatrix object

replace_NA

Replace NA values

df

A data.frame object

seqnames1, start1, end1, seqnames2, start2, end2

Names (as strings) of columns containing corresponding information in a data.frame parsed into GInteractions (default: FALSE)

Examples

mcoolPath <- HiContactsData::HiContactsData('yeast_wt', 'mcool')
contacts <- import(mcoolPath, focus = 'XVI', resolution = 16000, format = 'cool')
gis <- interactions(contacts)
cm <- gi2cm(gis, 'balanced')
cm
cm2matrix(cm)[1:10, 1:10]
df2gi(data.frame(
    chr1 = 'I', start1 = 10, end1 = 100, 
    chr2 = 'I', start2 = 40, end2 = 1000, 
    score = 12, 
    weight = 0.234, 
    filtered = TRUE
), seqnames1 = 'chr1', seqnames2 = 'chr2')

HiCExperiment binning methods

Description

HiCExperiment binning methods

Usage

## S4 method for signature 'GInteractions,numeric'
bin(x, resolution, seqinfo = NULL)

## S4 method for signature 'PairsFile,numeric'
bin(x, resolution, seqinfo = NULL)

Arguments

x

A PairsFile or GInteractions object

resolution

Which resolution to use to bin the interactions

seqinfo

Seqinfo object

Examples

pairsf <- HiContactsData::HiContactsData('yeast_wt', 'pairs.gz')
pf <- PairsFile(pairsf)

ContactsFile S4 class

Description

The ContactsFile class describes a BiocFile object, pointing to the location of an Hi-C matrix file (cool, mcool, hic, hicpro, ...) and containing additional slots:

  1. resolution: at which resolution the associated mcool file should be parsed

  2. pairsFile: the path (in plain character) to an optional pairs file (stored as a PairsFile object);

  3. metadata: a list. If the CoolFile is created by HiCool, it will contain two elements: log (path to HiCool processing log file) and stats (aggregating some stats from HiCool mapping).

ContactsFile methods.

Arguments

path

String; path to an Hi-C matrix file (cool, mcool, hic, hicpro)

resolution

numeric; resolution to use with Hi-C matrix file

pairsFile

String; path to a pairs file

metadata

list.

object

A ContactsFile object.

x

A ContactsFile object.

Slots

resolution

numeric value or NULL

pairsFile

PairsFile object

metadata

list

See Also

CoolFile(), HicFile(), HicproFile()


CoolFile S4 class

Description

The CoolFile class describes a BiocFile object, pointing to the location of an Hi-C matrix file (cool, mcool, hic, hicpro, ...) and containing additional slots:

  1. resolution: at which resolution the associated mcool file should be parsed

  2. pairsFile: the path (in plain character) to an optional pairs file (stored as a PairsFile object);

  3. metadata: a list. If the CoolFile is created by HiCool, it will contain two elements: log (path to HiCool processing log file) and stats (aggregating some stats from HiCool mapping).

CoolFile methods.

Arguments

path

String; path to a (m)cool file

resolution

numeric; resolution to use with mcool file

pairsFile

String; path to a pairs file

metadata

list; if the CoolFile object was generated by HiCool::HiCool, this list contains the path to log file, some statistics regarding the number of pairs obtained by hicstuff as well as the arguments and the hash ID used by HiCool.

object

A CoolFile object.

See Also

HicFile(), HicproFile()

Examples

mcoolPath <- HiContactsData::HiContactsData('yeast_wt', 'mcool')
pairsPath <- HiContactsData::HiContactsData('yeast_wt', 'pairs.gz')
cf <- CoolFile(
  mcoolPath, 
  resolution = 2000, 
  pairsFile = pairsPath, 
  metadata = list(info = 'Yeast WT Hi-C exp.')
)
cf
resolution(cf)
pairsFile(cf)
metadata(cf)

Example datasets provided in HiCExperiment & HiContactsData

Description

Example datasets provided in HiCExperiment & HiContactsData

Usage

data(centros_yeast)

contacts_yeast(full = FALSE)

contacts_yeast_eco1(full = FALSE)

Arguments

full

Whether to import all interactions

Format

An object of class "GRanges".

Source

HiContacts

Examples

data(centros_yeast)
centros_yeast
contacts_yeast()

HiCExperiment export methods

Description

Export methods to save a HiCExperiment object into a set of HiC-Pro-style files (matrix & regions files)

Usage

## S4 method for signature 'HiCExperiment,missing,character'
export(object, prefix, format, ...)

Arguments

object

A HiCExperiment object

prefix

Prefix used when generating output file(s).

format

File format. Available: cool and HiC-Pro.

...

Extra arguments to use when exporting to cool. Can be ⁠metadata <string>⁠ or ⁠chunksize <integer>⁠.

Value

Path to saved files

Examples

################################################################
## ----------- Importing .(m)cool contact matrices ---------- ##
################################################################

mcoolPath <- HiContactsData::HiContactsData('yeast_wt', 'mcool')
hic <- import(mcoolPath, format = 'mcool', resolution = 16000)
export(hic["II"], prefix = 'subset_chrII', format = 'cool')
export(hic["II"], prefix = 'subset_chrII', format = 'HiC-Pro')

HiCExperiment S4 class

Description

The HiCExperiment class describes Hi-C contact files imported in R, either through the HiCExperiment constructor function or using the import method implemented by HiCExperiment package.

Usage

HiCExperiment(
  file,
  resolution = NULL,
  focus = NULL,
  metadata = list(),
  topologicalFeatures = S4Vectors::SimpleList(compartments = GenomicRanges::GRanges(),
    borders = GenomicRanges::GRanges(), loops =
    InteractionSet::GInteractions(GenomicRanges::GRanges(), GenomicRanges::GRanges()),
    viewpoints = GenomicRanges::GRanges()),
  pairsFile = NULL,
  bed = NULL
)

makeHiCExperimentFromGInteractions(gi)

## S4 method for signature 'HiCExperiment'
resolutions(x)

## S4 method for signature 'HiCExperiment'
resolution(x)

## S4 method for signature 'HiCExperiment'
focus(x)

## S4 replacement method for signature 'HiCExperiment,character'
focus(x) <- value

## S4 method for signature 'HiCExperiment,numeric'
zoom(x, resolution)

## S4 method for signature 'HiCExperiment,character'
refocus(x, focus)

## S4 method for signature 'HiCExperiment,missing'
scores(x)

## S4 method for signature 'HiCExperiment,character'
scores(x, name)

## S4 method for signature 'HiCExperiment,numeric'
scores(x, name)

## S4 replacement method for signature 'HiCExperiment,character,numeric'
scores(x, name) <- value

## S4 method for signature 'HiCExperiment,missing'
topologicalFeatures(x)

## S4 method for signature 'HiCExperiment,character'
topologicalFeatures(x, name)

## S4 method for signature 'HiCExperiment,numeric'
topologicalFeatures(x, name)

## S4 replacement method for signature 'HiCExperiment,character,GRangesOrGInteractions'
topologicalFeatures(x, name) <- value

## S4 method for signature 'HiCExperiment'
pairsFile(x)

## S4 replacement method for signature 'HiCExperiment,character'
pairsFile(x) <- value

## S4 replacement method for signature 'HiCExperiment,list'
metadata(x) <- value

## S4 method for signature 'HiCExperiment,numeric'
subsetByOverlaps(x, ranges)

## S4 method for signature 'HiCExperiment,logical'
subsetByOverlaps(x, ranges)

## S4 method for signature 'HiCExperiment,GRanges'
subsetByOverlaps(x, ranges, type = c("within", "any"))

## S4 method for signature 'HiCExperiment,GInteractions'
subsetByOverlaps(x, ranges)

## S4 method for signature 'HiCExperiment,Pairs'
subsetByOverlaps(x, ranges)

## S4 method for signature 'HiCExperiment,numeric,ANY,ANY'
x[i]

## S4 method for signature 'HiCExperiment,GRanges,ANY,ANY'
x[i]

## S4 method for signature 'HiCExperiment,logical,ANY,ANY'
x[i]

## S4 method for signature 'HiCExperiment,GInteractions,ANY,ANY'
x[i]

## S4 method for signature 'HiCExperiment,Pairs,ANY,ANY'
x[i]

## S4 method for signature 'HiCExperiment,character,ANY,ANY'
x[i]

## S4 method for signature 'HiCExperiment'
fileName(object)

## S4 method for signature 'HiCExperiment'
interactions(x, fillout.regions = FALSE)

## S4 replacement method for signature 'HiCExperiment,GInteractions'
interactions(x) <- value

## S4 method for signature 'HiCExperiment'
length(x)

## S4 replacement method for signature 'HiCExperiment'
x$name <- value

## S4 method for signature 'HiCExperiment'
x$name

## S4 method for signature 'HiCExperiment'
seqinfo(x)

## S4 method for signature 'HiCExperiment'
bins(x)

## S4 method for signature 'HiCExperiment'
anchors(x)

## S4 method for signature 'HiCExperiment'
regions(x)

## S4 method for signature 'HiCExperiment'
cis(x)

## S4 method for signature 'HiCExperiment'
trans(x)

Arguments

file

CoolFile or plain path to a Hi-C contact file

resolution

Resolution to use with the Hi-C contact file

focus

Chromosome coordinates for which interaction counts are extracted from the Hi-C contact file, provided as a character string (e.g. "II:4001-5000"). If not provided, the entire Hi-C contact file will be imported.

metadata

list of metadata

topologicalFeatures

topologicalFeatures provided as a named SimpleList

pairsFile

Path to an associated .pairs file (optional)

bed

Path to regions file generated by HiC-Pro (optional)

gi

GInteractions object

x

A HiCExperiment object.

value

Value to add to topologicalFeatures, scores, pairsFile or metadata slots.

name

Name of the element to access in topologicalFeatures or scores SimpleLists.

type

any of within or any, to subset interactions by overlap with a provided GRanges.

i, ranges

a GRanges, coordinates in character, or boolean vector to subset a HiCExperiment

object

A HiCExperiment object.

fillout.regions

Whehter to add missing regions to GInteractions' regions?

Value

An HiCExperiment object.

Slots

fileName

Path of Hi-C contact file

focus

Chr. coordinates for which interaction counts are extracted from the Hi-C contact file.

resolutions

Resolutions available in the Hi-C contact file.

resolution

Current resolution

interactions

Genomic Interactions extracted from the Hi-C contact file

scores

Available interaction scores.

topologicalFeatures

Topological features associated with the dataset (e.g. loops (\<GInteractions\>), borders (\<GRanges\>), viewpoints (\<GRanges\>), etc...)

pairsFile

Path to the .pairs file associated with the Hi-C contact file

metadata

metadata associated with the Hi-C contact file.

See Also

AggrHiCExperiment(), CoolFile(), HicFile(), HicproFile(), PairsFile()

Examples

#####################################################################
## Create a HiCExperiment object from a disk-stored contact matrix ##
#####################################################################

mcool_file <- HiContactsData::HiContactsData("yeast_wt", "mcool")
pairs_file <- HiContactsData::HiContactsData("yeast_wt", "pairs.gz")
contacts <- HiCExperiment(
    file = mcool_file, 
    resolution = 8000L, 
    pairsFile = pairs_file
)
contacts

#####################################################################
## ----- Manually create a HiCExperiment from GInteractions ------ ##
#####################################################################

gis <- interactions(contacts)[1:1000]
contacts2 <- makeHiCExperimentFromGInteractions(gis)
contacts2

#####################################################################
## -------- Slots present in an HiCExperiment object ------------- ##
#####################################################################

fileName(contacts)
focus(contacts)
resolutions(contacts)
resolution(contacts)
interactions(contacts)
scores(contacts)
topologicalFeatures(contacts)
pairsFile(contacts)

#####################################################################
## ---------------------- Slot getters --------------------------- ##
#####################################################################

scores(contacts, 1) |> head()
scores(contacts, 'balanced') |> head()
topologicalFeatures(contacts, 1)

#####################################################################
## ---------------------- Slot setters --------------------------- ##
#####################################################################

scores(contacts, 'random') <- runif(length(contacts))
topologicalFeatures(contacts, 'loops') <- InteractionSet::GInteractions(
  GenomicRanges::GRanges('II:15324'), 
  GenomicRanges::GRanges('II:24310')
)
pairsFile(contacts) <- HiContactsData('yeast_wt', 'pairs.gz')

#####################################################################
## ------------------ Subsetting functions ----------------------- ##
#####################################################################

contacts[1:100]
contacts['II']
contacts[c('II', 'III')]
contacts['II|III']
contacts['II:10001-30000|III:50001-90000']

#####################################################################
## --------------------- Utils functions ------------------------- ##
#####################################################################
## Adapted from other packages

seqinfo(contacts)
bins(contacts)
anchors(contacts)
regions(contacts)

#####################################################################
## ------------- Coercing HiCExperiment objects ------------------ ##
#####################################################################

as(contacts, 'GInteractions')
as(contacts, 'ContactMatrix')
as(contacts, 'matrix')[seq_len(10), seq_len(10)]
as(contacts, 'data.frame')[seq_len(10), seq_len(10)]

HicFile S4 class

Description

The HicFile class describes a BiocFile object, pointing to the location of a .hic file (usually created with juicer) and containing 3 additional slots:

  1. resolution: at which resolution the associated .hic file should be parsed;

  2. pairsFile: the path (in plain character) to an optional pairs file (stored as a PairsFile object);

  3. metadata: a list metadata

HicFile methods.

Arguments

path

String; path to a .hic file

resolution

numeric; resolution to use with mcool file

pairsFile

String; path to a pairs file

metadata

list.

object

A HicFile object.

See Also

CoolFile(), HicproFile()

Examples

hicPath <- HiContactsData::HiContactsData('yeast_wt', 'hic')
pairsPath <- HiContactsData::HiContactsData('yeast_wt', 'pairs.gz')
hic <- HicFile(
  hicPath, 
  resolution = 16000, 
  pairsFile = pairsPath, 
  metadata = list(type = 'example')
)
hic
resolution(hic)
pairsFile(hic)
metadata(hic)

HicproFile S4 class

Description

The HicproFile class describes a BiocFile object, pointing to the location of a HiC-Pro-generated matrix file and containing 4 additional slots:

  1. bed: path to the matching .bed file generated by HiC-Pro;

  2. resolution: at which resolution the associated mcool file should be parsed ;

  3. pairsFile: the path (in plain character) to an optional pairs file (stored as a PairsFile object);

  4. metadata: a list metadata

HicproFile methods.

Arguments

path

String; path to the HiC-Pro output .matrix file (matrix file)

bed

String; path to the HiC-Pro output .bed file (regions file)

pairsFile

String; path to a pairs file

metadata

list.

object

A HicproFile object.

Slots

bed

Path to the matching .bed file generated by HiC-Pro

See Also

CoolFile(), HicFile()

Examples

hicproMatrixPath <- HiContactsData::HiContactsData('yeast_wt', 'hicpro_matrix')
hicproBedPath <- HiContactsData::HiContactsData('yeast_wt', 'hicpro_bed')
pairsPath <- HiContactsData::HiContactsData('yeast_wt', 'pairs.gz')
hicpro <- HicproFile(
  hicproMatrixPath, bed = hicproBedPath, pairs = pairsPath ,
  metadata = list(type = 'example')
)
hicpro
resolution(hicpro)
pairsFile(hicpro)
metadata(hicpro)

HiCExperiment import methods

Description

Import methods to parse Hi-C files (⁠.(m)cool⁠, .hic, HiC-Pro derived matrices, pairs files) into data structures implemented in the HiCExperiment package.

Usage

import(con, format, text, ...)

## S4 method for signature 'ANY'
availableResolutions(x, ...)

## S4 method for signature 'CoolFile'
availableResolutions(x)

## S4 method for signature 'HicFile'
availableResolutions(x)

## S4 method for signature 'HicproFile'
availableResolutions(x)

## S4 method for signature 'ANY'
availableChromosomes(x, ...)

## S4 method for signature 'CoolFile'
availableChromosomes(x)

## S4 method for signature 'HicFile'
availableChromosomes(x)

## S4 method for signature 'HicproFile'
availableChromosomes(x)

Arguments

...

Extra parameters to pass to format-specific methods. A list of possible arguments is provided in the next section.

con, x

Path or connection to a cool, mcool, .hic or HiC-Pro derived files. Can also be a path to a pairs file.

format

The format of the output. If missing and 'con' is a filename, the format is derived from the file extension. This argument is unnecessary when files are directly provided as CoolFile, HicFile, HicproFile or PairsFile.

text

If 'con' is missing, this can be a character vector directly providing the string data to import.

Value

A HiCExperiment or GInteractions object

import arguments for ContactFile class

ContactFile class gathers CoolFile, HicFile and HicproFile classes. When importing a ContactFile object in R, two main arguments can be provided besides the ContactFile itself:

  • resolution: Resolutions available in the disk-stored contact matrix can be listed using availableResolutions(file)

  • focus: A genomic locus (or pair of loci) provided as a string. It can be any of the following string structures:

    • "II" or "II:20001-30000": this will extract a symmetrical square HiCExperiment object, of an entire chromosome or an portion of it.

    • "II|III" or "II:20001-30000|III:40001-90000": this will extract a non-symmetrical HiCExperiment object, with an entire or portion of different chromosomes on each axis.

Examples

################################################################
## ----------- Importing .(m)cool contact matrices ---------- ##
################################################################

mcoolPath <- HiContactsData::HiContactsData('yeast_wt', 'mcool')
availableResolutions(mcoolPath)
availableChromosomes(mcoolPath)
import(mcoolPath, resolution = 16000, focus = 'XVI', format = 'cool')

################################################################
## ------------ Importing .hic contact matrices ------------- ##
################################################################

hicPath <- HiContactsData::HiContactsData('yeast_wt', 'hic')
availableResolutions(hicPath)
availableChromosomes(hicPath)
import(hicPath, resolution = 16000, focus = 'XVI', format = 'hic')

################################################################
## ------- Importing HiC-Pro derived contact matrices ------- ##
################################################################

hicproMatrixPath <- HiContactsData::HiContactsData('yeast_wt', 'hicpro_matrix')
hicproBedPath <- HiContactsData::HiContactsData('yeast_wt', 'hicpro_bed')
availableResolutions(hicproMatrixPath, hicproBedPath)
availableChromosomes(hicproMatrixPath, hicproBedPath)
import(hicproMatrixPath, bed = hicproBedPath, format = 'hicpro')

PairsFile S4 class

Description

The PairsFile class describes a BiocFile object, pointing to the location of pairs file, typically generated by HiCool::HiCool().

PairsFile methods

Arguments

x

Path to a pairs file

See Also

CoolFile(), HicFile(), HicproFile()

Examples

pairsPath <- HiContactsData::HiContactsData('yeast_wt', 'pairs.gz')
pf <- PairsFile(pairsPath)
pf
pairsFile(pf)