Title: | HiCool |
---|---|
Description: | HiCool provides an R interface to process and normalize Hi-C paired-end fastq reads into .(m)cool files. .(m)cool is a compact, indexed HDF5 file format specifically tailored for efficiently storing HiC-based data. On top of processing fastq reads, HiCool provides a convenient reporting function to generate shareable reports summarizing Hi-C experiments and including quality controls. |
Authors: | Jacques Serizay [aut, cre] |
Maintainer: | Jacques Serizay <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.7.0 |
Built: | 2024-12-19 04:26:32 UTC |
Source: | https://github.com/bioc/HiCool |
Find loops using chromosight
getLoops( x, resolution = NULL, output_prefix = file.path("chromosight", "chromo"), norm = "auto", max.dist = "auto", min.dist = "auto", min.separation = "auto", n.mads = 5L, pearson = "auto", nreads = "no", ncores = 1L )
getLoops( x, resolution = NULL, output_prefix = file.path("chromosight", "chromo"), norm = "auto", max.dist = "auto", min.dist = "auto", min.separation = "auto", n.mads = 5L, pearson = "auto", nreads = "no", ncores = 1L )
x |
A |
resolution |
Which resolution to use to search loops |
output_prefix |
Prefix to chromosight output (default: "chromosight/chromo") |
norm |
Normalization parameter for chromosight |
min.dist , max.dist
|
Min and max distance to use to filter for significant loops |
min.separation |
Minimum separation between anchors of potential loops |
n.mads |
Number of MADs to use to filter relevant bins to search for loops |
pearson |
Minimum Pearson correlation score to use to filter for significant loops |
nreads |
Number of reads to subsample to before searching for loops |
ncores |
Number of cores for chromosight |
A HiCExperiment
object with a new "loops" topologicalFeatures
storing significant interactions identified by chromosight, and an additional
chromosight_args
metadata entry.
contacts_yeast <- contacts_yeast() contacts_yeast <- getLoops(contacts_yeast) S4Vectors::metadata(contacts_yeast)$chromosight_args topologicalFeatures(contacts_yeast, 'loops')
contacts_yeast <- contacts_yeast() contacts_yeast <- getLoops(contacts_yeast) S4Vectors::metadata(contacts_yeast)$chromosight_args topologicalFeatures(contacts_yeast, 'loops')
HiCool::HiCool()
automatically processes paired-end HiC sequencing files
by performing the following steps:
Automatically setting up an appropriate conda environment using basilisk;
Mapping the reads to the provided genome reference using hicstuff
and filtering of irrelevant pairs;
Filtering the resulting pairs file to remove unwanted chromosomes (e.g. chrM);
Binning the filtered pairs into a cool file at a chosen resolution;
Generating a multi-resolution mcool file;
Normalizing matrices at each resolution by iterative corretion using cooler.
The filtering strategy used by hicstuff
is described in Cournac et al., BMC Genomics 2012.
HiCool( r1, r2, genome, restriction = "DpnII,HinfI", resolutions = NULL, iterative = TRUE, balancing_args = " --min-nnz 10 --mad-max 5 ", threads = 1L, exclude_chr = "Mito|chrM|MT", output = "HiCool", keep_bam = FALSE, build_report = TRUE, scratch = tempdir() ) importHiCoolFolder(output, hash, resolution = NULL) getHiCoolArgs(log) getHicStats(log)
HiCool( r1, r2, genome, restriction = "DpnII,HinfI", resolutions = NULL, iterative = TRUE, balancing_args = " --min-nnz 10 --mad-max 5 ", threads = 1L, exclude_chr = "Mito|chrM|MT", output = "HiCool", keep_bam = FALSE, build_report = TRUE, scratch = tempdir() ) importHiCoolFolder(output, hash, resolution = NULL) getHiCoolArgs(log) getHicStats(log)
r1 |
Path to fastq file (R1 read) |
r2 |
Path to fastq file (R2 read) |
genome |
Genome used to map the reads on, provided either
as a fasta file (in which case the bowtie2 index will be automatically
generated), or as a prefix to a bowtie2 index (e.g. |
restriction |
Restriction enzyme(s) used in HiC (Default: "DpnII,HinfI") |
resolutions |
Resolutions used to bin the final mcool file (Default: 5 levels of resolution automatically inferred according to genome size) |
iterative |
Should the read mapping be performed iteratively? (Default: TRUE) |
balancing_args |
Balancing arguments for cooler.
See |
threads |
Number of CPUs used for parallelization. (Default: 1) |
exclude_chr |
Chromosomes excluded from the final .mcool file. This will not affect the pairs file. (Default: "Mito|chrM|MT") |
output |
Output folder used by HiCool. |
keep_bam |
Should the bam files be kept? (Default: FALSE) |
build_report |
Should an automated report be computed? (Default: TRUE) |
scratch |
Path to temporary directory where processing will take place.
(Default: |
hash |
Unique 6-letter ID used to identify files from a specific HiCool processing run. |
resolution |
Resolution used to import the mcool file |
log |
Path to log file generated by hicstuff/hicool |
A CoolFile
object with prefilled pairsFile
and metadata
slots.
importHiCoolFolder(folder, hash)
automatically finds the different processed files
associated with a specific HiCool::HiCool() processing hash ID.
getHiCoolArgs() parses the log file generated by HiCool::HiCool() during processing to recover which arguments were used.
getHicStats() parses the log file generated by HiCool::HiCool() during processing to recover pre-computed stats about pair numbers, filtering thresholds, etc.
r1 <- HiContactsData::HiContactsData(sample = 'yeast_wt', format = 'fastq_R1') r2 <- HiContactsData::HiContactsData(sample = 'yeast_wt', format = 'fastq_R2') hcf <- HiCool(r1, r2, genome = 'R64-1-1', output = './HiCool/') hcf getHiCoolArgs(S4Vectors::metadata(hcf)$log) getHicStats(S4Vectors::metadata(hcf)$log) readLines(S4Vectors::metadata(hcf)$log)
r1 <- HiContactsData::HiContactsData(sample = 'yeast_wt', format = 'fastq_R1') r2 <- HiContactsData::HiContactsData(sample = 'yeast_wt', format = 'fastq_R2') hcf <- HiCool(r1, r2, genome = 'R64-1-1', output = './HiCool/') hcf getHiCoolArgs(S4Vectors::metadata(hcf)$log) getHicStats(S4Vectors::metadata(hcf)$log) readLines(S4Vectors::metadata(hcf)$log)
HiC processing report
HiCReport(x, output = NULL)
HiCReport(x, output = NULL)
x |
an CoolFile object, generated from |
output |
Path to save output HTML file. |
String to the generated HTML report file
mcool_path <- HiContactsData::HiContactsData('yeast_wt', 'mcool') pairs_path <- HiContactsData::HiContactsData('yeast_wt', 'pairs.gz') log_path <- HiContactsData::HiContactsData(sample = 'yeast_wt', format = 'HiCool_log') cf <- CoolFile(mcool_path, pairs = pairs_path, metadata = list(log = log_path)) HiCReport(cf)
mcool_path <- HiContactsData::HiContactsData('yeast_wt', 'mcool') pairs_path <- HiContactsData::HiContactsData('yeast_wt', 'pairs.gz') log_path <- HiContactsData::HiContactsData(sample = 'yeast_wt', format = 'HiCool_log') cf <- CoolFile(mcool_path, pairs = pairs_path, metadata = list(log = log_path)) HiCReport(cf)