Package 'consensusSeekeR'

Title: Detection of consensus regions inside a group of experiences using genomic positions and genomic ranges
Description: This package compares genomic positions and genomic ranges from multiple experiments to extract common regions. The size of the analyzed region is adjustable as well as the number of experiences in which a feature must be present in a potential region to tag this region as a consensus region. In genomic analysis where feature identification generates a position value surrounded by a genomic range, such as ChIP-Seq peaks and nucleosome positions, the replication of an experiment may result in slight differences between predicted values. This package enables the conciliation of the results into consensus regions.
Authors: Astrid Deschênes [cre, aut] , Fabien Claude Lamaze [ctb], Pascal Belleau [aut] , Arnaud Droit [aut]
Maintainer: Astrid Deschênes <[email protected]>
License: Artistic-2.0
Version: 1.35.0
Built: 2024-11-07 06:14:45 UTC
Source: https://github.com/bioc/consensusSeekeR

Help Index


consensusSeekeR: Detection of consensus peak regions inside a group of experiments using narrowPeak files

Description

This package compares positions and ranges data from multiple experiments to extract common consensus regions. The size of the analyzed region is adjustable as well as the number of experiments in which a peak must be detected to mark a potential region as a consensus peak region.

Author(s)

Astrid Deschênes, Fabien Claude Lamaze, Pascal Belleau and Arnaud Droit

Maintainer: Astrid Deschênes <[email protected]>

See Also


Sites with the greatest evidence of transcription factor binding for the CTCF transcription factor (for demonstration purpose)

Description

Sites representing the greatest evidence of enrichment for the CTCF transcription factor (DCC accession: ENCFF000MYJ) for regions chr1:246000000-249250621 and chr10:10000000-12500000 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_CTCF_MYJ_NarrowPeaks_partial)

Format

A GRanges containing one entry per site.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF000MYJ)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

A549_CTCF_MYJ_NarrowPeaks_partial

Examples

## Loading datasets
data(A549_CTCF_MYJ_NarrowPeaks_partial)
data(A549_CTCF_MYJ_Peaks_partial)
data(A549_CTCF_MYN_NarrowPeaks_partial)
data(A549_CTCF_MYN_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_CTCF_MYJ_Peaks_partial) <- rep("CTCF_MYJ",
                            length(A549_CTCF_MYJ_Peaks_partial))
names(A549_CTCF_MYJ_NarrowPeaks_partial) <- rep("CTCF_MYJ",
                            length(A549_CTCF_MYJ_NarrowPeaks_partial))
names(A549_CTCF_MYN_Peaks_partial) <-rep("CTCF_MYN",
                            length(A549_CTCF_MYN_Peaks_partial))
names(A549_CTCF_MYN_NarrowPeaks_partial) <- rep("CTCF_MYN",
                            length(A549_CTCF_MYN_NarrowPeaks_partial))

## Calculating consensus regions for chromosome 10
## with a default region size of 100 bp (2 * extendingSize)
## which is extended to include all genomic regions for the closest
## peak to the median position of all peaks included in the region (for each
## experiment).
## A peak from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr10"), c(135534747), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_CTCF_MYJ_NarrowPeaks_partial,
                            A549_CTCF_MYN_NarrowPeaks_partial),
    peaks = c(A549_CTCF_MYJ_Peaks_partial,
                            A549_CTCF_MYN_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 50,
    expandToFitPeakRegion = TRUE,
    shrinkToFitPeakRegion = TRUE,
    minNbrExp = 2,
    nbrThreads = 1)

Sites with the greatest evidence of transcription factor binding for the CTCF transcription factor (for demonstration purpose)

Description

Sites representing the greatest evidence of enrichment for the CTCF transcription factor (DCC accession: ENCFF000MYJ) for regions chr1:246000000-249250621 and chr10:10000000-12500000 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_CTCF_MYJ_Peaks_partial)

Format

A GRanges containing one entry per site.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF000MYJ)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_CTCF_MYJ_NarrowPeaks_partial)
data(A549_CTCF_MYJ_Peaks_partial)
data(A549_CTCF_MYN_NarrowPeaks_partial)
data(A549_CTCF_MYN_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_CTCF_MYJ_Peaks_partial) <- rep("CTCF_MYJ",
                            length(A549_CTCF_MYJ_Peaks_partial))
names(A549_CTCF_MYJ_NarrowPeaks_partial) <- rep("CTCF_MYJ",
                            length(A549_CTCF_MYJ_NarrowPeaks_partial))
names(A549_CTCF_MYN_Peaks_partial) <-rep("CTCF_MYN",
                            length(A549_CTCF_MYN_Peaks_partial))
names(A549_CTCF_MYN_NarrowPeaks_partial) <- rep("CTCF_MYN",
                            length(A549_CTCF_MYN_NarrowPeaks_partial))

## Calculating consensus regions for chromosome 10
## with a default region size of 40 bp (2 * extendingSize)
## which is extended to include all genomic regions for the closest
## peak to the median position of all peaks included in the region (for each
## experiment).
## A peak from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr10"), c(135534747), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_CTCF_MYJ_NarrowPeaks_partial,
                            A549_CTCF_MYN_NarrowPeaks_partial),
    peaks = c(A549_CTCF_MYJ_Peaks_partial,
                            A549_CTCF_MYN_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 20,
    expandToFitPeakRegion = FALSE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

Sites with the greatest evidence of transcription factor binding for the CTCF transcription factor (for demonstration purpose)

Description

Sites representing the greatest evidence of enrichment for the CTCF transcription factor (DCC accession: ENCFF000MYN) for regions chr1:246000000-249250621 and chr10:10000000-12500000 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_CTCF_MYN_NarrowPeaks_partial)

Format

A GRanges containing one entry per site.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF000MYN)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_CTCF_MYJ_NarrowPeaks_partial)
data(A549_CTCF_MYJ_Peaks_partial)
data(A549_CTCF_MYN_NarrowPeaks_partial)
data(A549_CTCF_MYN_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_CTCF_MYJ_Peaks_partial) <- rep("CTCF_MYJ",
                            length(A549_CTCF_MYJ_Peaks_partial))
names(A549_CTCF_MYJ_NarrowPeaks_partial) <- rep("CTCF_MYJ",
                            length(A549_CTCF_MYJ_NarrowPeaks_partial))
names(A549_CTCF_MYN_Peaks_partial) <-rep("CTCF_MYN",
                            length(A549_CTCF_MYN_Peaks_partial))
names(A549_CTCF_MYN_NarrowPeaks_partial) <- rep("CTCF_MYN",
                            length(A549_CTCF_MYN_NarrowPeaks_partial))

## Calculating consensus regions for chromosome 10
## with a default region size of 40 bp (2 * extendingSize)
## which is extended to include all genomic regions for the closest
## peak to the median position of all peaks included in the region (for each
## experiment).
## A peak from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo("chr10", 135534747, NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_CTCF_MYJ_NarrowPeaks_partial,
                            A549_CTCF_MYN_NarrowPeaks_partial),
    peaks = c(A549_CTCF_MYJ_Peaks_partial,
                            A549_CTCF_MYN_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 20,
    expandToFitPeakRegion = FALSE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

Sites with the greatest evidence of transcription factor binding for the CTCF transcription factor (for demonstration purpose)

Description

Sites representing the greatest evidence of enrichment for the CTCF transcription factor (DCC accession: ENCFF000MYN) for regions chr1:246000000-249250621 and chr10:10000000-12500000 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_CTCF_MYN_Peaks_partial)

Format

A GRanges containing one entry per site.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF000MYN)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_CTCF_MYJ_NarrowPeaks_partial)
data(A549_CTCF_MYJ_Peaks_partial)
data(A549_CTCF_MYN_NarrowPeaks_partial)
data(A549_CTCF_MYN_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_CTCF_MYJ_Peaks_partial) <- rep("CTCF_MYJ",
                            length(A549_CTCF_MYJ_Peaks_partial))
names(A549_CTCF_MYJ_NarrowPeaks_partial) <- rep("CTCF_MYJ",
                            length(A549_CTCF_MYJ_NarrowPeaks_partial))
names(A549_CTCF_MYN_Peaks_partial) <-rep("CTCF_MYN",
                            length(A549_CTCF_MYN_Peaks_partial))
names(A549_CTCF_MYN_NarrowPeaks_partial) <- rep("CTCF_MYN",
                            length(A549_CTCF_MYN_NarrowPeaks_partial))

## Calculating consensus regions for chromosomes 1
## with a default region size of 40 bp (2 * extendingSize)
## which is extended to include all genomic regions for the closest
## peak to the median position of all peaks included in the region (for each
## experiment).
## A peak from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr1"), c(249250621), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_CTCF_MYJ_NarrowPeaks_partial,
                            A549_CTCF_MYN_NarrowPeaks_partial),
    peaks = c(A549_CTCF_MYJ_Peaks_partial,
                            A549_CTCF_MYN_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 20,
    expandToFitPeakRegion = FALSE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

Genomic regions with the greatest evidence of transcription factor binding for the FOSL2 transcription factor (for demonstration purpose)

Description

Genomic regions representing the greatest evidence of enrichment for the FOSL2 transcription factor (DCC accession: ENCFF000MZT) for regions chr1:249120200-249250621 and chr10:1-370100 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_FOSL2_01_NarrowPeaks_partial)

Format

A GRanges containing one entry per genomic regions. Each row of GRanges has a name which represent the name of the experiment.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF000MZT)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_FOSL2_01_NarrowPeaks_partial)
data(A549_FOSL2_01_Peaks_partial)
data(A549_FOXA1_01_NarrowPeaks_partial)
data(A549_FOXA1_01_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_FOXA1_01_Peaks_partial) <- rep("FOXA1_01",
                            length(A549_FOXA1_01_Peaks_partial))
names(A549_FOXA1_01_NarrowPeaks_partial) <- rep("FOXA1_01",
                            length(A549_FOXA1_01_NarrowPeaks_partial))
names(A549_FOSL2_01_Peaks_partial) <-rep("FOSL2_01",
                            length(A549_FOSL2_01_Peaks_partial))
names(A549_FOSL2_01_NarrowPeaks_partial) <- rep("FOSL2_01",
                            length(A549_FOSL2_01_NarrowPeaks_partial))

## Calculating consensus regions for chromosome 10 only
## with a default region size of 200 bp (2 * extendingSize)
## which is not extended to include all genomic regions.
## A peak from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo("chr10", 135534747, NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_FOXA1_01_NarrowPeaks_partial,
                            A549_FOSL2_01_NarrowPeaks_partial),
    peaks = c(A549_FOXA1_01_Peaks_partial,
                            A549_FOSL2_01_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 100,
    expandToFitPeakRegion = FALSE,
    shrinkToFitPeakRegion = TRUE,
    minNbrExp = 2,
    nbrThreads = 1)

Sites with the greatest evidence of transcription factor binding for the FOSL2 transcription factor (for demonstration purpose)

Description

Sites representing the greatest evidence of enrichment for the FOSL2 transcription factor (DCC accession: ENCFF000MZT) for regions chr1:249120200-249250621 and chr10:1-370100 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_FOSL2_01_Peaks_partial)

Format

A GRanges containing one entry per site. Each row of GRanges has the same row name which represent the name of the experiment.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF000MZT)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_FOSL2_01_NarrowPeaks_partial)
data(A549_FOXA1_01_NarrowPeaks_partial)
data(A549_FOSL2_01_Peaks_partial)
data(A549_FOXA1_01_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_FOXA1_01_Peaks_partial) <- rep("FOXA1_01",
                            length(A549_FOXA1_01_Peaks_partial))
names(A549_FOXA1_01_NarrowPeaks_partial) <- rep("FOXA1_01",
                            length(A549_FOXA1_01_NarrowPeaks_partial))
names(A549_FOSL2_01_Peaks_partial) <-rep("FOSL2_01",
                            length(A549_FOSL2_01_Peaks_partial))
names(A549_FOSL2_01_NarrowPeaks_partial) <- rep("FOSL2_01",
                            length(A549_FOSL2_01_NarrowPeaks_partial))

## Calculating consensus regions for chromosome 1 only
## with a default region size of 400 bp (2 * extendingSize)
## which is extended to include all genomic regions of the
## closest peak (for each experiment).
## A peak from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo("chr1", 249250621, NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_FOXA1_01_NarrowPeaks_partial,
                            A549_FOSL2_01_NarrowPeaks_partial),
    peaks = c(A549_FOXA1_01_Peaks_partial,
                            A549_FOSL2_01_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 200,
    expandToFitPeakRegion = TRUE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

Genomic regions with the greatest evidence of transcription factor binding for the FOXA1 transcription factor (for demonstration purpose)

Description

Genomic regions representing the greatest evidence of enrichment for the FOXA1 transcription factor (DCC accession: ENCFF000NAH) for regions chr1:249120200-249250621 and chr10:1-370100 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_FOXA1_01_NarrowPeaks_partial)

Format

A GRanges containing one entry per genomic regions. Each row of GRanges has a name which represent the name of the experiment.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF000NAH)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_FOSL2_01_NarrowPeaks_partial)
data(A549_FOSL2_01_Peaks_partial)
data(A549_FOXA1_01_NarrowPeaks_partial)
data(A549_FOXA1_01_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_FOXA1_01_Peaks_partial) <- rep("FOXA1_01",
        length(A549_FOXA1_01_Peaks_partial))
names(A549_FOXA1_01_NarrowPeaks_partial) <- rep("FOXA1_01",
        length(A549_FOXA1_01_NarrowPeaks_partial))
names(A549_FOSL2_01_Peaks_partial) <-rep("FOSL2_01",
        length(A549_FOSL2_01_Peaks_partial))
names(A549_FOSL2_01_NarrowPeaks_partial) <- rep("FOSL2_01",
        length(A549_FOSL2_01_NarrowPeaks_partial))

## Calculating consensus regions for both chromosomes 1 and 10
## with a default region size of 300 bp (2 * extendingSize)
## which is not extended to include all genomic regions.
## A peak from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr1", "chr10"), c(249250621, 135534747), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_FOXA1_01_NarrowPeaks_partial,
                            A549_FOSL2_01_NarrowPeaks_partial),
    peaks = c(A549_FOXA1_01_Peaks_partial,
                            A549_FOSL2_01_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 150,
    expandToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

Sites with the greatest evidence of transcription factor binding for the FOXA1 transcription factor (for demonstration purpose)

Description

Sites representing the greatest evidence of enrichment for the FOXA1 transcription factor (DCC accession: ENCFF000NAH) for regions chr1:249120200-249250621 and chr10:1-370100 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_FOXA1_01_Peaks_partial)

Format

A GRanges containing one entry per site . Each row of GRanges has a name which represent the name of the experiment.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF000NAH)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_FOSL2_01_NarrowPeaks_partial)
data(A549_FOSL2_01_Peaks_partial)
data(A549_FOXA1_01_NarrowPeaks_partial)
data(A549_FOXA1_01_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_FOXA1_01_Peaks_partial) <- rep("FOXA1_01",
        length(A549_FOXA1_01_Peaks_partial))
names(A549_FOXA1_01_NarrowPeaks_partial) <- rep("FOXA1_01",
        length(A549_FOXA1_01_NarrowPeaks_partial))
names(A549_FOSL2_01_Peaks_partial) <-rep("FOSL2_01",
        length(A549_FOSL2_01_Peaks_partial))
names(A549_FOSL2_01_NarrowPeaks_partial) <- rep("FOSL2_01",
        length(A549_FOSL2_01_NarrowPeaks_partial))

## Calculating consensus regions for both chromosomes 1 and 10
## with a default region size of 100 bp (2 * extendingSize)
## which is extended to include all genomic regions for the closest
## peak to the median position of all peaks included in the region (for each
## experiment).
## A peak from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr1", "chr10"), c(249250621, 135534747), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_FOXA1_01_NarrowPeaks_partial,
                            A549_FOSL2_01_NarrowPeaks_partial),
    peaks = c(A549_FOXA1_01_Peaks_partial,
                            A549_FOSL2_01_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 50,
    expandToFitPeakRegion = TRUE,
    shrinkToFitPeakRegion = TRUE,
    minNbrExp = 2,
    nbrThreads = 1)

Ranges with the greatest evidence of transcription factor binding for the NR3C1 transcription factor from ENCODE (DDC accession: ENCFF002CFQ). For demonstration purpose.

Description

Ranges representing the greatest evidence of enrichment for the NR3C1 transcription factor (DCC accession: ENCFF002CFQ) for regions chr2:40000000-50000000 and chr3:10000000-13000000 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_NR3C1_CFQ_NarrowPeaks_partial)

Format

A GRanges containing one entry per site. The ranges are surronding the peaks present in the dataset A549_NR3C1_CFQ_Peaks_partial.

Details

The peaks and ranges have been obtained using an optimal IDR analysis done on all replicates.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF002CFQ)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_NR3C1_CFQ_NarrowPeaks_partial)
data(A549_NR3C1_CFQ_Peaks_partial)
data(A549_NR3C1_CFS_NarrowPeaks_partial)
data(A549_NR3C1_CFS_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_NR3C1_CFQ_NarrowPeaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_NarrowPeaks_partial))
names(A549_NR3C1_CFQ_Peaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_Peaks_partial))
names(A549_NR3C1_CFS_NarrowPeaks_partial) <-rep("NR3C1_CFS",
                            length(A549_NR3C1_CFS_NarrowPeaks_partial))
names(A549_NR3C1_CFS_Peaks_partial) <- rep("NR3C1_CFS",
                            length(A549_NR3C1_CFS_Peaks_partial))

## Calculating consensus regions for chromosome 3
## with a default region size of 300 bp (2 * extendingSize)
## which is extended to include all genomic regions for the closest
## peak to the median position of all peaks included in the region (for
## each experiment).
## Peaks from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr3"), c(198022430), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_NR3C1_CFQ_NarrowPeaks_partial,
                            A549_NR3C1_CFS_NarrowPeaks_partial),
    peaks = c(A549_NR3C1_CFQ_Peaks_partial,
                            A549_NR3C1_CFS_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 150,
    expandToFitPeakRegion = FALSE,
    shrinkToFitPeakRegion = TRUE,
    minNbrExp = 2,
    nbrThreads = 1)

Sites with the greatest evidence of transcription factor binding for the NR3C1 transcription factor from ENCODE (DDC accession: ENCFF002CFQ). For demonstration purpose.

Description

Sites representing the greatest evidence of enrichment for the NR3C1 transcription factor (DCC accession: ENCFF002CFQ) for regions chr2:40000000-50000000 and chr3:10000000-13000000 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_NR3C1_CFQ_Peaks_partial)

Format

A GRanges containing one entry per site. The peaks are surronded by ranges present in the dataset A549_NR3C1_CFQ_NarrowPeaks_partial.

Details

The peaks and ranges have been obtained using an optimal IDR analysis done on all replicates.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF002CFQ)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_NR3C1_CFQ_NarrowPeaks_partial)
data(A549_NR3C1_CFQ_Peaks_partial)
data(A549_NR3C1_CFS_NarrowPeaks_partial)
data(A549_NR3C1_CFS_Peaks_partial)
data(A549_NR3C1_CFR_NarrowPeaks_partial)
data(A549_NR3C1_CFR_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_NR3C1_CFQ_NarrowPeaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_NarrowPeaks_partial))
names(A549_NR3C1_CFQ_Peaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_Peaks_partial))
names(A549_NR3C1_CFS_NarrowPeaks_partial) <-rep("NR3C1_CFS",
                            length(A549_NR3C1_CFS_NarrowPeaks_partial))
names(A549_NR3C1_CFS_Peaks_partial) <- rep("NR3C1_CFS",
                            length(A549_NR3C1_CFS_Peaks_partial))
names(A549_NR3C1_CFR_NarrowPeaks_partial) <-rep("NR3C1_CFR",
                            length(A549_NR3C1_CFR_NarrowPeaks_partial))
names(A549_NR3C1_CFR_Peaks_partial) <- rep("NR3C1_CFR",
                            length(A549_NR3C1_CFR_Peaks_partial))

## Calculating consensus regions for chromosome 3
## with a default region size of 140 bp (2 * extendingSize)
## which is extended to include all genomic regions for the closest
## peak to the median position of all peaks included in the region (for
## each experiment).
## Peaks from at least 2 experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr3"), c(198022430), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_NR3C1_CFQ_NarrowPeaks_partial,
                            A549_NR3C1_CFS_NarrowPeaks_partial,
                            A549_NR3C1_CFR_NarrowPeaks_partial),
    peaks = c(A549_NR3C1_CFQ_Peaks_partial,
                            A549_NR3C1_CFS_Peaks_partial,
                            A549_NR3C1_CFR_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 70,
    expandToFitPeakRegion = FALSE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

Ranges with the greatest evidence of transcription factor binding for the NR3C1 transcription factor from ENCODE (DDC accession: ENCFF002CFR). For demonstration purpose.

Description

Ranges representing the greatest evidence of enrichment for the NR3C1 transcription factor (DCC accession: ENCFF002CFR) for regions chr2:40000000-50000000 and chr3:10000000-13000000 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_NR3C1_CFR_NarrowPeaks_partial)

Format

A GRanges containing one entry per site. The ranges are surronding the peaks present in the dataset A549_NR3C1_CFR_Peaks_partial.

Details

The peaks and ranges have been obtained using an optimal IDR analysis done on all replicates.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF002CFR)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_NR3C1_CFQ_NarrowPeaks_partial)
data(A549_NR3C1_CFQ_Peaks_partial)
data(A549_NR3C1_CFR_NarrowPeaks_partial)
data(A549_NR3C1_CFR_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_NR3C1_CFQ_NarrowPeaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_NarrowPeaks_partial))
names(A549_NR3C1_CFQ_Peaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_Peaks_partial))
names(A549_NR3C1_CFR_NarrowPeaks_partial) <-rep("NR3C1_CFR",
                            length(A549_NR3C1_CFR_NarrowPeaks_partial))
names(A549_NR3C1_CFR_Peaks_partial) <- rep("NR3C1_CFR",
                            length(A549_NR3C1_CFR_Peaks_partial))

## Calculating consensus regions for chromosome 2
## with a default region size of 250 bp (2 * extendingSize)
## which is extended to include all genomic regions for the closest
## peak to the median position of all peaks included in the region (for
## each experiment).
## Peaks from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr2"), c(243199373), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_NR3C1_CFQ_NarrowPeaks_partial,
                            A549_NR3C1_CFR_NarrowPeaks_partial),
    peaks = c(A549_NR3C1_CFQ_Peaks_partial,
                            A549_NR3C1_CFR_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 125,
    expandToFitPeakRegion = TRUE,
    shrinkToFitPeakRegion = TRUE,
    minNbrExp = 2,
    nbrThreads = 1)

Sites with the greatest evidence of transcription factor binding for the NR3C1 transcription factor from ENCODE (DDC accession: ENCFF002CFR). For demonstration purpose.

Description

Sites representing the greatest evidence of enrichment for the NR3C1 transcription factor (DCC accession: ENCFF002CFR) for regions chr2:40000000-50000000 and chr3:10000000-13000000 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_NR3C1_CFR_Peaks_partial)

Format

A GRanges containing one entry per site. The peaks are surronded by ranges present in the dataset A549_NR3C1_CFR_NarrowPeaks_partial.

Details

The peaks and ranges have been obtained using an optimal IDR analysis done on all replicates.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF002CFR)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_NR3C1_CFQ_NarrowPeaks_partial)
data(A549_NR3C1_CFQ_Peaks_partial)
data(A549_NR3C1_CFR_NarrowPeaks_partial)
data(A549_NR3C1_CFR_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_NR3C1_CFQ_NarrowPeaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_NarrowPeaks_partial))
names(A549_NR3C1_CFQ_Peaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_Peaks_partial))
names(A549_NR3C1_CFR_NarrowPeaks_partial) <-rep("NR3C1_CFR",
                            length(A549_NR3C1_CFR_NarrowPeaks_partial))
names(A549_NR3C1_CFR_Peaks_partial) <- rep("NR3C1_CFR",
                            length(A549_NR3C1_CFR_Peaks_partial))

## Calculating consensus regions for chromosome 2
## with a default region size of 40 bp (2 * extendingSize)
## which is extended to include all genomic regions for the closest
## peak to the median position of all peaks included in the region (for
## each experiment).
## Peaks from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr2"), c(243199373), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_NR3C1_CFQ_NarrowPeaks_partial,
                            A549_NR3C1_CFR_NarrowPeaks_partial),
    peaks = c(A549_NR3C1_CFQ_Peaks_partial,
                            A549_NR3C1_CFR_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 20,
    expandToFitPeakRegion = TRUE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

Ranges with the greatest evidence of transcription factor binding for the NR3C1 transcription factor from ENCODE (DDC accession: ENCFF002CFS). For demonstration purpose.

Description

Ranges representing the greatest evidence of enrichment for the NR3C1 transcription factor (DCC accession: ENCFF002CFS) for regions chr2:40000000-50000000 and chr3:10000000-13000000 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_NR3C1_CFS_NarrowPeaks_partial)

Format

A GRanges containing one entry per site. The ranges are surronding the peaks present in the dataset A549_NR3C1_CFs_Peaks_partial.

Details

The peaks and ranges have been obtained using an optimal IDR analysis done on all replicates.

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_NR3C1_CFQ_NarrowPeaks_partial)
data(A549_NR3C1_CFQ_Peaks_partial)
data(A549_NR3C1_CFS_NarrowPeaks_partial)
data(A549_NR3C1_CFS_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_NR3C1_CFQ_NarrowPeaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_NarrowPeaks_partial))
names(A549_NR3C1_CFQ_Peaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_Peaks_partial))
names(A549_NR3C1_CFS_NarrowPeaks_partial) <-rep("NR3C1_CFS",
                            length(A549_NR3C1_CFS_NarrowPeaks_partial))
names(A549_NR3C1_CFS_Peaks_partial) <- rep("NR3C1_CFS",
                            length(A549_NR3C1_CFS_Peaks_partial))

## Calculating consensus regions for chromosome 2
## with a default region size of 400 bp (2 * extendingSize).
## The consensus regions are not resized to fit the narrowPeak regions.
## Peaks from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr2"), c(243199373), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_NR3C1_CFQ_NarrowPeaks_partial,
                            A549_NR3C1_CFS_NarrowPeaks_partial),
    peaks = c(A549_NR3C1_CFQ_Peaks_partial,
                            A549_NR3C1_CFS_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 200,
    expandToFitPeakRegion = FALSE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

Sites with the greatest evidence of transcription factor binding for the NR3C1 transcription factor from ENCODE (DDC accession: ENCFF002CFS). For demonstration purpose.

Description

Sites representing the greatest evidence of enrichment for the NR3C1 transcription factor (DCC accession: ENCFF002CFS) for regions chr2:40000000-50000000 and chr3:10000000-13000000 from the Encyclopedia of DNA Elements (ENCODE) data (Dunham I et al. 2012).

Usage

data(A549_NR3C1_CFS_Peaks_partial)

Format

A GRanges containing one entry per site. The peaks are surronded by ranges present in the dataset A549_NR3C1_CFS_NarrowPeaks_partial.

Details

The peaks and ranges have been obtained using an optimal IDR analysis done on all replicates.

Source

The Encyclopedia of DNA Elements (ENCODE) (DCC accession: ENCFF002CFS)

References

  • Dunham I, Kundaje A, Aldred SF, et al. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012 Sep 6;489(7414):57-74.

See Also

Examples

## Loading datasets
data(A549_NR3C1_CFQ_NarrowPeaks_partial)
data(A549_NR3C1_CFQ_Peaks_partial)
data(A549_NR3C1_CFS_NarrowPeaks_partial)
data(A549_NR3C1_CFS_Peaks_partial)

## Assigning experiment name to each row of the dataset.
## NarrowPeak and Peak datasets from the same experiment must
## have identical names.
names(A549_NR3C1_CFQ_NarrowPeaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_NarrowPeaks_partial))
names(A549_NR3C1_CFQ_Peaks_partial) <- rep("NR3C1_CFQ",
                            length(A549_NR3C1_CFQ_Peaks_partial))
names(A549_NR3C1_CFS_NarrowPeaks_partial) <-rep("NR3C1_CFS",
                            length(A549_NR3C1_CFS_NarrowPeaks_partial))
names(A549_NR3C1_CFS_Peaks_partial) <- rep("NR3C1_CFS",
                            length(A549_NR3C1_CFS_Peaks_partial))

## Calculating consensus regions for chromosome 2
## with a default region size of 80 bp (2 * extendingSize).
## The consensus regions are not resized to fit the narrowPeak regions.
## Peaks from both experiments must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr2"), c(243199373), NA)
findConsensusPeakRegions(
    narrowPeaks = c(A549_NR3C1_CFQ_NarrowPeaks_partial,
                            A549_NR3C1_CFS_NarrowPeaks_partial),
    peaks = c(A549_NR3C1_CFQ_Peaks_partial,
                            A549_NR3C1_CFS_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 40,
    expandToFitPeakRegion = FALSE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

Extract regions sharing features in more than one experiment

Description

Find regions sharing the same features for a minimum number of experiments using called peaks of signal enrichment based on pooled, normalized data (mainly coming from narrowPeak files). The peaks and narrow peaks are used to identify the consensus regions. The minimum number of experiments that must have at least on peak in a region so that it is retained as a consensus region is specified by user, as well as the size of mining regions. Only the chromosomes specified by the user are treated. The function can be parallized by specifying a number of threads superior to 1.

When the padding is small, the detected regions are smaller than the one that could be obtained by doing an overlap of the narrow regions. Even more, the parameter specifying the minimum number of experiments needed to retain a region add versatility to the function.

Beware that the side of the padding can have a large effect on the detected consensus regions. It is recommanded to test more than one size and to do some manual validation of the resulting consensus regions before selecting the final padding size.

Usage

findConsensusPeakRegions(
  narrowPeaks,
  peaks,
  chrInfo,
  extendingSize = 250,
  expandToFitPeakRegion = FALSE,
  shrinkToFitPeakRegion = FALSE,
  minNbrExp = 1L,
  nbrThreads = 1L
)

Arguments

narrowPeaks

a GRanges containing called peak regions of signal enrichment based on pooled, normalized data for all analyzed experiments. All GRanges entries must have a metadata field called "name" which identifies the region to the called peak. All GRanges entries must also have a row name which identifies the experiment of origin. Each peaks entry must have an associated narrowPeaks entry. A GRanges entry is associated to a narrowPeaks entry by having a identical metadata "name" field and a identical row name.

peaks

a GRanges containing called peaks of signal enrichment based on pooled, normalized data for all analyzed experiments. All GRanges entries must have a metadata field called "name" which identifies the called peak. All GRanges entries must have a row name which identifies the experiment of origin. Each peaks entry must have an associated narrowPeaks entry. A GRanges entry is associated to a narrowPeaks entry by having a identical metadata "name" field and a identical row name.

chrInfo

a Seqinfo containing the name and the length of the chromosomes to analyze. Only the chomosomes contained in this Seqinfo will be analyzed.

extendingSize

a numeric value indicating the size of padding on both sides of the position of the peaks median to create the consensus region. The minimum size of the consensus region is equal to twice the value of the extendingSize parameter. The size of the extendingSize must be a positive integer. Default = 250.

expandToFitPeakRegion

a logical indicating if the region size, which is set by the extendingSize parameter is extended to include the entire narrow peak regions of all peaks included in the unextended consensus region. The narrow peak regions of the peaks added because of the extension are not considered for the extension. Default: FALSE.

shrinkToFitPeakRegion

a logical indicating if the region size, which is set by the extendingSize parameter is shrinked to fit the narrow peak regions of the peaks when all those regions are smaller than the consensus region. Default: FALSE.

minNbrExp

a positive numeric or a positive integer indicating the minimum number of experiments in which at least one peak must be present for a potential consensus region. The numeric must be a positive integer inferior or equal to the number of experiments present in the narrowPeaks and peaks parameters. Default = 1.

nbrThreads

a numeric or a integer indicating the number of threads to use in parallel. The nbrThreads must be a positive integer. Default = 1.

Value

an list of class "consensusRanges" containing :

  • call the matched call.

  • consensusRanges a GRanges containing the consensus regions.

Author(s)

Astrid Deschênes

Examples

## Loading datasets
data(A549_CTCF_MYN_NarrowPeaks_partial)
data(A549_CTCF_MYN_Peaks_partial)
data(A549_CTCF_MYJ_NarrowPeaks_partial)
data(A549_CTCF_MYJ_Peaks_partial)

## Assigning experiment name "CTCF_MYJ" to first experiment
names(A549_CTCF_MYJ_NarrowPeaks_partial) <- rep("CTCF_MYJ",
    length(A549_CTCF_MYJ_NarrowPeaks_partial))
names(A549_CTCF_MYJ_Peaks_partial) <- rep("CTCF_MYJ",
    length(A549_CTCF_MYJ_Peaks_partial))

## Assigning experiment name "CTCF_MYN" to second experiment
names(A549_CTCF_MYN_NarrowPeaks_partial) <- rep("CTCF_MYN",
    length(A549_CTCF_MYN_NarrowPeaks_partial))
names(A549_CTCF_MYN_Peaks_partial) <- rep("CTCF_MYN",
    length(A549_CTCF_MYN_Peaks_partial))

## Only choromsome 1 is going to be analysed
chrList <- Seqinfo("chr1", 249250621, NA)

## Find consensus regions with both experiments
results <- findConsensusPeakRegions(
    narrowPeaks = c(A549_CTCF_MYJ_NarrowPeaks_partial,
        A549_CTCF_MYN_NarrowPeaks_partial),
    peaks = c(A549_CTCF_MYJ_Peaks_partial,
        A549_CTCF_MYN_Peaks_partial),
    chrInfo = chrList,
    extendingSize = 300,
    expandToFitPeakRegion = TRUE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

## Print 2 first consensus regions
head(results$consensusRanges, 2)

Nucleosome positions detected by the NOrMAL software using syntetic reads generated using a normal distribution. For demonstration purpose.

Description

Nucleosome positions detected by the NOrMAL software using syntetic reads generated using a normal distribution with a variance of 20 for regions chr1:10000-15000.

Usage

data(NOrMAL_nucleosome_positions)

Format

A GRanges containing one entry per detected nucleosome. The surronding ranges associated to those nucleosomes are in the dataset NOrMAL_nucleosome_ranges.

References

  • Polishko A, Ponts N, Le Roch KG and Lonardi S. 2012. NOrMAL: Accurate nucleosome positioning using a modified Gaussian mixture model. Bioinformatics 28 (12): 242-49.

See Also

Examples

## Loading datasets
data(PING_nucleosome_positions)
data(PING_nucleosome_ranges)
data(NOrMAL_nucleosome_positions)
data(NOrMAL_nucleosome_ranges)
data(NucPosSimulator_nucleosome_positions)
data(NucPosSimulator_nucleosome_ranges)

## Assigning experiment name to each row of the dataset.
## Position and range datasets from the same sofware must
## have identical names.
names(PING_nucleosome_positions) <- rep("PING",
                            length(PING_nucleosome_positions))
names(PING_nucleosome_ranges) <- rep("PING",
                            length(PING_nucleosome_ranges))
names(NOrMAL_nucleosome_positions) <-rep("NOrMAL",
                            length(NOrMAL_nucleosome_positions))
names(NOrMAL_nucleosome_ranges) <- rep("NOrMAL",
                            length(NOrMAL_nucleosome_ranges))
names(NucPosSimulator_nucleosome_positions) <-rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_positions))
names(NucPosSimulator_nucleosome_ranges) <- rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_ranges))

## Calculating consensus regions for chromosome 1
## with a default region size of 40 bp (2 * extendingSize).
## The consensus regions are extended to include all genomic regions for
## all nucleosomes. However, if the consensus regions are larger than the
## genomic regions of the nucleosomes, the consensus regions are not
## shrinked.
## Nucleosomes from all software must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr1"), c(249250621), NA)
findConsensusPeakRegions(
    narrowPeaks = c(PING_nucleosome_ranges,
                        NOrMAL_nucleosome_ranges,
                        NucPosSimulator_nucleosome_ranges),
    peaks = c(PING_nucleosome_positions,
                        NOrMAL_nucleosome_positions,
                        NucPosSimulator_nucleosome_positions),
    chrInfo = chrList,
    extendingSize = 20,
    expandToFitPeakRegion = TRUE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 3,
    nbrThreads = 1)

Ranges associated to nucleosomes detected by the NOrMAL software using syntetic reads generated using a normal distribution. For demonstration purpose.

Description

Ranges associated to nucleosomes detected by the NOrMAL software using syntetic reads generated using a normal distribution with a variance of 20 for regions chr1:10000-15000.

Usage

data(NOrMAL_nucleosome_ranges)

Format

A GRanges containing one entry per detected nucleosome. The ranges are surronding the nucleosomes present in the dataset NOrMAL_nucleosome_positions. The genomic ranges have been obtained by adding 73 bps on each side of the detected positions.

References

  • Polishko A, Ponts N, Le Roch KG and Lonardi S. 2012. NOrMAL: Accurate nucleosome positioning using a modified Gaussian mixture model. Bioinformatics 28 (12): 242-49.

See Also

Examples

## Loading datasets
data(PING_nucleosome_positions)
data(PING_nucleosome_ranges)
data(NOrMAL_nucleosome_positions)
data(NOrMAL_nucleosome_ranges)
data(NucPosSimulator_nucleosome_positions)
data(NucPosSimulator_nucleosome_ranges)

## Assigning experiment name to each row of the dataset.
## Position and range datasets from the same sofware must
## have identical names.
names(PING_nucleosome_positions) <- rep("PING",
                            length(PING_nucleosome_positions))
names(PING_nucleosome_ranges) <- rep("PING",
                            length(PING_nucleosome_ranges))
names(NOrMAL_nucleosome_positions) <-rep("NOrMAL",
                            length(NOrMAL_nucleosome_positions))
names(NOrMAL_nucleosome_ranges) <- rep("NOrMAL",
                            length(NOrMAL_nucleosome_ranges))
names(NucPosSimulator_nucleosome_positions) <-rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_positions))
names(NucPosSimulator_nucleosome_ranges) <- rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_ranges))

## Calculating consensus regions for chromosome 1
## with a default region size of 30 bp (2 * extendingSize).
## Consensus regions are resized to include all genomic regions of
## included nucleosomes.
## Nucleosomes from at least 2 software must be present
## in a region to be retained as a consensus region.
chrList <- Seqinfo(c("chr1"), c(249250621), NA)
findConsensusPeakRegions(
    narrowPeaks = c(PING_nucleosome_ranges,
                            NOrMAL_nucleosome_ranges,
                            NucPosSimulator_nucleosome_ranges),
    peaks = c(PING_nucleosome_positions,
                            NOrMAL_nucleosome_positions,
                            NucPosSimulator_nucleosome_positions),
    chrInfo = chrList,
    extendingSize = 15,
    expandToFitPeakRegion = TRUE,
    shrinkToFitPeakRegion = TRUE,
    minNbrExp = 2,
    nbrThreads = 1)

Nucleosome positions detected by the NucPosSimulator software using syntetic reads generated using a normal distribution. For demonstration purpose.

Description

Nucleosome positions detected by the NucPosSimulator software using syntetic reads generated using a normal distribution with a variance of 20 for regions chr1:10000-15000.

Usage

data(NucPosSimulator_nucleosome_positions)

Format

A GRanges containing one entry per detected nucleosome. The surronding ranges associated to those nucleosomes are in the dataset NucPosSimulator_nucleosome_ranges.

References

  • Sch&ouml;pflin R, Teif VB, M&uuml;ller O, Weinberg C, Rippe K, and Wedemann G. 2013. Modeling nucleosome position distributions from experimental nucleosome positioning maps. Bioinformatics 29 (19): 2380-86.

See Also

Examples

## Loading datasets
data(PING_nucleosome_positions)
data(PING_nucleosome_ranges)
data(NOrMAL_nucleosome_positions)
data(NOrMAL_nucleosome_ranges)
data(NucPosSimulator_nucleosome_positions)
data(NucPosSimulator_nucleosome_ranges)

## Assigning experiment name to each row of the dataset.
## Position and range datasets from the same sofware must
## have identical names.
names(PING_nucleosome_positions) <- rep("PING",
                            length(PING_nucleosome_positions))
names(PING_nucleosome_ranges) <- rep("PING",
                            length(PING_nucleosome_ranges))
names(NOrMAL_nucleosome_positions) <-rep("NOrMAL",
                            length(NOrMAL_nucleosome_positions))
names(NOrMAL_nucleosome_ranges) <- rep("NOrMAL",
                            length(NOrMAL_nucleosome_ranges))
names(NucPosSimulator_nucleosome_positions) <-rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_positions))
names(NucPosSimulator_nucleosome_ranges) <- rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_ranges))

## Calculating consensus regions for chromosome 1
## with a default region size of 50 bp (2 * extendingSize).
## The consensus regions are extended to include all genomic regions for
## all nucleosomes. However, if the consensus regions are larger than the
## genomic regions of the nucleosomes, the consensus regions are not
## shrinked.
## Nucleosomes from all software must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr1"), c(249250621), NA)
findConsensusPeakRegions(
    narrowPeaks = c(PING_nucleosome_ranges,
                        NOrMAL_nucleosome_ranges,
                        NucPosSimulator_nucleosome_ranges),
    peaks = c(PING_nucleosome_positions,
                        NOrMAL_nucleosome_positions,
                        NucPosSimulator_nucleosome_positions),
    chrInfo = chrList,
    extendingSize = 25,
    expandToFitPeakRegion = TRUE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 3,
    nbrThreads = 1)

Ranges associated to nucleosomes detected by the NucPosSimulator software using syntetic reads generated using a normal distribution. For demonstration purpose.

Description

Ranges associated to nucleosomes detected by the NucPosSimulator software using syntetic reads generated using a normal distribution with a variance of 20 for regions chr1:10000-15000.

Usage

data(NucPosSimulator_nucleosome_ranges)

Format

A GRanges containing one entry per detected nucleosome. The ranges are surronding the nucleosomes present in the dataset NucPosSimulator_nucleosome_positions. The genomic ranges have been obtained by adding 73 bps on each side of the detected positions.

References

  • Sch&ouml;pflin R, Teif VB, M&uuml;ller O, Weinberg C, Rippe K, and Wedemann G. 2013. Modeling nucleosome position distributions from experimental nucleosome positioning maps. Bioinformatics 29 (19): 2380-86.

See Also

Examples

## Loading datasets
data(PING_nucleosome_positions)
data(PING_nucleosome_ranges)
data(NOrMAL_nucleosome_positions)
data(NOrMAL_nucleosome_ranges)
data(NucPosSimulator_nucleosome_positions)
data(NucPosSimulator_nucleosome_ranges)

## Assigning experiment name to each row of the dataset.
## Position and range datasets from the same sofware must
## have identical names.
names(PING_nucleosome_positions) <- rep("PING",
                            length(PING_nucleosome_positions))
names(PING_nucleosome_ranges) <- rep("PING",
                            length(PING_nucleosome_ranges))
names(NOrMAL_nucleosome_positions) <-rep("NOrMAL",
                            length(NOrMAL_nucleosome_positions))
names(NOrMAL_nucleosome_ranges) <- rep("NOrMAL",
                            length(NOrMAL_nucleosome_ranges))
names(NucPosSimulator_nucleosome_positions) <-rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_positions))
names(NucPosSimulator_nucleosome_ranges) <- rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_ranges))

## Calculating consensus regions for chromosome 1
## with a default region size of 60 bp (2 * extendingSize).
## Consensus regions are resized to include all genomic regions of
## included nucleosomes.
## Nucleosomes from at least 2 software must be present
## in a region to be retained as a consensus region.
chrList <- Seqinfo(c("chr1"), c(249250621), NA)
findConsensusPeakRegions(
    narrowPeaks = c(PING_nucleosome_ranges,
                        NOrMAL_nucleosome_ranges,
                        NucPosSimulator_nucleosome_ranges),
    peaks = c(PING_nucleosome_positions,
                        NOrMAL_nucleosome_positions,
                        NucPosSimulator_nucleosome_positions),
    chrInfo = chrList,
    extendingSize = 30,
    expandToFitPeakRegion = TRUE,
    shrinkToFitPeakRegion = TRUE,
    minNbrExp = 2,
    nbrThreads = 1)

Nucleosome positions detected by the PING software using syntetic reads generated using a normal distribution. For demonstration purpose.

Description

Nucleosome positions detected by the PING software using syntetic reads generated using a normal distribution with a variance of 20 for regions chr1:10000-15000.

Usage

data(PING_nucleosome_positions)

Format

A GRanges containing one entry per detected nucleosome. The surronding ranges associated to those nucleosomes are in the dataset PING_nucleosome_positions.

References

  • Sangsoon W, Zhang X, Sauteraud R, Robert F and Gottardo R. 2013. PING 2.0: An R/Bioconductor package for nucleosome positioning using next-generation sequencing data. Bioinformatics 29 (16): 2049-50.

See Also

Examples

## Loading datasets
data(PING_nucleosome_positions)
data(PING_nucleosome_ranges)
data(NOrMAL_nucleosome_positions)
data(NOrMAL_nucleosome_ranges)
data(NucPosSimulator_nucleosome_positions)
data(NucPosSimulator_nucleosome_ranges)

## Assigning experiment name to each row of the dataset.
## Position and range datasets from the same sofware must
## have identical names.
names(PING_nucleosome_positions) <- rep("PING",
                            length(PING_nucleosome_positions))
names(PING_nucleosome_ranges) <- rep("PING",
                            length(PING_nucleosome_ranges))
names(NOrMAL_nucleosome_positions) <-rep("NOrMAL",
                            length(NOrMAL_nucleosome_positions))
names(NOrMAL_nucleosome_ranges) <- rep("NOrMAL",
                            length(NOrMAL_nucleosome_ranges))
names(NucPosSimulator_nucleosome_positions) <-rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_positions))
names(NucPosSimulator_nucleosome_ranges) <- rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_ranges))

## Calculating consensus regions for chromosome 1
## with a default region size of 20 bp (2 * extendingSize).
## The consensus regions are not resized to fit genomic ranges of the
## included nucleosomes.
## Nucleosomes from at least 2 software must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr1"), c(249250621), NA)
findConsensusPeakRegions(
    narrowPeaks = c(PING_nucleosome_ranges,
                            NOrMAL_nucleosome_ranges,
                            NucPosSimulator_nucleosome_ranges),
    peaks = c(PING_nucleosome_positions,
                            NOrMAL_nucleosome_positions,
                            NucPosSimulator_nucleosome_positions),
    chrInfo = chrList,
    extendingSize = 10,
    expandToFitPeakRegion = FALSE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 3,
    nbrThreads = 1)

Ranges associated to nucleosomes detected by the PING software using syntetic reads generated using a normal distribution. For demonstration purpose.

Description

Ranges associated to nucleosomes detected by the PING software using syntetic reads generated using a normal distribution with a variance of 20 for regions chr1:10000-15000.

Usage

data(PING_nucleosome_ranges)

Format

A GRanges containing one entry per detected nucleosome. The ranges are surronding the nucleosomes present in the dataset PING_nucleosome_positions. The genomic ranges have been obtained by adding 73 bps on both sides of the detected positions.

References

  • Sangsoon W, Zhang X, Sauteraud R, Robert F and Gottardo R. 2013. PING 2.0: An R/Bioconductor package for nucleosome positioning using next-generation sequencing data. Bioinformatics 29 (16): 2049-50.

See Also

Examples

## Loading datasets
data(PING_nucleosome_positions)
data(PING_nucleosome_ranges)
data(NOrMAL_nucleosome_positions)
data(NOrMAL_nucleosome_ranges)
data(NucPosSimulator_nucleosome_positions)
data(NucPosSimulator_nucleosome_ranges)

## Assigning experiment name to each row of the dataset.
## Position and range datasets from the same sofware must
## have identical names.
names(PING_nucleosome_positions) <- rep("PING",
                            length(PING_nucleosome_positions))
names(PING_nucleosome_ranges) <- rep("PING",
                            length(PING_nucleosome_ranges))
names(NOrMAL_nucleosome_positions) <-rep("NOrMAL",
                            length(NOrMAL_nucleosome_positions))
names(NOrMAL_nucleosome_ranges) <- rep("NOrMAL",
                            length(NOrMAL_nucleosome_ranges))
names(NucPosSimulator_nucleosome_positions) <-rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_positions))
names(NucPosSimulator_nucleosome_ranges) <- rep("NucPosSimulator",
                            length(NucPosSimulator_nucleosome_ranges))

## Calculating consensus regions for chromosome 1
## with a default region size of 20 bp (2 * extendingSize).
## which is not extended to include all genomic regions for the closest
## peak to the median position of all peaks included in the region (for
## each experiment).
## Nucleosomes from at least 2 software must be present in a region to
## be retained as a consensus region.
chrList <- Seqinfo(c("chr1"), c(249250621), NA)
findConsensusPeakRegions(
    narrowPeaks = c(PING_nucleosome_ranges,
                            NOrMAL_nucleosome_ranges,
                            NucPosSimulator_nucleosome_ranges),
    peaks = c(PING_nucleosome_positions,
                            NOrMAL_nucleosome_positions,
                            NucPosSimulator_nucleosome_positions),
    chrInfo = chrList,
    extendingSize = 10,
    expandToFitPeakRegion = FALSE,
    shrinkToFitPeakRegion = FALSE,
    minNbrExp = 2,
    nbrThreads = 1)

Extract narrow regions and peaks from a narrrowPeak file

Description

Read a narrowPeak file and extract the narrow regions and/or the peaks, as specified by used. The narrowPeak file must fit the UCSC specifications. See https://genome.ucsc.edu/FAQ/FAQformat.html#format12 for more details. The file can have one or many header lines. However, the total number of header lines must be inferior to 250 lines.

Usage

readNarrowPeakFile(file_path, extractRegions = TRUE, extractPeaks = TRUE)

Arguments

file_path

the name of the file.

extractRegions

a logical indicating if the narrow regions must be extracted. If TRUE, a GRanges containing the narrow regions will be returned. Otherwise, NULL is returned. Default = TRUE.

extractPeaks

a logical indicating if the peaks must be extracted. If TRUE, a GRanges containing the peaks will be returned. Otherwise, NULL is returned. Default = TRUE.

Value

a list containing 2 entries:

  • narrowPeak a GRanges containing the narrow regions extracted from the file. NULL when not needed by user.

  • peak a GRanges containing the peaks extracted from the file. NULL when not

Author(s)

Astrid Deschênes

Examples

## Set file information
test_narrowPeak <- system.file("extdata",
            "A549_FOSL2_ENCSR000BQO_MZW_part_chr_1_and_12.narrowPeak",
            package = "consensusSeekeR")

## Read file to extract peaks and regions
data <- readNarrowPeakFile(test_narrowPeak, extractRegions = TRUE,
            extractPeaks = TRUE)

## To access peak data (GRanges format)
head(data$peak)

## To access region data (GRanges format)
head(data$narrowPeak)