NEWS
Rsamtools 2.16
NEW FEATURES
- (v 2.15.1) sortBam() gains support for sorting by tag (byTag) and using
multiple threads (nThreads). (See
https://github.com/Bioconductor/Rsamtools/issues/46. ; kriemo)
Rsamtools 2.10
DEPRECATED AND DEFUNCT
- (v 2.9.1) Deprecate applyPileups() in favor of pileup().
Rsamtools 2.8
NEW FEATURES
- (v 2.7.2) idxstatsBam works on remote (e.g., http://) files and reports
unmapped ('seqnames' equal to *) reads. See
https://support.bioconductor.org/p/9136222.
Rsamtools 2.4
BUG FIXES
- (v 2.3.2; from v 2.2.2) Correctly handle '*' ('unknown') RNAME
during paired-end processing. See
https://github.com/Bioconductor/Rsamtools/issues/16.
- (v 2.3.5) Fix regression introduced by v 2.3.2
NEW FEATURES
- (v 2.3.1) Don't require BAM files to have @SQ lines; allows
parsing PacBio 'unaligned' BAM files.
(https://github.com/Bioconductor/Rsamtools/issues/15 ; jayoung)
Rsamtools 2.0
SIGNIFICANT USER-VISIBLE CHANGES
- Migrate Rsamtools to Rhtslib. See Rsamtools/migration_notes.md for
more information about this migration.
- Remove unused fields from BamRangeIterator
- Remove BAM header hash init for pileup (already memoized in Rhtslib)
Rsamtools 1.34
BUG FIXES
- (v 1.34.1) indexFa,FaFile-method correctly updates the index path.
Rsamtools 1.33
NEW FEATURES
- (v 1.33.4, 1.33.7) scanBamFlag() gains isSupplementaryAlignment
support.
BUG FIXES
- (v 1.33.1) Do not try to grow NULL (not-yet-encountered) tags
(https://support.bioconductor.org/p/110609/ ; Robert Bradley)
- (v 1.33.5) Check for corrupt index
(https://github.com/Bioconductor/Rsamtools/issues/3 ; kjohnsen)
Rsamtools 1.31
BUG FIXES
- (v.1.31.3) pileup() examples require min_base_quality =
10. See https://support.bioconductor.org/p/105515/#105553
Rsamtools 1.27
BUG FIXES
- qnameSuffixStart<-(), qnamePrefixEnd<-() accept 'NA' (bug report
from Peter Hickey).
- scanBam() accepts a single tag mixing 'Z' and 'A' format. See
https://support.bioconductor.org/p/94553/
Rsamtools 1.25
NEW FEATURES
- *File and *FileList (e.g., TabixFile, TabixFileList)
constructors support NA as 'index'.
- *File and *FileList have accessor method for index.
- asBam(), asSam() provide default desinations.
- idxstatsBam() quickly summarizes the number of mapped and
unmapped reads on each sequence in a BAM file.
SIGNIFICANT USER-VISIBLE CHANGES
- index() by default returns NA rather than character(), but can be
controled with asNA argument.
BUG FIXES
- TabixFileList(TabixFile()) works.
- *File constructors now check that the file argument is length 1,
and that the index argument is length 0 or 1.
Rsamtools 1.23
NEW FEATURES
- filterBam can filter one source file into multiple destinations
by providing a vector of destination files and a list of
FilterRules.
- phred2ASCIIOffset() helps translate PHRED encodings (integer or
character) to ASCII offsets for use in pileup()
BUG FIXES
- scanBam() fails early when param seqlevels not present in file.
- Rsamtools.mk for Windows avoids spaces in file paths
Rsamtools 1.21
SIGNIFICANT USER-VISIBLE CHANGES
- pileup adds query_bins arg to give strand-sensitive cycle bin
behavior; cycle_bins renamed left_bins; negative values allowed
(including -Inf) to specify bins based on distance from
end-of-read.
- mapqFilter allows specification of a mapping quality filter
threshold
- PileupParam() now correctly follows samtools with
min_base_quality=13, min_map_quality=0 (previously, values were
assigned as 0 and 13, respectively)
- Support parsing 'B' tags in bam file headers.
BUG FIXES
- segfault on range iteration introduced 1.19.35, fixed in 1.21.1
- BamViews parallel evaluation with BatchJobs back-end requires
named arguments
Rsamtools 1.19
SIGNIFICANT USER-VISIBLE CHANGES
- FaFile accepts a distinct index file
- Support for cigars > 32767 characters
- Mate pairs use pos and mpos values calculated modulo target
length for pairing, facilitating some representations of mates
on circular chromosomes.
- scanBam no longer translates mapq '255' to 'NA'
BUG FIXES
- segfault on file iteration, introduced in 1.19.35, fixed in
1.19.44
- scanBam correctly parses '=' and 'X'
Rsamtools 1.17.0
NEW FEATURES
- pileup visits entire file if no 'which' argument specified for
'ScanBamParam' parameter of pileup. Buffered functionality with
'yieldSize' available to manage memory consumption when working
with large BAM files
- pileup 'read_pos_breaks' parameter renamed to 'cycle_bins':
cycle_bins allows users to differentiate pileup counts based on
user-defined regions within a read.
- pileup uses PileupParam and ScanBamParam instances to calculate
pileup statistics for a BAM file; returns a data.frame with
columns summarizing information extracted from alignments
overlapping each genomic position
- scanBam,BamSampler-method returns requested and actual
yieldSize, and total reads
- seqinfo,BamFileList-method returns the merged seqinfo of each
BamFile; seqlevels and seqlengths behave similarly.
- scanBamHeader accepts a 'what' argument to control input of the
targets and / or text portion of the header, and is much faster
for BAM files with many rnames.
SIGNIFICANT USER-VISIBLE CHANGES
- rename PileupParam class and constructor -> ApplyPileupsParam
- seqinfo,BamFile-method orders levels as they occur in the file,
reverting a change introduced in Rsamtools version 1.15.28
(version 1.17.16).
BUG FIXES
- scanBam(BamSampler(), param=param) with a 'which' argument no
longer mangles element names, and respects yield size
- applyPileups checks that seqlevels are identical across files
- scanFa documentation incorrectly indicated that end coordinates
beyond the range of the sequence would be truncated; they are an
error.
- applyPileups would fail on cigars with insertion followed by
reference skip, e.g., 2I1024N98M (bug report of Dan Gatti).
Rsamtools 1.15.0
NEW FEATURES
- asSam converts BAM files to SAM files
- razip, bgzip re-compress directly from .gz files
- yieldReduce through a BAM or other file, applying a MAP function
to each chunk and reducing the result to it's final representation
SIGNIFICANT USER-VISIBLE CHANGES
- bgzip default extension changed to '.bgz'
- seqinfo,BamFile-method attempts to return seqnames in 'natural'
order, e.g., chr1, chr2, ...
- yieldSize now works on BAM files queried with ranges. Successive
ranges are input until the total number of records first equals
or exceeds yieldSize..
- scanFa supports DNA, RNA, and AAStringSet return objects
BUG FIXES
- scanFa returns correct sequence at the very end of files
- razip compresses small files
- applyPileups no longer crashes in the absence of an index file
Rsamtools 1.14.0
NEW FEATURES
- seqinfo(FaFile) returns available information on sequences and
lengths on Fa (indexed fasta) files.
- filterBam accepts FilterRules call-backs for arbitrary
filtering.
- add isIncomplete,BamFile-method to test for end-of-file
- add count.mapped.reads to summarizeOverlaps,*,BamFileList-method;
set to TRUE to collect read and nucleotide counts via countBam.
- add summarizeOverlaps,*,character-method to count simple file
paths
- add sequenceLayer() and stackStringsFromBam()
- add 'with.which_label' arg to readGAlignmentsFromBam(),
readGappedReadsFromBam(), readGAlignmentPairsFromBam(), and
readGAlignmentsListFromBam()
SIGNIFICANT USER-VISIBLE CHANGES
- rename:
readBamGappedAlignments() -> readGAlignmentsFromBam()
readBamGappedReads() -> readGappedReadsFromBam()
readBamGappedAlignmentPairs() -> readGAlignmentPairsFromBam()
readBamGAlignmentsList() -> readGAlignmentsListFromBam()
makeGappedAlignmentPairs() -> makeGAlignmentPairs()
- speedup findMateAlignment()
DEPRECATED AND DEFUNCT
- deprecate readBamGappedAlignments(), readBamGappedReads(),
readBamGappedAlignmentPairs(), readBamGAlignmentsList(), and
makeGappedAlignmentPairs()
BUG FIXES
- scanVcfHeader tolerates records without ID fields, and with
fields named similar to ID.
- close razip files only once.
- report tabix input errors
Rsamtools 1.12.0
NEW FEATURES
- BamSampler draws a random sample from BAM file records, obeying
any restriction by ScanBamParam().
- Add argument 'obeyQname' to BamFile. Used with qname-sorted
Bam files only.
- Add readBamGAlignmentsList function for reading qname-sorted
Bam files into a GAlignmentsList object.
USER-VISIBLE CHANGES
- bamPath and bamIndicies applied to BamViews returns named
vectors.
- 'yieldSize' argument in BamFile represents the number of
unique qnames when 'obeyQname=TRUE'.
BUG FIXES
- completely free razip, bgzip files when done.
- sortBam, indexBam fail gracefully on non-BAM input.
- headerTabix on an open TabixFile no longer reads the first
record
- scanBcfHeader provides informative error message when header
line ('#CHROM POS ...') is missing
Rsamtools 1.10.0
NEW FEATURES
- BamFile and TabixFile accept argument yieldSize; repeated calls
to scanBam and scanTabix return successive yieldSize chunks of
the file. readBamGappedAlignments, VariantAnnotation::readVcf
automatically gain support for yield'ing through files.
- Add getDumpedAlignments(), countDumpedAlignments(), and
flushDumpedAlignments() low-level utilities for manipulating
alignments dumped by findMateAlignment().
- Add quickBamCounts() utility for classifying the records in a BAM file
according to a set of predefined groups (based on the flag bits) and
for counting the nb of records in each group.
SIGNIFICANT USER-VISIBLE CHANGES
- scanBamFlag isValidVendorRead deprecated in favor of
isNotPassingQualityControls
- Rename makeGappedAlignmentPairs() arg 'keep.colnames' -> 'use.mcols'.
BUG FIXES
- close razip, bgzip files when done
- bamReverseComplement<- failed to return the updated object
- scanBcfHeader works on remote files
- allow asBam to work without warnings on header-only SAM files
- some bug fixes and and small performance improvements to
findMateAlignment()
- fix bug in readBamGappedAlignmentPairs() where fields and tags
specified by the user were not propagated to the returned
GappedAlignmentPairs object
Rsamtools 1.8.0
NEW FEATURES
- Add readBamGappedAlignmentPairs() (plus related utilities
findMateAlignment() and makeGappedAlignmentPairs()) to read a BAM
file into a GappedAlignmentPairs object.
SIGNIFICANT USER-VISIBLE CHANGES
- update samtools to github commit
dc27682f70713a70d4f31bca652cf78e00757da2
- Add 'bitnames' arg to bamFlagAsBitMatrix() utility.
- By default readBamGappedAlignments() and readBamGappedReads() don't
drop PCR or optical duplicates anymore.
BUG FIXES
- readBamGappedAlignments handles empty 'tag' fields
- scanTabix would omit variants overlapping range ends
- scanFa would segfault on empty files or empty ids
Rsamtools 1.6.0
NEW FEATURES
- TabixFile, indexTabix, scanTabix, yieldTabix index (sorted,
compressed) and parse tabix-indexed files
- readBamGappedReads(), bamFlagAsBitMatrix(), bamFlagAND()
- Add use.names and param args to readBamGappedAlignments(); dropped
which and ... args.
- PileupFiles, PileupParam, applyPileup for visiting several BAM
files and calculating pile-ups on each.
- Provide a zlib for Windows, as R does not currently do this
- BamFileList, BcfFileList, TabixFileList, FaFileList clases
extend IRanges::SimpleList, for managings lists of file references
- razfFa creates random access compressed fasta files.
- count and scanBam support input of larger numbers of records;
countBam nucleotide count is now numeric() and subject to rounding
error when large.
- Update to samtools 0.1.17
- asBcf and indexBcf coerces VCF files to BCF, and indexes BCF
- Update to samtools 0.1.18
- scanVcf parses VCF files; use scanVcf,connection,missing-method
to stream, scanVcf,TabixFile,*-method to select subsets. Use
unpackVcf to expand INFO and GENO fields.
SIGNIFICANT USER-VISIBLE CHANGES
- ScanBamParam argument 'what' defaults to character(0) (nothing)
rather than scanBamWhat() (everything)
- bamFlag returns a user-friendly description of flags by default
BUG FIXES
- scanBam (and readBamGappedAlignments) called with an invalid or
character(0) index no longer segfaults.
- scanBcfHeader parses values with embedded commas or =
- scanFa fails, rather than returns incorrect sequences, when file
is compressed and file positions are not accessed sequentially
- scanBcf parses VCF files with no genotype information.
- scanBam called with the first range having no reads returned
invalid results for subsequent ranges; introduced in svn r57138
- scanBamFlag isPrimaryRead changed to isNotPrimaryRead,
correctly reflecting the meaning of the flag.
Rsamtools 1.4.0
NEW FEATURES
- BamFile class allows bam files to be open across calls to
scanBam and friends. This can be helpful when wanting to avoid
repeated loading of the index, for instance.
- BcfFile, scanBcf, scanBcfHeader to parse bcftools' .vcf and .bcf
files. Note that this implements bcftools notions of vcf and bcf,
and are not fully compliant with vcf-4.0.
- asBam converts SAM files to (indexed) BAM files
- FaFile, indexFa, scanIndexFa, scanFa index and parse (indexed)
fasta files.
BUG FIXES
- scanBamFlag isValidVendorRead had reversed TRUE/FALSE logic
- Attempts to read too many records caught more gracefully.
- samtools output to fprintf() or calls to exit() are handled more
gracefully
Rsamtools 1.2.0
NEW FEATURES
- Update to samtools 0.1.8
- Update to samtools svn rev 750 (Mon, 27 Sep 2010)
- sortBam sorts a BAM file
BUG FIXES
- Attempts to parse non-existent local files now generate an error
- Reads whose last nucleotide overlaps the first of a range are
now scanned / counted.
- scanning / counting reads late in large Windows files is fast
- scanBam tag fields of type 'A' parsed correctly
Rsamtools 1.0.0
SIGNIFICANT USER-VISIBLE CHANGES
- scanBam returns minus-strand reads in the manner presented in
the BAM file, i.e., as though on the positive strand. This occurs
in revision 0.1.34
- readBamGappedAlignments replaces readBAMasGappedAlignments
NEW FEATURES
- ScanBamParam() accepts 'tag' argument for parsing optional fields
- BamViews can be used with scanBam, countBam,
readBamGappedAlignments
BUG FIXES
- No changes classified as 'bug fixes' (package under active
development)