Package: BUSpaRse 1.21.0

Lambda Moses

BUSpaRse: kallisto | bustools R utilities

The kallisto | bustools pipeline is a fast and modular set of tools to convert single cell RNA-seq reads in fastq files into gene count or transcript compatibility counts (TCC) matrices for downstream analysis. Central to this pipeline is the barcode, UMI, and set (BUS) file format. This package serves the following purposes: First, this package allows users to manipulate BUS format files as data frames in R and then convert them into gene count or TCC matrices. Furthermore, since R and Rcpp code is easier to handle than pure C++ code, users are encouraged to tweak the source code of this package to experiment with new uses of BUS format and different ways to convert the BUS file into gene count matrix. Second, this package can conveniently generate files required to generate gene count matrices for spliced and unspliced transcripts for RNA velocity. Here biotypes can be filtered and scaffolds and haplotypes can be removed, and the filtered transcriptome can be extracted and written to disk. Third, this package implements utility functions to get transcripts and associated genes required to convert BUS files to gene count matrices, to write the transcript to gene information in the format required by bustools, and to read output of bustools into R as sparses matrices.

Authors:Lambda Moses [aut, cre], Lior Pachter [aut, ths]

BUSpaRse_1.21.0.tar.gz
BUSpaRse_1.21.0.zip(r-4.5)BUSpaRse_1.21.0.zip(r-4.4)BUSpaRse_1.21.0.zip(r-4.3)
BUSpaRse_1.21.0.tgz(r-4.4-x86_64)BUSpaRse_1.21.0.tgz(r-4.4-arm64)BUSpaRse_1.21.0.tgz(r-4.3-x86_64)BUSpaRse_1.21.0.tgz(r-4.3-arm64)
BUSpaRse_1.21.0.tar.gz(r-4.5-noble)BUSpaRse_1.21.0.tar.gz(r-4.4-noble)
BUSpaRse_1.21.0.tgz(r-4.4-emscripten)BUSpaRse_1.21.0.tgz(r-4.3-emscripten)
BUSpaRse.pdf |BUSpaRse.html
BUSpaRse/json (API)
NEWS

# Install 'BUSpaRse' in R:
install.packages('BUSpaRse', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/bustools/busparse/issues

Uses libs:
  • c++– GNU Standard C++ Library v3
Datasets:

On BioConductor:BUSpaRse-1.21.0(bioc 3.21)BUSpaRse-1.20.0(bioc 3.20)

singlecellrnaseqworkflowstepcpp

7.47 score 9 stars 164 scripts 368 downloads 1 mentions 22 exports 115 dependencies

Last updated 2 months agofrom:0e32057b9f. Checks:OK: 1 NOTE: 8. Indexed: yes.

TargetResultDate
Doc / VignettesOKNov 29 2024
R-4.5-win-x86_64NOTENov 29 2024
R-4.5-linux-x86_64NOTENov 29 2024
R-4.4-win-x86_64NOTENov 29 2024
R-4.4-mac-x86_64NOTENov 29 2024
R-4.4-mac-aarch64NOTENov 29 2024
R-4.3-win-x86_64NOTENov 29 2024
R-4.3-mac-x86_64NOTENov 29 2024
R-4.3-mac-aarch64NOTENov 29 2024

Exports:annots_from_fa_dfannots_from_fa_GRangesdl_transcriptomeEC2geneget_inflectionget_knee_dfget_velocity_filesknee_plotmake_sparse_matrixread_count_outputread_velocity_outputsave_tr2g_bustoolssort_tr2gspecies2datasetsubset_annottr2g_EnsDbtr2g_ensembltr2g_fastatr2g_gff3tr2g_gtftr2g_TxDbtranscript2gene

Dependencies:abindAnnotationDbiAnnotationFilteraskpassBHBiobaseBiocFileCacheBiocGenericsBiocIOBiocParallelbiomaRtBiostringsbitbit64bitopsblobBSgenomecachemclicodetoolscolorspacecpp11crayoncurlDBIdbplyrDelayedArraydigestdplyrensembldbfansifarverfastmapfilelockformatRfutile.loggerfutile.optionsgenericsGenomeInfoDbGenomeInfoDbDataGenomicAlignmentsGenomicFeaturesGenomicRangesggplot2gluegtablehmshttrhttr2IRangesisobandjsonliteKEGGRESTlabelinglambda.rlatticelazyevallifecyclemagrittrMASSMatrixMatrixGenericsmatrixStatsmemoisemgcvmimemunsellnlmeopensslpillarpkgconfigplogrplyrangespngprettyunitsprogressProtGenericspurrrR6rappdirsRColorBrewerRcppRcppArmadilloRcppProgressRCurlrestfulrRhtslibrjsonrlangRsamtoolsRSQLitertracklayerS4ArraysS4VectorsscalessnowSparseArraystringistringrSummarizedExperimentsystibbletidyrtidyselectUCSC.utilsutf8vctrsviridisLitewithrXMLxml2XVectoryamlzeallotzlibbioc

Converting BUS format into sparse matrix

Rendered fromsparse-matrix.Rmdusingknitr::rmarkdownon Nov 29 2024.

Last update: 2021-03-01
Started: 2019-06-18

Generate transcript to gene file for bustools

Rendered fromtr2g.Rmdusingknitr::rmarkdownon Nov 29 2024.

Last update: 2024-07-31
Started: 2019-06-18

Readme and manuals

Help Manual

Help pageTopics
Generate RNA velocity files for GRanges.get_velocity_files
Transfer information about circular chromosomes between genome and annotationannot_circular
Get genome annotation from Ensembl FASTA fileannots_from_fa_df annots_from_fa_GRanges
Cell Ranger gene biotypescellranger_biotypes
Check that an object is a character vector of length 1check_char1
Check for chromosomes in genome but not annotationcheck_genome
Check inputs to tr2g_gtf and tr2g_gff3check_gff
Check that a tag is present in attribute field of GTF/GFFcheck_tag_present
Check if transcript ID in transcriptome and annotation matchcheck_tx
Download transcriptome from Ensembldl_transcriptome
Map EC Index to Genes Compatible with the ECEC2gene
Gene biotypes from Ensemblensembl_gene_biotypes
These are the column names of the 'mcols' when the Ensembl GTF file is read into R as a 'GRanges', including 'gene_id', 'transcript_id', 'biotype', 'description', and so on, and the mandatory tags like 'ID', 'Name', and 'Parent'.ensembl_gff_mcols
Tags in the attributes field of Ensembl GTF filesensembl_gtf_mcols
Transcript biotypes from Ensemblensembl_tx_biotypes
Get flanked intronic rangesget_intron_flanks
Plot the transposed knee plot and inflection pointget_inflection get_knee_df knee_plot
Get files required for RNA velocity with bustoolsget_velocity_files get_velocity_files,character-method get_velocity_files,EnsDb-method get_velocity_files,GRanges-method get_velocity_files,TxDb-method
Convert the Output of 'kallisto bus' into Gene by Gell Matrixmake_sparse_matrix
Match chromosome naming styles of annotation and genomematch_style
Read matrix along with barcode and gene namesread_count_output
Read intronic and exonic matrices into Rread_velocity_output
Tags in the attributes field of RefSeq GFF filesrefseq_gff_mcols
Save transcript to gene file for use in 'bustools'save_tr2g_bustools
Sort transcripts to the same order as in kallisto indexsort_tr2g
Convert Latin species name to dataset namespecies2dataset
Standardize GRanges field namesstandardize_tags
Remove chromosomes in anotation absent from genomesub_annot
Subset genome annotationsubset_annot subset_annot,BSgenome-method subset_annot,DNAStringSet-method
Get transcript and gene info from EnsDb objectstr2g_EnsDb
Get transcript and gene info from Ensembltr2g_ensembl
Get transcript and gene info from names in FASTA filestr2g_fasta
Get transcript and gene info from GFF3 filetr2g_gff3
Get transcript and gene info from GRangestr2g_GRanges
Get transcript and gene info from GTF filetr2g_gtf
tr2g for exon-exon junctionstr2g_junction
Get transcript and gene info from TxDb objectstr2g_TxDb
Map Ensembl transcript ID to gene IDtranscript2gene
Validate input to get_velocity_filesvalidate_velocity_input
Write the files for RNA velocity to diskwrite_velocity_output