Biostrings - Efficient manipulation of biological strings
Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.
Last updated
sequencematchingalignmentsequencinggeneticsdataimportdatarepresentationinfrastructurebioconductor-packagecore-package
17.94 score 67 stars 1.2k dependents 14k scripts 102k downloadsSingleCellExperiment - S4 Classes for Single Cell Data
Defines a S4 class for storing data from single-cell experiments. This includes specialized methods to store and retrieve spike-in information, dimensionality reduction coordinates and size factors for each cell, along with the usual metadata for genes and libraries.
Last updated
immunooncologydatarepresentationdataimportinfrastructuresinglecellbioconductorbioconductor-packagehuman-cell-atlassingle-cell-rna-seq
16.72 score 75 stars 337 dependents 19k scripts 34k downloadslimma - Linear Models for Microarray and Omics Data
Data analysis, linear models and differential expression for omics data.
Last updated
exonarraygeneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicinggenesetenrichmentdataimportbayesianclusteringregressiontimecoursemicroarraymicrornaarraymrnamicroarrayonechannelproprietaryplatformstwochannelsequencingrnaseqbatcheffectmultiplecomparisonnormalizationpreprocessingqualitycontrolbiomedicalinformaticscellbiologycheminformaticsepigeneticsfunctionalgenomicsgeneticsimmunooncologymetabolomicsproteomicssystemsbiologytranscriptomics
13.78 score 633 dependents 22k scripts 65k downloadsBiocGenerics - S4 generic functions used in Bioconductor
The package defines many S4 generic functions used in Bioconductor.
Last updated
infrastructurebioconductor-packagecore-package
13.76 score 13 stars 2.4k dependents 1.2k scripts 125k downloads
Spectra - Spectra Infrastructure for Mass Spectrometry Data
The Spectra package defines an efficient infrastructure for storing and handling mass spectrometry spectra and functionality to subset, process, visualize and compare spectra data. It provides different implementations (backends) to store mass spectrometry data. These comprise backends tuned for fast data access and processing and backends for very large data sets ensuring a small memory footprint.
Last updated
infrastructureproteomicsmassspectrometrymetabolomicsbioconductorhacktoberfestmass-spectrometry
13.60 score 46 stars 65 dependents 406 scripts 3.9k downloadsensembldb - Utilities to create and use Ensembl-based annotation databases
The package provides functions to create and use transcript centric annotation databases/packages. The annotation for the databases are directly fetched from Ensembl using their Perl API. The functionality and data is similar to that of the TxDb packages from the GenomicFeatures package, but, in addition to retrieve all gene/transcript models and annotations from the database, ensembldb provides a filter framework allowing to retrieve annotations for specific entries like genes encoded on a chromosome region or transcript models of lincRNA genes. EnsDb databases built with ensembldb contain also protein annotations and mappings between proteins and their encoding transcripts. Finally, ensembldb provides functions to map between genomic, transcript and protein coordinates.
Last updated
geneticsannotationdatasequencingcoverageannotationbioconductorbioconductor-packagesensembl
13.40 score 36 stars 112 dependents 1.7k scripts 18k downloadsscDblFinder - scDblFinder
The scDblFinder package gathers various methods for the detection and handling of doublets/multiplets in single-cell sequencing data (i.e. multiple cells captured within the same droplet or reaction volume). It includes methods formerly found in the scran package, the new fast and comprehensive scDblFinder method, and a reimplementation of the Amulet detection method for single-cell ATAC-seq.
Last updated
preprocessingsinglecellrnaseqatacseqdoubletssingle-cell
13.21 score 251 stars 2 dependents 1.9k scripts 4.3k downloadsSpatialExperiment - S4 Class for Spatially Resolved -omics Data
Defines an S4 class for storing data from spatial -omics experiments. The class extends SingleCellExperiment to support storage and retrieval of additional information from spot-based and molecule-based platforms, including spatial coordinates, images, and image metadata. A specialized constructor function is included for data from the 10x Genomics Visium platform.
Last updated
datarepresentationdataimportinfrastructureimmunooncologygeneexpressiontranscriptomicssinglecellspatialu24ca289073
13.05 score 73 stars 93 dependents 3.4k scripts 12k downloadsmsa - Multiple Sequence Alignment
The 'msa' package provides a unified R/Bioconductor interface to the multiple sequence alignment algorithms ClustalW, ClustalOmega, and Muscle. All three algorithms are integrated in the package, therefore, they do not depend on any external software tools and are available for all major platforms. The multiple sequence alignment algorithms are complemented by a function for pretty-printing multiple sequence alignments using the LaTeX package TeXshade.
Last updated
multiplesequencealignmentalignmentmultiplecomparisonsequencingcpp
12.12 score 25 stars 8 dependents 1.3k scripts 3.5k downloadsmia - Microbiome analysis
mia implements tools for microbiome analysis based on the SummarizedExperiment, SingleCellExperiment and TreeSummarizedExperiment infrastructure. Data wrangling and analysis in the context of taxonomic data is the main scope. Additional functions for common task are implemented such as community indices calculation and summarization.
Last updated
microbiomesoftwaredataimportanalysisbioconductorcpp
12.10 score 57 stars 8 dependents 802 scripts 3.5k downloadsuniversalmotif - Import, Modify, and Export Motifs with R
A comprehensive toolkit for working with sequence motifs in R. Imports and exports most common motif formats (JASPAR, MEME, HOMER, TRANSFAC, CIS-BP, UNIPROBE) and interoperates with the other Bioconductor motif packages. Analysis functions cover de novo motif discovery, motif-vs-motif comparison and clustering, P-value calculation, sequence scanning, enrichment against shuffled or composition-matched backgrounds, positional bias testing, and pairwise motif co-occurrence. Also includes utilities for sequence shuffling, motif trimming, higher-order representations, ground-truth simulation by motif implantation, and logo-plotting functionality.
Last updated
motifannotationmotifdiscoverydataimportgeneregulationmotif-analysismotif-enrichment-analysissequence-logocpp
11.81 score 33 stars 14 dependents 574 scripts 2.1k downloadsdecoupleR - decoupleR: Ensemble of computational methods to infer biological activities from omics data
Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.
Last updated
differentialexpressionfunctionalgenomicsgeneexpressiongeneregulationnetworksoftwarestatisticalmethodtranscription
11.79 score 294 stars 7 dependents 587 scripts 2.6k downloadsmuscat - Multi-sample multi-group scRNA-seq data analysis tools
`muscat` provides various methods and visualization tools for DS analysis in multi-sample, multi-group, multi-(cell-)subpopulation scRNA-seq data, including cell-level mixed models and methods based on aggregated “pseudobulk” data, as well as a flexible simulation platform that mimics both single and multi-sample scRNA-seq data.
Last updated
immunooncologydifferentialexpressionsequencingsinglecellsoftwarestatisticalmethodvisualization
11.62 score 231 stars 1 dependents 932 scripts 1.1k downloadsgdsfmt - R Interface to CoreArray Genomic Data Structure (GDS) Files
Provides a high-level R interface to CoreArray Genomic Data Structure (GDS) data files. GDS is portable across platforms with hierarchical structure to store multiple scalable array-oriented data sets with metadata information. It is suited for large-scale datasets, especially for data which are much larger than the available random-access memory. The gdsfmt package offers the efficient operations specifically designed for integers of less than 8 bits, since a diploid genotype, like single-nucleotide polymorphism (SNP), usually occupies fewer bits than a byte. Data compression and decompression are available with relatively efficient random access. It is also allowed to read a GDS file in parallel with multiple R processes supported by the package parallel.
Last updated
infrastructuredataimportbioinformaticsgds-formatgenomicscpp
11.46 score 20 stars 31 dependents 1.1k scripts 4.1k downloadsUCell - Rank-based signature enrichment analysis for single-cell data
UCell is a package for evaluating gene signatures in single-cell datasets. UCell signature scores, based on the Mann-Whitney U statistic, are robust to dataset size and heterogeneity, and their calculation demands less computing time and memory than other available methods, enabling the processing of large datasets in a few minutes even on machines with limited computing power. UCell can be applied to any single-cell data matrix, and includes functions to directly interact with SingleCellExperiment and Seurat objects.
Last updated
singlecellgenesetenrichmenttranscriptomicsgeneexpressioncellbasedassays
11.07 score 196 stars 2 dependents 740 scripts 2.8k downloadsS4Arrays - Foundation of array-like containers in Bioconductor
The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides: (1) low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and (2) a framework that facilitates block processing of array-like objects (typically on-disk objects).
Last updated
infrastructuredatarepresentationbioconductor-packagecore-packageu24ca289073
10.97 score 7 stars 1.4k dependents 13 scripts 82k downloadsanndataR - AnnData interoperability in R
Bring the power and flexibility of AnnData to the R ecosystem, allowing you to effortlessly manipulate and analyse your single-cell data. This package lets you work with backed h5ad and zarr files, directly access various slots (e.g. X, obs, var), or convert the data into SingleCellExperiment and Seurat objects.
Last updated
singlecelldataimportdatarepresentationanndatah5adinteroperability
10.89 score 181 stars 1 dependents 172 scripts 1.3k downloadsggtreeExtra - An R Package To Add Geometric Layers On Circular Or Other Layout Tree Of "ggtree"
'ggtreeExtra' extends the method for mapping and visualizing associated data on phylogenetic tree using 'ggtree'. These associated data can be presented on the external panels to circular layout, fan layout, or other rectangular layout tree built by 'ggtree' with the grammar of 'ggplot2'.
Last updated
softwarevisualizationphylogeneticsannotation
10.60 score 98 stars 2 dependents 684 scripts 2.2k downloadstxdbmaker - Tools for making TxDb objects from genomic annotations
A set of tools for making TxDb objects from genomic annotations from various sources (e.g. UCSC, Ensembl, and GFF files). These tools allow the user to download the genomic locations of transcripts, exons, and CDS, for a given assembly, and to import them in a TxDb object. TxDb objects are implemented in the GenomicFeatures package, together with flexible methods for extracting the desired features in convenient formats.
Last updated
infrastructuredataimportannotationgenomeannotationgenomeassemblygeneticssequencingbioconductor-packagecore-package
10.49 score 5 stars 63 dependents 305 scripts 7.7k downloadsRarr - Read Zarr Files in R
The Zarr specification defines a format for chunked, compressed, N-dimensional arrays. It's design allows efficient access to subsets of the stored array, and supports both local and cloud storage systems. Rarr aims to implement this specification in R with minimal reliance on an external tools or libraries.
Last updated
dataimportbioconductorome-ngffome-zarron-diskout-of-memoryzarrc-blosclibzstd
10.49 score 53 stars 7 dependents 91 scripts 630 downloadsSeqinfo - A simple S4 class for storing basic information about a collection of genomic sequences
The Seqinfo class stores the names, lengths, circularity flags, and genomes for a particular collection of sequences. These sequences are typically the chromosomes and/or scaffolds of a specific genome assembly of a given organism. Seqinfo objects are rarely used as standalone objects. Instead, they are used as part of higher-level objects to represent their seqinfo() component. Examples of such higher-level objects are GRanges, RangedSummarizedExperiment, VCF, GAlignments, etc... defined in other Bioconductor infrastructure packages.
Last updated
infrastructuredatarepresentationgenomeassemblyannotationgenomeannotationbioconductor-packagecore-package
10.41 score 1 stars 1.8k dependents 26 scripts 61k downloadsBiocCheck - Bioconductor-specific package checks
BiocCheck guides maintainers through Bioconductor best practicies. It runs Bioconductor-specific package checks by searching through package code, examples, and vignettes. Maintainers are required to address all errors, warnings, and most notes produced.
Last updated
infrastructurebioconductor-packagecore-services
10.22 score 10 stars 6 dependents 133 scripts 3.3k downloadsSpatialFeatureExperiment - Integrating SpatialExperiment with Simple Features in sf
A new S4 class integrating Simple Features with the R package sf to bring geospatial data analysis methods based on vector data to spatial transcriptomics. Also implements management of spatial neighborhood graphs and geometric operations. This pakage builds upon SpatialExperiment and SingleCellExperiment, hence methods for these parent classes can still be used.
Last updated
datarepresentationtranscriptomicsspatial
10.18 score 57 stars 2 dependents 804 scripts 756 downloadssingleCellTK - Comprehensive and Interactive Analysis of Single Cell RNA-Seq Data
The Single Cell Toolkit (SCTK) in the singleCellTK package provides an interface to popular tools for importing, quality control, analysis, and visualization of single cell RNA-seq data. SCTK allows users to seamlessly integrate tools from various packages at different stages of the analysis workflow. A general "a la carte" workflow gives users the ability access to multiple methods for data importing, calculation of general QC metrics, doublet detection, ambient RNA estimation and removal, filtering, normalization, batch correction or integration, dimensionality reduction, 2-D embedding, clustering, marker detection, differential expression, cell type labeling, pathway analysis, and data exporting. Curated workflows can be used to run Seurat and Celda. Streamlined quality control can be performed on the command line using the SCTK-QC pipeline. Users can analyze their data using commands in the R console or by using an interactive Shiny Graphical User Interface (GUI). Specific analyses or entire workflows can be summarized and shared with comprehensive HTML reports generated by Rmarkdown. Additional documentation and vignettes can be found at camplab.net/sctk.
Last updated
singlecellgeneexpressiondifferentialexpressionalignmentclusteringimmunooncologybatcheffectnormalizationqualitycontroldataimportgui
9.89 score 187 stars 260 scripts 646 downloadsdestiny - Creates diffusion maps
Create and plot diffusion maps.
Last updated
cellbiologycellbasedassaysclusteringsoftwarevisualizationdiffusion-mapsdimensionality-reductioncpp
9.85 score 104 stars 1 dependents 940 scripts 1.5k downloadsRsubread - Mapping, quantification and variant analysis of sequencing data
Alignment, quantification and analysis of RNA sequencing data (including both bulk RNA-seq and scRNA-seq) and DNA sequenicng data (including ATAC-seq, ChIP-seq, WGS, WES etc). Includes functionality for read mapping, read counting, SNP calling, structural variant detection and gene fusion discovery. Can be applied to all major sequencing techologies and to both short and long sequence reads.
Last updated
sequencingalignmentsequencematchingrnaseqchipseqsinglecellgeneexpressiongeneregulationgeneticsimmunooncologysnpgeneticvariabilitypreprocessingqualitycontrolgenomeannotationgenefusiondetectionindeldetectionvariantannotationvariantdetectionmultiplesequencealignmentzlib
9.50 score 10 dependents 1.5k scripts 3.6k downloadsUCSC.utils - Low-level utilities to retrieve data from the UCSC Genome Browser
A set of low-level utilities to retrieve data from the UCSC Genome Browser. Most functions in the package access the data via the UCSC REST API but some of them query the UCSC MySQL server directly. Note that the primary purpose of the package is to support higher-level functionalities implemented in downstream packages like GenomeInfoDb or txdbmaker.
Last updated
infrastructuregenomeassemblyannotationgenomeannotationdataimportbioconductor-packagecore-package
9.41 score 1 stars 332 dependents 12 scripts 54k downloadsh5mread - A fast HDF5 reader
The main function in the h5mread package is h5mread(), which allows reading arbitrary data from an HDF5 dataset into R, similarly to what the h5read() function from the rhdf5 package does. In the case of h5mread(), the implementation has been optimized to make it as fast and memory-efficient as possible.
Last updated
infrastructuredatarepresentationdataimportu24ca289073curlopenssl
9.09 score 3 stars 157 dependents 4 scripts 12k downloadscigarillo - Efficient manipulation of CIGAR strings
CIGAR stands for Concise Idiosyncratic Gapped Alignment Report. CIGAR strings are found in the BAM files produced by most aligners and in the AIRR-formatted output produced by IgBLAST. The cigarillo package provides functions to parse and inspect CIGAR strings, trim them, turn them into ranges of positions relative to the "query space" or "reference space", and project positions or sequences from one space to the other. Note that these operations are low-level operations that the user rarely needs to perform directly. More typically, they are performed behind the scene by higher-level functionality implemented in other packages like Bioconductor packages GenomicAlignments and igblastr.
Last updated
infrastructurealignmentsequencematchingsequencingbioconductor-packagecore-package
8.96 score 557 dependents 5 scripts 22k downloadscrisprDesign - Comprehensive design of CRISPR gRNAs for nucleases and base editors
Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA designs are enabled.
Last updated
crisprfunctionalgenomicsgenetargetbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgenomics-analysisgrnagrna-sequencegrna-sequencessgrnasgrna-design
8.93 score 30 stars 3 dependents 118 scripts 513 downloadssccomp - Differential Composition and Variability Analysis for Single-Cell Data
Comprehensive R package for differential composition and variability analysis in single-cell RNA sequencing, CyTOF, and microbiome data. Provides robust Bayesian modeling with outlier detection, random effects, and advanced statistical methods for cell type proportion analysis. Features include probabilistic outlier identification, mixed-effect modeling, differential variability testing, and comprehensive visualization tools. Perfect for cancer research, immunology, developmental biology, and single-cell genomics applications.
Last updated
bayesianregressiondifferentialexpressionsinglecellmetagenomicsflowcytometryspatialbatch-correctioncompositioncytofdifferential-proportionmicrobiomemultilevelproportionsrandom-effectssingle-cellunwanted-variation
8.88 score 125 stars 168 scripts 460 downloads
CompoundDb - Creating and Using (Chemical) Compound Annotation Databases
CompoundDb provides functionality to create and use (chemical) compound annotation databases from a variety of different sources such as LipidMaps, HMDB, ChEBI or MassBank. The database format allows to store in addition MS/MS spectra along with compound information. The package provides also a backend for Bioconductor's Spectra package and allows thus to match experimetal MS/MS spectra against MS/MS spectra in the database. Databases can be stored in SQLite format and are thus portable.
Last updated
massspectrometrymetabolomicsannotationdatabasesmass-spectrometry
8.80 score 19 stars 3 dependents 92 scripts 775 downloadshypeR - An R Package For Geneset Enrichment Workflows
An R Package for Geneset Enrichment Workflows.
Last updated
genesetenrichmentannotationpathwaysbioinformaticscomputational-biologygeneset-enrichment-analysis
8.80 score 79 stars 176 scripts 411 downloads
dreamlet - Scalable differential expression analysis of single cell transcriptomics datasets with complex study designs
Recent advances in single cell/nucleus transcriptomic technology has enabled collection of cohort-scale datasets to study cell type specific gene expression differences associated disease state, stimulus, and genetic regulation. The scale of these data, complex study designs, and low read count per cell mean that characterizing cell type specific molecular mechanisms requires a user-frieldly, purpose-build analytical framework. We have developed the dreamlet package that applies a pseudobulk approach and fits a regression model for each gene and cell cluster to test differential expression across individuals associated with a trait of interest. Use of precision-weighted linear mixed models enables accounting for repeated measures study designs, high dimensional batch effects, and varying sequencing depth or observed cells per biosample.
Last updated
rnaseqgeneexpressiondifferentialexpressionbatcheffectqualitycontrolregressiongenesetenrichmentgeneregulationepigeneticsfunctionalgenomicstranscriptomicsnormalizationsinglecellpreprocessingsequencingimmunooncologysoftwarecpp
8.63 score 22 stars 362 scripts 448 downloadsReactomeGSA - Client for the Reactome Analysis Service for comparative multi-omics gene set analysis
The ReactomeGSA packages uses Reactome's online analysis service to perform a multi-omics gene set analysis. The main advantage of this package is, that the retrieved results can be visualized using REACTOME's powerful webapplication. Since Reactome's analysis service also uses R to perfrom the actual gene set analysis you will get similar results when using the same packages (such as limma and edgeR) locally. Therefore, if you only require a gene set analysis, different packages are more suited.
Last updated
genesetenrichmentproteomicstranscriptomicssystemsbiologygeneexpressionreactome
8.62 score 33 stars 141 scripts 544 downloadsscDesign3 - A unified framework of realistic in silico data generation and statistical model inference for single-cell and spatial omics
We present a statistical simulator, scDesign3, to generate realistic single-cell and spatial omics data, including various cell states, experimental designs, and feature modalities, by learning interpretable parameters from real data. Using a unified probabilistic model for single-cell and spatial omics data, scDesign3 infers biologically meaningful parameters; assesses the goodness-of-fit of inferred cell clusters, trajectories, and spatial locations; and generates in silico negative and positive controls for benchmarking computational tools.
Last updated
softwaresinglecellsequencinggeneexpressionspatial
8.59 score 121 stars 1 dependents 71 scripts 316 downloadsClusterGVis - One-Step to Cluster and Visualize Gene Expression Data
Provides a streamlined workflow for clustering and visualizing gene expression patterns, particularly from time-series RNA-Seq and single-cell experiments. The package is designed to integrate seamlessly within the Bioconductor ecosystem by operating directly on standard data classes such as `SummarizedExperiment` and `SingleCellExperiment`. It implements common clustering algorithms (e.g., k-means, fuzzy c-means) and generates a suite of publication-ready visualizations to explore co-expressed gene modules. Functions are also included to facilitate the visualization of clustering results derived from other popular tools.
Last updated
rnaseqtranscriptomicsvisualizationsinglecellgeneexpressionclusteringcomplexheatmapgene-clusteringgene-expressionmfuzz
8.49 score 376 stars 68 scripts 157 downloadsrBLAST - R Interface for the Basic Local Alignment Search Tool
Seamlessly interfaces the Basic Local Alignment Search Tool (BLAST) running locally to search genetic sequence data bases. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Last updated
geneticssequencingsequencematchingalignmentdataimportbioconductorbioinformaticsblast-search
8.19 score 114 stars 1 dependents 151 scripts 469 downloadsggkegg - Analyzing and visualizing KEGG information using the grammar of graphics
This package aims to import, parse, and analyze KEGG data such as KEGG PATHWAY and KEGG MODULE. The package supports visualizing KEGG information using ggplot2 and ggraph through using the grammar of graphics. The package enables the direct visualization of the results from various omics analysis packages.
Last updated
pathwaysdataimportkeggggplot2ggraphpathwaytidygraphvisualization
8.18 score 243 stars 2 dependents 52 scripts 820 downloadsnotame - Workflow for non-targeted LC-MS metabolic profiling
Provides functionality for untargeted LC-MS metabolomics research as specified in the associated protocol article in the 'Metabolomics Data Processing and Data Analysis—Current Best Practices' special issue of the Metabolites journal (2020). This includes tabular data preprocessing and quality control, uni- and multivariate analysis as well as quality control visualizations, feature-wise visualizations and results visualizations. Raw data preprocessing and functionality related to biological context, such as pathway analysis, is not included.
Last updated
biomedicalinformaticsmetabolomicsdataimportmassspectrometrybatcheffectmultiplecomparisonnormalizationqualitycontrolvisualizationpreprocessing
8.17 score 5 stars 2 dependents 62 scripts 278 downloadsscDiagnostics - Cell type annotation diagnostics
The scDiagnostics package provides diagnostic plots to assess the quality of cell type assignments from single cell gene expression profiles. The implemented functionality allows to assess the reliability of cell type annotations, investigate gene expression patterns, and explore relationships between different cell types in query and reference datasets allowing users to detect potential misalignments between reference and query datasets. The package also provides visualization capabilities for diagnostics purposes.
Last updated
annotationclassificationclusteringgeneexpressionrnaseqsinglecellsoftwaretranscriptomics
8.05 score 13 stars 67 scripts 255 downloadscrisprScore - On-Target and Off-Target Scoring Algorithms for CRISPR gRNAs
Provides R wrappers of several on-target and off-target scoring methods for CRISPR guide RNAs (gRNAs). The following nucleases are supported: SpCas9, AsCas12a, enAsCas12a, and RfxCas13d (CasRx). The available on-target cutting efficiency scoring methods are RuleSet1, RuleSet3, DeepHF, enPAM+GB, and CRISPRscan. Both the CFD and MIT scoring methods are available for off-target specificity prediction. The package also provides a Lindel-derived score to predict the probability of a gRNA to produce indels inducing a frameshift for the Cas9 nuclease. Note that DeepHF and enPAM+GB are not available on Windows machines.
Last updated
crisprfunctionalgenomicsfunctionalpredictionbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgenomicsgrnagrna-sequencegrna-sequencesscoring-algorithmsgrnasgrna-design
7.93 score 27 stars 4 dependents 22 scripts 490 downloads
MetaboAnnotation - Utilities for Annotation of Metabolomics Data
High level functions to assist in annotation of (metabolomics) data sets. These include functions to perform simple tentative annotations based on mass matching but also functions to consider m/z and retention times for annotation of LC-MS features given that respective reference values are available. In addition, the function provides high-level functions to simplify matching of LC-MS/MS spectra against spectral libraries and objects and functionality to represent and manage such matched data.
Last updated
infrastructuremetabolomicsmassspectrometryannotationmass-spectromtry
7.89 score 20 stars 1 dependents 72 scripts 506 downloadsSpaceTrooper - SpaceTrooper performs Quality Control analysis of Image-Based spatial
SpaceTrooper performs Quality Control analysis using data driven GLM models of Image-Based spatial data, providing exploration plots, QC metrics computation, outlier detection. It implements a GLM strategy for the detection of low quality cells in imaging-based spatial data (Transcriptomics and Proteomics). It additionally implements several plots for the visualization of imaging based polygons through the ggplot2 package.
Last updated
softwaretranscriptomicsgeneexpressionqualitycontrolspatialsinglecelldataimportimmunooncology
7.87 score 11 stars 32 scripts 306 downloads
SpectriPy - Enhancing Cross-Language Mass Spectrometry Data Analysis with R and Python
The SpectriPy package allows integration of Python-based MS analysis code with the Spectra package. Spectra objects can be converted into Python MS data structures. In addition, SpectriPy integrates and wraps the similarity scoring and processing/filtering functions from the Python matchms package into R.
Last updated
infrastructuremetabolomicsmassspectrometryproteomicsmass-spectrometrypythonquarto
7.85 score 13 stars 28 scripts 172 downloadsspacexr - SpatialeXpressionR: Cell Type Identification in Spatial Transcriptomics
Spatial-eXpression-R (spacexr) is a package for analyzing cell types in spatial transcriptomics data. This implementation is a fork of the spacexr GitHub repo (https://github.com/dmcable/spacexr), adapted to work with Bioconductor objects. The original package implements two statistical methods: RCTD for learning cell types and CSIDE for inferring cell type-specific differential expression. Currently, this fork only implements RCTD, which learns cell type profiles from annotated RNA sequencing (RNA-seq) reference data and uses these profiles to identify cell types in spatial transcriptomic pixels while accounting for platform-specific effects. Future releases will include an implementation of CSIDE.
Last updated
geneexpressiondifferentialexpressionsinglecellrnaseqsoftwarespatialtranscriptomics
7.84 score 3 stars 816 scripts 635 downloads
hermes - Preprocessing, analyzing, and reporting of RNA-seq data
Provides classes and functions for quality control, filtering, normalization and differential expression analysis of pre-processed `RNA-seq` data. Data can be imported from `SummarizedExperiment` as well as `matrix` objects and can be annotated from `BioMart`. Filtering for genes without too low expression or containing required annotations, as well as filtering for samples with sufficient correlation to other samples or total number of reads is supported. The standard normalization methods including cpm, rpkm and tpm can be used, and 'DESeq2` as well as voom differential expression analyses are available.
Last updated
rnaseqdifferentialexpressionnormalizationpreprocessingqualitycontrolrna-seqstatistical-engineering
7.80 score 12 stars 1 dependents 45 scripts 456 downloadsPhyloProfile - PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Last updated
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
7.71 score 38 stars 12 scripts 521 downloadsscMultiSim - Simulation of Multi-Modality Single Cell Data Guided By Gene Regulatory Networks and Cell-Cell Interactions
scMultiSim simulates paired single cell RNA-seq, single cell ATAC-seq and RNA velocity data, while incorporating mechanisms of gene regulatory networks, chromatin accessibility and cell-cell interactions. It allows users to tune various parameters controlling the amount of each biological factor, variation of gene-expression levels, the influence of chromatin accessibility on RNA sequence data, and so on. It can be used to benchmark various computational methods for single cell multi-omics data, and to assist in experimental design of wet-lab experiments.
Last updated
singlecelltranscriptomicsgeneexpressionsequencingexperimentaldesign
7.58 score 68 stars 35 scripts 332 downloadsbiocmake - CMake for Bioconductor
Manages the installation of CMake for building Bioconductor packages. This avoids the need for end-users to manually install CMake on their system. No action is performed if a suitable version of CMake is already available.
Last updated
infrastructure
7.55 score 1 stars 350 dependents 4 scripts 1.1k downloadsMSstatsShiny - MSstats GUI for Statistical Anaylsis of Proteomics Experiments
MSstatsShiny is an R-Shiny graphical user interface (GUI) integrated with the R packages MSstats, MSstatsTMT, and MSstatsPTM. It provides a point and click end-to-end analysis pipeline applicable to a wide variety of experimental designs. These include data-dependedent acquisitions (DDA) which are label-free or tandem mass tag (TMT)-based, as well as DIA, SRM, and PRM acquisitions and those targeting post-translational modifications (PTMs). The application automatically saves users selections and builds an R script that recreates their analysis, supporting reproducible data analysis.
Last updated
immunooncologymassspectrometryproteomicssoftwareshinyappsdifferentialexpressiononechanneltwochannelnormalizationqualitycontrolgui
7.52 score 20 stars 8 scripts 362 downloadsMGnifyR - R interface to EBI MGnify metagenomics resource
Utility package to facilitate integration and analysis of EBI MGnify data in R. The package can be used to import microbial data for instance into TreeSummarizedExperiment (TreeSE). In TreeSE format, the data is directly compatible with miaverse framework.
Last updated
infrastructuredataimportmetagenomicsmicrobiomemicrobiomedata
7.49 score 23 stars 37 scripts 312 downloads
gemma.R - A wrapper for Gemma's Restful API to access curated gene expression data and differential expression analyses
Low- and high-level wrappers for Gemma's RESTful API. They enable access to curated expression and differential expression data from over 10,000 published studies. Gemma is a web site, database and a set of tools for the meta-analysis, re-use and sharing of genomics data, currently primarily targeted at the analysis of gene expression profiles.
Last updated
softwaredataimportmicroarraysinglecellthirdpartyclientdifferentialexpressiongeneexpressionbayesianannotationexperimentaldesignnormalizationbatcheffectpreprocessingbioinformaticsgemmagenomicstranscriptomics
7.45 score 10 stars 42 scripts 389 downloadscrisprBase - Base functions and classes for CRISPR gRNA design
Provides S4 classes for general nucleases, CRISPR nucleases, CRISPR nickases, and base editors.Several CRISPR-specific genome arithmetic functions are implemented to help extract genomic coordinates of spacer and protospacer sequences. Commonly-used CRISPR nuclease objects are provided that can be readily used in other packages. Both DNA- and RNA-targeting nucleases are supported.
Last updated
crisprfunctionalgenomicsbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgrnagrna-sequencegrna-sequences
7.39 score 5 stars 6 dependents 91 scripts 475 downloadsbaySeq - Empirical Bayesian analysis of patterns of differential expression in count data
This package identifies differential expression in high-throughput 'count' data, such as that derived from next-generation sequencing machines, calculating estimated posterior likelihoods of differential expression (or more complex hypotheses) via empirical Bayesian methods.
Last updated
sequencingdifferentialexpressionmultiplecomparisonsagebayesiancoverage
7.39 score 3 dependents 82 scripts 1.0k downloads
DeconvoBuddies - Helper Functions for LIBD Deconvolution
Functions helpful for LIBD deconvolution project. Includes tools for marker finding with mean ratio, expression plotting, and plotting deconvolution results. Working to include DLPFC datasets.
Last updated
softwaresinglecellrnaseqgeneexpressiontranscriptomicsexperimenthubsoftwarebioconductordeconvolution
7.36 score 8 stars 48 scripts 273 downloadssimona - Semantic Similarity on Bio-Ontologies
This package implements infrastructures for ontology analysis by offering efficient data structures, fast ontology traversal methods, and elegant visualizations. It provides a robust toolbox supporting over 70 methods for semantic similarity analysis.
Last updated
softwareannotationgobiomedicalinformaticscpp
7.35 score 18 stars 2 dependents 44 scripts 1.2k downloadspsichomics - Graphical Interface for Alternative Splicing Quantification, Analysis and Visualisation
Interactive R package with an intuitive Shiny-based graphical interface for alternative splicing quantification and integrative analyses of alternative splicing and gene expression based on The Cancer Genome Atlas (TCGA), the Genotype-Tissue Expression project (GTEx), Sequence Read Archive (SRA) and user-provided data. The tool interactively performs survival, dimensionality reduction and median- and variance-based differential splicing and gene expression analyses that benefit from the incorporation of clinical and molecular sample-associated features (such as tumour stage or survival). Interactive visual access to genomic mapping and functional annotation of selected alternative splicing events is also included.
Last updated
sequencingrnaseqalternativesplicingdifferentialsplicingtranscriptionguiprincipalcomponentsurvivalbiomedicalinformaticstranscriptomicsimmunooncologyvisualizationmultiplecomparisongeneexpressiondifferentialexpressionalternative-splicingbioconductordata-analysesdifferential-gene-expressiondifferential-splicing-analysisgene-expressiongtexrecount2rna-seq-datasplicing-quantificationsratcgavast-toolscpp
7.30 score 37 stars 34 scripts 418 downloadsggsc - Visualizing Single Cell and Spatial Transcriptomics
Useful functions to visualize single cell and spatial data. It supports visualizing 'Seurat', 'SingleCellExperiment' and 'SpatialExperiment' objects through grammar of graphics syntax implemented in 'ggplot2'.
Last updated
dimensionreductiongeneexpressionsinglecellsoftwarespatialtranscriptomicsvisualizationopenblascppopenmp
7.30 score 51 stars 37 scripts 308 downloadsgDRutils - A package with helper functions for processing drug response data
This package contains utility functions used throughout the gDR platform to fit data, manipulate data, and convert and validate data structures. This package also has the necessary default constants for gDR platform. Many of the functions are utilized by the gDRcore package.
Last updated
softwareinfrastructure
7.30 score 2 stars 3 dependents 10 scripts 286 downloadsgDRcore - Processing functions and interface to process and analyze drug dose-response data
This package contains core functions to process and analyze drug response data. The package provides tools for normalizing, averaging, and calculation of gDR metrics data. All core functions are wrapped into the pipeline function allowing analyzing the data in a straightforward way.
Last updated
softwareshinyappscpp
7.25 score 2 stars 1 dependents 12 scripts 296 downloadsmariner - Mariner: Explore the Hi-Cs
Tools for manipulating paired ranges and working with Hi-C data in R. Functionality includes manipulating/merging paired regions, generating paired ranges, extracting/aggregating interactions from `.hic` files, and visualizing the results. Designed for compatibility with plotgardener for visualization.
Last updated
functionalgenomicsvisualizationhic
7.24 score 12 stars 193 scripts 284 downloadsTnT - Interactive Visualization for Genomic Features
A R interface to the TnT javascript library (https://github.com/ tntvis) to provide interactive and flexible visualization of track-based genomic data.
Last updated
infrastructurevisualizationbioconductorgenome-browserhtmlwidgetsshiny
7.23 score 15 stars 19 scripts 392 downloadslfa - Logistic Factor Analysis for Categorical Data
Logistic Factor Analysis is a method for a PCA analogue on Binomial data via estimation of latent structure in the natural parameter. The main method estimates genetic population structure from genotype data. There are also methods for estimating individual-specific allele frequencies using the population structure. Lastly, a structured Hardy-Weinberg equilibrium (HWE) test is developed, which quantifies the goodness of fit of the genotype data to the estimated population structure, via the estimated individual-specific allele frequencies (all of which generalizes traditional HWE tests).
Last updated
snpdimensionreductionprincipalcomponentregressionopenblas
7.23 score 16 stars 1 dependents 59 scripts 697 downloadsAlphaMissenseR - Accessing AlphaMissense Data Resources in R
The AlphaMissense publication <https://www.science.org/doi/epdf/10.1126/science.adg7492> outlines how a variant of AlphaFold / DeepMind was used to predict missense variant pathogenicity. Supporting data on Zenodo <https://zenodo.org/record/10813168> include, for instance, 71M variants across hg19 and hg38 genome builds. The 'AlphaMissenseR' package allows ready access to the data, downloading individual files to DuckDB databases for exploration and integration into *R* and *Bioconductor* workflows.
Last updated
snpannotationfunctionalgenomicsstructuralpredictiontranscriptomicsvariantannotationgenepredictionimmunooncology
7.19 score 13 stars 15 scripts 374 downloads
BulkSignalR - Infer Ligand-Receptor Interactions from bulk expression (transcriptomics/proteomics) data, or spatial transcriptomics
Inference of ligand-receptor (LR) interactions from bulk expression (transcriptomics/proteomics) data, or spatial transcriptomics. BulkSignalR bases its inferences on the LRdb database included in our other package, SingleCellSignalR available from Bioconductor. It relies on a statistical model that is specific to bulk data sets. Different visualization and data summary functions are proposed to help navigating prediction results.
Last updated
networkrnaseqsoftwareproteomicstranscriptomicsnetworkinferencespatial
7.19 score 27 stars 1 dependents 16 scripts 452 downloadsimmApex - Tools for Adaptive Immune Receptor Sequence-Based Machine and Deep Learning
A set of tools to for machine and deep learning in R from amino acid and nucleotide sequences focusing on adaptive immune receptors. The package includes pre-processing of sequences, unifying gene nomenclature usage, encoding sequences, and combining models. This package will serve as the basis of future immune receptor sequence functions/packages/models compatible with the scRepertoire ecosystem.
Last updated
softwareimmunooncologysinglecellclassificationannotationsequencingmotifannotationcppopenmp
7.18 score 14 stars 3 dependents 12 scripts 505 downloadskoinar - KoinaR - Remote machine learning inference using Koina
A client to simplify fetching predictions from the Koina web service. Koina is a model repository enabling the remote execution of models. Predictions are generated as a response to HTTP/S requests, the standard protocol used for nearly all web traffic.
Last updated
massspectrometryproteomicsinfrastructuresoftwarebioinformaticsdeep-learningmachine-learningmass-spectrometrypython
7.13 score 53 stars 4 scripts 284 downloadsImageArray - A framework for on-disk and in-memory image arrays
ImageArray provides a framework for on-disk and in-memory image arrays, specifically for pyramidal images stored in HDF5, Zarr and life sciences image file formats (OME Bio-Formats).
Last updated
softwarevisualization
7.09 score 6 stars 2 dependents 17 scripts 131 downloadsmiaSim - Microbiome Data Simulation
Microbiome time series simulation with generalized Lotka-Volterra model, Self-Organized Instability (SOI), and other models. Hubbell's Neutral model is used to determine the abundance matrix. The resulting abundance matrix is applied to (Tree)SummarizedExperiment objects.
Last updated
microbiomesoftwaresequencingdnaseqatacseqcoveragenetwork
7.07 score 22 stars 30 scripts 354 downloadsRCX - R package implementing the Cytoscape Exchange (CX) format
Create, handle, validate, visualize and convert networks in the Cytoscape exchange (CX) format to standard data types and objects. The package also provides conversion to and from objects of iGraph and graphNEL. The CX format is also used by the NDEx platform, a online commons for biological networks, and the network visualization software Cytocape.
Last updated
pathwaysdataimportnetwork
7.00 score 8 stars 1 dependents 21 scripts 338 downloads
tidyomics - Easily install and load the tidyomics ecosystem
The tidyomics ecosystem is a set of packages for ’omic data analysis that work together in harmony; they share common data representations and API design, consistent with the tidyverse ecosystem. The tidyomics package is designed to make it easy to install and load core packages from the tidyomics ecosystem with a single command.
Last updated
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicscytometrygenomicstidyverse
7.00 score 75 stars 22 scripts 260 downloadsimmReferent - An Interface for Immune Receptor and HLA Gene Reference Data
Provides a consistent interface for downloading, storing, and accessing immune receptor (TCR/BCR) and HLA sequences from IMGT, IPD-IMGT/HLA, and OGRDB (AIRR-C). Supports export to popular analysis tools including MiXCR, TRUST4, Cell Ranger, and IgBLAST. This package serves as a core dependency for immunogenomics packages, ensuring reliable and high-quality sequence access with local caching for reproducibility.
Last updated
softwareannotationsequencing
6.99 score 9 stars 4 dependents 6 scripts 164 downloadsGloScope - Population-level Representation on scRNA-Seq data
This package aims at representing and summarizing the entire single-cell profile of a sample. It allows researchers to perform important bioinformatic analyses at the sample-level such as visualization and quality control. The main functions Estimate sample distribution and calculate statistical divergence among samples, and visualize the distance matrix through MDS plots.
Last updated
datarepresentationqualitycontrolrnaseqsequencingsoftwaresinglecell
6.96 score 7 stars 87 scripts 308 downloadsIbex - Methods for BCR single-cell embedding
Implementation of the Ibex algorithm for single-cell embedding based on BCR sequences. The package includes a standalone function to encode BCR sequence information by amino acid properties or sequence order using tensorflow-based autoencoder. In addition, the package interacts with SingleCellExperiment or Seurat data objects.
Last updated
softwareimmunooncologysinglecellclassificationannotationsequencing
6.95 score 27 stars 17 scripts 182 downloadsdominoSignal - Cell Communication Analysis for Single Cell RNA Sequencing
dominoSignal is a package developed to analyze cell signaling through ligand - receptor - transcription factor networks in scRNAseq data. It takes as input information transcriptomic data, requiring counts, z-scored counts, and cluster labels, as well as information on transcription factor activation (such as from SCENIC) and a database of ligand and receptor pairings (such as from CellPhoneDB). This package creates an object storing ligand - receptor - transcription factor linkages by cluster and provides several methods for exploring, summarizing, and visualizing the analysis.
Last updated
systemsbiologysinglecelltranscriptomicsnetwork
6.93 score 6 stars 34 scripts 272 downloadsNetPathMiner - NetPathMiner for Biological Network Construction, Path Mining and Visualization
NetPathMiner is a general framework for network path mining using genome-scale networks. It constructs networks from KGML, SBML and BioPAX files, providing three network representations, metabolic, reaction and gene representations. NetPathMiner finds active paths and applies machine learning methods to summarize found paths for easy interpretation. It also provides static and interactive visualizations of networks and paths to aid manual investigation.
Last updated
graphandnetworkpathwaysnetworkclusteringclassificationlibsbmllibxml2openblascpp
6.85 score 9 stars 1 dependents 11 scripts 414 downloadsscLANE - Model Gene Expression Dynamics with Spline-Based NB GLMs, GEEs, & GLMMs
Our scLANE model uses truncated power basis spline models to build flexible, interpretable models of single cell gene expression over pseudotime or latent time. The modeling architectures currently supported are Negative-binomial GLMs, GEEs, & GLMMs. Downstream analysis functionalities include model comparison, dynamic gene clustering, smoothed counts generation, gene set enrichment testing, & visualization.
Last updated
rnaseqsoftwareclusteringtimecoursesequencingregressionsinglecellvisualizationgeneexpressiontranscriptomicsgenesetenrichmentdifferentialexpressiondifferential-expressionestimating-equationsgenomicsmixed-modelspseudotimerna-velocityscrna-seqsingle-celltrajectory-inferencecpp
6.83 score 16 stars 38 scripts 224 downloadsChromatograms - Infrastructure for Chromatographic Mass Spectrometry Data
The Chromatograms packages defines an efficient infrastructure for storing and handling of chromatographic mass spectrometry data. It provides different implementations of *backends* to store and represent the data. Such backends can be optimized for small memory footprint or fast data access/processing. A lazy evaluation queue and chunk-wise processing capabilities ensure efficient analysis of also very large data sets.
Last updated
infrastructuremetabolomicsmassspectrometryproteomics
6.82 score 2 stars 1 dependents 26 scripts 197 downloadsCaDrA - Candidate Driver Analysis
Performs both stepwise and backward heuristic search for candidate (epi)genetic drivers based on a binary multi-omics dataset. CaDrA's main objective is to identify features which, together, are significantly skewed or enriched pertaining to a given vector of continuous scores (e.g. sample-specific scores representing a phenotypic readout of interest, such as protein expression, pathway activity, etc.), based on the union occurence (i.e. logical OR) of the events.
Last updated
microarrayrnaseqgeneexpressionsoftwarefeatureextraction
6.81 score 24 stars 10 scripts 283 downloadsSpotSweeper - Spatially-aware quality control for spatial transcriptomics
Spatially-aware quality control (QC) software for both spot-level and artifact-level QC in spot-based spatial transcripomics, such as 10x Visium. These methods calculate local (nearest-neighbors) mean and variance of standard QC metrics (library size, unique genes, and mitochondrial percentage) to identify outliers spot and large technical artifacts.
Last updated
softwarespatialtranscriptomicsqualitycontrolgeneexpressionbioconductorquality-controlspatial-transcriptomics
6.81 score 16 stars 134 scripts 384 downloadsspatialFDA - A Tool for Spatial Multi-sample Comparisons
spatialFDA is a package to calculate spatial statistics metrics. The package takes a SpatialExperiment object and calculates spatial statistics metrics using the package spatstat. Then it compares the resulting functions across samples/conditions using functional additive models as implemented in the package refund. Furthermore, it provides exploratory visualisations using functional principal component analysis, as well implemented in refund.
Last updated
softwarespatialtranscriptomics
6.81 score 8 stars 25 scripts 274 downloadsRiboCrypt - Interactive visualization in genomics
R Package for interactive visualization and browsing NGS data. It contains a browser for both transcript and genomic coordinate view. In addition a QC and general metaplots are included, among others differential translation plots and gene expression plots. The package is still under development.
Last updated
softwaresequencingriboseqrnaseq
6.77 score 6 stars 33 scripts 272 downloadsmiaTime - Microbiome Time Series Analysis
The `miaTime` package provides tools for microbiome time series analysis based on (Tree)SummarizedExperiment infrastructure.
Last updated
microbiomesoftwaresequencing
6.72 score 7 stars 36 scripts 213 downloadsMSstatsConvert - Import Data from Various Mass Spectrometry Signal Processing Tools to MSstats Format
MSstatsConvert provides tools for importing reports of Mass Spectrometry data processing tools into R format suitable for statistical analysis using the MSstats and MSstatsTMT packages.
Last updated
massspectrometryproteomicssoftwaredataimportqualitycontrolcpp
6.71 score 8 dependents 33 scripts 1.0k downloads
MsBackendMsp - Mass Spectrometry Data Backend for NIST msp Files
Mass spectrometry (MS) data backend supporting import and handling of MS/MS spectra from NIST MSP Format (msp) files. Import of data from files with different MSP *flavours* is supported. Objects from this package add support for MSP files to Bioconductor's Spectra package. This package is thus not supposed to be used without the Spectra package that provides a complete infrastructure for MS data handling.
Last updated
infrastructureproteomicsmassspectrometrymetabolomicsdataimportmass-spectrometry
6.69 score 5 stars 2 dependents 41 scripts 622 downloadsigblastr - User-friendly R Wrapper to IgBLAST
The igblastr package provides functions to conveniently install and use a local IgBLAST installation from within R. The package also includes a set of built-in IgBLAST-compatible germline databases from OGRDB, the AIRR Community’s Open Germline Receptor Database, for various organisms. It provides functions to create additional IgBLAST-compatible germline databases using reference sequences retrieved from IMGT/V-QUEST or local FASTA files supplied by the user. When possible, annotations for the V and J alleles in a new germline database are automatically computed and added to the database, so they can be used as replacements for the internal and auxiliary data shipped with IgBLAST. IgBLAST is described at <https://pubmed.ncbi.nlm.nih.gov/23671333/>. IgBLAST web interface: <https://www.ncbi.nlm.nih.gov/igblast/>. OGRDB: <https://ogrdb.airr-community.org/>. IMGT/V-QUEST download site: <https://www.imgt.org/download/V-QUEST/>.
Last updated
immunologyimmunogeneticsimmunooncologycellbiologybioconductor-package
6.66 score 4 stars 21 scripts 214 downloadsSPONGE - Sparse Partial Correlations On Gene Expression
This package provides methods to efficiently detect competitive endogeneous RNA interactions between two genes. Such interactions are mediated by one or several miRNAs such that both gene and miRNA expression data for a larger number of samples is needed as input. The SPONGE package now also includes spongEffects: ceRNA modules offer patient-specific insights into the miRNA regulatory landscape.
Last updated
geneexpressiontranscriptiongeneregulationnetworkinferencetranscriptomicssystemsbiologyregressionrandomforestmachinelearning
6.64 score 1 dependents 49 scripts 423 downloadsSpaNorm - Spatially-aware normalisation for spatial transcriptomics data
This package implements the spatially aware library size normalisation algorithm, SpaNorm. SpaNorm normalises out library size effects while retaining biology through the modelling of smooth functions for each effect. Normalisation is performed in a gene- and cell-/spot- specific manner, yielding library size adjusted data.
Last updated
softwaregeneexpressiontranscriptomicsspatialcellbiology
6.64 score 18 stars 23 scripts 332 downloadsBEclear - Correction of batch effects in DNA methylation data
Provides functions to detect and correct for batch effects in DNA methylation data. The core function is based on latent factor models and can also be used to predict missing values in any other matrix containing real numbers.
Last updated
batcheffectdnamethylationsoftwarepreprocessingstatisticalmethodbatch-effectsbioconductor-packagedna-methylationlatent-factor-modelmethylationmissing-datamissing-valuesstochastic-gradient-descentcpp
6.64 score 5 stars 16 scripts 408 downloadsSuperCellCyto - SuperCell For Cytometry Data
SuperCellCyto provides the ability to summarise cytometry data into supercells by merging together cells that are similar in their marker expressions using the SuperCell package.
Last updated
cellbiologyflowcytometrysoftwaresinglecellbioinformaticscomputational-biologycytometry
6.63 score 13 stars 22 scripts 214 downloads
dar - Differential Abundance Analysis by Consensus
Differential abundance testing in microbiome data challenges both parametric and non-parametric statistical methods, due to its sparsity, high variability and compositional nature. Microbiome-specific statistical methods often assume classical distribution models or take into account compositional specifics. These produce results that range within the specificity vs sensitivity space in such a way that type I and type II error that are difficult to ascertain in real microbiome data when a single method is used. Recently, a consensus approach based on multiple differential abundance (DA) methods was recently suggested in order to increase robustness. With dar, you can use dplyr-like pipeable sequences of DA methods and then apply different consensus strategies. In this way we can obtain more reliable results in a fast, consistent and reproducible way.
Last updated
softwaresequencingmicrobiomemetagenomicsmultiplecomparisonnormalizationbioconductorbiomarker-discoverydifferential-abundance-analysisfeature-selectionmicrobiologyphyloseq
6.62 score 6 stars 11 scripts 265 downloadsSpatialExperimentIO - Read in Xenium, CosMx, MERSCOPE or STARmapPLUS data as SpatialExperiment object
Read in imaging-based spatial transcriptomics technology data. Current available modules are for Xenium by 10X Genomics, CosMx by Nanostring, MERSCOPE by Vizgen, or STARmapPLUS from Broad Institute. You can choose to read the data in as a SpatialExperiment or a SingleCellExperiment object.
Last updated
datarepresentationdataimportinfrastructuretranscriptomicssinglecellspatialgeneexpression
6.59 score 19 stars 1 dependents 69 scripts 296 downloads
markeR - An R Toolkit for Evaluating Gene Signatures as Phenotypic Markers
markeR is an R package that provides a modular and extensible framework for the systematic evaluation of gene sets as phenotypic markers using transcriptomic data. The package is designed to support both quantitative analyses and visual exploration of gene set behaviour across experimental and clinical phenotypes. It implements multiple methods, including score-based and enrichment approaches, and also allows the exploration of expression behaviour of individual genes. In addition, users can assess the similarity of their own gene sets against established collections (e.g., those from MSigDB), facilitating biological interpretation.
Last updated
geneexpressiontranscriptomicsvisualizationsoftwaregenesetenrichmentclassificationgene-expressiongene-setsgene-signaturesphenotypesrna-seq-data
6.56 score 9 stars 21 scripts 190 downloadscliqueMS - Annotation of Isotopes, Adducts and Fragmentation Adducts for in-Source LC/MS Metabolomics Data
Annotates data from liquid chromatography coupled to mass spectrometry (LC/MS) metabolomics experiments. Based on a network algorithm (O.Senan, A. Aguilar- Mogas, M. Navarro, O. Yanes, R.Guimerà and M. Sales-Pardo, Bioinformatics, 35(20), 2019), 'CliqueMS' builds a weighted similarity network where nodes are features and edges are weighted according to the similarity of this features. Then it searches for the most plausible division of the similarity network into cliques (fully connected components). Finally it annotates metabolites within each clique, obtaining for each annotated metabolite the neutral mass and their features, corresponding to isotopes, ionization adducts and fragmentation adducts of that metabolite.
Last updated
metabolomicsmassspectrometrynetworknetworkinferencecpp
6.54 score 12 stars 32 scripts 412 downloadstreeclimbR - An algorithm to find optimal signal levels in a tree
The arrangement of hypotheses in a hierarchical structure appears in many research fields and often indicates different resolutions at which data can be viewed. This raises the question of which resolution level the signal should best be interpreted on. treeclimbR provides a flexible method to select optimal resolution levels (potentially different levels in different parts of the tree), rather than cutting the tree at an arbitrary level. treeclimbR uses a tuning parameter to generate candidate resolutions and from these selects the optimal one.
Last updated
statisticalmethodcellbasedassays
6.53 score 20 stars 56 scripts 240 downloadspoem - POpulation-based Evaluation Metrics
This package provides a comprehensive set of external and internal evaluation metrics. It includes metrics for assessing partitions or fuzzy partitions derived from clustering results, as well as for evaluating subpopulation identification results within embeddings or graph representations. Additionally, it provides metrics for comparing spatial domain detection results against ground truth labels, and tools for visualizing spatial errors.
Last updated
dimensionreductionclusteringgraphandnetworkspatialatacseqsinglecellrnaseqsoftwarevisualization
6.52 score 11 stars 25 scripts 242 downloadscoMethDMR - Accurate identification of co-methylated and differentially methylated regions in epigenome-wide association studies
coMethDMR identifies genomic regions associated with continuous phenotypes by optimally leverages covariations among CpGs within predefined genomic regions. Instead of testing all CpGs within a genomic region, coMethDMR carries out an additional step that selects co-methylated sub-regions first without using any outcome information. Next, coMethDMR tests association between methylation within the sub-region and continuous phenotype using a random coefficient mixed effects model, which models both variations between CpG sites within the region and differential methylation simultaneously.
Last updated
dnamethylationepigeneticsmethylationarraydifferentialmethylationgenomewideassociation
6.51 score 7 stars 46 scripts 338 downloadsLimROTS - LimROTS: A Hybrid Method Integrating Empirical Bayes and Reproducibility-Optimized Statistics for Robust Differential Expression Analysis
Differential expression analysis is commonly used to study diverse biological datasets. The reproducibility-optimized test statistic (ROTS) (Elo et al., 2008, <doi:10.1109/tcbb.2007.1078>) uses a modified t-statistic to prioritise features that differ between two or more groups. However, the ROTS Bioconductor implementation (Suomi et al., 2017, <doi:10.1371/journal.pcbi.1005562>) did not accommodate technical or biological covariates. LimROTS (Anwar et al., 2025, <doi:10.1093/bioinformatics/btaf570>) addressed this limitation by combining a reproducibility-optimized test statistic with the limma empirical Bayes approach (Ritchie et al., 2015, <doi:10.1093/nar/gkv007>). This enables the analysis of more complex experimental designs and the incorporation of covariates.
Last updated
softwaregeneexpressiondifferentialexpressionmicroarrayrnaseqproteomicsimmunooncologymetabolomicsmrnamicroarray
6.48 score 4 stars 28 scripts 300 downloadsTaxSEA - Taxon Set Enrichment Analysis
TaxSEA is an R package for Taxon Set Enrichment Analysis, which utilises a Kolmogorov-Smirnov test analyses to investigate differential abundance analysis output for whether there are alternations in a-priori defined sets of taxa from public databases (BugSigDB, MiMeDB, GutMGene, mBodyMap, BacDive and GMRepoV2) and collated from the literature. TaxSEA takes as input a list of taxonomic identifiers (e.g. species names, NCBI IDs etc.) and a rank (E.g. fold change, correlation coefficient). TaxSEA be applied to any microbiota taxonomic profiling technology (array-based, 16S rRNA gene sequencing, shotgun metagenomics & metatranscriptomics etc.) and enables researchers to rapidly contextualize their findings within the broader literature to accelerate interpretation of results.
Last updated
microbiomemetagenomicssequencinggenesetenrichmentrnaseq
6.48 score 10 stars 7 scripts 246 downloads
tidySpatialExperiment - SpatialExperiment with tidy principles
tidySpatialExperiment provides a bridge between the SpatialExperiment package and the tidyverse ecosystem. It creates an invisible layer that allows you to interact with a SpatialExperiment object as if it were a tibble; enabling the use of functions from dplyr, tidyr, ggplot2 and plotly. But, underneath, your data remains a SpatialExperiment object.
Last updated
infrastructurernaseqgeneexpressionsequencingspatialtranscriptomicssinglecell
6.46 score 8 stars 1 dependents 16 scripts 358 downloadspathlinkR - Analyze and interpret RNA-Seq results
pathlinkR is an R package designed to facilitate analysis of RNA-Seq results. Specifically, our aim with pathlinkR was to provide a number of tools which take a list of DE genes and perform different analyses on them, aiding with the interpretation of results. Functions are included to perform pathway enrichment, with muliplte databases supported, and tools for visualizing these results. Genes can also be used to create and plot protein-protein interaction networks, all from inside of R.
Last updated
genesetenrichmentnetworkpathwaysreactomernaseqnetworkenrichmentbioinformaticsnetworkspathway-enrichment-analysisvisualization
6.45 score 31 stars 5 scripts 296 downloadsDspikeIn - Estimating Absolute Abundance from Microbial Spike-in Controls
Provides a reproducible and modular workflow for absolute microbial quantification using spike-in controls. Supports both single spike-in taxa and synthetic microbial communities with user-defined spike-in volumes and genome copy numbers. Compatible with 'phyloseq' and 'TreeSummarizedExperiment' (TSE) data structures. The package implements methods for spike-in validation, preprocessing, scaling factor estimation, absolute abundance conversion, bias correction, and normalization. Facilitates downstream statistical analyses with 'DESeq2', 'edgeR', and other Bioconductor-compatible methods. Visualization tools are provided via 'ggplot2', 'ggtree', and related packages. Includes detailed vignettes, case studies, and function-level documentation to guide users through experimental design, quantification, and interpretation.
Last updated
microbiomepreprocessingqualitycontroldifferentialexpressionnormalizationsequencingvisualizationphylogeneticsexperimentaldesigndataimportsoftwareabsolute-abundancesamplicon-sequencinggene-copiesphylogenetic-treesqiime2quantifyingspike-instransformationwhole-cell
6.44 score 17 stars 36 scripts 219 downloadsRAIDS - Robust Ancestry Inference using Data Synthesis
This package implements specialized algorithms that enable genetic ancestry inference from various cancer sequences sources (RNA, Exome and Whole-Genome sequences). This package also implements a simulation algorithm that generates synthetic cancer-derived data. This code and analysis pipeline was designed and developed for the following publication: Belleau, P et al. Genetic Ancestry Inference from Cancer-Derived Molecular Data across Genomic and Transcriptomic Platforms. Cancer Res 1 January 2023; 83 (1): 49–58.
Last updated
geneticssoftwaresequencingwholegenomeprincipalcomponentgeneticvariabilitydimensionreductionbiocviewsancestrycancer-genomicsexome-sequencinggenomicsinferencer-languagerna-seqrna-sequencingwhole-genome-sequencing
6.44 score 6 stars 19 scripts 252 downloadsCalibraCurve - Calibration curves for targeted proteomics, lipidomics and metabolomics data
CalibraCurve is a computational tool designed to generate calibration curves for targeted mass spectrometry-based quantitative data. It is applicable to various omics disciplines, including proteomics, lipidomics, and metabolomics. The package also offers functionalities for data and calibration curve visualization and concentration prediction from new datasets based on the established curves.
Last updated
proteomicslipidomicsmetabolomicsregressionmassspectrometryvisualizationassayscalibrationdynamic-linear-rangesquantification-limitstargeted-proteomics
6.38 score 5 stars 7 scripts 217 downloadssosta - A package for the analysis of anatomical tissue structures in spatial omics data
sosta (Spatial Omics STructure Analysis) is a package for analyzing spatial omics data to explore tissue organization at the anatomical structure level. It reconstructs anatomically relevant structures based on molecular features or cell types. It further calculates a range of metrics at the structure level to quantitatively describe tissue architecture. The package is designed to integrate with other packages for the analysis of spatial omics data.
Last updated
softwarespatialtranscriptomicsvisualization
6.37 score 7 stars 14 scripts 258 downloadsClustIRR - Clustering of Immune Receptor Repertoires
ClustIRR analyzes repertoires of B- and T-cell receptors. It starts by identifying communities of immune receptors with similar specificities, based on the sequences of their complementarity-determining regions (CDRs). Next, it employs a Bayesian probabilistic models to quantify differential community occupancy (DCO) between repertoires, allowing the identification of expanding or contracting communities in response to e.g. infection or cancer treatment.
Last updated
clusteringimmunooncologysinglecellsoftwareclassificationbayesianbiomedicalinformaticsmathematicalbiologyb-cell-receptorbioinformaticsimmunoinformaticsimmunologyquantitative-methodsrep-seqrepertoire-analysist-cell-receptorcpp
6.36 score 5 stars 11 scripts 278 downloadsnipalsMCIA - Multiple Co-Inertia Analysis via the NIPALS Method
Computes Multiple Co-Inertia Analysis (MCIA), a dimensionality reduction (jDR) algorithm, for a multi-block dataset using a modification to the Nonlinear Iterative Partial Least Squares method (NIPALS) proposed in (Hanafi et. al, 2010). Allows multiple options for row- and table-level preprocessing, and speeds up computation of variance explained. Vignettes detail application to bulk- and single cell- multi-omics studies.
Last updated
softwareclusteringclassificationmultiplecomparisonnormalizationpreprocessingsinglecell
6.30 score 7 stars 16 scripts 308 downloadsscider - Spatial cell-type inter-correlation by density in R
scider is an user-friendly R package providing functions to model the global density of cells in a slide of spatial transcriptomics data. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. After modelling density, the package allows for several downstream analysis, including colocalization analysis, boundary detection analysis and differential density analysis.
Last updated
spatialtranscriptomicscppopenjdk
6.30 score 11 stars 13 scripts 309 downloads
iscream - Make fast and memory efficient BED file queries, summaries and matrices
BED files store ranged genomic data that can be queried even when the files are compressed. iscream can query data from BED files and return them in muliple formats: parsed records or their summary statistics as data frames or GenomicRanges objects, and matrices as matrix, GenomicRanges, or SummarizedExperiment objects. iscream also provides specialized support for importing methylation data.
Last updated
dataimportsoftwaresequencingsinglecelldnamethylationbedrcpptabixwgbscurlbzip2xz-utilszlibcppopenmp
6.29 score 12 scripts 212 downloadsBgeeCall - Automatic RNA-Seq present/absent gene expression calls generation
BgeeCall allows to generate present/absent gene expression calls without using an arbitrary cutoff like TPM<1. Calls are generated based on reference intergenic sequences. These sequences are generated based on expression of all RNA-Seq libraries of each species integrated in Bgee (https://bgee.org).
Last updated
softwaregeneexpressionrnaseqbiologygene-expressiongene-levelintergenic-regionspresent-absent-callsrna-seqrna-seq-librariesscrna-seq
6.28 score 4 stars 7 scripts 448 downloadsgDRstyle - A package with style requirements for the gDR suite
Package fills a helper package role for whole gDR suite. It helps to support good development practices by keeping style requirements and style tests for other packages. It also contains build helpers to make all package requirements met.
Last updated
softwareinfrastructure
6.28 score 3 stars 7 scripts 272 downloadsPRONE - The PROteomics Normalization Evaluator
High-throughput omics data are often affected by systematic biases introduced throughout all the steps of a clinical study, from sample collection to quantification. Normalization methods aim to adjust for these biases to make the actual biological signal more prominent. However, selecting an appropriate normalization method is challenging due to the wide range of available approaches. Therefore, a comparative evaluation of unnormalized and normalized data is essential in identifying an appropriate normalization strategy for a specific data set. This R package provides different functions for preprocessing, normalizing, and evaluating different normalization approaches. Furthermore, normalization methods can be evaluated on downstream steps, such as differential expression analysis and statistical enrichment analysis. Spike-in data sets with known ground truth and real-world data sets of biological experiments acquired by either tandem mass tag (TMT) or label-free quantification (LFQ) can be analyzed.
Last updated
proteomicspreprocessingnormalizationdifferentialexpressionvisualizationdata-analysisevaluation
6.28 score 8 stars 9 scripts 326 downloadsiNETgrate - Integrates DNA methylation data with gene expression in a single gene network
The iNETgrate package provides functions to build a correlation network in which nodes are genes. DNA methylation and gene expression data are integrated to define the connections between genes. This network is used to identify modules (clusters) of genes. The biological information in each of the resulting modules is represented by an eigengene. These biological signatures can be used as features e.g., for classification of patients into risk categories. The resulting biological signatures are very robust and give a holistic view of the underlying molecular changes.
Last updated
geneexpressionrnaseqdnamethylationnetworkinferencenetworkgraphandnetworkbiomedicalinformaticssystemsbiologytranscriptomicsclassificationclusteringdimensionreductionprincipalcomponentmrnamicroarraynormalizationgenepredictionkeggsurvivalcore-services
6.26 score 76 stars 1 scripts 297 downloadsBiocHail - basilisk and hail
Use hail via basilisk when appropriate, or via reticulate. This package can be used in terra.bio to interact with UK Biobank resources processed by hail.is.
Last updated
infrastructurebioconductorgeneticshail
6.26 score 6 stars 25 scripts 168 downloadscrisprBowtie - Bowtie-based alignment of CRISPR gRNA spacer sequences
Provides a user-friendly interface to map on-targets and off-targets of CRISPR gRNA spacer sequences using bowtie. The alignment is fast, and can be performed using either commonly-used or custom CRISPR nucleases. The alignment can work with any reference or custom genomes. Both DNA- and RNA-targeting nucleases are supported.
Last updated
crisprfunctionalgenomicsalignmentalignerbioconductorbioconductor-packagebowtiecrispr-analysiscrispr-cas9crispr-designcrispr-targetgrnagrna-sequencegrna-sequencessgrnasgrna-design
6.24 score 3 stars 4 dependents 16 scripts 420 downloadsrawDiag - Brings Orbitrap Mass Spectrometry Data to Life; Fast and Colorful
Optimizing methods for liquid chromatography coupled to mass spectrometry (LC-MS) poses a nontrivial challenge. The rawDiag package facilitates rational method optimization by generating MS operator-tailored diagnostic plots of scan-level metadata. The package is designed for use on the R shell or as a Shiny application on the Orbitrap instrument PC.
Last updated
massspectrometryproteomicsmetabolomicsinfrastructuresoftwareshinyappsfastmass-spectrometrymultiplatformorbitrapvisualization
6.23 score 37 stars 23 scripts 252 downloadsPirat - Precursor or Peptide Imputation under Random Truncation
Pirat enables the imputation of missing values (either MNARs or MCARs) in bottom-up LC-MS/MS proteomics data using a penalized maximum likelihood strategy. It does not require any parameter tuning, it models the instrument censorship from the data available. It accounts for sibling peptides correlations and it can leverage complementary transcriptomics measurements.
Last updated
proteomicsmassspectrometrypreprocessingsoftwareprostar2
6.20 score 3 stars 9 scripts 278 downloadsAnVILBase - Generic functions for interacting with the AnVIL ecosystem
Provides generic functions for interacting with the AnVIL ecosystem. Packages that use either GCP or Azure in AnVIL are built on top of AnVILBase. Extension packages will provide methods for interacting with other cloud providers.
Last updated
softwareinfrastructureu24hg010263
6.20 score 17 dependents 77 scripts 768 downloadsfaers - R interface for FDA Adverse Event Reporting System
The FDA Adverse Event Reporting System (FAERS) is a database used for the spontaneous reporting of adverse events and medication errors related to human drugs and therapeutic biological products. faers pacakge serves as the interface between the FAERS database and R. Furthermore, faers pacakge offers a standardized approach for performing pharmacovigilance analysis.
Last updated
softwaredataimportbiomedicalinformaticspharmacogenomicsadverse-eventsdrug-safetyfaersfaers-procedurepharmacovigilancesignal-detection
6.19 score 47 stars 11 scripts 346 downloadsPLSDAbatch - PLSDA-batch
A novel framework to correct for batch effects prior to any downstream analysis in microbiome data based on Projection to Latent Structures Discriminant Analysis. The main method is named “PLSDA-batch”. It first estimates treatment and batch variation with latent components, then subtracts batch-associated components from the data whilst preserving biological variation of interest. PLSDA-batch is highly suitable for microbiome data as it is non-parametric, multivariate and allows for ordination and data visualisation. Combined with centered log-ratio transformation for addressing uneven library sizes and compositional structure, PLSDA-batch addresses all characteristics of microbiome data that existing correction methods have ignored so far. Two other variants are proposed for 1/ unbalanced batch x treatment designs that are commonly encountered in studies with small sample sizes, and for 2/ selection of discriminative variables amongst treatment groups to avoid overfitting in classification problems. These two variants have widened the scope of applicability of PLSDA-batch to different data settings.
Last updated
statisticalmethoddimensionreductionprincipalcomponentclassificationmicrobiomebatcheffectnormalizationvisualization
6.18 score 14 stars 54 scripts 287 downloadsCatsCradle - This package provides methods for analysing spatial transcriptomics data and for discovering gene clusters
This package addresses two broad areas. It allows for in-depth analysis of spatial transcriptomic data by identifying tissue neighbourhoods. These are contiguous regions of tissue surrounding individual cells. 'CatsCradle' allows for the categorisation of neighbourhoods by the cell types contained in them and the genes expressed in them. In particular, it produces Seurat objects whose individual elements are neighbourhoods rather than cells. In addition, it enables the categorisation and annotation of genes by producing Seurat objects whose elements are genes.
Last updated
biologicalquestionstatisticalmethodgeneexpressionsinglecelltranscriptomicsspatial
6.15 score 7 stars 4 scripts 308 downloadsDOtools - Convenient functions to streamline your single cell data analysis workflow
This package provides functions for creating various visualizations, convenient wrappers, and quality-of-life utilities for single cell experiment objects. It offers a streamlined approach to visualize results and integrates different tools for easy use.
Last updated
singlecellrnaseqvisualizationclusteringannotationworkflowstepqualitycontrolgeneexpression
6.14 score 6 stars 22 scripts 186 downloadsanansi - Annotation-Based Analysis of Specific Interactions
Studies including both microbiome and metabolomics data are becoming more common. Often, it would be helpful to integrate both datasets in order to see if they corroborate each others patterns. All vs all association is imprecise and likely to yield spurious associations. This package takes a knowledge-based approach to constrain association search space, only considering metabolite-function pairs that have been recorded in a pathway database. This package also provides a framework to assess differential association.
Last updated
microbiomemetabolomicsregressionpathwayskegg
6.13 score 10 stars 8 scripts 188 downloadsRFLOMICS - Interactive web application for Omics-data analysis
R-package with shiny interface, provides a framework for the analysis of transcriptomics, proteomics and/or metabolomics data. The interface offers a guided experience for the user, from the definition of the experimental design to the integration of several omics table together. A report can be generated with all settings and analysis results.
Last updated
shinyappsdifferentialexpressionmetabolomicsproteomicstranscriptomics
6.13 score 60 scripts 246 downloadsDFplyr - A `DataFrame` (`S4Vectors`) backend for `dplyr`
Provides `dplyr` verbs (`mutate`, `select`, `filter`, etc...) supporting `S4Vectors::DataFrame` objects. Importantly, this is achieved without conversion to an intermediate `tibble`. Adds grouping infrastructure to `DataFrame` which is respected by the transformation verbs.
Last updated
datarepresentationinfrastructuresoftware
6.12 score 21 stars 1 dependents 14 scripts 246 downloadsdandelionR - Single-cell Immune Repertoire Trajectory Analysis in R
dandelionR is an R package for performing single-cell immune repertoire trajectory analysis, based on the original python implementation. It provides the necessary functions to interface with scRepertoire and a custom implementation of an absorbing Markov chain for pseudotime inference, inspired by the Palantir Python package.
Last updated
softwareimmunooncologysinglecell
6.10 score 12 stars 10 scripts 302 downloadsdeconvR - Simulation and Deconvolution of Omic Profiles
This package provides a collection of functions designed for analyzing deconvolution of the bulk sample(s) using an atlas of reference omic signature profiles and a user-selected model. Users are given the option to create or extend a reference atlas and,also simulate the desired size of the bulk signature profile of the reference cell types.The package includes the cell-type-specific methylation atlas and, Illumina Epic B5 probe ids that can be used in deconvolution. Additionally,we included BSmeth2Probe, to make mapping WGBS data to their probe IDs easier.
Last updated
dnamethylationregressiongeneexpressionrnaseqsinglecellstatisticalmethodtranscriptomicsbioconductor-packagedeconvolutiondna-methylationomics
6.10 score 10 stars 21 scripts 460 downloadsleapR - Layered enrichment analysis of pathways R
leapR is a package that identifies pathways that are enriched across diverse 'omics experiments. It leverages any tabular expression data (proteomics, transcriptomics) using the `SummarizedExperiment` object. It works with any pathway in the .gct file format.
Last updated
genesetenrichmentproteomicspathwaysgeneexpressiontranscriptomics
6.10 score 83 scripts 228 downloadsxCell2 - A Tool for Generic Cell Type Enrichment Analysis
xCell2 provides methods for cell type enrichment analysis using cell type signatures. It includes three main functions - 1. xCell2Train for training custom references objects from bulk or single-cell RNA-seq datasets. 2. xCell2Analysis for conducting the cell type enrichment analysis using the custom reference. 3. xCell2GetLineage for identifying dependencies between different cell types using ontology.
Last updated
geneexpressiontranscriptomicsmicroarrayrnaseqsinglecelldifferentialexpressionimmunooncologygenesetenrichment
6.09 score 21 stars 29 scripts 366 downloadsHMMcopy - Copy number prediction with correction for GC and mappability bias for HTS data
Corrects GC and mappability biases for readcounts (i.e. coverage) in non-overlapping windows of fixed length for single whole genome samples, yielding a rough estimate of copy number for furthur analysis. Designed for rapid correction of high coverage whole genome tumour and normal samples.
Last updated
sequencingpreprocessingvisualizationcopynumbervariationmicroarray
6.08 score 1 dependents 135 scripts 836 downloadsmutscan - Preprocessing and Analysis of Deep Mutational Scanning Data
Provides functionality for processing and statistical analysis of multiplexed assays of variant effect (MAVE) and similar data. The package contains functions covering the full workflow from raw FASTQ files to publication-ready visualizations. A broad range of library designs can be processed with a single, unified interface.
Last updated
geneticvariabilitygenomicvariationpreprocessingzlibcppopenmp
6.07 score 14 stars 14 scripts 191 downloadsOSTA.data - OSTA book data
'OSTA.data' is a companion package for the "Orchestrating Spatial Transcriptomics Analysis" (OSTA) with Bioconductor online book. Throughout OSTA, we rely on a set of publicly available datasets that cover different sequencing- and imaging-based platforms, such as Visium, Visium HD, Xenium (10x Genomics) and CosMx (NanoString). In addition, we rely on scRNA-seq (Chromium) data for tasks, e.g., spot deconvolution and label transfer (i.e., supervised clustering). These data been deposited in an Open Storage Framework (OSF) repository, and can be queried and downloaded using functions from the 'osfr' package. For convenience, we have implemented 'OSTA.data' to query and retrieve data from our OSF node, and cache retrieved Zip archives using 'BiocFileCache'.
Last updated
dataimportdatarepresentationexperimenthubsoftwareinfrastructureimmunooncologygeneexpressiontranscriptomicssinglecellspatial
6.06 score 2 stars 96 scripts 359 downloadsCCPlotR - Plots For Visualising Cell-Cell Interactions
CCPlotR is an R package for visualising results from tools that predict cell-cell interactions from single-cell RNA-seq data. These plots are generic and can be used to visualise results from multiple tools such as Liana, CellPhoneDB, NATMI etc.
Last updated
singlecellnetworkvisualizationcellbiologysystemsbiology
6.05 score 47 stars 16 scripts 309 downloadsMSstatsResponse - Statistical Methods for Chemoproteomics Dose-Response Analysis
Tools for detecting drug-protein interactions and estimating IC50 values from chemoproteomics data. Implements semi-parametric isotonic regression, bootstrapping, and curve fitting to evaluate compound effects on protein abundance.
Last updated
proteomicsmassspectrometrystatisticalmethodsoftwareregression
6.03 score 1 stars 1 dependents 15 scripts 200 downloadsdominatR - Feature Dominance-based R Package for Genomic Data
dominatR is an R package for quantifying and visualizing feature dominance in datasets. dominatR applies concepts drawn from physics such as center of mass and shannon's entropy to effectively visualize features (e.g. genes) that are present within a specific context or condition. The package integrates, dataframes, matrices and SummerizedExperiment objects and is able to perform common genomic normalization methods. The key aspect is the generation of plots that serve to highlight context-relevant feature dominance.
Last updated
visualizationnormalizationclassificationgeneexpression
5.98 score 3 stars 6 scripts 141 downloadsCoralysis - Coralysis sensitive identification of imbalanced cell types and states in single-cell data via multi-level integration
Coralysis is an R package featuring a multi-level integration algorithm for sensitive integration, reference-mapping, and cell-state identification in single-cell data. The multi-level integration algorithm is inspired by the process of assembling a puzzle - where one begins by grouping pieces based on low-to high-level features, such as color and shading, before looking into shape and patterns. This approach progressively blends the batch effects and separates cell types across multiple rounds of divisive clustering.
Last updated
singlecellrnaseqproteomicstranscriptomicsgeneexpressionbatcheffectclusteringannotationclassificationdifferentialexpressiondimensionreductionsoftwaredata-integrationscrna-seq
5.98 score 4 stars 1 scripts 200 downloads
visiumStitched - Enable downstream analysis of Visium capture areas stitched together with Fiji
This package provides helper functions for working with multiple Visium capture areas that overlap each other. This package was developed along with the companion example use case data available from https://github.com/LieberInstitute/visiumStitched_brain. visiumStitched prepares SpaceRanger (10x Genomics) output files so you can stitch the images from groups of capture areas together with Fiji. Then visiumStitched builds a SpatialExperiment object with the stitched data and makes an artificial hexagonal grid enabling the seamless use of spatial clustering methods that rely on such grid to identify neighboring spots, such as PRECAST and BayesSpace. The SpatialExperiment objects created by visiumStitched are compatible with spatialLIBD, which can be used to build interactive websites for stitched SpatialExperiment objects. visiumStitched also enables casting SpatialExperiment objects as Seurat objects.
Last updated
softwarespatialtranscriptomicstranscriptiongeneexpressionvisualizationdataimport10xgenomicsbioconductorspatial-transcriptomicsspatialexperimentspatiallibdvisium
5.98 score 4 stars 7 scripts 239 downloadsGeDi - Defining and visualizing the distances between different genesets
The package provides different distances measurements to calculate the difference between genesets. Based on these scores the genesets are clustered and visualized as graph. This is all presented in an interactive Shiny application for easy usage.
Last updated
guigenesetenrichmentsoftwaretranscriptionrnaseqvisualizationclusteringpathwaysreportwritinggokeggreactomeshinyapps
5.96 score 2 stars 76 scripts 249 downloads
CellBarcode - Cellular DNA Barcode Analysis toolkit
The package CellBarcode performs Cellular DNA Barcode analysis. It can handle all kinds of DNA barcodes, as long as the barcode is within a single sequencing read and has a pattern that can be matched by a regular expression. \code{CellBarcode} can handle barcodes with flexible lengths, with or without UMI (unique molecular identifier). This tool also can be used for pre-processing some amplicon data such as CRISPR gRNA screening, immune repertoire sequencing, and metagenome data.
Last updated
preprocessingqualitycontrolsequencingcrisprampliconamplicon-sequencingcellular-barcodecpp
5.95 score 3 stars 50 scripts 354 downloadsbenchdamic - Benchmark of differential abundance methods on microbiome data
Starting from a microbiome dataset (16S or WMS with absolute count values) it is possible to perform several analysis to assess the performances of many differential abundance detection methods. A basic and standardized version of the main differential abundance analysis methods is supplied but the user can also add his method to the benchmark. The analyses focus on 4 main aspects: i) the goodness of fit of each method's distributional assumptions on the observed count data, ii) the ability to control the false discovery rate, iii) the within and between method concordances, iv) the truthfulness of the findings if any apriori knowledge is given. Several graphical functions are available for result visualization.
Last updated
metagenomicsmicrobiomedifferentialexpressionmultiplecomparisonnormalizationpreprocessingsoftwarebenchmarkdifferential-abundance-methods
5.94 score 10 stars 11 scripts 272 downloadsOmicsMLRepoR - Search harmonized metadata created under the OmicsMLRepo project
This package provides functions to browse the harmonized metadata for large omics databases. This package also supports data navigation if the metadata incorporates ontology.
Last updated
softwareinfrastructuredatarepresentationu24ca289073
5.93 score 2 stars 26 scripts 220 downloadsGeoDiff - Count model based differential expression and normalization on GeoMx RNA data
A series of statistical models using count generating distributions for background modelling, feature and sample QC, normalization and differential expression analysis on GeoMx RNA data. The application of these methods are demonstrated by example data analysis vignette.
Last updated
geneexpressiondifferentialexpressionnormalizationopenblascppopenmp
5.93 score 9 stars 19 scripts 306 downloadsMetMashR - Metabolite Mashing with R
A package to merge, filter sort, organise and otherwise mash together metabolite annotation tables. Metabolite annotations can be imported from multiple sources (software) and combined using workflow steps based on S4 class templates derived from the `struct` package. Other modular workflow steps such as filtering, merging, splitting, normalisation and rest-api queries are included.
Last updated
workflowstepmetabolomicskegg
5.91 score 3 stars 17 scripts 213 downloadsscRNAseqApp - A single-cell RNAseq Shiny app-package
The scRNAseqApp is a Shiny app package designed for interactive visualization of single-cell data. It is an enhanced version derived from the ShinyCell, repackaged to accommodate multiple datasets. The app enables users to visualize data containing various types of information simultaneously, facilitating comprehensive analysis. Additionally, it includes a user management system to regulate database accessibility for different users.
Last updated
visualizationsinglecellrnaseqinteractive-visualizationsmultiple-usersshiny-appssingle-cell-rna-seq
5.91 score 6 stars 5 scripts 330 downloadsQTLExperiment - S4 classes for QTL summary statistics and metadata
QLTExperiment defines an S4 class for storing and manipulating summary statistics from QTL mapping experiments in one or more states. It is based on the 'SummarizedExperiment' class and contains functions for creating, merging, and subsetting objects. 'QTLExperiment' also stores experiment metadata and has checks in place to ensure that transformations apply correctly.
Last updated
functionalgenomicsdataimportdatarepresentationinfrastructuresequencingsnpsoftware
5.90 score 3 stars 1 dependents 25 scripts 264 downloadsconumee - Enhanced copy-number variation analysis using Illumina DNA methylation arrays
This package contains a set of processing and plotting methods for performing copy-number variation (CNV) analysis using Illumina 450k or EPIC methylation arrays.
Last updated
copynumbervariationdnamethylationmethylationarraymicroarraynormalizationpreprocessingqualitycontrolsoftware
5.89 score 39 scripts 577 downloads
plyinteractions - Extending tidy verbs to genomic interactions
Operate on `GInteractions` objects as tabular data using `dplyr`-like verbs. The functions and methods in `plyinteractions` provide a grammatical approach to manipulate `GInteractions`, to facilitate their integration in genomic analysis workflows.
Last updated
softwareinfrastructure
5.88 score 1 dependents 25 scripts 274 downloads
shiny.gosling - A Grammar-based Toolkit for Scalable and Interactive Genomics Data Visualization for R and Shiny
A Grammar-based Toolkit for Scalable and Interactive Genomics Data Visualization. http://gosling-lang.org/. This R package is based on gosling.js. It uses R functions to create gosling plots that could be embedded onto R Shiny apps.
Last updated
shinyappsgeneticsvisualization
5.87 score 1 dependents 49 scripts 238 downloadslimpa - Quantification and Differential Analysis of Proteomics Data
Quantification and differential analysis of mass-spectrometry proteomics data, with probabilistic recovery of information from missing values. Avoids the need for imputation. Estimates the detection probability curve (DPC), which relates the probability of successful detection to the underlying log-intensity of each precursor ion, and uses it to incorporate missing values into protein quantification and into subsequent differential expression analyses. The package produces objects suitable for downstream analysis in limma. The package accepts precursor (or peptide) intensities including missing values and produces complete protein quantifications without the need for imputation. The uncertainty of the protein quantifications is propagated through to the limma analyses using variance modeling and precision weights, ensuring accurate error rate control. The analysis pipeline can alternatively work with PTM or protein level data. The package name "limpa" is an acronym for "Linear Models for Proteomics Data".
Last updated
bayesianbiologicalquestiondataimportdifferentialexpressiongeneexpressionmassspectrometrypreprocessingproteomicsregressionsoftwaredifferential-expressionmass-spectrometry
5.86 score 20 stars 20 scripts 394 downloadsCaMutQC - An R Package for Comprehensive Filtration and Selection of Cancer Somatic Mutations
CaMutQC is able to filter false positive mutations generated due to technical issues, as well as to select candidate cancer mutations through a series of well-structured functions by labeling mutations with various flags. And a detailed and vivid filter report will be offered after completing a whole filtration or selection section. Also, CaMutQC integrates serveral methods and gene panels for Tumor Mutational Burden (TMB) estimation.
Last updated
softwarequalitycontrolgenetargetcancer-genomicssomatic-mutations
5.86 score 8 stars 5 scripts 260 downloadsMetaboDynamics - Bayesian analysis of longitudinal metabolomics data
MetaboDynamics is an R-package that provides a framework of probabilistic models to analyze longitudinal metabolomics data. It enables robust estimation of mean concentrations despite varying spread between timepoints and reports differences between timepoints as well as metabolite specific dynamics profiles that can be used for identifying "dynamics clusters" of metabolites of similar dynamics. Provides probabilistic over-representation analysis of KEGG functional modules and pathways as well as comparison between clusters of different experimental conditions.
Last updated
softwaremetabolomicsbayesianfunctionalpredictionmultiplecomparisonkeggpathwaystimecourseclusteringdynamicsfunctional-analysislongitudinal-analysismetabolomics-datametabolomics-pipelinecpp
5.85 score 5 stars 5 scripts 310 downloadsSEraster - Rasterization Preprocessing Framework for Scalable Spatial Omics Data Analysis
SEraster is a rasterization preprocessing framework that aggregates cellular information into spatial pixels to reduce resource requirements for spatial omics data analysis. SEraster reduces the number of spatial points in spatial omics datasets for downstream analysis through a process of rasterization where single cells’ gene expression or cell-type labels are aggregated into equally sized pixels based on a user-defined resolution. SEraster is built on an R/Bioconductor S4 class called SpatialExperiment. SEraster can be incorporated with other packages to conduct downstream analyses for spatial omics datasets, such as detecting spatially variable genes.
Last updated
softwarespatialgeneexpressiontranscriptomicssinglecellpreprocessingspatial-analysisspatial-data-analysisspatial-omicsspatial-transcriptomics
5.84 score 19 stars 18 scripts 246 downloads
blase - Bulk Linking Analysis for Single-cell Experiments
BLASE is a method for finding where bulk RNA-seq data lies on a single-cell pseudotime trajectory. It uses a fast and understandable approach based on Spearman correlation, with bootstrapping to provide confidence. BLASE can be used to "date" bulk RNA-seq data, annotate cell types in scRNA-seq, and help correct for developmental phenotype differences in bulk RNA-seq experiments.
Last updated
transcriptomicssinglecellsequencinggeneexpressiontranscriptionrnaseqtimecoursecellbiologysoftwarecellbasedassays
5.83 score 1 stars 7 scripts 197 downloadsphantasusLite - Loading and annotation RNA-seq counts matrices
PhantasusLite – a lightweight package with helper functions of general interest extracted from phantasus package. In parituclar it simplifies working with public RNA-seq datasets from GEO by providing access to the remote HSDS repository with the precomputed gene counts from ARCHS4 and DEE2 projects.
Last updated
geneexpressiontranscriptomicsrnaseq
5.82 score 11 stars 1 dependents 9 scripts 262 downloadsHiCaptuRe - HiCaptuRe: Manipulating and integrating Capture Hi-C data
Capture Hi-C is a set of techniques that enable the detection of genomic interactions involving regions of interest, known as baits. By focusing on selected loci, these approaches reduce sequencing costs while maintaining high resolution at the level of restriction fragments. HiCaptuRe provides tools to import, annotate, manipulate, and export Capture Hi-C data. The package accounts for the specific structure of bait–otherEnd interactions, facilitates integration with other omics datasets, and enables comparison across samples and conditions.
Last updated
epigeneticshicsequencingdataimportsoftware
5.80 score 3 stars 21 scripts 188 downloadsCRISPRball - Shiny Application for Interactive CRISPR Screen Visualization, Exploration, Comparison, and Filtering
A Shiny application for visualization, exploration, comparison, and filtering of CRISPR screens analyzed with MAGeCK RRA or MLE. Features include interactive plots with on-click labeling, full customization of plot aesthetics, data upload and/or download, and much more. Quickly and easily explore your CRISPR screen results and generate publication-quality figures in seconds.
Last updated
softwareshinyappscrisprqualitycontrolvisualizationguicrispr-screendata-visualizationinteractive-visualizationsmageckplotlyscreeningshiny
5.80 score 13 stars 24 scripts 246 downloadsgatom - Finding an Active Metabolic Module in Atom Transition Network
This package implements a metabolic network analysis pipeline to identify an active metabolic module based on high throughput data. The pipeline takes as input transcriptional and/or metabolic data and finds a metabolic subnetwork (module) most regulated between the two conditions of interest. The package further provides functions for module post-processing, annotation and visualization.
Last updated
geneexpressiondifferentialexpressionpathwaysnetwork
5.80 score 8 stars 13 scripts 312 downloadsrhinotypeR - Rhinovirus genotyping
"rhinotypeR" is designed to automate the comparison of sequence data against prototype strains, streamlining the genotype assignment process. By implementing predefined pairwise distance thresholds, this package makes genotype assignment accessible to researchers and public health professionals. This tool enhances our epidemiological toolkit by enabling more efficient surveillance and analysis of rhinoviruses (RVs) and other viral pathogens with complex genomic landscapes. Additionally, "rhinotypeR" supports comprehensive visualization and analysis of single nucleotide polymorphisms (SNPs) and amino acid substitutions, facilitating in-depth genetic and evolutionary studies.
Last updated
sequencinggeneticsphylogeneticsvisualizationmultiplesequencealignmentmultiplecomparison
5.78 score 4 stars 4 scripts 231 downloadstidyCoverage - Extract and aggregate genomic coverage over features of interest
`tidyCoverage` framework enables tidy manipulation of collections of genomic tracks and features using `tidySummarizedExperiment` methods. It facilitates the extraction, aggregation and visualization of genomic coverage over individual or thousands of genomic loci, relying on `CoverageExperiment` and `AggregatedCoverage` classes. This accelerates the integration of genomic track data in genomic analysis workflows.
Last updated
softwaresequencingcoverage
5.78 score 24 stars 10 scripts 258 downloads
raer - RNA editing tools in R
Toolkit for identification and statistical testing of RNA editing signals from within R. Provides support for identifying sites from bulk-RNA and single cell RNA-seq datasets, and general methods for extraction of allelic read counts from alignment files. Facilitates annotation and exploratory analysis of editing signals using Bioconductor packages and resources.
Last updated
multiplecomparisonrnaseqsinglecellsequencingcoverageepitranscriptomicsfeatureextractionannotationalignmentbioconductor-packagerna-seq-analysissingle-cell-analysissingle-cell-rna-seqcurlbzip2xz-utilszlib
5.78 score 10 stars 6 scripts 334 downloads
VDJdive - Analysis Tools for 10X V(D)J Data
This package provides functions for handling and analyzing immune receptor repertoire data, such as produced by the CellRanger V(D)J pipeline. This includes reading the data into R, merging it with paired single-cell data, quantifying clonotype abundances, calculating diversity metrics, and producing common plots. It implements the E-M Algorithm for clonotype assignment, along with other methods, which makes use of ambiguous cells for improved quantification.
Last updated
softwareimmunooncologysinglecellannotationrnaseqtargetedresequencingcpp
5.77 score 13 stars 5 scripts 288 downloadsSingleCellAlleleExperiment - S4 Class for Single Cell Data with Allele and Functional Levels for Immune Genes
Defines a S4 class that is based on SingleCellExperiment. In addition to the usual gene layer the object can also store data for immune genes such as HLAs, Igs and KIRs at allele and functional level. The package is part of a workflow named single-cell ImmunoGenomic Diversity (scIGD), that firstly incorporates allele-aware quantification data for immune genes. This new data can then be used with the here implemented data structure and functionalities for further data handling and data analysis.
Last updated
datarepresentationinfrastructuresinglecelltranscriptomicsgeneexpressiongeneticsimmunooncologydataimport
5.76 score 8 stars 12 scripts 260 downloadshoodscanR - Spatial cellular neighbourhood scanning in R
hoodscanR is an user-friendly R package providing functions to assist cellular neighborhood analysis of any spatial transcriptomics data with single-cell resolution. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. The package can result in cell-level neighborhood annotation output, along with funtions to perform neighborhood colocalization analysis and neighborhood-based cell clustering.
Last updated
spatialtranscriptomicssinglecellclusteringcpp
5.68 score 13 stars 37 scripts 342 downloads
beer - Bayesian Enrichment Estimation in R
BEER implements a Bayesian model for analyzing phage-immunoprecipitation sequencing (PhIP-seq) data. Given a PhIPData object, BEER returns posterior probabilities of enriched antibody responses, point estimates for the relative fold-change in comparison to negative control samples, and more. Additionally, BEER provides a convenient implementation for using edgeR to identify enriched antibody responses.
Last updated
softwarestatisticalmethodbayesiansequencingcoveragejagscpp
5.66 score 11 stars 14 scripts 364 downloadsAnVILPublish - Publish Packages and Other Resources to AnVIL Workspaces
Use this package to create or update AnVIL workspaces from resources such as R / Bioconductor packages. The metadata about the package (e.g., select information from the package DESCRIPTION file and from vignette YAML headings) are used to populate the 'DASHBOARD'. Vignettes are translated to python notebooks ready for evaluation in AnVIL.
Last updated
infrastructuresoftwareu24hg010263
5.65 score 1 dependents 2 scripts 316 downloadsLACHESIS - Functions used to analyze early tumor evolution from whole genome sequencing data
This package provides modalities to analyze tumor evolution from whole genome sequencing data. In particular, it provides estimates of mutation densities at genomic segments and uses these to time the origin of the tumor.
Last updated
softwarestatisticalmethodtimecoursesequencingwholegenomesurvivalsomaticmutationtumor-evolutionwhole-genome-sequencing
5.65 score 3 stars 7 scripts 143 downloadsR3CPET - 3CPET: Finding Co-factor Complexes in Chia-PET experiment using a Hierarchical Dirichlet Process
The package provides a method to infer the set of proteins that are more probably to work together to maintain chormatin interaction given a ChIA-PET experiment results.
Last updated
networkinferencegenepredictionbayesiangraphandnetworknetworkgeneexpressionhicchia-petchromatin-interactiondirichlet-process-mixturestranscription-factocpp
5.62 score 4 stars 8 scripts 372 downloadsiModMix - Integrative Modules for Multi-Omics Data
The iModMix network-based method offers an integrated framework for analyzing multi-omics data, including metabolomics, proteomics, and transcriptomics data, enabling the exploration of intricate molecular associations within heterogeneous biological systems.
Last updated
softwarenetworkclusteringvisualizationtranscriptomicsproteomicsmetabolomicsgeneexpressionprincipalcomponentbioinformaticsmultiomics
5.60 score 4 stars 3 scripts 218 downloadstxcutr - Transcriptome CUTteR
Various mRNA sequencing library preparation methods generate sequencing reads specifically from the transcript ends. Analyses that focus on quantification of isoform usage from such data can be aided by using truncated versions of transcriptome annotations, both at the alignment or pseudo-alignment stage, as well as in downstream analysis. This package implements some convenience methods for readily generating such truncated annotations and their corresponding sequences.
Last updated
alignmentannotationrnaseqsequencingtranscriptomics
5.60 score 5 stars 10 scripts 303 downloadslcmsPlot - Comprehensive Liquid Chromatography-Mass Spectrometry (LC-MS) data visualisation package
lcmsPlot is an R package designed for visualising Liquid Chromatography-Mass Spectrometry (LC-MS) data with publication-ready high-quality plots. The package enables users to generate and customise chromatograms, mass traces, spectra, and more with fine-tuned aesthetics and annotation options.
Last updated
metabolomicsmassspectrometry
5.58 score 1 stars 4 scripts 173 downloadsclustSIGNAL - ClustSIGNAL: a spatial clustering method
clustSIGNAL: clustering of Spatially Informed Gene expression with Neighbourhood Adapted Learning. A tool for adaptively smoothing and clustering gene expression data. clustSIGNAL uses entropy to measure heterogeneity of cell neighbourhoods and performs a weighted, adaptive smoothing, where homogeneous neighbourhoods are smoothed more and heterogeneous neighbourhoods are smoothed less. This not only overcomes data sparsity but also incorporates spatial context into the gene expression data. The resulting smoothed gene expression data is used for clustering and could be used for other downstream analyses.
Last updated
clusteringsoftwaregeneexpressionspatialtranscriptomicssinglecell
5.56 score 6 stars 3 scripts 253 downloadsReUseData - Reusable and reproducible Data Management
ReUseData is an _R/Bioconductor_ software tool to provide a systematic and versatile approach for standardized and reproducible data management. ReUseData facilitates transformation of shell or other ad hoc scripts for data preprocessing into workflow-based data recipes. Evaluation of data recipes generate curated data files in their generic formats (e.g., VCF, bed). Both recipes and data are cached using database infrastructure for easy data management and reuse. Prebuilt data recipes are available through ReUseData portal ("https://rcwl.org/dataRecipes/") with full annotation and user instructions. Pregenerated data are available through ReUseData cloud bucket that is directly downloadable through "getCloudData()".
Last updated
softwareinfrastructuredataimportpreprocessingimmunooncology
5.56 score 4 stars 7 scripts 249 downloadsSVP - Predicting cell states and their variability in single-cell or spatial omics data
SVP uses the distance between cells and cells, features and features, cells and features in the space of MCA to build nearest neighbor graph, then uses random walk with restart algorithm to calculate the activity score of gene sets (such as cell marker genes, kegg pathway, go ontology, gene modules, transcription factor or miRNA target sets, reactome pathway, ...), which is then further weighted using the hypergeometric test results from the original expression matrix. To detect the spatially or single cell variable gene sets or (other features) and the spatial colocalization between the features accurately, SVP provides some global and local spatial autocorrelation method to identify the spatial variable features. SVP is developed based on SingleCellExperiment class, which can be interoperable with the existing computing ecosystem.
Last updated
singlecellsoftwarespatialtranscriptomicsgenetargetgeneexpressiongenesetenrichmenttranscriptiongokeggopenblascppopenmp
5.56 score 12 stars 6 scripts 331 downloadsmosdef - MOSt frequently used and useful Differential Expression Functions
This package provides functionality to run a number of tasks in the differential expression analysis workflow. This encompasses the most widely used steps, from running various enrichment analysis tools with a unified interface to creating plots and beautifying table components linking to external websites and databases. This streamlines the generation of comprehensive analysis reports.
Last updated
geneexpressionsoftwaretranscriptiontranscriptomicsdifferentialexpressionvisualizationreportwritinggenesetenrichmentgo
5.56 score 4 dependents 1 scripts 518 downloadsTEKRABber - An R package estimates the correlations of orthologs and transposable elements between two species
TEKRABber is made to provide a user-friendly pipeline for comparing orthologs and transposable elements (TEs) between two species. It considers the orthology confidence between two species from BioMart to normalize expression counts and detect differentially expressed orthologs/TEs. Then it provides one to one correlation analysis for desired orthologs and TEs. There is also an app function to have a first insight on the result. Users can prepare orthologs/TEs RNA-seq expression data by their own preference to run TEKRABber following the data structure mentioned in the vignettes.
Last updated
differentialexpressionnormalizationtranscriptiongeneexpressionbioconductorcpp
5.56 score 3 stars 20 scripts 361 downloadsGBScleanR - Error correction tool for noisy genotyping by sequencing (GBS) data
GBScleanR is a package for quality check, filtering, and error correction of genotype data derived from next generation sequcener (NGS) based genotyping platforms. GBScleanR takes Variant Call Format (VCF) file as input. The main function of this package is `estGeno()` which estimates the true genotypes of samples from given read counts for genotype markers using a hidden Markov model with incorporating uneven observation ratio of allelic reads. This implementation gives robust genotype estimation even in noisy genotype data usually observed in Genotyping-By-Sequnencing (GBS) and similar methods, e.g. RADseq. The current implementation accepts genotype data of a diploid population at any generation of multi-parental cross, e.g. biparental F2 from inbred parents, biparental F2 from outbred parents, and 8-way recombinant inbred lines (8-way RILs) which can be refered to as MAGIC population.
Last updated
geneticvariabilitysnpgeneticshiddenmarkovmodelsequencingqualitycontrolcpp
5.56 score 4 stars 5 scripts 380 downloadsStabMap - Stabilised mosaic single cell data integration using unshared features
StabMap performs single cell mosaic data integration by first building a mosaic data topology, and for each reference dataset, traverses the topology to project and predict data onto a common embedding. Mosaic data should be provided in a list format, with all relevant features included in the data matrices within each list object. The output of stabMap is a joint low-dimensional embedding taking into account all available relevant features. Expression imputation can also be performed using the StabMap embedding and any of the original data matrices for given reference and query cell lists.
Last updated
singlecelldimensionreductionsoftware
5.55 score 79 scripts 256 downloadsPedixplorer - Pedigree Functions
Routines to handle family data with a Pedigree object. The initial purpose was to create correlation structures that describe family relationships such as kinship and identity-by-descent, which can be used to model family data in mixed effects models, such as in the coxme function. Also includes a tool for Pedigree drawing which is focused on producing compact layouts without intervention. Recent additions include utilities to trim the Pedigree object with various criteria, and kinship for the X chromosome.
Last updated
softwaredatarepresentationgeneticsgraphandnetworkvisualizationkinshippedigree
5.55 score 7 stars 1 dependents 16 scripts 262 downloadsgypsum - Interface to the gypsum REST API
Client for the gypsum REST API (https://gypsum.artifactdb.com), a cloud-based file store in the ArtifactDB ecosystem. This package provides functions for uploads, downloads, and various adminstrative and management tasks. Check out the documentation at https://github.com/ArtifactDB/gypsum-worker for more details.
Last updated
dataimport
5.53 score 1 stars 1 dependents 22 scripts 5.2k downloadsknowYourCG - Functional analysis of DNA methylome datasets
KnowYourCG (KYCG) is a supervised learning framework designed for the functional analysis of DNA methylation data. Unlike existing tools that focus on genes or genomic intervals, KnowYourCG directly targets CpG dinucleotides, featuring automated supervised screenings of diverse biological and technical influences, including sequence motifs, transcription factor binding, histone modifications, replication timing, cell-type-specific methylation, and trait-epigenome associations. KnowYourCG addresses the challenges of data sparsity in various methylation datasets, including low-pass Nanopore sequencing, single-cell DNA methylomes, 5-hydroxymethylation profiles, spatial DNA methylation maps, and array-based datasets for epigenome-wide association studies and epigenetic clocks (<doi:10.1126/sciadv.adw3027>).
Last updated
epigeneticsdnamethylationsequencingsinglecellspatialtranscriptionmethylationarrayzlib
5.53 score 7 stars 18 scripts 329 downloadsDuplexDiscovereR - Analysis of the data from RNA duplex probing experiments
DuplexDiscovereR is a package designed for analyzing data from RNA cross-linking and proximity ligation protocols such as SPLASH, PARIS, LIGR-seq, and others. DuplexDiscovereR accepts input in the form of chimerically or split-aligned reads. It includes procedures for alignment classification, filtering, and efficient clustering of individual chimeric reads into duplex groups (DGs). Once DGs are identified, the package predicts RNA duplex formation and their hybridization energies. Additional metrics, such as p-values for random ligation hypothesis or mean DG alignment scores, can be calculated to rank final set of RNA duplexes. Data from multiple experiments or replicates can be processed separately and further compared to check the reproducibility of the experimental method.
Last updated
sequencingtranscriptomicsstructuralpredictionclusteringsplicedalignment
5.52 score 3 stars 8 scripts 219 downloadsanglemania - Feature Extraction for scRNA-seq Dataset Integration
anglemania extracts genes from multi-batch scRNA-seq experiments for downstream dataset integration. It shows improvement over the conventional usage of highly-variable genes for many integration tasks. We leverage gene-gene correlations that are stable across batches to identify biologically informative genes which are less affected by batch effects. Currently, its main use is for single-cell RNA-seq dataset integration, but it can be applied for other multi-batch downstream analyses such as NMF.
Last updated
singlecellbatcheffectmultiplecomparisonfeatureextractioncpp
5.51 score 4 stars 2 scripts 222 downloadsMotifPeeker - Benchmarking Epigenomic Profiling Methods Using Motif Enrichment
MotifPeeker is used to compare and analyse datasets from epigenomic profiling methods with motif enrichment as the key benchmark. The package outputs an HTML report consisting of three sections: (1. General Metrics) Overview of peaks-related general metrics for the datasets (FRiP scores, peak widths and motif-summit distances). (2. Known Motif Enrichment Analysis) Statistics for the frequency of user-provided motifs enriched in the datasets. (3. Motif Discovery Enrichment Analysis) Statistics for the frequency of ab-initio discovered motifs enriched in the datasets and compared with known motifs.
Last updated
epigeneticsgeneticsqualitycontrolchipseqmultiplecomparisonfunctionalgenomicsmotifdiscoverysequencematchingsoftwarealignmentbioconductorbioconductor-packagechip-seqepigenomicsinteractive-reportmotif-enrichment-analysis
5.51 score 2 stars 7 scripts 228 downloads
demuxSNP - scRNAseq demultiplexing using cell hashing and SNPs
This package assists in demultiplexing scRNAseq data using both cell hashing and SNPs data. The SNP profile of each group os learned using high confidence assignments from the cell hashing data. Cells which cannot be assigned with high confidence from the cell hashing data are assigned to their most similar group based on their SNPs. We also provide some helper function to optimise SNP selection, create training data and merge SNP data into the SingleCellExperiment framework.
Last updated
classificationsinglecell
5.49 score 9 stars 23 scripts 277 downloadsjazzPanda - Finding spatially relevant marker genes in image based spatial transcriptomics data
This package contains the function to find marker genes for image-based spatial transcriptomics data. There are functions to create spatial vectors from the cell and transcript coordiantes, which are passed as inputs to find marker genes. Marker genes are detected for every cluster by two approaches. The first approach is by permtuation testing, which is implmented in parallel for finding marker genes for one sample study. The other approach is to build a linear model for every gene. This approach can account for multiple samples and backgound noise.
Last updated
spatialgeneexpressiondifferentialexpressionstatisticalmethodtranscriptomicscorrelationlinear-modelsmarker-genesspatial-transcriptomics
5.48 score 4 stars 3 scripts 331 downloadsenrichViewNet - From functional enrichment results to biological networks
This package enables the visualization of functional enrichment results as network graphs. First the package enables the visualization of enrichment results, in a format corresponding to the one generated by gprofiler2, as a customizable Cytoscape network. In those networks, both gene datasets (GO terms/pathways/protein complexes) and genes associated to the datasets are represented as nodes. While the edges connect each gene to its dataset(s). The package also provides the option to create enrichment maps from functional enrichment results. Enrichment maps enable the visualization of enriched terms into a network with edges connecting overlapping genes.
Last updated
biologicalquestionsoftwarenetworknetworkenrichmentgocystocapefunctional-enrichment
5.48 score 6 stars 8 scripts 319 downloadsGenomicPlot - Plot profiles of next generation sequencing data in genomic features
Visualization of next generation sequencing (NGS) data is essential for interpreting high-throughput genomics experiment results. 'GenomicPlot' facilitates plotting of NGS data in various formats (bam, bed, wig and bigwig); both coverage and enrichment over input can be computed and displayed with respect to genomic features (such as UTR, CDS, enhancer), and user defined genomic loci or regions. Statistical tests on signal intensity within user defined regions of interest can be performed and represented as boxplots or bar graphs. Parallel processing is used to speed up computation on multicore platforms. In addition to genomic plots which is suitable for displaying of coverage of genomic DNA (such as ChIPseq data), metagenomic (without introns) plots can also be made for RNAseq or CLIPseq data as well.
Last updated
alternativesplicingchipseqcoveragegeneexpressionrnaseqsequencingsoftwaretranscriptionvisualizationannotation
5.48 score 6 stars 7 scripts 390 downloadsMultiRNAflow - An R package for integrated analysis of temporal RNA-seq data with multiple biological conditions
Our R package MultiRNAflow provides an easy to use unified framework allowing to automatically make both unsupervised and supervised (DE) analysis for datasets with an arbitrary number of biological conditions and time points. In particular, our code makes a deep downstream analysis of DE information, e.g. identifying temporal patterns across biological conditions and DE genes which are specific to a biological condition for each time.
Last updated
sequencingrnaseqgeneexpressiontranscriptiontimecoursepreprocessingvisualizationnormalizationprincipalcomponentclusteringdifferentialexpressiongenesetenrichmentpathways
5.45 score 7 stars 6 scripts 303 downloadslimpca - An R package for the linear modeling of high-dimensional designed data based on ASCA/APCA family of methods
This package has for objectives to provide a method to make Linear Models for high-dimensional designed data. limpca applies a GLM (General Linear Model) version of ASCA and APCA to analyse multivariate sample profiles generated by an experimental design. ASCA/APCA provide powerful visualization tools for multivariate structures in the space of each effect of the statistical model linked to the experimental design and contrarily to MANOVA, it can deal with mutlivariate datasets having more variables than observations. This method can handle unbalanced design.
Last updated
statisticalmethodprincipalcomponentregressionvisualizationexperimentaldesignmultiplecomparisongeneexpressionmetabolomics
5.43 score 2 stars 5 scripts 246 downloadsgcatest - Genotype Conditional Association TEST
GCAT is an association test for genome wide association studies that controls for population structure under a general class of trait models. This test conditions on the trait, which makes it immune to confounding by unmodeled environmental factors. Population structure is modeled via logistic factors, which are estimated using the `lfa` package.
Last updated
snpdimensionreductionprincipalcomponentgenomewideassociation
5.43 score 6 stars 5 scripts 345 downloadsbroadSeq - broadSeq : for streamlined exploration of RNA-seq data
This package helps user to do easily RNA-seq data analysis with multiple methods (usually which needs many different input formats). Here the user will provid the expression data as a SummarizedExperiment object and will get results from different methods. It will help user to quickly evaluate different methods.
Last updated
geneexpressiondifferentialexpressionrnaseqtranscriptomicssequencingcoveragegenesetenrichmentgo
5.43 score 9 stars 8 scripts 290 downloadsmethyLImp2 - Missing value estimation of DNA methylation data
This package allows to estimate missing values in DNA methylation data. methyLImp method is based on linear regression since methylation levels show a high degree of inter-sample correlation. Implementation is parallelised over chromosomes since probes on different chromosomes are usually independent. Mini-batch approach to reduce the runtime in case of large number of samples is available.
Last updated
dnamethylationmicroarraysoftwaremethylationarrayregressionimputationmethylationmissing-value-imputation
5.42 score 8 stars 11 scripts 278 downloadseasylift - An R package to perform genomic liftover
The easylift package provides a convenient tool for genomic liftover operations between different genome assemblies. It seamlessly works with Bioconductor's GRanges objects and chain files from the UCSC Genome Browser, allowing for straightforward handling of genomic ranges across various genome versions. One noteworthy feature of easylift is its integration with the BiocFileCache package. This integration automates the management and caching of chain files necessary for liftover operations. Users no longer need to manually specify chain file paths in their function calls, reducing the complexity of the liftover process.
Last updated
softwareworkflowstepsequencingcoveragegenomeassemblydataimport
5.41 score 8 stars 16 scripts 282 downloadsCDI - Clustering Deviation Index (CDI)
Single-cell RNA-sequencing (scRNA-seq) is widely used to explore cellular variation. The analysis of scRNA-seq data often starts from clustering cells into subpopulations. This initial step has a high impact on downstream analyses, and hence it is important to be accurate. However, there have not been unsupervised metric designed for scRNA-seq to evaluate clustering performance. Hence, we propose clustering deviation index (CDI), an unsupervised metric based on the modeling of scRNA-seq UMI counts to evaluate clustering of cells.
Last updated
singlecellsoftwareclusteringvisualizationsequencingrnaseqcellbasedassays
5.40 score 5 stars 5 scripts 288 downloadsSanityR - R/Bioconductor interface to the Sanity model gene expression analysis
a Bayesian normalization procedure derived from first principles. Sanity estimates expression values and associated error bars directly from raw unique molecular identifier (UMI) counts without any tunable parameters.
Last updated
softwaregeneexpressionsinglecellnormalizationbayesiancpp
5.38 score 4 stars 6 scripts 177 downloadsBreastSubtypeR - Cohort-aware methods for intrinsic molecular subtyping of breast cancer
BreastSubtypeR provides an assumption-aware, multi-method framework for intrinsic molecular subtyping of breast cancer. The package harmonizes several published nearest-centroid (NC) and single-sample predictor (SSP) classifiers, supplies method-specific preprocessing and robust probe-to-gene mapping, and implements a cohort-aware AUTO mode that selectively enables classifiers compatible with the cohort composition. A local Shiny app (iBreastSubtypeR) is included for interactive analyses and to support users without programming experience.
Last updated
rnaseqsoftwaregeneexpressionclassificationpreprocessingvisualization
5.38 score 4 stars 12 scripts 254 downloadsCleanUpRNAseq - Detect and Correct Genomic DNA Contamination in RNA-seq Data
RNA-seq data generated by some library preparation methods, such as rRNA-depletion-based method and the SMART-seq method, might be contaminated by genomic DNA (gDNA), if DNase I disgestion is not performed properly during RNA preparation. CleanUpRNAseq is developed to check if RNA-seq data is suffered from gDNA contamination. If so, it can perform correction for gDNA contamination and reduce false discovery rate of differentially expressed genes.
Last updated
qualitycontrolsequencinggeneexpression
5.38 score 8 stars 4 scripts 273 downloadsBERT - High Performance Data Integration for Large-Scale Analyses of Incomplete Omic Profiles Using Batch-Effect Reduction Trees (BERT)
Provides efficient batch-effect adjustment of data with missing values. BERT orders all batch effect correction to a tree of pairwise computations. BERT allows parallelization over sub-trees.
Last updated
batcheffectpreprocessingexperimentaldesignqualitycontrolbatch-effectbioconductor-packagebioinformaticsdata-integrationdata-sciencenature-communications
5.38 score 4 stars 20 scripts 241 downloadsMouseFM - In-silico methods for genetic finemapping in inbred mice
This package provides methods for genetic finemapping in inbred mice by taking advantage of their very high homozygosity rate (>95%).
Last updated
geneticssnpgenetargetvariantannotationgenomicvariationmultiplecomparisonsystemsbiologymathematicalbiologypatternlogicgenepredictionbiomedicalinformaticsfunctionalgenomicsfinemapgene-candidatesinbred-miceinbred-strainsmouseqtlqtl-mapping
5.38 score 5 scripts 312 downloadsiSEEindex - iSEE extension for a landing page to a custom collection of data sets
This package provides an interface to any collection of data sets within a single iSEE web-application. The main functionality of this package is to define a custom landing page allowing app maintainers to list a custom collection of data sets that users can selected from and directly load objects into an iSEE web-application.
Last updated
softwareinfrastructurebioconductorhacktoberfest
5.37 score 2 stars 13 scripts 290 downloadsomXplore - Vizualization tools for 'omics' datasets with R
This package contains a collection of functions (written as shiny modules) for the visualisation and the statistical analysis of omics data. These plots can be displayed individually or embedded in a global Shiny module. Additionaly, it is possible to integrate third party modules to the main interface of the package omXplore.
Last updated
softwareshinyappsmassspectrometrydatarepresentationguiqualitycontrolprostar2
5.36 score 38 scripts 302 downloadsdnaEPICO - dnaEPICO: Analysis Pipeline for Illumina DNA Methylation Array Data
A modular and reproducible workflow for preprocessing and analysing Illumina DNA methylation array data from the EPICv2, EPIC, and 450K platforms. The package integrates quality control, probe filtering, cell-type deconvolution, phenotype preparation, generalised linear models, linear mixed-effects models, and automated report generation. It builds on established Bioconductor infrastructure and wraps commonly used tools including 'minfi', 'ENmix', and 'wateRmelon', with support for both local execution and high-performance computing workflows.
Last updated
softwarepreprocessingmethylationarrayqualitycontrolepigeneticsmicroarraystatisticalmethodchiponchip450k-arraydna-array-methylationepic-arrayepicv2-arrayglm2illuminalmermethylationminfistatistics
5.35 score 1 stars 5 scripts 60 downloadsstPipe - Upstream pre-processing for Sequencing-Based Spatial Transcriptomics
This package serves as an upstream pipeline for pre-processing sequencing-based spatial transcriptomics data. Functions includes FASTQ trimming, BAM file reformatting, index building, spatial barcode detection, demultiplexing, gene count matrix generation with UMI deduplication, QC, and revelant visualization. Config is an essential input for most of the functions which aims to improve reproducibility.
Last updated
immunooncologysoftwaresequencingrnaseqgeneexpressionsinglecellvisualizationsequencematchingpreprocessingqualitycontrolgenomeannotationdataimportspatialtranscriptomicsclusteringcurlopensslzlibcpp
5.35 score 5 stars 3 scripts 297 downloads
goatea - Interactive Exploration of GSEA by the GOAT Method
Geneset Ordinal Association Test Enrichment Analysis (GOATEA) provides a 'Shiny' interface with interactive visualizations and utility functions for performing and exploring automated gene set enrichment analysis using the 'GOAT' package. 'GOATEA' is designed to support large-scale and user-friendly enrichment workflows across multiple gene lists and comparisons, with flexible plotting and output options. Visualizations pre-enrichment include interactive 'Volcano' and 'UpSet' (overlap) plots. Visualizations post-enrichment include interactive geneset dotplot, geneset treeplot, gene-effectsize heatmap, gene-geneset heatmap and 'STRING' database of protein-protein-interactions network graph. 'GOAT' reference: Frank Koopmans (2024) <doi:10.1038/s42003-024-06454-5>.
Last updated
genesetenrichmentnetworkenrichmentvisualizationshinyappsguitranscriptomicsgeneticsfunctionalgenomicsdifferentialexpressionnetwork
5.34 score 2 stars 5 scripts 56 downloadsggtreeSpace - Visualizing Phylomorphospaces using 'ggtree'
This package is a comprehensive visualization tool specifically designed for exploring phylomorphospace. It not only simplifies the process of generating phylomorphospace, but also enhances it with the capability to add graphic layers to the plot with grammar of graphics to create fully annotated phylomorphospaces. It also provide some utilities to help interpret evolutionary patterns.
Last updated
annotationvisualizationphylogeneticssoftware
5.32 score 5 stars 14 scripts 234 downloadsTMSig - Tools for Molecular Signatures
The TMSig package contains tools to prepare, analyze, and visualize named lists of sets, with an emphasis on molecular signatures (such as gene or kinase sets). It includes fast, memory efficient functions to construct sparse incidence and similarity matrices and filter, cluster, invert, and decompose sets. Additionally, bubble heatmaps can be created to visualize the results of any differential or molecular signatures analysis.
Last updated
clusteringgenesetenrichmentgraphandnetworkpathwaysvisualizationgene-setsmolecular-signatures
5.31 score 4 stars 51 scripts 251 downloadsMetaDICT - Microbiome data integration method via shared dictionary learning
MetaDICT is a method for the integration of microbiome data. This method is designed to remove batch effects and preserve biological variation while integrating heterogeneous datasets. MetaDICT can better avoid overcorrection when unobserved confounding variables are present.
Last updated
microbiomebatcheffectsequencingclusteringsoftware
5.30 score 5 stars 8 scripts 182 downloadsSTADyUM - Statistical Transcriptome Analysis under a Dynamic Unified Model
STADyUM is a package with functionality for analyzing nascent RNA read counts to infer transcription rates. This includes utilities for processing experimental nascent RNA read counts as well as for simulating PRO-seq data. Rates such as initiation, pause release and landing pad occupancy are estimated from either synthetic or experimental data. There are also options for varying pause sites and including steric hindrance of initiation in the model.
Last updated
statisticalmethodtranscriptomicstranscriptionsequencingcpp
5.28 score 1 stars 2 scripts 292 downloadsscDotPlot - Cluster a Single-cell RNA-seq Dot Plot
Dot plots of single-cell RNA-seq data allow for an examination of the relationships between cell groupings (e.g. clusters) and marker gene expression. The scDotPlot package offers a unified approach to perform a hierarchical clustering analysis and add annotations to the columns and/or rows of a scRNA-seq dot plot. It works with SingleCellExperiment and Seurat objects as well as data frames.
Last updated
softwarevisualizationdifferentialexpressiongeneexpressiontranscriptionrnaseqsinglecellsequencingclustering
5.28 score 7 stars 18 scripts 302 downloadsBioCartaImage - BioCarta Pathway Images
The core functionality of the package is to provide coordinates of genes on the BioCarta pathway images and to provide methods to add self-defined graphics to the genes of interest.
Last updated
softwarepathwaysbiocartavisualization
5.26 score 11 stars 11 scripts 258 downloadspeakCombiner - The R package to curate and merge enriched genomic regions into consensus peak sets
peakCombiner, a fully R based, user-friendly, transparent, and customizable tool that allows even novice R users to create a high-quality consensus peak list. The modularity of its functions allows an easy way to optimize input and output data. A broad range of accepted input data formats can be used to create a consensus peak set that can be exported to a file or used as the starting point for most downstream peak analyses.
Last updated
workflowsteppreprocessingchiponchip
5.26 score 6 stars 1 scripts 186 downloadsSETA - Single Cell Ecological Taxonomic Analysis
Tools for compositional and other sample-level ecological analyses and visualizations tailored for single-cell RNA-seq data. SETA includes functions for taxonomizing celltypes, normalizing data, performing statistical tests, and visualizing results. Several tutorials are included to guide users and introduce them to key concepts. SETA is meant to teach users about statistical concepts underlying ecological analysis methods so they can apply them to their own single-cell data.
Last updated
singlecelltranscriptomicsrnaseqgeneexpressionstatisticalmethoddimensionreductionvisualizationnormalizationdatarepresentationsystemsbiology
5.26 score 4 scripts 228 downloadsSurfR - Surface Protein Prediction and Identification
Identify Surface Protein coding genes from a list of candidates. Systematically download data from GEO and TCGA or use your own data. Perform DGE on bulk RNAseq data. Perform Meta-analysis. Descriptive enrichment analysis and plots.
Last updated
softwaresequencingrnaseqgeneexpressiontranscriptiondifferentialexpressionprincipalcomponentgenesetenrichmentpathwaysbatcheffectfunctionalgenomicsvisualizationdataimportfunctionalpredictiongenepredictiongodgeenrichment-analysismetaanalysisplotsproteinspublic-datasurfacesurfaceome
5.26 score 6 stars 4 scripts 306 downloadsCNVMetrics - Copy Number Variant Metrics
The CNVMetrics package calculates similarity metrics to facilitate copy number variant comparison among samples and/or methods. Similarity metrics can be employed to compare CNV profiles of genetically unrelated samples as well as those with a common genetic background. Some metrics are based on the shared amplified/deleted regions while other metrics rely on the level of amplification/deletion. The data type used as input is a plain text file containing the genomic position of the copy number variations, as well as the status and/or the log2 ratio values. Finally, a visualization tool is provided to explore resulting metrics.
Last updated
biologicalquestionsoftwarecopynumbervariationcnvcopy-number-variationmetricsr-language
5.26 score 4 stars 8 scripts 352 downloadsposDemux - Positional combinatorial sequence demultiplexer
Demultiplexing and filtering utilities intended for reads with combinatorial barcodes (i.e. PETRI-seq and SPLiT-seq). The demultiplexer algorithm uses the position of the segments to extract and compare the barcodes with the reference (whitelist). A Shiny application is provided to interactively select cutoffs for which barcode combinations to keep.
Last updated
sequencematchingsequencingsoftwarernaseqcpp
5.23 score 7 scripts 55 downloadsIsoBayes - IsoBayes: Single Isoform protein inference Method via Bayesian Analyses
IsoBayes is a Bayesian method to perform inference on single protein isoforms. Our approach infers the presence/absence of protein isoforms, and also estimates their abundance; additionally, it provides a measure of the uncertainty of these estimates, via: i) the posterior probability that a protein isoform is present in the sample; ii) a posterior credible interval of its abundance. IsoBayes inputs liquid cromatography mass spectrometry (MS) data, and can work with both PSM counts, and intensities. When available, trascript isoform abundances (i.e., TPMs) are also incorporated: TPMs are used to formulate an informative prior for the respective protein isoform relative abundance. We further identify isoforms where the relative abundance of proteins and transcripts significantly differ. We use a two-layer latent variable approach to model two sources of uncertainty typical of MS data: i) peptides may be erroneously detected (even when absent); ii) many peptides are compatible with multiple protein isoforms. In the first layer, we sample the presence/absence of each peptide based on its estimated probability of being mistakenly detected, also known as PEP (i.e., posterior error probability). In the second layer, for peptides that were estimated as being present, we allocate their abundance across the protein isoforms they map to. These two steps allow us to recover the presence and abundance of each protein isoform.
Last updated
statisticalmethodbayesianproteomicsmassspectrometryalternativesplicingsequencingrnaseqgeneexpressiongeneticsvisualizationsoftwarecpp
5.23 score 8 stars 14 scripts 299 downloadsTSAR - Thermal Shift Analysis in R
This package automates analysis workflow for Thermal Shift Analysis (TSA) data. Processing, analyzing, and visualizing data through both shiny applications and command lines. Package aims to simplify data analysis and offer front to end workflow, from raw data to multiple trial analysis.
Last updated
softwareshinyappsvisualizationqpcr
5.23 score 14 scripts 258 downloadsdecontX - Decontamination of single cell genomics data
This package contains implementation of DecontX (Yang et al. 2020), a decontamination algorithm for single-cell RNA-seq, and DecontPro (Yin et al. 2023), a decontamination algorithm for single cell protein expression data. DecontX is a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. DecontPro is a Bayesian method that estimates the level of contamination from ambient and background sources in CITE-seq ADT dataset and decontaminate the dataset.
Last updated
singlecellbayesiancpp
5.21 score 81 scripts 798 downloads
TrIdent - TrIdent - Transduction Identification
The `TrIdent` R package automates the analysis of transductomics data by detecting, classifying, and characterizing read coverage patterns associated with potential transduction events. Transductomics is a DNA sequencing-based method for the detection and characterization of transduction events in pure cultures and complex communities. Transductomics relies on mapping sequencing reads from a viral-like particle (VLP)-fraction of a sample to contigs assembled from the metagenome (whole-community) of the same sample. Reads from bacterial DNA carried by VLPs will map back to the bacterial contigs of origin creating read coverage patterns indicative of ongoing transduction.
Last updated
coveragemetagenomicspatternlogicclassificationsequencingbacteriophagehorizontal-gene-transferpattern-matchingphagesequencing-coveragetransductiontransductomicsvirus-like-particle
5.20 score 2 stars 7 scripts 220 downloadsscoup - Simulate Codons with Darwinian Selection Modelled as an OU Process
An elaborate molecular evolutionary framework that facilitates straightforward simulation of codon genetic sequences subjected to different degrees and/or patterns of Darwinian selection. The model is built upon the fitness landscape paradigm of Sewall Wright, as popularised by the mutation-selection model of Halpern and Bruno. This enables realistic evolutionary process of living organisms to be reproducible seamlessly. For example, an Ornstein-Uhlenbeck fitness update algorithm is incorporated herein. Consequently, otherwise complex biological processes, such as the effect of the interplay between genetic drift and fitness landscape fluctuations on the inference of diversifying selection, may now be investigated with minimal effort. Frequency-dependent and stochastic fitness landscape update techniques are available.
Last updated
alignmentclassificationcomparativegenomicsdataimportgeneticsmathematicalbiologyresearchfieldsequencingsequencematchingsoftwarestatisticalmethodworkflowstepcomputational-biologyevolutionary-biologymolecular-biologysimulation
5.20 score 16 scripts 238 downloads
HybridExpress - Comparative analysis of RNA-seq data for hybrids and their progenitors
HybridExpress can be used to perform comparative transcriptomics analysis of hybrids (or allopolyploids) relative to their progenitor species. The package features functions to perform exploratory analyses of sample grouping, identify differentially expressed genes in hybrids relative to their progenitors, classify genes in expression categories (N = 12) and classes (N = 5), and perform functional analyses. We also provide users with graphical functions for the seamless creation of publication-ready figures that are commonly used in the literature.
Last updated
softwarefunctionalgenomicsgeneexpressiontranscriptomicsrnaseqclassificationdifferentialexpressiongene-expressionhybridpolyploidyrna-seq
5.20 score 16 stars 3 scripts 266 downloadsgVenn - Proportional Venn and UpSet Diagrams for Gene Sets and Genomic Regions
Tools to compute and visualize overlaps between gene sets or genomic regions. Venn diagrams with proportional areas are provided, while UpSet plots are recommended for larger numbers of sets. The package supports GRanges and GRangesList inputs, and integrates with analysis workflows for ChIP-seq, ATAC-seq, and other genomic interval data. It generates clean, interpretable, and publication-ready figures.
Last updated
softwarevisualizationchipseqatacseqepigeneticsdatarepresentationsequencing
5.19 score 2 stars 14 scripts 239 downloadsiSEEde - iSEE extension for panels related to differential expression analysis
This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of differential expression results. This package does not perform differential expression. Instead, it provides methods to embed precomputed differential expression results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.
Last updated
softwareinfrastructuredifferentialexpressionbioconductorhacktoberfestiseeu
5.18 score 1 stars 19 scripts 313 downloadsHiCDOC - A/B compartment detection and differential analysis
HiCDOC normalizes intrachromosomal Hi-C matrices, uses unsupervised learning to predict A/B compartments from multiple replicates, and detects significant compartment changes between experiment conditions. It provides a collection of functions assembled into a pipeline to filter and normalize the data, predict the compartments and visualize the results. It accepts several type of data: tabular `.tsv` files, Cooler `.cool` or `.mcool` files, Juicer `.hic` files or HiC-Pro `.matrix` and `.bed` files.
Last updated
hicdna3dstructurenormalizationsequencingsoftwareclusteringcpp
5.18 score 5 stars 1 dependents 5 scripts 346 downloads
vmrseq - Probabilistic Modeling of Single-cell Methylation Heterogeneity
High-throughput single-cell measurements of DNA methylation allows studying inter-cellular epigenetic heterogeneity, but this task faces the challenges of sparsity and noise. We present vmrseq, a statistical method that overcomes these challenges and identifies variably methylated regions accurately and robustly.
Last updated
softwareimmunooncologydnamethylationepigeneticssinglecellsequencingwholegenomecomputational-biologydimensionality-reductionepigenomics-workflowhidden-markov-modelprobabilistic-models
5.18 score 10 stars 5 scripts 248 downloads
PIUMA - Phenotypes Identification Using Mapper from topological data Analysis
The PIUMA package offers a tidy pipeline of Topological Data Analysis frameworks to identify and characterize communities in high and heterogeneous dimensional data.
Last updated
clusteringgraphandnetworkdimensionreductionnetworkclassification
5.18 score 5 stars 3 scripts 253 downloadspathMED - Scoring Personalized Molecular Portraits
PathMED is a collection of tools to facilitate precision medicine studies with omics data (e.g. transcriptomics). Among its funcionalities, genesets scores for individual samples may be calculated with several methods. These scores may be used to train machine learning models and to predict clinical features on new data. For this, several machine learning methods are evaluated in order to select the best method based on internal validation and to tune the hyperparameters. Performance metrics and a ready-to-use model to predict the outcomes for new patients are returned.
Last updated
pathwaysclassificationfeatureextractiontranscriptomics
5.15 score 5 stars 1 scripts 179 downloadsCBN2Path - CBN2Path: an R/Bioconductor package for the analysis of cancer progression pathways using Conjunctive Bayesian Networks
CBN2Path package provides a unifying interface to facilitate CBN-based quantification, analysis and visualization of cancer progression pathways.
Last updated
softwarestatisticalmethodgraphandnetworkbayesianpathwaysgsl
5.15 score 2 scripts 264 downloadslineagespot - Detection of SARS-CoV-2 lineages in wastewater samples using next-generation sequencing
Lineagespot is a framework written in R, and aims to identify SARS-CoV-2 related mutations based on a single (or a list) of variant(s) file(s) (i.e., variant calling format). The method can facilitate the detection of SARS-CoV-2 lineages in wastewater samples using next generation sequencing, and attempts to infer the potential distribution of the SARS-CoV-2 lineages.
Last updated
variantdetectionvariantannotationsequencing
5.15 score 2 stars 4 scripts 313 downloadsrigvf - R interface to the IGVF Catalog
The IGVF Catalog provides data on the impact of genomic variants on function. The `rigvf` package provides an interface to the IGVF Catalog, allowing easy integration with Bioconductor resources.
Last updated
thirdpartyclientannotationvariantannotationfunctionalgenomicsgeneregulationgenomicvariationgenetarget
5.13 score 2 scripts 252 downloads
scGraphVerse - scGraphVerse: A Gene Network Analysis Package
A package for inferring, comparing, and visualizing gene networks from single-cell RNA sequencing data. It integrates multiple methods (GENIE3, GRNBoost2, ZILGM, PCzinb, and JRF) for robust network inference, supports consensus building across methods or datasets, and provides tools for evaluating regulatory structure and community similarity. GRNBoost2 requires Python package 'arboreto' which can be installed using init_py(install_missing = TRUE). This package includes adapted functions from ZILGM (Park et al., 2021), JRF (Petralia et al., 2015), and learn2count (Nguyen et al. 2023) packages with proper attribution under GPL-2 license.
Last updated
generegulationnetworkinferencesinglecellrnaseqvisualizationsoftwaregraphandnetworkgenesetenrichmentnetworkenrichmentpathwayssequencingreactomenetworkkegg
5.11 score 1 stars 4 scripts 192 downloadsrhdf5client - Access HDF5 content from HDF Scalable Data Service
This package provides functionality for reading data from HDF Scalable Data Service from within R. The HSDSArray function bridges from HSDS to the user via the DelayedArray interface. Bioconductor manages an open HSDS instance graciously provided by John Readey of the HDF Group.
Last updated
dataimportsoftwareinfrastructure
5.11 score 2 dependents 36 scripts 396 downloadsdamidBind - Differential Binding and Expression Analysis for DamID-seq Data
The damidBind package provides a straightforward formal analysis pipeline to analyse and explore differential DamID binding, gene transcription or chromatin accessibility between two conditions. The package imports processed data from DamID-seq experiments, either as external raw files in the form of binding bedGraphs and GFF/BED peak calls, or as internal lists of GRanges objects. After optionally normalising data, combining peaks across replicates and determining per-replicate peak occupancy, the package links bound loci to nearby genes. For RNA Polymerase DamID data, the package calculates occupancy over genes, and optionally calcualates the FDR of significantly-enriched gene occupancy. damidBind then uses either limma (for conventional log2 ratio DamID binding data) or NOIseq (for counts-based CATaDa chromatin accessibility data) to identify differentially-enriched regions, or differentially epxressed genes, between two conditions. The package provides a number of visualisation tools (volcano plots, Gene Ontology enrichment plots via ClusterProfiler and proportional Venn diagrams via BioVenn for downstream data exploration and analysis. An powerful, interactive IGV genome browser interface (powered by Shiny and igvShiny) allows users to rapidly and intuitively assess significant differentially-bound regions in their genomic context.
Last updated
differentialexpressiongeneexpressiontranscriptionepigeneticsvisualizationsequencingsoftwaregeneregulationcatadadamiddifferential-bindingdifferential-expression-analysisgene-expressiontargeted-damidtranscription-factors
5.11 score 1 stars 16 scripts 163 downloadsHTqPCR - Automated analysis of high-throughput qPCR data
Analysis of Ct values from high throughput quantitative real-time PCR (qPCR) assays across multiple conditions or replicates. The input data can be from spatially-defined formats such ABI TaqMan Low Density Arrays or OpenArray; LightCycler from Roche Applied Science; the CFX plates from Bio-Rad Laboratories; conventional 96- or 384-well plates; or microfluidic devices such as the Dynamic Arrays from Fluidigm Corporation. HTqPCR handles data loading, quality assessment, normalization, visualization and parametric or non-parametric testing for statistical significance in Ct values between features (e.g. genes, microRNAs).
Last updated
microtitreplateassaydifferentialexpressiongeneexpressiondataimportqualitycontrolpreprocessingvisualizationmultiplecomparisonqpcr
5.10 score 1 dependents 21 scripts 578 downloadsshinybiocloader - Use a Shiny Bioconductor CSS loader
Add a Bioconductor themed CSS loader to your shiny app. It is based on the shinycustomloader R package. Use a spinning Bioconductor note loader to enhance your shiny app loading screen. This package is intended for developer use.
Last updated
softwareinfrastructureguibioconductor-packagequarto
5.08 score 2 dependents 5 scripts 207 downloadsPICB - piRNA Cluster Builder
piRNAs (short for PIWI-interacting RNAs) and their PIWI protein partners play a key role in fertility and maintaining genome integrity by restricting mobile genetic elements (transposons) in germ cells. piRNAs originate from genomic regions known as piRNA clusters. The piRNA Cluster Builder (PICB) is a versatile toolkit designed to identify genomic regions with a high density of piRNAs. It constructs piRNA clusters through a stepwise integration of unique and multimapping piRNAs and offers wide-ranging parameter settings, supported by an optimization function that allows users to test different parameter combinations to tailor the analysis to their specific piRNA system. The output includes extensive metadata columns, enabling researchers to rank clusters and extract cluster characteristics.
Last updated
geneticsgenomeannotationsequencingfunctionalpredictioncoveragetranscriptomics
5.08 score 8 stars 5 scripts 257 downloads
gINTomics - Multi-Omics data integration
gINTomics is an R package for Multi-Omics data integration and visualization. gINTomics is designed to detect the association between the expression of a target and of its regulators, taking into account also their genomics modifications such as Copy Number Variations (CNV) and methylation. What is more, gINTomics allows integration results visualization via a Shiny-based interactive app.
Last updated
geneexpressionrnaseqmicroarrayvisualizationcopynumbervariationgenetargetquarto
5.08 score 3 stars 4 scripts 202 downloadsCytoMDS - Low Dimensions projection of cytometry samples
This package implements a low dimensional visualization of a set of cytometry samples, in order to visually assess the 'distances' between them. This, in turn, can greatly help the user to identify quality issues like batch effects or outlier samples, and/or check the presence of potential sample clusters that might align with the exeprimental design. The CytoMDS algorithm combines, on the one hand, the concept of Earth Mover's Distance (EMD), a.k.a. Wasserstein metric and, on the other hand, the Multi Dimensional Scaling (MDS) algorithm for the low dimensional projection. Also, the package provides some diagnostic tools for both checking the quality of the MDS projection, as well as tools to help with the interpretation of the axes of the projection.
Last updated
flowcytometryqualitycontroldimensionreductionmultidimensionalscalingsoftwarevisualization
5.08 score 1 stars 1 dependents 8 scripts 290 downloads
mobileRNA - mobileRNA: Investigate the RNA mobilome & population-scale changes
Genomic analysis can be utilised to identify differences between RNA populations in two conditions, both in production and abundance. This includes the identification of RNAs produced by multiple genomes within a biological system. For example, RNA produced by pathogens within a host or mobile RNAs in plant graft systems. The mobileRNA package provides methods to pre-process, analyse and visualise the sRNA and mRNA populations based on the premise of mapping reads to all genotypes at the same time.
Last updated
visualizationrnaseqsequencingsmallrnagenomeassemblyclusteringexperimentaldesignqualitycontrolworkflowstepalignmentpreprocessingbioinformaticsplant-science
5.08 score 4 stars 2 scripts 279 downloadsgDNAx - Diagnostics for assessing genomic DNA contamination in RNA-seq data
Provides diagnostics for assessing genomic DNA contamination in RNA-seq data, as well as plots representing these diagnostics. Moreover, the package can be used to get an insight into the strand library protocol used and, in case of strand-specific libraries, the strandedness of the data. Furthermore, it provides functionality to filter out reads of potential gDNA origin.
Last updated
transcriptiontranscriptomicsrnaseqsequencingpreprocessingsoftwaregeneexpressioncoveragedifferentialexpressionfunctionalgenomicssplicedalignmentalignment
5.08 score 2 stars 6 scripts 316 downloadsscatterHatch - Creates hatched patterns for scatterplots
The objective of this package is to efficiently create scatterplots where groups can be distinguished by color and texture. Visualizations in computational biology tend to have many groups making it difficult to distinguish between groups solely on color. Thus, this package is useful for increasing the accessibility of scatterplot visualizations to those with visual impairments such as color blindness.
Last updated
visualizationsinglecellcellbiologysoftwarespatial
5.08 score 7 stars 17 scripts 296 downloadsnotameViz - Workflow for non-targeted LC-MS metabolic profiling
Provides visualization functionality for untargeted LC-MS metabolomics research. Includes quality control visualizations, feature-wise visualizations and results visualizations.
Last updated
biomedicalinformaticsmetabolomicsdataimportmassspectrometrybatcheffectmultiplecomparisonnormalizationqualitycontrolvisualizationpreprocessing
5.04 score 6 scripts 236 downloadsCPSM - CPSM: Cancer patient survival model
CPSM provides a comprehensive computational pipeline for predicting survival probability and risk groups in cancer patients. The package includes steps for data preprocessing, training/test split, and normalization. It enables feature selection using univariate survival analysis and computes a LASSO-based prognostic index (PI) score. CPSM supports the development of predictive models using various feature sets and offers a suite of visualization tools, including survival curves based on predicted probabilities, barplots for predicted mean and median survival times, KM plots overlaid with individual survival predictions, and nomograms for estimating 1-, 3-, 5-, and 10-year survival probabilities. This makes CPSM a versatile tool for survival analysis in cancer research.
Last updated
normalizationsurvivalgeneexpressionpreprocessingfeatureextractionsoftwarevisualization
5.04 score 2 stars 6 scripts 256 downloadsBiocBook - Write, containerize, publish and version Quarto books with Bioconductor
A BiocBook can be created by authors (e.g. R developers, but also scientists, teachers, communicators, ...) who wish to 1) write (compile a body of biological and/or bioinformatics knowledge), 2) containerize (provide Docker images to reproduce the examples illustrated in the compendium), 3) publish (deploy an online book to disseminate the compendium), and 4) version (automatically generate specific online book versions and Docker images for specific Bioconductor releases).
Last updated
infrastructurereportwritingsoftwarequarto
5.02 score 7 stars 6 scripts 320 downloadsMeLSI - Metric Learning for Statistical Inference in Microbiome Analysis
MeLSI (Metric Learning for Statistical Inference) is a novel machine learning method for microbiome data analysis that learns optimal distance metrics to improve statistical power in detecting group differences. Unlike traditional distance metrics (Bray-Curtis, Euclidean, Jaccard), MeLSI adapts to the specific characteristics of your dataset to maximize separation between groups. The method uses an ensemble of weak learners to identify which microbial features drive group differences, providing both improved statistical power and biological interpretability through feature importance weights.
Last updated
softwarestatisticalmethodmicrobiome
5.01 score 1 stars 17 scripts 121 downloadsomicsGMF - Dimensionality reduction of (single-cell) omics data in R using omicsGMF
omicsGMF is a Bioconductor package that uses the sgdGMF-framework of the \code{sgdGMF} package for highly performant and fast matrix factorization that can be used for dimensionality reduction, visualization and imputation of omics data. It considers data from the general exponential family as input, and therefore suits the use of both RNA-seq (Poisson or Negative Binomial data) and proteomics data (Gaussian data). It does not require prior transformation of counts to the log-scale, because it rather optimizes the deviances from the data family specified. Also, it allows to correct for known sample-level and feature-level covariates, therefore enabling visualization and dimensionality reduction upon batch correction. Last but not least, it deals with missing values, and allows to impute these after matrix factorization, useful for proteomics data. This Bioconductor package allows input of SummarizedExperiment, SingleCellExperiment, and QFeature classes.
Last updated
singlecellrnaseqproteomicsqualitycontrolpreprocessingnormalizationvisualizationdimensionreductiontranscriptomicsgeneexpressionsequencingsoftwaredatarepresentationmassspectrometry
5.00 score 2 stars 5 scripts 204 downloadsCytoPipelineGUI - GUI's for visualization of flow cytometry data analysis pipelines
This package is the companion of the `CytoPipeline` package. It provides GUI's (shiny apps) for the visualization of flow cytometry data analysis pipelines that are run with `CytoPipeline`. Two shiny applications are provided, i.e. an interactive flow frame assessment and comparison tool and an interactive scale transformations visualization and adjustment tool.
Last updated
flowcytometrypreprocessingqualitycontrolworkflowstepimmunooncologysoftwarevisualizationguishinyapps
5.00 score 2 stars 4 scripts 258 downloadsMSstatsBig - MSstats Preprocessing for Larger than Memory Data
MSstats package provide tools for preprocessing, summarization and differential analysis of mass spectrometry (MS) proteomics data. Recently, some MS protocols enable acquisition of data sets that result in larger than memory quantitative data. MSstats functions are not able to process such data. MSstatsBig package provides additional converter functions that enable processing larger than memory data sets.
Last updated
massspectrometryproteomicssoftware
5.00 score 1 dependents 11 scripts 302 downloadscfdnakit - Fragmen-length analysis package from high-throughput sequencing of cell-free DNA (cfDNA)
This package provides basic functions for analyzing shallow whole-genome sequencing (~0.3X or more) of cell-free DNA (cfDNA). The package basically extracts the length of cfDNA fragments and aids the vistualization of fragment-length information. The package also extract fragment-length information per non-overlapping fixed-sized bins and used it for calculating ctDNA estimation score (CES).
Last updated
copynumbervariationsequencingwholegenome
5.00 score 9 stars 11 scripts 319 downloadsnotameStats - Workflow for non-targeted LC-MS metabolic profiling
Provides univariate and multivariate statistics for feature prioritization in untargeted LC-MS metabolomics research.
Last updated
biomedicalinformaticsmetabolomicsdataimportmassspectrometrybatcheffectmultiplecomparisonnormalizationqualitycontrolvisualizationpreprocessing
4.95 score 5 scripts 206 downloadsimageTCGA - TCGA Diagnostic Image Database Explorer
A Shiny application to explore the TCGA Diagnostic Image Database.
Last updated
shinyappsu24ca289073
4.95 score 3 stars 4 scripts 234 downloadschevreulProcess - Tools for managing SingleCellExperiment objects as projects
Tools for analyzing SingleCellExperiment objects as projects. for input into the chevreulShiny app downstream. Includes functions for analysis of single cell RNA sequencing data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.
Last updated
coveragernaseqsequencingvisualizationgeneexpressiontranscriptionsinglecelltranscriptomicsnormalizationpreprocessingqualitycontroldimensionreductiondataimport
4.95 score 2 dependents 4 scripts 258 downloadschevreulPlot - Plots used in the chevreulPlot package
Tools for plotting SingleCellExperiment objects in the chevreulPlot package. Includes functions for analysis and visualization of single-cell data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.
Last updated
coveragernaseqsequencingvisualizationgeneexpressiontranscriptionsinglecelltranscriptomicsnormalizationpreprocessingqualitycontroldimensionreductiondataimport
4.95 score 1 dependents 3 scripts 236 downloadsfenr - Fast functional enrichment for interactive applications
Perform fast functional enrichment on feature lists (like genes or proteins) using the hypergeometric distribution. Tailored for speed, this package is ideal for interactive platforms such as Shiny. It supports the retrieval of functional data from sources like GO, KEGG, Reactome, Bioplanet and WikiPathways. By downloading and preparing data first, it allows for rapid successive tests on various feature selections without the need for repetitive, time-consuming preparatory steps typical of other packages.
Last updated
functionalpredictiondifferentialexpressiongenesetenrichmentgokeggreactomeproteomics
4.95 score 1 dependents 10 scripts 339 downloadsCardinalIO - Read and write mass spectrometry imaging files
Fast and efficient reading and writing of mass spectrometry imaging data files. Supports imzML and Analyze 7.5 formats. Provides ontologies for mass spectrometry imaging.
Last updated
softwareinfrastructuredataimportmassspectrometryimagingmassspectrometrycpp
4.95 score 3 stars 1 dependents 3 scripts 436 downloadsPanomiR - Detection of miRNAs that regulate interacting groups of pathways
PanomiR is a package to detect miRNAs that target groups of pathways from gene expression data. This package provides functionality for generating pathway activity profiles, determining differentially activated pathways between user-specified conditions, determining clusters of pathways via the PCxN package, and generating miRNAs targeting clusters of pathways. These function can be used separately or sequentially to analyze RNA-Seq data.
Last updated
geneexpressiongenesetenrichmentgenetargetmirnapathways
4.95 score 3 stars 15 scripts 324 downloads
crumblr - Count ratio uncertainty modeling base linear regression
Crumblr enables analysis of count ratio data using precision weighted linear (mixed) models. It uses an asymptotic normal approximation of the variance following the centered log ration transform (CLR) that is widely used in compositional data analysis. Crumblr provides a fast, flexible alternative to GLMs and GLMM's while retaining high power and controlling the false positive rate.
Last updated
rnaseqgeneexpressiondifferentialexpressionbatcheffectqualitycontrolsinglecellregressionepigeneticsfunctionalgenomicstranscriptomicsnormalizationclusteringdimensionreductionpreprocessingsoftware
4.93 score 7 stars 61 scripts 248 downloads
linkSet - Base Classes for Storing Genomic Link Data
Provides a comprehensive framework for representing, analyzing, and visualizing genomic interactions, particularly focusing on gene-enhancer relationships. The package extends the GenomicRanges infrastructure to handle paired genomic regions with specialized methods for chromatin interaction data from Hi-C, Promoter Capture Hi-C (PCHi-C), and single-cell ATAC-seq experiments. Key features include conversion from common interaction formats, annotation of promoters and enhancers, distance-based analyses, interaction strength metrics, statistical modeling using CHiCANE methodology, and tailored visualization tools. The package aims to standardize the representation of genomic interaction data while providing domain-specific functions not available in general genomic interaction packages.
Last updated
softwarehicdatarepresentationsequencingsinglecellcoverage
4.90 score 8 scripts 118 downloadsHoloFoodR - R interface to EBI HoloFood resource
Utility package to facilitate integration and analysis of EBI HoloFood data in R. This package streamlines access to the resource, allowing for direct loading of data into formats optimized for downstream analytics.
Last updated
softwareinfrastructuredataimportmicrobiomemicrobiomedata
4.90 score 2 stars 7 scripts 246 downloadsmotifTestR - Perform key tests for binding motifs in sequence data
Taking a set of sequence motifs as PWMs, test a set of sequences for over-representation of these motifs, as well as any positional features within the set of motifs. Enrichment analysis can be undertaken using multiple statistical approaches. The package also contains core functions to prepare data for analysis, and to visualise results.
Last updated
motifannotationchipseqchiponchipsequencematchingsoftware
4.90 score 1 stars 5 scripts 293 downloadsmultiWGCNA - multiWGCNA
An R package for deeping mining gene co-expression networks in multi-trait expression data. Provides functions for analyzing, comparing, and visualizing WGCNA networks across conditions. multiWGCNA was designed to handle the common case where there are multiple biologically meaningful sample traits, such as disease vs wildtype across development or anatomical region.
Last updated
sequencingrnaseqgeneexpressiondifferentialexpressionregressionclustering
4.90 score 9 scripts 346 downloadsSPICEY - Calculates cell type specificity from single cell data
SPICEY (SPecificity Index for Coding and Epigenetic activitY) is an R package designed to quantify cell-type specificity in single-cell transcriptomic and epigenomic data, particularly scRNA-seq and scATAC-seq. It introduces two complementary indices: the Gene Expression Tissue Specificity Index (GETSI) and the Regulatory Element Tissue Specificity Index (RETSI), both based on entropy to provide continuous, interpretable measures of specificity. By integrating gene expression and chromatin accessibility, SPICEY enables standardized analysis of cell-type-specific regulatory programs across diverse tissues and conditions.
Last updated
transcriptomicsepigeneticssinglecelldifferentialexpressiondifferentialpeakcallinggeneregulationgenetargetgeneexpressiontranscription
4.88 score 1 stars 3 scripts 209 downloadsDegCre - Probabilistic association of DEGs to CREs from differential data
DegCre generates associations between differentially expressed genes (DEGs) and cis-regulatory elements (CREs) based on non-parametric concordance between differential data. The user provides GRanges of DEG TSS and CRE regions with differential p-value and optionally log-fold changes and DegCre returns an annotated Hits object with associations and their calculated probabilities. Additionally, the package provides functionality for visualization and conversion to other formats.
Last updated
geneexpressiongeneregulationatacseqchipseqdnaseseqrnaseq
4.88 score 5 stars 3 scripts 267 downloadsbeachmat.hdf5 - beachmat bindings for HDF5-backed matrices
Extends beachmat to support initialization of tatami matrices from HDF5-backed arrays. This allows C++ code in downstream packages to directly call the HDF5 C/C++ library to access array data, without the need for block processing via DelayedArray. Some utilities are also provided for direct creation of an in-memory tatami matrix from a HDF5 file.
Last updated
datarepresentationdataimportinfrastructurecurlopensslcpp
4.88 score 8 scripts 370 downloadssurvClust - Identification Of Clinically Relevant Genomic Subtypes Using Outcome Weighted Learning
survClust is an outcome weighted integrative clustering algorithm used to classify multi-omic samples on their available time to event information. The resulting clusters are cross-validated to avoid over overfitting and output classification of samples that are molecularly distinct and clinically meaningful. It takes in binary (mutation) as well as continuous data (other omic types).
Last updated
softwareclusteringsurvivalclassificationcpp
4.87 score 16 stars 23 scripts 242 downloadstadar - Transcriptome Analysis of Differential Allelic Representation
This package provides functions to standardise the analysis of Differential Allelic Representation (DAR). DAR compromises the integrity of Differential Expression analysis results as it can bias expression, influencing the classification of genes (or transcripts) as being differentially expressed. DAR analysis results in an easy-to-interpret value between 0 and 1 for each genetic feature of interest, where 0 represents identical allelic representation and 1 represents complete diversity. This metric can be used to identify features prone to false-positive calls in Differential Expression analysis, and can be leveraged with statistical methods to alleviate the impact of such artefacts on RNA-seq data.
Last updated
sequencingrnaseqsnpgenomicvariationvariantannotationdifferentialexpression
4.86 score 1 stars 16 scripts 276 downloadsiSEEpathways - iSEE extension for panels related to pathway analysis
This package contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels or modes facilitating the analysis of pathway analysis results. This package does not perform pathway analysis. Instead, it provides methods to embed precomputed pathway analysis results in a SummarizedExperiment object, in a manner that is compatible with interactive visualisation in iSEE applications.
Last updated
softwareinfrastructuredifferentialexpressiongeneexpressionguivisualizationpathwaysgenesetenrichmentgoshinyappsbioconductorhacktoberfestiseeiseeu
4.86 score 1 stars 12 scripts 260 downloadshicVennDiagram - Venn Diagram for genomic interaction data
A package to generate high-resolution Venn and Upset plots for genomic interaction data from HiC, ChIA-PET, HiChIP, PLAC-Seq, Hi-TrAC, HiCAR and etc. The package generates plots specifically crafted to eliminate the deceptive visual representation caused by the counts method.
Last updated
dna3dstructurehicvisualization
4.86 score 18 scripts 292 downloadsgmapR - An R interface to the GMAP/GSNAP/GSTRUCT suite
GSNAP and GMAP are a pair of tools to align short-read data written by Tom Wu. This package provides convenience methods to work with GMAP and GSNAP from within R. In addition, it provides methods to tally alignment results on a per-nucleotide basis using the bam_tally tool.
Last updated
alignmentzlib
4.85 score 51 scripts 472 downloadsStatescopeR - StatescopeR framework for discovery of cell states from cell type-specific gene expression profiles inferred from bulk mRNA profiles
StatescopeR is an R wrapper around Statescope, a computational framework designed to discover cell states from cell type-specific gene expression profiles inferred from bulk RNA profiles.
Last updated
geneexpressionrnaseqsinglecellbayesiantranscriptomicssoftware
4.85 score 4 scripts 36 downloadsGNOSIS - Genomics explorer using statistical and survival analysis in R
GNOSIS incorporates a range of R packages enabling users to efficiently explore and visualise clinical and genomic data obtained from cBioPortal. GNOSIS uses an intuitive GUI and multiple tab panels supporting a range of functionalities. These include data upload and initial exploration, data recoding and subsetting, multiple visualisations, survival analysis, statistical analysis and mutation analysis, in addition to facilitating reproducible research.
Last updated
softwareshinyappssurvivalgui
4.85 score 7 stars 2 scripts 268 downloadsTOP - TOP Constructs Transferable Model Across Gene Expression Platforms
TOP constructs a transferable model across gene expression platforms for prospective experiments. Such a transferable model can be trained to make predictions on independent validation data with an accuracy that is similar to a re-substituted model. The TOP procedure also has the flexibility to be adapted to suit the most common clinical response variables, including linear response, binomial and Cox PH models.
Last updated
softwaresurvivalgeneexpression
4.83 score 67 scripts 266 downloadscrisprShiny - Exploring curated CRISPR gRNAs via Shiny
Provides means to interactively visualize guide RNAs (gRNAs) in GuideSet objects via Shiny application. This GUI can be self-contained or as a module within a larger Shiny app. The content of the app reflects the annotations present in the passed GuideSet object, and includes intuitive tools to examine, filter, and export gRNAs, thereby making gRNA design more user-friendly.
Last updated
crisprfunctionalgenomicsgenetargetguicrispr-analysiscrispr-designshiny
4.82 score 2 stars 11 scripts 243 downloadsELViS - An R Package for Estimating Copy Number Levels of Viral Genome Segments Using Base-Resolution Read Depth Profile
Base-resolution copy number analysis of viral genome. Utilizes base-resolution read depth data over viral genome to find copy number segments with two-dimensional segmentation approach. Provides publish-ready figures, including histograms of read depths, coverage line plots over viral genome annotated with copy number change events and viral genes, and heatmaps showing multiple types of data with integrative clustering of samples.
Last updated
copynumbervariationcoveragegenomicvariationbiomedicalinformaticssequencingnormalizationvisualizationclustering
4.81 score 8 scripts 236 downloadsrprimer - Design Degenerate Oligos from a Multiple DNA Sequence Alignment
Functions, workflow, and a Shiny application for visualizing sequence conservation and designing degenerate primers, probes, and (RT)-(q/d)PCR assays from a multiple DNA sequence alignment. The results can be presented in data frame format and visualized as dashboard-like plots. For more information, please see the package vignette.
Last updated
alignmentddpcrcoveragemultiplesequencealignmentsequencematchingqpcr
4.81 score 4 stars 16 scripts 347 downloadsenhancerHomologSearch - Identification of putative mammalian orthologs to given enhancer
Get ENCODE data of enhancer region via H3K4me1 peaks and search homolog regions for given sequences. The candidates of enhancer homolog regions can be filtered by distance to target TSS. The top candidates from human and mouse will be aligned to each other and then exported as multiple alignments with given enhancer.
Last updated
sequencinggeneregulationalignmentcpp
4.78 score 2 scripts 303 downloadsbeachmat.tiledb - beachmat bindings for TileDB-backed matrices
Extends beachmat to initialize tatami matrices from TileDB-backed arrays. This allows C++ code in downstream packages to directly call the TileDB C/C++ library to access array data, without the need for block processing via DelayedArray. Developers only need to import this package to automatically extend the capabilities of beachmat::initializeCpp to TileDBArray instances.
Last updated
datarepresentationdataimportinfrastructurecpp
4.78 score 4 scripts 206 downloadsHiCool - HiCool
HiCool provides an R interface to process and normalize Hi-C paired-end fastq reads into .(m)cool files. .(m)cool is a compact, indexed HDF5 file format specifically tailored for efficiently storing HiC-based data. On top of processing fastq reads, HiCool provides a convenient reporting function to generate shareable reports summarizing Hi-C experiments and including quality controls.
Last updated
hicdna3dstructuredataimport
4.78 score 2 stars 10 scripts 272 downloadsFinfoMDS - Multidimensional Scaling with F-ratio for microbiome visualization
F-informed MDS is a new multidimensional scaling-based ordination method that configures data distribution based on the F-statistic (i.e., the ratio of dispersion between groups with shared or differing labels).
Last updated
dimensionreductionmultidimensionalscalingvisualizationmicrobiome
4.78 score 2 stars 2 scripts 190 downloadsDNEA - Differential Network Enrichment Analysis for Biological Data
The DNEA R package is the latest implementation of the Differential Network Enrichment Analysis algorithm and is the successor to the Filigree Java-application described in Iyer et al. (2020). The package is designed to take as input an m x n expression matrix for some -omics modality (ie. metabolomics, lipidomics, proteomics, etc.) and jointly estimate the biological network associations of each condition using the DNEA algorithm described in Ma et al. (2019). This approach provides a framework for data-driven enrichment analysis across two experimental conditions that utilizes the underlying correlation structure of the data to determine feature-feature interactions.
Last updated
metabolomicsproteomicslipidomicsdifferentialexpressionnetworkenrichmentnetworkclusteringdataimportbioinformaticsnetwork-analysisrstudio
4.78 score 3 stars 2 scripts 185 downloadsTENET - R package for TENET (Tracing regulatory Element Networks using Epigenetic Traits) to identify key transcription factors
TENET identifies key transcription factors (TFs) and regulatory elements (REs) linked to a specific cell type by finding significantly correlated differences in gene expression and RE DNA methylation between case and control input datasets, and identifying the top genes by number of significant RE DNA methylation site links. It also includes many tools for visualization and analysis of the results, including plots displaying and comparing methylation and expression data and methylation site link counts, survival analysis, TF motif searching in the vicinity of linked RE DNA methylation sites, custom TAD and peak overlap analysis, and UCSC Genome Browser track file generation. A utility function is also provided to download methylation, expression, and patient survival data from The Cancer Genome Atlas (TCGA) for use in TENET or other analyses.
Last updated
softwarebiomedicalinformaticscellbiologygeneticsepigeneticsmultiplecomparisongeneexpressiondifferentialexpressiondnamethylationdifferentialmethylationmethylationarraysequencingmethylseqrnaseqfunctionalgenomicsgeneregulationgenetargethistonemodificationtranscriptiontranscriptomicssurvivalvisualization
4.78 score 1 stars 20 scripts 249 downloadsmist - Differential Methylation Analysis for scDNAm Data
mist (Methylation Inference for Single-cell along Trajectory) is a hierarchical Bayesian framework for modeling DNA methylation trajectories and performing differential methylation (DM) analysis in single-cell DNA methylation (scDNAm) data. It estimates developmental-stage-specific variations, identifies genomic features with drastic changes along pseudotime, and, for two phenotypic groups, detects features with distinct temporal methylation patterns. mist uses Gibbs sampling to estimate parameters for temporal changes and stage-specific variations.
Last updated
epigeneticsdifferentialmethylationdnamethylationsinglecellsoftware
4.78 score 2 stars 12 scripts 222 downloadsMOSClip - Multi Omics Survival Clip
Topological pathway analysis tool able to integrate multi-omics data. It finds survival-associated modules or significant modules for two-class analysis. This tool have two main methods: pathway tests and module tests. The latter method allows the user to dig inside the pathways itself.
Last updated
softwarestatisticalmethodgraphandnetworksurvivalregressiondimensionreductionpathwaysreactome
4.78 score 5 scripts 234 downloads
seahtrue - Seahtrue revives XF data for structured data analysis
Seahtrue organizes oxygen consumption and extracellular acidification analysis data from experiments performed on an XF analyzer into structured nested tibbles.This allows for detailed processing of raw data and advanced data visualization and statistics. Seahtrue introduces an open and reproducible way to analyze these XF experiments. It uses file paths to .xlsx files. These .xlsx files are supplied by the userand are generated by the user in the Wave software from Agilent from the assay result files (.asyr). The .xlsx file contains different sheets of important data for the experiment; 1. Assay Information - Details about how the experiment was set up. 2. Rate Data - Information about the OCR and ECAR rates. 3. Raw Data - The original raw data collected during the experiment. 4. Calibration Data - Data related to calibrating the instrument. Seahtrue focuses on getting the specific data needed for analysis. Once this data is extracted, it is prepared for calculations through preprocessing. To make sure everything is accurate, both the initial data and the preprocessed data go through thorough checks.
Last updated
cellbasedassaysfunctionalpredictiondatarepresentationdataimportcellbiologycheminformaticsmetabolomicsmicrotitreplateassayvisualizationqualitycontrolbatcheffectexperimentaldesignpreprocessinggo
4.78 score 2 stars 4 scripts 238 downloadsfunOmics - Aggregating Omics Data into Higher-Level Functional Representations
The 'funOmics' package ggregates or summarizes omics data into higher level functional representations such as GO terms gene sets or KEGG metabolic pathways. The aggregated data matrix represents functional activity scores that facilitate the analysis of functional molecular sets while allowing to reduce dimensionality and provide easier and faster biological interpretations. Coordinated functional activity scores can be as informative as single molecules!
Last updated
softwaretranscriptomicsmetabolomicsproteomicspathwaysgokegg
4.78 score 6 stars 3 scripts 239 downloadstransmogR - Modify a set of reference sequences using a set of variants
transmogR provides the tools needed to crate a new reference genome or reference transcriptome, using a set of variants. Variants can be any combination of SNPs, Insertions and Deletions. The intended use-case is to enable creation of variant-modified reference transcriptomes for incorporation into transcriptomic pseudo-alignment workflows, such as salmon.
Last updated
alignmentgenomicvariationsequencingtranscriptomevariantvariantannotationzlib
4.78 score 6 scripts 262 downloads
MIRit - Integrate microRNA and gene expression to decipher pathway complexity
MIRit is an R package that provides several methods for investigating the relationships between miRNAs and genes in different biological conditions. In particular, MIRit allows to explore the functions of dysregulated miRNAs, and makes it possible to identify miRNA-gene regulatory axes that control biological pathways, thus enabling the users to unveil the complexity of miRNA biology. MIRit is an all-in-one framework that aims to help researchers in all the central aspects of an integrative miRNA-mRNA analyses, from differential expression analysis to network characterization.
Last updated
softwaregeneregulationnetworkenrichmentnetworkinferenceepigeneticsfunctionalgenomicssystemsbiologynetworkpathwaysgeneexpressiondifferentialexpressionmirnamirna-mrna-interactionmirna-seqmirnaseq-analysiscpp
4.78 score 2 stars 6 scripts 342 downloadssaseR - Scalable Aberrant Splicing and Expression Retrieval
saseR is a highly performant and fast framework for aberrant expression and splicing analyses. The main functions are: \itemize{ \item \code{\link{BamtoAspliCounts}} - Process BAM files to ASpli counts \item \code{\link{convertASpli}} - Get gene, bin or junction counts from ASpli SummarizedExperiment \item \code{\link{calculateOffsets}} - Create an offsets assays for aberrant expression or splicing analysis \item \code{\link{saseRfindEncodingDim}} - Estimate the optimal number of latent factors to include when estimating the mean expression \item \code{\link{saseRfit}} - Parameter estimation of the negative binomial distribution and compute p-values for aberrant expression and splicing } For information upon how to use these functions, check out our vignette at \url{https://github.com/statOmics/saseR/blob/main/vignettes/Vignette.Rmd} and the saseR paper: Segers, A. et al. (2023). Juggling offsets unlocks RNA-seq tools for fast scalable differential usage, aberrant splicing and expression analyses. bioRxiv. \url{https://doi.org/10.1101/2023.06.29.547014}.
Last updated
differentialexpressiondifferentialsplicingregressiongeneexpressionalternativesplicingrnaseqsequencingsoftware
4.78 score 4 stars 2 scripts 275 downloadsOGRE - Calculate, visualize and analyse overlap between genomic regions
OGRE calculates overlap between user defined genomic region datasets. Any regions can be supplied i.e. genes, SNPs, or reads from sequencing experiments. Key numbers help analyse the extend of overlaps which can also be visualized at a genomic level.
Last updated
softwareworkflowstepbiologicalquestionannotationmetagenomicsvisualizationsequencing
4.78 score 2 stars 4 scripts 314 downloadsSmartPhos - A phosphoproteomics data analysis package with an interactive ShinyApp
To facilitate and streamline phosphoproteomics data analysis, we developed SmartPhos, an R package for the pre-processing, quality control, and exploratory analysis of phosphoproteomics data generated by MaxQuant and Spectronaut. The package can be used either through the R command line or through an interactive ShinyApp called SmartPhos Explorer. The package contains methods such as normalization and normalization correction, transformation, imputation, batch effect correction, PCA, heatmap, differential expression, time-series clustering, gene set enrichment analysis, and kinase activity inference.
Last updated
visualizationshinyappsguiqualitycontrolproteomicsdifferentialexpressionnormalizationpreprocessinggenesetenrichmentclusteringgeneexpressionmassspectrometrybatcheffect
4.70 score 4 scripts 82 downloadsBioc.gff - Read and write GFF and GTF files
Parse GFF and GTF files using C++ classes. The package also provides utilities to read and write GFF3 files. The GFF (General Feature Format) format is a tab-delimited file format for describing genes and other features of DNA, RNA, and protein sequences. GFF files are often used to describe the features of genomes.
Last updated
softwareinfrastructuredataimportbioconductor-package
4.70 score 3 scripts 202 downloadsNCIgraph - Pathways from the NCI Pathways Database
Provides various methods to load the pathways from the NCI Pathways Database in R graph objects and to re-format them.
Last updated
pathwaysgraphandnetwork
4.69 score 1 dependents 18 scripts 296 downloadsMACSr - MACS: Model-based Analysis for ChIP-Seq
The Model-based Analysis of ChIP-Seq (MACS) is a widely used toolkit for identifying transcript factor binding sites. This package is an R wrapper of the lastest MACS3.
Last updated
softwarechipseqatacseqimmunooncology
4.68 score 32 scripts 352 downloadsMultipleAlignment - Representation of multiple sequence alignments in Bioconductor
The package implements a set of S4 classes (DNAMultipleAlignment, RNAMultipleAlignment, AAMultipleAlignment) for representing Multiple Sequence Alignments (MSA). The classes allow users to represent groups of aligned DNA, RNA or amino acid sequences as a single object. The package also provides functions to read/write such object from/to traditional MSA file formats including Stockholm and Clustal.
Last updated
alignmentmultiplesequencealignmentgeneticsdataimportdatarepresentationinfrastructurebioconductor-packagecore-package
4.65 score 1 dependents 3 scriptslooking4clusters - Interactive Visualization of scRNA-Seq
Enables the interactive visualization of dimensional reduction, clustering, and cell properties for scRNA-Seq results. It generates an interactive HTML page using either a numeric matrix, SummarizedExperiment, SingleCellExperiment or Seurat objects as input. The input data can be projected into two-dimensional representations by applying dimensionality reduction methods such as PCA, MDS, t-SNE, UMAP, and NMF. Displaying multiple dimensionality reduction results within the same interface, with interconnected graphs, provides different perspectives that facilitate accurate cell classification. The package also integrates unsupervised clustering techniques, whose results that can be viewed interactively in the graphical interface. In addition to visualization, this interface allows manual selection of groups, labeling of cell entities based on processed meta-information, generation of new graphs displaying gene expression values for each cell, sample identification, and visual comparison of samples and clusters.
Last updated
softwarevisualizationdatarepresentationgeneexpressionmultiplecomparisonclassificationclustering
4.65 score 3 scripts 192 downloadsspARI - Spatially Aware Adjusted Rand Index for Evaluating Spatial Transcritpomics Clustering
The R package used in the manuscript "Spatially Aware Adjusted Rand Index for Evaluating Spatial Transcritpomics Clustering".
Last updated
clusteringdataimportgeneexpressiontranscriptomicsspatialsoftwarecpp
4.65 score 8 scripts 180 downloadsHiCParser - Parser for HiC data in R
This package is a parser to import HiC data into R. It accepts several type of data: tabular files, Cooler `.cool` or `.mcool` files, Juicer `.hic` files or HiC-Pro `.matrix` and `.bed` files. The HiC data can be several files, for several replicates and conditions. The data is formated in an InteractionSet object.
Last updated
softwarehicdataimportzlibcpp
4.65 score 1 stars 1 dependents 1 scripts 232 downloadsCTexploreR - Explores Cancer Testis Genes
The CTexploreR package re-defines the list of Cancer Testis/Germline (CT) genes. It is based on publicly available RNAseq databases (GTEx, CCLE and TCGA) and summarises CT genes' main characteristics. Several visualisation functions allow to explore their expression in different types of tissues and cancer cells, or to inspect the methylation status of their promoters in normal tissues.
Last updated
transcriptomicsepigeneticsdifferentialexpressiongeneexpressiondnamethylationexperimenthubsoftwaredataimportbioconductor
4.65 score 4 scripts 266 downloadsDCATS - Differential Composition Analysis Transformed by a Similarity matrix
Methods to detect the differential composition abundances between conditions in singel-cell RNA-seq experiments, with or without replicates. It aims to correct bias introduced by missclaisification and enable controlling of confounding covariates. To avoid the influence of proportion change from big cell types, DCATS can use either total cell number or specific reference group as normalization term.
Last updated
singlecellnormalization
4.64 score 44 scripts 299 downloadsLipidTrend - LipidTrend: Analysis and Visualization of Lipid Feature Tendencies
"LipidTrend" is an R package that implements a permutation-based statistical test to identify significant differences in lipidomic features between groups. The test incorporates Gaussian kernel smoothing of region statistics to improve stability and accuracy, particularly when dealing with small sample sizes. This package also includes two plotting functions for visualizing significant tendencies in 1D and 2D feature data, respectively.
Last updated
softwarelipidomicsstatisticalmethoddifferentialexpressionvisualization
4.64 score 29 scripts 172 downloadstidyFlowCore - tidyFlowCore: Bringing flowCore to the tidyverse
tidyFlowCore bridges the gap between flow cytometry analysis using the flowCore Bioconductor package and the tidy data principles advocated by the tidyverse. It provides a suite of dplyr-, ggplot2-, and tidyr-like verbs specifically designed for working with flowFrame and flowSet objects as if they were tibbles; however, your data remain flowCore data structures under this layer of abstraction. tidyFlowCore enables intuitive and streamlined analysis workflows that can leverage both the Bioconductor and tidyverse ecosystems for cytometry data.
Last updated
singlecellflowcytometryinfrastructure
4.62 score 2 stars 21 scripts 222 downloadsepiregulon.extra - Companion package to epiregulon with additional plotting, differential and graph functions
Gene regulatory networks model the underlying gene regulation hierarchies that drive gene expression and observed phenotypes. Epiregulon infers TF activity in single cells by constructing a gene regulatory network (regulons). This is achieved through integration of scATAC-seq and scRNA-seq data and incorporation of public bulk TF ChIP-seq data. Links between regulatory elements and their target genes are established by computing correlations between chromatin accessibility and gene expressions.
Last updated
generegulationnetworkgeneexpressiontranscriptionchiponchipdifferentialexpressiongenetargetnormalizationgraphandnetwork
4.62 score 14 scripts 321 downloadsDamsel - Damsel: an end to end analysis of DamID
Damsel provides an end to end analysis of DamID data. Damsel takes bam files from Dam-only control and fusion samples and counts the reads matching to each GATC region. edgeR is utilised to identify regions of enrichment in the fusion relative to the control. Enriched regions are combined into peaks, and are associated with nearby genes. Damsel allows for IGV style plots to be built as the results build, inspired by ggcoverage, and using the functionality and layering ability of ggplot2. Damsel also conducts gene ontology testing with bias correction through goseq, and future versions of Damsel will also incorporate motif enrichment analysis. Overall, Damsel is the first package allowing for an end to end analysis with visual capabilities. The goal of Damsel was to bring all the analysis into one place, and allow for exploratory analysis within R.
Last updated
differentialmethylationpeakdetectiongenepredictiongenesetenrichment
4.62 score 1 stars 21 scripts 292 downloadsADAPT - Analysis of Microbiome Differential Abundance by Pooling Tobit Models
ADAPT carries out differential abundance analysis for microbiome metagenomics data in phyloseq format. It has two innovations. One is to treat zero counts as left censored and use Tobit models for log count ratios. The other is an innovative way to find non-differentially abundant taxa as reference, then use the reference taxa to find the differentially abundant ones.
Last updated
differentialexpressionmicrobiomenormalizationsequencingmetagenomicssoftwaremultiplecomparisonopenblascpp
4.61 score 41 scripts 202 downloadsmitology - Study of mitochondrial activity from RNA-seq data
mitology allows to study the mitochondrial activity throught high-throughput RNA-seq data. It is based on a collection of genes whose proteins localize in to the mitochondria. From these, mitology provides a reorganization of the pathways related to mitochondria activity from Reactome and Gene Ontology. Further a ready-to-use implementation of MitoCarta3.0 pathways is included.
Last updated
geneexpressionrnaseqvisualizationsinglecellspatialpathwaysreactomego
4.60 score 2 stars 5 scripts 282 downloadsSplineDV - Differential Variability (DV) analysis for single-cell RNA sequencing data. (e.g. Identify Differentially Variable Genes across two experimental conditions)
A spline based scRNA-seq method for identifying differentially variable (DV) genes across two experimental conditions. Spline-DV constructs a 3D spline from 3 key gene statistics: mean expression, coefficient of variance, and dropout rate. This is done for both conditions. The 3D spline provides the “expected” behavior of genes in each condition. The distance of the observed mean, CV and dropout rate of each gene from the expected 3D spline is used to measure variability. As the final step, the spline-DV method compares the variabilities of each condition to identify differentially variable (DV) genes.
Last updated
softwaresinglecellsequencingdifferentialexpressionrnaseqgeneexpressiontranscriptomicsfeatureextraction
4.60 score 4 stars 4 scripts 258 downloadsEpipwR - Efficient Power Analysis for EWAS with Continuous or Binary Outcomes
A quasi-simulation based approach to performing power analysis for EWAS (Epigenome-wide association studies) with continuous or binary outcomes. 'EpipwR' relies on empirical EWAS datasets to determine power at specific sample sizes while keeping computational cost low. EpipwR can be run with a variety of standard statistical tests, controlling for either a false discovery rate or a family-wise type I error rate.
Last updated
epigeneticsexperimentaldesign
4.60 score 2 stars 2 scripts 237 downloadsregionalpcs - Summarizing Regional Methylation with Regional Principal Components Analysis
Functions to summarize DNA methylation data using regional principal components. Regional principal components are computed using principal components analysis within genomic regions to summarize the variability in methylation levels across CpGs. The number of principal components is chosen using either the Marcenko-Pasteur or Gavish-Donoho method to identify relevant signal in the data.
Last updated
dnamethylationdifferentialmethylationstatisticalmethodsoftwaremethylationarray
4.60 score 4 stars 7 scripts 280 downloadsbatchCorr - Within And Between Batch Correction Of LC-MS Metabolomics Data
From the perspective of metabolites as the continuation of the central dogma of biology, metabolomics provides the closest link to many phenotypes of interest. This makes metabolomics research promising in teasing apart the complexities of living systems. However, due to experimental reasons, the data includes non-biological variation which limits quality and reproducibility, especially if the data is obtained from several batches. The batchCorr package reduces unwanted variation by way of between-batch alignment, within-batch drift correction and between-batch normalization using batch-specific quality control samples and long-term reference QC samples. Please see the associated article for more thorough descriptions of algorithms.
Last updated
biomedicalinformaticsmetabolomicsmassspectrometrybatcheffectnormalizationqualitycontrol
4.59 score 13 scripts 204 downloadsPolySTest - PolySTest: Detection of differentially regulated features. Combined statistical testing for data with few replicates and missing values
The complexity of high-throughput quantitative omics experiments often leads to low replicates numbers and many missing values. We implemented a new test to simultaneously consider missing values and quantitative changes, which we combined with well-performing statistical tests for high confidence detection of differentially regulated features. The package contains functions to run the test and to visualize the results.
Last updated
massspectrometryproteomicssoftwaredifferentialexpressioncsvdifferential-protein-expression-profilingdsvexpression-datagene-expression-profileheat-maphistogramp-valueproteomics-experimentq-valuestatistical-modellingstatistics-and-probabilitysvg
4.54 score 10 scripts 228 downloadsorthos - `orthos` is an R package for variance decomposition using conditional variational auto-encoders
`orthos` decomposes RNA-seq contrasts, for example obtained from a gene knock-out or compound treatment experiment, into unspecific and experiment-specific components. Original and decomposed contrasts can be efficiently queried against a large database of contrasts (derived from ARCHS4, https://maayanlab.cloud/archs4/) to identify similar experiments. `orthos` furthermore provides plotting functions to visualize the results of such a search for similar contrasts.
Last updated
rnaseqdifferentialexpressiongeneexpression
4.54 score 7 scripts 324 downloadsCCAFE - Case Control Allele Frequency Estimation
Functions to reconstruct case and control AFs from summary statistics. One function uses OR, NCase, NControl, and SE(log(OR)). The second function uses OR, NCase, NControl, and AF for the whole sample.
Last updated
genomewideassociationcomparativegenomicsgeneticspreprocessingsnpsoftwarewholegenome
4.53 score 1 stars 17 scripts 248 downloadsDEGraph - Two-sample tests on a graph
DEGraph implements recent hypothesis testing methods which directly assess whether a particular gene network is differentially expressed between two conditions. This is to be contrasted with the more classical two-step approaches which first test individual genes, then test gene sets for enrichment in differentially expressed genes. These recent methods take into account the topology of the network to yield more powerful detection procedures. DEGraph provides methods to easily test all KEGG pathways for differential expression on any gene expression data set and tools to visualize the results.
Last updated
microarraydifferentialexpressiongraphandnetworknetworknetworkenrichmentdecisiontree
4.52 score 11 scripts 371 downloadsUPDhmm - Detecting Uniparental Disomy through NGS trio data
Uniparental disomy (UPD) is a genetic condition where an individual inherits both copies of a chromosome or part of it from one parent, rather than one copy from each parent. This package contains a HMM for detecting UPDs through HTS (High Throughput Sequencing) data from trio assays. By analyzing the genotypes in the trio, the model infers a hidden state (normal, father isodisomy, mother isodisomy, father heterodisomy and mother heterodisomy).
Last updated
softwarehiddenmarkovmodelgenetics
4.51 score 4 stars 16 scripts 220 downloadsscHiCcompare - Differential Analysis of Single-cell Hi-C Data
This package provides functions for differential chromatin interaction analysis between two single-cell Hi-C data groups. It includes tools for imputation, normalization, and differential analysis of chromatin interactions. The package implements pooling techniques for imputation and offers methods to normalize and test for differential interactions across single-cell Hi-C datasets.
Last updated
softwaresinglecellhicsequencingnormalizationchromatinschi-csingle-cell
4.48 score 6 scripts 243 downloadsshinyDSP - A Shiny App For Visualizing Nanostring GeoMx DSP Data
This package is a Shiny app for interactively analyzing and visualizing Nanostring GeoMX Whole Transcriptome Atlas data. Users have the option of exploring a sample data to explore this app's functionality. Regions of interest (ROIs) can be filtered based on any user-provided metadata. Upon taking two or more groups of interest, all pairwise and ANOVA-like testing are automatically performed. Available ouputs include PCA, Volcano plots, tables and heatmaps. Aesthetics of each output are highly customizable.
Last updated
differentialexpressiongeneexpressionshinyappsspatialtranscriptomics
4.48 score 1 stars 5 scripts 254 downloadschevreulShiny - Tools for managing SingleCellExperiment objects as projects
Tools for managing SingleCellExperiment objects as projects. Includes functions for analysis and visualization of single-cell data. Also included is a shiny app for visualization of pre-processed scRNA data. Supported by NIH grants R01CA137124 and R01EY026661 to David Cobrinik.
Last updated
coveragernaseqsequencingvisualizationgeneexpressiontranscriptionsinglecelltranscriptomicsnormalizationpreprocessingqualitycontroldimensionreductiondataimport
4.48 score 5 scripts 275 downloadsReducedExperiment - Containers and tools for dimensionally-reduced -omics representations
Provides SummarizedExperiment-like containers for storing and manipulating dimensionally-reduced assay data. The ReducedExperiment classes allow users to simultaneously manipulate their original dataset and their decomposed data, in addition to other method-specific outputs like feature loadings. Implements utilities and specialised classes for the application of stabilised independent component analysis (sICA) and weighted gene correlation network analysis (WGCNA).
Last updated
geneexpressioninfrastructuredatarepresentationsoftwaredimensionreductionnetworkbioconductor-packagebioinformaticsdimensionality-reduction
4.48 score 3 stars 8 scripts 252 downloadssquallms - Speedy quality assurance via lasso labeling for LC-MS data
squallms is a Bioconductor R package that implements a "semi-labeled" approach to untargeted mass spectrometry data. It pulls in raw data from mass-spec files to calculate several metrics that are then used to label MS features in bulk as high or low quality. These metrics of peak quality are then passed to a simple logistic model that produces a fully-labeled dataset suitable for downstream analysis.
Last updated
massspectrometrymetabolomicsproteomicslipidomicsshinyappsclassificationclusteringfeatureextractionprincipalcomponentregressionpreprocessingqualitycontrolvisualization
4.48 score 3 stars 7 scripts 235 downloads
HicAggR - Set of 3D genomic interaction analysis tools
This package provides a set of functions useful in the analysis of 3D genomic interactions. It includes the import of standard HiC data formats into R and HiC normalisation procedures. The main objective of this package is to improve the visualization and quantification of the analysis of HiC contacts through aggregation. The package allows to import 1D genomics data, such as peaks from ATACSeq, ChIPSeq, to create potential couples between features of interest under user-defined parameters such as distance between pairs of features of interest. It allows then the extraction of contact values from the HiC data for these couples and to perform Aggregated Peak Analysis (APA) for visualization, but also to compare normalized contact values between conditions. Overall the package allows to integrate 1D genomics data with 3D genomics data, providing an easy access to HiC contact values.
Last updated
softwarehicdataimportdatarepresentationnormalizationvisualizationdna3dstructureatacseqchipseqdnaseseqrnaseq
4.48 score 1 stars 4 scripts 282 downloadsiSEEfier - Streamlining the creation of initial states for starting an iSEE instance
iSEEfier provides a set of functionality to quickly and intuitively create, inspect, and combine initial configuration objects. These can be conveniently passed in a straightforward manner to the function call to launch iSEE() with the specified configuration. This package currently works seamlessly with the sets of panels provided by the iSEE and iSEEu packages, but can be extended to accommodate the usage of any custom panel (e.g. from iSEEde, iSEEpathways, or any panel developed independently by the user).
Last updated
cellbasedassaysclusteringdimensionreductionfeatureextractionguigeneexpressionimmunooncologyshinyappssinglecellsoftwaretranscriptiontranscriptomicsvisualization
4.48 score 2 scripts 234 downloadscomapr - Crossover analysis and genetic map construction
comapr detects crossover intervals for single gametes from their haplotype states sequences and stores the crossovers in GRanges object. The genetic distances can then be calculated via the mapping functions using estimated crossover rates for maker intervals. Visualisation functions for plotting interval-based genetic map or cumulative genetic distances are implemented, which help reveal the variation of crossovers landscapes across the genome and across individuals.
Last updated
softwaresinglecellvisualizationgenetics
4.48 score 4 scripts 308 downloadseasier - Estimate Systems Immune Response from RNA-seq data
This package provides a workflow for the use of EaSIeR tool, developed to assess patients' likelihood to respond to ICB therapies providing just the patients' RNA-seq data as input. We integrate RNA-seq data with different types of prior knowledge to extract quantitative descriptors of the tumor microenvironment from several points of view, including composition of the immune repertoire, and activity of intra- and extra-cellular communications. Then, we use multi-task machine learning trained in TCGA data to identify how these descriptors can simultaneously predict several state-of-the-art hallmarks of anti-cancer immune response. In this way we derive cancer-specific models and identify cancer-specific systems biomarkers of immune response. These biomarkers have been experimentally validated in the literature and the performance of EaSIeR predictions has been validated using independent datasets form four different cancer types with patients treated with anti-PD1 or anti-PDL1 therapy.
Last updated
geneexpressionsoftwaretranscriptionsystemsbiologypathwaysgenesetenrichmentimmunooncologyepigeneticsclassificationbiomedicalinformaticsregressionexperimenthubsoftware
4.46 score 29 scripts 378 downloadsalabaster.sfe - Language agnostic on disk serialization of SpatialFeatureExperiment
Builds upon the existing ArtifactDB project, expending alabaster.spatial for language agnostic on disk serialization of SpatialFeatureExperiment.
Last updated
datarepresentationspatialopenjdk
4.45 score 28 scripts 182 downloadsgeyser - Gene Expression displaYer of SummarizedExperiment in R
Lightweight Expression displaYer (plotter / viewer) of SummarizedExperiment object in R. This package provides a quick and easy Shiny-based GUI to empower a user to use a SummarizedExperiment object to view
Last updated
softwareshinyappsguigeneexpression
4.43 score 18 scripts 236 downloadsDNABarcodes - A tool for creating and analysing DNA barcodes used in Next Generation Sequencing multiplexing experiments
The package offers a function to create DNA barcode sets capable of correcting insertion, deletion, and substitution errors. Existing barcodes can be analysed regarding their minimal, maximal and average distances between barcodes. Finally, reads that start with a (possibly mutated) barcode can be demultiplexed, i.e., assigned to their original reference barcode.
Last updated
preprocessingsequencingcppopenmp
4.43 score 45 scripts 556 downloadsClusterFoldSimilarity - Calculate similarity of clusters from different single cell samples using foldchanges
This package calculates a similarity coefficient using the fold changes of shared features (e.g. genes) among clusters of different samples/batches/datasets. The similarity coefficient is calculated using the dot-product (Hadamard product) of every pairwise combination of Fold Changes between a source cluster i of sample/dataset n and all the target clusters j in sample/dataset m
Last updated
singlecellclusteringfeatureextractiongraphandnetworkgenetargetrnaseq
4.43 score 18 scripts 266 downloadsHarmonizR - Handles missing values and makes more data available
An implementation, which takes input data and makes it available for proper batch effect removal by ComBat or Limma. The implementation appropriately handles missing values by dissecting the input matrix into smaller matrices with sufficient data to feed the ComBat or limma algorithm. The adjusted data is returned to the user as a rebuild matrix. The implementation is meant to make as much data available as possible with minimal data loss.
Last updated
batcheffect
4.40 score 25 scripts 292 downloadsCARDspa - Spatially Informed Cell Type Deconvolution for Spatial Transcriptomics
CARD is a reference-based deconvolution method that estimates cell type composition in spatial transcriptomics based on cell type specific expression information obtained from a reference scRNA-seq data. A key feature of CARD is its ability to accommodate spatial correlation in the cell type composition across tissue locations, enabling accurate and spatially informed cell type deconvolution as well as refined spatial map construction. CARD relies on an efficient optimization algorithm for constrained maximum likelihood estimation and is scalable to spatial transcriptomics with tens of thousands of spatial locations and tens of thousands of genes.
Last updated
spatialsinglecelltranscriptomicsvisualizationopenblascppopenmp
4.38 score 16 scripts 293 downloadsscQTLtools - scQTLtools: an R/Bioconductor package for comprehensive identification and visualization of single-cell eQTLs
scQTLtools is a comprehensive R/Bioconductor package that facilitates end-to-end single-cell eQTL analysis, from preprocessing to visualization
Last updated
softwaregeneexpressiongeneticvariabilitysnpdifferentialexpressiongenomicvariationvariantdetectiongeneticsfunctionalgenomicssystemsbiologyregressionsinglecellnormalizationvisualizationpreprocessingrna-seqsc-eqtl
4.38 score 6 stars 6 scripts 242 downloadsspoon - Address the Mean-variance Relationship in Spatial Transcriptomics Data
This package addresses the mean-variance relationship in spatially resolved transcriptomics data. Precision weights are generated for individual observations using Empirical Bayes techniques. These weights are used to rescale the data and covariates, which are then used as input in spatially variable gene detection tools.
Last updated
spatialsinglecelltranscriptomicsgeneexpressionpreprocessing
4.34 score 22 scripts 236 downloadsginmappeR - Gene Identifier Mapper
Provides functionalities to translate gene or protein identifiers between state-of-art biological databases: CARD (<https://card.mcmaster.ca/>), NCBI Protein, Nucleotide and Gene (<https://www.ncbi.nlm.nih.gov/>), UniProt (<https://www.uniprot.org/>) and KEGG (<https://www.kegg.jp>). Also offers complementary functionality like NCBI identical proteins or UniProt similar genes clusters retrieval.
Last updated
annotationkegggeneticsthirdpartyclientsoftware
4.32 score 21 scripts 234 downloadsalabaster.files - Wrappers to Save Common File Formats
Save common bioinformatics file formats within the alabaster framework. This includes BAM, BED, VCF, bigWig, bigBed, FASTQ, FASTA and so on. We save and load additional metadata for each file, and we support linkage between each file and its corresponding index.
Last updated
datarepresentationdataimport
4.32 score 21 scripts 252 downloadsoctad - Open Cancer TherApeutic Discovery (OCTAD)
OCTAD provides a platform for virtually screening compounds targeting precise cancer patient groups. The essential idea is to identify drugs that reverse the gene expression signature of disease by tamping down over-expressed genes and stimulating weakly expressed ones. The package offers deep-learning based reference tissue selection, disease gene expression signature creation, pathway enrichment analysis, drug reversal potency scoring, cancer cell line selection, drug enrichment analysis and in silico hit validation. It currently covers ~20,000 patient tissue samples covering 50 cancer types, and expression profiles for ~12,000 distinct compounds.
Last updated
classificationgeneexpressionpharmacogeneticspharmacogenomicssoftwaregenesetenrichment
4.30 score 5 scripts 282 downloadsAWAggregator - Attribute-Weighted Aggregation
This package implements an attribute-weighted aggregation algorithm which leverages peptide-spectrum match (PSM) attributes to provide a more accurate estimate of protein abundance compared to conventional aggregation methods. This algorithm employs pre-trained random forest models to predict the quantitative inaccuracy of PSMs based on their attributes. PSMs are then aggregated to the protein level using a weighted average, taking the predicted inaccuracy into account. Additionally, the package allows users to construct their own training sets that are more relevant to their specific experimental conditions if desired.
Last updated
softwaremassspectrometrypreprocessingproteomicsregression
4.30 score 4 scripts 182 downloadsPepSetTest - Peptide Set Test
Peptide Set Test (PepSetTest) is a peptide-centric strategy to infer differentially expressed proteins in LC-MS/MS proteomics data. This test detects coordinated changes in the expression of peptides originating from the same protein and compares these changes against the rest of the peptidome. Compared to traditional aggregation-based approaches, the peptide set test demonstrates improved statistical power, yet controlling the Type I error rate correctly in most cases. This test can be valuable for discovering novel biomarkers and prioritizing drug targets, especially when the direct application of statistical analysis to protein data fails to provide substantial insights.
Last updated
differentialexpressionregressionproteomicsmassspectrometry
4.30 score 2 stars 9 scripts 227 downloadssmartid - Scoring and Marker Selection Method Based on Modified TF-IDF
This package enables automated selection of group specific signature, especially for rare population. The package is developed for generating specifc lists of signature genes based on Term Frequency-Inverse Document Frequency (TF-IDF) modified methods. It can also be used as a new gene-set scoring method or data transformation method. Multiple visualization functions are implemented in this package.
Last updated
softwaregeneexpressiontranscriptomics
4.30 score 1 stars 5 scripts 244 downloads
tpSVG - Thin plate models to detect spatially variable genes
The goal of `tpSVG` is to detect and visualize spatial variation in the gene expression for spatially resolved transcriptomics data analysis. Specifically, `tpSVG` introduces a family of count-based models, with generalizable parametric assumptions such as Poisson distribution or negative binomial distribution. In addition, comparing to currently available count-based model for spatially resolved data analysis, the `tpSVG` models improves computational time, and hence greatly improves the applicability of count-based models in SRT data analysis.
Last updated
spatialtranscriptomicsgeneexpressionsoftwarestatisticalmethoddimensionreductionregressionpreprocessingspatially-resolvespatially-variable-genes
4.30 score 2 stars 7 scripts 232 downloads
biocroxytest - Handle Long Tests in Bioconductor Packages
This package provides a roclet for roxygen2 that identifies and processes code blocks in your documentation marked with `@longtests`. These blocks should contain tests that take a long time to run and thus cannot be included in the regular test suite of the package. When you run `roxygen2::roxygenise` with the `longtests_roclet`, it will extract these long tests from your documentation and save them in a separate directory. This allows you to run these long tests separately from the rest of your tests, for example, on a continuous integration server that is set up to run long tests.
Last updated
softwareinfrastructure
4.30 score 2 stars 3 scripts 235 downloadsRvisdiff - Interactive Graphs for Differential Expression
Creates a muti-graph web page which allows the interactive exploration of differential analysis tests. The graphical web interface presents results as a table which is integrated with five interactive graphs: MA-plot, volcano plot, box plot, lines plot and cluster heatmap. Graphical aspect and information represented in the graphs can be customized by means of user controls. Final graphics can be exported as PNG format.
Last updated
softwarevisualizationrnaseqdatarepresentationdifferentialexpression
4.30 score 9 scripts 289 downloadsMotif2Site - Detect binding sites from motifs and ChIP-seq experiments, and compare binding sites across conditions
Detect binding sites using motifs IUPAC sequence or bed coordinates and ChIP-seq experiments in bed or bam format. Combine/compare binding sites across experiments, tissues, or conditions. All normalization and differential steps are done using TMM-GLM method. Signal decomposition is done by setting motifs as the centers of the mixture of normal distribution curves.
Last updated
softwaresequencingchipseqdifferentialpeakcallingepigeneticssequencematching
4.30 score 3 scripts 334 downloadspengls - Fit Penalised Generalised Least Squares models
Combine generalised least squares methodology from the nlme package for dealing with autocorrelation with penalised least squares methods from the glmnet package to deal with high dimensionality. This pengls packages glues them together through an iterative loop. The resulting method is applicable to high dimensional datasets that exhibit autocorrelation, such as spatial or temporal data.
Last updated
transcriptomicsregressiontimecoursespatial
4.30 score 4 scripts 284 downloadsrsbml - R support for SBML, using libsbml
Links R to libsbml for SBML parsing, validating output, provides an S4 SBML DOM, converts SBML to R graph objects. Optionally links to the SBML ODE Solver Library (SOSLib) for simulating models.
Last updated
graphandnetworkpathwaysnetworklibsbmlcpp
4.26 score 20 scripts 577 downloadsEnrichDO - a Global Weighted Model for Disease Ontology Enrichment Analysis
To implement disease ontology (DO) enrichment analysis, this package is designed and presents a double weighted model based on the latest annotations of the human genome with DO terms, by integrating the DO graph topology on a global scale. This package exhibits high accuracy that it can identify more specific DO terms, which alleviates the over enriched problem. The package includes various statistical models and visualization schemes for discovering the associations between genes and diseases from biological big data.
Last updated
annotationvisualizationgenesetenrichmentsoftware
4.22 score 11 scripts 258 downloadsbedbaser - A BEDbase client
A client for BEDbase. bedbaser provides access to the API at api.bedbase.org. It also includes convenience functions to import BED files into GRanges objects and BEDsets into GRangesLists.
Last updated
softwaredataimportthirdpartyclientu24ca289073
4.22 score 3 stars 4 scripts 279 downloadsDeeDeeExperiment - DeeDeeExperiment: An S4 Class for managing and exploring omics analysis results
DeeDeeExperiment is an S4 class extending the SingleCellExperiment class, designed to integrate and manage omics analysis results. It introduces two dedicated slots to store Differential Expression Analysis (DEA) results and Functional Enrichment Analysis (FEA) results, providing a structured approach for downstream analysis.
Last updated
softwareinfrastructuredatarepresentationgeneexpressiontranscriptiontranscriptomicsdifferentialexpressionpathwaysgo
4.20 score 1 scripts 184 downloadsscafari - Analysis of scDNA-seq data
Scafari is a Shiny application designed for the analysis of single-cell DNA sequencing (scDNA-seq) data provided in .h5 file format. The analysis process is structured into the four key steps "Sequencing", "Panel", "Variants", and "Explore Variants". It supports various analyses and visualizations.
Last updated
softwareshinyappssinglecellsequencing
4.20 score 2 stars 2 scripts 192 downloadsHVP - Hierarchical Variance Partitioning
HVP is a quantitative batch effect metric that estimates the proportion of variance associated with batch effects in a data set.
Last updated
singlecelltranscriptomicsgeneexpressionbatcheffect
4.18 score 6 scripts 184 downloadscrupR - An R package to predict condition-specific enhancers from ChIP-seq data
An R package that offers a workflow to predict condition-specific enhancers from ChIP-seq data. The prediction of regulatory units is done in four main steps: Step 1 - the normalization of the ChIP-seq counts. Step 2 - the prediction of active enhancers binwise on the whole genome. Step 3 - the condition-specific clustering of the putative active enhancers. Step 4 - the detection of possible target genes of the condition-specific clusters using RNA-seq counts.
Last updated
differentialpeakcallinggenetargetfunctionalpredictionhistonemodificationpeakdetection
4.18 score 3 scripts 228 downloadsimmunogenViewer - Visualization and evaluation of protein immunogens
Plots protein properties and visualizes position of peptide immunogens within protein sequence. Allows evaluation of immunogens based on structural and functional annotations to infer suitability for antibody-based methods aiming to detect native proteins.
Last updated
featureextractionproteomicssoftwarevisualization
4.18 score 10 scripts 186 downloadsGrafGen - Classification of Helicobacter Pylori Genomes
To classify Helicobacter pylori genomes according to genetic distance from nine reference populations. The nine reference populations are hpgpAfrica, hpgpAfrica-distant, hpgpAfroamerica, hpgpEuroamerica, hpgpMediterranea, hpgpEurope, hpgpEurasia, hpgpAsia, and hpgpAklavik86-like. The vertex populations are Africa, Europe and Asia.
Last updated
geneticssoftwaregenomeannotationclassificationcpp
4.18 score 2 scripts 255 downloadsRNAdecay - Maximum Likelihood Decay Modeling of RNA Degradation Data
RNA degradation is monitored through measurement of RNA abundance after inhibiting RNA synthesis. This package has functions and example scripts to facilitate (1) data normalization, (2) data modeling using constant decay rate or time-dependent decay rate models, (3) the evaluation of treatment or genotype effects, and (4) plotting of the data and models. Data Normalization: functions and scripts make easy the normalization to the initial (T0) RNA abundance, as well as a method to correct for artificial inflation of Reads per Million (RPM) abundance in global assessments as the total size of the RNA pool decreases. Modeling: Normalized data is then modeled using maximum likelihood to fit parameters. For making treatment or genotype comparisons (up to four), the modeling step models all possible treatment effects on each gene by repeating the modeling with constraints on the model parameters (i.e., the decay rate of treatments A and B are modeled once with them being equal and again allowing them to both vary independently). Model Selection: The AICc value is calculated for each model, and the model with the lowest AICc is chosen. Modeling results of selected models are then compiled into a single data frame. Graphical Plotting: functions are provided to easily visualize decay data model, or half-life distributions using ggplot2 package functions.
Last updated
immunooncologysoftwaregeneexpressiongeneregulationdifferentialexpressiontranscriptiontranscriptomicstimecourseregressionrnaseqnormalizationworkflowstep
4.18 score 4 scripts 352 downloadsbetaHMM - A Hidden Markov Model Approach for Identifying Differentially Methylated Sites and Regions for Beta-Valued DNA Methylation Data
A novel approach utilizing a homogeneous hidden Markov model. And effectively model untransformed beta values. To identify DMCs while considering the spatial. Correlation of the adjacent CpG sites.
Last updated
dnamethylationdifferentialmethylationimmunooncologybiomedicalinformaticsmethylationarraysoftwaremultiplecomparisonsequencingspatialcoveragegenetargethiddenmarkovmodelmicroarray
4.18 score 9 scripts 254 downloadsgg4way - 4way Plots of Differential Expression
4way plots enable a comparison of the logFC values from two contrasts of differential gene expression. The gg4way package creates 4way plots using the ggplot2 framework and supports popular Bioconductor objects. The package also provides information about the correlation between contrasts and significant genes of interest.
Last updated
softwarevisualizationdifferentialexpressiongeneexpressiontranscriptionrnaseqsinglecellsequencing
4.18 score 5 scripts 272 downloadsRegionalST - Investigating regions of interest and performing regional cell type-specific analysis with spatial transcriptomics data
This package analyze spatial transcriptomics data through cross-regional cell type-specific analysis. It selects regions of interest (ROIs) and identifys cross-regional cell type-specific differential signals. The ROIs can be selected using automatic algorithm or through manual selection. It facilitates manual selection of ROIs using a shiny application.
Last updated
spatialtranscriptomicsreactomekegg
4.18 score 8 scripts 217 downloadszitools - Analysis of zero-inflated count data
zitools allows for zero inflated count data analysis by either using down-weighting of excess zeros or by replacing an appropriate proportion of excess zeros with NA. Through overloading frequently used statistical functions (such as mean, median, standard deviation), plotting functions (such as boxplots or heatmap) or differential abundance tests, it allows a wide range of downstream analyses for zero-inflated data in a less biased manner. This becomes applicable in the context of microbiome analyses, where the data is often overdispersed and zero-inflated, therefore making data analysis extremly challenging.
Last updated
softwarestatisticalmethodmicrobiome
4.08 score 12 scripts 243 downloadsCepo - Cepo for the identification of differentially stable genes
Defining the identity of a cell is fundamental to understand the heterogeneity of cells to various environmental signals and perturbations. We present Cepo, a new method to explore cell identities from single-cell RNA-sequencing data using differential stability as a new metric to define cell identity genes. Cepo computes cell-type specific gene statistics pertaining to differential stable gene expression.
Last updated
classificationgeneexpressionsinglecellsoftwaresequencingdifferentialexpression
4.07 score 1 dependents 26 scripts 368 downloadsMEIGOR - MEIGOR - MEtaheuristics for bIoinformatics Global Optimization
MEIGOR provides a comprehensive environment for performing global optimization tasks in bioinformatics and systems biology. It leverages advanced metaheuristic algorithms to efficiently search the solution space and is specifically tailored to handle the complexity and high-dimensionality of biological datasets. This package supports various optimization routines and is integrated with Bioconductor's infrastructure for a seamless analysis workflow.
Last updated
systemsbiologyoptimizationsoftware
4.06 score 58 scripts 394 downloads
BioGA - Bioinformatics Genetic Algorithm (BioGA)
Genetic algorithm are a class of optimization algorithms inspired by the process of natural selection and genetics. This package allows users to analyze and optimize high throughput genomic data using genetic algorithms. The functions provided are implemented in C++ for improved speed and efficiency, with an easy-to-use interface for use within R.
Last updated
experimentaldesigntechnologydata-analysisgene-expressiongenetic-algorithmsgenomicsoptimization-algorithmscpp
4.04 score 11 scripts 228 downloadsggseqalign - Minimal Visualization of Sequence Alignments
Simple visualizations of alignments of DNA or AA sequences as well as arbitrary strings. Compatible with Biostrings and ggplot2. The plots are fully customizable using ggplot2 modifiers such as theme().
Last updated
alignmentmultiplesequencealignmentsoftwarevisualizationbioinformaticsggplot2-enhancementsminimalistic
4.04 score 11 scripts 210 downloadsbarbieQ - Analyze Barcode Data from Clonal Tracking Experiments
The barbieQ package provides a series of robust statistical tools for analysing barcode count data generated from cell clonal tracking (i.e., lineage tracing) experiments. In these experiments, an initial cell and its offspring collectively form a clone (i.e., lineage). A unique barcode sequence, incorporated into the DNA of the inital cell, is inherited within the clone. This one-to-one mapping of barcodes to clones enables clonal tracking of their behaviors. By counting barcodes, researchers can quantify the population abundance of individual clones under specific experimental perturbations. barbieQ supports barcode count data preprocessing, statistical testing, and visualization.
Last updated
sequencingsoftwareregressionpreprocessingvisualization
4.00 score 1 stars 4 scripts 255 downloadsLheuristic - Detection of scatterplots with L-shaped pattern
The Lheuristic package identifies scatterpots that follow and L-shaped, negative distribution. It can be used to identify genes regulated by methylation by integration of an expression and a methylation array. The package uses two different methods to detect expression and methyaltion L- shapped scatterplots. The parameters can be changed to detect other scatterplot patterns.
Last updated
dnamethylationstatisticalmethodmethylationarrayl-shapedmethylation-expressionscatterplot-matrix
4.00 score 8 scripts 225 downloadsSite2Target - An R package to associate peaks and target genes
Statistics implemented for both peak-wise and gene-wise associations. In peak-wise associations, the p-value of the target genes of a given set of peaks are calculated. Negative binomial or Poisson distributions can be used for modeling the unweighted peaks targets and log-nromal can be used to model the weighted peaks. In gene-wise associations a table consisting of a set of genes, mapped to specific peaks, is generated using the given rules.
Last updated
annotationchipseqsoftwareepigeneticsgeneexpressiongenetarget
4.00 score 6 scripts 220 downloadsPolytect - An R package for digital data clustering
Polytect is an advanced computational tool designed for the analysis of multi-color digital PCR data. It provides automatic clustering and labeling of partitions into distinct groups based on clusters first identified by the flowPeaks algorithm. Polytect is particularly useful for researchers in molecular biology and bioinformatics, enabling them to gain deeper insights into their experimental results through precise partition classification and data visualization.
Last updated
ddpcrclusteringmultichannelclassification
4.00 score 1 stars 4 scripts 222 downloads
mspms - Tools for the analysis of MSP-MS data
This package provides functions for the analysis of data generated by the multiplex substrate profiling by mass spectrometry for proteases (MSP-MS) method. Data exported from upstream proteomics software is accepted as input and subsequently processed for analysis. Tools for statistical analysis, visualization, and interpretation of the data are provided.
Last updated
proteomicsmassspectrometrypreprocessingproteaseproteomics-data-analysis
4.00 score 1 stars 7 scripts 256 downloadsspatialSimGP - Simulate Spatial Transcriptomics Data with the Mean-variance Relationship
This packages simulates spatial transcriptomics data with the mean- variance relationship using a Gaussian Process model per gene.
Last updated
spatialtranscriptomicsgeneexpression
4.00 score 4 scripts 222 downloadsDeepTarget - Deep characterization of cancer drugs
This package predicts a drug’s primary target(s) or secondary target(s) by integrating large-scale genetic and drug screens from the Cancer Dependency Map project run by the Broad Institute. It further investigates whether the drug specifically targets the wild-type or mutated target forms. To show how to use this package in practice, we provided sample data along with step-by-step example.
Last updated
genetargetgenepredictionpathwaysgeneexpressionrnaseqimmunooncologydifferentialexpressiongenesetenrichmentreportwritingcrispr
4.00 score 3 scripts 256 downloadsfindIPs - Influential Points Detection for Feature Rankings
Feature rankings can be distorted by a single case in the context of high-dimensional data. The cases exerts abnormal influence on feature rankings are called influential points (IPs). The package aims at detecting IPs based on case deletion and quantifies their effects by measuring the rank changes (DOI:10.48550/arXiv.2303.10516). The package applies a novel rank comparing measure using the adaptive weights that stress the top-ranked important features and adjust the weights to ranking properties.
Last updated
geneexpressiondifferentialexpressionregressionsurvival
4.00 score 4 scripts 264 downloadsMAPFX - MAssively Parallel Flow cytometry Xplorer (MAPFX): A Toolbox for Analysing Data from the Massively-Parallel Cytometry Experiments
MAPFX is an end-to-end toolbox that pre-processes the raw data from MPC experiments (e.g., BioLegend's LEGENDScreen and BD Lyoplates assays), and further imputes the ‘missing’ infinity markers in the wells without those measurements. The pipeline starts by performing background correction on raw intensities to remove the noise from electronic baseline restoration and fluorescence compensation by adapting a normal-exponential convolution model. Unwanted technical variation, from sources such as well effects, is then removed using a log-normal model with plate, column, and row factors, after which infinity markers are imputed using the informative backbone markers as predictors. The completed dataset can then be used for clustering and other statistical analyses. Additionally, MAPFX can be used to normalise data from FFC assays as well.
Last updated
softwareflowcytometrycellbasedassayssinglecellproteomicsclustering
4.00 score 1 stars 251 downloadshdxmsqc - An R package for quality Control for hydrogen deuterium exchange mass spectrometry experiments
The hdxmsqc package enables us to analyse and visualise the quality of HDX-MS experiments. Either as a final quality check before downstream analysis and publication or as part of a interative procedure to determine the quality of the data. The package builds on the QFeatures and Spectra packages to integrate with other mass-spectrometry data.
Last updated
qualitycontroldataimportproteomicsmassspectrometrymetabolomics
4.00 score 1 stars 3 scripts 253 downloadsBREW3R.r - R package associated to BREW3R
This R package provide functions that are used in the BREW3R workflow. This mainly contains a function that extend a gtf as GRanges using information from another gtf (also as GRanges). The process allows to extend gene annotation without increasing the overlap between gene ids.
Last updated
genomeannotation
4.00 score 5 scripts 287 downloadscypress - Cell-Type-Specific Power Assessment
CYPRESS is a cell-type-specific power tool. This package aims to perform power analysis for the cell-type-specific data. It calculates FDR, FDC, and power, under various study design parameters, including but not limited to sample size, and effect size. It takes the input of a SummarizeExperimental(SE) object with observed mixture data (feature by sample matrix), and the cell-type mixture proportions (sample by cell-type matrix). It can solve the cell-type mixture proportions from the reference free panel from TOAST and conduct tests to identify cell-type-specific differential expression (csDE) genes.
Last updated
softwaregeneexpressiondataimportrnaseqsequencing
4.00 score 1 stars 2 scripts 306 downloadsdinoR - Differential NOMe-seq analysis
dinoR tests for significant differences in NOMe-seq footprints between two conditions, using genomic regions of interest (ROI) centered around a landmark, for example a transcription factor (TF) motif. This package takes NOMe-seq data (GCH methylation/protection) in the form of a Ranged Summarized Experiment as input. dinoR can be used to group sequencing fragments into 3 or 5 categories representing characteristic footprints (TF bound, nculeosome bound, open chromatin), plot the percentage of fragments in each category in a heatmap, or averaged across different ROI groups, for example, containing a common TF motif. It is designed to compare footprints between two sample groups, using edgeR's quasi-likelihood methods on the total fragment counts per ROI, sample, and footprint category.
Last updated
nucleosomepositioningepigeneticsmethylseqdifferentialmethylationcoveragetranscriptionsequencingsoftware
4.00 score 10 scripts 274 downloadsspillR - Spillover Compensation in Mass Cytometry Data
Channel interference in mass cytometry can cause spillover and may result in miscounting of protein markers. We develop a nonparametric finite mixture model and use the mixture components to estimate the probability of spillover. We implement our method using expectation-maximization to fit the mixture model.
Last updated
flowcytometryimmunooncologymassspectrometrypreprocessingsinglecellsoftwarestatisticalmethodvisualizationregression
4.00 score 3 scripts 242 downloadsplasmut - Stratifying mutations observed in cell-free DNA and white blood cells as germline, hematopoietic, or somatic
A Bayesian method for quantifying the liklihood that a given plasma mutation arises from clonal hematopoesis or the underlying tumor. It requires sequencing data of the mutation in plasma and white blood cells with the number of distinct and mutant reads in both tissues. We implement a Monte Carlo importance sampling method to assess the likelihood that a mutation arises from the tumor relative to non-tumor origin.
Last updated
bayesiansomaticmutationgermlinemutationsequencing
4.00 score 8 scripts 248 downloadscompSPOT - compSPOT: Tool for identifying and comparing significantly mutated genomic hotspots
Clonal cell groups share common mutations within cancer, precancer, and even clinically normal appearing tissues. The frequency and location of these mutations may predict prognosis and cancer risk. It has also been well established that certain genomic regions have increased sensitivity to acquiring mutations. Mutation-sensitive genomic regions may therefore serve as markers for predicting cancer risk. This package contains multiple functions to establish significantly mutated hotspots, compare hotspot mutation burden between samples, and perform exploratory data analysis of the correlation between hotspot mutation burden and personal risk factors for cancer, such as age, gender, and history of carcinogen exposure. This package allows users to identify robust genomic markers to help establish cancer risk.
Last updated
softwaretechnologysequencingdnaseqwholegenomeclassificationsinglecellsurvivalmultiplecomparison
4.00 score 4 scripts 236 downloadsHERON - Hierarchical Epitope pROtein biNding
HERON is a software package for analyzing peptide binding array data. In addition to identifying significant binding probes, HERON also provides functions for finding epitopes (string of consecutive peptides within a protein). HERON also calculates significance on the probe, epitope, and protein level by employing meta p-value methods. HERON is designed for obtaining calls on the sample level and calculates fractions of hits for different conditions.
Last updated
microarraysoftware
4.00 score 1 stars 8 scripts 275 downloadsMICSQTL - MICSQTL (Multi-omic deconvolution, Integration and Cell-type-specific Quantitative Trait Loci)
Our pipeline, MICSQTL, utilizes scRNA-seq reference and bulk transcriptomes to estimate cellular composition in the matched bulk proteomes. The expression of genes and proteins at either bulk level or cell type level can be integrated by Angle-based Joint and Individual Variation Explained (AJIVE) framework. Meanwhile, MICSQTL can perform cell-type-specic quantitative trait loci (QTL) mapping to proteins or transcripts based on the input of bulk expression data and the estimated cellular composition per molecule type, without the need for single cell sequencing. We use matched transcriptome-proteome from human brain frontal cortex tissue samples to demonstrate the input and output of our tool.
Last updated
geneexpressiongeneticsproteomicsrnaseqsequencingsinglecellsoftwarevisualizationcellbasedassayscoverage
4.00 score 3 scripts 292 downloadsMultimodalExperiment - Integrative Bulk and Single-Cell Experiment Container
MultimodalExperiment is an S4 class that integrates bulk and single-cell experiment data; it is optimally storage-efficient, and its methods are exceptionally fast. It effortlessly represents multimodal data of any nature and features normalized experiment, subject, sample, and cell annotations, which are related to underlying biological experiments through maps. Its coordination methods are opt-in and employ database-like join operations internally to deliver fast and flexible management of multimodal data.
Last updated
datarepresentationinfrastructuresinglecell
4.00 score 8 scripts 312 downloadsChIPXpress - ChIPXpress: enhanced transcription factor target gene identification from ChIP-seq and ChIP-chip data using publicly available gene expression profiles
ChIPXpress takes as input predicted TF bound genes from ChIPx data and uses a corresponding database of gene expression profiles downloaded from NCBI GEO to rank the TF bound targets in order of which gene is most likely to be functional TF target.
Last updated
chipchipchipseq
3.95 score 1 scripts 398 downloadsroastgsa - Rotation based gene set analysis
This package implements a variety of functions useful for gene set analysis using rotations to approximate the null distribution. It contributes with the implementation of seven test statistic scores that can be used with different goals and interpretations. Several functions are available to complement the statistical results with graphical representations.
Last updated
microarraypreprocessingnormalizationgeneexpressionsurvivaltranscriptionsequencingtranscriptomicsbayesianclusteringregressionrnaseqmicrornaarraymrnamicroarrayfunctionalgenomicssystemsbiologyimmunooncologydifferentialexpressiongenesetenrichmentbatcheffectmultiplecomparisonqualitycontroltimecoursemetabolomicsproteomicsepigeneticscheminformaticsexonarrayonechanneltwochannelproprietaryplatformscellbiologybiomedicalinformaticsalternativesplicingdifferentialsplicingdataimportpathways
3.95 score 15 scripts 272 downloadsRcollectl - Help use collectl with R in Linux, to measure resource consumption in R processes
Provide functions to obtain instrumentation data on processes in a unix environment. Parse output of a collectl run. Vizualize aspects of system usage over time, with annotation.
Last updated
softwareinfrastructure
3.95 score 3 stars 7 scripts 176 downloadsggmanh - Visualization Tool for GWAS Result
Manhattan plot and QQ Plot are commonly used to visualize the end result of Genome Wide Association Study. The "ggmanh" package aims to keep the generation of these plots simple while maintaining customizability. Main functions include manhattan_plot, qqunif, and thinPoints.
Last updated
visualizationgenomewideassociationgenetics
3.93 score 43 scripts 406 downloadsBOBaFIT - Refitting diploid region profiles using a clustering procedure
This package provides a method to refit and correct the diploid region in copy number profiles. It uses a clustering algorithm to identify pathology-specific normal (diploid) chromosomes and then use their copy number signal to refit the whole profile. The package is composed by three functions: DRrefit (the main function), ComputeNormalChromosome and PlotCluster.
Last updated
copynumbervariationclusteringvisualizationnormalizationsoftware
3.90 score 4 scripts 334 downloadspfamAnalyzeR - Identification of domain isotypes in pfam data
Protein domains is one of the most import annoation of proteins we have with the Pfam database/tool being (by far) the most used tool. This R package enables the user to read the pfam prediction from both webserver and stand-alone runs into R. We have recently shown most human protein domains exist as multiple distinct variants termed domain isotypes. Different domain isotypes are used in a cell, tissue, and disease-specific manner. Accordingly, we find that domain isotypes, compared to each other, modulate, or abolish the functionality of a protein domain. This R package enables the identification and classification of such domain isotypes from Pfam data.
Last updated
alternativesplicingtranscriptomevariantbiomedicalinformaticsfunctionalgenomicssystemsbiologyannotationfunctionalpredictiongenepredictiondataimport
3.78 score 1 stars 1 dependents 2 scripts 621 downloadsprofileplyr - Visualization and annotation of read signal over genomic ranges with profileplyr
Quick and straightforward visualization of read signal over genomic intervals is key for generating hypotheses from sequencing data sets (e.g. ChIP-seq, ATAC-seq, bisulfite/methyl-seq). Many tools both inside and outside of R and Bioconductor are available to explore these types of data, and they typically start with a bigWig or BAM file and end with some representation of the signal (e.g. heatmap). profileplyr leverages many Bioconductor tools to allow for both flexibility and additional functionality in workflows that end with visualization of the read signal.
Last updated
chipseqdataimportsequencingchiponchipcoverage
3.70 score 488 downloadsXeniumIO - Import and represent Xenium data from the 10X Xenium Analyzer
The package allows users to readily import spatial data obtained from the 10X Xenium Analyzer pipeline. Supported formats include 'parquet', 'h5', and 'mtx' files. The package mainly represents data as SpatialExperiment objects.
Last updated
softwareinfrastructuredataimportsinglecellspatialu24ca289073
3.70 score 3 scripts 228 downloadspgxRpi - R wrapper for Progenetix
The package is an R wrapper for Progenetix REST API built upon the Beacon v2 protocol. Its purpose is to provide a seamless way for retrieving genomic data from Progenetix database—an open resource dedicated to curated oncogenomic profiles. Empowered by this package, users can effortlessly access and visualize data from Progenetix.
Last updated
copynumbervariationgenomicvariationdataimportsoftware
3.69 score 3 stars 11 scripts 283 downloadsOpenStats - A Robust and Scalable Software Package for Reproducible Analysis of High-Throughput genotype-phenotype association
Package contains several methods for statistical analysis of genotype to phenotype association in high-throughput screening pipelines.
Last updated
statisticalmethodbatcheffectbayesian
3.68 score 16 scripts 287 downloadsbiobtreeR - Using biobtree tool from R
The biobtreeR package provides an interface to [biobtree](https://github.com/tamerh/biobtree) tool which covers large set of bioinformatics datasets and allows search and chain mappings functionalities.
Last updated
annotationbioinformatics
3.65 score 3 stars 4 scripts 368 downloadsiBMQ - integrated Bayesian Modeling of eQTL data
integrated Bayesian Modeling of eQTL data
Last updated
microarraypreprocessinggeneexpressionsnpgslopenmp
3.48 score 3 scripts 337 downloadsHilbertVisGUI - HilbertVisGUI
An interactive tool to visualize long vectors of integer data by means of Hilbert curves
Last updated
visualizationgtkmmatkmmpangommglibmmlibsigcppgtk+cpp
3.48 score 252 downloadsSICtools - Find SNV/Indel differences between two bam files with near relationship
This package is to find SNV/Indel differences between two bam files with near relationship in a way of pairwise comparison thourgh each base position across the genome region of interest. The difference is inferred by fisher test and euclidean distance, the input of which is the base count (A,T,G,C) in a given position and read counts for indels that span no less than 2bp on both sides of indel region.
Last updated
alignmentsequencingcoveragesequencematchingqualitycontroldataimportsoftwaresnpvariantdetection
3.48 score 2 scripts 472 downloadsQRscore - Quantile Rank Score
In genomics, differential analysis enables the discovery of groups of genes implicating important biological processes such as cell differentiation and aging. Non-parametric tests of differential gene expression usually detect shifts in centrality (such as mean or median), and therefore suffer from diminished power against alternative hypotheses characterized by shifts in spread (such as variance). This package provides a flexible family of non-parametric two-sample tests and K-sample tests, which is based on theoretical work around non-parametric tests, spacing statistics and local asymptotic normality (Erdmann-Pham et al., 2022+ [arXiv:2008.06664v2]; Erdmann-Pham, 2023+ [arXiv:2209.14235v2]).
Last updated
statisticalmethoddifferentialexpressiongeneexpressionstructuralgenomicsgenetarget
3.48 score 3 scripts 244 downloadsislify - Automatic scoring and classification of cell-based assay images
This software is meant to be used for classification of images of cell-based assays for neuronal surface autoantibody detection or similar techniques. It takes imaging files as input and creates a composite score from these, that for example can be used to classify samples as negative or positive for a certain antibody-specificity. The reason for its name is that I during its creation have thought about the individual picture as an archielago where we with different filters control the water level as well as ground characteristica, thereby finding islands of interest.
Last updated
softwarecellbasedassaysbiomedicalinformaticsfeatureextractionvisualizationpathwaysclassificationopenjdk
3.48 score 2 scripts 245 downloadsMPAC - Multi-omic Pathway Analysis of Cells
Multi-omic Pathway Analysis of Cells (MPAC), integrates multi-omic data for understanding cellular mechanisms. It predicts novel patient groups with distinct pathway profiles as well as identifying key pathway proteins with potential clinical associations. From CNA and RNA-seq data, it determines genes’ DNA and RNA states (i.e., repressed, normal, or activated), which serve as the input for PARADIGM to calculate Inferred Pathway Levels (IPLs). It also permutes DNA and RNA states to create a background distribution to filter IPLs as a way to remove events observed by chance. It provides multiple methods for downstream analysis and visualization.
Last updated
softwaretechnologysequencingrnaseqsurvivalclusteringimmunooncology
3.48 score 4 scripts 244 downloadsterapadog - Translational Efficiency Regulation Analysis using the PADOG Method
This package performs a Gene Set Analysis with the approach adopted by PADOG on the genes that are reported as translationally regulated (ie. exhibit a significant change in TE) by the DeltaTE package. It can be used on its own to see the impact of translation regulation on gene sets, but it is also integrated as an additional analysis method within ReactomeGSA, where results are further contextualised in terms of pathways and directionality of the change.
Last updated
riboseqtranscriptomicsgenesetenrichmentgeneregulationreactomesoftware
3.30 score 9 scripts 234 downloadstidysbml - Extract SBML's data into dataframes
Starting from one SBML file, it extracts information from each listOfCompartments, listOfSpecies and listOfReactions element by saving them into data frames. Each table provides one row for each entity (i.e. either compartment, species, reaction or speciesReference) and one set of columns for the attributes, one column for the content of the 'notes' subelement and one set of columns for the content of the 'annotation' subelement.
Last updated
graphandnetworknetworkpathwayssoftware
3.30 score 2 stars 8 scripts 249 downloadskmcut - Optimized Kaplan Meier analysis and identification and validation of prognostic biomarkers
The purpose of the package is to identify prognostic biomarkers and an optimal numeric cutoff for each biomarker that can be used to stratify a group of test subjects (samples) into two sub-groups with significantly different survival (better vs. worse). The package was developed for the analysis of gene expression data, such as RNA-seq. However, it can be used with any quantitative variable that has a sufficiently large proportion of unique values.
Last updated
softwarestatisticalmethodgeneexpressionsurvival
3.30 score 2 scripts 231 downloadsClustAll - ClustAll: Data driven strategy to robustly identify stratification of patients within complex diseases
Data driven strategy to find hidden groups of patients with complex diseases using clinical data. ClustAll facilitates the unsupervised identification of multiple robust stratifications. ClustAll, is able to overcome the most common limitations found when dealing with clinical data (missing values, correlated data, mixed data types).
Last updated
softwarestatisticalmethodclusteringdimensionreductionprincipalcomponent
3.30 score 6 scripts 258 downloadsbiomvRCNS - Copy Number study and Segmentation for multivariate biological data
In this package, a Hidden Semi Markov Model (HSMM) and one homogeneous segmentation model are designed and implemented for segmentation genomic data, with the aim of assisting in transcripts detection using high throughput technology like RNA-seq or tiling array, and copy number analysis using aCGH or sequencing.
Last updated
acghcopynumbervariationmicroarraysequencingvisualizationgenetics
3.30 score 2 scripts 480 downloadsannmap - Genome annotation and visualisation package pertaining to Affymetrix arrays and NGS analysis.
annmap provides annotation mappings for Affymetrix exon arrays and coordinate based queries to support deep sequencing data analysis. Database access is hidden behind the API which provides a set of functions such as genesInRange(), geneToExon(), exonDetails(), etc. Functions to plot gene architecture and BAM file data are also provided. Underlying data are from Ensembl. The annmap database can be downloaded from: https://figshare.manchester.ac.uk/account/articles/16685071
Last updated
annotationmicroarrayonechannelreportwritingtranscriptionvisualization
3.18 score 5 scripts 506 downloadsxenLite - Simple classes and methods for managing Xenium datasets
Define a relatively light class for managing Xenium data using Bioconductor. Address use of parquet for coordinates, SpatialExperiment for assay and sample data. Address serialization and use of cloud storage.
Last updated
infrastructureu24ca289073
3.00 score 1 stars 4 scripts 226 downloadsCoverageView - Coverage visualization package for R
This package provides a framework for the visualization of genome coverage profiles. It can be used for ChIP-seq experiments, but it can be also used for genome-wide nucleosome positioning experiments or other experiment types where it is important to have a framework in order to inspect how the coverage distributed across the genome
Last updated
immunooncologyvisualizationrnaseqchipseqsequencingtechnologysoftware
2.95 score 7 scripts 502 downloadsGeneGA - Design gene based on both mRNA secondary structure and codon usage bias using Genetic algorithm
R based Genetic algorithm for gene expression optimization by considering both mRNA secondary structure and codon usage bias, GeneGA includes the information of highly expressed genes of almost 200 genomes. Meanwhile, Vienna RNA Package is needed to ensure GeneGA to function properly.
Last updated
geneexpression
2.48 score 8 scripts 296 downloadsrhdf5 - R Interface to HDF5
This package provides an interface between HDF5 and R. HDF5's main features are the ability to store and access very large and/or complex datasets and a wide variety of metadata on mass storage (disk) through a completely portable file format. The rhdf5 package is thus suited for the exchange of large and/or complex datasets between R and other software package, and for letting R applications work on datasets that are larger than the available RAM.
Last updated
infrastructuredataimporthdf5rhdf5curlopensslcpp
16.77 score 72 stars 230 dependents 6.3k scripts 43k downloadsBiobase - Biobase: Base functions for Bioconductor
Functions that are needed by many other packages or which replace R functions.
Last updated
infrastructurebioconductor-packagecore-package
16.48 score 9 stars 1.9k dependents 8.0k scripts 103k downloadsenrichplot - Visualization of Functional Enrichment Result
The 'enrichplot' package provides visualization methods for interpreting functional enrichment results from ORA or GSEA analyses. It is designed to work with the 'clusterProfiler' ecosystem and builds on 'ggplot2' for flexible and extensible graphics.
Last updated
annotationgenesetenrichmentgokeggpathwayssoftwarevisualizationenrichment-analysispathway-analysisquarto
16.46 score 258 stars 56 dependents 7.1k scripts 44k downloadsS4Vectors - Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Last updated
infrastructuredatarepresentationbioconductor-packagecore-package
16.45 score 18 stars 2.0k dependents 1.8k scripts 118k downloadsIRanges - Foundation of integer range manipulation in Bioconductor
Provides efficient low-level and highly reusable S4 classes for storing, manipulating and aggregating over annotated ranges of integers. Implements an algebra of range operations, including efficient algorithms for finding overlaps and nearest neighbors. Defines efficient list-like classes for storing, transforming and aggregating large grouped data, i.e., collections of atomic vectors and DataFrames.
Last updated
infrastructuredatarepresentationbioconductor-packagecore-packageopenmp
16.41 score 23 stars 1.9k dependents 3.1k scripts 117k downloadsGenomeInfoDb - Utilities for manipulating chromosome names, including modifying them to follow a particular naming style
Contains data and functions that define and allow translation between different chromosome sequence naming conventions (e.g., "chr1" versus "1"), including a function that attempts to place sequence names in their natural, rather than lexicographic, order.
Last updated
geneticsdatarepresentationannotationgenomeannotationbioconductor-packagecore-package
16.08 score 33 stars 329 dependents 2.4k scripts 102k downloadsDelayedArray - A unified framework for working transparently with on-disk and in-memory array-like datasets
Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.
Last updated
infrastructuredatarepresentationannotationgenomeannotationbioconductor-packagecore-packageu24ca289073
15.63 score 29 stars 1.4k dependents 762 scripts 94k downloadsGSVA - Gene Set Variation Analysis for Microarray and RNA-Seq Data
Gene Set Variation Analysis (GSVA) is a non-parametric, unsupervised method for estimating variation of gene set enrichment through the samples of a expression data set. GSVA performs a change in coordinate systems, transforming the data from a gene by sample matrix to a gene-set by sample matrix, thereby allowing the evaluation of pathway enrichment for each sample. This new matrix of GSVA enrichment scores facilitates applying standard analytical methods like functional enrichment, survival analysis, clustering, CNV-pathway analysis or cross-tissue pathway analysis, in a pathway-centric manner.
Last updated
functionalgenomicsmicroarrayrnaseqpathwaysgenesetenrichmentgene-set-enrichmentgenomicspathway-enrichment-analysis
15.53 score 242 stars 21 dependents 3.2k scripts 15k downloadsMSnbase - Base Functions and Classes for Mass Spectrometry and Proteomics
MSnbase provides infrastructure for manipulation, processing and visualisation of mass spectrometry and proteomics data, ranging from raw to quantitative and annotated data.
Last updated
immunooncologyinfrastructureproteomicsmassspectrometryqualitycontroldataimportbioconductorbioinformaticsmass-spectrometryproteomics-datavisualisationcpp
14.63 score 137 stars 37 dependents 892 scripts 5.9k downloadsscran - Methods for Single-Cell RNA-Seq Data Analysis
Implements miscellaneous functions for interpretation of single-cell RNA-seq data. Methods are provided for assignment of cell cycle phase, detection of highly variable and significantly correlated genes, identification of marker genes, and other common tasks in routine single-cell analysis workflows.
Last updated
immunooncologynormalizationsequencingrnaseqsoftwaregeneexpressiontranscriptomicssinglecellclusteringbioconductor-packagehuman-cell-atlassingle-cell-rna-seqopenblascpp
13.44 score 48 stars 41 dependents 9.1k scripts 10k downloadsedgeR - Empirical Analysis of Digital Gene Expression Data in R
Differential expression analysis of sequence count data. Implements a range of statistical methodology based on the negative binomial distributions, including empirical Bayes estimation, exact tests, generalized linear models, quasi-likelihood, and gene set enrichment. Can perform differential analyses of any type of omics data that produces read counts, including RNA-seq, ChIP-seq, ATAC-seq, Bisulfite-seq, SAGE, CAGE, metabolomics, or proteomics spectral counts. RNA-seq analyses can be conducted at the gene or isoform level, and tests can be conducted for differential exon or transcript usage.
Last updated
alternativesplicingbatcheffectbayesianbiomedicalinformaticscellbiologychipseqclusteringcoveragedifferentialexpressiondifferentialmethylationdifferentialsplicingdnamethylationepigeneticsfunctionalgenomicsgeneexpressiongenesetenrichmentgeneticsimmunooncologymultiplecomparisonnormalizationpathwaysproteomicsqualitycontrolregressionrnaseqsagesequencingsinglecellsystemsbiologytimecoursetranscriptiontranscriptomicsopenblas
13.39 score 282 dependents 24k scripts 43k downloadsminfi - Analyze Illumina Infinium DNA methylation arrays
Tools to analyze & visualize Illumina Infinium methylation arrays.
Last updated
immunooncologydnamethylationdifferentialmethylationepigeneticsmicroarraymethylationarraymultichanneltwochanneldataimportnormalizationpreprocessingqualitycontrol
13.33 score 64 stars 34 dependents 1.6k scripts 4.7k downloadsSingleR - Reference-Based Single-Cell RNA-Seq Annotation
Performs unbiased cell type recognition from single-cell RNA sequencing data, by leveraging reference transcriptomic datasets of pure cell types to infer the cell of origin of each single cell independently.
Last updated
softwaresinglecellgeneexpressiontranscriptomicsclassificationclusteringannotationbioconductorsinglercpp
13.24 score 202 stars 3 dependents 3.0k scripts 7.2k downloadsSparseArray - High-performance sparse data representation and manipulation in R
The SparseArray package provides array-like containers for efficient in-memory representation of multidimensional sparse data in R (arrays and matrices). The package defines the SparseArray virtual class and two concrete subclasses: COO_SparseArray and SVT_SparseArray. Each subclass uses its own internal representation of the nonzero multidimensional data: the "COO layout" and the "SVT layout", respectively. SVT_SparseArray objects mimic as much as possible the behavior of ordinary matrix and array objects in base R. In particular, they suppport most of the "standard matrix and array API" defined in base R and in the matrixStats package from CRAN.
Last updated
infrastructuredatarepresentationbioconductor-packagecore-packageu24ca289073openmp
12.80 score 11 stars 1.4k dependents 103 scripts 91k downloadsOmnipathR - OmniPath web service client and more
A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).
Last updated
graphandnetworknetworkpathwayssoftwarethirdpartyclientdataimportdatarepresentationgenesignalinggeneregulationsystemsbiologytranscriptomicssinglecellannotationkeggcomplexesenzyme-ptmnetworksnetworks-biologyomnipathproteinsquarto
12.70 score 166 stars 4 dependents 424 scripts 1.8k downloadsvariancePartition - Quantify and interpret drivers of variation in multilevel gene expression experiments
Quantify and interpret multiple sources of biological and technical variation in gene expression experiments. Uses a linear mixed model to quantify variation in gene expression attributable to individual, tissue, time point, or technical variables. Includes dream differential expression analysis for repeated measures.
Last updated
rnaseqgeneexpressiongenesetenrichmentdifferentialexpressionbatcheffectqualitycontrolregressionepigeneticsfunctionalgenomicstranscriptomicsnormalizationpreprocessingmicroarrayimmunooncologysoftware
12.67 score 16 stars 6 dependents 1.5k scripts 2.2k downloads
ggcyto - Visualize Cytometry data with ggplot
With the dedicated fortify method implemented for flowSet, ncdfFlowSet and GatingSet classes, both raw and gated flow cytometry data can be plotted directly with ggplot. ggcyto wrapper and some customed layers also make it easy to add gates and population statistics to the plot.
Last updated
immunooncologyflowcytometrycellbasedassaysinfrastructurevisualization
12.37 score 65 stars 7 dependents 559 scripts 2.0k downloadsSeqArray - Data management of large-scale whole-genome sequence variant calls using GDS files
Data management of large-scale whole-genome sequencing variant calls with thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.
Last updated
infrastructuredatarepresentationsequencinggeneticsbioinformaticsgds-formatsnpsnvweswgscpp
12.34 score 46 stars 9 dependents 1.4k scripts 1.8k downloadsglmGamPoi - Fit a Gamma-Poisson Generalized Linear Model
Fit linear models to overdispersed count data. The package can estimate the overdispersion and fit repeated models for matrix input. It is designed to handle large input datasets as they typically occur in single cell RNA-seq experiments.
Last updated
regressionrnaseqsoftwaresinglecellgamma-poissonglmnegative-binomial-regressionon-diskopenblascpp
12.12 score 122 stars 4 dependents 2.1k scripts 10k downloads
QFeatures - Quantitative features for mass spectrometry data
The QFeatures infrastructure enables the management and processing of quantitative features for high-throughput mass spectrometry assays. It provides a familiar Bioconductor user experience to manages quantitative data across different assay levels (such as peptide spectrum matches, peptides and proteins) in a coherent and tractable format.
Last updated
infrastructuremassspectrometryproteomicsmetabolomicsbioconductormass-spectrometry
12.08 score 29 stars 56 dependents 398 scripts 4.2k downloadszellkonverter - Conversion Between scRNA-seq Objects
Provides methods to convert between Python AnnData objects and SingleCellExperiment objects. These are primarily intended for use by downstream Bioconductor packages that wrap Python methods for single-cell data analysis. It also includes functions to read and write H5AD files used for saving AnnData objects to disk.
Last updated
singlecelldataimportdatarepresentationbioconductorconversionscrna-seq
12.02 score 210 stars 6 dependents 1.4k scripts 4.0k downloadsBiocNeighbors - Nearest Neighbor Detection for Bioconductor Packages
Implements exact and approximate methods for nearest neighbor detection, in a framework that allows them to be easily switched within Bioconductor packages or workflows. Exact searches can be performed using the k-means for k-nearest neighbors algorithm, vantage point trees, or an exhaustive search. Approximate searches can be performed using the Annoy or HNSW libraries. Each search can be performed with a variety of different distance metrics, parallelization, and variable numbers of neighbors. Range-based searches (to find all neighbors within a certain distance) are also supported.
Last updated
clusteringclassificationbioconductor-packagehuman-cell-atlasnearest-neighbor-searchcpp
11.90 score 6 stars 111 dependents 844 scripts 20k downloadsbumphunter - Bump Hunter
Tools for finding bumps in genomic data
Last updated
dnamethylationepigeneticsinfrastructuremultiplecomparisonimmunooncology
11.89 score 18 stars 51 dependents 206 scripts 4.9k downloads
scRepertoire - A toolkit for single-cell immune receptor profiling
scRepertoire is a toolkit for processing and analyzing single-cell T-cell receptor (TCR) and immunoglobulin (Ig). The scRepertoire framework supports use of 10x, AIRR, BD, MiXCR, TRUST4, and WAT3R single-cell formats. The functionality includes basic clonal analyses, repertoire summaries, distance-based clustering and interaction with the popular Seurat and SingleCellExperiment/Bioconductor R single-cell workflows.
Last updated
softwareimmunooncologysinglecellclassificationannotationsequencingcpp
11.63 score 368 stars 1 dependents 634 scripts 1.5k downloadssplatter - Simple Simulation of Single-cell RNA Sequencing Data
Splatter is a package for the simulation of single-cell RNA sequencing count data. It provides a simple interface for creating complex simulations that are reproducible and well-documented. Parameters can be estimated from real data and functions are provided for comparing real and simulated datasets.
Last updated
singlecellrnaseqtranscriptomicsgeneexpressionsequencingsoftwareimmunooncologybioconductorbioinformaticsscrna-seqsimulation
11.63 score 235 stars 1 dependents 532 scripts 848 downloadsmiloR - Differential neighbourhood abundance testing on a graph
Milo performs single-cell differential abundance testing. Cell states are modelled as representative neighbourhoods on a nearest neighbour graph. Hypothesis testing is performed using either a negative bionomial generalized linear model or negative binomial generalized linear mixed model.
Last updated
singlecellmultiplecomparisonfunctionalgenomicssoftwareopenblascppopenmp
11.37 score 433 stars 2 dependents 578 scripts 1.3k downloadstximeta - Transcript Quantification Import with Automatic Metadata
Transcript quantification import from Salmon and other quantifiers with automatic attachment of transcript ranges and release information, and other associated metadata. De novo transcriptomes can be linked to the appropriate sources with linkedTxomes and shared for computational reproducibility.
Last updated
annotationgenomeannotationdataimportpreprocessingrnaseqlongreadsinglecelltranscriptomicstranscriptiongeneexpressionfunctionalgenomicsreproducibleresearchreportwritingimmunooncology
11.30 score 72 stars 1 dependents 584 scripts 2.8k downloadsscater - Single-Cell Analysis Toolkit for Gene Expression Data in R
A collection of tools for doing various analyses of single-cell RNA-seq gene expression data, with a focus on quality control and visualization.
Last updated
immunooncologysinglecellrnaseqqualitycontrolpreprocessingnormalizationvisualizationdimensionreductiontranscriptomicsgeneexpressionsequencingsoftwaredataimportdatarepresentationinfrastructurecoverage
11.15 score 57 dependents 13k scripts 16k downloads
MsCoreUtils - Core Utils for Mass Spectrometry Data
MsCoreUtils defines low-level functions for mass spectrometry data and is independent of any high-level data structures. These functions include mass spectra processing functions (noise estimation, smoothing, binning, baseline estimation), quantitative aggregation functions (median polish, robust summarisation, ...), missing data imputation, data normalisation (quantiles, vsn, ...), misc helper functions, that are used across high-level data structure within the R for Mass Spectrometry packages.
Last updated
infrastructureproteomicsmassspectrometrymetabolomicsbioconductormass-spectrometryutils
11.11 score 17 stars 80 dependents 88 scripts 5.5k downloadsGlimma - Interactive visualizations for gene expression analysis
This package produces interactive visualizations for RNA-seq data analysis, utilizing output from limma, edgeR, or DESeq2. It produces interactive htmlwidgets versions of popular RNA-seq analysis plots to enhance the exploration of analysis results by overlaying interactive features. The plots can be viewed in a web browser or embedded in notebook documents.
Last updated
differentialexpressiongeneexpressionmicroarrayreportwritingrnaseqsequencingvisualizationdifferential-expressioninteractive-visualizations
10.81 score 35 stars 2 dependents 748 scripts 1.8k downloadsAnVIL - Bioconductor on the AnVIL compute environment
The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The AnVIL package provides programatic access to the Dockstore, Leonardo, Rawls, TDR, and Terra RESTful programming interfaces. For platform-specific user-level functionality, see either the AnVILGCP or AnVILAz package.
Last updated
infrastructureu24hg010263
10.57 score 7 stars 12 dependents 317 scripts 1.2k downloads
MetaboCoreUtils - Core Utils for Metabolomics Data
MetaboCoreUtils defines metabolomics-related core functionality provided as low-level functions to allow a data structure-independent usage across various R packages. This includes functions to calculate between ion (adduct) and compound mass-to-charge ratios and masses or functions to work with chemical formulas. The package provides also a set of adduct definitions and information on some commercially available internal standard mixes commonly used in MS experiments.
Last updated
infrastructuremetabolomicsmassspectrometrymass-spectrometry
10.29 score 9 stars 65 dependents 101 scripts 3.3k downloadsTCGAutils - TCGA utility functions for data management
A suite of helper functions for checking and manipulating TCGA data including data obtained from the curatedTCGAData experiment package. These functions aim to simplify and make working with TCGA data more manageable. Exported functions include those that import data from flat files into Bioconductor objects, convert row annotations, and identifier translation via the GDC API.
Last updated
softwareworkflowsteppreprocessingdataimportbioconductor-packagetcgau24ca289073utilities
10.26 score 29 stars 11 dependents 267 scripts 2.2k downloadscBioPortalData - Exposes and Makes Available Data from the cBioPortal Web Resources
The cBioPortalData R package accesses study datasets from the cBio Cancer Genomics Portal. It accesses the data either from the pre-packaged zip / tar files or from the API interface that was recently implemented by the cBioPortal Data Team. The package can provide data in either tabular format or with MultiAssayExperiment object that uses familiar Bioconductor data representations.
Last updated
softwareinfrastructurethirdpartyclientbioconductor-packagenci-itcru24ca289073
10.23 score 34 stars 4 dependents 254 scripts 836 downloadsBiocIO - Standard Input and Output for Bioconductor Packages
The `BiocIO` package contains high-level abstract classes and generics used by developers to build IO funcionality within the Bioconductor suite of packages. Implements `import()` and `export()` standard generics for importing and exporting biological data formats. `import()` supports whole-file as well as chunk-wise iterative import. The `import()` interface optionally provides a standard mechanism for 'lazy' access via `filter()` (on row or element-like components of the file resource), `select()` (on column-like components of the file resource) and `collect()`. The `import()` interface optionally provides transparent access to remote (e.g. via https) as well as local access. Developers can register a file extension, e.g., `.loom` for dispatch from character-based URIs to specific `import()` / `export()` methods based on classes representing file types, e.g., `LoomFile()`.
Last updated
annotationdataimportbioconductor-packagecore-package
10.06 score 1 stars 515 dependents 36 scripts 34k downloadsflowCore - flowCore: Basic structures for flow cytometry data
Provides S4 data structures and basic functions to deal with flow cytometry data.
Last updated
immunooncologyinfrastructureflowcytometrycellbasedassayscurlopensslopenblascpp
9.77 score 61 dependents 2.2k scripts 5.8k downloadsmaaslin3 - "Refining and extending generalized multivariate linear models for meta-omic association discovery"
MaAsLin 3 refines and extends generalized multivariate linear models for meta-omicron association discovery. It finds abundance and prevalence associations between microbiome meta-omics features and complex metadata in population-scale epidemiological studies. The software includes multiple analysis methods (including support for multiple covariates, repeated measures, and ordered predictors), filtering, normalization, and transform options to customize analysis for your specific study.
Last updated
metagenomicssoftwaremicrobiomenormalizationmultiplecomparisonr-tools
9.75 score 83 stars 2 dependents 147 scripts 502 downloadsSpiecEasi - Sparse Inverse Covariance for Ecological Statistical Inference
Estimate networks from the precision matrix of compositional microbial abundance data.
Last updated
softwaremicrobiomemetagenomicsgraphandnetworknetworkinferenceopenblascpp
9.50 score 231 stars 656 scripts 336 downloads
flowWorkspace - Infrastructure for representing and interacting with gated and ungated cytometry data sets.
This package is designed to facilitate comparison of automated gating methods against manual gating done in flowJo. This package allows you to import basic flowJo workspaces into BioConductor and replicate the gating from flowJo using the flowCore functionality. Gating hierarchies, groups of samples, compensation, and transformation are performed so that the output matches the flowJo analysis.
Last updated
immunooncologyflowcytometrydataimportpreprocessingdatarepresentationcurlopensslopenblascpp
9.50 score 13 dependents 682 scripts 3.0k downloadsBanksy - Spatial transcriptomic clustering
Banksy is an R package that incorporates spatial information to cluster cells in a feature space (e.g. gene expression). To incorporate spatial information, BANKSY computes the mean neighborhood expression and azimuthal Gabor filters that capture gene expression gradients. These features are combined with the cell's own expression to embed cells in a neighbor-augmented product space which can then be clustered, allowing for accurate and spatially-aware cell typing and tissue domain segmentation.
Last updated
clusteringspatialsinglecellgeneexpressiondimensionreductionclustering-algorithmsingle-cell-omicsspatial-omics
9.46 score 148 stars 404 scripts 828 downloadsBiocBaseUtils - Utility and internal functions for Bioconductor packages
The package coalesces typical helper functions that are scattered throughout the Bioconductor ecosystem. It aims to reduce code redundancy by formalizing functions often used by Bioconductor developers. These functions include operations such as replacing slots in an object, selecting observations for show methods, labeling function life cycles, and more.
Last updated
softwareinfrastructurebioconductor-packagecore-package
9.38 score 4 stars 713 dependents 7 scripts 11k downloadsassorthead - Assorted Header-Only C++ Libraries
Vendors an assortment of useful header-only C++ libraries. Bioconductor packages can use these libraries in their own C++ code by LinkingTo this package without introducing any additional dependencies. The use of a central repository avoids duplicate vendoring of libraries across multiple R packages, and enables better coordination of version updates across cohorts of interdependent C++ libraries.
Last updated
singlecellqualitycontrolnormalizationdatarepresentationdataimportdifferentialexpressionalignment
9.29 score 1 stars 207 dependents 20k downloadsscPipe - Pipeline for single cell multi-omic data pre-processing
A preprocessing pipeline for single cell RNA-seq/ATAC-seq data that starts from the fastq files and produces a feature count matrix with associated quality control information. It can process fastq data generated by CEL-seq, MARS-seq, Drop-seq, Chromium 10x and SMART-seq protocols.
Last updated
immunooncologysoftwaresequencingrnaseqgeneexpressionsinglecellvisualizationsequencematchingpreprocessingqualitycontrolgenomeannotationdataimportcurlbzip2xz-utilszlibcpp
9.22 score 67 stars 1 dependents 90 scripts 485 downloadsorthogene - Gene mapping made easy
`orthogene` is an R package for easy mapping of orthologous genes across hundreds of species. It pulls up-to-date gene ortholog mappings across **700+ organisms**. It also provides various utility functions to aggregate/expand common objects (e.g. data.frames, gene expression matrices, lists) using **1:1**, **many:1**, **1:many** or **many:many** gene mappings, both within- and between-species.
Last updated
geneticscomparativegenomicspreprocessingphylogeneticstranscriptomicsgeneexpressionanimal-modelsbioconductorbioconductor-packagebioinformaticsbiomedicinecomparative-genomicsevolutionary-biologygenesgenomicsontologiestranslational-research
9.13 score 56 stars 3 dependents 82 scripts 1.0k downloadsiCOBRA - Comparison and Visualization of Ranking and Assignment Methods
This package provides functions for calculation and visualization of performance metrics for evaluation of ranking and binary classification (assignment) methods. Various types of performance plots can be generated programmatically. The package also contains a shiny application for interactive exploration of results.
Last updated
classificationvisualization
9.01 score 16 stars 1 dependents 237 scripts 692 downloadspwalign - Perform pairwise sequence alignments
The two main functions in the package are pairwiseAlignment() and stringDist(). The former solves (Needleman-Wunsch) global alignment, (Smith-Waterman) local alignment, and (ends-free) overlap alignment problems. The latter computes the Levenshtein edit distance or pairwise alignment score matrix for a set of strings.
Last updated
alignmentsequencematchingsequencinggeneticsbioconductor-package
8.93 score 1 stars 111 dependents 137 scripts 12k downloadsGenomicScores - Infrastructure to work with genomewide position-specific scores
Provide infrastructure to store and access genomewide position-specific scores within R and Bioconductor.
Last updated
infrastructuregeneticsannotationsequencingcoverageannotationhubsoftware
8.93 score 9 stars 6 dependents 100 scripts 1.2k downloadsimcRtools - Methods for imaging mass cytometry data analysis
This R package supports the handling and analysis of imaging mass cytometry and other highly multiplexed imaging data. The main functionality includes reading in single-cell data after image segmentation and measurement, data formatting to perform channel spillover correction and a number of spatial analysis approaches. First, cell-cell interactions are detected via spatial graph construction; these graphs can be visualized with cells representing nodes and interactions representing edges. Furthermore, per cell, its direct neighbours are summarized to allow spatial clustering. Per image/grouping level, interactions between types of cells are counted, averaged and compared against random permutations. In that way, types of cells that interact more (attraction) or less (avoidance) frequently than expected by chance are detected.
Last updated
immunooncologysinglecellspatialdataimportclusteringimcsingle-cell
8.84 score 31 stars 426 scripts 592 downloadsBiocPkgTools - Collection of simple tools for learning about Bioconductor Packages
Bioconductor has a rich ecosystem of metadata around packages, usage, and build status. This package is a simple collection of functions to access that metadata from R. The goal is to expose metadata for data mining and value-added functionality such as package searching, text mining, and analytics on packages.
Last updated
softwareinfrastructurebioconductormetadatau24ca289073
8.80 score 22 stars 1 dependents 98 scripts 664 downloadsFLAMES - FLAMES: Full Length Analysis of Mutations and Splicing in long read RNA-seq data
Semi-supervised isoform detection and annotation from both bulk and single-cell long read RNA-seq data. Flames provides automated pipelines for analysing isoforms, as well as intermediate functions for manual execution.
Last updated
rnaseqsinglecelltranscriptomicsdataimportdifferentialsplicingalternativesplicinggeneexpressionlongreadrustcargozlibcurlbzip2xz-utilscpp
8.73 score 63 stars 34 scripts 470 downloadssangeranalyseR - sangeranalyseR: a suite of functions for the analysis of Sanger sequence data in R
This package builds on sangerseqR to allow users to create contigs from collections of Sanger sequencing reads. It provides a wide range of options for a number of commonly-performed actions including read trimming, detecting secondary peaks, and detecting indels using a reference sequence. All parameters can be adjusted interactively either in R or in the associated Shiny applications. There is extensive online documentation, and the package can outputs detailed HTML reports, including chromatograms.
Last updated
geneticsalignmentsequencingsangerseqpreprocessingqualitycontrolvisualizationguicpp
8.67 score 104 stars 64 scripts 628 downloadsbamsignals - Extract read count signals from bam files
This package allows to efficiently obtain count vectors from indexed bam files. It counts the number of reads in given genomic ranges and it computes reads profiles and coverage profiles. It also handles paired-end data.
Last updated
dataimportsequencingcoveragealignmentcurlbzip2xz-utilszlibcpp
8.63 score 15 stars 9 dependents 2.1k downloadslefser - R implementation of the LEfSE method for microbiome biomarker discovery
lefser is the R implementation of the popular microbiome biomarker discovery too, LEfSe. It uses the Kruskal-Wallis test, Wilcoxon-Rank Sum test, and Linear Discriminant Analysis to find biomarkers from two-level classes (and optional sub-classes).
Last updated
softwaresequencingdifferentialexpressionmicrobiomestatisticalmethodclassificationbioconductor-packager01ca230551
8.56 score 66 stars 110 scripts 1.2k downloadsscrapper - Bindings to C++ Libraries for Single-Cell Analysis
Implements R bindings to C++ code for analyzing single-cell (expression) data, mostly from various libscran libraries. Each function performs an individual step in the single-cell analysis workflow, ranging from quality control to clustering and marker detection. Additional wrappers are provided for easy construction of end-to-end workflows involving Bioconductor objects like SingleCellExperiments.
Last updated
normalizationrnaseqsoftwaregeneexpressiontranscriptomicssinglecellbatcheffectqualitycontroldifferentialexpressionfeatureextractionprincipalcomponentclusteringopenblascpp
8.50 score 8 stars 6 dependents 109 scripts 1.2k downloadsCOTAN - COexpression Tables ANalysis
Statistical and computational method to analyze the co-expression of gene pairs at single cell level. It provides the foundation for single-cell gene interactome analysis. The basic idea is studying the zero UMI counts' distribution instead of focusing on positive counts; this is done with a generalized contingency tables framework. COTAN can effectively assess the correlated or anti-correlated expression of gene pairs. It provides a numerical index related to the correlation and an approximate p-value for the associated independence test. COTAN can also evaluate whether single genes are differentially expressed, scoring them with a newly defined global differentiation index. Moreover, this approach provides ways to plot and cluster genes according to their co-expression pattern with other genes, effectively helping the study of gene interactions and becoming a new tool to identify cell-identity marker genes.
Last updated
systemsbiologytranscriptomicsgeneexpressionsinglecelldifferentialexpressionclusteringgpu
8.42 score 17 stars 130 scripts 460 downloadsigvR - igvR: integrative genomics viewer
Access to igv.js, the Integrative Genomics Viewer running in a web browser.
Last updated
visualizationthirdpartyclientgenomebrowsers
8.37 score 45 stars 117 scripts 479 downloadsPTMods - Managing Post-Translational Modifications in R
An interface to the community supported database for amino acid/protein modifications using mass spectrometry.
Last updated
proteomicsmassspectrometryamino-acid-modificationsmass-spectrometryprotein
8.13 score 11 stars 41 dependents 2 scripts 599 downloadsontoProc - processing of ontologies of anatomy, cell lines, and so on
Support harvesting of diverse bioinformatic ontologies, making particular use of the ontologyIndex package on CRAN. We provide snapshots of key ontologies for terms about cells, cell lines, chemical compounds, and anatomy, to help analyze genome-scale experiments, particularly cell x compound screens. Another purpose is to strengthen development of compelling use cases for richer interfaces to emerging ontologies.
Last updated
infrastructuregobioinformaticsgenomicsontology
8.10 score 5 stars 2 dependents 77 scripts 584 downloadsarrayQualityMetrics - Quality metrics report for microarray data sets
This package generates microarray quality metrics reports for data in Bioconductor microarray data containers (ExpressionSet, NChannelSet, AffyBatch). One and two color array platforms are supported.
Last updated
microarrayqualitycontrolonechanneltwochannelreportwritingbioconductor
8.09 score 1 stars 337 scripts 1.0k downloadschromVAR - Chromatin Variation Across Regions
Determine variation in chromatin accessibility across sets of annotations or peaks. Designed primarily for single-cell or sparse chromatin accessibility data, e.g. from scATAC-seq or sparse bulk ATAC or DNAse-seq experiments.
Last updated
singlecellsequencinggeneregulationimmunooncologycpp
7.88 score 1.3k scripts 2.9k downloadsBgeeDB - Annotation and gene expression data retrieval from Bgee database. TopAnat, an anatomical entities Enrichment Analysis tool for UBERON ontology
A package for the annotation and gene expression data download from Bgee database, and TopAnat analysis: GO-like enrichment of anatomical terms, mapped to genes by expression patterns.
Last updated
softwaredataimportsequencinggeneexpressionmicroarraygogenesetenrichmentbioinformaticsenrichment-analysisrna-seqscrna-seqsingle-cell
7.84 score 16 stars 1 dependents 28 scripts 491 downloadsOrganismDbi - Software to enable the smooth interfacing of different database packages
The package enables a simple unified interface to several annotation packages each of which has its own schema by taking advantage of the fact that each of these packages implements a select methods.
Last updated
annotationinfrastructure
7.63 score 37 dependents 38 scripts 4.1k downloadsgDRimport - Package for handling the import of dose-response data
The package is a part of the gDR suite. It helps to prepare raw drug response data for downstream processing. It mainly contains helper functions for importing/loading/validating dose-response data provided in different file formats.
Last updated
softwareinfrastructuredataimport
7.52 score 3 stars 1 dependents 14 scripts 323 downloadsDiffBind - Differential Binding Analysis of ChIP-Seq Peak Data
Compute differentially bound sites from multiple ChIP-seq experiments using affinity (quantitative) data. Also enables occupancy (overlap) analysis and plotting functions.
Last updated
sequencingchipseqatacseqdnaseseqmethylseqripseqdifferentialpeakcallingdifferentialmethylationgeneregulationhistonemodificationpeakdetectionbiomedicalinformaticscellbiologymultiplecomparisonnormalizationreportwritingepigeneticsfunctionalgenomicscurlbzip2xz-utilszlibcpp
7.50 score 2 dependents 672 scripts 2.6k downloadsMetaProViz - METabolomics pre-PRocessing, functiOnal analysis and VIZualisation
MetaProViz can analyse standard metabolomics and exometabolomics data (CoRe). It performs pre-processing including feature filtering, missing value imputation, normalisation and outlier detection. It performs functional analysis including differential metabolite analysis (DMA), clustering based on regulatory rules (MCA) and contains different visualisation methods to extract biological interpretable graphs and saves them in a publication ready format.
Last updated
clusteringmetabolomicspathwaysqualitycontrolsoftwaresystemsbiologyvisualizationquarto
7.44 score 20 stars 30 scripts 219 downloadsigvShiny - igvShiny: a wrapper of Integrative Genomics Viewer (IGV - an interactive tool for visualization and exploration integrated genomic data)
This package is a wrapper of Integrative Genomics Viewer (IGV). It comprises an htmlwidget version of IGV. It can be used as a module in Shiny apps.
Last updated
softwareshinyappssequencingcoverage
7.42 score 38 stars 1 dependents 166 scripts 350 downloadspipeComp - pipeComp pipeline benchmarking framework
A simple framework to facilitate the comparison of pipelines involving various steps and parameters. The `pipelineDefinition` class represents pipelines as, minimally, a set of functions consecutively executed on the output of the previous one, and optionally accompanied by step-wise evaluation and aggregation functions. Given such an object, a set of alternative parameters/methods, and benchmark datasets, the `runPipeline` function then proceeds through all combinations arguments, avoiding recomputing the same step twice and compiling evaluations on the fly to avoid storing potentially large intermediate data.
Last updated
geneexpressiontranscriptomicsclusteringdatarepresentationbenchmarkbioconductorpipeline-benchmarkingpipelinessingle-cell-rna-seq
7.36 score 43 stars 44 scripts 346 downloadsncdfFlow - ncdfFlow: A package that provides HDF5 based storage for flow cytometry data.
Provides HDF5 storage based methods and functions for manipulation of flow cytometry data.
Last updated
immunooncologyflowcytometrycurlopensslcpp
7.28 score 14 dependents 96 scripts 2.4k downloadsplyxp - Data masks for SummarizedExperiment enabling dplyr-like manipulation
The package provides `rlang` data masks for the SummarizedExperiment class. The enables the evaluation of unquoted expression in different contexts of the SummarizedExperiment object with optional access to other contexts. The goal for `plyxp` is for evaluation to feel like a data.frame object without ever needing to unwind to a rectangular data.frame.
Last updated
annotationgenomeannotationtranscriptomics
7.27 score 9 stars 2 dependents 23 scripts 298 downloadsMicrobiomeProfiler - An R/shiny package for microbiome functional enrichment analysis
This is an R/shiny package to perform functional enrichment analysis for microbiome data. This package was based on clusterProfiler. Moreover, MicrobiomeProfiler support KEGG enrichment analysis, COG enrichment analysis, Microbe-Disease association enrichment analysis, Metabo-Pathway analysis.
Last updated
microbiomesoftwarevisualizationkegg
7.21 score 42 stars 31 scripts 350 downloadsdrugfindR - Investigate iLINCS for candidate repurposable drugs
This package provides a convenient way to access the LINCS Signatures available in the iLINCS database. These signatures include Consensus Gene Knockdown Signatures, Gene Overexpression signatures and Chemical Perturbagen Signatures. It also provides a way to enter your own transcriptomic signatures and identify concordant and discordant signatures in the LINCS database.
Last updated
lincsilincsdrug repurposingdrug discoverytranscriptomicsgene expressiongene knockdowngene overexpressionchemical perturbagendrugfindrbioinformaticsbioinformatics-pipeline
7.18 score 10 stars 151 scripts 179 downloadsdoppelgangR - Identify likely duplicate samples from genomic or meta-data
The main function is doppelgangR(), which takes as minimal input a list of ExpressionSet object, and searches all list pairs for duplicated samples. The search is based on the genomic data (exprs(eset)), phenotype/clinical data (pData(eset)), and "smoking guns" - supposedly unique identifiers found in pData(eset).
Last updated
immunooncologyrnaseqmicroarraygeneexpressionqualitycontrolbioconductor-package
7.17 score 5 stars 33 scripts 400 downloadsSpaceMarkers - Spatial Interaction Markers
Spatial transcriptomic technologies have helped to resolve the connection between gene expression and the 2D orientation of tissues relative to each other. However, the limited single-cell resolution makes it difficult to highlight the most important molecular interactions in these tissues. SpaceMarkers, R/Bioconductor software, can help to find molecular interactions, by identifying genes associated with latent space interactions in spatial transcriptomics.
Last updated
singlecellgeneexpressionsoftwarespatialtranscriptomics
7.14 score 8 stars 38 scripts 245 downloadsbugsigdbr - R-side access to published microbial signatures from BugSigDB
The bugsigdbr package implements convenient access to bugsigdb.org from within R/Bioconductor. The goal of the package is to facilitate import of BugSigDB data into R/Bioconductor, provide utilities for extracting microbe signatures, and enable export of the extracted signatures to plain text files in standard file formats such as GMT.
Last updated
dataimportgenesetenrichmentmetagenomicsmicrobiomebioconductor-packager01ca230551
6.98 score 6 stars 67 scripts 854 downloadsGSEABenchmarkeR - Reproducible GSEA Benchmarking
The GSEABenchmarkeR package implements an extendable framework for reproducible evaluation of set- and network-based methods for enrichment analysis of gene expression data. This includes support for the efficient execution of these methods on comprehensive real data compendia (microarray and RNA-seq) using parallel computation on standard workstations and institutional computer grids. Methods can then be assessed with respect to runtime, statistical significance, and relevance of the results for the phenotypes investigated.
Last updated
immunooncologymicroarrayrnaseqgeneexpressiondifferentialexpressionpathwaysgraphandnetworknetworkgenesetenrichmentnetworkenrichmentvisualizationreportwritingbioconductor-packageu24ca289073
6.95 score 14 stars 30 scripts 373 downloadsfastreeR - Phylogenetic, Distance and Other Calculations on VCF and Fasta Files
Calculate distances, build phylogenetic trees or perform hierarchical clustering between the samples of a VCF or FASTA file. Functions are implemented in Java-11 and called via rJava. Parallel implementation that operates directly on the VCF or FASTA file for fast execution.
Last updated
phylogeneticsmetagenomicsclusteringopenjdk
6.94 score 31 stars 28 scripts 353 downloads
plaid - PLAID ultrafast gene set enrichment scoring
PLAID (Pathway Level Average Intensity Detection) is an ultra-fast method to compute single-sample enrichment scores for gene expression or proteomics data. For each sample, plaid computes the gene set score as the average intensity of the genes/proteins in the gene set. The output is a gene set score matrix suitable for further analyses.
Last updated
genesetenrichmentgeneexpressionproteomicsbioinformaticsenrichment-analysisomics-datarna-seq-analysis
6.94 score 24 stars 3 scripts 162 downloadsAerith - visualization and annotation of isotopic enrichment patterns of peptides and metabolites with stable isotope labeling from proteomics and metabolomics
Visualisation of peptide isotopic peaks and SIP peptide spectra match (PSM). Filtration of high quality PSM. Accurate isotopic abundance calculation of peptide and metabolites. Visualisation of SIP proteomics results.
Last updated
proteomicsmetabolomicsmassspectrometrysoftwarevisualizationqualitycontrolannotationlc-msmass-spectrometrystable-isotope-mass-spectrometrystable-isotope-tracingstable-isotopescppopenmp
6.92 score 5 stars 32 scripts 169 downloadsextraChIPs - Additional functions for working with ChIP-Seq data
This package builds on existing tools and adds some simple but extremely useful capabilities for working wth ChIP-Seq data. The focus is on detecting differential binding windows/regions. One set of functions focusses on set-operations retaining mcols for GRanges objects, whilst another group of functions are to aid visualisation of results. Coercion to tibble objects is also implemented.
Last updated
chipseqhicsequencingcoverage
6.88 score 7 stars 36 scripts 402 downloadslemur - Latent Embedding Multivariate Regression
Fit a latent embedding multivariate regression (LEMUR) model to multi-condition single-cell data. The model provides a parametric description of single-cell data measured with treatment vs. control or more complex experimental designs. The parametric model is used to (1) align conditions, (2) predict log fold changes between conditions for all cells, and (3) identify cell neighborhoods with consistent log fold changes. For those neighborhoods, a pseudobulked differential expression test is conducted to assess which genes are significantly changed.
Last updated
transcriptomicsdifferentialexpressionsinglecelldimensionreductionregressionquartoopenblascpp
6.87 score 101 stars 92 scripts 355 downloads
MutSeqR - Analysis of Error-Corrected Sequencing Data for Mutation Detection
Standard methods for analysis of mutation data following error- corrected sequencing (ECS) for the purpose of mutagencity assessment. Functions include importing the mutation lists provided by a variant caller, and a set of analytical tools for statistical testing and visualization of mutation data; comparison to COSMIC and/or germline signatures; etc.
Last updated
sequencingsomaticmutationvisualizationgenomicvariationdrivermutationstatisticalmethodgenetarget
6.84 score 10 stars 8 scripts 190 downloadsVisiumIO - Import Visium data from the 10X Space Ranger pipeline
The package allows users to readily import spatial data obtained from either the 10X website or from the Space Ranger pipeline. Supported formats include tar.gz, h5, and mtx files. Multiple files can be imported at once with *List type of functions. The package represents data mainly as SpatialExperiment objects.
Last updated
softwareinfrastructuredataimportsinglecellspatialbioconductor-packagegenomicsu24ca289073
6.84 score 3 stars 1 dependents 95 scripts 400 downloadsMSstatsBioNet - Network Analysis for MS-based Proteomics Experiments
A set of tools for network analysis using mass spectrometry-based proteomics data and network databases. The package takes as input the output of MSstats differential abundance analysis and provides functions to perform enrichment analysis and visualization in the context of prior knowledge from past literature. Notably, this package integrates with INDRA, which is a database of biological networks extracted from the literature using text mining techniques.
Last updated
immunooncologymassspectrometryproteomicssoftwarequalitycontrolnetworkenrichmentnetwork
6.77 score 2 stars 1 dependents 13 scripts 252 downloadsepiregulon - Gene regulatory network inference from single cell epigenomic data
Gene regulatory networks model the underlying gene regulation hierarchies that drive gene expression and observed phenotypes. Epiregulon infers TF activity in single cells by constructing a gene regulatory network (regulons). This is achieved through integration of scATAC-seq and scRNA-seq data and incorporation of public bulk TF ChIP-seq data. Links between regulatory elements and their target genes are established by computing correlations between chromatin accessibility and gene expressions.
Last updated
singlecellgeneregulationnetworkinferencenetworkgeneexpressiontranscriptiongenetargetcpp
6.70 score 28 stars 30 scripts 306 downloadsZarrArray - Bring Zarr datasets in R as DelayedArray objects
The ZarrArray package leverages the Rarr package to bring Zarr datasets in R as DelayedArray objects. The main class in the package is the ZarrArray class. A ZarrArray object is an array-like object that represents a Zarr dataset in R. ZarrArray objects are DelayedArray derivatives and therefore support all operations (delayed or block-processed) supported by DelayedArray objects.
Last updated
infrastructuredatarepresentationdataimportbioconductor-packagecore-packageu24ca289073
6.62 score 5 stars 4 dependents 3 scripts 628 downloadsRnBeads - RnBeads
RnBeads facilitates comprehensive analysis of various types of DNA methylation data at the genome scale.
Last updated
dnamethylationmethylationarraymethylseqepigeneticsqualitycontrolpreprocessingbatcheffectdifferentialmethylationsequencingcpgislandimmunooncologytwochanneldataimport
6.61 score 1 dependents 225 scripts 872 downloadsphylobar - Interactive construction of stacked barplots using hierarchies
The phylobar package supports interactive visualization of microbiome data by allowing a stacked barplot to be constructed using a guiding taxonomic or phylogenetic hierarchy. The package provides a strategy for collapsing and expanding the hierarchy to different levels of resolution and then for interactively "painting" the stacked barplot by placing the mouse over different subtrees. This makes it possible to interactively test different color palettes at different resolution and identify taxonomic groups with interesting variation before settling on a final stacked barplot. One advantage of the approach is that multiple levels of taxonomic resolution can be compared at once within the same view.
Last updated
softwaremicrobiomephylogeneticsvisualization
6.56 score 3 stars 28 scripts 63 downloadsMetaboAnnotatoR - Automated Annotation of All-Ion Fragmentation LC-MS Metabolomic Features
Performs feature annotations on LC-MS All-ion fragmentation datasets using fragment ion libraries.
Last updated
massspectrometrymetabolomics
6.55 score 14 stars 4 scripts 162 downloadsproBatch - Tools for Diagnostics and Corrections of Batch Effects in Proteomics
These tools facilitate batch effects analysis and correction in high-throughput experiments. It was developed primarily for mass-spectrometry proteomics (DIA/SWATH), but could also be applicable to most omic data with minor adaptations. The package contains functions for diagnostics (proteome/genome-wide and feature-level), correction (normalization and batch effects correction) and quality control. Non-linear fitting based approaches were also included to deal with complex, mass spectrometry-specific signal drifts.
Last updated
batcheffectnormalizationpreprocessingsoftwaremassspectrometryproteomicsqualitycontrolvisualization
6.55 score 66 scripts 88 downloadsCellMentor - Supervised Non-negative Matrix Factorization for Dimensional Reduction in Single-Cell Analysis
Implements supervised cell type-aware non-negative matrix factorization (NMF) for dimensional reduction in single-cell RNA sequencing analysis. The package provides methods for incorporating cell type information into the dimensionality reduction process, enabling improved visualization and downstream analysis of single-cell data while preserving biological structure. CellMentor employs a unique loss function that simultaneously minimizes variation within known cell populations while maximizing distinctions between different cell types, enabling effective transfer of learned patterns from labeled reference datasets to new unlabeled data.
Last updated
softwaresinglecelltranscriptomicsdimensionreduction
6.45 score 19 stars 37 scripts 176 downloadsbettr - A Better Way To Explore What Is Best
bettr provides a set of interactive visualization methods to explore the results of a benchmarking study, where typically more than a single performance measures are computed. The user can weight the performance measures according to their preferences. Performance measures can also be grouped and aggregated according to additional annotations.
Last updated
visualizationshinyappsgui
6.26 score 5 stars 20 scripts 238 downloadstkWidgets - R based tk widgets
Widgets to provide user interfaces. tcltk should have been installed for the widgets to run.
Last updated
infrastructure
6.24 score 5 dependents 82 scripts 2.4k downloadsCoSIA - An Investigation Across Different Species and Tissues
Cross-Species Investigation and Analysis (CoSIA) is a package that provides researchers with an alternative methodology for comparing across species and tissues using normal wild-type RNA-Seq Gene Expression data from Bgee. Using RNA-Seq Gene Expression data, CoSIA provides multiple visualization tools to explore the transcriptome diversity and variation across genes, tissues, and species. CoSIA uses the Coefficient of Variation and Shannon Entropy and Specificity to calculate transcriptome diversity and variation. CoSIA also provides additional conversion tools and utilities to provide a streamlined methodology for cross-species comparison.
Last updated
softwarebiologicalquestiongeneexpressionmultiplecomparisonthirdpartyclientdataimportgui
6.18 score 12 stars 6 scripts 313 downloadsiSEEtree - Interactive visualisation for microbiome data
iSEEtree is an extension of iSEE for the TreeSummarizedExperiment data container. It provides interactive panel designs to explore hierarchical datasets, such as the microbiome and cell lines.
Last updated
softwarevisualizationmicrobiomeguishinyappsdataimportshiny-appsvisualisation
6.16 score 3 stars 1 dependents 5 scripts 300 downloadsscToppR - API Wrapper for ToppGene
scToppR provides an easy-to-use API wrapper for the ToppGene web platform, used for gene ontology and functional enrichment research. The package also integrates visualization tools, making it a convenient tool directly connecting ToppGene to code-based workflows in R. The tool can also easily save results into different formats.
Last updated
pathwayssinglecell
6.15 score 7 stars 17 scripts 260 downloadsrexposome - Exposome exploration and outcome data analysis
Package that allows to explore the exposome and to perform association analyses between exposures and health outcomes.
Last updated
softwarebiologicalquestioninfrastructuredataimportdatarepresentationbiomedicalinformaticsexperimentaldesignmultiplecomparisonclassificationclustering
6.14 score 1 dependents 31 scripts 452 downloadsscTypeEval - Evaluation of cell type classifications in single-cell transcriptomics
scTypeEval provides tools to evaluate and validate cell type classifications in single-cell transcriptomics when ground truth labels are limited or unavailable. Results are organized in an S4 object that integrates processed data, dimensional reductions, dissimilarity assays, and consistency metrics computed across samples. The workflow includes preprocessing and feature selection, principal component analysis, computation of dissimilarity matrices, internal validation metrics (for example, silhouette-based summaries), and visualization utilities to inspect heatmaps and PCA plots. Functions support common single-cell containers and enable comparison of clustering and labeling strategies across datasets.
Last updated
singlecelltranscriptomicsgeneexpressioncellbasedassaysdimensionreductionpreprocessingprincipalcomponent
6.13 score 2 stars 4 scripts 156 downloadsnormr - Normalization and difference calling in ChIP-seq data
Robust normalization and difference calling procedures for ChIP-seq and alike data. Read counts are modeled jointly as a binomial mixture model with a user-specified number of components. A fitted background estimate accounts for the effect of enrichment in certain regions and, therefore, represents an appropriate null hypothesis. This robust background is used to identify significantly enriched or depleted regions.
Last updated
bayesiandifferentialpeakcallingclassificationdataimportchipseqripseqfunctionalgenomicsgeneticsmultiplecomparisonnormalizationpeakdetectionpreprocessingalignmentcppopenmp
6.12 score 11 stars 398 downloadsimageFeatureTCGA - Import features from hovernet, provgigapath into a MultiAssayExperiment
The package imports data from HoverNet, and ProvGigaPath pipelines. Pipeline output data are hosted in a self-owned online repository. Package functionality conveniently incorporates pipeline data into existing MultiAssayExperiment instances from curatedTCGAData.
Last updated
softwareinfrastructuredataimportdatarepresentation
6.11 score 1 stars 1 dependents 13 scripts 223 downloadsscECODA - Single-Cell Exploratory Compositional Data Analysis
The scECODA R package provides a complete workflow for the analysis and visualization of compositional data, primarily focusing on cell type proportions derived from single-cell data. It implements specialized methods, such as the Centered Log-Ratio (CLR) transformation, to properly analyze proportional data while avoiding the biases introduced by the compositional constraint. The package encapsulates data management, transformation, and analysis into a single SummarizedExperiment object, offering downstream tools for dimensionality reduction via PCA, calculating critical metrics like the Adjusted Rand Index (ARI) and Modularity to quantify sample grouping quality, and generating high-quality visualizations like heatmaps and scatter plots.
Last updated
softwaresinglecelltranscriptomicscellbasedassaysnormalizationpreprocessingvisualizationclusteringdimensionreductionfeatureextractionprincipalcomponent
6.08 score 8 stars 6 scripts 139 downloads
VISTA - Visualization and Integrated System for Transcriptomic Analysis
The VISTA (Visualization and Integrated System for Transcriptomic Analysis) platform streamlines differential expression workflows by wrapping DESeq2 and edgeR into a SummarizedExperiment-based container with consistent metadata. The package includes visualization utilities, MSigDB enrichment helpers, and optional deconvolution support to simplify interactive exploration of RNA-seq experiments.
Last updated
rnaseqdifferentialexpressiongeneexpressiontranscriptomicsvisualization
6.05 score 7 stars 32 scripts 226 downloadsSynExtend - Tools for Comparative Genomics
A multitude of tools for comparative genomics, focused on large-scale analyses of biological data. SynExtend includes tools for working with syntenic data, clustering massive network structures, and estimating functional relationships among genes.
Last updated
geneticsclusteringcomparativegenomicsdataimportfortranopenmp
6.02 score 1 stars 87 scripts 410 downloadsChIPQC - Quality metrics for ChIPseq data
Quality metrics for ChIPseq data.
Last updated
sequencingchipseqqualitycontrolreportwriting
6.01 score 205 scripts 828 downloadsmetabom8 - A High-Performance R Package for Metabolomics Modeling and Analysis
Tools for 1D NMR metabolomics workflows, including import and preprocessing of Bruker experiments, multivariate modeling (PCA, PLS, OPLS) and model analytics and validation (y-permutations, cv-anova). Performance-critical routines are implemented in C++ and use the Armadillo and Eigen linear algebra libraries to improve runtime.
Last updated
metabolomicscheminformaticspreprocessingdataimportalignmentworkflowsteparmadilloeigenrcppcpp
6.00 score 3 stars 17 scripts 133 downloadsBrowserViz - BrowserViz: interactive R/browser graphics using websockets and JSON
Interactvive graphics in a web browser from R, using websockets and JSON.
Last updated
visualizationthirdpartyclient
5.98 score 2 stars 2 dependents 20 scripts 448 downloadslncRna - A Comprehensive Workflow for Long Non-coding RNA Identification and Functional Analysis
Provides a complete workflow for the identification, analysis, and functional annotation of long non-coding RNAs (lncRNAs) from RNA-Seq data. The package includes functions for filtering transcripts from GTF files, evaluating the performance of multiple coding potential prediction tools (e.g., CPC2, PLEK, CPAT), and summarizing their agreement. It enables systematic performance analysis of individual tools, "at least N" tool consensus, and all possible tool combinations. Functional analysis is supported through the identification of potential cis- and trans-acting interactions with protein-coding genes, followed by enrichment analysis. Results can be visualized using a variety of plots, including radar plots, clock plots, and interactive Sankey diagrams.
Last updated
softwaregeneexpressionrnaseqtranscriptionvisualizationqualitycontrolfunctionalgenomicsclassificationfunctionalprediction
5.98 score 8 stars 10 scripts 72 downloadsROTS - Reproducibility-Optimized Test Statistic
Calculates the Reproducibility-Optimized Test Statistic (ROTS) for differential testing in omics data.
Last updated
softwaregeneexpressiondifferentialexpressionmicroarrayrnaseqproteomicsimmunooncologycpp
5.91 score 3 dependents 684 downloadssafe - Significance Analysis of Function and Expression
SAFE is a resampling-based method for testing functional categories in gene expression experiments. SAFE can be applied to 2-sample and multi-class comparisons, or simple linear regressions. Other experimental designs can also be accommodated through user-defined functions.
Last updated
differentialexpressionpathwaysgenesetenrichmentstatisticalmethodsoftware
5.87 score 5 dependents 62 scripts 890 downloadsCNVRanger - Summarization and expression/phenotype association of CNV ranges
The CNVRanger package implements a comprehensive tool suite for CNV analysis. This includes functionality for summarizing individual CNV calls across a population, assessing overlap with functional genomic regions, and association analysis with gene expression and quantitative phenotypes.
Last updated
copynumbervariationdifferentialexpressiongeneexpressiongenomewideassociationgenomicvariationmicroarrayrnaseqsnpbioconductor-packageu24ca289073
5.86 score 8 stars 13 scripts 401 downloadsepiRomics - Epigenomic Analysis Package Built for R (epiRomics)
Integrates various levels of epigenomic information, including ChIP-seq, histone modification, ATAC-seq, and RNA-seq data. Regulatory network analysis uses combinatory approaches to infer regions of significance, such as enhancers. Downstream analysis identifies co-occurrence of epigenomic data at regions of interest. Visualization functions display multi-track genomic views with signal overlays. Please contact <[email protected]> for suggestions, feedback, or bug reporting.
Last updated
epigeneticschipseqatacseqrnaseqvisualizationsequencingsoftwarehistonemodificationgeneregulationtranscriptionfunctionalgenomicsatac-seqchip-seqchromatin-accessibiitydata-visualizationenhancerenhancer-predictionepigenomicshistonemulti-omicsregulatory-networkregulome-analysisrna-seqtranscription-factor-bindingtranscription-factorsucsc-browser
5.86 score 5 stars 41 scripts 154 downloadsDaparToolshed - Tools for the Differential Analysis of Proteins Abundance with R
The package DaparToolshed is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. It is an update of our previous package DAPAR and contains more functions to analyze the data and uses MultAssayExperiment and SummarizedExperiment data structures. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required (see `Prostar` package).
Last updated
proteomicsnormalizationpreprocessingmassspectrometryqualitycontroldataimportprostar2
5.84 score 1 stars 33 scriptsCSOA - Calculate per-cell gene signature scores in scRNA-seq data using cell set overlaps
Cell Set Overlap Analysis (CSOA) is a tool for calculating per-cell gene signature scores in an scRNA-seq dataset. CSOA constructs a set for each gene in the signature, consisting of the cells that highly express the gene. Next, all overlaps of pairs of cell sets are computed, ranked, filtered and scored. The CSOA per-cell score is calculated by summing up all products of the overlap scores and the min-max-normalized expression of the two involved genes. CSOA can run on a Seurat object, a SingleCellExperiment object, a matrix and a dgCMatrix.
Last updated
softwaresinglecellgenesetenrichmentgeneexpression
5.83 score 1 stars 1 dependents 5 scripts 192 downloadsRigraphlib - igraph library as an R package
Vendors the igraph C source code and builds it into a static library. Other Bioconductor packages can link to libigraph.a in their own C/C++ code. This is intended for packages wrapping C/C++ libraries that depend on the igraph C library and cannot be easily adapted to use the igraph R package.
Last updated
clusteringgraphandnetwork
5.82 score 1 stars 7 dependents 1.3k downloadsGCPtools - Tools for working with gcloud and gsutil
Lower-level functionality to interface with Google Cloud Platform tools. 'gcloud' and 'gsutil' are both supported. The functionality provided centers around utilities for the AnVIL platform.
Last updated
softwareinfrastructurethirdpartyclientdataimportbioconductor-packageu24hg010263
5.82 score 15 dependents 21 scripts 399 downloadsAnVILGCP - The GCP R Client for the AnVIL
The package provides a set of functions to interact with the Google Cloud Platform (GCP) services on the AnVIL platform. The package is designed to use the API calls from the AnVIL package. It coordinates AnVIL workspace functionality with native GCP tools.
Last updated
softwareinfrastructurethirdpartyclientdataimportu24hg010263
5.80 score 5 dependents 38 scripts 286 downloadsmastR - Markers Automated Screening Tool in R
mastR is an R package designed for automated screening of signatures of interest for specific research questions. The package is developed for generating refined lists of signature genes from multiple group comparisons based on the results from edgeR and limma differential expression (DE) analysis workflow. It also takes into account the background noise of tissue-specificity, which is often ignored by other marker generation tools. This package is particularly useful for the identification of group markers in various biological and medical applications, including cancer research and developmental biology.
Last updated
softwaregeneexpressiontranscriptomicsdifferentialexpressionvisualization
5.78 score 5 stars 5 scripts 346 downloadsaugere.core - Core Utilities for Automatically Generated Reports
Provides utility functions for automatic generation of reports in the augere framework. This mostly involves parsing and dynamically editing Rmarkdown files, controlling inputs into the augere pipelines. Each pipeline generates and executes a self-contained Rmarkdown file for a common bioinformatics analysis; see downstream packages like augere.de and augere.screen for examples of some analysis pipelines.
Last updated
workflowmanagementreportwriting
5.78 score 4 dependents 32 downloadstidyexposomics - Integrated Exposure-Omics Analysis Powered by Tidy Principles
The tidyexposomics package is designed to facilitate the integration of exposure and omics data to identify exposure-omics associations. We structure our commands to fit into the tidyverse framework, where commands are designed to be simplified and intuitive. Here we provide functionality to perform quality control, sample and exposure association analysis, differential abundance analysis, multi-omics integration, and functional enrichment analysis.
Last updated
softwaretranscriptomicsgeneexpressionepigeneticsproteomicsdifferentialexpressiondifferentialmethylationqualitycontrolgraphandnetworkmultiplecomparisonregressionstatisticalmethodvisualizationworkflowstep
5.72 score 1 stars 12 scripts 181 downloadsscConform - Conformal Inference for Cell Type Annotation
Builds prediction interval for cell type annotation using conformal inference and conformal risk control. It provides two main methods. The first one gives prediction intervals with coverage guarantees based on standard conformal inference. The second one instead gives hierarchical prediction intervals that are consistent with the cell ontology.
Last updated
softwareclassificationsinglecellannotationu24ca289073
5.69 score 7 stars 6 scripts 242 downloadsscLang - A unified language for interacting with Seurat and SingleCellExperiment
scLang is a suite for package development for scRNA-seq analysis. It offers functions that can operate on both Seurat and SingleCellExperiment objects. These functions are primarily aimed to help developers build tools compatible with both types of input.
Last updated
softwaresinglecellgeneexpressionvisualization
5.68 score 2 stars 2 dependents 4 scripts 159 downloadsSpNeigh - Spatial Neighborhood Modeling and Differential Expression Analysis for Transcriptomics
SpNeigh provides methods for neighborhood-aware analysis of spatial transcriptomics data. It supports boundary detection, spatial weighting (centroid- and boundary-based), spatially informed differential expression using spline-based models, and spatial enrichment analysis via the Spatial Enrichment Index (SEI). Designed for compatibility with Seurat objects, SpatialExperiment objects and spatial data frames, SpNeigh enables interpretable, publication-ready analysis of spatial gene expression patterns.
Last updated
spatialsinglecellgeneexpressiondifferentialexpressiontranscriptomicssoftware
5.68 score 3 stars 274 downloadssketchR - An R interface for python subsampling/sketching algorithms
Provides an R interface for various subsampling algorithms implemented in python packages. Currently, interfaces to the geosketch and scSampler python packages are implemented. In addition it also provides diagnostic plots to evaluate the subsampling.
Last updated
singlecell
5.66 score 3 stars 11 scripts 240 downloadsfRagmentomics - Extract Fragmentomics Features and Mutational Status
A user-friendly R package that enables the characterization of each cfDNA fragment overlapping one or multiple mutations of interest, starting from a sequencing file containing aligned reads (BAM file). fRagmentomics supports multiple mutation input formats (e.g., VCF, TSV, or string "chr:pos:ref:alt" representation), accommodates one-based and zero-based genomic conventions, handles mutation representation ambiguities, and accepts any reference file and species in FASTA format. For each cfDNA fragment, fRagmentomics outputs its size, its 3' and 5' sequences, and its mutational status. Optionally, when users set apply_bcftools_norm = TRUE, fRagmentomics invokes the external command-line tool bcftools norm to left-align and normalize variants. If bcftools is not found on the system PATH while this option is enabled, the function errors. The package does not install external software; see the INSTALL file for per-OS instructions.
Last updated
softwaregeneticsvariantdetectionindeldetectionsequencingdnaseqalignmentmultiplesequencealignment
5.66 score 7 stars 8 scripts 228 downloadsDenoIST - DenoIST: Denoising Image-based Spatial Transcriptomics data
DenoIST identifies and removes contamination in Image-based Spatial Transcriptomics data, using a transposed poisson mixture model with local neighbourhood offsets to infer genes that are likely to be due to neighbourhood contamination rather than endogenous expression.
Last updated
softwarepreprocessingspatialgeneexpressionsinglecelltranscriptomics
5.65 score 9 stars 7 scripts 188 downloadsgDR - Umbrella package for R packages in the gDR suite
Package is a part of the gDR suite. It reexports functions from other packages in the gDR suite that contain critical processing functions and utilities. The vignette walks through the full processing pipeline for drug response analyses that the gDR suite offers.
Last updated
softwaredataimportshinyapps
5.60 score 2 stars 11 scripts 251 downloads
MsBackendMetaboLights - Retrieve Mass Spectrometry Data from MetaboLights
MetaboLights is one of the main public repositories for storage of metabolomics experiments, which includes analysis results as well as raw data. The MsBackendMetaboLights package provides functionality to retrieve and represent mass spectrometry (MS) data from MetaboLights. Data files are downloaded and cached locally avoiding repetitive downloads. MS data from metabolomics experiments can thus be directly and seamlessly integrated into R-based analysis workflows with the Spectra and MsBackendMetaboLights package.
Last updated
infrastructuremassspectrometrymetabolomicsdataimportproteomicsmass-spectrometrymetabolomics-data
5.57 score 2 stars 31 scripts 298 downloadssplicelogic - splicelogic: differential transcripts to splice events
Translate differential transcript usage results into discrete splice events.
Last updated
alternativesplicingdifferentialsplicingtranscriptomicsrnaseqlongreadannotationfunctionalgenomicsdtusplicing
5.52 score 1 stars 2 scripts 116 downloads
SpatialArtifacts - Identification and Classification of Spatial Artifacts in Visium and Visium HD Data
SpatialArtifacts provides a data-driven two-step workflow to identify, classify, and handle spatial artifacts in spatial transcriptomics data. The package combines median absolute deviation (MAD)-based outlier detection with morphological image processing (fill, outline, and star patterns) to detect edge and interior artifacts. It supports multiple platforms including 10x Genomics Visium (standard and HD), allowing for consistent quality control across different spatial resolutions.
Last updated
softwarespatialtranscriptomicsqualitycontroldataimportworkflowstepclassification
5.51 score 4 stars 9 scripts 132 downloadsimmLynx - Linking Advanced TCR Python Pipelines and Hugging Face Models in R
A comprehensive toolkit that bridges popular Python-based immune repertoire analysis tools and Hugging Face protein language models into the R environment. Provides unified interfaces for TCR distance calculations (tcrdist3), sequence generation probability (OLGA), selection inference (soNNia), clustering (clusTCR), protein embeddings (ESM-2), metaclone discovery (metaclonotypist). Fully compatible with the scRepertoire and immApex ecosystem for single-cell immune repertoire analysis.
Last updated
softwareimmunooncologysinglecellclassificationannotationsequencingmotifannotationclusteringdimensionreduction
5.51 score 2 stars 6 scripts 195 downloadscellmig - Uncertainty-aware quantitative analysis of high-throughput live cell migration data
High-throughput cell imaging facilitates the analysis of cell migration across many wells treated under different biological conditions. These workflows generate considerable technical noise and biological variability, and therefore technical and biological replicates are necessary, leading to large, hierarchically structured datasets, i.e., cells are nested within technical replicates that are nested within biological replicates. Current statistical analyses of such data usually ignore the hierarchical structure of the data and fail to explicitly quantify uncertainty arising from technical or biological variability. To address this gap, we present cellmig, an R package implementing Bayesian hierarchical models for migration analysis. cellmig quantifies condition- specific velocity changes (e.g., drug effects) while modeling nested data structures and technical artifacts. It further enables synthetic data generation for experimental design optimization.
Last updated
singlecellcellbiologybayesianexperimentaldesignsoftwarebatcheffectregressionclusteringcpp
5.49 score 1 stars 18 scripts 206 downloadstidyprint - Custom Print Methods for SummarizedExperiment
Provides customized print methods for 'SummarizedExperiment' objects to enhance readability and usability within a tidy workflow. It offers consistent, tidyverse-aligned console displays, including alternative tibble abstractions for large genomic data to improve discoverability and interpretation. The package also includes unified, contextual messaging utilities intended for the 'tidyomics' ecosystem.
Last updated
softwarevisualizationinfrastructure
5.43 score 2 stars 17 scripts 174 downloads
staRgate - Automated gating pipeline for flow cytometry analysis to characterize the lineage, differentiation, and functional states of T-cells
An R-based automated gating pipeline for flow cytometry data designed to mimic the manual gating strategy of defining flow biomarker positive populations relative to a unimodal background population to include cells with varying intensities of marker expression. The pipeline’s main feature is a flexible density-based gating strategy capable of capturing varying scenarios based on marker expression patterns to analyze a 29-marker flow panel that characterizes T-cell lineage, differentiation, and functional states.
Last updated
flowcytometrypreprocessingimmunooncology
5.43 score 3 stars 2 scripts 194 downloadsdecemedip - hierarchical Bayesian modeling for cell type deconvolution of immunoprecipitation-based DNA methylome
The R package decemedip is a novel computational paradigm developed for inferring the relative abundances of cell types and tissues measure by methylated DNA immunoprecipitation sequencing (MeDIP-Seq). This paradigm allows using reference data from other technologies such as microarray or WGBS.
Last updated
softwareimmunooncologydnamethylationepigeneticssequencingwholegenomebayesianbayesian-data-analysisbayesian-inferencebioinformaticsdna-methylationdna-methylation-dataepigenomicsgenerative-modelomicsrstanstancpp
5.41 score 4 stars 26 scripts 193 downloadscarnation - Interactive Exploration & Management of RNA-Seq Analyses
Highly interactive & modular shiny app to explore three facets of RNA-Seq analysis: differential expression (DE), functional enrichment and pattern analysis. Several visualizations are implemented to provide a wide-ranging view of data sets. For DE analysis, we provide PCA plot, MA plot, Upset plot & heatmaps, in addition to a highly customizable gene plot. Seven different visualizations are available for functional enrichment analysis, and we also support gene pattern analysis. Genes of interest can be tracked across all modules using the gene scratchpad. In addition, carnation provides an integrated platform to manage multiple projects and user access that can be run on a central server to share with collaborators.
Last updated
guigeneexpressionsoftwareshinyappsgotranscriptiontranscriptomicsvisualizationdifferentialexpressionpathwaysgenesetenrichment
5.41 score 2 stars 17 scripts 72 downloadshammers - Utilities for scRNA-seq data analysis
hammers is a utilities suite for scRNA-seq data analysis compatible with both Seurat and SingleCellExperiment. It provides simple tools to address tasks such as retrieving aggregate gene statistics, finding and removing rare genes, performing representation analysis, computing the center of mass for the expression of a gene of interest in low-dimensional space, and calculating silhouette and cluster-normalized silhouette.
Last updated
softwaresinglecellgeneexpressionmultiplecomparisonvisualization
5.41 score 1 stars 1 dependents 4 scripts 269 downloadsMDSvis - Plots of Multi Dimensional Scaling (MDS) results
This package implements visulization of Multi Dimensional Scaling (MDS) results.
Last updated
flowcytometryqualitycontroldimensionreductionmultidimensionalscalingsoftwarevisualizationbioconductormdsshinyvisualisation
5.38 score 5 scripts 52 downloadsHuBMAPR - Interface to 'HuBMAP'
'HuBMAP' provides an open, global bio-molecular atlas of the human body at the cellular level. The `datasets()`, `samples()`, `donors()`, `publications()`, and `collections()` functions retrieves the information for each of these entity types. `*_details()` are available for individual entries of each entity type. `*_derived()` are available for retrieving derived datasets or samples for individual entries of each entity type. Data files can be accessed using `bulk_data_transfer()`.
Last updated
softwaresinglecelldataimportthirdpartyclientspatialinfrastructurebioconductor-packageclienthubmaprstudio
5.35 score 3 stars 3 scripts 273 downloads
SpectraQL - MassQL support for Spectra
The Mass Spec Query Language (MassQL) is a domain-specific language enabling to express a query and retrieve mass spectrometry (MS) data in a more natural and understandable way for MS users. It is inspired by SQL and is by design programming language agnostic. The SpectraQL package adds support for the MassQL query language to R, in particular to MS data represented by Spectra objects. Users can thus apply MassQL expressions to analyze and retrieve specific data from Spectra objects.
Last updated
infrastructureproteomicsmassspectrometrymetabolomics
5.30 score 10 stars 7 scripts 240 downloadsaugere.de - Automatic Generation of Differential Expression Analyses
Implements pipelines for generating differential expression analysis reports in the augere framework. This includes analyses with edgeR or voom-limma, with a variety of options for contrasts, blocking and covariates. Each pipeline function generates a self-contained Rmarkdown report with all of the steps required to reproduce the DE analysis.
Last updated
workflowmanagementreportwritingdifferentialexpressiontranscription
5.30 score 2 dependents 11 scripts 30 downloadspodkat - Position-Dependent Kernel Association Test
This package provides an association test that is capable of dealing with very rare and even private variants. This is accomplished by a kernel-based approach that takes the positions of the variants into account. The test can be used for pre-processed matrix data, but also directly for variant data stored in VCF files. Association testing can be performed whole-genome, whole-exome, or restricted to pre-defined regions of interest. The test is complemented by tools for analyzing and visualizing the results.
Last updated
geneticswholegenomeannotationvariantannotationsequencingdataimportcurlbzip2xz-utilszlibcpp
5.24 score 8 scripts 446 downloadsCompensAID - Automated detection tool for spillover errors
The CompensAID is an automated quality control tool, which determines for each marker combination in the FCS file, whether there a potential presence of reference errors. Such reference errors, which represent themselves in the form of skewed populations, are detected by integrating the Secondary Stain Index (SSI) score. Marker combinations with an SSI < 1 are flagged by CompensAID.
Last updated
flowcytometryqualitycontrolpreprocessing
5.24 score 5 stars 2 scripts 180 downloadsBioNAR - Biological Network Analysis in R
the R package BioNAR, developed to step by step analysis of PPI network. The aim is to quantify and rank each protein’s simultaneous impact into multiple complexes based on network topology and clustering. Package also enables estimating of co-occurrence of diseases across the network and specific clusters pointing towards shared/common mechanisms.
Last updated
softwaregraphandnetworknetwork
5.22 score 3 stars 37 scripts 369 downloadsimageTCGAutils - Utility functions for working with histopathology images
Utility functions for working with CONCH data, listing remote files. One function assigns HoverNet nuclei to ProvGigaPath tiles with a scale factor to align coordinates. Provides internal utility functions for 'imageFeatureTCGA' and most functions are not meant for end users.
Last updated
softwareworkflowsteppreprocessing
5.22 score 2 scripts 211 downloadsspatialHeatmap - spatialHeatmap: Visualizing Spatial Assays in Anatomical Images and Large-Scale Data Extensions
The spatialHeatmap package offers the primary functionality for visualizing cell-, tissue- and organ-specific assay data in spatial anatomical images. Additionally, it provides extended functionalities for large-scale data mining routines and co-visualizing bulk and single-cell data. A description of the project is available here: https://spatialheatmap.org.
Last updated
spatialvisualizationmicroarraysequencinggeneexpressiondatarepresentationnetworkclusteringgraphandnetworkcellbasedassaysatacseqdnaseqtissuemicroarraysinglecellcellbiologygenetarget
5.16 score 6 stars 20 scripts 468 downloadsBatchSVG - Identify Batch-Biased Spatially Variable Genes
BatchSVG is a method to identify batch-biased spatially variable genes (SVGs) in spatial transcriptomics data. The batch variable can be defined as sample, donor sex, or other batch effects of interest. The BatchSVG method is based on the binomial deviance model (Townes et al, 2019).
Last updated
spatialtranscriptomicsbatcheffectqualitycontrolbatch-effectbioconductor-packagespatial-transcriptomics
5.16 score 3 stars 12 scripts 269 downloadsimmGLIPH - Grouping of Lymphocyte Interactions by Paratope Hotspots
An R implementation of the GLIPH and GLIPH2 algorithms for clustering T cell receptors (TCRs) predicted to bind the same HLA-restricted peptide antigen. Identifies specificity groups based on local (motif-based) and global (sequence-based) CDR3 similarities. Integrates with the scRepertoire ecosystem via immApex for single-cell immune repertoire analysis. Users should cite the original GLIPH algorithm papers: Glanville et al. (2017) <doi:10.1038/nature22976> and Huang et al. (2020) <doi:10.1038/s41587-020-0505-4>.
Last updated
softwareimmunooncologyclusteringsinglecellsequencingvisualization
5.15 score 4 stars 7 scripts 50 downloadsRankMap - Rank-based reference mapping for fast and robust cell type annotation in spatial and single-cell transcriptomics
RankMap is a fast and scalable tool for reference-based cell type annotation of single-cell and spatial transcriptomics data. It uses ranked gene expression and multinomial regression to achieve robust predictions, even with partial gene coverage. Compatible with Seurat, SingleCellExperiment, and SpatialExperiment objects, RankMap offers flexible preprocessing and significantly faster runtime than tools like SingleR, Azimuth, and RCTD.
Last updated
spatialsinglecelltranscriptomicsgeneexpressionannotationregressionpreprocessingsoftware
5.15 score 2 stars 120 downloadsepiSeeker - epiSeeker: an R package for Annotation, Comparison and Visualization of multi-omics epigenetic data
This package implements functions to analyze multi-omics epigenetic data. Data of fragment type and base type are supported by epiSeeker. It provides functions to retrieve the nearest genes around the peak, annotate genomic region of the peak, statistical methods to estimate the significance of overlap among peak data sets, and motif analysis. It incorporates the GEO database for users to compare their own dataset with those deposited in the database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. Several visualization functions are implemented to summarize the coverage of the peak experiment, average profile and heatmap of peaks binding to TSS regions, genomic annotation, distance to TSS, overlap of peaks or genes, and the single-base resolution epigenetic data by considering the strand, motif, and additional information.
Last updated
annotationchipseqsoftwarevisualizationmultiplecomparisoncoveragemotifannotationgeneregulation
5.15 score 54 downloadsTSENAT - Tsallis Entropy Analysis Toolbox
Quantifies and models isoform-usage complexity in RNA-seq data using Tsallis entropy, a scale-dependent diversity measure. By tuning the entropic index parameter (q), TSENAT examines transcriptome heterogeneity at different scales: rare variants (low q) or dominant isoforms (high q). It enables computing Tsallis entropy and Tsallis divergence from transcript-level estimates, comparing measures between conditions, testing for differences, and visualizing scale-dependent complexity via q-curves.
Last updated
transcriptomicsrnaseqdifferentialsplicingalternativesplicingtranscriptomevariantgeneexpressiondifferentialexpressioncomplex-systemsisoform-diversityrna-seqtsallis-entropycpp
5.13 scoreCrcBiomeScreen - An R package for colorectal cancer screening and microbiome analysis
A developed and benchmarked reproducible machine learning framework for microbiome-based colorectal cancer (CRC) screening. By systematically evaluating normalization strategies, taxonomic resolutions, and class imbalance handling. This R package allows users to apply the full pipeline or selectively run specific components depending on their analytical needs. It establishes a scalable foundation for developing interpretable microbiome-based screening tools to support early CRC detection. This approach could be easily implemented in a national screening programme, to improve early detection rates for this disease.
Last updated
softwaremicrobiomemetagenomicsclassificationnormalizationvisualization
5.13 score 7 scripts 226 downloadspostNet - Post-transcriptional network modeling
A tool that enables in silico identification, integration, and modeling of mRNA features that influence post-transcriptional regulation of gene expression at a transcriptome-wide scale.
Last updated
geneexpressiongeneregulationtranscriptomicsriboseqrnaseqsequencingannotationnetworkfeatureextractioncpp
5.13 score 2 scripts 276 downloadsRNAshapeQC - RNA Coverage-Shape-Based Quality Control Metrics
RNAshapeQC provides coverage-shape-based quality control (QC) metrics for mRNA-seq and total RNA-seq data. It supports per-gene pileup construction from BAM files as well as toy datasets for quick-start examples. The package implements protocol-specific metrics, including decay rate (DR), degradation score (DS), mean coverage depth (MCD), window coefficient of variation (wCV), area under the curve (AUC), and shape-based sample-level indices. RNAshapeQC also includes HPC-friendly functions for per-gene batch processing and cross-study pileup generation. This package enables interpretable, protocol-specific QC assessments for diverse RNA-seq workflows.
Last updated
rnaseqqualitycontrolcoveragetranscriptomicssequencing
5.11 score 2 stars 1 scripts 266 downloadsAnVILAz - R / Bioconductor Support for the AnVIL Azure Platform
The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The AnVILAz package supports end-users and developers using the AnVIL platform in the Azure cloud. The package provides a programmatic interface to AnVIL resources, including workspaces, notebooks, tables, and workflows. The package also provides utilities for managing resources, including copying files to and from Azure Blob Storage, and creating shared access signatures (SAS) for secure access to Azure resources.
Last updated
softwareinfrastructurethirdpartyclientu24hg010263
5.08 score 5 scripts 254 downloadsgoSorensen - Statistical inference based on the Sorensen-Dice dissimilarity and the Gene Ontology (GO)
This package implements inferential methods to compare gene lists in terms of their biological meaning as expressed in the GO. The compared gene lists are characterized by cross-tabulation frequency tables of enriched GO items. Dissimilarity between gene lists is evaluated using the Sorensen-Dice index. The fundamental guiding principle is that two gene lists are taken as similar if they share a great proportion of common enriched GO items.
Last updated
annotationgogenesetenrichmentsoftwaremicroarraypathwaysgeneexpressionmultiplecomparisongraphandnetworkreactomeclusteringkegg
5.08 score 2 stars 7 scripts 300 downloadsGSABenchmark - Tools for benchmarking single-cell gene set analysis methods
GSABenchmark is a package designed for benchmarking scRNA-seq gene set analysis (scGSA) methods. It provides both traditional and novel benchmark metrics, as well as visualization tools. Currently, GSABenchmark supports 17 scGSA methods.
Last updated
softwaresinglecellgenesetenrichmentgeneexpressionvisualization
5.08 score 1 stars 2 scripts 213 downloadssmoppix - Analyze Single Molecule Spatial Omics Data Using the Probabilistic Index
Test for univariate and bivariate spatial patterns in spatial omics data with single-molecule resolution. The tests implemented allow for analysis of nested designs and are automatically calibrated to different biological specimens. Tests for aggregation, colocalization, gradients and vicinity to cell edge or centroid are provided.
Last updated
transcriptomicsspatialsinglecellcpp
5.04 score 1 stars 5 scripts 277 downloadsOAtools - Analysis of OpenArray PCR Data
Provides a suite of R functions to analyze gene expression experiments on the OpenArray real-time PCR platform. OAtools fits logistic regressions to fluorescence curves to distinguish between real amplification and false positives. OAtools supports data import, analysis, and visualization through plots and a dynamic HTML report.
Last updated
qpcrgeneexpressiondataimportregression
5.04 score 3 scripts 80 downloadsHubPub - Utilities to create and use Bioconductor Hubs
HubPub provides users with functionality to help with the Bioconductor Hub structures. The package provides the ability to create a skeleton of a Hub style package that the user can then populate with the necessary information. There are also functions to help add resources to the Hub package metadata files as well as publish data to the Bioconductor S3 bucket.
Last updated
dataimportinfrastructuresoftwarethirdpartyclientbioconductor-package
5.03 score 3 stars 6 scripts 498 downloadsRega - R Interface to European Genome-Phenome Archive
The European Genome-phenome Archive (EGA) provides long-term storage and controlled sharing of personally identifiable genetic data. The Rega package offers a streamlined and extensible R interface to the EGA API, facilitating the programmatic upload of metadata. GEO-like Excel submission template is provided as a default method of organizing submission metadata.
Last updated
softwareinfrastructurethirdpartyclient
5.02 score 3 scripts 246 downloadsjvecfor - Fast K-Nearest Neighbor Search for Single-Cell Analysis
Drop-in replacement for BiocNeighbors::findKNN using the jvecfor Java library, which builds on the jvector library to leverage the Java Vector API for portable SIMD acceleration across AVX2, AVX-512, and ARM NEON hardware. jvecfor/jvector implements HNSW-DiskANN approximate search and VP-tree exact search. The package achieves approximately 2x speedup over Annoy-based search at n >= 50K cells while returning output structurally identical to BiocNeighbors, making it suitable for seamless integration into existing Bioconductor single-cell workflows. Convenience wrappers delegate shared nearest-neighbor (SNN) and k-nearest-neighbor (KNN) graph construction to the bluster package.
Last updated
singlecellgraphandnetworkclusteringclassification
4.95 score 3 stars 6 scripts 211 downloadsSEMPLR - SNP Effect Matrix Pipeline in R
SEMPLR computes transcription factor binding affinity scores for genomic positions and genetic variants. Scores are computed from SNP Effect Matrices (SEMs) produced by SEMpl. 223 pre-computed SEMs are included with the package or custom sets can be provided. Enrichment can be tested among sets of genomic positions to determine if transcription factor binding events occur more often than expected. Comparing binding affinity scores between alleles can reveal differences in transcription factor binding with genetic variation. This package also includes several visualization functions to view scores both on the motif and variant/position level.
Last updated
motifannotationtranscriptionsnpgenomicvariationcpp
4.93 score 1 stars 2 scripts 185 downloadsmultistateQTL - Toolkit for the analysis of multi-state QTL data
A collection of tools for doing various analyses of multi-state QTL data, with a focus on visualization and interpretation. The package 'multistateQTL' contains functions which can remove or impute missing data, identify significant associations, as well as categorise features into global, multi-state or unique. The analysis results are stored in a 'QTLExperiment' object, which is based on the 'SummarisedExperiment' framework.
Last updated
functionalgenomicsgeneexpressionsequencingvisualizationsnpsoftware
4.93 score 2 stars 17 scripts 264 downloads
BamScale - Bioconductor-Friendly Multithreaded BAM Processing
Multithreaded sequential BAM processing built on top of the ompBAM C++ engine. BamScale provides user-friendly BAM read and scan interfaces designed for compatibility with existing Bioconductor workflows.
Last updated
softwaredataimportsequencingcoveragealignmentqualitycontrolcpp
4.90 scorepairedGSEA - Paired DGE and DGS analysis for gene set enrichment analysis
pairedGSEA makes it simple to run a paired Differential Gene Expression (DGE) and Differencital Gene Splicing (DGS) analysis. The package allows you to store intermediate results for further investiation, if desired. pairedGSEA comes with a wrapper function for running an Over-Representation Analysis (ORA) and functionalities for plotting the results.
Last updated
differentialexpressionalternativesplicingdifferentialsplicinggeneexpressionimmunooncologygenesetenrichmentpathwaysrnaseqsoftwaretranscription
4.90 score 4 stars 4 scripts 244 downloadsSMTrackR - SMTrackR: an R/Bioconductor package for mapping protein binding at individual DNA molecules
The package uses exogenous enzyme imprinted information to map protein-DNA binding on individual sequenced DNA molecules. For example, GpC methyltransferase, CpG methyltransferase, and Adenine methyltransferases. Public datasets from such assays are compiled into tracks, and hosted at public servers like Galaxy for their seamless access by this package.
Last updated
nucleosomepositioningvisualizationgenetargetgenomeassembly
4.90 score 2 scripts 145 downloadsDOTSeq - Genome-wide Detection of Differential ORF Usage
Differential open reading frame (ORF) translation analysis framework for ribosome profiling (Ribo-seq) with matched RNA-seq. Implements (i) Differential ORF Usage (DOU), a beta-binomial generalized linear model that models the expected proportion of Ribo-seq versus RNA-seq reads mapping to each ORF within a gene, and (ii) ORF-level Differential Translation Efficiency (DTE), a negative binomial GLM that capture changes in translation efficiency of individual ORFs across experimental conditions. Supports ORF-level read summarization for bulk and single-cell Ribo-seq.
Last updated
riboseqsinglecellgeneregulationgeneexpressiondifferentialexpressiongeneticssequencingsoftwarernaseqbayesianregressionmultiplecomparisoncpp
4.90 score 1 stars 6 scripts 190 downloadsHistoImagePlot - Plotting functionality for Histopathology pipeline datasets
Create side-by-side visualizations of tissue thumbnail image and HoverNet cell segmentation with colored cell type labels. Functionality automatically retrieves the thumbnail image associated with a HoverNet JSON file and overlays the segmentation data. This package is intended for researchers working with histopathological images, facilitating exploratory analysis, and integrates with the imageFeatureTCGA Bioconductor package.
Last updated
softwarevisualizationspatial
4.88 score 2 scripts 251 downloadsBiocPkgDash - An interactive Shiny dashboard for Bioconductor package maintainers
This package provides an interactive Shiny dashboard for Bioconductor package maintainers. It visualizes various package statuses, metadata, and development metrics, offering insights into package health and activity. This tool aims to support maintainers of multiple packages by filtering packages via maintainer email.
Last updated
softwareinfrastructurevisualizationguiu24ca289073
4.88 score 4 scripts 108 downloadsBiocBuildReporter - Functions to process a bioconductor build report database
This package reads remote parquet files that have processed Bioconductor build report logs. Users may query the tables directly for specific information or use pre-defined helper functions for common queries. The logs processed are from https://bioconductor.org/checkResults/. In the future we will extend this package out to include processing of r-universe logs.
Last updated
softwareinfrastructure
4.85 score 2 scripts 79 downloadsatacInferCnv - Call CNV from single cell ATAC-seq data based on InferCNV adaptation
The package prepares input scATAC-seq data and adapts for copy number variance profiling with InferCNV package usage. It has also various paramters to control the analysis (e.g. external normal reference usage, meta-cells, bin size, etc) and custom plot visualizations.
Last updated
epigeneticssequencingcopynumbervariationsinglecellimmunooncologycppjags
4.85 score 3 scripts 210 downloadsmiaDash - Dashboard for the interactive analysis and exploration of microbiome data
miaDash provides a Graphical User Interface for the exploration of microbiome data. This way, no knowledge of programming is required to perform analyses. Datasets can be imported, manipulated, analysed and visualised with a user-friendly interface.
Last updated
microbiomesoftwarevisualizationguishinyappsdataimportbioinformaticsdashboardiseemiashinyvisualisationwebapp
4.78 score 1 stars 8 scripts 240 downloadsscPassport - Passport System for Single-Cell Objects
Stamps Seurat, SingleCellExperiment, and SummarizedExperiment objects with a persistent metadata passport. For Seurat objects the passport is stored in the misc slot; for SingleCellExperiment and SummarizedExperiment objects it is stored in the metadata slot. Tracks animal info, experiment details, lineage (parent/child relationships), RDS registry numbers, processing logs, and custom fields. Includes an interactive Shiny gadget to fill and update the passport, and a read mode to print the full passport to console. The passport persists inside the RDS file with no external files needed.
Last updated
singlecelldataimportvisualizationinfrastructurecpp
4.78 score 3 stars 262 downloadssfi - Data analysis for Single File Injections (SFIs) mode LC-MS analysis
Data analysis for Single File Injections(SFIs) mode LC-MS analysis. In SFIs mode, pooled samples are initially injected to serve as reference peaks for subsequent analyses. Repeated injections of individual samples are then performed at fixed time intervals using isocratic elution. This package provides the functions to analyze data from SFIs mode including peak picking and peak reassignment.
Last updated
massspectrometrymetabolomicsfeatureextractioncpp
4.78 score 1 stars 5 scripts 201 downloadsdmGsea - Efficient Gene Set Enrichment Analysis for DNA Methylation Data
The R package dmGsea provides efficient gene set enrichment analysis specifically for DNA methylation data. It addresses key biases, including probe dependency and varying probe numbers per gene. The package supports Illumina 450K, EPIC, and mouse methylation arrays. Users can also apply it to other omics data by supplying custom probe-to-gene mapping annotations. dmGsea is flexible, fast, and well-suited for large-scale epigenomic studies.
Last updated
genesetenrichmentpathwaysdnamethylationproteomicssequencingcopynumbervariationgeneexpressiongenomicvariationcoverage
4.74 score 3 scripts 282 downloadsasuri - Analysis of SUrvival and RIsk prediction in patients based on gene signatures
The asuri (Analysis of SUrvival and patients RIsk prediction based on gene signatures) package discovers marker genes that are related to risk prediction capabilities and to a clinical variable of interest. It uses two main steps, including subsampling glmnet and unicox. The package implements robust functions to discover survival markers related to a clinical phenotype and to predict a risk score, allowing to study the patient's risk based on the gene signatures. Several plots are provided to visualise the relevance of the genes, the risk score, and patient stratification, as well as a robust version of the Kaplan-Meier curves.
Last updated
softwarestatisticalmethodworkflowstepgeneexpressionmicroarraydifferentialexpressiongenepredictionregressionsurvivalexonarraymultiplecomparison
4.74 score 1 scripts 144 downloadsUlarcirc - Shiny app for canonical and back splicing analysis (i.e. circular and mRNA analysis)
Ularcirc reads in STAR aligned splice junction files and provides visualisation and analysis tools for splicing analysis. Users can assess backsplice junctions and forward canonical junctions.
Last updated
datarepresentationvisualizationgeneticssequencingannotationcoveragealternativesplicingdifferentialsplicing
4.70 score 6 scripts 348 downloadsBiocAzul - Programmatic Access to the Azul API
Represents the OpenAPI v2 Azul API as an R object for performing requests. The infrastructure uses the AnVIL and rapiclient packages. Users can connect to either the AnVIL or Human Cell Atlas Data Explorers.
Last updated
softwareinfrastructuredataimportthirdpartyclientu24hg010263
4.70 score 2 scripts 150 downloadsGOaGO - Gene Ontology enrichment analysis of gene pairs
GO-a-GO annotates Gene Ontology terms that are enriched in a given set of gene pairs. The enrichment is calculated from a permutation test for overrepresentation of gene pairs that are associated with a shared term. Such gene pairs are counted for the original set of gene pairs and compared against randomized sets in which the structure of the pairs is preserved, but the gene identities (including the associated terms) are permuted.
Last updated
gogenesetenrichment
4.70 score 1 stars 88 downloadsPlinkMatrix - DelayedArray interface for plink bed files
This package provides a DelayedArray interface for plink bed files. There is support for interfacing to plink genotype data via RangedSummarizedExperiment. Example data from the GEUVADIS project (internationalgenome.org) are used for demonstration.
Last updated
infrastructuregeneticscpp
4.70 score 1 stars 10 scripts 197 downloadsfourSynergy - Ensemble algorithm for 4C-seq data
fourSynergy is an ensemble algorithm leveraging synergies among the existing 4C-seq algorithms r3C-seq, peakC, r.4cker and fourSig. It uses a weighted voting approach to perform improved interaction calling. fourSynergy supports also differential interaction calling.
Last updated
sequencingsoftwaredifferentialpeakcalling
4.70 score 130 downloads
Battlefield - Swiss-army toolkit for selecting niche fronts and invasive margins in spatial transcriptomics data
Battlefield is a Swiss-army toolkit originally developed to define and extract spatial spots from specific tissue regions—such as front regions, niche borders, invasive margins, and cluster interfaces—using spatial transcriptomics data or clustered tissue maps. It has since been extended to support trajectory selection and layer inspection, and now provides a collection of low-level utilities for spatial transcriptomics analysis. These utilities are primarily intended to be reused within higher-level analytical packages. It is designed to work with sequencing-based platforms such as Visium at several resolutions and Visium HD(binned).
Last updated
sequencingsoftwaretranscriptomicsspatial
4.68 score 12 scripts 88 downloadsscMitoMut - Single-cell Mitochondrial Mutation Analysis Tool
This package is designed for calling lineage-informative mitochondrial mutations using single-cell sequencing data, such as scRNASeq and scATACSeq (preferably the latter due to RNA editing issues). It includes functions for mutation calling and visualization. Mutation calling is done using beta-binomial distribution.
Last updated
preprocessingsequencingsinglecellopenblascpp
4.65 score 3 stars 8 scripts 244 downloadsClonalSim - Simulation of Tumor Clonal Evolution with Realistic Sequencing Noise
ClonalSim generates realistic mutational profiles of tumor samples with hierarchical clonal structure. It simulates founder, shared, and private mutations with biologically realistic noise models including intra-tumor heterogeneity (Beta distribution) and technical sequencing noise (negative binomial depth variation, binomial read sampling, base errors). The package is designed for benchmarking variant callers, testing clonal deconvolution algorithms, and teaching tumor heterogeneity concepts.
Last updated
softwaresequencingsomaticmutationvariantdetectioncoveragevisualizationdataimport
4.65 score 1 stars 170 downloadsqueeems - Quantify the Extent of Evolutionary Evidence in Molecular Sequences
Biological inferences obtained from molecular data are only as good as the extent of evolutionary signatures retained in the genetic data. Techniques available to quantify these signatures are largely targeted towards phylogeny reconstruction and they often rely on adhoc hypothesis tests of significance. I present a Bayesian function that assesses whether a set of genetic sequences are saturated. That is, it is useful for determining whether the evolutionary information in the sequences has eroded with time. Site specific Bayes factors are generated with respect to codon bases to allow for straightforward applications in extensive computational biology inquiries, including natural selection analyses.
Last updated
alignmentbayesianclassificationdataimportgeneticsmathematicalbiologyresearchfieldsequencingsequencematchingsoftwarestatisticalmethodworkflowstepbayesian-statisticsevolutionary-biologymolecular-biologynatural-selectionphylogeneticsstatistical-analysis
4.65 score 1 scripts 186 downloadsbetterChromVAR - Improved ChromVAR (Chromatin Variation Across Regions)
A much faster analytical implementation of chromVAR, with additional features, used to infer TF activity from (bulk or single-cell) ATAC-seq data and motif annotations (or binding probabilities). The package also includes the CVnorm normalization method based on the chromVAR logic.
Last updated
softwareatacseqnormalizationepigeneticssequencing
4.60 score 1 stars 8 scripts 135 downloadstoppgene - Gene List Enrichment Analysis using the ToppGene Suite
The ToppGene Suite is a one-stop portal for gene list enrichment analysis and candidate gene prioritization based on functional annotations and protein interactions network. Although the ToppCluster web application provides convenient graphical access to the ToppGene Suite, the OpenAPI 3.0 compliant interface of ToppGene is better suited for automation and reproducibility. This package includes Bioconductor class interfaces and biological examples.
Last updated
clusteringgeneexpressiongenesetenrichmentgeneticsmotifdiscoverynetworknetworkenrichmentpathwayspharmacogeneticsproteomicssoftwarethirdpartyclient
4.60 score 1 stars 2 scripts 42 downloadsterraTCGAdata - OpenAccess TCGA Data on Terra as MultiAssayExperiment
Leverage the existing open access TCGA data on Terra with well-established Bioconductor infrastructure. Make use of the Terra data model without learning its complexities. With a few functions, you can copy / download and generate a MultiAssayExperiment from the TCGA example workspaces provided by Terra.
Last updated
softwareinfrastructuredataimportbioconductor-packageu24hg010263
4.60 score 4 scripts 290 downloadsfastRanges - Deterministic Multithreaded Genomic Interval Operations
High-performance interval overlap and join operations for 'IRanges' and 'GenomicRanges'. The package provides deterministic multithreaded overlap computation, reusable subject indexes for repeated queries, and join helpers that keep range metadata in a consistent output grammar.
Last updated
softwareinfrastructuresequencingcpp
4.60 score 2 stars 115 downloadsGenomicCoordinates - Enhanced string parsing for genomic coordinates
Extends string parsing capabilities for genomic coordinates, supporting various formats including comma-separated numbers, space-delimited coordinates, and automatic detection of GRanges, GPos, and GInteractions objects.
Last updated
infrastructuredatarepresentationgenomeannotation
4.60 score 2 stars 4 scripts 252 downloadsepigraHMM - Epigenomic R-based analysis with hidden Markov models
epigraHMM provides a set of tools for the analysis of epigenomic data based on hidden Markov Models. It contains two separate peak callers, one for consensus peaks from biological or technical replicates, and one for differential peaks from multi-replicate multi-condition experiments. In differential peak calling, epigraHMM provides window-specific posterior probabilities associated with every possible combinatorial pattern of read enrichment across conditions.
Last updated
chipseqatacseqdnaseseqhiddenmarkovmodelepigeneticscurlopensslopenblascppopenmp
4.56 score 91 scripts 306 downloadsRFGeneRank - RFGeneRank: Cross-validated Stable Predictive Gene Ranking for Transcriptomics
Tools to harmonize bulk RNA-seq matrices, optionally apply batch correction, and train cross-validated classification models using ranger, glmnet, or xgboost. Supports leakage-safe feature selection, permutation importance, SHAP-based interpretability, and calibration methods (Platt or isotonic). Provides stability metrics across folds, embeddings (PCA/UMAP), ROC visualization, SHAP dependence plots, and tidy ranked-gene tables for downstream analysis.
Last updated
transcriptomicsrnaseqgeneexpressionfeatureextractionclassificationvisualizationsoftwarestatisticalmethodalignment
4.54 score 2 scripts 193 downloadsSeqtometry - Signature scoring for single cell analysis
This package provides functions used in Seqtometry (Kousnetsov et al. 2024), a method for analyzing single cell (scRNA-seq or scATAC-seq) data via signature (gene set) enrichment scores. The Seqtometry scores may be useful for annotating or characterizing cells, either in a flow cytometry like workflow (where scores are standalone features used for progressive partitoning as described in the Seqtometry publication) or in a cluster-based workflow (as features of clusters). The exported impute function (a port of Python's MAGIC-impute, van Dijk et al. 2018), may also be useful for single cell analysis on its own.
Last updated
singlecellgenesetenrichmentgeneexpressioncpp
4.54 score 1 stars 1 scripts 204 downloads
glycoTraitR - Compute and analyze the glycan structrual traits from GPSM data
GlycoTraitR is an R package for analyzing glycoproteomics data, particularly glycopeptide-spectrum matches (GPSMs). It supports results generated by the pGlyco3 and Glyco-Decipher search engines. The package parses glycan structures, computes monosaccharide compositions and structural traits, and performs differential analysis of glycan heterogeneity. It constructs trait-by-PSM matrices stored in a SummarizedExperiment object, supports user-defined structural motifs, and provides visualization utilities for interpreting glycan trait changes.
Last updated
proteomicsmassspectrometryvisualizationsoftware
4.54 score 2 scripts 183 downloadsRfastp - An Ultra-Fast and All-in-One Fastq Preprocessor (Quality Control, Adapter, low quality and polyX trimming) and UMI Sequence Parsing).
Rfastp is an R wrapper of fastp developed in c++. fastp performs quality control for fastq files. including low quality bases trimming, polyX trimming, adapter auto-detection and trimming, paired-end reads merging, UMI sequence/id handling. Rfastp can concatenate multiple files into one file (like shell command cat) and accept multiple files as input.
Last updated
qualitycontrolsequencingpreprocessingsoftwarezlibcpp
4.53 score 56 scripts 411 downloadssmoothclust - smoothclust
Method for identification of spatial domains and spatially-aware clustering in spatial transcriptomics data. The method generates spatial domains with smooth boundaries by smoothing gene expression profiles across neighboring spatial locations, followed by unsupervised clustering. Spatial domains consisting of consistent mixtures of cell types may then be further investigated by applying cell type compositional analyses or differential analyses.
Last updated
spatialsinglecelltranscriptomicsgeneexpressionclustering
4.51 score 1 stars 13 scripts 279 downloadsBatChef - Single-cell RNA-seq batch effects correction methods interface
This package implements a variety of methods for batch correction in single-cell RNA sequencing (scRNA-seq) data. It incorporates quantitative metrics (e.g. Wasserstein distance, Adjusted Rand Index) to evaluate their performance. Furthermore, the package assists users in identifying and applying the optimal method for specific datasets.
Last updated
batcheffectsinglecellsequencingcpp
4.51 score 4 stars 2 scripts 157 downloads
GraphExperiment - S4 Class for Quantitative Data and Associated Networks
GraphExperiment provides users and developers with an S4 class that extends `SingleCellExperiment` by offering infrastructure to store and retrieve networks (`igraph` objects) representing how assay features and/or observations are associated with each other. The class was designed to store networks inferred from high-dimensional quantitative data, with feature-feature networks including gene coexpression networks (GCNs), gene regulatory networks (GRNs), and co-abundance networks (from proteomics and metabolomics), and observation-observation network including cell-cell distances, species-species relationships, and sample-sample similarities.
Last updated
datarepresentationdataimportinfrastructuregeneexpressiontranscriptomicsnetworksinglecellbioconductorbioinformaticsoop
4.48 score 1 stars 2 scripts 158 downloadsaugere.gsea - Automatic Generation of Gene Set Enrichment Analyses
Implements pipelines for generating gene set enrichment analysis reports in the augere framework. This includes various competitive and self-contained gene sets from a variety of Bioconductor packages. Each pipeline function generates a self-contained Rmarkdown report with all of the steps required to reproduce the gene set enrichment analysis.
Last updated
workflowmanagementreportwritinggenesetenrichment
4.48 score 31 downloadsSpliceImpactR - An R package to identify functional impacts due to alternative RNA processing events
Works by taking in processed data from the HIT Index and/or rMATS and identifying how differentially used alternative RNA processing events lead to changes in protein function through various means. Primarily this is done through protein similarity, functional protein domain analysis, and domain-domain interaction changes. Notably, we both identify alterantive RNA processing event 'swaps' across condition and are able to perform holistic analyses regarding the impact of different RNA processing events.
Last updated
alternativesplicingdifferentialsplicingstatisticalmethodalignment
4.48 score 2 stars 3 scripts 235 downloadsExpoRiskR - Exposure-Aware Multi-Omics Risk Modeling
ExpoRiskR provides tools for exposure-aware multi-omics risk modeling in translational and environmental health studies. The package aligns sample identifiers across exposure and multi-omics blocks, performs lightweight preprocessing, and fits exposure-adjusted association models to build interpretable microbe–metabolite networks. It also computes simple exposure perturbation summaries and generates publication-ready visualizations. Workflows support both matrix-based inputs and SummarizedExperiment objects.
Last updated
softwarenetworksystemsbiologymetabolomicsmicrobiomeregression
4.48 score 250 downloadsHiSpaR - Hierarchical Inference of Spatial Positions from Hi-C Data
Provides R bindings for HiSpa, a hierarchical Bayesian model for inferring three-dimensional chromatin structures from Hi-C contact matrices using Markov Chain Monte Carlo (MCMC) sampling. The package implements a cluster-based hierarchical approach that efficiently handles large-scale Hi-C datasets. It uses Rcpp and RcppArmadillo for efficient C++ integration with the original HiSpa C++ implementation, enabling fast computation of chromatin structure inference through parallel MCMC sampling.
Last updated
softwareepigeneticshicstructuralpredictionbayesianspatialopenblascpp
4.48 score 3 scripts 193 downloadsannoLinker - Annotating genomic regions through chromatin interaction links
Fast annotation of genomic peaks using DNA interaction data by constructing interaction networks with igraph, where peaks overlapping any node in a connected subgraph are annotated with all genes in that subgraph. The annotation evidence could be visualized as either a network graph or a genomic track integrated with gene annotation information.
Last updated
networkannotationvisualization
4.48 score 179 downloadsBiocHubsShiny - View AnnotationHub and ExperimentHub Resources Interactively
A package that allows interactive exploration of AnnotationHub and ExperimentHub resources. It uses DT / DataTable to display resources for multiple organisms. It provides template code for reproducibility and for downloading resources via the indicated Hub package.
Last updated
softwareshinyapps
4.40 score 3 scripts 342 downloadsaugere.screen - Automatic Generation of Functional Screen Analyses
Implements pipelines for generating functional screen analysis reports in the augere framework. This uses voom to test for differential abundance of barcodes with consolidation into gene-level statistics. Each pipeline function generates a self-contained Rmarkdown report with all of the steps required to reproduce the analysis.
Last updated
workflowmanagementreportwritinggenetargetfunctionalgenomics
4.40 score 31 downloadsparati - Parental Allele Transmission Inference for Trio Genotype Data
Infers maternal and paternal transmitted and non-transmitted alleles from phased trio genotype data. The package supports SNP-level analyses of genetic nurture and transgenerational effects. It interoperates with Bioconductor VCF infrastructure through support for VariantAnnotation::VCF objects and returns R objects for downstream analysis.
Last updated
geneticssnpsequencingvariantannotationsoftware
4.40 score 238 downloadsaugere.solo - Automatic Generation of Single-Cell Analyses
Implements pipelines for generating single-cell analysis reports in the augere framework. This uses scrapper to execute routine steps such as quality control, normalization, feature selection, clustering and marker detection. We also implement a pipeline for automatic cell type annotation against a labelled reference with SingleR. Each pipeline function generates a self-contained Rmarkdown report with all of the steps required to reproduce its analysis.
Last updated
workflowmanagementreportwritingsinglecell
4.30 score 35 downloadsfraq - A High-Throughput and Extensible Toolkit for Processing FASTQ Data
High-throughput extensible toolkit for processing FASTQ data. The goal of this package is to empower users to quickly build out small programmatic 'kernels' to define any FASTQ processing task they may need. Builds on Intel TBB’s flow graph to orchestrate concurrent I/O and data processing; throughput can be as fast as compression and disk speed allows. The package also ships with a suite of predefined kernels for common FASTQ tasks.
Last updated
softwareinfrastructuresequencingdnaseqqualitycontrolalignmentcpp
4.30 score 1 stars 9 scripts 120 downloadsDNAcycP2 - DNA Cyclizability Prediction
This package performs prediction of intrinsic cyclizability of of every 50-bp subsequence in a DNA sequence. The input could be a file either in FASTA or text format. The output will be the C-score, the estimated intrinsic cyclizability score for each 50 bp sequences in each entry of the sequence set.
Last updated
neuralnetworkstructuralprediction
4.30 score 3 scripts 254 downloadsRBedMethyl - Disk-backed Representation of ONT bedMethyl Files
Bioconductor-native infrastructure for handling large nanoporetech modkit bedMethyl pileup files from ONT data using HDF5Array and DelayedArray.
Last updated
dnamethylationdifferentialmethylationepigeneticsinfrastructuredataimportsoftware
4.30 score 3 scripts 171 downloadsBiocMaintainerApp - View Bioconductor Package Maintainer Information Interactively
This package allows interactive viewing of package maintainer information. The Bioconductor Package Maintainer Application sends yearly verification emails to accept Bioconductor policies; this application also depicts maintainer status on opting in and if the email is deemed valid.
Last updated
infrastructureshinyapps
4.30 score 68 downloadsCyTOFpower - Power analysis for CyTOF experiments
This package is a tool to predict the power of CyTOF experiments in the context of differential state analyses. The package provides a shiny app with two options to predict the power of an experiment: i. generation of in-sicilico CyTOF data, using users input ii. browsing in a grid of parameters for which the power was already precomputed.
Last updated
flowcytometrysinglecellcellbiologystatisticalmethodsoftware
4.18 score 4 scripts 144 downloadswavFeatExt - Wavelet-based Feature Extraction for Copy-number Alteration Data
Provides tools for simulating copy-number alteration (CNA) profiles, applying a non-decimated Haar wavelet transform to genomic signals, and extracting wavelet-derived features for use in supervised learning. Multiple machine learning methods including lasso and elastic-net regularisation, random forest, partial least squares, neural networks and k-nearest neighbours are implemented to train predictive models from genomic feature vectors. The workflow enables end-to-end analysis from CNA simulation to feature extraction and classification.
Last updated
copynumbervariationgenomicvariationfeatureextractionclassification
4.04 score 73 downloadssimPIC - Flexible simulation of paired-insertion counts for single-cell ATAC-sequencing data
simPIC is a package for simulating single-cell ATAC-seq count data. It provides a user-friendly, well documented interface for data simulation. Functions are provided for parameter estimation, realistic scATAC-seq data simulation, and comparing real and simulated datasets.
Last updated
singlecellatacseqsoftwaresequencingimmunooncologydataimportbioconductorbioinformaticsscatac-seqsimulation
4.00 score 10 scripts 255 downloadsExperimentHubData - Add resources to ExperimentHub
Functions to add metadata to ExperimentHub db and resource files to AWS S3 buckets.
Last updated
infrastructuredataimportguithirdpartyclient
3.95 score 1 dependents 7 scripts 748 downloadsRegEnrich - Gene regulator enrichment analysis
This package is a pipeline to identify the key gene regulators in a biological process, for example in cell differentiation and in cell development after stimulation. There are four major steps in this pipeline: (1) differential expression analysis; (2) regulator-target network inference; (3) enrichment analysis; and (4) regulators scoring and ranking.
Last updated
geneexpressiontranscriptomicsrnaseqtwochanneltranscriptiongenetargetnetworkenrichmentdifferentialexpressionnetworknetworkinferencegenesetenrichmentfunctionalprediction
3.92 score 28 scripts 384 downloadsncRNAtools - An R toolkit for non-coding RNA
ncRNAtools provides a set of basic tools for handling and analyzing non-coding RNAs. These include tools to access the RNAcentral database and to predict and visualize the secondary structure of non-coding RNAs. The package also provides tools to read, write and interconvert the file formats most commonly used for representing such secondary structures.
Last updated
functionalgenomicsdataimportthirdpartyclientvisualizationstructuralprediction
3.83 score 1 stars 34 scripts 350 downloadsmethodical - Discovering genomic regions where methylation is strongly associated with transcriptional activity
DNA methylation is generally considered to be associated with transcriptional silencing. However, comprehensive, genome-wide investigation of this relationship requires the evaluation of potentially millions of correlation values between the methylation of individual genomic loci and expression of associated transcripts in a relatively large numbers of samples. Methodical makes this process quick and easy while keeping a low memory footprint. It also provides a novel method for identifying regions where a number of methylation sites are consistently strongly associated with transcriptional expression. In addition, Methodical enables housing DNA methylation data from diverse sources (e.g. WGBS, RRBS and methylation arrays) with a common framework, lifting over DNA methylation data between different genome builds and creating base-resolution plots of the association between DNA methylation and transcriptional activity at transcriptional start sites.
Last updated
dnamethylationmethylationarraytranscriptiongenomewideassociationsoftware
3.82 score 39 scripts 264 downloadsCNViz - Copy Number Visualization
CNViz takes probe, gene, and segment-level log2 copy number ratios and launches a Shiny app to visualize your sample's copy number profile. You can also integrate loss of heterozygosity (LOH) and single nucleotide variant (SNV) data.
Last updated
visualizationcopynumbervariationsequencingdnaseq
3.48 score 1 scripts 284 downloadsMSTree - MSTree plotting minimum spanning tree directly from the output of ChewBBACA pipeline
This package is used to generate a graph object from the output of chewBBACA pipeline (https://chewbbaca.readthedocs.io/en/latest/). Then, the generated graph object can be used to make a minimum spanning tree (MST). The minimum spanning tree can be customized using all the available arguments. This package consists of two functions: one to build the graph and another one for plotting.
Last updated
comparativegenomicsgenomicvariationclustering
3.48 scoreTDbasedUFEadv - Advanced package of tensor decomposition based unsupervised feature extraction
This is an advanced version of TDbasedUFE, which is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. In contrast to TDbasedUFE which can perform simple the feature selection and the multiomics analyses, this package can perform more complicated and advanced features, but they are not so popularly required. Only users who require more specific features can make use of its functionality.
Last updated
geneexpressionfeatureextractionmethylationarraysinglecellsoftwarebioconductor-packagebioinformaticstensor-decomposition
3.30 score 10 scripts 329 downloadssingIST - comparative single-cell transcriptomics between disease models and a human condition
Provides with toolkits to implement a full singIST analysis with pseudobulked Seurat objects of disease models and human data.
Last updated
singlecellclassificationtranscriptomics
3.04 score 9 scripts 63 downloadsa4Base - Automated Affymetrix Array Analysis Base Package
Base utility functions are available for the Automated Affymetrix Array Analysis set of packages.
Last updated
microarray
2.95 score 1 dependents 6 scripts 545 downloadsXAItest - XAItest: Enhancing Feature Discovery with eXplainable AI
XAItest is an R Package that identifies features using eXplainable AI (XAI) methods such as SHAP or LIME. This package allows users to compare these methods with traditional statistical tests like t-tests, empirical Bayes, and Fisher's test. Additionally, it includes simThresh, a system that enables the comparison of feature importance with p-values by incorporating calibrated simulated data.
Last updated
softwarestatisticalmethodfeatureextractionclassificationregression
2.85 score 1 stars 5 scripts 235 downloads
plotgardener - Coordinate-Based Genomic Visualization Package for R
Coordinate-based genomic visualization package for R. It grants users the ability to programmatically produce complex, multi-paneled figures. Tailored for genomics, plotgardener allows users to visualize large complex genomic datasets and provides exquisite control over how plots are placed and arranged on a page.
Last updated
visualizationgenomeannotationfunctionalgenomicsgenomeassemblyhiccpp
10.06 score 358 stars 4 dependents 300 scripts 592 downloads
ggmsa - Plot Multiple Sequence Alignment using 'ggplot2'
A visual exploration tool for multiple sequence alignment and associated data. Supports MSA of DNA, RNA, and protein sequences using 'ggplot2'. Multiple sequence alignment can easily be combined with other 'ggplot2' plots, such as phylogenetic tree Visualized by 'ggtree', boxplot, genome map and so on. More features: visualization of sequence logos, sequence bundles, RNA secondary structures and detection of sequence recombinations.
Last updated
softwarevisualizationalignmentannotationmultiplesequencealignment
9.94 score 217 stars 1 dependents 344 scripts 1.1k downloads
cosmosR - COSMOS (Causal Oriented Search of Multi-Omic Space)
COSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets based on prior knowledge of signaling, metabolic, and gene regulatory networks. It estimated the activities of transcrption factors and kinases and finds a network-level causal reasoning. Thereby, COSMOS provides mechanistic hypotheses for experimental observations across mulit-omics datasets.
Last updated
cellbiologypathwaysnetworkproteomicsmetabolomicstranscriptomicsgenesignalingdata-integrationmetabolomic-datanetwork-modellingphosphoproteomics
8.97 score 69 stars 1 dependents 42 scripts 438 downloadsrawrr - Direct Access to Orbitrap Data and Beyond
This package wraps the functionality of the Thermo Fisher Scientic RawFileReader .NET 8.0 assembly. Within the R environment, spectra and chromatograms are represented by S3 objects. The package provides basic functions to download and install the required third-party libraries. The package is developed, tested, and used at the Functional Genomics Center Zurich, Switzerland.
Last updated
massspectrometryproteomicsmetabolomicsinfrastructuresoftwarefastmass-spectrometrymultiplatformorbitrap-ms
8.70 score 61 stars 3 dependents 41 scripts 535 downloadsmistyR - Multiview Intercellular SpaTial modeling framework
mistyR is an implementation of the Multiview Intercellular SpaTialmodeling framework (MISTy). MISTy is an explainable machine learning framework for knowledge extraction and analysis of single-cell, highly multiplexed, spatially resolved data. MISTy facilitates an in-depth understanding of marker interactions by profiling the intra- and intercellular relationships. MISTy is a flexible framework able to process a custom number of views. Each of these views can describe a different spatial context, i.e., define a relationship among the observed expressions of the markers, such as intracellular regulation or paracrine regulation, but also, the views can also capture cell-type specific relationships, capture relations between functional footprints or focus on relations between different anatomical regions. Each MISTy view is considered as a potential source of variability in the measured marker expressions. Each MISTy view is then analyzed for its contribution to the total expression of each marker and is explained in terms of the interactions with other measurements that led to the observed contribution.
Last updated
softwarebiomedicalinformaticscellbiologysystemsbiologyregressiondecisiontreesinglecellspatialbioconductorbiologyintercellularmachine-learningmodularmolecular-biologymultiviewspatial-transcriptomics
8.25 score 72 stars 182 scripts 406 downloadsmsqrob2 - Robust statistical inference for quantitative LC-MS proteomics
msqrob2 provides a robust linear mixed model framework for assessing differential abundance in MS-based Quantitative proteomics experiments. Our workflows can start from raw peptide intensities or summarised protein expression values. The model parameter estimates can be stabilized by ridge regression, empirical Bayes variance estimation and robust M-estimation. msqrob2's hurde workflow can handle missing data without having to rely on hard-to-verify imputation assumptions, and, outcompetes state-of-the-art methods with and without imputation for both high and low missingness. It builds on QFeature infrastructure for quantitative mass spectrometry data to store the model results together with the raw data and preprocessed data.
Last updated
proteomicsmetabolomicsmassspectrometrydifferentialexpressionmultiplecomparisonregressionexperimentaldesignsoftwareimmunooncologynormalizationtimecoursepreprocessing
8.13 score 13 stars 123 scripts 458 downloadsGenomicSuperSignature - Interpretation of RNA-seq experiments through robust, efficient comparison to public databases
This package provides a novel method for interpreting new transcriptomic datasets through near-instantaneous comparison to public archives without high-performance computing requirements. Through the pre-computed index, users can identify public resources associated with their dataset such as gene sets, MeSH term, and publication. Functions to identify interpretable annotations and intuitive visualization options are implemented in this package.
Last updated
transcriptomicssystemsbiologyprincipalcomponentrnaseqsequencingpathwaysclusteringbioconductor-packageexploratory-data-analysisgseameshprincipal-component-analysisrna-sequencing-profilestransferlearningu24ca289073
7.56 score 16 stars 75 scripts 369 downloadsmethylclock - Methylclock - DNA methylation-based clocks
This package allows to estimate chronological and gestational DNA methylation (DNAm) age as well as biological age using different methylation clocks. Chronological DNAm age (in years) : Horvath's clock, Hannum's clock, BNN, Horvath's skin+blood clock, PedBE clock and Wu's clock. Gestational DNAm age : Knight's clock, Bohlin's clock, Mayne's clock and Lee's clocks. Biological DNAm clocks : Levine's clock and Telomere Length's clock.
Last updated
dnamethylationbiologicalquestionpreprocessingstatisticalmethodnormalizationcpp
6.73 score 53 stars 51 scripts 588 downloadsDino - Normalization of Single-Cell mRNA Sequencing Data
Dino normalizes single-cell, mRNA sequencing data to correct for technical variation, particularly sequencing depth, prior to downstream analysis. The approach produces a matrix of corrected expression for which the dependency between sequencing depth and the full distribution of normalized expression; many existing methods aim to remove only the dependency between sequencing depth and the mean of the normalized expression. This is particuarly useful in the context of highly sparse datasets such as those produced by 10X genomics and other uninque molecular identifier (UMI) based microfluidics protocols for which the depth-dependent proportion of zeros in the raw expression data can otherwise present a challenge.
Last updated
softwarenormalizationrnaseqsinglecellsequencinggeneexpressiontranscriptomicsregressioncellbasedassays
6.27 score 11 stars 19 scripts 346 downloadsvissE - Visualising Set Enrichment Analysis Results
This package enables the interpretation and analysis of results from a gene set enrichment analysis using network-based and text-mining approaches. Most enrichment analyses result in large lists of significant gene sets that are difficult to interpret. Tools in this package help build a similarity-based network of significant gene sets from a gene set enrichment analysis that can then be investigated for their biological function using text-mining approaches.
Last updated
softwaregeneexpressiongenesetenrichmentnetworkenrichmentnetworkbioinformatics
6.10 score 19 stars 22 scripts 400 downloadsepialleleR - Fast, Accurate, Epiallele-Aware Methylation Caller and Reporter
Epialleles are specific DNA methylation patterns that are mitotically and/or meiotically inherited. This package calls and reports cytosine methylation as well as frequencies of hypermethylated epialleles at the level of genomic regions or individual cytosines in next-generation sequencing data using binary alignment map (BAM) files as an input. Among other things, this package can also extract and visualise methylation patterns and assess allele specificity of methylation.
Last updated
dnamethylationepigeneticsmethylseqlongreadbioconductorcytosine-methylation-reportdna-methylationepialleleepimutationlong-read-sequencingnext-generation-sequencingsamtoolsshort-read-sequencingcurlbzip2xz-utilszlibcpp
5.86 score 6 stars 6 scripts 389 downloadsbiodb - Biodb, a Library and a Development Framework for Connecting to Chemical and Biological Databases
The biodb package provides access to standard remote chemical and biological databases (ChEBI, KEGG, HMDB, ...), as well as to in-house local database files (CSV, SQLite), with easy retrieval of entries, access to web services, search of compounds by mass and/or name, and mass spectra matching for LCMS and MSMS. Its architecture as a development framework facilitates the development of new database connectors for local projects or inside separate published packages.
Last updated
softwareinfrastructuredataimportkeggcpp
5.84 score 1 dependents 31 scripts 544 downloadsscanMiR - scanMiR
A set of tools for working with miRNA affinity models (KdModels), efficiently scanning for miRNA binding sites, and predicting target repression. It supports scanning using miRNA seeds, full miRNA sequences (enabling 3' alignment) and KdModels, and includes the prediction of slicing and TDMD sites. Finally, it includes utility and plotting functions (e.g. for the visual representation of miRNA-target alignment).
Last updated
mirnasequencematchingalignment
5.77 score 1 dependents 65 scripts 330 downloadsdiffUTR - diffUTR: Streamlining differential exon and 3' UTR usage
The diffUTR package provides a uniform interface and plotting functions for limma/edgeR/DEXSeq -powered differential bin/exon usage. It includes in addition an improved version of the limma::diffSplice method. Most importantly, diffUTR further extends the application of these frameworks to differential UTR usage analysis using poly-A site databases.
Last updated
geneexpression
5.53 score 7 stars 12 scripts 339 downloadscbpManager - Generate, manage, and edit data and metadata files suitable for the import in cBioPortal for Cancer Genomics
This R package provides an R Shiny application that enables the user to generate, manage, and edit data and metadata files suitable for the import in cBioPortal for Cancer Genomics. Create cancer studies and edit its metadata. Upload mutation data of a patient that will be concatenated to the data_mutation_extended.txt file of the study. Create and edit clinical patient data, sample data, and timeline data. Create custom timeline tracks for patients.
Last updated
immunooncologydataimportdatarepresentationguithirdpartyclientpreprocessingvisualizationcancer-genomicscbioportalclinical-datafilegeneratormutation-datapatient-data
5.51 score 8 stars 3 scripts 382 downloadssvaNUMT - NUMT detection from structural variant calls
svaNUMT contains functions for detecting NUMT events from structural variant calls. It takes structural variant calls in GRanges of breakend notation and identifies NUMTs by nuclear-mitochondrial breakend junctions. The main function reports candidate NUMTs if there is a pair of valid insertion sites found on the nuclear genome within a certain distance threshold. The candidate NUMTs are reported by events.
Last updated
dataimportsequencingannotationgeneticsvariantannotation
5.50 score 3 stars 14 scripts 294 downloadsBindingSiteFinder - Binding site defintion based on iCLIP data
Precise knowledge on the binding sites of an RNA-binding protein (RBP) is key to understand (post-) transcriptional regulatory processes. Here we present a workflow that describes how exact binding sites can be defined from iCLIP data. The package provides functions for binding site definition and result visualization. For details please see the vignette.
Last updated
sequencinggeneexpressiongeneregulationfunctionalgenomicscoveragedataimportbinding-site-classificationbinding-sitesbioconductor-packageicliprna-binding-proteins
5.43 score 6 stars 9 scripts 340 downloadscsdR - Differential gene co-expression
This package contains functionality to run differential gene co-expression across two different conditions. The algorithm is inspired by Voigt et al. 2017 and finds Conserved, Specific and Differentiated genes (hence the name CSD). This package include efficient and variance calculation by bootstrapping and Welford's algorithm.
Last updated
differentialexpressiongraphandnetworkgeneexpressionnetworkcppopenmp
5.32 score 7 stars 6 scripts 292 downloadsFindIT2 - find influential TF and Target based on multi-omics data
This package implements functions to find influential TF and target based on different input type. It have five module: Multi-peak multi-gene annotaion(mmPeakAnno module), Calculate regulation potential(calcRP module), Find influential Target based on ChIP-Seq and RNA-Seq data(Find influential Target module), Find influential TF based on different input(Find influential TF module), Calculate peak-gene or peak-peak correlation(peakGeneCor module). And there are also some other useful function like integrate different source information, calculate jaccard similarity for your TF.
Last updated
softwareannotationchipseqatacseqgeneregulationmultiplecomparisongenetarget
5.26 score 6 stars 7 scripts 384 downloadsdStruct - Identifying differentially reactive regions from RNA structurome profiling data
dStruct identifies differentially reactive regions from RNA structurome profiling data. dStruct is compatible with a broad range of structurome profiling technologies, e.g., SHAPE-MaP, DMS-MaPseq, Structure-Seq, SHAPE-Seq, etc. See Choudhary et al., Genome Biology, 2019 for the underlying method.
Last updated
statisticalmethodstructuralpredictionsequencingsoftware
5.18 score 3 stars 17 scripts 296 downloadsspiky - Spike-in calibration for cell-free MeDIP
spiky implements methods and model generation for cfMeDIP (cell-free methylated DNA immunoprecipitation) with spike-in controls. CfMeDIP is an enrichment protocol which avoids destructive conversion of scarce template, making it ideal as a "liquid biopsy," but creating certain challenges in comparing results across specimens, subjects, and experiments. The use of synthetic spike-in standard oligos allows diagnostics performed with cfMeDIP to quantitatively compare samples across subjects, experiments, and time points in both relative and absolute terms.
Last updated
differentialmethylationdnamethylationnormalizationpreprocessingqualitycontrolsequencing
5.08 score 3 stars 9 scripts 328 downloadssegmenter - Perform Chromatin Segmentation Analysis in R by Calling ChromHMM
Chromatin segmentation analysis transforms ChIP-seq data into signals over the genome. The latter represents the observed states in a multivariate Markov model to predict the chromatin's underlying states. ChromHMM, written in Java, integrates histone modification datasets to learn the chromatin states de-novo. The goal of this package is to call chromHMM from within R, capture the output files in an S4 object and interface to other relevant Bioconductor analysis tools. In addition, segmenter provides functions to test, select and visualize the output of the segmentation.
Last updated
softwarehistonemodificationbioconductorchromhmmsegmentation-an
4.85 score 5 stars 14 scripts 290 downloadsHGC - A fast hierarchical graph-based clustering method
HGC (short for Hierarchical Graph-based Clustering) is an R package for conducting hierarchical clustering on large-scale single-cell RNA-seq (scRNA-seq) data. The key idea is to construct a dendrogram of cells on their shared nearest neighbor (SNN) graph. HGC provides functions for building graphs and for conducting hierarchical clustering on the graph. The users with old R version could visit https://github.com/XuegongLab/HGC/tree/HGC4oldRVersion to get HGC package built for R 3.6.
Last updated
singlecellsoftwareclusteringrnaseqgraphandnetworkdnaseqcpp
4.79 score 31 scripts 282 downloadsKBoost - Inference of gene regulatory networks from gene expression data
Reconstructing gene regulatory networks and transcription factor activity is crucial to understand biological processes and holds potential for developing personalized treatment. Yet, it is still an open problem as state-of-art algorithm are often not able to handle large amounts of data. Furthermore, many of the present methods predict numerous false positives and are unable to integrate other sources of information such as previously known interactions. Here we introduce KBoost, an algorithm that uses kernel PCA regression, boosting and Bayesian model averaging for fast and accurate reconstruction of gene regulatory networks. KBoost can also use a prior network built on previously known transcription factor targets. We have benchmarked KBoost using three different datasets against other high performing algorithms. The results show that our method compares favourably to other methods across datasets.
Last updated
networkgraphandnetworkbayesiannetworkinferencegeneregulationtranscriptomicssystemsbiologytranscriptiongeneexpressionregressionprincipalcomponent
4.75 score 4 stars 14 scripts 246 downloadsBUSseq - Batch Effect Correction with Unknow Subtypes for scRNA-seq data
BUSseq R package fits an interpretable Bayesian hierarchical model---the Batch Effects Correction with Unknown Subtypes for scRNA seq Data (BUSseq)---to correct batch effects in the presence of unknown cell types. BUSseq is able to simultaneously correct batch effects, clusters cell types, and takes care of the count data nature, the overdispersion, the dropout events, and the cell-specific sequencing depth of scRNA-seq data. After correcting the batch effects with BUSseq, the corrected value can be used for downstream analysis as if all cells were sequenced in a single batch. BUSseq can integrate read count matrices obtained from different scRNA-seq platforms and allow cell types to be measured in some but not all of the batches as long as the experimental design fulfills the conditions listed in our manuscript.
Last updated
experimentaldesigngeneexpressionstatisticalmethodbayesianclusteringfeatureextractionbatcheffectsinglecellsequencingcppopenmp
4.65 score 1 stars 30 scripts 335 downloadsm6Aboost - m6Aboost
This package can help user to run the m6Aboost model on their own miCLIP2 data. The package includes functions to assign the read counts and get the features to run the m6Aboost model. The miCLIP2 data should be stored in a GRanges object. More details can be found in the vignette.
Last updated
sequencingepigeneticsgeneticsexperimenthubsoftware
4.65 score 3 stars 5 scripts 286 downloadsfgga - Hierarchical ensemble method based on factor graph
Package that implements the FGGA algorithm. This package provides a hierarchical ensemble method based ob factor graphs for the consistent cross-ontology annotation of protein coding genes. FGGA embodies elements of predicate logic, communication theory, supervised learning and inference in graphical models.
Last updated
softwarestatisticalmethodclassificationnetworknetworkinferencesupportvectormachinegraphandnetworkgo
4.56 score 3 stars 12 scripts 325 downloadsDelayedTensor - R package for sparse and out-of-core arithmetic and decomposition of Tensor
DelayedTensor operates Tensor arithmetic directly on DelayedArray object. DelayedTensor provides some generic function related to Tensor arithmetic/decompotision and dispatches it on the DelayedArray class. DelayedTensor also suppors Tensor contraction by einsum function, which is inspired by numpy einsum.
Last updated
softwareinfrastructuredatarepresentationdimensionreduction
4.51 score 4 stars 3 scripts 346 downloadsRCSL - Rank Constrained Similarity Learning for single cell RNA sequencing data
A novel clustering algorithm and toolkit RCSL (Rank Constrained Similarity Learning) to accurately identify various cell types using scRNA-seq data from a complex tissue. RCSL considers both lo-cal similarity and global similarity among the cells to discern the subtle differences among cells of the same type as well as larger differences among cells of different types. RCSL uses Spearman’s rank correlations of a cell’s expression vector with those of other cells to measure its global similar-ity, and adaptively learns neighbour representation of a cell as its local similarity. The overall similar-ity of a cell to other cells is a linear combination of its global similarity and local similarity.
Last updated
singlecellsoftwareclusteringdimensionreductionrnaseqvisualizationsequencing
4.51 score 2 stars 16 scripts 331 downloads
cageminer - Candidate Gene Miner
This package aims to integrate GWAS-derived SNPs and coexpression networks to mine candidate genes associated with a particular phenotype. For that, users must define a set of guide genes, which are known genes involved in the studied phenotype. Additionally, the mined candidates can be given a score that favor candidates that are hubs and/or transcription factors. The scores can then be used to rank and select the top n most promising genes for downstream experiments.
Last updated
softwaresnpfunctionalpredictiongenomewideassociationgeneexpressionnetworkenrichmentvariantannotationfunctionalgenomicsnetwork
4.48 score 1 stars 5 scripts 364 downloadsMSstatsLOBD - Assay characterization: estimation of limit of blanc(LoB) and limit of detection(LOD)
The MSstatsLOBD package allows calculation and visualization of limit of blac (LOB) and limit of detection (LOD). We define the LOB as the highest apparent concentration of a peptide expected when replicates of a blank sample containing no peptides are measured. The LOD is defined as the measured concentration value for which the probability of falsely claiming the absence of a peptide in the sample is 0.05, given a probability 0.05 of falsely claiming its presence. These functionalities were previously a part of the MSstats package. The methodology is described in Galitzine (2018) <doi:10.1074/mcp.RA117.000322>.
Last updated
immunooncologymassspectrometryproteomicssoftwaredifferentialexpressiononechanneltwochannelnormalizationqualitycontrolmass-spectrometry
4.30 score 7 scripts 250 downloadsNanoTube - An Easy Pipeline for NanoString nCounter Data Analysis
NanoTube includes functions for the processing, quality control, analysis, and visualization of NanoString nCounter data. Analysis functions include differential analysis and gene set analysis methods, as well as postprocessing steps to help understand the results. Additional functions are included to enable interoperability with other Bioconductor NanoString data analysis packages.
Last updated
softwaregeneexpressiondifferentialexpressionqualitycontrol
4.00 score 8 scripts 366 downloadssystemPipeTools - Tools for data visualization
systemPipeTools package extends the widely used systemPipeR (SPR) workflow environment with an enhanced toolkit for data visualization, including utilities to automate the data visualizaton for analysis of differentially expressed genes (DEGs). systemPipeTools provides data transformation and data exploration functions via scatterplots, hierarchical clustering heatMaps, principal component analysis, multidimensional scaling, generalized principal components, t-Distributed Stochastic Neighbor embedding (t-SNE), and MA and volcano plots. All these utilities can be integrated with the modular design of the systemPipeR environment that allows users to easily substitute any of these features and/or custom with alternatives.
Last updated
infrastructuredataimportsequencingqualitycontrolreportwritingexperimentaldesignclusteringdifferentialexpressionmultidimensionalscalingprincipalcomponent
4.00 score 9 scripts 296 downloadsGenomicRanges - Representation and manipulation of genomic intervals
The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.
Last updated
geneticsinfrastructuredatarepresentationsequencingannotationgenomeannotationcoveragebioconductor-packagecore-packagegenomicranges
18.30 score 46 stars 1.4k dependents 26k scripts 97k downloadsBiocParallel - Bioconductor facilities for parallel evaluation
This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.
Last updated
infrastructurebioconductor-packagecore-packageu24ca289073cpp
17.67 score 69 stars 1.2k dependents 10k scripts 117k downloads
plyranges - A fluent interface for manipulating GenomicRanges
A dplyr-like interface for interacting with the common Bioconductor classes Ranges and GenomicRanges. By providing a grammatical and consistent way of manipulating these classes their accessiblity for new Bioconductor users is hopefully increased.
Last updated
infrastructuredatarepresentationworkflowstepcoveragebioconductordata-analysisdplyrgenomic-rangesgenomicstidy-data
13.33 score 154 stars 20 dependents 2.5k scripts 2.1k downloadsEnhancedVolcano - Publication-ready volcano plots with enhanced colouring and labeling
Volcano plots represent a useful way to visualise the results of differential expression analyses. Here, we present a highly-configurable function that produces publication-ready volcano plots. EnhancedVolcano will attempt to fit as many point labels in the plot window as possible, thus avoiding 'clogging' up the plot with labels that could not otherwise have been read. Other functionality allows the user to identify up to 4 different types of attributes in the same plot space via colour, shape, size, and shade parameter configurations.
Last updated
rnaseqgeneexpressiontranscriptiondifferentialexpressionimmunooncology
13.26 score 464 stars 4 dependents 4.0k scripts 10k downloadsmicrobiome - Microbiome Analytics
Utilities for microbiome analysis.
Last updated
metagenomicsmicrobiomesequencingsystemsbiologyhitchiphitchip-atlashuman-microbiomemicrobiologymicrobiome-analysisphyloseqpopulation-study
13.07 score 316 stars 5 dependents 2.3k scripts 3.3k downloadsiSEE - Interactive SummarizedExperiment Explorer
Create an interactive Shiny-based graphical user interface for exploring data stored in SummarizedExperiment objects, including row- and column-level metadata. The interface supports transmission of selections between plots and tables, code tracking, interactive tours, interactive or programmatic initialization, preservation of app state, and extensibility to new panel types via S4 classes. Special attention is given to single-cell data in a SingleCellExperiment object with visualization of dimensionality reduction results.
Last updated
cellbasedassaysclusteringdimensionreductionfeatureextractiongeneexpressionguiimmunooncologyshinyappssinglecelltranscriptiontranscriptomicsvisualizationdimension-reductionfeature-extractiongene-expressionhacktoberfesthuman-cell-atlasshinysingle-cell
12.69 score 230 stars 12 dependents 440 scripts 1.1k downloadsslingshot - Tools for ordering single-cell sequencing
Provides functions for inferring continuous, branching lineage structures in low-dimensional data. Slingshot was designed to model developmental trajectories in single-cell RNA sequencing data and serve as a component in an analysis pipeline after dimensionality reduction and clustering. It is flexible enough to handle arbitrarily many branching events and allows for the incorporation of prior knowledge through supervised graph construction.
Last updated
clusteringdifferentialexpressiongeneexpressionrnaseqsequencingsoftwaresinglecelltranscriptomicsvisualization
12.34 score 337 stars 3 dependents 1.7k scripts 3.9k downloadsBiocSingular - Singular Value Decomposition for Bioconductor Packages
Implements exact and approximate methods for singular value decomposition and principal components analysis, in a framework that allows them to be easily switched within Bioconductor packages or workflows. Where possible, parallelization is achieved using the BiocParallel framework.
Last updated
softwaredimensionreductionprincipalcomponentbioconductor-packagehuman-cell-atlassingular-value-decompositioncpp
12.27 score 8 stars 123 dependents 1.1k scripts 24k downloadsdecontam - Identify Contaminants in Marker-gene and Metagenomics Sequencing Data
Simple statistical identification of contaminating sequence features in marker-gene or metagenomics data. Works on any kind of feature derived from environmental sequencing data (e.g. ASVs, OTUs, taxonomic groups, MAGs,...). Requires DNA quantitation data or sequenced negative control samples.
Last updated
immunooncologymicrobiomesequencingclassificationmetagenomicsampliconbioinformaticscontaminationmetabarcoding
11.89 score 170 stars 9 dependents 700 scripts 2.7k downloadsPCAtools - PCAtools: Everything Principal Components Analysis
Principal Component Analysis (PCA) is a very powerful technique that has wide applicability in data science, bioinformatics, and further afield. It was initially developed to analyse large volumes of data in order to tease out the differences/relationships between the logical entities being analysed. It extracts the fundamental structure of the data without the need to build any model to represent it. This 'summary' of the data is arrived at through a process of reduction that can transform the large number of variables into a lesser number that are uncorrelated (i.e. the 'principal components'), while at the same time being capable of easy interpretation on the original data. PCAtools provides functions for data exploration via PCA, and allows the user to generate publication-ready figures. PCA is performed via BiocSingular - users can also identify optimal number of principal components via different metrics, such as elbow method and Horn's parallel analysis, which has relevance for data reduction in single-cell RNA-seq (scRNA-seq) and high dimensional mass cytometry data.
Last updated
rnaseqatacseqgeneexpressiontranscriptionsinglecellprincipalcomponentcpp
11.79 score 380 stars 2 dependents 952 scripts 2.0k downloadsDelayedMatrixStats - Functions that Apply to Rows and Columns of 'DelayedMatrix' Objects
A port of the 'matrixStats' API for use with DelayedMatrix objects from the 'DelayedArray' package. High-performing functions operating on rows and columns of DelayedMatrix objects, e.g. col / rowMedians(), col / rowRanks(), and col / rowSds(). Functions optimized per data type and for subsetted calculations such that both memory usage and processing time is minimized.
Last updated
infrastructuredatarepresentationsoftware
11.63 score 15 stars 130 dependents 479 scripts 30k downloadsGenomicDataCommons - NIH / NCI Genomic Data Commons Access
Programmatically access the NIH / NCI Genomic Data Commons RESTful service.
Last updated
dataimportsequencingapi-clientbioconductorbioinformaticscancercore-servicesdata-sciencegenomicsncitcgavignette
11.62 score 90 stars 13 dependents 259 scripts 2.1k downloadsBiocFileCache - Manage Files Across Sessions
This package creates a persistent on-disk cache of files that the user can add, update, and retrieve. It is useful for managing resources (such as custom Txdb objects) that are costly or difficult to create, web resources, and data files used across sessions.
Last updated
dataimportcore-packageu24ca289073
11.61 score 14 stars 465 dependents 628 scripts 46k downloads
Rhdf5lib - hdf5 library as an R package
Provides C and C++ hdf5 libraries.
Last updated
infrastructurebioconductorhdf5hdf5-library
11.48 score 7 stars 343 dependents 29 scripts 44k downloadsbeachmat - Compiling Bioconductor to Handle Each Matrix Type
Provides a consistent C++ class interface for reading from a variety of commonly used matrix types. Ordinary matrices and several sparse/dense Matrix classes are directly supported, along with a subset of the delayed operations implemented in the DelayedArray package. All other matrix-like objects are supported by calling back into R.
Last updated
datarepresentationdataimportinfrastructurebioconductor-packagehuman-cell-atlasmatrix-librarycpp
11.24 score 5 stars 191 dependents 26 scripts 31k downloadsCATALYST - Cytometry dATa anALYSis Tools
CATALYST provides tools for preprocessing of and differential discovery in cytometry data such as FACS, CyTOF, and IMC. Preprocessing includes i) normalization using bead standards, ii) single-cell deconvolution, and iii) bead-based compensation. For differential discovery, the package provides a number of convenient functions for data processing (e.g., clustering, dimension reduction), as well as a suite of visualizations for exploratory data analysis and exploration of results from differential abundance (DA) and state (DS) analysis in order to identify differences in composition and expression profiles at the subpopulation-level, respectively.
Last updated
clusteringdataimportdifferentialexpressionexperimentaldesignflowcytometryimmunooncologymassspectrometrynormalizationpreprocessingsinglecellsoftwarestatisticalmethodvisualization
11.10 score 75 stars 2 dependents 504 scripts 1.1k downloads
singscore - Rank-based single-sample gene set scoring method
A simple single-sample gene signature scoring method that uses rank-based statistics to analyze the sample's gene expression profile. It scores the expression activities of gene sets at a single-sample level.
Last updated
softwaregeneexpressiongenesetenrichmentbioinformatics
10.74 score 48 stars 6 dependents 161 scripts 2.5k downloadssesame - SEnsible Step-wise Analysis of DNA MEthylation BeadChips
Tools For analyzing Illumina Infinium DNA methylation arrays. SeSAMe provides utilities to support analyses of multiple generations of Infinium DNA methylation BeadChips, including preprocessing, quality control, visualization and inference. SeSAMe features accurate detection calling, intelligent inference of ethnicity, sex and advanced quality control routines.
Last updated
dnamethylationmethylationarraypreprocessingqualitycontrolbioinformaticsdna-methylationmicroarray
10.62 score 82 stars 1 dependents 350 scripts 1.5k downloadsDropletUtils - Utilities for Handling Single-Cell Droplet Data
Provides a number of utility functions for handling single-cell (RNA-seq) data from droplet technologies such as 10X Genomics. This includes data loading from count matrices or molecule information files, identification of cells from empty droplets, removal of barcode-swapped pseudo-cells, and downsampling of the count matrix.
Last updated
immunooncologysinglecellsequencingrnaseqgeneexpressiontranscriptomicsdataimportcoveragecurlopensslcpp
10.45 score 13 dependents 3.5k scripts 4.6k downloadsBASiCS - Bayesian Analysis of Single-Cell Sequencing data
Single-cell mRNA sequencing can uncover novel cell-to-cell heterogeneity in gene expression levels in seemingly homogeneous populations of cells. However, these experiments are prone to high levels of technical noise, creating new challenges for identifying genes that show genuine heterogeneous expression within the population of cells under study. BASiCS (Bayesian Analysis of Single-Cell Sequencing data) is an integrated Bayesian hierarchical model to perform statistical analyses of single-cell RNA sequencing datasets in the context of supervised experiments (where the groups of cells of interest are known a priori, e.g. experimental conditions or cell types). BASiCS performs built-in data normalisation (global scaling) and technical noise quantification (based on spike-in genes). BASiCS provides an intuitive detection criterion for highly (or lowly) variable genes within a single group of cells. Additionally, BASiCS can compare gene expression patterns between two or more pre-specified groups of cells. Unlike traditional differential expression tools, BASiCS quantifies changes in expression that lie beyond comparisons of means, also allowing the study of changes in cell-to-cell heterogeneity. The latter can be quantified via a biological over-dispersion parameter that measures the excess of variability that is observed with respect to Poisson sampling noise, after normalisation and technical noise removal. Due to the strong mean/over-dispersion confounding that is typically observed for scRNA-seq datasets, BASiCS also tests for changes in residual over-dispersion, defined by residual values with respect to a global mean/over-dispersion trend.
Last updated
immunooncologynormalizationsequencingrnaseqsoftwaregeneexpressiontranscriptomicssinglecelldifferentialexpressionbayesiancellbiologybioconductor-packagegene-expressionrcpprcpparmadilloscrna-seqsingle-cellopenblascppopenmp
10.41 score 88 stars 1 dependents 389 scripts 592 downloadsAnnotationFilter - Facilities for Filtering Bioconductor Annotation Resources
This package provides class and other infrastructure to implement filters for manipulating Bioconductor annotation resources. The filters will be used by ensembldb, Organism.dplyr, and other packages.
Last updated
annotationinfrastructuresoftwarebioconductor-packagecore-package
10.38 score 5 stars 172 dependents 62 scripts 19k downloadsEpiDISH - Epigenetic Dissection of Intra-Sample-Heterogeneity
EpiDISH is a R package to infer the proportions of a priori known cell-types present in a sample representing a mixture of such cell-types. Right now, the package can be used on DNAm data of blood-tissue of any age, from birth to old-age, generic epithelial tissue and breast tissue. Besides, the package provides a function that allows the identification of differentially methylated cell-types and their directionality of change in Epigenome-Wide Association Studies.
Last updated
dnamethylationmethylationarrayepigeneticsdifferentialmethylationimmunooncology
10.36 score 57 stars 5 dependents 207 scripts 1.3k downloadsORFik - Open Reading Frames in Genomics
R package for analysis of transcript and translation features through manipulation of sequence data and NGS data like Ribo-Seq, RNA-Seq, TCP-Seq and CAGE. It is generalized in the sense that any transcript region can be analysed, as the name hints to it was made with investigation of ribosomal patterns over Open Reading Frames (ORFs) as it's primary use case. ORFik is extremely fast through use of C++, data.table and GenomicRanges. Package allows to reassign starts of the transcripts with the use of CAGE-Seq data, automatic shifting of RiboSeq reads, finding of Open Reading Frames for whole genomes and much more.
Last updated
immunooncologysoftwaresequencingriboseqrnaseqfunctionalgenomicscoveragealignmentdataimportcpp
10.34 score 38 stars 2 dependents 179 scripts 688 downloadszinbwave - Zero-Inflated Negative Binomial Model for RNA-Seq Data
Implements a general and flexible zero-inflated negative binomial model that can be used to provide a low-dimensional representations of single-cell RNA-seq data. The model accounts for zero inflation (dropouts), over-dispersion, and the count nature of the data. The model also accounts for the difference in library sizes and optionally for batch effects and/or other covariates, avoiding the need for pre-normalize the data.
Last updated
immunooncologydimensionreductiongeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecell
10.23 score 44 stars 5 dependents 244 scripts 1.5k downloadsdiffcyt - Differential discovery in high-dimensional cytometry via high-resolution clustering
Statistical methods for differential discovery analyses in high-dimensional cytometry data (including flow cytometry, mass cytometry or CyTOF, and oligonucleotide-tagged cytometry), based on a combination of high-resolution clustering and empirical Bayes moderated tests adapted from transcriptomics.
Last updated
immunooncologyflowcytometryproteomicssinglecellcellbasedassayscellbiologyclusteringfeatureextractionsoftware
9.86 score 25 stars 5 dependents 273 scripts 930 downloadsrWikiPathways - rWikiPathways - R client library for the WikiPathways API
Use this package to interface with the WikiPathways API. It provides programmatic access to WikiPathways content in multiple data and image formats, including official monthly release files and convenient GMT read/write functions.
Last updated
visualizationgraphandnetworkthirdpartyclientnetworkmetabolomicsbioinformaticsdata-accesspathways
9.48 score 20 stars 2 dependents 198 scripts 826 downloadsbatchelor - Single-Cell Batch Correction Methods
Implements a variety of methods for batch correction of single-cell (RNA sequencing) data. This includes methods based on detecting mutually nearest neighbors, as well as several efficient variants of linear regression of the log-expression values. Functions are also provided to perform global rescaling to remove differences in depth between batches, and to perform a principal components analysis that is robust to differences in the numbers of cells across batches.
Last updated
sequencingrnaseqsoftwaregeneexpressiontranscriptomicssinglecellbatcheffectnormalizationcpp
9.23 score 10 dependents 1.4k scripts 7.5k downloadsprogeny - Pathway RespOnsive GENes for activity inference from gene expression
PROGENy is resource that leverages a large compendium of publicly available signaling perturbation experiments to yield a common core of pathway responsive genes for human and mouse. These, coupled with any statistical method, can be used to infer pathway activities from bulk or single-cell transcriptomics.
Last updated
systemsbiologygeneexpressionfunctionalpredictiongeneregulation
9.19 score 124 stars 1 dependents 287 scripts 1.2k downloadsscmap - A tool for unsupervised projection of single cell RNA-seq data
Single-cell RNA-seq (scRNA-seq) is widely used to investigate the composition of complex tissues since the technology allows researchers to define cell-types using unsupervised clustering of the transcriptome. However, due to differences in experimental methods and computational analyses, it is often challenging to directly compare the cells identified in two different experiments. scmap is a method for projecting cells from a scRNA-seq experiment on to the cell-types or individual cells identified in a different experiment.
Last updated
immunooncologysinglecellsoftwareclassificationsupportvectormachinernaseqvisualizationtranscriptomicsdatarepresentationtranscriptionsequencingpreprocessinggeneexpressiondataimportbioconductor-packagehuman-cell-atlasprojection-mappingsingle-cell-rna-seqopenblascpp
9.06 score 100 stars 285 scripts 770 downloadsCellBench - Construct Benchmarks for Single Cell Analysis Methods
This package contains infrastructure for benchmarking analysis methods and access to single cell mixture benchmarking data. It provides a framework for organising analysis methods and testing combinations of methods in a pipeline without explicitly laying out each combination. It also provides utilities for sampling and filtering SingleCellExperiment objects, constructing lists of functions with varying parameters, and multithreaded evaluation of analysis methods.
Last updated
softwareinfrastructuresinglecellbenchmarkbioinformatics
8.98 score 32 stars 113 scripts 451 downloadsRaggedExperiment - Representation of Sparse Experiments and Assays Across Samples
This package provides a flexible representation of copy number, mutation, and other data that fit into the ragged array schema for genomic location data. The basic representation of such data provides a rectangular flat table interface to the user with range information in the rows and samples/specimen in the columns. The RaggedExperiment class derives from a GRangesList representation and provides a semblance of a rectangular dataset.
Last updated
infrastructuredatarepresentationcopynumbercore-packagedata-structuremutationsu24ca289073
8.95 score 5 stars 14 dependents 117 scripts 1.5k downloadsHiCcompare - HiCcompare: Joint normalization and comparative analysis of multiple Hi-C datasets
HiCcompare provides functions for joint normalization and difference detection in multiple Hi-C datasets. HiCcompare operates on processed Hi-C data in the form of chromosome-specific chromatin interaction matrices. It accepts three-column tab-separated text files storing chromatin interaction matrices in a sparse matrix format which are available from several sources. HiCcompare is designed to give the user the ability to perform a comparative analysis on the 3-Dimensional structure of the genomes of cells in different biological states.`HiCcompare` differs from other packages that attempt to compare Hi-C data in that it works on processed data in chromatin interaction matrix format instead of pre-processed sequencing data. In addition, `HiCcompare` provides a non-parametric method for the joint normalization and removal of biases between two Hi-C datasets for the purpose of comparative analysis. `HiCcompare` also provides a simple yet robust method for detecting differences between Hi-C datasets.
Last updated
softwarehicsequencingnormalizationdifference-detectionhi-cvisualization
8.92 score 23 stars 6 dependents 71 scripts 722 downloadsNormalyzerDE - Evaluation of normalization methods and calculation of differential expression analysis statistics
NormalyzerDE provides screening of normalization methods for LC-MS based expression data. It calculates a range of normalized matrices using both existing approaches and a novel time-segmented approach, calculates performance measures and generates an evaluation report. Furthermore, it provides an easy utility for Limma- or ANOVA- based differential expression analysis.
Last updated
normalizationmultiplecomparisonvisualizationbayesianproteomicsmetabolomicsdifferentialexpressionbioconductorbioinformaticslimma
8.89 score 26 stars 1 dependents 71 scripts 518 downloadsIsoformSwitchAnalyzeR - Identify, Annotate and Visualize Isoform Switches with Functional Consequences from both short- and long-read RNA-seq data
Analysis of alternative splicing and isoform switches with predicted functional consequences (e.g. gain/loss of protein domains etc.) from quantification of all types of RNA-seq (short/long) by tools such as Kallisto, Salmon, StringTie, Tallon, IsoQuant etc.
Last updated
geneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicingvisualizationstatisticalmethodtranscriptomevariantbiomedicalinformaticsfunctionalgenomicssystemsbiologytranscriptomicsrnaseqannotationfunctionalpredictiongenepredictiondataimportmultiplecomparisonbatcheffectimmunooncology
8.84 score 132 stars 239 scripts 936 downloadsdrawProteins - Package to Draw Protein Schematics from Uniprot API output
This package draws protein schematics from Uniprot API output. From the JSON returned by the GET command, it creates a dataframe from the Uniprot Features API. This dataframe can then be used by geoms based on ggplot2 and base R to draw protein schematics.
Last updated
visualizationfunctionalpredictionproteomics
8.83 score 35 stars 1 dependents 71 scripts 554 downloadsapeglm - Approximate posterior estimation for GLM coefficients
apeglm provides Bayesian shrinkage estimators for effect sizes for a variety of GLM models, using approximation of the posterior for individual coefficients.
Last updated
immunooncologysequencingrnaseqdifferentialexpressiongeneexpressionbayesiancpp
8.72 score 9 dependents 1.4k scripts 7.1k downloadsSCnorm - Normalization of single cell RNA-seq data
This package implements SCnorm — a method to normalize single-cell RNA-seq data.
Last updated
normalizationrnaseqsinglecellimmunooncology
8.70 score 51 stars 81 scripts 488 downloadsmotifmatchr - Fast Motif Matching in R
Quickly find motif matches for many motifs and many sequences. Wraps C++ code from the MOODS motif calling library, which was developed by Pasi Rastas, Janne Korhonen, and Petri Martinmäki.
Last updated
motifannotationcpp
8.53 score 5 dependents 1.0k scripts 2.9k downloadsOUTRIDER - OUTRIDER - OUTlier in RNA-Seq fInDER
Identification of aberrant gene expression in RNA-seq data. Read count expectations are modeled by an autoencoder to control for confounders in the data. Given these expectations, the RNA-seq read counts are assumed to follow a negative binomial distribution with a gene-specific dispersion. Outliers are then identified as read counts that significantly deviate from this distribution. Furthermore, OUTRIDER provides useful plotting functions to analyze and visualize the results.
Last updated
immunooncologyrnaseqtranscriptomicsalignmentsequencinggeneexpressiongeneticscount-datadiagnosticsexpression-analysismendelian-geneticsoutlier-detectionrna-seqopenblascpp
8.52 score 56 stars 1 dependents 167 scripts 674 downloadsprojectR - Functions for the projection of weights from PCA, CoGAPS, NMF, correlation, and clustering
Functions for the projection of data into the spaces defined by PCA, CoGAPS, NMF, correlation, and clustering.
Last updated
functionalpredictiongeneregulationbiologicalquestionsoftware
8.38 score 65 stars 123 scripts 332 downloadsCAGEfightR - Analysis of Cap Analysis of Gene Expression (CAGE) data using Bioconductor
CAGE is a widely used high throughput assay for measuring transcription start site (TSS) activity. CAGEfightR is an R/Bioconductor package for performing a wide range of common data analysis tasks for CAGE and 5'-end data in general. Core functionality includes: import of CAGE TSSs (CTSSs), tag (or unidirectional) clustering for TSS identification, bidirectional clustering for enhancer identification, annotation with transcript and gene models, correlation of TSS and enhancer expression, calculation of TSS shapes, quantification of CAGE expression as expression matrices and genome brower visualization.
Last updated
softwaretranscriptioncoveragegeneexpressiongeneregulationpeakdetectiondataimportdatarepresentationtranscriptomicssequencingannotationgenomebrowsersnormalizationpreprocessingvisualization
8.36 score 10 stars 1 dependents 126 scripts 593 downloadsmultiMiR - Integration of multiple microRNA-target databases with their disease and drug associations
A collection of microRNAs/targets from external resources, including validated microRNA-target databases (miRecords, miRTarBase and TarBase), predicted microRNA-target databases (DIANA-microT, ElMMo, MicroCosm, miRanda, miRDB, PicTar, PITA and TargetScan) and microRNA-disease/drug databases (miR2Disease, Pharmaco-miR VerSe and PhenomiR).
Last updated
mirnadatahomo_sapiens_datamus_musculus_datarattus_norvegicus_dataorganismdatamicrorna-sequencesql
8.27 score 24 stars 156 scripts 808 downloadsAUCell - AUCell: Analysis of 'gene set' activity in single-cell RNA-seq data (e.g. identify cells with specific gene signatures)
AUCell allows to identify cells with active gene sets (e.g. signatures, gene modules...) in single-cell RNA-seq data. AUCell uses the "Area Under the Curve" (AUC) to calculate whether a critical subset of the input gene set is enriched within the expressed genes for each cell. The distribution of AUC scores across all the cells allows exploring the relative expression of the signature. Since the scoring method is ranking-based, AUCell is independent of the gene expression units and the normalization procedure. In addition, since the cells are evaluated individually, it can easily be applied to bigger datasets, subsetting the expression matrix if needed.
Last updated
singlecellgenesetenrichmenttranscriptomicstranscriptiongeneexpressionworkflowstepnormalization
8.23 score 1 dependents 1.2k scripts 4.9k downloadsamplican - Automated analysis of CRISPR experiments
`amplican` performs alignment of the amplicon reads, normalizes gathered data, calculates multiple statistics (e.g. cut rates, frameshifts) and presents results in form of aggregated reports. Data and statistics can be broken down by experiments, barcodes, user defined groups, guides and amplicons allowing for quick identification of potential problems.
Last updated
immunooncologytechnologyalignmentqpcrcrisprcpp
8.21 score 12 stars 60 scripts 572 downloadsmbkmeans - Mini-batch K-means Clustering for Single-Cell RNA-seq
Implements the mini-batch k-means algorithm for large datasets, including support for on-disk data representation.
Last updated
clusteringgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecellhuman-cell-atlasopenblascpp
8.16 score 13 stars 2 dependents 92 scripts 1.1k downloadsngsReports - Load FastqQC reports and other NGS related files
This package provides methods and object classes for parsing FastQC reports and output summaries from other NGS tools into R. As well as parsing files, multiple plotting methods have been implemented for visualising the parsed data. Plots can be generated as static ggplot objects or interactive plotly objects.
Last updated
qualitycontrolreportwriting
8.09 score 23 stars 120 scripts 448 downloadsDEqMS - a tool to perform statistical analysis of differential protein expression for quantitative proteomics data.
DEqMS is developped on top of Limma. However, Limma assumes same prior variance for all genes. In proteomics, the accuracy of protein abundance estimates varies by the number of peptides/PSMs quantified in both label-free and labelled data. Proteins quantification by multiple peptides or PSMs are more accurate. DEqMS package is able to estimate different prior variances for proteins quantified by different number of PSMs/peptides, therefore acchieving better accuracy. The package can be applied to analyze both label-free and labelled proteomics data.
Last updated
immunooncologyproteomicsmassspectrometrypreprocessingdifferentialexpressionmultiplecomparisonnormalizationbayesianexperimenthubsoftwarelimmaquantitative-proteomic-analysis
8.08 score 26 stars 1 dependents 102 scripts 601 downloadsfishpond - Fishpond: downstream methods and tools for expression data
Fishpond contains methods for differential transcript and gene expression analysis of RNA-seq data using inferential replicates for uncertainty of abundance quantification, as generated by Gibbs sampling or bootstrap sampling. Also the package contains a number of utilities for working with Salmon and Alevin quantification files.
Last updated
sequencingrnaseqgeneexpressiontranscriptionnormalizationregressionmultiplecomparisonbatcheffectvisualizationdifferentialexpressiondifferentialsplicingalternativesplicingsinglecellbioconductorgene-expressiongenomicssalmonscrnaseqstatisticstranscriptomics
8.06 score 32 stars 225 scripts 832 downloadscoRdon - Codon Usage Analysis and Prediction of Gene Expressivity
Tool for analysis of codon usage in various unannotated or KEGG/COG annotated DNA sequences. Calculates different measures of CU bias and CU-based predictors of gene expressivity, and performs gene set enrichment analysis for annotated sequences. Implements several methods for visualization of CU and enrichment analysis results.
Last updated
softwaremetagenomicsgeneexpressiongenesetenrichmentgenepredictionvisualizationkeggpathwaysgenetics cellbiologybiomedicalinformaticsimmunooncology
7.93 score 23 stars 1 dependents 69 scripts 590 downloadsGENIE3 - GEne Network Inference with Ensemble of trees
This package implements the GENIE3 algorithm for inferring gene regulatory networks from expression data.
Last updated
networkinferencesystemsbiologydecisiontreeregressionnetworkgraphandnetworkgeneexpression
7.84 score 4 dependents 235 scripts 2.5k downloadsHPAanalyze - Retrieve and analyze data from the Human Protein Atlas
Provide functions for retrieving, exploratory analyzing and visualizing the Human Protein Atlas data. HPAanalyze is designed to fullfill 3 main tasks: (1) Import, subsetting and export downloadable datasets; (2) Visualization of downloadable datasets for exploratory analysis; and (3) Working with the individual XML files. This package aims to serve researchers with little programming experience, but also allow power users to use the imported data as desired.
Last updated
proteomicscellbiologyvisualizationsoftware
7.82 score 39 stars 40 scripts 546 downloadsACE - Absolute Copy Number Estimation from Low-coverage Whole Genome Sequencing
Uses segmented copy number data to estimate tumor cell percentage and produce copy number plots displaying absolute copy numbers.
Last updated
copynumbervariationdnaseqcoveragewholegenomevisualizationsequencing
7.80 score 20 stars 42 scripts 482 downloadspathwayPCA - Integrative Pathway Analysis with Modern PCA Methodology and Gene Selection
pathwayPCA is an integrative analysis tool that implements the principal component analysis (PCA) based pathway analysis approaches described in Chen et al. (2008), Chen et al. (2010), and Chen (2011). pathwayPCA allows users to: (1) Test pathway association with binary, continuous, or survival phenotypes. (2) Extract relevant genes in the pathways using the SuperPCA and AES-PCA approaches. (3) Compute principal components (PCs) based on the selected genes. These estimated latent variables represent pathway activities for individual subjects, which can then be used to perform integrative pathway analysis, such as multi-omics analysis. (4) Extract relevant genes that drive pathway significance as well as data corresponding to these relevant genes for additional in-depth analysis. (5) Perform analyses with enhanced computational efficiency with parallel computing and enhanced data safety with S4-class data objects. (6) Analyze studies with complex experimental designs, with multiple covariates, and with interaction effects, e.g., testing whether pathway association with clinical phenotype is different between male and female subjects. Citations: Chen et al. (2008) <https://doi.org/10.1093/bioinformatics/btn458>; Chen et al. (2010) <https://doi.org/10.1002/gepi.20532>; and Chen (2011) <https://doi.org/10.2202/1544-6115.1697>.
Last updated
copynumbervariationdnamethylationgeneexpressionsnptranscriptiongenepredictiongenesetenrichmentgenesignalinggenetargetgenomewideassociationgenomicvariationcellbiologyepigeneticsfunctionalgenomicsgeneticslipidomicsmetabolomicsproteomicssystemsbiologytranscriptomicsclassificationdimensionreductionfeatureextractionprincipalcomponentregressionsurvivalmultiplecomparisonpathways
7.75 score 11 stars 43 scripts 414 downloadscountsimQC - Compare Characteristic Features of Count Data Sets
countsimQC provides functionality to create a comprehensive report comparing a broad range of characteristics across a collection of count matrices. One important use case is the comparison of one or more synthetic count matrices to a real count matrix, possibly the one underlying the simulations. However, any collection of count matrices can be compared.
Last updated
microbiomernaseqsinglecellexperimentaldesignqualitycontrolreportwritingvisualizationimmunooncology
7.75 score 30 stars 25 scripts 437 downloadsWrench - Wrench normalization for sparse count data
Wrench is a package for normalization sparse genomic count data, like that arising from 16s metagenomic surveys.
Last updated
normalizationsequencingsoftware
7.74 score 6 stars 7 dependents 27 scripts 2.7k downloadsMIRA - Methylation-Based Inference of Regulatory Activity
DNA methylation contains information about the regulatory state of the cell. MIRA aggregates genome-scale DNA methylation data into a DNA methylation profile for a given region set with shared biological annotation. Using this profile, MIRA infers and scores the collective regulatory activity for the region set. MIRA facilitates regulatory analysis in situations where classical regulatory assays would be difficult and allows public sources of region sets to be leveraged for novel insight into the regulatory state of DNA methylation datasets.
Last updated
immunooncologydnamethylationgeneregulationgenomeannotationsystemsbiologyfunctionalgenomicschipseqmethylseqsequencingepigeneticscoverage
7.74 score 13 stars 1 dependents 14 scripts 407 downloadsglmSparseNet - Network Centrality Metrics for Elastic-Net Regularized Models
glmSparseNet is an R-package that generalizes sparse regression models when the features (e.g. genes) have a graph structure (e.g. protein-protein interactions), by including network-based regularizers. glmSparseNet uses the glmnet R-package, by including centrality measures of the network as penalty weights in the regularization. The current version implements regularization based on node degree, i.e. the strength and/or number of its associated edges, either by promoting hubs in the solution or orphan genes in the solution. All the glmnet distribution families are supported, namely "gaussian", "poisson", "binomial", "multinomial", "cox", and "mgaussian".
Last updated
softwarestatisticalmethoddimensionreductionregressionclassificationsurvivalnetworkgraphandnetwork
7.72 score 6 stars 1 dependents 54 scripts 820 downloadscola - A Framework for Consensus Partitioning
Subgroup classification is a basic task in genomic data analysis, especially for gene expression and DNA methylation data analysis. It can also be used to test the agreement to known clinical annotations, or to test whether there exist significant batch effects. The cola package provides a general framework for subgroup classification by consensus partitioning. It has the following features: 1. It modularizes the consensus partitioning processes that various methods can be easily integrated. 2. It provides rich visualizations for interpreting the results. 3. It allows running multiple methods at the same time and provides functionalities to straightforward compare results. 4. It provides a new method to extract features which are more efficient to separate subgroups. 5. It automatically generates detailed reports for the complete analysis. 6. It allows applying consensus partitioning in a hierarchical manner.
Last updated
clusteringgeneexpressionclassificationsoftwareconsensus-clusteringcpp
7.65 score 64 stars 156 scripts 556 downloadsDEsingle - DEsingle for detecting three types of differential expression in single-cell RNA-seq data
DEsingle is an R package for differential expression (DE) analysis of single-cell RNA-seq (scRNA-seq) data. It defines and detects 3 types of differentially expressed genes between two groups of single cells, with regard to different expression status (DEs), differential expression abundance (DEa), and general differential expression (DEg). DEsingle employs Zero-Inflated Negative Binomial model to estimate the proportion of real and dropout zeros and to define and detect the 3 types of DE genes. Results showed that DEsingle outperforms existing methods for scRNA-seq DE analysis, and can reveal different types of DE genes that are enriched in different biological functions.
Last updated
differentialexpressiongeneexpressionsinglecellimmunooncologyrnaseqtranscriptomicssequencingpreprocessingsoftware
7.58 score 33 stars 38 scripts 611 downloadsmultiHiCcompare - Normalize and detect differences between Hi-C datasets when replicates of each experimental condition are available
multiHiCcompare provides functions for joint normalization and difference detection in multiple Hi-C datasets. This extension of the original HiCcompare package now allows for Hi-C experiments with more than 2 groups and multiple samples per group. multiHiCcompare operates on processed Hi-C data in the form of sparse upper triangular matrices. It accepts four column (chromosome, region1, region2, IF) tab-separated text files storing chromatin interaction matrices. multiHiCcompare provides cyclic loess and fast loess (fastlo) methods adapted to jointly normalizing Hi-C data. Additionally, it provides a general linear model (GLM) framework adapting the edgeR package to detect differences in Hi-C data in a distance dependent manner.
Last updated
softwarehicsequencingnormalization
7.50 score 10 stars 2 dependents 53 scripts 474 downloadsnetSmooth - Network smoothing for scRNAseq
netSmooth is an R package for network smoothing of single cell RNA sequencing data. Using bio networks such as protein-protein interactions as priors for gene co-expression, netsmooth improves cell type identification from noisy, sparse scRNAseq data.
Last updated
networkgraphandnetworksinglecellrnaseqgeneexpressionsequencingtranscriptomicsnormalizationpreprocessingclusteringdimensionreductionbioinformaticsgenomicssingle-cell
7.44 score 29 stars 4 scripts 398 downloadsphantasus - Visual and interactive gene expression analysis
Phantasus is a web-application for visual and interactive gene expression analysis. Phantasus is based on Morpheus – a web-based software for heatmap visualisation and analysis, which was integrated with an R environment via OpenCPU API. Aside from basic visualization and filtering methods, R-based methods such as k-means clustering, principal component analysis or differential expression analysis with limma package are supported.
Last updated
geneexpressionguivisualizationdatarepresentationtranscriptomicsrnaseqmicroarraynormalizationclusteringdifferentialexpressionprincipalcomponentimmunooncology
7.35 score 46 stars 18 scripts 457 downloadsBiocVersion - Set the appropriate version of Bioconductor packages
This package provides repository information for the appropriate version of Bioconductor.
Last updated
infrastructure
7.35 score 116 dependents 26 scripts 123k downloadsCOCOA - Coordinate Covariation Analysis
COCOA is a method for understanding epigenetic variation among samples. COCOA can be used with epigenetic data that includes genomic coordinates and an epigenetic signal, such as DNA methylation and chromatin accessibility data. To describe the method on a high level, COCOA quantifies inter-sample variation with either a supervised or unsupervised technique then uses a database of "region sets" to annotate the variation among samples. A region set is a set of genomic regions that share a biological annotation, for instance transcription factor (TF) binding regions, histone modification regions, or open chromatin regions. COCOA can identify region sets that are associated with epigenetic variation between samples and increase understanding of variation in your data.
Last updated
epigeneticsdnamethylationatacseqdnaseseqmethylseqmethylationarrayprincipalcomponentgenomicvariationgeneregulationgenomeannotationsystemsbiologyfunctionalgenomicschipseqsequencingimmunooncologydna-methylationpca
7.24 score 11 stars 21 scripts 400 downloads
onlineFDR - Online error rate control
This package allows users to control the false discovery rate (FDR) or familywise error rate (FWER) for online multiple hypothesis testing, where hypotheses arrive in a stream. In this framework, a null hypothesis is rejected based on the evidence against it and on the previous rejection decisions.
Last updated
multiplecomparisonsoftwarestatisticalmethoderror-rate-controlfdrfwerhypothesis-testingcpp
7.24 score 15 stars 48 scripts 272 downloadstopconfects - Top Confident Effect Sizes
Rank results by confident effect sizes, while maintaining False Discovery Rate and False Coverage-statement Rate control. Topconfects is an alternative presentation of TREAT results with improved usability, eliminating p-values and instead providing confidence bounds. The main application is differential gene expression analysis, providing genes ranked in order of confident log2 fold change, but it can be applied to any collection of effect sizes with associated standard errors.
Last updated
geneexpressiondifferentialexpressiontranscriptomicsrnaseqmrnamicroarrayregressionmultiplecomparison
7.20 score 15 stars 2 dependents 22 scripts 454 downloadsmaser - Mapping Alternative Splicing Events to pRoteins
This package provides functionalities for downstream analysis, annotation and visualizaton of alternative splicing events generated by rMATS.
Last updated
alternativesplicingtranscriptomicsvisualization
7.18 score 21 stars 27 scripts 538 downloadsmethylGSA - Gene Set Analysis Using the Outcome of Differential Methylation
The main functions for methylGSA are methylglm and methylRRA. methylGSA implements logistic regression adjusting number of probes as a covariate. methylRRA adjusts multiple p-values of each gene by Robust Rank Aggregation. For more detailed help information, please see the vignette.
Last updated
dnamethylationdifferentialmethylationgenesetenrichmentregressiongeneregulationpathwaysenrichmentgeneralized-linear-modelslogistic-regressionmethylationontologyshiny
7.18 score 13 stars 58 scripts 425 downloadscytolib - C++ infrastructure for representing and interacting with the gated cytometry data
This package provides the core data structure and API to represent and interact with the gated cytometry data.
Last updated
immunooncologyflowcytometrydataimportpreprocessingdatarepresentation
7.09 score 62 dependents 13 scripts 5.1k downloadsRCM - Fit row-column association models with the negative binomial distribution for the microbiome
Combine ideas of log-linear analysis of contingency table, flexible response function estimation and empirical Bayes dispersion estimation for explorative visualization of microbiome datasets. The package includes unconstrained as well as constrained analysis. In addition, diagnostic plot to detect lack of fit are available.
Last updated
metagenomicsdimensionreductionmicrobiomevisualizationordinationphyloseqrcm
7.09 score 16 stars 17 scripts 348 downloadsalevinQC - Generate QC Reports For Alevin Output
Generate QC reports summarizing the output from an alevin, alevin-fry, or simpleaf run. Reports can be generated as html or pdf files, or as shiny applications.
Last updated
qualitycontrolsinglecellcpp
7.07 score 31 stars 51 scripts 512 downloadsMPRAnalyze - Statistical Analysis of MPRA data
MPRAnalyze provides statistical framework for the analysis of data generated by Massively Parallel Reporter Assays (MPRAs), used to directly measure enhancer activity. MPRAnalyze can be used for quantification of enhancer activity, classification of active enhancers and comparative analyses of enhancer activity between conditions. MPRAnalyze construct a nested pair of generalized linear models (GLMs) to relate the DNA and RNA observations, easily adjustable to various experimental designs and conditions, and provides a set of rigorous statistical testig schemes.
Last updated
immunooncologysoftwarestatisticalmethodsequencinggeneexpressioncellbiologycellbasedassaysdifferentialexpressionexperimentaldesignclassification
7.04 score 12 stars 46 scripts 328 downloadsModstrings - Working with modified nucleotide sequences
Representing nucleotide modifications in a nucleotide sequence is usually done via special characters from a number of sources. This represents a challenge to work with in R and the Biostrings package. The Modstrings package implements this functionallity for RNA and DNA sequences containing modified nucleotides by translating the character internally in order to work with the infrastructure of the Biostrings package. For this the ModRNAString and ModDNAString classes and derivates and functions to construct and modify these objects despite the encoding issues are implemenented. In addition the conversion from sequences to list like location information (and the reverse operation) is implemented as well.
Last updated
dataimportdatarepresentationinfrastructuresequencingsoftwarebioconductorbiostringsdnadna-modificationsmodified-nucleotidesnucleotidesrnarna-modification-alphabetrna-modificationssequences
7.03 score 2 stars 8 dependents 5 scripts 478 downloadsdiffcoexp - Differential Co-expression Analysis
A tool for the identification of differentially coexpressed links (DCLs) and differentially coexpressed genes (DCGs). DCLs are gene pairs with significantly different correlation coefficients under two conditions. DCGs are genes with significantly more DCLs than by chance.
Last updated
geneexpressiondifferentialexpressiontranscriptionmicroarrayonechanneltwochannelrnaseqsequencingcoverageimmunooncology
6.94 score 17 stars 37 scripts 437 downloadsSpectralTAD - SpectralTAD: Hierarchical TAD detection using spectral clustering
SpectralTAD is an R package designed to identify Topologically Associated Domains (TADs) from Hi-C contact matrices. It uses a modified version of spectral clustering that uses a sliding window to quickly detect TADs. The function works on a range of different formats of contact matrices and returns a bed file of TAD coordinates. The method does not require users to adjust any parameters to work and gives them control over the number of hierarchical levels to be returned.
Last updated
softwarehicsequencingfeatureextractionclustering
6.92 score 12 stars 28 scripts 392 downloadsvidger - Create rapid visualizations of RNAseq data in R
The aim of vidger is to rapidly generate information-rich visualizations for the interpretation of differential gene expression results from three widely-used tools: Cuffdiff, DESeq2, and edgeR.
Last updated
immunooncologyvisualizationrnaseqdifferentialexpressiongeneexpressiondata-mungingdifferential-expressiongene-expressionrna-seq-analysis
6.91 score 20 stars 34 scripts 470 downloadsRProtoBufLib - C++ headers and static libraries of Protocol buffers
This package provides the headers and static library of Protocol buffers for other R packages to compile and link against.
Last updated
infrastructure
6.91 score 63 dependents 4.3k downloadssRACIPE - Systems biology tool to simulate gene regulatory circuits
sRACIPE implements a randomization-based method for gene circuit modeling. It allows us to study the effect of both the gene expression noise and the parametric variation on any gene regulatory circuit (GRC) using only its topology, and simulates an ensemble of models with random kinetic parameters at multiple noise levels. Statistical analysis of the generated gene expressions reveals the basin of attraction and stability of various phenotypic states and their changes associated with intrinsic and extrinsic noises. sRACIPE provides a holistic picture to evaluate the effects of both the stochastic nature of cellular processes and the parametric variation.
Last updated
researchfieldsystemsbiologymathematicalbiologygeneexpressiongeneregulationgenetargetcpp
6.80 score 6 stars 234 scripts 356 downloadsEventPointer - An effective identification of alternative splicing events using junction arrays and RNA-Seq data
EventPointer is an R package to identify alternative splicing events that involve either simple (case-control experiment) or complex experimental designs such as time course experiments and studies including paired-samples. The algorithm can be used to analyze data from either junction arrays (Affymetrix Arrays) or sequencing data (RNA-Seq). In the latter, EventPointer can work with annotated splicing events or can build a splicing graph from the RNA-Seq reads and then identify new and specific alternative splicing events. The software returns a data.frame with the detected alternative splicing events: gene name, type of event (cassette, alternative 3',...,etc), genomic position, statistical significance and increment of the percent spliced in (Delta PSI) for all the events. The algorithm can generate a series of files to visualize the detected alternative splicing events in IGV. This eases the interpretation of results and the design of primers for standard PCR validation.
Last updated
alternativesplicingdifferentialsplicingmrnamicroarrayrnaseqtranscriptionsequencingtimecourseimmunooncology
6.80 score 5 stars 7 scripts 453 downloadsATACseqQC - ATAC-seq Quality Control
ATAC-seq, an assay for Transposase-Accessible Chromatin using sequencing, is a rapid and sensitive method for chromatin accessibility analysis. It was developed as an alternative method to MNase-seq, FAIRE-seq and DNAse-seq. Comparing to the other methods, ATAC-seq requires less amount of the biological samples and time to process. In the process of analyzing several ATAC-seq dataset produced in our labs, we learned some of the unique aspects of the quality assessment for ATAC-seq data.To help users to quickly assess whether their ATAC-seq experiment is successful, we developed ATACseqQC package partially following the guideline published in Nature Method 2013 (Greenleaf et al.), including diagnostic plot of fragment size distribution, proportion of mitochondria reads, nucleosome positioning pattern, and CTCF or other Transcript Factor footprints.
Last updated
sequencingdnaseqatacseqgeneregulationqualitycontrolcoveragenucleosomepositioningimmunooncology
6.80 score 1 dependents 192 scripts 1.1k downloadsPrInCE - Predicting Interactomes from Co-Elution
PrInCE (Predicting Interactomes from Co-Elution) uses a naive Bayes classifier trained on dataset-derived features to recover protein-protein interactions from co-elution chromatogram profiles. This package contains the R implementation of PrInCE.
Last updated
proteomicssystemsbiologynetworkinference
6.80 score 8 stars 65 scripts 367 downloads
SIAMCAT - Statistical Inference of Associations between Microbial Communities And host phenoTypes
Pipeline for Statistical Inference of Associations between Microbial Communities And host phenoTypes (SIAMCAT). A primary goal of analyzing microbiome data is to determine changes in community composition that are associated with environmental factors. In particular, linking human microbiome composition to host phenotypes such as diseases has become an area of intense research. For this, robust statistical modeling and biomarker extraction toolkits are crucially needed. SIAMCAT provides a full pipeline supporting data preprocessing, statistical association testing, statistical modeling (LASSO logistic regression) including tools for evaluation and interpretation of these models (such as cross validation, parameter selection, ROC analysis and diagnostic model plots).
Last updated
immunooncologymetagenomicsclassificationmicrobiomesequencingpreprocessingclusteringfeatureextractiongeneticvariabilitymultiplecomparisonregression
6.79 score 173 scripts 450 downloadsBioCor - Functional Similarities
Calculates functional similarities based on the pathways described on KEGG and REACTOME or in gene sets. These similarities can be calculated for pathways or gene sets, genes, or clusters and combined with other similarities. They can be used to improve networks, gene selection, testing relationships...
Last updated
statisticalmethodclusteringgeneexpressionnetworkpathwaysnetworkenrichmentsystemsbiologybioconductor-packagesbioinformaticsfunctional-similaritygenegene-setspathway-analysissimilaritysimilarity-measurement
6.77 score 14 stars 5 scripts 511 downloadsTrendy - Breakpoint analysis of time-course expression data
Trendy implements segmented (or breakpoint) regression models to estimate breakpoints which represent changes in expression for each feature/gene in high throughput data with ordered conditions.
Last updated
timecoursernaseqregressionimmunooncology
6.70 score 7 stars 18 scripts 383 downloadsmiRBaseConverter - A comprehensive and high-efficiency tool for converting and retrieving the information of miRNAs in different miRBase versions
A comprehensive tool for converting and retrieving the miRNA Name, Accession, Sequence, Version, History and Family information in different miRBase versions. It can process a huge number of miRNAs in a short time without other depends.
Last updated
softwaremirna
6.69 score 2 stars 82 scripts 544 downloadstRNA - Analyzing tRNA sequences and structures
The tRNA package allows tRNA sequences and structures to be accessed and used for subsetting. In addition, it provides visualization tools to compare feature parameters of multiple tRNA sets and correlate them to additional data. The tRNA package uses GRanges objects as inputs requiring only few additional column data sets.
Last updated
softwarevisualizationbioconductorsequencesstructurestrna
6.69 score 1 stars 3 dependents 9 scripts 457 downloadsM3C - Monte Carlo Reference-based Consensus Clustering
M3C is a consensus clustering algorithm that uses a Monte Carlo simulation to eliminate overestimation of K and can reject the null hypothesis K=1.
Last updated
clusteringgeneexpressiontranscriptionrnaseqsequencingimmunooncology
6.68 score 1 dependents 214 scripts 1.1k downloadsscds - In-Silico Annotation of Doublets for Single Cell RNA Sequencing Data
In single cell RNA sequencing (scRNA-seq) data combinations of cells are sometimes considered a single cell (doublets). The scds package provides methods to annotate doublets in scRNA-seq data computationally.
Last updated
singlecellrnaseqqualitycontrolpreprocessingtranscriptomicsgeneexpressionsequencingsoftwareclassification
6.67 score 1 dependents 224 scripts 875 downloadsPepsNMR - Pre-process 1H-NMR FID signals
This package provides R functions for common pre-procssing steps that are applied on 1H-NMR data. It also provides a function to read the FID signals directly in the Bruker format.
Last updated
softwarepreprocessingvisualizationmetabolomicsdataimport
6.61 score 8 stars 1 dependents 28 scripts 465 downloadsXeva - Analysis of patient-derived xenograft (PDX) data
The Xeva package provides efficient and powerful functions for patient-drived xenograft (PDX) based pharmacogenomic data analysis. This package contains a set of functions to perform analysis of patient-derived xenograft data. This package was developed by the BHKLab, for further information please see our documentation.
Last updated
geneexpressionpharmacogeneticspharmacogenomicssoftwareclassification
6.59 score 11 stars 35 scripts 344 downloadsscruff - Single Cell RNA-Seq UMI Filtering Facilitator (scruff)
A pipeline which processes single cell RNA-seq (scRNA-seq) reads from CEL-seq and CEL-seq2 protocols. Demultiplex scRNA-seq FASTQ files, align reads to reference genome using Rsubread, and generate UMI filtered count matrix. Also provide visualizations of read alignments and pre- and post-alignment QC metrics.
Last updated
softwaretechnologysequencingalignmentrnaseqsinglecellworkflowsteppreprocessingqualitycontrolvisualizationimmunooncologybioinformaticsscrna-seqsingle-cellumi
6.57 score 11 stars 28 scripts 420 downloadsCellMixS - Evaluate Cellspecific Mixing
CellMixS provides metrics and functions to evaluate batch effects, data integration and batch effect correction in single cell trancriptome data with single cell resolution. Results can be visualized and summarised on different levels, e.g. on cell, celltype or dataset level.
Last updated
singlecelltranscriptomicsgeneexpressionbatcheffect
6.55 score 7 stars 67 scripts 494 downloadsbayNorm - Single-cell RNA sequencing data normalization
bayNorm is used for normalizing single-cell RNA-seq data.
Last updated
immunooncologynormalizationrnaseqsinglecellsequencingscrnaseqcppopenmp
6.55 score 10 stars 39 scripts 486 downloadswiggleplotr - Make read coverage plots from BigWig files
Tools to visualise read coverage from sequencing experiments together with genomic annotations (genes, transcripts, peaks). Introns of long transcripts can be rescaled to a fixed length for better visualisation of exonic read coverage.
Last updated
immunooncologycoveragernaseqchipseqsequencingvisualizationgeneexpressiontranscriptionalternativesplicing
6.51 score 3 dependents 36 scripts 532 downloadsdmrseq - Detection and inference of differentially methylated regions from Whole Genome Bisulfite Sequencing
This package implements an approach for scanning the genome to detect and perform accurate inference on differentially methylated regions from Whole Genome Bisulfite Sequencing data. The method is based on comparing detected regions to a pooled null distribution, that can be implemented even when as few as two samples per population are available. Region-level statistics are obtained by fitting a generalized least squares (GLS) regression model with a nested autoregressive correlated error structure for the effect of interest on transformed methylation proportions.
Last updated
immunooncologydnamethylationepigeneticsmultiplecomparisonsoftwaresequencingdifferentialmethylationwholegenomeregressionfunctionalgenomics
6.51 score 1 dependents 102 scripts 728 downloadsndexr - NDEx R client library
This package offers an interface to NDEx servers, e.g. the public server at http://ndexbio.org/. It can retrieve and save networks via the API. Networks are offered as RCX object and as igraph representation.
Last updated
pathwaysdataimportnetwork
6.50 score 9 stars 44 scripts 384 downloadsgwasurvivr - gwasurvivr: an R package for genome wide survival analysis
gwasurvivr is a package to perform survival analysis using Cox proportional hazard models on imputed genetic data.
Last updated
genomewideassociationsurvivalregressiongeneticssnpgeneticvariabilitypharmacogenomicsbiomedicalinformatics
6.48 score 13 stars 77 scripts 384 downloadsStructstrings - Implementation of the dot bracket annotations with Biostrings
The Structstrings package implements the widely used dot bracket annotation for storing base pairing information in structured RNA. Structstrings uses the infrastructure provided by the Biostrings package and derives the DotBracketString and related classes from the BString class. From these, base pair tables can be produced for in depth analysis. In addition, the loop indices of the base pairs can be retrieved as well. For better efficiency, information conversion is implemented in C, inspired to a large extend by the ViennaRNA package.
Last updated
dataimportdatarepresentationinfrastructuresequencingsoftwarealignmentsequencematchingbioconductorrnarna-structural-analysisrna-structuresequencesstructures
6.47 score 5 stars 4 dependents 11 scripts 455 downloadsmpra - Analyze massively parallel reporter assays
Tools for data management, count preprocessing, and differential analysis in massively parallel report assays (MPRA).
Last updated
softwaregeneregulationsequencingfunctionalgenomics
6.46 score 6 stars 34 scripts 450 downloadsMSstatsTMT - Protein Significance Analysis in shotgun mass spectrometry-based proteomic experiments with tandem mass tag (TMT) labeling
The package provides statistical tools for detecting differentially abundant proteins in shotgun mass spectrometry-based proteomic experiments with tandem mass tag (TMT) labeling. It provides multiple functionalities, including aata visualization, protein quantification and normalization, and statistical modeling and inference. Furthermore, it is inter-operable with other data processing tools, such as Proteome Discoverer, MaxQuant, OpenMS and SpectroMine.
Last updated
immunooncologymassspectrometryproteomicssoftware
6.42 score 3 dependents 59 scripts 708 downloadsesATAC - An Easy-to-use Systematic pipeline for ATACseq data analysis
This package provides a framework and complete preset pipeline for quantification and analysis of ATAC-seq Reads. It covers raw sequencing reads preprocessing (FASTQ files), reads alignment (Rbowtie2), aligned reads file operations (SAM, BAM, and BED files), peak calling (F-seq), genome annotations (Motif, GO, SNP analysis) and quality control report. The package is managed by dataflow graph. It is easy for user to pass variables seamlessly between processes and understand the workflow. Users can process FASTQ files through end-to-end preset pipeline which produces a pretty HTML report for quality control and preliminary statistical results, or customize workflow starting from any intermediate stages with esATAC functions easily and flexibly.
Last updated
immunooncologysequencingdnaseqqualitycontrolalignmentpreprocessingcoverageatacseqdnaseseqatac-seqbioconductorpipelinecppopenjdk
6.41 score 23 stars 7 scripts 396 downloadskissDE - Retrieves Condition-Specific Variants in RNA-Seq Data
Retrieves condition-specific variants in RNA-seq data (SNVs, alternative-splicings, indels). It has been developed as a post-treatment of 'KisSplice' but can also be used with user's own data.
Last updated
alternativesplicingdifferentialsplicingexperimentaldesigngenomicvariationrnaseqtranscriptomicsquarto
6.38 score 3 stars 7 scripts 360 downloadsCluMSID - Clustering of MS2 Spectra for Metabolite Identification
CluMSID is a tool that aids the identification of features in untargeted LC-MS/MS analysis by the use of MS2 spectra similarity and unsupervised statistical methods. It offers functions for a complete and customisable workflow from raw data to visualisations and is interfaceable with the xmcs family of preprocessing packages.
Last updated
metabolomicspreprocessingclustering
6.32 score 10 stars 42 scripts 334 downloadsideal - Interactive Differential Expression AnaLysis
This package provides functions for an Interactive Differential Expression AnaLysis of RNA-sequencing datasets, to extract quickly and effectively information downstream the step of differential expression. A Shiny application encapsulates the whole package. Support for reproducibility of the whole analysis is provided by means of a template report which gets automatically compiled and can be stored/shared.
Last updated
immunooncologygeneexpressiondifferentialexpressionrnaseqsequencingvisualizationqualitycontrolguigenesetenrichmentreportwritingshinyappsbioconductordifferential-expressionreproducible-researchrna-seqrna-seq-analysisshinyuser-friendly
6.30 score 33 stars 7 scripts 448 downloadsGDSArray - Representing GDS files as array-like objects
GDS files are widely used to represent genotyping or sequence data. The GDSArray package implements the `GDSArray` class to represent nodes in GDS files in a matrix-like representation that allows easy manipulation (e.g., subsetting, mathematical transformation) in _R_. The data remains on disk until needed, so that very large files can be processed.
Last updated
infrastructuredatarepresentationsequencinggenotypingarray
6.26 score 5 stars 2 dependents 8 scripts 434 downloadsOmaDB - R wrapper for the OMA REST API
A package for the orthology prediction data download from OMA database.
Last updated
softwarecomparativegenomicsfunctionalgenomicsgeneticsannotationgofunctionalprediction
6.23 score 2 stars 15 scripts 410 downloadsDiscoRhythm - Interactive Workflow for Discovering Rhythmicity in Biological Data
Set of functions for estimation of cyclical characteristics, such as period, phase, amplitude, and statistical significance in large temporal datasets. Supporting functions are available for quality control, dimensionality reduction, spectral analysis, and analysis of experimental replicates. Contains a R Shiny web interface to execute all workflow steps.
Last updated
softwaretimecoursequalitycontrolvisualizationguiprincipalcomponentbioconductordata-visualizationoscillationsrhythm-detectionwebapp
6.21 score 13 stars 21 scripts 377 downloadsBANDITS - BANDITS: Bayesian ANalysis of DIfferenTial Splicing
BANDITS is a Bayesian hierarchical model for detecting differential splicing of genes and transcripts, via differential transcript usage (DTU), between two or more conditions. The method uses a Bayesian hierarchical framework, which allows for sample specific proportions in a Dirichlet-Multinomial model, and samples the allocation of fragments to the transcripts. Parameters are inferred via Markov chain Monte Carlo (MCMC) techniques and a DTU test is performed via a multivariate Wald test on the posterior densities for the average relative abundance of transcripts.
Last updated
differentialsplicingalternativesplicingbayesiangeneticsrnaseqsequencingdifferentialexpressiongeneexpressionmultiplecomparisonsoftwaretranscriptionstatisticalmethodvisualizationopenblascpp
6.19 score 19 stars 1 dependents 18 scripts 528 downloadsconsensusOV - Gene expression-based subtype classification for high-grade serous ovarian cancer
This package implements four major subtype classifiers for high-grade serous (HGS) ovarian cancer as described by Helland et al. (PLoS One, 2011), Bentink et al. (PLoS One, 2012), Verhaak et al. (J Clin Invest, 2013), and Konecny et al. (J Natl Cancer Inst, 2014). In addition, the package implements a consensus classifier, which consolidates and improves on the robustness of the proposed subtype classifiers, thereby providing reliable stratification of patients with HGS ovarian tumors of clearly defined subtype.
Last updated
classificationclusteringdifferentialexpressiongeneexpressionmicroarraytranscriptomicscancer-datacancer-genomicscancer-researchexpression-databaseovarian-cancer
6.18 score 3 stars 1 dependents 24 scripts 416 downloadssevenC - Computational Chromosome Conformation Capture by Correlation of ChIP-seq at CTCF motifs
Chromatin looping is an essential feature of eukaryotic genomes and can bring regulatory sequences, such as enhancers or transcription factor binding sites, in the close physical proximity of regulated target genes. Here, we provide sevenC, an R package that uses protein binding signals from ChIP-seq and sequence motif information to predict chromatin looping events. Cross-linking of proteins that bind close to loop anchors result in ChIP-seq signals at both anchor loci. These signals are used at CTCF motif pairs together with their distance and orientation to each other to predict whether they interact or not. The resulting chromatin loops might be used to associate enhancers or transcription factor binding sites (e.g., ChIP-seq peaks) to regulated target genes.
Last updated
dna3dstructurechipchipcoveragedataimportepigeneticsfunctionalgenomicsclassificationregressionchipseqhicannotation3d-genomechip-seqchromatin-interactionhi-cpredictionsequence-motiftranscription-factors
6.11 score 13 stars 4 scripts 439 downloadsPROPS - PRObabilistic Pathway Score (PROPS)
This package calculates probabilistic pathway scores using gene expression data. Gene expression values are aggregated into pathway-based scores using Bayesian network representations of biological pathways.
Last updated
classificationbayesiangeneexpression
6.08 score 604 scripts 433 downloadsFELLA - Interpretation and enrichment for metabolomics data
Enrichment of metabolomics data using KEGG entries. Given a set of affected compounds, FELLA suggests affected reactions, enzymes, modules and pathways using label propagation in a knowledge model network. The resulting subnetwork can be visualised and exported.
Last updated
softwaremetabolomicsgraphandnetworkkegggopathwaysnetworknetworkenrichment
6.07 score 37 scripts 530 downloadsmartini - GWAS Incorporating Networks
martini deals with the low power inherent to GWAS studies by using prior knowledge represented as a network. SNPs are the vertices of the network, and the edges represent biological relationships between them (genomic adjacency, belonging to the same gene, physical interaction between protein products). The network is scanned using SConES, which looks for groups of SNPs maximally associated with the phenotype, that form a close subnetwork.
Last updated
softwaregenomewideassociationsnpgeneticvariabilitygeneticsfeatureextractiongraphandnetworknetworkbioinformaticsgenomicsgwasnetwork-analysissnpssystems-biologycpp
6.06 score 4 stars 36 scripts 420 downloadsatSNP - Affinity test for identifying regulatory SNPs
atSNP performs affinity tests of motif matches with the SNP or the reference genomes and SNP-led changes in motif matches.
Last updated
softwarechipseqgenomeannotationmotifannotationvisualizationcpp
6.06 score 2 stars 38 scripts 485 downloadsCEMiTool - Co-expression Modules identification Tool
The CEMiTool package unifies the discovery and the analysis of coexpression gene modules in a fully automatic manner, while providing a user-friendly html report with high quality graphs. Our tool evaluates if modules contain genes that are over-represented by specific pathways or that are altered in a specific sample group. Additionally, CEMiTool is able to integrate transcriptomic data with interactome information, identifying the potential hubs on each network.
Last updated
geneexpressiontranscriptomicsgraphandnetworkmrnamicroarrayrnaseqnetworknetworkenrichmentpathwaysimmunooncology
6.06 score 57 scripts 514 downloadsCopyNumberPlots - Create Copy-Number Plots using karyoploteR functionality
CopyNumberPlots have a set of functions extending karyoploteRs functionality to create beautiful, customizable and flexible plots of copy-number related data.
Last updated
visualizationcopynumbervariationcoverageonechanneldataimportsequencingdnaseqbioconductorbioconductor-packagebioinformaticscopy-number-variationgenomicsgenomics-visualization
6.05 score 6 stars 1 dependents 21 scripts 321 downloadszFPKM - A suite of functions to facilitate zFPKM transformations
Perform the zFPKM transform on RNA-seq FPKM data. This algorithm is based on the publication by Hart et al., 2013 (Pubmed ID 24215113). Reference recommends using zFPKM > -3 to select expressed genes. Validated with encode open/closed chromosome data. Works well for gene level data using FPKM or TPM. Does not appear to calibrate well for transcript level data.
Last updated
immunooncologyrnaseqfeatureextractionsoftwaregeneexpression
6.03 score 9 stars 20 scripts 469 downloadsINDEED - Interactive Visualization of Integrated Differential Expression and Differential Network Analysis for Biomarker Candidate Selection Package
An R package for integrated differential expression and differential network analysis based on omic data for cancer biomarker discovery. Both correlation and partial correlation can be used to generate differential network to aid the traditional differential expression analysis to identify changes between biomolecules on both their expression and pairwise association levels. A detailed description of the methodology has been published in Methods journal (PMID: 27592383). An interactive visualization feature allows for the exploration and selection of candidate biomarkers.
Last updated
immunooncologysoftwareresearchfieldbiologicalquestionstatisticalmethoddifferentialexpressionmassspectrometrymetabolomics
6.02 score 5 stars 10 scripts 316 downloadsRITAN - Rapid Integration of Term Annotation and Network resources
Tools for comprehensive gene set enrichment and extraction of multi-resource high confidence subnetworks. RITAN facilitates bioinformatic tasks for enabling network biology research.
Last updated
qualitycontrolnetworknetworkenrichmentnetworkinferencegenesetenrichmentfunctionalgenomicsgraphandnetwork
5.99 score 26 scripts 306 downloadscicero - Predict cis-co-accessibility from single-cell chromatin accessibility data
Cicero computes putative cis-regulatory maps from single-cell chromatin accessibility data. It also extends monocle 2 for use in chromatin accessibility data.
Last updated
sequencingclusteringcellbasedassaysimmunooncologygeneregulationgenetargetepigeneticsatacseqsinglecell
5.94 score 436 scripts 759 downloadsbreakpointR - Find breakpoints in Strand-seq data
This package implements functions for finding breakpoints, plotting and export of Strand-seq data.
Last updated
softwaresequencingdnaseqsinglecellcoverage
5.93 score 9 stars 19 scripts 446 downloadsseqcombo - Visualization Tool for Genetic Reassortment
Provides useful functions for visualizing virus reassortment events.
Last updated
alignmentsoftwarevisualization
5.92 score 21 stars 8 scripts 369 downloadsqsmooth - Smooth quantile normalization
Smooth quantile normalization is a generalization of quantile normalization, which is average of the two types of assumptions about the data generation process: quantile normalization and quantile normalization between groups.
Last updated
normalizationpreprocessingmultiplecomparisonmicroarraysequencingrnaseqbatcheffect
5.91 score 1 dependents 91 scripts 386 downloadsmiRspongeR - Identification and analysis of miRNA sponge regulation
This package provides several functions to explore miRNA sponge (also called ceRNA or miRNA decoy) regulation from putative miRNA-target interactions or/and transcriptomics data (including bulk, single-cell and spatial gene expression data). It provides eight popular methods for identifying miRNA sponge interactions, and an integrative method to integrate miRNA sponge interactions from different methods, as well as the functions to validate miRNA sponge interactions, and infer miRNA sponge modules, conduct enrichment analysis of miRNA sponge modules, and conduct survival analysis of miRNA sponge modules. By using a sample control variable strategy, it provides a function to infer sample-specific miRNA sponge interactions. In terms of sample-specific miRNA sponge interactions, it implements three similarity methods to construct sample-sample correlation network.
Last updated
geneexpressionbiomedicalinformaticsnetworkenrichmentsurvivalmicroarraysoftwaresinglecellspatialrnaseqcernamirnaspongecpp
5.88 score 5 stars 10 scripts 362 downloadsMetID - Network-based prioritization of putative metabolite IDs
This package uses an innovative network-based approach that will enhance our ability to determine the identities of significant ions detected by LC-MS.
Last updated
assaydomainbiologicalquestioninfrastructureresearchfieldstatisticalmethodtechnologyworkflowstepnetworkkegg
5.85 score 1 stars 142 scripts 375 downloadsDeMixT - Cell type-specific deconvolution of heterogeneous tumor samples with two or three components using expression data from RNAseq or microarray platforms
DeMixT is a software package that performs deconvolution on transcriptome data from a mixture of two or three components.
Last updated
softwarestatisticalmethodclassificationgeneexpressionsequencingmicroarraytissuemicroarraycoveragecppopenmp
5.78 score 30 scripts 395 downloadssitePath - Phylogeny-based sequence clustering with site polymorphism
Using site polymorphism is one of the ways to cluster DNA/protein sequences but it is possible for the sequences with the same polymorphism on a single site to be genetically distant. This package is aimed at clustering sequences using site polymorphism and their corresponding phylogenetic trees. By considering their location on the tree, only the structurally adjacent sequences will be clustered. However, the adjacent sequences may not necessarily have the same polymorphism. So a branch-and-bound like algorithm is used to minimize the entropy representing the purity of site polymorphism of each cluster.
Last updated
alignmentmultiplesequencealignmentphylogeneticssnpsoftwaremutationcpp
5.69 score 11 stars 7 scripts 390 downloadsGDCRNATools - GDCRNATools: an R/Bioconductor package for integrative analysis of lncRNA, mRNA, and miRNA data in GDC
This is an easy-to-use package for downloading, organizing, and integrative analyzing RNA expression data in GDC with an emphasis on deciphering the lncRNA-mRNA related ceRNA regulatory network in cancer. Three databases of lncRNA-miRNA interactions including spongeScan, starBase, and miRcode, as well as three databases of mRNA-miRNA interactions including miRTarBase, starBase, and miRcode are incorporated into the package for ceRNAs network construction. limma, edgeR, and DESeq2 can be used to identify differentially expressed genes/miRNAs. Functional enrichment analyses including GO, KEGG, and DO can be performed based on the clusterProfiler and DO packages. Both univariate CoxPH and KM survival analyses of multiple genes can be implemented in the package. Besides some routine visualization functions such as volcano plot, bar plot, and KM plot, a few simply shiny apps are developed to facilitate visualization of results on a local webpage.
Last updated
immunooncologygeneexpressiondifferentialexpressiongeneregulationgenetargetnetworkinferencesurvivalvisualizationgenesetenrichmentnetworkenrichmentnetworkrnaseqgokegg
5.67 score 47 scripts 546 downloadsRhisat2 - R Wrapper for HISAT2 Aligner
An R interface to the HISAT2 spliced short-read aligner by Kim et al. (2015). The package contains wrapper functions to create a genome index and to perform the read alignment to the generated index.
Last updated
alignmentsequencingsplicedalignmentcpp
5.56 score 3 stars 1 dependents 7 scripts 565 downloadsRIVER - R package for RIVER (RNA-Informed Variant Effect on Regulation)
An implementation of a probabilistic modeling framework that jointly analyzes personal genome and transcriptome data to estimate the probability that a variant has regulatory impact in that individual. It is based on a generative model that assumes that genomic annotations, such as the location of a variant with respect to regulatory elements, determine the prior probability that variant is a functional regulatory variant, which is an unobserved variable. The functional regulatory variant status then influences whether nearby genes are likely to display outlier levels of gene expression in that person. See the RIVER website for more information, documentation and examples.
Last updated
geneexpressiongeneticvariabilitysnptranscriptionfunctionalpredictiongeneregulationgenomicvariationbiomedicalinformaticsfunctionalgenomicsgeneticssystemsbiologytranscriptomicsbayesianclusteringtranscriptomevariantregressionfunctional-variantsvariant
5.56 score 12 stars 5 scripts 426 downloadsRcwl - An R interface to the Common Workflow Language
The Common Workflow Language (CWL) is an open standard for development of data analysis workflows that is portable and scalable across different tools and working environments. Rcwl provides a simple way to wrap command line tools and build CWL data analysis pipelines programmatically within R. It increases the ease of usage, development, and maintenance of CWL pipelines.
Last updated
softwareworkflowstepimmunooncology
5.55 score 2 dependents 39 scripts 318 downloadsseqsetvis - Set Based Visualizations for Next-Gen Sequencing Data
seqsetvis enables the visualization and analysis of sets of genomic sites in next gen sequencing data. Although seqsetvis was designed for the comparison of mulitple ChIP-seq samples, this package is domain-agnostic and allows the processing of multiple genomic coordinate files (bed-like files) and signal files (bigwig files pileups from bam file). seqsetvis has multiple functions for fetching data from regions into a tidy format for analysis in data.table or tidyverse and visualization via ggplot2.
Last updated
softwarechipseqmultiplecomparisonsequencingvisualization
5.52 score 83 scripts 508 downloadsmiRSM - Inferring miRNA sponge modules in heterogeneous data
The package aims to identify miRNA sponge or ceRNA modules in heterogeneous data. It provides several functions to study miRNA sponge modules at single-sample and multi-sample levels, including popular methods for inferring gene modules (candidate miRNA sponge or ceRNA modules), and two functions to identify miRNA sponge modules at single-sample and multi-sample levels, as well as several functions to conduct modular analysis of miRNA sponge modules.
Last updated
geneexpressionbiomedicalinformaticsclusteringgenesetenrichmentmicroarraysoftwaregeneregulationgenetargetcernamirnamirna-spongemirna-targetsmodulescppopenjdk
5.51 score 4 stars 5 scripts 348 downloadsbnbc - Bandwise normalization and batch correction of Hi-C data
Tools to normalize (several) Hi-C data from replicates.
Last updated
hicpreprocessingnormalizationsoftwarecpp
5.48 score 2 stars 15 scripts 402 downloadscellbaseR - Querying annotation data from the high performance Cellbase web
This R package makes use of the exhaustive RESTful Web service API that has been implemented for the Cellabase database. It enable researchers to query and obtain a wealth of biological information from a single database saving a lot of time. Another benefit is that researchers can easily make queries about different biological topics and link all this information together as all information is integrated.
Last updated
annotationvariantannotation
5.48 score 2 stars 7 scripts 374 downloadsepiNEM - epiNEM
epiNEM is an extension of the original Nested Effects Models (NEM). EpiNEM is able to take into account double knockouts and infer more complex network signalling pathways. It is tailored towards large scale double knock-out screens.
Last updated
pathwayssystemsbiologynetworkinferencenetwork
5.48 score 1 stars 2 dependents 3 scripts 444 downloadsscRecover - scRecover for imputation of single-cell RNA-seq data
scRecover is an R package for imputation of single-cell RNA-seq (scRNA-seq) data. It will detect and impute dropout values in a scRNA-seq raw read counts matrix while keeping the real zeros unchanged, since there are both dropout zeros and real zeros in scRNA-seq data. By combination with scImpute, SAVER and MAGIC, scRecover not only detects dropout and real zeros at higher accuracy, but also improve the downstream clustering and visualization results.
Last updated
geneexpressionsinglecellrnaseqtranscriptomicssequencingpreprocessingsoftware
5.46 score 9 stars 16 scripts 391 downloadsRbowtie2 - An R Wrapper for Bowtie2 and AdapterRemoval
This package provides an R wrapper of the popular bowtie2 sequencing reads aligner and AdapterRemoval, a convenient tool for rapid adapter trimming, identification, and read merging. The package contains wrapper functions that allow for genome indexing and alignment to those indexes. The package also allows for the creation of .bam files via Rsamtools.
Last updated
sequencingalignmentpreprocessingcpp
5.45 score 3 dependents 31 scripts 764 downloadsdiffuStats - Diffusion scores on biological networks
Label propagation approaches are a widely used procedure in computational biology for giving context to molecular entities using network data. Node labels, which can derive from gene expression, genome-wide association studies, protein domains or metabolomics profiling, are propagated to their neighbours in the network, effectively smoothing the scores through prior annotated knowledge and prioritising novel candidates. The R package diffuStats contains a collection of diffusion kernels and scoring approaches that facilitates their computation, characterisation and benchmarking.
Last updated
networkgeneexpressiongraphandnetworkmetabolomicstranscriptomicsproteomicsgeneticsgenomewideassociationnormalizationcpp
5.43 score 45 scripts 414 downloadsQSutils - Quasispecies Diversity
Set of utility functions for viral quasispecies analysis with NGS data. Most functions are equally useful for metagenomic studies. There are three main types: (1) data manipulation and exploration—functions useful for converting reads to haplotypes and frequencies, repairing reads, intersecting strand haplotypes, and visualizing haplotype alignments. (2) diversity indices—functions to compute diversity and entropy, in which incidence, abundance, and functional indices are considered. (3) data simulation—functions useful for generating random viral quasispecies data.
Last updated
softwaregeneticsdnaseqgeneticvariabilitysequencingalignmentsequencematchingdataimport
5.43 score 1 dependents 15 scripts 445 downloadsMetNet - Inferring metabolic networks from untargeted high-resolution mass spectrometry data
MetNet contains functionality to infer metabolic network topologies from quantitative data and high-resolution mass/charge information. Using statistical models (including correlation, mutual information, regression and Bayes statistics) and quantitative data (intensity values of features) adjacency matrices are inferred that can be combined to a consensus matrix. Mass differences calculated between mass/charge values of features will be matched against a data frame of supplied mass/charge differences referring to transformations of enzymatic activities. In a third step, the two levels of information are combined to form a adjacency matrix inferred from both quantitative and structure information.
Last updated
immunooncologymetabolomicsmassspectrometrynetworkregression
5.40 score 6 scripts 440 downloadsSMAD - Statistical Modelling of AP-MS Data (SMAD)
Assigning probability scores to protein interactions captured in affinity purification mass spectrometry (AP-MS) expriments to infer protein-protein interactions. The output would facilitate non-specific background removal as contaminants are commonly found in AP-MS data.
Last updated
massspectrometryproteomicssoftwarecpp
5.38 score 5 scripts 320 downloadsrScudo - Signature-based Clustering for Diagnostic Purposes
SCUDO (Signature-based Clustering for Diagnostic Purposes) is a rank-based method for the analysis of gene expression profiles for diagnostic and classification purposes. It is based on the identification of sample-specific gene signatures composed of the most up- and down-regulated genes for that sample. Starting from gene expression data, functions in this package identify sample-specific gene signatures and use them to build a graph of samples. In this graph samples are joined by edges if they have a similar expression profile, according to a pre-computed similarity matrix. The similarity between the expression profiles of two samples is computed using a method similar to GSEA. The graph of samples can then be used to perform community clustering or to perform supervised classification of samples in a testing set.
Last updated
geneexpressiondifferentialexpressionbiomedicalinformaticsclassificationclusteringgraphandnetworknetworkproteomicstranscriptomicssystemsbiologyfeatureextraction
5.38 score 4 stars 20 scripts 334 downloadsicetea - Integrating Cap Enrichment with Transcript Expression Analysis
icetea (Integrating Cap Enrichment with Transcript Expression Analysis) provides functions for end-to-end analysis of multiple 5'-profiling methods such as CAGE, RAMPAGE and MAPCap, beginning from raw reads to detection of transcription start sites using replicates. It also allows performing differential TSS detection between group of samples, therefore, integrating the mRNA cap enrichment information with transcript expression analysis.
Last updated
immunooncologytranscriptiongeneexpressionsequencingrnaseqtranscriptomicsdifferentialexpressioncageexpressionrna-seq
5.37 score 2 stars 13 scripts 364 downloadsscFeatureFilter - A correlation-based method for quality filtering of single-cell RNAseq data
An R implementation of the correlation-based method developed in the Joshi laboratory to analyse and filter processed single-cell RNAseq data. It returns a filtered version of the data containing only genes expression values unaffected by systematic noise.
Last updated
immunooncologysinglecellrnaseqpreprocessinggeneexpression
5.31 score 27 scripts 393 downloadsbiotmle - Targeted Learning with Moderated Statistics for Biomarker Discovery
Tools for differential expression biomarker discovery based on microarray and next-generation sequencing data that leverage efficient semiparametric estimators of the average treatment effect for variable importance analysis. Estimation and inference of the (marginal) average treatment effects of potential biomarkers are computed by targeted minimum loss-based estimation, with joint, stable inference constructed across all biomarkers using a generalization of moderated statistics for use with the estimated efficient influence function. The procedure accommodates the use of ensemble machine learning for the estimation of nuisance functions.
Last updated
regressiongeneexpressiondifferentialexpressionsequencingmicroarrayrnaseqimmunooncologybioconductorbioconductor-packagebioconductor-packagesbioinformaticsbiomarker-discoverybiostatisticscausal-inferencecomputational-biologymachine-learningstatisticstargeted-learning
5.30 score 5 stars 10 scripts 460 downloadsomicsPrint - Cross omic genetic fingerprinting
omicsPrint provides functionality for cross omic genetic fingerprinting, for example, to verify sample relationships between multiple omics data types, i.e. genomic, transcriptomic and epigenetic (DNA methylation).
Last updated
qualitycontrolgeneticsepigeneticstranscriptomicsdnamethylationtranscriptiongeneticvariabilityimmunooncology
5.27 score 37 scripts 420 downloadsstrandCheckR - Calculate strandness information of a bam file
This package aims to quantify and remove putative double strand DNA from a strand-specific RNA sample. There are also options and methods to plot the positive/negative proportions of all sliding windows, which allow users to have an idea of how much the sample was contaminated and the appropriate threshold to be used for filtering.
Last updated
rnaseqalignmentqualitycontrolcoverageimmunooncology
5.26 score 1 stars 13 scripts 383 downloadsDelayedDataFrame - Delayed operation on DataFrame using standard DataFrame metaphor
Based on the standard DataFrame metaphor, we are trying to implement the feature of delayed operation on the DelayedDataFrame, with a slot of lazyIndex, which saves the mapping indexes for each column of DelayedDataFrame. Methods like show, validity check, [/[[ subsetting, rbind/cbind are implemented for DelayedDataFrame to be operated around lazyIndex. The listData slot stays untouched until a realization call e.g., DataFrame constructor OR as.list() is invoked.
Last updated
infrastructuredatarepresentation
5.26 score 2 stars 1 dependents 3 scripts 340 downloadsRcwlPipelines - Bioinformatics pipelines based on Rcwl
A collection of Bioinformatics tools and pipelines based on R and the Common Workflow Language.
Last updated
softwareworkflowstepalignmentpreprocessingqualitycontroldnaseqrnaseqdataimportimmunooncology
5.12 score 1 dependents 29 scripts 328 downloadshipathia - HiPathia: High-throughput Pathway Analysis
Hipathia is a method for the computation of signal transduction along signaling pathways from transcriptomic data. The method is based on an iterative algorithm which is able to compute the signal intensity passing through the nodes of a network by taking into account the level of expression of each gene and the intensity of the signal arriving to it. It also provides a new approach to functional analysis allowing to compute the signal arriving to the functions annotated to each pathway.
Last updated
pathwaysgraphandnetworkgeneexpressiongenesignalinggo
5.11 score 65 scripts 438 downloadsTFEA.ChIP - TFEA.ChIP, a Tool Kit for Transcription Factor Enrichment
Package to analyze transcription factor enrichment in a gene set using data from ChIP-Seq experiments.
Last updated
transcriptiongeneregulationgenesetenrichmenttranscriptomicssequencingchipseqrnaseqimmunooncologygeneexpressionchiponchip
5.11 score 20 scripts 459 downloadsqPLEXanalyzer - Tools for quantitative proteomics data analysis
Tools for TMT based quantitative proteomics data analysis.
Last updated
immunooncologyproteomicsmassspectrometrynormalizationpreprocessingqualitycontroldataimport
5.08 score 1 stars 10 scripts 400 downloadstRNAscanImport - Importing a tRNAscan-SE result file as GRanges object
The package imports the result of tRNAscan-SE as a GRanges object.
Last updated
softwaredataimportworkflowsteppreprocessingvisualizationbioconductorsequencesstructurestrnatrnascantrnascan-se
5.08 score 2 stars 3 scripts 338 downloadsSeqSQC - A bioconductor package for sample quality check with next generation sequencing data
The SeqSQC is designed to identify problematic samples in NGS data, including samples with gender mismatch, contamination, cryptic relatedness, and population outlier.
Last updated
experiment datahomo_sapiens_datasequencing dataproject1000genomesgenome
5.08 score 7 scripts 380 downloadsmethInheritSim - Simulating Whole-Genome Inherited Bisulphite Sequencing Data
Simulate a multigeneration methylation case versus control experiment with inheritance relation using a real control dataset.
Last updated
biologicalquestionepigeneticsdnamethylationdifferentialmethylationmethylseqsoftwareimmunooncologystatisticalmethodwholegenomesequencingbisulphite-sequencinginheritancemethylationsimulation
5.08 score 2 stars 1 scripts 354 downloadssemisup - Semi-Supervised Mixture Model
Implements a parametric semi-supervised mixture model. The permutation test detects markers with main or interactive effects, without distinguishing them. Possible applications include genome-wide association analysis and differential expression analysis.
Last updated
snpgenomicvariationsomaticmutationgeneticsclassificationclusteringdnaseqmicroarraymultiplecomparison
5.08 score 1 stars 4 scripts 358 downloadsanota2seq - Generally applicable transcriptome-wide analysis of translational efficiency using anota2seq
anota2seq provides analysis of translational efficiency and differential expression analysis for polysome-profiling and ribosome-profiling studies (two or more sample classes) quantified by RNA sequencing or DNA-microarray. Polysome-profiling and ribosome-profiling typically generate data for two RNA sources; translated mRNA and total mRNA. Analysis of differential expression is used to estimate changes within each RNA source (i.e. translated mRNA or total mRNA). Analysis of translational efficiency aims to identify changes in translation efficiency leading to altered protein levels that are independent of total mRNA levels (i.e. changes in translated mRNA that are independent of levels of total mRNA) or buffering, a mechanism regulating translational efficiency so that protein levels remain constant despite fluctuating total mRNA levels (i.e. changes in total mRNA that are independent of levels of translated mRNA). anota2seq applies analysis of partial variance and the random variance model to fulfill these tasks.
Last updated
immunooncologygeneexpressiondifferentialexpressionmicroarraygenomewideassociationbatcheffectnormalizationrnaseqsequencinggeneregulationregression
5.04 score 1 dependents 23 scripts 594 downloadsplotGrouper - Shiny app GUI wrapper for ggplot with built-in statistical analysis
A shiny app-based GUI wrapper for ggplot with built-in statistical analysis. Import data from file and use dropdown menus and checkboxes to specify the plotting variables, graph type, and look of your plots. Once created, plots can be saved independently or stored in a report that can be saved as a pdf. If new data are added to the file, the report can be refreshed to include new data. Statistical tests can be selected and added to the graphs. Analysis of flow cytometry data is especially integrated with plotGrouper. Count data can be transformed to return the absolute number of cells in a sample (this feature requires inclusion of the number of beads per sample and information about any dilution performed).
Last updated
immunooncologyflowcytometrygraphandnetworkstatisticalmethoddataimportguimultiplecomparisonbioconductorggplot2plottingshiny
5.02 score 7 stars 10 scripts 320 downloadsSconify - A toolkit for performing KNN-based statistics for flow and mass cytometry data
This package does k-nearest neighbor based statistics and visualizations with flow and mass cytometery data. This gives tSNE maps"fold change" functionality and provides a data quality metric by assessing manifold overlap between fcs files expected to be the same. Other applications using this package include imputation, marker redundancy, and testing the relative information loss of lower dimension embeddings compared to the original manifold.
Last updated
immunooncologysinglecellflowcytometrysoftwaremultiplecomparisonvisualization
5.02 score 14 scripts 410 downloadsSIMD - Statistical Inferences with MeDIP-seq Data (SIMD) to infer the methylation level for each CpG site
This package provides a inferential analysis method for detecting differentially expressed CpG sites in MeDIP-seq data. It uses statistical framework and EM algorithm, to identify differentially expressed CpG sites. The methods on this package are described in the article 'Methylation-level Inferences and Detection of Differential Methylation with Medip-seq Data' by Yan Zhou, Jiadi Zhu, Mingtao Zhao, Baoxue Zhang, Chunfu Jiang and Xiyan Yang (2018, pending publication).
Last updated
immunooncologydifferentialmethylationsinglecelldifferentialexpression
5.00 score 2 scripts 342 downloadssigFeature - sigFeature: Significant feature selection using SVM-RFE & t-statistic
This package provides a novel feature selection algorithm for binary classification using support vector machine recursive feature elimination SVM-RFE and t-statistic. In this feature selection process, the selected features are differentially significant between the two classes and also they are good classifier with higher degree of classification accuracy.
Last updated
featureextractiongeneexpressionmicroarraytranscriptionmrnamicroarraygenepredictionnormalizationclassificationsupportvectormachine
5.00 score 25 scripts 600 downloadsGARS - GARS: Genetic Algorithm for the identification of Robust Subsets of variables in high-dimensional and challenging datasets
Feature selection aims to identify and remove redundant, irrelevant and noisy variables from high-dimensional datasets. Selecting informative features affects the subsequent classification and regression analyses by improving their overall performances. Several methods have been proposed to perform feature selection: most of them relies on univariate statistics, correlation, entropy measurements or the usage of backward/forward regressions. Herein, we propose an efficient, robust and fast method that adopts stochastic optimization approaches for high-dimensional. GARS is an innovative implementation of a genetic algorithm that selects robust features in high-dimensional and challenging datasets.
Last updated
classificationfeatureextractionclusteringopenjdk
5.00 score 3 scripts 385 downloadsRVS - Computes estimates of the probability of related individuals sharing a rare variant
Rare Variant Sharing (RVS) implements tests of association and linkage between rare genetic variant genotypes and a dichotomous phenotype, e.g. a disease status, in family samples. The tests are based on probabilities of rare variant sharing by relatives under the null hypothesis of absence of linkage and association between the rare variants and the phenotype and apply to single variants or multiple variants in a region (e.g. gene-based test).
Last updated
immunooncologygeneticsgenomewideassociationvariantdetectionexomeseqwholegenome
4.98 score 16 scripts 350 downloadsChIPanalyser - ChIPanalyser: Predicting Transcription Factor Binding Sites
ChIPanalyser is a package to predict and understand TF binding by utilizing a statistical thermodynamic model. The model incorporates 4 main factors thought to drive TF binding: Chromatin State, Binding energy, Number of bound molecules and a scaling factor modulating TF binding affinity. Taken together, ChIPanalyser produces ChIP-like profiles that closely mimic the patterns seens in real ChIP-seq data.
Last updated
softwarebiologicalquestionworkflowsteptranscriptionsequencingchiponchipcoveragealignmentchipseqsequencematchingdataimportpeakdetection
4.95 score 30 scripts 440 downloadstranscriptogramer - Transcriptional analysis based on transcriptograms
R package for transcriptional analysis based on transcriptograms, a method to analyze transcriptomes that projects expression values on a set of ordered proteins, arranged such that the probability that gene products participate in the same metabolic pathway exponentially decreases with the increase of the distance between two proteins of the ordering. Transcriptograms are, hence, genome wide gene expression profiles that provide a global view for the cellular metabolism, while indicating gene sets whose expressions are altered.
Last updated
softwarenetworkvisualizationsystemsbiologygeneexpressiongenesetenrichmentgraphandnetworkclusteringdifferentialexpressionmicroarrayrnaseqtranscriptionimmunooncology
4.95 score 4 stars 14 scripts 427 downloadsevaluomeR - Evaluation of Bioinformatics Metrics
Evaluating the reliability of your own metrics and the measurements done on your own datasets by analysing the stability and goodness of the classifications of such metrics.
Last updated
clusteringclassificationfeatureextractionassessmentclustering-evaluationevaluomeevaluomermetrics
4.92 score 42 scripts 374 downloadsDepecheR - Determination of essential phenotypic elements of clusters in high-dimensional entities
The purpose of this package is to identify traits in a dataset that can separate groups. This is done on two levels. First, clustering is performed, using an implementation of sparse K-means. Secondly, the generated clusters are used to predict outcomes of groups of individuals based on their distribution of observations in the different clusters. As certain clusters with separating information will be identified, and these clusters are defined by a sparse number of variables, this method can reduce the complexity of data, to only emphasize the data that actually matters.
Last updated
softwarecellbasedassaystranscriptiondifferentialexpressiondatarepresentationimmunooncologytranscriptomicsclassificationclusteringdimensionreductionfeatureextractionflowcytometryrnaseqsinglecellvisualizationcpp
4.92 score 21 scripts 489 downloadsIMMAN - Interlog protein network reconstruction by Mapping and Mining ANalysis
Reconstructing Interlog Protein Network (IPN) integrated from several Protein protein Interaction Networks (PPINs). Using this package, overlaying different PPINs to mine conserved common networks between diverse species will be applicable.
Last updated
sequencematchingalignmentsystemsbiologygraphandnetworknetworkproteomics
4.90 score 3 scripts 352 downloadsbranchpointer - Prediction of intronic splicing branchpoints
Predicts branchpoint probability for sites in intronic branchpoint windows. Queries can be supplied as intronic regions; or to evaluate the effects of mutations, SNPs.
Last updated
softwaregenomeannotationgenomicvariationmotifannotation
4.89 score 26 scripts 404 downloadsAMARETTO - Regulatory Network Inference and Driver Gene Evaluation using Integrative Multi-Omics Analysis and Penalized Regression
Integrating an increasing number of available multi-omics cancer data remains one of the main challenges to improve our understanding of cancer. One of the main challenges is using multi-omics data for identifying novel cancer driver genes. We have developed an algorithm, called AMARETTO, that integrates copy number, DNA methylation and gene expression data to identify a set of driver genes by analyzing cancer samples and connects them to clusters of co-expressed genes, which we define as modules. We applied AMARETTO in a pancancer setting to identify cancer driver genes and their modules on multiple cancer sites. AMARETTO captures modules enriched in angiogenesis, cell cycle and EMT, and modules that accurately predict survival and molecular subtypes. This allows AMARETTO to identify novel cancer driver genes directing canonical cancer pathways.
Last updated
statisticalmethoddifferentialmethylationgeneregulationgeneexpressionmethylationarraytranscriptionpreprocessingbatcheffectdataimportmrnamicroarraymicrornaarrayregressionclusteringrnaseqcopynumbervariationsequencingmicroarraynormalizationnetworkbayesianexonarrayonechanneltwochannelproprietaryplatformsalternativesplicingdifferentialexpressiondifferentialsplicinggenesetenrichmentmultiplecomparisonqualitycontroltimecourse
4.88 score 15 scripts 510 downloadsscmeth - Functions to conduct quality control analysis in methylation data
Functions to analyze methylation data can be found here. Some functions are relevant for single cell methylation data but most other functions can be used for any methylation data. Highlight of this workflow is the comprehensive quality control report.
Last updated
dnamethylationqualitycontrolpreprocessingsinglecellimmunooncologybioconductor-packagemethylationsingle-cell-methylation
4.88 score 8 scripts 332 downloadsmethylInheritance - Permutation-Based Analysis associating Conserved Differentially Methylated Elements Across Multiple Generations to a Treatment Effect
Permutation analysis, based on Monte Carlo sampling, for testing the hypothesis that the number of conserved differentially methylated elements, between several generations, is associated to an effect inherited from a treatment and that stochastic effect can be dismissed.
Last updated
biologicalquestionepigeneticsdnamethylationdifferentialmethylationmethylseqsoftwareimmunooncologystatisticalmethodwholegenomesequencinganalysisbioconductorbioinformaticscpgdifferentially-methylated-elementsinheritancemonte-carlo-samplingpermutation
4.88 score 6 scripts 452 downloadsTFutils - TFutils
This package helps users to work with TF metadata from various sources. Significant catalogs of TFs and classifications thereof are made available. Tools for working with motif scans are also provided.
Last updated
transcriptomics
4.84 score 23 scripts 418 downloadsASICS - Automatic Statistical Identification in Complex Spectra
With a set of pure metabolite reference spectra, ASICS quantifies concentration of metabolites in a complex spectrum. The identification of metabolites is performed by fitting a mixture model to the spectra of the library with a sparse penalty. The method and its statistical properties are described in Tardivel et al. (2017) <doi:10.1007/s11306-017-1244-5>.
Last updated
softwaredataimportcheminformaticsmetabolomics
4.79 score 31 scripts 654 downloadsNBAMSeq - Negative Binomial Additive Model for RNA-Seq Data
High-throughput sequencing experiments followed by differential expression analysis is a widely used approach to detect genomic biomarkers. A fundamental step in differential expression analysis is to model the association between gene counts and covariates of interest. NBAMSeq a flexible statistical model based on the generalized additive model and allows for information sharing across genes in variance estimation.
Last updated
rnaseqdifferentialexpressiongeneexpressionsequencingcoveragedifferential-expressiongene-expressiongeneralized-additive-modelsgeneralized-linear-modelsnegative-binomial-regressionsplines
4.78 score 2 stars 4 scripts 346 downloadsSCBN - A statistical normalization method and differential expression analysis for RNA-seq data between different species
This package provides a scale based normalization (SCBN) method to identify genes with differential expression between different species. It takes into account the available knowledge of conserved orthologous genes and the hypothesis testing framework to detect differentially expressed orthologous genes. The method on this package are described in the article 'A statistical normalization method and differential expression analysis for RNA-seq data between different species' by Yan Zhou, Jiadi Zhu, Tiejun Tong, Junhui Wang, Bingqing Lin, Jun Zhang (2018, pending publication).
Last updated
differentialexpressiongeneexpressionnormalization
4.78 score 1 dependents 7 scripts 307 downloadsMetaCyto - MetaCyto: A package for meta-analysis of cytometry data
This package provides functions for preprocessing, automated gating and meta-analysis of cytometry data. It also provides functions that facilitate the collection of cytometry data from the ImmPort database.
Last updated
immunooncologycellbiologyflowcytometryclusteringstatisticalmethodsoftwarecellbasedassayspreprocessing
4.78 score 20 scripts 386 downloadsMSstatsQC - Longitudinal system suitability monitoring and quality control for proteomic experiments
MSstatsQC is an R package which provides longitudinal system suitability monitoring and quality control tools for proteomic experiments.
Last updated
softwarequalitycontrolproteomicsmassspectrometrynormalization
4.78 score 1 dependents 8 scripts 454 downloadsGA4GHshiny - Shiny application for interacting with GA4GH-based data servers
GA4GHshiny package provides an easy way to interact with data servers based on Global Alliance for Genomics and Health (GA4GH) genomics API through a Shiny application. It also integrates with Beacon Network.
Last updated
gui
4.78 score 2 stars 3 scripts 360 downloadsphenopath - Genomic trajectories with heterogeneous genetic and environmental backgrounds
PhenoPath infers genomic trajectories (pseudotimes) in the presence of heterogeneous genetic and environmental backgrounds and tests for interactions between them.
Last updated
immunooncologyrnaseqgeneexpressionbayesiansinglecellprincipalcomponentcpp
4.76 score 58 scripts 464 downloadsiasva - Iteratively Adjusted Surrogate Variable Analysis
Iteratively Adjusted Surrogate Variable Analysis (IA-SVA) is a statistical framework to uncover hidden sources of variation even when these sources are correlated. IA-SVA provides a flexible methodology to i) identify a hidden factor for unwanted heterogeneity while adjusting for all known factors; ii) test the significance of the putative hidden factor for explaining the unmodeled variation in the data; and iii), if significant, use the estimated factor as an additional known factor in the next iteration to uncover further hidden factors.
Last updated
preprocessingqualitycontrolbatcheffectrnaseqsoftwarestatisticalmethodfeatureextractionimmunooncology
4.72 score 52 scripts 380 downloadsconsensus - Cross-platform consensus analysis of genomic measurements via interlaboratory testing method
An implementation of the American Society for Testing and Materials (ASTM) Standard E691 for interlaboratory testing procedures, designed for cross-platform genomic measurements. Given three (3) or more genomic platforms or laboratory protocols, this package provides interlaboratory testing procedures giving per-locus comparisons for sensitivity and precision between platforms.
Last updated
qualitycontrolregressiondatarepresentationgeneexpressionmicroarrayrnaseq
4.70 score 9 scripts 372 downloadstopdownr - Investigation of Fragmentation Conditions in Top-Down Proteomics
The topdownr package allows automatic and systemic investigation of fragment conditions. It creates Thermo Orbitrap Fusion Lumos method files to test hundreds of fragmentation conditions. Additionally it provides functions to analyse and process the generated MS data and determine the best conditions to maximise overall fragment coverage.
Last updated
immunooncologyinfrastructureproteomicsmassspectrometrycoverage
4.70 score 412 downloadsANF - Affinity Network Fusion for Complex Patient Clustering
This package is used for complex patient clustering by integrating multi-omic data through affinity network fusion.
Last updated
clusteringgraphandnetworknetwork
4.70 score 25 scripts 488 downloadsrmelting - R Interface to MELTING 5
R interface to the MELTING 5 program (https://www.ebi.ac.uk/biomodels/tools/melting/) to compute melting temperatures of nucleic acid duplexes along with other thermodynamic parameters.
Last updated
biomedicalinformaticscheminformaticsbioconductorbioinformaticsmelting-temperatureopenjdk
4.68 score 2 stars 12 scripts 348 downloadsmdp - Molecular Degree of Perturbation calculates scores for transcriptome data samples based on their perturbation from controls
The Molecular Degree of Perturbation webtool quantifies the heterogeneity of samples. It takes a data.frame of omic data that contains at least two classes (control and test) and assigns a score to all samples based on how perturbed they are compared to the controls. It is based on the Molecular Distance to Health (Pankla et al. 2009), and expands on this algorithm by adding the options to calculate the z-score using the modified z-score (using median absolute deviation), change the z-score zeroing threshold, and look at genes that are most perturbed in the test versus control classes.
Last updated
biomedicalinformaticsqualitycontroltranscriptomicssystemsbiologymicroarray
4.68 score 16 scripts 465 downloadsepivizrChart - R interface to epiviz web components
This package provides an API for interactive visualization of genomic data using epiviz web components. Objects in R/BioConductor can be used to generate interactive R markdown/notebook documents or can be visualized in the R Studio's default viewer.
Last updated
visualizationgui
4.68 score 12 scripts 410 downloadsmsgbsR - msgbsR: methylation sensitive genotyping by sequencing (MS-GBS) R functions
Pipeline for the anaysis of a MS-GBS experiment.
Last updated
immunooncologydifferentialmethylationdataimportepigeneticsmethylseq
4.65 score 1 scripts 373 downloadsGA4GHclient - A Bioconductor package for accessing GA4GH API data servers
GA4GHclient provides an easy way to access public data servers through Global Alliance for Genomics and Health (GA4GH) genomics API. It provides low-level access to GA4GH API and translates response data into Bioconductor-based class objects.
Last updated
datarepresentationthirdpartyclient
4.65 score 1 stars 1 dependents 3 scripts 364 downloadspanelcn.mops - CNV detection tool for targeted NGS panel data
CNV detection tool for targeted NGS panel data. Extension of the cn.mops package.
Last updated
sequencingcopynumbervariationcellbiologygenomicvariationvariantdetectiongenetics
4.64 score 22 scripts 372 downloadsmCSEA - Methylated CpGs Set Enrichment Analysis
Identification of diferentially methylated regions (DMRs) in predefined regions (promoters, CpG islands...) from the human genome using Illumina's 450K or EPIC microarray data. Provides methods to rank CpG probes based on linear models and includes plotting functions.
Last updated
immunooncologydifferentialmethylationdnamethylationepigeneticsgeneticsgenomeannotationmethylationarraymicroarraymultiplecomparisontwochannel
4.60 score 20 scripts 452 downloadsIsoCorrectoR - Correction for natural isotope abundance and tracer purity in MS and MS/MS data from stable isotope labeling experiments
IsoCorrectoR performs the correction of mass spectrometry data from stable isotope labeling/tracing metabolomics experiments with regard to natural isotope abundance and tracer impurity. Data from both MS and MS/MS measurements can be corrected (with any tracer isotope: 13C, 15N, 18O...), as well as ultra-high resolution MS data from multiple-tracer experiments (e.g. 13C and 15N used simultaneously). See the Bioconductor package IsoCorrectoRGUI for a graphical user interface to IsoCorrectoR. NOTE: With R version 4.0.0, writing correction results to Excel files may currently not work on Windows. However, writing results to csv works as before.
Last updated
softwaremetabolomicsmassspectrometrypreprocessingimmunooncology
4.56 score 1 dependents 7 scripts 419 downloadsInTAD - Search for correlation between epigenetic signals and gene expression in TADs
The package is focused on the detection of correlation between expressed genes and selected epigenomic signals (i.e. enhancers obtained from ChIP-seq data) either within topologically associated domains (TADs) or between chromatin contact loop anchors. Various parameters can be controlled to investigate the influence of external factors and visualization plots are available for each analysis step.
Last updated
epigeneticssequencingchipseqrnaseqhicgeneexpressionimmunooncology
4.56 score 12 scripts 386 downloadsopenPrimeR - Multiplex PCR Primer Design and Analysis
An implementation of methods for designing, evaluating, and comparing primer sets for multiplex PCR. Primers are designed by solving a set cover problem such that the number of covered template sequences is maximized with the smallest possible set of primers. To guarantee that high-quality primers are generated, only primers fulfilling constraints on their physicochemical properties are selected. A Shiny app providing a user interface for the functionalities of this package is provided by the 'openPrimeRui' package.
Last updated
softwaretechnologycoveragemultiplecomparison
4.54 score 23 scripts 402 downloadscydar - Using Mass Cytometry for Differential Abundance Analyses
Identifies differentially abundant populations between samples and groups in mass cytometry data. Provides methods for counting cells into hyperspheres, controlling the spatial false discovery rate, and visualizing changes in abundance in the high-dimensional marker space.
Last updated
immunooncologyflowcytometrymultiplecomparisonproteomicssinglecellcpp
4.53 score 56 scripts 438 downloadsERSSA - Empirical RNA-seq Sample Size Analysis
The ERSSA package takes user supplied RNA-seq differential expression dataset and calculates the number of differentially expressed genes at varying biological replicate levels. This allows the user to determine, without relying on any a priori assumptions, whether sufficient differential detection has been acheived with their RNA-seq dataset.
Last updated
immunooncologygeneexpressiontranscriptiondifferentialexpressionrnaseqmultiplecomparisonqualitycontrol
4.48 score 1 stars 1 scripts 377 downloadsprimirTSS - Prediction of pri-miRNA Transcription Start Site
A fast, convenient tool to identify the TSSs of miRNAs by integrating the data of H3K4me3 and Pol II as well as combining the conservation level and sequence feature, provided within both command-line and graphical interfaces, which achieves a better performance than the previous non-cell-specific methods on miRNA TSSs.
Last updated
immunooncologysequencingrnaseqgeneticspreprocessingtranscriptiongeneregulation
4.48 score 2 scripts 332 downloadsRSeqAn - R SeqAn
Headers and some wrapper functions from the SeqAn C++ library for ease of usage in R.
Last updated
infrastructuresoftwarecpp
4.48 score 3 stars 2 scripts 404 downloadsddPCRclust - Clustering algorithm for ddPCR data
The ddPCRclust algorithm can automatically quantify the CPDs of non-orthogonal ddPCR reactions with up to four targets. In order to determine the correct droplet count for each target, it is crucial to both identify all clusters and label them correctly based on their position. For more information on what data can be analyzed and how a template needs to be formatted, please check the vignette.
Last updated
ddpcrclusteringbiological-data-analysis
4.48 score 3 stars 4 scripts 417 downloadsloci2path - Loci2path: regulatory annotation of genomic intervals based on tissue-specific expression QTLs
loci2path performs statistics-rigorous enrichment analysis of eQTLs in genomic regions of interest. Using eQTL collections provided by the Genotype-Tissue Expression (GTEx) project and pathway collections from MSigDB.
Last updated
functionalgenomicsgeneticsgenesetenrichmentsoftwaregeneexpressionsequencingcoveragebiocarta
4.48 score 1 stars 7 scripts 333 downloadsgep2pep - Creation and Analysis of Pathway Expression Profiles (PEPs)
Pathway Expression Profiles (PEPs) are based on the expression of pathways (defined as sets of genes) as opposed to individual genes. This package converts gene expression profiles to PEPs and performs enrichment analysis of both pathways and experimental conditions, such as "drug set enrichment analysis" and "gene2drug" drug discovery analysis respectively.
Last updated
geneexpressiondifferentialexpressiongenesetenrichmentdimensionreductionpathwaysgo
4.48 score 6 scripts 332 downloadsPPInfer - Inferring functionally related proteins using protein interaction networks
Interactions between proteins occur in many, if not most, biological processes. Most proteins perform their functions in networks associated with other proteins and other biomolecules. This fact has motivated the development of a variety of experimental methods for the identification of protein interactions. This variety has in turn ushered in the development of numerous different computational approaches for modeling and predicting protein interactions. Sometimes an experiment is aimed at identifying proteins closely related to some interesting proteins. A network based statistical learning method is used to infer the putative functions of proteins from the known functions of its neighboring proteins on a PPI network. This package identifies such proteins often involved in the same or similar biological functions.
Last updated
softwarestatisticalmethodnetworkgraphandnetworkgenesetenrichmentnetworkenrichmentpathways
4.48 score 1 dependents 6 scripts 417 downloadsslalom - Factorial Latent Variable Modeling of Single-Cell RNA-Seq Data
slalom is a scalable modelling framework for single-cell RNA-seq data that uses gene set annotations to dissect single-cell transcriptome heterogeneity, thereby allowing to identify biological drivers of cell-to-cell variability and model confounding factors. The method uses Bayesian factor analysis with a latent variable model to identify active pathways (selected by the user, e.g. KEGG pathways) that explain variation in a single-cell RNA-seq dataset. This an R/C++ implementation of the f-scLVM Python package. See the publication describing the method at https://doi.org/10.1186/s13059-017-1334-8.
Last updated
immunooncologysinglecellrnaseqnormalizationvisualizationdimensionreductiontranscriptomicsgeneexpressionsequencingsoftwarereactomekeggopenblascpp
4.38 score 16 scripts 342 downloadsscoreInvHap - Get inversion status in predefined regions
scoreInvHap can get the samples' inversion status of known inversions. scoreInvHap uses SNP data as input and requires the following information about the inversion: genotype frequencies in the different haplotypes, R2 between the region SNPs and inversion status and heterozygote genotypes in the reference. The package include this data for 21 inversions.
Last updated
snpgeneticsgenomicvariation
4.38 score 12 scripts 417 downloadsGeneStructureTools - Tools for spliced gene structure manipulation and analysis
GeneStructureTools can be used to create in silico alternative splicing events, and analyse potential effects this has on functional gene products.
Last updated
immunooncologysoftwaredifferentialsplicingfunctionalpredictiontranscriptomicsalternativesplicingrnaseq
4.34 score 22 scripts 444 downloadsXINA - Multiplexes Isobaric Mass Tagged-based Kinetics Data for Network Analysis
The aim of XINA is to determine which proteins exhibit similar patterns within and across experimental conditions, since proteins with co-abundance patterns may have common molecular functions. XINA imports multiple datasets, tags dataset in silico, and combines the data for subsequent subgrouping into multiple clusters. The result is a single output depicting the variation across all conditions. XINA, not only extracts coabundance profiles within and across experiments, but also incorporates protein-protein interaction databases and integrative resources such as KEGG to infer interactors and molecular functions, respectively, and produces intuitive graphical outputs.
Last updated
systemsbiologyproteomicsrnaseqnetwork
4.30 score 6 scripts 454 downloadsGIGSEA - Genotype Imputed Gene Set Enrichment Analysis
We presented the Genotype-imputed Gene Set Enrichment Analysis (GIGSEA), a novel method that uses GWAS-and-eQTL-imputed trait-associated differential gene expression to interrogate gene set enrichment for the trait-associated SNPs. By incorporating eQTL from large gene expression studies, e.g. GTEx, GIGSEA appropriately addresses such challenges for SNP enrichment as gene size, gene boundary, SNP distal regulation, and multiple-marker regulation. The weighted linear regression model, taking as weights both imputation accuracy and model completeness, was used to perform the enrichment test, properly adjusting the bias due to redundancy in different gene sets. The permutation test, furthermore, is used to evaluate the significance of enrichment, whose efficiency can be largely elevated by expressing the computational intensive part in terms of large matrix operation. We have shown the appropriate type I error rates for GIGSEA (<5%), and the preliminary results also demonstrate its good performance to uncover the real signal.
Last updated
genesetenrichmentsnpvariantannotationgeneexpressiongeneregulationregressiondifferentialexpression
4.30 score 8 scripts 384 downloadsiCNV - Integrated Copy Number Variation detection
Integrative copy number variation (CNV) detection from multiple platform and experimental design.
Last updated
immunooncologyexomeseqwholegenomesnpcopynumbervariationhiddenmarkovmodel
4.30 score 5 scripts 403 downloadsomicRexposome - Exposome and omic data associatin and integration analysis
omicRexposome systematizes the association evaluation between exposures and omic data, taking advantage of MultiDataSet for coordinated data management, rexposome for exposome data definition and limma for association testing. Also to perform data integration mixing exposome and omic data using multi co-inherent analysis (omicade4) and multi-canonical correlation analysis (PMA).
Last updated
immunooncologyworkflowstepmultiplecomparisonvisualizationgeneexpressiondifferentialexpressiondifferentialmethylationgeneregulationepigeneticsproteomicstranscriptomicsstatisticalmethodregression
4.30 score 5 scripts 434 downloadsOPWeight - Optimal p-value weighting with independent information
This package perform weighted-pvalue based multiple hypothesis test and provides corresponding information such as ranking probability, weight, significant tests, etc . To conduct this testing procedure, the testing method apply a probabilistic relationship between the test rank and the corresponding test effect size.
Last updated
immunooncologybiomedicalinformaticsmultiplecomparisonregressionrnaseqsnp
4.30 score 2 stars 4 scripts 360 downloadsrqt - rqt: utilities for gene-level meta-analysis
Despite the recent advances of modern GWAS methods, it still remains an important problem of addressing calculation an effect size and corresponding p-value for the whole gene rather than for single variant. The R- package rqt offers gene-level GWAS meta-analysis. For more information, see: "Gene-set association tests for next-generation sequencing data" by Lee et al (2016), Bioinformatics, 32(17), i611-i619, <doi:10.1093/bioinformatics/btw429>.
Last updated
genomewideassociationregressionsurvivalprincipalcomponentstatisticalmethodsequencing
4.30 score 2 stars 4 scripts 346 downloadsCellTrails - Reconstruction, visualization and analysis of branching trajectories
CellTrails is an unsupervised algorithm for the de novo chronological ordering, visualization and analysis of single-cell expression data. CellTrails makes use of a geometrically motivated concept of lower-dimensional manifold learning, which exhibits a multitude of virtues that counteract intrinsic noise of single cell data caused by drop-outs, technical variance, and redundancy of predictive variables. CellTrails enables the reconstruction of branching trajectories and provides an intuitive graphical representation of expression patterns along all branches simultaneously. It allows the user to define and infer the expression dynamics of individual and multiple pathways towards distinct phenotypes.
Last updated
immunooncologyclusteringdatarepresentationdifferentialexpressiondimensionreductiongeneexpressionsequencingsinglecellsoftwaretimecourse
4.29 score 13 scripts 426 downloadsIntEREst - Intron-Exon Retention Estimator
This package performs Intron-Exon Retention analysis on RNA-seq data (.bam files).
Last updated
softwarealternativesplicingcoveragedifferentialsplicingsequencingrnaseqalignmentnormalizationdifferentialexpressionimmunooncology
4.23 score 21 scripts 432 downloadsdivergence - Divergence: Functionality for assessing omics data by divergence with respect to a baseline
This package provides functionality for performing divergence analysis as presented in Dinalankara et al, "Digitizing omics profiles by divergence from a baseline", PANS 2018. This allows the user to simplify high dimensional omics data into a binary or ternary format which encapsulates how the data is divergent from a specified baseline group with the same univariate or multivariate features.
Last updated
softwarestatisticalmethod
4.20 score 16 scripts 368 downloadsivygapSE - A SummarizedExperiment for Ivy-GAP data
Define a SummarizedExperiment and exploratory app for Ivy-GAP glioblastoma image, expression, and clinical data.
Last updated
transcriptionsoftwarevisualizationsurvivalgeneexpressionsequencing
4.20 score 16 scripts 417 downloadspram - Pooling RNA-seq datasets for assembling transcript models
Publicly available RNA-seq data is routinely used for retrospective analysis to elucidate new biology. Novel transcript discovery enabled by large collections of RNA-seq datasets has emerged as one of such analysis. To increase the power of transcript discovery from large collections of RNA-seq datasets, we developed a new R package named Pooling RNA-seq and Assembling Models (PRAM), which builds transcript models in intergenic regions from pooled RNA-seq datasets. This package includes functions for defining intergenic regions, extracting and pooling related RNA-seq alignments, predicting, selected, and evaluating transcript models.
Last updated
softwaretechnologysequencingrnaseqbiologicalquestiongenepredictiongenomeannotationresearchfieldtranscriptomicsbioconductor-packagegenome-annotationrna-seqtranscript-model
4.18 score 1 stars 3 scripts 365 downloadsBiFET - Bias-free Footprint Enrichment Test
BiFET identifies TFs whose footprints are over-represented in target regions compared to background regions after correcting for the bias arising from the imbalance in read counts and GC contents between the target and background regions. For a given TF k, BiFET tests the null hypothesis that the target regions have the same probability of having footprints for the TF k as the background regions while correcting for the read count and GC content bias. For this, we use the number of target regions with footprints for TF k, t_k as a test statistic and calculate the p-value as the probability of observing t_k or more target regions with footprints under the null hypothesis.
Last updated
immunooncologygeneticsepigeneticstranscriptiongeneregulationatacseqdnaseseqripseqsoftware
4.18 score 4 scripts 431 downloadsmethimpute - Imputation-guided re-construction of complete methylomes from WGBS data
This package implements functions for calling methylation for all cytosines in the genome.
Last updated
immunooncologysoftwarednamethylationepigeneticshiddenmarkovmodelsequencingcoveragecppopenmp
4.18 score 15 scripts 436 downloadsbanocc - Bayesian ANalysis Of Compositional Covariance
BAnOCC is a package designed for compositional data, where each sample sums to one. It infers the approximate covariance of the unconstrained data using a Bayesian model coded with `rstan`. It provides as output the `stanfit` object as well as posterior median and credible interval estimates for each correlation element.
Last updated
immunooncologymetagenomicssoftwarebayesian
4.15 score 14 scripts 452 downloadsBiocSklearn - interface to python sklearn via Rstudio reticulate
This package provides interfaces to selected sklearn elements, and demonstrates fault tolerant use of python modules requiring extensive iteration.
Last updated
statisticalmethoddimensionreductioninfrastructure
4.04 score 11 scripts 424 downloadsBLMA - BLMA: A package for bi-level meta-analysis
Suit of tools for bi-level meta-analysis. The package can be used in a wide range of applications, including general hypothesis testings, differential expression analysis, functional analysis, and pathway analysis.
Last updated
genesetenrichmentpathwaysdifferentialexpressionmicroarray
4.01 score 51 scripts 408 downloadssurvtype - Subtype Identification with Survival Data
Subtypes are defined as groups of samples that have distinct molecular and clinical features. Genomic data can be analyzed for discovering patient subtypes, associated with clinical data, especially for survival information. This package is aimed to identify subtypes that are both clinically relevant and biologically meaningful.
Last updated
softwarestatisticalmethodgeneexpressionsurvivalclusteringsequencingcoverage
4.00 score 9 scripts 346 downloadsdoseR - doseR
doseR package is a next generation sequencing package for sex chromosome dosage compensation which can be applied broadly to detect shifts in gene expression among an arbitrary number of pre-defined groups of loci. doseR is a differential gene expression package for count data, that detects directional shifts in expression for multiple, specific subsets of genes, broad utility in systems biology research. doseR has been prepared to manage the nature of the data and the desired set of inferences. doseR uses S4 classes to store count data from sequencing experiment. It contains functions to normalize and filter count data, as well as to plot and calculate statistics of count data. It contains a framework for linear modeling of count data. The package has been tested using real and simulated data.
Last updated
infrastructuresoftwaredatarepresentationsequencinggeneexpressionsystemsbiologydifferentialexpression
4.00 score 9 scripts 375 downloadsHIREewas - Detection of cell-type-specific risk-CpG sites in epigenome-wide association studies
In epigenome-wide association studies, the measured signals for each sample are a mixture of methylation profiles from different cell types. The current approaches to the association detection only claim whether a cytosine-phosphate-guanine (CpG) site is associated with the phenotype or not, but they cannot determine the cell type in which the risk-CpG site is affected by the phenotype. We propose a solid statistical method, HIgh REsolution (HIRE), which not only substantially improves the power of association detection at the aggregated level as compared to the existing methods but also enables the detection of risk-CpG sites for individual cell types. The "HIREewas" R package is to implement HIRE model in R.
Last updated
dnamethylationdifferentialmethylationfeatureextraction
4.00 score 1 scripts 338 downloadsKinSwingR - KinSwingR: network-based kinase activity prediction
KinSwingR integrates phosphosite data derived from mass-spectrometry data and kinase-substrate predictions to predict kinase activity. Several functions allow the user to build PWM models of kinase-subtrates, statistically infer PWM:substrate matches, and integrate these data to infer kinase activity.
Last updated
proteomicssequencematchingnetwork
4.00 score 9 scripts 312 downloadsAffiXcan - A Functional Approach To Impute Genetically Regulated Expression
Impute a GReX (Genetically Regulated Expression) for a set of genes in a sample of individuals, using a method based on the Total Binding Affinity (TBA). Statistical models to impute GReX can be trained with a training dataset where the real total expression values are known.
Last updated
geneexpressiontranscriptiongeneregulationdimensionreductionregressionprincipalcomponent
4.00 score 512 downloadsREBET - The subREgion-based BurdEn Test (REBET)
There is an increasing focus to investigate the association between rare variants and diseases. The REBET package implements the subREgion-based BurdEn Test which is a powerful burden test that simultaneously identifies susceptibility loci and sub-regions.
Last updated
softwarevariantannotationsnp
4.00 score 2 scripts 353 downloadsBUScorrect - Batch Effects Correction with Unknown Subtypes
High-throughput experimental data are accumulating exponentially in public databases. However, mining valid scientific discoveries from these abundant resources is hampered by technical artifacts and inherent biological heterogeneity. The former are usually termed "batch effects," and the latter is often modelled by "subtypes." The R package BUScorrect fits a Bayesian hierarchical model, the Batch-effects-correction-with-Unknown-Subtypes model (BUS), to correct batch effects in the presence of unknown subtypes. BUS is capable of (a) correcting batch effects explicitly, (b) grouping samples that share similar characteristics into subtypes, (c) identifying features that distinguish subtypes, and (d) enjoying a linear-order computation complexity.
Last updated
geneexpressionstatisticalmethodbayesianclusteringfeatureextractionbatcheffect
4.00 score 2 scripts 372 downloadsnuCpos - An R package for prediction of nucleosome positions
nuCpos, a derivative of NuPoP, is an R package for prediction of nucleosome positions. nuCpos calculates local and whole nucleosomal histone binding affinity (HBA) scores for a given 147-bp sequence. Note: This package was designed to demonstrate the use of chemical maps in prediction. As the parental package NuPoP now provides chemical-map-based prediction, the function for dHMM-based prediction was removed from this package. nuCpos continues to provide functions for HBA calculation.
Last updated
geneticsepigeneticsnucleosomepositioning
4.00 score 1 scripts 376 downloadshierinf - Hierarchical Inference
Tools to perform hierarchical inference for one or multiple studies / data sets based on high-dimensional multivariate (generalised) linear models. A possible application is to perform hierarchical inference for GWA studies to find significant groups or single SNPs (if the signal is strong) in a data-driven and automated procedure. The method is based on an efficient hierarchical multiple testing correction and controls the FWER. The functions can easily be run in parallel.
Last updated
clusteringgenomewideassociationlinkagedisequilibriumregressionsnp
4.00 score 2 scripts 299 downloadsGMRP - GWAS-based Mendelian Randomization and Path Analyses
Perform Mendelian randomization analysis of multiple SNPs to determine risk factors causing disease of study and to exclude confounding variabels and perform path analysis to construct path of risk factors to the disease.
Last updated
sequencingregressionsnp
4.00 score 8 scripts 360 downloadsFastqCleaner - A Shiny Application for Quality Control, Filtering and Trimming of FASTQ Files
An interactive web application for quality control, filtering and trimming of FASTQ files. This user-friendly tool combines a pipeline for data processing based on Biostrings and ShortRead infrastructure, with a cutting-edge visual environment. Single-Read and Paired-End files can be locally processed. Diagnostic interactive plots (CG content, per-base sequence quality, etc.) are provided for both the input and output files.
Last updated
qualitycontrolsequencingsoftwaresangerseqsequencematchingcpp
4.00 score 4 scripts 407 downloadsMBttest - Multiple Beta t-Tests
MBttest method was developed from beta t-test method of Baggerly et al(2003). Compared to baySeq (Hard castle and Kelly 2010), DESeq (Anders and Huber 2010) and exact test (Robinson and Smyth 2007, 2008) and the GLM of McCarthy et al(2012), MBttest is of high work efficiency,that is, it has high power, high conservativeness of FDR estimation and high stability. MBttest is suit- able to transcriptomic data, tag data, SAGE data (count data) from small samples or a few replicate libraries. It can be used to identify genes, mRNA isoforms or tags differentially expressed between two conditions.
Last updated
sequencingdifferentialexpressionmultiplecomparisonsagegeneexpressiontranscriptionalternativesplicingcoveragedifferentialsplicing
4.00 score 5 scripts 430 downloadsMSstatsQCgui - A graphical user interface for MSstatsQC package
MSstatsQCgui is a Shiny app which provides longitudinal system suitability monitoring and quality control tools for proteomic experiments.
Last updated
softwarequalitycontrolproteomicsmassspectrometrygui
4.00 score 7 scripts 314 downloadsccfindR - Cancer Clone Finder
A collection of tools for cancer genomic data clustering analyses, including those for single cell RNA-seq. Cell clustering and feature gene selection analysis employ Bayesian (and maximum likelihood) non-negative matrix factorization (NMF) algorithm. Input data set consists of RNA count matrix, gene, and cell bar code annotations. Analysis outputs are factor matrices for multiple ranks and marginal likelihood values for each rank. The package includes utilities for downstream analyses, including meta-gene identification, visualization, and construction of rank-based trees for clusters.
Last updated
transcriptomicssinglecellimmunooncologybayesianclusteringgslcpp
4.00 score 9 scripts 413 downloadsCytoDx - Robust prediction of clinical outcomes using cytometry data without cell gating
This package provides functions that predict clinical outcomes using single cell data (such as flow cytometry data, RNA single cell sequencing data) without the requirement of cell gating or clustering.
Last updated
immunooncologycellbiologyflowcytometrystatisticalmethodsoftwarecellbasedassaysregressionclassificationsurvival
4.00 score 8 scripts 382 downloadspowerTCR - Model-Based Comparative Analysis of the TCR Repertoire
This package provides a model for the clone size distribution of the TCR repertoire. Further, it permits comparative analysis of TCR repertoire libraries based on theoretical model fits.
Last updated
softwareclusteringbiomedicalinformatics
4.00 score 4 scripts 692 downloadsTTMap - Two-Tier Mapper: a clustering tool based on topological data analysis
TTMap is a clustering method that groups together samples with the same deviation in comparison to a control group. It is specially useful when the data is small. It is parameter free.
Last updated
softwaremicroarraydifferentialexpressionmultiplecomparisonclusteringclassification
4.00 score 3 scripts 380 downloadsomicplotR - Visual Exploration of Omic Datasets Using a Shiny App
A Shiny app for visual exploration of omic datasets as compositions, and differential abundance analysis using ALDEx2. Useful for exploring RNA-seq, meta-RNA-seq, 16s rRNA gene sequencing with visualizations such as principal component analysis biplots (coloured using metadata for visualizing each variable), dendrograms and stacked bar plots, and effect plots (ALDEx2). Input is a table of counts and metadata file (if metadata exists), with options to filter data by count or by metadata to remove low counts, or to visualize select samples according to selected metadata.
Last updated
softwaredifferentialexpressiongeneexpressionguirnaseqdnaseqmetagenomicstranscriptomicsbayesianmicrobiomevisualizationsequencingimmunooncology
4.00 score 8 scripts 402 downloadsgsean - Gene Set Enrichment Analysis with Networks
Biological molecules in a living organism seldom work individually. They usually interact each other in a cooperative way. Biological process is too complicated to understand without considering such interactions. Thus, network-based procedures can be seen as powerful methods for studying complex process. However, many methods are devised for analyzing individual genes. It is said that techniques based on biological networks such as gene co-expression are more precise ways to represent information than those using lists of genes only. This package is aimed to integrate the gene expression and biological network. A biological network is constructed from gene expression data and it is used for Gene Set Enrichment Analysis.
Last updated
softwarestatisticalmethodnetworkgraphandnetworkgenesetenrichmentgeneexpressionnetworkenrichmentpathwaysdifferentialexpression
4.00 score 1 scripts 416 downloadspogos - PharmacOGenomics Ontology Support
Provide simple utilities for querying bhklab PharmacoDB, modeling API outputs, and integrating to cell and compound ontologies.
Last updated
pharmacogenomicspooledscreensimmunooncology
4.00 score 10 scripts 394 downloadstenXplore - ontological exploration of scRNA-seq of 1.3 million mouse neurons from 10x genomics
Perform ontological exploration of scRNA-seq of 1.3 million mouse neurons from 10x genomics.
Last updated
immunooncologydimensionreductionprincipalcomponenttranscriptomicssinglecell
4.00 score 10 scripts 426 downloadsseqCAT - High Throughput Sequencing Cell Authentication Toolkit
The seqCAT package uses variant calling data (in the form of VCF files) from high throughput sequencing technologies to authenticate and validate the source, function and characteristics of biological samples used in scientific endeavours.
Last updated
coveragegenomicvariationsequencingvariantannotation
4.00 score 4 scripts 423 downloadsTFARM - Transcription Factors Association Rules Miner
It searches for relevant associations of transcription factors with a transcription factor target, in specific genomic regions. It also allows to evaluate the Importance Index distribution of transcription factors (and combinations of transcription factors) in association rules.
Last updated
biologicalquestioninfrastructurestatisticalmethodtranscription
4.00 score 4 scripts 358 downloadsDoscheda - A DownStream Chemo-Proteomics Analysis Pipeline
Doscheda focuses on quantitative chemoproteomics used to determine protein interaction profiles of small molecules from whole cell or tissue lysates using Mass Spectrometry data. The package provides a shiny application to run the pipeline, several visualisations and a downloadable report of an experiment.
Last updated
proteomicsnormalizationpreprocessingmassspectrometryqualitycontroldataimportregression
4.00 score 7 scripts 424 downloadspgca - PGCA: An Algorithm to Link Protein Groups Created from MS/MS Data
Protein Group Code Algorithm (PGCA) is a computationally inexpensive algorithm to merge protein summaries from multiple experimental quantitative proteomics data. The algorithm connects two or more groups with overlapping accession numbers. In some cases, pairwise groups are mutually exclusive but they may still be connected by another group (or set of groups) with overlapping accession numbers. Thus, groups created by PGCA from multiple experimental runs (i.e., global groups) are called "connected" groups. These identified global protein groups enable the analysis of quantitative data available for protein groups instead of unique protein identifiers.
Last updated
workflowstepassaydomainproteomicsmassspectrometryimmunooncology
4.00 score 3 scripts 300 downloadsheatmaps - Flexible Heatmaps for Functional Genomics and Sequence Features
This package provides functions for plotting heatmaps of genome-wide data across genomic intervals, such as ChIP-seq signals at peaks or across promoters. Many functions are also provided for investigating sequence features.
Last updated
visualizationsequencematchingfunctionalgenomics
3.93 score 19 scripts 700 downloadsoncomix - Identifying Genes Overexpressed in Subsets of Tumors from Tumor-Normal mRNA Expression Data
This package helps identify mRNAs that are overexpressed in subsets of tumors relative to normal tissue. Ideal inputs would be paired tumor-normal data from the same tissue from many patients (>15 pairs). This unsupervised approach relies on the observation that oncogenes are characteristically overexpressed in only a subset of tumors in the population, and may help identify oncogene candidates purely based on differences in mRNA expression between previously unknown subtypes.
Last updated
geneexpressionsequencing
3.78 score 6 scripts 421 downloadsRTNsurvival - Survival analysis using transcriptional networks inferred by the RTN package
RTNsurvival integrates regulons inferred by the RTN package with survival data. For each regulon, a two-tailed GSEA framework computes a differential Enrichment Score (dES) at the individual-sample level. The resulting dES distribution across samples is then used to evaluate survival associations within the cohort. Two primary workflows are supported: (i) Cox proportional hazards models, in which regulon activities are treated as predictors of survival time, and (ii) Kaplan–Meier analyses assessing cohort stratification based on regulon activity. All graphical outputs are customizable according to user specifications.
Last updated
networkenrichmentsurvivalgeneregulationgenesetenrichmentnetworkinferencegraphandnetwork
3.78 score 15 scripts 384 downloadsssrch - a simple search engine
Demonstrate tokenization and a search gadget for collections of CSV files.
Last updated
infrastructure
3.60 score 20 scripts 340 downloadsSDAMS - Differential Abundant/Expression Analysis for Metabolomics, Proteomics and single-cell RNA sequencing Data
This Package utilizes a Semi-parametric Differential Abundance/expression analysis (SDA) method for metabolomics and proteomics data from mass spectrometry as well as single-cell RNA sequencing data. SDA is able to robustly handle non-normally distributed data and provides a clear quantification of the effect size.
Last updated
immunooncologydifferentialexpressionmetabolomicsproteomicsmassspectrometrysinglecell
3.60 score 1 scripts 385 downloadsDEScan2 - Differential Enrichment Scan 2
Integrated peak and differential caller, specifically designed for broad epigenomic signals.
Last updated
immunooncologypeakdetectionepigeneticssoftwaresequencingcoveragecpp
3.60 score 4 scripts 401 downloadsTMixClust - Time Series Clustering of Gene Expression with Gaussian Mixed-Effects Models and Smoothing Splines
Implementation of a clustering method for time series gene expression data based on mixed-effects models with Gaussian variables and non-parametric cubic splines estimation. The method can robustly account for the high levels of noise present in typical gene expression time series datasets.
Last updated
softwarestatisticalmethodclusteringtimecoursegeneexpression
3.60 score 9 scripts 377 downloadssrnadiff - Finding differentially expressed unannotated genomic regions from RNA-seq data
srnadiff is a package that finds differently expressed regions from RNA-seq data at base-resolution level without relying on existing annotation. To do so, the package implements the identify-then-annotate methodology that builds on the idea of combining two pipelines approachs differential expressed regions detection and differential expression quantification. It reads BAM files as input, and outputs a list differentially regions, together with the adjusted p-values.
Last updated
immunooncologygeneexpressioncoveragesmallrnaepigeneticsstatisticalmethodpreprocessingdifferentialexpressioncpp
3.56 score 12 scripts 388 downloadsvulcan - VirtUaL ChIP-Seq data Analysis using Networks
Vulcan (VirtUaL ChIP-Seq Analysis through Networks) is a package that interrogates gene regulatory networks to infer cofactors significantly enriched in a differential binding signature coming from ChIP-Seq data. In order to do so, our package combines strategies from different BioConductor packages: DESeq for data normalization, ChIPpeakAnno and DiffBind for annotation and definition of ChIP-Seq genomic peaks, csaw to define optimal peak width and viper for applying a regulatory network over a differential binding signature.
Last updated
systemsbiologynetworkenrichmentgeneexpressionchipseq
3.48 score 15 scripts 346 downloadscbaf - Automated functions for comparing various omic data from cbioportal.org
This package contains functions that allow analysing and comparing omic data across various cancers/cancer subgroups easily. So far, it is compatible with RNA-seq, microRNA-seq, microarray and methylation datasets that are stored on cbioportal.org.
Last updated
softwareassaydomaindnamethylationgeneexpressiontranscriptionmicroarrayresearchfieldbiomedicalinformaticscomparativegenomicsepigeneticsgeneticstranscriptomics
3.48 score 2 scripts 345 downloadsScale4C - Scale4C: an R/Bioconductor package for scale-space transformation of 4C-seq data
Scale4C is an R/Bioconductor package for scale-space transformation and visualization of 4C-seq data. The scale-space transformation is a multi-scale visualization technique to transform a 2D signal (e.g. 4C-seq reads on a genomic interval of choice) into a tesselation in the scale space (2D, genomic position x scale factor) by applying different smoothing kernels (Gauss, with increasing sigma). This transformation allows for explorative analysis and comparisons of the data's structure with other samples.
Last updated
visualizationqualitycontroldataimportsequencingcoverage
3.48 score 6 scripts 438 downloadsADAMgui - Activity and Diversity Analysis Module Graphical User Interface
ADAMgui is a Graphical User Interface for the ADAM package. The ADAMgui package provides 2 shiny-based applications that allows the user to study the output of the ADAM package files through different plots. It's possible, for example, to choose a specific GFAG and observe the gene expression behavior with the plots created with the GFAGtargetUi function. Features such as differential expression and foldchange can be easily seen with aid of the plots made with GFAGpathUi function.
Last updated
genesetenrichmentpathwayskegg
3.30 score 7 scripts 444 downloadsmixOmics - Omics Data Integration Project
Multivariate methods are well suited to large omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (components), which are defined as combinations of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of the relationships and correlation structures between the different data sets that are integrated. mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection. The package proposes several sparse multivariate models we have developed to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The data that can be analysed with mixOmics may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics etc) but also beyond the realm of omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data. A non exhaustive list of methods include variants of generalised Canonical Correlation Analysis, sparse Partial Least Squares and sparse Discriminant Analysis. Recently we implemented integrative methods to combine multiple data sets: N-integration with variants of Generalised Canonical Correlation Analysis and P-integration with variants of multi-group Partial Least Squares.
Last updated
immunooncologymicroarraysequencingmetabolomicsmetagenomicsproteomicsgenepredictionmultiplecomparisonclassificationregressionbioconductorgenomicsgenomics-datagenomics-visualizationmultivariate-analysismultivariate-statisticsomicsr-pkgr-project
13.58 score 245 stars 25 dependents 1.9k scripts 5.7k downloadsartMS - Analytical R tools for Mass Spectrometry
artMS provides a set of tools for the analysis of proteomics label-free datasets. It takes as input the MaxQuant search result output (evidence.txt file) and performs quality control, relative quantification using MSstats, downstream analysis and integration. artMS also provides a set of functions to re-format and make it compatible with other analytical tools, including, SAINTq, SAINTexpress, Phosfate, and PHOTON. Check [http://artms.org](http://artms.org) for details.
Last updated
proteomicsdifferentialexpressionbiomedicalinformaticssystemsbiologymassspectrometryannotationqualitycontrolgenesetenrichmentclusteringnormalizationimmunooncologymultiplecomparisonanalysisanalyticalap-msbioconductorbioinformaticsmass-spectrometryphosphoproteomicspost-translational-modificationquantitative-analysis
6.70 score 15 stars 24 scripts 459 downloadscTRAP - Identification of candidate causal perturbations from differential gene expression data
Compare differential gene expression results with those from known cellular perturbations (such as gene knock-down, overexpression or small molecules) derived from the Connectivity Map. Such analyses allow not only to infer the molecular causes of the observed difference in gene expression but also to identify small molecules that could drive or revert specific transcriptomic alterations.
Last updated
differentialexpressiongeneexpressionrnaseqtranscriptomicspathwaysimmunooncologygenesetenrichmentbioconductorbioinformaticscmapgene-expressionl1000
5.13 score 8 stars 17 scripts 434 downloadsADAM - ADAM: Activity and Diversity Analysis Module
ADAM is a GSEA R package created to group a set of genes from comparative samples (control versus experiment) belonging to different species according to their respective functions (Gene Ontology and KEGG pathways as default) and show their significance by calculating p-values referring togene diversity and activity. Each group of genes is called GFAG (Group of Functionally Associated Genes).
Last updated
genesetenrichmentpathwayskegggeneexpressionmicroarraycpp
5.10 score 1 dependents 21 scripts 416 downloadsdecompTumor2Sig - Decomposition of individual tumors into mutational signatures by signature refitting
Uses quadratic programming for signature refitting, i.e., to decompose the mutation catalog from an individual tumor sample into a set of given mutational signatures (either Alexandrov-model signatures or Shiraishi-model signatures), computing weights that reflect the contributions of the signatures to the mutation load of the tumor.
Last updated
softwaresnpsequencingdnaseqgenomicvariationsomaticmutationbiomedicalinformaticsgeneticsbiologicalquestionstatisticalmethod
5.00 score 1 stars 1 dependents 11 scripts 466 downloadstRNAdbImport - Importing from tRNAdb and mitotRNAdb as GRanges objects
tRNAdbImport imports the entries of the tRNAdb and mtRNAdb (http://trna.bioinf.uni-leipzig.de) as GRanges object.
Last updated
softwarevisualizationdataimportbioconductorsequencesstructurestrnatrna-genestrna-sequencestrnadb
4.95 score 1 stars 1 dependents 3 scripts 432 downloadsgraper - Adaptive penalization in high-dimensional regression and classification with external covariates using variational Bayes
This package enables regression and classification on high-dimensional data with different relative strengths of penalization for different feature groups, such as different assays or omic types. The optimal relative strengths are chosen adaptively. Optimisation is performed using a variational Bayes approach.
Last updated
regressionbayesianclassificationopenblascpp
4.80 score 21 scripts 370 downloadstransite - RNA-binding protein motif analysis
transite is a computational method that allows comprehensive analysis of the regulatory role of RNA-binding proteins in various cellular processes by leveraging preexisting gene expression data and current knowledge of binding preferences of RNA-binding proteins.
Last updated
geneexpressiontranscriptiondifferentialexpressionmicroarraymrnamicroarraygeneticsgenesetenrichmentcpp
4.32 score 14 scripts 361 downloadsOVESEG - OVESEG-test to detect tissue/cell-specific markers
An R package for multiple-group comparison to detect tissue/cell-specific marker genes among subtypes. It provides functions to compute OVESEG-test statistics, derive component weights in the mixture null distribution model and estimate p-values from weightedly aggregated permutations. Obtained posterior probabilities of component null hypotheses can also portrait all kinds of upregulation patterns among subtypes.
Last updated
softwaremultiplecomparisoncellbiologygeneexpressioncpp
4.30 score 2 stars 2 scripts 352 downloadsSpatialCPie - Cluster analysis of Spatial Transcriptomics data
SpatialCPie is an R package designed to facilitate cluster evaluation for spatial transcriptomics data by providing intuitive visualizations that display the relationships between clusters in order to guide the user during cluster identification and other downstream applications. The package is built around a shiny "gadget" to allow the exploration of the data with multiple plots in parallel and an interactive UI. The user can easily toggle between different cluster resolutions in order to choose the most appropriate visual cues.
Last updated
transcriptomicsclusteringrnaseqsoftware
4.30 score 5 scripts 416 downloadsLoomExperiment - LoomExperiment container
The LoomExperiment package provide a means to easily convert the Bioconductor "Experiment" classes to loom files and vice versa.
Last updated
immunooncologydatarepresentationdataimportinfrastructuresinglecell
4.20 score 80 scripts 1.0k downloadsscTensor - Detection of cell-cell interaction from single-cell RNA-seq dataset by tensor decomposition
The algorithm is based on the non-negative tucker decomposition (NTD2) of nnTensor.
Last updated
dimensionreductionsinglecellsoftwaregeneexpression
4.18 score 2 scripts 334 downloadsVCFArray - Representing on-disk / remote VCF files as array-like objects
VCFArray extends the DelayedArray to represent VCF data entries as array-like objects with on-disk / remote VCF file as backend. Data entries from VCF files, including info fields, FORMAT fields, and the fixed columns (REF, ALT, QUAL, FILTER) could be converted into VCFArray instances with different dimensions.
Last updated
infrastructuredatarepresentationsequencingvariantannotation
4.00 score 1 stars 9 scripts 342 downloadsabseqR - Reporting and data analysis functionalities for Rep-Seq datasets of antibody libraries
AbSeq is a comprehensive bioinformatic pipeline for the analysis of sequencing datasets generated from antibody libraries and abseqR is one of its packages. abseqR empowers the users of abseqPy (https://github.com/malhamdoosh/abseqPy) with plotting and reporting capabilities and allows them to generate interactive HTML reports for the convenience of viewing and sharing with other researchers. Additionally, abseqR extends abseqPy to compare multiple repertoire analyses and perform further downstream analysis on its output.
Last updated
sequencingvisualizationreportwritingqualitycontrolmultiplecomparison
4.00 score 7 scripts 508 downloadsAssessORF - Assess Gene Predictions Using Proteomics and Evolutionary Conservation
In order to assess the quality of a set of predicted genes for a genome, evidence must first be mapped to that genome. Next, each gene must be categorized based on how strong the evidence is for or against that gene. The AssessORF package provides the functions and class structures necessary for accomplishing those tasks, using proteomic hits and evolutionarily conserved start codons as the forms of evidence.
Last updated
comparativegenomicsgenepredictiongenomeannotationgeneticsproteomicsqualitycontrolvisualization
4.00 score 8 scripts 509 downloadsPAIRADISE - PAIRADISE: Paired analysis of differential isoform expression
This package implements the PAIRADISE procedure for detecting differential isoform expression between matched replicates in paired RNA-Seq data.
Last updated
rnaseqdifferentialexpressionalternativesplicingstatisticalmethodimmunooncology
3.98 score 48 scripts 417 downloadsProteoMM - Multi-Dataset Model-based Differential Expression Proteomics Analysis Platform
ProteoMM is a statistical method to perform model-based peptide-level differential expression analysis of single or multiple datasets. For multiple datasets ProteoMM produces a single fold change and p-value for each protein across multiple datasets. ProteoMM provides functionality for normalization, missing value imputation and differential expression. Model-based peptide-level imputation and differential expression analysis component of package follows the analysis described in “A statistical framework for protein quantitation in bottom-up MS based proteomics" (Karpievitch et al. Bioinformatics 2009). EigenMS normalisation is implemented as described in "Normalization of peak intensities in bottom-up MS-based proteomics using singular value decomposition." (Karpievitch et al. Bioinformatics 2009).
Last updated
immunooncologymassspectrometryproteomicsnormalizationdifferentialexpression
3.68 score 24 scripts 374 downloadsIsoCorrectoRGUI - Graphical User Interface for IsoCorrectoR
IsoCorrectoRGUI is a Graphical User Interface for the IsoCorrectoR package. IsoCorrectoR performs the correction of mass spectrometry data from stable isotope labeling/tracing metabolomics experiments with regard to natural isotope abundance and tracer impurity. Data from both MS and MS/MS measurements can be corrected (with any tracer isotope: 13C, 15N, 18O...), as well as high resolution MS data from multiple-tracer experiments (e.g. 13C and 15N used simultaneously).
Last updated
softwaremetabolomicsmassspectrometrypreprocessingguiimmunooncology
3.30 score 4 scripts 305 downloadsfgsea - Fast Gene Set Enrichment Analysis
The package implements an algorithm for fast gene set enrichment analysis. Using the fast algorithm allows to make more permutations and get more fine grained p-values, which allows to use accurate stantard approaches to multiple hypothesis correction.
Last updated
geneexpressiondifferentialexpressiongenesetenrichmentpathwayscpp
16.21 score 444 stars 53 dependents 6.3k scripts 52k downloadsMultiAssayExperiment - Software for the integration of multi-omics experiments in Bioconductor
Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.
Last updated
infrastructuredatarepresentationbioconductorbioconductor-packagegenomicsnci-itcrtcgau24ca289073
15.00 score 75 stars 146 dependents 920 scripts 9.7k downloads
treeio - Base Classes and Functions for Phylogenetic Tree Input and Output
'treeio' is an R package to make it easier to import and store phylogenetic tree with associated data; and to link external data from different sources to phylogeny. It also supports exporting phylogenetic tree with heterogeneous associated data to a single tree file and can be served as a platform for merging tree with associated data and converting file formats.
Last updated
softwareannotationclusteringdataimportdatarepresentationalignmentmultiplesequencealignmentphylogeneticsexporterparserphylogenetic-trees
14.63 score 104 stars 126 dependents 2.0k scripts 42k downloadsmaftools - Summarize, Analyze and Visualize MAF Files
Analyze and visualize Mutation Annotation Format (MAF) files from large scale sequencing studies. This package provides various functions to perform most commonly used analyses in cancer genomics and to create feature rich customizable visualzations with minimal effort.
Last updated
datarepresentationdnaseqvisualizationdrivermutationvariantannotationfeatureextractionclassificationsomaticmutationsequencingfunctionalgenomicssurvivalbioinformaticscancer-genome-atlascancer-genomicsgenomicsmaf-filestcgacurlbzip2xz-utilszlib
13.94 score 493 stars 12 dependents 1.3k scripts 4.3k downloadsdada2 - Accurate, high-resolution sample inference from amplicon sequencing data
The dada2 package infers exact amplicon sequence variants (ASVs) from high-throughput amplicon sequencing data, replacing the coarser and less accurate OTU clustering approach. The dada2 pipeline takes as input demultiplexed fastq files, and outputs the sequence variants and their sample-wise abundances after removing substitution and chimera errors. Taxonomic classification is available via a native implementation of the RDP naive Bayesian classifier, and species-level assignment to 16S rRNA gene fragments by exact matching.
Last updated
immunooncologymicrobiomesequencingclassificationmetagenomicsampliconbioconductorbioinformaticsmetabarcodingtaxonomycpp
13.74 score 552 stars 10 dependents 3.3k scripts 5.3k downloadstximport - Import and summarize transcript-level estimates for transcript- and gene-level analysis
Imports transcript-level abundance, estimated counts and transcript lengths, and summarizes into matrices for use with downstream gene-level analysis packages. Average transcript length, weighted by sample-specific transcript abundance estimates, is provided as a matrix which can be used as an offset for different expression of gene-level counts.
Last updated
dataimportpreprocessingrnaseqtranscriptomicstranscriptiongeneexpressionimmunooncologybioconductordeseq2
13.69 score 145 stars 13 dependents 4.3k scripts 6.8k downloadsMAST - Model-based Analysis of Single Cell Transcriptomics
Methods and models for handling zero-inflated single cell assay data.
Last updated
geneexpressiondifferentialexpressiongenesetenrichmentrnaseqtranscriptomicssinglecell
13.19 score 264 stars 6 dependents 2.2k scripts 3.6k downloadsHDF5Array - HDF5 datasets as array-like objects in R
The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.
Last updated
infrastructuredatarepresentationdataimportsequencingrnaseqcoverageannotationgenomeannotationsinglecellimmunooncologybioconductor-packagecore-packageu24ca289073
12.90 score 12 stars 157 dependents 1.4k scripts 28k downloadskaryoploteR - Plot customizable linear genomes displaying arbitrary data
karyoploteR creates karyotype plots of arbitrary genomes and offers a complete set of functions to plot arbitrary data on them. It mimicks many R base graphics functions coupling them with a coordinate change function automatically mapping the chromosome and data coordinates into the plot coordinates. In addition to the provided data plotting functions, it is easy to add new ones.
Last updated
visualizationcopynumbervariationsequencingcoveragednaseqchipseqmethylseqdataimportonechannelbioconductorbioinformaticsdata-visualizationgenomegenomics-visualizationplotting-in-r
12.16 score 367 stars 7 dependents 830 scripts 1.8k downloadsmethylKit - DNA methylation analysis from high-throughput bisulfite sequencing results
methylKit is an R package for DNA methylation analysis and annotation from high-throughput bisulfite sequencing. The package is designed to deal with sequencing data from RRBS and its variants, but also target-capture methods and whole genome bisulfite sequencing. It also has functions to analyze base-pair resolution 5hmC data from experimental protocols such as oxBS-Seq and TAB-Seq. Methylation calling can be performed directly from Bismark aligned BAM files.
Last updated
dnamethylationsequencingmethylseqgenome-biologymethylationstatistical-analysisvisualizationcurlbzip2xz-utilszlibcpp
11.68 score 253 stars 3 dependents 692 scripts 1.8k downloadsExperimentHub - Client to access ExperimentHub resources
This package provides a client for the Bioconductor ExperimentHub web resource. ExperimentHub provides a central location where curated data from experiments, publications or training courses can be accessed. Each resource has associated metadata, tags and date of modification. The client creates and manages a local cache of files retrieved enabling quick and reproducible access.
Last updated
infrastructuredataimportguithirdpartyclientcore-packageu24ca289073
11.46 score 11 stars 66 dependents 1.5k scripts 13k downloadsPharmacoGx - Analysis of Large-Scale Pharmacogenomic Data
Contains a set of functions to perform large-scale analysis of pharmaco-genomic data. These include the PharmacoSet object for storing the results of pharmacogenomic experiments, as well as a number of functions for computing common summaries of drug-dose response and correlating them with the molecular features in a cancer cell-line.
Last updated
geneexpressionpharmacogeneticspharmacogenomicssoftwareclassificationdatasetspharmacogenomicpharmacogxcpp
11.15 score 70 stars 3 dependents 468 scripts 670 downloadsannotatr - Annotation of Genomic Regions to Genomic Annotations
Given a set of genomic sites/regions (e.g. ChIP-seq peaks, CpGs, differentially methylated CpGs or regions, SNPs, etc.) it is often of interest to investigate the intersecting genomic annotations. Such annotations include those relating to gene models (promoters, 5'UTRs, exons, introns, and 3'UTRs), CpGs (CpG islands, CpG shores, CpG shelves), or regulatory sequences such as enhancers. The annotatr package provides an easy way to summarize and visualize the intersection of genomic sites/regions with genomic annotations.
Last updated
softwareannotationgenomeannotationfunctionalgenomicsvisualizationgenome-annotation
10.29 score 27 stars 5 dependents 375 scripts 1.1k downloadsclusterExperiment - Compare Clusterings for Single-Cell Sequencing
Provides functionality for running and comparing many different clusterings of single-cell sequencing data or other large mRNA Expression data sets.
Last updated
clusteringrnaseqsequencingsoftwaresinglecellcpp
9.92 score 41 stars 1 dependents 213 scripts 919 downloadsBatchQC - Batch Effects Quality Control Software
Sequencing and microarray samples often are collected or processed in multiple batches or at different times. This often produces technical biases that can lead to incorrect results in the downstream analysis. BatchQC is a software tool that streamlines batch preprocessing and evaluation by providing interactive diagnostics, visualizations, and statistical analyses to explore the extent to which batch variation impacts the data. BatchQC diagnostics help determine whether batch adjustment needs to be done, and how correction should be applied before proceeding with a downstream analysis. Moreover, BatchQC interactively applies multiple common batch effect approaches to the data and the user can quickly see the benefits of each method. BatchQC is developed as a Shiny App. The output is organized into multiple tabs and each tab features an important part of the batch effect analysis and visualization of the data. The BatchQC interface has the following analysis groups: Summary, Differential Expression, Median Correlations, Heatmaps, Circular Dendrogram, PCA Analysis, Shape, ComBat and SVA.
Last updated
batcheffectgeneexpressiongraphandnetworkmicroarraynormalizationprincipalcomponentsequencingsoftwarevisualizationqualitycontrolrnaseqpreprocessingdifferentialexpressionimmunooncology
9.39 score 11 stars 60 scripts 494 downloads
recount - Explore and download data from the recount project
Explore and download data from the recount project available at https://jhubiostatistics.shinyapps.io/recount/. Using the recount package you can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level, the raw counts, the phenotype metadata used, the urls to the sample coverage bigWig files or the mean coverage bigWig file for a particular study. The RangedSummarizedExperiment objects can be used by different packages for performing differential expression analysis. Using http://bioconductor.org/packages/derfinder you can perform annotation-agnostic differential expression analyses with the data from the recount project as described at http://www.nature.com/nbt/journal/v35/n4/full/nbt.3838.html.
Last updated
coveragedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaredataimportimmunooncologyannotation-agnosticbioconductorcountderfinderdeseq2exongenehumanilluminajunctionrecount
9.35 score 41 stars 3 dependents 506 scripts 807 downloadspcaExplorer - Interactive Visualization of RNA-seq Data Using a Principal Components Approach
This package provides functionality for interactive visualization of RNA-seq datasets based on Principal Components Analysis. The methods provided allow for quick information extraction and effective data exploration. A Shiny application encapsulates the whole analysis.
Last updated
immunooncologyvisualizationrnaseqdimensionreductionprincipalcomponentqualitycontrolguireportwritingshinyappsbioconductorprincipal-componentsreproducible-researchrna-seq-analysisrna-seq-datashinytranscriptomeuser-friendly
9.19 score 56 stars 197 scripts 689 downloadsmatter - Out-of-core statistical computing and signal processing
Toolbox for larger-than-memory scientific computing and visualization, providing efficient out-of-core data structures using files or shared memory, for dense and sparse vectors, matrices, and arrays, with applications to nonuniformly sampled signals and images.
Last updated
infrastructuredatarepresentationdataimportdimensionreductionpreprocessingcpp
9.14 score 61 stars 2 dependents 67 scripts 676 downloadsInteractionSet - Base Classes for Storing Genomic Interaction Data
Provides the GInteractions, InteractionSet and ContactMatrix objects and associated methods for storing and manipulating genomic interaction data from Hi-C and ChIA-PET experiments.
Last updated
infrastructuredatarepresentationsoftwarehiccpp
9.09 score 43 dependents 366 scripts 2.9k downloadsscone - Single Cell Overview of Normalized Expression data
SCONE is an R package for comparing and ranking the performance of different normalization schemes for single-cell RNA-seq and other high-throughput analyses.
Last updated
immunooncologynormalizationpreprocessingqualitycontrolgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecellcoverage
8.86 score 55 stars 110 scripts 536 downloadsPureCN - Copy number calling and SNV classification using targeted short read sequencing
This package estimates tumor purity, copy number, and loss of heterozygosity (LOH), and classifies single nucleotide variants (SNVs) by somatic status and clonality. PureCN is designed for targeted short read sequencing data, integrates well with standard somatic variant detection and copy number pipelines, and has support for tumor samples without matching normal samples.
Last updated
copynumbervariationsoftwaresequencingvariantannotationvariantdetectioncoverageimmunooncologybioconductor-packagecell-free-dnacopy-numberlohtumor-heterogeneitytumor-mutational-burdentumor-purity
8.68 score 147 stars 62 scripts 681 downloadsSIMLR - Single-cell Interpretation via Multi-kernel LeaRning (SIMLR)
Single-cell RNA-seq technologies enable high throughput gene expression measurement of individual cells, and allow the discovery of heterogeneity within cell populations. Measurement of cell-to-cell gene expression similarity is critical for the identification, visualization and analysis of cell populations. However, single-cell data introduce challenges to conventional measures of gene expression similarity because of the high level of noise, outliers and dropouts. We develop a novel similarity-learning framework, SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), which learns an appropriate distance metric from the data for dimension reduction, clustering and visualization.
Last updated
immunooncologyclusteringgeneexpressionsequencingsinglecellopenblascpp
8.52 score 115 stars 72 scripts 516 downloadsM3Drop - Michaelis-Menten Modelling of Dropouts in single-cell RNASeq
This package fits a model to the pattern of dropouts in single-cell RNASeq data. This model is used as a null to identify significantly variable (i.e. differentially expressed) genes for use in downstream analysis, such as clustering cells. Also includes an method for calculating exact Pearson residuals in UMI-tagged data using a library-size aware negative binomial model.
Last updated
rnaseqsequencingtranscriptomicsgeneexpressionsoftwaredifferentialexpressiondimensionreductionfeatureextractionhuman-cell-atlasrna-seqsingle-cellsingle-cell-rna-seq
8.49 score 33 stars 3 dependents 130 scripts 815 downloadsIPO - Automated Optimization of XCMS Data Processing parameters
The outcome of XCMS data processing strongly depends on the parameter settings. IPO (`Isotopologue Parameter Optimization`) is a parameter optimization tool that is applicable for different kinds of samples and liquid chromatography coupled to high resolution mass spectrometry devices, fast and free of labeling steps. IPO uses natural, stable 13C isotopes to calculate a peak picking score. Retention time correction is optimized by minimizing the relative retention time differences within features and grouping parameters are optimized by maximizing the number of features showing exactly one peak from each injection of a pooled sample. The different parameter settings are achieved by design of experiment. The resulting scores are evaluated using response surface models.
Last updated
immunooncologymetabolomicsmassspectrometry
8.17 score 34 stars 44 scripts 395 downloadsscDD - Mixture modeling of single-cell RNA-seq data to identify genes with differential distributions
This package implements a method to analyze single-cell RNA- seq Data utilizing flexible Dirichlet Process mixture models. Genes with differential distributions of expression are classified into several interesting patterns of differences between two conditions. The package also includes functions for simulating data with these patterns from negative binomial distributions.
Last updated
immunooncologybayesianclusteringrnaseqsinglecellmultiplecomparisonvisualizationdifferentialexpression
8.15 score 35 stars 54 scripts 584 downloadsphilr - Phylogenetic partitioning based ILR transform for metagenomics data
PhILR is short for Phylogenetic Isometric Log-Ratio Transform. This package provides functions for the analysis of compositional data (e.g., data representing proportions of different variables/parts). Specifically this package allows analysis of compositional data where the parts can be related through a phylogenetic tree (as is common in microbiota survey data) and makes available the Isometric Log Ratio transform built from the phylogenetic tree and utilizing a weighted reference measure.
Last updated
immunooncologysequencingmicrobiomemetagenomicssoftware
7.96 score 19 stars 132 scripts 486 downloads
CytoML - A GatingML Interface for Cross Platform Cytometry Data Sharing
Uses platform-specific implemenations of the GatingML2.0 standard to exchange gated cytometry data with other software platforms.
Last updated
immunooncologyflowcytometrydataimportdatarepresentationcurlopensslopenblaslibxml2cpp
7.92 score 35 stars 249 scripts 1.2k downloadssigneR - Empirical Bayesian approach to mutational signature discovery
The signeR package provides an empirical Bayesian approach to mutational signature discovery. It is designed to analyze single nucleotide variation (SNV) counts in cancer genomes, but can also be applied to other features as well. Functionalities to characterize signatures or genome samples according to exposure patterns are also provided.
Last updated
genomicvariationsomaticmutationstatisticalmethodvisualizationbioconductorbioinformaticsopenblascpp
7.86 score 14 stars 24 scripts 446 downloadsdebrowser - Interactive Differential Expresion Analysis Browser
Bioinformatics platform containing interactive plots and tables for differential gene and region expression studies. Allows visualizing expression data much more deeply in an interactive and faster way. By changing the parameters, users can easily discover different parts of the data that like never have been done before. Manually creating and looking these plots takes time. With DEBrowser users can prepare plots without writing any code. Differential expression, PCA and clustering analysis are made on site and the results are shown in various plots such as scatter, bar, box, volcano, ma plots and Heatmaps.
Last updated
sequencingchipseqrnaseqdifferentialexpressiongeneexpressionclusteringimmunooncology
7.82 score 61 stars 68 scripts 516 downloadsmeshes - MeSH Enrichment and Semantic analyses
MeSH (Medical Subject Headings) is the NLM controlled vocabulary used to manually index articles for MEDLINE/PubMed. MeSH terms were associated by Entrez Gene ID by three methods, gendoo, gene2pubmed and RBBH. This association is fundamental for enrichment and semantic analyses. meshes supports enrichment analysis (over-representation and gene set enrichment analysis) of gene list or whole expression profile. The semantic comparisons of MeSH terms provide quantitative ways to compute similarities between genes and gene groups. meshes implemented five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively and supports more than 70 species.
Last updated
annotationclusteringmultiplecomparisonsoftwareenrichment-analysismedical-subject-headingssemantic-similarity
7.27 score 12 stars 52 scripts 569 downloadsmsPurity - Automated Evaluation of Precursor Ion Purity for Mass Spectrometry Based Fragmentation in Metabolomics
msPurity R package was developed to: 1) Assess the spectral quality of fragmentation spectra by evaluating the "precursor ion purity". 2) Process fragmentation spectra. 3) Perform spectral matching. What is precursor ion purity? -What we call "Precursor ion purity" is a measure of the contribution of a selected precursor peak in an isolation window used for fragmentation. The simple calculation involves dividing the intensity of the selected precursor peak by the total intensity of the isolation window. When assessing MS/MS spectra this calculation is done before and after the MS/MS scan of interest and the purity is interpolated at the recorded time of the MS/MS acquisition. Additionally, isotopic peaks can be removed, low abundance peaks are removed that are thought to have limited contribution to the resulting MS/MS spectra and the isolation efficiency of the mass spectrometer can be used to normalise the intensities used for the calculation.
Last updated
massspectrometrymetabolomicssoftwarebioconductor-packagedimsfragmentationlc-mslc-msmsmass-spectrometryprecursor-ion-purity
7.27 score 16 stars 44 scripts 386 downloadsDRIMSeq - Differential transcript usage and tuQTL analyses with Dirichlet-multinomial model in RNA-seq
The package provides two frameworks. One for the differential transcript usage analysis between different conditions and one for the tuQTL analysis. Both are based on modeling the counts of genomic features (i.e., transcripts) with the Dirichlet-multinomial distribution. The package also makes available functions for visualization and exploration of the data and results.
Last updated
immunooncologysnpalternativesplicingdifferentialsplicinggeneticsrnaseqsequencingworkflowstepmultiplecomparisongeneexpressiondifferentialexpression
7.05 score 2 dependents 189 scripts 737 downloadsisomiRs - Analyze isomiRs and miRNAs from small RNA-seq
Characterization of miRNAs and isomiRs, clustering and differential expression.
Last updated
mirnarnaseqdifferentialexpressionclusteringimmunooncologyanalyze-isomirsbioconductorisomirs
7.02 score 8 stars 36 scripts 472 downloadsLOBSTAHS - Lipid and Oxylipin Biomarker Screening through Adduct Hierarchy Sequences
LOBSTAHS is a multifunction package for screening, annotation, and putative identification of mass spectral features in large, HPLC-MS lipid datasets. In silico data for a wide range of lipids, oxidized lipids, and oxylipins can be generated from user-supplied structural criteria with a database generation function. LOBSTAHS then applies these databases to assign putative compound identities to features in any high-mass accuracy dataset that has been processed using xcms and CAMERA. Users can then apply a series of orthogonal screening criteria based on adduct ion formation patterns, chromatographic retention time, and other properties, to evaluate and assign confidence scores to this list of preliminary assignments. During the screening routine, LOBSTAHS rejects assignments that do not meet the specified criteria, identifies potential isomers and isobars, and assigns a variety of annotation codes to assist the user in evaluating the accuracy of each assignment.
Last updated
immunooncologymassspectrometrymetabolomicslipidomicsdataimportadductalgaebioconductorhplc-esi-mslipidmass-spectrometryoxidative-stress-biomarkersoxidized-lipidsoxylipinsplankton
6.96 score 8 stars 17 scripts 288 downloadsDEFormats - Differential gene expression data formats converter
Convert between different data formats used by differential gene expression analysis tools.
Last updated
immunooncologydifferentialexpressiongeneexpressionrnaseqsequencingtranscription
6.88 score 4 stars 1 dependents 91 scripts 588 downloadschimeraviz - Visualization tools for gene fusions
chimeraviz manages data from fusion gene finders and provides useful visualization tools.
Last updated
infrastructurealignment
6.82 score 39 stars 17 scripts 498 downloadsflowPloidy - Analyze flow cytometer data to determine sample ploidy
Determine sample ploidy via flow cytometry histogram analysis. Reads Flow Cytometry Standard (FCS) files via the flowCore bioconductor package, and provides functions for determining the DNA ploidy of samples based on internal standards.
Last updated
flowcytometryguiregressionvisualizationbioconductorevolutionflow-cytometrypolyploidy
6.61 score 5 stars 15 scripts 378 downloadsMultiDataSet - Implementation of MultiDataSet and ResultSet
Implementation of the BRGE's (Bioinformatic Research Group in Epidemiology from Center for Research in Environmental Epidemiology) MultiDataSet and ResultSet. MultiDataSet is designed for integrating multi omics data sets and ResultSet is a container for omics results. This package contains base classes for MEAL and rexposome packages.
Last updated
softwaredatarepresentation
6.59 score 11 dependents 28 scripts 2.1k downloadsMoonlightR - Identify oncogenes and tumor suppressor genes from omics data
Motivation: The understanding of cancer mechanism requires the identification of genes playing a role in the development of the pathology and the characterization of their role (notably oncogenes and tumor suppressors). Results: We present an R/bioconductor package called MoonlightR which returns a list of candidate driver genes for specific cancer types on the basis of TCGA expression data. The method first infers gene regulatory networks and then carries out a functional enrichment analysis (FEA) (implementing an upstream regulator analysis, URA) to score the importance of well-known biological processes with respect to the studied cancer type. Eventually, by means of random forests, MoonlightR predicts two specific roles for the candidate driver genes: i) tumor suppressor genes (TSGs) and ii) oncogenes (OCGs). As a consequence, this methodology does not only identify genes playing a dual role (e.g. TSG in one cancer type and OCG in another) but also helps in elucidating the biological processes underlying their specific roles. In particular, MoonlightR can be used to discover OCGs and TSGs in the same cancer type. This may help in answering the question whether some genes change role between early stages (I, II) and late stages (III, IV) in breast cancer. In the future, this analysis could be useful to determine the causes of different resistances to chemotherapeutic treatments.
Last updated
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksurvivalgenesetenrichmentnetworkenrichment
6.57 score 17 stars 422 downloadsHarman - The removal of batch effects from datasets using a PCA and constrained optimisation based technique
Harman is a PCA and constrained optimisation based technique that maximises the removal of batch effects from datasets, with the constraint that the probability of overcorrection (i.e. removing genuine biological signal along with batch noise) is kept to a fraction which is set by the end-user.
Last updated
batcheffectmicroarraymultiplecomparisonprincipalcomponentnormalizationpreprocessingdnamethylationtranscriptionsoftwarestatisticalmethodcpp
6.53 score 2 dependents 38 scripts 512 downloadsREMP - Repetitive Element Methylation Prediction
Machine learning-based tools to predict DNA methylation of locus-specific repetitive elements (RE) by learning surrounding genetic and epigenetic information. These tools provide genomewide and single-base resolution of DNA methylation prediction on RE that are difficult to measure using array-based or sequencing-based platforms, which enables epigenome-wide association study (EWAS) and differentially methylated region (DMR) analysis on RE.
Last updated
dnamethylationmicroarraymethylationarraysequencinggenomewideassociationepigeneticspreprocessingmultichanneltwochanneldifferentialmethylationqualitycontroldataimport
6.45 score 3 stars 26 scripts 398 downloadsswfdr - Estimation of the science-wise false discovery rate and the false discovery rate conditional on covariates
This package allows users to estimate the science-wise false discovery rate from Jager and Leek, "Empirical estimates suggest most published medical research is true," 2013, Biostatistics, using an EM approach due to the presence of rounding and censoring. It also allows users to estimate the false discovery rate conditional on covariates, using a regression framework, as per Boca and Leek, "A direct approach to estimating false discovery rates conditional on covariates," 2018, PeerJ.
Last updated
multiplecomparisonstatisticalmethodsoftware
6.42 score 3 stars 55 scripts 426 downloadsQUBIC - An R Package for Qualitative Biclustering in Support of Gene Co-Expression Analyses
The core function of this R package is to provide the implementation of the well-cited and well-reviewed QUBIC algorithm, aiming to deliver an effective and efficient biclustering capability. This package also includes the following related functions: (i) a qualitative representation of the input gene expression data, through a well-designed discretization way considering the underlying data property, which can be directly used in other biclustering programs; (ii) visualization of identified biclusters using heatmap in support of overall expression pattern analysis; (iii) bicluster-based co-expression network elucidation and visualization, where different correlation coefficient scores between a pair of genes are provided; and (iv) a generalize output format of biclusters and corresponding network can be freely downloaded so that a user can easily do following comprehensive functional enrichment analysis (e.g. DAVID) and advanced network visualization (e.g. Cytoscape).
Last updated
statisticalmethodmicroarraydifferentialexpressionmultiplecomparisonclusteringvisualizationgeneexpressionnetworkbioconductor-packagebioconductor-packagescppopenmp
6.21 score 4 stars 17 scripts 463 downloadsLinnorm - Linear model and normality based normalization and transformation method (Linnorm)
Linnorm is an algorithm for normalizing and transforming RNA-seq, single cell RNA-seq, ChIP-seq count data or any large scale count data. It has been independently reviewed by Tian et al. on Nature Methods (https://doi.org/10.1038/s41592-019-0425-8). Linnorm can work with raw count, CPM, RPKM, FPKM and TPM.
Last updated
immunooncologysequencingchipseqrnaseqdifferentialexpressiongeneexpressiongeneticsnormalizationsoftwaretranscriptionbatcheffectpeakdetectionclusteringnetworksinglecellcpp
6.21 score 4 dependents 67 scripts 616 downloadsramwas - Fast Methylome-Wide Association Study Pipeline for Enrichment Platforms
A complete toolset for methylome-wide association studies (MWAS). It is specifically designed for data from enrichment based methylation assays, but can be applied to other data as well. The analysis pipeline includes seven steps: (1) scanning aligned reads from BAM files, (2) calculation of quality control measures, (3) creation of methylation score (coverage) matrix, (4) principal component analysis for capturing batch effects and detection of outliers, (5) association analysis with respect to phenotypes of interest while correcting for top PCs and known covariates, (6) annotation of significant findings, and (7) multi-marker analysis (methylation risk score) using elastic net. Additionally, RaMWAS include tools for joint analysis of methlyation and genotype data. This work is published in Bioinformatics, Shabalin et al. (2018) <doi:10.1093/bioinformatics/bty069>.
Last updated
dnamethylationsequencingqualitycontrolcoveragepreprocessingnormalizationbatcheffectprincipalcomponentdifferentialmethylationvisualization
6.20 score 10 stars 113 scripts 410 downloadsOncoScore - A tool to identify potentially oncogenic genes
OncoScore is a tool to measure the association of genes to cancer based on citation frequencies in biomedical literature. The score is evaluated from PubMed literature by dynamically updatable web queries.
Last updated
biomedicalinformatics
6.15 score 5 stars 2 scripts 369 downloadsRCAS - RNA Centric Annotation System
RCAS is an R/Bioconductor package designed as a generic reporting tool for the functional analysis of transcriptome-wide regions of interest detected by high-throughput experiments. Such transcriptomic regions could be, for instance, signal peaks detected by CLIP-Seq analysis for protein-RNA interaction sites, RNA modification sites (alias the epitranscriptome), CAGE-tag locations, or any other collection of query regions at the level of the transcriptome. RCAS produces in-depth annotation summaries and coverage profiles based on the distribution of the query regions with respect to transcript features (exons, introns, 5'/3' UTR regions, exon-intron boundaries, promoter regions). Moreover, RCAS can carry out functional enrichment analyses and discriminative motif discovery.
Last updated
softwaregenetargetmotifannotationmotifdiscoverygotranscriptomicsgenomeannotationgenesetenrichmentcoverage
6.14 score 1 dependents 29 scripts 591 downloadsswitchde - Switch-like differential expression across single-cell trajectories
Inference and detection of switch-like differential expression across single-cell RNA-seq trajectories.
Last updated
immunooncologysoftwaretranscriptomicsgeneexpressionrnaseqregressiondifferentialexpressionsinglecellgene-expressiongenomicssingle-cell
6.04 score 22 stars 10 scripts 375 downloadsPathoStat - PathoStat Statistical Microbiome Analysis Package
The purpose of this package is to perform Statistical Microbiome Analysis on metagenomics results from sequencing data samples. In particular, it supports analyses on the PathoScope generated report files. PathoStat provides various functionalities including Relative Abundance charts, Diversity estimates and plots, tests of Differential Abundance, Time Series visualization, and Core OTU analysis.
Last updated
microbiomemetagenomicsgraphandnetworkmicroarraypatternlogicprincipalcomponentsequencingsoftwarevisualizationrnaseqimmunooncology
6.02 score 7 stars 15 scripts 412 downloads
TVTB - TVTB: The VCF Tool Box
The package provides S4 classes and methods to filter, summarise and visualise genetic variation data stored in VCF files. In particular, the package extends the FilterRules class (S4Vectors package) to define news classes of filter rules applicable to the various slots of VCF objects. Functionalities are integrated and demonstrated in a Shiny web-application, the Shiny Variant Explorer (tSVE).
Last updated
softwaregeneticsgeneticvariabilitygenomicvariationdatarepresentationguidnaseqwholegenomevisualizationmultiplecomparisondataimportvariantannotationsequencingcoveragealignmentsequencematching
5.98 score 2 stars 16 scripts 368 downloadsepivizrServer - WebSocket server infrastructure for epivizr apps and packages
This package provides objects to manage WebSocket connections to epiviz apps. Other epivizr package use this infrastructure.
Last updated
infrastructurevisualization
5.95 score 5 dependents 6 scripts 376 downloadsdiscordant - The Discordant Method: A Novel Approach for Differential Correlation
Discordant is an R package that identifies pairs of features that correlate differently between phenotypic groups, with application to -omics data sets. Discordant uses a mixture model that “bins” molecular feature pairs based on their type of coexpression or coabbundance. Algorithm is explained further in "Differential Correlation for Sequencing Data"" (Siska et al. 2016).
Last updated
immunooncologybiologicalquestionstatisticalmethodmrnamicroarraymicroarraygeneticsrnaseqcpp
5.90 score 10 stars 8 scripts 424 downloadsEGSEA - Ensemble of Gene Set Enrichment Analyses
This package implements the Ensemble of Gene Set Enrichment Analyses (EGSEA) method for gene set testing. EGSEA algorithm utilizes the analysis results of twelve prominent GSE algorithms in the literature to calculate collective significance scores for each gene set.
Last updated
immunooncologydifferentialexpressiongogeneexpressiongenesetenrichmentgeneticsmicroarraymultiplecomparisononechannelpathwaysrnaseqsequencingsoftwaresystemsbiologytwochannelmetabolomicsproteomicskegggraphandnetworkgenesignalinggenetargetnetworkenrichmentnetworkclassification
5.85 score 71 scripts 471 downloadsbioCancer - Interactive Multi-Omics Cancers Data Visualization and Analysis
This package is a Shiny App to visualize and analyse interactively Multi-Assays of Cancer Genomic Data.
Last updated
guidatarepresentationnetworkmultiplecomparisonpathwaysreactomevisualizationgeneexpressiongenetargetanalysisbiocancer-interfacecancercancer-studiesrmarkdown
5.80 score 21 stars 7 scripts 390 downloadsregsplice - L1-regularization based methods for detection of differential splicing
Statistical methods for detection of differential splicing (differential exon usage) in RNA-seq and exon microarray data, using L1-regularization (lasso) to improve power.
Last updated
immunooncologyalternativesplicingdifferentialexpressiondifferentialsplicingsequencingrnaseqmicroarrayexonarrayexperimentaldesignsoftware
5.77 score 3 stars 33 scripts 377 downloadsTCseq - Time course sequencing data analysis
Quantitative and differential analysis of epigenomic and transcriptomic time course sequencing data, clustering analysis and visualization of the temporal patterns of time course data.
Last updated
epigeneticstimecoursesequencingchipseqrnaseqdifferentialexpressionclusteringvisualization
5.73 score 42 scripts 1.3k downloadsDNAshapeR - High-throughput prediction of DNA shape features
DNAhapeR is an R/BioConductor package for ultra-fast, high-throughput predictions of DNA shape features. The package allows to predict, visualize and encode DNA shape features for statistical learning.
Last updated
structuralpredictiondna3dstructuresoftwarecpp
5.67 score 47 scripts 474 downloadsstatTarget - Statistical Analysis of Molecular Profiles
A streamlined tool provides a graphical user interface for quality control based signal drift correction (QC-RFSC), integration of data from multi-batch MS-based experiments, and the comprehensive statistical analysis in metabolomics and proteomics.
Last updated
immunooncologymetabolomicsproteomicsmachine learninglipidomicsmassspectrometryqualitycontrolnormalizationqc-rfsccombatdifferentialexpressionbatcheffectvisualizationmultiplecomparisonpreprocessingsoftware
5.66 score 38 scripts 452 downloadsgcapc - GC Aware Peak Caller
Peak calling for ChIP-seq data with consideration of potential GC bias in sequencing reads. GC bias is first estimated with generalized linear mixture models using effective GC strategy, then applied into peak significance estimation.
Last updated
sequencingchipseqbatcheffectpeakdetection
5.54 score 10 stars 23 scripts 450 downloadsGEM - GEM: fast association study for the interplay of Gene, Environment and Methylation
Tools for analyzing EWAS, methQTL and GxE genome widely.
Last updated
methylseqmethylationarraygenomewideassociationregressiondnamethylationsnpgeneexpressiongui
5.53 score 34 scripts 359 downloadscoseq - Co-Expression Analysis of Sequencing Data
Co-expression analysis for expression profiles arising from high-throughput sequencing data. Feature (e.g., gene) profiles are clustered using adapted transformations and mixture models or a K-means algorithm, and model selection criteria (to choose an appropriate number of clusters) are provided.
Last updated
geneexpressionrnaseqsequencingsoftwareimmunooncology
5.51 score 1 dependents 18 scripts 496 downloadsepivizrData - Data Management API for epiviz interactive visualization app
Serve data from Bioconductor Objects through a WebSocket connection.
Last updated
infrastructurevisualization
5.38 score 1 stars 4 dependents 9 scripts 370 downloadsSNPediaR - Query data from SNPedia
SNPediaR provides some tools for downloading and parsing data from the SNPedia web site <http://www.snpedia.com>. The implemented functions allow users to import the wiki text available in SNPedia pages and to extract the most relevant information out of them. If some information in the downloaded pages is not automatically processed by the library functions, users can easily implement their own parsers to access it in an efficient way.
Last updated
snpvariantannotation
5.34 score 11 stars 9 scripts 337 downloadsDMRScan - Detection of Differentially Methylated Regions
This package detects significant differentially methylated regions (for both qualitative and quantitative traits), using a scan statistic with underlying Poisson heuristics. The scan statistic will depend on a sequence of window sizes (# of CpGs within each window) and on a threshold for each window size. This threshold can be calculated by three different means: i) analytically using Siegmund et.al (2012) solution (preferred), ii) an important sampling as suggested by Zhang (2008), and a iii) full MCMC modeling of the data, choosing between a number of different options for modeling the dependency between each CpG.
Last updated
softwaretechnologysequencingwholegenome
5.32 score 2 stars 5 scripts 408 downloadsbigmelon - Illumina methylation array analysis for large experiments
Methods for working with Illumina arrays using gdsfmt.
Last updated
dnamethylationmicroarraytwochannelpreprocessingqualitycontrolmethylationarraydataimportcpgisland
5.31 score 29 scripts 547 downloadsgCrisprTools - Suite of Functions for Pooled Crispr Screen QC and Analysis
Set of tools for evaluating pooled high-throughput screening experiments, typically employing CRISPR/Cas9 or shRNA expression cassettes. Contains methods for interrogating library and cassette behavior within an experiment, identifying differentially abundant cassettes, aggregating signals to identify candidate targets for empirical validation, hypothesis testing, and comprehensive reporting. Version 2.0 extends these applications to include a variety of tools for contextualizing and integrating signals across many experiments, incorporates extended signal enrichment methodologies via the "sparrow" package, and streamlines many formal requirements to aid in interpretablity.
Last updated
immunooncologycrisprpooledscreensexperimentaldesignbiomedicalinformaticscellbiologyfunctionalgenomicspharmacogenomicspharmacogeneticssystemsbiologydifferentialexpressiongenesetenrichmentgeneticsmultiplecomparisonnormalizationpreprocessingqualitycontrolrnaseqregressionsoftwarevisualization
5.26 score 12 scripts 484 downloadsrecoup - An R package for the creation of complex genomic profile plots
recoup calculates and plots signal profiles created from short sequence reads derived from Next Generation Sequencing technologies. The profiles provided are either sumarized curve profiles or heatmap profiles. Currently, recoup supports genomic profile plots for reads derived from ChIP-Seq and RNA-Seq experiments. The package uses ggplot2 and ComplexHeatmap graphics facilities for curve and heatmap coverage profiles respectively.
Last updated
immunooncologysoftwaregeneexpressionpreprocessingqualitycontrolrnaseqchipseqsequencingcoverageatacseqchiponchipalignmentdataimport
5.24 score 1 stars 7 scripts 426 downloadsnucleoSim - Generate synthetic nucleosome maps
This package can generate a synthetic map with reads covering the nucleosome regions as well as a synthetic map with forward and reverse reads emulating next-generation sequencing. The synthetic hybridization data of “Tiling Arrays” can also be generated. The user has choice between three different distributions for the read positioning: Normal, Student and Uniform. In addition, a visualization tool is provided to explore the synthetic nucleosome maps.
Last updated
geneticssequencingsoftwarestatisticalmethodalignmentbioconductornucleosome-mapsnucleosomessimulationsimulatorsynthetic-nucleosomes
5.20 score 2 stars 16 scripts 373 downloadsSPLINTER - Splice Interpreter of Transcripts
Provides tools to analyze alternative splicing sites, interpret outcomes based on sequence information, select and design primers for site validiation and give visual representation of the event to guide downstream experiments.
Last updated
immunooncologygeneexpressionrnaseqvisualizationalternativesplicing
5.18 score 7 scripts 352 downloadsyamss - Tools for high-throughput metabolomics
Tools to analyze and visualize high-throughput metabolomics data aquired using chromatography-mass spectrometry. These tools preprocess data in a way that enables reliable and powerful differential analysis. At the core of these methods is a peak detection phase that pools information across all samples simultaneously. This is in contrast to other methods that detect peaks in a sample-by-sample basis.
Last updated
massspectrometrymetabolomicspeakdetectionsoftware
5.16 score 3 stars 12 scripts 450 downloadsbacon - Controlling bias and inflation in association studies using the empirical null distribution
Bacon can be used to remove inflation and bias often observed in epigenome- and transcriptome-wide association studies. To this end bacon constructs an empirical null distribution using a Gibbs Sampling algorithm by fitting a three-component normal mixture on z-scores.
Last updated
immunooncologystatisticalmethodbayesianregressiongenomewideassociationtranscriptomicsrnaseqmethylationarraybatcheffectmultiplecomparison
5.08 score 150 scripts 572 downloadsASpli - Analysis of Alternative Splicing Using RNA-Seq
Integrative pipeline for the analysis of alternative splicing using RNAseq.
Last updated
immunooncologygeneexpressiontranscriptionalternativesplicingcoveragedifferentialexpressiondifferentialsplicingtimecoursernaseqgenomeannotationsequencingalignment
5.05 score 1 dependents 47 scripts 546 downloadsEGAD - Extending guilt by association by degree
The package implements a series of highly efficient tools to calculate functional properties of networks based on guilt by association methods.
Last updated
softwarefunctionalgenomicssystemsbiologygenepredictionfunctionalpredictionnetworkenrichmentgraphandnetworknetwork
5.00 score 101 scripts 408 downloadsGRmetrics - Calculate growth-rate inhibition (GR) metrics
Functions for calculating and visualizing growth-rate inhibition (GR) metrics.
Last updated
immunooncologycellbasedassayscellbiologysoftwaretimecoursevisualization
4.98 score 1 stars 24 scripts 430 downloadsMetaboSignal - MetaboSignal: a network-based approach to overlay and explore metabolic and signaling KEGG pathways
MetaboSignal is an R package that allows merging, analyzing and customizing metabolic and signaling KEGG pathways. It is a network-based approach designed to explore the topological relationship between genes (signaling- or enzymatic-genes) and metabolites, representing a powerful tool to investigate the genetic landscape and regulatory networks of metabolic phenotypes.
Last updated
graphandnetworkgenesignalinggenetargetnetworkpathwayskeggreactomesoftware
4.94 score 11 scripts 494 downloadsExpressionAtlas - Download datasets from EMBL-EBI Expression Atlas
This package is for searching for datasets in EMBL-EBI Expression Atlas, and downloading them into R for further analysis. Each Expression Atlas dataset is represented as a SimpleList object with one element per platform. Sequencing data is contained in a SummarizedExperiment object, while microarray data is contained in an ExpressionSet or MAList object.
Last updated
expressiondataexperimentdatasequencingdatamicroarraydataarrayexpress
4.94 score 31 scripts 414 downloadsYAPSA - Yet Another Package for Signature Analysis
This package provides functions and routines for supervised analyses of mutational signatures (i.e., the signatures have to be known, cf. L. Alexandrov et al., Nature 2013 and L. Alexandrov et al., Bioaxiv 2018). In particular, the family of functions LCD (LCD = linear combination decomposition) can use optimal signature-specific cutoffs which takes care of different detectability of the different signatures. Moreover, the package provides different sets of mutational signatures, including the COSMIC and PCAWG SNV signatures and the PCAWG Indel signatures; the latter infering that with YAPSA, the concept of supervised analysis of mutational signatures is extended to Indel signatures. YAPSA also provides confidence intervals as computed by profile likelihoods and can perform signature analysis on a stratified mutational catalogue (SMC = stratify mutational catalogue) in order to analyze enrichment and depletion patterns for the signatures in different strata.
Last updated
sequencingdnaseqsomaticmutationvisualizationclusteringgenomicvariationstatisticalmethodbiologicalquestion
4.91 score 81 scripts 634 downloadschromPlot - Global visualization tool of genomic data
Package designed to visualize genomic data along the chromosomes, where the vertical chromosomes are sorted by number, with sex chromosomes at the end.
Last updated
datarepresentationfunctionalgenomicsgeneticssequencingannotationvisualization
4.84 score 49 scripts 430 downloadsFitHiC - Confidence estimation for intra-chromosomal contact maps
Fit-Hi-C is a tool for assigning statistical confidence estimates to intra-chromosomal contact maps produced by genome-wide genome architecture assays such as Hi-C.
Last updated
dna3dstructuresoftwarecpp
4.78 score 5 scripts 471 downloadsmimager - mimager: The Microarray Imager
Easily visualize and inspect microarrays for spatial artifacts.
Last updated
infrastructurevisualizationmicroarraybioconductorbioinformatics
4.70 score 3 scripts 360 downloadsAnaquin - Statistical analysis of sequins
The project is intended to support the use of sequins (synthetic sequencing spike-in controls) owned and made available by the Garvan Institute of Medical Research. The goal is to provide a standard open source library for quantitative analysis, modelling and visualization of spike-in controls.
Last updated
immunooncologydifferentialexpressionpreprocessingrnaseqgeneexpressionsoftware
4.69 score 49 scripts 420 downloadsChIPexoQual - ChIPexoQual
Package with a quality control pipeline for ChIP-exo/nexus data.
Last updated
chipseqsequencingtranscriptionvisualizationqualitycontrolcoveragealignment
4.62 score 1 stars 14 scripts 510 downloadsRJMCMCNucleosomes - Bayesian hierarchical model for genome-wide nucleosome positioning with high-throughput short-read data (MNase-Seq)
This package does nucleosome positioning using informative Multinomial-Dirichlet prior in a t-mixture with reversible jump estimation of nucleosome positions for genome-wide profiling.
Last updated
biologicalquestionchipseqnucleosomepositioningsoftwarestatisticalmethodbayesiansequencingcoveragebayesian-t-mixturebioconductorc-plus-plusgenome-wide-profilingmultinomial-dirichlet-priornucleosome-positioningnucleosomesreversible-jump-mcmcgslcpp
4.60 score 3 scripts 368 downloadsfCCAC - functional Canonical Correlation Analysis to evaluate Covariance between nucleic acid sequencing datasets
Computational evaluation of variability across DNA or RNA sequencing datasets is a crucial step in genomics, as it allows both to evaluate reproducibility of replicates, and to compare different datasets to identify potential correlations. fCCAC applies functional Canonical Correlation Analysis to allow the assessment of: (i) reproducibility of biological or technical replicates, analyzing their shared covariance in higher order components; and (ii) the associations between different datasets. fCCAC represents a more sophisticated approach that complements Pearson correlation of genomic coverage.
Last updated
epigeneticstranscriptionsequencingcoveragechipseqfunctionalgenomicsrnaseqatacseqmnaseseqbioc-devel
4.60 score 3 scripts 444 downloadsClusterSignificance - The ClusterSignificance package provides tools to assess if class clusters in dimensionality reduced data representations have a separation different from permuted data
The ClusterSignificance package provides tools to assess if class clusters in dimensionality reduced data representations have a separation different from permuted data. The term class clusters here refers to, clusters of points representing known classes in the data. This is particularly useful to determine if a subset of the variables, e.g. genes in a specific pathway, alone can separate samples into these established classes. ClusterSignificance accomplishes this by, projecting all points onto a one dimensional line. Cluster separations are then scored and the probability of the seen separation being due to chance is evaluated using a permutation method.
Last updated
clusteringclassificationprincipalcomponentstatisticalmethod
4.60 score 5 scripts 440 downloadsUniquorn - Identification of cancer cell lines based on their weighted mutational/ variational fingerprint
'Uniquorn' enables users to identify cancer cell lines. Cancer cell line misidentification and cross-contamination reprents a significant challenge for cancer researchers. The identification is vital and in the frame of this package based on the locations/ loci of somatic and germline mutations/ variations. The input format is vcf/ vcf.gz and the files have to contain a single cancer cell line sample (i.e. a single member/genotype/gt column in the vcf file).
Last updated
immunooncologystatisticalmethodwholegenomeexomeseq
4.60 score 444 downloadsyarn - YARN: Robust Multi-Condition RNA-Seq Preprocessing and Normalization
Expedite large RNA-Seq analyses using a combination of previously developed tools. YARN is meant to make it easier for the user in performing basic mis-annotation quality control, filtering, and condition-aware normalization. YARN leverages many Bioconductor tools and statistical techniques to account for the large heterogeneity and sparsity found in very large RNA-seq experiments.
Last updated
softwarequalitycontrolgeneexpressionsequencingpreprocessingnormalizationannotationvisualizationclustering
4.56 score 36 scripts 568 downloadspqsfinder - Identification of potential quadruplex forming sequences
Pqsfinder detects DNA and RNA sequence patterns that are likely to fold into an intramolecular G-quadruplex (G4). Unlike many other approaches, pqsfinder is able to detect G4s folded from imperfect G-runs containing bulges or mismatches or G4s having long loops. Pqsfinder also assigns an integer score to each hit that was fitted on G4 sequencing data and corresponds to expected stability of the folded G4.
Last updated
motifdiscoverysequencematchinggeneregulationcpp
4.51 score 20 scripts 466 downloadsMergeomics - Integrative network analysis of omics data
The Mergeomics pipeline serves as a flexible framework for integrating multidimensional omics-disease associations, functional genomics, canonical pathways and gene-gene interaction networks to generate mechanistic hypotheses. It includes two main parts, 1) Marker set enrichment analysis (MSEA); 2) Weighted Key Driver Analysis (wKDA).
Last updated
software
4.48 score 15 scripts 428 downloadsRImmPort - RImmPort: Enabling Ready-for-analysis Immunology Research Data
The RImmPort package simplifies access to ImmPort data for analysis in the R environment. It provides a standards-based interface to the ImmPort study data that is in a proprietary format.
Last updated
biomedicalinformaticsdataimportdatarepresentation
4.45 score 35 scripts 323 downloadsSMITE - Significance-based Modules Integrating the Transcriptome and Epigenome
This package builds on the Epimods framework which facilitates finding weighted subnetworks ("modules") on Illumina Infinium 27k arrays using the SpinGlass algorithm, as implemented in the iGraph package. We have created a class of gene centric annotations associated with p-values and effect sizes and scores from any researchers prior statistical results to find functional modules.
Last updated
immunooncologydifferentialmethylationdifferentialexpressionsystemsbiologynetworkenrichmentgenomeannotationnetworksequencingrnaseqcoverage
4.45 score 1 stars 20 scripts 394 downloadsBiocWorkflowTools - Tools to aid the development of Bioconductor Workflow packages
Provides functions to ease the transition between Rmarkdown and LaTeX documents when authoring a Bioconductor Workflow.
Last updated
softwarereportwriting
4.34 score 11 scripts 524 downloadsbiosigner - Signature discovery from omics data
Feature selection is critical in omics data analysis to extract restricted and meaningful molecular signatures from complex and high-dimension data, and to build robust classifiers. This package implements a new method to assess the relevance of the variables for the prediction performances of the classifier. The approach can be run in parallel with the PLS-DA, Random Forest, and SVM binary classifiers. The signatures and the corresponding 'restricted' models are returned, enabling future predictions on new datasets. A Galaxy implementation of the package is available within the Workflow4metabolomics.org online infrastructure for computational metabolomics.
Last updated
classificationfeatureextractiontranscriptomicsproteomicsmetabolomicslipidomicsmassspectrometry
4.30 score 10 scripts 499 downloadsmetaCCA - Summary Statistics-Based Multivariate Meta-Analysis of Genome-Wide Association Studies Using Canonical Correlation Analysis
metaCCA performs multivariate analysis of a single or multiple GWAS based on univariate regression coefficients. It allows multivariate representation of both phenotype and genotype. metaCCA extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.
Last updated
genomewideassociationsnpgeneticsregressionstatisticalmethodsoftware
4.26 score 9 scripts 326 downloadsodseq - Outlier detection in multiple sequence alignments
Performs outlier detection of sequences in a multiple sequence alignment using bootstrap of predefined distance metrics. Outlier sequences can make downstream analyses unreliable or make the alignments less accurate while they are being constructed. This package implements the OD-seq algorithm proposed by Jehl et al (doi 10.1186/s12859-015-0702-1) for aligned sequences and a variant using string kernels for unaligned sequences.
Last updated
alignmentmultiplesequencealignment
4.23 score 1 dependents 28 scripts 382 downloadsBUMHMM - Computational pipeline for computing probability of modification from structure probing experiment data
This is a probabilistic modelling pipeline for computing per- nucleotide posterior probabilities of modification from the data collected in structure probing experiments. The model supports multiple experimental replicates and empirically corrects coverage- and sequence-dependent biases. The model utilises the measure of a "drop-off rate" for each nucleotide, which is compared between replicates through a log-ratio (LDR). The LDRs between control replicates define a null distribution of variability in drop-off rate observed by chance and LDRs between treatment and control replicates gets compared to this distribution. Resulting empirical p-values (probability of being "drawn" from the null distribution) are used as observations in a Hidden Markov Model with a Beta-Uniform Mixture model used as an emission model. The resulting posterior probabilities indicate the probability of a nucleotide of having being modified in a structure probing experiment.
Last updated
immunooncologygeneticvariabilitytranscriptiongeneexpressiongeneregulationcoveragegeneticsstructuralpredictiontranscriptomicsbayesianclassificationfeatureextractionhiddenmarkovmodelregressionrnaseqsequencing
4.20 score 16 scripts 386 downloadsgeneAttribution - Identification of candidate genes associated with genetic variation
Identification of the most likely gene or genes through which variation at a given genomic locus in the human genome acts. The most basic functionality assumes that the closer gene is to the input locus, the more likely the gene is to be causative. Additionally, any empirical data that links genomic regions to genes (e.g. eQTL or genome conformation data) can be used if it is supplied in the UCSC .BED file format.
Last updated
snpgenepredictiongenomewideassociationvariantannotationgenomicvariation
4.18 score 3 scripts 406 downloadssights - Statistics and dIagnostic Graphs for HTS
SIGHTS is a suite of normalization methods, statistical tests, and diagnostic graphical tools for high throughput screening (HTS) assays. HTS assays use microtitre plates to screen large libraries of compounds for their biological, chemical, or biochemical activity.
Last updated
immunooncologycellbasedassaysmicrotitreplateassaynormalizationmultiplecomparisonpreprocessingqualitycontrolbatcheffectvisualization
4.18 score 15 scripts 425 downloadscellity - Quality Control for Single-Cell RNA-seq Data
A support vector machine approach to identifying and filtering low quality cells from single-cell RNA-seq datasets.
Last updated
immunooncologyrnaseqqualitycontrolpreprocessingnormalizationvisualizationdimensionreductiontranscriptomicsgeneexpressionsequencingsoftwaresupportvectormachine
4.18 score 15 scripts 382 downloadsPCAN - Phenotype Consensus ANalysis (PCAN)
Phenotypes comparison based on a pathway consensus approach. Assess the relationship between candidate genes and a set of phenotypes based on additional genes related to the candidate (e.g. Pathways or network neighbors).
Last updated
annotationsequencinggeneticsfunctionalpredictionvariantannotationpathwaysnetwork
4.15 score 8 scripts 420 downloadsgeneClassifiers - Application of gene classifiers
This packages aims for easy accessible application of classifiers which have been published in literature using an ExpressionSet as input.
Last updated
geneexpressionbiomedicalinformaticsclassificationsurvivalmicroarray
4.08 score 1 stars 7 scripts 414 downloadsclusterSeq - Clustering of high-throughput sequencing data by identifying co-expression patterns
Identification of clusters of co-expressed genes based on their expression across multiple (replicated) biological samples.
Last updated
sequencingdifferentialexpressionmultiplecomparisonclusteringgeneexpression
4.08 score 2 scripts 366 downloadsBaalChIP - BaalChIP: Bayesian analysis of allele-specific transcription factor binding in cancer genomes
The package offers functions to process multiple ChIP-seq BAM files and detect allele-specific events. Computes allele counts at individual variants (SNPs/SNVs), implements extensive QC steps to remove problematic variants, and utilizes a bayesian framework to identify statistically significant allele- specific events. BaalChIP is able to account for copy number differences between the two alleles, a known phenotypical feature of cancer samples.
Last updated
softwarechipseqbayesiansequencing
4.08 score 12 scripts 536 downloadsCellMapper - Predict genes expressed selectively in specific cell types
Infers cell type-specific expression based on co-expression similarity with known cell type marker genes. Can make accurate predictions using publicly available expression data, even when a cell type has not been isolated before.
Last updated
microarraysoftwaregeneexpression
4.06 score 19 scripts 361 downloadsMCbiclust - Massive correlating biclusters for gene expression data and associated methods
Custom made algorithm and associated methods for finding, visualising and analysing biclusters in large gene expression data sets. Algorithm is based on with a supplied gene set of size n, finding the maximum strength correlation matrix containing m samples from the data set.
Last updated
immunooncologyclusteringmicroarraystatisticalmethodsoftwarernaseqgeneexpression
4.00 score 7 scripts 452 downloadsKEGGlincs - Visualize all edges within a KEGG pathway and overlay LINCS data
See what is going on 'under the hood' of KEGG pathways by explicitly re-creating the pathway maps from information obtained from KGML files.
Last updated
networkinferencegeneexpressiondatarepresentationthirdpartyclientcellbiologygraphandnetworkpathwayskeggnetwork
4.00 score 3 scripts 426 downloadsGAprediction - Prediction of gestational age with Illumina HumanMethylation450 data
[GAprediction] predicts gestational age using Illumina HumanMethylation450 CpG data.
Last updated
immunooncologydnamethylationepigeneticsregressionbiomedicalinformatics
4.00 score 317 downloadsctsGE - Clustering of Time Series Gene Expression data
Methodology for supervised clustering of potentially many predictor variables, such as genes etc., in time series datasets Provides functions that help the user assigning genes to predefined set of model profiles.
Last updated
immunooncologygeneexpressiontranscriptiondifferentialexpressiongenesetenrichmentgeneticsbayesianclusteringtimecoursesequencingrnaseq
4.00 score 1 stars 3 scripts 377 downloadscovRNA - Multivariate Analysis of Transcriptomic Data
This package provides the analysis methods fourthcorner and RLQ analysis for large-scale transcriptomic data.
Last updated
geneexpressiontranscription
4.00 score 7 scripts 346 downloadsLymphoSeq - Analyze high-throughput sequencing of T and B cell receptors
This R package analyzes high-throughput sequencing of T and B cell receptor complementarity determining region 3 (CDR3) sequences generated by Adaptive Biotechnologies' ImmunoSEQ assay. Its input comes from tab-separated value (.tsv) files exported from the ImmunoSEQ analyzer.
Last updated
softwaretechnologysequencingtargetedresequencingalignmentmultiplesequencealignment
4.00 score 10 scripts 398 downloadsMethPed - A DNA methylation classifier tool for the identification of pediatric brain tumor subtypes
Classification of pediatric tumors into biologically defined subtypes is challenging and multifaceted approaches are needed. For this aim, we developed a diagnostic classifier based on DNA methylation profiles. We offer MethPed as an easy-to-use toolbox that allows researchers and clinical diagnosticians to test single samples as well as large cohorts for subclass prediction of pediatric brain tumors. The current version of MethPed can classify the following tumor diagnoses/subgroups: Diffuse Intrinsic Pontine Glioma (DIPG), Ependymoma, Embryonal tumors with multilayered rosettes (ETMR), Glioblastoma (GBM), Medulloblastoma (MB) - Group 3 (MB_Gr3), Group 4 (MB_Gr3), Group WNT (MB_WNT), Group SHH (MB_SHH) and Pilocytic Astrocytoma (PiloAstro).
Last updated
immunooncologydnamethylationclassificationepigenetics
4.00 score 7 scripts 422 downloadsRTNduals - Analysis of co-regulation and inference of 'dual regulons'
RTNduals identifies co-regulatory loops between pairs of regulons inferred by the RTN package by evaluating their shared target genes. It infers dual regulons and tests whether regulator pairs exhibit cooperative or competitive influences on common targets.
Last updated
generegulationgeneexpressionnetworkenrichmentnetworkinferencegraphandnetwork
3.95 score 1 dependents 7 scripts 362 downloadsCONFESS - Cell OrderiNg by FluorEScence Signal
Single Cell Fluidigm Spot Detector.
Last updated
immunooncologygeneexpressiondataimportcellbiologyclusteringrnaseqqualitycontrolvisualizationtimecourseregressionclassification
3.90 score 2 scripts 370 downloadsCHRONOS - CHRONOS: A time-varying method for microRNA-mediated sub-pathway enrichment analysis
A package used for efficient unraveling of the inherent dynamic properties of pathways. MicroRNA-mediated subpathway topologies are extracted and evaluated by exploiting the temporal transition and the fold change activity of the linked genes/microRNAs.
Last updated
systemsbiologygraphandnetworkpathwayskeggopenjdk
3.89 score 13 scripts 390 downloadsMWASTools - MWASTools: an integrated pipeline to perform metabolome-wide association studies
MWASTools provides a complete pipeline to perform metabolome-wide association studies. Key functionalities of the package include: quality control analysis of metabonomic data; MWAS using different association models (partial correlations; generalized linear models); model validation using non-parametric bootstrapping; visualization of MWAS results; NMR metabolite identification using STOCSY; and biological interpretation of MWAS results.
Last updated
metabolomicslipidomicscheminformaticssystemsbiologyqualitycontrol
3.78 score 1 dependents 7 scripts 454 downloadsDEsubs - DEsubs: an R package for flexible identification of differentially expressed subpathways using RNA-seq expression experiments
DEsubs is a network-based systems biology package that extracts disease-perturbed subpathways within a pathway network as recorded by RNA-seq experiments. It contains an extensive and customizable framework covering a broad range of operation modes at all stages of the subpathway analysis, enabling a case-specific approach. The operation modes refer to the pathway network construction and processing, the subpathway extraction, visualization and enrichment analysis with regard to various biological and pharmacological features. Its capabilities render it a tool-guide for both the modeler and experimentalist for the identification of more robust systems-level biomarkers for complex diseases.
Last updated
systemsbiologygraphandnetworkpathwayskegggeneexpressionnetworkenrichmentnetworkrnaseqdifferentialexpressionnormalizationimmunooncology
3.78 score 2 scripts 428 downloadsAMOUNTAIN - Active modules for multilayer weighted gene co-expression networks: a continuous optimization approach
A pure data-driven gene network, weighted gene co-expression network (WGCN) could be constructed only from expression profile. Different layers in such networks may represent different time points, multiple conditions or various species. AMOUNTAIN aims to search active modules in multi-layer WGCN using a continuous optimization approach.
Last updated
geneexpressionmicroarraydifferentialexpressionnetworkgsl
3.78 score 1 dependents 6 scripts 530 downloadsmiRNAmeConverter - Convert miRNA Names to Different miRBase Versions
Translating mature miRNA names to different miRBase versions, sequence retrieval, checking names for validity and detecting miRBase version of a given set of names (data from http://www.mirbase.org/).
Last updated
preprocessingmirna
3.78 score 10 scripts 340 downloadsfuntooNorm - Normalization Procedure for Infinium HumanMethylation450 BeadChip Kit
Provides a function to normalize Illumina Infinium Human Methylation 450 BeadChip (Illumina 450K), correcting for tissue and/or cell type.
Last updated
dnamethylationpreprocessingnormalization
3.70 score 402 downloadsPigengene - Infers biological signatures from gene expression data
Pigengene package provides an efficient way to infer biological signatures from gene expression profiles. The signatures are independent from the underlying platform, e.g., the input can be microarray or RNA Seq data. It can even infer the signatures using data from one platform, and evaluate them on the other. Pigengene identifies the modules (clusters) of highly coexpressed genes using coexpression network analysis, summarizes the biological information of each module in an eigengene, learns a Bayesian network that models the probabilistic dependencies between modules, and builds a decision tree based on the expression of eigengenes.
Last updated
geneexpressionrnaseqnetworkinferencenetworkgraphandnetworkbiomedicalinformaticssystemsbiologytranscriptomicsclassificationclusteringdecisiontreedimensionreductionprincipalcomponentmicroarraynormalizationimmunooncology
3.67 score 1 dependents 13 scripts 458 downloadsphosphonormalizer - Compensates for the bias introduced by median normalization in
It uses the overlap between enriched and non-enriched datasets to compensate for the bias introduced in global phosphorylation after applying median normalization.
Last updated
softwarestatisticalmethodworkflowstepnormalizationproteomics
3.60 score 329 downloadsGOpro - Find the most characteristic gene ontology terms for groups of human genes
Find the most characteristic gene ontology terms for groups of human genes. This package was created as a part of the thesis which was developed under the auspices of MI^2 Group (http://mi2.mini.pw.edu.pl/, https://github.com/geneticsMiNIng).
Last updated
annotationclusteringgogeneexpressiongenesetenrichmentmultiplecomparisoncpp
3.60 score 2 stars 5 scripts 436 downloadsASAFE - Ancestry Specific Allele Frequency Estimation
Given admixed individuals' bi-allelic SNP genotypes and ancestry pairs (where each ancestry can take one of three values) for multiple SNPs, perform an EM algorithm to deal with the fact that SNP genotypes are unphased with respect to ancestry pairs, in order to estimate ancestry-specific allele frequencies for all SNPs.
Last updated
snpgenomewideassociationlinkagedisequilibriumbiomedicalinformaticsgeneticsexperimentaldesign
3.60 score 4 scripts 371 downloadsHelloRanges - Introduce *Ranges to bedtools users
Translates bedtools command-line invocations to R code calling functions from the Bioconductor *Ranges infrastructure. This is intended to educate novice Bioconductor users and to compare the syntax and semantics of the two frameworks.
Last updated
sequencingannotationcoveragegenomeannotationdataimportsequencematchingvariantannotation
3.59 score 1 dependents 26 scripts 487 downloadsIWTomics - Interval-Wise Testing for Omics Data
Implementation of the Interval-Wise Testing (IWT) for omics data. This inferential procedure tests for differences in "Omics" data between two groups of genomic regions (or between a group of genomic regions and a reference center of symmetry), and does not require fixing location and scale at the outset.
Last updated
statisticalmethodmultiplecomparisondifferentialexpressiondifferentialmethylationdifferentialpeakcallinggenomeannotationdataimport
3.48 score 5 scripts 470 downloadsqsea - IP-seq data analysis and vizualization
qsea (quantitative sequencing enrichment analysis) was developed as the successor of the MEDIPS package for analyzing data derived from methylated DNA immunoprecipitation (MeDIP) experiments followed by sequencing (MeDIP-seq). However, qsea provides several functionalities for the analysis of other kinds of quantitative sequencing data (e.g. ChIP-seq, MBD-seq, CMS-seq and others) including calculation of differential enrichment between groups of samples.
Last updated
sequencingdnamethylationcpgislandchipseqpreprocessingnormalizationqualitycontrolvisualizationcopynumbervariationchiponchipdifferentialmethylation
3.48 score 9 scripts 376 downloadsMetCirc - Navigating mass spectral similarity in high-resolution MS/MS metabolomics data metabolomics data
MetCirc comprises a workflow to interactively explore high-resolution MS/MS metabolomics data. MetCirc uses the Spectra object infrastructure defined in the package Spectra that stores MS/MS spectra. MetCirc offers functionality to calculate similarity between precursors based on the normalised dot product, neutral losses or user-defined functions and visualise similarities in a circular layout. Within the interactive framework the user can annotate MS/MS features based on their similarity to (known) related MS/MS features.
Last updated
shinyappsmetabolomicsmassspectrometryvisualization
3.48 score 5 scripts 426 downloadsepivizrStandalone - Run Epiviz Interactive Genomic Data Visualization App within R
This package imports the epiviz visualization JavaScript app for genomic data interactive visualization. The 'epivizrServer' package is used to provide a web server running completely within R. This standalone version allows to browse arbitrary genomes through genome annotations provided by Bioconductor packages.
Last updated
visualizationinfrastructuregui
3.48 score 7 scripts 372 downloadsMMDiff2 - Statistical Testing for ChIP-Seq data sets
This package detects statistically significant differences between read enrichment profiles in different ChIP-Seq samples. To take advantage of shape differences it uses Kernel methods (Maximum Mean Discrepancy, MMD).
Last updated
chipseqdifferentialpeakcallingsequencingsoftware
3.48 score 3 scripts 392 downloadsBasicSTARRseq - Basic peak calling on STARR-seq data
Basic peak calling on STARR-seq data based on a method introduced in "Genome-Wide Quantitative Enhancer Activity Maps Identified by STARR-seq" Arnold et al. Science. 2013 Mar 1;339(6123):1074-7. doi: 10.1126/science. 1232542. Epub 2013 Jan 17.
Last updated
peakdetectiongeneregulationfunctionalpredictionfunctionalgenomicscoverage
3.48 score 1 scripts 408 downloadsesetVis - Visualizations of expressionSet Bioconductor object
Utility functions for visualization of expressionSet (or SummarizedExperiment) Bioconductor object, including spectral map, tsne and linear discriminant analysis. Static plot via the ggplot2 package or interactive via the ggvis or rbokeh packages are available.
Last updated
visualizationdatarepresentationdimensionreductionprincipalcomponentpathways
3.34 score 11 scripts 421 downloadsCCPROMISE - PROMISE analysis with Canonical Correlation for Two Forms of High Dimensional Genetic Data
Perform Canonical correlation between two forms of high demensional genetic data, and associate the first compoent of each form of data with a specific biologically interesting pattern of associations with multiple endpoints. A probe level analysis is also implemented.
Last updated
microarraygeneexpression
3.30 score 3 scripts 400 downloadsuSORT - uSORT: A self-refining ordering pipeline for gene selection
This package is designed to uncover the intrinsic cell progression path from single-cell RNA-seq data. It incorporates data pre-processing, preliminary PCA gene selection, preliminary cell ordering, feature selection, refined cell ordering, and post-analysis interpretation and visualization.
Last updated
immunooncologyrnaseqguicellbiologydnaseq
3.30 score 3 scripts 341 downloadsBayesKnockdown - BayesKnockdown: Posterior Probabilities for Edges from Knockdown Data
A simple, fast Bayesian method for computing posterior probabilities for relationships between a single predictor variable and multiple potential outcome variables, incorporating prior probabilities of relationships. In the context of knockdown experiments, the predictor variable is the knocked-down gene, while the other genes are potential targets. Can also be used for differential expression/2-class data.
Last updated
networkinferencegeneexpressiongenetargetnetworkbayesian
3.30 score 1 scripts 443 downloadsMODA - MODA: MOdule Differential Analysis for weighted gene co-expression network
MODA can be used to estimate and construct condition-specific gene co-expression networks, and identify differentially expressed subnetworks as conserved or condition specific modules which are potentially associated with relevant biological processes.
Last updated
geneexpressionmicroarraydifferentialexpressionnetwork
3.30 score 9 scripts 371 downloadscovEB - Empirical Bayes estimate of block diagonal covariance matrices
Using bayesian methods to estimate correlation matrices assuming that they can be written and estimated as block diagonal matrices. These block diagonal matrices are determined using shrinkage parameters that values below this parameter to zero.
Last updated
immunooncologybayesianmicroarrayrnaseqpreprocessingsoftwaregeneexpressionstatisticalmethod
3.30 score 3 scripts 354 downloadsoppar - Outlier profile and pathway analysis in R
The R implementation of mCOPA package published by Wang et al. (2012). Oppar provides methods for Cancer Outlier profile Analysis. Although initially developed to detect outlier genes in cancer studies, methods presented in oppar can be used for outlier profile analysis in general. In addition, tools are provided for gene set enrichment and pathway analysis.
Last updated
pathwaysgenesetenrichmentsystemsbiologygeneexpressionsoftware
3.30 score 3 scripts 396 downloadsEBSEA - Exon Based Strategy for Expression Analysis of genes
Calculates differential expression of genes based on exon counts of genes obtained from RNA-seq sequencing data.
Last updated
softwaredifferentialexpressiongeneexpressionsequencing
3.30 score 5 scripts 350 downloadsgarfield - GWAS Analysis of Regulatory or Functional Information Enrichment with LD correction
GARFIELD is a non-parametric functional enrichment analysis approach described in the paper GARFIELD: GWAS analysis of regulatory or functional information enrichment with LD correction. Briefly, it is a method that leverages GWAS findings with regulatory or functional annotations (primarily from ENCODE and Roadmap epigenomics data) to find features relevant to a phenotype of interest. It performs greedy pruning of GWAS SNPs (LD r2 > 0.1) and then annotates them based on functional information overlap. Next, it quantifies Fold Enrichment (FE) at various GWAS significance cutoffs and assesses them by permutation testing, while matching for minor allele frequency, distance to nearest transcription start site and number of LD proxies (r2 > 0.8).
Last updated
softwarestatisticalmethodannotationfunctionalpredictiongenomeannotationcpp
3.30 score 6 scripts 332 downloadsRGraph2js - Convert a Graph into a D3js Script
Generator of web pages which display interactive network/graph visualizations with D3js, jQuery and Raphael.
Last updated
visualizationnetworkgraphandnetworkthirdpartyclient
3.30 score 1 scripts 411 downloadsgeneplast - Evolutionary and plasticity analysis of orthologous groups
Geneplast is designed for evolutionary and plasticity analysis based on orthologous groups distribution in a given species tree. It uses Shannon information theory and orthologs abundance to estimate the Evolutionary Plasticity Index. Additionally, it implements the Bridge algorithm to determine the evolutionary root of a given gene based on its orthologs distribution.
Last updated
geneticsgeneregulationsystemsbiology
3.11 score 32 scripts 505 downloadsMGFR - Marker Gene Finder in RNA-seq data
The package is designed to detect marker genes from RNA-seq data.
Last updated
immunooncologygeneticsgeneexpressionrnaseq
2.78 score 1 dependents 2 scripts 398 downloadsBadRegionFinder - BadRegionFinder: an R/Bioconductor package for identifying regions with bad coverage
BadRegionFinder is a package for identifying regions with a bad, acceptable and good coverage in sequence alignment data available as bam files. The whole genome may be considered as well as a set of target regions. Various visual and textual types of output are available.
Last updated
coveragesequencingalignmentwholegenomeclassification
2.30 score 3 scripts 437 downloadsISoLDE - Integrative Statistics of alleLe Dependent Expression
This package provides ISoLDE a new method for identifying imprinted genes. This method is dedicated to data arising from RNA sequencing technologies. The ISoLDE package implements original statistical methodology described in the publication below.
Last updated
immunooncologygeneexpressiontranscriptiongenesetenrichmentgeneticssequencingrnaseqmultiplecomparisonsnpgeneticvariabilityepigeneticsmathematicalbiologygeneregulationopenmp
2.30 score 8 scripts 338 downloadsComplexHeatmap - Make Complex Heatmaps
Complex heatmaps are efficient to visualize associations between different sources of data sets and reveal potential patterns. Here the ComplexHeatmap package provides a highly flexible way to arrange multiple heatmaps and supports various annotation graphics.
Last updated
softwarevisualizationsequencingclusteringcomplex-heatmapsheatmap
17.61 score 1.5k stars 187 dependents 20k scripts 32k downloadsggtree - an R package for visualization of tree and annotation data
'ggtree' extends the 'ggplot2' plotting system which implemented the grammar of graphics. 'ggtree' is designed for visualization and annotation of phylogenetic trees and other tree-like structures with their annotation data.
Last updated
alignmentannotationclusteringdataimportmultiplesequencealignmentphylogeneticsreproducibleresearchsoftwarevisualizationannotationsggplot2phylogenetic-trees
17.32 score 927 stars 110 dependents 7.4k scripts 46k downloadsSummarizedExperiment - A container (S4 class) for matrix-like assays
The SummarizedExperiment container contains one or more assays, each represented by a matrix-like object of numeric or other mode. The rows typically represent genomic ranges of interest and the columns represent samples.
Last updated
geneticsinfrastructuresequencingannotationcoveragegenomeannotationbioconductor-packagecore-package
17.00 score 37 stars 1.3k dependents 13k scripts 93k downloadsDESeq2 - Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Last updated
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
16.67 score 461 stars 123 dependents 26k scripts 45k downloadsGenomicAlignments - Representation and manipulation of short genomic alignments
Provides efficient containers for storing and manipulating short genomic alignments (typically obtained by aligning short reads to a reference genome). This includes read counting, computing the coverage, junction detection, and working with the nucleotide content of the alignments.
Last updated
infrastructuredataimportgeneticssequencingrnaseqsnpcoveragealignmentimmunooncologybioconductor-packagecore-package
15.78 score 11 stars 555 dependents 4.4k scripts 48k downloadsKEGGREST - Client-side REST access to the Kyoto Encyclopedia of Genes and Genomes (KEGG)
A package that provides a client interface to the Kyoto Encyclopedia of Genes and Genomes (KEGG) REST API. Only for academic use by academic users belonging to academic institutions (see <https://www.kegg.jp/kegg/rest/>). Note that KEGGREST is based on KEGGSOAP by J. Zhang, R. Gentleman, and Marc Carlson, and KEGG (python package) by Aurelien Mazurie.
Last updated
annotationpathwaysthirdpartyclientkeggbioconductor-packagecore-package
15.07 score 15 stars 768 dependents 1.2k scripts 74k downloadsGviz - Plotting data and annotation information along genomic coordinates
Genomic data analyses requires integrated visualization of known genomic information and new experimental data. Gviz uses the biomaRt and the rtracklayer packages to perform live annotation queries to Ensembl and UCSC and translates this to e.g. gene/transcript structures in viewports of the grid graphics package. This results in genomic information plotted together with your data.
Last updated
visualizationmicroarraysequencing
13.51 score 92 stars 49 dependents 2.0k scripts 6.1k downloadsChIPseeker - ChIPseeker for ChIP peak Annotation, Comparison, and Visualization
This package implements functions to retrieve the nearest genes around the peak, annotate genomic region of the peak, statstical methods for estimate the significance of overlap among ChIP peak data sets, and incorporate GEO database for user to compare the own dataset with those deposited in database. The comparison can be used to infer cooperative regulation and thus can be used to generate hypotheses. Several visualization functions are implemented to summarize the coverage of the peak experiment, average profile and heatmap of peaks binding to TSS regions, genomic annotation, distance to TSS, and overlap of peaks or genes.
Last updated
annotationchipseqsoftwarevisualizationmultiplecomparisonatac-seqchip-seqcomparisonepigeneticsepigenomics
13.27 score 256 stars 4 dependents 2.7k scripts 5.1k downloadsSNPRelate - Parallel Computing Toolset for Relatedness and Principal Component Analysis of SNP Data
Genome-wide association studies (GWAS) are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed an R package SNPRelate to provide a binary format for single-nucleotide polymorphism (SNP) data in GWAS utilizing CoreArray Genomic Data Structure (GDS) data files. The GDS format offers the efficient operations specifically designed for integers with two bits, since a SNP could occupy only two bits. SNPRelate is also designed to accelerate two key computations on SNP data using parallel computing for multi-core symmetric multiprocessing computer architectures: Principal Component Analysis (PCA) and relatedness analysis using Identity-By-Descent measures. The SNP GDS format is also used by the GWASTools package with the support of S4 classes and generic functions. The extended GDS format is implemented in the SeqArray package to support the storage of single nucleotide variations (SNVs), insertion/deletion polymorphism (indel) and structural variation calls in whole-genome and whole-exome variant data.
Last updated
infrastructuregeneticsstatisticalmethodprincipalcomponentbioinformaticsgds-formatpcasimdsnpopenblascpp
12.70 score 113 stars 18 dependents 1.8k scripts 3.0k downloadsbsseq - Analyze, manage and store whole-genome methylation data
A collection of tools for analyzing and visualizing whole-genome methylation data from sequencing. This includes whole-genome bisulfite sequencing and Oxford nanopore data.
Last updated
dnamethylationcpp
12.58 score 38 stars 19 dependents 776 scripts 2.7k downloadsAnnotationHub - Client to access AnnotationHub resources
This package provides a client for the Bioconductor AnnotationHub web resource. The AnnotationHub web resource provides a central location where genomic files (e.g., VCF, bed, wig) and other resources from standard locations (e.g., UCSC, Ensembl) can be discovered. The resource includes metadata about each resource, e.g., a textual description, tags, and date of modification. The client creates and manages a local cache of files retrieved by the user, helping with quick and reproducible access.
Last updated
infrastructuredataimportguithirdpartyclientcore-packageu24ca289073
12.49 score 19 stars 115 dependents 3.6k scripts 20k downloadsReactomePA - Reactome Pathway Analysis
This package provides functions for pathway analysis based on REACTOME pathway database. It implements enrichment analysis, gene set enrichment analysis and several functions for visualization. This package is not affiliated with the Reactome team.
Last updated
pathwaysvisualizationannotationmultiplecomparisongenesetenrichmentreactomeenrichment-analysisreactome-pathway-analysisreactomepa
12.48 score 45 stars 9 dependents 2.1k scripts 5.8k downloadsTFBSTools - Software Package for Transcription Factor Binding Site (TFBS) Analysis
TFBSTools is a package for the analysis and manipulation of transcription factor binding sites. It includes matrices conversion between Position Frequency Matirx (PFM), Position Weight Matirx (PWM) and Information Content Matrix (ICM). It can also scan putative TFBS from sequence/alignment, query JASPAR database and provides a wrapper of de novo motif discovery software.
Last updated
motifannotationgeneregulationmotifdiscoverytranscriptionalignment
12.47 score 35 stars 17 dependents 1.8k scripts 7.7k downloadsmetagenomeSeq - Statistical analysis for sparse high-throughput sequencing
metagenomeSeq is designed to determine features (be it Operational Taxanomic Unit (OTU), species, etc.) that are differentially abundant between two or more groups of multiple samples. metagenomeSeq is designed to address the effects of both normalization and under-sampling of microbial communities on disease association detection and the testing of feature correlations.
Last updated
immunooncologyclassificationclusteringgeneticvariabilitydifferentialexpressionmicrobiomemetagenomicsnormalizationvisualizationmultiplecomparisonsequencingsoftware
12.02 score 74 stars 6 dependents 656 scripts 2.9k downloadsXVector - Foundation of external vector representation and manipulation in Bioconductor
Provides memory efficient S4 classes for storing sequences "externally" (e.g. behind an R external pointer, or on disk).
Last updated
infrastructuredatarepresentationbioconductor-packagecore-packagezlib
11.62 score 3 stars 1.8k dependents 85 scripts 112k downloadssystemPipeR - systemPipeR: A Multipurpose Workflow Management System for Reproducible Data Analysis
systemPipeR is a workflow management environment for reproducible data analysis that integrates R with command-line software. It enables researchers to design, execute, and report complex workflows on local machines and HPC systems. The framework combines R-based analysis with external tools through a Common Workflow Language (CWL) interface, manages workflow dependencies and restart capabilities, and automatically generates reproducible scientific analysis reports. The companion package systemPipeRdata provides ready-to-use workflow templates that simplify workflow setup and customization. Alternatively, workflow templates can be loaded from dedicated GitHub repositories.
Last updated
geneticsinfrastructuredataimportsequencingrnaseqriboseqchipseqmethylseqsnpgeneexpressioncoveragegenesetenrichmentalignmentqualitycontrolimmunooncologyreportwritingworkflowstepworkflowmanagement
11.57 score 52 stars 3 dependents 488 scripts 2.7k downloadsBiocStyle - Standard styles for vignettes and other Bioconductor documents
Provides standard formatting styles for Bioconductor PDF and HTML documents. Package vignettes illustrate use and functionality.
Last updated
softwarebioconductor-packagecore-package
11.52 score 15 stars 41 dependents 1.7k scripts 13k downloadspathview - a tool set for pathway based data integration and visualization
Pathview is a tool set for pathway based data integration and visualization. It maps and renders a wide variety of biological data on relevant pathway graphs. All users need is to supply their data and specify the target pathway. Pathview automatically downloads the pathway graph data, parses the data file, maps user data to the pathway, and render pathway graph with the mapped data. In addition, Pathview also seamlessly integrates with pathway and gene set (enrichment) analysis tools for large-scale and fully automated analysis.
Last updated
pathwaysgraphandnetworkvisualizationgenesetenrichmentdifferentialexpressiongeneexpressionmicroarrayrnaseqgeneticsmetabolomicsproteomicssystemsbiologysequencing
11.36 score 48 stars 10 dependents 2.2k scripts 7.3k downloadsALDEx2 - Analysis Of Differential Abundance Taking Sample and Scale Variation Into Account
A differential abundance analysis for the comparison of two or more conditions. Useful for analyzing data from standard RNA-seq or meta-RNA-seq assays as well as selected and unselected values from in-vitro sequence selections. Uses a Dirichlet-multinomial model to infer abundance from counts, optimized for three or more experimental replicates. The method infers biological and sampling variation to calculate the expected false discovery rate, given the variation, based on a Wilcoxon Rank Sum test and Welch's t-test (via aldex.ttest), a Kruskal-Wallis test (via aldex.kw), a generalized linear model (via aldex.glm), or a correlation test (via aldex.corr). All tests report predicted p-values and posterior Benjamini-Hochberg corrected p-values. ALDEx2 also calculates expected standardized effect sizes for paired or unpaired study designs. ALDEx2 can now be used to estimate the effect of scale on the results and report on the scale-dependent robustness of results.
Last updated
differentialexpressionrnaseqtranscriptomicsgeneexpressiondnaseqchipseqbayesiansequencingsoftwaremicrobiomemetagenomicsimmunooncologyscale simulationposterior p-value
10.95 score 31 stars 3 dependents 592 scripts 2.2k downloadsDirichletMultinomial - Dirichlet-Multinomial Mixture Model Machine Learning for Microbiome Data
Dirichlet-multinomial mixture models can be used to describe variability in microbial metagenomic data. This package is an interface to code originally made available by Holmes, Harris, and Quince, 2012, PLoS ONE 7(2): 1-15, as discussed further in the man page for this package, ?DirichletMultinomial.
Last updated
immunooncologymicrobiomesequencingclusteringclassificationmetagenomicsgsl
10.94 score 12 stars 27 dependents 154 scripts 9.3k downloadsilluminaio - Parsing Illumina Microarray Output Files
Tools for parsing Illumina's microarray output files, including IDAT.
Last updated
infrastructuredataimportmicroarrayproprietaryplatformsbioconductor
10.57 score 5 stars 39 dependents 93 scripts 4.8k downloadspRoloc - A unifying bioinformatics framework for spatial proteomics
The pRoloc package implements machine learning and visualisation methods for the analysis and interogation of quantitiative mass spectrometry data to reliably infer protein sub-cellular localisation.
Last updated
immunooncologyproteomicsmassspectrometryclassificationclusteringqualitycontrolbioconductorproteomics-dataspatial-proteomicsvisualisationopenblascpp
10.42 score 16 stars 2 dependents 106 scripts 777 downloadsQDNAseq - Quantitative DNA Sequencing for Chromosomal Aberrations
Quantitative DNA sequencing for chromosomal aberrations. The genome is divided into non-overlapping fixed-sized bins, number of sequence reads in each counted, adjusted with a simultaneous two-dimensional loess correction for sequence mappability and GC content, and filtered to remove spurious regions in the genome. Downstream steps of segmentation and calling are also implemented via packages DNAcopy and CGHcall, respectively.
Last updated
copynumbervariationdnaseqgeneticsgenomeannotationpreprocessingqualitycontrolsequencing
10.25 score 54 stars 4 dependents 228 scripts 888 downloadsderfinder - Annotation-agnostic differential expression analysis of RNA-seq data at base-pair resolution via the DER Finder approach
This package provides functions for annotation-agnostic differential expression analysis of RNA-seq data. Two implementations of the DER Finder approach are included in this package: (1) single base-level F-statistics and (2) DER identification at the expressed regions-level. The DER Finder approach can also be used to identify differentially bounded ChIP-seq peaks.
Last updated
differentialexpressionsequencingrnaseqchipseqdifferentialpeakcallingsoftwareimmunooncologycoverageannotation-agnosticbioconductorderfinder
10.19 score 44 stars 6 dependents 77 scripts 1.1k downloadsEnrichmentBrowser - Seamless navigation through combined results of set-based and network-based enrichment analysis
The EnrichmentBrowser package implements essential functionality for the enrichment analysis of gene expression data. The analysis combines the advantages of set-based and network-based enrichment analysis in order to derive high-confidence gene sets and biological pathways that are differentially regulated in the expression data under investigation. Besides, the package facilitates the visualization and exploration of such sets and pathways.
Last updated
immunooncologymicroarrayrnaseqgeneexpressiondifferentialexpressionpathwaysgraphandnetworknetworkgenesetenrichmentnetworkenrichmentvisualizationreportwriting
10.13 score 22 stars 3 dependents 282 scripts 920 downloadsRUVSeq - Remove Unwanted Variation from RNA-Seq Data
This package implements the remove unwanted variation (RUV) methods of Risso et al. (2014) for the normalization of RNA-Seq read counts between samples.
Last updated
immunooncologydifferentialexpressionpreprocessingrnaseqsoftware
10.10 score 15 stars 5 dependents 606 scripts 1.5k downloadspiano - Platform for integrative analysis of omics data
Piano performs gene set analysis using various statistical methods, from different gene level statistics and a wide range of gene-set collections. Furthermore, the Piano package contains functions for combining the results of multiple runs of gene set analyses.
Last updated
microarraypreprocessingqualitycontroldifferentialexpressionvisualizationgeneexpressiongenesetenrichmentpathwaysbioconductorbioconductor-packagebioinformaticsgene-set-enrichmenttranscriptomics
9.85 score 15 stars 9 dependents 219 scripts 934 downloadsGenomicInteractions - Utilities for handling genomic interaction data
Utilities for handling genomic interaction data such as ChIA-PET or Hi-C, annotating genomic features with interaction information, and producing plots and summary statistics.
Last updated
softwareinfrastructuredataimportdatarepresentationhic
9.71 score 7 stars 5 dependents 271 scripts 836 downloadsDEGreport - Report of DEG analysis
Creation of ready-to-share figures of differential expression analyses of count data. It integrates some of the code mentioned in DESeq2 and edgeR vignettes, and report a ranked list of genes according to the fold changes mean and variability for each selected gene.
Last updated
differentialexpressionvisualizationrnaseqreportwritinggeneexpressionimmunooncologybioconductordifferential-expressionqcreportrna-seqsmallrna
9.69 score 28 stars 1 dependents 492 scripts 1.2k downloadsAnnotationForge - Tools for building SQLite-based annotation data packages
Provides code for generating Annotation packages and their databases. Packages produced are intended to be used with AnnotationDbi.
Last updated
annotationinfrastructurebioconductor-packagecore-package
9.51 score 5 stars 16 dependents 214 scripts 3.2k downloadsMSstats - Protein Significance Analysis in DDA, SRM and DIA for Label-free or Label-based Proteomics Experiments
A set of tools for statistical relative protein significance analysis in DDA, SRM and DIA experiments.
Last updated
immunooncologymassspectrometryproteomicssoftwarenormalizationqualitycontroltimecourseopenblascpp
9.41 score 7 dependents 383 scripts 1.3k downloadsUniProt.ws - R Interface to UniProt Web Services
The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. This package provides a collection of functions for retrieving, processing, and re-packaging UniProt web services. The package makes use of UniProt's modernized REST API and allows mapping of identifiers accross different databases.
Last updated
annotationinfrastructuregokeggbiocartabioconductor-packagecore-package
9.11 score 10 stars 6 dependents 255 scripts 1.1k downloadsmonocle - Clustering, differential expression, and trajectory analysis for single- cell RNA-Seq
Monocle performs differential expression and time-series analysis for single-cell expression experiments. It orders individual cells according to progress through a biological process, without knowing ahead of time which genes define progress through that process. Monocle also performs differential expression analysis, clustering, visualization, and other useful tasks on single cell expression data. It is designed to work with RNA-Seq and qPCR data, but could be used with other types as well.
Last updated
immunooncologysequencingrnaseqgeneexpressiondifferentialexpressioninfrastructuredataimportdatarepresentationvisualizationclusteringmultiplecomparisonqualitycontrolcpp
8.81 score 2 dependents 1.8k scripts 5.9k downloadsGenomicFiles - Distributed computing by file or by range
This package provides infrastructure for parallel computations distributed 'by file' or 'by range'. User defined MAPPER and REDUCER functions provide added flexibility for data combination and manipulation.
Last updated
geneticsinfrastructuredataimportsequencingcoveragebioconductor-packagecore-packageu24ca289073
8.69 score 3 stars 16 dependents 98 scripts 1.6k downloadsSeqVarTools - Tools for variant data
An interface to the fast-access storage format for VCF data provided in SeqArray, with tools for common operations and analysis.
Last updated
snpgeneticvariabilitysequencinggenetics
8.69 score 3 stars 2 dependents 482 scripts 933 downloadsQuasR - Quantify and Annotate Short Reads in R
This package provides a framework for the quantification and analysis of Short Reads. It covers a complete workflow starting from raw sequence reads, over creation of alignments and quality control plots, to the quantification of genomic regions of interest. Read alignments are either generated through Rbowtie (data from DNA/ChIP/ATAC/Bis-seq experiments) or Rhisat2 (data from RNA-seq experiments that require spliced alignments), or can be provided in the form of bam files.
Last updated
geneticspreprocessingsequencingchipseqrnaseqmethylseqcoveragealignmentqualitycontrolimmunooncologycurlbzip2xz-utilszlibcpp
8.64 score 7 stars 1 dependents 105 scripts 866 downloadstrackViewer - A R/Bioconductor package with web interface for drawing elegant interactive tracks or lollipop plot to facilitate integrated analysis of multi-omics data
Visualize mapped reads along with annotation as track layers for NGS dataset such as ChIP-seq, RNA-seq, miRNA-seq, DNA-seq, SNPs and methylation data.
Last updated
visualization
8.60 score 3 dependents 227 scripts 1.3k downloadsChemmineOB - R interface to a subset of OpenBabel functionalities
ChemmineOB provides an R interface to a subset of cheminformatics functionalities implemented by the OpelBabel C++ project. OpenBabel is an open source cheminformatics toolbox that includes utilities for structure format interconversions, descriptor calculations, compound similarity searching and more. ChemineOB aims to make a subset of these utilities available from within R. For non-developers, ChemineOB is primarily intended to be used from ChemmineR as an add-on package rather than used directly.
Last updated
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsmicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportclusteringproteomicsmetabolomicsopenbabelcpp
8.51 score 10 stars 1 dependents 84 scripts 954 downloadsClassifyR - A framework for cross-validated classification problems, with applications to differential variability and differential distribution testing
The software formalises a framework for classification and survival model evaluation in R. There are four stages; Data transformation, feature selection, model training, and prediction. The requirements of variable types and variable order are fixed, but specialised variables for functions can also be provided. The framework is wrapped in a driver loop that reproducibly carries out a number of cross-validation schemes. Functions for differential mean, differential variability, and differential distribution are included. Additional functions may be developed by the user, by creating an interface to the framework.
Last updated
classificationsurvivalcpp
8.44 score 6 stars 3 dependents 58 scripts 692 downloadsSTRINGdb - STRINGdb - Protein-Protein Interaction Networks and Functional Enrichment Analysis
The STRINGdb package provides an R interface to STRING, a protein-protein interaction database and functional enrichment analysis tool (https://string-db.org).
Last updated
network
8.23 score 7 dependents 556 scripts 3.6k downloadsEBSeq - An R package for gene and isoform differential expression analysis of RNA-seq data
Differential Expression analysis at both gene and isoform level using RNA-seq data
Last updated
immunooncologystatisticalmethoddifferentialexpressionmultiplecomparisonrnaseqsequencingcpp
8.17 score 6 dependents 237 scripts 990 downloadsmzID - An mzIdentML parser for R
A parser for mzIdentML files implemented using the XML package. The parser tries to be general and able to handle all types of mzIdentML files with the drawback of having less 'pretty' output than a vendor specific parser. Please contact the maintainer with any problems and supply an mzIdentML file so the problems can be fixed quickly.
Last updated
immunooncologydataimportmassspectrometryproteomics
8.16 score 39 dependents 38 scripts 5.5k downloadsopenCyto - Hierarchical Gating Pipeline for flow cytometry data
This package is designed to facilitate the automated gating methods in sequential way to mimic the manual gating strategy.
Last updated
immunooncologyflowcytometrydataimportpreprocessingdatarepresentationcpp
8.12 score 1 dependents 472 scripts 1.6k downloadsmotifStack - Plot stacked logos for single or multiple DNA, RNA and amino acid sequence
The motifStack package is designed for graphic representation of multiple motifs with different similarity scores. It works with both DNA/RNA sequence motif and amino acid sequence motif. In addition, it provides the flexibility for users to customize the graphic parameters such as the font type and symbol colors.
Last updated
sequencematchingvisualizationsequencingmicroarrayalignmentchipchipchipseqmotifannotationdataimport
8.08 score 6 dependents 281 scripts 1.6k downloadsCOMPASS - Combinatorial Polyfunctionality Analysis of Single Cells
COMPASS is a statistical framework that enables unbiased analysis of antigen-specific T-cell subsets. COMPASS uses a Bayesian hierarchical framework to model all observed cell-subsets and select the most likely to be antigen-specific while regularizing the small cell counts that often arise in multi-parameter space. The model provides a posterior probability of specificity for each cell subset and each sample, which can be used to profile a subject's immune response to external stimuli such as infection or vaccination.
Last updated
immunooncologyflowcytometrycpp
7.95 score 8 stars 51 scripts 486 downloadswateRmelon - Illumina DNA methylation array normalization and metrics
15 flavours of betas and three performance metrics, with methods for objects produced by methylumi and minfi packages.
Last updated
dnamethylationmicroarraytwochannelpreprocessingqualitycontrol
7.95 score 6 dependents 338 scripts 1.5k downloadscompcodeR - RNAseq data simulation, differential expression analysis and performance comparison of differential expression methods
This package provides extensive functionality for comparing results obtained by different methods for differential expression analysis of RNAseq data. It also contains functions for simulating count data. Finally, it provides convenient interfaces to several packages for performing the differential expression analysis. These can also be used as templates for setting up and running a user-defined differential analysis workflow within the framework of the package.
Last updated
immunooncologyrnaseqdifferentialexpression
7.79 score 12 stars 32 scripts 484 downloadsCNEr - CNE Detection and Visualization
Large-scale identification and advanced visualization of sets of conserved noncoding elements.
Last updated
generegulationvisualizationdataimport
7.76 score 4 stars 40 scripts 5.0k downloadsmetaMS - MS-based metabolomics annotation pipeline
MS-based metabolomics data processing and compound annotation pipeline.
Last updated
immunooncologymassspectrometrymetabolomics
7.50 score 15 stars 15 scripts 424 downloadsChAMP - Chip Analysis Methylation Pipeline for Illumina HumanMethylation450 and EPIC
The package includes quality control metrics, a selection of normalization methods and novel methods to identify differentially methylated regions and to highlight copy number alterations.
Last updated
microarraymethylationarraynormalizationtwochannelcopynumberdnamethylation
7.40 score 1 dependents 344 scripts 1.6k downloadsASSIGN - Adaptive Signature Selection and InteGratioN (ASSIGN)
ASSIGN is a computational tool to evaluate the pathway deregulation/activation status in individual patient samples. ASSIGN employs a flexible Bayesian factor analysis approach that adapts predetermined pathway signatures derived either from knowledge-based literature or from perturbation experiments to the cell-/tissue-specific pathway signatures. The deregulation/activation level of each context-specific pathway is quantified to a score, which represents the extent to which a patient sample encompasses the pathway deregulation/activation signature.
Last updated
softwaregeneexpressionpathwaysbayesian
7.38 score 2 stars 1 dependents 66 scripts 480 downloads
Rcpi - Molecular Informatics Toolkit for Compound-Protein Interaction in Drug Discovery
A molecular informatics toolkit with an integration of bioinformatics and chemoinformatics tools for drug discovery.
Last updated
softwaredataimportdatarepresentationfeatureextractioncheminformaticsbiomedicalinformaticsproteomicsgosystemsbiologybioconductorbioinformaticsdrug-discoveryfeature-extractionfingerprintmolecular-descriptorsprotein-sequences
7.37 score 39 stars 30 scripts 476 downloadsshinyMethyl - Interactive visualization for Illumina methylation arrays
Interactive tool for visualizing Illumina methylation array data. Both the 450k and EPIC array are supported.
Last updated
dnamethylationmicroarraytwochannelpreprocessingqualitycontrolmethylationarray
7.34 score 6 stars 52 scripts 596 downloadsTSCAN - Tools for Single-Cell Analysis
Provides methods to perform trajectory analysis based on a minimum spanning tree constructed from cluster centroids. Computes pseudotemporal cell orderings by mapping cells in each cluster (or new cells) to the closest edge in the tree. Uses linear modelling to identify differentially expressed genes along each path through the tree. Several plotting and interactive visualization functions are also implemented.
Last updated
geneexpressionvisualizationgui
7.33 score 3 dependents 236 scripts 932 downloadsGeneOverlap - Test and visualize gene overlaps
Test two sets of gene lists and visualize the results.
Last updated
multiplecomparisonvisualization
7.26 score 1 dependents 428 scripts 1.4k downloadsGOexpress - Visualise microarray and RNAseq data using gene ontology annotations
The package contains methods to visualise the expression profile of genes from a microarray or RNA-seq experiment, and offers a supervised clustering approach to identify GO terms containing genes with expression levels that best classify two or more predefined groups of samples. Annotations for the genes present in the expression dataset may be obtained from Ensembl through the biomaRt package, if not provided by the user. The default random forest framework is used to evaluate the capacity of each gene to cluster samples according to the factor of interest. Finally, GO terms are scored by averaging the rank (alternatively, score) of their respective gene sets to cluster the samples. P-values may be computed to assess the significance of GO term ranking. Visualisation function include gene expression profile, gene ontology-based heatmaps, and hierarchical clustering of experimental samples using gene expression data.
Last updated
softwaregeneexpressiontranscriptiondifferentialexpressiongenesetenrichmentdatarepresentationclusteringtimecoursemicroarraysequencingrnaseqannotationmultiplecomparisonpathwaysgovisualizationimmunooncologybioconductorbioconductor-packagebioconductor-statsgeneontologygeneset-enrichment
7.22 score 9 stars 44 scripts 510 downloadsDSS - Dispersion shrinkage for sequencing data
DSS is an R library performing differntial analysis for count-based sequencing data. It detectes differentially expressed genes (DEGs) from RNA-seq, and differentially methylated loci or regions (DML/DMRs) from bisulfite sequencing (BS-seq). The core of DSS is a new dispersion shrinkage method for estimating the dispersion parameter from Gamma-Poisson or Beta-Binomial distributions.
Last updated
sequencingrnaseqdnamethylationgeneexpressiondifferentialexpressiondifferentialmethylation
7.21 score 4 dependents 328 scripts 1.4k downloadsSomaticSignatures - Somatic Signatures
The SomaticSignatures package identifies mutational signatures of single nucleotide variants (SNVs). It provides a infrastructure related to the methodology described in Nik-Zainal (2012, Cell), with flexibility in the matrix decomposition algorithms.
Last updated
sequencingsomaticmutationvisualizationclusteringgenomicvariationstatisticalmethod
7.19 score 23 stars 1 dependents 75 scripts 580 downloadsviper - Virtual Inference of Protein-activity by Enriched Regulon analysis
Inference of protein activity from gene expression data, including the VIPER and msVIPER algorithms
Last updated
systemsbiologynetworkenrichmentgeneexpressionfunctionalpredictiongeneregulation
7.10 score 5 dependents 326 scripts 1.4k downloadsMotifDb - An Annotated Collection of Protein-DNA Binding Sequence Motifs
More than 9900 annotated position frequency matrices from 14 public sources, for multiple organisms.
Last updated
motifannotation
7.06 score 3 dependents 488 scripts 1.3k downloadsfmcsR - Mismatch Tolerant Maximum Common Substructure Searching
The fmcsR package introduces an efficient maximum common substructure (MCS) algorithms combined with a novel matching strategy that allows for atom and/or bond mismatches in the substructures shared among two small molecules. The resulting flexible MCSs (FMCSs) are often larger than strict MCSs, resulting in the identification of more common features in their source structures, as well as a higher sensitivity in finding compounds with weak structural similarities. The fmcsR package provides several utilities to use the FMCS algorithm for pairwise compound comparisons, structure similarity searching and clustering.
Last updated
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsmicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportclusteringproteomicsmetabolomicscpp
7.04 score 6 stars 1 dependents 77 scripts 689 downloadsNOISeq - Exploratory analysis and differential expression for RNA-seq data
Analysis of RNA-seq expression data or other similar kind of data. Exploratory plots to evualuate saturation, count distribution, expression per chromosome, type of detected features, features length, etc. Differential expression between two experimental conditions with no parametric assumptions.
Last updated
immunooncologyrnaseqdifferentialexpressionvisualizationsequencing
6.93 score 5 dependents 269 scripts 1.1k downloadspRolocGUI - Interactive visualisation of spatial proteomics data
The package pRolocGUI comprises functions to interactively visualise spatial proteomics data on the basis of pRoloc, pRolocdata and shiny.
Last updated
proteomicsvisualizationgui
6.90 score 8 stars 3 scripts 478 downloadsReportingTools - Tools for making reports in various formats
The ReportingTools software package enables users to easily display reports of analysis results generated from sources such as microarray and sequencing data. The package allows users to create HTML pages that may be viewed on a web browser such as Safari, or in other formats readable by programs such as Excel. Users can generate tables with sortable and filterable columns, make and display plots, and link table entries to other data sources such as NCBI or larger plots within the HTML page. Using the package, users can also produce a table of contents page to link various reports together for a particular project that can be viewed in a web browser. For more examples, please visit our site: http:// research-pub.gene.com/ReportingTools.
Last updated
immunooncologysoftwarevisualizationmicroarrayrnaseqgodatarepresentationgenesetenrichment
6.88 score 2 dependents 132 scripts 1.3k downloadsCRISPRseek - Design of guide RNAs in CRISPR genome-editing systems
The package encompasses functions to find potential guide RNAs for the CRISPR-based genome-editing systems including the Base Editors and the Prime Editors when supplied with target sequences as input. Users have the flexibility to filter resulting guide RNAs based on parameters such as the absence of restriction enzyme cut sites or the lack of paired guide RNAs. The package also facilitates genome-wide exploration for off-targets, offering features to score and rank off-targets, retrieve flanking sequences, and indicate whether the hits are located within exon regions. All detected guide RNAs are annotated with the cumulative scores of the top5 and topN off-targets together with the detailed information such as mismatch sites and restrictuion enzyme cut sites. The package also outputs INDELs and their frequencies for Cas9 targeted sites.
Last updated
immunooncologygeneregulationsequencematchingcrispr
6.79 score 2 dependents 51 scripts 471 downloadsquantro - A test for when to use quantile normalization
A data-driven test for the assumptions of quantile normalization using raw data such as objects that inherit eSets (e.g. ExpressionSet, MethylSet). Group level information about each sample (such as Tumor / Normal status) must also be provided because the test assesses if there are global differences in the distributions between the user-defined groups.
Last updated
normalizationpreprocessingmultiplecomparisonmicroarraysequencing
6.74 score 1 dependents 304 scripts 728 downloadsdeepSNV - Detection of subclonal SNVs in deep sequencing data.
This package provides provides quantitative variant callers for detecting subclonal mutations in ultra-deep (>=100x coverage) sequencing experiments. The deepSNV algorithm is used for a comparative setup with a control experiment of the same loci and uses a beta-binomial model and a likelihood ratio test to discriminate sequencing errors and subclonal SNVs. The shearwater algorithm computes a Bayes classifier based on a beta-binomial model for variant calling with multiple samples for precisely estimating model parameters - such as local error rates and dispersion - and prior knowledge, e.g. from variation data bases such as COSMIC.
Last updated
geneticvariabilitysnpsequencinggeneticsdataimportcurlbzip2xz-utilszlibcpp
6.70 score 1 dependents 56 scripts 715 downloadscategoryCompare - Meta-analysis of high-throughput experiments using feature annotations
Calculates significant annotations (categories) in each of two (or more) feature (i.e. gene) lists, determines the overlap between the annotations, and returns graphical and tabular data about the significant annotations and which combinations of feature lists the annotations were found to be significant. Interactive exploration is facilitated through the use of RCytoscape (heavily suggested).
Last updated
annotationgomultiplecomparisonpathwaysgeneexpressionbioconductor
6.68 score 6 stars 460 downloadsRbowtie - R bowtie wrapper
This package provides an R wrapper around the popular bowtie short read aligner and around SpliceMap, a de novo splice junction discovery and alignment tool. The package is used by the QuasR bioconductor package. We recommend to use the QuasR package instead of using Rbowtie directly.
Last updated
sequencingalignment
6.68 score 1 stars 8 dependents 25 scripts 834 downloadsmissMethyl - Analysing Illumina HumanMethylation BeadChip Data
Normalisation, testing for differential variability and differential methylation and gene set testing for data from Illumina's Infinium HumanMethylation arrays. The normalisation procedure is subset-quantile within-array normalisation (SWAN), which allows Infinium I and II type probes on a single array to be normalised together. The test for differential variability is based on an empirical Bayes version of Levene's test. Differential methylation testing is performed using RUV, which can adjust for systematic errors of unknown origin in high-dimensional data by using negative control probes. Gene ontology analysis is performed by taking into account the number of probes per gene on the array, as well as taking into account multi-gene associated probes.
Last updated
normalizationdnamethylationmethylationarraygenomicvariationgeneticvariabilitydifferentialmethylationgenesetenrichment
6.62 score 5 dependents 440 scripts 2.1k downloadsrpx - R Interface to the ProteomeXchange Repository
The rpx package implements an interface to proteomics data submitted to the ProteomeXchange consortium.
Last updated
immunooncologyproteomicsmassspectrometrydataimportthirdpartyclientbioconductordatamass-spectrometryproteomexchange
6.54 score 7 stars 50 scripts 720 downloadsgwascat - representing and modeling data in the EMBL-EBI GWAS catalog
Represent and model data in the EMBL-EBI GWAS catalog.
Last updated
genetics
6.54 score 2 dependents 128 scripts 1.1k downloadsbioassayR - Cross-target analysis of small molecule bioactivity
bioassayR is a computational tool that enables simultaneous analysis of thousands of bioassay experiments performed over a diverse set of compounds and biological targets. Unique features include support for large-scale cross-target analyses of both public and custom bioassays, generation of high throughput screening fingerprints (HTSFPs), and an optional preloaded database that provides access to a substantial portion of publicly available bioactivity data.
Last updated
immunooncologymicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportbioinformaticsproteomicsmetabolomics
6.48 score 6 stars 46 scripts 480 downloadsHTSFilter - Filter replicated high-throughput transcriptome sequencing data
This package implements a filtering procedure for replicated transcriptome sequencing data based on a global Jaccard similarity index in order to identify genes with low, constant levels of expression across one or more experimental conditions.
Last updated
sequencingrnaseqpreprocessingdifferentialexpressiongeneexpressionnormalizationimmunooncology
6.41 score 2 dependents 43 scripts 552 downloadsVariantFiltering - Filtering of coding and non-coding genetic variants
Filter genetic variants using different criteria such as inheritance model, amino acid change consequence, minor allele frequencies across human populations, splice site strength, conservation, etc.
Last updated
geneticshomo_sapiensannotationsnpsequencinghighthroughputsequencing
6.30 score 4 stars 25 scripts 520 downloadsDMRcate - Methylation array and sequencing spatial analysis methods
De novo identification and extraction of differentially methylated regions (DMRs) from the human genome using Whole Genome Bisulfite Sequencing (WGBS) and Illumina Infinium Array (450K and EPIC) data. Provides functionality for filtering probes possibly confounded by SNPs and cross-hybridisation. Includes GRanges generation and plotting functions.
Last updated
differentialmethylationgeneexpressionmicroarraymethylationarraygeneticsdifferentialexpressiongenomeannotationdnamethylationonechanneltwochannelmultiplecomparisonqualitycontroltimecoursesequencingwholegenomeepigeneticscoveragepreprocessingdataimport
6.25 score 2 dependents 392 scripts 1.9k downloadsRTN - RTN: Reconstruction of Transcriptional regulatory Networks and analysis of regulons
A transcriptional regulatory network (TRN) consists of a collection of transcription factors (TFs) and the regulated target genes. TFs are regulators that recognize specific DNA sequences and guide the expression of the genome, either activating or repressing the expression the target genes. The set of genes controlled by the same TF forms a regulon. This package provides classes and methods for the reconstruction of TRNs and analysis of regulons.
Last updated
transcriptionnetworknetworkinferencenetworkenrichmentgeneregulationgeneexpressiongraphandnetworkgenesetenrichmentgeneticvariability
6.21 score 2 dependents 90 scripts 630 downloadsASSET - An R package for subset-based association analysis of heterogeneous traits and subtypes
An R package for subset-based analysis of heterogeneous traits and disease subtypes. The package allows the user to search through all possible subsets of z-scores to identify the subset of traits giving the best meta-analyzed z-score. Further, it returns a p-value adjusting for the multiple-testing involved in the search. It also allows for searching for the best combination of disease subtypes associated with each variant.
Last updated
statisticalmethodsnpgenomewideassociationmultiplecomparison
6.10 score 1 dependents 212 scripts 532 downloadsqcmetrics - A Framework for Quality Control
The package provides a framework for generic quality control of data. It permits to create, manage and visualise individual or sets of quality control metrics and generate quality control reports in various formats.
Last updated
immunooncologysoftwarequalitycontrolproteomicsmicroarraymassspectrometryvisualizationreportwriting
6.03 score 2 stars 2 dependents 4 scripts 474 downloadsrBiopaxParser - Parses BioPax files and represents them in R
Parses BioPAX files and represents them in R, at the moment BioPAX level 2 and level 3 are supported.
Last updated
datarepresentation
5.99 score 10 stars 14 scripts 502 downloadssangerseqR - Tools for Sanger Sequencing Data in R
This package contains several tools for analyzing Sanger Sequencing data files in R, including reading .scf and .ab1 files, making basecalls and plotting chromatograms.
Last updated
sequencingsnpvisualization
5.98 score 2 dependents 81 scripts 1.2k downloadsflowPeaks - An R package for flow data clustering
A fast and automatic clustering to classify the cells into subpopulations based on finding the peaks from the overall density function generated by K-means.
Last updated
immunooncologyflowcytometryclusteringgatinggslcpp
5.88 score 3 dependents 42 scripts 574 downloadsCNORode - ODE add-on to CellNOptR
Logic based ordinary differential equation (ODE) add-on to CellNOptR.
Last updated
immunooncologycellbasedassayscellbiologyproteomicsbioinformaticstimecourse
5.86 score 1 dependents 48 scripts 358 downloadsiClusterPlus - Integrative clustering of multi-type genomic data
Integrative clustering of multiple genomic data using a joint latent variable model.
Last updated
multi-omicsclusteringfortranopenblas
5.84 score 342 scripts 876 downloadshpar - Human Protein Atlas in R
The hpar package provides a simple R interface to and data from the Human Protein Atlas project.
Last updated
proteomicscellbiologydataimportfunctionalgenomicssystemsbiologyexperimenthubsoftware
5.77 score 1 dependents 39 scripts 818 downloadspvca - Principal Variance Component Analysis (PVCA)
This package contains the function to assess the batch sourcs by fitting all "sources" as random effects including two-way interaction terms in the Mixed Model(depends on lme4 package) to selected principal components, which were obtained from the original data correlation matrix. This package accompanies the book "Batch Effects and Noise in Microarray Experiements, chapter 12.
Last updated
microarraybatcheffect
5.70 score 1 dependents 118 scripts 683 downloadspepStat - Statistical analysis of peptide microarrays
Statistical analysis of peptide microarrays
Last updated
microarraypreprocessing
5.68 score 8 stars 4 scripts 360 downloadseiR - Accelerated similarity searching of small molecules
The eiR package provides utilities for accelerated structure similarity searching of very large small molecule data sets using an embedding and indexing approach.
Last updated
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsmicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportclusteringproteomicsmetabolomics
5.61 score 4 stars 17 scripts 367 downloadsSplicingGraphs - Create, manipulate, visualize splicing graphs, and assign RNA-seq reads to them
This package allows the user to create, manipulate, and visualize splicing graphs and their bubbles based on a gene model for a given organism. Additionally it allows the user to assign RNA-seq reads to the edges of a set of splicing graphs, and to summarize them in different ways.
Last updated
geneticsannotationdatarepresentationvisualizationsequencingrnaseqgeneexpressionalternativesplicingtranscriptionimmunooncologybioconductor-package
5.58 score 2 stars 21 scripts 578 downloadsMethylSeekR - Segmentation of Bis-seq data
This is a package for the discovery of regulatory regions from Bis-seq data
Last updated
sequencingmethylseqdnamethylation
5.55 score 44 scripts 567 downloadsomicade4 - Multiple co-inertia analysis of omics datasets
This package performes multiple co-inertia analysis of omics datasets.
Last updated
softwareclusteringclassificationmultiplecomparison
5.51 score 1 dependents 54 scripts 530 downloadsHiTC - High Throughput Chromosome Conformation Capture analysis
The HiTC package was developed to explore high-throughput 'C' data such as 5C or Hi-C. Dedicated R classes as well as standard methods for quality controls, normalization, visualization, and further analysis are also provided.
Last updated
sequencinghighthroughputsequencinghic
5.49 score 51 scripts 628 downloadscleaver - Cleavage of Polypeptide Sequences
In-silico cleavage of polypeptide sequences. The cleavage rules are taken from: http://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html
Last updated
proteomics
5.46 score 1 dependents 32 scripts 621 downloadsflowDensity - Sequential Flow Cytometry Data Gating
This package provides tools for automated sequential gating analogous to the manual gating strategy based on the density of the data.
Last updated
bioinformaticsflowcytometrycellbiologyclusteringcancerflowcytdatadatarepresentationstemcelldensitygating
5.44 score 4 dependents 114 scripts 675 downloadsROntoTools - R Onto-Tools suite
Suite of tools for functional analysis.
Last updated
networkanalysismicroarraygraphsandnetworks
5.37 score 2 dependents 28 scripts 571 downloadsPADOG - Pathway Analysis with Down-weighting of Overlapping Genes (PADOG)
This package implements a general purpose gene set analysis method called PADOG that downplays the importance of genes that apear often accross the sets of genes to be analyzed. The package provides also a benchmark for gene set analysis methods in terms of sensitivity and ranking using 24 public datasets from KEGGdzPathwaysGEO package.
Last updated
microarrayonechanneltwochannel
5.36 score 2 dependents 19 scripts 529 downloadsMSnID - Utilities for Exploration and Assessment of Confidence of LC-MSn Proteomics Identifications
Extracts MS/MS ID data from mzIdentML (leveraging mzID package) or text files. After collating the search results from multiple datasets it assesses their identification quality and optimize filtering criteria to achieve the maximum number of identifications while not exceeding a specified false discovery rate. Also contains a number of utilities to explore the MS/MS results and assess missed and irregular enzymatic cleavages, mass measurement accuracy, etc.
Last updated
proteomicsmassspectrometryimmunooncology
5.35 score 74 scripts 667 downloadsMethylMix - MethylMix: Identifying methylation driven cancer genes
MethylMix is an algorithm implemented to identify hyper and hypomethylated genes for a disease. MethylMix is based on a beta mixture model to identify methylation states and compares them with the normal DNA methylation state. MethylMix uses a novel statistic, the Differential Methylation value or DM-value defined as the difference of a methylation state with the normal methylation state. Finally, matched gene expression data is used to identify, besides differential, functional methylation states by focusing on methylation changes that effect gene expression. References: Gevaert 0. MethylMix: an R package for identifying DNA methylation-driven genes. Bioinformatics (Oxford, England). 2015;31(11):1839-41. doi:10.1093/bioinformatics/btv020. Gevaert O, Tibshirani R, Plevritis SK. Pancancer analysis of DNA methylation-driven genes using MethylMix. Genome Biology. 2015;16(1):17. doi:10.1186/s13059-014-0579-8.
Last updated
dnamethylationstatisticalmethoddifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetwork
5.33 score 1 dependents 36 scripts 477 downloadsepivizr - R Interface to epiviz web app
This package provides connections to the epiviz web app (http://epiviz.cbcb.umd.edu) for interactive visualization of genomic data. Objects in R/bioc interactive sessions can be displayed in genome browser tracks or plots to be explored by navigation through genomic regions. Fundamental Bioconductor data structures are supported (e.g., GenomicRanges and RangedSummarizedExperiment objects), while providing an easy mechanism to support other data structures (through package epivizrData). Visualizations (using d3.js) can be easily added to the web app as well.
Last updated
visualizationinfrastructuregui
5.32 score 2 dependents 35 scripts 498 downloadsderfinderPlot - Plotting functions for derfinder
This package provides plotting functions for results from the derfinder package. This helps separate the graphical dependencies required for making these plots from the core functionality of derfinder.
Last updated
differentialexpressionsequencingrnaseqsoftwarevisualizationimmunooncologybioconductorderfinder
5.30 score 2 stars 5 scripts 432 downloadsPWMEnrich - PWM enrichment analysis
A toolkit of high-level functions for DNA motif scanning and enrichment analysis built upon Biostrings. The main functionality is PWM enrichment analysis of already known PWMs (e.g. from databases such as MotifDb), but the package also implements high-level functions for PWM scanning and visualisation. The package does not perform "de novo" motif discovery, but is instead focused on using motifs that are either experimentally derived or computationally constructed by other tools.
Last updated
motifannotationsequencematchingsoftware
5.28 score 95 scripts 565 downloadsOmicCircos - High-quality circular visualization of omics data
OmicCircos is an R application and package for generating high-quality circular plots for omics data.
Last updated
visualizationstatisticsannotation
5.27 score 93 scripts 554 downloadsriboSeqR - Analysis of sequencing data from ribosome profiling experiments
Plotting functions, frameshift detection and parsing of sequencing data from ribosome profiling experiments.
Last updated
sequencinggeneticsvisualizationriboseq
5.25 score 1 stars 14 scripts 500 downloadspathifier - Quantify deregulation of pathways in cancer
Pathifier is an algorithm that infers pathway deregulation scores for each tumor sample on the basis of expression data. This score is determined, in a context-specific manner, for every particular dataset and type of cancer that is being investigated. The algorithm transforms gene-level information into pathway-level information, generating a compact and biologically relevant representation of each sample.
Last updated
network
5.21 score 1 dependents 39 scripts 375 downloadsswitchBox - Utilities to train and validate classifiers based on pair switching using the K-Top-Scoring-Pair (KTSP) algorithm
The package offer different classifiers based on comparisons of pair of features (TSP), using various decision rules (e.g., majority wins principle).
Last updated
softwarestatisticalmethodclassification
5.20 score 1 dependents 88 scripts 406 downloadscasper - Characterization of Alternative Splicing Based on Paired-End Reads
Infer alternative splicing from paired-end RNA-seq data. The model is based on counting paths across exons, rather than pairwise exon connections, and estimates the fragment size and start distributions non-parametrically, which improves estimation precision.
Last updated
immunooncologygeneexpressiondifferentialexpressiontranscriptionrnaseqsequencingcpp
5.18 score 76 scripts 386 downloadsBRAIN - Baffling Recursive Algorithm for Isotope distributioN calculations
Package for calculating aggregated isotopic distribution and exact center-masses for chemical substances (in this version composed of C, H, N, O and S). This is an implementation of the BRAIN algorithm described in the paper by J. Claesen, P. Dittwald, T. Burzykowski and D. Valkenborg.
Last updated
immunooncologymassspectrometryproteomics
5.18 score 15 scripts 548 downloadsBiSeq - Processing and analyzing bisulfite sequencing data
The BiSeq package provides useful classes and functions to handle and analyze targeted bisulfite sequencing (BS) data such as reduced-representation bisulfite sequencing (RRBS) data. In particular, it implements an algorithm to detect differentially methylated regions (DMRs). The package takes already aligned BS data from one or multiple samples.
Last updated
geneticssequencingmethylseqdnamethylation
5.14 score 46 scripts 522 downloadsAllelicImbalance - Investigates Allele Specific Expression
Provides a framework for allelic specific expression investigation using RNA-seq data.
Last updated
geneticsinfrastructuresequencing
5.08 score 1 stars 7 scripts 521 downloadschipenrich - Gene Set Enrichment For ChIP-seq Peak Data
ChIP-Enrich and Poly-Enrich perform gene set enrichment testing using peaks called from a ChIP-seq experiment. The method empirically corrects for confounding factors such as the length of genes, and the mappability of the sequence surrounding genes.
Last updated
immunooncologychipseqepigeneticsfunctionalgenomicsgenesetenrichmenthistonemodificationregression
5.08 score 40 scripts 482 downloadsRRHO - Inference on agreement between ordered lists
The package is aimed at inference on the amount of agreement in two sorted lists using the Rank-Rank Hypergeometric Overlap test.
Last updated
geneticssequencematchingmicroarraytranscription
5.07 score 59 scripts 332 downloadsRNASeqPower - Sample size for RNAseq studies
RNA-seq, sample size
Last updated
immunooncologyrnaseq
5.06 score 57 scripts 508 downloadsroar - Identify differential APA usage from RNA-seq alignments
Identify preferential usage of APA sites, comparing two biological conditions, starting from known alternative sites and alignments obtained from standard RNA-seq experiments.
Last updated
sequencinghighthroughputsequencingrnaseqtranscription
5.05 score 4 stars 14 scripts 490 downloadsmsmsTests - LC-MS/MS Differential Expression Tests
Statistical tests for label-free LC-MS/MS data by spectral counts, to discover differentially expressed proteins between two biological conditions. Three tests are available: Poisson GLM regression, quasi-likelihood GLM regression, and the negative binomial of the edgeR package.The three models admit blocking factors to control for nuissance variables.To assure a good level of reproducibility a post-test filter is available, where we may set the minimum effect size considered biologicaly relevant, and the minimum expression of the most abundant condition.
Last updated
immunooncologysoftwaremassspectrometryproteomics
5.02 score 1 dependents 22 scripts 688 downloadsmiRNAtap - miRNAtap: microRNA Targets - Aggregated Predictions
The package facilitates implementation of workflows requiring miRNA predictions, it allows to integrate ranked miRNA target predictions from multiple sources available online and aggregate them with various methods which improves quality of predictions above any of the single sources. Currently predictions are available for Homo sapiens, Mus musculus and Rattus norvegicus (the last one through homology translation).
Last updated
softwareclassificationmicroarraysequencingmirna
5.02 score 52 scripts 440 downloadsGSCA - GSCA: Gene Set Context Analysis
GSCA takes as input several lists of activated and repressed genes. GSCA then searches through a compendium of publicly available gene expression profiles for biological contexts that are enriched with a specified pattern of gene expression. GSCA provides both traditional R functions and interactive, user-friendly user interface.
Last updated
geneexpressionvisualizationgui
5.00 score 5 scripts 418 downloadsCAGEr - Analysis of CAGE (Cap Analysis of Gene Expression) sequencing data for precise mapping of transcription start sites and promoterome mining
The _CAGEr_ package identifies transcription start sites (TSS) and their usage frequency from CAGE (Cap Analysis Gene Expression) sequencing data. It normalises raw CAGE tag count, clusters TSSs into tag clusters (TC) and aggregates them across multiple CAGE experiments to construct consensus clusters (CC) representing the promoterome. CAGEr provides functions to profile expression levels of these clusters by cumulative expression and rarefaction analysis, and outputs the plots in ggplot2 format for further facetting and customisation. After clustering, CAGEr performs analyses of promoter width and detects differential usage of TSSs (promoter shifting) between samples. CAGEr also exports its data as genome browser tracks, and as R objects for downsteam expression analysis by other Bioconductor packages such as DESeq2, CAGEfightR, or seqArchR.
Last updated
preprocessingsequencingnormalizationfunctionalgenomicstranscriptiongeneexpressionclusteringvisualization
4.96 score 102 scripts 657 downloadsMLSeq - Machine Learning Interface for RNA-Seq Data
This package applies several machine learning methods, including SVM, bagSVM, Random Forest and CART to RNA-Seq data.
Last updated
immunooncologysequencingrnaseqclassificationclustering
4.96 score 1 dependents 38 scripts 470 downloadsTCC - TCC: Differential expression analysis for tag count data with robust normalization strategies
This package provides a series of functions for performing differential expression analysis from RNA-seq count data using robust normalization strategy (called DEGES). The basic idea of DEGES is that potential differentially expressed genes or transcripts (DEGs) among compared samples should be removed before data normalization to obtain a well-ranked gene list where true DEGs are top-ranked and non-DEGs are bottom ranked. This can be done by performing a multi-step normalization strategy (called DEGES for DEG elimination strategy). A major characteristic of TCC is to provide the robust normalization methods for several kinds of count data (two-group with or without replicates, multi-group/multi-factor, and so on) by virtue of the use of combinations of functions in depended packages.
Last updated
immunooncologysequencingdifferentialexpressionrnaseq
4.92 score 42 scripts 678 downloadsMeSHDbi - DBI to construct MeSH-related package from sqlite file
The package is unified implementation of MeSH.db, MeSH.AOR.db, and MeSH.PCR.db and also is interface to construct Gene-MeSH package (MeSH.XXX.eg.db). loadMeSHDbiPkg import sqlite file and generate MeSH.XXX.eg.db.
Last updated
annotationannotationdatainfrastructure
4.87 score 3 dependents 41 scripts 541 downloadsrTRM - Identification of Transcriptional Regulatory Modules from Protein-Protein Interaction Networks
rTRM identifies transcriptional regulatory modules (TRMs) from protein-protein interaction networks.
Last updated
transcriptionnetworkgeneregulationgraphandnetworkbioconductorbioinformatics
4.86 score 3 stars 1 dependents 3 scripts 504 downloadsHDTD - Statistical Inference about the Mean Matrix and the Covariance Matrices in High-Dimensional Transposable Data (HDTD)
Characterization of intra-individual variability using physiologically relevant measurements provides important insights into fundamental biological questions ranging from cell type identity to tumor development. For each individual, the data measurements can be written as a matrix with the different subsamples of the individual recorded in the columns and the different phenotypic units recorded in the rows. Datasets of this type are called high-dimensional transposable data. The HDTD package provides functions for conducting statistical inference for the mean relationship between the row and column variables and for the covariance structure within and between the row and column variables.
Last updated
differentialexpressiongeneticsgeneexpressionmicroarraysequencingstatisticalmethodsoftwarebioconductor-packagehigh-dimensionalstatisticsopenblascppopenmp
4.78 score 1 stars 3 scripts 348 downloadsmassiR - massiR: MicroArray Sample Sex Identifier
Predicts the sex of samples in gene expression microarray datasets
Last updated
softwaremicroarraygeneexpressionclusteringclassificationqualitycontrol
4.76 score 19 scripts 418 downloadsmeshr - Tools for conducting enrichment analysis of MeSH
A set of annotation maps describing the entire MeSH assembled using data from MeSH.
Last updated
annotationdatafunctionalannotationbioinformaticsstatisticsannotationmultiplecomparisonsmeshdb
4.73 score 1 stars 1 dependents 15 scripts 476 downloadsGeneNetworkBuilder - GeneNetworkBuilder: a bioconductor package for building regulatory network using ChIP-chip/ChIP-seq data and Gene Expression Data
Appliation for discovering direct or indirect targets of transcription factors using ChIP-chip or ChIP-seq, and microarray or RNA-seq gene expression data. Inputting a list of genes of potential targets of one TF from ChIP-chip or ChIP-seq, and the gene expression results, GeneNetworkBuilder generates a regulatory network of the TF.
Last updated
sequencingmicroarraygraphandnetworkcpp
4.68 score 24 scripts 500 downloadsABSSeq - ABSSeq: a new RNA-Seq analysis method based on modelling absolute expression differences
Inferring differential expression genes by absolute counts difference between two groups, utilizing Negative binomial distribution and moderating fold-change according to heterogeneity of dispersion across expression level.
Last updated
differentialexpression
4.62 score 1 dependents 2 scripts 632 downloadsFGNet - Functional Gene Networks derived from biological enrichment analyses
Build and visualize functional gene and term networks from clustering of enrichment analyses in multiple annotation spaces. The package includes a graphical user interface (GUI) and functions to perform the functional enrichment analysis through DAVID, GeneTerm Linker, gage (GSEA) and topGO.
Last updated
annotationgopathwaysgenesetenrichmentnetworkvisualizationfunctionalgenomicsnetworkenrichmentclustering
4.62 score 1 dependents 5 scripts 443 downloadsMethylAid - Visual and interactive quality control of large Illumina DNA Methylation array data sets
A visual and interactive web application using RStudio's shiny package. Bad quality samples are detected using sample-dependent and sample-independent controls present on the array and user adjustable thresholds. In depth exploration of bad quality samples can be performed using several interactive diagnostic plots of the quality control probes present on the array. Furthermore, the impact of any batch effect provided by the user can be explored.
Last updated
dnamethylationmethylationarraymicroarraytwochannelqualitycontrolbatcheffectvisualizationgui
4.60 score 20 scripts 559 downloadsflowClean - flowClean
A quality control tool for flow cytometry data based on compositional data analysis.
Last updated
flowcytometryqualitycontrolimmunooncology
4.58 score 19 scripts 558 downloadsRSVSim - RSVSim: an R/Bioconductor package for the simulation of structural variations
RSVSim is a package for the simulation of deletions, insertions, inversion, tandem-duplications and translocations of various sizes in any genome available as FASTA-file or BSgenome data package. SV breakpoints can be placed uniformly accross the whole genome, with a bias towards repeat regions and regions of high homology (for hg19) or at user-supplied coordinates.
Last updated
sequencing
4.58 score 19 scripts 379 downloadscompEpiTools - Tools for computational epigenomics
Tools for computational epigenomics developed for the analysis, integration and simultaneous visualization of various (epi)genomics data types across multiple genomic regions in multiple samples.
Last updated
geneexpressionsequencingvisualizationgenomeannotationcoverage
4.56 score 12 scripts 330 downloadsMGFM - Marker Gene Finder in Microarray gene expression data
The package is designed to detect marker genes from Microarray gene expression data sets
Last updated
geneticsgeneexpressionmicroarray
4.48 score 1 dependents 1 scripts 391 downloadsMBASED - Package containing functions for ASE analysis using Meta-analysis Based Allele-Specific Expression Detection
The package implements MBASED algorithm for detecting allele-specific gene expression from RNA count data, where allele counts at individual loci (SNVs) are integrated into a gene-specific measure of ASE, and utilizes simulations to appropriately assess the statistical significance of observed ASE.
Last updated
sequencinggeneexpressiontranscription
4.48 score 15 scripts 424 downloadsPviz - Peptide Annotation and Data Visualization using Gviz
Pviz adapts the Gviz package for protein sequences and data.
Last updated
visualizationproteomicsmicroarray
4.48 score 7 scripts 374 downloadscosmiq - cosmiq - COmbining Single Masses Into Quantities
cosmiq is a tool for the preprocessing of liquid- or gas - chromatography mass spectrometry (LCMS/GCMS) data with a focus on metabolomics or lipidomics applications. To improve the detection of low abundant signals, cosmiq generates master maps of the mZ/RT space from all acquired runs before a peak detection algorithm is applied. The result is a more robust identification and quantification of low-intensity MS signals compared to conventional approaches where peak picking is performed in each LCMS/GCMS file separately. The cosmiq package builds on the xcmsSet object structure and can be therefore integrated well with the package xcms as an alternative preprocessing step.
Last updated
immunooncologymassspectrometrymetabolomics
4.48 score 5 scripts 368 downloadsintansv - Integrative analysis of structural variations
This package provides efficient tools to read and integrate structural variations predicted by popular softwares. Annotation and visulation of structural variations are also implemented in the package.
Last updated
geneticsannotationsequencingsoftware
4.48 score 3 scripts 476 downloadsRMassBank - Workflow to process tandem MS files and build MassBank records
Workflow to process tandem MS files and build MassBank records. Functions include automated extraction of tandem MS spectra, formula assignment to tandem MS fragments, recalibration of tandem MS spectra with assigned fragments, spectrum cleanup, automated retrieval of compound information from Internet databases, and export to MassBank records.
Last updated
immunooncologybioinformaticsmassspectrometrymetabolomicssoftwareopenjdk
4.48 score 30 scripts 546 downloadstrio - Testing of SNPs and SNP Interactions in Case-Parent Trio Studies
Testing SNPs and SNP interactions with a genotypic TDT. This package furthermore contains functions for computing pairwise values of LD measures and for identifying LD blocks, as well as functions for setting up matched case pseudo-control genotype data for case-parent trios in order to run trio logic regression, for imputing missing genotypes in trios, for simulating case-parent trios with disease risk dependent on SNP interaction, and for power and sample size calculation in trio data.
Last updated
snpgeneticvariabilitymicroarraygenetics
4.44 score 23 scripts 498 downloadsmsmsEDA - Exploratory Data Analysis of LC-MS/MS data by spectral counts
Exploratory data analysis to assess the quality of a set of LC-MS/MS experiments, and visualize de influence of the involved factors.
Last updated
immunooncologysoftwaremassspectrometryproteomics
4.42 score 2 dependents 11 scripts 646 downloadsmethylMnM - detect different methylation level (DMR)
To give the exactly p-value and q-value of MeDIP-seq and MRE-seq data for different samples comparation.
Last updated
softwarednamethylationsequencing
4.38 score 1 dependents 2 scripts 482 downloadsdagLogo - dagLogo: a Bioconductor package for visualizing conserved amino acid sequence pattern in groups based on probability theory
Visualize significant conserved amino acid sequence pattern in groups based on probability theory.
Last updated
sequencematchingvisualization
4.38 score 12 scripts 474 downloadsgeNetClassifier - Classify diseases and build associated gene networks using gene expression profiles
Comprehensive package to automatically train and validate a multi-class SVM classifier based on gene expression data. Provides transparent selection of gene markers, their coexpression networks, and an interface to query the classifier.
Last updated
classificationdifferentialexpressionmicroarray
4.38 score 2 dependents 1 scripts 583 downloadsVariantTools - Tools for Exploratory Analysis of Variant Calls
Explore, diagnose, and compare variant calls using filters.
Last updated
geneticsgeneticvariabilitysequencing
4.36 score 46 scripts 553 downloadscustomProDB - Generate customized protein database from NGS data, with a focus on RNA-Seq data, for proteomics search
Database search is the most widely used approach for peptide and protein identification in mass spectrometry-based proteomics studies. Our previous study showed that sample-specific protein databases derived from RNA-Seq data can better approximate the real protein pools in the samples and thus improve protein identification. More importantly, single nucleotide variations, short insertion and deletions and novel junctions identified from RNA-Seq data make protein database more complete and sample-specific. Here, we report an R package customProDB that enables the easy generation of customized databases from RNA-Seq data for proteomics search. This work bridges genomics and proteomics studies and facilitates cross-omics data integration.
Last updated
immunooncologysequencingmassspectrometryproteomicssnprnaseqsoftwaretranscriptionalternativesplicingfunctionalgenomics
4.35 score 16 scripts 382 downloadsMultiMed - Testing multiple biological mediators simultaneously
Implements methods for testing multiple mediators
Last updated
multiplecomparisonstatisticalmethodsoftware
4.34 score 11 scripts 363 downloadsoposSOM - Comprehensive analysis of transcriptome data
This package translates microarray expression data into metadata of reduced dimension. It provides various sample-centered and group-centered visualizations, sample similarity analyses and functional enrichment analyses. The underlying SOM algorithm combines feature clustering, multidimensional scaling and dimension reduction, along with strong visualization capabilities. It enables extraction and description of functional expression modules inherent in the data.
Last updated
geneexpressiondifferentialexpressiongenesetenrichmentdatarepresentationvisualizationcpp
4.34 score 11 scripts 476 downloadsSeqGSEA - Gene Set Enrichment Analysis (GSEA) of RNA-Seq Data: integrating differential expression and splicing
The package generally provides methods for gene set enrichment analysis of high-throughput RNA-Seq data by integrating differential expression and splicing. It uses negative binomial distribution to model read count data, which accounts for sequencing biases and biological variation. Based on permutation tests, statistical significance can also be achieved regarding each gene's differential expression and splicing, respectively.
Last updated
sequencingrnaseqgenesetenrichmentgeneexpressiondifferentialexpressiondifferentialsplicingimmunooncology
4.34 score 11 scripts 501 downloadsSemDist - Information Accretion-based Function Predictor Evaluation
This package implements methods to calculate information accretion for a given version of the gene ontology and uses this data to calculate remaining uncertainty, misinformation, and semantic similarity for given sets of predicted annotations and true annotations from a protein function predictor.
Last updated
classificationannotationgosoftware
4.30 score 1 stars 8 scripts 428 downloadsCAFE - Chromosmal Aberrations Finder in Expression data
Detection and visualizations of gross chromosomal aberrations using Affymetrix expression microarrays as input
Last updated
geneexpressionmicroarrayonechannelgenesetenrichment
4.30 score 3 scripts 452 downloadsClomial - Infers clonal composition of a tumor
Clomial fits binomial distributions to counts obtained from Next Gen Sequencing data of multiple samples of the same tumor. The trained parameters can be interpreted to infer the clonal structure of the tumor.
Last updated
geneticsgeneticvariabilitysequencingclusteringmultiplecomparisonbayesiandnaseqexomeseqtargetedresequencingimmunooncology
4.30 score 8 scripts 396 downloadsBEAT - BEAT - BS-Seq Epimutation Analysis Toolkit
Model-based analysis of single-cell methylation data
Last updated
immunooncologygeneticsmethylseqsoftwarednamethylationepigenetics
4.30 score 7 scripts 448 downloadsPECA - Probe-level Expression Change Averaging
Calculates Probe-level Expression Change Averages (PECA) to identify differential expression in Affymetrix gene expression microarray studies or in proteomic studies using peptide-level mesurements respectively.
Last updated
softwareproteomicsmicroarraydifferentialexpressiongeneexpressionexonarraydifferentialsplicing
4.30 score 10 scripts 437 downloadscleanUpdTSeq - cleanUpdTSeq cleans up artifacts from polyadenylation sites from oligo(dT)-mediated 3' end RNA sequending data
This package implements a Naive Bayes classifier for accurately differentiating true polyadenylation sites (pA sites) from oligo(dT)-mediated 3' end sequencing such as PAS-Seq, PolyA-Seq and RNA-Seq by filtering out false polyadenylation sites, mainly due to oligo(dT)-mediated internal priming during reverse transcription. The classifer is highly accurate and outperforms other heuristic methods.
Last updated
sequencing3 end sequencingpolyadenylation siteinternal priming
4.26 score 1 dependents 9 scripts 520 downloadsrain - Rhythmicity Analysis Incorporating Non-parametric Methods
This package uses non-parametric methods to detect rhythms in time series. It deals with outliers, missing values and is optimized for time series comprising 10-100 measurements. As it does not assume expect any distinct waveform it is optimal or detecting oscillating behavior (e.g. circadian or cell cycle) in e.g. genome- or proteome-wide biological measurements such as: micro arrays, proteome mass spectrometry, or metabolome measurements.
Last updated
timecoursegeneticssystemsbiologyproteomicsmicroarraymultiplecomparison
4.25 score 44 scripts 443 downloadsRUVnormalize - RUV for normalization of expression array data
RUVnormalize is meant to remove unwanted variation from gene expression data when the factor of interest is not defined, e.g., to clean up a dataset for general use or to do any kind of unsupervised analysis.
Last updated
statisticalmethodnormalization
4.19 score 39 scripts 358 downloadsflowCyBar - Analyze flow cytometric data using gate information
A package to analyze flow cytometric data using gate information to follow population/community dynamics
Last updated
immunooncologycellbasedassaysclusteringflowcytometrysoftwarevisualization
4.15 score 7 scripts 343 downloadsGSReg - Gene Set Regulation (GS-Reg)
A package for gene set analysis based on the variability of expressions as well as a method to detect Alternative Splicing Events . It implements DIfferential RAnk Conservation (DIRAC) and gene set Expression Variation Analysis (EVA) methods. For detecting Differentially Spliced genes, it provides an implementation of the Spliced-EVA (SEVA).
Last updated
generegulationpathwaysgeneexpressiongeneticvariabilitygenesetenrichmentalternativesplicing
4.14 score 23 scripts 346 downloadsffpe - Quality assessment and control for FFPE microarray expression data
Identify low-quality data using metrics developed for expression data derived from Formalin-Fixed, Paraffin-Embedded (FFPE) data. Also a function for making Concordance at the Top plots (CAT-plots).
Last updated
microarraygeneexpressionqualitycontrol
4.12 score 33 scripts 424 downloadsGSAR - Gene Set Analysis in R
Gene set analysis using specific alternative hypotheses. Tests for differential expression, scale and net correlation structure.
Last updated
softwarestatisticalmethoddifferentialexpression
4.08 score 9 scripts 439 downloadsflowBeads - flowBeads: Analysis of flow bead data
This package extends flowCore to provide functionality specific to bead data. One of the goals of this package is to automate analysis of bead data for the purpose of normalisation.
Last updated
immunooncologyinfrastructureflowcytometrycellbasedassays
4.08 score 12 scripts 394 downloadsRnits - R Normalization and Inference of Time Series data
R/Bioconductor package for normalization, curve registration and inference in time course gene expression data.
Last updated
geneexpressionmicroarraytimecoursedifferentialexpressionnormalization
4.00 score 3 scripts 334 downloadsUNDO - Unsupervised Deconvolution of Tumor-Stromal Mixed Expressions
UNDO is an R package for unsupervised deconvolution of tumor and stromal mixed expression data. It detects marker genes and deconvolutes the mixing expression data without any prior knowledge.
Last updated
software
4.00 score 8 scripts 418 downloadsunifiedWMWqPCR - Unified Wilcoxon-Mann Whitney Test for testing differential expression in qPCR data
This packages implements the unified Wilcoxon-Mann-Whitney Test for qPCR data. This modified test allows for testing differential expression in qPCR data.
Last updated
differentialexpressiongeneexpressionmicrotitreplateassaymultiplecomparisonqualitycontrolsoftwarevisualizationqpcr
4.00 score 3 scripts 388 downloadsEBcoexpress - EBcoexpress for Differential Co-Expression Analysis
An Empirical Bayesian Approach to Differential Co-Expression Analysis at the Gene-Pair Level
Last updated
bayesian
4.00 score 8 scripts 406 downloadsHybridMTest - Hybrid Multiple Testing
Performs hybrid multiple testing that incorporates method selection and assumption evaluations into the analysis using empirical Bayes probability (EBP) estimates obtained by Grenander density estimation. For instance, for 3-group comparison analysis, Hybrid Multiple testing considers EBPs as weighted EBPs between F-test and H-test with EBPs from Shapiro Wilk test of normality as weigth. Instead of just using EBPs from F-test only or using H-test only, this methodology combines both types of EBPs through EBPs from Shapiro Wilk test of normality. This methodology uses then the law of total EBPs.
Last updated
geneexpressiongeneticsmicroarray
3.98 score 12 scripts 553 downloadsmethylPipe - Base resolution DNA methylation data analysis
Memory efficient analysis of base resolution DNA methylation data in both the CpG and non-CpG sequence context. Integration of DNA methylation data derived from any methodology providing base- or low-resolution data.
Last updated
methylseqdnamethylationcoveragesequencing
3.95 score 1 dependents 1 scripts 442 downloadsSCAN.UPC - Single-channel array normalization (SCAN) and Universal exPression Codes (UPC)
SCAN is a microarray normalization method to facilitate personalized-medicine workflows. Rather than processing microarray samples as groups, which can introduce biases and present logistical challenges, SCAN normalizes each sample individually by modeling and removing probe- and array-specific background noise using only data from within each array. SCAN can be applied to one-channel (e.g., Affymetrix) or two-channel (e.g., Agilent) microarrays. The Universal exPression Codes (UPC) method is an extension of SCAN that estimates whether a given gene/transcript is active above background levels in a given sample. The UPC method can be applied to one-channel or two-channel microarrays as well as to RNA-Seq read counts. Because UPC values are represented on the same scale and have an identical interpretation for each platform, they can be used for cross-platform data integration.
Last updated
immunooncologysoftwaremicroarraypreprocessingrnaseqtwochannelonechannel
3.94 score 29 scripts 456 downloadsCNVrd2 - CNVrd2: a read depth-based method to detect and genotype complex common copy number variants from next generation sequencing data.
CNVrd2 uses next-generation sequencing data to measure human gene copy number for multiple samples, indentify SNPs tagging copy number variants and detect copy number polymorphic genomic regions.
Last updated
copynumbervariationsnpsequencingsoftwarecoveragelinkagedisequilibriumclustering.jagscpp
3.92 score 3 stars 406 downloadsflowMatch - Matching and meta-clustering in flow cytometry
Matching cell populations and building meta-clusters and templates from a collection of FC samples.
Last updated
immunooncologyclusteringflowcytometrycpp
3.90 score 8 scripts 463 downloadsBADER - Bayesian Analysis of Differential Expression in RNA Sequencing Data
For RNA sequencing count data, BADER fits a Bayesian hierarchical model. The algorithm returns the posterior probability of differential expression for each gene between two groups A and B. The joint posterior distribution of the variables in the model can be returned in the form of posterior samples, which can be used for further down-stream analyses such as gene set enrichment.
Last updated
immunooncologysequencingrnaseqdifferentialexpressionsoftwaresagecpp
3.90 score 7 scripts 465 downloadsantiProfiles - Implementation of gene expression anti-profiles
Implements gene expression anti-profiles as described in Corrada Bravo et al., BMC Bioinformatics 2012, 13:272 doi:10.1186/1471-2105-13-272.
Last updated
geneexpressionclassification
3.90 score 4 scripts 488 downloadsMiRaGE - MiRNA Ranking by Gene Expression
The package contains functions for inferece of target gene regulation by miRNA, based on only target gene expression profile.
Last updated
immunooncologymicroarraygeneexpressionrnaseqsequencingsage
3.90 score 40 scripts 512 downloadsiASeq - iASeq: integrating multiple sequencing datasets for detecting allele-specific events
It fits correlation motif model to multiple RNAseq or ChIPseq studies to improve detection of allele-specific events and describe correlation patterns across studies.
Last updated
immunooncologysnprnaseqchipseq
3.90 score 6 scripts 346 downloadsnondetects - Non-detects in qPCR data
Methods to model and impute non-detects in the results of qPCR experiments.
Last updated
softwareassaydomaingeneexpressiontechnologyqpcrworkflowsteppreprocessing
3.88 score 19 scripts 345 downloadsCNORfeeder - Integration of CellNOptR to add missing links
This package integrates literature-constrained and data-driven methods to infer signalling networks from perturbation experiments. It permits to extends a given network with links derived from the data via various inference methods and uses information on physical interactions of proteins to guide and validate the integration of links.
Last updated
cellbasedassayscellbiologyproteomicsnetworkinference
3.86 score 18 scripts 396 downloadserccdashboard - Assess Differential Gene Expression Experiments with ERCC Controls
Technical performance metrics for differential gene expression experiments using External RNA Controls Consortium (ERCC) spike-in ratio mixtures.
Last updated
immunooncologygeneexpressiontranscriptionalternativesplicingdifferentialexpressiondifferentialsplicinggeneticsmicroarraymrnamicroarrayrnaseqbatcheffectmultiplecomparisonqualitycontrol
3.78 score 7 scripts 420 downloadsMPFE - Estimation of the amplicon methylation pattern distribution from bisulphite sequencing data
Estimate distribution of methylation patterns from a table of counts from a bisulphite sequencing experiment given a non-conversion rate and read error rate.
Last updated
highthroughputsequencingdatadnamethylationmethylseq
3.78 score 7 scripts 289 downloadsrfPred - Assign rfPred functional prediction scores to a missense variants list
Based on external numerous data files where rfPred scores are pre-calculated on all genomic positions of the human exome, the package gives rfPred scores to missense variants identified by the chromosome, the position (hg19 version), the referent and alternative nucleotids and the uniprot identifier of the protein. Note that for using the package, the user has to download the TabixFile and index (approximately 3.3 Go).
Last updated
softwareannotationclassification
3.78 score 7 scripts 428 downloadsSPEM - S-system parameter estimation method
This package can optimize the parameter in S-system models given time series data
Last updated
networknetworkinferencesoftware
3.78 score 1 dependents 4 scripts 390 downloadsCNORdt - Add-on to CellNOptR: Discretized time treatments
This add-on to the package CellNOptR handles time-course data, as opposed to steady state data in CellNOptR. It scales the simulation step to allow comparison and model fitting for time-course data. Future versions will optimize delays and strengths for each edge.
Last updated
immunooncologycellbasedassayscellbiologyproteomicstimecourse
3.78 score 15 scripts 374 downloadsagilp - Agilent expression array processing package
More about what it does (maybe more than one line)
Last updated
3.78 score 9 scripts 518 downloadslmdme - Linear Model decomposition for Designed Multivariate Experiments
linear ANOVA decomposition of Multivariate Designed Experiments implementation based on limma lmFit. Features: i)Flexible formula type interface, ii) Fast limma based implementation, iii) p-values for each estimated coefficient levels in each factor, iv) F values for factor effects and v) plotting functions for PCA and PLS.
Last updated
microarrayonechanneltwochannelvisualizationdifferentialexpressionexperimentdatacancer
3.78 score 4 scripts 362 downloadsMinimumDistance - A Package for De Novo CNV Detection in Case-Parent Trios
Analysis of de novo copy number variants in trios from high-dimensional genotyping platforms.
Last updated
microarraysnpcopynumbervariation
3.78 score 10 scripts 434 downloadsmetabomxtr - A package to run mixture models for truncated metabolomics data with normal or lognormal distributions
The functions in this package return optimized parameter estimates and log likelihoods for mixture models of truncated data with normal or lognormal distributions.
Last updated
immunooncologymetabolomicsmassspectrometry
3.60 score 7 scripts 362 downloadsssviz - A small RNA-seq visualizer and analysis toolkit
Small RNA sequencing viewer
Last updated
immunooncologysequencingrnaseqvisualizationmultiplecomparisongenetics
3.60 score 4 scripts 412 downloadsINPower - An R package for computing the number of susceptibility SNPs
An R package for computing the number of susceptibility SNPs and power of future studies
Last updated
snp
3.60 score 6 scripts 362 downloadsh5vc - Managing alignment tallies using a hdf5 backend
This package contains functions to interact with tally data from NGS experiments that is stored in HDF5 files.
Last updated
curlbzip2xz-utilszlibcpp
3.60 score 2 scripts 494 downloadsCNORfuzzy - Addon to CellNOptR: Fuzzy Logic
This package is an extension to CellNOptR. It contains additional functionality needed to simulate and train a prior knowledge network to experimental data using constrained fuzzy logic (cFL, rather than Boolean logic as is the case in CellNOptR). Additionally, this package will contain functions to use for the compilation of multiple optimization results (either Boolean or cFL).
Last updated
network
3.60 score 10 scripts 396 downloadscancerclass - Development and validation of diagnostic tests from high-dimensional molecular data
The classification protocol starts with a feature selection step and continues with nearest-centroid classification. The accurarcy of the predictor can be evaluated using training and test set validation, leave-one-out cross-validation or in a multiple random validation protocol. Methods for calculation and visualization of continuous prediction scores allow to balance sensitivity and specificity and define a cutoff value according to clinical requirements.
Last updated
cancermicroarrayclassificationvisualization
3.60 score 20 scripts 759 downloadsBasic4Cseq - Basic4Cseq: an R/Bioconductor package for analyzing 4C-seq data
Basic4Cseq is an R/Bioconductor package for basic filtering, analysis and subsequent visualization of 4C-seq data. Virtual fragment libraries can be created for any BSGenome package, and filter functions for both reads and fragments and basic quality controls are included. Fragment data in the vicinity of the experiment's viewpoint can be visualized as a coverage plot based on a running median approach and a multi-scale contact profile.
Last updated
immunooncologyvisualizationqualitycontrolsequencingcoveragealignmentrnaseqsequencematchingdataimport
3.52 score 11 scripts 462 downloadsprebs - Probe region expression estimation for RNA-seq data for improved microarray comparability
The prebs package aims at making RNA-sequencing (RNA-seq) data more comparable to microarray data. The comparability is achieved by summarizing sequencing-based expressions of probe regions using a modified version of RMA algorithm. The pipeline takes mapped reads in BAM format as an input and produces either gene expressions or original microarray probe set expressions as an output.
Last updated
immunooncologymicroarrayrnaseqsequencinggeneexpressionpreprocessing
3.48 score 9 scripts 417 downloadshapFabia - hapFabia: Identification of very short segments of identity by descent (IBD) characterized by rare variants in large sequencing data
A package to identify very short IBD segments in large sequencing data by FABIA biclustering. Two haplotypes are identical by descent (IBD) if they share a segment that both inherited from a common ancestor. Current IBD methods reliably detect long IBD segments because many minor alleles in the segment are concordant between the two haplotypes. However, many cohort studies contain unrelated individuals which share only short IBD segments. This package provides software to identify short IBD segments in sequencing data. Knowledge of short IBD segments are relevant for phasing of genotyping data, association studies, and for population genetics, where they shed light on the evolutionary history of humans. The package supports VCF formats, is based on sparse matrix operations, and provides visualization of haplotype clusters in different formats.
Last updated
geneticsgeneticvariabilitysnpsequencingvisualizationclusteringsequencematchingsoftware
3.45 score 14 scripts 468 downloadsblima - Tools for the preprocessing and analysis of the Illumina microarrays on the detector (bead) level
Package blima includes several algorithms for the preprocessing of Illumina microarray data. It focuses to the bead level analysis and provides novel approach to the quantile normalization of the vectors of unequal lengths. It provides variety of the methods for background correction including background subtraction, RMA like convolution and background outlier removal. It also implements variance stabilizing transformation on the bead level. There are also implemented methods for data summarization. It also provides the methods for performing T-tests on the detector (bead) level and on the probe level for differential expression testing.
Last updated
microarraypreprocessingnormalizationdifferentialexpressiongeneregulationgeneexpressioncpp
3.30 score 3 scripts 382 downloadsfastLiquidAssociation - functions for genome-wide application of Liquid Association
This package extends the function of the LiquidAssociation package for genome-wide application. It integrates a screening method into the LA analysis to reduce the number of triplets to be examined for a high LA value and provides code for use in subsequent significance analyses.
Last updated
softwaregeneexpressiongeneticspathwayscellbiology
3.30 score 8 scripts 372 downloadsnpGSEA - Permutation approximation methods for gene set enrichment analysis (non-permutation GSEA)
Current gene set enrichment methods rely upon permutations for inference. These approaches are computationally expensive and have minimum achievable p-values based on the number of permutations, not on the actual observed statistics. We have derived three parametric approximations to the permutation distributions of two gene set enrichment test statistics. We are able to reduce the computational burden and granularity issues of permutation testing with our method, which is implemented in this package. npGSEA calculates gene set enrichment statistics and p-values without the computational cost of permutations. It is applicable in settings where one or many gene sets are of interest. There are also built-in plotting functions to help users visualize results.
Last updated
genesetenrichmentmicroarraystatisticalmethodpathways
3.30 score 5 scripts 380 downloadsFRGEpistasis - Epistasis Analysis for Quantitative Traits by Functional Regression Model
A Tool for Epistasis Analysis Based on Functional Regression Model
Last updated
geneticsnetworkinferencegeneticvariabilitysoftware
3.30 score 10 scripts 333 downloadsmessina - Single-gene classifiers and outlier-resistant detection of differential expression for two-group and survival problems
Messina is a collection of algorithms for constructing optimally robust single-gene classifiers, and for identifying differential expression in the presence of outliers or unknown sample subgroups. The methods have application in identifying lead features to develop into clinical tests (both diagnostic and prognostic), and in identifying differential expression when a fraction of samples show unusual patterns of expression.
Last updated
geneexpressiondifferentialexpressionbiomedicalinformaticsclassificationsurvivalcpp
3.30 score 4 scripts 406 downloadsflowBin - Combining multitube flow cytometry data by binning
Software to combine flow cytometry data that has been multiplexed into multiple tubes with common markers between them, by establishing common bins across tubes in terms of the common markers, then determining expression within each tube for each bin in terms of the tube-specific markers.
Last updated
immunooncologycellbasedassaysflowcytometry
3.30 score 7 scripts 354 downloadsmetaSeq - Meta-analysis of RNA-Seq count data in multiple studies
The probabilities by one-sided NOISeq are combined by Fisher's method or Stouffer's method
Last updated
rnaseqdifferentialexpressionsequencingimmunooncology
3.30 score 2 scripts 502 downloadsrTRMui - A shiny user interface for rTRM
This package provides a web interface to compute transcriptional regulatory modules with rTRM.
Last updated
transcriptionnetworkgeneregulationgraphandnetworkgui
3.30 score 1 stars 1 scripts 403 downloadsmaPredictDSC - Phenotype prediction using microarray data: approach of the best overall team in the IMPROVER Diagnostic Signature Challenge
This package implements the classification pipeline of the best overall team (Team221) in the IMPROVER Diagnostic Signature Challenge. Additional functionality is added to compare 27 combinations of data preprocessing, feature selection and classifier types.
Last updated
microarrayclassification
3.30 score 2 scripts 343 downloadspaircompviz - Multiple comparison test visualization
This package provides visualization of the results from the multiple (i.e. pairwise) comparison tests such as pairwise.t.test, pairwise.prop.test or pairwise.wilcox.test. The groups being compared are visualized as nodes in Hasse diagram. Such approach enables very clear and vivid depiction of which group is significantly greater than which others, especially if comparing a large number of groups.
Last updated
graphandnetwork
3.30 score 3 scripts 337 downloadsvtpnet - variant-transcription factor-phenotype networks
variant-transcription factor-phenotype networks, inspired by Maurano et al., Science (2012), PMID 22955828
Last updated
network
3.30 score 2 scripts 436 downloadstRanslatome - Comparison between multiple levels of gene expression
Detection of differentially expressed genes (DEGs) from the comparison of two biological conditions (treated vs. untreated, diseased vs. normal, mutant vs. wild-type) among different levels of gene expression (transcriptome ,translatome, proteome), using several statistical methods: Rank Product, Translational Efficiency, t-test, Limma, ANOTA, DESeq, edgeR. Possibility to plot the results with scatterplots, histograms, MA plots, standard deviation (SD) plots, coefficient of variation (CV) plots. Detection of significantly enriched post-transcriptional regulatory factors (RBPs, miRNAs, etc) and Gene Ontology terms in the lists of DEGs previously identified for the two expression levels. Comparison of GO terms enriched only in one of the levels or in both. Calculation of the semantic similarity score between the lists of enriched GO terms coming from the two expression levels. Visual examination and comparison of the enriched terms with heatmaps, radar plots and barplots.
Last updated
cellbiologygeneregulationregulationgeneexpressiondifferentialexpressionmicroarrayhighthroughputsequencingqualitycontrolgomultiplecomparisonsbioinformatics
3.30 score 5 scripts 350 downloadstriplex - Search and visualize intramolecular triplex-forming sequences in DNA
This package provides functions for identification and visualization of potential intramolecular triplex patterns in DNA sequence. The main functionality is to detect the positions of subsequences capable of folding into an intramolecular triplex (H-DNA) in a much larger sequence. The potential H-DNA (triplexes) should be made of as many cannonical nucleotide triplets as possible. The package includes visualization showing the exact base-pairing in 1D, 2D or 3D.
Last updated
sequencematchinggeneregulation
3.30 score 4 scripts 454 downloadsBaseSpaceR - R SDK for BaseSpace RESTful API
A rich R interface to Illumina's BaseSpace cloud computing environment, enabling the fast development of data analysis and visualisation tools.
Last updated
infrastructuredatarepresentationconnecttoolssoftwaredataimporthighthroughputsequencingsequencinggenetics
3.30 score 9 scripts 471 downloadsdeltaGseg - deltaGseg
Identifying distinct subpopulations through multiscale time series analysis
Last updated
proteomicstimecoursevisualizationclustering
3.30 score 2 scripts 399 downloadsepigenomix - Epigenetic and gene transcription data normalization and integration with mixture models
A package for the integrative analysis of RNA-seq or microarray based gene transcription and histone modification data obtained by ChIP-seq. The package provides methods for data preprocessing and matching as well as methods for fitting bayesian mixture models in order to detect genes with differences in both data types.
Last updated
chipseqgeneexpressiondifferentialexpressionclassification
3.30 score 1 scripts 387 downloadsARRmNormalization - Adaptive Robust Regression normalization for Illumina methylation data
Perform the Adaptive Robust Regression method (ARRm) for the normalization of methylation data from the Illumina Infinium HumanMethylation 450k assay.
Last updated
dnamethylationtwochannelpreprocessingmicroarray
3.30 score 6 scripts 464 downloadsproteinProfiles - Protein Profiling
Significance assessment for distance measures of time-course protein profiles
Last updated
3.30 score 8 scripts 338 downloadslpNet - Linear Programming Model for Network Inference
lpNet aims at infering biological networks, in particular signaling and gene networks. For that it takes perturbation data, either steady-state or time-series, as input and generates an LP model which allows the inference of signaling networks. For parameter identification either leave-one-out cross-validation or stratified n-fold cross-validation can be used.
Last updated
networkinference
3.30 score 341 downloadsDrugVsDisease - Comparison of disease and drug profiles using Gene set Enrichment Analysis
This package generates ranked lists of differential gene expression for either disease or drug profiles. Input data can be downloaded from Array Express or GEO, or from local CEL files. Ranked lists of differential expression and associated p-values are calculated using Limma. Enrichment scores (Subramanian et al. PNAS 2005) are calculated to a reference set of default drug or disease profiles, or a set of custom data supplied by the user. Network visualisation of significant scores are output in Cytoscape format.
Last updated
microarraygeneexpressionclustering
3.30 score 8 scripts 368 downloadsSNAGEE - Signal-to-Noise applied to Gene Expression Experiments
Signal-to-Noise applied to Gene Expression Experiments. Signal-to-noise ratios can be used as a proxy for quality of gene expression studies and samples. The SNRs can be calculated on any gene expression data set as long as gene IDs are available, no access to the raw data files is necessary. This allows to flag problematic studies and samples in any public data set.
Last updated
microarrayonechanneltwochannelqualitycontrol
3.30 score 3 scripts 411 downloadsmatchBox - Utilities to compute, compare, and plot the agreement between ordered vectors of features (ie. distinct genomic experiments). The package includes Correspondence-At-the-TOP (CAT) analysis.
The matchBox package enables comparing ranked vectors of features, merging multiple datasets, removing redundant features, using CAT-plots and Venn diagrams, and computing statistical significance.
Last updated
softwareannotationmicroarraymultiplecomparisonvisualization
3.30 score 3 scripts 352 downloadscnvGSA - Gene Set Analysis of (Rare) Copy Number Variants
This package is intended to facilitate gene-set association with rare CNVs in case-control studies.
Last updated
multiplecomparison
3.30 score 3 scripts 384 downloadsphyloseq - Handling and analysis of high-throughput microbiome census data
phyloseq provides a set of classes and tools to facilitate the import, storage, analysis, and graphical display of microbiome census data.
Last updated
immunooncologysequencingmicrobiomemetagenomicsclusteringclassificationmultiplecomparisongeneticvariability
16.15 score 649 stars 50 dependents 14k scripts 11k downloadsggbio - Visualization tools for genomic data
The ggbio package extends and specializes the grammar of graphics for biological data. The graphics are designed to answer common scientific questions, in particular those often asked of high throughput genomics data. All core Bioconductor data structures are supported, where appropriate. The package supports detailed views of particular genomic regions, as well as genome-wide overviews. Supported overviews include ideograms and grand linear views. High-level plots include sequence fragment length, edge-linked interval to data view, mismatch pileup, and several splicing summaries.
Last updated
infrastructurevisualization
12.92 score 117 stars 20 dependents 880 scripts 3.2k downloadsmzR - parser for netCDF, mzXML and mzML and mzIdentML files (mass spectrometry data)
mzR provides a unified API to the common file formats and parsers available for mass spectrometry data. It comes with a subset of the proteowizard library for mzXML, mzML and mzIdentML. The netCDF reading code has previously been used in XCMS.
Last updated
immunooncologyinfrastructuredataimportproteomicsmetabolomicsmassspectrometrycurlopensslcpp
12.79 score 46 stars 46 dependents 247 scripts 6.8k downloadsVariantAnnotation - Annotation of Genetic Variants
Annotate variants, compute amino acid coding changes, predict coding outcomes.
Last updated
dataimportsequencingsnpannotationgeneticsvariantannotationcurlbzip2xz-utilszlib
11.37 score 156 dependents 2.2k scripts 15k downloadsGWASTools - Tools for Genome Wide Association Studies
Classes for storing very large GWAS data sets and annotation, and functions for GWAS data cleaning and analysis.
Last updated
snpgeneticvariabilityqualitycontrolmicroarray
10.61 score 18 stars 5 dependents 456 scripts 1.2k downloadsDECIPHER - Tools for curating, analyzing, and manipulating biological sequences
A toolset for deciphering and managing biological sequences.
Last updated
clusteringgeneticssequencingdataimportvisualizationmicroarrayqualitycontrolqpcralignmentwholegenomemicrobiomeimmunooncologygenepredictionphylogeneticscomparativegenomicsopenmp
10.57 score 21 dependents 1.5k scripts 5.4k downloadsgraphite - GRAPH Interaction from pathway Topological Environment
Graph objects from pathway topology derived from KEGG, Panther, PathBank, PharmGKB, Reactome SMPDB and WikiPathways databases.
Last updated
pathwaysthirdpartyclientgraphandnetworknetworkreactomekeggmetabolomicsbioinformaticsmirrorpathway-analysis
10.53 score 8 stars 24 dependents 163 scripts 5.7k downloadsEDASeq - Exploratory Data Analysis and Normalization for RNA-Seq
Numerical and graphical summaries of RNA-Seq read data. Within-lane normalization procedures to adjust for GC-content effect (or other gene-level effects) on read counts: loess robust local regression, global-scaling, and full-quantile normalization (Risso et al., 2011). Between-lane normalization procedures to adjust for distributional differences between lanes (e.g., sequencing depth): global-scaling and full-quantile normalization (Bullard et al., 2010).
Last updated
immunooncologysequencingrnaseqpreprocessingqualitycontroldifferentialexpression
10.28 score 5 stars 9 dependents 676 scripts 2.6k downloadsbiovizBase - Basic graphic utilities for visualization of genomic data.
The biovizBase package is designed to provide a set of utilities, color schemes and conventions for genomic data. It serves as the base for various high-level packages for biological data visualization. This saves development effort and encourages consistency.
Last updated
infrastructurevisualizationpreprocessing
8.57 score 76 dependents 442 scripts 9.2k downloadsisobar - Analysis and quantitation of isobarically tagged MSMS proteomics data
isobar provides methods for preprocessing, normalization, and report generation for the analysis of quantitative mass spectrometry proteomics data labeled with isobaric tags, such as iTRAQ and TMT. Features modules for integrating and validating PTM-centric datasets (isobar-PTM). More information on http://www.ms-isobar.org.
Last updated
immunooncologyproteomicsmassspectrometrybioinformaticsmultiplecomparisonsqualitycontrol
7.13 score 10 stars 28 scripts 503 downloadsRedeR - Interactive visualization and manipulation of nested networks
RedeR combines an R package with a stand-alone Java application for interactive visualization and manipulation of nested networks. Graph, node, and edge attributes can be configured using either graphical or command-line methods, following igraph syntax rules.
Last updated
guigraphandnetworknetworknetworkenrichmentnetworkinferencesoftwaresystemsbiology
6.88 score 7 dependents 121 scripts 719 downloadsCellNOptR - Training of boolean logic models of signalling networks using prior knowledge networks and perturbation data
This package does optimisation of boolean logic networks of signalling pathways based on a previous knowledge network and a set of data upon perturbation of the nodes in the network.
Last updated
cellbasedassayscellbiologyproteomicspathwaysnetworktimecourseimmunooncology
6.66 score 6 dependents 127 scripts 542 downloadsnucleR - Nucleosome positioning package for R
Nucleosome positioning for Tiling Arrays and NGS experiments.
Last updated
nucleosomepositioningcoveragechipseqmicroarraysequencinggeneticsqualitycontroldataimport
5.73 score 36 scripts 510 downloadseasyRNASeq - Count summarization and normalization for RNA-Seq data
Calculates the coverage of high-throughput short-reads against a genome of reference and summarizes it per feature of interest (e.g. exon, gene, transcript). The data can be normalized as 'RPKM' or by the 'DESeq' or 'edgeR' package.
Last updated
geneexpressionrnaseqgeneticspreprocessingimmunooncology
5.56 score 1 dependents 20 scripts 514 downloadscn.mops - cn.mops - Mixture of Poissons for CNV detection in NGS data
cn.mops (Copy Number estimation by a Mixture Of PoissonS) is a data processing pipeline for copy number variations and aberrations (CNVs and CNAs) from next generation sequencing (NGS) data. The package supplies functions to convert BAM files into read count matrices or genomic ranges objects, which are the input objects for cn.mops. cn.mops models the depths of coverage across samples at each genomic position. Therefore, it does not suffer from read count biases along chromosomes. Using a Bayesian approach, cn.mops decomposes read variations across samples into integer copy numbers and noise by its mixture components and Poisson distributions, respectively. cn.mops guarantees a low FDR because wrong detections are indicated by high noise and filtered out. cn.mops is very fast and written in C++.
Last updated
sequencingcopynumbervariationhomo_sapienscellbiologyhapmapgeneticscpp
5.51 score 3 dependents 121 scripts 629 downloadsReadqPCR - Read qPCR data
The package provides functions to read raw RT-qPCR data of different platforms.
Last updated
dataimportmicrotitreplateassaygeneexpressionqpcr
5.36 score 2 dependents 19 scripts 490 downloadsr3Cseq - Analysis of Chromosome Conformation Capture and Next-generation Sequencing (3C-seq)
This package is used for the analysis of long-range chromatin interactions from 3C-seq assay.
Last updated
preprocessingsequencing
5.10 score 3 stars 20 scripts 565 downloadsGeneExpressionSignature - Gene Expression Signature based Similarity Metric
This package gives the implementations of the gene expression signature and its distance to each. Gene expression signature is represented as a list of genes whose expression is correlated with a biological state of interest. And its distance is defined using a nonparametric, rank-based pattern-matching strategy based on the Kolmogorov-Smirnov statistic. Gene expression signature and its distance can be used to detect similarities among the signatures of drugs, diseases, and biological states of interest.
Last updated
geneexpression
5.04 score 1 stars 11 scripts 530 downloadstweeDEseq - RNA-seq data analysis using the Poisson-Tweedie family of distributions
Differential expression analysis of RNA-seq using the Poisson-Tweedie (PT) family of distributions. PT distributions are described by a mean, a dispersion and a shape parameter and include Poisson and NB distributions, among others, as particular cases. An important feature of this family is that, while the Negative Binomial (NB) distribution only allows a quadratic mean-variance relationship, the PT distributions generalizes this relationship to any orde.
Last updated
immunooncologystatisticalmethoddifferentialexpressionsequencingrnaseqdnaseq
4.99 score 1 stars 1 dependents 54 scripts 430 downloadsDTA - Dynamic Transcriptome Analysis
Dynamic Transcriptome Analysis (DTA) can monitor the cellular response to perturbations with higher sensitivity and temporal resolution than standard transcriptomics. The package implements the underlying kinetic modeling approach capable of the precise determination of synthesis- and decay rates from individual microarray or RNAseq measurements.
Last updated
microarraydifferentialexpressiongeneexpressiontranscription
4.78 score 1 dependents 6 scripts 399 downloadsCormotif - Correlation Motif Fit
It fits correlation motif model to multiple studies to detect study specific differential expression patterns.
Last updated
microarraydifferentialexpression
4.78 score 60 scripts 372 downloadsNormqPCR - Functions for normalisation of RT-qPCR data
Functions for the selection of optimal reference genes and the normalisation of real-time quantitative PCR data.
Last updated
microtitreplateassaygeneexpressionqpcr
4.75 score 28 scripts 533 downloadsDART - Denoising Algorithm based on Relevance network Topology
Denoising Algorithm based on Relevance network Topology (DART) is an algorithm designed to evaluate the consistency of prior information molecular signatures (e.g in-vitro perturbation expression signatures) in independent molecular data (e.g gene expression data sets). If consistent, a pruning network strategy is then used to infer the activation status of the molecular signature in individual samples.
Last updated
geneexpressiondifferentialexpressiongraphandnetworkpathways
4.30 score 1 scripts 496 downloadsCNAnorm - A normalization method for Copy Number Aberration in cancer samples
Performs ratio, GC content correction and normalization of data obtained using low coverage (one read every 100-10,000 bp) high troughput sequencing. It performs a "discrete" normalization looking for the ploidy of the genome. It will also provide tumour content if at least two ploidy states can be found.
Last updated
copynumbervariationsequencingcoveragenormalizationwholegenomednaseqgenomicvariation
4.30 score 9 scripts 421 downloadsPREDA - Position Related Data Analysis
Package for the position related analysis of quantitative functional genomics data.
Last updated
softwarecopynumbervariationgeneexpressiongenetics
4.30 score 10 scripts 452 downloadsASEB - Predict Acetylated Lysine Sites
ASEB is an R package to predict lysine sites that can be acetylated by a specific KAT-family.
Last updated
proteomicscpp
4.20 score 6 scripts 516 downloadsGRENITS - Gene Regulatory Network Inference Using Time Series
The package offers four network inference statistical models using Dynamic Bayesian Networks and Gibbs Variable Selection: a linear interaction model, two linear interaction models with added experimental noise (Gaussian and Student distributed) for the case where replicates are available and a non-linear interaction model.
Last updated
networkinferencegeneregulationtimecoursegraphandnetworkgeneexpressionnetworkbayesianopenblascpp
4.20 score 2 scripts 405 downloadsBAGS - A Bayesian Approach for Geneset Selection
R package providing functions to perform geneset significance analysis over simple cross-sectional data between 2 and 5 phenotypes of interest.
Last updated
bayesian
4.15 score 47 scripts 444 downloadsAffyRNADegradation - Analyze and correct probe positional bias in microarray data due to RNA degradation
The package helps with the assessment and correction of RNA degradation effects in Affymetrix 3' expression arrays. The parameter d gives a robust and accurate measure of RNA integrity. The correction removes the probe positional bias, and thus improves comparability of samples that are affected by RNA degradation.
Last updated
geneexpressionmicroarrayonechannelpreprocessingqualitycontrol
4.08 score 7 scripts 532 downloadsweaver - Tools and extensions for processing Sweave documents
This package provides enhancements on the Sweave() function in the base package. In particular a facility for caching code chunk results is included.
Last updated
infrastructure
3.88 score 38 scripts 432 downloadsmaskBAD - Masking probes with binding affinity differences
Package includes functions to analyze and mask microarray expression data.
Last updated
microarray
3.60 score 3 scripts 397 downloadsGEWIST - Gene Environment Wide Interaction Search Threshold
This 'GEWIST' package provides statistical tools to efficiently optimize SNP prioritization for gene-gene and gene-environment interactions.
Last updated
multiplecomparisongenetics
3.30 score 3 scripts 324 downloadsAGDEX - Agreement of Differential Expression Analysis
A tool to evaluate agreement of differential expression for cross-species genomics
Last updated
microarraygeneticsgeneexpression
3.30 score 6 scripts 506 downloadsdks - The double Kolmogorov-Smirnov package for evaluating multiple testing procedures.
The dks package consists of a set of diagnostic functions for multiple testing methods. The functions can be used to determine if the p-values produced by a multiple testing procedure are correct. These functions are designed to be applied to simulated data. The functions require the entire set of p-values from multiple simulated studies, so that the joint distribution can be evaluated.
Last updated
multiplecomparisonqualitycontrol
3.30 score 1 scripts 328 downloadsPANR - Posterior association networks and functional modules inferred from rich phenotypes of gene perturbations
This package provides S4 classes and methods for inferring functional gene networks with edges encoding posterior beliefs of gene association types and nodes encoding perturbation effects.
Last updated
immunooncologynetworkinferencevisualizationgraphandnetworkclusteringcellbasedassays
3.30 score 6 scripts 430 downloadsRTopper - This package is designed to perform Gene Set Analysis across multiple genomic platforms
the RTopper package is designed to perform and integrate gene set enrichment results across multiple genomic platforms.
Last updated
microarray
2.38 score 12 scripts 500 downloadsclusterProfiler - A Universal Enrichment Tool for Interpreting Omics Data
A universal tool for interpreting functional characteristics of omics data. It supports Over-Representation Analysis (ORA) and Gene Set Enrichment Analysis (GSEA) for both coding and non-coding genomics data of thousands of species. It provides a unified and tidy interface to access, manipulate, and visualize enrichment results. A key capability is the simultaneous analysis and comparison of datasets from multiple treatments or time points. Furthermore, it integrates Large Language Model (LLM) capabilities to provide automated and insightful interpretation of enrichment results.
Last updated
annotationclusteringgenesetenrichmentgokeggmultiplecomparisonpathwaysreactomevisualizationenrichment-analysisgseaquarto
17.35 score 1.2k stars 47 dependents 16k scripts 42k downloads
biomaRt - Interface to BioMart databases (i.e. Ensembl)
In recent years a wealth of biological data has become available in public data repositories. Easy access to these valuable data resources and firm integration with data analysis is needed for comprehensive bioinformatics data analysis. biomaRt provides an interface to a growing collection of databases implementing the BioMart software suite (<https://www.ensembl.org/info/data/biomart/index.html>). The package enables retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas or write complex SQL queries. The most prominent examples of BioMart databases are maintained by Ensembl, which provides biomaRt users direct access to a diverse set of data and enables a wide range of powerful online queries from gene annotation to database mining.
Last updated
annotationbioconductorbiomartensembl
16.50 score 49 stars 214 dependents 17k scripts 41k downloadsRsamtools - Binary alignment (BAM), FASTA, variant call (BCF), and tabix file import
This package provides an interface to the 'samtools', 'bcftools', and 'tabix' utilities for manipulating SAM (Sequence Alignment / Map), FASTA, binary variant call (BCF) and compressed indexed tab-delimited (tabix) files.
Last updated
dataimportsequencingcoveragealignmentqualitycontrolbioconductor-packagecore-packagecurlbzip2xz-utilszlibcpp
15.71 score 29 stars 597 dependents 4.3k scripts 50k downloadsAnnotationDbi - Manipulation of SQLite-based annotations in Bioconductor
Implements a user-friendly interface for querying SQLite-based annotation data packages.
Last updated
annotationmicroarraysequencinggenomeannotationbioconductor-packagecore-package
15.48 score 10 stars 760 dependents 6.2k scripts 78k downloadsgoseq - Gene Ontology analyser for RNA-seq and other length biased data
Detects Gene Ontology and/or other user defined categories which are over/under represented in RNA-seq data.
Last updated
immunooncologysequencinggogeneexpressiontranscriptionrnaseqdifferentialexpressionannotationgenesetenrichmentkeggpathwayssoftware
10.13 score 2 stars 10 dependents 752 scripts 2.5k downloadsConsensusClusterPlus - ConsensusClusterPlus
algorithm for determining cluster count and membership by stability evidence in unsupervised analysis
Last updated
softwareclustering
8.45 score 20 dependents 544 scripts 4.3k downloadsCoGAPS - Coordinated Gene Activity in Pattern Sets
Coordinated Gene Activity in Pattern Sets (CoGAPS) implements a Bayesian MCMC matrix factorization algorithm, GAPS, and links it to gene set statistic methods to infer biological process activity. It can be used to perform sparse matrix factorization on any data, and when this data represents biomolecules, to do gene set analysis.
Last updated
geneexpressiontranscriptiongenesetenrichmentdifferentialexpressionbayesianclusteringtimecoursernaseqmicroarraymultiplecomparisondimensionreductionimmunooncologycpp
6.41 score 130 scripts 588 downloadssegmentSeq - Methods for identifying small RNA loci from high-throughput sequencing data
High-throughput sequencing technologies allow the production of large volumes of short sequences, which can be aligned to the genome to create a set of matches to the genome. By looking for regions of the genome which to which there are high densities of matches, we can infer a segmentation of the genome into regions of biological significance. The methods in this package allow the simultaneous segmentation of data from multiple samples, taking into account replicate data, in order to create a consensus segmentation. This has obvious applications in a number of classes of sequencing experiments, particularly in the discovery of small RNA loci and novel mRNA transcriptome discovery.
Last updated
multiplecomparisonsequencingalignmentdifferentialexpressionqualitycontroldataimport
6.26 score 86 scripts 484 downloadsBioNet - Routines for the functional analysis of biological networks
This package provides functions for the integrated analysis of protein-protein interaction networks and the detection of functional modules. Different datasets can be integrated into the network by assigning p-values of statistical tests to the nodes of the network. E.g. p-values obtained from the differential expression of the genes from an Affymetrix array are assigned to the nodes of the network. By fitting a beta-uniform mixture model and calculating scores from the p-values, overall scores of network regions can be calculated and an integer linear programming algorithm identifies the maximum scoring subnetwork.
Last updated
microarraydataimportgraphandnetworknetworknetworkenrichmentgeneexpressiondifferentialexpression
6.15 score 2 dependents 117 scripts 788 downloadsflowMeans - Non-parametric Flow Cytometry Data Gating
Identifies cell populations in Flow Cytometry data using non-parametric clustering and segmented-regression-based change point detection. Note: R 2.11.0 or newer is required.
Last updated
immunooncologyflowcytometrycellbiologyclustering
5.76 score 2 dependents 48 scripts 596 downloadsfrma - Frozen RMA and Barcode
Preprocessing and analysis for single microarrays and microarray batches.
Last updated
softwaremicroarraypreprocessing
5.75 score 1 dependents 93 scripts 485 downloadsPROMISE - PRojection Onto the Most Interesting Statistical Evidence
A general tool to identify genomic features with a specific biologically interesting pattern of associations with multiple endpoint variables as described in Pounds et. al. (2009) Bioinformatics 25: 2013-2019
Last updated
microarrayonechannelmultiplecomparisongeneexpression
5.53 score 1 dependents 57 scripts 429 downloadsfabia - FABIA: Factor Analysis for Bicluster Acquisition
Biclustering by "Factor Analysis for Bicluster Acquisition" (FABIA). FABIA is a model-based technique for biclustering, that is clustering rows and columns simultaneously. Biclusters are found by factor analysis where both the factors and the loading matrix are sparse. FABIA is a multiplicative model that extracts linear dependencies between samples and feature patterns. It captures realistic non-Gaussian data distributions with heavy tails as observed in gene expression measurements. FABIA utilizes well understood model selection techniques like the EM algorithm and variational approaches and is embedded into a Bayesian framework. FABIA ranks biclusters according to their information content and separates spurious biclusters from true biclusters. The code is written in C.
Last updated
statisticalmethodmicroarraydifferentialexpressionmultiplecomparisonclusteringvisualization
5.50 score 2 dependents 44 scripts 587 downloadsNuPoP - An R package for nucleosome positioning prediction
NuPoP is an R package for Nucleosome Positioning Prediction.This package is built upon a duration hidden Markov model proposed in Xi et al, 2010; Wang et al, 2008. The core of the package was written in Fotran. In addition to the R package, a stand-alone Fortran software tool is also available at https://github.com/jipingw. The Fortran codes have complete functonality as the R package. Note: NuPoP has two separate functions for prediction of nucleosome positioning, one for MNase-map trained models and the other for chemical map-trained models. The latter was implemented for four species including yeast, S.pombe, mouse and human, trained based on our recent publications. We noticed there is another package nuCpos by another group for prediction of nucleosome positioning trained with chemicals. A report to compare recent versions of NuPoP with nuCpos can be found at https://github.com/jiping/NuPoP_doc. Some more information can be found and will be posted at https://github.com/jipingw/NuPoP.
Last updated
geneticsvisualizationclassificationnucleosomepositioninghiddenmarkovmodelfortran
5.23 score 17 scripts 450 downloadsDEGseq - Identify Differentially Expressed Genes from RNA-seq data
DEGseq is an R package to identify differentially expressed genes from RNA-Seq data.
Last updated
rnaseqpreprocessinggeneexpressiondifferentialexpressionimmunooncologycpp
5.01 score 51 scripts 628 downloadsRDRToolbox - A package for nonlinear dimension reduction with Isomap and LLE.
A package for nonlinear dimension reduction using the Isomap and LLE algorithm. It also includes a routine for computing the Davis-Bouldin-Index for cluster validation, a plotting tool and a data generator for microarray gene expression data and for the Swiss Roll dataset.
Last updated
dimensionreductionfeatureextractionvisualizationclusteringmicroarray
4.96 score 65 scripts 460 downloadsChIPseqR - Identifying Protein Binding Sites in High-Throughput Sequencing Data
ChIPseqR identifies protein binding sites from ChIP-seq and nucleosome positioning experiments. The model used to describe binding events was developed to locate nucleosomes but should flexible enough to handle other types of experiments as well.
Last updated
chipseqinfrastructure
4.70 score 7 scripts 504 downloadsBeadDataPackR - Compression of Illumina BeadArray data
Provides functionality for the compression and decompression of raw bead-level data from the Illumina BeadArray platform.
Last updated
microarray
4.68 score 4 dependents 4 scripts 1.0k downloadsflowTrans - Parameter Optimization for Flow Cytometry Data Transformation
Profile maximum likelihood estimation of parameters for flow cytometry data transformations.
Last updated
immunooncologyflowcytometry
4.48 score 25 scripts 460 downloadsCSAR - Statistical tools for the analysis of ChIP-seq data
Statistical tools for ChIP-seq data analysis. The package includes the statistical method described in Kaufmann et al. (2009) PLoS Biology: 7(4):e1000090. Briefly, Taking the average DNA fragment size subjected to sequencing into account, the software calculates genomic single-nucleotide read-enrichment values. After normalization, sample and control are compared using a test based on the Poisson distribution. Test statistic thresholds to control the false discovery rate are obtained through random permutation.
Last updated
chipseqtranscriptiongenetics
4.48 score 6 scripts 470 downloadstigre - Transcription factor Inference through Gaussian process Reconstruction of Expression
The tigre package implements our methodology of Gaussian process differential equation models for analysis of gene expression time series from single input motif networks. The package can be used for inferring unobserved transcription factor (TF) protein concentrations from expression measurements of known target genes, or for ranking candidate targets of a TF.
Last updated
microarraytimecoursegeneexpressiontranscriptiongeneregulationnetworkinferencebayesian
4.38 score 6 scripts 409 downloadsiChip - Bayesian Modeling of ChIP-chip Data Through Hidden Ising Models
Hidden Ising models are implemented to identify enriched genomic regions in ChIP-chip data. They can be used to analyze the data from multiple platforms (e.g., Affymetrix, Agilent, and NimbleGen), and the data with single to multiple replicates.
Last updated
chipchiponechannelagilentchipmicroarray
4.32 score 5 scripts 479 downloadsMassArray - Analytical Tools for MassArray Data
This package is designed for the import, quality control, analysis, and visualization of methylation data generated using Sequenom's MassArray platform. The tools herein contain a highly detailed amplicon prediction for optimal assay design. Also included are quality control measures of data, such as primer dimer and bisulfite conversion efficiency estimation. Methylation data are calculated using the same algorithms contained in the EpiTyper software package. Additionally, automatic SNP-detection can be used to flag potentially confounded data from specific CG sites. Visualization includes barplots of methylation data as well as UCSC Genome Browser-compatible BED tracks. Multiple assays can be positionally combined for integrated analysis.
Last updated
immunooncologydnamethylationsnpmassspectrometrygeneticsdataimportvisualization
4.30 score 4 scripts 351 downloadsSpeCond - Condition specific detection from expression data
This package performs a gene expression data analysis to detect condition-specific genes. Such genes are significantly up- or down-regulated in a small number of conditions. It does so by fitting a mixture of normal distributions to the expression values. Conditions can be environmental conditions, different tissues, organs or any other sources that you wish to compare in terms of gene expression.
Last updated
microarraydifferentialexpressionmultiplecomparisonclusteringreportwriting
3.92 score 14 scripts 460 downloadsfrmaTools - Frozen RMA Tools
Tools for advanced use of the frma package.
Last updated
softwaremicroarraypreprocessing
3.90 score 6 scripts 421 downloadsMBCB - MBCB (Model-based Background Correction for Beadarray)
This package provides a model-based background correction method, which incorporates the negative control beads to pre-process Illumina BeadArray data.
Last updated
microarraypreprocessing
3.86 score 18 scripts 384 downloadsles - Identifying Differential Effects in Tiling Microarray Data
The 'les' package estimates Loci of Enhanced Significance (LES) in tiling microarray data. These are regions of regulation such as found in differential transcription, CHiP-chip, or DNA modification analysis. The package provides a universal framework suitable for identifying differential effects in tiling microarray data sets, and is independent of the underlying statistics at the level of single probes.
Last updated
microarraydifferentialexpressionchipchipdnamethylationtranscription
3.78 score 1 dependents 3 scripts 404 downloadsLiquidAssociation - LiquidAssociation
The package contains functions for calculate direct and model-based estimators for liquid association. It also provides functions for testing the existence of liquid association given a gene triplet data.
Last updated
pathwaysgeneexpressioncellbiologygeneticsnetworktimecourse
3.78 score 1 dependents 10 scripts 358 downloadsaffyILM - Linear Model of background subtraction and the Langmuir isotherm
affyILM is a preprocessing tool which estimates gene expression levels for Affymetrix Gene Chips. Input from physical chemistry is employed to first background subtract intensities before calculating concentrations on behalf of the Langmuir model.
Last updated
microarrayonechannelpreprocessing
3.60 score 1 scripts 493 downloadsgenomes - Genome sequencing project metadata
Download genome and assembly reports from NCBI
Last updated
annotationgenetics
3.56 score 18 scripts 370 downloadsOTUbase - Provides structure and functions for the analysis of OTU data
Provides a platform for Operational Taxonomic Unit based analysis
Last updated
sequencingdataimport
3.30 score 3 scripts 442 downloadsNTW - Predict gene network using an Ordinary Differential Equation (ODE) based method
This package predicts the gene-gene interaction network and identifies the direct transcriptional targets of the perturbation using an ODE (Ordinary Differential Equation) based method.
Last updated
preprocessing
3.30 score 2 scripts 344 downloadsattract - Methods to Find the Gene Expression Modules that Represent the Drivers of Kauffman's Attractor Landscape
This package contains the functions to find the gene expression modules that represent the drivers of Kauffman's attractor landscape. The modules are the core attractor pathways that discriminate between different cell types of groups of interest. Each pathway has a set of synexpression groups, which show transcriptionally-coordinated changes in gene expression.
Last updated
immunooncologykeggreactomegeneexpressionpathwaysgenesetenrichmentmicroarrayrnaseq
3.30 score 4 scripts 600 downloadsGSRI - Gene Set Regulation Index
The GSRI package estimates the number of differentially expressed genes in gene sets, utilizing the concept of the Gene Set Regulation Index (GSRI).
Last updated
microarraytranscriptiondifferentialexpressiongenesetenrichmentgeneregulation
3.30 score 2 scripts 378 downloadsGEOsubmission - Prepares microarray data for submission to GEO
Helps to easily submit a microarray dataset and the associated sample information to GEO by preparing a single file for upload (direct deposit).
Last updated
microarray
3.30 score 7 scripts 423 downloadsMulcom - Calculates Mulcom test
Identification of differentially expressed genes and false discovery rate (FDR) calculation by Multiple Comparison test.
Last updated
statisticalmethodmultiplecomparisonmicroarraydifferentialexpressiongeneexpressioncpp
3.00 score 4 scripts 460 downloadsGenomicFeatures - Query the gene models of a given organism/assembly
Extract the genomic locations of genes, transcripts, exons, introns, and CDS, for the gene models stored in a TxDb object. A TxDb object is a small database that contains the gene models of a given organism/assembly. Bioconductor provides a small collection of TxDb objects in the form of ready-to-install TxDb packages for the most commonly studied organisms. Additionally, the user can easily make a TxDb object (or package) for the organism/assembly of their choice by using the tools from the txdbmaker package.
Last updated
geneticsinfrastructureannotationsequencinggenomeannotationbioconductor-packagecore-package
15.59 score 27 stars 353 dependents 8.2k scripts 35k downloadsGEOquery - Get data from NCBI Gene Expression Omnibus (GEO)
The NCBI Gene Expression Omnibus (GEO) is a public repository of microarray data. Given the rich and varied nature of this resource, it is only natural to want to apply BioConductor tools to these data. GEOquery is the bridge between GEO and BioConductor.
Last updated
microarraydataimportonechanneltwochannelsagebioconductorbioinformaticsdata-sciencegenomicsncbi-geou24ca289073quarto
15.33 score 112 stars 53 dependents 5.8k scripts 18k downloadsDOSE - Disease Ontology Semantic and Enrichment analysis
This package implements five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively for measuring semantic similarities among DO terms and gene products. Enrichment analyses including hypergeometric model and gene set enrichment analysis are also implemented for discovering disease associations of high-throughput biological data.
Last updated
annotationvisualizationmultiplecomparisongenesetenrichmentpathwayssoftwaredisease-ontologyenrichment-analysissemantic-similarity
15.24 score 126 stars 61 dependents 3.3k scripts 43k downloadsTCGAbiolinks - TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data
The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.
Last updated
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksequencingsurvivalsoftwarebiocbioconductorgdcintegrative-analysistcgatcga-datatcgabiolinks
14.95 score 351 stars 7 dependents 2.4k scripts 9.1k downloadsqvalue - Q-value estimation for false discovery rate control
This package takes a list of p-values resulting from the simultaneous testing of many hypotheses and estimates their q-values and local FDR values. The q-value of a test measures the proportion of false positives incurred (called the false discovery rate) when that particular test is called significant. The local FDR measures the posterior probability the null hypothesis is true given the test's p-value. Various plots are automatically generated, allowing one to make sensible significance cut-offs. Several mathematical results have recently been shown on the conservative accuracy of the estimated q-values from this software. The software can be applied to problems in genomics, brain imaging, astrophysics, and data mining.
Last updated
multiplecomparisons
14.51 score 122 stars 132 dependents 4.4k scripts 34k downloadsGOSemSim - GO-terms Semantic Similarity Measures
The semantic comparisons of Gene Ontology (GO) annotations provide quantitative ways to compute similarities between genes and gene groups, and have became important basis for many bioinformatics analysis approaches. GOSemSim is an R package for semantic similarity computation among GO terms, sets of GO terms, gene products and gene clusters. GOSemSim implemented five methods proposed by Resnik, Schlicker, Jiang, Lin and Wang respectively.
Last updated
annotationgoclusteringpathwaysnetworksoftwarebioinformaticsgene-ontologysemantic-similaritycpp
14.47 score 72 stars 68 dependents 840 scripts 34k downloads
xcms - LC-MS and GC-MS Data Analysis
Framework for processing and visualization of chromatographically separated and single-spectra mass spectral data. Imports from AIA/ANDI NetCDF, mzXML, mzData and mzML files. Preprocesses data for high-throughput, untargeted analyte profiling.
Last updated
immunooncologymassspectrometrymetabolomicsbioconductorfeature-detectionmass-spectrometrypeak-detectioncpp
14.45 score 224 stars 14 dependents 1.2k scripts 3.3k downloadsBSgenome - Software infrastructure for efficient representation of full genomes and their SNPs
Infrastructure shared by all the Biostrings-based genome data packages.
Last updated
geneticsinfrastructuredatarepresentationsequencematchingannotationsnpbioconductor-packagecore-package
14.28 score 9 stars 275 dependents 1.7k scripts 25k downloadsRCy3 - Functions to Access and Control Cytoscape
Vizualize, analyze and explore networks using Cytoscape via R. Anything you can do using the graphical user interface of Cytoscape, you can now do with a single RCy3 function.
Last updated
visualizationgraphandnetworkthirdpartyclientnetwork
13.54 score 58 stars 16 dependents 828 scripts 1.6k downloadspcaMethods - A collection of PCA methods
Provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA. A cluster based method for missing value estimation is included for comparison. BPCA, PPCA and NipalsPCA may be used to perform PCA on incomplete data as well as for accurate missing value estimation. A set of methods for printing and plotting the results is also provided. All PCA methods make use of the same data structure (pcaRes) to provide a common interface to the PCA results. Initiated at the Max-Planck Institute for Molecular Plant Physiology, Golm, Germany.
Last updated
bayesianlooking-for-maintainercpp
13.32 score 55 stars 79 dependents 624 scripts 9.4k downloadsrtracklayer - R interface to genome annotation files and the UCSC genome browser
Extensible framework for interacting with multiple genome browsers (currently UCSC built-in) and manipulating annotation tracks in various formats (currently GFF, BED, bedGraph, BED15, WIG, BigWig and 2bit built-in). The user may export/import tracks to/from the supported browsers, as well as query and modify the browser state, such as the current viewport.
Last updated
annotationvisualizationdataimportzlibopensslcurl
12.99 score 501 dependents 13k scripts 45k downloadsEBImage - Image processing and analysis toolbox for R
EBImage provides general purpose functionality for image processing and analysis. In the context of (high-throughput) microscopy-based cellular assays, EBImage offers tools to segment cells and extract quantitative cellular descriptors. This allows the automation of such tasks using the R programming language and facilitates the use of other tools in the R environment for signal processing, statistical modeling, machine learning and visualization with image data.
Last updated
visualizationbioinformaticsimage-analysisimage-processingcpp
12.88 score 77 stars 44 dependents 1.7k scripts 5.5k downloadsaffy - Methods for Affymetrix Oligonucleotide Arrays
The package contains functions for exploratory oligonucleotide array analysis. The dependence on tkWidgets only concerns few convenience functions. 'affy' is fully functional without it.
Last updated
microarrayonechannelpreprocessing
12.44 score 106 dependents 2.1k scripts 13k downloadsShortRead - FASTQ input and manipulation
This package implements sampling, iteration, and input of FASTQ files. The package includes functions for filtering and trimming reads, and for generating a quality assessment report. Data are represented as DNAStringSet-derived objects, and easily manipulated for a diversity of purposes. The package also contains legacy support for early single-end, ungapped alignment formats.
Last updated
dataimportsequencingqualitycontrolbioconductor-packagecore-packagezlibcpp
12.40 score 8 stars 56 dependents 2.2k scripts 12k downloadspreprocessCore - A collection of pre-processing functions
A library of core preprocessing routines.
Last updated
infrastructureopenblas
12.30 score 20 stars 214 dependents 2.8k scripts 23k downloadssparseMatrixStats - Summary Statistics for Rows and Columns of Sparse Matrices
High performance functions for row and column operations on sparse matrices. For example: col / rowMeans2, col / rowMedians, col / rowVars etc. Currently, the optimizations are limited to data in the column sparse format. This package is inspired by the matrixStats package by Henrik Bengtsson.
Last updated
infrastructuresoftwaredatarepresentationcpp
12.06 score 55 stars 156 dependents 326 scripts 27k downloadsbiomformat - An interface package for the BIOM file format
This is an R package for interfacing with the BIOM file format. This package includes basic tools for reading biom-format files, accessing and subsetting data tables from a biom object (which is more complex than a single table), as well as limited support for writing a biom-object back to a biom-format file. The design of this API is intended to match the python API and other tools included with the biom-format project, but with a decidedly "R flavor" that should be familiar to R users. This includes S4 classes and methods, as well as extensions of common core functions/methods.
Last updated
immunooncologydataimportmetagenomicsmicrobiome
11.94 score 8 stars 52 dependents 592 scripts 9.9k downloadsgraph - graph: A package to handle graph data structures
A package that implements some simple graph handling capabilities.
Last updated
graphandnetwork
11.81 score 343 dependents 1.2k scripts 34k downloads
vsn - Variance stabilization and calibration for microarray data
The package implements a method for normalising microarray intensities from single- and multiple-color arrays. It can also be used for data from other technologies, as long as they have similar format. The method uses a robust variant of the maximum-likelihood estimator for an additive-multiplicative error model and affine calibration. The model incorporates data calibration step (a.k.a. normalization), a model for the dependence of the variance on the mean intensity and a variance stabilizing data transformation. Differences between transformed intensities are analogous to "normalized log-ratios". However, in contrast to the latter, their variance is independent of the mean, and they are usually more sensitive and specific in detecting differential transcription.
Last updated
microarrayonechanneltwochannelpreprocessing
11.61 score 55 dependents 1.3k scripts 9.1k downloadsinfercnv - Infer Copy Number Variation from Single-Cell RNA-Seq Data
Using single-cell RNA-Seq expression to visualize CNV in cells.
Last updated
softwarecopynumbervariationvariantdetectionstructuralvariationgenomicvariationgeneticstranscriptomicsstatisticalmethodbayesianhiddenmarkovmodelsinglecelljagscpp
11.57 score 670 stars 1 dependents 916 scripts 3.6k downloadsMOFA2 - Multi-Omics Factor Analysis v2
The MOFA2 package contains a collection of tools for training and analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic factor model that aims to identify principal axes of variation from data sets that can comprise multiple omic layers and/or groups of samples. Additional time or space information on the samples can be incorporated using the MEFISTO framework, which is part of MOFA2. Downstream analysis functions to inspect molecular features underlying each factor, visualisation, imputation etc are available.
Last updated
dimensionreductionbayesianvisualizationfactor-analysismofamulti-omics
11.51 score 401 stars 1 dependents 780 scripts 1.5k downloadsgenomation - Summary, annotation and visualization of genomic data
A package for summary and annotation of genomic intervals. Users can visualize and quantify genomic intervals over pre-defined functional regions, such as promoters, exons, introns, etc. The genomic intervals represent regions with a defined chromosome position, which may be associated with a score, such as aligned reads from HT-seq experiments, TF binding sites, methylation scores, etc. The package can use any tabular genomic feature data as long as it has minimal information on the locations of genomic intervals. In addition, It can use BAM or BigWig files as input.
Last updated
annotationsequencingvisualizationcpgislandcpp
11.47 score 79 stars 5 dependents 784 scripts 1.4k downloadsMatrixGenerics - S4 Generic Summary Statistic Functions that Operate on Matrix-Like Objects
S4 generic functions modeled after the 'matrixStats' API for alternative matrix implementations. Packages with alternative matrix implementation can depend on this package and implement the generic functions that are defined here for a useful set of row and column summary statistics. Other package developers can import this package and handle a different matrix implementations without worrying about incompatibilities.
Last updated
infrastructuresoftwarebioconductor-packagecore-package
11.43 score 12 stars 1.4k dependents 155 scripts 84k downloadstopGO - Enrichment Analysis for Gene Ontology
topGO package provides tools for testing GO terms while accounting for the topology of the GO graph. Different test statistics and different methods for eliminating local similarities and dependencies between GO terms can be implemented and applied.
Last updated
geneexpressiontranscriptomicsgenesetenrichmentgoannotationpathwayssystemsbiologymicroarraysequencingvisualizationsoftware
11.27 score 2 stars 21 dependents 2.5k scripts 3.7k downloadsMaaslin2 - "Multivariable Association Discovery in Population-scale Meta-omics Studies"
MaAsLin2 is comprehensive R package for efficiently determining multivariable association between clinical metadata and microbial meta'omic features. MaAsLin2 relies on general linear models to accommodate most modern epidemiological study designs, including cross-sectional and longitudinal, and offers a variety of data exploration, normalization, and transformation methods. MaAsLin2 is the next generation of MaAsLin.
Last updated
metagenomicssoftwaremicrobiomenormalizationbiobakerybioconductordifferential-abundance-analysisfalse-discovery-ratemultiple-covariatespublicr-toolsrepeated-measurestools
11.27 score 155 stars 2 dependents 764 scripts 1.7k downloadsgenefilter - genefilter: methods for filtering genes from high-throughput experiments
Some basic functions for filtering genes.
Last updated
microarraycpp
11.24 score 152 dependents 3.0k scripts 21k downloadsRgraphviz - Provides plotting capabilities for R graph objects
Interfaces R with the AT and T graphviz library for plotting R graph objects from the graph package.
Last updated
graphandnetworkvisualizationzlib
11.13 score 109 dependents 1.4k scripts 15k downloadsANCOMBC - Microbiome differential abudance and correlation analyses with bias correction
ANCOMBC is a package containing differential abundance (DA) and correlation analyses for microbiome data. Specifically, the package includes Analysis of Compositions of Microbiomes with Bias Correction 2 (ANCOM-BC2), Analysis of Compositions of Microbiomes with Bias Correction (ANCOM-BC), and Analysis of Composition of Microbiomes (ANCOM) for DA analysis, and Sparse Estimation of Correlations among Microbiomes (SECOM) for correlation analysis. Microbiome data are typically subject to two sources of biases: unequal sampling fractions (sample-specific biases) and differential sequencing efficiencies (taxon-specific biases). Methodologies included in the ANCOMBC package are designed to correct these biases and construct statistically consistent estimators.
Last updated
differentialexpressionmicrobiomenormalizationsequencingsoftwareancomancombcancombc2correlationdifferential-abundance-analysissecom
11.12 score 136 stars 1 dependents 734 scripts 2.9k downloadsRhtslib - HTSlib high-throughput sequencing library as an R package
This package provides version 1.18 of the 'HTSlib' C library for high-throughput sequence analysis. The package is primarily useful to developers of other R packages who wish to make use of HTSlib. Motivation and instructions for use of this package are in the vignette, vignette(package="Rhtslib", "Rhtslib").
Last updated
dataimportsequencingbioconductor-packagecore-packagecurlbzip2xz-utilszlib
11.06 score 11 stars 612 dependents 3 scripts 48k downloadsseqLogo - Sequence logos for DNA sequence alignments
seqLogo takes the position weight matrix of a DNA sequence motif and plots the corresponding sequence logo as introduced by Schneider and Stephens (1990).
Last updated
sequencematching
10.89 score 5 stars 28 dependents 506 scripts 6.1k downloadsGENESIS - GENetic EStimation and Inference in Structured samples (GENESIS): Statistical methods for analyzing genetic data from samples with population structure and/or relatedness
The GENESIS package provides methodology for estimating, inferring, and accounting for population and pedigree structure in genetic analyses. The current implementation provides functions to perform PC-AiR (Conomos et al., 2015, Gen Epi) and PC-Relate (Conomos et al., 2016, AJHG). PC-AiR performs a Principal Components Analysis on genome-wide SNP data for the detection of population structure in a sample that may contain known or cryptic relatedness. Unlike standard PCA, PC-AiR accounts for relatedness in the sample to provide accurate ancestry inference that is not confounded by family structure. PC-Relate uses ancestry representative principal components to adjust for population structure/ancestry and accurately estimate measures of recent genetic relatedness such as kinship coefficients, IBD sharing probabilities, and inbreeding coefficients. Additionally, functions are provided to perform efficient variance component estimation and mixed model association testing for both quantitative and binary phenotypes.
Last updated
snpgeneticvariabilitygeneticsstatisticalmethoddimensionreductionprincipalcomponentgenomewideassociationqualitycontrolbiocviews
10.76 score 44 stars 1 dependents 486 scripts 884 downloadsChemmineR - Cheminformatics Toolkit for R
ChemmineR is a cheminformatics package for analyzing drug-like small molecule data in R. Its latest version contains functions for efficient processing of large numbers of molecules, physicochemical/structural property predictions, structural similarity searching, classification and clustering of compound libraries with a wide spectrum of algorithms. In addition, it offers visualization functions for compound clustering results and chemical structures.
Last updated
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsmicrotitreplateassaycellbasedassaysvisualizationinfrastructuredataimportclusteringproteomicsmetabolomicscpp
10.75 score 17 stars 14 dependents 286 scripts 1.5k downloadsEnrichedHeatmap - Making Enriched Heatmaps
Enriched heatmap is a special type of heatmap which visualizes the enrichment of genomic signals on specific target regions. Here we implement enriched heatmap by ComplexHeatmap package. Since this type of heatmap is just a normal heatmap but with some special settings, with the functionality of ComplexHeatmap, it would be much easier to customize the heatmap as well as concatenating to a list of heatmaps to show correspondance between different data sources.
Last updated
softwarevisualizationsequencinggenomeannotationcoveragecpp
10.72 score 201 stars 1 dependents 605 scripts 1.6k downloadstradeSeq - trajectory-based differential expression analysis for sequencing data
tradeSeq provides a flexible method for fitting regression models that can be used to find genes that are differentially expressed along one or multiple lineages in a trajectory. Based on the fitted models, it uses a variety of tests suited to answer different questions of interest, e.g. the discovery of genes for which expression is associated with pseudotime, or which are differentially expressed (in a specific region) along the trajectory. It fits a negative binomial generalized additive model (GAM) for each gene, and performs inference on the parameters of the GAM.
Last updated
clusteringregressiontimecoursedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaresinglecelltranscriptomicsmultiplecomparisonvisualization
10.70 score 296 stars 776 scripts 1.5k downloadsannotate - Annotation for microarrays
Using R enviroments for annotation.
Last updated
annotationpathwaysgo
10.64 score 234 dependents 888 scripts 33k downloadsoligo - Preprocessing tools for oligonucleotide arrays
A package to analyze oligonucleotide arrays (expression/SNP/tiling/exon) at probe-level. It currently supports Affymetrix (CEL files) and NimbleGen arrays (XYS files).
Last updated
microarrayonechanneltwochannelpreprocessingsnpdifferentialexpressionexonarraygeneexpressiondataimportzlib
10.60 score 3 stars 10 dependents 622 scripts 6.4k downloadsbasilisk - Freezing Python Dependencies Inside Bioconductor Packages
Installs a self-contained conda instance that is managed by the R/Bioconductor installation machinery. This aims to provide a consistent Python environment that can be used reliably by Bioconductor packages. Functions are also provided to enable smooth interoperability of multiple Python environments in a single R session.
Last updated
infrastructurebioconductor-package
10.57 score 29 stars 42 dependents 134 scripts 6.3k downloadsNebulosa - Single-Cell Data Visualisation Using Kernel Gene-Weighted Density Estimation
This package provides a enhanced visualization of single-cell data based on gene-weighted density estimation. Nebulosa recovers the signal from dropped-out features and allows the inspection of the joint expression from multiple features (e.g. genes). Seurat and SingleCellExperiment objects can be used within Nebulosa.
Last updated
softwaregeneexpressionsinglecellvisualizationdimensionreductionsingle-cellsingle-cell-analysissingle-cell-multiomicssingle-cell-rna-seq
10.46 score 115 stars 1 dependents 734 scripts 2.1k downloadsGSEABase - Gene set enrichment data structures and methods
This package provides classes and methods to support Gene Set Enrichment Analysis (GSEA).
Last updated
geneexpressiongenesetenrichmentgraphandnetworkgokegg
10.42 score 75 dependents 2.5k scripts 15k downloadsbiocViews - Categorized views of R package repositories
Infrastructure to support 'views' used to classify Bioconductor packages. 'biocViews' are directed acyclic graphs of terms from a controlled vocabulary. There are three major classifications, corresponding to 'software', 'annotation', and 'experiment data' packages.
Last updated
infrastructurebioconductor-packagecore-package
10.37 score 4 stars 15 dependents 43 scripts 7.0k downloadscelda - CEllular Latent Dirichlet Allocation
Celda is a suite of Bayesian hierarchical models for clustering single-cell RNA-sequencing (scRNA-seq) data. It is able to perform "bi-clustering" and simultaneously cluster genes into gene modules and cells into cell subpopulations. It also contains DecontX, a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. A variety of scRNA-seq data visualization functions is also included.
Last updated
singlecellgeneexpressionclusteringsequencingbayesianimmunooncologydataimportcppopenmp
10.33 score 152 stars 2 dependents 416 scripts 1.6k downloadstidybulk - Brings transcriptomics to the tidyverse
This is a collection of utility functions that allow to perform exploration of and calculations to RNA sequencing data, in a modular, pipe-friendly and tidy fashion.
Last updated
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsbioconductorbulk-transcriptional-analysesdeseq2differential-expressionedgerensembl-idsentrezgene-symbolsgseamds-dimensionspcapiperedundancytibbletidytidy-datatidyversetranscriptstsne
10.31 score 180 stars 1 dependents 208 scripts 621 downloadsCAMERA - Collection of annotation related methods for mass spectrometry data
Annotation of peaklists generated by xcms, rule based annotation of isotopes and adducts, isotope validation, EIC correlation based tagging of unknown adducts and fragments
Last updated
immunooncologymassspectrometrymetabolomics
10.26 score 14 stars 6 dependents 195 scripts 1.0k downloadsmethylumi - Handle Illumina methylation data
This package provides classes for holding and manipulating Illumina methylation data. Based on eSet, it can contain MIAME information, sample information, feature information, and multiple matrices of data. An "intelligent" import function, methylumiR can read the Illumina text files and create a MethyLumiSet. methylumIDAT can directly read raw IDAT files from HumanMethylation27 and HumanMethylation450 microarrays. Normalization, background correction, and quality control features for GoldenGate, Infinium, and Infinium HD arrays are also included.
Last updated
dnamethylationtwochannelpreprocessingqualitycontrolcpgisland
10.23 score 9 stars 17 dependents 100 scripts 2.6k downloadsGenVisR - Genomic Visualizations in R
Produce highly customizable publication quality graphics for genomic data primarily at the cohort level.
Last updated
infrastructuredatarepresentationclassificationdnaseq
10.16 score 224 stars 87 scripts 621 downloadsSC3 - Single-Cell Consensus Clustering
A tool for unsupervised clustering and analysis of single cell RNA-Seq data.
Last updated
immunooncologysinglecellsoftwareclassificationclusteringdimensionreductionsupportvectormachinernaseqvisualizationtranscriptomicsdatarepresentationguidifferentialexpressiontranscriptionbioconductor-packagehuman-cell-atlassingle-cell-rna-seqopenblascpp
10.14 score 129 stars 1 dependents 394 scripts 846 downloads
rhdf5filters - HDF5 Compression Filters
Provides a collection of additional compression filters for HDF5 datasets. The package is intended to provide seamless integration with rhdf5, however the compiled filters can also be used with external applications.
Last updated
infrastructuredataimportcompressionfilter-pluginhdf5
10.12 score 5 stars 230 dependents 7 scripts 42k downloadsCardinal - A mass spectrometry imaging toolbox for statistical analysis
Implements statistical & computational tools for analyzing mass spectrometry imaging datasets, including methods for efficient pre-processing, spatial segmentation, and classification.
Last updated
softwareinfrastructureproteomicslipidomicsmassspectrometryimagingmassspectrometryimmunooncologynormalizationclusteringclassificationregression
10.02 score 72 stars 231 scripts 732 downloadsbluster - Clustering Algorithms for Bioconductor
Wraps common clustering algorithms in an easily extended S4 framework. Backends are implemented for hierarchical, k-means and graph-based clustering. Several utilities are also provided to compare and evaluate clustering results.
Last updated
immunooncologysoftwaregeneexpressiontranscriptomicssinglecellclusteringcpp
9.97 score 61 dependents 952 scripts 13k downloadsscMerge - scMerge: Merging multiple batches of scRNA-seq data
Like all gene expression data, single-cell data suffers from batch effects and other unwanted variations that makes accurate biological interpretations difficult. The scMerge method leverages factor analysis, stably expressed genes (SEGs) and (pseudo-) replicates to remove unwanted variations and merge multiple single-cell data. This package contains all the necessary functions in the scMerge pipeline, including the identification of SEGs, replication-identification methods, and merging of single-cell data.
Last updated
batcheffectgeneexpressionnormalizationrnaseqsequencingsinglecellsoftwaretranscriptomicsbioinformaticssingle-cell
9.93 score 73 stars 2 dependents 161 scripts 674 downloadsscuttle - Legacy Utilities for Single-Cell RNA-Seq Analysis
Provides some legacy utility functions for performing single-cell analyses. Most of these functions are deprecated in favor of newer, more performant alternatives. We just keep this package around for back-compatibility and to point to the replacement functions.
Last updated
immunooncologysinglecellrnaseqqualitycontrolpreprocessingnormalizationtranscriptomicsgeneexpressionsequencingsoftwaredataimportopenblascpp
9.86 score 99 dependents 2.8k scripts 17k downloadscytomapper - Visualization of highly multiplexed imaging data in R
Highly multiplexed imaging acquires the single-cell expression of selected proteins in a spatially-resolved fashion. These measurements can be visualised across multiple length-scales. First, pixel-level intensities represent the spatial distributions of feature expression with highest resolution. Second, after segmentation, expression values or cell-level metadata (e.g. cell-type information) can be visualised on segmented cell areas. This package contains functions for the visualisation of multiplexed read-outs and cell-level information obtained by multiplexed imaging technologies. The main functions of this package allow 1. the visualisation of pixel-level information across multiple channels, 2. the display of cell-level information (expression and/or metadata) on segmentation masks and 3. gating and visualisation of single cells.
Last updated
immunooncologysoftwaresinglecellonechanneltwochannelmultiplecomparisonnormalizationdataimportbioimagingimaging-mass-cytometrysingle-cellspatial-analysis
9.79 score 36 stars 5 dependents 478 scripts 859 downloadsLOLA - Locus overlap analysis for enrichment of genomic ranges
Provides functions for testing overlap of sets of genomic regions with public and custom region set (genomic ranges) databases. This makes it possible to do automated enrichment analysis for genomic region sets, thus facilitating interpretation of functional genomics and epigenomics data.
Last updated
genesetenrichmentgeneregulationgenomeannotationsystemsbiologyfunctionalgenomicschipseqmethylseqsequencing
9.72 score 80 stars 182 scripts 547 downloadsRTCGAToolbox - A new tool for exporting TCGA Firehose data
Managing data from large scale projects such as The Cancer Genome Atlas (TCGA) for further analysis is an important and time consuming step for research projects. Several efforts, such as Firehose project, make TCGA pre-processed data publicly available via web services and data portals but it requires managing, downloading and preparing the data for following steps. We developed an open source and extensible R based data client for Firehose pre-processed data and demonstrated its use with sample case studies. Results showed that RTCGAToolbox could improve data management for researchers who are interested with TCGA data. In addition, it can be integrated with other analysis pipelines for following data analysis.
Last updated
differentialexpressiongeneexpressionsequencing
9.66 score 19 stars 5 dependents 97 scripts 945 downloadsescape - Easy single cell analysis platform for enrichment
A bridging R package to facilitate gene set enrichment analysis (GSEA) in the context of single-cell RNA sequencing. Using raw count information, Seurat objects, or SingleCellExperiment format, users can perform and visualize ssGSEA, GSVA, AUCell, and UCell-based enrichment calculations across individual cells. Alternatively, escape supports use of rank-based GSEA, such as the use of differential gene expression via fgsea.
Last updated
softwaresinglecellclassificationannotationgenesetenrichmentsequencinggenesignalingpathways
9.65 score 225 stars 1 dependents 208 scripts 926 downloadsmemes - motif matching, comparison, and de novo discovery using the MEME Suite
A seamless interface to the MEME Suite family of tools for motif analysis. 'memes' provides data aware utilities for using GRanges objects as entrypoints to motif analysis, data structures for examining & editing motif lists, and novel data visualizations. 'memes' functions and data structures are amenable to both base R and tidyverse workflows.
Last updated
dataimportfunctionalgenomicsgeneregulationmotifannotationmotifdiscoverysequencematchingsoftware
9.64 score 54 stars 2 dependents 159 scripts 716 downloadsRBGL - An interface to the BOOST graph library
A fairly extensive and comprehensive interface to the graph algorithms contained in the BOOST library.
Last updated
graphandnetworknetworkcpp
9.61 score 135 dependents 334 scripts 15k downloadssnpStats - SnpMatrix and XSnpMatrix classes and methods
Classes and statistical methods for large SNP association studies. This extends the earlier snpMatrix package, allowing for uncertainty in genotypes.
Last updated
microarraysnpgeneticvariabilityzlib
9.61 score 19 dependents 824 scripts 3.6k downloadsBayesSpace - Clustering and Resolution Enhancement of Spatial Transcriptomes
Tools for clustering and enhancing the resolution of spatial gene expression experiments. BayesSpace clusters a low-dimensional representation of the gene expression matrix, incorporating a spatial prior to encourage neighboring spots to cluster together. The method can enhance the resolution of the low-dimensional representation into "sub-spots", for which features such as gene expression or cell type composition can be imputed.
Last updated
softwareclusteringtranscriptomicsgeneexpressionsinglecellimmunooncologydataimportopenblascppopenmp
9.59 score 179 stars 1 dependents 398 scripts 903 downloadsrGREAT - GREAT Analysis - Functional Enrichment on Genomic Regions
GREAT (Genomic Regions Enrichment of Annotations Tool) is a type of functional enrichment analysis directly performed on genomic regions. This package implements the GREAT algorithm (the local GREAT analysis), also it supports directly interacting with the GREAT web service (the online GREAT analysis). Both analysis can be viewed by a Shiny application. rGREAT by default supports more than 600 organisms and a large number of gene set collections, as well as self-provided gene sets and organisms from users. Additionally, it implements a general method for dealing with background regions.
Last updated
genesetenrichmentgopathwayssoftwaresequencingwholegenomegenomeannotationcoveragecpp
9.58 score 98 stars 376 scripts 1.4k downloadsMassSpecWavelet - Peak Detection for Mass Spectrometry data using wavelet-based algorithms
Peak Detection in Mass Spectrometry data is one of the important preprocessing steps. The performance of peak detection affects subsequent processes, including protein identification, profile alignment and biomarker identification. Using Continuous Wavelet Transform (CWT), this package provides a reliable algorithm for peak detection that does not require any type of smoothing or previous baseline correction method, providing more consistent results for different spectra. See <doi:10.1093/bioinformatics/btl355} for further details.
Last updated
immunooncologymassspectrometryproteomicspeakdetection
9.57 score 11 stars 21 dependents 47 scripts 4.0k downloadsPSMatch - Handling and Managing Peptide Spectrum Matches
The PSMatch package helps proteomics practitioners to load, handle and manage Peptide Spectrum Matches. It provides functions to model peptide-protein relations as adjacency matrices and connected components, visualise these as graphs and make informed decision about shared peptide filtering. The package also provides functions to calculate and visualise MS2 fragment ions.
Last updated
infrastructureproteomicsmassspectrometrymass-spectrometrypeptide-spectrum-matches
9.56 score 6 stars 40 dependents 32 scripts 3.7k downloadsregioneR - Association analysis of genomic regions based on permutation tests
regioneR offers a statistical framework based on customizable permutation tests to assess the association between genomic region sets and other genomic features.
Last updated
geneticschipseqdnaseqmethylseqcopynumbervariation
9.50 score 22 dependents 3.4k scripts 3.5k downloadsbambu - Context-Aware Transcript Quantification from Long Read RNA-Seq data
bambu is a R package for multi-sample transcript discovery and quantification using long read RNA-Seq data. You can use bambu after read alignment to obtain expression estimates for known and novel transcripts and genes. The output from bambu can directly be used for visualisation and downstream analysis such as differential gene expression or transcript usage.
Last updated
alignmentcoveragedifferentialexpressionfeatureextractiongeneexpressiongenomeannotationgenomeassemblyimmunooncologylongreadmultiplecomparisonnormalizationrnaseqregressionsequencingsoftwaretranscriptiontranscriptomicsbambubioconductorlong-readsnanoporenanopore-sequencingrna-seqrna-seq-analysistranscript-quantificationtranscript-reconstructioncpp
9.48 score 244 stars 1 dependents 183 scripts 918 downloadsDNAcopy - DNA Copy Number Data Analysis
Implements the circular binary segmentation (CBS) algorithm to segment DNA copy number data and identify genomic regions with abnormal copy number.
Last updated
microarraycopynumbervariationfortran
9.43 score 52 dependents 264 scripts 6.5k downloadsCARNIVAL - A CAusal Reasoning tool for Network Identification (from gene expression data) using Integer VALue programming
An upgraded causal reasoning tool from Melas et al in R with updated assignments of TFs' weights from PROGENy scores. Optimization parameters can be freely adjusted and multiple solutions can be obtained and aggregated.
Last updated
transcriptomicsgeneexpressionnetworkcausal-modelsfootprintsinteger-linear-programmingpathway-enrichment-analysis
9.41 score 62 stars 2 dependents 98 scripts 549 downloadsgage - Generally Applicable Gene-set Enrichment for Pathway Analysis
GAGE is a published method for gene set (enrichment or GSEA) or pathway analysis. GAGE is generally applicable independent of microarray or RNA-Seq data attributes including sample sizes, experimental designs, assay platforms, and other types of heterogeneity, and consistently achieves superior performance over other frequently used methods. In gage package, we provide functions for basic GAGE analysis, result processing and presentation. We have also built pipeline routines for of multiple GAGE analyses in a batch, comparison between parallel analyses, and combined analysis of heterogeneous data from different sources/studies. In addition, we provide demo microarray data and commonly used gene set data based on KEGG pathways and GO terms. These funtions and data are also useful for gene set analysis using other methods.
Last updated
pathwaysgodifferentialexpressionmicroarrayonechanneltwochannelrnaseqgeneticsmultiplecomparisongenesetenrichmentgeneexpressionsystemsbiologysequencing
9.41 score 6 stars 3 dependents 1.1k scripts 1.5k downloadsEWCE - Expression Weighted Celltype Enrichment
Used to determine which cell types are enriched within gene lists. The package provides tools for testing enrichments within simple gene lists (such as human disease associated genes) and those resulting from differential expression studies. The package does not depend upon any particular Single Cell Transcriptome dataset and user defined datasets can be loaded in and used in the analyses.
Last updated
geneexpressiontranscriptiondifferentialexpressiongenesetenrichmentgeneticsmicroarraymrnamicroarrayonechannelrnaseqbiomedicalinformaticsproteomicsvisualizationfunctionalgenomicssinglecelldeconvolutionsingle-cellsingle-cell-rna-seqtranscriptomics
9.39 score 59 stars 131 scripts 473 downloadsscp - Mass Spectrometry-Based Single-Cell Proteomics Data Analysis
Utility functions for manipulating, processing, and analyzing mass spectrometry-based single-cell proteomics data. The package is an extension to the 'QFeatures' package and relies on 'SingleCellExpirement' to enable single-cell proteomics analyses. The package offers the user the functionality to process quantitative table (as generated by MaxQuant, Proteome Discoverer, and more) into data tables ready for downstream analysis and data visualization.
Last updated
geneexpressionproteomicssinglecellmassspectrometrypreprocessingcellbasedassaysbioconductormass-spectrometrysingle-cellsoftware
9.34 score 32 stars 272 scripts 424 downloadsimpute - impute: Imputation for microarray data
Imputation for microarray data (currently KNN only)
Last updated
microarray
9.33 score 145 dependents 1.4k scripts 17k downloadsRdisop - Decomposition of Isotopic Patterns
In high resolution mass spectrometry (HR-MS), the measured masses can be decomposed into potential element combinations (chemical sum formulas). Where additional mass/intensity information of respective isotopic peaks is available, decomposition can take this information into account to better rank the potential candidate sum formulas. To compare measured mass/intensity information with the theoretical distribution of candidate sum formulas, the latter needs to be calculated. This package implements fast algorithms to address both tasks, the calculation of isotopic distributions for arbitrary sum formulas (assuming a HR-MS resolution of roughly 30,000), and the ranked list of sum formulas fitting an observed peak or isotopic peak set.
Last updated
immunooncologymassspectrometrymetabolomicsmass-spectrometrycpp
9.31 score 5 stars 3 dependents 118 scripts 1.3k downloadsFRASER - Find RAre Splicing Events in RNA-Seq Data
Detection of rare aberrant splicing events in transcriptome profiles. Read count ratio expectations are modeled by an autoencoder to control for confounding factors in the data. Given these expectations, the ratios are assumed to follow a beta-binomial distribution with a junction specific dispersion. Outlier events are then identified as read-count ratios that deviate significantly from this distribution. FRASER is able to detect alternative splicing, but also intron retention. The package aims to support diagnostics in the field of rare diseases where RNA-seq is performed to identify aberrant splicing defects.
Last updated
rnaseqalternativesplicingsequencingsoftwaregeneticscoverageaberrant-splicingdiagnosticsoutlier-detectionrare-diseaserna-seqsplicingopenblascpp
9.27 score 55 stars 175 scripts 538 downloadsKEGGgraph - KEGGgraph: A graph approach to KEGG PATHWAY in R and Bioconductor
KEGGGraph is an interface between KEGG pathway and graph object as well as a collection of tools to analyze, dissect and visualize these graphs. It parses the regularly updated KGML (KEGG XML) files into graph models maintaining all essential pathway attributes. The package offers functionalities including parsing, graph operation, visualization and etc.
Last updated
pathwaysgraphandnetworkvisualizationkegg
9.26 score 23 dependents 147 scripts 8.9k downloadsclustifyr - Classifier for Single-cell RNA-seq Using Cell Clusters
Package designed to aid in classifying cells from single-cell RNA sequencing data using external reference data (e.g., bulk RNA-seq, scRNA-seq, microarray, gene lists). A variety of correlation based methods and gene list enrichment methods are provided to assist cell type assignment.
Last updated
singlecellannotationsequencingmicroarraygeneexpressionassign-identitiesclustersmarker-genesrna-seqsingle-cell-rna-seq
9.24 score 125 stars 390 scripts 472 downloads
schex - Hexbin plots for single cell omics data
Builds hexbin plots for variables and dimension reduction stored in single cell omics data such as SingleCellExperiment. The ideas used in this package are based on the excellent work of Dan Carr, Nicholas Lewin-Koh, Martin Maechler and Thomas Lumley.
Last updated
softwaresequencingsinglecelldimensionreductionvisualizationimmunooncologydataimport
9.12 score 76 stars 2 dependents 146 scripts 464 downloadsmarray - Exploratory analysis for two-color spotted microarray data
Class definitions for two-color spotted microarray data. Fuctions for data input, diagnostic plots, normalization and quality checking.
Last updated
microarraytwochannelpreprocessing
9.12 score 41 dependents 273 scripts 3.3k downloadsFlowSOM - Using self-organizing maps for visualization and interpretation of cytometry data
FlowSOM offers visualization options for cytometry data, by using Self-Organizing Map clustering and Minimal Spanning Trees.
Last updated
cellbiologyflowcytometryclusteringvisualizationsoftwarecellbasedassays
9.11 score 10 dependents 510 scripts 2.1k downloadssurvcomp - Performance Assessment and Comparison for Survival Analysis
Assessment and Comparison for Performance of Risk Prediction (Survival) Models.
Last updated
geneexpressiondifferentialexpressionvisualizationcpp
9.01 score 14 dependents 508 scripts 2.4k downloadssva - Surrogate Variable Analysis
The sva package contains functions for removing batch effects and other unwanted variation in high-throughput experiment. Specifically, the sva package contains functions for the identifying and building surrogate variables for high-dimensional data sets. Surrogate variables are covariates constructed directly from high-dimensional data (like gene expression/RNA sequencing/methylation/brain imaging data) that can be used in subsequent analyses to adjust for unknown, unmodeled, or latent sources of noise. The sva package can be used to remove artifacts in three ways: (1) identifying and estimating surrogate variables for unknown sources of variation in high-throughput experiments (Leek and Storey 2007 PLoS Genetics,2008 PNAS), (2) directly removing known batch effects using ComBat (Johnson et al. 2007 Biostatistics) and (3) removing batch effects with known control probes (Leek 2014 biorXiv). Removing batch effects and using surrogate variables in differential expression analysis have been shown to reduce dependence, stabilize error rate estimates, and improve reproducibility, see (Leek and Storey 2007 PLoS Genetics, 2008 PNAS or Leek et al. 2011 Nat. Reviews Genetics).
Last updated
immunooncologymicroarraystatisticalmethodpreprocessingmultiplecomparisonsequencingrnaseqbatcheffectnormalization
8.96 score 54 dependents 4.5k scripts 13k downloadstidySingleCellExperiment - Brings SingleCellExperiment to the Tidyverse
'tidySingleCellExperiment' is an adapter that abstracts the 'SingleCellExperiment' container in the form of a 'tibble'. This allows *tidy* data manipulation, nesting, and plotting. For example, a 'tidySingleCellExperiment' is directly compatible with functions from 'tidyverse' packages `dplyr` and `tidyr`, as well as plotting with `ggplot2` and `plotly`. In addition, the package provides various utility functions specific to single-cell omics data analysis (e.g., aggregation of cell-level data to pseudobulks).
Last updated
assaydomaininfrastructurernaseqdifferentialexpressionsinglecellgeneexpressionnormalizationclusteringqualitycontrolsequencingbioconductordplyrggplot2plotlysingle-cell-rna-seqsingle-cell-sequencingsinglecellexperimenttibbletidyrtidyverse
8.96 score 37 stars 2 dependents 185 scripts 678 downloads
MsExperiment - Infrastructure for Mass Spectrometry Experiments
Infrastructure to store and manage all aspects related to a complete proteomics or metabolomics mass spectrometry (MS) experiment. The MsExperiment package provides light-weight and flexible containers for MS experiments building on the new MS infrastructure provided by the Spectra, QFeatures and related packages. Along with raw data representations, links to original data files and sample annotations, additional metadata or annotations can also be stored within the MsExperiment container. To guarantee maximum flexibility only minimal constraints are put on the type and content of the data within the containers.
Last updated
infrastructureproteomicsmassspectrometrymetabolomicsexperimentaldesigndataimport
8.94 score 5 stars 17 dependents 311 scripts 1.8k downloadsProtGenerics - Generic infrastructure for Bioconductor mass spectrometry packages
S4 generic functions and classes needed by Bioconductor proteomics packages.
Last updated
infrastructureproteomicsmassspectrometrybioconductormass-spectrometrymetabolomics
8.92 score 8 stars 204 dependents 9 scripts 19k downloadscmapR - CMap Tools in R
The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.
Last updated
dataimportdatarepresentationgeneexpressionbioconductorbioinformaticscmap
8.91 score 93 stars 324 scripts 976 downloadsVoyager - From geospatial to spatial omics
SpatialFeatureExperiment (SFE) is a new S4 class for working with spatial single-cell genomics data. The voyager package implements basic exploratory spatial data analysis (ESDA) methods for SFE. Univariate methods include univariate global spatial ESDA methods such as Moran's I, permutation testing for Moran's I, and correlograms. Bivariate methods include Lee's L and cross variogram. Multivariate methods include MULTISPATI PCA and multivariate local Geary's C recently developed by Anselin. The Voyager package also implements plotting functions to plot SFE data and ESDA results.
Last updated
geneexpressionspatialtranscriptomicsvisualizationbioconductoredaesdaexploratory-data-analysisomicsspatial-statisticsspatial-transcriptomics
8.89 score 103 stars 468 scripts 536 downloadsaffyio - Tools for parsing Affymetrix data files
Routines for parsing Affymetrix data files based upon file format information. Primary focus is on accessing the CEL and CDF file formats.
Last updated
microarraydataimportinfrastructurezlib
8.88 score 4 stars 118 dependents 51 scripts 13k downloadsMicrobiotaProcess - A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).
Last updated
visualizationmicrobiomesoftwaremultiplecomparisonfeatureextractionmicrobiome-analysismicrobiome-data
8.85 score 196 stars 215 scripts 812 downloadsInteractiveComplexHeatmap - Make Interactive Complex Heatmaps
This package can easily make heatmaps which are produced by the ComplexHeatmap package into interactive applications. It provides two types of interactivities: 1. on the interactive graphics device, and 2. on a Shiny app. It also provides functions for integrating the interactive heatmap widgets for more complex Shiny app development.
Last updated
softwarevisualizationsequencinginteractive-heatmaps
8.82 score 141 stars 4 dependents 185 scripts 1.0k downloadsaffxparser - Affymetrix File Parsing SDK
Package for parsing Affymetrix files (CDF, CEL, CHP, BPMAP, BAR). It provides methods for fast and memory efficient parsing of Affymetrix files using the Affymetrix' Fusion SDK. Both ASCII- and binary-based files are supported. Currently, there are methods for reading chip definition file (CDF) and a cell intensity file (CEL). These files can be read either in full or in part. For example, probe signals from a few probesets can be extracted very quickly from a set of CEL files into a convenient list structure.
Last updated
infrastructuredataimportmicroarrayproprietaryplatformsonechannelbioconductorcpp
8.81 score 8 stars 14 dependents 88 scripts 3.2k downloadstidySummarizedExperiment - Brings SummarizedExperiment to the Tidyverse
The tidySummarizedExperiment package provides a set of tools for creating and manipulating tidy data representations of SummarizedExperiment objects. SummarizedExperiment is a widely used data structure in bioinformatics for storing high-throughput genomic data, such as gene expression or DNA sequencing data. The tidySummarizedExperiment package introduces a tidy framework for working with SummarizedExperiment objects. It allows users to convert their data into a tidy format, where each observation is a row and each variable is a column. This tidy representation simplifies data manipulation, integration with other tidyverse packages, and enables seamless integration with the broader ecosystem of tidy tools for data analysis.
Last updated
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsbioconductorgenomicssummarizedexperimenttidyverse
8.81 score 30 stars 1 dependents 240 scripts 601 downloadsrrvgo - Reduce + Visualize GO
Reduce and visualize lists of Gene Ontology terms by identifying redudance based on semantic similarity.
Last updated
annotationclusteringgonetworkpathwayssoftware
8.78 score 36 stars 1 dependents 368 scripts 944 downloadsBioQC - Detect tissue heterogeneity in expression profiles with gene sets
BioQC performs quality control of high-throughput expression data based on tissue gene signatures. It can detect tissue heterogeneity in gene expression data. The core algorithm is a Wilcoxon-Mann-Whitney test that is optimised for high performance.
Last updated
geneexpressionqualitycontrolstatisticalmethodgenesetenrichmentcpp
8.68 score 5 stars 1 dependents 94 scripts 557 downloadsalabaster.base - Save Bioconductor Objects to File
Save Bioconductor data structures into file artifacts, and load them back into memory. This is a more robust and portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated
datarepresentationdataimportcurlopensslzlibcpp
8.61 score 4 stars 17 dependents 67 scripts 6.0k downloadsScaledMatrix - Creating a DelayedMatrix of Scaled and Centered Values
Provides delayed computation of a matrix of scaled and centered values. The result is equivalent to using the scale() function but avoids explicit realization of a dense matrix during block processing. This permits greater efficiency in common operations, most notably matrix multiplication.
Last updated
softwaredatarepresentation
8.58 score 125 dependents 15 scripts 22k downloadssimplifyEnrichment - Simplify Functional Enrichment Results
A new clustering algorithm, "binary cut", for clustering similarity matrices of functional terms is implemeted in this package. It also provides functions for visualizing, summarizing and comparing the clusterings.
Last updated
softwarevisualizationgoclusteringgenesetenrichment
8.55 score 125 stars 298 scripts 2.1k downloadsgeneplotter - Graphics related functions for Bioconductor
Functions for plotting genomic data
Last updated
visualization
8.51 score 11 dependents 336 scripts 7.2k downloadsgtrellis - Genome Level Trellis Layout
Genome level Trellis graph visualizes genomic data conditioned by genomic categories (e.g. chromosomes). For each genomic category, multiple dimensional data which are represented as tracks describe different features from different aspects. This package provides high flexibility to arrange genomic categories and to add self-defined graphics in the plot.
Last updated
softwarevisualizationsequencing
8.50 score 43 stars 1 dependents 61 scripts 642 downloadsmiaViz - Microbiome Analysis Plotting and Visualization
The miaViz package implements functions to visualize TreeSummarizedExperiment objects especially in the context of microbiome analysis. Part of the mia family of R/Bioconductor packages.
Last updated
microbiomesoftwarevisualizationbioconductormicrobiome-analysisplotting
8.50 score 12 stars 2 dependents 233 scripts 773 downloadsSPIAT - Spatial Image Analysis of Tissues
SPIAT (**Sp**atial **I**mage **A**nalysis of **T**issues) is an R package with a suite of data processing, quality control, visualization and data analysis tools. SPIAT is compatible with data generated from single-cell spatial proteomics platforms (e.g. OPAL, CODEX, MIBI, cellprofiler). SPIAT reads spatial data in the form of X and Y coordinates of cells, marker intensities and cell phenotypes. SPIAT includes six analysis modules that allow visualization, calculation of cell colocalization, categorization of the immune microenvironment relative to tumor areas, analysis of cellular neighborhoods, and the quantification of spatial heterogeneity, providing a comprehensive toolkit for spatial data analysis.
Last updated
biomedicalinformaticscellbiologyspatialclusteringdataimportimmunooncologyqualitycontrolsinglecellsoftwarevisualization
8.47 score 28 stars 74 scripts 442 downloadsqpgraph - Estimation of Genetic and Molecular Regulatory Networks from High-Throughput Genomics Data
Estimate gene and eQTL networks from high-throughput expression and genotyping assays.
Last updated
microarraygeneexpressiontranscriptionpathwaysnetworkinferencegraphandnetworkgeneregulationgeneticsgeneticvariabilitysnpsoftwareopenblas
8.46 score 3 stars 3 dependents 20 scripts 651 downloadsMSstatsPTM - Statistical Characterization of Post-translational Modifications
MSstatsPTM provides general statistical methods for quantitative characterization of post-translational modifications (PTMs). Supports DDA, DIA, SRM, and tandem mass tag (TMT) labeling. Typically, the analysis involves the quantification of PTM sites (i.e., modified residues) and their corresponding proteins, as well as the integration of the quantification results. MSstatsPTM provides functions for summarization, estimation of PTM site abundance, and detection of changes in PTMs across experimental conditions.
Last updated
immunooncologymassspectrometryproteomicssoftwaredifferentialexpressiononechanneltwochannelnormalizationqualitycontrolpost-translational-modificationcpp
8.46 score 14 stars 2 dependents 53 scripts 557 downloads
recount3 - Explore and download data from the recount3 project
The recount3 package enables access to a large amount of uniformly processed RNA-seq data from human and mouse. You can download RangedSummarizedExperiment objects at the gene, exon or exon-exon junctions level with sample metadata and QC statistics. In addition we provide access to sample coverage BigWig files.
Last updated
coveragedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaredataimportannotation-agnosticbioconductorcountderfinderexongenehumanilluminajunctionmouserecountrecount3
8.41 score 38 stars 284 scripts 900 downloadsTRONCO - TRONCO, an R package for TRanslational ONCOlogy
The TRONCO (TRanslational ONCOlogy) R package collects algorithms to infer progression models via the approach of Suppes-Bayes Causal Network, both from an ensemble of tumors (cross-sectional samples) and within an individual patient (multi-region or single-cell samples). The package provides parallel implementation of algorithms that process binary matrices where each row represents a tumor sample and each column a single-nucleotide or a structural variant driving the progression; a 0/1 value models the absence/presence of that alteration in the sample. The tool can import data from plain, MAF or GISTIC format files, and can fetch it from the cBioPortal for cancer genomics. Functions for data manipulation and visualization are provided, as well as functions to import/export such data to other bioinformatics tools for, e.g, clustering or detection of mutually exclusive alterations. Inferred models can be visualized and tested for their confidence via bootstrap and cross-validation. TRONCO is used for the implementation of the Pipeline for Cancer Inference (PICNIC).
Last updated
biomedicalinformaticsbayesiangraphandnetworksomaticmutationnetworkinferencenetworkclusteringdataimportsinglecellimmunooncologyalgorithmscancer-inferencetumors
8.39 score 30 stars 42 scripts 451 downloadsflowStats - Statistical methods for the analysis of flow cytometry data
Methods and functionality to analyse flow data that is beyond the basic infrastructure provided by the flowCore package.
Last updated
immunooncologyflowcytometrycellbasedassays
8.39 score 14 stars 1 dependents 209 scripts 1.3k downloadsmetapod - Meta-Analyses on P-Values of Differential Analyses
Implements a variety of methods for combining p-values in differential analyses of genome-scale datasets. Functions can combine p-values across different tests in the same analysis (e.g., genomic windows in ChIP-seq, exons in RNA-seq) or for corresponding tests across separate analyses (e.g., replicated comparisons, effect of different treatment conditions). Support is provided for handling log-transformed input p-values, missing values and weighting where appropriate.
Last updated
multiplecomparisondifferentialpeakcallingcpp
8.38 score 2 stars 55 dependents 20 scripts 9.2k downloads
MsFeatures - Functionality for Mass Spectrometry Features
The MsFeature package defines functionality for Mass Spectrometry features. This includes functions to group (LC-MS) features based on some of their properties, such as retention time (coeluting features), or correlation of signals across samples. This packge hence allows to group features, and its results can be used as an input for the `QFeatures` package which allows to aggregate abundance levels of features within each group. This package defines concepts and functions for base and common data types, implementations for more specific data types are expected to be implemented in the respective packages (such as e.g. `xcms`). All functionality of this package is implemented in a modular way which allows combination of different grouping approaches and enables its re-use in other R packages.
Last updated
infrastructuremassspectrometrymetabolomics
8.37 score 7 stars 15 dependents 51 scripts 2.4k downloadsaffyPLM - Methods for fitting probe-level models
A package that extends and improves the functionality of the base affy package. Routines that make heavy use of compiled code for speed. Central focus is on implementation of methods for fitting probe-level models and tools using these models. PLM based quality assessment tools.
Last updated
microarrayonechannelpreprocessingqualitycontrolopenblaszlib
8.36 score 4 dependents 239 scripts 2.0k downloadscsaw - ChIP-Seq Analysis with Windows
Detection of differentially bound regions in ChIP-seq data with sliding windows, with methods for normalization and proper FDR control.
Last updated
multiplecomparisonchipseqnormalizationsequencingcoveragegeneticsannotationdifferentialpeakcallingcurlbzip2xz-utilszlibcpp
8.35 score 9 dependents 550 scripts 999 downloadsTreeSummarizedExperiment - TreeSummarizedExperiment: a S4 Class for Data with Tree Structures
TreeSummarizedExperiment has extended SingleCellExperiment to include hierarchical information on the rows or columns of the rectangular data.
Last updated
datarepresentationinfrastructure
8.34 score 19 dependents 322 scripts 4.0k downloadsTOAST - Tools for the analysis of heterogeneous tissues
This package is devoted to analyzing high-throughput data (e.g. gene expression microarray, DNA methylation microarray, RNA-seq) from complex tissues. Current functionalities include 1. detect cell-type specific or cross-cell type differential signals 2. tree-based differential analysis 3. improve variable selection in reference-free deconvolution 4. partial reference-free deconvolution with prior knowledge.
Last updated
dnamethylationgeneexpressiondifferentialexpressiondifferentialmethylationmicroarraygenetargetepigeneticsmethylationarray
8.34 score 14 stars 4 dependents 120 scripts 1.1k downloadsPeacoQC - Peak-based selection of high quality cytometry data
This is a package that includes pre-processing and quality control functions that can remove margin events, compensate and transform the data and that will use PeacoQCSignalStability for quality control. This last function will first detect peaks in each channel of the flowframe. It will remove anomalies based on the IsolationTree function and the MAD outlier detection method. This package can be used for both flow- and mass cytometry data.
Last updated
flowcytometryqualitycontrolpreprocessingpeakdetection
8.28 score 21 stars 4 dependents 47 scripts 574 downloadsproDA - Differential Abundance Analysis of Label-Free Mass Spectrometry Data
Account for missing values in label-free mass spectrometry data without imputation. The package implements a probabilistic dropout model that ensures that the information from observed and missing values are properly combined. It adds empirical Bayesian priors to increase power to detect differentially abundant proteins.
Last updated
proteomicsmassspectrometrydifferentialexpressionbayesianregressionsoftwarenormalizationqualitycontrol
8.28 score 23 stars 2 dependents 76 scripts 543 downloadsmonaLisa - Binned Motif Enrichment Analysis and Visualization
Useful functions to work with sequence motifs in the analysis of genomics data. These include methods to annotate genomic regions or sequences with predicted motif hits and to identify motifs that drive observed changes in accessibility or expression. Functions to produce informative visualizations of the obtained results are also provided.
Last updated
motifannotationvisualizationfeatureextractionepigenetics
8.23 score 47 stars 87 scripts 612 downloadssignatureSearch - Environment for Gene Expression Searching Combined with Functional Enrichment Analysis
This package implements algorithms and data structures for performing gene expression signature (GES) searches, and subsequently interpreting the results functionally with specialized enrichment methods.
Last updated
softwaregeneexpressiongonetworkenrichmentsequencingcoveragedifferentialexpressioncpp
8.21 score 23 stars 1 dependents 95 scripts 540 downloadsGeneTonic - Enjoy Analyzing And Integrating The Results From Differential Expression Analysis And Functional Enrichment Analysis
This package provides functionality to combine the existing pieces of the transcriptome data and results, making it easier to generate insightful observations and hypothesis. Its usage is made easy with a Shiny application, combining the benefits of interactivity and reproducibility e.g. by capturing the features and gene sets of interest highlighted during the live session, and creating an HTML report as an artifact where text, code, and output coexist. Using the GeneTonicList as a standardized container for all the required components, it is possible to simplify the generation of multiple visualizations and summaries.
Last updated
guigeneexpressionsoftwaretranscriptiontranscriptomicsvisualizationdifferentialexpressionpathwaysreportwritinggenesetenrichmentannotationgoshinyappsbioconductorbioconductor-packagedata-explorationdata-visualizationfunctional-enrichment-analysisgene-expressionpathway-analysisreproducible-researchrna-seq-analysisrna-seq-datashinytranscriptomeuser-friendly
8.20 score 84 stars 1 dependents 52 scripts 438 downloadsVennDetail - Comprehensive Visualization and Analysis of Multi-Set Intersections
A comprehensive package for visualizing multi-set intersections and extracting detailed subset information. VennDetail generates high-resolution visualizations including traditional Venn diagrams, Venn-pie plots, and UpSet-style plots. It provides functions to extract and combine subset details with user datasets in various formats. The package is particularly useful for bioinformatics applications but can be used for any multi-set analysis.
Last updated
datarepresentationgraphandnetworkvisualizationsoftwareextractvenndiagram
8.13 score 30 stars 111 scripts 434 downloadstarget - Predict Combined Function of Transcription Factors
Implement the BETA algorithm for infering direct target genes from DNA-binding and perturbation expression data Wang et al. (2013) <doi: 10.1038/nprot.2013.150>. Extend the algorithm to predict the combined function of two DNA-binding elements from comprable binding and expression data.
Last updated
softwarestatisticalmethodtranscriptionalgorithmchip-seqdna-bindinggene-regulationtranscription-factors
8.09 score 5 stars 1.4k scripts 368 downloadsscry - Small-Count Analysis Methods for High-Dimensional Data
Many modern biological datasets consist of small counts that are not well fit by standard linear-Gaussian methods such as principal component analysis. This package provides implementations of count-based feature selection and dimension reduction algorithms. These methods can be used to facilitate unsupervised analysis of any high-dimensional data such as single-cell RNA-seq.
Last updated
dimensionreductiongeneexpressionnormalizationprincipalcomponentrnaseqsoftwaresequencingsinglecelltranscriptomics
8.07 score 23 stars 1 dependents 171 scripts 742 downloadsHeatplus - Heatmaps with row and/or column covariates and colored clusters
Display a rectangular heatmap (intensity plot) of a data matrix. By default, both samples (columns) and features (row) of the matrix are sorted according to a hierarchical clustering, and the corresponding dendrogram is plotted. Optionally, panels with additional information about samples and features can be added to the plot.
Last updated
microarrayvisualization
8.05 score 2 stars 6 dependents 105 scripts 799 downloadsHIBAG - HLA Genotype Imputation with Attribute Bagging
Imputes HLA classical alleles using GWAS SNP data, and it relies on a training set of HLA and SNP genotypes. HIBAG can be used by researchers with published parameter estimates instead of requiring access to large training sample datasets. It combines the concepts of attribute bagging, an ensemble classifier method, with haplotype inference for SNPs and HLA types. Attribute bagging is a technique which improves the accuracy and stability of classifier ensembles using bootstrap aggregating and random variable selection.
Last updated
geneticsstatisticalmethodbioinformaticsgpuhlaimputationmhcsnpcpp
8.02 score 31 stars 56 scripts 548 downloadssiggenes - Multiple Testing using SAM and Efron's Empirical Bayes Approaches
Identification of differentially expressed genes and estimation of the False Discovery Rate (FDR) using both the Significance Analysis of Microarrays (SAM) and the Empirical Bayes Analyses of Microarrays (EBAM).
Last updated
multiplecomparisonmicroarraygeneexpressionsnpexonarraydifferentialexpression
7.99 score 40 dependents 80 scripts 5.1k downloadsRBioFormats - R interface to Bio-Formats
An R package which interfaces the OME Bio-Formats Java library to allow reading of proprietary microscopy image data and metadata.
Last updated
dataimportbio-formatsbioconductorimage-processingopenjdk
7.99 score 27 stars 4 dependents 100 scripts 676 downloadsdir.expiry - Managing Expiration for Cache Directories
Implements an expiration system for access to versioned directories. Directories that have not been accessed by a registered function within a certain time frame are deleted. This aims to reduce disk usage by eliminating obsolete caches generated by old versions of packages.
Last updated
softwareinfrastructure
7.98 score 402 dependents 12 scripts 6.6k downloadsspicyR - Spatial analysis of in situ cytometry data
The spicyR package provides a framework for performing inference on changes in spatial relationships between pairs of cell types for cell-resolution spatial omics technologies. spicyR consists of three primary steps: (i) summarizing the degree of spatial localization between pairs of cell types for each image; (ii) modelling the variability in localization summary statistics as a function of cell counts and (iii) testing for changes in spatial localizations associated with a response variable.
Last updated
singlecellcellbasedassaysspatial
7.98 score 12 stars 1 dependents 73 scripts 584 downloadsDEXSeq - Inference of differential exon usage in RNA-Seq
The package is focused on finding differential exon usage using RNA-seq exon counts between samples with different experimental designs. It provides functions that allows the user to make the necessary statistical tests based on a model that uses the negative binomial distribution to estimate the variance between biological replicates and generalized linear models for testing. The package also provides functions for the visualization and exploration of the results.
Last updated
immunooncologysequencingrnaseqdifferentialexpressionalternativesplicingdifferentialsplicinggeneexpressionvisualization
7.95 score 5 dependents 498 scripts 3.0k downloadsROC - utilities for ROC, with microarray focus
Provide utilities for ROC, with microarray focus.
Last updated
differentialexpression
7.95 score 12 dependents 72 scripts 1.9k downloadsmethylSig - MethylSig: Differential Methylation Testing for WGBS and RRBS Data
MethylSig is a package for testing for differentially methylated cytosines (DMCs) or regions (DMRs) in whole-genome bisulfite sequencing (WGBS) or reduced representation bisulfite sequencing (RRBS) experiments. MethylSig uses a beta binomial model to test for significant differences between groups of samples. Several options exist for either site-specific or sliding window tests, and variance estimation.
Last updated
dnamethylationdifferentialmethylationepigeneticsregressionmethylseqdifferential-methylationdna-methylation
7.94 score 20 stars 36 scripts 344 downloads
nullranges - Generation of null ranges via bootstrapping or covariate matching
Modular package for generation of sets of ranges representing the null hypothesis. These can take the form of bootstrap samples of ranges (using the block bootstrap framework of Bickel et al 2010), or sets of control ranges that are matched across one or more covariates. nullranges is designed to be inter-operable with other packages for analysis of genomic overlap enrichment, including the plyranges Bioconductor package.
Last updated
visualizationgenesetenrichmentfunctionalgenomicsepigeneticsgeneregulationgenetargetgenomeannotationannotationgenomewideassociationhistonemodificationchipseqatacseqdnaseseqrnaseqhiddenmarkovmodelbioconductorbootstrapgenomicsmatchingstatistics
7.91 score 27 stars 72 scripts 562 downloads
BioNERO - Biological Network Reconstruction Omnibus
BioNERO aims to integrate all aspects of biological network inference in a single package, including data preprocessing, exploratory analyses, network inference, and analyses for biological interpretations. BioNERO can be used to infer gene coexpression networks (GCNs) and gene regulatory networks (GRNs) from gene expression data. Additionally, it can be used to explore topological properties of protein-protein interaction (PPI) networks. GCN inference relies on the popular WGCNA algorithm. GRN inference is based on the "wisdom of the crowds" principle, which consists in inferring GRNs with multiple algorithms (here, CLR, GENIE3 and ARACNE) and calculating the average rank for each interaction pair. As all steps of network analyses are included in this package, BioNERO makes users avoid having to learn the syntaxes of several packages and how to communicate between them. Finally, users can also identify consensus modules across independent expression sets and calculate intra and interspecies module preservation statistics between different networks.
Last updated
softwaregeneexpressiongeneregulationsystemsbiologygraphandnetworkpreprocessingnetworknetworkinference
7.91 score 36 stars 1 dependents 63 scripts 694 downloads
POMA - Tools for Omics Data Analysis
The POMA package offers a comprehensive toolkit designed for omics data analysis, streamlining the process from initial visualization to final statistical analysis. Its primary goal is to simplify and unify the various steps involved in omics data processing, making it more accessible and manageable within a single, intuitive R package. Emphasizing on reproducibility and user-friendliness, POMA leverages the standardized SummarizedExperiment class from Bioconductor, ensuring seamless integration and compatibility with a wide array of Bioconductor tools. This approach guarantees maximum flexibility and replicability, making POMA an essential asset for researchers handling omics datasets. See https://github.com/pcastellanoescuder/POMAShiny. Paper: Castellano-Escuder et al. (2021) <doi:10.1371/journal.pcbi.1009148> for more details.
Last updated
batcheffectclassificationclusteringdecisiontreedimensionreductionmultidimensionalscalingnormalizationpreprocessingprincipalcomponentregressionrnaseqsoftwarestatisticalmethodvisualizationbioconductorbioinformaticsdata-visualizationdimension-reductionexploratory-data-analysismachine-learningomics-data-integrationpipelinepre-processingstatistical-analysisuser-friendlyworkflow
7.91 score 14 stars 1 dependents 48 scripts 428 downloadsbeadarray - Quality assessment and low-level analysis for Illumina BeadArray data
The package is able to read bead-level data (raw TIFFs and text files) output by BeadScan as well as bead-summary data from BeadStudio. Methods for quality assessment and low-level analysis are provided.
Last updated
microarrayonechannelqualitycontrolpreprocessing
7.90 score 3 dependents 75 scripts 1.2k downloadsnnSVG - Scalable identification of spatially variable genes in spatially-resolved transcriptomics data
Method for scalable identification of spatially variable genes (SVGs) in spatially-resolved transcriptomics data. The method is based on nearest-neighbor Gaussian processes and uses the BRISC algorithm for model fitting and parameter estimation. Allows identification and ranking of SVGs with flexible length scales across a tissue slide or within spatial domains defined by covariates. Scales linearly with the number of spatial locations and can be applied to datasets containing thousands or more spatial locations.
Last updated
spatialsinglecelltranscriptomicsgeneexpressionpreprocessing
7.84 score 24 stars 1 dependents 319 scripts 472 downloads
velociraptor - Toolkit for Single-Cell Velocity
This package provides Bioconductor-friendly wrappers for RNA velocity calculations in single-cell RNA-seq data. We use the basilisk package to manage Conda environments, and the zellkonverter package to convert data structures between SingleCellExperiment (R) and AnnData (Python). The information produced by the velocity methods is stored in the various components of the SingleCellExperiment class.
Last updated
singlecellgeneexpressionsequencingcoveragerna-velocity
7.83 score 61 stars 62 scripts 416 downloadsstandR - Spatial transcriptome analyses of Nanostring's DSP data in R
standR is an user-friendly R package providing functions to assist conducting good-practice analysis of Nanostring's GeoMX DSP data. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. standR allows data inspection, quality control, normalization, batch correction and evaluation with informative visualizations.
Last updated
spatialtranscriptomicsgeneexpressiondifferentialexpressionqualitycontrolnormalizationexperimenthubsoftware
7.77 score 25 stars 1 dependents 75 scripts 421 downloadsMLInterfaces - Uniform interfaces to R machine learning procedures for data in Bioconductor containers
This package provides uniform interfaces to machine learning code for data in R and Bioconductor containers.
Last updated
classificationclustering
7.75 score 5 dependents 81 scripts 1.0k downloadsedge - Extraction of Differential Gene Expression
The edge package implements methods for carrying out differential expression analyses of genome-wide gene expression studies. Significance testing using the optimal discovery procedure and generalized likelihood ratio tests (equivalent to F-tests and t-tests) are implemented for general study designs. Special functions are available to facilitate the analysis of common study designs, including time course experiments. Other packages such as sva and qvalue are integrated in edge to provide a wide range of tools for gene expression analysis.
Last updated
multiplecomparisondifferentialexpressiontimecourseregressiongeneexpressiondataimport
7.74 score 22 stars 83 scripts 439 downloadsdittoSeq - User Friendly Single-Cell and Bulk RNA Sequencing Visualization
A universal, user friendly, single-cell and bulk RNA sequencing visualization toolkit that allows highly customizable creation of color blindness friendly, publication-quality figures. dittoSeq accepts both SingleCellExperiment (SCE) and Seurat objects, as well as the import and usage, via conversion to an SCE, of SummarizedExperiment or DGEList bulk data. Visualizations include dimensionality reduction plots, heatmaps, scatterplots, percent composition or expression across groups, and more. Customizations range from size and title adjustments to automatic generation of annotations for heatmaps, overlay of trajectory analysis onto any dimensionality reduciton plot, hidden data overlay upon cursor hovering via ggplotly conversion, and many more. All with simple, discrete inputs. Color blindness friendliness is powered by legend adjustments (enlarged keys), and by allowing the use of shapes or letter-overlay in addition to the carefully selected dittoColors().
Last updated
softwarevisualizationrnaseqsinglecellgeneexpressiontranscriptomicsdataimport
7.72 score 2 dependents 2.1k scripts 2.0k downloadsSpatialDecon - Deconvolution of mixed cells from spatial and/or bulk gene expression data
Using spatial or bulk gene expression data, estimates abundance of mixed cell types within each observation. Based on "Advances in mixed cell deconvolution enable quantification of cell types in spatial transcriptomic data", Danaher (2022). Designed for use with the NanoString GeoMx platform, but applicable to any gene expression data.
Last updated
immunooncologyfeatureextractiongeneexpressiontranscriptomicsspatial
7.68 score 42 stars 96 scripts 417 downloadsHiCExperiment - Bioconductor class for interacting with Hi-C files in R
R generic interface to Hi-C contact matrices in `.(m)cool`, `.hic` or HiC-Pro derived formats, as well as other Hi-C processed file formats. Contact matrices can be partially parsed using a random access method, allowing a memory-efficient representation of Hi-C data in R. The `HiCExperiment` class stores the Hi-C contacts parsed from local contact matrix files. `HiCExperiment` instances can be further investigated in R using the `HiContacts` analysis package.
Last updated
hicdna3dstructuredataimport
7.68 score 13 stars 3 dependents 54 scripts 454 downloadsLACE - Longitudinal Analysis of Cancer Evolution (LACE)
LACE is an algorithmic framework that processes single-cell somatic mutation profiles from cancer samples collected at different time points and in distinct experimental settings, to produce longitudinal models of cancer evolution. The approach solves a Boolean Matrix Factorization problem with phylogenetic constraints, by maximizing a weighed likelihood function computed on multiple time points.
Last updated
biomedicalinformaticssinglecellsomaticmutation
7.65 score 15 stars 3 scripts 389 downloadsscde - Single Cell Differential Expression
The scde package implements a set of statistical methods for analyzing single-cell RNA-seq data. scde fits individual error models for single-cell RNA-seq measurements. These models can then be used for assessment of differential expression between groups of cells, as well as other types of analysis. The scde package also contains the pagoda framework which applies pathway and gene set overdispersion analysis to identify and characterize putative cell subpopulations based on transcriptional signatures. The overall approach to the differential expression analysis is detailed in the following publication: "Bayesian approach to single-cell differential expression analysis" (Kharchenko PV, Silberstein L, Scadden DT, Nature Methods, doi: 10.1038/nmeth.2967). The overall approach to subpopulation identification and characterization is detailed in the following pre-print: "Characterizing transcriptional heterogeneity through pathway and gene set overdispersion analysis" (Fan J, Salathia N, Liu R, Kaeser G, Yung Y, Herman J, Kaper F, Fan JB, Zhang K, Chun J, and Kharchenko PV, Nature Methods, doi:10.1038/nmeth.3734).
Last updated
immunooncologyrnaseqstatisticalmethoddifferentialexpressionbayesiantranscriptionsoftwareanalysisbioinformaticsheterogenityngssingle-celltranscriptomicsopenblascppopenmp
7.64 score 180 stars 174 scripts 858 downloadsbiocthis - Automate package and project setup for Bioconductor packages
This package expands the usethis package with the goal of helping automate the process of creating R packages for Bioconductor or making them Bioconductor-friendly.
Last updated
softwarereportwritingactionsbioconductorbiocthisgithubstylerusethis
7.64 score 56 stars 1 dependents 5 scripts 698 downloads
lipidr - Data Mining and Analysis of Lipidomics Datasets
lipidr an easy-to-use R package implementing a complete workflow for downstream analysis of targeted and untargeted lipidomics data. lipidomics results can be imported into lipidr as a numerical matrix or a Skyline export, allowing integration into current analysis frameworks. Data mining of lipidomics datasets is enabled through integration with Metabolomics Workbench API. lipidr allows data inspection, normalization, univariate and multivariate analysis, displaying informative visualizations. lipidr also implements a novel Lipid Set Enrichment Analysis (LSEA), harnessing molecular information such as lipid class, total chain length and unsaturation.
Last updated
lipidomicsmassspectrometrynormalizationqualitycontrolvisualizationbioconductor
7.62 score 33 stars 53 scripts 466 downloadsEpiCompare - Comparison, Benchmarking & QC of Epigenomic Datasets
EpiCompare is used to compare and analyse epigenetic datasets for quality control and benchmarking purposes. The package outputs an HTML report consisting of three sections: (1. General metrics) Metrics on peaks (percentage of blacklisted and non-standard peaks, and peak widths) and fragments (duplication rate) of samples, (2. Peak overlap) Percentage and statistical significance of overlapping and non-overlapping peaks. Also includes upset plot and (3. Functional annotation) functional annotation (ChromHMM, ChIPseeker and enrichment analysis) of peaks. Also includes peak enrichment around TSS.
Last updated
epigeneticsgeneticsqualitycontrolchipseqmultiplecomparisonfunctionalgenomicsatacseqdnaseseqbenchmarkbenchmarkingbioconductorbioconductor-packagecomparisonhtmlinteractive-reporting
7.62 score 17 stars 47 scripts 352 downloadsflowViz - Visualization for flow cytometry
Provides visualization tools for flow cytometry data.
Last updated
immunooncologyinfrastructureflowcytometrycellbasedassaysvisualization
7.61 score 13 dependents 286 scripts 1.8k downloadsHilbertCurve - Making 2D Hilbert Curve
Hilbert curve is a type of space-filling curves that fold one dimensional axis into a two dimensional space, but with still preserves the locality. This package aims to provide an easy and flexible way to visualize data through Hilbert curve.
Last updated
softwarevisualizationsequencingcoveragegenomeannotationcpp
7.60 score 44 stars 65 scripts 490 downloadsproActiv - Estimate Promoter Activity from RNA-Seq data
Most human genes have multiple promoters that control the expression of different isoforms. The use of these alternative promoters enables the regulation of isoform expression pre-transcriptionally. Alternative promoters have been found to be important in a wide number of cell types and diseases. proActiv is an R package that enables the analysis of promoters from RNA-seq data. proActiv uses aligned reads as input, and generates counts and normalized promoter activity estimates for each annotated promoter. In particular, proActiv accepts junction files from TopHat2 or STAR or BAM files as inputs. These estimates can then be used to identify which promoter is active, which promoter is inactive, and which promoters change their activity across conditions. proActiv also allows visualization of promoter activity across conditions.
Last updated
rnaseqgeneexpressiontranscriptionalternativesplicinggeneregulationdifferentialsplicingfunctionalgenomicsepigeneticstranscriptomicspreprocessingalternative-promotersgenomicspromoter-activitypromoter-annotationrna-seq-data
7.54 score 59 stars 59 scripts 428 downloads
tanggle - Visualization of Phylogenetic Networks
Offers functions for plotting split (or implicit) networks (unrooted, undirected) and explicit networks (rooted, directed) with reticulations extending. 'ggtree' and using functions from 'ape' and 'phangorn'. It extends the 'ggtree' package [@Yu2017] to allow the visualization of phylogenetic networks using the 'ggplot2' syntax. It offers an alternative to the plot functions already available in 'ape' Paradis and Schliep (2019) <doi:10.1093/bioinformatics/bty633> and 'phangorn' Schliep (2011) <doi:10.1093/bioinformatics/btq706>.
Last updated
softwarevisualizationphylogeneticsalignmentclusteringmultiplesequencealignmentdataimportphylogenetic-networks
7.54 score 11 stars 90 scripts 329 downloadsAlpsNMR - Automated spectraL Processing System for NMR
Reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra proccessing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.
Last updated
softwarepreprocessingvisualizationclassificationcheminformaticsmetabolomicsdataimport
7.54 score 16 stars 1 dependents 15 scripts 536 downloadsSimBu - Simulate Bulk RNA-seq Datasets from Single-Cell Datasets
SimBu can be used to simulate bulk RNA-seq datasets with known cell type fractions. You can either use your own single-cell study for the simulation or the sfaira database. Different pre-defined simulation scenarios exist, as are options to run custom simulations. Additionally, expression values can be adapted by adding an mRNA bias, which produces more biologically relevant simulations.
Last updated
softwarernaseqsinglecell
7.52 score 19 stars 1 dependents 36 scripts 316 downloadsExploreModelMatrix - Graphical Exploration of Design Matrices
Given a sample data table and a design formula, ExploreModelMatrix generates an interactive application for exploration of the resulting design matrix. This can be helpful for interpreting model coefficients and constructing appropriate contrasts in (generalized) linear models. Static visualizations can also be generated.
Last updated
experimentaldesignregressiondifferentialexpressionshinyapps
7.51 score 38 stars 106 scripts 528 downloadsropls - PCA, PLS(-DA) and OPLS(-DA) for multivariate analysis and feature selection of omics data
Latent variable modeling with Principal Component Analysis (PCA) and Partial Least Squares (PLS) are powerful methods for visualization, regression, classification, and feature selection of omics data where the number of variables exceeds the number of samples and with multicollinearity among variables. Orthogonal Partial Least Squares (OPLS) enables to separately model the variation correlated (predictive) to the factor of interest and the uncorrelated (orthogonal) variation. While performing similarly to PLS, OPLS facilitates interpretation. Successful applications of these chemometrics techniques include spectroscopic data such as Raman spectroscopy, nuclear magnetic resonance (NMR), mass spectrometry (MS) in metabolomics and proteomics, but also transcriptomics data. In addition to scores, loadings and weights plots, the package provides metrics and graphics to determine the optimal number of components (e.g. with the R2 and Q2 coefficients), check the validity of the model by permutation testing, detect outliers, and perform feature selection (e.g. with Variable Importance in Projection or regression coefficients). The package can be accessed via a user interface on the Workflow4Metabolomics.org online resource for computational metabolomics (built upon the Galaxy environment).
Last updated
regressionclassificationprincipalcomponenttranscriptomicsproteomicsmetabolomicslipidomicsmassspectrometryimmunooncology
7.50 score 8 dependents 294 scripts 2.3k downloadsTBSignatureProfiler - Profile RNA-Seq Data Using TB Pathway Signatures
Gene signatures of TB progression, TB disease, and other TB disease states have been validated and published previously. This package aggregates known signatures and provides computational tools to enlist their usage on other datasets. The TBSignatureProfiler makes it easy to profile RNA-Seq data using these signatures and includes common signature profiling tools including ASSIGN, GSVA, and ssGSEA. Original models for some gene signatures are also available. A shiny app provides some functionality alongside for detailed command line accessibility.
Last updated
geneexpressiondifferentialexpressionbioconductor-packagebiomarkersgene-signaturestuberculosis
7.49 score 13 stars 53 scripts 338 downloadssevenbridges - Seven Bridges Platform API Client and Common Workflow Language Tool Builder in R
R client and utilities for Seven Bridges platform API, from Cancer Genomics Cloud to other Seven Bridges supported platforms.
Last updated
softwaredataimportthirdpartyclientapi-clientbioconductorbioinformaticscloudcommon-workflow-languagesevenbridges
7.48 score 37 stars 27 scripts 446 downloadsCHETAH - Fast and accurate scRNA-seq cell type identification
CHETAH (CHaracterization of cEll Types Aided by Hierarchical classification) is an accurate, selective and fast scRNA-seq classifier. Classification is guided by a reference dataset, preferentially also a scRNA-seq dataset. By hierarchical clustering of the reference data, CHETAH creates a classification tree that enables a step-wise, top-to-bottom classification. Using a novel stopping rule, CHETAH classifies the input cells to the cell types of the references and to "intermediate types": more general classifications that ended in an intermediate node of the tree.
Last updated
classificationrnaseqsinglecellclusteringgeneexpressionimmunooncology
7.47 score 44 stars 75 scripts 422 downloadsGenomicDistributions - GenomicDistributions: fast analysis of genomic intervals with Bioconductor
If you have a set of genomic ranges, this package can help you with visualization and comparison. It produces several kinds of plots, for example: Chromosome distribution plots, which visualize how your regions are distributed over chromosomes; feature distance distribution plots, which visualizes how your regions are distributed relative to a feature of interest, like Transcription Start Sites (TSSs); genomic partition plots, which visualize how your regions overlap given genomic features such as promoters, introns, exons, or intergenic regions. It also makes it easy to compare one set of ranges to another.
Last updated
softwaregenomeannotationgenomeassemblydatarepresentationsequencingcoveragefunctionalgenomicsvisualization
7.45 score 27 stars 37 scripts 436 downloadsIHW - Independent Hypothesis Weighting
Independent hypothesis weighting (IHW) is a multiple testing procedure that increases power compared to the method of Benjamini and Hochberg by assigning data-driven weights to each hypothesis. The input to IHW is a two-column table of p-values and covariates. The covariate can be any continuous-valued or categorical variable that is thought to be informative on the statistical properties of each hypothesis test, while it is independent of the p-value under the null hypothesis.
Last updated
immunooncologymultiplecomparisonrnaseq
7.44 score 2 dependents 324 scripts 1.8k downloadsflowClust - Clustering for Flow Cytometry
Robust model-based clustering using a t-mixture model with Box-Cox transformation. Note: users should have GSL installed. Windows users: 'consult the README file available in the inst directory of the source distribution for necessary configuration instructions'.
Last updated
immunooncologyclusteringvisualizationflowcytometry
7.43 score 6 dependents 87 scripts 1.7k downloadsdupRadar - Assessment of duplication rates in RNA-Seq datasets
Duplication rate quality control for RNA-Seq datasets.
Last updated
technologysequencingrnaseqqualitycontrolimmunooncology
7.42 score 3 stars 73 scripts 481 downloadsmethrix - Fast and efficient summarization of generic bedGraph files from Bisufite sequencing
Bedgraph files generated by Bisulfite pipelines often come in various flavors. Critical downstream step requires summarization of these files into methylation/coverage matrices. This step of data aggregation is done by Methrix, including many other useful downstream functions.
Last updated
dnamethylationsequencingcoveragebedgraphbioinformaticsdna-methylation
7.41 score 36 stars 59 scripts 446 downloadsgenefu - Computation of Gene Expression-Based Signatures in Breast Cancer
This package contains functions implementing various tasks usually required by gene expression analysis, especially in breast cancer studies: gene mapping between different microarray platforms, identification of molecular subtypes, implementation of published gene signatures, gene selection, and survival analysis.
Last updated
differentialexpressiongeneexpressionvisualizationclusteringclassification
7.40 score 3 dependents 281 scripts 848 downloadsgcrma - Background Adjustment Using Sequence Information
Background adjustment using sequence information
Last updated
microarrayonechannelpreprocessing
7.39 score 11 dependents 182 scripts 2.1k downloadscogena - co-expressed gene-set enrichment analysis
cogena is a workflow for co-expressed gene-set enrichment analysis. It aims to discovery smaller scale, but highly correlated cellular events that may be of great biological relevance. A novel pipeline for drug discovery and drug repositioning based on the cogena workflow is proposed. Particularly, candidate drugs can be predicted based on the gene expression of disease-related data, or other similar drugs can be identified based on the gene expression of drug-related data. Moreover, the drug mode of action can be disclosed by the associated pathway analysis. In summary, cogena is a flexible workflow for various gene set enrichment analysis for co-expressed genes, with a focus on pathway/GO analysis and drug repositioning.
Last updated
clusteringgenesetenrichmentgeneexpressionvisualizationpathwayskegggomicroarraysequencingsystemsbiologydatarepresentationdataimportbioconductorbioinformatics
7.35 score 13 stars 29 scripts 470 downloads
syntenet - Inference And Analysis Of Synteny Networks
syntenet can be used to infer synteny networks from whole-genome protein sequences and analyze them. Anchor pairs are detected with the MCScanX algorithm, which was ported to this package with the Rcpp framework for R and C++ integration. Anchor pairs from synteny analyses are treated as an undirected unweighted graph (i.e., a synteny network), and users can perform: i. network clustering; ii. phylogenomic profiling (by identifying which species contain which clusters) and; iii. microsynteny-based phylogeny reconstruction with maximum likelihood.
Last updated
softwarenetworkinferencefunctionalgenomicscomparativegenomicsphylogeneticssystemsbiologygraphandnetworkwholegenomenetworkcomparative-genomicsevolutionary-genomicsnetwork-sciencephylogenomicssyntenysynteny-networkcpp
7.33 score 40 stars 1 dependents 30 scripts 426 downloadsdcanr - Differential co-expression/association network analysis
This package implements methods and an evaluation framework to infer differential co-expression/association networks. Various methods are implemented and can be evaluated using simulated datasets. Inference of differential co-expression networks can allow identification of networks that are altered between two conditions (e.g., health and disease).
Last updated
networkinferencegraphandnetworkdifferentialexpressionnetwork
7.33 score 7 stars 5 dependents 34 scripts 562 downloadsscAnnotatR - Pretrained learning models for cell type prediction on single cell RNA-sequencing data
The package comprises a set of pretrained machine learning models to predict basic immune cell types. This enables all users to quickly get a first annotation of the cell types present in their dataset without requiring prior knowledge. scAnnotatR also allows users to train their own models to predict new cell types based on specific research needs.
Last updated
singlecelltranscriptomicsgeneexpressionsupportvectormachineclassificationsoftware
7.31 score 22 stars 68 scripts 453 downloadseisaR - Exon-Intron Split Analysis (EISA) in R
Exon-intron split analysis (EISA) uses ordinary RNA-seq data to measure changes in mature RNA and pre-mRNA reads across different experimental conditions to quantify transcriptional and post-transcriptional regulation of gene expression. For details see Gaidatzis et al., Nat Biotechnol 2015. doi: 10.1038/nbt.3269. eisaR implements the major steps of EISA in R.
Last updated
transcriptiongeneexpressiongeneregulationfunctionalgenomicstranscriptomicsregressionrnaseq
7.30 score 17 stars 73 scripts 498 downloadsMfuzz - Soft clustering of omics time series data
The Mfuzz package implements noise-robust soft clustering of omics time-series data, including transcriptomic, proteomic or metabolomic data. It is based on the use of c-means clustering. For convenience, it includes a graphical user interface.
Last updated
microarrayclusteringtimecoursepreprocessingvisualization
7.28 score 3 dependents 478 scripts 2.2k downloads
MsBackendMgf - Mass Spectrometry Data Backend for Mascot Generic Format (mgf) Files
Mass spectrometry (MS) data backend supporting import and export of MS/MS spectra data from Mascot Generic Format (mgf) files. Objects defined in this package are supposed to be used with the Spectra Bioconductor package. This package thus adds mgf file support to the Spectra package.
Last updated
infrastructureproteomicsmassspectrometrymetabolomicsdataimport
7.27 score 5 stars 2 dependents 83 scripts 922 downloadsBioTIP - BioTIP: An R package for characterization of Biological Tipping-Point
Adopting tipping-point theory to transcriptome profiles to unravel disease regulatory trajectory.
Last updated
sequencingrnaseqgeneexpressiontranscriptionsoftware
7.25 score 24 stars 37 scripts 436 downloadsSPsimSeq - Semi-parametric simulation tool for bulk and single-cell RNA sequencing data
SPsimSeq uses a specially designed exponential family for density estimation to constructs the distribution of gene expression levels from a given real RNA sequencing data (single-cell or bulk), and subsequently simulates a new dataset from the estimated marginal distributions using Gaussian-copulas to retain the dependence between genes. It allows simulation of multiple groups and batches with any required sample size and library size.
Last updated
geneexpressionrnaseqsinglecellsequencingdnaseq
7.24 score 10 stars 1 dependents 36 scripts 326 downloadsTargetSearch - A package for the analysis of GC-MS metabolite profiling data
This packages provides a flexible, fast and accurate method for targeted pre-processing of GC-MS data. The user provides a (often very large) set of GC chromatograms and a metabolite library of targets. The package will automatically search those targets in the chromatograms resulting in a data matrix that can be used for further data analysis.
Last updated
massspectrometrypreprocessingdecisiontreeimmunooncologybiocbioconductorgc-msmass-spectrometry
7.24 score 4 stars 12 scripts 444 downloadsregionReport - Generate HTML or PDF reports for a set of genomic regions or DESeq2/edgeR results
Generate HTML or PDF reports to explore a set of regions such as the results from annotation-agnostic expression analysis of RNA-seq data at base-pair resolution performed by derfinder. You can also create reports for DESeq2 or edgeR results.
Last updated
differentialexpressionsequencingrnaseqsoftwarevisualizationtranscriptioncoveragereportwritingdifferentialmethylationdifferentialpeakcallingimmunooncologyqualitycontrolbioconductorderfinderdeseq2edgerregionreportrmarkdown
7.23 score 9 stars 47 scripts 508 downloadssechm - sechm: Complex Heatmaps from a SummarizedExperiment
sechm provides a simple interface between SummarizedExperiment objects and the ComplexHeatmap package. It enables plotting annotated heatmaps from SE objects, with easy access to rowData and colData columns, and implements a number of features to make the generation of heatmaps easier and more flexible. These functionalities used to be part of the SEtools package.
Last updated
geneexpressionvisualization
7.23 score 8 stars 2 dependents 175 scripts 477 downloadsCytoPipeline - Automation and visualization of flow cytometry data analysis pipelines
This package provides support for automation and visualization of flow cytometry data analysis pipelines. In the current state, the package focuses on the preprocessing and quality control part. The framework is based on two main S4 classes, i.e. CytoPipeline and CytoProcessingStep. The pipeline steps are linked to corresponding R functions - that are either provided in the CytoPipeline package itself, or exported from a third party package, or coded by the user her/himself. The processing steps need to be specified centrally and explicitly using either a json input file or through step by step creation of a CytoPipeline object with dedicated methods. After having run the pipeline, obtained results at all steps can be retrieved and visualized thanks to file caching (the running facility uses a BiocFileCache implementation). The package provides also specific visualization tools like pipeline workflow summary display, and 1D/2D comparison plots of obtained flowFrames at various steps of the pipeline.
Last updated
flowcytometrypreprocessingqualitycontrolworkflowstepimmunooncologysoftwarevisualization
7.22 score 7 stars 3 dependents 19 scripts 308 downloadsLEA - LEA: an R package for Landscape and Ecological Association Studies
LEA is an R package dedicated to population genomics, landscape genomics and genotype-environment association tests. LEA can run analyses of population structure and genome-wide tests for local adaptation, and also performs imputation of missing genotypes. The package includes statistical methods for estimating ancestry coefficients from large genotypic matrices and for evaluating the number of ancestral populations (snmf). It performs statistical tests using latent factor mixed models for identifying genetic polymorphisms that exhibit association with environmental gradients or phenotypic traits (lfmm2). In addition, LEA computes values of genetic offset statistics based on new or predicted environments (genetic.gap, genetic.offset). LEA is mainly based on optimized programs that can scale with the dimensions of large data sets.
Last updated
softwarestatistical methodclusteringregressionopenblas
7.22 score 3 dependents 732 scripts 1.3k downloadsanimalcules - Interactive microbiome analysis toolkit
animalcules is an R package for utilizing up-to-date data analytics, visualization methods, and machine learning models to provide users an easy-to-use interactive microbiome analysis framework. It can be used as a standalone software package or users can explore their data with the accompanying interactive R Shiny application. Traditional microbiome analysis such as alpha/beta diversity and differential abundance analysis are enhanced, while new methods like biomarker identification are introduced by animalcules. Powerful interactive and dynamic figures generated by animalcules enable users to understand their data better and discover new insights.
Last updated
microbiomemetagenomicscoveragevisualization
7.22 score 56 stars 28 scripts 588 downloadsTADCompare - TADCompare: Identification and characterization of differential TADs
TADCompare is an R package designed to identify and characterize differential Topologically Associated Domains (TADs) between multiple Hi-C contact matrices. It contains functions for finding differential TADs between two datasets, finding differential TADs over time and identifying consensus TADs across multiple matrices. It takes all of the main types of HiC input and returns simple, comprehensive, easy to analyze results.
Last updated
softwarehicsequencingfeatureextractionclustering
7.21 score 27 stars 25 scripts 448 downloadswaddR - Statistical tests for detecting differential distributions based on the 2-Wasserstein distance
The package offers statistical tests based on the 2-Wasserstein distance for detecting and characterizing differences between two distributions given in the form of samples. Functions for calculating the 2-Wasserstein distance and testing for differential distributions are provided, as well as a specifically tailored test for differential expression in single-cell RNA sequencing data.
Last updated
softwarestatisticalmethodsinglecelldifferentialexpressioncpp
7.20 score 28 stars 28 scripts 247 downloadsggspavis - Visualization functions for spatial transcriptomics data
Visualization functions for spatial transcriptomics data. Includes functions to generate several types of plots, including spot plots, feature (molecule) plots, reduced dimension plots, spot-level quality control (QC) plots, and feature-level QC plots, for datasets from the 10x Genomics Visium and other technological platforms. Datasets are assumed to be in either SpatialExperiment or SingleCellExperiment format.
Last updated
spatialsinglecelltranscriptomicsgeneexpressionqualitycontroldimensionreduction
7.19 score 5 stars 514 scripts 642 downloadspeakPantheR - Peak Picking and Annotation of High Resolution Experiments
An automated pipeline for the detection, integration and reporting of predefined features across a large number of mass spectrometry data files. It enables the real time annotation of multiple compounds in a single file, or the parallel annotation of multiple compounds in multiple files. A graphical user interface as well as command line functions will assist in assessing the quality of annotation and update fitting parameters until a satisfactory result is obtained.
Last updated
massspectrometrymetabolomicspeakdetectionfeature-detectionmass-spectrometry
7.16 score 13 stars 46 scripts 318 downloads
escheR - Unified multi-dimensional visualizations with Gestalt principles
The creation of effective visualizations is a fundamental component of data analysis. In biomedical research, new challenges are emerging to visualize multi-dimensional data in a 2D space, but current data visualization tools have limited capabilities. To address this problem, we leverage Gestalt principles to improve the design and interpretability of multi-dimensional data in 2D data visualizations, layering aesthetics to display multiple variables. The proposed visualization can be applied to spatially-resolved transcriptomics data, but also broadly to data visualized in 2D space, such as embedding visualizations. We provide this open source R package escheR, which is built off of the state-of-the-art ggplot2 visualization framework and can be seamlessly integrated into genomics toolboxes and workflows.
Last updated
spatialsinglecelltranscriptomicsvisualizationsoftwaremultidimensionalsingle-cellspatial-omics
7.14 score 8 stars 1 dependents 290 scripts 440 downloadsSharedObject - Sharing R objects across multiple R processes without memory duplication
This package is developed for facilitating parallel computing in R. It is capable to create an R object in the shared memory space and share the data across multiple R processes. It avoids the overhead of memory dulplication and data transfer, which make sharing big data object across many clusters possible.
Last updated
infrastructuresharedobjectcpp
7.13 score 51 stars 1 dependents 11 scripts 407 downloadssatuRn - Scalable Analysis of Differential Transcript Usage for Bulk and Single-Cell RNA-sequencing Applications
satuRn provides a higly performant and scalable framework for performing differential transcript usage analyses. The package consists of three main functions. The first function, fitDTU, fits quasi-binomial generalized linear models that model transcript usage in different groups of interest. The second function, testDTU, tests for differential usage of transcripts between groups of interest. Finally, plotDTU visualizes the usage profiles of transcripts in groups of interest.
Last updated
regressionexperimentaldesigndifferentialexpressiongeneexpressionrnaseqsequencingsoftwaresinglecelltranscriptomicsmultiplecomparisonvisualization
7.13 score 23 stars 1 dependents 97 scripts 802 downloadssystemPipeShiny - systemPipeShiny: An Interactive Framework for Workflow Management and Visualization
systemPipeShiny (SPS) extends the widely used systemPipeR (SPR) workflow environment with a versatile graphical user interface provided by a Shiny App. This allows non-R users, such as experimentalists, to run many systemPipeR’s workflow designs, control, and visualization functionalities interactively without requiring knowledge of R. Most importantly, SPS has been designed as a general purpose framework for interacting with other R packages in an intuitive manner. Like most Shiny Apps, SPS can be used on both local computers as well as centralized server-based deployments that can be accessed remotely as a public web service for using SPR’s functionalities with community and/or private data. The framework can integrate many core packages from the R/Bioconductor ecosystem. Examples of SPS’ current functionalities include: (a) interactive creation of experimental designs and metadata using an easy to use tabular editor or file uploader; (b) visualization of workflow topologies combined with auto-generation of R Markdown preview for interactively designed workflows; (d) access to a wide range of data processing routines; (e) and an extendable set of visualization functionalities. Complex visual results can be managed on a 'Canvas Workbench’ allowing users to organize and to compare plots in an efficient manner combined with a session snapshot feature to continue work at a later time. The present suite of pre-configured visualization examples. The modular design of SPR makes it easy to design custom functions without any knowledge of Shiny, as well as extending the environment in the future with contributions from the community.
Last updated
shinyappsinfrastructuredataimportsequencingqualitycontrolreportwritingexperimentaldesignclusteringbioconductorbioconductor-packagedata-visualizationshinysystempiper
7.12 score 35 stars 42 scripts 360 downloadscardelino - Clone Identification from Single Cell Data
Methods to infer clonal tree configuration for a population of cells using single-cell RNA-seq data (scRNA-seq), and possibly other data modalities. Methods are also provided to assign cells to inferred clones and explore differences in gene expression between clones. These methods can flexibly integrate information from imperfect clonal trees inferred based on bulk exome-seq data, and sparse variant alleles expressed in scRNA-seq data. A flexible beta-binomial error model that accounts for stochastic dropout events as well as systematic allelic imbalance is used.
Last updated
singlecellrnaseqvisualizationtranscriptomicsgeneexpressionsequencingsoftwareexomeseqclonal-clusteringgibbs-samplingscrna-seqsingle-cellsomatic-mutations
7.08 score 65 stars 62 scripts 406 downloadsmygene - Access MyGene.Info_ services
MyGene.Info_ provides simple-to-use REST web services to query/retrieve gene annotation data. It's designed with simplicity and performance emphasized. *mygene*, is an easy-to-use R wrapper to access MyGene.Info_ services.
Last updated
annotation
7.06 score 1 dependents 386 scripts 733 downloadsCategory - Category Analysis
A collection of tools for performing category (gene set enrichment) analysis.
Last updated
annotationgopathwaysgenesetenrichment
7.05 score 14 dependents 550 scripts 2.5k downloads
tricycle - tricycle: Transferable Representation and Inference of cell cycle
The package contains functions to infer and visualize cell cycle process using Single Cell RNASeq data. It exploits the idea of transfer learning, projecting new data to the previous learned biologically interpretable space. We provide a pre-learned cell cycle space, which could be used to infer cell cycle time of human and mouse single cell samples. In addition, we also offer functions to visualize cell cycle time on different embeddings and functions to build new reference.
Last updated
singlecellsoftwaretranscriptomicsrnaseqtranscriptionbiologicalquestiondimensionreductionimmunooncology
6.96 score 30 stars 67 scripts 766 downloadsGOstats - Tools for manipulating GO and microarrays
A set of tools for interacting with GO and microarray data. A variety of basic manipulation tools for graphs, hypothesis testing and other simple calculations.
Last updated
annotationgomultiplecomparisongeneexpressionmicroarraypathwaysgenesetenrichmentgraphandnetwork
6.94 score 10 dependents 600 scripts 2.4k downloadsTileDBArray - Using TileDB as a DelayedArray Backend
Implements a DelayedArray backend for reading and writing dense or sparse arrays in the TileDB format. The resulting TileDBArrays are compatible with all Bioconductor pipelines that can accept DelayedArray instances.
Last updated
datarepresentationinfrastructuresoftware
6.93 score 11 stars 1 dependents 26 scripts 359 downloadsNanoMethViz - Visualise methylation data from Oxford Nanopore sequencing
NanoMethViz is a toolkit for visualising methylation data from Oxford Nanopore sequencing. It can be used to explore methylation patterns from reads derived from Oxford Nanopore direct DNA sequencing with methylation called by callers including nanopolish, f5c and megalodon. The plots in this package allow the visualisation of methylation profiles aggregated over experimental groups and across classes of genomic features.
Last updated
softwarelongreadvisualizationdifferentialmethylationdnamethylationepigeneticsdataimportzlibcpp
6.92 score 36 stars 22 scripts 446 downloadsDiffLogo - DiffLogo: A comparative visualisation of biooligomer motifs
DiffLogo is an easy-to-use tool to visualize motif differences.
Last updated
softwaresequencematchingmultiplecomparisonmotifannotationvisualizationalignment
6.91 score 8 stars 48 scripts 404 downloadsmegadepth - megadepth: BigWig and BAM related utilities
This package provides an R interface to Megadepth by Christopher Wilks available at https://github.com/ChristopherWilks/megadepth. It is particularly useful for computing the coverage of a set of genomic regions across bigWig or BAM files. With this package, you can build base-pair coverage matrices for regions or annotations of your choice from BigWig files. Megadepth was used to create the raw files provided by https://bioconductor.org/packages/recount3.
Last updated
softwarecoveragedataimporttranscriptomicsrnaseqpreprocessingbambigwigdasptermegadepthrecount2recount3
6.90 score 14 stars 3 dependents 14 scripts 504 downloadsspatialDE - R wrapper for SpatialDE
SpatialDE is a method to find spatially variable genes (SVG) from spatial transcriptomics data. This package provides wrappers to use the Python SpatialDE library in R, using reticulate and basilisk.
Last updated
softwaretranscriptomicspythonspatial-datawrapper
6.84 score 3 stars 193 scripts 380 downloadsCiteFuse - CiteFuse: multi-modal analysis of CITE-seq data
CiteFuse pacakage implements a suite of methods and tools for CITE-seq data from pre-processing to integrative analytics, including doublet detection, network-based modality integration, cell type clustering, differential RNA and protein expression analysis, ADT evaluation, ligand-receptor interaction analysis, and interactive web-based visualisation of the analyses.
Last updated
singlecellgeneexpressionbioinformaticssingle-cellcpp
6.83 score 28 stars 20 scripts 377 downloadscqn - Conditional quantile normalization
A normalization tool for RNA-Seq data, implementing the conditional quantile normalization method.
Last updated
immunooncologyrnaseqpreprocessingdifferentialexpression
6.79 score 4 dependents 258 scripts 601 downloadsiSEEu - iSEE Universe
iSEEu (the iSEE universe) contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels, or modes allowing easy configuration of iSEE applications.
Last updated
immunooncologyvisualizationguidimensionreductionfeatureextractionclusteringtranscriptiongeneexpressiontranscriptomicssinglecellcellbasedassayshacktoberfest
6.79 score 9 stars 1 dependents 38 scripts 445 downloadsNanoStringNCTools - NanoString nCounter Tools
Tools for NanoString Technologies nCounter Technology. Provides support for reading RCC files into an ExpressionSet derived object. Also includes methods for QC and normalizaztion of NanoString data.
Last updated
geneexpressiontranscriptioncellbasedassaysdataimporttranscriptomicsproteomicsmrnamicroarrayproprietaryplatformsrnaseq
6.79 score 4 dependents 204 scripts 732 downloadsStatial - A package to identify changes in cell state relative to spatial associations
Statial is a suite of functions for identifying changes in cell state. The functionality provided by Statial provides robust quantification of cell type localisation which are invariant to changes in tissue structure. In addition to this Statial uncovers changes in marker expression associated with varying levels of localisation. These features can be used to explore how the structure and function of different cell types may be altered by the agents they are surrounded with.
Last updated
singlecellspatialclassificationsingle-cell
6.77 score 6 stars 31 scripts 402 downloadsMOSim - Multi-Omics Simulation (MOSim)
MOSim package simulates multi-omic experiments that mimic regulatory mechanisms within the cell, allowing flexible experimental design including time course and multiple groups.
Last updated
softwaretimecourseexperimentaldesignrnaseqcpp
6.76 score 12 stars 10 scripts 377 downloadsaffycoretools - Functions useful for those doing repetitive analyses with Affymetrix GeneChips
Various wrapper functions that have been written to streamline the more common analyses that a core Biostatistician might see.
Last updated
reportwritingmicroarrayonechannelgeneexpression
6.75 score 1 dependents 125 scripts 930 downloadscellxgenedp - Discover and Access Single Cell Data Sets in the CELLxGENE Data Portal
The cellxgene data portal (https://cellxgene.cziscience.com/) provides a graphical user interface to collections of single-cell sequence data processed in standard ways to 'count matrix' summaries. The cellxgenedp package provides an alternative, R-based inteface, allowind data discovery, viewing, and downloading.
Last updated
singlecelldataimportthirdpartyclient
6.70 score 9 stars 46 scripts 401 downloadsmethylCC - Estimate the cell composition of whole blood in DNA methylation samples
A tool to estimate the cell composition of DNA methylation whole blood sample measured on any platform technology (microarray and sequencing).
Last updated
microarraysequencingdnamethylationmethylationarraymethylseqwholegenome
6.70 score 18 stars 23 scripts 436 downloadsResidualMatrix - Creating a DelayedMatrix of Regression Residuals
Provides delayed computation of a matrix of residuals after fitting a linear model to each column of an input matrix. Also supports partial computation of residuals where selected factors are to be preserved in the output matrix. Implements a number of efficient methods for operating on the delayed matrix of residuals, most notably matrix multiplication and calculation of row/column sums or means.
Last updated
softwaredatarepresentationregressionbatcheffectexperimentaldesign
6.69 score 1 stars 11 dependents 11 scripts 6.8k downloadsscDataviz - scDataviz: single cell dataviz and downstream analyses
In the single cell World, which includes flow cytometry, mass cytometry, single-cell RNA-seq (scRNA-seq), and others, there is a need to improve data visualisation and to bring analysis capabilities to researchers even from non-technical backgrounds. scDataviz attempts to fit into this space, while also catering for advanced users. Additonally, due to the way that scDataviz is designed, which is based on SingleCellExperiment, it has a 'plug and play' feel, and immediately lends itself as flexibile and compatibile with studies that go beyond scDataviz. Finally, the graphics in scDataviz are generated via the ggplot engine, which means that users can 'add on' features to these with ease.
Last updated
singlecellimmunooncologyrnaseqgeneexpressiontranscriptionflowcytometrymassspectrometrydataimport
6.68 score 67 stars 24 scripts 349 downloadsSWATH2stats - Transform and Filter SWATH Data for Statistical Packages
This package is intended to transform SWATH data from the OpenSWATH software into a format readable by other statistics packages while performing filtering, annotation and FDR estimation.
Last updated
proteomicsannotationexperimentaldesignpreprocessingmassspectrometryimmunooncology
6.67 score 1 stars 26 scripts 422 downloadsmnem - Mixture Nested Effects Models
Mixture Nested Effects Models (mnem) is an extension of Nested Effects Models and allows for the analysis of single cell perturbation data provided by methods like Perturb-Seq (Dixit et al., 2016) or Crop-Seq (Datlinger et al., 2017). In those experiments each of many cells is perturbed by a knock-down of a specific gene, i.e. several cells are perturbed by a knock-down of gene A, several by a knock-down of gene B, ... and so forth. The observed read-out has to be multi-trait and in the case of the Perturb-/Crop-Seq gene are expression profiles for each cell. mnem uses a mixture model to simultaneously cluster the cell population into k clusters and and infer k networks causally linking the perturbed genes for each cluster. The mixture components are inferred via an expectation maximization algorithm.
Last updated
pathwayssystemsbiologynetworkinferencenetworkrnaseqpooledscreenssinglecellcrispratacseqdnaseqgeneexpressioncpp
6.67 score 4 stars 3 dependents 43 scripts 439 downloadsdistinct - distinct: a method for differential analyses via hierarchical permutation tests
distinct is a statistical method to perform differential testing between two or more groups of distributions; differential testing is performed via hierarchical non-parametric permutation tests on the cumulative distribution functions (cdfs) of each sample. While most methods for differential expression target differences in the mean abundance between conditions, distinct, by comparing full cdfs, identifies, both, differential patterns involving changes in the mean, as well as more subtle variations that do not involve the mean (e.g., unimodal vs. bi-modal distributions with the same mean). distinct is a general and flexible tool: due to its fully non-parametric nature, which makes no assumptions on how the data was generated, it can be applied to a variety of datasets. It is particularly suitable to perform differential state analyses on single cell data (i.e., differential analyses within sub-populations of cells), such as single cell RNA sequencing (scRNA-seq) and high-dimensional flow or mass cytometry (HDCyto) data. To use distinct one needs data from two or more groups of samples (i.e., experimental conditions), with at least 2 samples (i.e., biological replicates) per group.
Last updated
geneticsrnaseqsequencingdifferentialexpressiongeneexpressionmultiplecomparisonsoftwaretranscriptionstatisticalmethodvisualizationsinglecellflowcytometrygenetargetopenblascpp
6.64 score 13 stars 1 dependents 37 scripts 490 downloadsmetaseqR2 - An R package for the analysis and result reporting of RNA-Seq data by combining multiple statistical algorithms
Provides an interface to several normalization and statistical testing packages for RNA-Seq gene expression data. Additionally, it creates several diagnostic plots, performs meta-analysis by combinining the results of several statistical tests and reports the results in an interactive way.
Last updated
softwaregeneexpressiondifferentialexpressionworkflowsteppreprocessingqualitycontrolnormalizationreportwritingrnaseqtranscriptionsequencingtranscriptomicsbayesianclusteringcellbiologybiomedicalinformaticsfunctionalgenomicssystemsbiologyimmunooncologyalternativesplicingdifferentialsplicingmultiplecomparisontimecoursedataimportatacseqepigeneticsregressionproprietaryplatformsgenesetenrichmentbatcheffectchipseq
6.64 score 8 stars 18 scripts 437 downloadslumi - BeadArray Specific Methods for Illumina Methylation and Expression Microarrays
The lumi package provides an integrated solution for the Illumina microarray data analysis. It includes functions of Illumina BeadStudio (GenomeStudio) data input, quality control, BeadArray-specific variance stabilization, normalization and gene annotation at the probe level. It also includes the functions of processing Illumina methylation microarrays, especially Illumina Infinium methylation microarrays.
Last updated
microarrayonechannelpreprocessingdnamethylationqualitycontroltwochannel
6.63 score 10 dependents 350 scripts 2.1k downloadsmultiGSEA - Combining GSEA-based pathway enrichment with multi omics data integration
Extracted features from pathways derived from 8 different databases (KEGG, Reactome, Biocarta, etc.) can be used on transcriptomic, proteomic, and/or metabolomic level to calculate a combined GSEA-based enrichment score.
Last updated
genesetenrichmentpathwaysreactomebiocarta
6.63 score 21 stars 41 scripts 416 downloads
condiments - Differential Topology, Progression and Differentiation
This package encapsulate many functions to conduct a differential topology analysis. It focuses on analyzing an 'omic dataset with multiple conditions. While the package is mostly geared toward scRNASeq, it does not place any restriction on the actual input format.
Last updated
rnaseqsequencingsoftwaresinglecelltranscriptomicsmultiplecomparisonvisualization
6.63 score 32 stars 44 scripts 390 downloadsSpotClean - SpotClean adjusts for spot swapping in spatial transcriptomics data
SpotClean is a computational method to adjust for spot swapping in spatial transcriptomics data. Recent spatial transcriptomics experiments utilize slides containing thousands of spots with spot-specific barcodes that bind mRNA. Ideally, unique molecular identifiers at a spot measure spot-specific expression, but this is often not the case due to bleed from nearby spots, an artifact we refer to as spot swapping. SpotClean is able to estimate the contamination rate in observed data and decontaminate the spot swapping effect, thus increase the sensitivity and precision of downstream analyses.
Last updated
dataimportrnaseqsequencinggeneexpressionspatialsinglecelltranscriptomicspreprocessingrna-seqspatial-transcriptomics
6.63 score 37 stars 38 scripts 413 downloadsENmix - Quality control and analysis tools for Illumina DNA methylation BeadChip
Tools for quanlity control, analysis and visulization of Illumina DNA methylation array data.
Last updated
dnamethylationpreprocessingqualitycontroltwochannelmicroarrayonechannelmethylationarraybatcheffectnormalizationdataimportregressionprincipalcomponentepigeneticsmultichanneldifferentialmethylationimmunooncology
6.62 score 1 dependents 175 scripts 946 downloadsBridgeDbR - Code for using BridgeDb identifier mapping framework from within R
Use BridgeDb functions and load identifier mapping databases in R. It uses GitHub, Zenodo, and Figshare if you use this package to download identifier mappings files.
Last updated
softwareannotationmetabolomicscheminformaticsbioconductor-packagebridgedbgenesidentifierslife-sciencesmetabolitesproteinsopenjdk
6.61 score 4 stars 47 scripts 432 downloadsBumpyMatrix - Bumpy Matrix of Non-Scalar Objects
Implements the BumpyMatrix class and several subclasses for holding non-scalar objects in each entry of the matrix. This is akin to a ragged array but the raggedness is in the third dimension, much like a bumpy surface - hence the name. Of particular interest is the BumpyDataFrameMatrix, where each entry is a Bioconductor data frame. This allows us to naturally represent multivariate data in a format that is compatible with two-dimensional containers like the SummarizedExperiment and MultiAssayExperiment objects.
Last updated
softwareinfrastructuredatarepresentation
6.61 score 1 stars 15 dependents 44 scripts 1.0k downloadsMuData - Serialization for MultiAssayExperiment Objects
Save MultiAssayExperiments to h5mu files supported by muon and mudata. Muon is a Python framework for multimodal omics data analysis. It uses an HDF5-based format for data storage.
Last updated
dataimportanndatabioconductormudatamulti-omicsmultimodal-omicsscrna-seq
6.60 score 10 stars 33 scripts 376 downloadsscClassify - scClassify: single-cell Hierarchical Classification
scClassify is a multiscale classification framework for single-cell RNA-seq data based on ensemble learning and cell type hierarchies, enabling sample size estimation required for accurate cell type classification and joint classification of cells using multiple references.
Last updated
singlecellgeneexpressionclassification
6.59 score 28 stars 35 scripts 402 downloadscfDNAPro - cfDNAPro extracts and Visualises biological features from whole genome sequencing data of cell-free DNA
cfDNA fragments carry important features for building cancer sample classification ML models, such as fragment size, and fragment end motif etc. Analyzing and visualizing fragment size metrics, as well as other biological features in a curated, standardized, scalable, well-documented, and reproducible way might be time intensive. This package intends to resolve these problems and simplify the process. It offers two sets of functions for cfDNA feature characterization and visualization.
Last updated
visualizationsequencingwholegenomebioinformaticscancer-genomicscancer-researchcell-free-dnaearly-detectiongenomics-visualizationliquid-biopsyswgswhole-genome-sequencing
6.59 score 43 stars 15 scripts 418 downloads
doubletrouble - Identification and classification of duplicated genes
doubletrouble aims to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. The duplication modes are i. segmental duplication (SD); ii. tandem duplication (TD); iii. proximal duplication (PD); iv. transposed duplication (TRD) and; v. dispersed duplication (DD). Transposon-derived duplicates (TRD) can be further subdivided into rTRD (retrotransposon-derived duplication) and dTRD (DNA transposon-derived duplication). If users want a simpler classification scheme, duplicates can also be classified into SD- and SSD-derived (small-scale duplication) gene pairs. Besides classifying gene pairs, users can also classify genes, so that each gene is assigned a unique mode of duplication. Users can also calculate substitution rates per substitution site (i.e., Ka and Ks) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.
Last updated
softwarewholegenomecomparativegenomicsfunctionalgenomicsphylogeneticsnetworkclassificationbioinformaticscomparative-genomicsgene-duplicationmolecular-evolutionwhole-genome-duplication
6.58 score 34 stars 25 scripts 414 downloadsSPIA - Signaling Pathway Impact Analysis (SPIA) using combined evidence of pathway over-representation and unusual signaling perturbations
This package implements the Signaling Pathway Impact Analysis (SPIA) which uses the information form a list of differentially expressed genes and their log fold changes together with signaling pathways topology, in order to identify the pathways most relevant to the condition under the study.
Last updated
microarraygraphandnetwork
6.58 score 4 dependents 139 scripts 1.1k downloadsstructToolbox - Data processing & analysis tools for Metabolomics and other omics
An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). Ontology terms have been integrated to provide standardised definitions for the different methods, inputs and outputs.
Last updated
workflowstepmetabolomicsbioconductor-packagedimslc-msmachine-learningmultivariate-analysisstatisticsunivariate
6.57 score 11 stars 28 scripts 474 downloadsmitch - Multi-Contrast Gene Set Enrichment Analysis
mitch is an R package for multi-contrast enrichment analysis. At it’s heart, it uses a rank-MANOVA based statistical approach to detect sets of genes that exhibit enrichment in the multidimensional space as compared to the background. The rank-MANOVA concept dates to work by Cox and Mann (https://doi.org/10.1186/1471-2105-13-S16-S12). mitch is useful for pathway analysis of profiling studies with one, two or more contrasts, or in studies with multiple omics profiling, for example proteomic, transcriptomic, epigenomic analysis of the same samples. mitch is perfectly suited for pathway level differential analysis of scRNA-seq data. We have an established routine for pathway enrichment of Infinium Methylation Array data (see vignette). The main strengths of mitch are that it can import datasets easily from many upstream tools and has advanced plotting features to visualise these enrichments.
Last updated
geneexpressiongenesetenrichmentsinglecelltranscriptomicsepigeneticsproteomicsdifferentialexpressionreactomednamethylationmethylationarraydataimportgene-regulationgene-seq-analysispathway-analysis
6.56 score 17 stars 18 scripts 328 downloads
SingleCellSignalR - Cell Signalling Using Single-Cell RNA-seq or Proteomics Data
Inference of ligand-receptor (L-R) interactions from single-cell expression (transcriptomics/proteomics) data. SingleCellSignalR v2 inferences rely on the statistical model we introduced in the BulkSignalR package as well as the original SingleCellSignalR LR-score (both are available). SingleCellSignalR v2 can be regarded as a wrapper to BulkSignalR fundamental classes. This also enables v2 users to work with any species, whereas only Mus musculus & Homo sapiens were available before in SingleCellSignalR v1.
Last updated
networkrnaseqsoftwareproteomicstranscriptomicssinglecellnetworkinference
6.56 score 1 stars 49 scripts 571 downloadsscFeatures - scFeatures: Multi-view representations of single-cell and spatial data for disease outcome prediction
scFeatures constructs multi-view representations of single-cell and spatial data. scFeatures is a tool that generates multi-view representations of single-cell and spatial data through the construction of a total of 17 feature types. These features can then be used for a variety of analyses using other software in Biocondutor.
Last updated
cellbasedassayssinglecellspatialsoftwaretranscriptomics
6.56 score 15 stars 24 scripts 310 downloadssubSeq - Subsampling of high-throughput sequencing count data
Subsampling of high throughput sequencing count data for use in experiment design and analysis.
Last updated
immunooncologysequencingtranscriptionrnaseqgeneexpressiondifferentialexpression
6.56 score 20 stars 30 scripts 406 downloadsmade4 - Multivariate analysis of microarray data using ADE4
Multivariate data analysis and graphical display of microarray data. Functions include for supervised dimension reduction (between group analysis) and joint dimension reduction of 2 datasets (coinertia analysis). It contains functions that require R package ade4.
Last updated
clusteringclassificationdimensionreductionprincipalcomponenttranscriptomicsmultiplecomparisongeneexpressionsequencingmicroarray
6.56 score 4 dependents 150 scripts 712 downloadsBSgenomeForge - Forge your own BSgenome data package
A set of tools to forge BSgenome data packages. Supersedes the old seed-based tools from the BSgenome software package. This package allows the user to create a BSgenome data package in one function call, simplifying the old seed-based process.
Last updated
infrastructuredatarepresentationgenomeassemblyannotationgenomeannotationsequencingalignmentdataimportsequencematchingbioconductor-packagecore-package
6.54 score 5 stars 20 scripts 486 downloadslpsymphony - Symphony integer linear programming solver in R
This package was derived from Rsymphony_0.1-17 from CRAN. These packages provide an R interface to SYMPHONY, an open-source linear programming solver written in C++. The main difference between this package and Rsymphony is that it includes the solver source code (SYMPHONY version 5.6), while Rsymphony expects to find header and library files on the users' system. Thus the intention of lpsymphony is to provide an easy to install interface to SYMPHONY. For Windows, precompiled DLLs are included in this package.
Last updated
infrastructurethirdpartyclientcoinor-symphony
6.54 score 3 dependents 28 scripts 2.0k downloads
cogeqc - Systematic quality checks on comparative genomics analyses
cogeqc aims to facilitate systematic quality checks on standard comparative genomics analyses to help researchers detect issues and select the most suitable parameters for each data set. cogeqc can be used to asses: i. genome assembly and annotation quality with BUSCOs and comparisons of statistics with publicly available genomes on the NCBI; ii. orthogroup inference using a protein domain-based approach and; iii. synteny detection using synteny network properties. There are also data visualization functions to explore QC summary statistics.
Last updated
softwaregenomeassemblycomparativegenomicsfunctionalgenomicsphylogeneticsqualitycontrolnetworkcomparative-genomicsevolutionary-genomics
6.54 score 12 stars 32 scripts 358 downloadsMungeSumstats - Standardise summary statistics from GWAS
The *MungeSumstats* package is designed to facilitate the standardisation of GWAS summary statistics. It reformats inputted summary statisitics to include SNP, CHR, BP and can look up these values if any are missing. It also pefrorms dozens of QC and filtering steps to ensure high data quality and minimise inter-study differences.
Last updated
snpwholegenomegeneticscomparativegenomicsgenomewideassociationgenomicvariationpreprocessing
6.53 score 3 stars 176 scripts 1.2k downloadszenith - Gene set analysis following differential expression using linear (mixed) modeling with dream
Zenith performs gene set analysis on the result of differential expression using linear (mixed) modeling with dream by considering the correlation between gene expression traits. This package implements the camera method from the limma package proposed by Wu and Smyth (2012). Zenith is a simple extension of camera to be compatible with linear mixed models implemented in variancePartition::dream().
Last updated
rnaseqgeneexpressiongenesetenrichmentdifferentialexpressionbatcheffectqualitycontrolregressionepigeneticsfunctionalgenomicstranscriptomicsnormalizationpreprocessingmicroarrayimmunooncologysoftware
6.52 score 1 dependents 185 scripts 461 downloadsChromSCape - Analysis of single-cell epigenomics datasets with a Shiny App
ChromSCape - Chromatin landscape profiling for Single Cells - is a ready-to-launch user-friendly Shiny Application for the analysis of single-cell epigenomics datasets (scChIP-seq, scATAC-seq, scCUT&Tag, ...) from aligned data to differential analysis & gene set enrichment analysis. It is highly interactive, enables users to save their analysis and covers a wide range of analytical steps: QC, preprocessing, filtering, batch correction, dimensionality reduction, vizualisation, clustering, differential analysis and gene set analysis.
Last updated
shinyappssoftwaresinglecellchipseqatacseqmethylseqclassificationclusteringepigeneticsprincipalcomponentannotationbatcheffectmultiplecomparisonnormalizationpathwayspreprocessingqualitycontrolreportwritingvisualizationgenesetenrichmentdifferentialpeakcallingepigenomicsshinysingle-cellcpp
6.51 score 14 stars 22 scripts 324 downloadsrecountmethylation - Access and analyze public DNA methylation array data compilations
Resources for cross-study analyses of public DNAm array data from NCBI GEO repo, produced using Illumina's Infinium HumanMethylation450K (HM450K) and MethylationEPIC (EPIC) platforms. Provided functions enable download, summary, and filtering of large compilation files. Vignettes detail background about file formats, example analyses, and more. Note the disclaimer on package load and consult the main manuscripts for further info.
Last updated
dnamethylationepigeneticsmicroarraymethylationarrayexperimenthub
6.51 score 9 stars 17 scripts 379 downloadsMsDataHub - Mass Spectrometry Data on ExperimentHub
The MsDataHub package uses the ExperimentHub infrastructure to distribute raw mass spectrometry data files, peptide spectrum matches or quantitative data from proteomics and metabolomics experiments.
Last updated
experimenthubsoftwaremassspectrometryproteomicsmetabolomicsbioconductordatamass-spectrometry
6.50 score 1 stars 1 dependents 88 scripts 414 downloadsHiContacts - Analysing cool files in R with HiContacts
HiContacts provides a collection of tools to analyse and visualize Hi-C datasets imported in R by HiCExperiment.
Last updated
hicdna3dstructure
6.49 score 16 stars 64 scripts 436 downloadsGPA - GPA (Genetic analysis incorporating Pleiotropy and Annotation)
This package provides functions for fitting GPA, a statistical framework to prioritize GWAS results by integrating pleiotropy information and annotation data. In addition, it also includes ShinyGPA, an interactive visualization toolkit to investigate pleiotropic architecture.
Last updated
softwarestatisticalmethodclassificationgenomewideassociationsnpgeneticsclusteringmultiplecomparisonpreprocessinggeneexpressiondifferentialexpressioncpp
6.48 score 16 stars 19 scripts 296 downloadsRNAmodR - Detection of post-transcriptional modifications in high throughput sequencing data
RNAmodR provides classes and workflows for loading/aggregation data from high througput sequencing aimed at detecting post-transcriptional modifications through analysis of specific patterns. In addition, utilities are provided to validate and visualize the results. The RNAmodR package provides a core functionality from which specific analysis strategies can be easily implemented as a seperate package.
Last updated
softwareinfrastructureworkflowstepvisualizationsequencingalkanilineseqbioconductormodificationsribomethseqrnarnamodr
6.46 score 3 stars 3 dependents 12 scripts 402 downloadsalabaster.schemas - Schemas for the Alabaster Framework
Stores all schemas required by various alabaster.* packages. No computation should be performed by this package, as that is handled by alabaster.base. We use a separate package instead of storing the schemas in alabaster.base itself, to avoid conflating management of the schemas with code maintenence.
Last updated
datarepresentationdataimport
6.46 score 18 dependents 2 scripts 5.4k downloadsalabaster.ranges - Load and Save Ranges-related Artifacts from File
Save GenomicRanges, IRanges and related data structures into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated
dataimportdatarepresentation
6.46 score 11 dependents 8 scripts 5.8k downloadsRankProd - Rank Product method for identifying differentially expressed genes with application in meta-analysis
Non-parametric method for identifying differentially expressed (up- or down- regulated) genes based on the estimated percentage of false predictions (pfp). The method can combine data sets from different origins (meta-analysis) to increase the power of the identification.
Last updated
differentialexpressionstatisticalmethodsoftwareresearchfieldmetabolomicslipidomicsproteomicssystemsbiologygeneexpressionmicroarraygenesignaling
6.45 score 5 dependents 94 scripts 714 downloads
Prostar - Provides a GUI for DAPAR
This package provides a GUI interface for the DAPAR package. The package Prostar (Proteomics statistical analysis with R) is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required.
Last updated
proteomicsmassspectrometrynormalizationpreprocessingsoftwareguiprostar1
6.45 score 1 stars 17 scripts 486 downloadsalabaster.matrix - Load and Save Artifacts from File
Save matrices, arrays and similar objects into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated
dataimportdatarepresentationcpp
6.40 score 11 dependents 13 scripts 5.8k downloadscrlmm - Genotype Calling (CRLMM) and Copy Number Analysis tool for Affymetrix SNP 5.0 and 6.0 and Illumina arrays
Faster implementation of CRLMM specific to SNP 5.0 and 6.0 arrays, as well as a copy number tool specific to 5.0, 6.0, and Illumina platforms.
Last updated
microarraypreprocessingsnpcopynumbervariation
6.39 score 3 dependents 38 scripts 687 downloadsidpr - Profiling and Analyzing Intrinsically Disordered Proteins in R
‘idpr’ aims to integrate tools for the computational analysis of intrinsically disordered proteins (IDPs) within R. This package is used to identify known characteristics of IDPs for a sequence of interest with easily reported and dynamic results. Additionally, this package includes tools for IDP-based sequence analysis to be used in conjunction with other R packages. Described in McFadden WM & Yanowitz JL (2022). "idpr: A package for profiling and analyzing Intrinsically Disordered Proteins in R." PloS one, 17(4), e0266929. <https://doi.org/10.1371/journal.pone.0266929>.
Last updated
structuralpredictionproteomicscellbiology
6.39 score 5 stars 41 scripts 363 downloadsSBGNview - "SBGNview: Data Analysis, Integration and Visualization on SBGN Pathways"
SBGNview is a tool set for pathway based data visalization, integration and analysis. SBGNview is similar and complementary to the widely used Pathview, with the following key features: 1. Pathway definition by the widely adopted Systems Biology Graphical Notation (SBGN); 2. Supports multiple major pathway databases beyond KEGG (Reactome, MetaCyc, SMPDB, PANTHER, METACROP) and user defined pathways; 3. Covers 5,200 reference pathways and over 3,000 species by default; 4. Extensive graphics controls, including glyph and edge attributes, graph layout and sub-pathway highlight; 5. SBGN pathway data manipulation, processing, extraction and analysis.
Last updated
genetargetpathwaysgraphandnetworkvisualizationgenesetenrichmentdifferentialexpressiongeneexpressionmicroarrayrnaseqgeneticsmetabolomicsproteomicssystemsbiologysequencing
6.39 score 29 stars 28 scripts 526 downloadsalabaster.se - Load and Save SummarizedExperiments from File
Save SummarizedExperiments into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated
dataimportdatarepresentation
6.35 score 10 dependents 13 scripts 5.7k downloadsDESpace - DESpace: a framework to discover spatially variable genes and differential spatial patterns across conditions
Intuitive framework for identifying spatially variable genes (SVGs) and differential spatial variable pattern (DSP) between conditions via edgeR, a popular method for performing differential expression analyses. Based on pre-annotated spatial clusters as summarized spatial information, DESpace models gene expression using a negative binomial (NB), via edgeR, with spatial clusters as covariates. SVGs are then identified by testing the significance of spatial clusters. For multi-sample, multi-condition datasets, we again fit a NB model via edgeR, incorporating spatial clusters, conditions and their interactions as covariates. DSP genes-representing differences in spatial gene expression patterns across experimental conditions-are identified by testing the interaction between spatial clusters and conditions.
Last updated
spatialsinglecellrnaseqtranscriptomicsgeneexpressionsequencingdifferentialexpressionstatisticalmethodvisualization
6.34 score 7 stars 52 scripts 439 downloadsscPCA - Sparse Contrastive Principal Component Analysis
A toolbox for sparse contrastive principal component analysis (scPCA) of high-dimensional biological data. scPCA combines the stability and interpretability of sparse PCA with contrastive PCA's ability to disentangle biological signal from unwanted variation through the use of control data. Also implements and extends cPCA.
Last updated
principalcomponentgeneexpressiondifferentialexpressionsequencingmicroarrayrnaseqbioconductorcontrastive-learningdimensionality-reduction
6.33 score 12 stars 36 scripts 406 downloadsdearseq - Differential Expression Analysis for RNA-seq data through a robust variance component test
Differential Expression Analysis RNA-seq data with variance component score test accounting for data heteroscedasticity through precision weights. Perform both gene-wise and gene set analyses, and can deal with repeated or longitudinal data. Methods are detailed in: i) Agniel D & Hejblum BP (2017) Variance component score test for time-course gene set analysis of longitudinal RNA-seq data, Biostatistics, 18(4):589-604 ; and ii) Gauthier M, Agniel D, Thiébaut R & Hejblum BP (2020) dearseq: a variance component score test for RNA-Seq differential analysis that effectively controls the false discovery rate, NAR Genomics and Bioinformatics, 2(4):lqaa093.
Last updated
biomedicalinformaticscellbiologydifferentialexpressiondnaseqgeneexpressiongeneticsgenesetenrichmentimmunooncologykeggregressionrnaseqsequencingsystemsbiologytimecoursetranscriptiontranscriptomics
6.33 score 8 stars 1 dependents 15 scripts 546 downloadspeco - A Supervised Approach for **P**r**e**dicting **c**ell Cycle Pr**o**gression using scRNA-seq data
Our approach provides a way to assign continuous cell cycle phase using scRNA-seq data, and consequently, allows to identify cyclic trend of gene expression levels along the cell cycle. This package provides method and training data, which includes scRNA-seq data collected from 6 individual cell lines of induced pluripotent stem cells (iPSCs), and also continuous cell cycle phase derived from FUCCI fluorescence imaging data.
Last updated
sequencingrnaseqgeneexpressiontranscriptomicssinglecellsoftwarestatisticalmethodclassificationvisualizationcell-cyclesingle-cell-rna-seq
6.33 score 14 stars 34 scripts 372 downloadsMoleculeExperiment - Prioritising a molecule-level storage of Spatial Transcriptomics Data
MoleculeExperiment contains functions to create and work with objects from the new MoleculeExperiment class. We introduce this class for analysing molecule-based spatial transcriptomics data (e.g., Xenium by 10X, Cosmx SMI by Nanostring, and Merscope by Vizgen). This allows researchers to analyse spatial transcriptomics data at the molecule level, and to have standardised data formats accross vendors.
Last updated
dataimportdatarepresentationinfrastructuresoftwarespatialtranscriptomics
6.32 score 12 stars 44 scripts 305 downloadsscMET - Bayesian modelling of cell-to-cell DNA methylation heterogeneity
High-throughput single-cell measurements of DNA methylomes can quantify methylation heterogeneity and uncover its role in gene regulation. However, technical limitations and sparse coverage can preclude this task. scMET is a hierarchical Bayesian model which overcomes sparsity, sharing information across cells and genomic features to robustly quantify genuine biological heterogeneity. scMET can identify highly variable features that drive epigenetic heterogeneity, and perform differential methylation and variability analyses. We illustrate how scMET facilitates the characterization of epigenetically distinct cell populations and how it enables the formulation of novel hypotheses on the epigenetic regulation of gene expression.
Last updated
immunooncologydnamethylationdifferentialmethylationdifferentialexpressiongeneexpressiongeneregulationepigeneticsgeneticsclusteringfeatureextractionregressionbayesiansequencingcoveragesinglecellbayesian-inferencegeneralised-linear-modelsheterogeneityhierarchical-modelsmethylation-analysissingle-cellcpp
6.32 score 25 stars 42 scripts 318 downloadsDAPAR - Tools for the Differential Analysis of Proteins Abundance with R
The package DAPAR is a Bioconductor distributed R package which provides all the necessary functions to analyze quantitative data from label-free proteomics experiments. Contrarily to most other similar R packages, it is endowed with rich and user-friendly graphical interfaces, so that no programming skill is required (see `Prostar` package).
Last updated
proteomicsnormalizationpreprocessingmassspectrometryqualitycontrolgodataimportprostar1
6.32 score 2 stars 1 dependents 21 scripts 512 downloads
scifer - Scifer: Single-Cell Immunoglobulin Filtering of Sanger Sequences
Have you ever index sorted cells in a 96 or 384-well plate and then sequenced using Sanger sequencing? If so, you probably had some struggles to either check the electropherogram of each cell sequenced manually, or when you tried to identify which cell was sorted where after sequencing the plate. Scifer was developed to solve this issue by performing basic quality control of Sanger sequences and merging flow cytometry data from probed single-cell sorted B cells with sequencing data. scifer can export summary tables, 'fasta' files, electropherograms for visual inspection, and generate reports.
Last updated
preprocessingqualitycontrolsangerseqsequencingsoftwareflowcytometrysinglecell
6.31 score 7 stars 28 scripts 324 downloadsPROcess - Ciphergen SELDI-TOF Processing
A package for processing protein mass spectrometry data.
Last updated
immunooncologymassspectrometryproteomics
6.31 score 1.0k scripts 480 downloadscrisprViz - Visualization Functions for CRISPR gRNAs
Provides functionalities to visualize and contextualize CRISPR guide RNAs (gRNAs) on genomic tracks across nucleases and applications. Works in conjunction with the crisprBase and crisprDesign Bioconductor packages. Plots are produced using the Gviz framework.
Last updated
crisprfunctionalgenomicsgenetargetbioconductorbioconductor-packagecrispr-analysiscrispr-designgrnagrna-sequencegrna-sequencessgrnasgrna-designvisualization
6.30 score 8 stars 2 dependents 14 scripts 401 downloadsmgsa - Model-based gene set analysis
Model-based Gene Set Analysis (MGSA) is a Bayesian modeling approach for gene set enrichment. The package mgsa implements MGSA and tools to use MGSA together with the Gene Ontology.
Last updated
pathwaysgogenesetenrichmentopenmp
6.30 score 5 stars 20 scripts 445 downloadsaroma.light - Light-Weight Methods for Normalization and Visualization of Microarray Data using Only Basic R Data Types
Methods for microarray analysis that take basic data types such as matrices and lists of vectors. These methods can be used standalone, be utilized in other packages, or be wrapped up in higher-level classes.
Last updated
infrastructuremicroarrayonechanneltwochannelmultichannelvisualizationpreprocessingbioconductor
6.27 score 1 stars 14 dependents 27 scripts 2.8k downloadsCGHcall - Calling aberrations for array CGH tumor profiles.
Calls aberrations for array CGH data using a six state mixture model as well as several biological concepts that are ignored by existing algorithms. Visualization of profiles is also provided.
Last updated
microarraypreprocessingvisualization
6.27 score 6 dependents 52 scripts 824 downloadsAPL - Association Plots
APL is a package developed for computation of Association Plots (AP), a method for visualization and analysis of single cell transcriptomics data. The main focus of APL is the identification of genes characteristic for individual clusters of cells from input data. The package performs correspondence analysis (CA) and allows to identify cluster-specific genes using Association Plots. Additionally, APL computes the cluster-specificity scores for all genes which allows to rank the genes by their specificity for a selected cell cluster of interest.
Last updated
statisticalmethoddimensionreductionsinglecellsequencingrnaseqgeneexpression
6.26 score 17 stars 18 scripts 274 downloadssparrow - Take command of set enrichment analyses through a unified interface
Provides a unified interface to a variety of GSEA techniques from different bioconductor packages. Results are harmonized into a single object and can be interrogated uniformly for quick exploration and interpretation of results. Interactive exploration of GSEA results is enabled through a shiny app provided by a sparrow.shiny sibling package.
Last updated
genesetenrichmentpathwaysbioinformaticsgsea
6.24 score 23 stars 19 scripts 670 downloadsSEtools - SEtools: tools for working with SummarizedExperiment
This includes a set of convenience functions for working with the SummarizedExperiment class. Note that plotting functions historically in this package have been moved to the sechm package (see vignette for details).
Last updated
geneexpression
6.24 score 3 stars 97 scripts 366 downloadsSpliceWiz - interactive analysis and visualization of alternative splicing in R
The analysis and visualization of alternative splicing (AS) events from RNA sequencing data remains challenging. SpliceWiz is a user-friendly and performance-optimized R package for AS analysis, by processing alignment BAM files to quantify read counts across splice junctions, IRFinder-based intron retention quantitation, and supports novel splicing event identification. We introduce a novel visualization for AS using normalized coverage, thereby allowing visualization of differential AS across conditions. SpliceWiz features a shiny-based GUI facilitating interactive data exploration of results including gene ontology enrichment. It is performance optimized with multi-threaded processing of BAM files and a new COV file format for fast recall of sequencing coverage. Overall, SpliceWiz streamlines AS analysis, enabling reliable identification of functionally relevant AS events for further characterization.
Last updated
softwaretranscriptomicsrnaseqalternativesplicingcoveragedifferentialsplicingdifferentialexpressionguisequencingcppopenmp
6.24 score 24 stars 18 scripts 436 downloadstransformGamPoi - Variance Stabilizing Transformation for Gamma-Poisson Models
Variance-stabilizing transformations help with the analysis of heteroskedastic data (i.e., data where the variance is not constant, like count data). This package provide two types of variance stabilizing transformations: (1) methods based on the delta method (e.g., 'acosh', 'log(x+1)'), (2) model residual based (Pearson and randomized quantile residuals).
Last updated
singlecellnormalizationpreprocessingregressioncpp
6.23 score 22 stars 39 scripts 310 downloadscytoviewer - An interactive multi-channel image viewer for R
This R package supports interactive visualization of multi-channel images and segmentation masks generated by imaging mass cytometry and other highly multiplexed imaging techniques using shiny. The cytoviewer interface is divided into image-level (Composite and Channels) and cell-level visualization (Masks). It allows users to overlay individual images with segmentation masks, integrates well with SingleCellExperiment and SpatialExperiment objects for metadata visualization and supports image downloads.
Last updated
immunooncologysoftwaresinglecellonechanneltwochannelmultichannelspatialdataimportbioconductorimagingshinyvisualization
6.23 score 7 stars 54 scripts 360 downloadssimpleSeg - A package to perform simple cell segmentation
Image segmentation is the process of identifying the borders of individual objects (in this case cells) within an image. This allows for the features of cells such as marker expression and morphology to be extracted, stored and analysed. simpleSeg provides functionality for user friendly, watershed based segmentation on multiplexed cellular images in R based on the intensity of user specified protein marker channels. simpleSeg can also be used for the normalization of single cell data obtained from multiple images.
Last updated
classificationsurvivalsinglecellnormalizationspatialspatial-statistics
6.23 score 1 stars 2 dependents 28 scripts 502 downloadsMSstatsLiP - LiP Significance Analysis in shotgun mass spectrometry-based proteomic experiments
Tools for LiP peptide and protein significance analysis. Provides functions for summarization, estimation of LiP peptide abundance, and detection of changes across conditions. Utilizes functionality across the MSstats family of packages.
Last updated
immunooncologymassspectrometryproteomicssoftwaredifferentialexpressiononechanneltwochannelnormalizationqualitycontrolcpp
6.23 score 7 stars 5 scripts 369 downloads
ISAnalytics - Analyze gene therapy vector insertion sites data identified from genomics next generation sequencing reads for clonal tracking studies
In gene therapy, stem cells are modified using viral vectors to deliver the therapeutic transgene and replace functional properties since the genetic modification is stable and inherited in all cell progeny. The retrieval and mapping of the sequences flanking the virus-host DNA junctions allows the identification of insertion sites (IS), essential for monitoring the evolution of genetically modified cells in vivo. A comprehensive toolkit for the analysis of IS is required to foster clonal trackign studies and supporting the assessment of safety and long term efficacy in vivo. This package is aimed at (1) supporting automation of IS workflow, (2) performing base and advance analysis for IS tracking (clonal abundance, clonal expansions and statistics for insertional mutagenesis, etc.), (3) providing basic biology insights of transduced stem cells in vivo.
Last updated
biomedicalinformaticssequencingsinglecellcellbiologyfunctionalgenomicsdataimport
6.22 score 3 stars 22 scripts 331 downloadsSAIGEgds - Scalable Implementation of Generalized mixed models using GDS files in Phenome-Wide Association Studies
Scalable implementation of generalized mixed models with highly optimized C++ implementation and integration with Genomic Data Structure (GDS) files. It is designed for single variant tests and set-based aggregate tests in large-scale Phenome-wide Association Studies (PheWAS) with millions of variants and samples, controlling for sample structure and case-control imbalance. The implementation is based on the SAIGE R package (v0.45, Zhou et al. 2018 and Zhou et al. 2020), and it is extended to include the state-of-the-art ACAT-O set-based tests. Benchmarks show that SAIGEgds is significantly faster than the SAIGE R package. Optional OpenCL-based GPU acceleration is supported for the GRM cross-product computation in null model fitting and for GRM construction.
Last updated
softwaregeneticsstatisticalmethodgenomewideassociationgdsgwasmixed-modelphewasopenblascpp
6.21 score 7 stars 17 scripts 368 downloadsMsQuality - MsQuality - Quality metric calculation from Spectra, MsExperiment and Chromatograms objects
The MsQuality provides functionality to calculate quality metrics for mass spectrometry-derived, spectral data at the per-sample level. MsQuality relies on the mzQC framework of quality metrics defined by the Human Proteom Organization-Proteomics Standards Initiative (HUPO-PSI). These metrics quantify the quality of spectral raw files using a controlled vocabulary. The package is especially addressed towards users that acquire mass spectrometry data on a large scale (e.g. data sets from clinical settings consisting of several thousands of samples). The MsQuality package allows to calculate low-level quality metrics that require minimum information on mass spectrometry data: retention time, m/z values, and associated intensities. MsQuality relies on the Spectra package, or alternatively the MsExperiment package, and its infrastructure to store spectral data. Additionally, MsQuality supports Chromatograms objects from the Chromatograms package for chromatographic quality metrics.
Last updated
metabolomicsproteomicsmassspectrometryqualitycontrolmass-spectrometryqc
6.20 score 8 stars 6 scripts 344 downloadsderfinderHelper - derfinder helper package
Helper package for speeding up the derfinder package when using multiple cores. This package is particularly useful when using BiocParallel and it helps reduce the time spent loading the full derfinder package when running the F-statistics calculation in parallel.
Last updated
differentialexpressionsequencingrnaseqsoftwareimmunooncologybioconductorderfinder
6.20 score 7 dependents 910 downloadsgemini - GEMINI: Variational inference approach to infer genetic interactions from pairwise CRISPR screens
GEMINI uses log-fold changes to model sample-dependent and independent effects, and uses a variational Bayes approach to infer these effects. The inferred effects are used to score and identify genetic interactions, such as lethality and recovery. More details can be found in Zamanighomi et al. 2019 (in press).
Last updated
softwarecrisprbayesiandataimportcomputational-biologygenetic-interactions
6.20 score 16 stars 14 scripts 325 downloadsgscreend - Analysis of pooled genetic screens
Package for the analysis of pooled genetic screens (e.g. CRISPR-KO). The analysis of such screens is based on the comparison of gRNA abundances before and after a cell proliferation phase. The gscreend packages takes gRNA counts as input and allows detection of genes whose knockout decreases or increases cell proliferation.
Last updated
softwarestatisticalmethodpooledscreenscrispr
6.19 score 12 stars 13 scripts 342 downloadsbiscuiteer - Convenience Functions for Biscuit
A test harness for bsseq loading of Biscuit output, summarization of WGBS data over defined regions and in mappable samples, with or without imputation, dropping of mostly-NA rows, age estimates, etc.
Last updated
dataimportmethylseqdnamethylation
6.18 score 6 stars 17 scripts 465 downloadsHIPPO - Heterogeneity-Induced Pre-Processing tOol
For scRNA-seq data, it selects features and clusters the cells simultaneously for single-cell UMI data. It has a novel feature selection method using the zero inflation instead of gene variance, and computationally faster than other existing methods since it only relies on PCA+Kmeans rather than graph-clustering or consensus clustering.
Last updated
sequencingsinglecellgeneexpressiondifferentialexpressionclustering
6.18 score 19 stars 8 scripts 335 downloadsStructuralVariantAnnotation - Variant annotations for structural variants
StructuralVariantAnnotation provides a framework for analysis of structural variants within the Bioconductor ecosystem. This package contains contains useful helper functions for dealing with structural variants in VCF format. The packages contains functions for parsing VCFs from a number of popular callers as well as functions for dealing with breakpoints involving two separate genomic loci encoded as GRanges objects.
Last updated
dataimportsequencingannotationgeneticsvariantannotation
6.18 score 2 dependents 126 scripts 636 downloadsTENxIO - Import methods for 10X Genomics files
Provides a structured S4 approach to importing data files from the 10X pipelines. It mainly supports Single Cell Multiome ATAC + Gene Expression data among other data types. The main Bioconductor data representations used are SingleCellExperiment and RaggedExperiment.
Last updated
softwareinfrastructuredataimportsinglecellbioconductor-packageu24ca289073
6.16 score 1 stars 5 dependents 16 scripts 408 downloadsdiffHic - Differential Analysis of Hi-C Data
Detects differential interactions across biological conditions in a Hi-C experiment. Methods are provided for read alignment and data pre-processing into interaction counts. Statistical analysis is based on edgeR and supports normalization and filtering. Several visualization options are also available.
Last updated
multiplecomparisonpreprocessingsequencingcoveragealignmentnormalizationclusteringhiccurlbzip2xz-utilszlibcpp
6.16 score 1 dependents 48 scripts 539 downloadsSGSeq - Splice event prediction and quantification from RNA-seq data
SGSeq is a software package for analyzing splice events from RNA-seq data. Input data are RNA-seq reads mapped to a reference genome in BAM format. Genes are represented as a splice graph, which can be obtained from existing annotation or predicted from the mapped sequence reads. Splice events are identified from the graph and are quantified locally using structurally compatible reads at the start or end of each splice variant. The software includes functions for splice event prediction, quantification, visualization and interpretation.
Last updated
alternativesplicingimmunooncologyrnaseqtranscription
6.15 score 3 dependents 52 scripts 754 downloadsaffycomp - Graphics Toolbox for Assessment of Affymetrix Expression Measures
The package contains functions that can be used to compare expression measures for Affymetrix Oligonucleotide Arrays.
Last updated
onechannelmicroarraypreprocessing
6.14 score 23 scripts 712 downloadsCBNplot - plot bayesian network inferred from gene expression data based on enrichment analysis results
This package provides the visualization of bayesian network inferred from gene expression data. The networks are based on enrichment analysis results inferred from packages including clusterProfiler and ReactomePA. The networks between pathways and genes inside the pathways can be inferred and visualized.
Last updated
visualizationbayesiangeneexpressionnetworkinferencepathwaysreactomenetworknetworkenrichmentgenesetenrichment
6.14 score 69 stars 10 scripts 351 downloads
planet - Placental DNA methylation analysis tools
This package contains R functions to predict biological variables to from placnetal DNA methylation data generated from infinium arrays. This includes inferring ethnicity/ancestry, gestational age, and cell composition from placental DNA methylation array (450k/850k) data.
Last updated
softwaredifferentialmethylationepigeneticsmicroarraymethylationarraydnamethylationcpgislandancestrydna-methylation-datageneticsinferencemachine-learningplacenta
6.14 score 4 stars 1 dependents 38 scripts 534 downloadsminet - Mutual Information NETworks
This package implements various algorithms for inferring mutual information networks from data.
Last updated
microarraygraphandnetworknetworknetworkinferencecpp
6.13 score 15 dependents 128 scripts 1.2k downloadsGLAD - Gain and Loss Analysis of DNA
Analysis of array CGH data : detection of breakpoints in genomic profiles and assignment of a status (gain, normal or loss) to each chromosomal regions identified.
Last updated
microarraycopynumbervariationgslcpp
6.13 score 2 dependents 112 scripts 734 downloadslisaClust - lisaClust: Clustering of Local Indicators of Spatial Association
lisaClust provides a series of functions to identify and visualise regions of tissue where spatial associations between cell-types is similar. This package can be used to provide a high-level summary of cell-type colocalization in multiplexed imaging data that has been segmented at a single-cell resolution.
Last updated
singlecellcellbasedassaysspatial
6.11 score 4 stars 46 scripts 448 downloadsFEAST - FEAture SelcTion (FEAST) for Single-cell clustering
Cell clustering is one of the most important and commonly performed tasks in single-cell RNA sequencing (scRNA-seq) data analysis. An important step in cell clustering is to select a subset of genes (referred to as “features”), whose expression patterns will then be used for downstream clustering. A good set of features should include the ones that distinguish different cell types, and the quality of such set could have significant impact on the clustering accuracy. FEAST is an R library for selecting most representative features before performing the core of scRNA-seq clustering. It can be used as a plug-in for the etablished clustering algorithms such as SC3, TSCAN, SHARP, SIMLR, and Seurat. The core of FEAST algorithm includes three steps: 1. consensus clustering; 2. gene-level significance inference; 3. validation of an optimized feature set.
Last updated
sequencingsinglecellclusteringfeatureextraction
6.11 score 10 stars 64 scripts 476 downloadshopach - Hierarchical Ordered Partitioning and Collapsing Hybrid (HOPACH)
The HOPACH clustering algorithm builds a hierarchical tree of clusters by recursively partitioning a data set, while ordering and possibly collapsing clusters at each level. The algorithm uses the Mean/Median Split Silhouette (MSS) criteria to identify the level of the tree with maximally homogeneous clusters. It also runs the tree down to produce a final ordered list of the elements. The non-parametric bootstrap allows one to estimate the probability that each element belongs to each cluster (fuzzy clustering).
Last updated
clustering
6.10 score 5 dependents 60 scripts 782 downloadsbioDist - Different distance measures
A collection of software tools for calculating distance measures.
Last updated
clusteringclassification
6.10 score 3 dependents 70 scripts 668 downloadschipseq - chipseq: A package for analyzing chipseq data
Tools for helping process short read data for chipseq experiments.
Last updated
chipseqsequencingcoveragequalitycontroldataimport
6.08 score 2 dependents 92 scripts 1.2k downloadsMOGAMUN - MOGAMUN: A Multi-Objective Genetic Algorithm to Find Active Modules in Multiplex Biological Networks
MOGAMUN is a multi-objective genetic algorithm that identifies active modules in a multiplex biological network. This allows analyzing different biological networks at the same time. MOGAMUN is based on NSGA-II (Non-Dominated Sorting Genetic Algorithm, version II), which we adapted to work on networks.
Last updated
systemsbiologygraphandnetworkdifferentialexpressionbiomedicalinformaticstranscriptomicsclusteringnetwork
6.07 score 13 stars 5 scripts 298 downloadsceRNAnetsim - Regulation Simulator of Interaction between miRNA and Competing RNAs (ceRNA)
This package simulates regulations of ceRNA (Competing Endogenous) expression levels after a expression level change in one or more miRNA/mRNAs. The methodolgy adopted by the package has potential to incorparate any ceRNA (circRNA, lincRNA, etc.) into miRNA:target interaction network. The package basically distributes miRNA expression over available ceRNAs where each ceRNA attracks miRNAs proportional to its amount. But, the package can utilize multiple parameters that modify miRNA effect on its target (seed type, binding energy, binding location, etc.). The functions handle the given dataset as graph object and the processes progress via edge and node variables.
Last updated
networkinferencesystemsbiologynetworkgraphandnetworktranscriptomicscernamirnanetwork-biologynetwork-simulatortcgatidygraphtidyverse
6.07 score 5 stars 13 scripts 322 downloadsRqc - Quality Control Tool for High-Throughput Sequencing Data
Rqc is an optimised tool designed for quality control and assessment of high-throughput sequencing data. It performs parallel processing of entire files and produces a report which contains a set of high-resolution graphics.
Last updated
sequencingqualitycontroldataimportcpp
6.06 score 77 scripts 709 downloadsSCOPE - A normalization and copy number estimation method for single-cell DNA sequencing
Whole genome single-cell DNA sequencing (scDNA-seq) enables characterization of copy number profiles at the cellular level. This circumvents the averaging effects associated with bulk-tissue sequencing and has increased resolution yet decreased ambiguity in deconvolving cancer subclones and elucidating cancer evolutionary history. ScDNA-seq data is, however, sparse, noisy, and highly variable even within a homogeneous cell population, due to the biases and artifacts that are introduced during the library preparation and sequencing procedure. Here, we propose SCOPE, a normalization and copy number estimation method for scDNA-seq data. The distinguishing features of SCOPE include: (i) utilization of cell-specific Gini coefficients for quality controls and for identification of normal/diploid cells, which are further used as negative control samples in a Poisson latent factor model for normalization; (ii) modeling of GC content bias using an expectation-maximization algorithm embedded in the Poisson generalized linear models, which accounts for the different copy number states along the genome; (iii) a cross-sample iterative segmentation procedure to identify breakpoints that are shared across cells from the same genetic background.
Last updated
singlecellnormalizationcopynumbervariationsequencingwholegenomecoveragealignmentqualitycontroldataimportdnaseq
6.06 score 114 scripts 369 downloadsTrajectoryUtils - Single-Cell Trajectory Analysis Utilities
Implements low-level utilities for single-cell trajectory analysis, primarily intended for re-use inside higher-level packages. Include a function to create a cluster-level minimum spanning tree and data structures to hold pseudotime inference results.
Last updated
geneexpressionsinglecell
6.05 score 8 dependents 20 scripts 3.9k downloadsVplotR - Set of tools to make V-plots and compute footprint profiles
The pattern of digestion and protection from DNA nucleases such as DNAse I, micrococcal nuclease, and Tn5 transposase can be used to infer the location of associated proteins. This package contains useful functions to analyze patterns of paired-end sequencing fragment density. VplotR facilitates the generation of V-plots and footprint profiles over single or aggregated genomic loci of interest.
Last updated
nucleosomepositioningcoveragesequencingbiologicalquestionatacseqalignment
6.05 score 11 stars 17 scripts 370 downloadsArrayExpress - Access the ArrayExpress Collection at EMBL-EBI Biostudies and build Bioconductor data structures: ExpressionSet, AffyBatch, NChannelSet
Access the ArrayExpress Collection at EMBL-EBI Biostudies and build Bioconductor data structures: ExpressionSet, AffyBatch, NChannelSet.
Last updated
microarraydataimportonechanneltwochannel
6.04 score 1 dependents 183 scripts 884 downloadsconcordexR - Identify Spatial Homogeneous Regions with concordex
Spatial homogeneous regions (SHRs) in tissues are domains that are homogenous with respect to cell type composition. We present a method for identifying SHRs using spatial transcriptomics data, and demonstrate that it is efficient and effective at finding SHRs for a wide variety of tissue types. concordex relies on analysis of k-nearest-neighbor (kNN) graphs. The tool is also useful for analysis of non-spatial transcriptomics data, and can elucidate the extent of concordance between partitions of cells derived from clustering algorithms, and transcriptomic similarity as represented in kNN graphs.
Last updated
singlecellclusteringspatialtranscriptomics
6.04 score 14 stars 13 scripts 294 downloadsCytoGLMM - Conditional Differential Analysis for Flow and Mass Cytometry Experiments
The CytoGLMM R package implements two multiple regression strategies: A bootstrapped generalized linear model (GLM) and a generalized linear mixed model (GLMM). Most current data analysis tools compare expressions across many computationally discovered cell types. CytoGLMM focuses on just one cell type. Our narrower field of application allows us to define a more specific statistical model with easier to control statistical guarantees. As a result, CytoGLMM finds differential proteins in flow and mass cytometry data while reducing biases arising from marker correlations and safeguarding against false discoveries induced by patient heterogeneity.
Last updated
flowcytometryproteomicssinglecellcellbasedassayscellbiologyimmunooncologyregressionstatisticalmethodsoftware
6.03 score 3 stars 1 dependents 2 scripts 322 downloadsoligoClasses - Classes for high-throughput arrays supported by oligo and crlmm
This package contains class definitions, validity checks, and initialization methods for classes used by the oligo and crlmm packages.
Last updated
infrastructure
6.03 score 18 dependents 100 scripts 3.3k downloadsgroHMM - GRO-seq Analysis Pipeline
A pipeline for the analysis of GRO-seq data.
Last updated
sequencingsoftware
6.00 score 2 stars 28 scripts 492 downloadsDAMEfinder - Finds DAMEs - Differential Allelicly MEthylated regions
'DAMEfinder' offers functionality for taking methtuple or bismark outputs to calculate ASM scores and compute DAMEs. It also offers nice visualization of methyl-circle plots.
Last updated
dnamethylationdifferentialmethylationcoverage
6.00 score 10 stars 10 scripts 472 downloadsrnaseqcomp - Benchmarks for RNA-seq Quantification Pipelines
Several quantitative and visualized benchmarks for RNA-seq quantification pipelines. Two-condition quantifications for genes, transcripts, junctions or exons by each pipeline with necessary meta information should be organized into numeric matrices in order to proceed the evaluation.
Last updated
rnaseqvisualizationqualitycontrol
5.98 score 8 stars 12 scripts 404 downloadsbasilisk.utils - Centralized Conda Installation for Bioconductor Packages
Provides a centralized conda installation for use by other Bioconductor packages. If conda is not already available on the system, it is downloaded and installed from the Miniforge project; otherwise, no action is performed. Historically, this package was used to provide a Python installation for basilisk, hence the name.
Last updated
infrastructure
5.97 score 2 dependents 15 scripts 5.2k downloads
UMI4Cats - UMI4Cats: Processing, analysis and visualization of UMI-4C chromatin contact data
UMI-4C is a technique that allows characterization of 3D chromatin interactions with a bait of interest, taking advantage of a sonication step to produce unique molecular identifiers (UMIs) that help remove duplication bias, thus allowing a better differential comparsion of chromatin interactions between conditions. This package allows processing of UMI-4C data, starting from FastQ files provided by the sequencing facility. It provides two statistical methods for detecting differential contacts and includes a visualization function to plot integrated information from a UMI-4C assay.
Last updated
qualitycontrolpreprocessingalignmentnormalizationvisualizationsequencingcoveragechromatinchromatin-interactiongenomicsumi4c
5.93 score 5 stars 17 scripts 368 downloadscfTools - Informatics Tools for Cell-Free DNA Study
The cfTools R package provides methods for cell-free DNA (cfDNA) methylation data analysis to facilitate cfDNA-based studies. Given the methylation sequencing data of a cfDNA sample, for each cancer marker or tissue marker, we deconvolve the tumor-derived or tissue-specific reads from all reads falling in the marker region. Our read-based deconvolution algorithm exploits the pervasiveness of DNA methylation for signal enhancement, therefore can sensitively identify a trace amount of tumor-specific or tissue-specific cfDNA in plasma. cfTools provides functions for (1) cancer detection: sensitively detect tumor-derived cfDNA and estimate the tumor-derived cfDNA fraction (tumor burden); (2) tissue deconvolution: infer the tissue type composition and the cfDNA fraction of multiple tissue types for a plasma cfDNA sample. These functions can serve as foundations for more advanced cfDNA-based studies, including cancer diagnosis and disease monitoring.
Last updated
softwarebiomedicalinformaticsepigeneticssequencingmethylseqdnamethylationdifferentialmethylationcpp
5.92 score 11 stars 2 scripts 280 downloads
MsBackendMassbank - Mass Spectrometry Data Backend for MassBank record Files
Mass spectrometry (MS) data backend supporting import and export of MS/MS library spectra from MassBank record files. Different backends are available that allow handling of data in plain MassBank text file format or allow also to interact directly with MassBank SQL databases. Objects from this package are supposed to be used with the Spectra Bioconductor package. This package thus adds MassBank support to the Spectra package.
Last updated
infrastructuremassspectrometrymetabolomicsdataimportmassbankspectra
5.91 score 3 stars 34 scripts 328 downloadsdemuxmix - Demultiplexing oligo-barcoded scRNA-seq data using regression mixture models
A package for demultiplexing single-cell sequencing experiments of pooled cells labeled with barcode oligonucleotides. The package implements methods to fit regression mixture models for a probabilistic classification of cells, including multiplet detection. Demultiplexing error rates can be estimated, and methods for quality control are provided.
Last updated
singlecellsequencingpreprocessingclassificationregression
5.91 score 5 stars 1 dependents 27 scripts 438 downloadsGWENA - Pipeline for augmented co-expression analysis
The development of high-throughput sequencing led to increased use of co-expression analysis to go beyong single feature (i.e. gene) focus. We propose GWENA (Gene Whole co-Expression Network Analysis) , a tool designed to perform gene co-expression network analysis and explore the results in a single pipeline. It includes functional enrichment of modules of co-expressed genes, phenotypcal association, topological analysis and comparison of networks configuration between conditions.
Last updated
softwaregeneexpressionnetworkclusteringgraphandnetworkgenesetenrichmentpathwaysvisualizationrnaseqtranscriptomicsmrnamicroarraymicroarraynetworkenrichmentsequencinggoco-expressionenrichment-analysisgenenetwork-analysispipeline
5.88 score 25 stars 15 scripts 435 downloadsglobaltest - Testing Groups of Covariates/Features for Association with a Response Variable, with Applications to Gene Set Testing
The global test tests groups of covariates (or features) for association with a response variable. This package implements the test with diagnostic plots and multiple testing utilities, along with several functions to facilitate the use of this test for gene set testing of GO and KEGG terms.
Last updated
microarrayonechannelbioinformaticsdifferentialexpressiongopathways
5.87 score 7 dependents 78 scripts 2.3k downloadsspeckle - Statistical methods for analysing single cell RNA-seq data
The speckle package contains functions for the analysis of single cell RNA-seq data. The speckle package currently contains functions to analyse differences in cell type proportions. There are also functions to estimate the parameters of the Beta distribution based on a given counts matrix, and a function to normalise a counts matrix to the median library size. There are plotting functions to visualise cell type proportions and the mean-variance relationship in cell type proportions and counts. As our research into specialised analyses of single cell data continues we anticipate that the package will be updated with new functions.
Last updated
singlecellrnaseqregressiongeneexpression
5.87 score 496 scripts 626 downloadsquantsmooth - Quantile smoothing and genomic visualization of array data
Implements quantile smoothing as introduced in: Quantile smoothing of array CGH data; Eilers PH, de Menezes RX; Bioinformatics. 2005 Apr 1;21(7):1146-53.
Last updated
visualizationcopynumbervariation
5.87 score 7 dependents 47 scripts 1.2k downloadscircRNAprofiler - circRNAprofiler: An R-Based Computational Framework for the Downstream Analysis of Circular RNAs
R-based computational framework for a comprehensive in silico analysis of circRNAs. This computational framework allows to combine and analyze circRNAs previously detected by multiple publicly available annotation-based circRNA detection tools. It covers different aspects of circRNAs analysis from differential expression analysis, evolutionary conservation, biogenesis to functional analysis.
Last updated
annotationstructuralpredictionfunctionalpredictiongenepredictiongenomeassemblydifferentialexpression
5.86 score 12 stars 6 scripts 508 downloadsBiocFHIR - Illustration of FHIR ingestion and transformation using R
FHIR R4 bundles in JSON format are derived from https://synthea.mitre.org/downloads. Transformation inspired by a kaggle notebook published by Dr Alexander Scarlat, https://www.kaggle.com/code/drscarlat/fhir-starter-parse-healthcare-bundles-into-tables. This is a very limited illustration of some basic parsing and reorganization processes. Additional tooling will be required to move beyond the Synthea data illustrations.
Last updated
infrastructuredataimportdatarepresentationfhir
5.86 score 4 stars 18 scripts 308 downloadssimilaRpeak - Metrics to estimate a level of similarity between two ChIP-Seq profiles
This package calculates metrics which quantify the level of similarity between ChIP-Seq profiles. More specifically, the package implements six pseudometrics specialized in pattern similarity detection in ChIP-Seq profiles.
Last updated
biologicalquestionchipseqgeneticsmultiplecomparisondifferentialexpressionbioconductorbioconductor-packagechip-profileschip-seqmetrics
5.85 score 7 stars 17 scripts 424 downloadsQtlizer - Comprehensive QTL annotation of GWAS results
This R package provides access to the Qtlizer web server. Qtlizer annotates lists of common small variants (mainly SNPs) and genes in humans with associated changes in gene expression using the most comprehensive database of published quantitative trait loci (QTLs).
Last updated
genomewideassociationsnpgeneticslinkagedisequilibriumeqtlgwasvariant-annotation
5.85 score 3 stars 13 scripts 359 downloadsGeomxTools - NanoString GeoMx Tools
Tools for NanoString Technologies GeoMx Technology. Package provides functions for reading in DCC and PKC files based on an ExpressionSet derived object. Normalization and QC functions are also included.
Last updated
geneexpressiontranscriptioncellbasedassaysdataimporttranscriptomicsproteomicsmrnamicroarrayproprietaryplatformsrnaseqsequencingexperimentaldesignnormalizationspatial
5.83 score 3 dependents 378 scripts 793 downloadsEmpiricalBrownsMethod - Uses Brown's method to combine p-values from dependent tests
Combining P-values from multiple statistical tests is common in bioinformatics. However, this procedure is non-trivial for dependent P-values. This package implements an empirical adaptation of Brown’s Method (an extension of Fisher’s Method) for combining dependent P-values which is appropriate for highly correlated data sets found in high-throughput biological experiments.
Last updated
statisticalmethodgeneexpressionpathways
5.83 score 25 stars 3 dependents 15 scripts 402 downloadsGeoTcgaData - Processing Various Types of Data on GEO and TCGA
Gene Expression Omnibus(GEO) and The Cancer Genome Atlas (TCGA) provide us with a wealth of data, such as RNA-seq, DNA Methylation, SNP and Copy number variation data. It's easy to download data from TCGA using the gdc tool, but processing these data into a format suitable for bioinformatics analysis requires more work. This R package was developed to handle these data.
Last updated
geneexpressiondifferentialexpressionrnaseqcopynumbervariationmicroarraysoftwarednamethylationdifferentialmethylationsnpatacseqmethylationarray
5.83 score 27 stars 25 scripts 343 downloadsnetresponse - Functional Network Analysis
Algorithms for functional network analysis. Includes an implementation of a variational Dirichlet process Gaussian mixture model for nonparametric mixture modeling.
Last updated
cellbiologyclusteringgeneexpressiongeneticsnetworkgraphandnetworkdifferentialexpressionmicroarraynetworkinferencetranscription
5.81 score 3 stars 31 scripts 413 downloadsrcellminer - rcellminer: Molecular Profiles, Drug Response, and Chemical Structures for the NCI-60 Cell Lines
The NCI-60 cancer cell line panel has been used over the course of several decades as an anti-cancer drug screen. This panel was developed as part of the Developmental Therapeutics Program (DTP, http://dtp.nci.nih.gov/) of the U.S. National Cancer Institute (NCI). Thousands of compounds have been tested on the NCI-60, which have been extensively characterized by many platforms for gene and protein expression, copy number, mutation, and others (Reinhold, et al., 2012). The purpose of the CellMiner project (http://discover.nci.nih.gov/ cellminer) has been to integrate data from multiple platforms used to analyze the NCI-60 and to provide a powerful suite of tools for exploration of NCI-60 data.
Last updated
acghcellbasedassayscopynumbervariationgeneexpressionpharmacogenomicspharmacogeneticsmirnacheminformaticsvisualizationsoftwaresystemsbiology
5.78 score 135 scripts 502 downloads
regutools - regutools: an R package for data extraction from RegulonDB
RegulonDB has collected, harmonized and centralized data from hundreds of experiments for nearly two decades and is considered a point of reference for transcriptional regulation in Escherichia coli K12. Here, we present the regutools R package to facilitate programmatic access to RegulonDB data in computational biology. regutools provides researchers with the possibility of writing reproducible workflows with automated queries to RegulonDB. The regutools package serves as a bridge between RegulonDB data and the Bioconductor ecosystem by reusing the data structures and statistical methods powered by other Bioconductor packages. We demonstrate the integration of regutools with Bioconductor by analyzing transcription factor DNA binding sites and transcriptional regulatory networks from RegulonDB. We anticipate that regutools will serve as a useful building block in our progress to further our understanding of gene regulatory networks.
Last updated
generegulationgeneexpressionsystemsbiologynetworknetworkinferencevisualizationtranscriptionbioconductorcdsbregulondb
5.78 score 5 stars 6 scripts 337 downloadsRPA - RPA: Robust Probabilistic Averaging for probe-level analysis
Probabilistic analysis of probe reliability and differential gene expression on short oligonucleotide arrays.
Last updated
geneexpressionmicroarraypreprocessingqualitycontrol
5.78 score 1 dependents 20 scripts 373 downloadsMEDIPS - DNA IP-seq data analysis
MEDIPS was developed for analyzing data derived from methylated DNA immunoprecipitation (MeDIP) experiments followed by sequencing (MeDIP-seq). However, MEDIPS provides functionalities for the analysis of any kind of quantitative sequencing data (e.g. ChIP-seq, MBD-seq, CMS-seq and others) including calculation of differential coverage between groups of samples and saturation and correlation analysis.
Last updated
dnamethylationcpgislanddifferentialexpressionsequencingchipseqpreprocessingqualitycontrolvisualizationmicroarraygeneticscoveragegenomeannotationcopynumbervariationsequencematching
5.77 score 1 dependents 99 scripts 526 downloadslionessR - Modeling networks for individual samples using LIONESS
LIONESS, or Linear Interpolation to Obtain Network Estimates for Single Samples, can be used to reconstruct single-sample networks (https://arxiv.org/abs/1505.06440). This code implements the LIONESS equation in the lioness function in R to reconstruct single-sample networks. The default network reconstruction method we use is based on Pearson correlation. However, lionessR can run on any network reconstruction algorithms that returns a complete, weighted adjacency matrix. lionessR works for both unipartite and bipartite networks.
Last updated
networknetworkinferencegeneexpression
5.76 score 24 stars 24 scripts 410 downloadstimeOmics - Time-Course Multi-Omics data integration
timeOmics is a generic data-driven framework to integrate multi-Omics longitudinal data measured on the same biological samples and select key temporal features with strong associations within the same sample group. The main steps of timeOmics are: 1. Plaform and time-specific normalization and filtering steps; 2. Modelling each biological into one time expression profile; 3. Clustering features with the same expression profile over time; 4. Post-hoc validation step.
Last updated
clusteringfeatureextractiontimecoursedimensionreductionsoftwaresequencingmicroarraymetabolomicsmetagenomicsproteomicsclassificationregressionimmunooncologygenepredictionmultiplecomparisonclusterintegrationmulti-omicstime-series
5.76 score 26 stars 11 scripts 384 downloadsscReClassify - scReClassify: post hoc cell type classification of single-cell RNA-seq data
A post hoc cell type classification tool to fine-tune cell type annotations generated by any cell type classification procedure with semi-supervised learning algorithm AdaSampling technique. The current version of scReClassify supports Support Vector Machine and Random Forest as a base classifier.
Last updated
softwaretranscriptomicssinglecellclassificationsupportvectormachine
5.75 score 11 stars 17 scripts 306 downloadsmiQC - Flexible, probabilistic metrics for quality control of scRNA-seq data
Single-cell RNA-sequencing (scRNA-seq) has made it possible to profile gene expression in tissues at high resolution. An important preprocessing step prior to performing downstream analyses is to identify and remove cells with poor or degraded sample quality using quality control (QC) metrics. Two widely used QC metrics to identify a ‘low-quality’ cell are (i) if the cell includes a high proportion of reads that map to mitochondrial DNA encoded genes (mtDNA) and (ii) if a small number of genes are detected. miQC is data-driven QC metric that jointly models both the proportion of reads mapping to mtDNA and the number of detected genes with mixture models in a probabilistic framework to predict the low-quality cells in a given dataset.
Last updated
singlecellqualitycontrolgeneexpressionpreprocessingsequencing
5.73 score 21 stars 129 scripts 455 downloadsMsBackendRawFileReader - Mass Spectrometry Backend for Reading Thermo Fisher Scientific raw Files
implements a MsBackend for the Spectra package using Thermo Fisher Scientific's NewRawFileReader .Net libraries. The package is generalizing the functionality introduced by the rawrr package Methods defined in this package are supposed to extend the Spectra Bioconductor package.
Last updated
massspectrometryproteomicsmetabolomics
5.73 score 6 stars 12 scripts 340 downloadsctc - Cluster and Tree Conversion.
Tools for export and import classification trees and clusters to other programs
Last updated
microarrayclusteringclassificationdataimportvisualization
5.72 score 2 dependents 88 scripts 696 downloadsppcseq - Probabilistic Outlier Identification for RNA Sequencing Generalized Linear Models
Relative transcript abundance has proven to be a valuable tool for understanding the function of genes in biological systems. For the differential analysis of transcript abundance using RNA sequencing data, the negative binomial model is by far the most frequently adopted. However, common methods that are based on a negative binomial model are not robust to extreme outliers, which we found to be abundant in public datasets. So far, no rigorous and probabilistic methods for detection of outliers have been developed for RNA sequencing data, leaving the identification mostly to visual inspection. Recent advances in Bayesian computation allow large-scale comparison of observed data against its theoretical distribution given in a statistical model. Here we propose ppcseq, a key quality-control tool for identifying transcripts that include outlier data points in differential expression analysis, which do not follow a negative binomial distribution. Applying ppcseq to analyse several publicly available datasets using popular tools, we show that from 3 to 10 percent of differentially abundant transcripts across algorithms and datasets had statistics inflated by the presence of outliers.
Last updated
rnaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsbayesian-inferencedeseq2edgernegative-binomialoutlierstancpp
5.71 score 8 stars 16 scripts 339 downloadsautonomics - Unified Statistical Modeling of Omics Data
This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). Across survival methods (coxph, survdiff, coin). It provides a fast enrichment analysis implementation.
Last updated
softwaredataimportpreprocessingdimensionreductionprincipalcomponentregressiondifferentialexpressiongenesetenrichmenttranscriptomicstranscriptiongeneexpressionrnaseqmicroarrayproteomicsmetabolomicsmassspectrometry
5.71 score 5 scripts 399 downloadsmetabCombiner - Method for Combining LC-MS Metabolomics Feature Measurements
This package aligns LC-HRMS metabolomics datasets acquired from biologically similar specimens analyzed under similar, but not necessarily identical, conditions. Peak-picked and simply aligned metabolomics feature tables (consisting of m/z, rt, and per-sample abundance measurements, plus optional identifiers & adduct annotations) are accepted as input. The package outputs a combined table of feature pair alignments, organized into groups of similar m/z, and ranked by a similarity score. Input tables are assumed to be acquired using similar (but not necessarily identical) analytical methods.
Last updated
softwaremassspectrometrymetabolomicsmass-spectrometry
5.71 score 13 stars 13 scripts 422 downloadsPAST - Pathway Association Study Tool (PAST)
PAST takes GWAS output and assigns SNPs to genes, uses those genes to find pathways associated with the genes, and plots pathways based on significance. Implements methods for reading GWAS input data, finding genes associated with SNPs, calculating enrichment score and significance of pathways, and plotting pathways.
Last updated
pathwaysgenesetenrichment
5.70 score 5 stars 6 scripts 376 downloads
MsBackendSql - SQL-based Mass Spectrometry Data Backend
SQL-based mass spectrometry (MS) data backend supporting also storange and handling of very large data sets. Objects from this package are supposed to be used with the Spectra Bioconductor package. Through the MsBackendSql with its minimal memory footprint, this package thus provides an alternative MS data representation for very large or remote MS data sets.
Last updated
infrastructuremassspectrometrymetabolomicsdataimportproteomics
5.70 score 4 stars 31 scripts 318 downloadsTPP - Analyze thermal proteome profiling (TPP) experiments
Analyze thermal proteome profiling (TPP) experiments with varying temperatures (TR) or compound concentrations (CCR).
Last updated
immunooncologyproteomicsmassspectrometry
5.69 score 27 scripts 443 downloadsTDbasedUFE - Tensor Decomposition Based Unsupervised Feature Extraction
This is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. It can perform unsupervised feature extraction. It uses tensor decomposition. It is applicable to gene expression, DNA methylation, and histone modification etc. It can perform multiomics analysis. It is also potentially applicable to single cell omics data sets.
Last updated
geneexpressionfeatureextractionmethylationarraysinglecellbioinformaticsdna-methylationgene-expression-profileshistone-modificationsmultiomicstensor-decomposition
5.68 score 5 stars 1 dependents 16 scripts 307 downloadsompBAM - C++ Library for OpenMP-based multi-threaded sequential profiling of Binary Alignment Map (BAM) files
This packages provides C++ header files for developers wishing to create R packages that processes BAM files. ompBAM automates file access, memory management, and handling of multiple threads 'behind the scenes', so developers can focus on creating domain-specific functionality. The included vignette contains detailed documentation of this API, including quick-start instructions to create a new ompBAM-based package, and step-by-step explanation of the functionality behind the example packaged included within ompBAM.
Last updated
alignmentdataimportrnaseqsoftwaresequencingtranscriptomicssinglecell
5.68 score 4 stars 2 dependents 8 scripts 344 downloadsEBarrays - Unified Approach for Simultaneous Gene Clustering and Differential Expression Identification
EBarrays provides tools for the analysis of replicated/unreplicated microarray data.
Last updated
clusteringdifferentialexpression
5.67 score 6 dependents 13 scripts 658 downloadsCrispRVariants - Tools for counting and visualising mutations in a target location
CrispRVariants provides tools for analysing the results of a CRISPR-Cas9 mutagenesis sequencing experiment, or other sequencing experiments where variants within a given region are of interest. These tools allow users to localize variant allele combinations with respect to any genomic location (e.g. the Cas9 cut site), plot allele combinations and calculate mutation rates with flexible filtering of unrelated variants.
Last updated
immunooncologycrisprgenomicvariationvariantdetectiongeneticvariabilitydatarepresentationvisualizationsequencing
5.66 score 46 scripts 509 downloadsaCGH - Classes and functions for Array Comparative Genomic Hybridization data
Functions for reading aCGH data from image analysis output files and clone information files, creation of aCGH S3 objects for storing these data. Basic methods for accessing/replacing, subsetting, printing and plotting aCGH objects.
Last updated
copynumbervariationdataimportgeneticscpp
5.64 score 4 dependents 18 scripts 773 downloadsrgoslin - Lipid Shorthand Name Parsing and Normalization
The R implementation for the Grammar of Succint Lipid Nomenclature parses different short hand notation dialects for lipid names. It normalizes them to a standard name. It further provides calculated monoisotopic masses and sum formulas for each successfully parsed lipid name and supplements it with LIPID MAPS Category and Class information. Also, the structural level and further structural details about the head group, fatty acyls and functional groups are returned, where applicable.
Last updated
softwarelipidomicsmetabolomicspreprocessingnormalizationmassspectrometrycpp
5.64 score 6 stars 36 scripts 434 downloadsgenomeIntervals - Operations on genomic intervals
This package defines classes for representing genomic intervals and provides functions and methods for working with these. Note: The package provides the basic infrastructure for and is enhanced by the package 'girafe'.
Last updated
dataimportinfrastructuregenetics
5.64 score 2 dependents 48 scripts 456 downloadsannaffy - Annotation tools for Affymetrix biological metadata
Functions for handling data from Bioconductor Affymetrix annotation data packages. Produces compact HTML and text reports including experimental data and URL links to many online databases. Allows searching biological metadata using various criteria.
Last updated
onechannelmicroarrayannotationgopathwaysreportwriting
5.64 score 3 dependents 60 scripts 752 downloadsAnVILWorkflow - Run workflows implemented in Terra/AnVIL workspace
The AnVIL is a cloud computing resource developed in part by the National Human Genome Research Institute. The main cloud-based genomics platform deported by the AnVIL project is Terra. The AnVILWorkflow package allows remote access to Terra implemented workflows, enabling end-user to utilize Terra/ AnVIL provided resources - such as data, workflows, and flexible/scalble computing resources - through the conventional R functions.
Last updated
infrastructuresoftwareanvilgcpterrau24hg010263workflows
5.62 score 7 stars 3 scripts 265 downloadsfedup - Fisher's Test for Enrichment and Depletion of User-Defined Pathways
An R package that tests for enrichment and depletion of user-defined pathways using a Fisher's exact test. The method is designed for versatile pathway annotation formats (eg. gmt, txt, xlsx) to allow the user to run pathway analysis on custom annotations. This package is also integrated with Cytoscape to provide network-based pathway visualization that enhances the interpretability of the results.
Last updated
genesetenrichmentpathwaysnetworkenrichmentnetworkbioconductorenrichment
5.62 score 7 stars 20 scripts 338 downloadsDMRcaller - Differentially Methylated Regions Caller
Uses Bisulfite sequencing data in two conditions and identifies differentially methylated regions between the conditions in CG and non-CG context. The input is the CX report files produced by Bismark and the output is a list of DMRs stored as GRanges objects.
Last updated
differentialmethylationdnamethylationsoftwaresequencingcoverage
5.62 score 63 scripts 463 downloadsGenomicTuples - Representation and Manipulation of Genomic Tuples
GenomicTuples defines general purpose containers for storing genomic tuples. It aims to provide functionality for tuples of genomic co-ordinates that are analogous to those available for genomic ranges in the GenomicRanges Bioconductor package.
Last updated
infrastructuredatarepresentationsequencingcpp
5.57 score 5 stars 7 scripts 380 downloadsconvert - Convert Microarray Data Objects
Define coerce methods for microarray data objects.
Last updated
infrastructuremicroarraytwochannel
5.57 score 1 dependents 88 scripts 496 downloadstripr - T-cell Receptor/Immunoglobulin Profiler (TRIP)
TRIP is a software framework that provides analytics services on antigen receptor (B cell receptor immunoglobulin, BcR IG | T cell receptor, TR) gene sequence data. It is a web application written in R Shiny. It takes as input the output files of the IMGT/HighV-Quest tool. Users can select to analyze the data from each of the input samples separately, or the combined data files from all samples and visualize the results accordingly.
Last updated
batcheffectmultiplecomparisongeneexpressionimmunooncologytargetedresequencingbioconductorclonotype
5.56 score 3 stars 10 scripts 362 downloadsTAPseq - Targeted scRNA-seq primer design for TAP-seq
Design primers for targeted single-cell RNA-seq used by TAP-seq. Create sequence templates for target gene panels and design gene-specific primers using Primer3. Potential off-targets can be estimated with BLAST. Requires working installations of Primer3 and BLASTn.
Last updated
singlecellsequencingtechnologycrisprpooledscreens
5.56 score 4 stars 10 scripts 387 downloadsHiLDA - Conducting statistical inference on comparing the mutational exposures of mutational signatures by using hierarchical latent Dirichlet allocation
A package built under the Bayesian framework of applying hierarchical latent Dirichlet allocation. It statistically tests whether the mutational exposures of mutational signatures (Shiraishi-model signatures) are different between two groups. The package also provides inference and visualization.
Last updated
softwaresomaticmutationsequencingstatisticalmethodbayesianmutational-signaturesrjagssomatic-mutationscppjags
5.56 score 3 stars 1 dependents 7 scripts 370 downloadsELMER - Inferring Regulatory Element Landscapes and Transcription Factor Networks Using Cancer Methylomes
ELMER is designed to use DNA methylation and gene expression from a large number of samples to infere regulatory element landscape and transcription factor network in primary tissue.
Last updated
dnamethylationgeneexpressionmotifannotationsoftwaregeneregulationtranscriptionnetwork
5.55 score 179 scripts 545 downloadsINTACT - Integrate TWAS and Colocalization Analysis for Gene Set Enrichment Analysis
This package integrates colocalization probabilities from colocalization analysis with transcriptome-wide association study (TWAS) scan summary statistics to implicate genes that may be biologically relevant to a complex trait. The probabilistic framework implemented in this package constrains the TWAS scan z-score-based likelihood using a gene-level colocalization probability. Given gene set annotations, this package can estimate gene set enrichment using posterior probabilities from the TWAS-colocalization integration step.
Last updated
bayesiangenesetenrichment
5.53 score 17 stars 20 scripts 277 downloadspreciseTAD - preciseTAD: A machine learning framework for precise TAD boundary prediction
preciseTAD provides functions to predict the location of boundaries of topologically associated domains (TADs) and chromatin loops at base-level resolution. As an input, it takes BED-formatted genomic coordinates of domain boundaries detected from low-resolution Hi-C data, and coordinates of high-resolution genomic annotations from ENCODE or other consortia. preciseTAD employs several feature engineering strategies and resampling techniques to address class imbalance, and trains an optimized random forest model for predicting low-resolution domain boundaries. Translated on a base-level, preciseTAD predicts the probability for each base to be a boundary. Density-based clustering and scalable partitioning techniques are used to detect precise boundary regions and summit points. Compared with low-resolution boundaries, preciseTAD boundaries are highly enriched for CTCF, RAD21, SMC3, and ZNF143 signal and more conserved across cell lines. The pre-trained model can accurately predict boundaries in another cell line using CTCF, RAD21, SMC3, and ZNF143 annotation data for this cell line.
Last updated
softwarehicsequencingclusteringclassificationfunctionalgenomicsfeatureextraction
5.48 score 8 stars 19 scripts 373 downloads
TREG - Tools for finding Total RNA Expression Genes in single nucleus RNA-seq data
RNA abundance and cell size parameters could improve RNA-seq deconvolution algorithms to more accurately estimate cell type proportions given the different cell type transcription activity levels. A Total RNA Expression Gene (TREG) can facilitate estimating total RNA content using single molecule fluorescent in situ hybridization (smFISH). We developed a data-driven approach using a measure of expression invariance to find candidate TREGs in postmortem human brain single nucleus RNA-seq. This R package implements the method for identifying candidate TREGs from snRNA-seq data.
Last updated
softwaresinglecellrnaseqgeneexpressiontranscriptomicstranscriptionsequencingbioconductordeconvolutionrnascopescrna-seqsmfishsnrna-seqtreg
5.48 score 5 stars 5 scripts 324 downloadsProteoDisco - Generation of customized protein variant databases from genomic variants, splice-junctions and manual sequences
ProteoDisco is an R package to facilitate proteogenomics studies. It houses functions to create customized (variant) protein databases based on user-submitted genomic variants, splice-junctions, fusion genes and manual transcript sequences. The flexible workflow can be adopted to suit a myriad of research and experimental settings.
Last updated
softwareproteomicsrnaseqsnpsequencingvariantannotationdataimport
5.48 score 5 stars 8 scripts 314 downloadsGSgalgoR - An Evolutionary Framework for the Identification and Study of Prognostic Gene Expression Signatures in Cancer
A multi-objective optimization algorithm for disease sub-type discovery based on a non-dominated sorting genetic algorithm. The 'Galgo' framework combines the advantages of clustering algorithms for grouping heterogeneous 'omics' data and the searching properties of genetic algorithms for feature selection. The algorithm search for the optimal number of clusters determination considering the features that maximize the survival difference between sub-types while keeping cluster consistency high.
Last updated
geneexpressiontranscriptionclusteringclassificationsurvival
5.48 score 15 stars 6 scripts 342 downloadsDEWSeq - Differential Expressed Windows Based on Negative Binomial Distribution
DEWSeq is a sliding window approach for the analysis of differentially enriched binding regions eCLIP or iCLIP next generation sequencing data.
Last updated
sequencinggeneregulationfunctionalgenomicsdifferentialexpressionbioinformaticseclipngs-analysis
5.48 score 5 stars 9 scripts 360 downloadsgoProfiles - goProfiles: an R package for the statistical analysis of functional profiles
The package implements methods to compare lists of genes based on comparing the corresponding 'functional profiles'.
Last updated
annotationgogeneexpressiongenesetenrichmentgraphandnetworkmicroarraymultiplecomparisonpathwayssoftware
5.48 score 1 dependents 6 scripts 456 downloadstilingArray - Transcript mapping with high-density oligonucleotide tiling arrays
The package provides functionality that can be useful for the analysis of high-density tiling microarray data (such as from Affymetrix genechips) for measuring transcript abundance and architecture. The main functionalities of the package are: 1. the class 'segmentation' for representing partitionings of a linear series of data; 2. the function 'segment' for fitting piecewise constant models using a dynamic programming algorithm that is both fast and exact; 3. the function 'confint' for calculating confidence intervals using the strucchange package; 4. the function 'plotAlongChrom' for generating pretty plots; 5. the function 'normalizeByReference' for probe-sequence dependent response adjustment from a (set of) reference hybridizations.
Last updated
microarrayonechannelpreprocessingvisualization
5.48 score 1 dependents 8 scripts 666 downloadsR4RNA - An R package for RNA visualization and analysis
A package for RNA basepair analysis, including the visualization of basepairs as arc diagrams for easy comparison and annotation of sequence and structure. Arc diagrams can additionally be projected onto multiple sequence alignments to assess basepair conservation and covariation, with numerical methods for computing statistics for each.
Last updated
alignmentmultiplesequencealignmentpreprocessingvisualizationdataimportdatarepresentationmultiplecomparison
5.46 score 3 dependents 27 scripts 1.2k downloadsRNAAgeCalc - A multi-tissue transcriptional age calculator
It has been shown that both DNA methylation and RNA transcription are linked to chronological age and age related diseases. Several estimators have been developed to predict human aging from DNA level and RNA level. Most of the human transcriptional age predictor are based on microarray data and limited to only a few tissues. To date, transcriptional studies on aging using RNASeq data from different human tissues is limited. The aim of this package is to provide a tool for across-tissue and tissue-specific transcriptional age calculation based on GTEx RNASeq data.
Last updated
rnaseqgeneexpressionbiological-ageelastic-netgene-expressiongenotype-tissue-expressionpredictionregularized-regressionrna-seq
5.46 score 9 stars 16 scripts 418 downloadsCODEX - A Normalization and Copy Number Variation Detection Method for Whole Exome Sequencing
A normalization and copy number variation calling procedure for whole exome DNA sequencing data. CODEX relies on the availability of multiple samples processed using the same sequencing pipeline for normalization, and does not require matched controls. The normalization model in CODEX includes terms that specifically remove biases due to GC content, exon length and targeting and amplification efficiency, and latent systemic artifacts. CODEX also includes a Poisson likelihood-based recursive segmentation procedure that explicitly models the count-based exome sequencing data.
Last updated
immunooncologyexomeseqnormalizationqualitycontrolcopynumbervariation
5.45 score 1 dependents 47 scripts 491 downloadsmetagene2 - A package to produce metagene plots
This package produces metagene plots to compare coverages of sequencing experiments at selected groups of genomic regions. It can be used for such analyses as assessing the binding of DNA-interacting proteins at promoter regions or surveying antisense transcription over the length of a gene. The metagene2 package can manage all aspects of the analysis, from normalization of coverages to plot facetting according to experimental metadata. Bootstraping analysis is used to provide confidence intervals of per-sample mean coverages.
Last updated
chipseqgeneticsmultiplecomparisoncoveragealignmentsequencing
5.45 score 4 stars 9 scripts 416 downloadsflowcatchR - Tools to analyze in vivo microscopy imaging data focused on tracking flowing blood cells
flowcatchR is a set of tools to analyze in vivo microscopy imaging data, focused on tracking flowing blood cells. It guides the steps from segmentation to calculation of features, filtering out particles not of interest, providing also a set of utilities to help checking the quality of the performed operations (e.g. how good the segmentation was). It allows investigating the issue of tracking flowing cells such as in blood vessels, to categorize the particles in flowing, rolling and adherent. This classification is applied in the study of phenomena such as hemostasis and study of thrombosis development. Moreover, flowcatchR presents an integrated workflow solution, based on the integration with a Shiny App and Jupyter notebooks, which is delivered alongside the package, and can enable fully reproducible bioimage analysis in the R environment.
Last updated
softwarevisualizationcellbiologyclassificationinfrastructureguishinyappsbioconductorfluorescencemicroscopyparticlestracking
5.45 score 4 stars 452 downloadsflowAI - Automatic and interactive quality control for flow cytometry data
The package is able to perform an automatic or interactive quality control on FCS data acquired using flow cytometry instruments. By evaluating three different properties: 1) flow rate, 2) signal acquisition, 3) dynamic range, the quality control enables the detection and removal of anomalies.
Last updated
flowcytometryqualitycontrolbiomedicalinformaticsimmunooncology
5.44 score 4 dependents 106 scripts 1.1k downloadsepistack - Heatmaps of Stack Profiles from Epigenetic Signals
The epistack package main objective is the visualizations of stacks of genomic tracks (such as, but not restricted to, ChIP-seq, ATAC-seq, DNA methyation or genomic conservation data) centered at genomic regions of interest. epistack needs three different inputs: 1) a genomic score objects, such as ChIP-seq coverage or DNA methylation values, provided as a `GRanges` (easily obtained from `bigwig` or `bam` files). 2) a list of feature of interest, such as peaks or transcription start sites, provided as a `GRanges` (easily obtained from `gtf` or `bed` files). 3) a score to sort the features, such as peak height or gene expression value.
Last updated
rnaseqpreprocessingchipseqgeneexpressioncoveragebioinformatics
5.43 score 6 stars 10 scripts 355 downloads
periodicDNA - Set of tools to identify periodic occurrences of k-mers in DNA sequences
This R package helps the user identify k-mers (e.g. di- or tri-nucleotides) present periodically in a set of genomic loci (typically regulatory elements). The functions of this package provide a straightforward approach to find periodic occurrences of k-mers in DNA sequences, such as regulatory elements. It is not aimed at identifying motifs separated by a conserved distance; for this type of analysis, please visit MEME website.
Last updated
sequencematchingmotifdiscoverymotifannotationsequencingcoveragealignmentdataimport
5.43 score 6 stars 5 scripts 376 downloadsconsensusSeekeR - Detection of consensus regions inside a group of experiences using genomic positions and genomic ranges
This package compares genomic positions and genomic ranges from multiple experiments to extract common regions. The size of the analyzed region is adjustable as well as the number of experiences in which a feature must be present in a potential region to tag this region as a consensus region. In genomic analysis where feature identification generates a position value surrounded by a genomic range, such as ChIP-Seq peaks and nucleosome positions, the replication of an experiment may result in slight differences between predicted values. This package enables the conciliation of the results into consensus regions.
Last updated
biologicalquestionchipseqgeneticsmultiplecomparisontranscriptionpeakdetectionsequencingcoveragechip-seq-analysisgenomic-data-analysisnucleosome-positioning
5.43 score 1 stars 1 dependents 7 scripts 366 downloadsGuitar - Guitar
The package is designed for visualization of RNA-related genomic features with respect to the landmarks of RNA transcripts, i.e., transcription starting site, start codon, stop codon and transcription ending site.
Last updated
sequencingsplicedalignmentalignmentdataimportrnaseqmethylseqqualitycontroltranscription
5.43 score 54 scripts 627 downloadsrCGH - Comprehensive Pipeline for Analyzing and Visualizing Array-Based CGH Data
A comprehensive pipeline for analyzing and interactively visualizing genomic profiles generated through commercial or custom aCGH arrays. As inputs, rCGH supports Agilent dual-color Feature Extraction files (.txt), from 44 to 400K, Affymetrix SNP6.0 and cytoScanHD probeset.txt, cychp.txt, and cnchp.txt files exported from ChAS or Affymetrix Power Tools. rCGH also supports custom arrays, provided data complies with the expected format. This package takes over all the steps required for individual genomic profiles analysis, from reading files to profiles segmentation and gene annotations. This package also provides several visualization functions (static or interactive) which facilitate individual profiles interpretation. Input files can be in compressed format, e.g. .bz2 or .gz.
Last updated
acghcopynumbervariationpreprocessingfeatureextraction
5.43 score 5 stars 1 dependents 30 scripts 477 downloadsretrofit - RETROFIT: Reference-free deconvolution of cell mixtures in spatial transcriptomics
RETROFIT is a Bayesian non-negative matrix factorization framework to decompose cell type mixtures in ST data without using external single-cell expression references. RETROFIT outperforms existing reference-based methods in estimating cell type proportions and reconstructing gene expressions in simulations with varying spot size and sample heterogeneity, irrespective of the quality or availability of the single-cell reference. RETROFIT recapitulates known cell-type localization patterns in a Slide-seq dataset of mouse cerebellum without using any single-cell data.
Last updated
transcriptomicsvisualizationrnaseqbayesianspatialsoftwaregeneexpressiondimensionreductionfeatureextractionsinglecellcpp
5.42 score 3 stars 22 scripts 296 downloadscytoMEM - Marker Enrichment Modeling (MEM)
MEM, Marker Enrichment Modeling, automatically generates and displays quantitative labels for cell populations that have been identified from single-cell data. The input for MEM is a dataset that has pre-clustered or pre-gated populations with cells in rows and features in columns. Labels convey a list of measured features and the features' levels of relative enrichment on each population. MEM can be applied to a wide variety of data types and can compare between MEM labels from flow cytometry, mass cytometry, single cell RNA-seq, and spectral flow cytometry using RMSD.
Last updated
proteomicssystemsbiologyclassificationflowcytometrydatarepresentationdataimportcellbiologysinglecellclustering
5.42 score 4 stars 1 dependents 22 scripts 376 downloadsspecL - specL - Prepare Peptide Spectrum Matches for Use in Targeted Proteomics
provides a functions for generating spectra libraries that can be used for MRM SRM MS workflows in proteomics. The package provides a BiblioSpec reader, a function which can add the protein information using a FASTA formatted amino acid file, and an export method for using the created library in the Spectronaut software. The package is developed, tested and used at the Functional Genomics Center Zurich <https://fgcz.ch>.
Last updated
massspectrometryproteomicsddadiamass-spectrometry
5.42 score 1 stars 11 scripts 406 downloadsscCB2 - CB2 improves power of cell detection in droplet-based single-cell RNA sequencing data
scCB2 is an R package implementing CB2 for distinguishing real cells from empty droplets in droplet-based single cell RNA-seq experiments (especially for 10x Chromium). It is based on clustering similar barcodes and calculating Monte-Carlo p-value for each cluster to test against background distribution. This cluster-level test outperforms single-barcode-level tests in dealing with low count barcodes and homogeneous sequencing library, while keeping FDR well controlled.
Last updated
dataimportrnaseqsinglecellsequencinggeneexpressiontranscriptomicspreprocessingclustering
5.42 score 11 stars 12 scripts 296 downloadsTreeAndLeaf - Displaying binary trees with focus on dendrogram leaves
TreeAndLeaf implements a hybrid layout strategy that enhances leaf-level visualization in dendrograms. By integrating force-directed graph and tree layout algorithms, it enables projection of multiple layers of information onto graph–tree diagrams.
Last updated
graphandnetworknetworkvisualizationdatarepresentationsoftwaresystemsbiologybinary-treebinary-tree-visualizationdendrogramreorganizing-dendrograms
5.42 score 15 stars 29 scripts 407 downloadsRnaSeqSampleSize - RnaSeqSampleSize
RnaSeqSampleSize package provides a sample size calculation method based on negative binomial model and the exact test for assessing differential expression analysis of RNA-seq data. It controls FDR for multiple testing and utilizes the average read count and dispersion distributions from real data to estimate a more reliable sample size. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.
Last updated
immunooncologyexperimentaldesignsequencingrnaseqgeneexpressiondifferentialexpressioncpp
5.40 score 25 scripts 400 downloadsSiPSiC - Calculate Pathway Scores for Each Cell in scRNA-Seq Data
Infer biological pathway activity of cells from single-cell RNA-sequencing data by calculating a pathway score for each cell (pathway genes are specified by the user). It is recommended to have the data in Transcripts-Per-Million (TPM) or Counts-Per-Million (CPM) units for best results. Scores may change when adding cells to or removing cells off the data. SiPSiC stands for Single Pathway analysis in Single Cells.
Last updated
softwaredifferentialexpressiongenesetenrichmentbiomedicalinformaticscellbiologytranscriptomicsrnaseqsinglecelltranscriptionsequencingimmunooncologydataimport
5.38 score 8 stars 1 dependents 8 scripts 248 downloadsbandle - An R package for the Bayesian analysis of differential subcellular localisation experiments
The Bandle package enables the analysis and visualisation of differential localisation experiments using mass-spectrometry data. Experimental methods supported include dynamic LOPIT-DC, hyperLOPIT, Dynamic Organellar Maps, Dynamic PCP. It provides Bioconductor infrastructure to analyse these data.
Last updated
bayesianclassificationclusteringimmunooncologyqualitycontroldataimportproteomicsmassspectrometryopenblascppopenmp
5.38 score 4 stars 4 scripts 364 downloadsNewWave - Negative binomial model for scRNA-seq
A model designed for dimensionality reduction and batch effect removal for scRNA-seq data. It is designed to be massively parallelizable using shared objects that prevent memory duplication, and it can be used with different mini-batch approaches in order to reduce time consumption. It assumes a negative binomial distribution for the data with a dispersion parameter that can be both commonwise across gene both genewise.
Last updated
softwaregeneexpressiontranscriptomicssinglecellbatcheffectsequencingcoverageregressionbatch-effectsdimensionality-reductionnegative-binomialscrna-seq
5.38 score 4 stars 30 scripts 344 downloadsDegNorm - DegNorm: degradation normalization for RNA-seq data
This package performs degradation normalization in bulk RNA-seq data to improve differential expression analysis accuracy. It provides estimates for each gene within each sample.
Last updated
rnaseqnormalizationgeneexpressionalignmentcoveragedifferentialexpressionbatcheffectsoftwaresequencingimmunooncologyqualitycontroldataimportopenblascppopenmp
5.38 score 2 stars 3 scripts 368 downloadsNPARC - Non-parametric analysis of response curves for thermal proteome profiling experiments
Perform non-parametric analysis of response curves as described by Childs, Bach, Franken et al. (2019): Non-parametric analysis of thermal proteome profiles reveals novel drug-binding proteins.
Last updated
softwareproteomics
5.38 score 40 scripts 366 downloadsIgGeneUsage - Differential gene usage in immune repertoires
Detection of biases in the usage of immunoglobulin (Ig) genes is an important task in immune repertoire profiling. IgGeneUsage detects aberrant Ig gene usage between biological conditions using a probabilistic model which is analyzed computationally by Bayes inference. With this IgGeneUsage also avoids some common problems related to the current practice of null-hypothesis significance testing.
Last updated
differentialexpressionregressiongeneticsbayesianbiomedicalinformaticsimmunooncologymathematicalbiologyb-cell-receptorbcr-repertoiredifferential-analysisdifferential-gene-expressionhigh-throughput-sequencingimmune-repertoireimmune-repertoire-analysisimmune-repertoiresimmunogenomicsimmunoglobulinimmunoinformaticsimmunological-bioinformaticsimmunologytcr-repertoirevdj-recombinationcpp
5.38 score 6 stars 1 scripts 332 downloadsAIMS - AIMS : Absolute Assignment of Breast Cancer Intrinsic Molecular Subtype
This package contains the AIMS implementation. It contains necessary functions to assign the five intrinsic molecular subtypes (Luminal A, Luminal B, Her2-enriched, Basal-like, Normal-like). Assignments could be done on individual samples as well as on dataset of gene expression data.
Last updated
immunooncologyclassificationrnaseqmicroarraysoftwaregeneexpression
5.38 score 4 dependents 9 scripts 800 downloadsalabaster.sce - Load and Save SingleCellExperiment from File
Save SingleCellExperiment into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated
dataimportdatarepresentation
5.38 score 4 dependents 8 scripts 2.0k downloadsscBubbletree - Quantitative visual exploration of scRNA-seq data
scBubbletree is a quantitative method for the visual exploration of scRNA-seq data, preserving key biological properties such as local and global cell distances and cell density distributions across samples. It effectively resolves overplotting and enables the visualization of diverse cell attributes from multiomic single-cell experiments. Additionally, scBubbletree is user-friendly and integrates seamlessly with popular scRNA-seq analysis tools, facilitating comprehensive and intuitive data interpretation.
Last updated
visualizationclusteringsinglecelltranscriptomicsrnaseqbig-databigdatascrna-seqscrna-seq-analysisvisualvisual-exploration
5.38 score 7 stars 17 scripts 340 downloadsmakecdfenv - CDF Environment Maker
This package has two functions. One reads a Affymetrix chip description file (CDF) and creates a hash table environment containing the location/probe set membership mapping. The other creates a package that automatically loads that environment.
Last updated
onechanneldataimportpreprocessingzlib
5.36 score 1 dependents 38 scripts 672 downloadsVanillaICE - A Hidden Markov Model for high throughput genotyping arrays
Hidden Markov Models for characterizing chromosomal alteration in high throughput SNP arrays.
Last updated
copynumbervariation
5.36 score 1 dependents 63 scripts 562 downloadsepistasisGA - An R package to identify multi-snp effects in nuclear family studies using the GADGETS method
This package runs the GADGETS method to identify epistatic effects in nuclear family studies. It also provides functions for permutation-based inference and graphical visualization of the results.
Last updated
geneticssnpgeneticvariabilityopenblascpp
5.35 score 1 stars 9 scripts 293 downloadsCNTools - Convert segment data into a region by sample matrix to allow for other high level computational analyses.
This package provides tools to convert the output of segmentation analysis using DNAcopy to a matrix structure with overlapping segments as rows and samples as columns so that other computational analyses can be applied to segmented data
Last updated
microarraycopynumbervariation
5.35 score 1 dependents 37 scripts 502 downloadsGreyListChIP - Grey Lists -- Mask Artefact Regions Based on ChIP Inputs
Identify regions of ChIP experiments with high signal in the input, that lead to spurious peaks during peak calling. Remove reads aligning to these regions prior to peak calling, for cleaner ChIP analysis.
Last updated
chipseqalignmentpreprocessingdifferentialpeakcallingsequencinggenomeannotationcoverage
5.34 score 4 dependents 15 scripts 2.0k downloadssnapcount - R/Bioconductor Package for interfacing with Snaptron for rapid querying of expression counts
snapcount is a client interface to the Snaptron webservices which support querying by gene name or genomic region. Results include raw expression counts derived from alignment of RNA-seq samples and/or various summarized measures of expression across one or more regions/genes per-sample (e.g. percent spliced in).
Last updated
coveragegeneexpressionrnaseqsequencingsoftwaredataimport
5.33 score 3 stars 18 scripts 346 downloadsHELP - Tools for HELP data analysis
The package contains a modular pipeline for analysis of HELP microarray data, and includes graphical and mathematical tools with more general applications.
Last updated
cpgislanddnamethylationmicroarraytwochanneldataimportqualitycontrolpreprocessingvisualization
5.33 score 107 scripts 478 downloadsqsvaR - Generate Quality Surrogate Variable Analysis for Degradation Correction
The qsvaR package contains functions for removing the effect of degration in rna-seq data from postmortem brain tissue. The package is equipped to help users generate principal components associated with degradation. The components can be used in differential expression analysis to remove the effects of degradation.
Last updated
softwareworkflowstepnormalizationbiologicalquestiondifferentialexpressionsequencingcoveragebioconductorbraindegradationhumanqsva
5.32 score 5 scripts 327 downloadsPhIPData - Container for PhIP-Seq Experiments
PhIPData defines an S4 class for phage-immunoprecipitation sequencing (PhIP-seq) experiments. Buliding upon the RangedSummarizedExperiment class, PhIPData enables users to coordinate metadata with experimental data in analyses. Additionally, PhIPData provides specialized methods to subset and identify beads-only samples, subset objects using virus aliases, and use existing peptide libraries to populate object parameters.
Last updated
infrastructuredatarepresentationsequencingcoverage
5.32 score 7 stars 1 dependents 7 scripts 332 downloadsmsImpute - Imputation of label-free mass spectrometry peptides
MsImpute is a package for imputation of peptide intensity in proteomics experiments. It additionally contains tools for MAR/MNAR diagnosis and assessment of distortions to the probability distribution of the data post imputation. The missing values are imputed by low-rank approximation of the underlying data matrix if they are MAR (method = "v2"), by Barycenter approach if missingness is MNAR ("v2-mnar"), or by Peptide Identity Propagation (PIP).
Last updated
massspectrometryproteomicssoftwareimputation-algorithmlabel-free-proteomicslow-rank-approximation
5.32 score 15 stars 14 scripts 400 downloadsglobalSeq - Global Test for Counts
The method may be conceptualised as a test of overall significance in regression analysis, where the response variable is overdispersed and the number of explanatory variables exceeds the sample size. Useful for testing for association between RNA-Seq and high-dimensional data.
Last updated
geneexpressionexonarraydifferentialexpressiongenomewideassociationtranscriptomicsdimensionreductionregressionsequencingwholegenomernaseqexomeseqmirnamultiplecomparison
5.32 score 1 stars 4 scripts 355 downloadssvaRetro - Retrotransposed transcript detection from structural variants
svaRetro contains functions for detecting retrotransposed transcripts (RTs) from structural variant calls. It takes structural variant calls in GRanges of breakend notation and identifies RTs by exon-exon junctions and insertion sites. The candidate RTs are reported by events and annotated with information of the inserted transcripts.
Last updated
dataimportsequencingannotationgeneticsvariantannotationcoveragevariantdetection
5.32 score 13 scripts 323 downloadsmaSigPro - Significant Gene Expression Profile Differences in Time Course Gene Expression Data
maSigPro is a regression based approach to find genes for which there are significant gene expression profile differences between experimental groups in time course microarray and RNA-Seq experiments.
Last updated
microarrayrna-seqdifferential expressiontimecourse
5.32 score 104 scripts 783 downloadsmuscle - Multiple Sequence Alignment with MUSCLE
MUSCLE performs multiple sequence alignments of nucleotide or amino acid sequences.
Last updated
multiplesequencealignmentalignmentsequencinggeneticssequencematchingdataimportcpp
5.31 score 102 scripts 756 downloadsSplicingFactory - Splicing Diversity Analysis for Transcriptome Data
The SplicingFactory R package uses transcript-level expression values to analyze splicing diversity based on various statistical measures, like Shannon entropy or the Gini index. These measures can quantify transcript isoform diversity within samples or between conditions. Additionally, the package analyzes the isoform diversity data, looking for significant changes between conditions.
Last updated
transcriptomicsrnaseqdifferentialsplicingalternativesplicingtranscriptomevariantgini-indexrna-seqshannon-entropysimpson-indexsplicing
5.30 score 4 stars 4 scripts 286 downloadsMEAT - Muscle Epigenetic Age Test
This package estimates epigenetic age in skeletal muscle, using DNA methylation data generated with the Illumina Infinium technology (HM27, HM450 and HMEPIC).
Last updated
epigeneticsdnamethylationmicroarraynormalizationbiomedicalinformaticsmethylationarraypreprocessing
5.30 score 1 stars 4 scripts 358 downloadsscGPS - A complete analysis of single cell subpopulations, from identifying subpopulations to analysing their relationship (scGPS = single cell Global Predictions of Subpopulation)
The package implements two main algorithms to answer two key questions: a SCORE (Stable Clustering at Optimal REsolution) to find subpopulations, followed by scGPS to investigate the relationships between subpopulations.
Last updated
singlecellclusteringdataimportsequencingcoverageopenblascpp
5.30 score 5 stars 9 scripts 386 downloadsPhosR - A set of methods and tools for comprehensive analysis of phosphoproteomics data
PhosR is a package for the comprenhensive analysis of phosphoproteomic data. There are two major components to PhosR: processing and downstream analysis. PhosR consists of various processing tools for phosphoproteomics data including filtering, imputation, normalisation, and functional analysis for inferring active kinases and signalling pathways.
Last updated
softwareresearchfieldproteomics
5.30 score 99 scripts 460 downloadsspaSim - Spatial point data simulator for tissue images
A suite of functions for simulating spatial patterns of cells in tissue images. Output images are multitype point data in SingleCellExperiment format. Each point represents a cell, with its 2D locations and cell type. Potential cell patterns include background cells, tumour/immune cell clusters, immune rings, and blood/lymphatic vessels.
Last updated
statisticalmethodspatialbiomedicalinformatics
5.27 score 2 stars 31 scripts 319 downloadsiSEEhub - iSEE for the Bioconductor ExperimentHub
This package defines a custom landing page for an iSEE app interfacing with the Bioconductor ExperimentHub. The landing page allows users to browse the ExperimentHub, select a data set, download and cache it, and import it directly into a Bioconductor iSEE app.
Last updated
dataimportimmunooncology infrastructureshinyappssinglecellsoftwarebioconductorbioconductor-packagehacktoberfestisee
5.26 score 3 stars 4 scripts 338 downloadsHPiP - Host-Pathogen Interaction Prediction
HPiP (Host-Pathogen Interaction Prediction) uses an ensemble learning algorithm for prediction of host-pathogen protein-protein interactions (HP-PPIs) using structural and physicochemical descriptors computed from amino acid-composition of host and pathogen proteins.The proposed package can effectively address data shortages and data unavailability for HP-PPI network reconstructions. Moreover, establishing computational frameworks in that regard will reveal mechanistic insights into infectious diseases and suggest potential HP-PPI targets, thus narrowing down the range of possible candidates for subsequent wet-lab experimental validations.
Last updated
proteomicssystemsbiologynetworkinferencestructuralpredictiongenepredictionnetwork
5.26 score 3 stars 6 scripts 334 downloadssnifter - R wrapper for the python openTSNE library
Provides an R wrapper for the implementation of FI-tSNE from the python package openTNSE. See Poličar et al. (2019) <doi:10.1101/731877> and the algorithm described by Linderman et al. (2018) <doi:10.1038/s41592-018-0308-4>.
Last updated
dimensionreductionvisualizationsoftwaresinglecellsequencing
5.26 score 3 stars 5 scripts 631 downloadsEnMCB - Predicting Disease Progression Based on Methylation Correlated Blocks using Ensemble Models
Creation of the correlated blocks using DNA methylation profiles. Machine learning models can be constructed to predict differentially methylated blocks and disease progression.
Last updated
normalizationdnamethylationmethylationarraysupportvectormachine
5.26 score 9 stars 4 scripts 358 downloadsCNVfilteR - Identifies false positives of CNV calling tools by using SNV calls
CNVfilteR identifies those CNVs that can be discarded by using the single nucleotide variant (SNV) calls that are usually obtained in common NGS pipelines.
Last updated
copynumbervariationsequencingdnaseqvisualizationdataimport
5.26 score 6 stars 2 scripts 288 downloadsFamAgg - Pedigree Analysis and Familial Aggregation
Framework providing basic pedigree analysis and plotting utilities as well as a variety of methods to evaluate familial aggregation of traits in large pedigrees.
Last updated
geneticspedigree
5.26 score 1 stars 5 scripts 397 downloadsACME - Algorithms for Calculating Microarray Enrichment (ACME)
ACME (Algorithms for Calculating Microarray Enrichment) is a set of tools for analysing tiling array ChIP/chip, DNAse hypersensitivity, or other experiments that result in regions of the genome showing "enrichment". It does not rely on a specific array technology (although the array should be a "tiling" array), is very general (can be applied in experiments resulting in regions of enrichment), and is very insensitive to array noise or normalization methods. It is also very fast and can be applied on whole-genome tiling array experiments quite easily with enough memory.
Last updated
technologymicroarraynormalization
5.26 score 5 scripts 736 downloadsMOMA - Multi Omic Master Regulator Analysis
This package implements the inference of candidate master regulator proteins from multi-omics' data (MOMA) algorithm, as well as ancillary analysis and visualization functions.
Last updated
softwarenetworkenrichmentnetworkinferencenetworkfeatureextractionclusteringfunctionalgenomicstranscriptomicssystemsbiology
5.23 score 6 stars 14 scripts 350 downloadsatena - Analysis of Transposable Elements
Quantify expression of transposable elements (TEs) from RNA-seq data through different methods, including ERVmap, TEtranscripts and Telescope. A common interface is provided to use each of these methods, which consists of building a parameter object, calling the quantification function with this object and getting a SummarizedExperiment object as output container of the quantified expression profiles. The implementation allows one to quantify TEs and gene transcripts in an integrated manner.
Last updated
transcriptiontranscriptomicsrnaseqsequencingpreprocessingsoftwaregeneexpressioncoveragedifferentialexpressionfunctionalgenomics
5.21 score 13 stars 2 scripts 480 downloadscrisprVerse - Easily install and load the crisprVerse ecosystem for CRISPR gRNA design
The crisprVerse is a modular ecosystem of R packages developed for the design and manipulation of CRISPR guide RNAs (gRNAs). All packages share a common language and design principles. This package is designed to make it easy to install and load the crisprVerse packages in a single step. To learn more about the crisprVerse, visit <https://www.github.com/crisprVerse>.
Last updated
crisprfunctionalgenomicsgenetargetcrispr-analysiscrispr-designcrispr-targetgrnagrna-sequencegrna-sequences
5.18 score 15 stars 8 scripts 344 downloads
mitoClone2 - Clonal Population Identification in Single-Cell RNA-Seq Data using Mitochondrial and Somatic Mutations
This package primarily identifies variants in mitochondrial genomes from BAM alignment files. It filters these variants to remove RNA editing events then estimates their evolutionary relationship (i.e. their phylogenetic tree) and groups single cells into clones. It also visualizes the mutations and providing additional genomic context.
Last updated
annotationdataimportgeneticssnpsoftwaresinglecellalignmentcurlbzip2xz-utilszlibcpp
5.18 score 1 stars 10 scripts 370 downloadsaggregateBioVar - Differential Gene Expression Analysis for Multi-subject scRNA-seq
For single cell RNA-seq data collected from more than one subject (e.g. biological sample or technical replicates), this package contains tools to summarize single cell gene expression profiles at the level of subject. A SingleCellExperiment object is taken as input and converted to a list of SummarizedExperiment objects, where each list element corresponds to an assigned cell type. The SummarizedExperiment objects contain aggregate gene-by-subject count matrices and inter-subject column metadata for individual subjects that can be processed using downstream bulk RNA-seq tools.
Last updated
softwaresinglecellrnaseqtranscriptomicstranscriptiongeneexpressiondifferentialexpression
5.18 score 5 stars 20 scripts 532 downloadsspqn - Spatial quantile normalization
The spqn package implements spatial quantile normalization (SpQN). This method was developed to remove a mean-correlation relationship in correlation matrices built from gene expression data. It can serve as pre-processing step prior to a co-expression analysis.
Last updated
networkinferencegraphandnetworknormalization
5.18 score 5 stars 30 scripts 324 downloadsCMA - Synthesis of microarray-based classification
This package provides a comprehensive collection of various microarray-based classification algorithms both from Machine Learning and Statistics. Variable Selection, Hyperparameter tuning, Evaluation and Comparison can be performed combined or stepwise in a user-friendly environment.
Last updated
classificationdecisiontree
5.16 score 73 scripts 549 downloadsEpiMix - EpiMix: an integrative tool for the population-level analysis of DNA methylation
EpiMix is a comprehensive tool for the integrative analysis of high-throughput DNA methylation data and gene expression data. EpiMix enables automated data downloading (from TCGA or GEO), preprocessing, methylation modeling, interactive visualization and functional annotation.To identify hypo- or hypermethylated CpG sites across physiological or pathological conditions, EpiMix uses a beta mixture modeling to identify the methylation states of each CpG probe and compares the methylation of the experimental group to the control group.The output from EpiMix is the functional DNA methylation that is predictive of gene expression. EpiMix incorporates specialized algorithms to identify functional DNA methylation at various genetic elements, including proximal cis-regulatory elements of protein-coding genes, distal enhancers, and genes encoding microRNAs and lncRNAs.
Last updated
softwareepigeneticspreprocessingdnamethylationgeneexpressiondifferentialmethylation
5.16 score 2 stars 1 dependents 16 scripts 344 downloads
fobitools - Tools for Manipulating the FOBI Ontology
A set of tools for interacting with the Food-Biomarker Ontology (FOBI). A collection of basic manipulation tools for biological significance analysis, graphs, and text mining strategies for annotating nutritional data.
Last updated
massspectrometrymetabolomicssoftwarevisualizationbiomedicalinformaticsgraphandnetworkannotationcheminformaticspathwaysgenesetenrichmentbiological-intrerpretationbiological-knowledgebiological-significance-analysisenrichment-analysisfood-biomarker-ontologyknowledge-graphnutritionobofoundryontologytext-mining
5.16 score 1 stars 18 scripts 346 downloadsptairMS - Pre-processing PTR-TOF-MS Data
This package implements a suite of methods to preprocess data from PTR-TOF-MS instruments (HDF5 format) and generates the 'sample by features' table of peak intensities in addition to the sample and feature metadata (as a singl<e ExpressionSet object for subsequent statistical analysis). This package also permit usefull tools for cohorts management as analyzing data progressively, visualization tools and quality control. The steps include calibration, expiration detection, peak detection and quantification, feature alignment, missing value imputation and feature annotation. Applications to exhaled air and cell culture in headspace are described in the vignettes and examples. This package was used for data analysis of Gassin Delyle study on adults undergoing invasive mechanical ventilation in the intensive care unit due to severe COVID-19 or non-COVID-19 acute respiratory distress syndrome (ARDS), and permit to identfy four potentiel biomarquers of the infection.
Last updated
softwaremassspectrometrypreprocessingmetabolomicspeakdetectionalignmentcpp
5.15 score 7 stars 3 scripts 336 downloadsPROPER - PROspective Power Evaluation for RNAseq
This package provide simulation based methods for evaluating the statistical power in differential expression analysis from RNA-seq data.
Last updated
immunooncologysequencingrnaseqdifferentialexpression
5.14 score 1 dependents 23 scripts 451 downloadsmyvariant - Accesses MyVariant.info variant query and annotation services
MyVariant.info is a comprehensive aggregation of variant annotation resources. myvariant is a wrapper for querying MyVariant.info services
Last updated
variantannotationannotationgenomicvariation
5.12 score 29 scripts 406 downloadsiSEEhex - iSEE extension for summarising data points in hexagonal bins
This package provides panels summarising data points in hexagonal bins for `iSEE`. It is part of `iSEEu`, the iSEE universe of panels that extend the `iSEE` package.
Last updated
softwareinfrastructurebioconductoriseeushiny-r
5.08 score 2 dependents 7 scripts 364 downloadsshinyepico - ShinyÉPICo
ShinyÉPICo is a graphical pipeline to analyze Illumina DNA methylation arrays (450k or EPIC). It allows to calculate differentially methylated positions and differentially methylated regions in a user-friendly interface. Moreover, it includes several options to export the results and obtain files to perform downstream analysis.
Last updated
differentialmethylationdnamethylationmicroarraypreprocessingqualitycontrol
5.08 score 6 stars 1 scripts 358 downloadsGladiaTOX - R Package for Processing High Content Screening data
GladiaTOX R package is an open-source, flexible solution to high-content screening data processing and reporting in biomedical research. GladiaTOX takes advantage of the tcpl core functionalities and provides a number of extensions: it provides a web-service solution to fetch raw data; it computes severity scores and exports ToxPi formatted files; furthermore it contains a suite of functionalities to generate pdf reports for quality control and data processing.
Last updated
softwareworkflowstepnormalizationpreprocessingqualitycontrol
5.08 score 6 scripts 402 downloadscanceR - A Graphical User Interface for accessing and modeling the Cancer Genomics Data of MSKCC
The package is user friendly interface based on the cgdsr and other modeling packages to explore, compare, and analyse all available Cancer Data (Clinical data, Gene Mutation, Gene Methylation, Gene Expression, Protein Phosphorylation, Copy Number Alteration) hosted by the Computational Biology Center at Memorial-Sloan-Kettering Cancer Center (MSKCC).
Last updated
guigeneexpressionclusteringgogenesetenrichmentkeggmultiplecomparisoncancercancer-datagenegene-expressiongene-methylationgene-mutationgene-setsmethylationmskccmutationstcltk
5.08 score 7 stars 17 scripts 450 downloadsDifferentialRegulation - Differentially regulated genes from scRNA-seq data
DifferentialRegulation is a method for detecting differentially regulated genes between two groups of samples (e.g., healthy vs. disease, or treated vs. untreated samples), by targeting differences in the balance of spliced and unspliced mRNA abundances, obtained from single-cell RNA-sequencing (scRNA-seq) data. From a mathematical point of view, DifferentialRegulation accounts for the sample-to-sample variability, and embeds multiple samples in a Bayesian hierarchical model. Furthermore, our method also deals with two major sources of mapping uncertainty: i) 'ambiguous' reads, compatible with both spliced and unspliced versions of a gene, and ii) reads mapping to multiple genes. In particular, ambiguous reads are treated separately from spliced and unsplced reads, while reads that are compatible with multiple genes are allocated to the gene of origin. Parameters are inferred via Markov chain Monte Carlo (MCMC) techniques (Metropolis-within-Gibbs).
Last updated
differentialsplicingbayesiangeneticsrnaseqsequencingdifferentialexpressiongeneexpressionmultiplecomparisonsoftwaretranscriptionstatisticalmethodvisualizationsinglecellgenetargetopenblascpp
5.04 score 11 stars 10 scripts 340 downloadsChicago - CHiCAGO: Capture Hi-C Analysis of Genomic Organization
A pipeline for analysing Capture Hi-C data.
Last updated
epigeneticshicsequencingsoftware
5.03 score 36 scripts 528 downloadsMEAL - Perform methylation analysis
Package to integrate methylation and expression data. It can also perform methylation or expression analysis alone. Several plotting functionalities are included as well as a new region analysis based on redundancy analysis. Effect of SNPs on a region can also be estimated.
Last updated
dnamethylationmicroarraysoftwarewholegenome
5.03 score 27 scripts 486 downloadswidgetTools - Creates an interactive tcltk widget
This packages contains tools to support the construction of tcltk widgets
Last updated
infrastructure
5.01 score 7 dependents 10 scripts 2.4k downloadskatdetectr - Detection, Characterization and Visualization of Kataegis in Sequencing Data
Kataegis refers to the occurrence of regional hypermutation and is a phenomenon observed in a wide range of malignancies. Using changepoint detection katdetectr aims to identify putative kataegis foci from common data-formats housing genomic variants. Katdetectr has shown to be a robust package for the detection, characterization and visualization of kataegis.
Last updated
wholegenomesoftwaresnpsequencingclassificationvariantannotation
5.00 score 5 stars 3 scripts 284 downloadsRolDE - RolDE: Robust longitudinal Differential Expression
RolDE detects longitudinal differential expression between two conditions in noisy high-troughput data. Suitable even for data with a moderate amount of missing values.RolDE is a composite method, consisting of three independent modules with different approaches to detecting longitudinal differential expression. The combination of these diverse modules allows RolDE to robustly detect varying differences in longitudinal trends and expression levels in diverse data types and experimental settings.
Last updated
statisticalmethodsoftwaretimecourseregressionproteomicsdifferentialexpression
5.00 score 5 stars 1 scripts 359 downloadsMAI - Mechanism-Aware Imputation
A two-step approach to imputing missing data in metabolomics. Step 1 uses a random forest classifier to classify missing values as either Missing Completely at Random/Missing At Random (MCAR/MAR) or Missing Not At Random (MNAR). MCAR/MAR are combined because it is often difficult to distinguish these two missing types in metabolomics data. Step 2 imputes the missing values based on the classified missing mechanisms, using the appropriate imputation algorithms. Imputation algorithms tested and available for MCAR/MAR include Bayesian Principal Component Analysis (BPCA), Multiple Imputation No-Skip K-Nearest Neighbors (Multi_nsKNN), and Random Forest. Imputation algorithms tested and available for MNAR include nsKNN and a single imputation approach for imputation of metabolites where left-censoring is present.
Last updated
softwaremetabolomicsstatisticalmethodclassificationimputation-methodsmachine-learningmissing-data
5.00 score 2 stars 6 scripts 320 downloadsSCArray - Large-scale single-cell omics data manipulation with GDS files
Provides large-scale single-cell omics data manipulation using Genomic Data Structure (GDS) files. It combines dense and sparse matrices stored in GDS files and the Bioconductor infrastructure framework (SingleCellExperiment and DelayedArray) to provide out-of-memory data storage and large-scale manipulation using the R programming language.
Last updated
infrastructuredatarepresentationdataimportsinglecellrnaseqcpp
5.00 score 1 stars 1 dependents 11 scripts 372 downloadsfmrs - Variable Selection in Finite Mixture of AFT Regression and FMR Models
The package obtains parameter estimation, i.e., maximum likelihood estimators (MLE), via the Expectation-Maximization (EM) algorithm for the Finite Mixture of Regression (FMR) models with Normal distribution, and MLE for the Finite Mixture of Accelerated Failure Time Regression (FMAFTR) subject to right censoring with Log-Normal and Weibull distributions via the EM algorithm and the Newton-Raphson algorithm (for Weibull distribution). More importantly, the package obtains the maximum penalized likelihood (MPLE) for both FMR and FMAFTR models (collectively called FMRs). A component-wise tuning parameter selection based on a component-wise BIC is implemented in the package. Furthermore, this package provides Ridge Regression and Elastic Net.
Last updated
survivalregressiondimensionreduction
5.00 score 3 stars 1 dependents 55 scripts 358 downloadsCNVPanelizer - Reliable CNV detection in targeted sequencing applications
A method that allows for the use of a collection of non-matched normal tissue samples. Our approach uses a non-parametric bootstrap subsampling of the available reference samples to estimate the distribution of read counts from targeted sequencing. As inspired by random forest, this is combined with a procedure that subsamples the amplicons associated with each of the targeted genes. The obtained information allows us to reliably classify the copy number aberrations on the gene level.
Last updated
classificationsequencingnormalizationcopynumbervariationcoverage
4.99 score 14 scripts 362 downloadsOscope - Oscope - A statistical pipeline for identifying oscillatory genes in unsynchronized single cell RNA-seq
Oscope is a statistical pipeline developed to identifying and recovering the base cycle profiles of oscillating genes in an unsynchronized single cell RNA-seq experiment. The Oscope pipeline includes three modules: a sine model module to search for candidate oscillator pairs; a K-medoids clustering module to cluster candidate oscillators into groups; and an extended nearest insertion module to recover the base cycle order for each oscillator group.
Last updated
immunooncologystatisticalmethodrnaseqsequencinggeneexpression
4.98 score 1 dependents 16 scripts 514 downloadsphenoTest - Tools to test association between gene expression and phenotype in a way that is efficient, structured, fast and scalable. We also provide tools to do GSEA (Gene set enrichment analysis) and copy number variation.
Tools to test correlation between gene expression and phenotype in a way that is efficient, structured, fast and scalable. GSEA is also provided.
Last updated
microarraydifferentialexpressionmultiplecomparisonclusteringclassification
4.97 score 1 dependents 26 scripts 534 downloadsCTdata - Data companion to CTexploreR
Data from publicly available databases (GTEx, CCLE, TCGA and ENCODE) that go with CTexploreR in order to re-define a comprehensive and thoroughly curated list of CT genes and their main characteristics.
Last updated
transcriptomicsepigeneticsgeneexpressiondataimportexperimenthubsoftware
4.95 score 1 stars 1 dependents 2 scripts 287 downloadsalabaster.spatial - Save and Load Spatial 'Omics Data to/from File
Save SpatialExperiment objects and their images into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated
dataimportdatarepresentation
4.95 score 2 dependents 5 scripts 299 downloadsCexoR - An R package to uncover high-resolution protein-DNA interactions in ChIP-exo replicates
Strand specific peak-pair calling in ChIP-exo replicates. The cumulative Skellam distribution function is used to detect significant normalised count differences of opposed sign at each DNA strand (peak-pairs). Then, irreproducible discovery rate for overlapping peak-pairs across biological replicates is computed.
Last updated
functionalgenomicssequencingcoveragechipseqpeakdetectionbioc-devel
4.95 score 6 scripts 457 downloadsawst - Asymmetric Within-Sample Transformation
We propose an Asymmetric Within-Sample Transformation (AWST) to regularize RNA-seq read counts and reduce the effect of noise on the classification of samples. AWST comprises two main steps: standardization and smoothing. These steps transform gene expression data to reduce the noise of the lowly expressed features, which suffer from background effects and low signal-to-noise ratio, and the influence of the highly expressed features, which may be the result of amplification bias and other experimental artifacts.
Last updated
normalizationgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecell
4.95 score 3 stars 15 scripts 403 downloadsDelayedRandomArray - Delayed Arrays of Random Values
Implements a DelayedArray of random values where the realization of the sampled values is delayed until they are needed. Reproducible sampling within any subarray is achieved by chunking where each chunk is initialized with a different random seed and stream. The usual distributions in the stats package are supported, along with scalar, vector and arrays for the parameters.
Last updated
datarepresentationcpp
4.95 score 1 dependents 7 scripts 346 downloadsEpiTxDb - Storing and accessing epitranscriptomic information using the AnnotationDbi interface
EpiTxDb facilitates the storage of epitranscriptomic information. More specifically, it can keep track of modification identity, position, the enzyme for introducing it on the RNA, a specifier which determines the position on the RNA to be modified and the literature references each modification is associated with.
Last updated
softwareepitranscriptomics
4.95 score 7 scripts 404 downloadsMEB - A normalization-invariant minimum enclosing ball method to detect differentially expressed genes for RNA-seq and scRNA-seq data
This package provides a method to identify differential expression genes in the same or different species. Given that non-DE genes have some similarities in features, a scaling-free minimum enclosing ball (SFMEB) model is built to cover those non-DE genes in feature space, then those DE genes, which are enormously different from non-DE genes, being regarded as outliers and rejected outside the ball. The method on this package is described in the article 'A minimum enclosing ball method to detect differential expression genes for RNA-seq data'. The SFMEB method is extended to the scMEB method that considering two or more potential types of cells or unknown labels scRNA-seq dataset DEGs identification.
Last updated
differentialexpressiongeneexpressionnormalizationclassificationsequencing
4.95 score 1 scripts 362 downloadsprocoil - Prediction of Oligomerization of Coiled Coil Proteins
The package allows for predicting whether a coiled coil sequence (amino acid sequence plus heptad register) is more likely to form a dimer or more likely to form a trimer. Additionally to the prediction itself, a prediction profile is computed which allows for determining the strengths to which the individual residues are indicative for either class. Prediction profiles can also be visualized as curves or heatmaps.
Last updated
proteomicsclassificationsupportvectormachine
4.95 score 7 scripts 512 downloadsMBQN - Mean/Median-balanced quantile normalization
Modified quantile normalization for omics or other matrix-like data distorted in location and scale.
Last updated
normalizationpreprocessingproteomicssoftware
4.92 score 2 stars 14 scripts 350 downloadsAnnotationHubData - Transform public data resources into Bioconductor Data Structures
These recipes convert a wide variety and a growing number of public bioinformatic data sets into easily-used standard Bioconductor data structures.
Last updated
dataimport
4.92 score 3 dependents 23 scripts 867 downloadsscanMiRApp - scanMiR shiny application
A shiny interface to the scanMiR package. The application enables the scanning of transcripts and custom sequences for miRNA binding sites, the visualization of KdModels and binding results, as well as browsing predicted repression data. In addition contains the IndexedFst class for fast indexed reading of large GenomicRanges or data.frames, and some utilities for facilitating scans and identifying enriched miRNA-target pairs.
Last updated
mirnasequencematchingguishinyapps
4.91 score 27 scripts 320 downloadsncGTW - Alignment of LC-MS Profiles by Neighbor-wise Compound-specific Graphical Time Warping with Misalignment Detection
The purpose of ncGTW is to help XCMS for LC-MS data alignment. Currently, ncGTW can detect the misaligned feature groups by XCMS, and the user can choose to realign these feature groups by ncGTW or not.
Last updated
softwaremassspectrometrymetabolomicsalignmentcpp
4.90 score 8 stars 9 scripts 319 downloadsFuseSOM - A Correlation Based Multiview Self Organizing Maps Clustering For IMC Datasets
A correlation-based multiview self-organizing map for the characterization of cell types in highly multiplexed in situ imaging cytometry assays (`FuseSOM`) is a tool for unsupervised clustering. `FuseSOM` is robust and achieves high accuracy by combining a `Self Organizing Map` architecture and a `Multiview` integration of correlation based metrics. This allows FuseSOM to cluster highly multiplexed in situ imaging cytometry assays.
Last updated
singlecellcellbasedassaysclusteringspatial
4.89 score 1 stars 26 scripts 338 downloads
signifinder - Collection and implementation of public transcriptional cancer signatures
signifinder is an R package for computing and exploring a compendium of tumor signatures. It allows to compute a variety of signatures coming from public literature, based on gene expression values, and return single-sample (-cell/-spot) scores. Currently, signifinder collects more than 70 distinct signatures, relating to multiple tumors and multiple cancer processes.
Last updated
geneexpressiongenetargetimmunooncologybiomedicalinformaticsrnaseqmicroarrayreportwritingvisualizationsinglecellspatialgenesignaling
4.88 score 9 stars 21 scripts 339 downloadseds - eds: Low-level reader for Alevin EDS format
This packages provides a single function, readEDS. This is a low-level utility for reading in Alevin EDS format into R. This function is not designed for end-users but instead the package is predominantly for simplifying package dependency graph for other Bioconductor packages.
Last updated
sequencingrnaseqgeneexpressionsinglecellcpp
4.88 score 1 dependents 25 scripts 540 downloadsRAREsim - Simulation of Rare Variant Genetic Data
Haplotype simulations of rare variant genetic data that emulates real data can be performed with RAREsim. RAREsim uses the expected number of variants in MAC bins - either as provided by default parameters or estimated from target data - and an abundance of rare variants as simulated HAPGEN2 to probabilistically prune variants. RAREsim produces haplotypes that emulate real sequencing data with respect to the total number of variants, allele frequency spectrum, haplotype structure, and variant annotation.
Last updated
geneticssoftwarevariantannotationsequencing
4.88 score 5 stars 15 scripts 274 downloadsHilbertVis - Hilbert curve visualization
Functions to visualize long vectors of integer data by means of Hilbert curves
Last updated
visualization
4.86 score 2 dependents 20 scripts 552 downloadsvsclust - Feature-based variance-sensitive quantitative clustering
Feature-based variance-sensitive clustering of omics data. Optimizes cluster assignment by taking into account individual feature variance. Includes several modules for statistical testing, clustering and enrichment analysis.
Last updated
clusteringannotationprincipalcomponentdifferentialexpressionvisualizationproteomicsmetabolomicscpp
4.85 score 9 scripts 278 downloadsSANTA - Spatial Analysis of Network Associations
This package provides methods for measuring the strength of association between a network and a phenotype. It does this by measuring clustering of the phenotype across the network (Knet). Vertices can also be individually ranked by their strength of association with high-weight vertices (Knode).
Last updated
networknetworkenrichmentclustering
4.85 score 6 scripts 417 downloadspackFinder - de novo Annotation of Pack-TYPE Transposable Elements
Algorithm and tools for in silico pack-TYPE transposon discovery. Filters a given genome for properties unique to DNA transposons and provides tools for the investigation of returned matches. Sequences are input in DNAString format, and ranges are returned as a dataframe (in the format returned by as.dataframe(GRanges)).
Last updated
geneticssequencematchingannotationbioinformaticstext-mining
4.85 score 7 stars 6 scripts 386 downloadssSNAPPY - Single Sample directioNAl Pathway Perturbation analYsis
A single sample pathway perturbation testing method for RNA-seq data. The method propagates changes in gene expression down gene-set topologies to compute single-sample directional pathway perturbation scores that reflect potential direction of change. Perturbation scores can be used to test significance of pathway perturbation at both individual-sample and treatment levels.
Last updated
softwaregeneexpressiongenesetenrichmentgenesignaling
4.80 score 1 stars 21 scripts 342 downloadsMANOR - CGH Micro-Array NORmalization
Importation, normalization, visualization, and quality control functions to correct identified sources of variability in array-CGH experiments.
Last updated
microarraytwochanneldataimportqualitycontrolpreprocessingcopynumbervariationnormalization
4.80 score 3 scripts 540 downloadsmiRLAB - Dry lab for exploring miRNA-mRNA relationships
Provide tools exploring miRNA-mRNA relationships, including popular miRNA target prediction methods, ensemble methods that integrate individual methods, functions to get data from online resources, functions to validate the results, and functions to conduct enrichment analyses.
Last updated
mirnageneexpressionnetworkinferencenetwork
4.80 score 13 scripts 344 downloadsSBMLR - SBML-R Interface and Analysis Tools
This package contains a systems biology markup language (SBML) interface to R.
Last updated
graphandnetworkpathwaysnetwork
4.80 score 52 scripts 475 downloadsseqPattern - Visualising oligonucleotide patterns and motif occurrences across a set of sorted sequences
Visualising oligonucleotide patterns and sequence motifs occurrences across a large set of sequences centred at a common reference point and sorted by a user defined feature.
Last updated
visualizationsequencematching
4.78 score 6 dependents 12 scripts 1.4k downloadsDELocal - Identifies differentially expressed genes with respect to other local genes
The goal of DELocal is to identify DE genes compared to their neighboring genes from the same chromosomal location. It has been shown that genes of related functions are generally very far from each other in the chromosome. DELocal utilzes this information to identify DE genes comparing with their neighbouring genes.
Last updated
geneexpressiondifferentialexpressionrnaseqtranscriptomics
4.78 score 1 dependents 4 scripts 306 downloadsalabaster.string - Save and Load Biostrings to/from File
Save Biostrings objects to file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated
dataimportdatarepresentation
4.78 score 2 dependents 5 scripts 276 downloadsclevRvis - Visualization Techniques for Clonal Evolution
clevRvis provides a set of visualization techniques for clonal evolution. These include shark plots, dolphin plots and plaice plots. Algorithms for time point interpolation as well as therapy effect estimation are provided. Phylogeny-aware color coding is implemented. A shiny-app for generating plots interactively is additionally provided.
Last updated
softwareshinyappsvisualization
4.78 score 6 stars 9 scripts 283 downloadscrisprBwa - BWA-based alignment of CRISPR gRNA spacer sequences
Provides a user-friendly interface to map on-targets and off-targets of CRISPR gRNA spacer sequences using bwa. The alignment is fast, and can be performed using either commonly-used or custom CRISPR nucleases. The alignment can work with any reference or custom genomes. Currently not supported on Windows machines.
Last updated
crisprfunctionalgenomicsalignmentalignerbioconductorbioconductor-packagebwacrispr-analysiscrispr-cas9crispr-designcrispr-targetgrnagrna-sequencegrna-sequencessgrnasgrna-design
4.78 score 1 stars 8 scripts 307 downloadsairpart - Differential cell-type-specific allelic imbalance
Airpart identifies sets of genes displaying differential cell-type-specific allelic imbalance across cell types or states, utilizing single-cell allelic counts. It makes use of a generalized fused lasso with binomial observations of allelic counts to partition cell types by their allelic imbalance. Alternatively, a nonparametric method for partitioning cell types is offered. The package includes a number of visualizations and quality control functions for examining single cell allelic imbalance datasets.
Last updated
singlecellrnaseqatacseqchipseqsequencinggeneregulationgeneexpressiontranscriptiontranscriptomevariantcellbiologyfunctionalgenomicsdifferentialexpressiongraphandnetworkregressionclusteringqualitycontrol
4.78 score 2 stars 2 scripts 431 downloadsmarr - Maximum rank reproducibility
marr (Maximum Rank Reproducibility) is a nonparametric approach that detects reproducible signals using a maximal rank statistic for high-dimensional biological data. In this R package, we implement functions that measures the reproducibility of features per sample pair and sample pairs per feature in high-dimensional biological replicate experiments. The user-friendly plot functions in this package also plot histograms of the reproducibility of features per sample pair and sample pairs per feature. Furthermore, our approach also allows the users to select optimal filtering threshold values for the identification of reproducible features and sample pairs based on output visualization checks (histograms). This package also provides the subset of data filtered by reproducible features and/or sample pairs.
Last updated
qualitycontrolmetabolomicsmassspectrometryrnaseqchipseqcpp
4.78 score 3 stars 7 scripts 322 downloadsrnaEditr - Statistical analysis of RNA editing sites and hyper-editing regions
RNAeditr analyzes site-specific RNA editing events, as well as hyper-editing regions. The editing frequencies can be tested against binary, continuous or survival outcomes. Multiple covariate variables as well as interaction effects can also be incorporated in the statistical models.
Last updated
genetargetepigeneticsdimensionreductionfeatureextractionregressionsurvivalrnaseq
4.78 score 3 stars 9 scripts 342 downloadswpm - Well Plate Maker
The Well-Plate Maker (WPM) is a shiny application deployed as an R package. Functions for a command-line/script use are also available. The WPM allows users to generate well plate maps to carry out their experiments while improving the handling of batch effects. In particular, it helps controlling the "plate effect" thanks to its ability to randomize samples over multiple well plates. The algorithm for placing the samples is inspired by the backtracking algorithm: the samples are placed at random while respecting specific spatial constraints.
Last updated
guiproteomicsmassspectrometrybatcheffectexperimentaldesign
4.78 score 6 stars 10 scripts 342 downloadspadma - Individualized Multi-Omic Pathway Deviation Scores Using Multiple Factor Analysis
Use multiple factor analysis to calculate individualized pathway-centric scores of deviation with respect to the sampled population based on multi-omic assays (e.g., RNA-seq, copy number alterations, methylation, etc). Graphical and numerical outputs are provided to identify highly aberrant individuals for a particular pathway of interest, as well as the gene and omics drivers of aberrant multi-omic profiles.
Last updated
softwarestatisticalmethodprincipalcomponentgeneexpressionpathwaysrnaseqbiocartamethylseq
4.78 score 3 stars 6 scripts 382 downloadsNoRCE - NoRCE: Noncoding RNA Sets Cis Annotation and Enrichment
While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint to a functional association. We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast.
Last updated
biologicalquestiondifferentialexpressiongenomeannotationgenesetenrichmentgenetargetgenomeassemblygo
4.78 score 1 stars 6 scripts 300 downloadssarks - Suffix Array Kernel Smoothing for discovery of correlative sequence motifs and multi-motif domains
Suffix Array Kernel Smoothing (see https://academic.oup.com/bioinformatics/article-abstract/35/20/3944/5418797), or SArKS, identifies sequence motifs whose presence correlates with numeric scores (such as differential expression statistics) assigned to the sequences (such as gene promoters). SArKS smooths over sequence similarity, quantified by location within a suffix array based on the full set of input sequences. A second round of smoothing over spatial proximity within sequences reveals multi-motif domains. Discovered motifs can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing.
Last updated
motifdiscoverygeneregulationgeneexpressiontranscriptomicsrnaseqdifferentialexpressionfeatureextractionopenjdk
4.78 score 3 stars 8 scripts 338 downloadsVaSP - Quantification and Visualization of Variations of Splicing in Population
Discovery of genome-wide variable alternative splicing events from short-read RNA-seq data and visualizations of gene splicing information for publication-quality multi-panel figures in a population. (Warning: The visualizing function is removed due to the dependent package Sushi deprecated. If you want to use it, please change back to an older version.)
Last updated
rnaseqalternativesplicingdifferentialsplicingstatisticalmethodvisualizationpreprocessingclusteringdifferentialexpressionkeggimmunooncology3s-scoresalternative-splicingballgownrna-seqsplicingsqtlstatistics
4.78 score 3 stars 3 scripts 353 downloadsRNAmodR.AlkAnilineSeq - Detection of m7G, m3C and D modification by AlkAnilineSeq
RNAmodR.AlkAnilineSeq implements the detection of m7G, m3C and D modifications on RNA from experimental data generated with the AlkAnilineSeq protocol. The package builds on the core functionality of the RNAmodR package to detect specific patterns of the modifications in high throughput sequencing data.
Last updated
softwareworkflowstepvisualizationsequencingalkanilineseqbioconductormodificationsrnarnamodr
4.78 score 2 stars 3 scripts 346 downloadsMLP - Mean Log P Analysis
Pathway analysis based on p-values associated to genes from a genes expression analysis of interest. Utility functions enable to extract pathways from the Gene Ontology Biological Process (GOBP), Molecular Function (GOMF) and Cellular Component (GOCC), Kyoto Encyclopedia of Genes of Genomes (KEGG) and Reactome databases. Methodology, and helper functions to display the results as a table, barplot of pathway significance, Gene Ontology graph and pathway significance are available.
Last updated
geneticsgeneexpressionpathwaysreactomekegggo
4.78 score 1 dependents 8 scripts 510 downloadsGeneMeta - MetaAnalysis for High Throughput Experiments
A collection of meta-analysis tools for analysing high throughput experimental data
Last updated
sequencinggeneexpressionmicroarray
4.78 score 1 dependents 4 scripts 594 downloadsOLIN - Optimized local intensity-dependent normalisation of two-color microarrays
Functions for normalisation of two-color microarrays by optimised local regression and for detection of artefacts in microarray data
Last updated
microarraytwochannelqualitycontrolpreprocessingvisualization
4.78 score 1 dependents 7 scripts 538 downloadsRtpca - Thermal proximity co-aggregation with R
R package for performing thermal proximity co-aggregation analysis with thermal proteome profiling datasets to analyse protein complex assembly and (differential) protein-protein interactions across conditions.
Last updated
softwareproteomicsdataimport
4.74 score 37 scripts 350 downloadsMMUPHin - Meta-analysis Methods with Uniform Pipeline for Heterogeneity in Microbiome Studies
MMUPHin is an R package for meta-analysis tasks of microbiome cohorts. It has function interfaces for: a) covariate-controlled batch- and cohort effect adjustment, b) meta-analysis differential abundance testing, c) meta-analysis unsupervised discrete structure (clustering) discovery, and d) meta-analysis unsupervised continuous structure discovery.
Last updated
metagenomicsmicrobiomebatcheffect
4.74 score 69 scripts 680 downloadsstageR - stageR: stage-wise analysis of high throughput gene expression data in R
The stageR package allows automated stage-wise analysis of high-throughput gene expression data. The method is published in Genome Biology at https://genomebiology.biomedcentral.com/articles/10.1186/s13059-017-1277-0
Last updated
softwarestatisticalmethod
4.74 score 184 scripts 676 downloadscorral - Correspondence Analysis for Single Cell Data
Correspondence analysis (CA) is a matrix factorization method, and is similar to principal components analysis (PCA). Whereas PCA is designed for application to continuous, approximately normally distributed data, CA is appropriate for non-negative, count-based data that are in the same additive scale. The corral package implements CA for dimensionality reduction of a single matrix of single-cell data, as well as a multi-table adaptation of CA that leverages data-optimized scaling to align data generated from different sequencing platforms by projecting into a shared latent space. corral utilizes sparse matrices and a fast implementation of SVD, and can be called directly on Bioconductor objects (e.g., SingleCellExperiment) for easy pipeline integration. The package also includes additional options, including variations of CA to address overdispersion in count data (e.g., Freeman-Tukey chi-squared residual), as well as the option to apply CA-style processing to continuous data (e.g., proteomic TOF intensities) with the Hellinger distance adaptation of CA.
Last updated
batcheffectdimensionreductiongeneexpressionpreprocessingprincipalcomponentsequencingsinglecellsoftwarevisualization
4.73 score 27 scripts 372 downloadsMAIT - Statistical Analysis of Metabolomic Data
The MAIT package contains functions to perform end-to-end statistical analysis of LC/MS Metabolomic Data. Special emphasis is put on peak annotation and in modular function design of the functions.
Last updated
immunooncologymassspectrometrymetabolomicssoftware
4.73 score 27 scripts 356 downloadsuncoverappLib - Interactive graphical application for clinical assessment of sequence coverage at the base-pair level
a Shiny application containing a suite of graphical and statistical tools to support clinical assessment of low coverage regions.It displays three web pages each providing a different analysis module: Coverage analysis, calculate AF by allele frequency app and binomial distribution. uncoverAPP provides a statisticl summary of coverage given target file or genes name.
Last updated
softwarevisualizationannotationcoverage
4.73 score 3 stars 18 scripts 342 downloadsflowFP - Fingerprinting for Flow Cytometry
Fingerprint generation of flow cytometry data, used to facilitate the application of machine learning and datamining tools for flow cytometry.
Last updated
flowcytometrycellbasedassaysclusteringvisualization
4.72 score 2 dependents 11 scripts 602 downloadsSpatialOmicsOverlay - Spatial Overlay for Omic Data from Nanostring GeoMx Data
Tools for NanoString Technologies GeoMx Technology. Package to easily graph on top of an OME-TIFF image. Plotting annotations can range from tissue segment to gene expression.
Last updated
geneexpressiontranscriptioncellbasedassaysdataimporttranscriptomicsproteomicsproprietaryplatformsrnaseqspatialdatarepresentationvisualizationopenjdk
4.72 score 21 scripts 256 downloadsqmtools - Quantitative Metabolomics Data Processing Tools
The qmtools (quantitative metabolomics tools) package provides basic tools for processing quantitative metabolomics data with the standard SummarizedExperiment class. This includes functions for imputation, normalization, feature filtering, feature clustering, dimension-reduction, and visualization to help users prepare data for statistical analysis. This package also offers a convenient way to compute empirical Bayes statistics for which metabolic features are different between two sets of study samples. Several functions in this package could also be used in other types of omics data.
Last updated
metabolomicspreprocessingnormalizationdimensionreductionmassspectrometry
4.72 score 2 stars 13 scripts 334 downloadsconsICA - consensus Independent Component Analysis
consICA implements a data-driven deconvolution method – consensus independent component analysis (ICA) to decompose heterogeneous omics data and extract features suitable for patient diagnostics and prognostics. The method separates biologically relevant transcriptional signals from technical effects and provides information about the cellular composition and biological processes. The implementation of parallel computing in the package ensures efficient analysis of modern multicore systems.
Last updated
technologystatisticalmethodsequencingrnaseqtranscriptomicsclassificationfeatureextraction
4.70 score 2 scripts 274 downloadsepidecodeR - epidecodeR: a functional exploration tool for epigenetic and epitranscriptomic regulation
epidecodeR is a package capable of analysing impact of degree of DNA/RNA epigenetic chemical modifications on dysregulation of genes or proteins. This package integrates chemical modification data generated from a host of epigenomic or epitranscriptomic techniques such as ChIP-seq, ATAC-seq, m6A-seq, etc. and dysregulated gene lists in the form of differential gene expression, ribosome occupancy or differential protein translation and identify impact of dysregulation of genes caused due to varying degrees of chemical modifications associated with the genes. epidecodeR generates cumulative distribution function (CDF) plots showing shifts in trend of overall log2FC between genes divided into groups based on the degree of modification associated with the genes. The tool also tests for significance of difference in log2FC between groups of genes.
Last updated
differentialexpressiongeneregulationhistonemodificationfunctionalpredictiontranscriptiongeneexpressionepitranscriptomicsepigeneticsfunctionalgenomicssystemsbiologytranscriptomicschiponchipdifferential-expressiongenomicsgenomics-visualization
4.70 score 5 stars 7 scripts 340 downloadsmethylscaper - Visualization of Methylation Data
methylscaper is an R package for processing and visualizing data jointly profiling methylation and chromatin accessibility (MAPit, NOMe-seq, scNMT-seq, nanoNOMe, etc.). The package supports both single-cell and single-molecule data, and a common interface for jointly visualizing both data types through the generation of ordered representational methylation-state matrices. The Shiny app allows for an interactive seriation process of refinement and re-weighting that optimally orders the cells or DNA molecules to discover methylation patterns and nucleosome positioning.
Last updated
dnamethylationepigeneticssequencingvisualizationsinglecellnucleosomepositioning
4.70 score 1 stars 6 scripts 332 downloadsILoReg - ILoReg: a tool for high-resolution cell population identification from scRNA-Seq data
ILoReg is a tool for identification of cell populations from scRNA-seq data. In particular, ILoReg is useful for finding cell populations with subtle transcriptomic differences. The method utilizes a self-supervised learning method, called Iteratitive Clustering Projection (ICP), to find cluster probabilities, which are used in noise reduction prior to PCA and the subsequent hierarchical clustering and t-SNE steps. Additionally, functions for differential expression analysis to find gene markers for the populations and gene expression visualization are provided.
Last updated
singlecellsoftwareclusteringdimensionreductionrnaseqvisualizationtranscriptomicsdatarepresentationdifferentialexpressiontranscriptiongeneexpression
4.70 score 5 stars 2 scripts 370 downloadsAWFisher - An R package for fast computing for adaptively weighted fisher's method
Implementation of the adaptively weighted fisher's method, including fast p-value computing, variability index, and meta-pattern.
Last updated
statisticalmethodsoftware
4.70 score 5 stars 4 scripts 482 downloadspuma - Propagating Uncertainty in Microarray Analysis(including Affymetrix tranditional 3' arrays and exon arrays and Human Transcriptome Array 2.0)
Most analyses of Affymetrix GeneChip data (including tranditional 3' arrays and exon arrays and Human Transcriptome Array 2.0) are based on point estimates of expression levels and ignore the uncertainty of such estimates. By propagating uncertainty to downstream analyses we can improve results from microarray analyses. For the first time, the puma package makes a suite of uncertainty propagation methods available to a general audience. In additon to calculte gene expression from Affymetrix 3' arrays, puma also provides methods to process exon arrays and produces gene and isoform expression for alternative splicing study. puma also offers improvements in terms of scope and speed of execution over previously available uncertainty propagation methods. Included are summarisation, differential expression detection, clustering and PCA methods, together with useful plotting functions.
Last updated
microarrayonechannelpreprocessingdifferentialexpressionclusteringexonarraygeneexpressionmrnamicroarraychiponchipalternativesplicingdifferentialsplicingbayesiantwochanneldataimporthta2.0
4.70 score 25 scripts 496 downloadsAnVILVRS - Expose the vrs_anvil_toolkit Python package via R
Process Variant Call Format (VCF) files and perform lookup operations on Genomic Variation Representation Service (GA4GH VRS) identifiers. The GA4GH VRS identifiers provide a standardized way to represent genomic variations, making it easier to exchange and share genomic information.
Last updated
softwarebiologicalquestionvariantannotationinfrastructurethirdpartyclientu24hg010263
4.70 score 2 scripts 80 downloadsLPE - Methods for analyzing microarray data using Local Pooled Error (LPE) method
This LPE library is used to do significance analysis of microarray data with small number of replicates. It uses resampling based FDR adjustment, and gives less conservative results than traditional 'BH' or 'BY' procedures. Data accepted is raw data in txt format from MAS4, MAS5 or dChip. Data can also be supplied after normalization. LPE library is primarily used for analyzing data between two conditions. To use it for paired data, see LPEP library. For using LPE in multiple conditions, use HEM library.
Last updated
microarraydifferentialexpression
4.69 score 1 dependents 27 scripts 436 downloadsDNAfusion - Identification of gene fusions using paired-end sequencing
DNAfusion can identify gene fusions such as EML4-ALK based on paired-end sequencing results. This package was developed using position deduplicated BAM files generated with the AVENIO Oncology Analysis Software. These files are made using the AVENIO ctDNA surveillance kit and Illumina Nextseq 500 sequencing. This is a targeted hybridization NGS approach and includes ALK-specific but not EML4-specific probes.
Last updated
targetedresequencinggeneticsgenefusiondetectionsequencingbioconductor-packagecirculating-tumor-dnagene-fusionliquid-biopsynext-generation-sequencingtargeted-sequencingvariant-calling
4.68 score 4 stars 12 scripts 348 downloadsCTSV - Identification of cell-type-specific spatially variable genes accounting for excess zeros
The R package CTSV implements the CTSV approach developed by Jinge Yu and Xiangyu Luo that detects cell-type-specific spatially variable genes accounting for excess zeros. CTSV directly models sparse raw count data through a zero-inflated negative binomial regression model, incorporates cell-type proportions, and performs hypothesis testing based on R package pscl. The package outputs p-values and q-values for genes in each cell type, and CTSV is scalable to datasets with tens of thousands of genes measured on hundreds of spots. CTSV can be installed in Windows, Linux, and Mac OS.
Last updated
geneexpressionstatisticalmethodregressionspatialgenetics
4.68 score 4 stars 12 scripts 285 downloads
GRaNIE - GRaNIE: Reconstruction cell type specific gene regulatory networks including enhancers using single-cell or bulk chromatin accessibility and RNA-seq data
Genetic variants associated with diseases often affect non-coding regions, thus likely having a regulatory role. To understand the effects of genetic variants in these regulatory regions, identifying genes that are modulated by specific regulatory elements (REs) is crucial. The effect of gene regulatory elements, such as enhancers, is often cell-type specific, likely because the combinations of transcription factors (TFs) that are regulating a given enhancer have cell-type specific activity. This TF activity can be quantified with existing tools such as diffTF and captures differences in binding of a TF in open chromatin regions. Collectively, this forms a gene regulatory network (GRN) with cell-type and data-specific TF-RE and RE-gene links. Here, we reconstruct such a GRN using single-cell or bulk RNAseq and open chromatin (e.g., using ATACseq or ChIPseq for open chromatin marks) and optionally (Capture) Hi-C data. Our network contains different types of links, connecting TFs to regulatory elements, the latter of which is connected to genes in the vicinity or within the same chromatin domain (TAD). We use a statistical framework to assign empirical FDRs and weights to all links using a permutation-based approach.
Last updated
softwaregeneexpressiongeneregulationnetworkinferencegenesetenrichmentbiomedicalinformaticsgeneticstranscriptomicsatacseqrnaseqgraphandnetworkregressiontranscriptionchipseq
4.68 score 16 scripts 308 downloadsRCyjs - Display and manipulate graphs in cytoscape.js
Interactive viewing and exploration of graphs, connecting R to Cytoscape.js, using websockets.
Last updated
visualizationgraphandnetworkthirdpartyclient
4.68 score 48 scripts 388 downloadsRTCA - Open-source toolkit to analyse data from xCELLigence System (RTCA)
Import, analyze and visualize data from Roche(R) xCELLigence RTCA systems. The package imports real-time cell electrical impedance data into R. As an alternative to commercial software shipped along the system, the Bioconductor package RTCA provides several unique transformation (normalization) strategies and various visualization tools.
Last updated
immunooncologycellbasedassaysinfrastructurevisualizationtimecourse
4.68 score 12 scripts 382 downloadsChIPComp - Quantitative comparison of multiple ChIP-seq datasets
ChIPComp detects differentially bound sharp binding sites across multiple conditions considering matching control.
Last updated
chipseqsequencingtranscriptiongeneticscoveragemultiplecomparisondataimport
4.67 score 52 scripts 444 downloadsDune - Improving replicability in single-cell RNA-Seq cell type discovery
Given a set of clustering labels, Dune merges pairs of clusters to increase mean ARI between labels, improving replicability.
Last updated
clusteringgeneexpressionrnaseqsoftwaresinglecelltranscriptomicsvisualization
4.66 score 46 scripts 299 downloadsoncoscanR - Secondary analyses of CNV data (HRD and more)
The software uses the copy number segments from a text file and identifies all chromosome arms that are globally altered and computes various genome-wide scores. The following HRD scores (characteristic of BRCA-mutated cancers) are included: LST, HR-LOH, nLST and gLOH. the package is tailored for the ThermoFisher Oncoscan assay analyzed with their Chromosome Alteration Suite (ChAS) but can be adapted to any input.
Last updated
copynumbervariationmicroarraysoftware
4.65 score 3 stars 7 scripts 319 downloadsGenomicInteractionNodes - A R/Bioconductor package to detect the interaction nodes from HiC/HiChIP/HiCAR data
The GenomicInteractionNodes package can import interactions from bedpe file and define the interaction nodes, the genomic interaction sites with multiple interaction loops. The interaction nodes is a binding platform regulates one or multiple genes. The detected interaction nodes will be annotated for downstream validation.
Last updated
hicsequencingsoftware
4.65 score 1 scripts 294 downloadssitadela - An R package for the easy provision of simple but complete tab-delimited genomic annotation from a variety of sources and organisms
Provides an interface to build a unified database of genomic annotations and their coordinates (gene, transcript and exon levels). It is aimed to be used when simple tab-delimited annotations (or simple GRanges objects) are required instead of the more complex annotation Bioconductor packages. Also useful when combinatorial annotation elements are reuired, such as RefSeq coordinates with Ensembl biotypes. Finally, it can download, construct and handle annotations with versioned genes and transcripts (where available, e.g. RefSeq and latest Ensembl). This is particularly useful in precision medicine applications where the latter must be reported.
Last updated
softwareworkflowsteprnaseqtranscriptionsequencingtranscriptomicsbiomedicalinformaticsfunctionalgenomicssystemsbiologyalternativesplicingdataimportchipseq
4.65 score 4 scripts 303 downloadsramr - Detection of Rare Aberrantly Methylated Regions in Array and NGS Data
ramr is an R package for detection of epimutations (i.e., infrequent aberrant DNA methylation events) in large data sets obtained by methylation profiling using array or high-throughput methylation sequencing. In addition, package provides functions to visualize found aberrantly methylated regions (AMRs), to generate sets of all possible regions to be used as reference sets for enrichment analysis, and to generate biologically relevant test data sets for performance evaluation of AMR/DMR search algorithms.
Last updated
dnamethylationdifferentialmethylationepigeneticsmethylationarraymethylseqaberrant-methylationbioconductordna-methylationepimutationmethylation-microarraysnext-generation-sequencingcppopenmp
4.65 score 5 scripts 392 downloadsidr2d - Irreproducible Discovery Rate for Genomic Interactions Data
A tool to measure reproducibility between genomic experiments that produce two-dimensional peaks (interactions between peaks), such as ChIA-PET, HiChIP, and HiC. idr2d is an extension of the original idr package, which is intended for (one-dimensional) ChIP-seq peaks.
Last updated
dna3dstructuregeneregulationpeakdetectionepigeneticsfunctionalgenomicsclassificationhic
4.65 score 15 scripts 388 downloadsaltcdfenvs - alternative CDF environments (aka probeset mappings)
Convenience data structures and functions to handle cdfenvs
Last updated
microarrayonechannelqualitycontrolpreprocessingannotationproprietaryplatformstranscription
4.65 score 15 scripts 568 downloadsPinPath - Visualization of Omics Data on Pathway Diagrams
With PinPath, you can visualize omics data onto pathways diagrams from WikiPathways and KEGG.
Last updated
pathwaysvisualizationgraphandnetworkkeggomicspathwaywikipathways
4.65 score 5 starsiCARE - Individualized Coherent Absolute Risk Estimation (iCARE)
An R package to build, validate and apply absolute risk models
Last updated
softwarestatisticalmethodgenomewideassociation
4.64 score 11 scripts 360 downloadsflowSpecs - Tools for processing of high-dimensional cytometry data
This package is intended to fill the role of conventional cytometry pre-processing software, for spectral decomposition, transformation, visualization and cleanup, and to aid further downstream analyses, such as with DepecheR, by enabling transformation of flowFrames and flowSets to dataframes. Functions for flowCore-compliant automatic 1D-gating/filtering are in the pipe line. The package name has been chosen both as it will deal with spectral cytometry and as it will hopefully give the user a nice pair of spectacles through which to view their data.
Last updated
softwarecellbasedassaysdatarepresentationimmunooncologyflowcytometrysinglecellvisualizationnormalizationdataimport
4.64 score 6 stars 12 scripts 400 downloadsgmoviz - Seamless visualization of complex genomic variations in GMOs and edited cell lines
Genetically modified organisms (GMOs) and cell lines are widely used models in all kinds of biological research. As part of characterising these models, DNA sequencing technology and bioinformatics analyses are used systematically to study their genomes. Therefore, large volumes of data are generated and various algorithms are applied to analyse this data, which introduces a challenge on representing all findings in an informative and concise manner. `gmoviz` provides users with an easy way to visualise and facilitate the explanation of complex genomic editing events on a larger, biologically-relevant scale.
Last updated
visualizationsequencinggeneticvariabilitygenomicvariationcoverage
4.62 score 14 scripts 351 downloadsparody - Parametric And Resistant Outlier DYtection
Provide routines for univariate and multivariate outlier detection with a focus on parametric methods, but support for some methods based on resistant statistics.
Last updated
multiplecomparison
4.62 score 1 dependents 14 scripts 524 downloadsBCRANK - Predicting binding site consensus from ranked DNA sequences
Functions and classes for de novo prediction of transcription factor binding consensus by heuristic search
Last updated
motifdiscoverygeneregulation
4.62 score 35 scripts 608 downloadsmagpie - MeRIP-Seq data Analysis for Genomic Power Investigation and Evaluation
This package aims to perform power analysis for the MeRIP-seq study. It calculates FDR, FDC, power, and precision under various study design parameters, including but not limited to sample size, sequencing depth, and testing method. It can also output results into .xlsx files or produce corresponding figures of choice.
Last updated
epitranscriptomicsdifferentialmethylationsequencingrnaseqsoftware
4.61 score 41 scripts 316 downloadsscreenCounter - Counting Reads in High-Throughput Sequencing Screens
Provides functions for counting reads from high-throughput sequencing screen data (e.g., CRISPR, shRNA) to quantify barcode abundance. Currently supports single barcodes in single- or paired-end data, and combinatorial barcodes in paired-end data.
Last updated
crispralignmentfunctionalgenomicsfunctionalpredictionzlibcpp
4.60 score 4 stars 10 scripts 293 downloadscytofQC - Labels normalized cells for CyTOF data and assigns probabilities for each label
cytofQC is a package for initial cleaning of CyTOF data. It uses a semi-supervised approach for labeling cells with their most likely data type (bead, doublet, debris, dead) and the probability that they belong to each label type. This package does not remove data from the dataset, but provides labels and information to aid the data user in cleaning their data. Our algorithm is able to distinguish between doublets and large cells.
Last updated
softwaresinglecellannotation
4.60 score 2 stars 4 scripts 293 downloadsMBECS - Evaluation and correction of batch effects in microbiome data-sets
The Microbiome Batch Effect Correction Suite (MBECS) provides a set of functions to evaluate and mitigate unwated noise due to processing in batches. To that end it incorporates a host of batch correcting algorithms (BECA) from various packages. In addition it offers a correction and reporting pipeline that provides a preliminary look at the characteristics of a data-set before and after correcting for batch effects.
Last updated
batcheffectmicrobiomereportwritingvisualizationnormalizationqualitycontrol
4.60 score 4 stars 6 scripts 414 downloadsIntramiRExploreR - Predicting Targets for Drosophila Intragenic miRNAs
Intra-miR-ExploreR, an integrative miRNA target prediction bioinformatics tool, identifies targets combining expression and biophysical interactions of a given microRNA (miR). Using the tool, we have identified targets for 92 intragenic miRs in D. melanogaster, using available microarray expression data, from Affymetrix 1 and Affymetrix2 microarray array platforms, providing a global perspective of intragenic miR targets in Drosophila. Predicted targets are grouped according to biological functions using the DAVID Gene Ontology tool and are ranked based on a biologically relevant scoring system, enabling the user to identify functionally relevant targets for a given miR.
Last updated
softwaremicroarraygenetargetstatisticalmethodgeneexpressiongeneprediction
4.60 score 6 scripts 421 downloadscellmigRation - Track Cells, Analyze Cell Trajectories and Compute Migration Statistics
Import TIFF images of fluorescently labeled cells, and track cell movements over time. Parallelization is supported for image processing and for fast computation of cell trajectories. In-depth analysis of cell trajectories is enabled by 15 trajectory analysis functions.
Last updated
cellbiologydatarepresentationdataimportbioconductor-packagecell-trackingshinytrajectory-analysis
4.60 score 1 stars 6 scripts 328 downloadsLRcell - Differential cell type change analysis using Logistic/linear Regression
The goal of LRcell is to identify specific sub-cell types that drives the changes observed in a bulk RNA-seq differential gene expression experiment. To achieve this, LRcell utilizes sets of cell marker genes acquired from single-cell RNA-sequencing (scRNA-seq) as indicators for various cell types in the tissue of interest. Next, for each cell type, using its marker genes as indicators, we apply Logistic Regression on the complete set of genes with differential expression p-values to calculate a cell-type significance p-value. Finally, these p-values are compared to predict which one(s) are likely to be responsible for the differential gene expression pattern observed in the bulk RNA-seq experiments. LRcell is inspired by the LRpath[@sartor2009lrpath] algorithm developed by Sartor et al., originally designed for pathway/gene set enrichment analysis. LRcell contains three major components: LRcell analysis, plot generation and marker gene selection. All modules in this package are written in R. This package also provides marker genes in the Prefrontal Cortex (pFC) human brain region, human PBMC and nine mouse brain regions (Frontal Cortex, Cerebellum, Globus Pallidus, Hippocampus, Entopeduncular, Posterior Cortex, Striatum, Substantia Nigra and Thalamus).
Last updated
singlecellgenesetenrichmentsequencingregressiongeneexpressiondifferentialexpressionenrichmentmarker-genes
4.60 score 4 stars 6 scripts 332 downloadsCAEN - Category encoding method for selecting feature genes for the classification of single-cell RNA-seq
With the development of high-throughput techniques, more and more gene expression analysis tend to replace hybridization-based microarrays with the revolutionary technology.The novel method encodes the category again by employing the rank of samples for each gene in each class. We then consider the correlation coefficient of gene and class with rank of sample and new rank of category. The highest correlation coefficient genes are considered as the feature genes which are most effective to classify the samples.
Last updated
differentialexpressionsequencingclassificationrnaseqatacseqsinglecellgeneexpressionripseq
4.60 score 8 scripts 326 downloadsGEOfastq - Downloads ENA Fastqs With GEO Accessions
GEOfastq is used to download fastq files from the European Nucleotide Archive (ENA) starting with an accession from the Gene Expression Omnibus (GEO). To do this, sample metadata is retrieved from GEO and the Sequence Read Archive (SRA). SRA run accessions are then used to construct FTP and aspera download links for fastq files generated by the ENA.
Last updated
rnaseqdataimportbioinformaticsfastqgene-expressiongeorna-seq
4.60 score 4 stars 9 scripts 346 downloadsnempi - Inferring unobserved perturbations from gene expression data
Takes as input an incomplete perturbation profile and differential gene expression in log odds and infers unobserved perturbations and augments observed ones. The inference is done by iteratively inferring a network from the perturbations and inferring perturbations from the network. The network inference is done by Nested Effects Models.
Last updated
softwaregeneexpressiondifferentialexpressiondifferentialmethylationgenesignalingpathwaysnetworkclassificationneuralnetworknetworkinferenceatacseqdnaseqrnaseqpooledscreenscrisprsinglecellsystemsbiology
4.60 score 2 stars 4 scripts 330 downloadsbnem - Training of logical models from indirect measurements of perturbation experiments
bnem combines the use of indirect measurements of Nested Effects Models (package mnem) with the Boolean networks of CellNOptR. Perturbation experiments of signalling nodes in cells are analysed for their effect on the global gene expression profile. Those profiles give evidence for the Boolean regulation of down-stream nodes in the network, e.g., whether two parents activate their child independently (OR-gate) or jointly (AND-gate).
Last updated
pathwayssystemsbiologynetworkinferencenetworkgeneexpressiongeneregulationpreprocessing
4.60 score 2 stars 7 scripts 326 downloadsreconsi - Resampling Collapsed Null Distributions for Simultaneous Inference
Improves simultaneous inference under dependence of tests by estimating a collapsed null distribution through resampling. Accounting for the dependence between tests increases the power while reducing the variability of the false discovery proportion. This dependence is common in genomics applications, e.g. when combining flow cytometry measurements with microbiome sequence counts.
Last updated
metagenomicsmicrobiomemultiplecomparisonflowcytometry
4.60 score 2 stars 2 scripts 302 downloadseasyreporting - Helps creating report for improving Reproducible Computational Research
An S4 class for facilitating the automated creation of rmarkdown files inside other packages/software even without knowing rmarkdown language. Best if implemented in functions as "recursive" style programming.
Last updated
reportwriting
4.60 score 2 stars 6 scripts 390 downloadsbrendaDb - The BRENDA Enzyme Database
R interface for importing and analyzing enzyme information from the BRENDA database.
Last updated
thirdpartyclientannotationdataimportbrendadatabaseenzymehacktoberfestcpp
4.60 score 2 stars 8 scripts 366 downloadsVariantExperiment - A RangedSummarizedExperiment Container for VCF/GDS Data with GDS Backend
VariantExperiment is a Bioconductor package for saving data in VCF/GDS format into RangedSummarizedExperiment object. The high-throughput genetic/genomic data are saved in GDSArray objects. The annotation data for features/samples are saved in DelayedDataFrame format with mono-dimensional GDSArray in each column. The on-disk representation of both assay data and annotation data achieves on-disk reading and processing and saves memory space significantly. The interface of RangedSummarizedExperiment data format enables easy and common manipulations for high-throughput genetic/genomic data with common SummarizedExperiment metaphor in R and Bioconductor.
Last updated
infrastructuredatarepresentationsequencingannotationgenomeannotationgenotypingarray
4.60 score 1 stars 2 scripts 266 downloadsGeneBreak - Gene Break Detection
Recurrent breakpoint gene detection on copy number aberration profiles.
Last updated
acghcopynumbervariationdnaseqgeneticssequencingwholegenomevisualization
4.60 score 2 stars 6 scripts 397 downloadslimmaGUI - GUI for limma Package With Two Color Microarrays
A Graphical User Interface for differential expression analysis of two-color microarray data using the limma package.
Last updated
guigeneexpressiondifferentialexpressiondataimportbayesianregressiontimecoursemicroarraymrnamicroarraytwochannelbatcheffectmultiplecomparisonnormalizationpreprocessingqualitycontrol
4.60 score 1 scripts 506 downloadsaffylmGUI - GUI for limma Package with Affymetrix Microarrays
A Graphical User Interface (GUI) for analysis of Affymetrix microarray gene expression data using the affy and limma packages.
Last updated
guigeneexpressiontranscriptiondifferentialexpressiondataimportbayesianregressiontimecoursemicroarraymrnamicroarrayonechannelproprietaryplatformsbatcheffectmultiplecomparisonnormalizationpreprocessingqualitycontrol
4.60 score 3 scripts 642 downloadsRbwa - R wrapper for BWA-backtrack and BWA-MEM aligners
Provides an R wrapper for BWA alignment algorithms. Both BWA-backtrack and BWA-MEM are available. Convenience function to build a BWA index from a reference genome is also provided. Currently not supported for Windows machines.
Last updated
sequencingalignmentbwabwa-memdna-sequencesdna-sequencing
4.60 score 1 stars 1 dependents 11 scripts 294 downloadsBufferedMatrix - A matrix data storage object held in temporary files
A tabular style data object where most data is stored outside main memory. A buffer is used to speed up access to data.
Last updated
infrastructure
4.60 score 1 dependents 11 scripts 396 downloadstreekoR - Cytometry Cluster Hierarchy and Cellular-to-phenotype Associations
treekoR is a novel framework that aims to utilise the hierarchical nature of single cell cytometry data to find robust and interpretable associations between cell subsets and patient clinical end points. These associations are aimed to recapitulate the nested proportions prevalent in workflows inovlving manual gating, which are often overlooked in workflows using automatic clustering to identify cell populations. We developed treekoR to: Derive a hierarchical tree structure of cell clusters; quantify a cell types as a proportion relative to all cells in a sample (%total), and, as the proportion relative to a parent population (%parent); perform significance testing using the calculated proportions; and provide an interactive html visualisation to help highlight key results.
Last updated
clusteringdifferentialexpressionflowcytometryimmunooncologymassspectrometrysinglecellsoftwarestatisticalmethodvisualization
4.59 score 1 dependents 13 scripts 405 downloadsROSeq - Modeling expression ranks for noise-tolerant differential expression analysis of scRNA-Seq data
ROSeq - A rank based approach to modeling gene expression with filtered and normalized read count matrix. ROSeq takes filtered and normalized read matrix and cell-annotation/condition as input and determines the differentially expressed genes between the contrasting groups of single cells. One of the input parameters is the number of cores to be used.
Last updated
geneexpressiondifferentialexpressionsinglecellcount-datagene-expressiongene-expression-profilesnormalizationpopulationsranktmmtungtung-datasettutorialvignette
4.58 score 2 stars 19 scripts 369 downloadsannotationTools - Annotate microarrays and perform cross-species gene expression analyses using flat file databases
Functions to annotate microarrays, find orthologs, and integrate heterogeneous gene expression profiles using annotation and other molecular biology information available as flat file database (plain text files).
Last updated
microarrayannotation
4.58 score 1 dependents 21 scripts 559 downloadsMacarron - Prioritization of potentially bioactive metabolic features from epidemiological and environmental metabolomics datasets
Macarron is a workflow for the prioritization of potentially bioactive metabolites from metabolomics experiments. Prioritization integrates strengths of evidences of bioactivity such as covariation with a known metabolite, abundance relative to a known metabolite and association with an environmental or phenotypic indicator of bioactivity. Broadly, the workflow consists of stratified clustering of metabolic spectral features which co-vary in abundance in a condition, transfer of functional annotations, estimation of relative abundance and differential abundance analysis to identify associations between features and phenotype/condition.
Last updated
sequencingmetabolomicscoveragefunctionalpredictionclustering
4.57 score 15 scripts 302 downloadsregioneReloaded - RegioneReloaded: Multiple Association for Genomic Region Sets
RegioneReloaded is a package that allows simultaneous analysis of associations between genomic region sets, enabling clustering of data and the creation of ready-to-publish graphs. It takes over and expands on all the features of its predecessor regioneR. It also incorporates a strategy to improve p-value calculations and normalize z-scores coming from multiple analysis to allow for their direct comparison. RegioneReloaded builds upon regioneR by adding new plotting functions for obtaining publication-ready graphs.
Last updated
geneticschipseqdnaseqmethylseqcopynumbervariationclusteringmultiplecomparison
4.56 score 5 stars 18 scripts 310 downloadscombi - Compositional omics model based visual integration
This explorative ordination method combines quasi-likelihood estimation, compositional regression models and latent variable models for integrative visualization of several omics datasets. Both unconstrained and constrained integration are available. The results are shown as interpretable, compositional multiplots.
Last updated
metagenomicsdimensionreductionmicrobiomevisualizationmetabolomics
4.56 score 1 stars 12 scripts 357 downloadsINSPEcT - Modeling RNA synthesis, processing and degradation with RNA-seq data
INSPEcT (INference of Synthesis, Processing and dEgradation rates from Transcriptomic data) RNA-seq data in time-course experiments or steady-state conditions, with or without the support of nascent RNA data.
Last updated
sequencingrnaseqgeneregulationtimecoursesystemsbiology
4.56 score 9 scripts 545 downloadsRBM - RBM: a R package for microarray and RNA-Seq data analysis
Use A Resampling-Based Empirical Bayes Approach to Assess Differential Expression in Two-Color Microarrays and RNA-Seq data sets.
Last updated
microarraydifferentialexpression
4.56 score 18 scripts 345 downloadsSELEX - Functions for analyzing SELEX-seq data
Tools for quantifying DNA binding specificities based on SELEX-seq data.
Last updated
softwaremotifdiscoverymotifannotationgeneregulationtranscriptionopenjdk
4.56 score 18 scripts 402 downloadsflowMerge - Cluster Merging for Flow Cytometry Data
Merging of mixture components for model-based automated gating of flow cytometry data using the flowClust framework. Note: users should have a working copy of flowClust 2.0 installed.
Last updated
immunooncologyclusteringflowcytometry
4.56 score 1 dependents 6 scripts 436 downloadsquantiseqr - Quantification of the Tumor Immune contexture from RNA-seq data
This package provides a streamlined workflow for the quanTIseq method, developed to perform the quantification of the Tumor Immune contexture from RNA-seq data. The quantification is performed against the TIL10 signature (dissecting the contributions of ten immune cell types), carefully crafted from a collection of human RNA-seq samples. The TIL10 signature has been extensively validated using simulated, flow cytometry, and immunohistochemistry data.
Last updated
geneexpressionsoftwaretranscriptiontranscriptomicssequencingmicroarrayvisualizationannotationimmunooncologyfeatureextractionclassificationstatisticalmethodexperimenthubsoftwareflowcytometry
4.55 score 1 dependents 4 scripts 1.2k downloadsribor - An R Interface for Ribo Files
The ribor package provides an R Interface for .ribo files. It provides functionality to read the .ribo file, which is of HDF5 format, and performs common analyses on its contents.
Last updated
softwareinfrastructure
4.54 score 35 scripts 350 downloadsSpaniel - Spatial Transcriptomics Analysis
Spaniel includes a series of tools to aid the quality control and analysis of Spatial Transcriptomics data. Spaniel can import data from either the original Spatial Transcriptomics system or 10X Visium technology. The package contains functions to create a SingleCellExperiment Seurat object and provides a method of loading a histologial image into R. The spanielPlot function allows visualisation of metrics contained within the S4 object overlaid onto the image of the tissue.
Last updated
singlecellrnaseqqualitycontrolpreprocessingnormalizationvisualizationtranscriptomicsgeneexpressionsequencingsoftwaredataimportdatarepresentationinfrastructurecoverageclustering
4.54 score 35 scripts 388 downloadspanoramic - Meta-Analysis of Spatial Colocalization in Spatial Omics
Provides a pipeline for quantifying and meta-analyzing spatial colocalization between cell types in spatial omics experiments. The package prepares SpatialExperiment inputs, computes Loh-bootstrap spatial summary functions (e.g. L- and K-functions) for cell-type pairs across samples, and performs random-effects meta-analysis to assess group-level differences in spatial colocalization.
Last updated
softwarespatialsinglecell
4.54 score 3 scriptsmbQTL - mbQTL: A package for SNP-Taxa mGWAS analysis
mbQTL is a statistical R package for simultaneous 16srRNA,16srDNA (microbial) and variant, SNP, SNV (host) relationship, correlation, regression studies. We apply linear, logistic and correlation based statistics to identify the relationships of taxa, genus, species and variant, SNP, SNV in the infected host. We produce various statistical significance measures such as P values, FDR, BC and probability estimation to show significance of these relationships. Further we provide various visualization function for ease and clarification of the results of these analysis. The package is compatible with dataframe, MRexperiment and text formats.
Last updated
snpmicrobiomewholegenomemetagenomicsstatisticalmethodregression
4.53 score 1 stars 34 scripts 286 downloadsASURAT - Functional annotation-driven unsupervised clustering for single-cell data
ASURAT is a software for single-cell data analysis. Using ASURAT, one can simultaneously perform unsupervised clustering and biological interpretation in terms of cell type, disease, biological process, and signaling pathway activity. Inputting a single-cell RNA-seq data and knowledge-based databases, such as Cell Ontology, Gene Ontology, KEGG, etc., ASURAT transforms gene expression tables into original multivariate tables, termed sign-by-sample matrices (SSMs).
Last updated
geneexpressionsinglecellsequencingclusteringgenesignalingcpp
4.52 score 22 scripts 391 downloadsplgem - Detect differential expression in microarray and proteomics datasets with the Power Law Global Error Model (PLGEM)
The Power Law Global Error Model (PLGEM) has been shown to faithfully model the variance-versus-mean dependence that exists in a variety of genome-wide datasets, including microarray and proteomics data. The use of PLGEM has been shown to improve the detection of differentially expressed genes or proteins in these datasets.
Last updated
immunooncologymicroarraydifferentialexpressionproteomicsgeneexpressionmassspectrometry
4.49 score 1 dependents 13 scripts 574 downloadsalabaster.vcf - Save and Load Variant Data to/from File
Save variant calling SummarizedExperiment to file and load them back as VCF objects. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated
dataimportdatarepresentation
4.48 score 1 dependents 6 scripts 274 downloadsalabaster.bumpy - Save and Load BumpyMatrices to/from file
Save BumpyMatrix objects into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated
dataimportdatarepresentation
4.48 score 1 dependents 7 scripts 304 downloadsalabaster.mae - Load and Save MultiAssayExperiments
Save MultiAssayExperiments into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated
dataimportdatarepresentation
4.48 score 1 dependents 6 scripts 348 downloadsATACseqTFEA - Transcription Factor Enrichment Analysis for ATAC-seq
Assay for Transpose-Accessible Chromatin using sequencing (ATAC-seq) is a technique to assess genome-wide chromatin accessibility by probing open chromatin with hyperactive mutant Tn5 Transposase that inserts sequencing adapters into open regions of the genome. ATACseqTFEA is an improvement of the current computational method that detects differential activity of transcription factors (TFs). ATACseqTFEA not only uses the difference of open region information, but also (or emphasizes) the difference of TFs footprints (cutting sites or insertion sites). ATACseqTFEA provides an easy, rigorous way to broadly assess TF activity changes between two conditions.
Last updated
sequencingdnaseqatacseqmnaseseqgeneregulation
4.48 score 1 stars 3 scripts 357 downloadsdrugTargetInteractions - Drug-Target Interactions
Provides utilities for identifying drug-target interactions for sets of small molecule or gene/protein identifiers. The required drug-target interaction information is obained from a local SQLite instance of the ChEMBL database. ChEMBL has been chosen for this purpose, because it provides one of the most comprehensive and best annotatated knowledge resources for drug-target information available in the public domain.
Last updated
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsproteomicsmetabolomics
4.48 score 1 stars 15 scripts 338 downloadsComPrAn - Complexome Profiling Analysis package
This package is for analysis of SILAC labeled complexome profiling data. It uses peptide table in tab-delimited format as an input and produces ready-to-use tables and plots.
Last updated
massspectrometryproteomicsvisualization
4.48 score 6 scripts 278 downloadsTrajectoryGeometry - This Package Discovers Directionality in Time and Pseudo-times Series of Gene Expression Patterns
Given a time series or pseudo-times series of gene expression data, we might wish to know: Do the changes in gene expression in these data exhibit directionality? Are there turning points in this directionality. Do different subsets of the data move in different directions? This package uses spherical geometry to probe these sorts of questions. In particular, if we are looking at (say) the first n dimensions of the PCA of gene expression, directionality can be detected as the clustering of points on the (n-1)-dimensional sphere.
Last updated
biologicalquestionstatisticalmethodgeneexpressionsinglecell
4.48 score 15 scripts 282 downloadsORFhunteR - Predict open reading frames in nucleotide sequences
The ORFhunteR package is a R and C++ library for an automatic determination and annotation of open reading frames (ORF) in a large set of RNA molecules. It efficiently implements the machine learning model based on vectorization of nucleotide sequences and the random forest classification algorithm. The ORFhunteR package consists of a set of functions written in the R language in conjunction with C++. The efficiency of the package was confirmed by the examples of the analysis of RNA molecules from the NCBI RefSeq and Ensembl databases. The package can be used in basic and applied biomedical research related to the study of the transcriptome of normal as well as altered (for example, cancer) human cells.
Last updated
technologystatisticalmethodsequencingrnaseqclassificationfeatureextractioncpp
4.48 score 1 stars 334 downloadsRLassoCox - A reweighted Lasso-Cox by integrating gene interaction information
RLassoCox is a package that implements the RLasso-Cox model proposed by Wei Liu. The RLasso-Cox model integrates gene interaction information into the Lasso-Cox model for accurate survival prediction and survival biomarker discovery. It is based on the hypothesis that topologically important genes in the gene interaction network tend to have stable expression changes. The RLasso-Cox model uses random walk to evaluate the topological weight of genes, and then highlights topologically important genes to improve the generalization ability of the Lasso-Cox model. The RLasso-Cox model has the advantage of identifying small gene sets with high prognostic performance on independent datasets, which may play an important role in identifying robust survival biomarkers for various cancer types.
Last updated
survivalregressiongeneexpressiongenepredictionnetwork
4.48 score 3 stars 5 scripts 304 downloadsSCFA - SCFA: Subtyping via Consensus Factor Analysis
Subtyping via Consensus Factor Analysis (SCFA) can efficiently remove noisy signals from consistent molecular patterns in multi-omics data. SCFA first uses an autoencoder to select only important features and then repeatedly performs factor analysis to represent the data with different numbers of factors. Using these representations, it can reliably identify cancer subtypes and accurately predict risk scores of patients.
Last updated
survivalclusteringclassification
4.48 score 3 stars 3 scripts 304 downloadscustomCMPdb - Customize and Query Compound Annotation Database
This package serves as a query interface for important community collections of small molecules, while also allowing users to include custom compound collections.
Last updated
softwarecheminformaticsannotationhubsoftware
4.48 score 1 stars 5 scripts 332 downloadsInformeasure - R implementation of information measures
This package consolidates a comprehensive set of information measurements, encompassing mutual information, conditional mutual information, interaction information, partial information decomposition, and part mutual information.
Last updated
geneexpressionnetworkinferencenetworksoftware
4.48 score 3 stars 10 scripts 296 downloadsMACSQuantifyR - Fast treatment of MACSQuantify FACS data
Automatically process the metadata of MACSQuantify FACS sorter. It runs multiple modules: i) imports of raw file and graphical selection of duplicates in well plate, ii) computes statistics on data and iii) can compute combination index.
Last updated
dataimportpreprocessingnormalizationflowcytometrydatarepresentationgui
4.48 score 10 scripts 325 downloadsmicrobiomeDASim - Microbiome Differential Abundance Simulation
A toolkit for simulating differential microbiome data designed for longitudinal analyses. Several functional forms may be specified for the mean trend. Observations are drawn from a multivariate normal model. The objective of this package is to be able to simulate data in order to accurately compare different longitudinal methods for differential abundance.
Last updated
microbiomevisualizationsoftware
4.48 score 3 stars 4 scripts 332 downloadsGOTHiC - Binomial test for Hi-C data analysis
This is a Hi-C analysis package using a cumulative binomial test to detect interactions between distal genomic loci that have significantly more reads than expected by chance in Hi-C experiments. It takes mapped paired NGS reads as input and gives back the list of significant interactions for a given bin size in the genome.
Last updated
immunooncologysequencingpreprocessingepigeneticshic
4.48 score 7 scripts 534 downloadsRNAmodR.RiboMethSeq - Detection of 2'-O methylations by RiboMethSeq
RNAmodR.RiboMethSeq implements the detection of 2'-O methylations on RNA from experimental data generated with the RiboMethSeq protocol. The package builds on the core functionality of the RNAmodR package to detect specific patterns of the modifications in high throughput sequencing data.
Last updated
softwareworkflowstepvisualizationsequencingbioconductormodificationsribomethseqrnarnamodr
4.48 score 1 stars 4 scripts 315 downloadsIVAS - Identification of genetic Variants affecting Alternative Splicing
Identification of genetic variants affecting alternative splicing.
Last updated
immunooncologyalternativesplicingdifferentialexpressiondifferentialsplicinggeneexpressiongeneregulationregressionrnaseqsequencingsnpsoftwaretranscription
4.48 score 1 scripts 376 downloadsTEQC - Quality control for target capture experiments
Target capture experiments combine hybridization-based (in solution or on microarrays) capture and enrichment of genomic regions of interest (e.g. the exome) with high throughput sequencing of the captured DNA fragments. This package provides functionalities for assessing and visualizing the quality of the target enrichment process, like specificity and sensitivity of the capture, per-target read coverage and so on.
Last updated
qualitycontrolmicroarraysequencinggenetics
4.48 score 8 scripts 421 downloadssnm - Supervised Normalization of Microarrays
SNM is a modeling strategy especially designed for normalizing high-throughput genomic data. The underlying premise of our approach is that your data is a function of what we refer to as study-specific variables. These variables are either biological variables that represent the target of the statistical analysis, or adjustment variables that represent factors arising from the experimental or biological setting the data is drawn from. The SNM approach aims to simultaneously model all study-specific variables in order to more accurately characterize the biological or clinical variables of interest.
Last updated
microarrayonechanneltwochannelmultichanneldifferentialexpressionexonarraygeneexpressiontranscriptionmultiplecomparisonpreprocessingqualitycontrol
4.47 score 73 scripts 408 downloadsNetActivity - Compute gene set scores from a deep learning framework
#' NetActivity enables to compute gene set scores from previously trained sparsely-connected autoencoders. The package contains a function to prepare the data (`prepareSummarizedExperiment`) and a function to compute the gene set scores (`computeGeneSetScores`). The package `NetActivityData` contains different pre-trained models to be directly applied to the data. Alternatively, the users might use the package to compute gene set scores using custom models.
Last updated
rnaseqmicroarraytranscriptionfunctionalgenomicsgogeneexpressionpathwayssoftware
4.46 score 29 scripts 276 downloadsToxicoGx - Analysis of Large-Scale Toxico-Genomic Data
Contains a set of functions to perform large-scale analysis of toxicogenomic data, providing a standardized data structure to hold information relevant to annotation, visualization and statistical analysis of toxicogenomic data.
Last updated
geneexpressionpharmacogeneticspharmacogenomicssoftware
4.46 score 29 scripts 366 downloadsCOSNet - Cost Sensitive Network for node label prediction on graphs with highly unbalanced labelings
Package that implements the COSNet classification algorithm. The algorithm predicts node labels in partially labeled graphs where few positives are available for the class being predicted.
Last updated
graphandnetworkclassificationnetworkneuralnetwork
4.46 score 1 stars 18 scripts 368 downloadsa4Preproc - Automated Affymetrix Array Analysis Preprocessing Package
Utility functions to pre-process data for the Automated Affymetrix Array Analysis set of packages.
Last updated
microarraypreprocessing
4.46 score 3 dependents 16 scripts 572 downloadspdInfoBuilder - Platform Design Information Package Builder
Builds platform design information packages. These consist of a SQLite database containing feature-level data such as x, y position on chip and featureSet ID. The database also incorporates featureSet-level annotation data. The products of this packages are used by the oligo pkg.
Last updated
annotationinfrastructure
4.46 score 18 scripts 462 downloadsmumosa - Multi-Modal Single-Cell Analysis Methods
Assorted utilities for multi-modal analyses of single-cell datasets. Includes functions to combine multiple modalities for downstream analysis, perform MNN-based batch correction across multiple modalities, and to compute correlations between assay values for different modalities.
Last updated
immunooncologysinglecellrnaseq
4.43 score 18 scripts 358 downloadsmetabolomicsWorkbenchR - Metabolomics Workbench in R
This package provides functions for interfacing with the Metabolomics Workbench RESTful API. Study, compound, protein and gene information can be searched for using the API. Methods to obtain study data in common Bioconductor formats such as SummarizedExperiment and MultiAssayExperiment are also included.
Last updated
softwaremetabolomics
4.41 score 13 scripts 392 downloadsSynMut - SynMut: Designing Synonymously Mutated Sequences with Different Genomic Signatures
There are increasing demands on designing virus mutants with specific dinucleotide or codon composition. This tool can take both dinucleotide preference and/or codon usage bias into account while designing mutants. It is a powerful tool for in silico designs of DNA sequence mutants.
Last updated
sequencematchingexperimentaldesignpreprocessing
4.41 score 2 stars 13 scripts 348 downloadsXDE - XDE: a Bayesian hierarchical model for cross-study analysis of differential gene expression
Multi-level model for cross-study detection of differential gene expression.
Last updated
microarraydifferentialexpressioncpp
4.41 score 16 scripts 538 downloadsnetboost - Network Analysis Supported by Boosting
Boosting supported network analysis for high-dimensional omics applications. This package comes bundled with the MC-UPGMA clustering package by Yaniv Loewenstein.
Last updated
softwarestatisticalmethodgraphandnetworknetworkclusteringdimensionreductionbiomedicalinformaticsepigeneticsmetabolomicstranscriptomicscpp
4.40 score 9 scripts 351 downloadsCGHregions - Dimension Reduction for Array CGH Data with Minimal Information Loss.
Dimension Reduction for Array CGH Data with Minimal Information Loss
Last updated
microarraycopynumbervariationvisualization
4.39 score 31 scripts 420 downloads
ccImpute - ccImpute: an accurate and scalable consensus clustering based approach to impute dropout events in the single-cell RNA-seq data (https://doi.org/10.1186/s12859-022-04814-8)
Dropout events make the lowly expressed genes indistinguishable from true zero expression and different than the low expression present in cells of the same type. This issue makes any subsequent downstream analysis difficult. ccImpute is an imputation algorithm that uses cell similarity established by consensus clustering to impute the most probable dropout events in the scRNA-seq datasets. ccImpute demonstrated performance which exceeds the performance of existing imputation approaches while introducing the least amount of new noise as measured by clustering performance characteristics on datasets with known cell identities.
Last updated
singlecellsequencingprincipalcomponentdimensionreductionclusteringrnaseqtranscriptomicsopenblascppopenmp
4.38 score 2 stars 12 scripts 294 downloads
GenomAutomorphism - Compute the automorphisms between DNA's Abelian group representations
This is a R package to compute the automorphisms between pairwise aligned DNA sequences represented as elements from a Genomic Abelian group. In a general scenario, from genomic regions till the whole genomes from a given population (from any species or close related species) can be algebraically represented as a direct sum of cyclic groups or more specifically Abelian p-groups. Basically, we propose the representation of multiple sequence alignments of length N bp as element of a finite Abelian group created by the direct sum of homocyclic Abelian group of prime-power order.
Last updated
mathematicalbiologycomparativegenomicsfunctionalgenomicsmultiplesequencealignmentwholegenomegenetic-codegenetic-code-algebragenomegenome-algebra
4.38 score 12 scripts 346 downloadsscDDboost - A compositional model to assess expression changes from single-cell rna-seq data
scDDboost is an R package to analyze changes in the distribution of single-cell expression data between two experimental conditions. Compared to other methods that assess differential expression, scDDboost benefits uniquely from information conveyed by the clustering of cells into cellular subtypes. Through a novel empirical Bayesian formulation it calculates gene-specific posterior probabilities that the marginal expression distribution is the same (or different) between the two conditions. The implementation in scDDboost treats gene-level expression data within each condition as a mixture of negative binomial distributions.
Last updated
singlecellsoftwareclusteringsequencinggeneexpressiondifferentialexpressionbayesiancpp
4.38 score 24 scripts 342 downloadsgenomicInstability - Genomic Instability estimation for scRNA-Seq
This package contain functions to run genomic instability analysis (GIA) from scRNA-Seq data. GIA estimates the association between gene expression and genomic location of the coding genes. It uses the aREA algorithm to quantify the enrichment of sets of contiguous genes (loci-blocks) on the gene expression profiles and estimates the Genomic Instability Score (GIS) for each analyzed cell.
Last updated
systemsbiologygeneexpressionsinglecell
4.38 score 5 stars 12 scripts 330 downloadsgetDEE2 - Programmatic access to the DEE2 RNA expression dataset
Digital Expression Explorer 2 (or DEE2 for short) is a repository of processed RNA-seq data in the form of counts. It was designed so that researchers could undertake re-analysis and meta-analysis of published RNA-seq studies quickly and easily. As of April 2020, over 1 million SRA datasets have been processed. This package provides an R interface to access these expression data. More information about the DEE2 project can be found at the project homepage (http://dee2.io) and main publication (https://doi.org/10.1093/gigascience/giz022).
Last updated
geneexpressiontranscriptomicssequencingbioinformaticsdata-mininggenomicsrna-expressionrna-seq
4.38 score 4 stars 7 scripts 348 downloadsCeTF - Coexpression for Transcription Factors using Regulatory Impact Factors and Partial Correlation and Information Theory analysis
This package provides the necessary functions for performing the Partial Correlation coefficient with Information Theory (PCIT) (Reverter and Chan 2008) and Regulatory Impact Factors (RIF) (Reverter et al. 2010) algorithm. The PCIT algorithm identifies meaningful correlations to define edges in a weighted network and can be applied to any correlation-based network including but not limited to gene co-expression networks, while the RIF algorithm identify critical Transcription Factors (TF) from gene expression data. These two algorithms when combined provide a very relevant layer of information for gene expression studies (Microarray, RNA-seq and single-cell RNA-seq data).
Last updated
sequencingrnaseqmicroarraygeneexpressiontranscriptionnormalizationdifferentialexpressionsinglecellnetworkregressionchipseqimmunooncologycoveragecpp
4.38 score 12 scripts 411 downloadsTPP2D - Detection of ligand-protein interactions from 2D thermal profiles (DLPTP)
Detection of ligand-protein interactions from 2D thermal profiles (DLPTP), Performs an FDR-controlled analysis of 2D-TPP experiments by functional analysis of dose-response curves across temperatures.
Last updated
softwareproteomicsdataimport
4.38 score 16 scripts 350 downloadsanota - ANalysis Of Translational Activity (ANOTA).
Genome wide studies of translational control is emerging as a tool to study verious biological conditions. The output from such analysis is both the mRNA level (e.g. cytosolic mRNA level) and the levl of mRNA actively involved in translation (the actively translating mRNA level) for each mRNA. The standard analysis of such data strives towards identifying differential translational between two or more sample classes - i.e. differences in actively translated mRNA levels that are independent of underlying differences in cytosolic mRNA levels. This package allows for such analysis using partial variances and the random variance model. As 10s of thousands of mRNAs are analyzed in parallell the library performs a number of tests to assure that the data set is suitable for such analysis.
Last updated
geneexpressiondifferentialexpressionmicroarraysequencing
4.38 score 1 dependents 2 scripts 526 downloadsa4Core - Automated Affymetrix Array Analysis Core Package
Utility functions for the Automated Affymetrix Array Analysis set of packages.
Last updated
microarrayclassification
4.38 score 4 dependents 7 scripts 595 downloadsddCt - The ddCt Algorithm for the Analysis of Quantitative Real-Time PCR (qRT-PCR)
The Delta-Delta-Ct (ddCt) Algorithm is an approximation method to determine relative gene expression with quantitative real-time PCR (qRT-PCR) experiments. Compared to other approaches, it requires no standard curve for each primer-target pair, therefore reducing the working load and yet returning accurate enough results as long as the assumptions of the amplification efficiency hold. The ddCt package implements a pipeline to collect, analyse and visualize qRT-PCR results, for example those from TaqMan SDM software, mainly using the ddCt method. The pipeline can be either invoked by a script in command-line or through the API consisting of S4-Classes, methods and functions.
Last updated
geneexpressiondifferentialexpressionmicrotitreplateassayqpcr
4.38 score 7 scripts 504 downloadsmogsa - Multiple omics data integrative clustering and gene set analysis
This package provide a method for doing gene set analysis based on multiple omics data.
Last updated
geneexpressionprincipalcomponentstatisticalmethodclusteringsoftware
4.37 score 58 scripts 591 downloadsDepInfeR - Inferring tumor-specific cancer dependencies through integrating ex-vivo drug response assays and drug-protein profiling
DepInfeR integrates two experimentally accessible input data matrices: the drug sensitivity profiles of cancer cell lines or primary tumors ex-vivo (X), and the drug affinities of a set of proteins (Y), to infer a matrix of molecular protein dependencies of the cancers (ß). DepInfeR deconvolutes the protein inhibition effect on the viability phenotype by using regularized multivariate linear regression. It assigns a “dependence coefficient” to each protein and each sample, and therefore could be used to gain a causal and accurate understanding of functional consequences of genomic aberrations in a heterogeneous disease, as well as to guide the choice of pharmacological intervention for a specific cancer type, sub-type, or an individual patient. For more information, please read out preprint on bioRxiv: https://doi.org/10.1101/2022.01.11.475864.
Last updated
softwareregressionpharmacogeneticspharmacogenomicsfunctionalgenomics
4.36 score 1 stars 23 scripts 319 downloadsprotGear - Protein Micro Array Data Management and Interactive Visualization
A generic three-step pre-processing package for protein microarray data. This package contains different data pre-processing procedures to allow comparison of their performance.These steps are background correction, the coefficient of variation (CV) based filtering, batch correction and normalization.
Last updated
microarrayonechannelpreprocessingbiomedicalinformaticsproteomicsbatcheffectnormalizationbayesianclusteringregressionsystemsbiologyimmunooncologybackground-correctionmicroarray-datanormalisationproteomics-datashinyshinydashboard
4.34 score 1 stars 11 scripts 436 downloadsHiCDCPlus - Hi-C Direct Caller Plus
Systematic 3D interaction calls and differential analysis for Hi-C and HiChIP. The HiC-DC+ (Hi-C/HiChIP direct caller plus) package enables principled statistical analysis of Hi-C and HiChIP data sets – including calling significant interactions within a single experiment and performing differential analysis between conditions given replicate experiments – to facilitate global integrative studies. HiC-DC+ estimates significant interactions in a Hi-C or HiChIP experiment directly from the raw contact matrix for each chromosome up to a specified genomic distance, binned by uniform genomic intervals or restriction enzyme fragments, by training a background model to account for random polymer ligation and systematic sources of read count variation.
Last updated
hicdna3dstructuresoftwarenormalizationzlibcpp
4.34 score 22 scripts 372 downloadsblacksheepr - Outlier Analysis for pairwise differential comparison
Blacksheep is a tool designed for outlier analysis in the context of pairwise comparisons in an effort to find distinguishing characteristics from two groups. This tool was designed to be applied for biological applications such as phosphoproteomics or transcriptomics, but it can be used for any data that can be represented by a 2D table, and has two sub populations within the table to compare.
Last updated
sequencingrnaseqgeneexpressiontranscriptiondifferentialexpressiontranscriptomics
4.34 score 11 scripts 386 downloadsPAA - PAA (Protein Array Analyzer)
PAA imports single color (protein) microarray data that has been saved in gpr file format - esp. ProtoArray data. After preprocessing (background correction, batch filtering, normalization) univariate feature preselection is performed (e.g., using the "minimum M statistic" approach - hereinafter referred to as "mMs"). Subsequently, a multivariate feature selection is conducted to discover biomarker candidates. Therefore, either a frequency-based backwards elimination aproach or ensemble feature selection can be used. PAA provides a complete toolbox of analysis tools including several different plots for results examination and evaluation.
Last updated
classificationmicroarrayonechannelproteomicscpp
4.34 score 11 scripts 432 downloadschihaya - Save Delayed Operations to a HDF5 File
Saves the delayed operations of a DelayedArray to a HDF5 file. This enables efficient recovery of the DelayedArray's contents in other languages and analysis frameworks.
Last updated
dataimportdatarepresentationcurlopensslcpp
4.32 score 21 scripts 389 downloadstomoseqr - R Package for Analyzing Tomo-seq Data
`tomoseqr` is an R package for analyzing Tomo-seq data. Tomo-seq is a genome-wide RNA tomography method that combines combining high-throughput RNA sequencing with cryosectioning for spatially resolved transcriptomics. `tomoseqr` reconstructs 3D expression patterns from tomo-seq data and visualizes the reconstructed 3D expression patterns.
Last updated
geneexpressionsequencingrnaseqtranscriptomicsspatialvisualizationsoftware
4.32 score 21 scripts 322 downloadsgpls - Classification using generalized partial least squares
Classification using generalized partial least squares for two-group and multi-group (more than 2 group) classification.
Last updated
classificationmicroarrayregression
4.32 score 21 scripts 452 downloadscodelink - Manipulation of Codelink microarray data
This package facilitates reading, preprocessing and manipulating Codelink microarray data. The raw data must be exported as text file using the Codelink software.
Last updated
microarrayonechanneldataimportpreprocessing
4.32 score 13 scripts 616 downloadsIcens - NPMLE for Censored and Truncated Data
Many functions for computing the NPMLE for censored and truncated data.
Last updated
infrastructure
4.31 score 8 dependents 43 scripts 838 downloadsSVMDO - Identification of Tumor-Discriminating mRNA Signatures via Support Vector Machines Supported by Disease Ontology
It is an easy-to-use GUI using disease information for detecting tumor/normal sample discriminating gene sets from differentially expressed genes. Our approach is based on an iterative algorithm filtering genes with disease ontology enrichment analysis and wilk and wilks lambda criterion connected to SVM classification model construction. Along with gene set extraction, SVMDO also provides individual prognostic marker detection. The algorithm is designed for FPKM and RPKM normalized RNA-Seq transcriptome datasets.
Last updated
genesetenrichmentdifferentialexpressionguiclassificationrnaseqtranscriptomicssurvivalmachine-learningrna-seqshiny
4.30 score 7 scripts 302 downloadsFeatSeekR - FeatSeekR an R package for unsupervised feature selection
FeatSeekR performs unsupervised feature selection using replicated measurements. It iteratively selects features with the highest reproducibility across replicates, after projecting out those dimensions from the data that are spanned by the previously selected features. The selected a set of features has a high replicate reproducibility and a high degree of uniqueness.
Last updated
softwarestatisticalmethodfeatureextractionmassspectrometry
4.30 score 2 stars 3 scripts 266 downloadsmetabinR - Abundance and Compositional Based Binning of Metagenomes
Provide functions for performing abundance and compositional based binning on metagenomic samples, directly from FASTA or FASTQ files. Functions are implemented in Java and called via rJava. Parallel implementation that operates directly on input FASTA/FASTQ files for fast execution. Inputs may be file paths or Biostrings/ShortRead sequence objects; results are returned as a MetabinResult S4 object wrapping cluster assignments, algorithm parameters, and input metadata.
Last updated
classificationclusteringmicrobiomesequencingsoftwareopenjdk
4.30 score 2 stars 2 scripts 306 downloads
magrene - Motif Analysis In Gene Regulatory Networks
magrene allows the identification and analysis of graph motifs in (duplicated) gene regulatory networks (GRNs), including lambda, V, PPI V, delta, and bifan motifs. GRNs can be tested for motif enrichment by comparing motif frequencies to a null distribution generated from degree-preserving simulated GRNs. Motif frequencies can be analyzed in the context of gene duplications to explore the impact of small-scale and whole-genome duplications on gene regulatory networks. Finally, users can calculate interaction similarity for gene pairs based on the Sorensen-Dice similarity index.
Last updated
softwaremotifdiscoverynetworkenrichmentsystemsbiologygraphandnetworkgene-regulatory-networkmotif-analysisnetwork-motifsnetwork-science
4.30 score 2 stars 5 scripts 299 downloadsupdateObject - Find/fix old serialized S4 instances
A set of tools built around updateObject() to work with old serialized S4 instances. The package is primarily useful to package maintainers who want to update the serialized S4 instances included in their package. This is still work-in-progress.
Last updated
infrastructuredatarepresentationbioconductor-packagecore-package
4.30 score 1 stars 4 scripts 308 downloadsCogito - Compare genomic intervals tool - Automated, complete, reproducible and clear report about genomic and epigenomic data sets
Biological studies often consist of multiple conditions which are examined with different laboratory set ups like RNA-sequencing or ChIP-sequencing. To get an overview about the whole resulting data set, Cogito provides an automated, complete, reproducible and clear report about all samples and basic comparisons between all different samples. This report can be used as documentation about the data set or as starting point for further custom analysis.
Last updated
functionalgenomicsgeneregulationsoftwaresequencing
4.30 score 3 scripts 314 downloadscytoKernel - Differential expression using kernel-based score test
cytoKernel implements a kernel-based score test to identify differentially expressed features in high-dimensional biological experiments. This approach can be applied across many different high-dimensional biological data including gene expression data and dimensionally reduced cytometry-based marker expression data. In this R package, we implement functions that compute the feature-wise p values and their corresponding adjusted p values. Additionally, it also computes the feature-wise shrunk effect sizes and their corresponding shrunken effect size. Further, it calculates the percent of differentially expressed features and plots user-friendly heatmap of the top differentially expressed features on the rows and samples on the columns.
Last updated
immunooncologyproteomicssinglecellsoftwareonechannelflowcytometrydifferentialexpressiongeneexpressionclusteringcpp
4.30 score 2 stars 4 scripts 320 downloadswppi - Weighting protein-protein interactions
Protein-protein interaction data is essential for omics data analysis and modeling. Database knowledge is general, not specific for cell type, physiological condition or any other context determining which connections are functional and contribute to the signaling. Functional annotations such as Gene Ontology and Human Phenotype Ontology might help to evaluate the relevance of interactions. This package predicts functional relevance of protein-protein interactions based on functional annotations such as Human Protein Ontology and Gene Ontology, and prioritizes genes based on network topology, functional scores and a path search algorithm.
Last updated
graphandnetworknetworkpathwayssoftwaregenesignalinggenetargetsystemsbiologytranscriptomicsannotationgene-ontologygene-prioritizationhuman-phenotype-ontologyomnipathppi-networksrandom-walk-with-restartquarto
4.30 score 1 stars 4 scripts 385 downloadscyanoFilter - Phytoplankton Population Identification using Cell Pigmentation and/or Complexity
An approach to filter out and/or identify phytoplankton cells from all particles measured via flow cytometry pigment and cell complexity information. It does this using a sequence of one-dimensional gates on pre-defined channels measuring certain pigmentation and complexity. The package is especially tuned for cyanobacteria, but will work fine for phytoplankton communities where there is at least one cell characteristic that differentiates every phytoplankton in the community.
Last updated
flowcytometryclusteringonechannel
4.30 score 4 scripts 350 downloadsCIMICE - CIMICE-R: (Markov) Chain Method to Inferr Cancer Evolution
CIMICE is a tool in the field of tumor phylogenetics and its goal is to build a Markov Chain (called Cancer Progression Markov Chain, CPMC) in order to model tumor subtypes evolution. The input of CIMICE is a Mutational Matrix, so a boolean matrix representing altered genes in a collection of samples. These samples are assumed to be obtained with single-cell DNA analysis techniques and the tool is specifically written to use the peculiarities of this data for the CMPC construction.
Last updated
softwarebiologicalquestionnetworkinferenceresearchfieldphylogeneticsstatisticalmethodgraphandnetworktechnologysinglecell
4.30 score 5 scripts 340 downloadsgeva - Gene Expression Variation Analysis (GEVA)
Statistic methods to evaluate variations of differential expression (DE) between multiple biological conditions. It takes into account the fold-changes and p-values from previous differential expression (DE) results that use large-scale data (*e.g.*, microarray and RNA-seq) and evaluates which genes would react in response to the distinct experiments. This evaluation involves an unique pipeline of statistical methods, including weighted summarization, quantile detection, cluster analysis, and ANOVA tests, in order to classify a subset of relevant genes whose DE is similar or dependent to certain biological factors.
Last updated
classificationdifferentialexpressiongeneexpressionmicroarraymultiplecomparisonrnaseqsystemsbiologytranscriptomics
4.30 score 2 stars 4 scripts 355 downloadsmidasHLA - R package for immunogenomics data handling and association analysis
MiDAS is a R package for immunogenetics data transformation and statistical analysis. MiDAS accepts input data in the form of HLA alleles and KIR types, and can transform it into biologically meaningful variables, enabling HLA amino acid fine mapping, analyses of HLA evolutionary divergence, KIR gene presence, as well as validated HLA-KIR interactions. Further, it allows comprehensive statistical association analysis workflows with phenotypes of diverse measurement scales. MiDAS closes a gap between the inference of immunogenetic variation and its efficient utilization to make relevant discoveries related to T cell, Natural Killer cell, and disease biology.
Last updated
cellbiologygeneticsstatisticalmethod
4.30 score 9 scripts 346 downloadsMAGAR - MAGAR: R-package to compute methylation Quantitative Trait Loci (methQTL) from DNA methylation and genotyping data
"Methylation-Aware Genotype Association in R" (MAGAR) computes methQTL from DNA methylation and genotyping data from matched samples. MAGAR uses a linear modeling stragety to call CpGs/SNPs that are methQTLs. MAGAR accounts for the local correlation structure of CpGs.
Last updated
regressionepigeneticsdnamethylationsnpgeneticvariabilitymethylationarraymicroarraycpgislandmethylseqsequencingmrnamicroarraypreprocessingcopynumbervariationtwochannelimmunooncologydifferentialmethylationbatcheffectqualitycontroldataimportnetworkclusteringgraphandnetwork
4.30 score 3 scripts 336 downloadsSummix - Summix2: A suite of methods to estimate, adjust, and leverage substructure in genetic summary data
This package contains the Summix2 method for estimating and adjusting for substructure in genetic summary allele frequency data. The function summix() estimates reference group proportions using a mixture model. The adjAF() function produces adjusted allele frequencies for an observed group with reference group proportions matching a target individual or sample. The summix_local() function estimates local ancestry mixture proportions and performs selection scans in genetic summary data.
Last updated
statisticalmethodwholegenomegenetics
4.30 score 20 scripts 276 downloadscenscyt - Differential abundance analysis with a right censored covariate in high-dimensional cytometry
Methods for differential abundance analysis in high-dimensional cytometry data when a covariate is subject to right censoring (e.g. survival time) based on multiple imputation and generalized linear mixed models.
Last updated
immunooncologyflowcytometryproteomicssinglecellcellbasedassayscellbiologyclusteringfeatureextractionsoftwaresurvival
4.30 score 7 scripts 312 downloadsADImpute - Adaptive Dropout Imputer (ADImpute)
Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values (‘dropout imputation’). Most existing dropout imputation methods are limited in the sense that they exclusively use the scRNA-seq dataset at hand and do not exploit external gene-gene relationship information. Here we propose two novel methods: a gene regulatory network-based approach using gene-gene relationships learnt from external data and a baseline approach corresponding to a sample-wide average. ADImpute can implement these novel methods and also combine them with existing imputation methods (currently supported: DrImpute, SAVER). ADImpute can learn the best performing method per gene and combine the results from different methods into an ensemble.
Last updated
geneexpressionnetworkpreprocessingsequencingsinglecelltranscriptomics
4.30 score 10 scripts 512 downloadsAnVILBilling - Provide functions to retrieve and report on usage expenses in NHGRI AnVIL (anvilproject.org).
AnVILBilling helps monitor AnVIL-related costs in R, using queries to a BigQuery table to which costs are exported daily. Functions are defined to help categorize tasks and associated expenditures, and to visualize and explore expense profiles over time. This package will be expanded to help users estimate costs for specific task sets.
Last updated
infrastructuresoftware
4.30 score 5 scripts 313 downloadsPathNet - An R package for pathway analysis using topological information
PathNet uses topological information present in pathways and differential expression levels of genes (obtained from microarray experiment) to identify pathways that are 1) significantly enriched and 2) associated with each other in the context of differential expression. The algorithm is described in: PathNet: A tool for pathway analysis using topological information. Dutta B, Wallqvist A, and Reifman J. Source Code for Biology and Medicine 2012 Sep 24;7(1):10.
Last updated
pathwaysdifferentialexpressionmultiplecomparisonkeggnetworkenrichmentnetwork
4.30 score 8 scripts 398 downloadsSigsPack - Mutational Signature Estimation for Single Samples
Single sample estimation of exposure to mutational signatures. Exposures to known mutational signatures are estimated for single samples, based on quadratic programming algorithms. Bootstrapping the input mutational catalogues provides estimations on the stability of these exposures. The effect of the sequence composition of mutational context can be taken into account by normalising the catalogues.
Last updated
somaticmutationsnpvariantannotationbiomedicalinformaticsdnaseq
4.30 score 2 stars 5 scripts 358 downloadsscBFA - A dimensionality reduction tool using gene detection pattern to mitigate noisy expression profile of scRNA-seq
This package is designed to model gene detection pattern of scRNA-seq through a binary factor analysis model. This model allows user to pass into a cell level covariate matrix X and gene level covariate matrix Q to account for nuisance variance(e.g batch effect), and it will output a low dimensional embedding matrix for downstream analysis.
Last updated
singlecelltranscriptomicsdimensionreductiongeneexpressionatacseqbatcheffectkeggqualitycontrol
4.30 score 6 scripts 368 downloadsNanoStringDiff - Differential Expression Analysis of NanoString nCounter Data
This Package utilizes a generalized linear model(GLM) of the negative binomial family to characterize count data and allows for multi-factor design. NanoStrongDiff incorporate size factors, calculated from positive controls and housekeeping controls, and background level, obtained from negative controls, in the model framework so that all the normalization information provided by NanoString nCounter Analyzer is fully utilized.
Last updated
differentialexpressionnormalizationcpp
4.30 score 9 scripts 367 downloadsTIN - Transcriptome instability analysis
The TIN package implements a set of tools for transcriptome instability analysis based on exon expression profiles. Deviating exon usage is studied in the context of splicing factors to analyse to what degree transcriptome instability is correlated to splicing factor expression. In the transcriptome instability correlation analysis, the data is compared to both random permutations of alternative splicing scores and expression of random gene sets.
Last updated
exonarraymicroarraygeneexpressionalternativesplicinggeneticsdifferentialsplicing
4.30 score 10 scripts 473 downloadsInPAS - Identify Novel Alternative PolyAdenylation Sites (PAS) from RNA-seq data
Alternative polyadenylation (APA) is one of the important post- transcriptional regulation mechanisms which occurs in most human genes. InPAS facilitates the discovery of novel APA sites and the differential usage of APA sites from RNA-Seq data. It leverages cleanUpdTSeq to fine tune identified APA sites by removing false sites.
Last updated
alternative polyadenylationdifferential polyadenylation site usagerna-seqgene regulationtranscription
4.30 score 7 scripts 470 downloadsiSeq - Bayesian Hierarchical Modeling of ChIP-seq Data Through Hidden Ising Models
Bayesian hidden Ising models are implemented to identify IP-enriched genomic regions from ChIP-seq data. They can be used to analyze ChIP-seq data with and without controls and replicates.
Last updated
chipseqsequencing
4.30 score 7 scripts 387 downloadsAgiMicroRna - Processing and Differential Expression Analysis of Agilent microRNA chips
Processing and Analysis of Agilent microRNA data
Last updated
microarrayagilentchiponechannelpreprocessingdifferentialexpression
4.30 score 8 scripts 695 downloadsflagme - Analysis of Metabolomics GC/MS Data
Fragment-level analysis of gas chromatography-massspectrometry metabolomics data.
Last updated
differentialexpressionmassspectrometry
4.30 score 2 scripts 394 downloadsMEDME - Modelling Experimental Data from MeDIP Enrichment
MEDME allows the prediction of absolute and relative methylation levels based on measures obtained by MeDIP-microarray experiments
Last updated
microarraycpgislanddnamethylation
4.30 score 2 scripts 430 downloadsmiRNApath - miRNApath: Pathway Enrichment for miRNA Expression Data
This package provides pathway enrichment techniques for miRNA expression data. Specifically, the set of methods handles the many-to-many relationship between miRNAs and the multiple genes they are predicted to target (and thus affect.) It also handles the gene-to-pathway relationships separately. Both steps are designed to preserve the additive effects of miRNAs on genes, many miRNAs affecting one gene, one miRNA affecting multiple genes, or many miRNAs affecting many genes.
Last updated
annotationpathwaysdifferentialexpressionnetworkenrichmentmirna
4.30 score 6 scripts 324 downloadsrbsurv - Robust likelihood-based survival modeling with microarray data
This package selects genes associated with survival.
Last updated
microarray
4.30 score 9 scripts 484 downloadsSLqPCR - Functions for analysis of real-time quantitative PCR data at SIRS-Lab GmbH
Functions for analysis of real-time quantitative PCR data at SIRS-Lab GmbH
Last updated
microtitreplateassayqpcr
4.30 score 7 scripts 379 downloadscghMCR - Find chromosome regions showing common gains/losses
This package provides functions to identify genomic regions of interests based on segmented copy number data from multiple samples.
Last updated
microarraycopynumbervariation
4.30 score 5 scripts 569 downloadsgenArise - Microarray Analysis tool
genArise is an easy to use tool for dual color microarray data. Its GUI-Tk based environment let any non-experienced user performs a basic, but not simple, data analysis just following a wizard. In addition it provides some tools for the developer.
Last updated
microarraytwochannelpreprocessing
4.30 score 7 scripts 404 downloads
epimutacions - Robust outlier identification for DNA methylation data
The package includes some statistical outlier detection methods for epimutations detection in DNA methylation data. The methods included in the package are MANOVA, Multivariate linear models, isolation forest, robust mahalanobis distance, quantile and beta. The methods compare a case sample with a suspected disease against a reference panel (composed of healthy individuals) to identify epimutations in the given case sample. It also contains functions to annotate and visualize the identified epimutations.
Last updated
dnamethylationbiologicalquestionpreprocessingstatisticalmethodnormalizationcpp
4.30 score 33 scripts 336 downloadsPoDCall - Positive Droplet Calling for DNA Methylation Droplet Digital PCR
Reads files exported from 'QX Manager or QuantaSoft' containing amplitude values from a run of ddPCR (96 well plate) and robustly sets thresholds to determine positive droplets for each channel of each individual well. Concentration and normalized concentration in addition to other metrics is then calculated for each well. Results are returned as a table, optionally written to file, as well as optional plots (scatterplot and histogram) for both channels per well written to file. The package includes a shiny application which provides an interactive and user-friendly interface to the full functionality of PoDCall.
Last updated
classificationepigeneticsddpcrdifferentialmethylationcpgislanddnamethylation
4.29 score 13 scripts 286 downloadsribosomeProfilingQC - Ribosome Profiling Quality Control
Ribo-Seq (also named ribosome profiling or footprinting) measures translatome (unlike RNA-Seq, which sequences the transcriptome) by direct quantification of the ribosome-protected fragments (RPFs). This package provides the tools for quality assessment of ribosome profiling. In addition, it can preprocess Ribo-Seq data for subsequent differential analysis.
Last updated
riboseqsequencinggeneregulationqualitycontrolvisualizationcoverage
4.29 score 13 scripts 404 downloadsISLET - Individual-Specific ceLl typE referencing Tool
ISLET is a method to conduct signal deconvolution for general -omics data. It can estimate the individual-specific and cell-type-specific reference panels, when there are multiple samples observed from each subject. It takes the input of the observed mixture data (feature by sample matrix), and the cell type mixture proportions (sample by cell type matrix), and the sample-to-subject information. It can solve for the reference panel on the individual-basis and conduct test to identify cell-type-specific differential expression (csDE) genes. It also improves estimated cell type mixture proportions by integrating personalized reference panels.
Last updated
softwarernaseqtranscriptomicstranscriptionsequencinggeneexpressiondifferentialexpressiondifferentialmethylation
4.28 score 19 scripts 326 downloadsscHOT - single-cell higher order testing
Single cell Higher Order Testing (scHOT) is an R package that facilitates testing changes in higher order structure of gene expression along either a developmental trajectory or across space. scHOT is general and modular in nature, can be run in multiple data contexts such as along a continuous trajectory, between discrete groups, and over spatial orientations; as well as accommodate any higher order measurement such as variability or correlation. scHOT meaningfully adds to first order effect testing, such as differential expression, and provides a framework for interrogating higher order interactions from single cell data.
Last updated
geneexpressionrnaseqsequencingsinglecellsoftwaretranscriptomics
4.28 score 19 scripts 382 downloadscliProfiler - A package for the CLIP data visualization
An easy and fast way to visualize and profile the high-throughput IP data. This package generates the meta gene profile and other profiles. These profiles could provide valuable information for understanding the IP experiment results.
Last updated
sequencingchipseqvisualizationepigeneticsgenetics
4.26 score 18 scripts 278 downloadsfamat - Functional analysis of metabolic and transcriptomic data
Famat is made to collect data about lists of genes and metabolites provided by user, and to visualize it through a Shiny app. Information collected is: - Pathways containing some of the user's genes and metabolites (obtained using a pathway enrichment analysis). - Direct interactions between user's elements inside pathways. - Information about elements (their identifiers and descriptions). - Go terms enrichment analysis performed on user's genes. The Shiny app is composed of: - information about genes, metabolites, and direct interactions between them inside pathways. - an heatmap showing which elements from the list are in pathways (pathways are structured in hierarchies). - hierarchies of enriched go terms using Molecular Function and Biological Process.
Last updated
functionalpredictiongenesetenrichmentpathwaysgoreactomekeggcompoundgene-ontologygenesshiny
4.26 score 1 stars 5 scripts 329 downloadsLedPred - Learning from DNA to Predict Enhancers
This package aims at creating a predictive model of regulatory sequences used to score unknown sequences based on the content of DNA motifs, next-generation sequencing (NGS) peaks and signals and other numerical scores of the sequences using supervised classification. The package contains a workflow based on the support vector machine (SVM) algorithm that maps features to sequences, optimize SVM parameters and feature number and creates a model that can be stored and used to score the regulatory potential of unknown sequences.
Last updated
supportvectormachinesoftwaremotifannotationchipseqsequencingclassification
4.26 score 3 stars 3 scripts 356 downloadsSIMAT - GC-SIM-MS data processing and alaysis tool
This package provides a pipeline for analysis of GC-MS data acquired in selected ion monitoring (SIM) mode. The tool also provides a guidance in choosing appropriate fragments for the targets of interest by using an optimization algorithm. This is done by considering overlapping peaks from a provided library by the user.
Last updated
immunooncologysoftwaremetabolomicsmassspectrometry
4.26 score 2 scripts 395 downloadsgaga - GaGa hierarchical model for high-throughput data analysis
Implements the GaGa model for high-throughput data analysis, including differential expression analysis, supervised gene clustering and classification. Additionally, it performs sequential sample size calculations using the GaGa and LNNGV models (the latter from EBarrays package).
Last updated
immunooncologyonechannelmassspectrometrymultiplecomparisondifferentialexpressionclassification
4.26 score 1 dependents 15 scripts 460 downloadsggtreeDendro - Drawing 'dendrogram' using 'ggtree'
Offers a set of 'autoplot' methods to visualize tree-like structures (e.g., hierarchical clustering and classification/regression trees) using 'ggtree'. You can adjust graphical parameters using grammar of graphic syntax and integrate external data to the tree.
Last updated
clusteringclassificationdecisiontreephylogeneticsvisualization
4.20 score 16 scripts 300 downloads
mirTarRnaSeq - mirTarRnaSeq
mirTarRnaSeq R package can be used for interactive mRNA miRNA sequencing statistical analysis. This package utilizes expression or differential expression mRNA and miRNA sequencing results and performs interactive correlation and various GLMs (Regular GLM, Multivariate GLM, and Interaction GLMs ) analysis between mRNA and miRNA expriments. These experiments can be time point experiments, and or condition expriments.
Last updated
mirnaregressionsoftwaresequencingsmallrnatimecoursedifferentialexpression
4.20 score 16 scripts 362 downloadsflowGraph - Identifying differential cell populations in flow cytometry data accounting for marker frequency
Identifies maximal differential cell populations in flow cytometry data taking into account dependencies between cell populations; flowGraph calculates and plots SpecEnr abundance scores given cell population cell counts.
Last updated
flowcytometrystatisticalmethodimmunooncologysoftwarecellbasedassaysvisualization
4.20 score 1 stars 16 scripts 342 downloadsmicrobiomeExplorer - Microbiome Exploration App
The MicrobiomeExplorer R package is designed to facilitate the analysis and visualization of marker-gene survey feature data. It allows a user to perform and visualize typical microbiome analytical workflows either through the command line or an interactive Shiny application included with the package. In addition to applying common analytical workflows the application enables automated analysis report generation.
Last updated
classificationclusteringgeneticvariabilitydifferentialexpressionmicrobiomemetagenomicsnormalizationvisualizationmultiplecomparisonsequencingsoftwareimmunooncology
4.20 score 16 scripts 460 downloadsABarray - Microarray QA and statistical data analysis for Applied Biosystems Genome Survey Microrarray (AB1700) gene expression data.
Automated pipline to perform gene expression analysis for Applied Biosystems Genome Survey Microarray (AB1700) data format. Functions include data preprocessing, filtering, control probe analysis, statistical analysis in one single function. A GUI interface is also provided. The raw data, processed data, graphics output and statistical results are organized into folders according to the analysis settings used.
Last updated
microarrayonechannelpreprocessing
4.20 score 8 scripts 570 downloadsplier - Implements the Affymetrix PLIER algorithm
The PLIER (Probe Logarithmic Error Intensity Estimate) method produces an improved signal by accounting for experimentally observed patterns in probe behavior and handling error at the appropriately at low and high signal values.
Last updated
softwarecpp
4.18 score 127 scripts 434 downloadsmoanin - An R Package for Time Course RNASeq Data Analysis
Simple and efficient workflow for time-course gene expression data, built on publictly available open-source projects hosted on CRAN and bioconductor. moanin provides helper functions for all the steps required for analysing time-course data using functional data analysis: (1) functional modeling of the timecourse data; (2) differential expression analysis; (3) clustering; (4) downstream analysis.
Last updated
timecoursegeneexpressionrnaseqmicroarraydifferentialexpressionclustering
4.18 score 15 scripts 310 downloadsRadioGx - Analysis of Large-Scale Radio-Genomic Data
Computational tool box for radio-genomic analysis which integrates radio-response data, radio-biological modelling and comprehensive cell line annotations for hundreds of cancer cell lines. The 'RadioSet' class enables creation and manipulation of standardized datasets including information about cancer cells lines, radio-response assays and dose-response indicators. Included methods allow fitting and plotting dose-response data using established radio-biological models along with quality control to validate results. Additional functions related to fitting and plotting dose response curves, quantifying statistical correlation and calculating area under the curve (AUC) or survival fraction (SF) are included. For more details please see the included documentation, references, as well as: Manem, V. et al (2018) <doi:10.1101/449793>.
Last updated
softwarepharmacogeneticsqualitycontrolsurvivalpharmacogenomicsclassification
4.18 score 15 scripts 450 downloadsGenomicOZone - Delineate outstanding genomic zones of differential gene activity
The package clusters gene activity along chromosome into zones, detects differential zones as outstanding, and visualizes maps of outstanding zones across the genome. It enables characterization of effects on multiple genes within adaptive genomic neighborhoods, which could arise from genome reorganization, structural variation, or epigenome alteration. It guarantees cluster optimality, linear runtime to sample size, and reproducibility. One can apply it on genome-wide activity measurements such as copy number, transcriptomic, proteomic, and methylation data.
Last updated
softwaregeneexpressiontranscriptiondifferentialexpressionfunctionalpredictiongeneregulationbiomedicalinformaticscellbiologyfunctionalgenomicsgeneticssystemsbiologytranscriptomicsclusteringregressionrnaseqannotationvisualizationsequencingcoveragedifferentialmethylationgenomicvariationstructuralvariationcopynumbervariation
4.18 score 2 scripts 348 downloadsLRDE - Differential Expression Analysis with Long Read RNA-Seq Data
Provides hurdle negative binomial models for differential expression analysis with long-read RNA-Seq data.
Last updated
softwaredifferentialexpressionsequencingrnaseqlongreadgeneexpressionregression
4.18 score 146 downloadsxcore - xcore expression regulators inference
xcore is an R package for transcription factor activity modeling based on known molecular signatures and user's gene expression data. Accompanying xcoredata package provides a collection of molecular signatures, constructed from publicly available ChiP-seq experiments. xcore use ridge regression to model changes in expression as a linear combination of molecular signatures and find their unknown activities. Obtained, estimates can be further tested for significance to select molecular signatures with the highest predicted effect on the observed expression changes.
Last updated
geneexpressiongeneregulationepigeneticsregressionsequencing
4.15 score 14 scripts 326 downloadsmetapone - Conducts pathway test of metabolomics data using a weighted permutation test
The package conducts pathway testing from untargetted metabolomics data. It requires the user to supply feature-level test results, from case-control testing, regression, or other suitable feature-level tests for the study design. Weights are given to metabolic features based on how many metabolites they could potentially match to. The package can combine positive and negative mode results in pathway tests.
Last updated
technologymassspectrometrymetabolomicspathways
4.15 score 14 scripts 322 downloadsAlphaBeta - Computational inference of epimutation rates and spectra from high-throughput DNA methylation data in plants
AlphaBeta is a computational method for estimating epimutation rates and spectra from high-throughput DNA methylation data in plants. The method has been specifically designed to: 1. analyze 'germline' epimutations in the context of multi-generational mutation accumulation lines (MA-lines). 2. analyze 'somatic' epimutations in the context of plant development and aging.
Last updated
epigeneticsfunctionalgenomicsgeneticsmathematicalbiology
4.15 score 8 scripts 441 downloadsDeMAND - DeMAND
DEMAND predicts Drug MoA by interrogating a cell context specific regulatory network with a small number (N >= 6) of compound-induced gene expression signatures, to elucidate specific proteins whose interactions in the network is dysregulated by the compound.
Last updated
systemsbiologynetworkenrichmentgeneexpressionstatisticalmethodnetwork
4.15 score 9 scripts 318 downloadspmm - Parallel Mixed Model
The Parallel Mixed Model (PMM) approach is suitable for hit selection and cross-comparison of RNAi screens generated in experiments that are performed in parallel under several conditions. For example, we could think of the measurements or readouts from cells under RNAi knock-down, which are infected with several pathogens or which are grown from different cell lines.
Last updated
systemsbiologyregression
4.15 score 8 scripts 274 downloadsSTATegRa - Classes and methods for multi-omics data integration
Classes and tools for multi-omics data integration.
Last updated
softwarestatisticalmethodclusteringdimensionreductionprincipalcomponent
4.15 score 4 scripts 424 downloadsCGHnormaliter - Normalization of array CGH data with imbalanced aberrations.
Normalization and centralization of array comparative genomic hybridization (aCGH) data. The algorithm uses an iterative procedure that effectively eliminates the influence of imbalanced copy numbers. This leads to a more reliable assessment of copy number alterations (CNAs).
Last updated
microarraypreprocessing
4.15 score 6 scripts 440 downloadssplineTimeR - Time-course differential gene expression data analysis using spline regression models followed by gene association network reconstruction
This package provides functions for differential gene expression analysis of gene expression time-course data. Natural cubic spline regression models are used. Identified genes may further be used for pathway enrichment analysis and/or the reconstruction of time dependent gene regulatory association networks.
Last updated
geneexpressiondifferentialexpressiontimecourseregressiongenesetenrichmentnetworkenrichmentnetworkinferencegraphandnetwork
4.12 score 22 scripts 419 downloadsphenomis - Postprocessing and univariate analysis of omics data
The 'phenomis' package provides methods to perform post-processing (i.e. quality control and normalization) as well as univariate statistical analysis of single and multi-omics data sets. These methods include quality control metrics, signal drift and batch effect correction, intensity transformation, univariate hypothesis testing, but also clustering (as well as annotation of metabolomics data). The data are handled in the standard Bioconductor formats (i.e. SummarizedExperiment and MultiAssayExperiment for single and multi-omics datasets, respectively; the alternative ExpressionSet and MultiDataSet formats are also supported for convenience). As a result, all methods can be readily chained as workflows. The pipeline can be further enriched by multivariate analysis and feature selection, by using the 'ropls' and 'biosigner' packages, which support the same formats. Data can be conveniently imported from and exported to text files. Although the methods were initially targeted to metabolomics data, most of the methods can be applied to other types of omics data (e.g., transcriptomics, proteomics).
Last updated
batcheffectclusteringcoveragekeggmassspectrometrymetabolomicsnormalizationproteomicsqualitycontrolsequencingstatisticalmethodtranscriptomics
4.11 score 13 scripts 307 downloadsrsemmed - An interface to the Semantic MEDLINE database
A programmatic interface to the Semantic MEDLINE database. It provides functions for searching the database for concepts and finding paths between concepts. Path searching can also be tailored to user specifications, such as placing restrictions on concept types and the type of link between concepts. It also provides functions for summarizing and visualizing those paths.
Last updated
softwareannotationpathwayssystemsbiology
4.11 score 13 scripts 332 downloadsGraphAlignment - GraphAlignment
Graph alignment is an extension package for the R programming environment which provides functions for finding an alignment between two networks based on link and node similarity scores. (J. Berg and M. Laessig, "Cross-species analysis of biological networks by Bayesian alignment", PNAS 103 (29), 10967-10972 (2006))
Last updated
graphandnetworknetwork
4.11 score 16 scripts 368 downloadsrmspc - Multiple Sample Peak Calling
The rmspc package runs MSPC (Multiple Sample Peak Calling) software using R. The analysis of ChIP-seq samples outputs a number of enriched regions (commonly known as "peaks"), each indicating a protein-DNA interaction or a specific chromatin modification. When replicate samples are analyzed, overlapping peaks are expected. This repeated evidence can therefore be used to locally lower the minimum significance required to accept a peak. MSPC uses combined evidence from replicated experiments to evaluate peak calling output, rescuing peaks, and reduce false positives. It takes any number of replicates as input and improves sensitivity and specificity of peak calling on each, and identifies consensus regions between the input samples.
Last updated
chipseqsequencingchiponchipdataimportrnaseqanalysischip-seqenriched-regionsgenome-analysismspcnext-generation-sequencingngs-analysisoverlapping-peakspeakpeaks
4.10 score 21 stars 9 scripts 325 downloadsEasyCellType - Annotate cell types for scRNA-seq data
We developed EasyCellType which can automatically examine the input marker lists obtained from existing software such as Seurat over the cell markerdatabases. Two quantification approaches to annotate cell types are provided: Gene set enrichment analysis (GSEA) and a modified versio of Fisher's exact test. The function presents annotation recommendations in graphical outcomes: bar plots for each cluster showing candidate cell types, as well as a dot plot summarizing the top 5 significant annotations for each cluster.
Last updated
singlecellsoftwaregeneexpressiongenesetenrichment
4.08 score 12 scripts 321 downloadsVarCon - VarCon: an R package for retrieving neighboring nucleotides of an SNV
VarCon is an R package which converts the positional information from the annotation of an single nucleotide variation (SNV) (either referring to the coding sequence or the reference genomic sequence). It retrieves the genomic reference sequence around the position of the single nucleotide variation. To asses, whether the SNV could potentially influence binding of splicing regulatory proteins VarCon calcualtes the HEXplorer score as an estimation. Besides, VarCon additionally reports splice site strengths of splice sites within the retrieved genomic sequence and any changes due to the SNV.
Last updated
functionalgenomicsalternativesplicing
4.08 score 12 scripts 361 downloadsscTHI - Indentification of significantly activated ligand-receptor interactions across clusters of cells from single-cell RNA sequencing data
scTHI is an R package to identify active pairs of ligand-receptors from single cells in order to study,among others, tumor-host interactions. scTHI contains a set of signatures to classify cells from the tumor microenvironment.
Last updated
softwaresinglecell
4.08 score 6 stars 7 scripts 310 downloadsselectKSigs - Selecting the number of mutational signatures using a perplexity-based measure and cross-validation
A package to suggest the number of mutational signatures in a collection of somatic mutations using calculating the cross-validated perplexity score.
Last updated
softwaresomaticmutationsequencingstatisticalmethodclusteringmutational-signaturesrjagssomatic-mutationscppjags
4.08 score 3 stars 2 scripts 308 downloadsCFAssay - Statistical analysis for the Colony Formation Assay
The package provides functions for calculation of linear-quadratic cell survival curves and for ANOVA of experimental 2-way designs along with the colony formation assay.
Last updated
cellbasedassayscellbiologyimmunooncologyregressionsurvival
4.08 score 1 scripts 323 downloadsITALICS - ITALICS
A Method to normalize of Affymetrix GeneChip Human Mapping 100K and 500K set
Last updated
microarraycopynumbervariation
4.08 score 466 downloadsmultiscan - R package for combining multiple scans
Estimates gene expressions from several laser scans of the same microarray
Last updated
microarraypreprocessing
4.08 score 7 scripts 413 downloadsGSEAlm - Linear Model Toolset for Gene Set Enrichment Analysis
Models and methods for fitting linear models to gene expression data, together with tools for computing and using various regression diagnostics.
Last updated
microarray
4.08 score 10 scripts 431 downloadsOCplus - Operating characteristics plus sample size and local fdr for microarray experiments
This package allows to characterize the operating characteristics of a microarray experiment, i.e. the trade-off between false discovery rate and the power to detect truly regulated genes. The package includes tools both for planned experiments (for sample size assessment) and for already collected data (identification of differentially expressed genes).
Last updated
microarraydifferentialexpressionmultiplecomparison
4.08 score 2 scripts 432 downloadsssize - Estimate Microarray Sample Size
Functions for computing and displaying sample size information for gene expression arrays.
Last updated
microarraydifferentialexpression
4.08 score 12 scripts 414 downloadsmulticrispr - Multi-locus multi-purpose Crispr/Cas design
This package is for designing Crispr/Cas9 and Prime Editing experiments. It contains functions to (1) define and transform genomic targets, (2) find spacers (4) count offtarget (mis)matches, and (5) compute Doench2016/2014 targeting efficiency. Care has been taken for multicrispr to scale well towards large target sets, enabling the design of large Crispr/Cas9 libraries.
Last updated
crisprsoftware
4.08 score 3 scripts 304 downloadsGUIDEseq - GUIDE-seq and PEtag-seq analysis pipeline
The package implements GUIDE-seq and PEtag-seq analysis workflow including functions for filtering UMI and reads with low coverage, obtaining unique insertion sites (proxy of cleavage sites), estimating the locations of the insertion sites, aka, peaks, merging estimated insertion sites from plus and minus strand, and performing off target search of the extended regions around insertion sites with mismatches and indels.
Last updated
immunooncologygeneregulationsequencingworkflowstepcrispr
4.05 score 14 scripts 434 downloadsflowCut - Automated Removal of Outlier Events and Flagging of Files Based on Time Versus Fluorescence Analysis
Common techinical complications such as clogging can result in spurious events and fluorescence intensity shifting, flowCut is designed to detect and remove technical artifacts from your data by removing segments that show statistical differences from other segments.
Last updated
flowcytometrypreprocessingqualitycontrolcellbasedassays
4.02 score 26 scripts 440 downloadsDynDoc - Dynamic document tools
A set of functions to create and interact with dynamic documents and vignettes.
Last updated
reportwritinginfrastructure
4.01 score 6 dependents 12 scripts 2.4k downloadsOutSplice - Comparison of Splicing Events between Tumor and Normal Samples
An easy to use tool that can compare splicing events in tumor and normal tissue samples using either a user generated matrix, or data from The Cancer Genome Atlas (TCGA). This package generates a matrix of splicing outliers that are significantly over or underexpressed in tumors samples compared to normal denoted by chromosome location. The package also will calculate the splicing burden in each tumor and characterize the types of splicing events that occur.
Last updated
alternativesplicingdifferentialexpressiondifferentialsplicinggeneexpressionrnaseqsoftwarevariantannotation
4.00 score 1 stars 8 scripts 326 downloadsEDIRquery - Query the EDIR Database For Specific Gene
EDIRquery provides a tool to search for genes of interest within the Exome Database of Interspersed Repeats (EDIR). A gene name is a required input, and users can additionally specify repeat sequence lengths, minimum and maximum distance between sequences, and whether to allow a 1-bp mismatch. Outputs include a summary of results by repeat length, as well as a dataframe of query results. Example data provided includes a subset of the data for the gene GAA (ENSG00000171298). To query the full database requires providing a path to the downloaded database files as a parameter.
Last updated
geneticssequencematching
4.00 score 1 scripts 280 downloadsseq.hotSPOT - Targeted sequencing panel design based on mutation hotspots
seq.hotSPOT provides a resource for designing effective sequencing panels to help improve mutation capture efficacy for ultradeep sequencing projects. Using SNV datasets, this package designs custom panels for any tissue of interest and identify the genomic regions likely to contain the most mutations. Establishing efficient targeted sequencing panels can allow researchers to study mutation burden in tissues at high depth without the economic burden of whole-exome or whole-genome sequencing. This tool was developed to make high-depth sequencing panels to study low-frequency clonal mutations in clinically normal and cancerous tissues.
Last updated
softwaretechnologysequencingdnaseqwholegenome
4.00 score 9 scripts 264 downloadsflowGate - Interactive Cytometry Gating in R
flowGate adds an interactive Shiny app to allow manual GUI-based gating of flow cytometry data in R. Using flowGate, you can draw 1D and 2D span/rectangle gates, quadrant gates, and polygon gates on flow cytometry data by interactively drawing the gates on a plot of your data, rather than by specifying gate coordinates. This package is especially geared toward wet-lab cytometerists looking to take advantage of R for cytometry analysis, without necessarily having a lot of R experience.
Last updated
softwareworkflowstepflowcytometrypreprocessingimmunooncologydataimport
4.00 score 9 scripts 333 downloadsalabaster - Umbrella for the Alabaster Framework
Umbrella for the alabaster suite, providing a single-line import for all alabaster.* packages. Installing this package ensures that all known alabaster.* packages are also installed, avoiding problems with missing packages when a staging method or loading function is dynamically requested. Obviously, this comes at the cost of needing to install more packages, so advanced users and application developers may prefer to install the required alabaster.* packages individually.
Last updated
datarepresentationdataimport
4.00 score 5 scripts 286 downloadsrifiComparative - 'rifiComparative' compares the output of rifi from two different conditions.
'rifiComparative' is a continuation of rifi package. It compares two conditions output of rifi using half-life and mRNA at time 0 segments. As an input for the segmentation, the difference between half-life of both condtions and log2FC of the mRNA at time 0 are used. The package provides segmentation, statistics, summary table, fragments visualization and some additional useful plots for further anaylsis.
Last updated
rnaseqdifferentialexpressiongeneregulationtranscriptomicsmicroarraysoftware
4.00 score 4 scripts 260 downloadsBG2 - Performs Bayesian GWAS analysis for non-Gaussian data using BG2
This package is built to perform GWAS analysis for non-Gaussian data using BG2. The BG2 method uses penalized quasi-likelihood along with nonlocal priors in a two step manner to identify SNPs in GWAS analysis. The research related to this package was supported in part by National Science Foundation awards DMS 1853549 and DMS 2054173.
Last updated
bayesianassaydomainsnpgenomewideassociation
4.00 score 8 scripts 265 downloads
planttfhunter - Identification and classification of plant transcription factors
planttfhunter is used to identify plant transcription factors (TFs) from protein sequence data and classify them into families and subfamilies using the classification scheme implemented in PlantTFDB. TFs are identified using pre-built hidden Markov model profiles for DNA-binding domains. Then, auxiliary and forbidden domains are used with DNA-binding domains to classify TFs into families and subfamilies (when applicable). Currently, TFs can be classified in 58 different TF families/subfamilies.
Last updated
softwaretranscriptionfunctionalpredictiongenomeannotationfunctionalgenomicshiddenmarkovmodelsequencingclassificationfunctional-genomicsgene-familieshidden-markov-modelsplant-genomicsplantsprotein-domainstranscription-factors
4.00 score 1 stars 5 scripts 275 downloadsMetaPhOR - Metabolic Pathway Analysis of RNA
MetaPhOR was developed to enable users to assess metabolic dysregulation using transcriptomic-level data (RNA-sequencing and Microarray data) and produce publication-quality figures. A list of differentially expressed genes (DEGs), which includes fold change and p value, from DESeq2 or limma, can be used as input, with sample size for MetaPhOR, and will produce a data frame of scores for each KEGG pathway. These scores represent the magnitude and direction of transcriptional change within the pathway, along with estimated p-values.MetaPhOR then uses these scores to visualize metabolic profiles within and between samples through a variety of mechanisms, including: bubble plots, heatmaps, and pathway models.
Last updated
metabolomicsrnaseqpathwaysgeneexpressiondifferentialexpressionkeggsequencingmicroarray
4.00 score 1 scripts 324 downloadsCircSeqAlignTk - End-to-End Analysis of Small RNA-Seq Data from Viroids
CircSeqAlignTk is a toolkit for the analysis of RNA-Seq data derived from circular genome sequences, with a primary focus on viroids, circular RNAs typically consisting of a few hundred nucleotides. The toolkit supports an end-to-end analysis pipeline, from alignment to visualization.
Last updated
sequencingsmallrnaalignmentsoftware
4.00 score 3 scripts 286 downloadsSUITOR - Selecting the number of mutational signatures through cross-validation
An unsupervised cross-validation method to select the optimal number of mutational signatures. A data set of mutational counts is split into training and validation data.Signatures are estimated in the training data and then used to predict the mutations in the validation data.
Last updated
geneticssoftwaresomaticmutation
4.00 score 2 scripts 293 downloadsscTreeViz - R/Bioconductor package to interactively explore and visualize single cell RNA-seq datasets with hierarhical annotations
scTreeViz provides classes to support interactive data aggregation and visualization of single cell RNA-seq datasets with hierarchies for e.g. cell clusters at different resolutions. The `TreeIndex` class provides methods to manage hierarchy and split the tree at a given resolution or across resolutions. The `TreeViz` class extends `SummarizedExperiment` and can performs quick aggregations on the count matrix defined by clusters.
Last updated
visualizationinfrastructureguisinglecell
4.00 score 3 scripts 342 downloadssurfaltr - Rapid Comparison of Surface Protein Isoform Membrane Topologies Through surfaltr
Cell surface proteins form a major fraction of the druggable proteome and can be used for tissue-specific delivery of oligonucleotide/cell-based therapeutics. Alternatively spliced surface protein isoforms have been shown to differ in their subcellular localization and/or their transmembrane (TM) topology. Surface proteins are hydrophobic and remain difficult to study thereby necessitating the use of TM topology prediction methods such as TMHMM and Phobius. However, there exists a need for bioinformatic approaches to streamline batch processing of isoforms for comparing and visualizing topologies. To address this gap, we have developed an R package, surfaltr. It pairs inputted isoforms, either known alternatively spliced or novel, with their APPRIS annotated principal counterparts, predicts their TM topologies using TMHMM or Phobius, and generates a customizable graphical output. Further, surfaltr facilitates the prioritization of biologically diverse isoform pairs through the incorporation of three different ranking metrics and through protein alignment functions. Citations for programs mentioned here can be found in the vignette.
Last updated
softwarevisualizationdatarepresentationsplicedalignmentalignmentmultiplesequencealignmentmultiplecomparison
4.00 score 5 scripts 370 downloadsveloviz - VeloViz: RNA-velocity informed 2D embeddings for visualizing cell state trajectories
VeloViz uses each cell’s current observed and predicted future transcriptional states inferred from RNA velocity analysis to build a nearest neighbor graph between cells in the population. Edges are then pruned based on a cosine correlation threshold and/or a distance threshold and the resulting graph is visualized using a force-directed graph layout algorithm. VeloViz can help ensure that relationships between cell states are reflected in the 2D embedding, allowing for more reliable representation of underlying cellular trajectories.
Last updated
transcriptomicsvisualizationgeneexpressionsequencingrnaseqdimensionreductioncpp
4.00 score 6 scripts 258 downloadsRbec - Rbec: a tool for analysis of amplicon sequencing data from synthetic microbial communities
Rbec is a adapted version of DADA2 for analyzing amplicon sequencing data from synthetic communities (SynComs), where the reference sequences for each strain exists. Rbec can not only accurately profile the microbial compositions in SynComs, but also predict the contaminants in SynCom samples.
Last updated
sequencingmicrobialstrainmicrobiomecpp
4.00 score 6 scripts 283 downloadsinteracCircos - The Generation of Interactive Circos Plot
Implement in an efficient approach to display the genomic data, relationship, information in an interactive circular genome(Circos) plot. 'interacCircos' are inspired by 'circosJS', 'BioCircos.js' and 'NG-Circos' and we integrate the modules of 'circosJS', 'BioCircos.js' and 'NG-Circos' into this R package, based on 'htmlwidgets' framework.
Last updated
visualization
4.00 score 4 scripts 293 downloadsModCon - Modifying splice site usage by changing the mRNP code, while maintaining the genetic code
Collection of functions to calculate a nucleotide sequence surrounding for splice donors sites to either activate or repress donor usage. The proposed alternative nucleotide sequence encodes the same amino acid and could be applied e.g. in reporter systems to silence or activate cryptic splice donor sites.
Last updated
functionalgenomicsalternativesplicing
4.00 score 1 stars 3 scripts 298 downloadsPhenoGeneRanker - PhenoGeneRanker: A gene and phenotype prioritization tool
This package is a gene/phenotype prioritization tool that utilizes multiplex heterogeneous gene phenotype network. PhenoGeneRanker allows multi-layer gene and phenotype networks. It also calculates empirical p-values of gene/phenotype ranking using random stratified sampling of genes/phenotypes based on their connectivity degree in the network. https://dl.acm.org/doi/10.1145/3307339.3342155.
Last updated
biomedicalinformaticsgenepredictiongraphandnetworknetworknetworkinferencepathwayssoftwaresystemsbiology
4.00 score 1 scripts 270 downloadsSOMNiBUS - Smooth modeling of bisulfite sequencing
This package aims to analyse count-based methylation data on predefined genomic regions, such as those obtained by targeted sequencing, and thus to identify differentially methylated regions (DMRs) that are associated with phenotypes or traits. The method is built a rich flexible model that allows for the effects, on the methylation levels, of multiple covariates to vary smoothly along genomic regions. At the same time, this method also allows for sequencing errors and can adjust for variability in cell type mixture.
Last updated
dnamethylationregressionepigeneticsdifferentialmethylationsequencingfunctionalprediction
4.00 score 1 stars 5 scripts 278 downloadsRiboDiPA - Differential pattern analysis for Ribo-seq data
This package performs differential pattern analysis for Ribo-seq data. It identifies genes with significantly different patterns in the ribosome footprint between two conditions. RiboDiPA contains five major components including bam file processing, P-site mapping, data binning, differential pattern analysis and footprint visualization.
Last updated
riboseqgeneexpressiongeneregulationdifferentialexpressionsequencingcoveragealignmentrnaseqimmunooncologyqualitycontroldataimportsoftwarenormalizationcpp
4.00 score 2 scripts 351 downloadsExperimentSubset - Manages subsets of data with Bioconductor Experiment objects
Experiment objects such as the SummarizedExperiment or SingleCellExperiment are data containers for one or more matrix-like assays along with the associated row and column data. Often only a subset of the original data is needed for down-stream analysis. For example, filtering out poor quality samples will require excluding some columns before analysis. The ExperimentSubset object is a container to efficiently manage different subsets of the same data without having to make separate objects for each new subset.
Last updated
infrastructuresoftwaredataimportdatarepresentation
4.00 score 8 scripts 346 downloadstomoda - Tomo-seq data analysis
This package provides many easy-to-use methods to analyze and visualize tomo-seq data. The tomo-seq technique is based on cryosectioning of tissue and performing RNA-seq on consecutive sections. (Reference: Kruse F, Junker JP, van Oudenaarden A, Bakkers J. Tomo-seq: A method to obtain genome-wide expression data with spatial resolution. Methods Cell Biol. 2016;135:299-307. doi:10.1016/bs.mcb.2016.01.006) The main purpose of the package is to find zones with similar transcriptional profiles and spatially expressed genes in a tomo-seq sample. Several visulization functions are available to create easy-to-modify plots.
Last updated
geneexpressionsequencingrnaseqtranscriptomicsspatialclusteringvisualization
4.00 score 3 scripts 326 downloadshummingbird - Bayesian Hidden Markov Model for the detection of differentially methylated regions
A package for detecting differential methylation. It exploits a Bayesian hidden Markov model that incorporates location dependence among genomic loci, unlike most existing methods that assume independence among observations. Bayesian priors are applied to permit information sharing across an entire chromosome for improved power of detection. The direct output of our software package is the best sequence of methylation states, eliminating the use of a subjective, and most of the time an arbitrary, threshold of p-value for determining significance. At last, our methodology does not require replication in either or both of the two comparison groups.
Last updated
hiddenmarkovmodelbayesiandnamethylationbiomedicalinformaticssequencinggeneexpressiondifferentialexpressiondifferentialmethylationcpp
4.00 score 6 scripts 302 downloadsOmixer - Omixer: multivariate and reproducible sample randomization to proactively counter batch effects in omics studies
Omixer - an Bioconductor package for multivariate and reproducible sample randomization, which ensures optimal sample distribution across batches with well-documented methods. It outputs lab-friendly sample layouts, reducing the risk of sample mixups when manually pipetting randomized samples.
Last updated
datarepresentationexperimentaldesignqualitycontrolsoftwarevisualization
4.00 score 2 scripts 355 downloadstransomics2cytoscape - A tool set for 3D Trans-Omic network visualization with Cytoscape
transomics2cytoscape generates a file for 3D transomics visualization by providing input that specifies the IDs of multiple KEGG pathway layers, their corresponding Z-axis heights, and an input that represents the edges between the pathway layers. The edges are used, for example, to describe the relationships between kinase on a pathway and enzyme on another pathway. This package automates creation of a transomics network as shown in the figure in Yugi.2014 (https://doi.org/10.1016/j.celrep.2014.07.021) using Cytoscape automation (https://doi.org/10.1186/s13059-019-1758-4).
Last updated
networksoftwarepathwaysdataimportkegg
4.00 score 2 scripts 385 downloadsweitrix - Tools for matrices with precision weights, test and explore weighted or sparse data
Data type and tools for working with matrices having precision weights and missing data. This package provides a common representation and tools that can be used with many types of high-throughput data. The meaning of the weights is compatible with usage in the base R function "lm" and the package "limma". Calibrate weights to account for known predictors of precision. Find rows with excess variability. Perform differential testing and find rows with the largest confident differences. Find PCA-like components of variation even with many missing values, rotated so that individual components may be meaningfully interpreted. DelayedArray matrices and BiocParallel are supported.
Last updated
softwaredatarepresentationdimensionreductiongeneexpressiontranscriptomicsrnaseqsinglecellregression
4.00 score 8 scripts 388 downloadsGGPA - graph-GPA: A graphical model for prioritizing GWAS results and investigating pleiotropic architecture
Genome-wide association studies (GWAS) is a widely used tool for identification of genetic variants associated with phenotypes and diseases, though complex diseases featuring many genetic variants with small effects present difficulties for traditional these studies. By leveraging pleiotropy, the statistical power of a single GWAS can be increased. This package provides functions for fitting graph-GPA, a statistical framework to prioritize GWAS results by integrating pleiotropy. 'GGPA' package provides user-friendly interface to fit graph-GPA models, implement association mapping, and generate a phenotype graph.
Last updated
softwarestatisticalmethodclassificationgenomewideassociationsnpgeneticsclusteringmultiplecomparisonpreprocessinggeneexpressiondifferentialexpressionopenblascpp
4.00 score 1 stars 6 scripts 321 downloadsfrenchFISH - Poisson Models for Quantifying DNA Copy-number from FISH Images of Tissue Sections
FrenchFISH comprises a nuclear volume correction method coupled with two types of Poisson models: either a Poisson model for improved manual spot counting without the need for control probes; or a homogenous Poisson Point Process model for automated spot counting.
Last updated
softwarebiomedicalinformaticscellbiologygeneticshiddenmarkovmodelpreprocessing
4.00 score 3 scripts 275 downloadsrSWeeP - Spaced Words Projection (SWeeP)
"Spaced Words Projection (SWeeP)" is a method for representing biological sequences using vectors preserving inter-sequence comparability.
Last updated
4.00 score 3 scripts 332 downloadsCSSQ - Chip-seq Signal Quantifier Pipeline
This package is desgined to perform statistical analysis to identify statistically significant differentially bound regions between multiple groups of ChIP-seq dataset.
Last updated
chipseqdifferentialpeakcallingsequencingnormalization
4.00 score 3 scripts 356 downloadsscTGIF - Cell type annotation for unannotated single-cell RNA-Seq data
scTGIF connects the cells and the related gene functions without cell type label.
Last updated
dimensionreductionqualitycontrolsinglecellsoftwaregeneexpression
4.00 score 10 scripts 380 downloadsLinkHD - LinkHD: a versatile framework to explore and integrate heterogeneous data
Here we present Link-HD, an approach to integrate heterogeneous datasets, as a generalization of STATIS-ACT (“Structuration des Tableaux A Trois Indices de la Statistique–Analyse Conjointe de Tableaux”), a family of methods to join and compare information from multiple subspaces. However, STATIS-ACT has some drawbacks since it only allows continuous data and it is unable to establish relationships between samples and features. In order to tackle these constraints, we incorporate multiple distance options and a linear regression based Biplot model in order to stablish relationships between observations and variable and perform variable selection.
Last updated
classificationmultiplecomparisonregressionsoftware
4.00 score 2 scripts 358 downloadsssPATHS - ssPATHS: Single Sample PATHway Score
This package generates pathway scores from expression data for single samples after training on a reference cohort. The score is generated by taking the expression of a gene set (pathway) from a reference cohort and performing linear discriminant analysis to distinguish samples in the cohort that have the pathway augmented and not. The separating hyperplane is then used to score new samples.
Last updated
softwaregeneexpressionbiomedicalinformaticsrnaseqpathwaystranscriptomicsdimensionreductionclassification
4.00 score 4 scripts 360 downloadsGmicR - Combines WGCNA and xCell readouts with bayesian network learrning to generate a Gene-Module Immune-Cell network (GMIC)
This package uses bayesian network learning to detect relationships between Gene Modules detected by WGCNA and immune cell signatures defined by xCell. It is a hypothesis generating tool.
Last updated
softwaresystemsbiologygraphandnetworknetworknetworkinferenceguiimmunooncologygeneexpressionqualitycontrolbayesianclustering
4.00 score 2 scripts 354 downloadsRNAmodR.ML - Detecting patterns of post-transcriptional modifications using machine learning
RNAmodR.ML extend the functionality of the RNAmodR package and classical detection strategies towards detection through machine learning models. RNAmodR.ML provides classes, functions and an example workflow to establish a detection stratedy, which can be packaged.
Last updated
softwareinfrastructureworkflowstepvisualizationsequencing
4.00 score 1 stars 5 scripts 389 downloadsGSALightning - Fast Permutation-based Gene Set Analysis
GSALightning provides a fast implementation of permutation-based gene set analysis for two-sample problem. This package is particularly useful when testing simultaneously a large number of gene sets, or when a large number of permutations is necessary for more accurate p-values estimation.
Last updated
softwarebiologicalquestiongenesetenrichmentdifferentialexpressiongeneexpressiontranscription
4.00 score 5 stars 5 scripts 341 downloadsiCheck - QC Pipeline and Data Analysis Tools for High-Dimensional Illumina mRNA Expression Data
QC pipeline and data analysis tools for high-dimensional Illumina mRNA expression data.
Last updated
geneexpressiondifferentialexpressionmicroarraypreprocessingdnamethylationonechanneltwochannelqualitycontrol
4.00 score 2 scripts 344 downloadsCRImage - CRImage a package to classify cells and calculate tumour cellularity
CRImage provides functionality to process and analyze images, in particular to classify cells in biological images. Furthermore, in the context of tumor images, it provides functionality to calculate tumour cellularity.
Last updated
cellbiologyclassification
4.00 score 9 scripts 468 downloadsChIPsim - Simulation of ChIP-seq experiments
A general framework for the simulation of ChIP-seq data. Although currently focused on nucleosome positioning the package is designed to support different types of experiments.
Last updated
infrastructurechipseq
4.00 score 5 scripts 434 downloadsLBE - Estimation of the false discovery rate
LBE is an efficient procedure for estimating the proportion of true null hypotheses, the false discovery rate (and so the q-values) in the framework of estimating procedures based on the marginal distribution of the p-values without assumption for the alternative hypothesis.
Last updated
multiplecomparison
4.00 score 2 scripts 554 downloadsSubCellBarCode - SubCellBarCode: Integrated workflow for robust mapping and visualizing whole human spatial proteome
Mass-Spectrometry based spatial proteomics have enabled the proteome-wide mapping of protein subcellular localization (Orre et al. 2019, Molecular Cell). SubCellBarCode R package robustly classifies proteins into corresponding subcellular localization.
Last updated
proteomicsmassspectrometryclassification
3.98 score 16 scripts 368 downloadstracktables - Build IGV tracks and HTML reports
Methods to create complex IGV genome browser sessions and dynamic IGV reports in HTML pages.
Last updated
sequencingreportwriting
3.98 score 48 scripts 634 downloadsCGHbase - CGHbase: Base functions and classes for arrayCGH data analysis.
Contains functions and classes that are needed by arrayCGH packages.
Last updated
infrastructuremicroarraycopynumbervariation
3.98 score 8 dependents 7 scripts 934 downloadsBUS - Gene network reconstruction
This package can be used to compute associations among genes (gene-networks) or between genes and some external traits (i.e. clinical).
Last updated
preprocessingcpp
3.94 score 11 scripts 490 downloadsIdeoViz - Plots data (continuous/discrete) along chromosomal ideogram
Plots data associated with arbitrary genomic intervals along chromosomal ideogram.
Last updated
visualizationmicroarray
3.94 score 29 scripts 470 downloadsCGEN - An R package for analysis of case-control studies in genetic epidemiology
This is a package for analysis of case-control data in genetic epidemiology. It provides a set of statistical methods for evaluating gene-environment (or gene-genes) interactions under multiplicative and additive risk models, with or without assuming gene-environment (or gene-gene) independence in the underlying population.
Last updated
snpmultiplecomparisonclustering
3.90 score 10 scripts 375 downloadsmBPCR - Bayesian Piecewise Constant Regression for DNA copy number estimation
It contains functions for estimating the DNA copy number profile using mBPCR with the aim of detecting regions with copy number changes.
Last updated
acghsnpmicroarraycopynumbervariation
3.90 score 7 scripts 412 downloadsRLMM - A Genotype Calling Algorithm for Affymetrix SNP Arrays
A classification algorithm, based on a multi-chip, multi-SNP approach for Affymetrix SNP arrays. Using a large training sample where the genotype labels are known, this aglorithm will obtain more accurate classification results on new data. RLMM is based on a robust, linear model and uses the Mahalanobis distance for classification. The chip-to-chip non-biological variation is removed through normalization. This model-based algorithm captures the similarities across genotype groups and probes, as well as thousands other SNPs for accurate classification. NOTE: 100K-Xba only at for now.
Last updated
microarrayonechannelsnpgeneticvariability
3.90 score 1 scripts 379 downloadstimecourse - Statistical Analysis for Developmental Microarray Time Course Data
Functions for data analysis and graphical displays for developmental microarray time course data.
Last updated
microarraytimecoursedifferentialexpression
3.90 score 8 scripts 461 downloadsclusterStab - Compute cluster stability scores for microarray data
This package can be used to estimate the number of clusters in a set of microarray data, as well as test the stability of these clusters.
Last updated
clustering
3.90 score 8 scripts 436 downloadswebbioc - Bioconductor Web Interface
An integrated web interface for doing microarray analysis using several of the Bioconductor packages. It is intended to be deployed as a centralized bioinformatics resource for use by many users. (Currently only Affymetrix oligonucleotide analysis is supported.)
Last updated
infrastructuremicroarrayonechanneldifferentialexpression
3.90 score 4 scripts 542 downloadsflowVS - Variance stabilization in flow cytometry (and microarrays)
Per-channel variance stabilization from a collection of flow cytometry samples by Bertlett test for homogeneity of variances. The approach is applicable to microarrays data as well.
Last updated
immunooncologyflowcytometrycellbasedassaysmicroarray
3.86 score 12 scripts 440 downloadsa4 - Automated Affymetrix Array Analysis Umbrella Package
Umbrella package is available for the entire Automated Affymetrix Array Analysis suite of package.
Last updated
microarray
3.86 score 36 scripts 564 downloadsclst - Classification by local similarity threshold
Package for modified nearest-neighbor classification based on calculation of a similarity threshold distinguishing within-group from between-group comparisons.
Last updated
classification
3.86 score 1 dependents 12 scripts 467 downloadsGeneticsPed - Pedigree and genetic relationship functions
Classes and methods for handling pedigree data. It also includes functions to calculate genetic relationship measures as relationship and inbreeding coefficients and other utilities. Note that package is not yet stable. Use it with care!
Last updated
geneticscpp
3.86 score 12 scripts 422 downloadsrebook - Re-using Content in Bioconductor Books
Provides utilities to re-use content across chapters of a Bioconductor book. This is mostly based on functionality developed while writing the OSCA book, but generalized for potential use in other large books with heavy compute. Also contains some functions to assist book deployment.
Last updated
softwareinfrastructurereportwriting
3.82 score 334 scripts 455 downloadsarrayMvout - multivariate outlier detection for expression array QA
This package supports the application of diverse quality metrics to AffyBatch instances, summarizing these metrics via PCA, and then performing parametric outlier detection on the PCs to identify aberrant arrays with a fixed Type I error rate
Last updated
infrastructuremicroarrayqualitycontrol
3.82 score 11 scripts 536 downloadsBASiCStan - Stan implementation of BASiCS
Provides an interface to infer the parameters of BASiCS using the variational inference (ADVI), Markov chain Monte Carlo (NUTS), and maximum a posteriori (BFGS) inference engines in the Stan programming language. BASiCS is a Bayesian hierarchical model that uses an adaptive Metropolis within Gibbs sampling scheme. Alternative inference methods provided by Stan may be preferable in some situations, for example for particularly large data or posterior distributions with difficult geometries.
Last updated
immunooncologynormalizationsequencingrnaseqsoftwaregeneexpressiontranscriptomicssinglecelldifferentialexpressionbayesiancellbiologysingle-cell-rna-seqcpp
3.78 score 1 scripts 309 downloadsCONSTANd - Data normalization by matrix raking
Normalizes a data matrix `data` by raking (using the RAS method by Bacharach, see references) the Nrows by Ncols matrix such that the row means and column means equal 1. The result is a normalized data matrix `K=RAS`, a product of row mulipliers `R` and column multipliers `S` with the original matrix `A`. Missing information needs to be presented as `NA` values and not as zero values, because CONSTANd is able to ignore missing values when calculating the mean. Using CONSTANd normalization allows for the direct comparison of values between samples within the same and even across different CONSTANd-normalized data matrices.
Last updated
massspectrometrycheminformaticsnormalizationpreprocessingdifferentialexpressiongeneticstranscriptomicsproteomics
3.78 score 5 scripts 324 downloadsGWAS.BAYES - Bayesian analysis of Gaussian GWAS data
This package is built to perform GWAS analysis using Bayesian techniques. Currently, GWAS.BAYES has functionality for the implementation of BICOSS (Williams, J., Ferreira, M. A., and Ji, T. (2022). BICOSS: Bayesian iterative conditional stochastic search for GWAS. BMC Bioinformatics), BGWAS (Williams, J., Xu, S., Ferreira, M. A.. (2023) "BGWAS: Bayesian variable selection in linear mixed models with nonlocal priors for genome-wide association studies." BMC Bioinformatics), and GINA. All methods currently are for the analysis of Gaussian phenotypes The research related to this package was supported in part by National Science Foundation awards DMS 1853549, DMS 1853556, and DMS 2054173.
Last updated
bayesianassaydomainsnpgenomewideassociation
3.78 score 15 scripts 364 downloadsiGC - An integrated analysis package of Gene expression and Copy number alteration
This package is intended to identify differentially expressed genes driven by Copy Number Alterations from samples with both gene expression and CNA data.
Last updated
softwarebiological questiondifferentialexpressiongenomicvariationassaydomaincopynumbervariationgeneexpressionresearchfieldgeneticstechnologymicroarraysequencingworkflowstepmultiplecomparison
3.78 score 1 stars 8 scripts 380 downloadsimmunoClust - immunoClust - Automated Pipeline for Population Detection in Flow Cytometry
immunoClust is a model based clustering approach for Flow Cytometry samples. The cell-events of single Flow Cytometry samples are modelled by a mixture of multinominal normal- or t-distributions. The cell-event clusters of several samples are modelled by a mixture of multinominal normal-distributions aiming stable co-clusters across these samples.
Last updated
clusteringflowcytometrysinglecellcellbasedassaysimmunooncologygslcpp
3.78 score 5 scripts 442 downloadsflowCHIC - Analyze flow cytometric data using histogram information
A package to analyze flow cytometric data of complex microbial communities based on histogram images
Last updated
immunooncologycellbasedassaysclusteringflowcytometrysoftwarevisualization
3.78 score 6 scripts 425 downloadsa4Reporting - Automated Affymetrix Array Analysis Reporting Package
Utility functions to facilitate the reporting of the Automated Affymetrix Array Analysis Reporting set of packages.
Last updated
microarray
3.78 score 1 dependents 3 scripts 500 downloadsa4Classif - Automated Affymetrix Array Analysis Classification Package
Functionalities for classification of Affymetrix microarray data, integrating within the Automated Affymetrix Array Analysis set of packages.
Last updated
microarraygeneexpressionclassification
3.78 score 1 dependents 4 scripts 504 downloadsExiMiR - R functions for the normalization of Exiqon miRNA array data
This package contains functions for reading raw data in ImaGene TXT format obtained from Exiqon miRCURY LNA arrays, annotating them with appropriate GAL files, and normalizing them using a spike-in probe-based method. Other platforms and data formats are also supported.
Last updated
microarrayonechanneltwochannelpreprocessinggeneexpressiontranscription
3.78 score 8 scripts 451 downloadsGeneRegionScan - GeneRegionScan
A package with focus on analysis of discrete regions of the genome. This package is useful for investigation of one or a few genes using Affymetrix data, since it will extract probe level data using the Affymetrix Power Tools application and wrap these data into a ProbeLevelSet. A ProbeLevelSet directly extends the expressionSet, but includes additional information about the sequence of each probe and the probe set it is derived from. The package includes a number of functions used for plotting these probe level data as a function of location along sequences of mRNA-strands. This can be used for analysis of variable splicing, and is especially well suited for use with exon-array data.
Last updated
microarraydataimportsnponechannelvisualization
3.78 score 1 scripts 511 downloadsGeneSelectMMD - Gene selection based on the marginal distributions of gene profiles that characterized by a mixture of three-component multivariate distributions
Gene selection based on a mixture of marginal distributions.
Last updated
differentialexpressionfortran
3.78 score 1 dependents 7 scripts 400 downloadsBicARE - Biclustering Analysis and Results Exploration
Biclustering Analysis and Results Exploration.
Last updated
microarraytranscriptionclustering
3.78 score 1 dependents 5 scripts 516 downloadsaffyContam - structured corruption of affymetrix cel file data
structured corruption of cel file data to demonstrate QA effectiveness
Last updated
infrastructure
3.78 score 1 dependents 4 scripts 533 downloadsDFP - Gene Selection
This package provides a supervised technique able to identify differentially expressed genes, based on the construction of \emph{Fuzzy Patterns} (FPs). The Fuzzy Patterns are built by means of applying 3 Membership Functions to discretized gene expression values.
Last updated
microarraydifferentialexpression
3.78 score 5 scripts 442 downloadsiterativeBMA - The Iterative Bayesian Model Averaging (BMA) algorithm
The iterative Bayesian Model Averaging (BMA) algorithm is a variable selection and classification algorithm with an application of classifying 2-class microarray samples, as described in Yeung, Bumgarner and Raftery (Bioinformatics 2005, 21: 2394-2402).
Last updated
microarrayclassification
3.78 score 1 scripts 406 downloadsmdqc - Mahalanobis Distance Quality Control for microarrays
MDQC is a multivariate quality assessment method for microarrays based on quality control (QC) reports. The Mahalanobis distance of an array's quality attributes is used to measure the similarity of the quality of that array against the quality of the other arrays. Then, arrays with unusually high distances can be flagged as potentially low-quality.
Last updated
microarrayqualitycontrol
3.78 score 1 dependents 5 scripts 548 downloadspanp - Presence-Absence Calls from Negative Strand Matching Probesets
A function to make gene presence/absence calls based on distance from negative strand matching probesets (NSMP) which are derived from Affymetrix annotation. PANP is applied after gene expression values are created, and therefore can be used after any preprocessing method such as MAS5 or GCRMA, or PM-only methods like RMA. NSMP sets have been established for the HGU133A and HGU133-Plus-2.0 chipsets to date.
Last updated
infrastructure
3.78 score 8 scripts 434 downloadsdiffGeneAnalysis - Performs differential gene expression Analysis
Analyze microarray data
Last updated
microarraydifferentialexpression
3.78 score 6 scripts 394 downloadsMVCClass - Model-View-Controller (MVC) Classes
Creates classes used in model-view-controller (MVC) design
Last updated
visualizationinfrastructuregraphandnetwork
3.78 score 1 dependents 562 downloadsBiocSet - Representing Different Biological Sets
BiocSet displays different biological sets in a triple tibble format. These three tibbles are `element`, `set`, and `elementset`. The user has the abilty to activate one of these three tibbles to perform common functions from the dplyr package. Mapping functionality and accessing web references for elements/sets are also available in BiocSet.
Last updated
geneexpressiongokeggsoftware
3.76 score 3 dependents 32 scripts 762 downloadscycle - Significance of periodic expression pattern in time-series data
Package for assessing the statistical significance of periodic expression based on Fourier analysis and comparison with data generated by different background models
Last updated
microarraytimecourse
3.75 score 14 scripts 462 downloadsinfinityFlow - Augmenting Massively Parallel Cytometry Experiments Using Multivariate Non-Linear Regressions
Pipeline to analyze and merge data files produced by BioLegend's LEGENDScreen or BD Human Cell Surface Marker Screening Panel (BD Lyoplates).
Last updated
softwareflowcytometrycellbasedassayssinglecellproteomics
3.72 score 13 scripts 372 downloadslogicFS - Identification of SNP Interactions
Identification of interactions between binary variables using Logic Regression. Can, e.g., be used to find interesting SNP interactions. Contains also a bagging version of logic regression for classification.
Last updated
snpclassificationgenetics
3.72 score 13 scripts 428 downloadsapComplex - Estimate protein complex membership using AP-MS protein data
Functions to estimate a bipartite graph of protein complex membership using AP-MS data.
Last updated
immunooncologynetworkinferencemassspectrometrygraphandnetwork
3.72 score 13 scripts 622 downloadsclstutils - Tools for performing taxonomic assignment
Tools for performing taxonomic assignment based on phylogeny using pplacer and clst.
Last updated
sequencingclassificationvisualizationqualitycontrol
3.68 score 12 scripts 448 downloadsmosaics - MOSAiCS (MOdel-based one and two Sample Analysis and Inference for ChIP-Seq)
This package provides functions for fitting MOSAiCS and MOSAiCS-HMM, a statistical framework to analyze one-sample or two-sample ChIP-seq data of transcription factor binding and histone modification.
Last updated
chipseqsequencingtranscriptiongeneticsbioinformaticscpp
3.65 score 15 scripts 522 downloadsrandRotation - Random Rotation Methods for High Dimensional Data with Batch Structure
A collection of methods for performing random rotations on high-dimensional, normally distributed data (e.g. microarray or RNA-seq data) with batch structure. The random rotation approach allows exact testing of dependent test statistics with linear models following arbitrary batch effect correction methods.
Last updated
softwaresequencingbatcheffectbiomedicalinformaticsrnaseqpreprocessingmicroarraydifferentialexpressiongeneexpressiongeneticsmicrornaarraynormalizationstatisticalmethod
3.60 score 5 scripts 294 downloadsDMCFB - Differentially Methylated Cytosines via a Bayesian Functional Approach
DMCFB is a pipeline for identifying differentially methylated cytosines using a Bayesian functional regression model in bisulfite sequencing data. By using a functional regression data model, it tries to capture position-specific, group-specific and other covariates-specific methylation patterns as well as spatial correlation patterns and unknown underlying models of methylation data. It is robust and flexible with respect to the true underlying models and inclusion of any covariates, and the missing values are imputed using spatial correlation between positions and samples. A Bayesian approach is adopted for estimation and inference in the proposed method.
Last updated
differentialmethylationsequencingcoveragebayesianregression
3.60 score 3 scripts 354 downloadsCausalR - Causal network analysis methods
Causal network analysis methods for regulator prediction and network reconstruction from genome scale data.
Last updated
immunooncologysystemsbiologynetworkgraphandnetworknetwork inferencetranscriptomicsproteomicsdifferentialexpressionrnaseqmicroarray
3.60 score 8 scripts 398 downloadsRUVcorr - Removal of unwanted variation for gene-gene correlations and related analysis
RUVcorr allows to apply global removal of unwanted variation (ridged version of RUV) to real and simulated gene expression data.
Last updated
geneexpressionnormalization
3.60 score 10 scripts 354 downloadspepXMLTab - Parsing pepXML files and filter based on peptide FDR.
Parsing pepXML files based one XML package. The package tries to handle pepXML files generated from different softwares. The output will be a peptide-spectrum-matching tabular file. The package also provide function to filter the PSMs based on FDR.
Last updated
immunooncologyproteomicsmassspectrometry
3.60 score 9 scripts 361 downloadscn.farms - cn.FARMS - factor analysis for copy number estimation
This package implements the cn.FARMS algorithm for copy number variation (CNV) analysis. cn.FARMS allows to analyze the most common Affymetrix (250K-SNP6.0) array types, supports high-performance computing using snow and ff.
Last updated
microarraycopynumbervariationcpp
3.60 score 10 scripts 435 downloadsADaCGH2 - Analysis of Big Data from aCGH Experiments using Parallel Computing and ff Objects
Analysis and plotting of array CGH data. Allows usage of Circular Binary Segementation, wavelet-based smoothing (both as in Liu et al., and HaarSeg as in Ben-Yaacov and Eldar), HMM, GLAD, CGHseg. Most computations are parallelized (either via forking or with clusters, including MPI and sockets clusters) and use ff for storing data.
Last updated
microarraycopynumbervariationacgh
3.60 score 10 scripts 562 downloadsqpcrNorm - Data-driven normalization strategies for high-throughput qPCR data.
The package contains functions to perform normalization of high-throughput qPCR data. Basic functions for processing raw Ct data plus functions to generate diagnostic plots are also available.
Last updated
preprocessinggeneexpression
3.60 score 9 scripts 363 downloadsKCsmart - Multi sample aCGH analysis package using kernel convolution
Multi sample aCGH analysis package using kernel convolution
Last updated
copynumbervariationvisualizationacghmicroarray
3.60 score 6 scripts 452 downloadsbiocGraph - Graph examples and use cases in Bioinformatics
This package provides examples and code that make use of the different graph related packages produced by Bioconductor.
Last updated
visualizationgraphandnetwork
3.60 score 4 scripts 602 downloadscopa - Functions to perform cancer outlier profile analysis.
COPA is a method to find genes that undergo recurrent fusion in a given cancer type by finding pairs of genes that have mutually exclusive outlier profiles.
Last updated
onechanneltwochanneldifferentialexpressionvisualization
3.60 score 9 scripts 470 downloadsMiPP - Misclassification Penalized Posterior Classification
This package finds optimal sets of genes that seperate samples into two or more classes.
Last updated
microarrayclassification
3.60 score 3 scripts 432 downloadsOLINgui - Graphical user interface for OLIN
Graphical user interface for the OLIN package
Last updated
microarraytwochannelqualitycontrolpreprocessingvisualization
3.60 score 6 scripts 451 downloadsldblock - data structures for linkage disequilibrium measures in populations
Define data structures for linkage disequilibrium measures in populations.
Last updated
3.56 score 12 scripts 552 downloadsdyebias - The GASSCO method for correcting for slide-dependent gene-specific dye bias
Many two-colour hybridizations suffer from a dye bias that is both gene-specific and slide-specific. The former depends on the content of the nucleotide used for labeling; the latter depends on the labeling percentage. The slide-dependency was hitherto not recognized, and made addressing the artefact impossible. Given a reasonable number of dye-swapped pairs of hybridizations, or of same vs. same hybridizations, both the gene- and slide-biases can be estimated and corrected using the GASSCO method (Margaritis et al., Mol. Sys. Biol. 5:266 (2009), doi:10.1038/msb.2009.21)
Last updated
microarraytwochannelqualitycontrolpreprocessing
3.56 score 18 scripts 530 downloadsborealis - Bisulfite-seq OutlieR mEthylation At singLe-sIte reSolution
Borealis is an R library performing outlier analysis for count-based bisulfite sequencing data. It detectes outlier methylated CpG sites from bisulfite sequencing (BS-seq). The core of Borealis is modeling Beta-Binomial distributions. This can be useful for rare disease diagnoses.
Last updated
sequencingcoveragednamethylationdifferentialmethylation
3.53 score 17 scripts 322 downloadsTRESS - Toolbox for mRNA epigenetics sequencing analysis
This package is devoted to analyzing MeRIP-seq data. Current functionalities include 1. detect transcriptome wide m6A methylation regions 2. detect transcriptome wide differential m6A methylation regions.
Last updated
epigeneticsrnaseqpeakdetectiondifferentialmethylation
3.48 score 1 dependents 9 scripts 364 downloadsdeltaCaptureC - This Package Discovers Meso-scale Chromatin Remodeling from 3C Data
This package discovers meso-scale chromatin remodelling from 3C data. 3C data is local in nature. It givens interaction counts between restriction enzyme digestion fragments and a preferred 'viewpoint' region. By binning this data and using permutation testing, this package can test whether there are statistically significant changes in the interaction counts between the data from two cell types or two treatments.
Last updated
biologicalquestionstatisticalmethod
3.48 score 1 scripts 312 downloadsmiRcomp - Tools to assess and compare miRNA expression estimatation methods
Based on a large miRNA dilution study, this package provides tools to read in the raw amplification data and use these data to assess the performance of methods that estimate expression from the amplification curves.
Last updated
softwareqpcrpreprocessingqualitycontrol
3.48 score 5 scripts 465 downloadsREDseq - Analysis of high-throughput sequencing data processed by restriction enzyme digestion
The package includes functions to build restriction enzyme cut site (RECS) map, distribute mapped sequences on the map with five different approaches, find enriched/depleted RECSs for a sample, and identify differentially enriched/depleted RECSs between samples.
Last updated
sequencingsequencematchingpreprocessing
3.48 score 15 scripts 454 downloadschopsticks - The 'snp.matrix' and 'X.snp.matrix' Classes
Implements classes and methods for large-scale SNP association studies
Last updated
microarraysnpsandgeneticvariabilitysnpgeneticvariability
3.48 score 6 scripts 410 downloadsR453Plus1Toolbox - A package for importing and analyzing data from Roche's Genome Sequencer System
The R453Plus1 Toolbox comprises useful functions for the analysis of data generated by Roche's 454 sequencing platform. It adds functions for quality assurance as well as for annotation and visualization of detected variants, complementing the software tools shipped by Roche with their product. Further, a pipeline for the detection of structural variants is provided.
Last updated
sequencinginfrastructuredataimportdatarepresentationvisualizationqualitycontrolreportwriting
3.48 score 10 scripts 410 downloadsmicroRNA - Data and functions for dealing with microRNAs
Different data resources for microRNAs and some functions for manipulating them.
Last updated
infrastructuregenomeannotationsequencematchingcpp
3.48 score 6 scripts 682 downloadscalm - Covariate Assisted Large-scale Multiple testing
Statistical methods for multiple testing with covariate information. Traditional multiple testing methods only consider a list of test statistics, such as p-values. Our methods incorporate the auxiliary information, such as the lengths of gene coding regions or the minor allele frequencies of SNPs, to improve power.
Last updated
bayesiandifferentialexpressiongeneexpressionregressionmicroarraysequencingrnaseqmultiplecomparisongeneticsimmunooncologymetabolomicsproteomicstranscriptomics
3.45 score 14 scripts 284 downloadskeggorthology - graph support for KO, KEGG Orthology
graphical representation of the Feb 2010 KEGG Orthology. The KEGG orthology is a set of pathway IDs that are not to be confused with the KEGG ortholog IDs.
Last updated
pathwaysgraphandnetworkvisualizationkegg
3.45 score 14 scripts 451 downloadsrandPack - Randomization routines for Clinical Trials
A suite of classes and functions for randomizing patients in clinical trials.
Last updated
statisticalmethod
3.41 score 13 scripts 400 downloads
ScreenR - Package to Perform High Throughput Biological Screening
ScreenR is a package suitable to perform hit identification in loss of function High Throughput Biological Screenings performed using barcoded shRNA-based libraries. ScreenR combines the computing power of software such as edgeR with the simplicity of use of the Tidyverse metapackage. ScreenR executes a pipeline able to find candidate hits from barcode counts, and integrates a wide range of visualization modes for each step of the analysis.
Last updated
softwareassaydomaingeneexpressionhigh-throughput-screening
3.38 score 1 stars 16 scripts 298 downloadsOrderedList - Similarities of Ordered Gene Lists
Detection of similarities between ordered lists of genes. Thereby, either simple lists can be compared or gene expression data can be used to deduce the lists. Significance of similarities is evaluated by shuffling lists or by resampling in microarray data, respectively.
Last updated
microarraydifferentialexpressionmultiplecomparison
3.34 score 11 scripts 425 downloadsidiogram - idiogram
A package for plotting genomic data by chromosomal location
Last updated
visualization
3.34 score 11 scripts 512 downloadsAHMassBank - MassBank Annotation Resources for AnnotationHub
Supplies AnnotationHub with MassBank metabolite/compound annotations bundled in CompDb SQLite databases. CompDb SQLite databases contain general compound annotation as well as fragment spectra representing fragmentation patterns of compounds' ions. MassBank data is retrieved from https://massbank.eu/MassBank and processed using helper functions from the CompoundDb Bioconductor package into redistributable SQLite databases.
Last updated
massspectrometryannotationhubsoftware
3.30 score 1 stars 1 scripts 298 downloadsmslp - Predict synthetic lethal partners of tumour mutations
An integrated pipeline to predict the potential synthetic lethality partners (SLPs) of tumour mutations, based on gene expression, mutation profiling and cell line genetic screens data. It has builtd-in support for data from cBioPortal. The primary SLPs correlating with muations in WT and compensating for the loss of function of mutations are predicted by random forest based methods (GENIE3) and Rank Products, respectively. Genetic screens are employed to identfy consensus SLPs leads to reduced cell viability when perturbed.
Last updated
pharmacogeneticspharmacogenomics
3.30 score 6 scripts 246 downloadsfactR - Functional Annotation of Custom Transcriptomes
factR contain tools to process and interact with custom-assembled transcriptomes (GTF). At its core, factR constructs CDS information on custom transcripts and subsequently predicts its functional output. In addition, factR has tools capable of plotting transcripts, correcting chromosome and gene information and shortlisting new transcripts.
Last updated
alternativesplicingfunctionalpredictiongenepredictioncustom-transcriptomesfunctional-annotationgtfrna-seq-analysis
3.30 score 2 stars 10 scripts 293 downloadsrGenomeTracks - Integerated visualization of epigenomic data
rGenomeTracks package leverages the power of pyGenomeTracks software with the interactivity of R. pyGenomeTracks is a python software that offers robust method for visualizing epigenetic data files like narrowPeak, Hic matrix, TADs and arcs, however though, here is no way currently to use it within R interactive session. rGenomeTracks wrapped the whole functionality of pyGenomeTracks with additional utilites to make to more pleasant for R users.
Last updated
softwarehicvisualization
3.30 score 5 scripts 304 downloadsDExMA - Differential Expression Meta-Analysis
performing all the steps of gene expression meta-analysis considering the possible existence of missing genes. It provides the necessary functions to be able to perform the different methods of gene expression meta-analysis. In addition, it contains functions to apply quality controls, download GEO datasets and show graphical representations of the results.
Last updated
differentialexpressiongeneexpressionstatisticalmethodqualitycontrol
3.30 score 8 scripts 334 downloadssampleClassifier - Sample Classifier
The package is designed to classify microarray RNA-seq gene expression profiles.
Last updated
immunooncologyclassificationmicroarrayrnaseqgeneexpression
3.30 score 3 scripts 346 downloadsSeqGate - Filtering of Lowly Expressed Features
Filtering of lowly expressed features (e.g. genes) is a common step before performing statistical analysis, but an arbitrary threshold is generally chosen. SeqGate implements a method that rationalize this step by the analysis of the distibution of counts in replicate samples. The gate is the threshold above which sequenced features can be considered as confidently quantified.
Last updated
differentialexpressiongeneexpressiontranscriptomicssequencingrnaseq
3.30 score 7 scripts 318 downloadspageRank - Temporal and Multiplex PageRank for Gene Regulatory Network Analysis
Implemented temporal PageRank analysis as defined by Rozenshtein and Gionis. Implemented multiplex PageRank as defined by Halu et al. Applied temporal and multiplex PageRank in gene regulatory network analysis.
Last updated
statisticalmethodgenetargetnetwork
3.30 score 7 scripts 325 downloadsFilterFFPE - FFPE Artificial Chimeric Read Filter for NGS data
This package finds and filters artificial chimeric reads specifically generated in next-generation sequencing (NGS) process of formalin-fixed paraffin-embedded (FFPE) tissues. These artificial chimeric reads can lead to a large number of false positive structural variation (SV) calls. The required input is an indexed BAM file of a FFPE sample.
Last updated
structuralvariationsequencingalignmentqualitycontrolpreprocessing
3.30 score 1 scripts 340 downloadsMultiBaC - Multiomic Batch effect Correction
MultiBaC is a strategy to correct batch effects from multiomic datasets distributed across different labs or data acquisition events. MultiBaC is the first Batch effect correction algorithm that dealing with batch effect correction in multiomics datasets. MultiBaC is able to remove batch effects across different omics generated within separate batches provided that at least one common omic data type is included in all the batches considered.
Last updated
softwarestatisticalmethodprincipalcomponentdatarepresentationgeneexpressiontranscriptionbatcheffect
3.30 score 7 scripts 440 downloadsoptimalFlow - optimalFlow
Optimal-transport techniques applied to supervised flow cytometry gating.
Last updated
softwareflowcytometrytechnology
3.30 score 10 scripts 325 downloadsROCpAI - Receiver Operating Characteristic Partial Area Indexes for evaluating classifiers
The package analyzes the Curve ROC, identificates it among different types of Curve ROC and calculates the area under de curve through the method that is most accuracy. This package is able to standarizate proper and improper pAUC.
Last updated
softwarestatisticalmethodclassification
3.30 score 5 scripts 308 downloadsSimFFPE - NGS Read Simulator for FFPE Tissue
The NGS (Next-Generation Sequencing) reads from FFPE (Formalin-Fixed Paraffin-Embedded) samples contain numerous artifact chimeric reads (ACRS), which can lead to false positive structural variant calls. These ACRs are derived from the combination of two single-stranded DNA (ss-DNA) fragments with short reverse complementary regions (SRCRs). This package simulates these artifact chimeric reads as well as normal reads for FFPE samples on the whole genome / several chromosomes / large regions.
Last updated
sequencingalignmentmultiplecomparisonsequencematchingdataimport
3.30 score 1 scripts 314 downloadsfcScan - fcScan for detecting clusters of coordinates with user defined options
This package is used to detect combination of genomic coordinates falling within a user defined window size along with user defined overlap between identified neighboring clusters. It can be used for genomic data where the clusters are built on a specific chromosome or specific strand. Clustering can be performed with a "greedy" option allowing thus the presence of additional sites within the allowed window size.
Last updated
genomeannotationclustering
3.30 score 1 scripts 456 downloadsKnowSeq - KnowSeq R/Bioc package: The Smart Transcriptomic Pipeline
KnowSeq proposes a novel methodology that comprises the most relevant steps in the Transcriptomic gene expression analysis. KnowSeq expects to serve as an integrative tool that allows to process and extract relevant biomarkers, as well as to assess them through a Machine Learning approaches. Finally, the last objective of KnowSeq is the biological knowledge extraction from the biomarkers (Gene Ontology enrichment, Pathway listing and Visualization and Evidences related to the addressed disease). Although the package allows analyzing all the data manually, the main strenght of KnowSeq is the possibilty of carrying out an automatic and intelligent HTML report that collect all the involved steps in one document. It is important to highligh that the pipeline is totally modular and flexible, hence it can be started from whichever of the different steps. KnowSeq expects to serve as a novel tool to help to the experts in the field to acquire robust knowledge and conclusions for the data and diseases to study.
Last updated
geneexpressiondifferentialexpressiongenesetenrichmentdataimportclassificationfeatureextractionsequencingrnaseqbatcheffectnormalizationpreprocessingqualitycontrolgeneticstranscriptomicsmicroarrayalignmentpathwayssystemsbiologygoimmunooncology
3.30 score 10 scripts 402 downloadslevi - Landscape Expression Visualization Interface
The tool integrates data from biological networks with transcriptomes, displaying a heatmap with surface curves to evidence the altered regions.
Last updated
geneexpressionsequencingnetworksoftwarecpp
3.30 score 4 scripts 348 downloadssparsenetgls - Using Gaussian graphical structue learning estimation in generalized least squared regression for multivariate normal regression
The package provides methods of combining the graph structure learning and generalized least squares regression to improve the regression estimation. The main function sparsenetgls() provides solutions for multivariate regression with Gaussian distributed dependant variables and explanatory variables utlizing multiple well-known graph structure learning approaches to estimating the precision matrix, and uses a penalized variance covariance matrix with a distance tuning parameter of the graph structure in deriving the sandwich estimators in generalized least squares (gls) regression. This package also provides functions for assessing a Gaussian graphical model which uses the penalized approach. It uses Receiver Operative Characteristics curve as a visualization tool in the assessment.
Last updated
immunooncologygraphandnetworkregressionmetabolomicscopynumbervariationmassspectrometryproteomicssoftwarevisualization
3.30 score 3 scripts 328 downloadsLRBaseDbi - DBI to construct LRBase-related package
Interface to construct LRBase package (LRBase.XXX.eg.db).
Last updated
infrastructure
3.30 score 7 scripts 337 downloadsappreci8R - appreci8R: an R/Bioconductor package for filtering SNVs and short indels with high sensitivity and high PPV
The appreci8R is an R version of our appreci8-algorithm - A Pipeline for PREcise variant Calling Integrating 8 tools. Variant calling results of our standard appreci8-tools (GATK, Platypus, VarScan, FreeBayes, LoFreq, SNVer, samtools and VarDict), as well as up to 5 additional tools is combined, evaluated and filtered.
Last updated
variantdetectiongeneticvariabilitysnpvariantannotationsequencing
3.30 score 1 scripts 394 downloadsipdDb - IPD IMGT/HLA and IPD KIR database for Homo sapiens
All alleles from the IPD IMGT/HLA <https://www.ebi.ac.uk/ipd/imgt/hla/> and IPD KIR <https://www.ebi.ac.uk/ipd/kir/> database for Homo sapiens. Reference: Robinson J, Maccari G, Marsh SGE, Walter L, Blokhuis J, Bimber B, Parham P, De Groot NG, Bontrop RE, Guethlein LA, and Hammond JA KIR Nomenclature in non-human species Immunogenetics (2018), in preparation.
Last updated
genomicvariationsequencematchingvariantannotationdatarepresentationannotationhubsoftware
3.30 score 8 scripts 338 downloadsRmmquant - RNA-Seq multi-mapping Reads Quantification Tool
RNA-Seq is currently used routinely, and it provides accurate information on gene transcription. However, the method cannot accurately estimate duplicated genes expression. Several strategies have been previously used, but all of them provide biased results. With Rmmquant, if a read maps at different positions, the tool detects that the corresponding genes are duplicated; it merges the genes and creates a merged gene. The counts of ambiguous reads is then based on the input genes and the merged genes. Rmmquant is a drop-in replacement of the widely used tools findOverlaps and featureCounts that handles multi-mapping reads in an unabiased way.
Last updated
geneexpressiontranscriptionzlibcpp
3.30 score 5 scripts 340 downloadsmissRows - Handling Missing Individuals in Multi-Omics Data Integration
The missRows package implements the MI-MFA method to deal with missing individuals ('biological units') in multi-omics data integration. The MI-MFA method generates multiple imputed datasets from a Multiple Factor Analysis model, then the yield results are combined in a single consensus solution. The package provides functions for estimating coordinates of individuals and variables, imputing missing individuals, and various diagnostic plots to inspect the pattern of missingness and visualize the uncertainty due to missing values.
Last updated
softwarestatisticalmethoddimensionreductionprincipalcomponentmathematicalbiologyvisualization
3.30 score 3 scripts 369 downloadsGateFinder - Projection-based Gating Strategy Optimization for Flow and Mass Cytometry
Given a vector of cluster memberships for a cell population, identifies a sequence of gates (polygon filters on 2D scatter plots) for isolation of that cell type.
Last updated
immunooncologyflowcytometrycellbiologyclustering
3.30 score 7 scripts 368 downloadsClusterJudge - Judging Quality of Clustering Methods using Mutual Information
ClusterJudge implements the functions, examples and other software published as an algorithm by Gibbons, FD and Roth FP. The article is called "Judging the Quality of Gene Expression-Based Clustering Methods Using Gene Annotation" and it appeared in Genome Research, vol. 12, pp1574-1581 (2002). See package?ClusterJudge for an overview.
Last updated
softwarestatisticalmethodclusteringgeneexpressiongo
3.30 score 3 scripts 422 downloadsprofileScoreDist - Profile score distributions
Regularization and score distributions for position count matrices.
Last updated
softwaregeneregulationstatisticalmethodcpp
3.30 score 4 scripts 303 downloadsnormalize450K - Preprocessing of Illumina Infinium 450K data
Precise measurements are important for epigenome-wide studies investigating DNA methylation in whole blood samples, where effect sizes are expected to be small in magnitude. The 450K platform is often affected by batch effects and proper preprocessing is recommended. This package provides functions to read and normalize 450K '.idat' files. The normalization corrects for dye bias and biases related to signal intensity and methylation of probes using local regression. No adjustment for probe type bias is performed to avoid the trade-off of precision for accuracy of beta-values.
Last updated
normalizationdnamethylationmicroarraytwochannelpreprocessingmethylationarray
3.30 score 1 scripts 398 downloadsPath2PPI - Prediction of pathway-related protein-protein interaction networks
Package to predict protein-protein interaction (PPI) networks in target organisms for which only a view information about PPIs is available. Path2PPI predicts PPI networks based on sets of proteins which can belong to a certain pathway from well-established model organisms. It helps to combine and transfer information of a certain pathway or biological process from several reference organisms to one target organism. Path2PPI only depends on the sequence similarity of the involved proteins.
Last updated
networkinferencesystemsbiologynetworkproteomicspathways
3.30 score 5 scripts 432 downloadsBBCAnalyzer - BBCAnalyzer: an R/Bioconductor package for visualizing base counts
BBCAnalyzer is a package for visualizing the relative or absolute number of bases, deletions and insertions at defined positions in sequence alignment data available as bam files in comparison to the reference bases. Markers for the relative base frequencies, the mean quality of the detected bases, known mutations or polymorphisms and variants called in the data may additionally be included in the plots.
Last updated
sequencingalignmentcoveragegeneticvariabilitysnp
3.30 score 3 scripts 512 downloadssynlet - Hits Selection for Synthetic Lethal RNAi Screen Data
Select hits from synthetic lethal RNAi screen data. For example, there are two identical celllines except one gene is knocked-down in one cellline. The interest is to find genes that lead to stronger lethal effect when they are knocked-down further by siRNA. Quality control and various visualisation tools are implemented. Four different algorithms could be used to pick up the interesting hits. This package is designed based on 384 wells plates, but may apply to other platforms with proper configuration.
Last updated
immunooncologycellbasedassaysqualitycontrolpreprocessingvisualizationfeatureextraction
3.30 score 7 scripts 381 downloadstraseR - GWAS trait-associated SNP enrichment analyses in genomic intervals
traseR performs GWAS trait-associated SNP enrichment analyses in genomic intervals using different hypothesis testing approaches, also provides various functionalities to explore and visualize the results.
Last updated
geneticssequencingcoveragealignmentqualitycontroldataimport
3.30 score 5 scripts 416 downloadsfCI - f-divergence Cutoff Index for Differential Expression Analysis in Transcriptomics and Proteomics
(f-divergence Cutoff Index), is to find DEGs in the transcriptomic & proteomic data, and identify DEGs by computing the difference between the distribution of fold-changes for the control-control and remaining (non-differential) case-control gene expression ratio data. fCI provides several advantages compared to existing methods.
Last updated
proteomics
3.30 score 5 scripts 402 downloadsmirIntegrator - Integrating microRNA expression into signaling pathways for pathway analysis
Tools for augmenting signaling pathways to perform pathway analysis of microRNA and mRNA expression levels.
Last updated
networkmicroarraygraphandnetworkpathwayskegg
3.30 score 1 stars 4 scripts 404 downloadshierGWAS - Asessing statistical significance in predictive GWA studies
Testing individual SNPs, as well as arbitrarily large groups of SNPs in GWA studies, using a joint model of all SNPs. The method controls the FWER, and provides an automatic, data-driven refinement of the SNP clusters to smaller groups or single markers.
Last updated
snplinkagedisequilibriumclustering
3.30 score 1 scripts 380 downloadsacde - Artificial Components Detection of Differentially Expressed Genes
This package provides a multivariate inferential analysis method for detecting differentially expressed genes in gene expression data. It uses artificial components, close to the data's principal components but with an exact interpretation in terms of differential genetic expression, to identify differentially expressed genes while controlling the false discovery rate (FDR). The methods on this package are described in the vignette or in the article 'Multivariate Method for Inferential Identification of Differentially Expressed Genes in Gene Expression Experiments' by J. P. Acosta, L. Lopez-Kleine and S. Restrepo (2015, pending publication).
Last updated
differentialexpressiontimecourseprincipalcomponentgeneexpressionmicroarraymrnamicroarray
3.30 score 6 scripts 542 downloadspandaR - PANDA Algorithm
Runs PANDA, an algorithm for discovering novel network structure by combining information from multiple complementary data sources.
Last updated
statisticalmethodgraphandnetworkmicroarraygeneregulationnetworkinferencegeneexpressiontranscriptionnetwork
3.30 score 10 scripts 452 downloadsMethTargetedNGS - Perform Methylation Analysis on Next Generation Sequencing Data
Perform step by step methylation analysis of Next Generation Sequencing data.
Last updated
researchfieldgeneticssequencingalignmentsequencematchingdataimport
3.30 score 6 scripts 340 downloadsFISHalyseR - FISHalyseR a package for automated FISH quantification
FISHalyseR provides functionality to process and analyse digital cell culture images, in particular to quantify FISH probes within nuclei. Furthermore, it extract the spatial location of each nucleus as well as each probe enabling spatial co-localisation analysis.
Last updated
cellbiology
3.30 score 8 scripts 324 downloadsMatrixRider - Obtain total affinity and occupancies for binding site matrices on a given sequence
Calculates a single number for a whole sequence that reflects the propensity of a DNA binding protein to interact with it. The DNA binding protein has to be described with a PFM matrix, for example gotten from Jaspar.
Last updated
generegulationgeneticsmotifannotation
3.30 score 6 scripts 384 downloadsdiggit - Inference of Genetic Variants Driving Cellular Phenotypes
Inference of Genetic Variants Driving Cellullar Phenotypes by the DIGGIT algorithm
Last updated
systemsbiologynetworkenrichmentgeneexpressionfunctionalpredictiongeneregulation
3.30 score 5 scripts 370 downloadsskewr - Visualize Intensities Produced by Illumina's Human Methylation 450k BeadChip
The skewr package is a tool for visualizing the output of the Illumina Human Methylation 450k BeadChip to aid in quality control. It creates a panel of nine plots. Six of the plots represent the density of either the methylated intensity or the unmethylated intensity given by one of three subsets of the 485,577 total probes. These subsets include Type I-red, Type I-green, and Type II.The remaining three distributions give the density of the Beta-values for these same three subsets. Each of the nine plots optionally displays the distributions of the "rs" SNP probes and the probes associated with imprinted genes as series of 'tick' marks located above the x-axis.
Last updated
dnamethylationtwochannelpreprocessingqualitycontrol
3.30 score 9 scripts 361 downloadssigsquared - Gene signature generation for functionally validated signaling pathways
By leveraging statistical properties (log-rank test for survival) of patient cohorts defined by binary thresholds, poor-prognosis patients are identified by the sigsquared package via optimization over a cost function reducing type I and II error.
Last updated
3.30 score 1 scripts 380 downloadsparglms - support for parallelized estimation of GLMs/GEEs
This package provides support for parallelized estimation of GLMs/GEEs, catering for dispersed data.
Last updated
3.30 score 6 scripts 386 downloadsrgsepd - Gene Set Enrichment / Projection Displays
R/GSEPD is a bioinformatics package for R to help disambiguate transcriptome samples (a matrix of RNA-Seq counts at transcript IDs) by automating differential expression (with DESeq2), then gene set enrichment (with GOSeq), and finally a N-dimensional projection to quantify in which ways each sample is like either treatment group.
Last updated
immunooncologysoftwaredifferentialexpressiongenesetenrichmentrnaseq
3.30 score 10 scripts 400 downloadscpvSNP - Gene set analysis methods for SNP association p-values that lie in genes in given gene sets
Gene set analysis methods exist to combine SNP-level association p-values into gene sets, calculating a single association p-value for each gene set. This package implements two such methods that require only the calculated SNP p-values, the gene set(s) of interest, and a correlation matrix (if desired). One method (GLOSSI) requires independent SNPs and the other (VEGAS) can take into account correlation (LD) among the SNPs. Built-in plotting functions are available to help users visualize results.
Last updated
geneticsstatisticalmethodpathwaysgenesetenrichmentgenomicvariation
3.30 score 6 scripts 448 downloadsMBAmethyl - Model-based analysis of DNA methylation data
This package provides a function for reconstructing DNA methylation values from raw measurements. It iteratively implements the group fused lars to smooth related-by-location methylation values and the constrained least squares to remove probe affinity effect across multiple sequences.
Last updated
dnamethylationmethylationarray
3.30 score 6 scripts 306 downloadsIMPCdata - Retrieves data from IMPC database
Package contains methods for data retrieval from IMPC Database.
Last updated
experimentdata
3.30 score 9 scripts 322 downloadsSigCheck - Check a gene signature's prognostic performance against random signatures, known signatures, and permuted data/metadata
While gene signatures are frequently used to predict phenotypes (e.g. predict prognosis of cancer patients), it it not always clear how optimal or meaningful they are (cf David Venet, Jacques E. Dumont, and Vincent Detours' paper "Most Random Gene Expression Signatures Are Significantly Associated with Breast Cancer Outcome"). Based on suggestions in that paper, SigCheck accepts a data set (as an ExpressionSet) and a gene signature, and compares its performance on survival and/or classification tasks against a) random gene signatures of the same length; b) known, related and unrelated gene signatures; and c) permuted data and/or metadata.
Last updated
geneexpressionclassificationgenesetenrichment
3.30 score 4 scripts 364 downloadsASGSCA - Association Studies for multiple SNPs and multiple traits using Generalized Structured Equation Models
The package provides tools to model and test the association between multiple genotypes and multiple traits, taking into account the prior biological knowledge. Genes, and clinical pathways are incorporated in the model as latent variables. The method is based on Generalized Structured Component Analysis (GSCA).
Last updated
structuralequationmodels
3.30 score 3 scripts 482 downloadsproBAMr - Generating SAM file for PSMs in shotgun proteomics data
Mapping PSMs back to genome. The package builds SAM file from shotgun proteomics data The package also provides function to prepare annotation from GTF file.
Last updated
immunooncologyproteomicsmassspectrometrysoftwarevisualization
3.30 score 1 scripts 360 downloadscoGPS - cancer outlier Gene Profile Sets
Gene Set Enrichment Analysis of P-value based statistics for outlier gene detection in dataset merged from multiple studies
Last updated
microarraydifferentialexpression
3.30 score 6 scripts 441 downloadsternarynet - Ternary Network Estimation
Gene-regulatory network (GRN) modeling seeks to infer dependencies between genes and thereby provide insight into the regulatory relationships that exist within a cell. This package provides a computational Bayesian approach to GRN estimation from perturbation experiments using a ternary network model, in which gene expression is discretized into one of 3 states: up, unchanged, or down). The ternarynet package includes a parallel implementation of the replica exchange Monte Carlo algorithm for fitting network models, using MPI.
Last updated
softwarecellbiologygraphandnetworknetworkbayesiancpp
3.30 score 3 scripts 334 downloadsRCASPAR - A package for survival time prediction based on a piecewise baseline hazard Cox regression model.
The package is the R-version of the C-based software \bold{CASPAR} (Kaderali,2006: \url{http://bioinformatics.oxfordjournals.org/content/22/12/1495}). It is meant to help predict survival times in the presence of high-dimensional explanatory covariates. The model is a piecewise baseline hazard Cox regression model with an Lq-norm based prior that selects for the most important regression coefficients, and in turn the most relevant covariates for survival analysis. It was primarily tried on gene expression and aCGH data, but can be used on any other type of high-dimensional data and in disciplines other than biology and medicine.
Last updated
acghgeneexpressiongeneticsproteomicsvisualization
3.30 score 1 scripts 317 downloadsflowPlots - flowPlots: analysis plots and data class for gated flow cytometry data
Graphical displays with embedded statistical tests for gated ICS flow cytometry data, and a data class which stores "stacked" data and has methods for computing summary measures on stacked data, such as marginal and polyfunctional degree data.
Last updated
immunooncologyflowcytometrycellbasedassaysvisualizationdatarepresentation
3.30 score 6 scripts 350 downloadsibh - Interaction Based Homogeneity for Evaluating Gene Lists
This package contains methods for calculating Interaction Based Homogeneity to evaluate fitness of gene lists to an interaction network which is useful for evaluation of clustering results and gene list analysis. BioGRID interactions are used in the calculation. The user can also provide their own interactions.
Last updated
qualitycontroldataimportgraphandnetworknetworkenrichment
3.30 score 6 scripts 390 downloadspvac - PCA-based gene filtering for Affymetrix arrays
The package contains the function for filtering genes by the proportion of variation accounted for by the first principal component (PVAC).
Last updated
microarrayonechannelqualitycontrol
3.30 score 4 scripts 402 downloadsTurboNorm - A fast scatterplot smoother suitable for microarray normalization
A fast scatterplot smoother based on B-splines with second-order difference penalty. Functions for microarray normalization of single-colour data i.e. Affymetrix/Illumina and two-colour data supplied as marray MarrayRaw-objects or limma RGList-objects are available.
Last updated
microarrayonechanneltwochannelpreprocessingdnamethylationcpgislandmethylationarraynormalization
3.30 score 3 scripts 413 downloadshyperdraw - Visualizing Hypergaphs
Functions for visualizing hypergraphs.
Last updated
visualizationgraphandnetwork
3.30 score 8 scripts 492 downloadsChromHeatMap - Heat map plotting by genome coordinate
The ChromHeatMap package can be used to plot genome-wide data (e.g. expression, CGH, SNP) along each strand of a given chromosome as a heat map. The generated heat map can be used to interactively identify probes and genes of interest.
Last updated
visualization
3.30 score 3 scripts 438 downloadsclippda - A package for the clinical proteomic profiling data analysis
Methods for the nalysis of data from clinical proteomic profiling studies. The focus is on the studies of human subjects, which are often observational case-control by design and have technical replicates. A method for sample size determination for planning these studies is proposed. It incorporates routines for adjusting for the expected heterogeneities and imbalances in the data and the within-sample replicate correlations.
Last updated
proteomicsonechannelpreprocessingdifferentialexpressionmultiplecomparison
3.30 score 7 scripts 476 downloadsMiChip - MiChip Parsing and Summarizing Functions
This package takes the MiChip miRNA microarray .grp scanner output files and parses these out, providing summary and plotting functions to analyse MiChip hybridizations. A set of hybridizations is packaged into an ExpressionSet allowing it to be used by other BioConductor packages.
Last updated
microarraypreprocessing
3.30 score 6 scripts 406 downloadsRmagpie - MicroArray Gene-expression-based Program In Error rate estimation
Microarray Classification is designed for both biologists and statisticians. It offers the ability to train a classifier on a labelled microarray dataset and to then use that classifier to predict the class of new observations. A range of modern classifiers are available, including support vector machines (SVMs), nearest shrunken centroids (NSCs)... Advanced methods are provided to estimate the predictive error rate and to report the subset of genes which appear essential in discriminating between classes.
Last updated
microarrayclassification
3.30 score 1 scripts 443 downloadsmetahdep - Hierarchical Dependence in Meta-Analysis
Tools for meta-analysis in the presence of hierarchical (and/or sampling) dependence, including with gene expression studies
Last updated
microarraydifferentialexpression
3.30 score 2 scripts 405 downloadsspkTools - Methods for Spike-in Arrays
The package contains functions that can be used to compare expression measures on different array platforms.
Last updated
softwaretechnologymicroarray
3.30 score 1 scripts 420 downloadsxmapbridge - Export plotting files to the xmapBridge for visualisation in X:Map
xmapBridge can plot graphs in the X:Map genome browser. This package exports plotting files in a suitable format.
Last updated
annotationreportwritingvisualization
3.30 score 3 scripts 498 downloadsPLPE - Local Pooled Error Test for Differential Expression with Paired High-throughput Data
This package performs tests for paired high-throughput data.
Last updated
proteomicsmicroarraydifferentialexpression
3.30 score 9 scripts 386 downloadsiterativeBMAsurv - The Iterative Bayesian Model Averaging (BMA) Algorithm For Survival Analysis
The iterative Bayesian Model Averaging (BMA) algorithm for survival analysis is a variable selection method for applying survival analysis to microarray data.
Last updated
microarray
3.30 score 8 scripts 372 downloadsSIM - Integrated Analysis on two human genomic datasets
Finds associations between two human genomic datasets.
Last updated
microarrayvisualization
3.30 score 7 scripts 444 downloadsvbmp - Variational Bayesian Multinomial Probit Regression
Variational Bayesian Multinomial Probit Regression with Gaussian Process Priors. It estimates class membership posterior probability employing variational and sparse approximation to the full posterior. This software also incorporates feature weighting by means of Automatic Relevance Determination.
Last updated
classification
3.30 score 3 scripts 528 downloadsoccugene - Functions for Multinomial Occupancy Distribution
Statistical tools for building random mutagenesis libraries for prokaryotes. The package has functions for handling the occupancy distribution for a multinomial and for estimating the number of essential genes in random transposon mutagenesis libraries.
Last updated
annotationpathways
3.30 score 362 downloadsRbcBook1 - Support for Springer monograph on Bioconductor
tools for building book
Last updated
software
3.30 score 7 scripts 796 downloadssplots - Visualization of high-throughput assays in microtitre plate or slide format
This package is here to support legacy usages of it, but it should not be used for new code development. It provides a single function, plotScreen, for visualising data in microtitre plate or slide format. As a better alternative for such functionality, please consider the platetools package on CRAN (https://cran.r-project.org/package=platetools and https://github.com/Swarchal/platetools), or ggplot2 (geom_raster, facet_wrap) as exemplified in the vignette of this package.
Last updated
visualizationsequencingmicrotitreplateassay
3.30 score 3 scripts 409 downloadsBioMVCClass - Model-View-Controller (MVC) Classes That Use Biobase
Creates classes used in model-view-controller (MVC) design
Last updated
visualizationinfrastructuregraphandnetwork
3.30 score 2 scripts 475 downloadsspikeLI - Affymetrix Spike-in Langmuir Isotherm Data Analysis Tool
SpikeLI is a package that performs the analysis of the Affymetrix spike-in data using the Langmuir Isotherm. The aim of this package is to show the advantages of a physical-chemistry based analysis of the Affymetrix microarray data compared to the traditional methods. The spike-in (or Latin square) data for the HGU95 and HGU133 chipsets have been downloaded from the Affymetrix web site. The model used in the spikeLI package is described in details in E. Carlon and T. Heim, Physica A 362, 433 (2006).
Last updated
microarrayqualitycontrol
3.30 score 4 scripts 358 downloadsmaCorrPlot - Visualize artificial correlation in microarray data
Graphically displays correlation in microarray data that is due to insufficient normalization
Last updated
microarraypreprocessingvisualization
3.30 score 7 scripts 475 downloadsadSplit - Annotation-Driven Clustering
This package implements clustering of microarray gene expression profiles according to functional annotations. For each term genes are annotated to, splits into two subclasses are computed and a significance of the supporting gene set is determined.
Last updated
microarrayclusteringcpp
3.30 score 6 scripts 594 downloadsMantelCorr - Compute Mantel Cluster Correlations
Computes Mantel cluster correlations from a (p x n) numeric data matrix (e.g. microarray gene-expression data).
Last updated
clustering
3.30 score 7 scripts 358 downloadsfdrame - FDR adjustments of Microarray Experiments (FDR-AME)
This package contains two main functions. The first is fdr.ma which takes normalized expression data array, experimental design and computes adjusted p-values It returns the fdr adjusted p-values and plots, according to the methods described in (Reiner, Yekutieli and Benjamini 2002). The second, is fdr.gui() which creates a simple graphic user interface to access fdr.ma
Last updated
microarraydifferentialexpressionmultiplecomparison
3.30 score 400 downloadsgeneRecommender - A gene recommender algorithm to identify genes coexpressed with a query set of genes
This package contains a targeted clustering algorithm for the analysis of microarray data. The algorithm can aid in the discovery of new genes with similar functions to a given list of genes already known to have closely related functions.
Last updated
microarrayclustering
3.30 score 6 scripts 446 downloadsnnNorm - Spatial and intensity based normalization of cDNA microarray data based on robust neural nets
This package allows to detect and correct for spatial and intensity biases with two-channel microarray data. The normalization method implemented in this package is based on robust neural networks fitting.
Last updated
microarraytwochannelpreprocessing
3.30 score 7 scripts 473 downloadsHEM - Heterogeneous error model for identification of differentially expressed genes under multiple conditions
This package fits heterogeneous error models for analysis of microarray data
Last updated
microarraydifferentialexpression
3.30 score 8 scripts 444 downloadsgoTools - Functions for Gene Ontology database
Wraper functions for description/comparison of oligo ID list using Gene Ontology database
Last updated
microarraygovisualization
3.30 score 8 scripts 450 downloadsarrayQuality - Assessing array quality on spotted arrays
Functions for performing print-run and array level quality assessment.
Last updated
microarraytwochannelqualitycontrolvisualization
3.30 score 10 scripts 749 downloadspickgene - Adaptive Gene Picking for Microarray Expression Data Analysis
Functions to Analyze Microarray (Gene Expression) Data.
Last updated
microarraydifferentialexpression
3.30 score 5 scripts 370 downloadsecolitk - Meta-data and tools for E. coli
Meta-data and tools to work with E. coli. The tools are mostly plotting functions to work with circular genomes. They can used with other genomes/plasmids.
Last updated
annotationvisualization
3.30 score 7 scripts 561 downloadsfactDesign - Factorial designed microarray experiment analysis
This package provides a set of tools for analyzing data from a factorial designed microarray experiment, or any microarray experiment for which a linear model is appropriate. The functions can be used to evaluate tests of contrast of biological interest and perform single outlier detection.
Last updated
microarraydifferentialexpression
3.30 score 6 scripts 411 downloadsMeasurementError.cor - Measurement Error model estimate for correlation coefficient
Two-stage measurement error model for correlation estimation with smaller bias than the usual sample correlation
Last updated
statisticalmethod
3.30 score 1 scripts 404 downloadsHiCBricks - Framework for Storing and Accessing Hi-C Data Through HDF Files
HiCBricks is a library designed for handling large high-resolution Hi-C datasets. Over the years, the Hi-C field has experienced a rapid increase in the size and complexity of datasets. HiCBricks is meant to overcome the challenges related to the analysis of such large datasets within the R environment. HiCBricks offers user-friendly and efficient solutions for handling large high-resolution Hi-C datasets. The package provides an R/Bioconductor framework with the bricks to build more complex data analysis pipelines and algorithms. HiCBricks already incorporates example algorithms for calling domain boundaries and functions for high quality data visualization.
Last updated
dataimportinfrastructuresoftwaretechnologysequencinghic
3.22 score 11 scripts 516 downloadsOMICsPCA - An R package for quantitative integration and analysis of multiple omics assays from heterogeneous samples
OMICsPCA is an analysis pipeline designed to integrate multi OMICs experiments done on various subjects (e.g. Cell lines, individuals), treatments (e.g. disease/control) or time points and to analyse such integrated data from various various angles and perspectives. In it's core OMICsPCA uses Principal Component Analysis (PCA) to integrate multiomics experiments from various sources and thus has ability to over data insufficiency issues by using the ingegrated data as representatives. OMICsPCA can be used in various application including analysis of overall distribution of OMICs assays across various samples /individuals /time points; grouping assays by user-defined conditions; identification of source of variation, similarity/dissimilarity between assays, variables or individuals.
Last updated
immunooncologymultiplecomparisonprincipalcomponentdatarepresentationworkflowvisualizationdimensionreductionclusteringbiologicalquestionepigeneticsworkflowtranscriptiongeneticvariabilityguibiomedicalinformaticsepigeneticsfunctionalgenomicssinglecell
3.18 score 5 scripts 400 downloadshypergraph - A package providing hypergraph data structures
A package that implements some simple capabilities for representing and manipulating hypergraphs.
Last updated
graphandnetwork
3.16 score 2 dependents 12 scripts 576 downloadscelaref - Single-cell RNAseq cell cluster labelling by reference
After the clustering step of a single-cell RNAseq experiment, this package aims to suggest labels/cell types for the clusters, on the basis of similarity to a reference dataset. It requires a table of read counts per cell per gene, and a list of the cells belonging to each of the clusters, (for both test and reference data).
Last updated
singlecell
3.00 score 5 scripts 348 downloadssincell - R package for the statistical assessment of cell state hierarchies from single-cell RNA-seq data
Cell differentiation processes are achieved through a continuum of hierarchical intermediate cell-states that might be captured by single-cell RNA seq. Existing computational approaches for the assessment of cell-state hierarchies from single-cell data might be formalized under a general workflow composed of i) a metric to assess cell-to-cell similarities (combined or not with a dimensionality reduction step), and ii) a graph-building algorithm (optionally making use of a cells-clustering step). Sincell R package implements a methodological toolbox allowing flexible workflows under such framework. Furthermore, Sincell contributes new algorithms to provide cell-state hierarchies with statistical support while accounting for stochastic factors in single-cell RNA seq. Graphical representations and functional association tests are provided to interpret hierarchies.
Last updated
immunooncologysequencingrnaseqclusteringgraphandnetworkvisualizationgeneexpressiongenesetenrichmentbiomedicalinformaticscellbiologyfunctionalgenomicssystemsbiologycpp
3.00 score 9 scripts 480 downloadsPIPETS - Poisson Identification of PEaks from Term-Seq data
PIPETS provides statistically robust analysis for 3'-seq/term-seq data. It utilizes a sliding window approach to apply a Poisson Distribution test to identify genomic positions with termination read coverage that is significantly higher than the surrounding signal. PIPETS then condenses proximal signal and produces strand specific results that contain all significant termination peaks.
Last updated
sequencingtranscriptiongeneregulationpeakdetectiongeneticstranscriptomicscoverage
3.00 score 2 scripts 268 downloadsMDTS - Detection of de novo deletion in targeted sequencing trios
A package for the detection of de novo copy number deletions in targeted sequencing of trios with high sensitivity and positive predictive value.
Last updated
statisticalmethodtechnologysequencingtargetedresequencingcoveragedataimport
2.78 score 4 scripts 339 downloadsbcSeq - Fast Sequence Mapping in High-Throughput shRNA and CRISPR Screens
This Rcpp-based package implements a highly efficient data structure and algorithm for performing alignment of short reads from CRISPR or shRNA screens to reference barcode library. Sequencing error are considered and matching qualities are evaluated based on Phred scores. A Bayes' classifier is employed to predict the originating barcode of a read. The package supports provision of user-defined probability models for evaluating matching qualities. The package also supports multi-threading.
Last updated
immunooncologyalignmentcrisprsequencingsequencematchingmultiplesequencealignmentsoftwareatacseqcpp
2.78 score 6 scripts 374 downloadsBufferedMatrixMethods - Microarray Data related methods that utlize BufferedMatrix objects
Microarray analysis methods that use BufferedMatrix objects
Last updated
infrastructure
2.78 score 2 scripts 402 downloadsDominoEffect - Identification and Annotation of Protein Hotspot Residues
The functions support identification and annotation of hotspot residues in proteins. These are individual amino acids that accumulate mutations at a much higher rate than their surrounding regions.
Last updated
softwaresomaticmutationproteomicssequencematchingalignment
2.70 score 1 scripts 337 downloadsCoCiteStats - Different test statistics based on co-citation.
A collection of software tools for dealing with co-citation data.
Last updated
software
2.60 score 4 scripts 419 downloadsGraphAT - Graph Theoretic Association Tests
Functions and data used in Balasubramanian, et al. (2004)
Last updated
networkgraphandnetwork
2.38 score 12 scripts 573 downloadsCTDquerier - Package for CTDbase data query, visualization and downstream analysis
Package to retrieve and visualize data from the Comparative Toxicogenomics Database (http://ctdbase.org/). The downloaded data is formated as DataFrames for further downstream analyses.
Last updated
softwarebiomedicalinformaticsinfrastructuredataimportdatarepresentationgenesetenrichmentnetworkenrichmentpathwaysnetworkgokegg
2.30 score 6 scripts 379 downloadsrfaRm - An R interface to the Rfam database
rfaRm provides a client interface to the Rfam database of RNA families. Data that can be retrieved include RNA families, secondary structure images, covariance models, sequences within each family, alignments leading to the identification of a family and secondary structures in the dot-bracket format.
Last updated
functionalgenomicsdataimportthirdpartyclientvisualizationmultiplesequencealignment
2.30 score 1 scripts 354 downloadsRepViz - Replicate oriented Visualization of a genomic region
RepViz enables the view of a genomic region in a simple and efficient way. RepViz allows simultaneous viewing of both intra- and intergroup variation in sequencing counts of the studied conditions, as well as their comparison to the output features (e.g. identified peaks) from user selected data analysis methods.The RepViz tool is primarily designed for chromatin data such as ChIP-seq and ATAC-seq, but can also be used with other sequencing data such as RNA-seq, or combinations of different types of genomic data.
Last updated
workflowstepvisualizationsequencingchipseqatacseqsoftwarecoveragegenomicvariation
2.30 score 1 scripts 342 downloadsdcGSA - Distance-correlation based Gene Set Analysis for longitudinal gene expression profiles
Distance-correlation based Gene Set Analysis for longitudinal gene expression profiles. In longitudinal studies, the gene expression profiles were collected at each visit from each subject and hence there are multiple measurements of the gene expression profiles for each subject. The dcGSA package could be used to assess the associations between gene sets and clinical outcomes of interest by fully taking advantage of the longitudinal nature of both the gene expression profiles and clinical outcomes.
Last updated
immunooncologygenesetenrichmentmicroarraystatisticalmethodsequencingrnaseqgeneexpression
2.30 score 1 scripts 377 downloadsRGSEA - Random Gene Set Enrichment Analysis
Combining bootstrap aggregating and Gene set enrichment analysis (GSEA), RGSEA is a classfication algorithm with high robustness and no over-fitting problem. It performs well especially for the data generated from different exprements.
Last updated
genesetenrichmentstatisticalmethodclassification
2.30 score 4 scripts 410 downloadsstepNorm - Stepwise normalization functions for cDNA microarrays
Stepwise normalization functions for cDNA microarray data.
Last updated
microarraytwochannelpreprocessing
2.30 score 3 scripts 465 downloadsdaMA - Efficient design and analysis of factorial two-colour microarray data
This package contains functions for the efficient design of factorial two-colour microarray experiments and for the statistical analysis of factorial microarray data. Statistical details are described in Bretz et al. (2003, submitted)
Last updated
microarraytwochanneldifferentialexpression
2.30 score 559 downloads
