GenomicRanges - Representation and manipulation of genomic intervals
The ability to efficiently represent and manipulate genomic annotations and alignments is playing a central role when it comes to analyzing high-throughput sequencing data (a.k.a. NGS data). The GenomicRanges package defines general purpose containers for storing and manipulating genomic intervals and variables defined along a genome. More specialized containers for representing and manipulating short alignments against a reference genome, or a matrix-like summarization of an experiment, are defined in the GenomicAlignments and SummarizedExperiment packages, respectively. Both packages build on top of the GenomicRanges infrastructure.
Last updated 3 months ago
geneticsinfrastructuredatarepresentationsequencingannotationgenomeannotationcoveragebioconductor-packagecore-package
17.88 score 45 stars 1.3k dependents 14k scripts 95k downloadsBiostrings - Efficient manipulation of biological strings
Memory efficient string containers, string matching algorithms, and other utilities, for fast manipulation of large biological sequences or sets of sequences.
Last updated 2 months ago
sequencematchingalignmentsequencinggeneticsdataimportdatarepresentationinfrastructurebioconductor-packagecore-package
17.85 score 58 stars 1.2k dependents 8.6k scripts 99k downloadsBiocParallel - Bioconductor facilities for parallel evaluation
This package provides modified versions and novel implementation of functions for parallel evaluation, tailored to use with Bioconductor objects.
Last updated 4 months ago
infrastructurebioconductor-packagecore-packageu24ca289073cpp
17.13 score 67 stars 1.1k dependents 6.7k scripts 82k downloadsS4Vectors - Foundation of vector-like and list-like containers in Bioconductor
The S4Vectors package defines the Vector and List virtual classes and a set of generic functions that extend the semantic of ordinary vectors and lists in R. Package developers can easily implement vector-like or list-like objects as concrete subclasses of Vector or List. In addition, a few low-level concrete subclasses of general interest (e.g. DataFrame, Rle, Factor, and Hits) are implemented in the S4Vectors package itself (many more are implemented in the IRanges package and in other Bioconductor infrastructure packages).
Last updated 8 hours ago
infrastructuredatarepresentationbioconductor-packagecore-package
16.10 score 18 stars 1.8k dependents 1.0k scripts 110k downloadsDESeq2 - Differential gene expression analysis based on the negative binomial distribution
Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution.
Last updated 7 days ago
sequencingrnaseqchipseqgeneexpressiontranscriptionnormalizationdifferentialexpressionbayesianregressionprincipalcomponentclusteringimmunooncologyopenblascpp
16.05 score 368 stars 115 dependents 17k scripts 36k downloadsbiomaRt - Interface to BioMart databases (i.e. Ensembl)
In recent years a wealth of biological data has become available in public data repositories. Easy access to these valuable data resources and firm integration with data analysis is needed for comprehensive bioinformatics data analysis. biomaRt provides an interface to a growing collection of databases implementing the BioMart software suite (<http://www.biomart.org>). The package enables retrieval of large amounts of data in a uniform way without the need to know the underlying database schemas or write complex SQL queries. The most prominent examples of BioMart databases are maintain by Ensembl, which provides biomaRt users direct access to a diverse set of data and enables a wide range of powerful online queries from gene annotation to database mining.
Last updated 13 days ago
annotationbioconductorbiomartensembl
15.92 score 38 stars 231 dependents 13k scripts 38k downloadsDelayedArray - A unified framework for working transparently with on-disk and in-memory array-like datasets
Wrapping an array-like object (typically an on-disk object) in a DelayedArray object allows one to perform common array operations on it without loading the object in memory. In order to reduce memory usage and optimize performance, operations on the object are either delayed or executed using a block processing mechanism. Note that this also works on in-memory array-like objects like DataFrame objects (typically with Rle columns), Matrix objects, ordinary arrays and, data frames.
Last updated 8 days ago
infrastructuredatarepresentationannotationgenomeannotationbioconductor-packagecore-packageu24ca289073
15.64 score 26 stars 1.2k dependents 542 scripts 88k downloadsMultiAssayExperiment - Software for the integration of multi-omics experiments in Bioconductor
Harmonize data management of multiple experimental assays performed on an overlapping set of specimens. It provides a familiar Bioconductor user experience by extending concepts from SummarizedExperiment, supporting an open-ended mix of standard data classes for individual assays, and allowing subsetting by genomic ranges or rownames. Facilities are provided for reshaping data into wide and long formats for adaptability to graphing and downstream analysis.
Last updated 13 days ago
infrastructuredatarepresentationbioconductorbioconductor-packagegenomicsnci-itcrtcgau24ca289073
14.98 score 70 stars 128 dependents 670 scripts 9.2k downloadsTCGAbiolinks - TCGAbiolinks: An R/Bioconductor package for integrative analysis with GDC data
The aim of TCGAbiolinks is : i) facilitate the GDC open-access data retrieval, ii) prepare the data using the appropriate pre-processing strategies, iii) provide the means to carry out different standard analyses and iv) to easily reproduce earlier research results. In more detail, the package provides multiple methods for analysis (e.g., differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.
Last updated 8 days ago
dnamethylationdifferentialmethylationgeneregulationgeneexpressionmethylationarraydifferentialexpressionpathwaysnetworksequencingsurvivalsoftwarebiocbioconductorgdcintegrative-analysistcgatcga-datatcgabiolinks
14.35 score 303 stars 6 dependents 1.6k scripts 6.5k downloadsBiocGenerics - S4 generic functions used in Bioconductor
The package defines many S4 generic functions used in Bioconductor.
Last updated 4 days ago
infrastructurebioconductor-packagecore-package
14.23 score 12 stars 2.2k dependents 612 scripts 108k downloadsHDF5Array - HDF5 datasets as array-like objects in R
The HDF5Array package is an HDF5 backend for DelayedArray objects. It implements the HDF5Array, H5SparseMatrix, H5ADMatrix, and TENxMatrix classes, 4 convenient and memory-efficient array-like containers for representing and manipulating either: (1) a conventional (a.k.a. dense) HDF5 dataset, (2) an HDF5 sparse matrix (stored in CSR/CSC/Yale format), (3) the central matrix of an h5ad file (or any matrix in the /layers group), or (4) a 10x Genomics sparse matrix. All these containers are DelayedArray extensions and thus support all operations (delayed or block-processed) supported by DelayedArray objects.
Last updated 15 hours ago
infrastructuredatarepresentationdataimportsequencingrnaseqcoverageannotationgenomeannotationsinglecellimmunooncologybioconductor-packagecore-packageu24ca289073
13.17 score 12 stars 123 dependents 860 scripts 27k downloadsSpatialExperiment - S4 Class for Spatially Resolved -omics Data
Defines an S4 class for storing data from spatial -omics experiments. The class extends SingleCellExperiment to support storage and retrieval of additional information from spot-based and molecule-based platforms, including spatial coordinates, images, and image metadata. A specialized constructor function is included for data from the 10x Genomics Visium platform.
Last updated 4 months ago
datarepresentationdataimportinfrastructureimmunooncologygeneexpressiontranscriptomicssinglecellspatial
12.56 score 59 stars 62 dependents 1.7k scripts 6.9k downloadsSparseArray - High-performance sparse data representation and manipulation in R
The SparseArray package provides array-like containers for efficient in-memory representation of multidimensional sparse data in R (arrays and matrices). The package defines the SparseArray virtual class and two concrete subclasses: COO_SparseArray and SVT_SparseArray. Each subclass uses its own internal representation of the nonzero multidimensional data: the "COO layout" and the "SVT layout", respectively. SVT_SparseArray objects mimic as much as possible the behavior of ordinary matrix and array objects in base R. In particular, they suppport most of the "standard matrix and array API" defined in base R and in the matrixStats package from CRAN.
Last updated 8 days ago
infrastructuredatarepresentationbioconductor-packagecore-packageopenmp
12.50 score 8 stars 1.2k dependents 51 scripts 75k downloadsglmGamPoi - Fit a Gamma-Poisson Generalized Linear Model
Fit linear models to overdispersed count data. The package can estimate the overdispersion and fit repeated models for matrix input. It is designed to handle large input datasets as they typically occur in single cell RNA-seq experiments.
Last updated 14 hours ago
regressionrnaseqsoftwaresinglecellgamma-poissonglmnegative-binomial-regressionon-diskopenblascpp
12.11 score 110 stars 4 dependents 1.0k scripts 8.6k downloadsSeqArray - Data Management of Large-Scale Whole-Genome Sequence Variant Calls
Data management of large-scale whole-genome sequencing variant calls with thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.
Last updated 9 days ago
infrastructuredatarepresentationsequencinggeneticsbioinformaticsgds-formatsnpsnvweswgscpp
12.07 score 45 stars 9 dependents 1.1k scripts 1.7k downloadssparseMatrixStats - Summary Statistics for Rows and Columns of Sparse Matrices
High performance functions for row and column operations on sparse matrices. For example: col / rowMeans2, col / rowMedians, col / rowVars etc. Currently, the optimizations are limited to data in the column sparse format. This package is inspired by the matrixStats package by Henrik Bengtsson.
Last updated 4 months ago
infrastructuresoftwaredatarepresentationcpp
12.01 score 54 stars 131 dependents 178 scripts 27k downloadsGenomicDataCommons - NIH / NCI Genomic Data Commons Access
Programmatically access the NIH / NCI Genomic Data Commons RESTful service.
Last updated 7 days ago
dataimportsequencingapi-clientbioconductorbioinformaticscancercore-servicesdata-sciencegenomicsncitcgavignette
11.93 score 85 stars 12 dependents 238 scripts 1.8k downloads
QFeatures - Quantitative features for mass spectrometry data
The QFeatures infrastructure enables the management and processing of quantitative features for high-throughput mass spectrometry assays. It provides a familiar Bioconductor user experience to manages quantitative data across different assay levels (such as peptide spectrum matches, peptides and proteins) in a coherent and tractable format.
Last updated 4 days ago
infrastructuremassspectrometryproteomicsmetabolomicsbioconductormass-spectrometry
11.64 score 26 stars 49 dependents 252 scripts 3.5k downloadsmia - Microbiome analysis
mia implements tools for microbiome analysis based on the SummarizedExperiment, SingleCellExperiment and TreeSummarizedExperiment infrastructure. Data wrangling and analysis in the context of taxonomic data is the main scope. Additional functions for common task are implemented such as community indices calculation and summarization.
Last updated 14 days ago
microbiomesoftwaredataimportanalysisbioconductor
11.41 score 52 stars 4 dependents 316 scripts 2.6k downloadsS4Arrays - Foundation of array-like containers in Bioconductor
The S4Arrays package defines the Array virtual class to be extended by other S4 classes that wish to implement a container with an array-like semantic. It also provides: (1) low-level functionality meant to help the developer of such container to implement basic operations like display, subsetting, or coercion of their array-like objects to an ordinary matrix or array, and (2) a framework that facilitates block processing of array-like objects (typically on-disk objects).
Last updated 3 days ago
infrastructuredatarepresentationbioconductor-packagecore-package
11.32 score 5 stars 1.2k dependents 8 scripts 73k downloadsdecoupleR - decoupleR: Ensemble of computational methods to infer biological activities from omics data
Many methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor package containing different statistical methods to extract these signatures within a unified framework. decoupleR allows the user to flexibly test any method with any resource. It incorporates methods that take into account the sign and weight of network interactions. decoupleR can be used with any omic, as long as its features can be linked to a biological process based on prior knowledge. For example, in transcriptomics gene sets regulated by a transcription factor, or in phospho-proteomics phosphosites that are targeted by a kinase.
Last updated 4 months ago
differentialexpressionfunctionalgenomicsgeneexpressiongeneregulationnetworksoftwarestatisticalmethodtranscription
11.20 score 216 stars 3 dependents 316 scripts 1.7k downloadsMaaslin2 - "Multivariable Association Discovery in Population-scale Meta-omics Studies"
MaAsLin2 is comprehensive R package for efficiently determining multivariable association between clinical metadata and microbial meta'omic features. MaAsLin2 relies on general linear models to accommodate most modern epidemiological study designs, including cross-sectional and longitudinal, and offers a variety of data exploration, normalization, and transformation methods. MaAsLin2 is the next generation of MaAsLin.
Last updated 4 months ago
metagenomicssoftwaremicrobiomenormalizationbiobakerybioconductordifferential-abundance-analysisfalse-discovery-ratemultiple-covariatespublicrepeated-measurestools
11.01 score 130 stars 3 dependents 532 scripts 1.6k downloadsscater - Single-Cell Analysis Toolkit for Gene Expression Data in R
A collection of tools for doing various analyses of single-cell RNA-seq gene expression data, with a focus on quality control and visualization.
Last updated 10 days ago
immunooncologysinglecellrnaseqqualitycontrolpreprocessingnormalizationvisualizationdimensionreductiontranscriptomicsgeneexpressionsequencingsoftwaredataimportdatarepresentationinfrastructurecoverage
10.96 score 40 dependents 12k scripts 12k downloadsinfercnv - Infer Copy Number Variation from Single-Cell RNA-Seq Data
Using single-cell RNA-Seq expression to visualize CNV in cells.
Last updated 4 months ago
softwarecopynumbervariationvariantdetectionstructuralvariationgenomicvariationgeneticstranscriptomicsstatisticalmethodbayesianhiddenmarkovmodelsinglecelljagscpp
10.89 score 588 stars 654 scripts 2.4k downloadstximeta - Transcript Quantification Import with Automatic Metadata
Transcript quantification import from Salmon and other quantifiers with automatic attachment of transcript ranges and release information, and other associated metadata. De novo transcriptomes can be linked to the appropriate sources with linkedTxomes and shared for computational reproducibility.
Last updated 12 days ago
annotationgenomeannotationdataimportpreprocessingrnaseqsinglecelltranscriptomicstranscriptiongeneexpressionfunctionalgenomicsreproducibleresearchreportwritingimmunooncology
10.68 score 67 stars 1 dependents 466 scripts 2.0k downloads
MsCoreUtils - Core Utils for Mass Spectrometry Data
MsCoreUtils defines low-level functions for mass spectrometry data and is independent of any high-level data structures. These functions include mass spectra processing functions (noise estimation, smoothing, binning, baseline estimation), quantitative aggregation functions (median polish, robust summarisation, ...), missing data imputation, data normalisation (quantiles, vsn, ...), misc helper functions, that are used across high-level data structure within the R for Mass Spectrometry packages.
Last updated 4 months ago
infrastructureproteomicsmassspectrometrymetabolomicsbioconductormass-spectrometryutils
10.60 score 16 stars 69 dependents 41 scripts 5.9k downloadsbasilisk - Freezing Python Dependencies Inside Bioconductor Packages
Installs a self-contained conda instance that is managed by the R/Bioconductor installation machinery. This aims to provide a consistent Python environment that can be used reliably by Bioconductor packages. Functions are also provided to enable smooth interoperability of multiple Python environments in a single R session.
Last updated 4 months ago
infrastructurebioconductor-package
10.58 score 27 stars 36 dependents 75 scripts 5.5k downloadsscRepertoire - A toolkit for single-cell immune receptor profiling
scRepertoire is a toolkit for processing and analyzing single-cell T-cell receptor (TCR) and immunoglobulin (Ig). The scRepertoire framework supports use of 10x, AIRR, BD, MiXCR, Omniscope, TRUST4, and WAT3R single-cell formats. The functionality includes basic clonal analyses, repertoire summaries, distance-based clustering and interaction with the popular Seurat and SingleCellExperiment/Bioconductor R workflows.
Last updated 13 days ago
softwareimmunooncologysinglecellclassificationannotationsequencingcpp
10.52 score 323 stars 240 scripts 1.1k downloadsGlimma - Interactive visualizations for gene expression analysis
This package produces interactive visualizations for RNA-seq data analysis, utilizing output from limma, edgeR, or DESeq2. It produces interactive htmlwidgets versions of popular RNA-seq analysis plots to enhance the exploration of analysis results by overlaying interactive features. The plots can be viewed in a web browser or embedded in notebook documents.
Last updated 3 days ago
differentialexpressiongeneexpressionmicroarrayreportwritingrnaseqsequencingvisualizationdifferential-expressioninteractive-visualizations
10.49 score 28 stars 1 dependents 600 scripts 1.7k downloadsGENESIS - GENetic EStimation and Inference in Structured samples (GENESIS): Statistical methods for analyzing genetic data from samples with population structure and/or relatedness
The GENESIS package provides methodology for estimating, inferring, and accounting for population and pedigree structure in genetic analyses. The current implementation provides functions to perform PC-AiR (Conomos et al., 2015, Gen Epi) and PC-Relate (Conomos et al., 2016, AJHG). PC-AiR performs a Principal Components Analysis on genome-wide SNP data for the detection of population structure in a sample that may contain known or cryptic relatedness. Unlike standard PCA, PC-AiR accounts for relatedness in the sample to provide accurate ancestry inference that is not confounded by family structure. PC-Relate uses ancestry representative principal components to adjust for population structure/ancestry and accurately estimate measures of recent genetic relatedness such as kinship coefficients, IBD sharing probabilities, and inbreeding coefficients. Additionally, functions are provided to perform efficient variance component estimation and mixed model association testing for both quantitative and binary phenotypes.
Last updated 10 days ago
snpgeneticvariabilitygeneticsstatisticalmethoddimensionreductionprincipalcomponentgenomewideassociationqualitycontrolbiocviews
10.44 score 36 stars 1 dependents 342 scripts 656 downloadscelda - CEllular Latent Dirichlet Allocation
Celda is a suite of Bayesian hierarchical models for clustering single-cell RNA-sequencing (scRNA-seq) data. It is able to perform "bi-clustering" and simultaneously cluster genes into gene modules and cells into cell subpopulations. It also contains DecontX, a novel Bayesian method to computationally estimate and remove RNA contamination in individual cells without empty droplet information. A variety of scRNA-seq data visualization functions is also included.
Last updated 4 months ago
singlecellgeneexpressionclusteringsequencingbayesianimmunooncologydataimportcppopenmp
10.39 score 148 stars 2 dependents 265 scripts 1.5k downloadsGSEABase - Gene set enrichment data structures and methods
This package provides classes and methods to support Gene Set Enrichment Analysis (GSEA).
Last updated 14 hours ago
geneexpressiongenesetenrichmentgraphandnetworkgokegg
10.27 score 75 dependents 1.5k scripts 13k downloadsmuscat - Multi-sample multi-group scRNA-seq data analysis tools
`muscat` provides various methods and visualization tools for DS analysis in multi-sample, multi-group, multi-(cell-)subpopulation scRNA-seq data, including cell-level mixed models and methods based on aggregated “pseudobulk” data, as well as a flexible simulation platform that mimics both single and multi-sample scRNA-seq data.
Last updated 4 months ago
immunooncologydifferentialexpressionsequencingsinglecellsoftwarestatisticalmethodvisualization
10.25 score 176 stars 686 scripts 1.0k downloadsscuttle - Single-Cell RNA-Seq Analysis Utilities
Provides basic utility functions for performing single-cell analyses, focusing on simple normalization, quality control and data transformations. Also provides some helper functions to assist development of other packages.
Last updated 4 months ago
immunooncologysinglecellrnaseqqualitycontrolpreprocessingnormalizationtranscriptomicsgeneexpressionsequencingsoftwaredataimportopenblascpp
10.21 score 77 dependents 1.7k scripts 17k downloads
plotgardener - Coordinate-Based Genomic Visualization Package for R
Coordinate-based genomic visualization package for R. It grants users the ability to programmatically produce complex, multi-paneled figures. Tailored for genomics, plotgardener allows users to visualize large complex genomic datasets and provides exquisite control over how plots are placed and arranged on a page.
Last updated 4 months ago
visualizationgenomeannotationfunctionalgenomicsgenomeassemblyhiccpp
10.15 score 303 stars 3 dependents 163 scripts 434 downloadsBiocCheck - Bioconductor-specific package checks
BiocCheck guides maintainers through Bioconductor best practicies. It runs Bioconductor-specific package checks by searching through package code, examples, and vignettes. Maintainers are required to address all errors, warnings, and most notes produced.
Last updated 8 days ago
infrastructurebioconductor-packagecore-services
10.07 score 8 stars 6 dependents 114 scripts 3.7k downloadsmiloR - Differential neighbourhood abundance testing on a graph
Milo performs single-cell differential abundance testing. Cell states are modelled as representative neighbourhoods on a nearest neighbour graph. Hypothesis testing is performed using either a negative bionomial generalized linear model or negative binomial generalized linear mixed model.
Last updated 4 months ago
singlecellmultiplecomparisonfunctionalgenomicssoftwareopenblascppopenmp
10.07 score 351 stars 340 scripts 788 downloadstradeSeq - trajectory-based differential expression analysis for sequencing data
tradeSeq provides a flexible method for fitting regression models that can be used to find genes that are differentially expressed along one or multiple lineages in a trajectory. Based on the fitted models, it uses a variety of tests suited to answer different questions of interest, e.g. the discovery of genes for which expression is associated with pseudotime, or which are differentially expressed (in a specific region) along the trajectory. It fits a negative binomial generalized additive model (GAM) for each gene, and performs inference on the parameters of the GAM.
Last updated 4 months ago
clusteringregressiontimecoursedifferentialexpressiongeneexpressionrnaseqsequencingsoftwaresinglecelltranscriptomicsmultiplecomparisonvisualization
10.02 score 244 stars 420 scripts 1.1k downloadsMOFA2 - Multi-Omics Factor Analysis v2
The MOFA2 package contains a collection of tools for training and analysing multi-omic factor analysis (MOFA). MOFA is a probabilistic factor model that aims to identify principal axes of variation from data sets that can comprise multiple omic layers and/or groups of samples. Additional time or space information on the samples can be incorporated using the MEFISTO framework, which is part of MOFA2. Downstream analysis functions to inspect molecular features underlying each factor, vizualisation, imputation etc are available.
Last updated 4 months ago
dimensionreductionbayesianvisualizationfactor-analysismofamulti-omics
10.02 score 316 stars 502 scripts 904 downloadsdiffcyt - Differential discovery in high-dimensional cytometry via high-resolution clustering
Statistical methods for differential discovery analyses in high-dimensional cytometry data (including flow cytometry, mass cytometry or CyTOF, and oligonucleotide-tagged cytometry), based on a combination of high-resolution clustering and empirical Bayes moderated tests adapted from transcriptomics.
Last updated 3 days ago
immunooncologyflowcytometryproteomicssinglecellcellbasedassayscellbiologyclusteringfeatureextractionsoftware
9.95 score 19 stars 5 dependents 225 scripts 903 downloadsOmnipathR - OmniPath web service client and more
A client for the OmniPath web service (https://www.omnipathdb.org) and many other resources. It also includes functions to transform and pretty print some of the downloaded data, functions to access a number of other resources such as BioPlex, ConsensusPathDB, EVEX, Gene Ontology, Guide to Pharmacology (IUPHAR/BPS), Harmonizome, HTRIdb, Human Phenotype Ontology, InWeb InBioMap, KEGG Pathway, Pathway Commons, Ramilowski et al. 2015, RegNetwork, ReMap, TF census, TRRUST and Vinayagam et al. 2011. Furthermore, OmnipathR features a close integration with the NicheNet method for ligand activity prediction from transcriptomics data, and its R implementation `nichenetr` (available only on github).
Last updated 7 days ago
graphandnetworknetworkpathwayssoftwarethirdpartyclientdataimportdatarepresentationgenesignalinggeneregulationsystemsbiologytranscriptomicssinglecellannotationkeggcomplexesenzyme-ptmnetworksnetworks-biologyomnipathproteins
9.88 score 125 stars 2 dependents 228 scripts 1.1k downloads
rhdf5filters - HDF5 Compression Filters
Provides a collection of additional compression filters for HDF5 datasets. The package is intended to provide seemless integration with rhdf5, however the compiled filters can also be used with external applications.
Last updated 4 months ago
infrastructuredataimportcompressionfilter-pluginhdf5
9.84 score 5 stars 232 dependents 4 scripts 33k downloadsMicrobiotaProcess - A comprehensive R package for managing and analyzing microbiome and other ecological data within the tidy framework
MicrobiotaProcess is an R package for analysis, visualization and biomarker discovery of microbial datasets. It introduces MPSE class, this make it more interoperable with the existing computing ecosystem. Moreover, it introduces a tidy microbiome data structure paradigm and analysis grammar. It provides a wide variety of microbiome data analysis procedures under the unified and common framework (tidy-like framework).
Last updated 4 months ago
visualizationmicrobiomesoftwaremultiplecomparisonfeatureextractionmicrobiome-analysismicrobiome-data
9.70 score 183 stars 1 dependents 126 scripts 697 downloadsggtreeExtra - An R Package To Add Geometric Layers On Circular Or Other Layout Tree Of "ggtree"
'ggtreeExtra' extends the method for mapping and visualizing associated data on phylogenetic tree using 'ggtree'. These associated data can be presented on the external panels to circular layout, fan layout, or other rectangular layout tree built by 'ggtree' with the grammar of 'ggplot2'.
Last updated 4 months ago
softwarevisualizationphylogeneticsannotation
9.70 score 89 stars 3 dependents 406 scripts 1.5k downloadsInteractiveComplexHeatmap - Make Interactive Complex Heatmaps
This package can easily make heatmaps which are produced by the ComplexHeatmap package into interactive applications. It provides two types of interactivities: 1. on the interactive graphics device, and 2. on a Shiny app. It also provides functions for integrating the interactive heatmap widgets for more complex Shiny app development.
Last updated 4 months ago
softwarevisualizationsequencinginteractive-heatmaps
9.68 score 131 stars 4 dependents 128 scripts 715 downloadsclustifyr - Classifier for Single-cell RNA-seq Using Cell Clusters
Package designed to aid in classifying cells from single-cell RNA sequencing data using external reference data (e.g., bulk RNA-seq, scRNA-seq, microarray, gene lists). A variety of correlation based methods and gene list enrichment methods are provided to assist cell type assignment.
Last updated 4 months ago
singlecellannotationsequencingmicroarraygeneexpressionassign-identitiesclustersmarker-genesrna-seqsingle-cell-rna-seq
9.61 score 116 stars 296 scripts 332 downloadscytomapper - Visualization of highly multiplexed imaging data in R
Highly multiplexed imaging acquires the single-cell expression of selected proteins in a spatially-resolved fashion. These measurements can be visualised across multiple length-scales. First, pixel-level intensities represent the spatial distributions of feature expression with highest resolution. Second, after segmentation, expression values or cell-level metadata (e.g. cell-type information) can be visualised on segmented cell areas. This package contains functions for the visualisation of multiplexed read-outs and cell-level information obtained by multiplexed imaging technologies. The main functions of this package allow 1. the visualisation of pixel-level information across multiple channels, 2. the display of cell-level information (expression and/or metadata) on segmentation masks and 3. gating and visualisation of single cells.
Last updated 4 months ago
immunooncologysoftwaresinglecellonechanneltwochannelmultiplecomparisonnormalizationdataimportbioimagingimaging-mass-cytometrysingle-cellspatial-analysis
9.60 score 31 stars 5 dependents 354 scripts 624 downloadsscMerge - scMerge: Merging multiple batches of scRNA-seq data
Like all gene expression data, single-cell data suffers from batch effects and other unwanted variations that makes accurate biological interpretations difficult. The scMerge method leverages factor analysis, stably expressed genes (SEGs) and (pseudo-) replicates to remove unwanted variations and merge multiple single-cell data. This package contains all the necessary functions in the scMerge pipeline, including the identification of SEGs, replication-identification methods, and merging of single-cell data.
Last updated 4 months ago
batcheffectgeneexpressionnormalizationrnaseqsequencingsinglecellsoftwaretranscriptomicsbioinformaticssingle-cell
9.52 score 67 stars 1 dependents 137 scripts 908 downloads
MetaboCoreUtils - Core Utils for Metabolomics Data
MetaboCoreUtils defines metabolomics-related core functionality provided as low-level functions to allow a data structure-independent usage across various R packages. This includes functions to calculate between ion (adduct) and compound mass-to-charge ratios and masses or functions to work with chemical formulas. The package provides also a set of adduct definitions and information on some commercially available internal standard mixes commonly used in MS experiments.
Last updated 4 months ago
infrastructuremetabolomicsmassspectrometrymass-spectrometry
9.51 score 9 stars 35 dependents 57 scripts 3.0k downloads
tidybulk - Brings transcriptomics to the tidyverse
This is a collection of utility functions that allow to perform exploration of and calculations to RNA sequencing data, in a modular, pipe-friendly and tidy fashion.
Last updated 4 months ago
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsbioconductorbulk-transcriptional-analysesdeseq2differential-expressionedgerensembl-idsentrezgene-symbolsgseamds-dimensionspcapiperedundancytibbletidytidy-datatidyversetranscriptstsne
9.47 score 166 stars 1 dependents 169 scripts 546 downloadsSpatialFeatureExperiment - Integrating SpatialExperiment with Simple Features in sf
A new S4 class integrating Simple Features with the R package sf to bring geospatial data analysis methods based on vector data to spatial transcriptomics. Also implements management of spatial neighborhood graphs and geometric operations. This pakage builds upon SpatialExperiment and SingleCellExperiment, hence methods for these parent classes can still be used.
Last updated 5 days ago
datarepresentationtranscriptomicsspatial
9.43 score 47 stars 1 dependents 308 scripts 381 downloadsbluster - Clustering Algorithms for Bioconductor
Wraps common clustering algorithms in an easily extended S4 framework. Backends are implemented for hierarchical, k-means and graph-based clustering. Several utilities are also provided to compare and evaluate clustering results.
Last updated 4 months ago
immunooncologysoftwaregeneexpressiontranscriptomicssinglecellclusteringcpp
9.40 score 47 dependents 636 scripts 9.3k downloadsNebulosa - Single-Cell Data Visualisation Using Kernel Gene-Weighted Density Estimation
This package provides a enhanced visualization of single-cell data based on gene-weighted density estimation. Nebulosa recovers the signal from dropped-out features and allows the inspection of the joint expression from multiple features (e.g. genes). Seurat and SingleCellExperiment objects can be used within Nebulosa.
Last updated 4 months ago
softwaregeneexpressionsinglecellvisualizationdimensionreductionsingle-cellsingle-cell-analysissingle-cell-multiomicssingle-cell-rna-seq
9.18 score 98 stars 494 scripts 1.9k downloadsRsubread - Mapping, quantification and variant analysis of sequencing data
Alignment, quantification and analysis of RNA sequencing data (including both bulk RNA-seq and scRNA-seq) and DNA sequenicng data (including ATAC-seq, ChIP-seq, WGS, WES etc). Includes functionality for read mapping, read counting, SNP calling, structural variant detection and gene fusion discovery. Can be applied to all major sequencing techologies and to both short and long sequence reads.
Last updated 8 days ago
sequencingalignmentsequencematchingrnaseqchipseqsinglecellgeneexpressiongeneregulationgeneticsimmunooncologysnpgeneticvariabilitypreprocessingqualitycontrolgenomeannotationgenefusiondetectionindeldetectionvariantannotationvariantdetectionmultiplesequencealignmentzlib
9.17 score 10 dependents 860 scripts 3.6k downloadsEWCE - Expression Weighted Celltype Enrichment
Used to determine which cell types are enriched within gene lists. The package provides tools for testing enrichments within simple gene lists (such as human disease associated genes) and those resulting from differential expression studies. The package does not depend upon any particular Single Cell Transcriptome dataset and user defined datasets can be loaded in and used in the analyses.
Last updated 4 months ago
geneexpressiontranscriptiondifferentialexpressiongenesetenrichmentgeneticsmicroarraymrnamicroarrayonechannelrnaseqbiomedicalinformaticsproteomicsvisualizationfunctionalgenomicssinglecelldeconvolutionsingle-cellsingle-cell-rna-seqtranscriptomics
9.17 score 55 stars 96 scripts 501 downloads
schex - Hexbin plots for single cell omics data
Builds hexbin plots for variables and dimension reduction stored in single cell omics data such as SingleCellExperiment. The ideas used in this package are based on the excellent work of Dan Carr, Nicholas Lewin-Koh, Martin Maechler and Thomas Lumley.
Last updated 4 months ago
softwaresequencingsinglecelldimensionreductionvisualizationimmunooncologydataimport
9.13 score 74 stars 2 dependents 102 scripts 380 downloadsCARNIVAL - A CAusal Reasoning tool for Network Identification (from gene expression data) using Integer VALue programming
An upgraded causal reasoning tool from Melas et al in R with updated assignments of TFs' weights from PROGENy scores. Optimization parameters can be freely adjusted and multiple solutions can be obtained and aggregated.
Last updated 4 months ago
transcriptomicsgeneexpressionnetworkcausal-modelsfootprintsinteger-linear-programmingpathway-enrichment-analysis
9.03 score 57 stars 1 dependents 90 scripts 328 downloadsbambu - Context-Aware Transcript Quantification from Long Read RNA-Seq data
bambu is a R package for multi-sample transcript discovery and quantification using long read RNA-Seq data. You can use bambu after read alignment to obtain expression estimates for known and novel transcripts and genes. The output from bambu can directly be used for visualisation and downstream analysis such as differential gene expression or transcript usage.
Last updated 4 days ago
alignmentcoveragedifferentialexpressionfeatureextractiongeneexpressiongenomeannotationgenomeassemblyimmunooncologylongreadmultiplecomparisonnormalizationrnaseqregressionsequencingsoftwaretranscriptiontranscriptomicsbambubioconductorlong-readsnanoporenanopore-sequencingrna-seqrna-seq-analysistranscript-quantificationtranscript-reconstructioncpp
8.91 score 197 stars 1 dependents 86 scripts 728 downloadsBayesSpace - Clustering and Resolution Enhancement of Spatial Transcriptomes
Tools for clustering and enhancing the resolution of spatial gene expression experiments. BayesSpace clusters a low-dimensional representation of the gene expression matrix, incorporating a spatial prior to encourage neighboring spots to cluster together. The method can enhance the resolution of the low-dimensional representation into "sub-spots", for which features such as gene expression or cell type composition can be imputed.
Last updated 4 months ago
softwareclusteringtranscriptomicsgeneexpressionsinglecellimmunooncologydataimportopenblascppopenmp
8.87 score 119 stars 1 dependents 278 scripts 583 downloadsscp - Mass Spectrometry-Based Single-Cell Proteomics Data Analysis
Utility functions for manipulating, processing, and analyzing mass spectrometry-based single-cell proteomics data. The package is an extension to the 'QFeatures' package and relies on 'SingleCellExpirement' to enable single-cell proteomics analyses. The package offers the user the functionality to process quantitative table (as generated by MaxQuant, Proteome Discoverer, and more) into data tables ready for downstream analysis and data visualization.
Last updated 4 months ago
geneexpressionproteomicssinglecellmassspectrometrypreprocessingcellbasedassaysbioconductormass-spectrometrysingle-cellsoftware
8.79 score 25 stars 115 scripts 404 downloadscmapR - CMap Tools in R
The Connectivity Map (CMap) is a massive resource of perturbational gene expression profiles built by researchers at the Broad Institute and funded by the NIH Library of Integrated Network-Based Cellular Signatures (LINCS) program. Please visit https://clue.io for more information. The cmapR package implements methods to parse, manipulate, and write common CMap data objects, such as annotated matrices and collections of gene sets.
Last updated 4 months ago
dataimportdatarepresentationgeneexpressionbioconductorbioinformaticscmap
8.79 score 87 stars 262 scripts 794 downloadstidySingleCellExperiment - Brings SingleCellExperiment to the Tidyverse
'tidySingleCellExperiment' is an adapter that abstracts the 'SingleCellExperiment' container in the form of a 'tibble'. This allows *tidy* data manipulation, nesting, and plotting. For example, a 'tidySingleCellExperiment' is directly compatible with functions from 'tidyverse' packages `dplyr` and `tidyr`, as well as plotting with `ggplot2` and `plotly`. In addition, the package provides various utility functions specific to single-cell omics data analysis (e.g., aggregation of cell-level data to pseudobulks).
Last updated 4 months ago
assaydomaininfrastructurernaseqdifferentialexpressionsinglecellgeneexpressionnormalizationclusteringqualitycontrolsequencingbioconductordplyrggplot2plotlysingle-cell-rna-seqsingle-cell-sequencingsinglecellexperimenttibbletidyrtidyverse
8.79 score 36 stars 2 dependents 105 scripts 342 downloadsBiocBaseUtils - General utility functions for developing Bioconductor packages
The package provides utility functions related to package development. These include functions that replace slots, and selectors for show methods. It aims to coalesce the various helper functions often re-used throughout the Bioconductor ecosystem.
Last updated 4 months ago
softwareinfrastructurebioconductor-packagecore-package
8.78 score 4 stars 157 dependents 3 scripts 11k downloadsbatchelor - Single-Cell Batch Correction Methods
Implements a variety of methods for batch correction of single-cell (RNA sequencing) data. This includes methods based on detecting mutually nearest neighbors, as well as several efficient variants of linear regression of the log-expression values. Functions are also provided to perform global rescaling to remove differences in depth between batches, and to perform a principal components analysis that is robust to differences in the numbers of cells across batches.
Last updated 4 months ago
sequencingrnaseqsoftwaregeneexpressiontranscriptomicssinglecellbatcheffectnormalizationcpp
8.78 score 7 dependents 1.2k scripts 6.5k downloadsGenomicScores - Infrastructure to work with genomewide position-specific scores
Provide infrastructure to store and access genomewide position-specific scores within R and Bioconductor.
Last updated 10 days ago
infrastructuregeneticsannotationsequencingcoverageannotationhubsoftware
8.71 score 8 stars 6 dependents 83 scripts 1.2k downloadsmemes - motif matching, comparison, and de novo discovery using the MEME Suite
A seamless interface to the MEME Suite family of tools for motif analysis. 'memes' provides data aware utilities for using GRanges objects as entrypoints to motif analysis, data structures for examining & editing motif lists, and novel data visualizations. 'memes' functions and data structures are amenable to both base R and tidyverse workflows.
Last updated 4 months ago
dataimportfunctionalgenomicsgeneregulationmotifannotationmotifdiscoverysequencematchingsoftware
8.66 score 47 stars 1 dependents 117 scripts 470 downloadsFRASER - Find RAre Splicing Events in RNA-Seq Data
Detection of rare aberrant splicing events in transcriptome profiles. Read count ratio expectations are modeled by an autoencoder to control for confounding factors in the data. Given these expectations, the ratios are assumed to follow a beta-binomial distribution with a junction specific dispersion. Outlier events are then identified as read-count ratios that deviate significantly from this distribution. FRASER is able to detect alternative splicing, but also intron retention. The package aims to support diagnostics in the field of rare diseases where RNA-seq is performed to identify aberrant splicing defects.
Last updated 4 months ago
rnaseqalternativesplicingsequencingsoftwaregeneticscoverageaberrant-splicingdiagnosticsoutlier-detectionrare-diseaserna-seqsplicingopenblascpp
8.61 score 40 stars 155 scripts 292 downloadsmiaViz - Microbiome Analysis Plotting and Visualization
The miaViz package implements functions to visualize TreeSummarizedExperiment objects especially in the context of microbiome analysis. Part of the mia family of R/Bioconductor packages.
Last updated 12 days ago
microbiomesoftwarevisualizationbioconductormicrobiome-analysisplotting
8.59 score 10 stars 1 dependents 81 scripts 554 downloadscBioPortalData - Exposes and Makes Available Data from the cBioPortal Web Resources
The cBioPortalData R package accesses study datasets from the cBio Cancer Genomics Portal. It accesses the data either from the pre-packaged zip / tar files or from the API interface that was recently implemented by the cBioPortal Data Team. The package can provide data in either tabular format or with MultiAssayExperiment object that uses familiar Bioconductor data representations.
Last updated 3 days ago
softwareinfrastructurethirdpartyclientbioconductor-packagenci-itcru24ca289073
8.48 score 33 stars 4 dependents 147 scripts 644 downloadstidySummarizedExperiment - Brings SummarizedExperiment to the Tidyverse
The tidySummarizedExperiment package provides a set of tools for creating and manipulating tidy data representations of SummarizedExperiment objects. SummarizedExperiment is a widely used data structure in bioinformatics for storing high-throughput genomic data, such as gene expression or DNA sequencing data. The tidySummarizedExperiment package introduces a tidy framework for working with SummarizedExperiment objects. It allows users to convert their data into a tidy format, where each observation is a row and each variable is a column. This tidy representation simplifies data manipulation, integration with other tidyverse packages, and enables seamless integration with the broader ecosystem of tidy tools for data analysis.
Last updated 4 months ago
assaydomaininfrastructurernaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomics
8.43 score 25 stars 1 dependents 199 scripts 615 downloadsSPIAT - Spatial Image Analysis of Tissues
SPIAT (**Sp**atial **I**mage **A**nalysis of **T**issues) is an R package with a suite of data processing, quality control, visualization and data analysis tools. SPIAT is compatible with data generated from single-cell spatial proteomics platforms (e.g. OPAL, CODEX, MIBI, cellprofiler). SPIAT reads spatial data in the form of X and Y coordinates of cells, marker intensities and cell phenotypes. SPIAT includes six analysis modules that allow visualization, calculation of cell colocalization, categorization of the immune microenvironment relative to tumor areas, analysis of cellular neighborhoods, and the quantification of spatial heterogeneity, providing a comprehensive toolkit for spatial data analysis.
Last updated 4 months ago
biomedicalinformaticscellbiologyspatialclusteringdataimportimmunooncologyqualitycontrolsinglecellsoftwarevisualization
8.43 score 21 stars 66 scripts 302 downloadsalabaster.base - Save Bioconductor Objects to File
Save Bioconductor data structures into file artifacts, and load them back into memory. This is a more robust and portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated 8 days ago
datarepresentationdataimportzlibcpp
8.40 score 3 stars 14 dependents 58 scripts 4.0k downloadsPSMatch - Handling and Managing Peptide Spectrum Matches
The PSMatch package helps proteomics practitioners to load, handle and manage Peptide Spectrum Matches. It provides functions to model peptide-protein relations as adjacency matrices and connected components, visualise these as graphs and make informed decision about shared peptide filtering. The package also provides functions to calculate and visualise MS2 fragment ions.
Last updated 4 months ago
infrastructureproteomicsmassspectrometrymass-spectrometrypeptide-spectrum-matches
8.37 score 3 stars 39 dependents 15 scripts 2.5k downloadsClassifyR - A framework for cross-validated classification problems, with applications to differential variability and differential distribution testing
The software formalises a framework for classification and survival model evaluation in R. There are four stages; Data transformation, feature selection, model training, and prediction. The requirements of variable types and variable order are fixed, but specialised variables for functions can also be provided. The framework is wrapped in a driver loop that reproducibly carries out a number of cross-validation schemes. Functions for differential mean, differential variability, and differential distribution are included. Additional functions may be developed by the user, by creating an interface to the framework.
Last updated 11 days ago
classificationsurvivalcpp
8.36 score 5 stars 3 dependents 45 scripts 574 downloadssccomp - Tests differences in cell-type proportion for single-cell data, robust to outliers
A robust and outlier-aware method for testing differences in cell-type proportion in single-cell data. This model can infer changes in tissue composition and heterogeneity, and can produce realistic data simulations based on any existing dataset. This model can also transfer knowledge from a large set of integrated datasets to increase accuracy further.
Last updated 14 days ago
bayesianregressiondifferentialexpressionsinglecellbatch-correctioncompositioncytofdifferential-proportionmicrobiomemultilevelproportionsrandom-effectssingle-cellunwanted-variation
8.32 score 97 stars 69 scripts 246 downloadscsaw - ChIP-Seq Analysis with Windows
Detection of differentially bound regions in ChIP-seq data with sliding windows, with methods for normalization and proper FDR control.
Last updated 13 days ago
multiplecomparisonchipseqnormalizationsequencingcoveragegeneticsannotationdifferentialpeakcallingcurlbzip2xz-utilszlibcpp
8.32 score 7 dependents 498 scripts 893 downloads
MsExperiment - Infrastructure for Mass Spectrometry Experiments
Infrastructure to store and manage all aspects related to a complete proteomics or metabolomics mass spectrometry (MS) experiment. The MsExperiment package provides light-weight and flexible containers for MS experiments building on the new MS infrastructure provided by the Spectra, QFeatures and related packages. Along with raw data representations, links to original data files and sample annotations, additional metadata or annotations can also be stored within the MsExperiment container. To guarantee maximum flexibility only minimal constraints are put on the type and content of the data within the containers.
Last updated 4 months ago
infrastructureproteomicsmassspectrometrymetabolomicsexperimentaldesigndataimport
8.30 score 5 stars 14 dependents 104 scripts 2.0k downloadscrisprDesign - Comprehensive design of CRISPR gRNAs for nucleases and base editors
Provides a comprehensive suite of functions to design and annotate CRISPR guide RNA (gRNAs) sequences. This includes on- and off-target search, on-target efficiency scoring, off-target scoring, full gene and TSS contextual annotations, and SNP annotation (human only). It currently support five types of CRISPR modalities (modes of perturbations): CRISPR knockout, CRISPR activation, CRISPR inhibition, CRISPR base editing, and CRISPR knockdown. All types of CRISPR nucleases are supported, including DNA- and RNA-target nucleases such as Cas9, Cas12a, and Cas13d. All types of base editors are also supported. gRNA design can be performed on reference genomes, transcriptomes, and custom DNA and RNA sequences. Both unpaired and paired gRNA designs are enabled.
Last updated 13 hours ago
crisprfunctionalgenomicsgenetargetbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgenomics-analysisgrnagrna-sequencegrna-sequencessgrnasgrna-design
8.26 score 21 stars 3 dependents 80 scripts 318 downloadsngsReports - Load FastqQC reports and other NGS related files
This package provides methods and object classes for parsing FastQC reports and output summaries from other NGS tools into R. As well as parsing files, multiple plotting methods have been implemented for visualising the parsed data. Plots can be generated as static ggplot objects or interactive plotly objects.
Last updated 4 months ago
qualitycontrolreportwriting
8.20 score 22 stars 99 scripts 233 downloadsggkegg - Analyzing and visualizing KEGG information using the grammar of graphics
This package aims to import, parse, and analyze KEGG data such as KEGG PATHWAY and KEGG MODULE. The package supports visualizing KEGG information using ggplot2 and ggraph through using the grammar of graphics. The package enables the direct visualization of the results from various omics analysis packages.
Last updated 13 days ago
pathwaysdataimportkeggggplot2ggraphpathwaytidygraphvisualization
8.18 score 222 stars 1 dependents 30 scripts 709 downloads
nullranges - Generation of null ranges via bootstrapping or covariate matching
Modular package for generation of sets of ranges representing the null hypothesis. These can take the form of bootstrap samples of ranges (using the block bootstrap framework of Bickel et al 2010), or sets of control ranges that are matched across one or more covariates. nullranges is designed to be inter-operable with other packages for analysis of genomic overlap enrichment, including the plyranges Bioconductor package.
Last updated 4 months ago
visualizationgenesetenrichmentfunctionalgenomicsepigeneticsgeneregulationgenetargetgenomeannotationannotationgenomewideassociationhistonemodificationchipseqatacseqdnaseseqrnaseqhiddenmarkovmodelbioconductorbootstrapgenomicsmatchingstatistics
8.12 score 26 stars 1 dependents 47 scripts 368 downloadsmonaLisa - Binned Motif Enrichment Analysis and Visualization
Useful functions to work with sequence motifs in the analysis of genomics data. These include methods to annotate genomic regions or sequences with predicted motif hits and to identify motifs that drive observed changes in accessibility or expression. Functions to produce informative visualizations of the obtained results are also provided.
Last updated 15 days ago
motifannotationvisualizationfeatureextractionepigenetics
8.06 score 40 stars 53 scripts 322 downloads
velociraptor - Toolkit for Single-Cell Velocity
This package provides Bioconductor-friendly wrappers for RNA velocity calculations in single-cell RNA-seq data. We use the basilisk package to manage Conda environments, and the zellkonverter package to convert data structures between SingleCellExperiment (R) and AnnData (Python). The information produced by the velocity methods is stored in the various components of the SingleCellExperiment class.
Last updated 4 months ago
singlecellgeneexpressionsequencingcoveragerna-velocity
8.04 score 54 stars 45 scripts 338 downloadssimplifyEnrichment - Simplify Functional Enrichment Results
A new clustering algorithm, "binary cut", for clustering similarity matrices of functional terms is implemeted in this package. It also provides functions for visualizing, summarizing and comparing the clusterings.
Last updated 4 months ago
softwarevisualizationgoclusteringgenesetenrichment
8.02 score 113 stars 196 scripts 1.6k downloadsTOAST - Tools for the analysis of heterogeneous tissues
This package is devoted to analyzing high-throughput data (e.g. gene expression microarray, DNA methylation microarray, RNA-seq) from complex tissues. Current functionalities include 1. detect cell-type specific or cross-cell type differential signals 2. tree-based differential analysis 3. improve variable selection in reference-free deconvolution 4. partial reference-free deconvolution with prior knowledge.
Last updated 4 months ago
dnamethylationgeneexpressiondifferentialexpressiondifferentialmethylationmicroarraygenetargetepigeneticsmethylationarray
7.98 score 10 stars 3 dependents 106 scripts 784 downloadsmaaslin3 - "Refining and extending generalized multivariate linear models for meta-omic association discovery"
MaAsLin 3 refines and extends generalized multivariate linear models for meta-omicron association discovery. It finds abundance and prevalence associations between microbiome meta-omics features and complex metadata in population-scale epidemiological studies. The software includes multiple analysis methods (including support for multiple covariates, repeated measures, and ordered predictors), filtering, normalization, and transform options to customize analysis for your specific study.
Last updated 11 days ago
metagenomicssoftwaremicrobiomenormalizationmultiplecomparison
7.89 score 24 stars 31 scripts 59 downloads
BioNERO - Biological Network Reconstruction Omnibus
BioNERO aims to integrate all aspects of biological network inference in a single package, including data preprocessing, exploratory analyses, network inference, and analyses for biological interpretations. BioNERO can be used to infer gene coexpression networks (GCNs) and gene regulatory networks (GRNs) from gene expression data. Additionally, it can be used to explore topological properties of protein-protein interaction (PPI) networks. GCN inference relies on the popular WGCNA algorithm. GRN inference is based on the "wisdom of the crowds" principle, which consists in inferring GRNs with multiple algorithms (here, CLR, GENIE3 and ARACNE) and calculating the average rank for each interaction pair. As all steps of network analyses are included in this package, BioNERO makes users avoid having to learn the syntaxes of several packages and how to communicate between them. Finally, users can also identify consensus modules across independent expression sets and calculate intra and interspecies module preservation statistics between different networks.
Last updated 4 months ago
softwaregeneexpressiongeneregulationsystemsbiologygraphandnetworkpreprocessingnetworknetworkinference
7.85 score 26 stars 1 dependents 50 scripts 734 downloadsCOTAN - COexpression Tables ANalysis
Statistical and computational method to analyze the co-expression of gene pairs at single cell level. It provides the foundation for single-cell gene interactome analysis. The basic idea is studying the zero UMI counts' distribution instead of focusing on positive counts; this is done with a generalized contingency tables framework. COTAN can effectively assess the correlated or anti-correlated expression of gene pairs. It provides a numerical index related to the correlation and an approximate p-value for the associated independence test. COTAN can also evaluate whether single genes are differentially expressed, scoring them with a newly defined global differentiation index. Moreover, this approach provides ways to plot and cluster genes according to their co-expression pattern with other genes, effectively helping the study of gene interactions and becoming a new tool to identify cell-identity marker genes.
Last updated 15 days ago
systemsbiologytranscriptomicsgeneexpressionsinglecell
7.84 score 15 stars 97 scripts 318 downloadsFLAMES - FLAMES: Full Length Analysis of Mutations and Splicing in long read RNA-seq data
Semi-supervised isoform detection and annotation from both bulk and single-cell long read RNA-seq data. Flames provides automated pipelines for analysing isoforms, as well as intermediate functions for manual execution.
Last updated 3 days ago
rnaseqsinglecelltranscriptomicsdataimportdifferentialsplicingalternativesplicinggeneexpressionlongreadzlibcurlbzip2xz-utilscpp
7.84 score 27 stars 11 scripts 372 downloadsorthogene - Interspecies gene mapping
`orthogene` is an R package for easy mapping of orthologous genes across hundreds of species. It pulls up-to-date gene ortholog mappings across **700+ organisms**. It also provides various utility functions to aggregate/expand common objects (e.g. data.frames, gene expression matrices, lists) using **1:1**, **many:1**, **1:many** or **many:many** gene mappings, both within- and between-species.
Last updated 4 months ago
geneticscomparativegenomicspreprocessingphylogeneticstranscriptomicsgeneexpressionanimal-modelsbioconductorbioconductor-packagebioinformaticsbiomedicinecomparative-genomicsevolutionary-biologygenesgenomicsontologiestranslational-research
7.84 score 41 stars 2 dependents 31 scripts 556 downloadsLACE - Longitudinal Analysis of Cancer Evolution (LACE)
LACE is an algorithmic framework that processes single-cell somatic mutation profiles from cancer samples collected at different time points and in distinct experimental settings, to produce longitudinal models of cancer evolution. The approach solves a Boolean Matrix Factorization problem with phylogenetic constraints, by maximizing a weighed likelihood function computed on multiple time points.
Last updated 4 months ago
biomedicalinformaticssinglecellsomaticmutation
7.83 score 15 stars 3 scripts 376 downloadsbiodb - biodb, a library and a development framework for connecting to chemical and biological databases
The biodb package provides access to standard remote chemical and biological databases (ChEBI, KEGG, HMDB, ...), as well as to in-house local database files (CSV, SQLite), with easy retrieval of entries, access to web services, search of compounds by mass and/or name, and mass spectra matching for LCMS and MSMS. Its architecture as a development framework facilitates the development of new database connectors for local projects or inside separate published packages.
Last updated 4 months ago
softwareinfrastructuredataimportkeggbiologycheminformaticschemistrydatabasescpp
7.82 score 11 stars 6 dependents 22 scripts 432 downloadsmistyR - Multiview Intercellular SpaTial modeling framework
mistyR is an implementation of the Multiview Intercellular SpaTialmodeling framework (MISTy). MISTy is an explainable machine learning framework for knowledge extraction and analysis of single-cell, highly multiplexed, spatially resolved data. MISTy facilitates an in-depth understanding of marker interactions by profiling the intra- and intercellular relationships. MISTy is a flexible framework able to process a custom number of views. Each of these views can describe a different spatial context, i.e., define a relationship among the observed expressions of the markers, such as intracellular regulation or paracrine regulation, but also, the views can also capture cell-type specific relationships, capture relations between functional footprints or focus on relations between different anatomical regions. Each MISTy view is considered as a potential source of variability in the measured marker expressions. Each MISTy view is then analyzed for its contribution to the total expression of each marker and is explained in terms of the interactions with other measurements that led to the observed contribution.
Last updated 4 months ago
softwarebiomedicalinformaticscellbiologysystemsbiologyregressiondecisiontreesinglecellspatialbioconductorbiologyintercellularmachine-learningmodularmolecular-biologymultiviewspatial-transcriptomics
7.81 score 51 stars 160 scripts 274 downloadsTreeSummarizedExperiment - TreeSummarizedExperiment: a S4 Class for Data with Tree Structures
TreeSummarizedExperiment has extended SingleCellExperiment to include hierarchical information on the rows or columns of the rectangular data.
Last updated 4 months ago
datarepresentationinfrastructure
7.81 score 15 dependents 226 scripts 3.2k downloadstarget - Predict Combined Function of Transcription Factors
Implement the BETA algorithm for infering direct target genes from DNA-binding and perturbation expression data Wang et al. (2013) <doi: 10.1038/nprot.2013.150>. Extend the algorithm to predict the combined function of two DNA-binding elements from comprable binding and expression data.
Last updated 4 months ago
softwarestatisticalmethodtranscriptionalgorithmchip-seqdna-bindinggene-regulationtranscription-factors
7.79 score 4 stars 1.3k scripts 192 downloads
hermes - Preprocessing, analyzing, and reporting of RNA-seq data
Provides classes and functions for quality control, filtering, normalization and differential expression analysis of pre-processed `RNA-seq` data. Data can be imported from `SummarizedExperiment` as well as `matrix` objects and can be annotated from `BioMart`. Filtering for genes without too low expression or containing required annotations, as well as filtering for samples with sufficient correlation to other samples or total number of reads is supported. The standard normalization methods including cpm, rpkm and tpm can be used, and 'DESeq2` as well as voom differential expression analyses are available.
Last updated 4 months ago
rnaseqdifferentialexpressionnormalizationpreprocessingqualitycontrolrna-seqstatistical-engineering
7.77 score 11 stars 1 dependents 48 scripts 424 downloadsEBSeq - An R package for gene and isoform differential expression analysis of RNA-seq data
Differential Expression analysis at both gene and isoform level using RNA-seq data
Last updated 14 days ago
immunooncologystatisticalmethoddifferentialexpressionmultiplecomparisonrnaseqsequencingcpp
7.77 score 6 dependents 162 scripts 752 downloadsPhyloProfile - PhyloProfile
PhyloProfile is a tool for exploring complex phylogenetic profiles. Phylogenetic profiles, presence/absence patterns of genes over a set of species, are commonly used to trace the functional and evolutionary history of genes across species and time. With PhyloProfile we can enrich regular phylogenetic profiles with further data like sequence/structure similarity, to make phylogenetic profiling more meaningful. Besides the interactive visualisation powered by R-Shiny, the package offers a set of further analysis features to gain insights like the gene age estimation or core gene identification.
Last updated 4 days ago
softwarevisualizationdatarepresentationmultiplecomparisonfunctionalpredictiondimensionreductionbioinformaticsheatmapinteractive-visualizationsorthologsphylogenetic-profileshiny
7.76 score 33 stars 10 scripts 226 downloadsrrvgo - Reduce + Visualize GO
Reduce and visualize lists of Gene Ontology terms by identifying redudance based on semantic similarity.
Last updated 4 months ago
annotationclusteringgonetworkpathwayssoftware
7.74 score 24 stars 190 scripts 880 downloadsMsFeatures - Functionality for Mass Spectrometry Features
The MsFeature package defines functionality for Mass Spectrometry features. This includes functions to group (LC-MS) features based on some of their properties, such as retention time (coeluting features), or correlation of signals across samples. This packge hence allows to group features, and its results can be used as an input for the `QFeatures` package which allows to aggregate abundance levels of features within each group. This package defines concepts and functions for base and common data types, implementations for more specific data types are expected to be implemented in the respective packages (such as e.g. `xcms`). All functionality of this package is implemented in a modular way which allows combination of different grouping approaches and enables its re-use in other R packages.
Last updated 4 months ago
infrastructuremassspectrometrymetabolomics
7.70 score 7 stars 12 dependents 32 scripts 2.1k downloadsdittoSeq - User Friendly Single-Cell and Bulk RNA Sequencing Visualization
A universal, user friendly, single-cell and bulk RNA sequencing visualization toolkit that allows highly customizable creation of color blindness friendly, publication-quality figures. dittoSeq accepts both SingleCellExperiment (SCE) and Seurat objects, as well as the import and usage, via conversion to an SCE, of SummarizedExperiment or DGEList bulk data. Visualizations include dimensionality reduction plots, heatmaps, scatterplots, percent composition or expression across groups, and more. Customizations range from size and title adjustments to automatic generation of annotations for heatmaps, overlay of trajectory analysis onto any dimensionality reduciton plot, hidden data overlay upon cursor hovering via ggplotly conversion, and many more. All with simple, discrete inputs. Color blindness friendliness is powered by legend adjustments (enlarged keys), and by allowing the use of shapes or letter-overlay in addition to the carefully selected dittoColors().
Last updated 4 months ago
softwarevisualizationrnaseqsinglecellgeneexpressiontranscriptomicsdataimport
7.69 score 2 dependents 760 scripts 2.4k downloadscola - A Framework for Consensus Partitioning
Subgroup classification is a basic task in genomic data analysis, especially for gene expression and DNA methylation data analysis. It can also be used to test the agreement to known clinical annotations, or to test whether there exist significant batch effects. The cola package provides a general framework for subgroup classification by consensus partitioning. It has the following features: 1. It modularizes the consensus partitioning processes that various methods can be easily integrated. 2. It provides rich visualizations for interpreting the results. 3. It allows running multiple methods at the same time and provides functionalities to straightforward compare results. 4. It provides a new method to extract features which are more efficient to separate subgroups. 5. It automatically generates detailed reports for the complete analysis. 6. It allows applying consensus partitioning in a hierarchical manner.
Last updated 5 days ago
clusteringgeneexpressionclassificationsoftwareconsensus-clusteringcpp
7.61 score 61 stars 112 scripts 271 downloadsnnSVG - Scalable identification of spatially variable genes in spatially-resolved transcriptomics data
Method for scalable identification of spatially variable genes (SVGs) in spatially-resolved transcriptomics data. The method is based on nearest-neighbor Gaussian processes and uses the BRISC algorithm for model fitting and parameter estimation. Allows identification and ranking of SVGs with flexible length scales across a tissue slide or within spatial domains defined by covariates. Scales linearly with the number of spatial locations and can be applied to datasets containing thousands or more spatial locations.
Last updated 3 days ago
spatialsinglecelltranscriptomicsgeneexpressionpreprocessing
7.59 score 17 stars 1 dependents 153 scripts 290 downloadsAlpsNMR - Automated spectraL Processing System for NMR
Reads Bruker NMR data directories both zipped and unzipped. It provides automated and efficient signal processing for untargeted NMR metabolomics. It is able to interpolate the samples, detect outliers, exclude regions, normalize, detect peaks, align the spectra, integrate peaks, manage metadata and visualize the spectra. After spectra proccessing, it can apply multivariate analysis on extracted data. Efficient plotting with 1-D data is also available. Basic reading of 1D ACD/Labs exported JDX samples is also available.
Last updated 4 months ago
softwarepreprocessingvisualizationclassificationcheminformaticsmetabolomicsdataimport
7.59 score 15 stars 1 dependents 12 scripts 312 downloadsSpatialDecon - Deconvolution of mixed cells from spatial and/or bulk gene expression data
Using spatial or bulk gene expression data, estimates abundance of mixed cell types within each observation. Based on "Advances in mixed cell deconvolution enable quantification of cell types in spatial transcriptomic data", Danaher (2022). Designed for use with the NanoString GeoMx platform, but applicable to any gene expression data.
Last updated 4 months ago
immunooncologyfeatureextractiongeneexpressiontranscriptomicsspatial
7.56 score 35 stars 58 scripts 388 downloadsRBioFormats - R interface to Bio-Formats
An R package which interfaces the OME Bio-Formats Java library to allow reading of proprietary microscopy image data and metadata.
Last updated 4 months ago
dataimportbio-formatsbioconductorimage-processingopenjdk
7.55 score 25 stars 1 dependents 52 scripts 366 downloadsimcRtools - Methods for imaging mass cytometry data analysis
This R package supports the handling and analysis of imaging mass cytometry and other highly multiplexed imaging data. The main functionality includes reading in single-cell data after image segmentation and measurement, data formatting to perform channel spillover correction and a number of spatial analysis approaches. First, cell-cell interactions are detected via spatial graph construction; these graphs can be visualized with cells representing nodes and interactions representing edges. Furthermore, per cell, its direct neighbours are summarized to allow spatial clustering. Per image/grouping level, interactions between types of cells are counted, averaged and compared against random permutations. In that way, types of cells that interact more (attraction) or less (avoidance) frequently than expected by chance are detected.
Last updated 4 months ago
immunooncologysinglecellspatialdataimportclusteringimcsingle-cell
7.54 score 22 stars 126 scripts 428 downloadsMOSim - Multi-Omics Simulation (MOSim)
MOSim package simulates multi-omic experiments that mimic regulatory mechanisms within the cell, allowing flexible experimental design including time course and multiple groups.
Last updated 4 months ago
softwaretimecourseexperimentaldesignrnaseqcpp
7.52 score 9 stars 11 scripts 234 downloadscrisprScore - On-Target and Off-Target Scoring Algorithms for CRISPR gRNAs
Provides R wrappers of several on-target and off-target scoring methods for CRISPR guide RNAs (gRNAs). The following nucleases are supported: SpCas9, AsCas12a, enAsCas12a, and RfxCas13d (CasRx). The available on-target cutting efficiency scoring methods are RuleSet1, Azimuth, DeepHF, DeepCpf1, enPAM+GB, and CRISPRscan. Both the CFD and MIT scoring methods are available for off-target specificity prediction. The package also provides a Lindel-derived score to predict the probability of a gRNA to produce indels inducing a frameshift for the Cas9 nuclease. Note that DeepHF, DeepCpf1 and enPAM+GB are not available on Windows machines.
Last updated 4 months ago
crisprfunctionalgenomicsfunctionalpredictionbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgenomicsgrnagrna-sequencegrna-sequencesscoring-algorithmsgrnasgrna-design
7.52 score 16 stars 4 dependents 19 scripts 338 downloadsmethrix - Fast and efficient summarization of generic bedGraph files from Bisufite sequencing
Bedgraph files generated by Bisulfite pipelines often come in various flavors. Critical downstream step requires summarization of these files into methylation/coverage matrices. This step of data aggregation is done by Methrix, including many other useful downstream functions.
Last updated 4 months ago
dnamethylationsequencingcoveragebedgraphbioinformaticsdna-methylation
7.51 score 31 stars 1 dependents 39 scripts 264 downloadsproDA - Differential Abundance Analysis of Label-Free Mass Spectrometry Data
Account for missing values in label-free mass spectrometry data without imputation. The package implements a probabilistic dropout model that ensures that the information from observed and missing values are properly combined. It adds empirical Bayesian priors to increase power to detect differentially abundant proteins.
Last updated 4 months ago
proteomicsmassspectrometrydifferentialexpressionbayesianregressionsoftwarenormalizationqualitycontrol
7.48 score 18 stars 1 dependents 47 scripts 378 downloadsEpiCompare - Comparison, Benchmarking & QC of Epigenomic Datasets
EpiCompare is used to compare and analyse epigenetic datasets for quality control and benchmarking purposes. The package outputs an HTML report consisting of three sections: (1. General metrics) Metrics on peaks (percentage of blacklisted and non-standard peaks, and peak widths) and fragments (duplication rate) of samples, (2. Peak overlap) Percentage and statistical significance of overlapping and non-overlapping peaks. Also includes upset plot and (3. Functional annotation) functional annotation (ChromHMM, ChIPseeker and enrichment analysis) of peaks. Also includes peak enrichment around TSS.
Last updated 4 months ago
epigeneticsgeneticsqualitycontrolchipseqmultiplecomparisonfunctionalgenomicsatacseqdnaseseqbenchmarkbenchmarkingbioconductorbioconductor-packagecomparisonhtmlinteractive-reporting
7.46 score 14 stars 46 scripts 91 downloadsstandR - Spatial transcriptome analyses of Nanostring's DSP data in R
standR is an user-friendly R package providing functions to assist conducting good-practice analysis of Nanostring's GeoMX DSP data. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. standR allows data inspection, quality control, normalization, batch correction and evaluation with informative visualizations.
Last updated 2 days ago
spatialtranscriptomicsgeneexpressiondifferentialexpressionqualitycontrolnormalizationexperimenthubsoftware
7.45 score 18 stars 45 scripts 313 downloads
lipidr - Data Mining and Analysis of Lipidomics Datasets
lipidr an easy-to-use R package implementing a complete workflow for downstream analysis of targeted and untargeted lipidomics data. lipidomics results can be imported into lipidr as a numerical matrix or a Skyline export, allowing integration into current analysis frameworks. Data mining of lipidomics datasets is enabled through integration with Metabolomics Workbench API. lipidr allows data inspection, normalization, univariate and multivariate analysis, displaying informative visualizations. lipidr also implements a novel Lipid Set Enrichment Analysis (LSEA), harnessing molecular information such as lipid class, total chain length and unsaturation.
Last updated 4 months ago
lipidomicsmassspectrometrynormalizationqualitycontrolvisualizationbioconductor
7.44 score 29 stars 40 scripts 308 downloadsGenomicDistributions - GenomicDistributions: fast analysis of genomic intervals with Bioconductor
If you have a set of genomic ranges, this package can help you with visualization and comparison. It produces several kinds of plots, for example: Chromosome distribution plots, which visualize how your regions are distributed over chromosomes; feature distance distribution plots, which visualizes how your regions are distributed relative to a feature of interest, like Transcription Start Sites (TSSs); genomic partition plots, which visualize how your regions overlap given genomic features such as promoters, introns, exons, or intergenic regions. It also makes it easy to compare one set of ranges to another.
Last updated 4 months ago
softwaregenomeannotationgenomeassemblydatarepresentationsequencingcoveragefunctionalgenomicsvisualization
7.42 score 25 stars 25 scripts 284 downloadsmbkmeans - Mini-batch K-means Clustering for Single-Cell RNA-seq
Implements the mini-batch k-means algorithm for large datasets, including support for on-disk data representation.
Last updated 4 months ago
clusteringgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecellhuman-cell-atlascpp
7.41 score 10 stars 2 dependents 54 scripts 873 downloadscytolib - C++ infrastructure for representing and interacting with the gated cytometry data
This package provides the core data structure and API to represent and interact with the gated cytometry data.
Last updated 13 days ago
immunooncologyflowcytometrydataimportpreprocessingdatarepresentation
7.38 score 60 dependents 7 scripts 4.4k downloadsmethylSig - MethylSig: Differential Methylation Testing for WGBS and RRBS Data
MethylSig is a package for testing for differentially methylated cytosines (DMCs) or regions (DMRs) in whole-genome bisulfite sequencing (WGBS) or reduced representation bisulfite sequencing (RRBS) experiments. MethylSig uses a beta binomial model to test for significant differences between groups of samples. Several options exist for either site-specific or sliding window tests, and variance estimation.
Last updated 4 months ago
dnamethylationdifferentialmethylationepigeneticsregressionmethylseqdifferential-methylationdna-methylation
7.37 score 17 stars 23 scripts 235 downloadsPeacoQC - Peak-based selection of high quality cytometry data
This is a package that includes pre-processing and quality control functions that can remove margin events, compensate and transform the data and that will use PeacoQCSignalStability for quality control. This last function will first detect peaks in each channel of the flowframe. It will remove anomalies based on the IsolationTree function and the MAD outlier detection method. This package can be used for both flow- and mass cytometry data.
Last updated 4 months ago
flowcytometryqualitycontrolpreprocessingpeakdetection
7.36 score 15 stars 3 dependents 28 scripts 440 downloadsscry - Small-Count Analysis Methods for High-Dimensional Data
Many modern biological datasets consist of small counts that are not well fit by standard linear-Gaussian methods such as principal component analysis. This package provides implementations of count-based feature selection and dimension reduction algorithms. These methods can be used to facilitate unsupervised analysis of any high-dimensional data such as single-cell RNA-seq.
Last updated 4 months ago
dimensionreductiongeneexpressionnormalizationprincipalcomponentrnaseqsoftwaresequencingsinglecelltranscriptomics
7.34 score 19 stars 116 scripts 608 downloadscrisprBase - Base functions and classes for CRISPR gRNA design
Provides S4 classes for general nucleases, CRISPR nucleases, CRISPR nickases, and base editors.Several CRISPR-specific genome arithmetic functions are implemented to help extract genomic coordinates of spacer and protospacer sequences. Commonly-used CRISPR nuclease objects are provided that can be readily used in other packages. Both DNA- and RNA-targeting nucleases are supported.
Last updated 4 months ago
crisprfunctionalgenomicsbioconductorbioconductor-packagecrispr-cas9crispr-designcrispr-targetgrnagrna-sequencegrna-sequences
7.27 score 5 stars 6 dependents 52 scripts 346 downloadsCHETAH - Fast and accurate scRNA-seq cell type identification
CHETAH (CHaracterization of cEll Types Aided by Hierarchical classification) is an accurate, selective and fast scRNA-seq classifier. Classification is guided by a reference dataset, preferentially also a scRNA-seq dataset. By hierarchical clustering of the reference data, CHETAH creates a classification tree that enables a step-wise, top-to-bottom classification. Using a novel stopping rule, CHETAH classifies the input cells to the cell types of the references and to "intermediate types": more general classifications that ended in an intermediate node of the tree.
Last updated 4 months ago
classificationrnaseqsinglecellclusteringgeneexpressionimmunooncology
7.27 score 44 stars 70 scripts 260 downloadsanimalcules - Interactive microbiome analysis toolkit
animalcules is an R package for utilizing up-to-date data analytics, visualization methods, and machine learning models to provide users an easy-to-use interactive microbiome analysis framework. It can be used as a standalone software package or users can explore their data with the accompanying interactive R Shiny application. Traditional microbiome analysis such as alpha/beta diversity and differential abundance analysis are enhanced, while new methods like biomarker identification are introduced by animalcules. Powerful interactive and dynamic figures generated by animalcules enable users to understand their data better and discover new insights.
Last updated 4 months ago
microbiomemetagenomicscoveragevisualization
7.23 score 55 stars 22 scripts 546 downloads
cosmosR - COSMOS (Causal Oriented Search of Multi-Omic Space)
COSMOS (Causal Oriented Search of Multi-Omic Space) is a method that integrates phosphoproteomics, transcriptomics, and metabolomics data sets based on prior knowledge of signaling, metabolic, and gene regulatory networks. It estimated the activities of transcrption factors and kinases and finds a network-level causal reasoning. Thereby, COSMOS provides mechanistic hypotheses for experimental observations across mulit-omics datasets.
Last updated 4 months ago
cellbiologypathwaysnetworkproteomicsmetabolomicstranscriptomicsgenesignalingdata-integrationmetabolomic-datanetwork-modellingphosphoproteomics
7.22 score 59 stars 35 scripts 244 downloadssignatureSearch - Environment for Gene Expression Searching Combined with Functional Enrichment Analysis
This package implements algorithms and data structures for performing gene expression signature (GES) searches, and subsequently interpreting the results functionally with specialized enrichment methods.
Last updated 4 months ago
softwaregeneexpressiongokeggnetworkenrichmentsequencingcoveragedifferentialexpressioncpp
7.18 score 17 stars 1 dependents 74 scripts 373 downloadsGeomxTools - NanoString GeoMx Tools
Tools for NanoString Technologies GeoMx Technology. Package provides functions for reading in DCC and PKC files based on an ExpressionSet derived object. Normalization and QC functions are also included.
Last updated 4 months ago
geneexpressiontranscriptioncellbasedassaysdataimporttranscriptomicsproteomicsmrnamicroarrayproprietaryplatformsrnaseqsequencingexperimentaldesignnormalizationspatial
7.17 score 3 dependents 218 scripts 754 downloadsqpgraph - Estimation of genetic and molecular regulatory networks from high-throughput genomics data
Estimate gene and eQTL networks from high-throughput expression and genotyping assays.
Last updated 10 days ago
microarraygeneexpressiontranscriptionpathwaysnetworkinferencegraphandnetworkgeneregulationgeneticsgeneticvariabilitysnpsoftwareopenblas
7.16 score 3 dependents 20 scripts 448 downloadsiSEEu - iSEE Universe
iSEEu (the iSEE universe) contains diverse functionality to extend the usage of the iSEE package, including additional classes for the panels, or modes allowing easy configuration of iSEE applications.
Last updated 4 months ago
immunooncologyvisualizationguidimensionreductionfeatureextractionclusteringtranscriptiongeneexpressiontranscriptomicssinglecellcellbasedassayshacktoberfest
7.15 score 9 stars 1 dependents 35 scripts 265 downloadsSPsimSeq - Semi-parametric simulation tool for bulk and single-cell RNA sequencing data
SPsimSeq uses a specially designed exponential family for density estimation to constructs the distribution of gene expression levels from a given real RNA sequencing data (single-cell or bulk), and subsequently simulates a new dataset from the estimated marginal distributions using Gaussian-copulas to retain the dependence between genes. It allows simulation of multiple groups and batches with any required sample size and library size.
Last updated 4 months ago
geneexpressionrnaseqsinglecellsequencingdnaseq
7.14 score 10 stars 1 dependents 29 scripts 403 downloadsbasilisk.utils - Basilisk Installation Utilities
Implements utilities for installation of the basilisk package, primarily for creation of the underlying Conda instance. This allows us to avoid re-writing the same R code in both the configure script (for centrally administered R installations) and in the lazy installation mechanism (for distributed package binaries). It is highly unlikely that developers - or, heaven forbid, end-users! - will need to interact with this package directly; they should be using the basilisk package instead.
Last updated 4 months ago
infrastructure
7.14 score 38 dependents 8 scripts 4.9k downloadsgDRutils - A package with helper functions for processing drug response data
This package contains utility functions used throughout the gDR platform to fit data, manipulate data, and convert and validate data structures. This package also has the necessary default constants for gDR platform. Many of the functions are utilized by the gDRcore package.
Last updated 5 days ago
softwareinfrastructure
7.13 score 3 dependents 3 scripts 176 downloadsgDRimport - Package for handling the import of dose-response data
The package is a part of the gDR suite. It helps to prepare raw drug response data for downstream processing. It mainly contains helper functions for importing/loading/validating dose-response data provided in different file formats.
Last updated 7 days ago
softwareinfrastructuredataimport
7.11 score 2 stars 1 dependents 5 scripts 171 downloadsHiCExperiment - Bioconductor class for interacting with Hi-C files in R
R generic interface to Hi-C contact matrices in `.(m)cool`, `.hic` or HiC-Pro derived formats, as well as other Hi-C processed file formats. Contact matrices can be partially parsed using a random access method, allowing a memory-efficient representation of Hi-C data in R. The `HiCExperiment` class stores the Hi-C contacts parsed from local contact matrix files. `HiCExperiment` instances can be further investigated in R using the `HiContacts` analysis package.
Last updated 4 months ago
hicdna3dstructuredataimport
7.10 score 9 stars 2 dependents 47 scripts 256 downloadsdcanr - Differential co-expression/association network analysis
This package implements methods and an evaluation framework to infer differential co-expression/association networks. Various methods are implemented and can be evaluated using simulated datasets. Inference of differential co-expression networks can allow identification of networks that are altered between two conditions (e.g., health and disease).
Last updated 4 months ago
networkinferencegraphandnetworkdifferentialexpressionnetwork
7.05 score 6 stars 2 dependents 26 scripts 330 downloadscardelino - Clone Identification from Single Cell Data
Methods to infer clonal tree configuration for a population of cells using single-cell RNA-seq data (scRNA-seq), and possibly other data modalities. Methods are also provided to assign cells to inferred clones and explore differences in gene expression between clones. These methods can flexibly integrate information from imperfect clonal trees inferred based on bulk exome-seq data, and sparse variant alleles expressed in scRNA-seq data. A flexible beta-binomial error model that accounts for stochastic dropout events as well as systematic allelic imbalance is used.
Last updated 4 months ago
singlecellrnaseqvisualizationtranscriptomicsgeneexpressionsequencingsoftwareexomeseqclonal-clusteringgibbs-samplingscrna-seqsingle-cellsomatic-mutations
7.05 score 60 stars 62 scripts 278 downloadsTADCompare - TADCompare: Identification and characterization of differential TADs
TADCompare is an R package designed to identify and characterize differential Topologically Associated Domains (TADs) between multiple Hi-C contact matrices. It contains functions for finding differential TADs between two datasets, finding differential TADs over time and identifying consensus TADs across multiple matrices. It takes all of the main types of HiC input and returns simple, comprehensive, easy to analyze results.
Last updated 4 months ago
softwarehicsequencingfeatureextractionclustering
7.04 score 23 stars 9 scripts 238 downloadsSharedObject - Sharing R objects across multiple R processes without memory duplication
This package is developed for facilitating parallel computing in R. It is capable to create an R object in the shared memory space and share the data across multiple R processes. It avoids the overhead of memory dulplication and data transfer, which make sharing big data object across many clusters possible.
Last updated 4 months ago
infrastructuresharedobjectcpp
7.03 score 45 stars 1 dependents 6 scripts 296 downloadssystemPipeShiny - systemPipeShiny: An Interactive Framework for Workflow Management and Visualization
systemPipeShiny (SPS) extends the widely used systemPipeR (SPR) workflow environment with a versatile graphical user interface provided by a Shiny App. This allows non-R users, such as experimentalists, to run many systemPipeR’s workflow designs, control, and visualization functionalities interactively without requiring knowledge of R. Most importantly, SPS has been designed as a general purpose framework for interacting with other R packages in an intuitive manner. Like most Shiny Apps, SPS can be used on both local computers as well as centralized server-based deployments that can be accessed remotely as a public web service for using SPR’s functionalities with community and/or private data. The framework can integrate many core packages from the R/Bioconductor ecosystem. Examples of SPS’ current functionalities include: (a) interactive creation of experimental designs and metadata using an easy to use tabular editor or file uploader; (b) visualization of workflow topologies combined with auto-generation of R Markdown preview for interactively designed workflows; (d) access to a wide range of data processing routines; (e) and an extendable set of visualization functionalities. Complex visual results can be managed on a 'Canvas Workbench’ allowing users to organize and to compare plots in an efficient manner combined with a session snapshot feature to continue work at a later time. The present suite of pre-configured visualization examples. The modular design of SPR makes it easy to design custom functions without any knowledge of Shiny, as well as extending the environment in the future with contributions from the community.
Last updated 4 months ago
shinyappsinfrastructuredataimportsequencingqualitycontrolreportwritingexperimentaldesignclusteringbioconductorbioconductor-packagedata-visualizationshinysystempiper
7.03 score 33 stars 36 scripts 236 downloadspipeComp - pipeComp pipeline benchmarking framework
A simple framework to facilitate the comparison of pipelines involving various steps and parameters. The `pipelineDefinition` class represents pipelines as, minimally, a set of functions consecutively executed on the output of the previous one, and optionally accompanied by step-wise evaluation and aggregation functions. Given such an object, a set of alternative parameters/methods, and benchmark datasets, the `runPipeline` function then proceeds through all combinations arguments, avoiding recomputing the same step twice and compiling evaluations on the fly to avoid storing potentially large intermediate data.
Last updated 4 months ago
geneexpressiontranscriptomicsclusteringdatarepresentationbenchmarkbioconductorpipeline-benchmarkingpipelinessingle-cell-rna-seq
7.02 score 41 stars 43 scripts 203 downloadsaffyPLM - Methods for fitting probe-level models
A package that extends and improves the functionality of the base affy package. Routines that make heavy use of compiled code for speed. Central focus is on implementation of methods for fitting probe-level models and tools using these models. PLM based quality assessment tools.
Last updated 13 days ago
microarrayonechannelpreprocessingqualitycontrolopenblaszlib
6.99 score 4 dependents 206 scripts 1.6k downloadsGenomicSuperSignature - Interpretation of RNA-seq experiments through robust, efficient comparison to public databases
This package provides a novel method for interpreting new transcriptomic datasets through near-instantaneous comparison to public archives without high-performance computing requirements. Through the pre-computed index, users can identify public resources associated with their dataset such as gene sets, MeSH term, and publication. Functions to identify interpretable annotations and intuitive visualization options are implemented in this package.
Last updated 4 months ago
transcriptomicssystemsbiologyprincipalcomponentrnaseqsequencingpathwaysclusteringbioconductor-packageexploratory-data-analysisgseameshprincipal-component-analysisrna-sequencing-profilestransferlearning
6.97 score 16 stars 59 scripts 247 downloadsdir.expiry - Managing Expiration for Cache Directories
Implements an expiration system for access to versioned directories. Directories that have not been accessed by a registered function within a certain time frame are deleted. This aims to reduce disk usage by eliminating obsolete caches generated by old versions of packages.
Last updated 4 months ago
softwareinfrastructure
6.97 score 40 dependents 6 scripts 5.2k downloadssatuRn - Scalable Analysis of Differential Transcript Usage for Bulk and Single-Cell RNA-sequencing Applications
satuRn provides a higly performant and scalable framework for performing differential transcript usage analyses. The package consists of three main functions. The first function, fitDTU, fits quasi-binomial generalized linear models that model transcript usage in different groups of interest. The second function, testDTU, tests for differential usage of transcripts between groups of interest. Finally, plotDTU visualizes the usage profiles of transcripts in groups of interest.
Last updated 4 months ago
regressionexperimentaldesigndifferentialexpressiongeneexpressionrnaseqsequencingsoftwaresinglecelltranscriptomicsmultiplecomparisonvisualization
6.95 score 20 stars 1 dependents 74 scripts 552 downloadsCytoPipeline - Automation and visualization of flow cytometry data analysis pipelines
This package provides support for automation and visualization of flow cytometry data analysis pipelines. In the current state, the package focuses on the preprocessing and quality control part. The framework is based on two main S4 classes, i.e. CytoPipeline and CytoProcessingStep. The pipeline steps are linked to corresponding R functions - that are either provided in the CytoPipeline package itself, or exported from a third party package, or coded by the user her/himself. The processing steps need to be specified centrally and explicitly using either a json input file or through step by step creation of a CytoPipeline object with dedicated methods. After having run the pipeline, obtained results at all steps can be retrieved and visualized thanks to file caching (the running facility uses a BiocFileCache implementation). The package provides also specific visualization tools like pipeline workflow summary display, and 1D/2D comparison plots of obtained flowFrames at various steps of the pipeline.
Last updated 4 months ago
flowcytometrypreprocessingqualitycontrolworkflowstepimmunooncologysoftwarevisualization
6.94 score 4 stars 2 dependents 18 scripts 196 downloadsscClassify - scClassify: single-cell Hierarchical Classification
scClassify is a multiscale classification framework for single-cell RNA-seq data based on ensemble learning and cell type hierarchies, enabling sample size estimation required for accurate cell type classification and joint classification of cells using multiple references.
Last updated 4 months ago
singlecellgeneexpressionclassification
6.92 score 23 stars 30 scripts 186 downloadsTileDBArray - Using TileDB as a DelayedArray Backend
Implements a DelayedArray backend for reading and writing dense or sparse arrays in the TileDB format. The resulting TileDBArrays are compatible with all Bioconductor pipelines that can accept DelayedArray instances.
Last updated 4 months ago
datarepresentationinfrastructuresoftware
6.89 score 10 stars 1 dependents 26 scripts 200 downloadsCRISPRseek - Design of guide RNAs in CRISPR genome-editing systems
The package encompasses functions to find potential guide RNAs for the CRISPR-based genome-editing systems including the Base Editors and the Prime Editors when supplied with target sequences as input. Users have the flexibility to filter resulting guide RNAs based on parameters such as the absence of restriction enzyme cut sites or the lack of paired guide RNAs. The package also facilitates genome-wide exploration for off-targets, offering features to score and rank off-targets, retrieve flanking sequences, and indicate whether the hits are located within exon regions. All detected guide RNAs are annotated with the cumulative scores of the top5 and topN off-targets together with the detailed information such as mismatch sites and restrictuion enzyme cut sites. The package also outputs INDELs and their frequencies for Cas9 targeted sites.
Last updated 4 days ago
immunooncologygeneregulationsequencematchingcrispr
6.88 score 2 dependents 50 scripts 352 downloadsBioTIP - BioTIP: An R package for characterization of Biological Tipping-Point
Adopting tipping-point theory to transcriptome profiles to unravel disease regulatory trajectory.
Last updated 4 months ago
sequencingrnaseqgeneexpressiontranscriptionsoftware
6.84 score 18 stars 37 scripts 204 downloadspeakPantheR - Peak Picking and Annotation of High Resolution Experiments
An automated pipeline for the detection, integration and reporting of predefined features across a large number of mass spectrometry data files. It enables the real time annotation of multiple compounds in a single file, or the parallel annotation of multiple compounds in multiple files. A graphical user interface as well as command line functions will assist in assessing the quality of annotation and update fitting parameters until a satisfactory result is obtained.
Last updated 4 months ago
massspectrometrymetabolomicspeakdetectionfeature-detectionmass-spectrometry
6.82 score 12 stars 23 scripts 181 downloadsSimBu - Simulate Bulk RNA-seq Datasets from Single-Cell Datasets
SimBu can be used to simulate bulk RNA-seq datasets with known cell type fractions. You can either use your own single-cell study for the simulation or the sfaira database. Different pre-defined simulation scenarios exist, as are options to run custom simulations. Additionally, expression values can be adapted by adding an mRNA bias, which produces more biologically relevant simulations.
Last updated 4 months ago
softwarernaseqsinglecell
6.81 score 14 stars 29 scripts 368 downloadsextraChIPs - Additional functions for working with ChIP-Seq data
This package builds on existing tools and adds some simple but extremely useful capabilities for working wth ChIP-Seq data. The focus is on detecting differential binding windows/regions. One set of functions focusses on set-operations retaining mcols for GRanges objects, whilst another group of functions are to aid visualisation of results. Coercion to tibble objects is also implemented.
Last updated 4 months ago
chipseqhicsequencingcoverage
6.80 score 7 stars 25 scripts 262 downloadsMicrobiomeProfiler - An R/shiny package for microbiome functional enrichment analysis
This is an R/shiny package to perform functional enrichment analysis for microbiome data. This package was based on clusterProfiler. Moreover, MicrobiomeProfiler support KEGG enrichment analysis, COG enrichment analysis, Microbe-Disease association enrichment analysis, Metabo-Pathway analysis.
Last updated 4 months ago
microbiomesoftwarevisualizationkegg
6.79 score 37 stars 22 scripts 281 downloadssechm - sechm: Complex Heatmaps from a SummarizedExperiment
sechm provides a simple interface between SummarizedExperiment objects and the ComplexHeatmap package. It enables plotting annotated heatmaps from SE objects, with easy access to rowData and colData columns, and implements a number of features to make the generation of heatmaps easier and more flexible. These functionalities used to be part of the SEtools package.
Last updated 4 months ago
geneexpressionvisualization
6.77 score 6 stars 2 dependents 55 scripts 292 downloadsVennDetail - A package for visualization and extract details
A set of functions to generate high-resolution Venn,Vennpie plot,extract and combine details of these subsets with user datasets in data frame is available.
Last updated 4 months ago
datarepresentationgraphandnetworkextractvenndiagram
6.75 score 29 stars 65 scripts 422 downloadsscAnnotatR - Pretrained learning models for cell type prediction on single cell RNA-sequencing data
The package comprises a set of pretrained machine learning models to predict basic immune cell types. This enables all users to quickly get a first annotation of the cell types present in their dataset without requiring prior knowledge. scAnnotatR also allows users to train their own models to predict new cell types based on specific research needs.
Last updated 4 months ago
singlecelltranscriptomicsgeneexpressionsupportvectormachineclassificationsoftware
6.73 score 15 stars 20 scripts 494 downloadsROTS - Reproducibility-Optimized Test Statistic
Calculates the Reproducibility-Optimized Test Statistic (ROTS) for differential testing in omics data.
Last updated 18 days ago
softwaregeneexpressiondifferentialexpressionmicroarrayrnaseqproteomicsimmunooncologycpp
6.72 score 3 dependents 84 scripts 492 downloadswaddR - Statistical tests for detecting differential distributions based on the 2-Wasserstein distance
The package offers statistical tests based on the 2-Wasserstein distance for detecting and characterizing differences between two distributions given in the form of samples. Functions for calculating the 2-Wasserstein distance and testing for differential distributions are provided, as well as a specifically tailored test for differential expression in single-cell RNA sequencing data.
Last updated 4 months ago
softwarestatisticalmethodsinglecelldifferentialexpressioncpp
6.70 score 25 stars 6 scripts 141 downloadsproActiv - Estimate Promoter Activity from RNA-Seq data
Most human genes have multiple promoters that control the expression of different isoforms. The use of these alternative promoters enables the regulation of isoform expression pre-transcriptionally. Alternative promoters have been found to be important in a wide number of cell types and diseases. proActiv is an R package that enables the analysis of promoters from RNA-seq data. proActiv uses aligned reads as input, and generates counts and normalized promoter activity estimates for each annotated promoter. In particular, proActiv accepts junction files from TopHat2 or STAR or BAM files as inputs. These estimates can then be used to identify which promoter is active, which promoter is inactive, and which promoters change their activity across conditions. proActiv also allows visualization of promoter activity across conditions.
Last updated 4 months ago
rnaseqgeneexpressiontranscriptionalternativesplicinggeneregulationdifferentialsplicingfunctionalgenomicsepigeneticstranscriptomicspreprocessingalternative-promotersgenomicspromoter-activitypromoter-annotationrna-seq-data
6.65 score 50 stars 15 scripts 192 downloadsggspavis - Visualization functions for spatial transcriptomics data
Visualization functions for spatial transcriptomics data. Includes functions to generate several types of plots, including spot plots, feature (molecule) plots, reduced dimension plots, spot-level quality control (QC) plots, and feature-level QC plots, for datasets from the 10x Genomics Visium and other technological platforms. Datasets are assumed to be in either SpatialExperiment or SingleCellExperiment format.
Last updated 4 months ago
spatialsinglecelltranscriptomicsgeneexpressionqualitycontroldimensionreduction
6.64 score 2 stars 218 scripts 354 downloadscellxgenedp - Discover and Access Single Cell Data Sets in the CELLxGENE Data Portal
The cellxgene data portal (https://cellxgene.cziscience.com/) provides a graphical user interface to collections of single-cell sequence data processed in standard ways to 'count matrix' summaries. The cellxgenedp package provides an alternative, R-based inteface, allowind data discovery, viewing, and downloading.
Last updated 4 months ago
singlecelldataimportthirdpartyclient
6.64 score 8 stars 27 scripts 209 downloadsRarr - Read Zarr Files in R
The Zarr specification defines a format for chunked, compressed, N-dimensional arrays. It's design allows efficient access to subsets of the stored array, and supports both local and cloud storage systems. Rarr aims to implement this specifcation in R with minimal reliance on an external tools or libraries.
Last updated 8 days ago
dataimportome-zarron-diskout-of-memoryzarrc-blosclibzstd
6.63 score 34 stars 21 scripts 266 downloadsSPIA - Signaling Pathway Impact Analysis (SPIA) using combined evidence of pathway over-representation and unusual signaling perturbations
This package implements the Signaling Pathway Impact Analysis (SPIA) which uses the information form a list of differentially expressed genes and their log fold changes together with signaling pathways topology, in order to identify the pathways most relevant to the condition under the study.
Last updated 17 days ago
microarraygraphandnetwork
6.62 score 4 dependents 113 scripts 1.0k downloadsmiaSim - Microbiome Data Simulation
Microbiome time series simulation with generalized Lotka-Volterra model, Self-Organized Instability (SOI), and other models. Hubbell's Neutral model is used to determine the abundance matrix. The resulting abundance matrix is applied to (Tree)SummarizedExperiment objects.
Last updated 4 months ago
microbiomesoftwaresequencingdnaseqatacseqcoveragenetwork
6.62 score 20 stars 23 scripts 192 downloadsCiteFuse - CiteFuse: multi-modal analysis of CITE-seq data
CiteFuse pacakage implements a suite of methods and tools for CITE-seq data from pre-processing to integrative analytics, including doublet detection, network-based modality integration, cell type clustering, differential RNA and protein expression analysis, ADT evaluation, ligand-receptor interaction analysis, and interactive web-based visualisation of the analyses.
Last updated 4 months ago
singlecellgeneexpressionbioinformaticssingle-cellcpp
6.61 score 28 stars 18 scripts 186 downloadsnipalsMCIA - Multiple Co-Inertia Analysis via the NIPALS Method
Computes Multiple Co-Inertia Analysis (MCIA), a dimensionality reduction (jDR) algorithm, for a multi-block dataset using a modification to the Nonlinear Iterative Partial Least Squares method (NIPALS) proposed in (Hanafi et. al, 2010). Allows multiple options for row- and table-level preprocessing, and speeds up computation of variance explained. Vignettes detail application to bulk- and single cell- multi-omics studies.
Last updated 5 days ago
softwareclusteringclassificationmultiplecomparisonnormalizationpreprocessingsinglecell
6.60 score 6 stars 7 scripts 162 downloadsStatial - A package to identify changes in cell state relative to spatial associations
Statial is a suite of functions for identifying changes in cell state. The functionality provided by Statial provides robust quantification of cell type localisation which are invariant to changes in tissue structure. In addition to this Statial uncovers changes in marker expression associated with varying levels of localisation. These features can be used to explore how the structure and function of different cell types may be altered by the agents they are surrounded with.
Last updated 4 months ago
singlecellspatialclassificationsingle-cell
6.56 score 5 stars 20 scripts 180 downloadszenith - Gene set analysis following differential expression using linear (mixed) modeling with dream
Zenith performs gene set analysis on the result of differential expression using linear (mixed) modeling with dream by considering the correlation between gene expression traits. This package implements the camera method from the limma package proposed by Wu and Smyth (2012). Zenith is a simple extension of camera to be compatible with linear mixed models implemented in variancePartition::dream().
Last updated 4 months ago
rnaseqgeneexpressiongenesetenrichmentdifferentialexpressionbatcheffectqualitycontrolregressionepigeneticsfunctionalgenomicstranscriptomicsnormalizationpreprocessingmicroarrayimmunooncologysoftware
6.52 score 1 dependents 91 scripts 254 downloadsmethylclock - Methylclock - DNA methylation-based clocks
This package allows to estimate chronological and gestational DNA methylation (DNAm) age as well as biological age using different methylation clocks. Chronological DNAm age (in years) : Horvath's clock, Hannum's clock, BNN, Horvath's skin+blood clock, PedBE clock and Wu's clock. Gestational DNAm age : Knight's clock, Bohlin's clock, Mayne's clock and Lee's clocks. Biological DNAm clocks : Levine's clock and Telomere Length's clock.
Last updated 4 months ago
dnamethylationbiologicalquestionpreprocessingstatisticalmethodnormalizationcpp
6.52 score 39 stars 28 scripts 338 downloads
tricycle - tricycle: Transferable Representation and Inference of cell cycle
The package contains functions to infer and visualize cell cycle process using Single Cell RNASeq data. It exploits the idea of transfer learning, projecting new data to the previous learned biologically interpretable space. We provide a pre-learned cell cycle space, which could be used to infer cell cycle time of human and mouse single cell samples. In addition, we also offer functions to visualize cell cycle time on different embeddings and functions to build new reference.
Last updated 4 months ago
singlecellsoftwaretranscriptomicsrnaseqtranscriptionbiologicalquestiondimensionreductionimmunooncology
6.51 score 24 stars 45 scripts 490 downloadsRNAmodR - Detection of post-transcriptional modifications in high throughput sequencing data
RNAmodR provides classes and workflows for loading/aggregation data from high througput sequencing aimed at detecting post-transcriptional modifications through analysis of specific patterns. In addition, utilities are provided to validate and visualize the results. The RNAmodR package provides a core functionality from which specific analysis strategies can be easily implemented as a seperate package.
Last updated 4 months ago
softwareinfrastructureworkflowstepvisualizationsequencingalkanilineseqbioconductormodificationsribomethseqrnarnamodr
6.51 score 3 stars 3 dependents 9 scripts 244 downloadscoMethDMR - Accurate identification of co-methylated and differentially methylated regions in epigenome-wide association studies
coMethDMR identifies genomic regions associated with continuous phenotypes by optimally leverages covariations among CpGs within predefined genomic regions. Instead of testing all CpGs within a genomic region, coMethDMR carries out an additional step that selects co-methylated sub-regions first without using any outcome information. Next, coMethDMR tests association between methylation within the sub-region and continuous phenotype using a random coefficient mixed effects model, which models both variations between CpG sites within the region and differential methylation simultaneously.
Last updated 4 months ago
dnamethylationepigeneticsmethylationarraydifferentialmethylationgenomewideassociation
6.47 score 7 stars 42 scripts 230 downloadsbugsigdbr - R-side access to published microbial signatures from BugSigDB
The bugsigdbr package implements convenient access to bugsigdb.org from within R/Bioconductor. The goal of the package is to facilitate import of BugSigDB data into R/Bioconductor, provide utilities for extracting microbe signatures, and enable export of the extracted signatures to plain text files in standard file formats such as GMT.
Last updated 4 months ago
dataimportgenesetenrichmentmetagenomicsmicrobiomebioconductor-package
6.46 score 3 stars 48 scripts 257 downloadsrecountmethylation - Access and analyze public DNA methylation array data compilations
Resources for cross-study analyses of public DNAm array data from NCBI GEO repo, produced using Illumina's Infinium HumanMethylation450K (HM450K) and MethylationEPIC (EPIC) platforms. Provided functions enable download, summary, and filtering of large compilation files. Vignettes detail background about file formats, example analyses, and more. Note the disclaimer on package load and consult the main manuscripts for further info.
Last updated 4 months ago
dnamethylationepigeneticsmicroarraymethylationarrayexperimenthub
6.45 score 9 stars 9 scripts 254 downloadsSpotClean - SpotClean adjusts for spot swapping in spatial transcriptomics data
SpotClean is a computational method to adjust for spot swapping in spatial transcriptomics data. Recent spatial transcriptomics experiments utilize slides containing thousands of spots with spot-specific barcodes that bind mRNA. Ideally, unique molecular identifiers at a spot measure spot-specific expression, but this is often not the case due to bleed from nearby spots, an artifact we refer to as spot swapping. SpotClean is able to estimate the contamination rate in observed data and decontaminate the spot swapping effect, thus increase the sensitivity and precision of downstream analyses.
Last updated 4 months ago
dataimportrnaseqsequencinggeneexpressionspatialsinglecelltranscriptomicspreprocessingrna-seqspatial-transcriptomics
6.45 score 26 stars 36 scripts 200 downloadsNanoStringNCTools - NanoString nCounter Tools
Tools for NanoString Technologies nCounter Technology. Provides support for reading RCC files into an ExpressionSet derived object. Also includes methods for QC and normalizaztion of NanoString data.
Last updated 4 months ago
geneexpressiontranscriptioncellbasedassaysdataimporttranscriptomicsproteomicsmrnamicroarrayproprietaryplatformsrnaseq
6.45 score 4 dependents 93 scripts 642 downloadstanggle - Visualization of Phylogenetic Networks
Offers functions for plotting split (or implicit) networks (unrooted, undirected) and explicit networks (rooted, directed) with reticulations extending. 'ggtree' and using functions from 'ape' and 'phangorn'. It extends the 'ggtree' package [@Yu2017] to allow the visualization of phylogenetic networks using the 'ggplot2' syntax. It offers an alternative to the plot functions already available in 'ape' Paradis and Schliep (2019) <doi:10.1093/bioinformatics/bty633> and 'phangorn' Schliep (2011) <doi:10.1093/bioinformatics/btq706>.
Last updated 4 months ago
softwarevisualizationphylogeneticsalignmentclusteringmultiplesequencealignmentdataimport
6.44 score 11 stars 42 scripts 160 downloadsMSstatsShiny - MSstats GUI for Statistical Anaylsis of Proteomics Experiments
MSstatsShiny is an R-Shiny graphical user interface (GUI) integrated with the R packages MSstats, MSstatsTMT, and MSstatsPTM. It provides a point and click end-to-end analysis pipeline applicable to a wide variety of experimental designs. These include data-dependedent acquisitions (DDA) which are label-free or tandem mass tag (TMT)-based, as well as DIA, SRM, and PRM acquisitions and those targeting post-translational modifications (PTMs). The application automatically saves users selections and builds an R script that recreates their analysis, supporting reproducible data analysis.
Last updated 4 months ago
immunooncologymassspectrometryproteomicssoftwareshinyappsdifferentialexpressiononechanneltwochannelnormalizationqualitycontrolgui
6.43 score 15 stars 4 scripts 232 downloadsalabaster.se - Load and Save SummarizedExperiments from File
Save SummarizedExperiments into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated 4 months ago
dataimportdatarepresentation
6.39 score 6 dependents 8 scripts 3.9k downloadsSynExtend - Tools for Working With Synteny Objects
Shared order between genomic sequences provide a great deal of information. Synteny objects produced by the R package DECIPHER provides quantitative information about that shared order. SynExtend provides tools for extracting information from Synteny objects.
Last updated 13 hours ago
geneticsclusteringcomparativegenomicsdataimportfortranopenmp
6.39 score 1 stars 74 scripts 188 downloadsmiQC - Flexible, probabilistic metrics for quality control of scRNA-seq data
Single-cell RNA-sequencing (scRNA-seq) has made it possible to profile gene expression in tissues at high resolution. An important preprocessing step prior to performing downstream analyses is to identify and remove cells with poor or degraded sample quality using quality control (QC) metrics. Two widely used QC metrics to identify a ‘low-quality’ cell are (i) if the cell includes a high proportion of reads that map to mitochondrial DNA encoded genes (mtDNA) and (ii) if a small number of genes are detected. miQC is data-driven QC metric that jointly models both the proportion of reads mapping to mtDNA and the number of detected genes with mixture models in a probabilistic framework to predict the low-quality cells in a given dataset.
Last updated 4 months ago
singlecellqualitycontrolgeneexpressionpreprocessingsequencing
6.38 score 19 stars 63 scripts 281 downloadsSingleMoleculeFootprinting - Analysis tools for Single Molecule Footprinting (SMF) data
SingleMoleculeFootprinting provides functions to analyze Single Molecule Footprinting (SMF) data. Following the workflow exemplified in its vignette, the user will be able to perform basic data analysis of SMF data with minimal coding effort. Starting from an aligned bam file, we show how to perform quality controls over sequencing libraries, extract methylation information at the single molecule level accounting for the two possible kind of SMF experiments (single enzyme or double enzyme), classify single molecules based on their patterns of molecular occupancy, plot SMF information at a given genomic location.
Last updated 7 days ago
dnamethylationcoveragenucleosomepositioningdatarepresentationepigeneticsmethylseqqualitycontrolsequencing
6.35 score 2 stars 25 scripts 224 downloadsdistinct - distinct: a method for differential analyses via hierarchical permutation tests
distinct is a statistical method to perform differential testing between two or more groups of distributions; differential testing is performed via hierarchical non-parametric permutation tests on the cumulative distribution functions (cdfs) of each sample. While most methods for differential expression target differences in the mean abundance between conditions, distinct, by comparing full cdfs, identifies, both, differential patterns involving changes in the mean, as well as more subtle variations that do not involve the mean (e.g., unimodal vs. bi-modal distributions with the same mean). distinct is a general and flexible tool: due to its fully non-parametric nature, which makes no assumptions on how the data was generated, it can be applied to a variety of datasets. It is particularly suitable to perform differential state analyses on single cell data (i.e., differential analyses within sub-populations of cells), such as single cell RNA sequencing (scRNA-seq) and high-dimensional flow or mass cytometry (HDCyto) data. To use distinct one needs data from two or more groups of samples (i.e., experimental conditions), with at least 2 samples (i.e., biological replicates) per group.
Last updated 4 months ago
geneticsrnaseqsequencingdifferentialexpressiongeneexpressionmultiplecomparisonsoftwaretranscriptionstatisticalmethodvisualizationsinglecellflowcytometrygenetargetopenblascpp
6.35 score 11 stars 1 dependents 34 scripts 508 downloadsSpliceWiz - interactive analysis and visualization of alternative splicing in R
The analysis and visualization of alternative splicing (AS) events from RNA sequencing data remains challenging. SpliceWiz is a user-friendly and performance-optimized R package for AS analysis, by processing alignment BAM files to quantify read counts across splice junctions, IRFinder-based intron retention quantitation, and supports novel splicing event identification. We introduce a novel visualization for AS using normalized coverage, thereby allowing visualization of differential AS across conditions. SpliceWiz features a shiny-based GUI facilitating interactive data exploration of results including gene ontology enrichment. It is performance optimized with multi-threaded processing of BAM files and a new COV file format for fast recall of sequencing coverage. Overall, SpliceWiz streamlines AS analysis, enabling reliable identification of functionally relevant AS events for further characterization.
Last updated 4 months ago
softwaretranscriptomicsrnaseqalternativesplicingcoveragedifferentialsplicingdifferentialexpressionguisequencingcppopenmp
6.35 score 16 stars 8 scripts 311 downloadsCellMixS - Evaluate Cellspecific Mixing
CellMixS provides metrics and functions to evaluate batch effects, data integration and batch effect correction in single cell trancriptome data with single cell resolution. Results can be visualized and summarised on different levels, e.g. on cell, celltype or dataset level.
Last updated 4 months ago
singlecelltranscriptomicsgeneexpressionbatcheffect
6.35 score 7 stars 64 scripts 236 downloadsgscreend - Analysis of pooled genetic screens
Package for the analysis of pooled genetic screens (e.g. CRISPR-KO). The analysis of such screens is based on the comparison of gRNA abundances before and after a cell proliferation phase. The gscreend packages takes gRNA counts as input and allows detection of genes whose knockout decreases or increases cell proliferation.
Last updated 4 months ago
softwarestatisticalmethodpooledscreenscrispr
6.34 score 11 stars 7 scripts 194 downloadsSingleCellAlleleExperiment - S4 Class for Single Cell Data with Allele and Functional Levels for Immune Genes
Defines a S4 class that is based on SingleCellExperiment. In addition to the usual gene layer the object can also store data for immune genes such as HLAs, Igs and KIRs at allele and functional level. The package is part of a workflow named single-cell ImmunoGenomic Diversity (scIGD), that firstly incorporates allele-aware quantification data for immune genes. This new data can then be used with the here implemented data structure and functionalities for further data handling and data analysis.
Last updated 15 days ago
datarepresentationinfrastructuresinglecelltranscriptomicsgeneexpressiongeneticsimmunooncologydataimport
6.34 score 7 stars 12 scripts 168 downloadsalabaster.ranges - Load and Save Ranges-related Artifacts from File
Save GenomicRanges, IRanges and related data structures into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated 4 months ago
dataimportdatarepresentation
6.32 score 7 dependents 8 scripts 4.0k downloadsscDataviz - scDataviz: single cell dataviz and downstream analyses
In the single cell World, which includes flow cytometry, mass cytometry, single-cell RNA-seq (scRNA-seq), and others, there is a need to improve data visualisation and to bring analysis capabilities to researchers even from non-technical backgrounds. scDataviz attempts to fit into this space, while also catering for advanced users. Additonally, due to the way that scDataviz is designed, which is based on SingleCellExperiment, it has a 'plug and play' feel, and immediately lends itself as flexibile and compatibile with studies that go beyond scDataviz. Finally, the graphics in scDataviz are generated via the ggplot engine, which means that users can 'add on' features to these with ease.
Last updated 4 months ago
singlecellimmunooncologyrnaseqgeneexpressiontranscriptionflowcytometrymassspectrometrydataimport
6.30 score 63 stars 16 scripts 248 downloadsHerper - The Herper package is a simple toolset to install and manage conda packages and environments from R
Many tools for data analysis are not available in R, but are present in public repositories like conda. The Herper package provides a comprehensive set of functions to interact with the conda package managament system. With Herper users can install, manage and run conda packages from the comfort of their R session. Herper also provides an ad-hoc approach to handling external system requirements for R packages. For people developing packages with python conda dependencies we recommend using basilisk (https://bioconductor.org/packages/release/bioc/html/basilisk.html) to internally support these system requirments pre-hoc.
Last updated 4 months ago
infrastructuresoftware
6.29 score 5 stars 52 scripts 258 downloadsRCX - R package implementing the Cytoscape Exchange (CX) format
Create, handle, validate, visualize and convert networks in the Cytoscape exchange (CX) format to standard data types and objects. The package also provides conversion to and from objects of iGraph and graphNEL. The CX format is also used by the NDEx platform, a online commons for biological networks, and the network visualization software Cytocape.
Last updated 4 months ago
pathwaysdataimportnetwork
6.28 score 8 stars 1 dependents 10 scripts 189 downloadsStructuralVariantAnnotation - Variant annotations for structural variants
StructuralVariantAnnotation provides a framework for analysis of structural variants within the Bioconductor ecosystem. This package contains contains useful helper functions for dealing with structural variants in VCF format. The packages contains functions for parsing VCFs from a number of popular callers as well as functions for dealing with breakpoints involving two separate genomic loci encoded as GRanges objects.
Last updated 4 months ago
dataimportsequencingannotationgeneticsvariantannotation
6.26 score 2 dependents 102 scripts 606 downloadsgDRstyle - A package with style requirements for the gDR suite
Package fills a helper package role for whole gDR suite. It helps to support good development practices by keeping style requirements and style tests for other packages. It also contains build helpers to make all package requirements met.
Last updated 7 days ago
softwareinfrastructure
6.26 score 2 stars 2 scripts 204 downloads
doubletrouble - Identification and classification of duplicated genes
doubletrouble aims to identify duplicated genes from whole-genome protein sequences and classify them based on their modes of duplication. The duplication modes are i. segmental duplication (SD); ii. tandem duplication (TD); iii. proximal duplication (PD); iv. transposed duplication (TRD) and; v. dispersed duplication (DD). Transposon-derived duplicates (TRD) can be further subdivided into rTRD (retrotransposon-derived duplication) and dTRD (DNA transposon-derived duplication). If users want a simpler classification scheme, duplicates can also be classified into SD- and SSD-derived (small-scale duplication) gene pairs. Besides classifying gene pairs, users can also classify genes, so that each gene is assigned a unique mode of duplication. Users can also calculate substitution rates per substitution site (i.e., Ka and Ks) from duplicate pairs, find peaks in Ks distributions with Gaussian Mixture Models (GMMs), and classify gene pairs into age groups based on Ks peaks.
Last updated 4 months ago
softwarewholegenomecomparativegenomicsfunctionalgenomicsphylogeneticsnetworkclassificationbioinformaticscomparative-genomicsgene-duplicationmolecular-evolutionwhole-genome-duplication
6.25 score 13 stars 17 scripts 198 downloadsCopyNumberPlots - Create Copy-Number Plots using karyoploteR functionality
CopyNumberPlots have a set of functions extending karyoploteRs functionality to create beautiful, customizable and flexible plots of copy-number related data.
Last updated 4 months ago
visualizationcopynumbervariationcoverageonechanneldataimportsequencingdnaseqbioconductorbioconductor-packagebioinformaticscopy-number-variationgenomicsgenomics-visualization
6.24 score 6 stars 2 dependents 16 scripts 313 downloadsVariantFiltering - Filtering of coding and non-coding genetic variants
Filter genetic variants using different criteria such as inheritance model, amino acid change consequence, minor allele frequencies across human populations, splice site strength, conservation, etc.
Last updated 10 days ago
geneticshomo_sapiensannotationsnpsequencinghighthroughputsequencing
6.23 score 4 stars 21 scripts 359 downloadscrisprViz - Visualization Functions for CRISPR gRNAs
Provides functionalities to visualize and contextualize CRISPR guide RNAs (gRNAs) on genomic tracks across nucleases and applications. Works in conjunction with the crisprBase and crisprDesign Bioconductor packages. Plots are produced using the Gviz framework.
Last updated 4 months ago
crisprfunctionalgenomicsgenetargetbioconductorbioconductor-packagecrispr-analysiscrispr-designgrnagrna-sequencegrna-sequencessgrnasgrna-designvisualization
6.23 score 7 stars 2 dependents 6 scripts 215 downloadsscMET - Bayesian modelling of cell-to-cell DNA methylation heterogeneity
High-throughput single-cell measurements of DNA methylomes can quantify methylation heterogeneity and uncover its role in gene regulation. However, technical limitations and sparse coverage can preclude this task. scMET is a hierarchical Bayesian model which overcomes sparsity, sharing information across cells and genomic features to robustly quantify genuine biological heterogeneity. scMET can identify highly variable features that drive epigenetic heterogeneity, and perform differential methylation and variability analyses. We illustrate how scMET facilitates the characterization of epigenetically distinct cell populations and how it enables the formulation of novel hypotheses on the epigenetic regulation of gene expression.
Last updated 4 months ago
immunooncologydnamethylationdifferentialmethylationdifferentialexpressiongeneexpressiongeneregulationepigeneticsgeneticsclusteringfeatureextractionregressionbayesiansequencingcoveragesinglecellbayesian-inferencegeneralised-linear-modelsheterogeneityhierarchical-modelsmethylation-analysissingle-cellcpp
6.23 score 20 stars 42 scripts 161 downloadstimeOmics - Time-Course Multi-Omics data integration
timeOmics is a generic data-driven framework to integrate multi-Omics longitudinal data measured on the same biological samples and select key temporal features with strong associations within the same sample group. The main steps of timeOmics are: 1. Plaform and time-specific normalization and filtering steps; 2. Modelling each biological into one time expression profile; 3. Clustering features with the same expression profile over time; 4. Post-hoc validation step.
Last updated 4 months ago
clusteringfeatureextractiontimecoursedimensionreductionsoftwaresequencingmicroarraymetabolomicsmetagenomicsproteomicsclassificationregressionimmunooncologygenepredictionmultiplecomparisonclusterintegrationmulti-omicstime-series
6.23 score 24 stars 10 scripts 265 downloadsSBGNview - "SBGNview: Data Analysis, Integration and Visualization on SBGN Pathways"
SBGNview is a tool set for pathway based data visalization, integration and analysis. SBGNview is similar and complementary to the widely used Pathview, with the following key features: 1. Pathway definition by the widely adopted Systems Biology Graphical Notation (SBGN); 2. Supports multiple major pathway databases beyond KEGG (Reactome, MetaCyc, SMPDB, PANTHER, METACROP) and user defined pathways; 3. Covers 5,200 reference pathways and over 3,000 species by default; 4. Extensive graphics controls, including glyph and edge attributes, graph layout and sub-pathway highlight; 5. SBGN pathway data manipulation, processing, extraction and analysis.
Last updated 4 months ago
genetargetpathwaysgraphandnetworkvisualizationgenesetenrichmentdifferentialexpressiongeneexpressionmicroarrayrnaseqgeneticsmetabolomicsproteomicssystemsbiologysequencing
6.22 score 25 stars 22 scripts 374 downloadsrpx - R Interface to the ProteomeXchange Repository
The rpx package implements an interface to proteomics data submitted to the ProteomeXchange consortium.
Last updated 4 days ago
immunooncologyproteomicsmassspectrometrydataimportthirdpartyclientbioconductordatamass-spectrometryproteomexchange
6.20 score 5 stars 21 scripts 531 downloadsMOMA - Multi Omic Master Regulator Analysis
This package implements the inference of candidate master regulator proteins from multi-omics' data (MOMA) algorithm, as well as ancillary analysis and visualization functions.
Last updated 4 months ago
softwarenetworkenrichmentnetworkinferencenetworkfeatureextractionclusteringfunctionalgenomicstranscriptomicssystemsbiology
6.19 score 6 stars 13 scripts 162 downloadsalabaster.schemas - Schemas for the Alabaster Framework
Stores all schemas required by various alabaster.* packages. No computation should be performed by this package, as that is handled by alabaster.base. We use a separate package instead of storing the schemas in alabaster.base itself, to avoid conflating management of the schemas with code maintenence.
Last updated 4 months ago
datarepresentationdataimport
6.18 score 15 dependents 3.3k downloadsidpr - Profiling and Analyzing Intrinsically Disordered Proteins in R
‘idpr’ aims to integrate tools for the computational analysis of intrinsically disordered proteins (IDPs) within R. This package is used to identify known characteristics of IDPs for a sequence of interest with easily reported and dynamic results. Additionally, this package includes tools for IDP-based sequence analysis to be used in conjunction with other R packages. Described in McFadden WM & Yanowitz JL (2022). "idpr: A package for profiling and analyzing Intrinsically Disordered Proteins in R." PloS one, 17(4), e0266929. <https://doi.org/10.1371/journal.pone.0266929>.
Last updated 4 months ago
structuralpredictionproteomicscellbiology
6.16 score 4 stars 20 scripts 228 downloadsHIPPO - Heterogeneity-Induced Pre-Processing tOol
For scRNA-seq data, it selects features and clusters the cells simultaneously for single-cell UMI data. It has a novel feature selection method using the zero inflation instead of gene variance, and computationally faster than other existing methods since it only relies on PCA+Kmeans rather than graph-clustering or consensus clustering.
Last updated 4 months ago
sequencingsinglecellgeneexpressiondifferentialexpressionclustering
6.16 score 18 stars 4 scripts 154 downloadsbiscuiteer - Convenience Functions for Biscuit
A test harness for bsseq loading of Biscuit output, summarization of WGBS data over defined regions and in mappable samples, with or without imputation, dropping of mostly-NA rows, age estimates, etc.
Last updated 4 months ago
dataimportmethylseqdnamethylation
6.16 score 6 stars 16 scripts 290 downloadsdearseq - Differential Expression Analysis for RNA-seq data through a robust variance component test
Differential Expression Analysis RNA-seq data with variance component score test accounting for data heteroscedasticity through precision weights. Perform both gene-wise and gene set analyses, and can deal with repeated or longitudinal data. Methods are detailed in: i) Agniel D & Hejblum BP (2017) Variance component score test for time-course gene set analysis of longitudinal RNA-seq data, Biostatistics, 18(4):589-604 ; and ii) Gauthier M, Agniel D, Thiébaut R & Hejblum BP (2020) dearseq: a variance component score test for RNA-Seq differential analysis that effectively controls the false discovery rate, NAR Genomics and Bioinformatics, 2(4):lqaa093.
Last updated 4 months ago
biomedicalinformaticscellbiologydifferentialexpressiondnaseqgeneexpressiongeneticsgenesetenrichmentimmunooncologykeggregressionrnaseqsequencingsystemsbiologytimecoursetranscriptiontranscriptomics
6.14 score 7 stars 1 dependents 11 scripts 585 downloadsstruct - Statistics in R Using Class-based Templates
Defines and includes a set of class-based templates for developing and implementing data processing and analysis workflows, with a strong emphasis on statistics and machine learning. The templates can be used and where needed extended to 'wrap' tools and methods from other packages into a common standardised structure to allow for effective and fast integration. Model objects can be combined into sequences, and sequences nested in iterators using overloaded operators to simplify and improve readability of the code. Ontology lookup has been integrated and implemented to provide standardised definitions for methods, inputs and outputs wrapped using the class-based templates.
Last updated 4 months ago
workflowstep
6.14 score 3 dependents 76 scripts 276 downloadsscFeatures - scFeatures: Multi-view representations of single-cell and spatial data for disease outcome prediction
scFeatures constructs multi-view representations of single-cell and spatial data. scFeatures is a tool that generates multi-view representations of single-cell and spatial data through the construction of a total of 17 feature types. These features can then be used for a variety of analyses using other software in Biocondutor.
Last updated 4 months ago
cellbasedassayssinglecellspatialsoftwaretranscriptomics
6.13 score 10 stars 15 scripts 177 downloadsatena - Analysis of Transposable Elements
Quantify expression of transposable elements (TEs) from RNA-seq data through different methods, including ERVmap, TEtranscripts and Telescope. A common interface is provided to use each of these methods, which consists of building a parameter object, calling the quantification function with this object and getting a SummarizedExperiment object as output container of the quantified expression profiles. The implementation allows one to quantify TEs and gene transcripts in an integrated manner.
Last updated 8 days ago
transcriptiontranscriptomicsrnaseqsequencingpreprocessingsoftwaregeneexpressioncoveragedifferentialexpressionfunctionalgenomics
6.13 score 9 stars 1 scripts 376 downloadsGPA - GPA (Genetic analysis incorporating Pleiotropy and Annotation)
This package provides functions for fitting GPA, a statistical framework to prioritize GWAS results by integrating pleiotropy information and annotation data. In addition, it also includes ShinyGPA, an interactive visualization toolkit to investigate pleiotropic architecture.
Last updated 4 months ago
softwarestatisticalmethodclassificationgenomewideassociationsnpgeneticsclusteringmultiplecomparisonpreprocessinggeneexpressiondifferentialexpressioncpp
6.11 score 13 stars 7 scripts 182 downloadspeco - A Supervised Approach for **P**r**e**dicting **c**ell Cycle Pr**o**gression using scRNA-seq data
Our approach provides a way to assign continuous cell cycle phase using scRNA-seq data, and consequently, allows to identify cyclic trend of gene expression levels along the cell cycle. This package provides method and training data, which includes scRNA-seq data collected from 6 individual cell lines of induced pluripotent stem cells (iPSCs), and also continuous cell cycle phase derived from FUCCI fluorescence imaging data.
Last updated 4 months ago
sequencingrnaseqgeneexpressiontranscriptomicssinglecellsoftwarestatisticalmethodclassificationvisualizationcell-cyclesingle-cell-rna-seq
6.09 score 12 stars 34 scripts 185 downloads
cogeqc - Systematic quality checks on comparative genomics analyses
cogeqc aims to facilitate systematic quality checks on standard comparative genomics analyses to help researchers detect issues and select the most suitable parameters for each data set. cogeqc can be used to asses: i. genome assembly and annotation quality with BUSCOs and comparisons of statistics with publicly available genomes on the NCBI; ii. orthogroup inference using a protein domain-based approach and; iii. synteny detection using synteny network properties. There are also data visualization functions to explore QC summary statistics.
Last updated 4 months ago
softwaregenomeassemblycomparativegenomicsfunctionalgenomicsphylogeneticsqualitycontrolnetworkcomparative-genomicsevolutionary-genomics
6.09 score 8 stars 17 scripts 158 downloadsTrajectoryUtils - Single-Cell Trajectory Analysis Utilities
Implements low-level utilities for single-cell trajectory analysis, primarily intended for re-use inside higher-level packages. Include a function to create a cluster-level minimum spanning tree and data structures to hold pseudotime inference results.
Last updated 4 months ago
geneexpressionsinglecell
6.07 score 9 dependents 16 scripts 3.0k downloadsVERSO - Viral Evolution ReconStructiOn (VERSO)
Mutations that rapidly accumulate in viral genomes during a pandemic can be used to track the evolution of the virus and, accordingly, unravel the viral infection network. To this extent, sequencing samples of the virus can be employed to estimate models from genomic epidemiology and may serve, for instance, to estimate the proportion of undetected infected people by uncovering cryptic transmissions, as well as to predict likely trends in the number of infected, hospitalized, dead and recovered people. VERSO is an algorithmic framework that processes variants profiles from viral samples to produce phylogenetic models of viral evolution. The approach solves a Boolean Matrix Factorization problem with phylogenetic constraints, by maximizing a log-likelihood function. VERSO includes two separate and subsequent steps; in this package we provide an R implementation of VERSO STEP 1.
Last updated 4 months ago
biomedicalinformaticssequencingsomaticmutation
6.05 score 7 stars 154 downloadsmetaseqR2 - An R package for the analysis and result reporting of RNA-Seq data by combining multiple statistical algorithms
Provides an interface to several normalization and statistical testing packages for RNA-Seq gene expression data. Additionally, it creates several diagnostic plots, performs meta-analysis by combinining the results of several statistical tests and reports the results in an interactive way.
Last updated 4 months ago
softwaregeneexpressiondifferentialexpressionworkflowsteppreprocessingqualitycontrolnormalizationreportwritingrnaseqtranscriptionsequencingtranscriptomicsbayesianclusteringcellbiologybiomedicalinformaticsfunctionalgenomicssystemsbiologyimmunooncologyalternativesplicingdifferentialsplicingmultiplecomparisontimecoursedataimportatacseqepigeneticsregressionproprietaryplatformsgenesetenrichmentbatcheffectchipseq
6.05 score 7 stars 3 scripts 232 downloadsvissE - Visualising Set Enrichment Analysis Results
This package enables the interpretation and analysis of results from a gene set enrichment analysis using network-based and text-mining approaches. Most enrichment analyses result in large lists of significant gene sets that are difficult to interpret. Tools in this package help build a similarity-based network of significant gene sets from a gene set enrichment analysis that can then be investigated for their biological function using text-mining approaches.
Last updated 4 months ago
softwaregeneexpressiongenesetenrichmentnetworkenrichmentnetworkbioinformatics
6.05 score 13 stars 19 scripts 224 downloadsRiboCrypt - Interactive visualization in genomics
R Package for interactive visualization and browsing NGS data. It contains a browser for both transcript and genomic coordinate view. In addition a QC and general metaplots are included, among others differential translation plots and gene expression plots. The package is still under development.
Last updated 4 months ago
softwaresequencingriboseqrnaseq
6.04 score 5 stars 22 scripts 177 downloadsscanMiR - scanMiR
A set of tools for working with miRNA affinity models (KdModels), efficiently scanning for miRNA binding sites, and predicting target repression. It supports scanning using miRNA seeds, full miRNA sequences (enabling 3' alignment) and KdModels, and includes the prediction of slicing and TDMD sites. Finally, it includes utility and plotting functions (e.g. for the visual representation of miRNA-target alignment).
Last updated 4 months ago
mirnasequencematchingalignment
6.04 score 1 dependents 52 scripts 230 downloadsMOGAMUN - MOGAMUN: A Multi-Objective Genetic Algorithm to Find Active Modules in Multiplex Biological Networks
MOGAMUN is a multi-objective genetic algorithm that identifies active modules in a multiplex biological network. This allows analyzing different biological networks at the same time. MOGAMUN is based on NSGA-II (Non-Dominated Sorting Genetic Algorithm, version II), which we adapted to work on networks.
Last updated 4 months ago
systemsbiologygraphandnetworkdifferentialexpressionbiomedicalinformaticstranscriptomicsclusteringnetwork
6.03 score 12 stars 5 scripts 196 downloadsIgGeneUsage - Differential gene usage in immune repertoires
Detection of biases in the usage of immunoglobulin (Ig) genes is an important task in immune repertoire profiling. IgGeneUsage detects aberrant Ig gene usage between biological conditions using a probabilistic model which is analyzed computationally by Bayes inference. With this IgGeneUsage also avoids some common problems related to the current practice of null-hypothesis significance testing.
Last updated 4 months ago
differentialexpressionregressiongeneticsbayesianbiomedicalinformaticsimmunooncologymathematicalbiologyb-cell-receptorbcr-repertoiredifferential-analysisdifferential-gene-expressionhigh-throughput-sequencingimmune-repertoireimmune-repertoire-analysisimmune-repertoiresimmunogenomicsimmunoglobulinimmunoinformaticsimmunological-bioinformaticsimmunologytcr-repertoirevdj-recombinationcpp
6.03 score 6 stars 1 scripts 183 downloadsDino - Normalization of Single-Cell mRNA Sequencing Data
Dino normalizes single-cell, mRNA sequencing data to correct for technical variation, particularly sequencing depth, prior to downstream analysis. The approach produces a matrix of corrected expression for which the dependency between sequencing depth and the full distribution of normalized expression; many existing methods aim to remove only the dependency between sequencing depth and the mean of the normalized expression. This is particuarly useful in the context of highly sparse datasets such as those produced by 10X genomics and other uninque molecular identifier (UMI) based microfluidics protocols for which the depth-dependent proportion of zeros in the raw expression data can otherwise present a challenge.
Last updated 4 months ago
softwarenormalizationrnaseqsinglecellsequencinggeneexpressiontranscriptomicsregressioncellbasedassays
6.02 score 9 stars 13 scripts 164 downloadsgemini - GEMINI: Variational inference approach to infer genetic interactions from pairwise CRISPR screens
GEMINI uses log-fold changes to model sample-dependent and independent effects, and uses a variational Bayes approach to infer these effects. The inferred effects are used to score and identify genetic interactions, such as lethality and recovery. More details can be found in Zamanighomi et al. 2019 (in press).
Last updated 4 months ago
softwarecrisprbayesiandataimportcomputational-biologygenetic-interactions
6.02 score 15 stars 9 scripts 158 downloadsautonomics - Unified Statistical Modeling of Omics Data
This package unifies access to Statistal Modeling of Omics Data. Across linear modeling engines (lm, lme, lmer, limma, and wilcoxon). Across coding systems (treatment, difference, deviation, etc). Across model formulae (with/without intercept, random effect, interaction or nesting). Across omics platforms (microarray, rnaseq, msproteomics, affinity proteomics, metabolomics). Across projection methods (pca, pls, sma, lda, spls, opls). Across clustering methods (hclust, pam, cmeans). It provides a fast enrichment analysis implementation. And an intuitive contrastogram visualisation to summarize contrast effects in complex designs.
Last updated 14 days ago
softwaredataimportpreprocessingdimensionreductionprincipalcomponentregressiondifferentialexpressiongenesetenrichmenttranscriptomicstranscriptiongeneexpressionrnaseqmicroarrayproteomicsmetabolomicsmassspectrometry
6.00 score 5 scripts 184 downloadsescape - Easy single cell analysis platform for enrichment
A bridging R package to facilitate gene set enrichment analysis (GSEA) in the context of single-cell RNA sequencing. Using raw count information, Seurat objects, or SingleCellExperiment format, users can perform and visualize ssGSEA, GSVA, AUCell, and UCell-based enrichment calculations across individual cells.
Last updated 17 days ago
softwaresinglecellclassificationannotationgenesetenrichmentsequencinggenesignalingpathways
5.98 score 146 scripts 772 downloadsFEAST - FEAture SelcTion (FEAST) for Single-cell clustering
Cell clustering is one of the most important and commonly performed tasks in single-cell RNA sequencing (scRNA-seq) data analysis. An important step in cell clustering is to select a subset of genes (referred to as “features”), whose expression patterns will then be used for downstream clustering. A good set of features should include the ones that distinguish different cell types, and the quality of such set could have significant impact on the clustering accuracy. FEAST is an R library for selecting most representative features before performing the core of scRNA-seq clustering. It can be used as a plug-in for the etablished clustering algorithms such as SC3, TSCAN, SHARP, SIMLR, and Seurat. The core of FEAST algorithm includes three steps: 1. consensus clustering; 2. gene-level significance inference; 3. validation of an optimized feature set.
Last updated 4 months ago
sequencingsinglecellclusteringfeatureextraction
5.97 score 10 stars 47 scripts 286 downloadssimpleSeg - A package to perform simple cell segmentation
Image segmentation is the process of identifying the borders of individual objects (in this case cells) within an image. This allows for the features of cells such as marker expression and morphology to be extracted, stored and analysed. simpleSeg provides functionality for user friendly, watershed based segmentation on multiplexed cellular images in R based on the intensity of user specified protein marker channels. simpleSeg can also be used for the normalization of single cell data obtained from multiple images.
Last updated 4 months ago
classificationsurvivalsinglecellnormalizationspatialspatial-statistics
5.96 score 2 dependents 19 scripts 193 downloadstransformGamPoi - Variance Stabilizing Transformation for Gamma-Poisson Models
Variance-stabilizing transformations help with the analysis of heteroskedastic data (i.e., data where the variance is not constant, like count data). This package provide two types of variance stabilizing transformations: (1) methods based on the delta method (e.g., 'acosh', 'log(x+1)'), (2) model residual based (Pearson and randomized quantile residuals).
Last updated 4 months ago
singlecellnormalizationpreprocessingregressioncpp
5.95 score 21 stars 21 scripts 193 downloadsscPCA - Sparse Contrastive Principal Component Analysis
A toolbox for sparse contrastive principal component analysis (scPCA) of high-dimensional biological data. scPCA combines the stability and interpretability of sparse PCA with contrastive PCA's ability to disentangle biological signal from unwanted variation through the use of control data. Also implements and extends cPCA.
Last updated 14 hours ago
principalcomponentgeneexpressiondifferentialexpressionsequencingmicroarrayrnaseqbioconductorcontrastive-learningdimensionality-reduction
5.94 score 12 stars 29 scripts 232 downloadsceRNAnetsim - Regulation Simulator of Interaction between miRNA and Competing RNAs (ceRNA)
This package simulates regulations of ceRNA (Competing Endogenous) expression levels after a expression level change in one or more miRNA/mRNAs. The methodolgy adopted by the package has potential to incorparate any ceRNA (circRNA, lincRNA, etc.) into miRNA:target interaction network. The package basically distributes miRNA expression over available ceRNAs where each ceRNA attracks miRNAs proportional to its amount. But, the package can utilize multiple parameters that modify miRNA effect on its target (seed type, binding energy, binding location, etc.). The functions handle the given dataset as graph object and the processes progress via edge and node variables.
Last updated 4 months ago
networkinferencesystemsbiologynetworkgraphandnetworktranscriptomicscernamirnanetwork-biologynetwork-simulatortcgatidygraphtidyverse
5.94 score 4 stars 12 scripts 149 downloadsspatialDE - R wrapper for SpatialDE
SpatialDE is a method to find spatially variable genes (SVG) from spatial transcriptomics data. This package provides wrappers to use the Python SpatialDE library in R, using reticulate and basilisk.
Last updated 4 months ago
softwaretranscriptomicspythonspatial-datawrapper
5.94 score 3 stars 16 scripts 240 downloadsdemuxmix - Demultiplexing oligo-barcoded scRNA-seq data using regression mixture models
A package for demultiplexing single-cell sequencing experiments of pooled cells labeled with barcode oligonucleotides. The package implements methods to fit regression mixture models for a probabilistic classification of cells, including multiplet detection. Demultiplexing error rates can be estimated, and methods for quality control are provided.
Last updated 4 months ago
singlecellsequencingpreprocessingclassificationregression
5.93 score 5 stars 1 dependents 19 scripts 384 downloadsSCOPE - A normalization and copy number estimation method for single-cell DNA sequencing
Whole genome single-cell DNA sequencing (scDNA-seq) enables characterization of copy number profiles at the cellular level. This circumvents the averaging effects associated with bulk-tissue sequencing and has increased resolution yet decreased ambiguity in deconvolving cancer subclones and elucidating cancer evolutionary history. ScDNA-seq data is, however, sparse, noisy, and highly variable even within a homogeneous cell population, due to the biases and artifacts that are introduced during the library preparation and sequencing procedure. Here, we propose SCOPE, a normalization and copy number estimation method for scDNA-seq data. The distinguishing features of SCOPE include: (i) utilization of cell-specific Gini coefficients for quality controls and for identification of normal/diploid cells, which are further used as negative control samples in a Poisson latent factor model for normalization; (ii) modeling of GC content bias using an expectation-maximization algorithm embedded in the Poisson generalized linear models, which accounts for the different copy number states along the genome; (iii) a cross-sample iterative segmentation procedure to identify breakpoints that are shared across cells from the same genetic background.
Last updated 4 months ago
singlecellnormalizationcopynumbervariationsequencingwholegenomecoveragealignmentqualitycontroldataimportdnaseq
5.91 score 82 scripts 256 downloadsHiContacts - Analysing cool files in R with HiContacts
HiContacts provides a collection of tools to analyse and visualize Hi-C datasets imported in R by HiCExperiment.
Last updated 4 months ago
hicdna3dstructure
5.91 score 11 stars 49 scripts 270 downloadsMuData - Serialization for MultiAssayExperiment Objects
Save MultiAssayExperiments to h5mu files supported by muon and mudata. Muon is a Python framework for multimodal omics data analysis. It uses an HDF5-based format for data storage.
Last updated 4 months ago
dataimportanndatabioconductormudatamulti-omicsmultimodal-omicsscrna-seq
5.89 score 5 stars 26 scripts 204 downloadsDiscoRhythm - Interactive Workflow for Discovering Rhythmicity in Biological Data
Set of functions for estimation of cyclical characteristics, such as period, phase, amplitude, and statistical significance in large temporal datasets. Supporting functions are available for quality control, dimensionality reduction, spectral analysis, and analysis of experimental replicates. Contains a R Shiny web interface to execute all workflow steps.
Last updated 4 months ago
softwaretimecoursequalitycontrolvisualizationguiprincipalcomponentbioconductordata-visualizationoscillationsrhythm-detectionwebapp
5.89 score 13 stars 9 scripts 220 downloadsGBScleanR - Error correction tool for noisy genotyping by sequencing (GBS) data
GBScleanR is a package for quality check, filtering, and error correction of genotype data derived from next generation sequcener (NGS) based genotyping platforms. GBScleanR takes Variant Call Format (VCF) file as input. The main function of this package is `estGeno()` which estimates the true genotypes of samples from given read counts for genotype markers using a hidden Markov model with incorporating uneven observation ratio of allelic reads. This implementation gives robust genotype estimation even in noisy genotype data usually observed in Genotyping-By-Sequnencing (GBS) and similar methods, e.g. RADseq. The current implementation accepts genotype data of a diploid population at any generation of multi-parental cross, e.g. biparental F2 from inbred parents, biparental F2 from outbred parents, and 8-way recombinant inbred lines (8-way RILs) which can be refered to as MAGIC population.
Last updated 6 days ago
geneticvariabilitysnpgeneticshiddenmarkovmodelsequencingqualitycontrolcpp
5.88 score 4 stars 6 scripts 273 downloadscfDNAPro - cfDNAPro extracts and Visualises biological features from whole genome sequencing data of cell-free DNA
cfDNA fragments carry important features for building cancer sample classification ML models, such as fragment size, and fragment end motif etc. Analyzing and visualizing fragment size metrics, as well as other biological features in a curated, standardized, scalable, well-documented, and reproducible way might be time intensive. This package intends to resolve these problems and simplify the process. It offers two sets of functions for cfDNA feature characterization and visualization.
Last updated 4 months ago
visualizationsequencingwholegenomebioinformaticscancer-genomicscancer-researchcell-free-dnaearly-detectiongenomics-visualizationliquid-biopsyswgswhole-genome-sequencing
5.86 score 28 stars 13 scripts 254 downloadscrisprBowtie - Bowtie-based alignment of CRISPR gRNA spacer sequences
Provides a user-friendly interface to map on-targets and off-targets of CRISPR gRNA spacer sequences using bowtie. The alignment is fast, and can be performed using either commonly-used or custom CRISPR nucleases. The alignment can work with any reference or custom genomes. Both DNA- and RNA-targeting nucleases are supported.
Last updated 4 months ago
crisprfunctionalgenomicsalignmentalignerbioconductorbioconductor-packagebowtiecrispr-analysiscrispr-cas9crispr-designcrispr-targetgrnagrna-sequencegrna-sequencessgrnasgrna-design
5.86 score 3 stars 4 dependents 7 scripts 340 downloadsChromSCape - Analysis of single-cell epigenomics datasets with a Shiny App
ChromSCape - Chromatin landscape profiling for Single Cells - is a ready-to-launch user-friendly Shiny Application for the analysis of single-cell epigenomics datasets (scChIP-seq, scATAC-seq, scCUT&Tag, ...) from aligned data to differential analysis & gene set enrichment analysis. It is highly interactive, enables users to save their analysis and covers a wide range of analytical steps: QC, preprocessing, filtering, batch correction, dimensionality reduction, vizualisation, clustering, differential analysis and gene set analysis.
Last updated 4 months ago
shinyappssoftwaresinglecellchipseqatacseqmethylseqclassificationclusteringepigeneticsprincipalcomponentannotationbatcheffectmultiplecomparisonnormalizationpathwayspreprocessingqualitycontrolreportwritingvisualizationgenesetenrichmentdifferentialpeakcallingepigenomicsshinysingle-cellcpp
5.85 score 14 stars 17 scripts 180 downloadsGeoTcgaData - Processing Various Types of Data on GEO and TCGA
Gene Expression Omnibus(GEO) and The Cancer Genome Atlas (TCGA) provide us with a wealth of data, such as RNA-seq, DNA Methylation, SNP and Copy number variation data. It's easy to download data from TCGA using the gdc tool, but processing these data into a format suitable for bioinformatics analysis requires more work. This R package was developed to handle these data.
Last updated 4 months ago
geneexpressiondifferentialexpressionrnaseqcopynumbervariationmicroarraysoftwarednamethylationdifferentialmethylationsnpatacseqmethylationarray
5.84 score 24 stars 19 scripts 354 downloadsscBubbletree - Quantitative visual exploration of scRNA-seq data
scBubbletree is a quantitative method for the visual exploration of scRNA-seq data, preserving key biological properties such as local and global cell distances and cell density distributions across samples. It effectively resolves overplotting and enables the visualization of diverse cell attributes from multiomic single-cell experiments. Additionally, scBubbletree is user-friendly and integrates seamlessly with popular scRNA-seq analysis tools, facilitating comprehensive and intuitive data interpretation.
Last updated 4 months ago
visualizationclusteringsinglecelltranscriptomicsrnaseqbig-databigdatascrna-seqscrna-seq-analysisvisualvisual-exploration
5.82 score 6 stars 8 scripts 198 downloadsSingleCellSignalR - Cell Signalling Using Single Cell RNAseq Data Analysis
Allows single cell RNA seq data analysis, clustering, creates internal network and infers cell-cell interactions.
Last updated 4 months ago
singlecellnetworkclusteringrnaseqclassification
5.78 score 1 dependents 29 scripts 438 downloadsBiocFHIR - Illustration of FHIR ingestion and transformation using R
FHIR R4 bundles in JSON format are derived from https://synthea.mitre.org/downloads. Transformation inspired by a kaggle notebook published by Dr Alexander Scarlat, https://www.kaggle.com/code/drscarlat/fhir-starter-parse-healthcare-bundles-into-tables. This is a very limited illustration of some basic parsing and reorganization processes. Additional tooling will be required to move beyond the Synthea data illustrations.
Last updated 4 months ago
infrastructuredataimportdatarepresentationfhir
5.78 score 4 stars 15 scripts 167 downloadsompBAM - C++ Library for OpenMP-based multi-threaded sequential profiling of Binary Alignment Map (BAM) files
This packages provides C++ header files for developers wishing to create R packages that processes BAM files. ompBAM automates file access, memory management, and handling of multiple threads 'behind the scenes', so developers can focus on creating domain-specific functionality. The included vignette contains detailed documentation of this API, including quick-start instructions to create a new ompBAM-based package, and step-by-step explanation of the functionality behind the example packaged included within ompBAM.
Last updated 4 months ago
alignmentdataimportrnaseqsoftwaresequencingtranscriptomicssinglecell
5.78 score 4 stars 1 dependents 3 scripts 250 downloadsdeconvR - Simulation and Deconvolution of Omic Profiles
This package provides a collection of functions designed for analyzing deconvolution of the bulk sample(s) using an atlas of reference omic signature profiles and a user-selected model. Users are given the option to create or extend a reference atlas and,also simulate the desired size of the bulk signature profile of the reference cell types.The package includes the cell-type-specific methylation atlas and, Illumina Epic B5 probe ids that can be used in deconvolution. Additionally,we included BSmeth2Probe, to make mapping WGBS data to their probe IDs easier.
Last updated 4 months ago
dnamethylationregressiongeneexpressionrnaseqsinglecellstatisticalmethodtranscriptomicsbioconductor-packagedeconvolutiondna-methylationomics
5.78 score 10 stars 15 scripts 280 downloadscircRNAprofiler - circRNAprofiler: An R-Based Computational Framework for the Downstream Analysis of Circular RNAs
R-based computational framework for a comprehensive in silico analysis of circRNAs. This computational framework allows to combine and analyze circRNAs previously detected by multiple publicly available annotation-based circRNA detection tools. It covers different aspects of circRNAs analysis from differential expression analysis, evolutionary conservation, biogenesis to functional analysis.
Last updated 4 months ago
annotationstructuralpredictionfunctionalpredictiongenepredictiongenomeassemblydifferentialexpression
5.78 score 10 stars 5 scripts 288 downloadsscRNAseqApp - A single-cell RNAseq Shiny app-package
The scRNAseqApp is a Shiny app package designed for interactive visualization of single-cell data. It is an enhanced version derived from the ShinyCell, repackaged to accommodate multiple datasets. The app enables users to visualize data containing various types of information simultaneously, facilitating comprehensive analysis. Additionally, it includes a user management system to regulate database accessibility for different users.
Last updated 11 days ago
visualizationsinglecellrnaseqinteractive-visualizationsmultiple-usersshiny-appssingle-cell-rna-seq
5.76 score 4 stars 3 scripts 201 downloadsBANDITS - BANDITS: Bayesian ANalysis of DIfferenTial Splicing
BANDITS is a Bayesian hierarchical model for detecting differential splicing of genes and transcripts, via differential transcript usage (DTU), between two or more conditions. The method uses a Bayesian hierarchical framework, which allows for sample specific proportions in a Dirichlet-Multinomial model, and samples the allocation of fragments to the transcripts. Parameters are inferred via Markov chain Monte Carlo (MCMC) techniques and a DTU test is performed via a multivariate Wald test on the posterior densities for the average relative abundance of transcripts.
Last updated 4 months ago
differentialsplicingalternativesplicingbayesiangeneticsrnaseqsequencingdifferentialexpressiongeneexpressionmultiplecomparisonsoftwaretranscriptionstatisticalmethodvisualizationopenblascpp
5.75 score 17 stars 1 dependents 11 scripts 390 downloadsAPAlyzer - A toolkit for APA analysis using RNA-seq data
Perform 3'UTR APA, Intronic APA and gene expression analysis using RNA-seq data.
Last updated 4 months ago
sequencingrnaseqdifferentialexpressiongeneexpressiongeneregulationannotationdataimportsoftwareative-polyadenylationbioinformatics-toolrna-seq
5.75 score 7 stars 9 scripts 188 downloadsGWENA - Pipeline for augmented co-expression analysis
The development of high-throughput sequencing led to increased use of co-expression analysis to go beyong single feature (i.e. gene) focus. We propose GWENA (Gene Whole co-Expression Network Analysis) , a tool designed to perform gene co-expression network analysis and explore the results in a single pipeline. It includes functional enrichment of modules of co-expressed genes, phenotypcal association, topological analysis and comparison of networks configuration between conditions.
Last updated 4 months ago
softwaregeneexpressionnetworkclusteringgraphandnetworkgenesetenrichmentpathwaysvisualizationrnaseqtranscriptomicsmrnamicroarraymicroarraynetworkenrichmentsequencinggoco-expressionenrichment-analysisgenenetwork-analysispipeline
5.72 score 22 stars 12 scripts 458 downloadsDAMEfinder - Finds DAMEs - Differential Allelicly MEthylated regions
'DAMEfinder' offers functionality for taking methtuple or bismark outputs to calculate ASM scores and compute DAMEs. It also offers nice visualization of methyl-circle plots.
Last updated 4 months ago
dnamethylationdifferentialmethylationcoverage
5.70 score 10 stars 9 scripts 300 downloadsPAST - Pathway Association Study Tool (PAST)
PAST takes GWAS output and assigns SNPs to genes, uses those genes to find pathways associated with the genes, and plots pathways based on significance. Implements methods for reading GWAS input data, finding genes associated with SNPs, calculating enrichment score and significance of pathways, and plotting pathways.
Last updated 4 months ago
pathwaysgenesetenrichment
5.70 score 5 stars 7 scripts 196 downloadsdebCAM - Deconvolution by Convex Analysis of Mixtures
An R package for fully unsupervised deconvolution of complex tissues. It provides basic functions to perform unsupervised deconvolution on mixture expression profiles by Convex Analysis of Mixtures (CAM) and some auxiliary functions to help understand the subpopulation-specific results. It also implements functions to perform supervised deconvolution based on prior knowledge of molecular markers, S matrix or A matrix. Combining molecular markers from CAM and from prior knowledge can achieve semi-supervised deconvolution of mixtures.
Last updated 4 months ago
softwarecellbiologygeneexpressionopenjdk
5.69 score 7 stars 14 scripts 272 downloadsmethyLImp2 - Missing value estimation of DNA methylation data
This package allows to estimate missing values in DNA methylation data. methyLImp method is based on linear regression since methylation levels show a high degree of inter-sample correlation. Implementation is parallelised over chromosomes since probes on different chromosomes are usually independent. Mini-batch approach to reduce the runtime in case of large number of samples is available.
Last updated 7 days ago
dnamethylationmicroarraysoftwaremethylationarrayregressionimputationmethylationmissing-value-imputation
5.68 score 6 stars 3 scripts 126 downloadsGenomicPlot - Plot profiles of next generation sequencing data in genomic features
Visualization of next generation sequencing (NGS) data is essential for interpreting high-throughput genomics experiment results. 'GenomicPlot' facilitates plotting of NGS data in various formats (bam, bed, wig and bigwig); both coverage and enrichment over input can be computed and displayed with respect to genomic features (such as UTR, CDS, enhancer), and user defined genomic loci or regions. Statistical tests on signal intensity within user defined regions of interest can be performed and represented as boxplots or bar graphs. Parallel processing is used to speed up computation on multicore platforms. In addition to genomic plots which is suitable for displaying of coverage of genomic DNA (such as ChIPseq data), metagenomic (without introns) plots can also be made for RNAseq or CLIPseq data as well.
Last updated 11 days ago
alternativesplicingchipseqcoveragegeneexpressionrnaseqsequencingsoftwaretranscriptionvisualizationannotation
5.68 score 3 stars 4 scripts 202 downloadsCytoGLMM - Conditional Differential Analysis for Flow and Mass Cytometry Experiments
The CytoGLMM R package implements two multiple regression strategies: A bootstrapped generalized linear model (GLM) and a generalized linear mixed model (GLMM). Most current data analysis tools compare expressions across many computationally discovered cell types. CytoGLMM focuses on just one cell type. Our narrower field of application allows us to define a more specific statistical model with easier to control statistical guarantees. As a result, CytoGLMM finds differential proteins in flow and mass cytometry data while reducing biases arising from marker correlations and safeguarding against false discoveries induced by patient heterogeneity.
Last updated 4 months ago
flowcytometryproteomicssinglecellcellbasedassayscellbiologyimmunooncologyregressionstatisticalmethodsoftware
5.68 score 2 stars 1 dependents 1 scripts 214 downloadsmetabCombiner - Method for Combining LC-MS Metabolomics Feature Measurements
This package aligns LC-HRMS metabolomics datasets acquired from biologically similar specimens analyzed under similar, but not necessarily identical, conditions. Peak-picked and simply aligned metabolomics feature tables (consisting of m/z, rt, and per-sample abundance measurements, plus optional identifiers & adduct annotations) are accepted as input. The package outputs a combined table of feature pair alignments, organized into groups of similar m/z, and ranked by a similarity score. Input tables are assumed to be acquired using similar (but not necessarily identical) analytical methods.
Last updated 4 months ago
softwaremassspectrometrymetabolomicsmass-spectrometry
5.65 score 10 stars 5 scripts 202 downloadsppcseq - Probabilistic Outlier Identification for RNA Sequencing Generalized Linear Models
Relative transcript abundance has proven to be a valuable tool for understanding the function of genes in biological systems. For the differential analysis of transcript abundance using RNA sequencing data, the negative binomial model is by far the most frequently adopted. However, common methods that are based on a negative binomial model are not robust to extreme outliers, which we found to be abundant in public datasets. So far, no rigorous and probabilistic methods for detection of outliers have been developed for RNA sequencing data, leaving the identification mostly to visual inspection. Recent advances in Bayesian computation allow large-scale comparison of observed data against its theoretical distribution given in a statistical model. Here we propose ppcseq, a key quality-control tool for identifying transcripts that include outlier data points in differential expression analysis, which do not follow a negative binomial distribution. Applying ppcseq to analyse several publicly available datasets using popular tools, we show that from 3 to 10 percent of differentially abundant transcripts across algorithms and datasets had statistics inflated by the presence of outliers.
Last updated 4 months ago
rnaseqdifferentialexpressiongeneexpressionnormalizationclusteringqualitycontrolsequencingtranscriptiontranscriptomicsbayesian-inferencedeseq2edgernegative-binomialoutlierstancpp
5.65 score 7 stars 16 scripts 214 downloadsrgoslin - Lipid Shorthand Name Parsing and Normalization
The R implementation for the Grammar of Succint Lipid Nomenclature parses different short hand notation dialects for lipid names. It normalizes them to a standard name. It further provides calculated monoisotopic masses and sum formulas for each successfully parsed lipid name and supplements it with LIPID MAPS Category and Class information. Also, the structural level and further structural details about the head group, fatty acyls and functional groups are returned, where applicable.
Last updated 4 months ago
softwarelipidomicsmetabolomicspreprocessingnormalizationmassspectrometrycpp
5.64 score 5 stars 22 scripts 269 downloadsVplotR - Set of tools to make V-plots and compute footprint profiles
The pattern of digestion and protection from DNA nucleases such as DNAse I, micrococcal nuclease, and Tn5 transposase can be used to infer the location of associated proteins. This package contains useful functions to analyze patterns of paired-end sequencing fragment density. VplotR facilitates the generation of V-plots and footprint profiles over single or aggregated genomic loci of interest.
Last updated 4 months ago
nucleosomepositioningcoveragesequencingbiologicalquestionatacseqalignment
5.64 score 10 stars 11 scripts 208 downloads
planet - Placental DNA methylation analysis tools
This package contains R functions to predict biological variables to from placnetal DNA methylation data generated from infinium arrays. This includes inferring ethnicity/ancestry, gestational age, and cell composition from placental DNA methylation array (450k/850k) data.
Last updated 9 days ago
softwaredifferentialmethylationepigeneticsmicroarraymethylationarraydnamethylationcpgislandancestrydna-methylation-datageneticsinferencemachine-learningplacenta
5.64 score 4 stars 1 dependents 12 scripts 370 downloadsSEtools - SEtools: tools for working with SummarizedExperiment
This includes a set of convenience functions for working with the SummarizedExperiment class. Note that plotting functions historically in this package have been moved to the sechm package (see vignette for details).
Last updated 4 months ago
geneexpression
5.63 score 2 stars 71 scripts 412 downloadsMSstatsLiP - LiP Significance Analysis in shotgun mass spectrometry-based proteomic experiments
Tools for LiP peptide and protein significance analysis. Provides functions for summarization, estimation of LiP peptide abundance, and detection of changes across conditions. Utilizes functionality across the MSstats family of packages.
Last updated 4 months ago
immunooncologymassspectrometryproteomicssoftwaredifferentialexpressiononechanneltwochannelnormalizationqualitycontrolcpp
5.62 score 7 stars 5 scripts 196 downloadsPICB - piRNA Cluster Builder
piRNAs (short for PIWI-interacting RNAs) and their PIWI protein partners play a key role in fertility and maintaining genome integrity by restricting mobile genetic elements (transposons) in germ cells. piRNAs originate from genomic regions known as piRNA clusters. The piRNA Cluster Builder (PICB) is a versatile toolkit designed to identify genomic regions with a high density of piRNAs. It constructs piRNA clusters through a stepwise integration of unique and multimapping piRNAs and offers wide-ranging parameter settings, supported by an optimization function that allows users to test different parameter combinations to tailor the analysis to their specific piRNA system. The output includes extensive metadata columns, enabling researchers to rank clusters and extract cluster characteristics.
Last updated 5 days ago
geneticsgenomeannotationsequencingfunctionalpredictioncoveragetranscriptomics
5.60 score 5 stars 22 downloadsgpuMagic - An openCL compiler with the capacity to compile R functions and run the code on GPU
The package aims to help users write openCL code with little or no effort. It is able to compile an user-defined R function and run it on a device such as a CPU or a GPU. The user can also write and run their openCL code directly by calling .kernel function.
Last updated 4 months ago
infrastructureocl-icdcpp
5.60 score 10 stars 1 scripts 216 downloadsscviR - experimental inferface from R to scvi-tools
This package defines interfaces from R to scvi-tools. A vignette works through the totalVI tutorial for analyzing CITE-seq data. Another vignette compares outputs of Chapter 12 of the OSCA book with analogous outputs based on totalVI quantifications. Future work will address other components of scvi-tools, with a focus on building understanding of probabilistic methods based on variational autoencoders.
Last updated 4 months ago
infrastructuresinglecelldataimportbioconductorcite-seqscverse
5.60 score 6 stars 11 scripts 147 downloads
UMI4Cats - UMI4Cats: Processing, analysis and visualization of UMI-4C chromatin contact data
UMI-4C is a technique that allows characterization of 3D chromatin interactions with a bait of interest, taking advantage of a sonication step to produce unique molecular identifiers (UMIs) that help remove duplication bias, thus allowing a better differential comparsion of chromatin interactions between conditions. This package allows processing of UMI-4C data, starting from FastQ files provided by the sequencing facility. It provides two statistical methods for detecting differential contacts and includes a visualization function to plot integrated information from a UMI-4C assay.
Last updated 4 months ago
qualitycontrolpreprocessingalignmentnormalizationvisualizationsequencingcoveragechromatinchromatin-interactiongenomicsumi4c
5.57 score 5 stars 7 scripts 197 downloadsSpatialExperimentIO - Read in Xenium, CosMx, MERSCOPE or STARmapPLUS data as SpatialExperiment object
Read in imaging-based spatial transcriptomics technology data. Current available modules are for Xenium by 10X Genomics, CosMx by Nanostring, MERSCOPE by Vizgen, or STARmapPLUS from Broad Institute. You can choose to read the data in as a SpatialExperiment or a SingleCellExperiment object.
Last updated 8 days ago
datarepresentationdataimportinfrastructuretranscriptomicssinglecellspatialgeneexpression
5.56 score 8 stars 17 downloadsbandle - An R package for the Bayesian analysis of differential subcellular localisation experiments
The Bandle package enables the analysis and visualisation of differential localisation experiments using mass-spectrometry data. Experimental methods supported include dynamic LOPIT-DC, hyperLOPIT, Dynamic Organellar Maps, Dynamic PCP. It provides Bioconductor infrastructure to analyse these data.
Last updated 15 days ago
bayesianclassificationclusteringimmunooncologyqualitycontroldataimportproteomicsmassspectrometryopenblascppopenmp
5.56 score 4 stars 3 scripts 158 downloadsTAPseq - Targeted scRNA-seq primer design for TAP-seq
Design primers for targeted single-cell RNA-seq used by TAP-seq. Create sequence templates for target gene panels and design gene-specific primers using Primer3. Potential off-targets can be estimated with BLAST. Requires working installations of Primer3 and BLASTn.
Last updated 4 months ago
singlecellsequencingtechnologycrisprpooledscreens
5.56 score 4 stars 9 scripts 194 downloadsBgeeCall - Automatic RNA-Seq present/absent gene expression calls generation
BgeeCall allows to generate present/absent gene expression calls without using an arbitrary cutoff like TPM<1. Calls are generated based on reference intergenic sequences. These sequences are generated based on expression of all RNA-Seq libraries of each species integrated in Bgee (https://bgee.org).
Last updated 4 months ago
softwaregeneexpressionrnaseqbiologygene-expressiongene-levelintergenic-regionspresent-absent-callsrna-seqrna-seq-librariesscrna-seq
5.56 score 3 stars 9 scripts 207 downloadsHiLDA - Conducting statistical inference on comparing the mutational exposures of mutational signatures by using hierarchical latent Dirichlet allocation
A package built under the Bayesian framework of applying hierarchical latent Dirichlet allocation. It statistically tests whether the mutational exposures of mutational signatures (Shiraishi-model signatures) are different between two groups. The package also provides inference and visualization.
Last updated 4 months ago
softwaresomaticmutationsequencingstatisticalmethodbayesianmutational-signaturesrjagssomatic-mutationscppjags
5.56 score 3 stars 1 dependents 7 scripts 210 downloadslionessR - Modeling networks for individual samples using LIONESS
LIONESS, or Linear Interpolation to Obtain Network Estimates for Single Samples, can be used to reconstruct single-sample networks (https://arxiv.org/abs/1505.06440). This code implements the LIONESS equation in the lioness function in R to reconstruct single-sample networks. The default network reconstruction method we use is based on Pearson correlation. However, lionessR can run on any network reconstruction algorithms that returns a complete, weighted adjacency matrix. lionessR works for both unipartite and bipartite networks.
Last updated 4 months ago
networknetworkinferencegeneexpression
5.55 score 21 stars 17 scripts 202 downloadsalabaster.sce - Load and Save SingleCellExperiment from File
Save SingleCellExperiment into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated 4 months ago
dataimportdatarepresentation
5.51 score 3 dependents 4 scripts 1.4k downloadsGeoDiff - Count model based differential expression and normalization on GeoMx RNA data
A series of statistical models using count generating distributions for background modelling, feature and sample QC, normalization and differential expression analysis on GeoMx RNA data. The application of these methods are demonstrated by example data analysis vignette.
Last updated 4 months ago
geneexpressiondifferentialexpressionnormalizationopenblascppopenmp
5.51 score 8 stars 9 scripts 222 downloadscbpManager - Generate, manage, and edit data and metadata files suitable for the import in cBioPortal for Cancer Genomics
This R package provides an R Shiny application that enables the user to generate, manage, and edit data and metadata files suitable for the import in cBioPortal for Cancer Genomics. Create cancer studies and edit its metadata. Upload mutation data of a patient that will be concatenated to the data_mutation_extended.txt file of the study. Create and edit clinical patient data, sample data, and timeline data. Create custom timeline tracks for patients.
Last updated 4 months ago
immunooncologydataimportdatarepresentationguithirdpartyclientpreprocessingvisualizationcancer-genomicscbioportalclinical-datafilegeneratormutation-datapatient-data
5.51 score 8 stars 1 scripts 232 downloadsrprimer - Design Degenerate Oligos from a Multiple DNA Sequence Alignment
Functions, workflow, and a Shiny application for visualizing sequence conservation and designing degenerate primers, probes, and (RT)-(q/d)PCR assays from a multiple DNA sequence alignment. The results can be presented in data frame format and visualized as dashboard-like plots. For more information, please see the package vignette.
Last updated 4 months ago
alignmentddpcrcoveragemultiplesequencealignmentsequencematchingqpcr
5.49 score 4 stars 13 scripts 161 downloadsTDbasedUFE - Tensor Decomposition Based Unsupervised Feature Extraction
This is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. It can perform unsupervised feature extraction. It uses tensor decomposition. It is applicable to gene expression, DNA methylation, and histone modification etc. It can perform multiomics analysis. It is also potentially applicable to single cell omics data sets.
Last updated 4 months ago
geneexpressionfeatureextractionmethylationarraysinglecellbioinformaticsdna-methylationgene-expression-profileshistone-modificationsmultiomicstensor-decomposition
5.48 score 5 stars 1 dependents 9 scripts 142 downloadstripr - T-cell Receptor/Immunoglobulin Profiler (TRIP)
TRIP is a software framework that provides analytics services on antigen receptor (B cell receptor immunoglobulin, BcR IG | T cell receptor, TR) gene sequence data. It is a web application written in R Shiny. It takes as input the output files of the IMGT/HighV-Quest tool. Users can select to analyze the data from each of the input samples separately, or the combined data files from all samples and visualize the results accordingly.
Last updated 4 months ago
batcheffectmultiplecomparisongeneexpressionimmunooncologytargetedresequencingbioconductorclonotype
5.48 score 2 stars 4 scripts 179 downloadsGSgalgoR - An Evolutionary Framework for the Identification and Study of Prognostic Gene Expression Signatures in Cancer
A multi-objective optimization algorithm for disease sub-type discovery based on a non-dominated sorting genetic algorithm. The 'Galgo' framework combines the advantages of clustering algorithms for grouping heterogeneous 'omics' data and the searching properties of genetic algorithms for feature selection. The algorithm search for the optimal number of clusters determination considering the features that maximize the survival difference between sub-types while keeping cluster consistency high.
Last updated 4 months ago
geneexpressiontranscriptionclusteringclassificationsurvival
5.48 score 15 stars 6 scripts 206 downloadsmetagene2 - A package to produce metagene plots
This package produces metagene plots to compare coverages of sequencing experiments at selected groups of genomic regions. It can be used for such analyses as assessing the binding of DNA-interacting proteins at promoter regions or surveying antisense transcription over the length of a gene. The metagene2 package can manage all aspects of the analysis, from normalization of coverages to plot facetting according to experimental metadata. Bootstraping analysis is used to provide confidence intervals of per-sample mean coverages.
Last updated 4 months ago
chipseqgeneticsmultiplecomparisoncoveragealignmentsequencing
5.45 score 4 stars 8 scripts 218 downloadsINTACT - Integrate TWAS and Colocalization Analysis for Gene Set Enrichment Analysis
This package integrates colocalization probabilities from colocalization analysis with transcriptome-wide association study (TWAS) scan summary statistics to implicate genes that may be biologically relevant to a complex trait. The probabilistic framework implemented in this package constrains the TWAS scan z-score-based likelihood using a gene-level colocalization probability. Given gene set annotations, this package can estimate gene set enrichment using posterior probabilities from the TWAS-colocalization integration step.
Last updated 4 months ago
bayesiangenesetenrichment
5.44 score 14 stars 13 scripts 144 downloadsTEKRABber - An R package estimates the correlations of orthologs and transposable elements between two species
TEKRABber is made to provide a user-friendly pipeline for comparing orthologs and transposable elements (TEs) between two species. It considers the orthology confidence between two species from BioMart to normalize expression counts and detect differentially expressed orthologs/TEs. Then it provides one to one correlation analysis for desired orthologs and TEs. There is also an app function to have a first insight on the result. Users can prepare orthologs/TEs RNA-seq expression data by their own preference to run TEKRABber following the data structure mentioned in the vignettes.
Last updated 4 months ago
differentialexpressionnormalizationtranscriptiongeneexpressionbioconductorcpp
5.43 score 3 stars 18 scripts 357 downloadsOmicsMLRepoR - Search harmonized metadata created under the OmicsMLRepo project
This package provides functions to browse the harmonized metadata for large omics databases. This package also supports data navigation if the metadata incorporates ontology.
Last updated 7 days ago
softwareinfrastructuredatarepresentation
5.42 score 14 scripts 63 downloadsspeckle - Statistical methods for analysing single cell RNA-seq data
The speckle package contains functions for the analysis of single cell RNA-seq data. The speckle package currently contains functions to analyse differences in cell type proportions. There are also functions to estimate the parameters of the Beta distribution based on a given counts matrix, and a function to normalise a counts matrix to the median library size. There are plotting functions to visualise cell type proportions and the mean-variance relationship in cell type proportions and counts. As our research into specialised analyses of single cell data continues we anticipate that the package will be updated with new functions.
Last updated 4 months ago
singlecellrnaseqregressiongeneexpression
5.41 score 258 scripts 390 downloadsBERT - High Performance Data Integration for Large-Scale Analyses of Incomplete Omic Profiles Using Batch-Effect Reduction Trees (BERT)
Provides efficient batch-effect adjustment of data with missing values. BERT orders all batch effect correction to a tree of pairwise computations. BERT allows parallelization over sub-trees.
Last updated 13 days ago
batcheffectpreprocessingexperimentaldesignqualitycontrolbatch-effectbioconductor-packagebioinformaticsdata-integrationdata-science
5.40 score 2 stars 18 scripts 124 downloads
GRaNIE - GRaNIE: Reconstruction cell type specific gene regulatory networks including enhancers using single-cell or bulk chromatin accessibility and RNA-seq data
Genetic variants associated with diseases often affect non-coding regions, thus likely having a regulatory role. To understand the effects of genetic variants in these regulatory regions, identifying genes that are modulated by specific regulatory elements (REs) is crucial. The effect of gene regulatory elements, such as enhancers, is often cell-type specific, likely because the combinations of transcription factors (TFs) that are regulating a given enhancer have cell-type specific activity. This TF activity can be quantified with existing tools such as diffTF and captures differences in binding of a TF in open chromatin regions. Collectively, this forms a gene regulatory network (GRN) with cell-type and data-specific TF-RE and RE-gene links. Here, we reconstruct such a GRN using single-cell or bulk RNAseq and open chromatin (e.g., using ATACseq or ChIPseq for open chromatin marks) and optionally (Capture) Hi-C data. Our network contains different types of links, connecting TFs to regulatory elements, the latter of which is connected to genes in the vicinity or within the same chromatin domain (TAD). We use a statistical framework to assign empirical FDRs and weights to all links using a permutation-based approach.
Last updated 4 months ago
softwaregeneexpressiongeneregulationnetworkinferencegenesetenrichmentbiomedicalinformaticsgeneticstranscriptomicsatacseqrnaseqgraphandnetworkregressiontranscriptionchipseq
5.40 score 24 scripts 266 downloadsReUseData - Reusable and reproducible Data Management
ReUseData is an _R/Bioconductor_ software tool to provide a systematic and versatile approach for standardized and reproducible data management. ReUseData facilitates transformation of shell or other ad hoc scripts for data preprocessing into workflow-based data recipes. Evaluation of data recipes generate curated data files in their generic formats (e.g., VCF, bed). Both recipes and data are cached using database infrastructure for easy data management and reuse. Prebuilt data recipes are available through ReUseData portal ("https://rcwl.org/dataRecipes/") with full annotation and user instructions. Pregenerated data are available through ReUseData cloud bucket that is directly downloadable through "getCloudData()".
Last updated 4 months ago
softwareinfrastructuredataimportpreprocessingimmunooncology
5.38 score 4 stars 7 scripts 187 downloadsiSEEhub - iSEE for the Bioconductor ExperimentHub
This package defines a custom landing page for an iSEE app interfacing with the Bioconductor ExperimentHub. The landing page allows users to browse the ExperimentHub, select a data set, download and cache it, and import it directly into a Bioconductor iSEE app.
Last updated 4 months ago
dataimportimmunooncology infrastructureshinyappssinglecellsoftwarebioconductorbioconductor-packagehacktoberfestisee
5.38 score 3 stars 4 scripts 357 downloadsiSEEhex - iSEE extension for summarising data points in hexagonal bins
This package provides panels summarising data points in hexagonal bins for `iSEE`. It is part of `iSEEu`, the iSEE universe of panels that extend the `iSEE` package.
Last updated 4 months ago
softwareinfrastructurebioconductoriseeushiny-r
5.38 score 2 dependents 7 scripts 244 downloadsfastreeR - Phylogenetic, Distance and Other Calculations on VCF and Fasta Files
Calculate distances, build phylogenetic trees or perform hierarchical clustering between the samples of a VCF or FASTA file. Functions are implemented in Java and called via rJava. Parallel implementation that operates directly on the VCF or FASTA file for fast execution.
Last updated 4 months ago
phylogeneticsmetagenomicsclusteringopenjdk
5.38 score 3 stars 20 scripts 202 downloads
beer - Bayesian Enrichment Estimation in R
BEER implements a Bayesian model for analyzing phage-immunoprecipitation sequencing (PhIP-seq) data. Given a PhIPData object, BEER returns posterior probabilities of enriched antibody responses, point estimates for the relative fold-change in comparison to negative control samples, and more. Additionally, BEER provides a convenient implementation for using edgeR to identify enriched antibody responses.
Last updated 4 months ago
softwarestatisticalmethodbayesiansequencingcoveragejagscpp
5.38 score 10 stars 12 scripts 185 downloadsscReClassify - scReClassify: post hoc cell type classification of single-cell RNA-seq data
A post hoc cell type classification tool to fine-tune cell type annotations generated by any cell type classification procedure with semi-supervised learning algorithm AdaSampling technique. The current version of scReClassify supports Support Vector Machine and Random Forest as a base classifier.
Last updated 4 months ago
softwaretranscriptomicssinglecellclassificationsupportvectormachine
5.38 score 10 stars 12 scripts 158 downloadsdiffUTR - diffUTR: Streamlining differential exon and 3' UTR usage
The diffUTR package provides a uniform interface and plotting functions for limma/edgeR/DEXSeq -powered differential bin/exon usage. It includes in addition an improved version of the limma::diffSplice method. Most importantly, diffUTR further extends the application of these frameworks to differential UTR usage analysis using poly-A site databases.
Last updated 4 months ago
geneexpression
5.38 score 6 stars 9 scripts 162 downloadsQtlizer - Comprehensive QTL annotation of GWAS results
This R package provides access to the Qtlizer web server. Qtlizer annotates lists of common small variants (mainly SNPs) and genes in humans with associated changes in gene expression using the most comprehensive database of published quantitative trait loci (QTLs).
Last updated 4 months ago
genomewideassociationsnpgeneticslinkagedisequilibriumeqtlgwasvariant-annotation
5.38 score 3 stars 2 scripts 176 downloadsNPARC - Non-parametric analysis of response curves for thermal proteome profiling experiments
Perform non-parametric analysis of response curves as described by Childs, Bach, Franken et al. (2019): Non-parametric analysis of thermal proteome profiles reveals novel drug-binding proteins.
Last updated 4 months ago
softwareproteomics
5.36 score 38 scripts 234 downloadssvaNUMT - NUMT detection from structural variant calls
svaNUMT contains functions for detecting NUMT events from structural variant calls. It takes structural variant calls in GRanges of breakend notation and identifies NUMTs by nuclear-mitochondrial breakend junctions. The main function reports candidate NUMTs if there is a pair of valid insertion sites found on the nuclear genome within a certain distance threshold. The candidate NUMTs are reported by events.
Last updated 4 months ago
dataimportsequencingannotationgeneticsvariantannotation
5.35 score 3 stars 6 scripts 162 downloadsNewWave - Negative binomial model for scRNA-seq
A model designed for dimensionality reduction and batch effect removal for scRNA-seq data. It is designed to be massively parallelizable using shared objects that prevent memory duplication, and it can be used with different mini-batch approaches in order to reduce time consumption. It assumes a negative binomial distribution for the data with a dispersion parameter that can be both commonwise across gene both genewise.
Last updated 4 months ago
softwaregeneexpressiontranscriptomicssinglecellbatcheffectsequencingcoverageregressionbatch-effectsdimensionality-reductionnegative-binomialscrna-seq
5.33 score 4 stars 27 scripts 228 downloadssRACIPE - Systems biology tool to simulate gene regulatory circuits
sRACIPE implements a randomization-based method for gene circuit modeling. It allows us to study the effect of both the gene expression noise and the parametric variation on any gene regulatory circuit (GRC) using only its topology, and simulates an ensemble of models with random kinetic parameters at multiple noise levels. Statistical analysis of the generated gene expressions reveals the basin of attraction and stability of various phenotypic states and their changes associated with intrinsic and extrinsic noises. sRACIPE provides a holistic picture to evaluate the effects of both the stochastic nature of cellular processes and the parametric variation.
Last updated 4 months ago
researchfieldsystemsbiologymathematicalbiologygeneexpressiongeneregulationgenetargetcppgenegene-circuit-explorergeneticsraciperegulatory-networkssimulated-annealingsimulationsracipestochasticcpp
5.32 score 211 scripts 170 downloadshoodscanR - Spatial cellular neighbourhood scanning in R
hoodscanR is an user-friendly R package providing functions to assist cellular neighborhood analysis of any spatial transcriptomics data with single-cell resolution. All functions in the package are built based on the SpatialExperiment object, allowing integration into various spatial transcriptomics-related packages from Bioconductor. The package can result in cell-level neighborhood annotation output, along with funtions to perform neighborhood colocalization analysis and neighborhood-based cell clustering.
Last updated 2 days ago
spatialtranscriptomicssinglecellclusteringcpp
5.32 score 4 stars 15 scripts 157 downloads
VDJdive - Analysis Tools for 10X V(D)J Data
This package provides functions for handling and analyzing immune receptor repertoire data, such as produced by the CellRanger V(D)J pipeline. This includes reading the data into R, merging it with paired single-cell data, quantifying clonotype abundances, calculating diversity metrics, and producing common plots. It implements the E-M Algorithm for clonotype assignment, along with other methods, which makes use of ambiguous cells for improved quantification.
Last updated 4 months ago
softwareimmunooncologysinglecellannotationrnaseqtargetedresequencingcpp
5.32 score 7 stars 1 scripts 174 downloadsGEOexplorer - GEOexplorer: a webserver for gene expression analysis and visualisation
GEOexplorer is a webserver and R/Bioconductor package and web application that enables users to perform gene expression analysis. The development of GEOexplorer was made possible because of the excellent code provided by GEO2R (https: //www.ncbi.nlm.nih.gov/geo/geo2r/).
Last updated 4 months ago
softwaregeneexpressionmrnamicroarraydifferentialexpressionmicroarraymicrornaarraytranscriptomicsrnaseq
5.32 score 5 stars 14 scripts 152 downloadsfedup - Fisher's Test for Enrichment and Depletion of User-Defined Pathways
An R package that tests for enrichment and depletion of user-defined pathways using a Fisher's exact test. The method is designed for versatile pathway annotation formats (eg. gmt, txt, xlsx) to allow the user to run pathway analysis on custom annotations. This package is also integrated with Cytoscape to provide network-based pathway visualization that enhances the interpretability of the results.
Last updated 4 months ago
genesetenrichmentpathwaysnetworkenrichmentnetworkbioconductorenrichment
5.32 score 7 stars 10 scripts 186 downloadskatdetectr - Detection, Characterization and Visualization of Kataegis in Sequencing Data
Kataegis refers to the occurrence of regional hypermutation and is a phenomenon observed in a wide range of malignancies. Using changepoint detection katdetectr aims to identify putative kataegis foci from common data-formats housing genomic variants. Katdetectr has shown to be a robust package for the detection, characterization and visualization of kataegis.
Last updated 4 months ago
wholegenomesoftwaresnpsequencingclassificationvariantannotation
5.30 score 5 stars 4 scripts 196 downloadsDifferentialRegulation - Differentially regulated genes from scRNA-seq data
DifferentialRegulation is a method for detecting differentially regulated genes between two groups of samples (e.g., healthy vs. disease, or treated vs. untreated samples), by targeting differences in the balance of spliced and unspliced mRNA abundances, obtained from single-cell RNA-sequencing (scRNA-seq) data. From a mathematical point of view, DifferentialRegulation accounts for the sample-to-sample variability, and embeds multiple samples in a Bayesian hierarchical model. Furthermore, our method also deals with two major sources of mapping uncertainty: i) 'ambiguous' reads, compatible with both spliced and unspliced versions of a gene, and ii) reads mapping to multiple genes. In particular, ambiguous reads are treated separately from spliced and unsplced reads, while reads that are compatible with multiple genes are allocated to the gene of origin. Parameters are inferred via Markov chain Monte Carlo (MCMC) techniques (Metropolis-within-Gibbs).
Last updated 4 months ago
differentialsplicingbayesiangeneticsrnaseqsequencingdifferentialexpressiongeneexpressionmultiplecomparisonsoftwaretranscriptionstatisticalmethodvisualizationsinglecellgenetargetopenblascpp
5.30 score 10 stars 4 scripts 193 downloadsRnaSeqSampleSize - RnaSeqSampleSize
RnaSeqSampleSize package provides a sample size calculation method based on negative binomial model and the exact test for assessing differential expression analysis of RNA-seq data. It controls FDR for multiple testing and utilizes the average read count and dispersion distributions from real data to estimate a more reliable sample size. It is also equipped with several unique features, including estimation for interested genes or pathway, power curve visualization, and parameter optimization.
Last updated 4 months ago
immunooncologyexperimentaldesignsequencingrnaseqgeneexpressiondifferentialexpressioncpp
5.30 score 20 scripts 280 downloadsscCB2 - CB2 improves power of cell detection in droplet-based single-cell RNA sequencing data
scCB2 is an R package implementing CB2 for distinguishing real cells from empty droplets in droplet-based single cell RNA-seq experiments (especially for 10x Chromium). It is based on clustering similar barcodes and calculating Monte-Carlo p-value for each cluster to test against background distribution. This cluster-level test outperforms single-barcode-level tests in dealing with low count barcodes and homogeneous sequencing library, while keeping FDR well controlled.
Last updated 4 months ago
dataimportrnaseqsinglecellsequencinggeneexpressiontranscriptomicspreprocessingclustering
5.30 score 10 stars 5 scripts 191 downloadsDEWSeq - Differential Expressed Windows Based on Negative Binomial Distribution
DEWSeq is a sliding window approach for the analysis of differentially enriched binding regions eCLIP or iCLIP next generation sequencing data.
Last updated 4 months ago
sequencinggeneregulationfunctionalgenomicsdifferentialexpressionbioinformaticseclipngs-analysis
5.30 score 5 stars 4 scripts 190 downloadspreciseTAD - preciseTAD: A machine learning framework for precise TAD boundary prediction
preciseTAD provides functions to predict the location of boundaries of topologically associated domains (TADs) and chromatin loops at base-level resolution. As an input, it takes BED-formatted genomic coordinates of domain boundaries detected from low-resolution Hi-C data, and coordinates of high-resolution genomic annotations from ENCODE or other consortia. preciseTAD employs several feature engineering strategies and resampling techniques to address class imbalance, and trains an optimized random forest model for predicting low-resolution domain boundaries. Translated on a base-level, preciseTAD predicts the probability for each base to be a boundary. Density-based clustering and scalable partitioning techniques are used to detect precise boundary regions and summit points. Compared with low-resolution boundaries, preciseTAD boundaries are highly enriched for CTCF, RAD21, SMC3, and ZNF143 signal and more conserved across cell lines. The pre-trained model can accurately predict boundaries in another cell line using CTCF, RAD21, SMC3, and ZNF143 annotation data for this cell line.
Last updated 4 months ago
softwarehicsequencingclusteringclassificationfunctionalgenomicsfeatureextraction
5.29 score 7 stars 14 scripts 202 downloadsDeMixT - Cell type-specific deconvolution of heterogeneous tumor samples with two or three components using expression data from RNAseq or microarray platforms
DeMixT is a software package that performs deconvolution on transcriptome data from a mixture of two or three components.
Last updated 4 months ago
softwarestatisticalmethodclassificationgeneexpressionsequencingmicroarraytissuemicroarraycoveragecppopenmp
5.27 score 25 scripts 218 downloadsstructToolbox - Data processing & analysis tools for Metabolomics and other omics
An extensive set of data (pre-)processing and analysis methods and tools for metabolomics and other omics, with a strong emphasis on statistics and machine learning. This toolbox allows the user to build extensive and standardised workflows for data analysis. The methods and tools have been implemented using class-based templates provided by the struct (Statistics in R Using Class-based Templates) package. The toolbox includes pre-processing methods (e.g. signal drift and batch correction, normalisation, missing value imputation and scaling), univariate (e.g. ttest, various forms of ANOVA, Kruskal–Wallis test and more) and multivariate statistical methods (e.g. PCA and PLS, including cross-validation and permutation testing) as well as machine learning methods (e.g. Support Vector Machines). The STATistics Ontology (STATO) has been integrated and implemented to provide standardised definitions for the different methods, inputs and outputs.
Last updated 14 days ago
workflowstepmetabolomicsbioconductor-packagedimslc-msmachine-learningmultivariate-analysisstatisticsunivariate
5.26 score 10 stars 12 scripts 242 downloadsepistack - Heatmaps of Stack Profiles from Epigenetic Signals
The epistack package main objective is the visualizations of stacks of genomic tracks (such as, but not restricted to, ChIP-seq, ATAC-seq, DNA methyation or genomic conservation data) centered at genomic regions of interest. epistack needs three different inputs: 1) a genomic score objects, such as ChIP-seq coverage or DNA methylation values, provided as a `GRanges` (easily obtained from `bigwig` or `bam` files). 2) a list of feature of interest, such as peaks or transcription start sites, provided as a `GRanges` (easily obtained from `gtf` or `bed` files). 3) a score to sort the features, such as peak height or gene expression value.
Last updated 4 months ago
rnaseqpreprocessingchipseqgeneexpressioncoveragebioinformatics
5.26 score 6 stars 5 scripts 157 downloadsFindIT2 - find influential TF and Target based on multi-omics data
This package implements functions to find influential TF and target based on different input type. It have five module: Multi-peak multi-gene annotaion(mmPeakAnno module), Calculate regulation potential(calcRP module), Find influential Target based on ChIP-Seq and RNA-Seq data(Find influential Target module), Find influential TF based on different input(Find influential TF module), Calculate peak-gene or peak-peak correlation(peakGeneCor module). And there are also some other useful function like integrate different source information, calculate jaccard similarity for your TF.
Last updated 4 months ago
softwareannotationchipseqatacseqgeneregulationmultiplecomparisongenetarget
5.26 score 6 stars 7 scripts 142 downloadsReactomeGraph4R - Interface for the Reactome Graph Database
Pathways, reactions, and biological entities in Reactome knowledge are systematically represented as an ordered network. Instances are represented as nodes and relationships between instances as edges; they are all stored in the Reactome Graph Database. This package serves as an interface to query the interconnected data from a local Neo4j database, with the aim of minimizing the usage of Neo4j Cypher queries.
Last updated 4 months ago
dataimportpathwaysreactomenetworkgraphandnetwork
5.26 score 6 stars 6 scripts 115 downloads
periodicDNA - Set of tools to identify periodic occurrences of k-mers in DNA sequences
This R package helps the user identify k-mers (e.g. di- or tri-nucleotides) present periodically in a set of genomic loci (typically regulatory elements). The functions of this package provide a straightforward approach to find periodic occurrences of k-mers in DNA sequences, such as regulatory elements. It is not aimed at identifying motifs separated by a conserved distance; for this type of analysis, please visit MEME website.
Last updated 4 months ago
sequencematchingmotifdiscoverymotifannotationsequencingcoveragealignmentdataimport
5.26 score 6 stars 5 scripts 197 downloadsEnMCB - Predicting Disease Progression Based on Methylation Correlated Blocks using Ensemble Models
Creation of the correlated blocks using DNA methylation profiles. Machine learning models can be constructed to predict differentially methylated blocks and disease progression.
Last updated 4 months ago
normalizationdnamethylationmethylationarraysupportvectormachine
5.26 score 9 stars 2 scripts 234 downloads
epimutacions - Robust outlier identification for DNA methylation data
The package includes some statistical outlier detection methods for epimutations detection in DNA methylation data. The methods included in the package are MANOVA, Multivariate linear models, isolation forest, robust mahalanobis distance, quantile and beta. The methods compare a case sample with a suspected disease against a reference panel (composed of healthy individuals) to identify epimutations in the given case sample. It also contains functions to annotate and visualize the identified epimutations.
Last updated 4 months ago
dnamethylationbiologicalquestionpreprocessingstatisticalmethodnormalizationcpp
5.23 score 28 scripts 168 downloadsProteoDisco - Generation of customized protein variant databases from genomic variants, splice-junctions and manual sequences
ProteoDisco is an R package to facilitate proteogenomics studies. It houses functions to create customized (variant) protein databases based on user-submitted genomic variants, splice-junctions, fusion genes and manual transcript sequences. The flexible workflow can be adopted to suit a myriad of research and experimental settings.
Last updated 4 months ago
softwareproteomicsrnaseqsnpsequencingvariantannotationdataimport
5.20 score 4 stars 4 scripts 142 downloadsmethylscaper - Visualization of Methylation Data
methylscaper is an R package for processing and visualizing data jointly profiling methylation and chromatin accessibility (MAPit, NOMe-seq, scNMT-seq, nanoNOMe, etc.). The package supports both single-cell and single-molecule data, and a common interface for jointly visualizing both data types through the generation of ordered representational methylation-state matrices. The Shiny app allows for an interactive seriation process of refinement and re-weighting that optimally orders the cells or DNA molecules to discover methylation patterns and nucleosome positioning.
Last updated 4 months ago
dnamethylationepigeneticssequencingvisualizationsinglecellnucleosomepositioning
5.20 score 2 stars 3 scripts 250 downloadsSplicingFactory - Splicing Diversity Analysis for Transcriptome Data
The SplicingFactory R package uses transcript-level expression values to analyze splicing diversity based on various statistical measures, like Shannon entropy or the Gini index. These measures can quantify transcript isoform diversity within samples or between conditions. Additionally, the package analyzes the isoform diversity data, looking for significant changes between conditions.
Last updated 4 months ago
transcriptomicsrnaseqdifferentialsplicingalternativesplicingtranscriptomevariantgini-indexrna-seqshannon-entropysimpson-indexsplicing
5.20 score 4 stars 1 scripts 172 downloadsMSPrep - Package for Summarizing, Filtering, Imputing, and Normalizing Metabolomics Data
Package performs summarization of replicates, filtering by frequency, several different options for imputing missing data, and a variety of options for transforming, batch correcting, and normalizing data.
Last updated 4 months ago
metabolomicsmassspectrometrypreprocessing
5.20 score 10 stars 4 scripts 362 downloadsDegNorm - DegNorm: degradation normalization for RNA-seq data
This package performs degradation normalization in bulk RNA-seq data to improve differential expression analysis accuracy.
Last updated 4 months ago
rnaseqnormalizationgeneexpressionalignmentcoveragedifferentialexpressionbatcheffectsoftwaresequencingimmunooncologyqualitycontroldataimportopenblascppopenmp
5.20 score 1 stars 3 scripts 193 downloadsRNAAgeCalc - A multi-tissue transcriptional age calculator
It has been shown that both DNA methylation and RNA transcription are linked to chronological age and age related diseases. Several estimators have been developed to predict human aging from DNA level and RNA level. Most of the human transcriptional age predictor are based on microarray data and limited to only a few tissues. To date, transcriptional studies on aging using RNASeq data from different human tissues is limited. The aim of this package is to provide a tool for across-tissue and tissue-specific transcriptional age calculation based on GTEx RNASeq data.
Last updated 4 months ago
rnaseqgeneexpressionbiological-ageelastic-netgene-expressiongenotype-tissue-expressionpredictionregularized-regressionrna-seq
5.20 score 8 stars 10 scripts 180 downloadsscGPS - A complete analysis of single cell subpopulations, from identifying subpopulations to analysing their relationship (scGPS = single cell Global Predictions of Subpopulation)
The package implements two main algorithms to answer two key questions: a SCORE (Stable Clustering at Optimal REsolution) to find subpopulations, followed by scGPS to investigate the relationships between subpopulations.
Last updated 4 months ago
singlecellclusteringdataimportsequencingcoverageopenblascpp
5.20 score 4 stars 7 scripts 256 downloadsscRecover - scRecover for imputation of single-cell RNA-seq data
scRecover is an R package for imputation of single-cell RNA-seq (scRNA-seq) data. It will detect and impute dropout values in a scRNA-seq raw read counts matrix while keeping the real zeros unchanged, since there are both dropout zeros and real zeros in scRNA-seq data. By combination with scImpute, SAVER and MAGIC, scRecover not only detects dropout and real zeros at higher accuracy, but also improve the downstream clustering and visualization results.
Last updated 4 months ago
geneexpressionsinglecellrnaseqtranscriptomicssequencingpreprocessingsoftware
5.20 score 8 stars 9 scripts 196 downloadssitePath - Phylogeny-based sequence clustering with site polymorphism
Using site polymorphism is one of the ways to cluster DNA/protein sequences but it is possible for the sequences with the same polymorphism on a single site to be genetically distant. This package is aimed at clustering sequences using site polymorphism and their corresponding phylogenetic trees. By considering their location on the tree, only the structurally adjacent sequences will be clustered. However, the adjacent sequences may not necessarily have the same polymorphism. So a branch-and-bound like algorithm is used to minimize the entropy representing the purity of site polymorphism of each cluster.
Last updated 4 months ago
alignmentmultiplesequencealignmentphylogeneticssnpsoftwaremutationcpp
5.20 score 8 stars 9 scripts 216 downloadsDESpace - DESpace: a framework to discover spatially variable genes
Intuitive framework for identifying spatially variable genes (SVGs) via edgeR, a popular method for performing differential expression analyses. Based on pre-annotated spatial clusters as summarized spatial information, DESpace models gene expression using a negative binomial (NB), via edgeR, with spatial clusters as covariates. SVGs are then identified by testing the significance of spatial clusters. The method is flexible and robust, and is faster than the most SV methods. Furthermore, to the best of our knowledge, it is the only SV approach that allows: - performing a SV test on each individual spatial cluster, hence identifying the key regions of the tissue affected by spatial variability; - jointly fitting multiple samples, targeting genes with consistent spatial patterns across replicates.
Last updated 4 months ago
spatialsinglecellrnaseqtranscriptomicsgeneexpressionsequencingdifferentialexpressionstatisticalmethodvisualization
5.19 score 4 stars 13 scripts 146 downloadssnapcount - R/Bioconductor Package for interfacing with Snaptron for rapid querying of expression counts
snapcount is a client interface to the Snaptron webservices which support querying by gene name or genomic region. Results include raw expression counts derived from alignment of RNA-seq samples and/or various summarized measures of expression across one or more regions/genes per-sample (e.g. percent spliced in).
Last updated 4 months ago
coveragegeneexpressionrnaseqsequencingsoftwaredataimport
5.19 score 3 stars 13 scripts 217 downloadsmethylCC - Estimate the cell composition of whole blood in DNA methylation samples
A tool to estimate the cell composition of DNA methylation whole blood sample measured on any platform technology (microarray and sequencing).
Last updated 4 months ago
microarraysequencingdnamethylationmethylationarraymethylseqwholegenome
5.18 score 19 stars 8 scripts 218 downloadssmoppix - Analyze Single Molecule Spatial Omics Data Using the Probabilistic Index
Test for univariate and bivariate spatial patterns in spatial omics data with single-molecule resolution. The tests implemented allow for analysis of nested designs and are automatically calibrated to different biological specimens. Tests for aggregation, colocalization, gradients and vicinity to cell edge or centroid are provided.
Last updated 5 days ago
transcriptomicsspatialsinglecellcpp
5.18 score 1 stars 4 scripts 44 downloadscoMET - coMET: visualisation of regional epigenome-wide association scan (EWAS) results and DNA co-methylation patterns
Visualisation of EWAS results in a genomic region. In addition to phenotype-association P-values, coMET also generates plots of co-methylation patterns and provides a series of annotation tracks. It can be used to other omic-wide association scans as lon:g as the data can be translated to genomic level and for any species.
Last updated 4 months ago
softwaredifferentialmethylationvisualizationsequencinggeneticsfunctionalgenomicsmicroarraymethylationarraymethylseqchipseqdnaseqriboseqrnaseqexomeseqdnamethylationgenomewideassociationmotifannotation
5.18 score 15 scripts 200 downloadsspaSim - Spatial point data simulator for tissue images
A suite of functions for simulating spatial patterns of cells in tissue images. Output images are multitype point data in SingleCellExperiment format. Each point represents a cell, with its 2D locations and cell type. Potential cell patterns include background cells, tumour/immune cell clusters, immune rings, and blood/lymphatic vessels.
Last updated 4 months ago
statisticalmethodspatialbiomedicalinformatics
5.18 score 2 stars 25 scripts 173 downloadsRolDE - RolDE: Robust longitudinal Differential Expression
RolDE detects longitudinal differential expression between two conditions in noisy high-troughput data. Suitable even for data with a moderate amount of missing values.RolDE is a composite method, consisting of three independent modules with different approaches to detecting longitudinal differential expression. The combination of these diverse modules allows RolDE to robustly detect varying differences in longitudinal trends and expression levels in diverse data types and experimental settings.
Last updated 4 months ago
statisticalmethodsoftwaretimecourseregressionproteomicsdifferentialexpression
5.18 score 5 stars 1 scripts 178 downloadsPhIPData - Container for PhIP-Seq Experiments
PhIPData defines an S4 class for phage-immunoprecipitation sequencing (PhIP-seq) experiments. Buliding upon the RangedSummarizedExperiment class, PhIPData enables users to coordinate metadata with experimental data in analyses. Additionally, PhIPData provides specialized methods to subset and identify beads-only samples, subset objects using virus aliases, and use existing peptide libraries to populate object parameters.
Last updated 4 months ago
infrastructuredatarepresentationsequencingcoverage
5.18 score 5 stars 1 dependents 6 scripts 197 downloadsCNVfilteR - Identifies false positives of CNV calling tools by using SNV calls
CNVfilteR identifies those CNVs that can be discarded by using the single nucleotide variant (SNV) calls that are usually obtained in common NGS pipelines.
Last updated 4 months ago
copynumbervariationsequencingdnaseqvisualizationdataimport
5.18 score 5 stars 1 scripts 218 downloadsgDNAx - Diagnostics for assessing genomic DNA contamination in RNA-seq data
Provides diagnostics for assessing genomic DNA contamination in RNA-seq data, as well as plots representing these diagnostics. Moreover, the package can be used to get an insight into the strand library protocol used and, in case of strand-specific libraries, the strandedness of the data. Furthermore, it provides functionality to filter out reads of potential gDNA origin.
Last updated 10 days ago
transcriptiontranscriptomicsrnaseqsequencingpreprocessingsoftwaregeneexpressioncoveragedifferentialexpressionfunctionalgenomicssplicedalignmentalignment
5.15 score 1 stars 3 scripts 189 downloadslineagespot - Detection of SARS-CoV-2 lineages in wastewater samples using next-generation sequencing
Lineagespot is a framework written in R, and aims to identify SARS-CoV-2 related mutations based on a single (or a list) of variant(s) file(s) (i.e., variant calling format). The method can facilitate the detection of SARS-CoV-2 lineages in wastewater samples using next generation sequencing, and attempts to infer the potential distribution of the SARS-CoV-2 lineages.
Last updated 4 months ago
variantdetectionvariantannotationsequencing
5.15 score 2 stars 4 scripts 180 downloadsptairMS - Pre-processing PTR-TOF-MS Data
This package implements a suite of methods to preprocess data from PTR-TOF-MS instruments (HDF5 format) and generates the 'sample by features' table of peak intensities in addition to the sample and feature metadata (as a singl<e ExpressionSet object for subsequent statistical analysis). This package also permit usefull tools for cohorts management as analyzing data progressively, visualization tools and quality control. The steps include calibration, expiration detection, peak detection and quantification, feature alignment, missing value imputation and feature annotation. Applications to exhaled air and cell culture in headspace are described in the vignettes and examples. This package was used for data analysis of Gassin Delyle study on adults undergoing invasive mechanical ventilation in the intensive care unit due to severe COVID-19 or non-COVID-19 acute respiratory distress syndrome (ARDS), and permit to identfy four potentiel biomarquers of the infection.
Last updated 4 months ago
softwaremassspectrometrypreprocessingmetabolomicspeakdetectionalignmentcpp
5.15 score 7 stars 3 scripts 140 downloadsmsImpute - Imputation of label-free mass spectrometry peptides
MsImpute is a package for imputation of peptide intensity in proteomics experiments. It additionally contains tools for MAR/MNAR diagnosis and assessment of distortions to the probability distribution of the data post imputation. The missing values are imputed by low-rank approximation of the underlying data matrix if they are MAR (method = "v2"), by Barycenter approach if missingness is MNAR ("v2-mnar"), or by Peptide Identity Propagation (PIP).
Last updated 4 months ago
massspectrometryproteomicssoftwarelabel-free-proteomicslow-rank-approximation
5.15 score 14 stars 7 scripts 396 downloadsMouseFM - In-silico methods for genetic finemapping in inbred mice
This package provides methods for genetic finemapping in inbred mice by taking advantage of their very high homozygosity rate (>95%).
Last updated 4 months ago
geneticssnpgenetargetvariantannotationgenomicvariationmultiplecomparisonsystemsbiologymathematicalbiologypatternlogicgenepredictionbiomedicalinformaticsfunctionalgenomicsfinemapgene-candidatesinbred-miceinbred-strainsmouseqtlqtl-mapping
5.13 score 5 scripts 359 downloadsSGCP - SGCP: A semi-supervised pipeline for gene clustering using self-training approach in gene co-expression networks
SGC is a semi-supervised pipeline for gene clustering in gene co-expression networks. SGC consists of multiple novel steps that enable the computation of highly enriched modules in an unsupervised manner. But unlike all existing frameworks, it further incorporates a novel step that leverages Gene Ontology information in a semi-supervised clustering method that further improves the quality of the computed modules.
Last updated 4 months ago
geneexpressiongenesetenrichmentnetworkenrichmentsystemsbiologyclassificationclusteringdimensionreductiongraphandnetworkneuralnetworknetworkmrnamicroarrayrnaseqvisualizationbioinformaticsgenecoexpressionnetworkgraphsnetworkclusteringnetworksself-trainingsemi-supervised-learningunsupervised-learning
5.12 score 2 stars 44 scripts 245 downloadsSplineDV - Differential Variability (DV) analysis for single-cell RNA sequencing data. (e.g. Identify Differentially Variable Genes across two experimental conditions)
A spline based scRNA-seq method for identifying differentially variable (DV) genes across two experimental conditions. Spline-DV constructs a 3D spline from 3 key gene statistics: mean expression, coefficient of variance, and dropout rate. This is done for both conditions. The 3D spline provides the “expected” behavior of genes in each condition. The distance of the observed mean, CV and dropout rate of each gene from the expected 3D spline is used to measure variability. As the final step, the spline-DV method compares the variabilities of each condition to identify differentially variable (DV) genes.
Last updated 3 days ago
softwaresinglecellsequencingdifferentialexpressionrnaseqgeneexpressiontranscriptomicsfeatureextraction
5.08 score 2 stars 3 scripts 24 downloadscrisprVerse - Easily install and load the crisprVerse ecosystem for CRISPR gRNA design
The crisprVerse is a modular ecosystem of R packages developed for the design and manipulation of CRISPR guide RNAs (gRNAs). All packages share a common language and design principles. This package is designed to make it easy to install and load the crisprVerse packages in a single step. To learn more about the crisprVerse, visit <https://www.github.com/crisprVerse>.
Last updated 4 months ago
crisprfunctionalgenomicsgenetargetcrispr-analysiscrispr-designcrispr-targetgrnagrna-sequencegrna-sequences
5.08 score 12 stars 8 scripts 226 downloadsCNVMetrics - Copy Number Variant Metrics
The CNVMetrics package calculates similarity metrics to facilitate copy number variant comparison among samples and/or methods. Similarity metrics can be employed to compare CNV profiles of genetically unrelated samples as well as those with a common genetic background. Some metrics are based on the shared amplified/deleted regions while other metrics rely on the level of amplification/deletion. The data type used as input is a plain text file containing the genomic position of the copy number variations, as well as the status and/or the log2 ratio values. Finally, a visualization tool is provided to explore resulting metrics.
Last updated 4 months ago
biologicalquestionsoftwarecopynumbervariationcnvcopy-number-variationmetricsr-language
5.08 score 4 stars 8 scripts 184 downloadsBiocSet - Representing Different Biological Sets
BiocSet displays different biological sets in a triple tibble format. These three tibbles are `element`, `set`, and `elementset`. The user has the abilty to activate one of these three tibbles to perform common functions from the dplyr package. Mapping functionality and accessing web references for elements/sets are also available in BiocSet.
Last updated 4 months ago
geneexpressiongokeggsoftware
5.06 score 4 dependents 32 scripts 656 downloadsspqn - Spatial quantile normalization
The spqn package implements spatial quantile normalization (SpQN). This method was developed to remove a mean-correlation relationship in correlation matrices built from gene expression data. It can serve as pre-processing step prior to a co-expression analysis.
Last updated 4 months ago
networkinferencegraphandnetworknormalization
5.04 score 5 stars 22 scripts 186 downloadsSANTA - Spatial Analysis of Network Associations
This package provides methods for measuring the strength of association between a network and a phenotype. It does this by measuring clustering of the phenotype across the network (Knet). Vertices can also be individually ranked by their strength of association with high-weight vertices (Knode).
Last updated 4 months ago
networknetworkenrichmentclustering
5.02 score 6 scripts 220 downloadsMAI - Mechanism-Aware Imputation
A two-step approach to imputing missing data in metabolomics. Step 1 uses a random forest classifier to classify missing values as either Missing Completely at Random/Missing At Random (MCAR/MAR) or Missing Not At Random (MNAR). MCAR/MAR are combined because it is often difficult to distinguish these two missing types in metabolomics data. Step 2 imputes the missing values based on the classified missing mechanisms, using the appropriate imputation algorithms. Imputation algorithms tested and available for MCAR/MAR include Bayesian Principal Component Analysis (BPCA), Multiple Imputation No-Skip K-Nearest Neighbors (Multi_nsKNN), and Random Forest. Imputation algorithms tested and available for MNAR include nsKNN and a single imputation approach for imputation of metabolites where left-censoring is present.
Last updated 4 months ago
softwaremetabolomicsstatisticalmethodclassificationimputation-methodsmachine-learningmissing-data
5.00 score 2 stars 6 scripts 202 downloadsshinyepico - ShinyÉPICo
ShinyÉPICo is a graphical pipeline to analyze Illumina DNA methylation arrays (450k or EPIC). It allows to calculate differentially methylated positions and differentially methylated regions in a user-friendly interface. Moreover, it includes several options to export the results and obtain files to perform downstream analysis.
Last updated 4 months ago
differentialmethylationdnamethylationmicroarraypreprocessingqualitycontrol
5.00 score 5 stars 1 scripts 177 downloadsVariantExperiment - A RangedSummarizedExperiment Container for VCF/GDS Data with GDS Backend
VariantExperiment is a Bioconductor package for saving data in VCF/GDS format into RangedSummarizedExperiment object. The high-throughput genetic/genomic data are saved in GDSArray objects. The annotation data for features/samples are saved in DelayedDataFrame format with mono-dimensional GDSArray in each column. The on-disk representation of both assay data and annotation data achieves on-disk reading and processing and saves memory space significantly. The interface of RangedSummarizedExperiment data format enables easy and common manipulations for high-throughput genetic/genomic data with common SummarizedExperiment metaphor in R and Bioconductor.
Last updated 4 months ago
infrastructuredatarepresentationsequencingannotationgenomeannotationgenotypingarray
5.00 score 1 stars 2 scripts 190 downloadsfmrs - Variable Selection in Finite Mixture of AFT Regression and FMR Models
The package obtains parameter estimation, i.e., maximum likelihood estimators (MLE), via the Expectation-Maximization (EM) algorithm for the Finite Mixture of Regression (FMR) models with Normal distribution, and MLE for the Finite Mixture of Accelerated Failure Time Regression (FMAFTR) subject to right censoring with Log-Normal and Weibull distributions via the EM algorithm and the Newton-Raphson algorithm (for Weibull distribution). More importantly, the package obtains the maximum penalized likelihood (MPLE) for both FMR and FMAFTR models (collectively called FMRs). A component-wise tuning parameter selection based on a component-wise BIC is implemented in the package. Furthermore, this package provides Ridge Regression and Elastic Net.
Last updated 4 months ago
survivalregressiondimensionreduction
5.00 score 3 stars 1 dependents 55 scripts 244 downloadsspatialFDA - A Tool for Spatial Multi-sample Comparisons
spatialFDA is a package to calculate spatial statistics metrics. The package takes a SpatialExperiment object and calculates spatial statistics metrics using the package spatstat. Then it compares the resulting functions across samples/conditions using functional additive models as implemented in the package refund. Furthermore, it provides exploratory visualisations using functional principal component analysis, as well implemented in refund.
Last updated 14 hours ago
softwarespatialtranscriptomics
4.95 score 2 stars 6 scripts 41 downloadstadar - Transcriptome Analysis of Differential Allelic Representation
This package provides functions to standardise the analysis of Differential Allelic Representation (DAR). DAR compromises the integrity of Differential Expression analysis results as it can bias expression, influencing the classification of genes (or transcripts) as being differentially expressed. DAR analysis results in an easy-to-interpret value between 0 and 1 for each genetic feature of interest, where 0 represents identical allelic representation and 1 represents complete diversity. This metric can be used to identify features prone to false-positive calls in Differential Expression analysis, and can be leveraged with statistical methods to alleviate the impact of such artefacts on RNA-seq data.
Last updated 15 days ago
sequencingrnaseqsnpgenomicvariationvariantannotationdifferentialexpression
4.95 score 1 stars 4 scripts 166 downloadsalabaster.string - Save and Load Biostrings to/from File
Save Biostrings objects to file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated 4 months ago
dataimportdatarepresentation
4.95 score 2 dependents 5 scripts 220 downloadsHPiP - Host-Pathogen Interaction Prediction
HPiP (Host-Pathogen Interaction Prediction) uses an ensemble learning algorithm for prediction of host-pathogen protein-protein interactions (HP-PPIs) using structural and physicochemical descriptors computed from amino acid-composition of host and pathogen proteins.The proposed package can effectively address data shortages and data unavailability for HP-PPI network reconstructions. Moreover, establishing computational frameworks in that regard will reveal mechanistic insights into infectious diseases and suggest potential HP-PPI targets, thus narrowing down the range of possible candidates for subsequent wet-lab experimental validations.
Last updated 4 months ago
proteomicssystemsbiologynetworkinferencestructuralpredictiongenepredictionnetwork
4.95 score 3 stars 6 scripts 179 downloadsawst - Asymmetric Within-Sample Transformation
We propose an Asymmetric Within-Sample Transformation (AWST) to regularize RNA-seq read counts and reduce the effect of noise on the classification of samples. AWST comprises two main steps: standardization and smoothing. These steps transform gene expression data to reduce the noise of the lowly expressed features, which suffer from background effects and low signal-to-noise ratio, and the influence of the highly expressed features, which may be the result of amplification bias and other experimental artifacts.
Last updated 4 months ago
normalizationgeneexpressionrnaseqsoftwaretranscriptomicssequencingsinglecell
4.95 score 3 stars 15 scripts 304 downloadsInterCellar - InterCellar: an R-Shiny app for interactive analysis and exploration of cell-cell communication in single-cell transcriptomics
InterCellar is implemented as an R/Bioconductor Package containing a Shiny app that allows users to interactively analyze cell-cell communication from scRNA-seq data. Starting from precomputed ligand-receptor interactions, InterCellar provides filtering options, annotations and multiple visualizations to explore clusters, genes and functions. Finally, based on functional annotation from Gene Ontology and pathway databases, InterCellar implements data-driven analyses to investigate cell-cell communication in one or multiple conditions.
Last updated 4 months ago
softwaresinglecellvisualizationgotranscriptomics
4.95 score 9 stars 7 scripts 212 downloadsttgsea - Tokenizing Text of Gene Set Enrichment Analysis
Functional enrichment analysis methods such as gene set enrichment analysis (GSEA) have been widely used for analyzing gene expression data. GSEA is a powerful method to infer results of gene expression data at a level of gene sets by calculating enrichment scores for predefined sets of genes. GSEA depends on the availability and accuracy of gene sets. There are overlaps between terms of gene sets or categories because multiple terms may exist for a single biological process, and it can thus lead to redundancy within enriched terms. In other words, the sets of related terms are overlapping. Using deep learning, this pakage is aimed to predict enrichment scores for unique tokens or words from text in names of gene sets to resolve this overlapping set issue. Furthermore, we can coin a new term by combining tokens and find its enrichment score by predicting such a combined tokens.
Last updated 4 months ago
softwaregeneexpressiongenesetenrichment
4.95 score 3 dependents 3 scripts 202 downloadsaggregateBioVar - Differential Gene Expression Analysis for Multi-subject scRNA-seq
For single cell RNA-seq data collected from more than one subject (e.g. biological sample or technical replicates), this package contains tools to summarize single cell gene expression profiles at the level of subject. A SingleCellExperiment object is taken as input and converted to a list of SummarizedExperiment objects, where each list element corresponds to an assigned cell type. The SummarizedExperiment objects contain aggregate gene-by-subject count matrices and inter-subject column metadata for individual subjects that can be processed using downstream bulk RNA-seq tools.
Last updated 4 months ago
softwaresinglecellrnaseqtranscriptomicstranscriptiongeneexpressiondifferentialexpression
4.95 score 5 stars 18 scripts 382 downloadssnifter - R wrapper for the python openTSNE library
Provides an R wrapper for the implementation of FI-tSNE from the python package openTNSE. See Poličar et al. (2019) <doi:10.1101/731877> and the algorithm described by Linderman et al. (2018) <doi:10.1038/s41592-018-0308-4>.
Last updated 4 months ago
dimensionreductionvisualizationsoftwaresinglecellsequencing
4.95 score 3 stars 3 scripts 311 downloadspadma - Individualized Multi-Omic Pathway Deviation Scores Using Multiple Factor Analysis
Use multiple factor analysis to calculate individualized pathway-centric scores of deviation with respect to the sampled population based on multi-omic assays (e.g., RNA-seq, copy number alterations, methylation, etc). Graphical and numerical outputs are provided to identify highly aberrant individuals for a particular pathway of interest, as well as the gene and omics drivers of aberrant multi-omic profiles.
Last updated 4 months ago
softwarestatisticalmethodprincipalcomponentgeneexpressionpathwaysrnaseqbiocartamethylseq
4.95 score 3 stars 2 scripts 240 downloadsepigraHMM - Epigenomic R-based analysis with hidden Markov models
epigraHMM provides a set of tools for the analysis of epigenomic data based on hidden Markov Models. It contains two separate peak callers, one for consensus peaks from biological or technical replicates, and one for differential peaks from multi-replicate multi-condition experiments. In differential peak calling, epigraHMM provides window-specific posterior probabilities associated with every possible combinatorial pattern of read enrichment across conditions.
Last updated 4 months ago
chipseqatacseqdnaseseqhiddenmarkovmodelepigeneticszlibopenblascppopenmp
4.94 score 88 scripts 238 downloadsMBQN - Mean/Median-balanced quantile normalization
Modified quantile normalization for omics or other matrix-like data distorted in location and scale.
Last updated 4 months ago
normalizationpreprocessingproteomicssoftware
4.92 score 2 stars 14 scripts 216 downloadsspiky - Spike-in calibration for cell-free MeDIP
spiky implements methods and model generation for cfMeDIP (cell-free methylated DNA immunoprecipitation) with spike-in controls. CfMeDIP is an enrichment protocol which avoids destructive conversion of scarce template, making it ideal as a "liquid biopsy," but creating certain challenges in comparing results across specimens, subjects, and experiments. The use of synthetic spike-in standard oligos allows diagnostics performed with cfMeDIP to quantitatively compare samples across subjects, experiments, and time points in both relative and absolute terms.
Last updated 4 months ago
differentialmethylationdnamethylationnormalizationpreprocessingqualitycontrolsequencing
4.90 score 2 stars 3 scripts 140 downloadsEpiTxDb - Storing and accessing epitranscriptomic information using the AnnotationDbi interface
EpiTxDb facilitates the storage of epitranscriptomic information. More specifically, it can keep track of modification identity, position, the enzyme for introducing it on the RNA, a specifier which determines the position on the RNA to be modified and the literature references each modification is associated with.
Last updated 4 months ago
softwareepitranscriptomics
4.90 score 7 scripts 236 downloadsncGTW - Alignment of LC-MS Profiles by Neighbor-wise Compound-specific Graphical Time Warping with Misalignment Detection
The purpose of ncGTW is to help XCMS for LC-MS data alignment. Currently, ncGTW can detect the misaligned feature groups by XCMS, and the user can choose to realign these feature groups by ncGTW or not.
Last updated 4 months ago
softwaremassspectrometrymetabolomicsalignmentcpp
4.90 score 8 stars 3 scripts 362 downloadsPanomiR - Detection of miRNAs that regulate interacting groups of pathways
PanomiR is a package to detect miRNAs that target groups of pathways from gene expression data. This package provides functionality for generating pathway activity profiles, determining differentially activated pathways between user-specified conditions, determining clusters of pathways via the PCxN package, and generating miRNAs targeting clusters of pathways. These function can be used separately or sequentially to analyze RNA-Seq data.
Last updated 4 months ago
geneexpressiongenesetenrichmentgenetargetmirnapathways
4.89 score 3 stars 13 scripts 192 downloadsVisiumIO - Import Visium data from the 10X Space Ranger pipeline
The package allows users to readily import spatial data obtained from either the 10X website or from the Space Ranger pipeline. Supported formats include tar.gz, h5, and mtx files. Multiple files can be imported at once with *List type of functions. The package represents data mainly as SpatialExperiment objects.
Last updated 14 days ago
softwareinfrastructuredataimportsinglecellspatialbioconductor-packagegenomicsu24ca289073
4.89 score 11 scripts 126 downloadsscatterHatch - Creates hatched patterns for scatterplots
The objective of this package is to efficiently create scatterplots where groups can be distinguished by color and texture. Visualizations in computational biology tend to have many groups making it difficult to distinguish between groups solely on color. Thus, this package is useful for increasing the accessibility of scatterplot visualizations to those with visual impairments such as color blindness.
Last updated 4 months ago
visualizationsinglecellcellbiologysoftwarespatial
4.89 score 7 stars 11 scripts 190 downloadsribosomeProfilingQC - Ribosome Profiling Quality Control
Ribo-Seq (also named ribosome profiling or footprinting) measures translatome (unlike RNA-Seq, which sequences the transcriptome) by direct quantification of the ribosome-protected fragments (RPFs). This package provides the tools for quality assessment of ribosome profiling. In addition, it can preprocess Ribo-Seq data for subsequent differential analysis.
Last updated 5 days ago
riboseqsequencinggeneregulationqualitycontrolvisualizationcoverage
4.88 score 17 scripts 237 downloadsscanMiRApp - scanMiR shiny application
A shiny interface to the scanMiR package. The application enables the scanning of transcripts and custom sequences for miRNA binding sites, the visualization of KdModels and binding results, as well as browsing predicted repression data. In addition contains the IndexedFst class for fast indexed reading of large GenomicRanges or data.frames, and some utilities for facilitating scans and identifying enriched miRNA-target pairs.
Last updated 4 months ago
mirnasequencematchingguishinyapps
4.88 score 19 scripts 192 downloadsMsDataHub - Mass Spectrometry Data on ExperimentHub
The MsDataHub package uses the ExperimentHub infrastructure to distribute raw mass spectrometry data files, peptide spectrum matches or quantitative data from proteomics and metabolomics experiments.
Last updated 4 months ago
experimenthubsoftwaremassspectrometryproteomicsmetabolomicsbioconductordatamass-spectrometry
4.88 score 1 stars 1 scripts 168 downloadsILoReg - ILoReg: a tool for high-resolution cell population identification from scRNA-Seq data
ILoReg is a tool for identification of cell populations from scRNA-seq data. In particular, ILoReg is useful for finding cell populations with subtle transcriptomic differences. The method utilizes a self-supervised learning method, called Iteratitive Clustering Projection (ICP), to find cluster probabilities, which are used in noise reduction prior to PCA and the subsequent hierarchical clustering and t-SNE steps. Additionally, functions for differential expression analysis to find gene markers for the populations and gene expression visualization are provided.
Last updated 4 months ago
singlecellsoftwareclusteringdimensionreductionrnaseqvisualizationtranscriptomicsdatarepresentationdifferentialexpressiontranscriptiongeneexpression
4.88 score 5 stars 2 scripts 130 downloadsdStruct - Identifying differentially reactive regions from RNA structurome profiling data
dStruct identifies differentially reactive regions from RNA structurome profiling data. dStruct is compatible with a broad range of structurome profiling technologies, e.g., SHAPE-MaP, DMS-MaPseq, Structure-Seq, SHAPE-Seq, etc. See Choudhary et al., Genome Biology, 2019 for the underlying method.
Last updated 4 months ago
statisticalmethodstructuralpredictionsequencingsoftware
4.86 score 2 stars 12 scripts 164 downloadssosta - A package for the analysis of anatomical tissue structures in spatial omics data
sosta (Spatial Omics STructure Analysis) is a package for analyzing spatial omics data to explore tissue organization at the anatomical structure level. It reconstructs morphologically relevant structures based on molecular features or cell types. It further calculates a range of structural and shape metrics to quantitatively describe tissue architecture. The package is designed to integrate with other packages for the analysis of spatial (omics) data.
Last updated 3 days ago
softwarespatialtranscriptomicsvisualization
4.85 score 1 stars 2 scripts 49 downloadsclevRvis - Visualization Techniques for Clonal Evolution
clevRvis provides a set of visualization techniques for clonal evolution. These include shark plots, dolphin plots and plaice plots. Algorithms for time point interpolation as well as therapy effect estimation are provided. Phylogeny-aware color coding is implemented. A shiny-app for generating plots interactively is additionally provided.
Last updated 4 months ago
softwareshinyappsvisualization
4.85 score 7 stars 2 scripts 149 downloadsCelliD - Unbiased Extraction of Single Cell gene signatures using Multiple Correspondence Analysis
CelliD is a clustering-free multivariate statistical method for the robust extraction of per-cell gene signatures from single-cell RNA-seq. CelliD allows unbiased cell identity recognition across different donors, tissues-of-origin, model organisms and single-cell omics protocols. The package can also be used to explore functional pathways enrichment in single cell data.
Last updated 4 months ago
rnaseqsinglecelldimensionreductionclusteringgenesetenrichmentgeneexpressionatacseqopenblascppopenmp
4.85 score 70 scripts 535 downloadsAlphaBeta - Computational inference of epimutation rates and spectra from high-throughput DNA methylation data in plants
AlphaBeta is a computational method for estimating epimutation rates and spectra from high-throughput DNA methylation data in plants. The method has been specifically designed to: 1. analyze 'germline' epimutations in the context of multi-generational mutation accumulation lines (MA-lines). 2. analyze 'somatic' epimutations in the context of plant development and aging.
Last updated 4 months ago
epigeneticsfunctionalgenomicsgeneticsmathematicalbiology
4.85 score 8 scripts 302 downloadssSNAPPY - Single Sample directioNAl Pathway Perturbation analYsis
A single sample pathway perturbation testing method for RNA-seq data. The method propagates changes in gene expression down gene-set topologies to compute single-sample directional pathway perturbation scores that reflect potential direction of change. Perturbation scores can be used to test significance of pathway perturbation at both individual-sample and treatment levels.
Last updated 4 months ago
softwaregeneexpressiongenesetenrichmentgenesignaling
4.83 score 1 stars 15 scripts 374 downloadsevaluomeR - Evaluation of Bioinformatics Metrics
Evaluating the reliability of your own metrics and the measurements done on your own datasets by analysing the stability and goodness of the classifications of such metrics.
Last updated 4 months ago
clusteringclassificationfeatureextractionassessmentclustering-evaluationevaluomeevaluomermetrics
4.82 score 33 scripts 230 downloadsmultistateQTL - Toolkit for the analysis of multi-state QTL data
A collection of tools for doing various analyses of multi-state QTL data, with a focus on visualization and interpretation. The package 'multistateQTL' contains functions which can remove or impute missing data, identify significant associations, as well as categorise features into global, multi-state or unique. The analysis results are stored in a 'QTLExperiment' object, which is based on the 'SummarisedExperiment' framework.
Last updated 1 days ago
functionalgenomicsgeneexpressionsequencingvisualizationsnpsoftware
4.81 score 9 scripts 93 downloadsalabaster.mae - Load and Save MultiAssayExperiments
Save MultiAssayExperiments into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated 4 months ago
dataimportdatarepresentation
4.78 score 1 dependents 5 scripts 197 downloadsbiodbChebi - biodbChebi, a library for connecting to the ChEBI Database
The biodbChebi library provides access to the ChEBI Database, using biodb package framework. It allows to retrieve entries by their accession number. Web services can be accessed for searching the database by name, mass or other fields.
Last updated 4 months ago
softwareinfrastructuredataimport
4.78 score 2 stars 1 dependents 3 scripts 208 downloadssupersigs - Supervised mutational signatures
Generate SuperSigs (supervised mutational signatures) from single nucleotide variants in the cancer genome. Functions included in the package allow the user to learn supervised mutational signatures from their data and apply them to new data. The methodology is based on the one described in Afsari (2021, ELife).
Last updated 4 months ago
featureextractionclassificationregressionsequencingwholegenomesomaticmutation
4.78 score 3 stars 3 scripts 340 downloadsairpart - Differential cell-type-specific allelic imbalance
Airpart identifies sets of genes displaying differential cell-type-specific allelic imbalance across cell types or states, utilizing single-cell allelic counts. It makes use of a generalized fused lasso with binomial observations of allelic counts to partition cell types by their allelic imbalance. Alternatively, a nonparametric method for partitioning cell types is offered. The package includes a number of visualizations and quality control functions for examining single cell allelic imbalance datasets.
Last updated 4 months ago
singlecellrnaseqatacseqchipseqsequencinggeneregulationgeneexpressiontranscriptiontranscriptomevariantcellbiologyfunctionalgenomicsdifferentialexpressiongraphandnetworkregressionclusteringqualitycontrol
4.78 score 2 stars 2 scripts 356 downloadscustomCMPdb - Customize and Query Compound Annotation Database
This package serves as a query interface for important community collections of small molecules, while also allowing users to include custom compound collections.
Last updated 4 months ago
softwarecheminformaticsannotationhubsoftware
4.78 score 2 stars 4 scripts 184 downloadswpm - Well Plate Maker
The Well-Plate Maker (WPM) is a shiny application deployed as an R package. Functions for a command-line/script use are also available. The WPM allows users to generate well plate maps to carry out their experiments while improving the handling of batch effects. In particular, it helps controlling the "plate effect" thanks to its ability to randomize samples over multiple well plates. The algorithm for placing the samples is inspired by the backtracking algorithm: the samples are placed at random while respecting specific spatial constraints.
Last updated 4 months ago
guiproteomicsmassspectrometrybatcheffectexperimentaldesign
4.78 score 6 stars 7 scripts 212 downloadssarks - Suffix Array Kernel Smoothing for discovery of correlative sequence motifs and multi-motif domains
Suffix Array Kernel Smoothing (see https://academic.oup.com/bioinformatics/article-abstract/35/20/3944/5418797), or SArKS, identifies sequence motifs whose presence correlates with numeric scores (such as differential expression statistics) assigned to the sequences (such as gene promoters). SArKS smooths over sequence similarity, quantified by location within a suffix array based on the full set of input sequences. A second round of smoothing over spatial proximity within sequences reveals multi-motif domains. Discovered motifs can then be merged or extended based on adjacency within MMDs. False positive rates are estimated and controlled by permutation testing.
Last updated 4 months ago
motifdiscoverygeneregulationgeneexpressiontranscriptomicsrnaseqdifferentialexpressionfeatureextractionopenjdk
4.78 score 3 stars 3 scripts 200 downloadsVaSP - Quantification and Visualization of Variations of Splicing in Population
Discovery of genome-wide variable alternative splicing events from short-read RNA-seq data and visualizations of gene splicing information for publication-quality multi-panel figures in a population. (Warning: The visualizing function is removed due to the dependent package Sushi deprecated. If you want to use it, please change back to an older version.)
Last updated 4 months ago
rnaseqalternativesplicingdifferentialsplicingstatisticalmethodvisualizationpreprocessingclusteringdifferentialexpressionkeggimmunooncology3s-scoresalternative-splicingballgownrna-seqsplicingsqtlstatistics
4.78 score 3 stars 3 scripts 214 downloadspackFinder - de novo Annotation of Pack-TYPE Transposable Elements
Algorithm and tools for in silico pack-TYPE transposon discovery. Filters a given genome for properties unique to DNA transposons and provides tools for the investigation of returned matches. Sequences are input in DNAString format, and ranges are returned as a dataframe (in the format returned by as.dataframe(GRanges)).
Last updated 4 months ago
geneticssequencematchingannotationbioinformaticstext-mining
4.78 score 6 stars 6 scripts 236 downloadsMEB - A normalization-invariant minimum enclosing ball method to detect differentially expressed genes for RNA-seq and scRNA-seq data
This package provides a method to identify differential expression genes in the same or different species. Given that non-DE genes have some similarities in features, a scaling-free minimum enclosing ball (SFMEB) model is built to cover those non-DE genes in feature space, then those DE genes, which are enormously different from non-DE genes, being regarded as outliers and rejected outside the ball. The method on this package is described in the article 'A minimum enclosing ball method to detect differential expression genes for RNA-seq data'. The SFMEB method is extended to the scMEB method that considering two or more potential types of cells or unknown labels scRNA-seq dataset DEGs identification.
Last updated 4 months ago
differentialexpressiongeneexpressionnormalizationclassificationsequencing
4.78 score 1 scripts 186 downloadsNBAMSeq - Negative Binomial Additive Model for RNA-Seq Data
High-throughput sequencing experiments followed by differential expression analysis is a widely used approach to detect genomic biomarkers. A fundamental step in differential expression analysis is to model the association between gene counts and covariates of interest. NBAMSeq a flexible statistical model based on the generalized additive model and allows for information sharing across genes in variance estimation.
Last updated 4 months ago
rnaseqdifferentialexpressiongeneexpressionsequencingcoveragedifferential-expressiongene-expressiongeneralized-additive-modelsgeneralized-linear-modelsnegative-binomial-regressionsplines
4.78 score 2 stars 2 scripts 369 downloadseds - eds: Low-level reader for Alevin EDS format
This packages provides a single function, readEDS. This is a low-level utility for reading in Alevin EDS format into R. This function is not designed for end-users but instead the package is predominantly for simplifying package dependency graph for other Bioconductor packages.
Last updated 4 months ago
sequencingrnaseqgeneexpressionsinglecellcpp
4.76 score 1 dependents 19 scripts 654 downloadsMethReg - Assessing the regulatory potential of DNA methylation regions or sites on gene transcription
Epigenome-wide association studies (EWAS) detects a large number of DNA methylation differences, often hundreds of differentially methylated regions and thousands of CpGs, that are significantly associated with a disease, many are located in non-coding regions. Therefore, there is a critical need to better understand the functional impact of these CpG methylations and to further prioritize the significant changes. MethReg is an R package for integrative modeling of DNA methylation, target gene expression and transcription factor binding sites data, to systematically identify and rank functional CpG methylations. MethReg evaluates, prioritizes and annotates CpG sites with high regulatory potential using matched methylation and gene expression data, along with external TF-target interaction databases based on manually curation, ChIP-seq experiments or gene regulatory network analysis.
Last updated 4 months ago
methylationarrayregressiongeneexpressionepigeneticsgenetargettranscription
4.76 score 5 stars 19 scripts 171 downloadsMatrixQCvis - Shiny-based interactive data-quality exploration for omics data
Data quality assessment is an integral part of preparatory data analysis to ensure sound biological information retrieval. We present here the MatrixQCvis package, which provides shiny-based interactive visualization of data quality metrics at the per-sample and per-feature level. It is broadly applicable to quantitative omics data types that come in matrix-like format (features x samples). It enables the detection of low-quality samples, drifts, outliers and batch effects in data sets. Visualizations include amongst others bar- and violin plots of the (count/intensity) values, mean vs standard deviation plots, MA plots, empirical cumulative distribution function (ECDF) plots, visualizations of the distances between samples, and multiple types of dimension reduction plots. Furthermore, MatrixQCvis allows for differential expression analysis based on the limma (moderated t-tests) and proDA (Wald tests) packages. MatrixQCvis builds upon the popular Bioconductor SummarizedExperiment S4 class and enables thus the facile integration into existing workflows. The package is especially tailored towards metabolomics and proteomics mass spectrometry data, but also allows to assess the data quality of other data types that can be represented in a SummarizedExperiment object.
Last updated 4 months ago
visualizationshinyappsguiqualitycontroldimensionreductionmetabolomicsproteomicstranscriptomics
4.74 score 4 scripts 374 downloadsFuseSOM - A Correlation Based Multiview Self Organizing Maps Clustering For IMC Datasets
A correlation-based multiview self-organizing map for the characterization of cell types in highly multiplexed in situ imaging cytometry assays (`FuseSOM`) is a tool for unsupervised clustering. `FuseSOM` is robust and achieves high accuracy by combining a `Self Organizing Map` architecture and a `Multiview` integration of correlation based metrics. This allows FuseSOM to cluster highly multiplexed in situ imaging cytometry assays.
Last updated 4 months ago
singlecellcellbasedassaysclusteringspatial
4.71 score 1 stars 17 scripts 180 downloadsPhosR - A set of methods and tools for comprehensive analysis of phosphoproteomics data
PhosR is a package for the comprenhensive analysis of phosphoproteomic data. There are two major components to PhosR: processing and downstream analysis. PhosR consists of various processing tools for phosphoproteomics data including filtering, imputation, normalisation, and functional analysis for inferring active kinases and signalling pathways.
Last updated 4 months ago
softwareresearchfieldproteomics
4.71 score 51 scripts 224 downloadsHGC - A fast hierarchical graph-based clustering method
HGC (short for Hierarchical Graph-based Clustering) is an R package for conducting hierarchical clustering on large-scale single-cell RNA-seq (scRNA-seq) data. The key idea is to construct a dendrogram of cells on their shared nearest neighbor (SNN) graph. HGC provides functions for building graphs and for conducting hierarchical clustering on the graph. The users with old R version could visit https://github.com/XuegongLab/HGC/tree/HGC4oldRVersion to get HGC package built for R 3.6.
Last updated 4 months ago
singlecellsoftwareclusteringrnaseqgraphandnetworkdnaseqcpp
4.70 score 25 scripts 178 downloadsepidecodeR - epidecodeR: a functional exploration tool for epigenetic and epitranscriptomic regulation
epidecodeR is a package capable of analysing impact of degree of DNA/RNA epigenetic chemical modifications on dysregulation of genes or proteins. This package integrates chemical modification data generated from a host of epigenomic or epitranscriptomic techniques such as ChIP-seq, ATAC-seq, m6A-seq, etc. and dysregulated gene lists in the form of differential gene expression, ribosome occupancy or differential protein translation and identify impact of dysregulation of genes caused due to varying degrees of chemical modifications associated with the genes. epidecodeR generates cumulative distribution function (CDF) plots showing shifts in trend of overall log2FC between genes divided into groups based on the degree of modification associated with the genes. The tool also tests for significance of difference in log2FC between groups of genes.
Last updated 4 months ago
differentialexpressiongeneregulationhistonemodificationfunctionalpredictiontranscriptiongeneexpressionepitranscriptomicsepigeneticsfunctionalgenomicssystemsbiologytranscriptomicschiponchipdifferential-expressiongenomicsgenomics-visualization
4.70 score 5 stars 1 scripts 194 downloadsbarcodetrackR - Functions for Analyzing Cellular Barcoding Data
barcodetrackR is an R package developed for the analysis and visualization of clonal tracking data. Data required is samples and tag abundances in matrix form. Usually from cellular barcoding experiments, integration site retrieval analyses, or similar technologies.
Last updated 4 months ago
softwarevisualizationsequencing
4.70 score 5 stars 6 scripts 253 downloadsweitrix - Tools for matrices with precision weights, test and explore weighted or sparse data
Data type and tools for working with matrices having precision weights and missing data. This package provides a common representation and tools that can be used with many types of high-throughput data. The meaning of the weights is compatible with usage in the base R function "lm" and the package "limma". Calibrate weights to account for known predictors of precision. Find rows with excess variability. Perform differential testing and find rows with the largest confident differences. Find PCA-like components of variation even with many missing values, rotated so that individual components may be meaningfully interpreted. DelayedArray matrices and BiocParallel are supported.
Last updated 4 months ago
softwaredatarepresentationdimensionreductiongeneexpressiontranscriptomicsrnaseqsinglecellregression
4.70 score 8 scripts 203 downloadsAWFisher - An R package for fast computing for adaptively weighted fisher's method
Implementation of the adaptively weighted fisher's method, including fast p-value computing, variability index, and meta-pattern.
Last updated 4 months ago
statisticalmethodsoftware
4.70 score 5 stars 4 scripts 342 downloadsDelayedTensor - R package for sparse and out-of-core arithmetic and decomposition of Tensor
DelayedTensor operates Tensor arithmetic directly on DelayedArray object. DelayedTensor provides some generic function related to Tensor arithmetic/decompotision and dispatches it on the DelayedArray class. DelayedTensor also suppors Tensor contraction by einsum function, which is inspired by numpy einsum.
Last updated 4 months ago
softwareinfrastructuredatarepresentationdimensionreduction
4.68 score 4 stars 3 scripts 188 downloadsimmunogenViewer - Visualization and evaluation of protein immunogens
Plots protein properties and visualizes position of peptide immunogens within protein sequence. Allows evaluation of immunogens based on structural and functional annotations to infer suitability for antibody-based methods aiming to detect native proteins.
Last updated 2 days ago
featureextractionproteomicssoftwarevisualization
4.65 score 10 scripts 66 downloadsGrafGen - Classification of Helicobacter Pylori Genomes
To classify Helicobacter pylori genomes according to genetic distance from nine reference populations. The nine reference populations are hpgpAfrica, hpgpAfrica-distant, hpgpAfroamerica, hpgpEuroamerica, hpgpMediterranea, hpgpEurope, hpgpEurasia, hpgpAsia, and hpgpAklavik86-like. The vertex populations are Africa, Europe and Asia.
Last updated 18 days ago
geneticssoftwaregenomeannotationclassificationcpp
4.65 score 2 scripts 152 downloadsalabaster.vcf - Save and Load Variant Data to/from File
Save variant calling SummarizedExperiment to file and load them back as VCF objects. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated 4 months ago
dataimportdatarepresentation
4.65 score 1 dependents 6 scripts 208 downloadsalabaster.bumpy - Save and Load BumpyMatrices to/from file
Save BumpyMatrix objects into file artifacts, and load them back into memory. This is a more portable alternative to serialization of such objects into RDS files. Each artifact is associated with metadata for further interpretation; downstream applications can enrich this metadata with context-specific properties.
Last updated 4 months ago
dataimportdatarepresentation
4.65 score 1 dependents 5 scripts 197 downloadsEpiMix - EpiMix: an integrative tool for the population-level analysis of DNA methylation
EpiMix is a comprehensive tool for the integrative analysis of high-throughput DNA methylation data and gene expression data. EpiMix enables automated data downloading (from TCGA or GEO), preprocessing, methylation modeling, interactive visualization and functional annotation.To identify hypo- or hypermethylated CpG sites across physiological or pathological conditions, EpiMix uses a beta mixture modeling to identify the methylation states of each CpG probe and compares the methylation of the experimental group to the control group.The output from EpiMix is the functional DNA methylation that is predictive of gene expression. EpiMix incorporates specialized algorithms to identify functional DNA methylation at various genetic elements, including proximal cis-regulatory elements of protein-coding genes, distal enhancers, and genes encoding microRNAs and lncRNAs.
Last updated 4 months ago
softwareepigeneticspreprocessingdnamethylationgeneexpressiondifferentialmethylation
4.65 score 1 stars 1 dependents 7 scripts 175 downloadsfgga - Hierarchical ensemble method based on factor graph
Package that implements the FGGA algorithm. This package provides a hierarchical ensemble method based ob factor graphs for the consistent cross-ontology annotation of protein coding genes. FGGA embodies elements of predicate logic, communication theory, supervised learning and inference in graphical models.
Last updated 4 months ago
softwarestatisticalmethodclassificationnetworknetworkinferencesupportvectormachinegraphandnetworkgo
4.65 score 3 stars 6 scripts 141 downloadsInformeasure - R implementation of information measures
This package consolidates a comprehensive set of information measurements, encompassing mutual information, conditional mutual information, interaction information, partial information decomposition, and part mutual information.
Last updated 4 months ago
geneexpressionnetworkinferencenetworksoftware
4.65 score 3 stars 4 scripts 166 downloadshca - Exploring the Human Cell Atlas Data Coordinating Platform
This package provides users with the ability to query the Human Cell Atlas data repository for single-cell experiment data. The `projects()`, `files()`, `samples()` and `bundles()` functions retrieve summary information on each of these indexes; corresponding `*_details()` are available for individual entries of each index. File-based resources can be downloaded using `files_download()`. Advanced use of the package allows the user to page through large result sets, and to flexibly query the 'list-of-lists' structure representing query responses.
Last updated 4 months ago
softwaresinglecell
4.64 score 55 scripts 196 downloadscorral - Correspondence Analysis for Single Cell Data
Correspondence analysis (CA) is a matrix factorization method, and is similar to principal components analysis (PCA). Whereas PCA is designed for application to continuous, approximately normally distributed data, CA is appropriate for non-negative, count-based data that are in the same additive scale. The corral package implements CA for dimensionality reduction of a single matrix of single-cell data, as well as a multi-table adaptation of CA that leverages data-optimized scaling to align data generated from different sequencing platforms by projecting into a shared latent space. corral utilizes sparse matrices and a fast implementation of SVD, and can be called directly on Bioconductor objects (e.g., SingleCellExperiment) for easy pipeline integration. The package also includes additional options, including variations of CA to address overdispersion in count data (e.g., Freeman-Tukey chi-squared residual), as well as the option to apply CA-style processing to continuous data (e.g., proteomic TOF intensities) with the Hellinger distance adaptation of CA.
Last updated 4 months ago
batcheffectdimensionreductiongeneexpressionpreprocessingprincipalcomponentsequencingsinglecellsoftwarevisualization
4.64 score 22 scripts 223 downloadsCepo - Cepo for the identification of differentially stable genes
Defining the identity of a cell is fundamental to understand the heterogeneity of cells to various environmental signals and perturbations. We present Cepo, a new method to explore cell identities from single-cell RNA-sequencing data using differential stability as a new metric to define cell identity genes. Cepo computes cell-type specific gene statistics pertaining to differential stable gene expression.
Last updated 4 months ago
classificationgeneexpressionsinglecellsoftwaresequencingdifferentialexpression
4.62 score 1 dependents 14 scripts 270 downloadsDune - Improving replicability in single-cell RNA-Seq cell type discovery
Given a set of clustering labels, Dune merges pairs of clusters to increase mean ARI between labels, improving replicability.
Last updated 4 months ago
clusteringgeneexpressionrnaseqsoftwaresinglecelltranscriptomicsvisualization
4.62 score 42 scripts 192 downloadsGenomicInteractionNodes - A R/Bioconductor package to detect the interaction nodes from HiC/HiChIP/HiCAR data
The GenomicInteractionNodes package can import interactions from bedpe file and define the interaction nodes, the genomic interaction sites with multiple interaction loops. The interaction nodes is a binding platform regulates one or multiple genes. The detected interaction nodes will be annotated for downstream validation.
Last updated 5 days ago
hicsequencingsoftware
4.60 score 1 scripts 164 downloadsmagpie - MeRIP-Seq data Analysis for Genomic Power Investigation and Evaluation
This package aims to perform power analysis for the MeRIP-seq study. It calculates FDR, FDC, power, and precision under various study design parameters, including but not limited to sample size, sequencing depth, and testing method. It can also output results into .xlsx files or produce corresponding figures of choice.
Last updated 4 months ago
epitranscriptomicsdifferentialmethylationsequencingrnaseqsoftware
4.60 score 40 scripts 119 downloadsoncoscanR - Secondary analyses of CNV data (HRD and more)
The software uses the copy number segments from a text file and identifies all chromosome arms that are globally altered and computes various genome-wide scores. The following HRD scores (characteristic of BRCA-mutated cancers) are included: LST, HR-LOH, nLST and gLOH. the package is tailored for the ThermoFisher Oncoscan assay analyzed with their Chromosome Alteration Suite (ChAS) but can be adapted to any input.
Last updated 4 months ago
copynumbervariationmicroarraysoftware
4.60 score 2 stars 6 scripts 203 downloadsrifi - 'rifi' analyses data from rifampicin time series created by microarray or RNAseq
'rifi' analyses data from rifampicin time series created by microarray or RNAseq. 'rifi' is a transcriptome data analysis tool for the holistic identification of transcription and decay associated processes. The decay constants and the delay of the onset of decay is fitted for each probe/bin. Subsequently, probes/bins of equal properties are combined into segments by dynamic programming, independent of a existing genome annotation. This allows to detect transcript segments of different stability or transcriptional events within one annotated gene. In addition to the classic decay constant/half-life analysis, 'rifi' detects processing sites, transcription pausing sites, internal transcription start sites in operons, sites of partial transcription termination in operons, identifies areas of likely transcriptional interference by the collision mechanism and gives an estimate of the transcription velocity. All data are integrated to give an estimate of continous transcriptional units, i.e. operons. Comprehensive output tables and visualizations of the full genome result and the individual fits for all probes/bins are produced.
Last updated 4 months ago
rnaseqdifferentialexpressiongeneregulationtranscriptomicsregressionmicroarraysoftware
4.60 score 1 scripts 204 downloadsRAREsim - Simulation of Rare Variant Genetic Data
Haplotype simulations of rare variant genetic data that emulates real data can be performed with RAREsim. RAREsim uses the expected number of variants in MAC bins - either as provided by default parameters or estimated from target data - and an abundance of rare variants as simulated HAPGEN2 to probabilistically prune variants. RAREsim produces haplotypes that emulate real sequencing data with respect to the total number of variants, allele frequency spectrum, haplotype structure, and variant annotation.
Last updated 4 months ago
geneticssoftwarevariantannotationsequencing
4.60 score 4 stars 4 scripts 136 downloadsterraTCGAdata - OpenAccess TCGA Data on Terra as MultiAssayExperiment
Leverage the existing open access TCGA data on Terra with well-established Bioconductor infrastructure. Make use of the Terra data model without learning its complexities. With a few functions, you can copy / download and generate a MultiAssayExperiment from the TCGA example workspaces provided by Terra.
Last updated 4 months ago
softwareinfrastructuredataimportbioconductor-package
4.60 score 4 scripts 236 downloadsMBECS - Evaluation and correction of batch effects in microbiome data-sets
The Microbiome Batch Effect Correction Suite (MBECS) provides a set of functions to evaluate and mitigate unwated noise due to processing in batches. To that end it incorporates a host of batch correcting algorithms (BECA) from various packages. In addition it offers a correction and reporting pipeline that provides a preliminary look at the characteristics of a data-set before and after correcting for batch effects.
Last updated 4 months ago
batcheffectmicrobiomereportwritingvisualizationnormalizationqualitycontrol
4.60 score 4 stars 4 scripts 306 downloadsOGRE - Calculate, visualize and analyse overlap between genomic regions
OGRE calculates overlap between user defined genomic region datasets. Any regions can be supplied i.e. genes, SNPs, or reads from sequencing experiments. Key numbers help analyse the extend of overlaps which can also be visualized at a genomic level.
Last updated 4 months ago
softwareworkflowstepbiologicalquestionannotationmetagenomicsvisualizationsequencing
4.60 score 2 stars 4 scripts 196 downloadsIntramiRExploreR - Predicting Targets for Drosophila Intragenic miRNAs
Intra-miR-ExploreR, an integrative miRNA target prediction bioinformatics tool, identifies targets combining expression and biophysical interactions of a given microRNA (miR). Using the tool, we have identified targets for 92 intragenic miRs in D. melanogaster, using available microarray expression data, from Affymetrix 1 and Affymetrix2 microarray array platforms, providing a global perspective of intragenic miR targets in Drosophila. Predicted targets are grouped according to biological functions using the DAVID Gene Ontology tool and are ranked based on a biologically relevant scoring system, enabling the user to identify functionally relevant targets for a given miR.
Last updated 4 months ago
softwaremicroarraygenetargetstatisticalmethodgeneexpressiongeneprediction
4.60 score 4 scripts 182 downloadssvaRetro - Retrotransposed transcript detection from structural variants
svaRetro contains functions for detecting retrotransposed transcripts (RTs) from structural variant calls. It takes structural variant calls in GRanges of breakend notation and identifies RTs by exon-exon junctions and insertion sites. The candidate RTs are reported by events and annotated with information of the inserted transcripts.
Last updated 4 months ago
dataimportsequencingannotationgeneticsvariantannotationcoveragevariantdetection
4.60 score 4 scripts 156 downloadssegmenter - Perform Chromatin Segmentation Analysis in R by Calling ChromHMM
Chromatin segmentation analysis transforms ChIP-seq data into signals over the genome. The latter represents the observed states in a multivariate Markov model to predict the chromatin's underlying states. ChromHMM, written in Java, integrates histone modification datasets to learn the chromatin states de-novo. The goal of this package is to call chromHMM from within R, capture the output files in an S4 object and interface to other relevant Bioconductor analysis tools. In addition, segmenter provides functions to test, select and visualize the output of the segmentation.
Last updated 4 months ago
softwarehistonemodificationbioconductorchromhmmsegmentation-an
4.60 score 4 stars 9 scripts 244 downloadsRCSL - Rank Constrained Similarity Learning for single cell RNA sequencing data
A novel clustering algorithm and toolkit RCSL (Rank Constrained Similarity Learning) to accurately identify various cell types using scRNA-seq data from a complex tissue. RCSL considers both lo-cal similarity and global similarity among the cells to discern the subtle differences among cells of the same type as well as larger differences among cells of different types. RCSL uses Spearman’s rank correlations of a cell’s expression vector with those of other cells to measure its global similar-ity, and adaptively learns neighbour representation of a cell as its local similarity. The overall similar-ity of a cell to other cells is a linear combination of its global similarity and local similarity.
Last updated 4 months ago
singlecellsoftwareclusteringdimensionreductionrnaseqvisualizationsequencing
4.60 score 2 stars 10 scripts 200 downloadsKBoost - Inference of gene regulatory networks from gene expression data
Reconstructing gene regulatory networks and transcription factor activity is crucial to understand biological processes and holds potential for developing personalized treatment. Yet, it is still an open problem as state-of-art algorithm are often not able to handle large amounts of data. Furthermore, many of the present methods predict numerous false positives and are unable to integrate other sources of information such as previously known interactions. Here we introduce KBoost, an algorithm that uses kernel PCA regression, boosting and Bayesian model averaging for fast and accurate reconstruction of gene regulatory networks. KBoost can also use a prior network built on previously known transcription factor targets. We have benchmarked KBoost using three different datasets against other high performing algorithms. The results show that our method compares favourably to other methods across datasets.
Last updated 4 months ago
networkgraphandnetworkbayesiannetworkinferencegeneregulationtranscriptomicssystemsbiologytranscriptiongeneexpressionregressionprincipalcomponent
4.60 score 4 stars 9 scripts 144 downloadscellmigRation - Track Cells, Analyze Cell Trajectories and Compute Migration Statistics
Import TIFF images of fluorescently labeled cells, and track cell movements over time. Parallelization is supported for image processing and for fast computation of cell trajectories. In-depth analysis of cell trajectories is enabled by 15 trajectory analysis functions.
Last updated 4 months ago
cellbiologydatarepresentationdataimportbioconductor-packagecell-trackingshinytrajectory-analysis
4.60 score 4 scripts 166 downloadssitadela - An R package for the easy provision of simple but complete tab-delimited genomic annotation from a variety of sources and organisms
Provides an interface to build a unified database of genomic annotations and their coordinates (gene, transcript and exon levels). It is aimed to be used when simple tab-delimited annotations (or simple GRanges objects) are required instead of the more complex annotation Bioconductor packages. Also useful when combinatorial annotation elements are reuired, such as RefSeq coordinates with Ensembl biotypes. Finally, it can download, construct and handle annotations with versioned genes and transcripts (where available, e.g. RefSeq and latest Ensembl). This is particularly useful in precision medicine applications where the latter must be reported.
Last updated 4 months ago
softwareworkflowsteprnaseqtranscriptionsequencingtranscriptomicsbiomedicalinformaticsfunctionalgenomicssystemsbiologyalternativesplicingdataimportchipseq
4.60 score 2 scripts 127 downloadsTrajectoryGeometry - This Package Discovers Directionality in Time and Pseudo-times Series of Gene Expression Patterns
Given a time series or pseudo-times series of gene expression data, we might wish to know: Do the changes in gene expression in these data exhibit directionality? Are there turning points in this directionality. Do different subsets of the data move in different directions? This package uses spherical geometry to probe these sorts of questions. In particular, if we are looking at (say) the first n dimensions of the PCA of gene expression, directionality can be detected as the clustering of points on the (n-1)-dimensional sphere.
Last updated 4 months ago
biologicalquestionstatisticalmethodgeneexpressionsinglecell
4.60 score 7 scripts 142 downloadsCAEN - Category encoding method for selecting feature genes for the classification of single-cell RNA-seq
With the development of high-throughput techniques, more and more gene expression analysis tend to replace hybridization-based microarrays with the revolutionary technology.The novel method encodes the category again by employing the rank of samples for each gene in each class. We then consider the correlation coefficient of gene and class with rank of sample and new rank of category. The highest correlation coefficient genes are considered as the feature genes which are most effective to classify the samples.
Last updated 4 months ago
differentialexpressionsequencingclassificationrnaseqatacseqsinglecellgeneexpressionripseq
4.60 score 2 scripts 166 downloadsGEOfastq - Downloads ENA Fastqs With GEO Accessions
GEOfastq is used to download fastq files from the European Nucleotide Archive (ENA) starting with an accession from the Gene Expression Omnibus (GEO). To do this, sample metadata is retrieved from GEO and the Sequence Read Archive (SRA). SRA run accessions are then used to construct FTP and aspera download links for fastq files generated by the ENA.
Last updated 4 months ago
rnaseqdataimportbioinformaticsfastqgene-expressiongeorna-seq
4.60 score 4 stars 6 scripts 260 downloadsnempi - Inferring unobserved perturbations from gene expression data
Takes as input an incomplete perturbation profile and differential gene expression in log odds and infers unobserved perturbations and augments observed ones. The inference is done by iteratively inferring a network from the perturbations and inferring perturbations from the network. The network inference is done by Nested Effects Models.
Last updated 4 months ago
softwaregeneexpressiondifferentialexpressiondifferentialmethylationgenesignalingpathwaysnetworkclassificationneuralnetworknetworkinferenceatacseqdnaseqrnaseqpooledscreenscrisprsinglecellsystemsbiology
4.60 score 2 stars 2 scripts 199 downloadsbnem - Training of logical models from indirect measurements of perturbation experiments
bnem combines the use of indirect measurements of Nested Effects Models (package mnem) with the Boolean networks of CellNOptR. Perturbation experiments of signalling nodes in cells are analysed for their effect on the global gene expression profile. Those profiles give evidence for the Boolean regulation of down-stream nodes in the network, e.g., whether two parents activate their child independently (OR-gate) or jointly (AND-gate).
Last updated 4 months ago
pathwayssystemsbiologynetworkinferencenetworkgeneexpressiongeneregulationpreprocessing
4.60 score 2 stars 5 scripts 196 downloadsmarr - Maximum rank reproducibility
marr (Maximum Rank Reproducibility) is a nonparametric approach that detects reproducible signals using a maximal rank statistic for high-dimensional biological data. In this R package, we implement functions that measures the reproducibility of features per sample pair and sample pairs per feature in high-dimensional biological replicate experiments. The user-friendly plot functions in this package also plot histograms of the reproducibility of features per sample pair and sample pairs per feature. Furthermore, our approach also allows the users to select optimal filtering threshold values for the identification of reproducible features and sample pairs based on output visualization checks (histograms). This package also provides the subset of data filtered by reproducible features and/or sample pairs.
Last updated 4 months ago
qualitycontrolmetabolomicsmassspectrometryrnaseqchipseqcpp
4.60 score 2 stars 4 scripts 221 downloadsmetabolomicsWorkbenchR - Metabolomics Workbench in R
This package provides functions for interfacing with the Metabolomics Workbench RESTful API. Study, compound, protein and gene information can be searched for using the API. Methods to obtain study data in common Bioconductor formats such as SummarizedExperiment and MultiAssayExperiment are also included.
Last updated 4 months ago
softwaremetabolomics
4.60 score 8 scripts 235 downloadsNoRCE - NoRCE: Noncoding RNA Sets Cis Annotation and Enrichment
While some non-coding RNAs (ncRNAs) are assigned critical regulatory roles, most remain functionally uncharacterized. This presents a challenge whenever an interesting set of ncRNAs needs to be analyzed in a functional context. Transcripts located close-by on the genome are often regulated together. This genomic proximity on the sequence can hint to a functional association. We present a tool, NoRCE, that performs cis enrichment analysis for a given set of ncRNAs. Enrichment is carried out using the functional annotations of the coding genes located proximal to the input ncRNAs. Other biologically relevant information such as topologically associating domain (TAD) boundaries, co-expression patterns, and miRNA target prediction information can be incorporated to conduct a richer enrichment analysis. To this end, NoRCE includes several relevant datasets as part of its data repository, including cell-line specific TAD boundaries, functional gene sets, and expression data for coding & ncRNAs specific to cancer. Additionally, the users can utilize custom data files in their investigation. Enrichment results can be retrieved in a tabular format or visualized in several different ways. NoRCE is currently available for the following species: human, mouse, rat, zebrafish, fruit fly, worm, and yeast.
Last updated 4 months ago
biologicalquestiondifferentialexpressiongenomeannotationgenesetenrichmentgenetargetgenomeassemblygo
4.60 score 1 stars 6 scripts 219 downloadsreconsi - Resampling Collapsed Null Distributions for Simultaneous Inference
Improves simultaneous inference under dependence of tests by estimating a collapsed null distribution through resampling. Accounting for the dependence between tests increases the power while reducing the variability of the false discovery proportion. This dependence is common in genomics applications, e.g. when combining flow cytometry measurements with microbiome sequence counts.
Last updated 4 months ago
metagenomicsmicrobiomemultiplecomparisonflowcytometry
4.60 score 2 stars 2 scripts 130 downloadseasyreporting - Helps creating report for improving Reproducible Computational Research
An S4 class for facilitating the automated creation of rmarkdown files inside other packages/software even without knowing rmarkdown language. Best if implemented in functions as "recursive" style programming.
Last updated 4 months ago
reportwriting
4.60 score 2 stars 2 scripts 206 downloadsbrendaDb - The BRENDA Enzyme Database
R interface for importing and analyzing enzyme information from the BRENDA database.
Last updated 4 months ago
thirdpartyclientannotationdataimportbrendadatabaseenzymehacktoberfestcpp
4.60 score 2 stars 4 scripts 202 downloadsGNET2 - Constructing gene regulatory networks from expression data through functional module inference
Cluster genes to functional groups with E-M process. Iteratively perform TF assigning and Gene assigning, until the assignment of genes did not change, or max number of iterations is reached.
Last updated 4 months ago
geneexpressionregressionnetworknetworkinferencesoftwarecpp
4.60 score 2 stars 3 scripts 178 downloadsNetActivity - Compute gene set scores from a deep learning framework
#' NetActivity enables to compute gene set scores from previously trained sparsely-connected autoencoders. The package contains a function to prepare the data (`prepareSummarizedExperiment`) and a function to compute the gene set scores (`computeGeneSetScores`). The package `NetActivityData` contains different pre-trained models to be directly applied to the data. Alternatively, the users might use the package to compute gene set scores using custom models.
Last updated 4 months ago
rnaseqmicroarraytranscriptionfunctionalgenomicsgogeneexpressionpathwayssoftware
4.59 score 26 scripts 184 downloadsscDDboost - A compositional model to assess expression changes from single-cell rna-seq data
scDDboost is an R package to analyze changes in the distribution of single-cell expression data between two experimental conditions. Compared to other methods that assess differential expression, scDDboost benefits uniquely from information conveyed by the clustering of cells into cellular subtypes. Through a novel empirical Bayesian formulation it calculates gene-specific posterior probabilities that the marginal expression distribution is the same (or different) between the two conditions. The implementation in scDDboost treats gene-level expression data within each condition as a mixture of negative binomial distributions.
Last updated 14 days ago
singlecellsoftwareclusteringsequencinggeneexpressiondifferentialexpressionbayesiancpp
4.58 score 19 scripts 208 downloadsgoSorensen - Statistical inference based on the Sorensen-Dice dissimilarity and the Gene Ontology (GO)
This package implements inferential methods to compare gene lists in terms of their biological meaning as expressed in the GO. The compared gene lists are characterized by cross-tabulation frequency tables of enriched GO items. Dissimilarity between gene lists is evaluated using the Sorensen-Dice index. The fundamental guiding principle is that two gene lists are taken as similar if they share a great proportion of common enriched GO items.
Last updated 4 months ago
annotationgogenesetenrichmentsoftwaremicroarraypathwaysgeneexpressionmultiplecomparisongraphandnetworkreactomeclusteringkegg
4.56 score 12 scripts 177 downloadsRedisParam - Provide a 'redis' back-end for BiocParallel
This package provides a Redis-based back-end for BiocParallel, enabling an alternative mechanism for distributed computation. The The 'manager' distributes tasks to a 'worker' pool through a central Redis server, rather than directly to workers as with other BiocParallel implementations. This means that the worker pool can change dynamically during job evaluation. All features of BiocParallel are supported, including reproducible random number streams, logging to the manager, and alternative 'load balancing' task distributions.
Last updated 4 months ago
infrastructure
4.53 score 17 scripts 128 downloadsmumosa - Multi-Modal Single-Cell Analysis Methods
Assorted utilities for multi-modal analyses of single-cell datasets. Includes functions to combine multiple modalities for downstream analysis, perform MNN-based batch correction across multiple modalities, and to compute correlations between assay values for different modalities.
Last updated 4 months ago
immunooncologysinglecellrnaseq
4.51 score 13 scripts 213 downloadsribor - An R Interface for Ribo Files
The ribor package provides an R Interface for .ribo files. It provides functionality to read the .ribo file, which is of HDF5 format, and performs common analyses on its contents.
Last updated 4 months ago
softwareinfrastructure
4.51 score 32 scripts 188 downloadsSQLDataFrame - Representation of SQL tables in DataFrame metaphor
Implements bindings for SQL tables that are compatible with Bioconductor S4 data structures, namely the DataFrame and DelayedArray. This allows SQL-derived data to be easily used inside other Bioconductor objects (e.g., SummarizedExperiments) while keeping everything on disk.
Last updated 4 months ago
datarepresentationinfrastructuresoftware
4.51 score 2 stars 5 scripts 130 downloadsFeatSeekR - FeatSeekR an R package for unsupervised feature selection
FeatSeekR performs unsupervised feature selection using replicated measurements. It iteratively selects features with the highest reproducibility across replicates, after projecting out those dimensions from the data that are spanned by the previously selected features. The selected a set of features has a high replicate reproducibility and a high degree of uniqueness.
Last updated 14 days ago
softwarestatisticalmethodfeatureextractionmassspectrometry
4.48 score 2 stars 3 scripts 153 downloadsTDbasedUFEadv - Advanced package of tensor decomposition based unsupervised feature extraction
This is an advanced version of TDbasedUFE, which is a comprehensive package to perform Tensor decomposition based unsupervised feature extraction. In contrast to TDbasedUFE which can perform simple the feature selection and the multiomics analyses, this package can perform more complicated and advanced features, but they are not so popularly required. Only users who require more specific features can make use of its functionality.
Last updated 4 months ago
geneexpressionfeatureextractionmethylationarraysinglecellsoftwarebioconductor-packagebioinformaticstensor-decomposition
4.48 score 4 scripts 130 downloadsDNAfusion - Identification of gene fusions using paired-end sequencing
DNAfusion can identify gene fusions such as EML4-ALK based on paired-end sequencing results. This package was developed using position deduplicated BAM files generated with the AVENIO Oncology Analysis Software. These files are made using the AVENIO ctDNA surveillance kit and Illumina Nextseq 500 sequencing. This is a targeted hybridization NGS approach and includes ALK-specific but not EML4-specific probes.
Last updated 4 months ago
targetedresequencinggeneticsgenefusiondetectionsequencingbioconductor-packagecirculating-tumor-dnagene-fusionliquid-biopsynext-generation-sequencingtargeted-sequencingvariant-calling
4.48 score 3 stars 10 scripts 173 downloads
ccImpute - ccImpute: an accurate and scalable consensus clustering based approach to impute dropout events in the single-cell RNA-seq data (https://doi.org/10.1186/s12859-022-04814-8)
Dropout events make the lowly expressed genes indistinguishable from true zero expression and different than the low expression present in cells of the same type. This issue makes any subsequent downstream analysis difficult. ccImpute is an imputation algorithm that uses cell similarity established by consensus clustering to impute the most probable dropout events in the scRNA-seq datasets. ccImpute demonstrated performance which exceeds the performance of existing imputation approaches while introducing the least amount of new noise as measured by clustering performance characteristics on datasets with known cell identities.
Last updated 4 months ago
singlecellsequencingprincipalcomponentdimensionreductionclusteringrnaseqtranscriptomicsopenblascppopenmp
4.48 score 2 stars 2 scripts 196 downloadsepistasisGA - An R package to identify multi-snp effects in nuclear family studies using the GADGETS method
This package runs the GADGETS method to identify epistatic effects in nuclear family studies. It also provides functions for permutation-based inference and graphical visualization of the results.
Last updated 4 months ago
geneticssnpgeneticvariabilityopenblascpp
4.48 score 1 stars 5 scripts 118 downloadsseqArchR - Identify Different Architectures of Sequence Elements
seqArchR enables unsupervised discovery of _de novo_ clusters with characteristic sequence architectures characterized by position-specific motifs or composition of stretches of nucleotides, e.g., CG-richness. seqArchR does _not_ require any specifications w.r.t. the number of clusters, the length of any individual motifs, or the distance between motifs if and when they occur in pairs/groups; it directly detects them from the data. seqArchR uses non-negative matrix factorization (NMF) as its backbone, and employs a chunking-based iterative procedure that enables processing of large sequence collections efficiently. Wrapper functions are provided for visualizing cluster architectures as sequence logos.
Last updated 4 months ago
motifdiscoverygeneregulationmathematicalbiologysystemsbiologytranscriptomicsgeneticsclusteringdimensionreductionfeatureextractiondnaseqnmfnonnegative-matrix-factorizationpromoter-sequence-architecturesscikit-learnsequence-analysissequence-architecturesunsupervised-machine-learning
4.48 score 1 stars 1 dependents 9 scripts 188 downloadsupdateObject - Find/fix old serialized S4 instances
A set of tools built around updateObject() to work with old serialized S4 instances. The package is primarily useful to package maintainers who want to update the serialized S4 instances included in their package. This is still work-in-progress.
Last updated 4 months ago
infrastructuredatarepresentationbioconductor-packagecore-package
4.48 score 1 stars 3 scripts 98 downloads
mitoClone2 - Clonal Population Identification in Single-Cell RNA-Seq Data using Mitochondrial and Somatic Mutations
This package primarily identifies variants in mitochondrial genomes from BAM alignment files. It filters these variants to remove RNA editing events then estimates their evolutionary relationship (i.e. their phylogenetic tree) and groups single cells into clones. It also visualizes the mutations and providing additional genomic context.
Last updated 4 months ago
annotationdataimportgeneticssnpsoftwaresinglecellalignmentcurlbzip2xz-utilszlibcpp
4.48 score 1 stars 8 scripts 204 downloadsBUSseq - Batch Effect Correction with Unknow Subtypes for scRNA-seq data
BUSseq R package fits an interpretable Bayesian hierarchical model---the Batch Effects Correction with Unknown Subtypes for scRNA seq Data (BUSseq)---to correct batch effects in the presence of unknown cell types. BUSseq is able to simultaneously correct batch effects, clusters cell types, and takes care of the count data nature, the overdispersion, the dropout events, and the cell-specific sequencing depth of scRNA-seq data. After correcting the batch effects with BUSseq, the corrected value can be used for downstream analysis as if all cells were sequenced in a single batch. BUSseq can integrate read count matrices obtained from different scRNA-seq platforms and allow cell types to be measured in some but not all of the batches as long as the experimental design fulfills the conditions listed in our manuscript.
Last updated 4 months ago
experimentaldesigngeneexpressionstatisticalmethodbayesianclusteringfeatureextractionbatcheffectsinglecellsequencingcppopenmp
4.48 score 30 scripts 190 downloadstLOH - Assessment of evidence for LOH in spatial transcriptomics pre-processed data using Bayes factor calculations
tLOH, or transcriptomicsLOH, assesses evidence for loss of heterozygosity (LOH) in pre-processed spatial transcriptomics data. This tool requires spatial transcriptomics cluster and allele count information at likely heterozygous single-nucleotide polymorphism (SNP) positions in VCF format. Bayes factors are calculated at each SNP to determine likelihood of potential loss of heterozygosity event. Two plotting functions are included to visualize allele fraction and aggregated Bayes factor per chromosome. Data generated with the 10X Genomics Visium Spatial Gene Expression platform must be pre-processed to obtain an individual sample VCF with columns for each cluster. Required fields are allele depth (AD) with counts for reference/alternative alleles and read depth (DP).
Last updated 4 months ago
copynumbervariationtranscriptionsnpgeneexpressiontranscriptomics
4.48 score 3 stars 4 scripts 170 downloadsgranulator - Rapid benchmarking of methods for *in silico* deconvolution of bulk RNA-seq data
granulator is an R package for the cell type deconvolution of heterogeneous tissues based on bulk RNA-seq data or single cell RNA-seq expression profiles. The package provides a unified testing interface to rapidly run and benchmark multiple state-of-the-art deconvolution methods. Data for the deconvolution of peripheral blood mononuclear cells (PBMCs) into individual immune cell types is provided as well.
Last updated 4 months ago
rnaseqgeneexpressiondifferentialexpressiontranscriptomicssinglecellstatisticalmethodregression
4.48 score 3 stars 6 scripts 260 downloadstreekoR - Cytometry Cluster Hierarchy and Cellular-to-phenotype Associations
treekoR is a novel framework that aims to utilise the hierarchical nature of single cell cytometry data to find robust and interpretable associations between cell subsets and patient clinical end points. These associations are aimed to recapitulate the nested proportions prevalent in workflows inovlving manual gating, which are often overlooked in workflows using automatic clustering to identify cell populations. We developed treekoR to: Derive a hierarchical tree structure of cell clusters; quantify a cell types as a proportion relative to all cells in a sample (%total), and, as the proportion relative to a parent population (%parent); perform significance testing using the calculated proportions; and provide an interactive html visualisation to help highlight key results.
Last updated 4 months ago
clusteringdifferentialexpressionflowcytometryimmunooncologymassspectrometrysinglecellsoftwarestatisticalmethodvisualization
4.48 score 1 dependents 10 scripts 164 downloadsLRcell - Differential cell type change analysis using Logistic/linear Regression
The goal of LRcell is to identify specific sub-cell types that drives the changes observed in a bulk RNA-seq differential gene expression experiment. To achieve this, LRcell utilizes sets of cell marker genes acquired from single-cell RNA-sequencing (scRNA-seq) as indicators for various cell types in the tissue of interest. Next, for each cell type, using its marker genes as indicators, we apply Logistic Regression on the complete set of genes with differential expression p-values to calculate a cell-type significance p-value. Finally, these p-values are compared to predict which one(s) are likely to be responsible for the differential gene expression pattern observed in the bulk RNA-seq experiments. LRcell is inspired by the LRpath[@sartor2009lrpath] algorithm developed by Sartor et al., originally designed for pathway/gene set enrichment analysis. LRcell contains three major components: LRcell analysis, plot generation and marker gene selection. All modules in this package are written in R. This package also provides marker genes in the Prefrontal Cortex (pFC) human brain region, human PBMC and nine mouse brain regions (Frontal Cortex, Cerebellum, Globus Pallidus, Hippocampus, Entopeduncular, Posterior Cortex, Striatum, Substantia Nigra and Thalamus).
Last updated 4 months ago
singlecellgenesetenrichmentsequencingregressiongeneexpressiondifferentialexpressionenrichmentmarker-genes
4.48 score 3 stars 5 scripts 188 downloadsComPrAn - Complexome Profiling Analysis package
This package is for analysis of SILAC labeled complexome profiling data. It uses peptide table in tab-delimited format as an input and produces ready-to-use tables and plots.
Last updated 4 months ago
massspectrometryproteomicsvisualization
4.48 score 5 scripts 184 downloadsmidasHLA - R package for immunogenomics data handling and association analysis
MiDAS is a R package for immunogenetics data transformation and statistical analysis. MiDAS accepts input data in the form of HLA alleles and KIR types, and can transform it into biologically meaningful variables, enabling HLA amino acid fine mapping, analyses of HLA evolutionary divergence, KIR gene presence, as well as validated HLA-KIR interactions. Further, it allows comprehensive statistical association analysis workflows with phenotypes of diverse measurement scales. MiDAS closes a gap between the inference of immunogenetic variation and its efficient utilization to make relevant discoveries related to T cell, Natural Killer cell, and disease biology.
Last updated 4 months ago
cellbiologygeneticsstatisticalmethod
4.48 score 3 scripts 190 downloadsORFhunteR - Predict open reading frames in nucleotide sequences
The ORFhunteR package is a R and C++ library for an automatic determination and annotation of open reading frames (ORF) in a large set of RNA molecules. It efficiently implements the machine learning model based on vectorization of nucleotide sequences and the random forest classification algorithm. The ORFhunteR package consists of a set of functions written in the R language in conjunction with C++. The efficiency of the package was confirmed by the examples of the analysis of RNA molecules from the NCBI RefSeq and Ensembl databases. The package can be used in basic and applied biomedical research related to the study of the transcriptome of normal as well as altered (for example, cancer) human cells.
Last updated 4 months ago
technologystatisticalmethodsequencingrnaseqclassificationfeatureextractioncpp
4.48 score 1 stars 163 downloadsRLassoCox - A reweighted Lasso-Cox by integrating gene interaction information
RLassoCox is a package that implements the RLasso-Cox model proposed by Wei Liu. The RLasso-Cox model integrates gene interaction information into the Lasso-Cox model for accurate survival prediction and survival biomarker discovery. It is based on the hypothesis that topologically important genes in the gene interaction network tend to have stable expression changes. The RLasso-Cox model uses random walk to evaluate the topological weight of genes, and then highlights topologically important genes to improve the generalization ability of the Lasso-Cox model. The RLasso-Cox model has the advantage of identifying small gene sets with high prognostic performance on independent datasets, which may play an important role in identifying robust survival biomarkers for various cancer types.
Last updated 4 months ago
survivalregressiongeneexpressiongenepredictionnetwork
4.48 score 3 stars 2 scripts 157 downloadsSCFA - SCFA: Subtyping via Consensus Factor Analysis
Subtyping via Consensus Factor Analysis (SCFA) can efficiently remove noisy signals from consistent molecular patterns in multi-omics data. SCFA first uses an autoencoder to select only important features and then repeatedly performs factor analysis to represent the data with different numbers of factors. Using these representations, it can reliably identify cancer subtypes and accurately predict risk scores of patients.
Last updated 4 months ago
survivalclusteringclassification
4.48 score 3 stars 2 scripts 168 downloadsAnVILPublish - Publish Packages and Other Resources to AnVIL Workspaces
Use this package to create or update AnVIL workspaces from resources such as R / Bioconductor packages. The metadata about the package (e.g., select information from the package DESCRIPTION file and from vignette YAML headings) are used to populate the 'DASHBOARD'. Vignettes are translated to python notebooks ready for evaluation in AnVIL.
Last updated 4 months ago
infrastructuresoftware
4.48 score 232 downloadsrnaEditr - Statistical analysis of RNA editing sites and hyper-editing regions
RNAeditr analyzes site-specific RNA editing events, as well as hyper-editing regions. The editing frequencies can be tested against binary, continuous or survival outcomes. Multiple covariate variables as well as interaction effects can also be incorporated in the statistical models.
Last updated 4 months ago
genetargetepigeneticsdimensionreductionfeatureextractionregressionsurvivalrnaseq
4.48 score 3 stars 9 scripts 234 downloadsuncoverappLib - Interactive graphical application for clinical assessment of sequence coverage at the base-pair level
a Shiny application containing a suite of graphical and statistical tools to support clinical assessment of low coverage regions.It displays three web pages each providing a different analysis module: Coverage analysis, calculate AF by allele frequency app and binomial distribution. uncoverAPP provides a statisticl summary of coverage given target file or genes name.
Last updated 4 months ago
softwarevisualizationannotationcoverage
4.48 score 3 stars 4 scripts 206 downloadscombi - Compositional omics model based visual integration
This explorative ordination method combines quasi-likelihood estimation, compositional regression models and latent variable models for integrative visualization of several omics datasets. Both unconstrained and constrained integration are available. The results are shown as interpretable, compositional multiplots.
Last updated 4 months ago
metagenomicsdimensionreductionmicrobiomevisualizationmetabolomics
4.48 score 1 stars 7 scripts 201 downloadsbiobtreeR - Using biobtree tool from R
The biobtreeR package provides an interface to [biobtree](https://github.com/tamerh/biobtree) tool which covers large set of bioinformatics datasets and allows search and chain mappings functionalities.
Last updated 4 months ago
annotationbioinformatics
4.48 score 3 stars 3 scripts 278 downloadsMACSQuantifyR - Fast treatment of MACSQuantify FACS data
Automatically process the metadata of MACSQuantify FACS sorter. It runs multiple modules: i) imports of raw file and graphical selection of duplicates in well plate, ii) computes statistics on data and iii) can compute combination index.
Last updated 4 months ago
dataimportpreprocessingnormalizationflowcytometrydatarepresentationgui
4.48 score 3 scripts 121 downloadsmicrobiomeDASim - Microbiome Differential Abundance Simulation
A toolkit for simulating differential microbiome data designed for longitudinal analyses. Several functional forms may be specified for the mean trend. Observations are drawn from a multivariate normal model. The objective of this package is to be able to simulate data in order to accurately compare different longitudinal methods for differential abundance.
Last updated 4 months ago
microbiomevisualizationsoftware
4.48 score 3 stars 1 scripts 183 downloadschipenrich - Gene Set Enrichment For ChIP-seq Peak Data
ChIP-Enrich and Poly-Enrich perform gene set enrichment testing using peaks called from a ChIP-seq experiment. The method empirically corrects for confounding factors such as the length of genes, and the mappability of the sequence surrounding genes.
Last updated 14 days ago
immunooncologychipseqepigeneticsfunctionalgenomicsgenesetenrichmenthistonemodificationregression
4.46 score 29 scripts 362 downloadsRtpca - Thermal proximity co-aggregation with R
R package for performing thermal proximity co-aggregation analysis with thermal proteome profiling datasets to analyse protein complex assembly and (differential) protein-protein interactions across conditions.
Last updated 4 months ago
softwareproteomicsdataimport
4.46 score 29 scripts 238 downloadsMMUPHin - Meta-analysis Methods with Uniform Pipeline for Heterogeneity in Microbiome Studies
MMUPHin is an R package for meta-analysis tasks of microbiome cohorts. It has function interfaces for: a) covariate-controlled batch- and cohort effect adjustment, b) meta-analysis differential abundance testing, c) meta-analysis unsupervised discrete structure (clustering) discovery, and d) meta-analysis unsupervised continuous structure discovery.
Last updated 4 months ago
metagenomicsmicrobiomebatcheffect
4.43 score 45 scripts 409 downloadsMacarron - Prioritization of potentially bioactive metabolic features from epidemiological and environmental metabolomics datasets
Macarron is a workflow for the prioritization of potentially bioactive metabolites from metabolomics experiments. Prioritization integrates strengths of evidences of bioactivity such as covariation with a known metabolite, abundance relative to a known metabolite and association with an environmental or phenotypic indicator of bioactivity. Broadly, the workflow consists of stratified clustering of metabolic spectral features which co-vary in abundance in a condition, transfer of functional annotations, estimation of relative abundance and differential abundance analysis to identify associations between features and phenotype/condition.
Last updated 4 months ago
sequencingmetabolomicscoveragefunctionalpredictionclustering
4.41 score 13 scripts 234 downloadsCircSeqAlignTk - A toolkit for end-to-end analysis of RNA-seq data for circular genomes
CircSeqAlignTk is designed for end-to-end RNA-Seq data analysis of circular genome sequences, from alignment to visualization. It mainly targets viroids which are composed of 246-401 nt circular RNAs. In addition, CircSeqAlignTk implements a tidy interface to generate synthetic sequencing data that mimic real RNA-Seq data, allowing developers to evaluate the performance of alignment tools and workflows.
Last updated 4 months ago
sequencingsmallrnaalignmentsoftware
4.40 score 3 scripts 164 downloadsphenomis - Postprocessing and univariate analysis of omics data
The 'phenomis' package provides methods to perform post-processing (i.e. quality control and normalization) as well as univariate statistical analysis of single and multi-omics data sets. These methods include quality control metrics, signal drift and batch effect correction, intensity transformation, univariate hypothesis testing, but also clustering (as well as annotation of metabolomics data). The data are handled in the standard Bioconductor formats (i.e. SummarizedExperiment and MultiAssayExperiment for single and multi-omics datasets, respectively; the alternative ExpressionSet and MultiDataSet formats are also supported for convenience). As a result, all methods can be readily chained as workflows. The pipeline can be further enriched by multivariate analysis and feature selection, by using the 'ropls' and 'biosigner' packages, which support the same formats. Data can be conveniently imported from and exported to text files. Although the methods were initially targeted to metabolomics data, most of the methods can be applied to other types of omics data (e.g., transcriptomics, proteomics).
Last updated 4 months ago
batcheffectclusteringcoveragekeggmassspectrometrymetabolomicsnormalizationproteomicsqualitycontrolsequencingstatisticalmethodtranscriptomics
4.40 score 6 scripts 179 downloadsEasyCellType - Annotate cell types for scRNA-seq data
We developed EasyCellType which can automatically examine the input marker lists obtained from existing software such as Seurat over the cell markerdatabases. Two quantification approaches to annotate cell types are provided: Gene set enrichment analysis (GSEA) and a modified versio of Fisher's exact test. The function presents annotation recommendations in graphical outcomes: bar plots for each cluster showing candidate cell types, as well as a dot plot summarizing the top 5 significant annotations for each cluster.
Last updated 4 months ago
singlecellsoftwaregeneexpressiongenesetenrichment
4.40 score 6 scripts 180 downloadschihaya - Save Delayed Operations to a HDF5 File
Saves the delayed operations of a DelayedArray to a HDF5 file. This enables efficient recovery of the DelayedArray's contents in other languages and analysis frameworks.
Last updated 4 months ago
dataimportdatarepresentationzlibcpp
4.38 score 16 scripts 310 downloadsCTSV - Identification of cell-type-specific spatially variable genes accounting for excess zeros
The R package CTSV implements the CTSV approach developed by Jinge Yu and Xiangyu Luo that detects cell-type-specific spatially variable genes accounting for excess zeros. CTSV directly models sparse raw count data through a zero-inflated negative binomial regression model, incorporates cell-type proportions, and performs hypothesis testing based on R package pscl. The package outputs p-values and q-values for genes in each cell type, and CTSV is scalable to datasets with tens of thousands of genes measured on hundreds of spots. CTSV can be installed in Windows, Linux, and Mac OS.
Last updated 4 months ago
geneexpressionstatisticalmethodregressionspatialgenetics
4.38 score 2 stars 12 scripts 164 downloadsflowSpecs - Tools for processing of high-dimensional cytometry data
This package is intended to fill the role of conventional cytometry pre-processing software, for spectral decomposition, transformation, visualization and cleanup, and to aid further downstream analyses, such as with DepecheR, by enabling transformation of flowFrames and flowSets to dataframes. Functions for flowCore-compliant automatic 1D-gating/filtering are in the pipe line. The package name has been chosen both as it will deal with spectral cytometry and as it will hopefully give the user a nice pair of spectacles through which to view their data.
Last updated 4 months ago
softwarecellbasedassaysdatarepresentationimmunooncologyflowcytometrysinglecellvisualizationnormalizationdataimport
4.38 score 6 stars 7 scripts 256 downloadsDepInfeR - Inferring tumor-specific cancer dependencies through integrating ex-vivo drug response assays and drug-protein profiling
DepInfeR integrates two experimentally accessible input data matrices: the drug sensitivity profiles of cancer cell lines or primary tumors ex-vivo (X), and the drug affinities of a set of proteins (Y), to infer a matrix of molecular protein dependencies of the cancers (ß). DepInfeR deconvolutes the protein inhibition effect on the viability phenotype by using regularized multivariate linear regression. It assigns a “dependence coefficient” to each protein and each sample, and therefore could be used to gain a causal and accurate understanding of functional consequences of genomic aberrations in a heterogeneous disease, as well as to guide the choice of pharmacological intervention for a specific cancer type, sub-type, or an individual patient. For more information, please read out preprint on bioRxiv: https://doi.org/10.1101/2022.01.11.475864.
Last updated 4 months ago
softwareregressionpharmacogeneticspharmacogenomicsfunctionalgenomics
4.36 score 1 stars 23 scripts 157 downloadsASURAT - Functional annotation-driven unsupervised clustering for single-cell data
ASURAT is a software for single-cell data analysis. Using ASURAT, one can simultaneously perform unsupervised clustering and biological interpretation in terms of cell type, disease, biological process, and signaling pathway activity. Inputting a single-cell RNA-seq data and knowledge-based databases, such as Cell Ontology, Gene Ontology, KEGG, etc., ASURAT transforms gene expression tables into original multivariate tables, termed sign-by-sample matrices (SSMs).
Last updated 4 months ago
geneexpressionsinglecellsequencingclusteringgenesignalingcpp
4.36 score 23 scripts 196 downloadsggmanh - Visualization Tool for GWAS Result
Manhattan plot and QQ Plot are commonly used to visualize the end result of Genome Wide Association Study. The "ggmanh" package aims to keep the generation of these plots simple while maintaining customizability. Main functions include manhattan_plot, qqunif, and thinPoints.
Last updated 4 months ago
visualizationgenomewideassociationgenetics
4.36 score 23 scripts 441 downloadsToxicoGx - Analysis of Large-Scale Toxico-Genomic Data
Contains a set of functions to perform large-scale analysis of toxicogenomic data, providing a standardized data structure to hold information relevant to annotation, visualization and statistical analysis of toxicogenomic data.
Last updated 4 months ago
geneexpressionpharmacogeneticspharmacogenomicssoftware
4.36 score 23 scripts 200 downloadsdrugTargetInteractions - Drug-Target Interactions
Provides utilities for identifying drug-target interactions for sets of small molecule or gene/protein identifiers. The required drug-target interaction information is obained from a local SQLite instance of the ChEMBL database. ChEMBL has been chosen for this purpose, because it provides one of the most comprehensive and best annotatated knowledge resources for drug-target information available in the public domain.
Last updated 4 months ago
cheminformaticsbiomedicalinformaticspharmacogeneticspharmacogenomicsproteomicsmetabolomics
4.34 score 1 stars 11 scripts 194 downloadsROSeq - Modeling expression ranks for noise-tolerant differential expression analysis of scRNA-Seq data
ROSeq - A rank based approach to modeling gene expression with filtered and normalized read count matrix. ROSeq takes filtered and normalized read matrix and cell-annotation/condition as input and determines the differentially expressed genes between the contrasting groups of single cells. One of the input parameters is the number of cores to be used.
Last updated 4 months ago
geneexpressiondifferentialexpressionsinglecellcount-datagene-expressiongene-expression-profilesnormalizationpopulationsranktmmtungtung-datasettutorialvignette
4.34 score 2 stars 11 scripts 231 downloadsSpaniel - Spatial Transcriptomics Analysis
Spaniel includes a series of tools to aid the quality control and analysis of Spatial Transcriptomics data. Spaniel can import data from either the original Spatial Transcriptomics system or 10X Visium technology. The package contains functions to create a SingleCellExperiment Seurat object and provides a method of loading a histologial image into R. The spanielPlot function allows visualisation of metrics contained within the S4 object overlaid onto the image of the tissue.
Last updated 4 months ago
singlecellrnaseqqualitycontrolpreprocessingnormalizationvisualizationtranscriptomicsgeneexpressionsequencingsoftwaredataimportdatarepresentationinfrastructurecoverageclustering
4.34 score 22 scripts 224 downloadsIFAA - Robust Inference for Absolute Abundance in Microbiome Analysis
This package offers a robust approach to make inference on the association of covariates with the absolute abundance (AA) of microbiome in an ecosystem. It can be also directly applied to relative abundance (RA) data to make inference on AA because the ratio of two RA is equal to the ratio of their AA. This algorithm can estimate and test the associations of interest while adjusting for potential confounders. The estimates of this method have easy interpretation like a typical regression analysis. High-dimensional covariates are handled with regularization and it is implemented by parallel computing. False discovery rate is automatically controlled by this approach. Zeros do not need to be imputed by a positive value for the analysis. The IFAA package also offers the 'MZILN' function for estimating and testing associations of abundance ratios with covariates.
Last updated 4 months ago
softwaretechnologysequencingmicrobiomeregression
4.32 score 14 scripts 160 downloadsOutSplice - Comparison of Splicing Events between Tumor and Normal Samples
An easy to use tool that can compare splicing events in tumor and normal tissue samples using either a user generated matrix, or data from The Cancer Genome Atlas (TCGA). This package generates a matrix of splicing outliers that are significantly over or underexpressed in tumors samples compared to normal denoted by chromosome location. The package also will calculate the splicing burden in each tumor and characterize the types of splicing events that occur.
Last updated 6 days ago
alternativesplicingdifferentialexpressiondifferentialsplicinggeneexpressionrnaseqsoftwarevariantannotation
4.30 score 1 stars 4 scripts 140 downloadscytofQC - Labels normalized cells for CyTOF data and assigns probabilities for each label
cytofQC is a package for initial cleaning of CyTOF data. It uses a semi-supervised approach for labeling cells with their most likely data type (bead, doublet, debris, dead) and the probability that they belong to each label type. This package does not remove data from the dataset, but provides labels and information to aid the data user in cleaning their data. Our algorithm is able to distinguish between doublets and large cells.
Last updated 4 months ago
softwaresinglecellannotation
4.30 score 2 stars 3 scripts 206 downloadsHiCool - HiCool
HiCool provides an R interface to process and normalize Hi-C paired-end fastq reads into .(m)cool files. .(m)cool is a compact, indexed HDF5 file format specifically tailored for efficiently storing HiC-based data. On top of processing fastq reads, HiCool provides a convenient reporting function to generate shareable reports summarizing Hi-C experiments and including quality controls.
Last updated 4 months ago
hicdna3dstructuredataimport
4.30 score 2 stars 10 scripts 148 downloadsmicroSTASIS - Microbiota STability ASsessment via Iterative cluStering
The toolkit 'µSTASIS', or microSTASIS, has been developed for the stability analysis of microbiota in a temporal framework by leveraging on iterative clustering. Concretely, the core function uses Hartigan-Wong k-means algorithm as many times as possible for stressing out paired samples from the same individuals to test if they remain together for multiple numbers of clusters over a whole data set of individuals. Moreover, the package includes multiple functions to subset samples from paired times, validate the results or visualize the output.
Last updated 4 months ago
geneticvariabilitybiomedicalinformaticsclusteringmultiplecomparisonmicrobiome
4.30 score 2 stars 1 scripts 156 downloadsggtreeDendro - Drawing 'dendrogram' using 'ggtree'
Offers a set of 'autoplot' methods to visualize tree-like structures (e.g., hierarchical clustering and classification/regression trees) using 'ggtree'. You can adjust graphical parameters using grammar of graphic syntax and integrate external data to the tree.
Last updated 4 months ago
clusteringclassificationdecisiontreephylogeneticsvisualization
4.30 score 10 scripts 175 downloadsconsICA - consensus Independent Component Analysis
consICA implements a data-driven deconvolution method – consensus independent component analysis (ICA) to decompose heterogeneous omics data and extract features suitable for patient diagnostics and prognostics. The method separates biologically relevant transcriptional signals from technical effects and provides information about the cellular composition and biological processes. The implementation of parallel computing in the package ensures efficient analysis of modern multicore systems.
Last updated 4 months ago
technologystatisticalmethodsequencingrnaseqtranscriptomicsclassificationfeatureextraction
4.30 score 2 scripts 362 downloadsregioneReloaded - RegioneReloaded: Multiple Association for Genomic Region Sets
RegioneReloaded is a package that allows simultaneous analysis of associations between genomic region sets, enabling clustering of data and the creation of ready-to-publish graphs. It takes over and expands on all the features of its predecessor regioneR. It also incorporates a strategy to improve p-value calculations and normalize z-scores coming from multiple analysis to allow for their direct comparison. RegioneReloaded builds upon regioneR by adding new plotting functions for obtaining publication-ready graphs.
Last updated 4 months ago
geneticschipseqdnaseqmethylseqcopynumbervariationclusteringmultiplecomparison
4.30 score 5 stars 2 scripts 204 downloadscrisprBwa - BWA-based alignment of CRISPR gRNA spacer sequences
Provides a user-friendly interface to map on-targets and off-targets of CRISPR gRNA spacer sequences using bwa. The alignment is fast, and can be performed using either commonly-used or custom CRISPR nucleases. The alignment can work with any reference or custom genomes. Currently not supported on Windows machines.
Last updated 4 months ago
crisprfunctionalgenomicsalignmentalignerbioconductorbioconductor-packagebwacrispr-analysiscrispr-cas9crispr-designcrispr-targetgrnagrna-sequencegrna-sequencessgrnasgrna-design
4.30 score 1 stars 6 scripts 162 downloadsqmtools - Quantitative Metabolomics Data Processing Tools
The qmtools (quantitative metabolomics tools) package provides basic tools for processing quantitative metabolomics data with the standard SummarizedExperiment class. This includes functions for imputation, normalization, feature filtering, feature clustering, dimension-reduction, and visualization to help users prepare data for statistical analysis. This package also offers a convenient way to compute empirical Bayes statistics for which metabolic features are different between two sets of study samples. Several functions in this package could also be used in other types of omics data.
Last updated 4 months ago
metabolomicspreprocessingnormalizationdimensionreductionmassspectrometry
4.30 score 1 stars 5 scripts 151 downloadsprotGear - Protein Micro Array Data Management and Interactive Visualization
A generic three-step pre-processing package for protein microarray data. This package contains different data pre-processing procedures to allow comparison of their performance.These steps are background correction, the coefficient of variation (CV) based filtering, batch correction and normalization.
Last updated 4 months ago
microarrayonechannelpreprocessingbiomedicalinformaticsproteomicsbatcheffectnormalizationbayesianclusteringregressionsystemsbiologyimmunooncologybackground-correctionmicroarray-datanormalisationproteomics-datashinyshinydashboard
4.30 score 1 stars 6 scripts 302 downloadsbiodbHmdb - biodbHmdb, a library for connecting to the HMDB Database
The biodbHmdb library is an extension of the biodb framework package that provides access to the HMDB Metabolites database. It allows to download the whole HMDB Metabolites database locally, access entries and search for entries by name or description. A future version of this package will also include a search by mass and mass spectra annotation.
Last updated 4 months ago
softwareinfrastructuredataimportcpp
4.30 score 2 stars 2 scripts 202 downloadstxcutr - Transcriptome CUTteR
Various mRNA sequencing library preparation methods generate sequencing reads specifically from the transcript ends. Analyses that focus on quantification of isoform usage from such data can be aided by using truncated versions of transcriptome annotations, both at the alignment or pseudo-alignment stage, as well as in downstream analysis. This package implements some convenience methods for readily generating such truncated annotations and their corresponding sequences.
Last updated 4 months ago
alignmentannotationrnaseqsequencingtranscriptomics
4.30 score 9 scripts 134 downloadsmosbi - Molecular Signature identification using Biclustering
This package is a implementation of biclustering ensemble method MoSBi (Molecular signature Identification from Biclustering). MoSBi provides standardized interfaces for biclustering results and can combine their results with a multi-algorithm ensemble approach to compute robust ensemble biclusters on molecular omics data. This is done by computing similarity networks of biclusters and filtering for overlaps using a custom error model. After that, the louvain modularity it used to extract bicluster communities from the similarity network, which can then be converted to ensemble biclusters. Additionally, MoSBi includes several network visualization methods to give an intuitive and scalable overview of the results. MoSBi comes with several biclustering algorithms, but can be easily extended to new biclustering algorithms.
Last updated 4 months ago
softwarestatisticalmethodclusteringnetworkcpp
4.30 score 8 scripts 178 downloadsbiodbUniprot - biodbUniprot, a library for connecting to the Uniprot Database
The biodbUniprot library is an extension of the biodb framework package. It provides access to the UniProt database. It allows to retrieve entries by their accession number, and run web service queries for searching for entries.
Last updated 4 months ago
softwareinfrastructuredataimport
4.30 score 2 stars 3 scripts 170 downloadsspatzie - Identification of enriched motif pairs from chromatin interaction data
Identifies motifs that are significantly co-enriched from enhancer-promoter interaction data. While enhancer-promoter annotation is commonly used to define groups of interaction anchors, spatzie also supports co-enrichment analysis between preprocessed interaction anchors. Supports BEDPE interaction data derived from genome-wide assays such as HiC, ChIA-PET, and HiChIP. Can also be used to look for differentially enriched motif pairs between two interaction experiments.
Last updated 4 months ago
dna3dstructuregeneregulationpeakdetectionepigeneticsfunctionalgenomicsclassificationhictranscription
4.30 score 5 scripts 130 downloads
cageminer - Candidate Gene Miner
This package aims to integrate GWAS-derived SNPs and coexpression networks to mine candidate genes associated with a particular phenotype. For that, users must define a set of guide genes, which are known genes involved in the studied phenotype. Additionally, the mined candidates can be given a score that favor candidates that are hubs and/or transcription factors. The scores can then be used to rank and select the top n most promising genes for downstream experiments.
Last updated 4 months ago
softwaresnpfunctionalpredictiongenomewideassociationgeneexpressionnetworkenrichmentvariantannotationfunctionalgenomicsnetwork
4.30 score 1 stars 2 scripts 139 downloadsm6Aboost - m6Aboost
This package can help user to run the m6Aboost model on their own miCLIP2 data. The package includes functions to assign the read counts and get the features to run the m6Aboost model. The miCLIP2 data should be stored in a GRanges object. More details can be found in the vignette.
Last updated 4 months ago
sequencingepigeneticsgeneticsexperimenthubsoftware
4.30 score 2 stars 5 scripts 159 downloadsMSstatsLOBD - Assay characterization: estimation of limit of blanc(LoB) and limit of detection(LOD)
The MSstatsLOBD package allows calculation and visualization of limit of blac (LOB) and limit of detection (LOD). We define the LOB as the highest apparent concentration of a peptide expected when replicates of a blank sample containing no peptides are measured. The LOD is defined as the measured concentration value for which the probability of falsely claiming the absence of a peptide in the sample is 0.05, given a probability 0.05 of falsely claiming its presence. These functionalities were previously a part of the MSstats package. The methodology is described in Galitzine (2018) <doi:10.1074/mcp.RA117.000322>.
Last updated 4 months ago
immunooncologymassspectrometryproteomicssoftwaredifferentialexpressiononechanneltwochannelnormalizationqualitycontrolmass-spectrometry
4.30 score 1 scripts 186 downloadswppi - Weighting protein-protein interactions
Protein-protein interaction data is essential for omics data analysis and modeling. Database knowledge is general, not specific for cell type, physiological condition or any other context determining which connections are functional and contribute to the signaling. Functional annotations such as Gene Ontology and Human Phenotype Ontology might help to evaluate the relevance of interactions. This package predicts functional relevance of protein-protein interactions based on functional annotations such as Human Protein Ontology and Gene Ontology, and prioritizes genes based on network topology, functional scores and a path search algorithm.
Last updated 4 months ago
graphandnetworknetworkpathwayssoftwaregenesignalinggenetargetsystemsbiologytranscriptomicsannotationgene-ontologygene-prioritizationhuman-phenotype-ontologyomnipathppi-networksrandom-walk-with-restart
4.30 score 1 stars 4 scripts 229 downloadscyanoFilter - Phytoplankton Population Identification using Cell Pigmentation and/or Complexity
An approach to filter out and/or identify phytoplankton cells from all particles measured via flow cytometry pigment and cell complexity information. It does this using a sequence of one-dimensional gates on pre-defined channels measuring certain pigmentation and complexity. The package is especially tuned for cyanobacteria, but will work fine for phytoplankton communities where there is at least one cell characteristic that differentiates every phytoplankton in the community.
Last updated 4 months ago
flowcytometryclusteringonechannel
4.30 score 4 scripts 260 downloadsCIMICE - CIMICE-R: (Markov) Chain Method to Inferr Cancer Evolution
CIMICE is a tool in the field of tumor phylogenetics and its goal is to build a Markov Chain (called Cancer Progression Markov Chain, CPMC) in order to model tumor subtypes evolution. The input of CIMICE is a Mutational Matrix, so a boolean matrix representing altered genes in a collection of samples. These samples are assumed to be obtained with single-cell DNA analysis techniques and the tool is specifically written to use the peculiarities of this data for the CMPC construction.
Last updated 4 months ago
softwarebiologicalquestionnetworkinferenceresearchfieldphylogeneticsstatisticalmethodgraphandnetworktechnologysinglecell
4.30 score 5 scripts 172 downloadsgeva - Gene Expression Variation Analysis (GEVA)
Statistic methods to evaluate variations of differential expression (DE) between multiple biological conditions. It takes into account the fold-changes and p-values from previous differential expression (DE) results that use large-scale data (*e.g.*, microarray and RNA-seq) and evaluates which genes would react in response to the distinct experiments. This evaluation involves an unique pipeline of statistical methods, including weighted summarization, quantile detection, cluster analysis, and ANOVA tests, in order to classify a subset of relevant genes whose DE is similar or dependent to certain biological factors.
Last updated 4 months ago
classificationdifferentialexpressiongeneexpressionmicroarraymultiplecomparisonrnaseqsystemsbiologytranscriptomics
4.30 score 2 stars 4 scripts 179 downloadsMACSr - MACS: Model-based Analysis for ChIP-Seq
The Model-based Analysis of ChIP-Seq (MACS) is a widely used toolkit for identifying transcript factor binding sites. This package is an R wrapper of the lastest MACS3.
Last updated 4 months ago
softwarechipseqatacseqimmunooncology
4.30 score 7 scripts 277 downloadscenscyt - Differential abundance analysis with a right censored covariate in high-dimensional cytometry
Methods for differential abundance analysis in high-dimensional cytometry data when a covariate is subject to right censoring (e.g. survival time) based on multiple imputation and generalized linear mixed models.
Last updated 4 months ago
immunooncologyflowcytometryproteomicssinglecellcellbasedassayscellbiologyclusteringfeatureextractionsoftwaresurvival
4.30 score 2 scripts 186 downloadsADImpute - Adaptive Dropout Imputer (ADImpute)
Single-cell RNA sequencing (scRNA-seq) methods are typically unable to quantify the expression levels of all genes in a cell, creating a need for the computational prediction of missing values (‘dropout imputation’). Most existing dropout imputation methods are limited in the sense that they exclusively use the scRNA-seq dataset at hand and do not exploit external gene-gene relationship information. Here we propose two novel methods: a gene regulatory network-based approach using gene-gene relationships learnt from external data and a baseline approach corresponding to a sample-wide average. ADImpute can implement these novel methods and also combine them with existing imputation methods (currently supported: DrImpute, SAVER). ADImpute can learn the best performing method per gene and combine the results from different methods into an ensemble.
Last updated 4 months ago
geneexpressionnetworkpreprocessingsequencingsinglecelltranscriptomics
4.30 score 7 scripts 384 downloadsAnVILBilling - Provide functions to retrieve and report on usage expenses in NHGRI AnVIL (anvilproject.org).
AnVILBilling helps monitor AnVIL-related costs in R, using queries to a BigQuery table to which costs are exported daily. Functions are defined to help categorize tasks and associated expenditures, and to visualize and explore expense profiles over time. This package will be expanded to help users estimate costs for specific task sets.
Last updated 4 months ago
infrastructuresoftware
4.30 score 5 scripts 181 downloadsgmoviz - Seamless visualization of complex genomic variations in GMOs and edited cell lines
Genetically modified organisms (GMOs) and cell lines are widely used models in all kinds of biological research. As part of characterising these models, DNA sequencing technology and bioinformatics analyses are used systematically to study their genomes. Therefore, large volumes of data are generated and various algorithms are applied to analyse this data, which introduces a challenge on representing all findings in an informative and concise manner. `gmoviz` provides users with an easy way to visualise and facilitate the explanation of complex genomic editing events on a larger, biologically-relevant scale.
Last updated 4 months ago
visualizationsequencinggeneticvariabilitygenomicvariationcoverage
4.30 score 9 scripts 162 downloadsCeTF - Coexpression for Transcription Factors using Regulatory Impact Factors and Partial Correlation and Information Theory analysis
This package provides the necessary functions for performing the Partial Correlation coefficient with Information Theory (PCIT) (Reverter and Chan 2008) and Regulatory Impact Factors (RIF) (Reverter et al. 2010) algorithm. The PCIT algorithm identifies meaningful correlations to define edges in a weighted network and can be applied to any correlation-based network including but not limited to gene co-expression networks, while the RIF algorithm identify critical Transcription Factors (TF) from gene expression data. These two algorithms when combined provide a very relevant layer of information for gene expression studies (Microarray, RNA-seq and single-cell RNA-seq data).
Last updated 4 months ago
sequencingrnaseqmicroarraygeneexpressiontranscriptionnormalizationdifferentialexpressionsinglecellnetworkregressionchipseqimmunooncology