Package: DaMiRseq 2.17.0

Mattia Chiesa

DaMiRseq: Data Mining for RNA-seq data: normalization, feature selection and classification

The DaMiRseq package offers a tidy pipeline of data mining procedures to identify transcriptional biomarkers and exploit them for both binary and multi-class classification purposes. The package accepts any kind of data presented as a table of raw counts and allows including both continous and factorial variables that occur with the experimental setting. A series of functions enable the user to clean up the data by filtering genomic features and samples, to adjust data by identifying and removing the unwanted source of variation (i.e. batches and confounding factors) and to select the best predictors for modeling. Finally, a "stacking" ensemble learning technique is applied to build a robust classification model. Every step includes a checkpoint that the user may exploit to assess the effects of data management by looking at diagnostic plots, such as clustering and heatmaps, RLE boxplots, MDS or correlation plot.

Authors:Mattia Chiesa <[email protected]>, Luca Piacentini <[email protected]>

DaMiRseq_2.17.0.tar.gz
DaMiRseq_2.17.0.zip(r-4.5)DaMiRseq_2.17.0.zip(r-4.4)DaMiRseq_2.17.0.zip(r-4.3)
DaMiRseq_2.17.0.tgz(r-4.4-any)DaMiRseq_2.17.0.tgz(r-4.3-any)
DaMiRseq_2.17.0.tar.gz(r-4.5-noble)DaMiRseq_2.17.0.tar.gz(r-4.4-noble)
DaMiRseq_2.17.0.tgz(r-4.4-emscripten)DaMiRseq_2.17.0.tgz(r-4.3-emscripten)
DaMiRseq.pdf |DaMiRseq.html
DaMiRseq/json (API)
NEWS

# Install 'DaMiRseq' in R:
install.packages('DaMiRseq', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Uses libs:
  • openjdk– OpenJDK Java runtime, using Hotspot JIT
Datasets:
  • SE - Example gene-expression dataset for DaMiRseq package
  • SEtest_norm - A sample dataset with a normalized count matrix for "testthat" functions.
  • data_min - Example gene-expression dataset for DaMiRseq package
  • data_norm - A dataset with a normalized matrix to test several DaMiRseq functions: sample data are a subset of Genotype-Tissue Expression
  • data_reduced - Example gene-expression dataset for DaMiRseq package
  • data_relief - Example ranking dataset for DaMiRseq package
  • df - Example gene-expression dataset for DaMiRseq package
  • selected_features - Example gene-expression dataset for DaMiRseq package
  • sv - Example Surrogate Variables dataset for DaMiRseq package

On BioConductor:DaMiRseq-2.17.0(bioc 3.20)DaMiRseq-2.16.0(bioc 3.19)

This package does not link to any Github/Gitlab/R-forge repository. No issue tracker or development information is available.

bioconductor-package

24 exports 1.08 score 242 dependencies 1 dependents 7 mentions

Last updated 2 months agofrom:7961b11b2c

Exports:DaMiR.AllplotDaMiR.ClustplotDaMiR.corrplotDaMiR.EnsembleLearningDaMiR.EnsembleLearning2clDaMiR.EnsembleLearningNclDaMiR.EnsL_PredictDaMiR.EnsL_TestDaMiR.EnsL_TrainDaMiR.FBestDaMiR.FReductDaMiR.FSelectDaMiR.FSortDaMiR.goldenDiceDaMiR.iTSadjustDaMiR.iTSnormDaMiR.makeSEDaMiR.MDSplotDaMiR.ModelSelectDaMiR.normalizationDaMiR.sampleFiltDaMiR.SVDaMiR.SVadjustDaMiR.transpose

Dependencies:abindannotateAnnotationDbiarmaroma.lightaskpassbackportsbase64encbdsmatrixBHBiobaseBiocFileCacheBiocGenericsBiocIOBiocManagerBiocParallelbiomaRtBiostringsbitbit64bitopsblobbootbroombslibcachemcarcarDatacaretcheckmateclasscliclockclustercodacodetoolscolorspacecorrplotcowplotcpp11crayoncrosstalkcurldata.tableDBIdbplyrDelayedArraydeldirDerivDESeq2diagramdigestdoBydplyrDTe1071EDASeqedgeRellipseellipsisemmeansentropyestimabilityevaluateFactoMineRfansifarverfastmapfilelockflashClustfontawesomeforeachforeignformatRFormulafsFSelectorfutile.loggerfutile.optionsfuturefuture.applygenalggenefiltergenericsGenomeInfoDbGenomeInfoDbDataGenomicAlignmentsGenomicFeaturesGenomicRangesggplot2ggrepelglobalsgluegowergridExtragtablehardhathighrHmischmshtmlTablehtmltoolshtmlwidgetshttpuvhttrhttr2hwriterigraphineqinterpipredIRangesisobanditeratorsjpegjquerylibjsonliteKEGGRESTKernSmoothkknnknitrlabelinglambda.rlaterlatticelatticeExtralavalazyevalleapslifecyclelimmalistenvlme4locfitlubridatemagrittrMASSMatrixMatrixGenericsMatrixModelsmatrixStatsmemoisemgcvmicrobenchmarkmimeminqaModelMetricsmodelrmultcompViewmunsellmvtnormnlmenloptrnnetnumDerivopensslparallellypbkrtestpheatmappillarpkgconfigplogrplsplsVarSelplyrpngpraznikprettyunitspROCprodlimprogressprogressrpromisesproxypurrrpwalignquantregR.methodsS3R.ooR.utilsR6randomForestrappdirsRColorBrewerRcppRcppArmadilloRcppEigenRCurlrecipesreshape2restfulrRhtslibrJavarjsonrlangrmarkdownrpartRsamtoolsRSNNSRSQLiterstudioapirtracklayerRWekaRWekajarsS4ArraysS4Vectorssassscalesscatterplot3dshapeShortReadsnowSparseArraySparseMSQUAREMstatmodstringistringrSummarizedExperimentsurvivalsvasystibbletidyrtidyselecttimechangetimeDatetinytextzdbUCSC.utilsutf8vctrsviridisviridisLitewithrxfunXMLxml2xtableXVectoryamlzlibbioc

Data Mining for RNA-seq data: normalization, features selection and classification - DaMiRseq package

Rendered fromDaMiRseq.Rnwusingknitr::knitron Jun 30 2024.

Last update: 2021-08-09
Started: 2017-02-09

Readme and manuals

Help Manual

Help pageTopics
Quality assessment and visualization of expression dataDaMiR.Allplot
Expression data clustering and heatmapDaMiR.Clustplot
Correlation PlotDaMiR.corrplot
Build Classifier using 'Staking' Ensemble Learning strategy.DaMiR.EnsembleLearning
Build a Binary Classifier using 'Staking' Learning strategy.DaMiR.EnsembleLearning2cl
Build a Multi-Class Classifier using 'Staking' Learning strategy.DaMiR.EnsembleLearningNcl
Predict new samples classDaMiR.EnsL_Predict
Test Binary ClassifiersDaMiR.EnsL_Test
Train a Binary Classifier using 'Staking' Learning strategy.DaMiR.EnsL_Train
Select best predictors to build Classification ModelDaMiR.FBest
Remove highly correlated features, based on feature-per-feature correlation.DaMiR.FReduct
Feature selection for classificationDaMiR.FSelect
Order features by importance, using RReliefF filterDaMiR.FSort
Generate a Number to Set SeedDaMiR.goldenDice
Batch correction of normalized Independent Test SetDaMiR.iTSadjust
Normalization of Independent Test SetDaMiR.iTSnorm
Import RNA-Seq count data and variablesDaMiR.makeSE
Plot multidimentional scaling (MDS)DaMiR.MDSplot
Select the best classification modelDaMiR.ModelSelect
Filter non Expressed and 'Hypervariant' features and Data NormalizationDaMiR.normalization
Filter Samples by Mean Correlation Distance MetricDaMiR.sampleFilt
Identification of Surrogate VariablesDaMiR.SV
Remove variable effects from expression dataDaMiR.SVadjust
Matrix transposition and replacement of '.' and '-' special charactersDaMiR.transpose
Example gene-expression dataset for DaMiRseq packagedata_min
A dataset with a normalized matrix to test several DaMiRseq functions: sample data are a subset of Genotype-Tissue Expression (GTEx) RNA-Seq database (dbGap Study Accession: phs000424.v6.p1)data_norm
Example gene-expression dataset for DaMiRseq packagedata_reduced
Example ranking dataset for DaMiRseq packagedata_relief
Example gene-expression dataset for DaMiRseq packagedf
Example gene-expression dataset for DaMiRseq packageSE
Example gene-expression dataset for DaMiRseq packageselected_features
A sample dataset with a normalized count matrix for "testthat" functions.SEtest_norm
Example Surrogate Variables dataset for DaMiRseq packagesv