Package: RAIDS 1.3.0

Pascal Belleau

RAIDS: Accurate Inference of Genetic Ancestry from Cancer Sequences

This package implements specialized algorithms that enable genetic ancestry inference from various cancer sequences sources (RNA, Exome and Whole-Genome sequences). This package also implements a simulation algorithm that generates synthetic cancer-derived data. This code and analysis pipeline was designed and developed for the following publication: Belleau, P et al. Genetic Ancestry Inference from Cancer-Derived Molecular Data across Genomic and Transcriptomic Platforms. Cancer Res 1 January 2023; 83 (1): 49–58.

Authors:Pascal Belleau [cre, aut], Astrid Deschênes [aut], David A. Tuveson [aut], Alexander Krasnitz [aut]

RAIDS_1.3.0.tar.gz
RAIDS_1.3.0.zip(r-4.5)RAIDS_1.3.0.zip(r-4.4)RAIDS_1.3.0.zip(r-4.3)
RAIDS_1.3.0.tgz(r-4.4-any)RAIDS_1.3.0.tgz(r-4.3-any)
RAIDS_1.3.0.tar.gz(r-4.5-noble)RAIDS_1.3.0.tar.gz(r-4.4-noble)
RAIDS_1.3.0.tgz(r-4.4-emscripten)RAIDS_1.3.0.tgz(r-4.3-emscripten)
RAIDS.pdf |RAIDS.html
RAIDS/json (API)
NEWS

# Install 'RAIDS' in R:
install.packages('RAIDS', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Peer review:

Bug tracker:https://github.com/krasnitzlab/raids/issues

Datasets:
  • demoKnownSuperPop1KG - The known super population ancestry of the demo 1KG reference profiles.
  • demoPCA1KG - The PCA results of the demo 1KG reference dataset for demonstration purpose. Beware that the PCA has been run on a very small subset of the 1KG reference dataset and should not be used to call ancestry inference on a real profile.
  • demoPCASyntheticProfiles - The PCA result of demo synthetic profiles projected on the demo subset 1KG reference PCA.
  • demoPedigreeEx1 - The pedigree information about a demo profile called 'ex1'.
  • matKNNSynthetic - A small 'data.frame' containing the inferred ancestry on the synthetic profiles.
  • pedSynthetic - A small 'data.frame' containing the information related to synthetic profiles. The ancestry of the profiles used to generate the synthetic profiles must be present.
  • snpPositionDemo - A small 'data.frame' containing the SNV information.

On BioConductor:RAIDS-1.3.0(bioc 3.20)RAIDS-1.2.0(bioc 3.19)

bioconductor-package

28 exports 1.51 score 145 dependencies

Last updated 2 months agofrom:8c31b43b86

Exports:add1KG2SampleGDSaddGeneBlockGDSRefAnnotaddRef2GDS1KGaddStudy1KgcomputeAncestryFromSyntheticFilecomputeKNNRefSamplecomputeKNNRefSyntheticcomputePCAMultiSyntheticcomputePCARefSamplecomputePoolSyntheticAncestryGrcomputeSyntheticROCcreateStudy2GDS1KGestimateAllelicFractiongenerateGDS1KGgenerateMapSnvSelgeneratePhase1KG2GDSgetRef1KGPopgroupChr1KGSNVidentifyRelativeprepPed1KGprepSyntheticpruningSamplerunExomeAncestryrunRNAAncestryselect1KGPopsnvListVCFsplitSelectByPopsyntheticGeno

Dependencies:abindAnnotationDbiAnnotationFilteraskpassbackportsBHBiobaseBiocGenericsBiocIOBiocParallelBiostringsbitbit64bitopsblobbootbroomBSgenomecachemclassclicliprcodetoolscpp11crayoncurldata.tableDBIDelayedArrayDNAcopydplyrensembldbfansifastmapforcatsforeachformatRformula.toolsfutile.loggerfutile.optionsgdsfmtgenericsGENESISGenomeInfoDbGenomeInfoDbDataGenomicAlignmentsGenomicFeaturesGenomicRangesglmnetglueGWASExactHWGWASToolshavenhmshttrigraphIRangesiteratorsjomojsonliteKEGGRESTlambda.rlatticelazyevallifecyclelme4lmtestlogistfmagrittrMASSMatrixMatrixGenericsMatrixModelsmatrixStatsmemoisemgcvmicemimeminqamitmlnlmenloptrnnetnumDerivopenssloperator.toolsordinalpanpillarpkgconfigplogrplyrpngprettyunitspROCprogressProtGenericspurrrquantregquantsmoothR6RcppRcppEigenRCurlreadrreshape2restfulrRhtslibrjsonrlangrpartRsamtoolsRSQLitertracklayerS4ArraysS4VectorssandwichSeqArraySeqVarToolsshapesnowSNPRelateSparseArraySparseMstringistringrSummarizedExperimentsurvivalsystibbletidyrtidyselecttzdbucminfUCSC.utilsutf8VariantAnnotationvctrsvroomwithrXMLXVectoryamlzlibbioczoo

Accurate Inference of Genetic Ancestry from Cancer-derived Sequences

Rendered fromRAIDS.Rmdusingknitr::rmarkdownon Jun 30 2024.

Last update: 2023-10-12
Started: 2023-03-30

Population reference dataset GDS files

Rendered fromCreate_Reference_GDS_File.Rmdusingknitr::rmarkdownon Jun 30 2024.

Last update: 2023-09-29
Started: 2023-08-14

Readme and manuals

Help Manual

Help pageTopics
RAIDS: Accurate Inference of Genetic Ancestry from Cancer SequencesRAIDS-package RAIDS
Add the genotype information for the list of pruned SNVs into the Profile GDS fileadd1KG2SampleGDS
Append information associated to blocks, as indexes, into the Population Reference SNV Annotation GDS fileaddGeneBlockGDSRefAnnot
Add the information about the unrelated patients to the Reference GDS fileaddRef2GDS1KG
Append information about the 1KG samples into the Profile GDS fileaddStudy1Kg
Select the optimal K and D parameters using the synthetic data and infer the ancestry of a specific profilecomputeAncestryFromSyntheticFile
Run a k-nearest neighbors analysis on one specific profilecomputeKNNRefSample
Run a k-nearest neighbors analysis on a subset of the synthetic datasetcomputeKNNRefSynthetic
Project synthetic profiles onto existing principal component axes generated using the reference 1KG profilescomputePCAMultiSynthetic
Project specified profile onto PCA axes generated using known reference profilescomputePCARefSample
Run a PCA analysis and a K-nearest neighbors analysis on a small set of synthetic data using all 1KG profiles except the ones used to generate the synthetic profilescomputePoolSyntheticAncestryGr
Calculate the AUROC of the inferences for specific values of D and K using the inferred ancestry results from the synthetic profiles.computeSyntheticROC
Create the Profile GDS file(s) for one or multiple specific profiles using the information from a RDS Sample description file and the 1KG GDS filecreateStudy2GDS1KG
The known super population ancestry of the demo 1KG reference profiles.demoKnownSuperPop1KG
The PCA results of the demo 1KG reference dataset for demonstration purpose. Beware that the PCA has been run on a very small subset of the 1KG reference dataset and should not be used to call ancestry inference on a real profile.demoPCA1KG
The PCA result of demo synthetic profiles projected on the demo subset 1KG reference PCA.demoPCASyntheticProfiles
The pedigree information about a demo profile called 'ex1'.demoPedigreeEx1
Estimate the allelic fraction of the pruned SNVs for a specific profileestimateAllelicFraction
Generate the GDS file that will contain the information from Reference data set (reference data set)generateGDS1KG
Generate the filter SNP information file in RDS formatgenerateMapSnvSel
Adding the phase information into the Reference GDS filegeneratePhase1KG2GDS
Extract the specified column from the 1KG GDS 'sample.ref' node for the reference profiles (real ancestry assignation)getRef1KGPop
Merge the genotyping files per chromosome into one filegroupChr1KGSNV
Identify genetically unrelated patients in GDS Reference fileidentifyRelative
A small 'data.frame' containing the inferred ancestry on the synthetic profiles.matKNNSynthetic
A small 'data.frame' containing the information related to synthetic profiles. The ancestry of the profiles used to generate the synthetic profiles must be present.pedSynthetic
Prepare the pedigree file using pedigree information from ReferenceprepPed1KG
Add information related to the synthetic profiles (study and synthetic reference profiles information) into a Profile GDS fileprepSynthetic
Compute the list of pruned SNVs for a specific profile using the information from the Reference GDS file and a linkage disequilibrium analysispruningSample
Run most steps leading to the ancestry inference call on a specific exome profilerunExomeAncestry
Run most steps leading to the ancestry inference call on a specific RNA profilerunRNAAncestry
Random selection of a specific number of reference profiles in each subcontinental population present in the 1KG GDS fileselect1KGPop
A small 'data.frame' containing the SNV information.snpPositionDemo
Generate a VCF with the information from the SNPs that pass a cut-off thresholdsnvListVCF
Group samples per subcontinental populationsplitSelectByPop
Generate synthetic profiles for each cancer profile and 1KG reference profile combination and add them to the Profile GDS filesyntheticGeno