Package: SeqArray 1.53.1

Xiuwen Zheng

SeqArray: Data management of large-scale whole-genome sequence variant calls using GDS files

Data management of large-scale whole-genome sequencing variant calls with thousands of individuals: genotypic data (e.g., SNVs, indels and structural variation calls) and annotations in SeqArray GDS files are stored in an array-oriented and compressed manner, with efficient data access using the R programming language.

Authors:Xiuwen Zheng [aut, cre], Stephanie Gogarten [aut], David Levine [ctb], Cathy Laurie [ctb]

SeqArray_1.53.1.tar.gz
SeqArray_1.53.1.zip(r-4.7)SeqArray_1.53.1.zip(r-4.6)SeqArray_1.53.1.zip(r-4.5)
SeqArray_1.53.1.tgz(r-4.6-x86_64)SeqArray_1.53.1.tgz(r-4.6-arm64)SeqArray_1.53.1.tgz(r-4.5-x86_64)SeqArray_1.53.1.tgz(r-4.5-arm64)
SeqArray_1.53.1.tar.gz(r-4.7-arm64)SeqArray_1.53.1.tar.gz(r-4.7-x86_64)SeqArray_1.53.1.tar.gz(r-4.6-arm64)SeqArray_1.53.1.tar.gz(r-4.6-x86_64)
SeqArray_1.53.1.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
SeqArray/json (API)

# Install 'SeqArray' in R:
install.packages('SeqArray', repos = c('https://bioc.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/zhengxwen/seqarray/issues

Uses libs:
  • c++– GNU Standard C++ Library v3
Datasets:

On BioConductor:SeqArray-1.53.1(bioc 3.24)SeqArray-1.52.1(bioc 3.23)

infrastructuredatarepresentationsequencinggeneticsbioinformaticsgds-formatsnpsnvweswgscpp

11.98 score 46 stars 10 packages 1.1k scripts 8 mentions 70 exports 11 dependencies

Last updated from:6c3f9b6562. Checks:1 WARNING, 10 OK, 3 NOTE. Indexed: yes.

TargetResultTimeFilesSyslog
bioc-checksWARNING234
linux-devel-arm64OK318
linux-devel-x86_64OK390
source / vignettesOK568
linux-release-arm64OK329
linux-release-x86_64OK380
macos-release-arm64OK258
macos-release-x86_64OK459
macos-oldrel-arm64NOTE204
macos-oldrel-x86_64NOTE356
windows-develOK248
windows-releaseOK254
windows-oldrelNOTE241
wasm-releaseOK568

Exports:altcolDatafiltfixedgenograngesheaderinfoqualrefrowRangesseqAddValueseqAlleleCountseqAlleleFreqseqApplyseqAsVCFseqBCF2GDSseqBED2GDSseqBlockApplyseqCheckseqCloseseqDeleteseqDigestseqEmptyFileseqExampleFileNameseqExportseqFilterPopseqFilterPushseqGDS2BEDseqGDS2SNPseqGDS2VCFseqGet2bGenoseqGetAF_AC_MissingseqGetDataseqGetFilterseqGetParallelseqListVarDataseqMergeseqMissingseqMulticoreSetupseqNewVarDataseqNumAlleleseqOpenseqOptimizeseqParallelseqParallelSetupseqParApplyseqRecompressseqResetFilterseqResetVariantIDseqSetFilterseqSetFilterAnnotIDseqSetFilterChromseqSetFilterCondseqSetFilterPosseqSNP2GDSseqStorageOptionseqSummaryseqSystemseqTransposeseqUnitApplyseqUnitCreateseqUnitFilterCondseqUnitMergeseqUnitSetDiffseqUnitSlidingWindowsseqUnitSubsetseqVCF_HeaderseqVCF_SampIDseqVCF2GDS

Dependencies:BiocGenericsBiostringscrayondigestgdsfmtgenericsGenomicRangesIRangesS4VectorsSeqinfoXVector

SeqArray Data Format and Access
Overview | Parallel Computing | Application Program Interface (API) | Preparing Data | Data Format used in SeqArray | Format Conversion from VCF Files | Export to VCF Files | Modification | Data Processing | Functions for Data Analysis | Get Data | Apply Functions Over Array Margins | Apply Functions in Parallel | Examples | The performance of seqApply | Missing Rates for Variants | seqApply | C++ Integration | seqBlockApply | seqBlockApply + Parallel | seqMissing | Missing Rates for Samples | Allele Frequency | seqAlleleFreq | Principal Component Analysis | Multi-process Implementation | Individual Inbreeding Coefficient | Resources | Session Information | References

Last update: 2026-04-15
Started: 2015-06-14

Integration with R
SeqArray Functions | Key R Functions | Calculating Allele Frequencies | PCA R Implementation | Parallel Implementation | Bioconductor Features | GRanges and GRangesList | VariantAnnotation | Integration with SeqVarTools | Linear Regression | Integration with SNPRelate | LD-based Marker Pruning | Principal Component Analysis | Relatedness Analysis | Identity-By-State Analysis | Fixation Index ($F_\text{st}$) | GENESIS | Resources | Session Information | References

Last update: 2022-07-16
Started: 2019-10-22

SeqArray Overview
Introduction | Methods | Methods -- Advantages | Methods -- File Contents | Methods -- Key Functions | Benchmark | Benchmark -- Test 1 (sequentially) | Benchmark -- Test 2 (in parallel) | Benchmark -- Test 3 (C++ Integration) | Conclusion | Resource | Acknowledgements

Last update: 2022-07-16
Started: 2015-12-02

Readme and manuals

Help Manual

Help pageTopics
Data Management of Large-scale Whole-Genome Sequence Variant CallsSeqArray-package SeqArray
Simulated sample data for 1000 Genomes Phase 1KG_P1_SampData
Add values to a GDS FileseqAddValue
Get Allele Frequencies or CountsseqAlleleCount seqAlleleFreq seqGetAF_AC_Missing
Apply Functions Over Array MarginsseqApply
VariantAnnotation objectsseqAsVCF
Conversion between PLINK BED and SeqArray GDSseqBED2GDS seqGDS2BED
Apply Functions Over Array Margins via BlockingseqBlockApply
Data Integrity CheckingseqCheck
Close the SeqArray GDS FileseqClose seqClose,gds.class-method seqClose,SeqVarGDSClass-method
Delete GDS VariablesseqDelete
Hash function digestsseqDigest
Empty GDS fileseqEmptyFile
Example filesseqExampleFileName
Export to a GDS FileseqExport
Convert to a SNP GDS FileseqGDS2SNP
Convert to a VCF FileseqGDS2VCF
Get packed genotypesseqGet2bGeno
Get DataseqGetData
Get the Filter of GDS FileseqGetFilter
Merge Multiple SeqArray GDS FilesseqMerge
Missing genotype percentageseqMissing
Variable-length dataseqListVarData seqNewVarData
Number of allelesseqNumAllele
Open a SeqArray GDS FileseqOpen
Optimize the Storage of Data ArrayseqOptimize
Apply Functions in ParallelseqParallel seqParApply
Setup/Get a Parallel EnvironmentseqGetParallel seqMulticoreSetup seqParallelSetup
Recompress the GDS fileseqRecompress
Reset Variant ID in SeqArray GDS FilesseqResetVariantID
Set a Filter to Sample or VariantseqFilterPop seqFilterPush seqResetFilter seqSetFilter seqSetFilter,SeqVarGDSClass,ANY-method seqSetFilter,SeqVarGDSClass,GRanges-method seqSetFilter,SeqVarGDSClass,GRangesList-method seqSetFilter,SeqVarGDSClass,IRanges-method seqSetFilterAnnotID seqSetFilterChrom seqSetFilterPos
Set a Filter to Variant with Allele Count/FreqseqSetFilterCond
Convert SNPRelate Format to SeqArray FormatseqSNP2GDS
Storage and Compression OptionsseqStorageOption
Summarize a SeqArray GDS FileseqSummary
Get the parameters in the GDS systemseqSystem
Transpose Data ArrayseqTranspose
Apply Function Over Variant UnitsseqUnitApply
Subset and merge the unitsseqUnitCreate seqUnitMerge seqUnitSetDiff seqUnitSubset
Filter unit variantsseqUnitFilterCond
Sliding units of selected variantsseqUnitSlidingWindows
SeqVarGDSClassalt alt,SeqVarGDSClass-method colData colData,SeqVarGDSClass-method filt filt,SeqVarGDSClass-method fixed fixed,SeqVarGDSClass-method geno geno,SeqVarGDSClass,ANY-method geno,SeqVarGDSClass-method granges,SeqVarGDSClass-method header header,SeqVarGDSClass-method info info,SeqVarGDSClass-method qual qual,SeqVarGDSClass-method ref ref,SeqVarGDSClass-method rowRanges rowRanges,SeqVarGDSClass-method SeqVarGDSClass SeqVarGDSClass-class
Parse the Header of a VCF/BCF FileseqVCF_Header
Get the Sample IDsseqVCF_SampID
Reformat VCF FilesseqBCF2GDS seqVCF2GDS