| Title: | An Interface for Immune Receptor and HLA Gene Reference Data |
|---|---|
| Description: | Provides a consistent interface for downloading, storing, and accessing immune receptor (TCR/BCR) and HLA sequences from IMGT, IPD-IMGT/HLA, and OGRDB (AIRR-C). Supports export to popular analysis tools including MiXCR, TRUST4, Cell Ranger, and IgBLAST. This package serves as a core dependency for immunogenomics packages, ensuring reliable and high-quality sequence access with local caching for reproducibility. |
| Authors: | Nick Borcherding [aut, cre] |
| Maintainer: | Nick Borcherding <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 1.1.0 |
| Built: | 2026-05-30 07:53:38 UTC |
| Source: | https://github.com/bioc/immReferent |
immReferent provides a stable, reproducible, and lightweight interface to reference sequences for immune receptors (TCR/BCR) and HLA genes sourced from IMGT, IPD-IMGT/HLA, and the AIRR-C's OGRDB. It centralizes downloading, caching, and querying of curated nucleotide and protein sequences, and plays a foundational role in computational immunology workflows.
The package is designed as a common reference layer across immunoinformatics tools, ensuring consistent provenance and offline reproducibility via caching.
Core functionality
Download and parse receptor and HLA sequences from IMGT and OGRDB
Local caching to support offline, reproducible analysis
Query by gene, allele, species, locus, and sequence type/format
Export to popular analysis tools (MiXCR, TRUST4, Cell Ranger, IgBLAST)
Interoperability with Bioconductor classes such as
DNAStringSet and
AAStringSet
Data retrieval functions
getIMGT: Download sequences from IMGT
getOGRDB: Download sequences from OGRDB
refreshIMGT, refreshOGRDB: Force re-download
Export functions
exportMiXCR: Export for MiXCR analysis
exportTRUST4: Export for TRUST4 analysis
exportCellRanger: Export for 10x Cell Ranger VDJ
exportIgBLAST: Export for IgBLAST analysis
Supported data sources
IMGT: The international ImMunoGeneTics information system (https://www.imgt.org/)
IPD-IMGT/HLA: The HLA Database (https://www.ebi.ac.uk/ipd/imgt/hla/)
OGRDB: Open Germline Receptor Database (AIRR-C) (https://ogrdb.airr-community.org/)
Getting started
browseVignettes("immReferent")
Data obtained from IMGT and OGRDB must be cited according to their terms. IMGT data are distributed under a CC BY-NC-ND 4.0 license. Proper attribution is required, and derivative or commercial use is restricted per IMGT policy. Always review the current licensing and citation requirements of each resource prior to use.
Maintainer: Nick Borcherding [email protected]
https://github.com/BorchLab/immReferent
Exports a DNAStringSet to FASTA format
suitable for creating a custom Cell Ranger VDJ reference. The function
generates a FASTA file with properly formatted headers for use with
cellranger mkvdjref.
exportCellRanger(sequences, output_file, gene_type = NULL)exportCellRanger(sequences, output_file, gene_type = NULL)
sequences |
A |
output_file |
Character string specifying the path to the output FASTA file. The parent directory will be created if it does not exist. |
gene_type |
Character string specifying the type of gene region. One of
|
Cell Ranger's mkvdjref command expects FASTA files with specific
header formats. This function creates a FASTA file that can be used as input
to build a custom VDJ reference.
Note: For a complete Cell Ranger VDJ reference, you also need a GTF file with gene annotations. This function only creates the FASTA component.
This function works with sequences from both IMGT (via
getIMGT) and OGRDB (via getOGRDB).
Character string with the path to the created file, returned invisibly.
getIMGT, getOGRDB for obtaining sequences
exportMiXCR, exportTRUST4,
exportIgBLAST for other export formats
https://www.10xgenomics.com/support/software/cell-ranger/latest/analysis/inputs/cr-5p-references for Cell Ranger documentation
# Create a small example DNAStringSet seqs <- Biostrings::DNAStringSet(c( "ATGCGATCGATCGATCG", "ATGCGATCGATCG" )) names(seqs) <- c("IGHV1-2*01", "IGHV1-3*01") # Export to temporary file output_file <- tempfile(fileext = ".fa") exportCellRanger(seqs, output_file) # View the result cat(readLines(output_file), sep = "\n") # Clean up unlink(output_file)# Create a small example DNAStringSet seqs <- Biostrings::DNAStringSet(c( "ATGCGATCGATCGATCG", "ATGCGATCGATCG" )) names(seqs) <- c("IGHV1-2*01", "IGHV1-3*01") # Export to temporary file output_file <- tempfile(fileext = ".fa") exportCellRanger(seqs, output_file) # View the result cat(readLines(output_file), sep = "\n") # Clean up unlink(output_file)
Exports a DNAStringSet to FASTA files
formatted for use with IgBLAST. The function creates separate FASTA files
for V, D, and J gene segments with simplified headers compatible with
IgBLAST's requirements.
exportIgBLAST( sequences, output_dir, organism = "custom", receptor_type = c("ig", "tcr") )exportIgBLAST( sequences, output_dir, organism = "custom", receptor_type = c("ig", "tcr") )
sequences |
A |
output_dir |
Character string specifying the directory where output files will be written. The directory will be created if it does not exist. |
organism |
Character string specifying the organism name for the output
files. Used in file naming. Default is |
receptor_type |
Character string specifying the receptor type. One of
|
IgBLAST requires FASTA files with simplified headers containing only the
gene/allele name. This function mimics the output of IgBLAST's
edit_imgt_file.pl script, which truncates IMGT headers to keep only
the allele designation.
Output files follow the naming convention used by IgBLAST:
<organism>_<receptor_type>_v.fasta
<organism>_<receptor_type>_d.fasta
<organism>_<receptor_type>_j.fasta
After exporting, use makeblastdb with the -parse_seqids flag
to create the BLAST database:
makeblastdb -parse_seqids -dbtype nucl -in <fasta_file> -out <db_name>
This function works with sequences from both IMGT (via
getIMGT) and OGRDB (via getOGRDB).
A named list containing the paths to the created files, returned
invisibly. The list may contain elements v_genes, d_genes,
and j_genes depending on which segment types were found in the
input sequences.
getIMGT, getOGRDB for obtaining sequences
exportMiXCR, exportTRUST4,
exportCellRanger for other export formats
https://ncbi.github.io/igblast/ for IgBLAST documentation
# Create a small example DNAStringSet seqs <- Biostrings::DNAStringSet(c( "ATGCGATCGATCGATCG", "ATGCGATCGATCG", "ATGCGATC" )) names(seqs) <- c("IGHV1-2*01", "IGHD1-1*01", "IGHJ1*01") # Export to temporary directory output_dir <- tempdir() files <- exportIgBLAST(seqs, output_dir, organism = "human", receptor_type = "ig") print(files) # Clean up unlink(unlist(files))# Create a small example DNAStringSet seqs <- Biostrings::DNAStringSet(c( "ATGCGATCGATCGATCG", "ATGCGATCGATCG", "ATGCGATC" )) names(seqs) <- c("IGHV1-2*01", "IGHD1-1*01", "IGHJ1*01") # Export to temporary directory output_dir <- tempdir() files <- exportIgBLAST(seqs, output_dir, organism = "human", receptor_type = "ig") print(files) # Clean up unlink(unlist(files))
Exports a DNAStringSet or
AAStringSet to FASTA files formatted for use with
MiXCR's buildLibrary command. The function creates separate FASTA
files for V, D, J, and C gene segments.
exportMiXCR( sequences, output_dir, chain = c("IGH", "IGK", "IGL", "TRA", "TRB", "TRD", "TRG") )exportMiXCR( sequences, output_dir, chain = c("IGH", "IGK", "IGL", "TRA", "TRB", "TRD", "TRG") )
sequences |
A |
output_dir |
Character string specifying the directory where output files will be written. The directory will be created if it does not exist. |
chain |
Character string specifying the chain type for the output files.
Must be one of |
MiXCR expects FASTA files with simple headers containing only the gene name. The function filters sequences by gene type (V, D, J, C) based on the gene name pattern and writes separate files for each segment type.
Output files follow the naming convention:
v-genes.<chain>.fasta
d-genes.<chain>.fasta
j-genes.<chain>.fasta
c-genes.<chain>.fasta
This function works with sequences from both IMGT (via
getIMGT) and OGRDB (via getOGRDB).
A named list containing the paths to the created files, returned
invisibly. The list may contain elements v_genes, d_genes,
j_genes, and c_genes depending on which segment types were
found in the input sequences.
getIMGT, getOGRDB for obtaining sequences
exportTRUST4, exportCellRanger,
exportIgBLAST for other export formats
https://mixcr.com/mixcr/guides/create-custom-library/ for MiXCR documentation
# Create a small example DNAStringSet seqs <- Biostrings::DNAStringSet(c( "ATGCGATCGATCGATCG", "ATGCGATCGATCG", "ATGCGATC", "ATGCGATCGATCGATCGATCG" )) names(seqs) <- c("IGHV1-2*01", "IGHD1-1*01", "IGHJ1*01", "IGHC*01") # Export to temporary directory output_dir <- tempdir() files <- exportMiXCR(seqs, output_dir, chain = "IGH") print(files) # Clean up unlink(unlist(files))# Create a small example DNAStringSet seqs <- Biostrings::DNAStringSet(c( "ATGCGATCGATCGATCG", "ATGCGATCGATCG", "ATGCGATC", "ATGCGATCGATCGATCGATCG" )) names(seqs) <- c("IGHV1-2*01", "IGHD1-1*01", "IGHJ1*01", "IGHC*01") # Export to temporary directory output_dir <- tempdir() files <- exportMiXCR(seqs, output_dir, chain = "IGH") print(files) # Clean up unlink(unlist(files))
Exports a DNAStringSet to a FASTA file
formatted for use with TRUST4. The output follows the format produced by
TRUST4's BuildImgtAnnot.pl script.
exportTRUST4(sequences, output_file, include_constant = TRUE)exportTRUST4(sequences, output_file, include_constant = TRUE)
sequences |
A |
output_file |
Character string specifying the path to the output FASTA file. The parent directory will be created if it does not exist. |
include_constant |
Logical. If |
TRUST4 expects FASTA files with headers containing only the allele name
(e.g., >IGHV1-2*01). The function reformats sequence headers to match
the output of TRUST4's BuildImgtAnnot.pl script.
TRUST4 uses this reference for the --ref parameter in its analysis
pipeline.
This function works with sequences from both IMGT (via
getIMGT) and OGRDB (via getOGRDB).
Character string with the path to the created file, returned invisibly.
getIMGT, getOGRDB for obtaining sequences
exportMiXCR, exportCellRanger,
exportIgBLAST for other export formats
https://github.com/liulab-dfci/TRUST4 for TRUST4 documentation
# Create a small example DNAStringSet seqs <- Biostrings::DNAStringSet(c( "ATGCGATCGATCGATCG", "ATGCGATCGATCG", "ATGCGATC" )) names(seqs) <- c("IGHV1-2*01", "IGHJ1*01", "IGHC*01") # Export to temporary file output_file <- tempfile(fileext = ".fa") exportTRUST4(seqs, output_file) # View the result cat(readLines(output_file), sep = "\n") # Clean up unlink(output_file)# Create a small example DNAStringSet seqs <- Biostrings::DNAStringSet(c( "ATGCGATCGATCGATCG", "ATGCGATCGATCG", "ATGCGATC" )) names(seqs) <- c("IGHV1-2*01", "IGHJ1*01", "IGHC*01") # Export to temporary file output_file <- tempfile(fileext = ".fa") exportTRUST4(seqs, output_file) # View the result cat(readLines(output_file), sep = "\n") # Clean up unlink(output_file)
This is the main function to download and load reference sequences from IMGT and the IPD-IMGT/HLA database. It handles caching of downloaded files.
getIMGT( species = "human", gene, type = c("NUC", "PROT"), refresh = FALSE, suppressMessages = FALSE )getIMGT( species = "human", gene, type = c("NUC", "PROT"), refresh = FALSE, suppressMessages = FALSE )
species |
Character string specifying the species for which to download
data. Required for TCR/BCR queries. Currently supported species:
|
gene |
Character string specifying the gene or locus to download. For
TCR/BCR, this can be a specific chain (e.g., |
type |
Character string specifying the type of sequence to retrieve.
Either |
refresh |
Logical. If |
suppressMessages |
Logical. If |
A DNAStringSet object (when
type = "NUC") or AAStringSet object (when
type = "PROT") containing the requested sequences.
loadIMGT, refreshIMGT for convenience wrappers
getOGRDB for OGRDB/AIRR-C germline sequences
exportMiXCR, exportTRUST4,
exportCellRanger, exportIgBLAST for exporting
sequences to analysis tools
if(is_imgt_available()) { # Download human IGHV nucleotide sequences ighv_nuc <- getIMGT(species = "human", gene = "IGHV", type = "NUC") # Download all HLA protein sequences hla_prot <- getIMGT(gene = "HLA", type = "PROT") # Download all mouse TRB genes trb_mouse <- getIMGT(species = "mouse", gene = "TRB", type = "NUC") }if(is_imgt_available()) { # Download human IGHV nucleotide sequences ighv_nuc <- getIMGT(species = "human", gene = "IGHV", type = "NUC") # Download all HLA protein sequences hla_prot <- getIMGT(gene = "HLA", type = "PROT") # Download all mouse TRB genes trb_mouse <- getIMGT(species = "mouse", gene = "TRB", type = "NUC") }
Downloads AIRR-compliant germline sets (or FASTA) from OGRDB
(Open Germline Receptor Database) and returns sequences as a
DNAStringSet or
AAStringSet.
getOGRDB( species = "human", locus = c("IGH", "IGK", "IGL"), set_name = NULL, type = c("NUC", "PROT"), format = c("FASTA_GAPPED", "FASTA_UNGAPPED", "AIRR"), version = c("published", "latest"), species_subgroup = NULL, refresh = FALSE, suppressMessages = FALSE )getOGRDB( species = "human", locus = c("IGH", "IGK", "IGL"), set_name = NULL, type = c("NUC", "PROT"), format = c("FASTA_GAPPED", "FASTA_UNGAPPED", "AIRR"), version = c("published", "latest"), species_subgroup = NULL, refresh = FALSE, suppressMessages = FALSE )
species |
Character string specifying the species. Accepts
|
locus |
Character string specifying the locus short code. One of
|
set_name |
Optional character string specifying an explicit OGRDB set
name (e.g., |
type |
Character string specifying the sequence type. Either
|
format |
Character string specifying the download format. One of
|
version |
Character string specifying the version. Either
|
species_subgroup |
Optional character string specifying a subgroup
(e.g., a mouse strain like |
refresh |
Logical. If |
suppressMessages |
Logical. If |
OGRDB (Open Germline Receptor Database) is the AIRR Community's curated repository of germline receptor sequences. It complements IMGT with additional species support and standardized AIRR JSON format.
The function supports multiple download formats:
FASTA_GAPPED: FASTA with IMGT gaps preserved
FASTA_UNGAPPED: FASTA without gaps
AIRR: AIRR-C compliant JSON format
A DNAStringSet object (when
type = "NUC") or AAStringSet object (when
type = "PROT") containing the requested sequences.
loadOGRDB, refreshOGRDB for convenience wrappers
getIMGT for IMGT sequences
exportMiXCR, exportTRUST4,
exportCellRanger, exportIgBLAST for exporting
sequences to analysis tools
https://ogrdb.airr-community.org/ for OGRDB documentation
if (is_ogrdb_available()) { # Download human IGH nucleotide sequences (gapped FASTA) igh_nuc <- getOGRDB(species = "human", locus = "IGH", type = "NUC", format = "FASTA_GAPPED") # Download human IGK sequences in AIRR JSON format igk_airr <- getOGRDB(species = "human", locus = "IGK", type = "NUC", format = "AIRR") # Download human IGL sequences and translate to AA igl_prot <- getOGRDB(species = "human", locus = "IGL", type = "PROT", format = "FASTA_UNGAPPED") # Example using an explicit set name (instead of locus) igh_explicit <- getOGRDB(species = "human", set_name = "IGH_VDJ", type = "NUC", format = "FASTA_GAPPED") }if (is_ogrdb_available()) { # Download human IGH nucleotide sequences (gapped FASTA) igh_nuc <- getOGRDB(species = "human", locus = "IGH", type = "NUC", format = "FASTA_GAPPED") # Download human IGK sequences in AIRR JSON format igk_airr <- getOGRDB(species = "human", locus = "IGK", type = "NUC", format = "AIRR") # Download human IGL sequences and translate to AA igl_prot <- getOGRDB(species = "human", locus = "IGL", type = "PROT", format = "FASTA_UNGAPPED") # Example using an explicit set name (instead of locus) igh_explicit <- getOGRDB(species = "human", set_name = "IGH_VDJ", type = "NUC", format = "FASTA_GAPPED") }
Sends a lightweight HEAD request to the main IMGT page to check if the service is online and accessible. This function is used to conditionally run examples and tests that require an internet connection.
is_imgt_available()is_imgt_available()
A logical value: TRUE if the IMGT website is accessible,
FALSE otherwise.
is_ogrdb_available for checking OGRDB availability
getIMGT which uses this function
is_imgt_available()is_imgt_available()
Sends a lightweight HEAD request to the OGRDB API to check if the service is online and accessible. This function is used to conditionally run examples and tests that require an internet connection.
is_ogrdb_available()is_ogrdb_available()
A logical value: TRUE if the OGRDB website is accessible,
FALSE otherwise.
is_imgt_available for checking IMGT availability
getOGRDB which uses this function
is_ogrdb_available()is_ogrdb_available()
Scans the cache directory and returns a list of available datasets that have been downloaded.
listIMGT()listIMGT()
A character vector of absolute file paths for the cached datasets. Returns an empty character vector if the cache directory does not exist or contains no files.
getIMGT for downloading sequences
listOGRDB for listing OGRDB cached files
# List all files in the cache cached_files <- listIMGT() # To see the structure, you can print the first few head(cached_files)# List all files in the cache cached_files <- listIMGT() # To see the structure, you can print the first few head(cached_files)
Scans the cache directory and returns a list of available OGRDB datasets that have been downloaded.
listOGRDB()listOGRDB()
A character vector of absolute file paths to cached OGRDB files.
Returns an empty character vector if no OGRDB files have been cached. Paths
are typically under the package cache directory (e.g.,
~/.immReferent/<Species>/ogrdb/).
getOGRDB for downloading sequences
listIMGT for listing IMGT cached files
# List cached OGRDB files cached_files <- listOGRDB() head(cached_files)# List cached OGRDB files cached_files <- listOGRDB() head(cached_files)
Loads sequences from the local cache without attempting to
download. This function is a convenience wrapper for
getIMGT(refresh = FALSE). If the data is not found in the cache, it
will be downloaded unless an internet connection is unavailable.
loadIMGT( species = "human", gene, type = c("NUC", "PROT"), suppressMessages = FALSE )loadIMGT( species = "human", gene, type = c("NUC", "PROT"), suppressMessages = FALSE )
species |
Character string specifying the species for which to download
data. Required for TCR/BCR queries. Currently supported species:
|
gene |
Character string specifying the gene or locus to download. For
TCR/BCR, this can be a specific chain (e.g., |
type |
Character string specifying the type of sequence to retrieve.
Either |
suppressMessages |
Logical. If |
A DNAStringSet object (when
type = "NUC") or AAStringSet object (when
type = "PROT") containing the requested sequences.
getIMGT for the main download function
refreshIMGT to force re-download
if(is_imgt_available()) { # First, download a file to ensure it's in the cache getIMGT(species = "human", gene = "IGHV", type = "NUC", suppressMessages = TRUE) # Now, load it from the cache ighv_cached <- loadIMGT(species = "human", gene = "IGHV", type = "NUC") }if(is_imgt_available()) { # First, download a file to ensure it's in the cache getIMGT(species = "human", gene = "IGHV", type = "NUC", suppressMessages = TRUE) # Now, load it from the cache ighv_cached <- loadIMGT(species = "human", gene = "IGHV", type = "NUC") }
Loads sequences from the local cache without attempting to
download. This function is a convenience wrapper for
getOGRDB(refresh = FALSE). If the data is not found in the cache, it
will be downloaded unless an internet connection is unavailable.
loadOGRDB( species = "human", locus = c("IGH", "IGK", "IGL"), set_name = NULL, type = c("NUC", "PROT"), format = c("FASTA_GAPPED", "FASTA_UNGAPPED", "AIRR"), version = c("published", "latest"), species_subgroup = NULL, suppressMessages = FALSE )loadOGRDB( species = "human", locus = c("IGH", "IGK", "IGL"), set_name = NULL, type = c("NUC", "PROT"), format = c("FASTA_GAPPED", "FASTA_UNGAPPED", "AIRR"), version = c("published", "latest"), species_subgroup = NULL, suppressMessages = FALSE )
species |
Character string specifying the species. Accepts
|
locus |
Character string specifying the locus short code. One of
|
set_name |
Optional character string specifying an explicit OGRDB set
name (e.g., |
type |
Character string specifying the sequence type. Either
|
format |
Character string specifying the download format. One of
|
version |
Character string specifying the version. Either
|
species_subgroup |
Optional character string specifying a subgroup
(e.g., a mouse strain like |
suppressMessages |
Logical. If |
A DNAStringSet object (when
type = "NUC") or AAStringSet object (when
type = "PROT") containing the requested sequences.
getOGRDB for the main download function
refreshOGRDB to force re-download
if (is_ogrdb_available()) { # First, ensure the file is cached getOGRDB(species = "human", locus = "IGH", type = "NUC", format = "FASTA_GAPPED", suppressMessages = TRUE) # Now load from cache only igh_cached <- loadOGRDB(species = "human", locus = "IGH", type = "NUC", format = "FASTA_GAPPED") }if (is_ogrdb_available()) { # First, ensure the file is cached getOGRDB(species = "human", locus = "IGH", type = "NUC", format = "FASTA_GAPPED", suppressMessages = TRUE) # Now load from cache only igh_cached <- loadOGRDB(species = "human", locus = "IGH", type = "NUC", format = "FASTA_GAPPED") }
A convenience wrapper for getIMGT(..., refresh = TRUE) to
ensure that the local cache is updated with the latest versions of the
requested sequences.
refreshIMGT( species = "human", gene, type = c("NUC", "PROT"), suppressMessages = FALSE )refreshIMGT( species = "human", gene, type = c("NUC", "PROT"), suppressMessages = FALSE )
species |
Character string specifying the species for which to download
data. Required for TCR/BCR queries. Currently supported species:
|
gene |
Character string specifying the gene or locus to download. For
TCR/BCR, this can be a specific chain (e.g., |
type |
Character string specifying the type of sequence to retrieve.
Either |
suppressMessages |
Logical. If |
A DNAStringSet object (when
type = "NUC") or AAStringSet object (when
type = "PROT") containing the requested sequences.
getIMGT for the main download function
loadIMGT to load from cache without downloading
if(is_imgt_available()) { # Force a re-download of human IGHV protein sequences ighv_prot_fresh <- refreshIMGT(species = "human", gene = "IGHV", type = "PROT") }if(is_imgt_available()) { # Force a re-download of human IGHV protein sequences ighv_prot_fresh <- refreshIMGT(species = "human", gene = "IGHV", type = "PROT") }
A convenience wrapper for getOGRDB(..., refresh = TRUE)
to ensure that the local cache is updated with the latest versions of the
requested sequences.
refreshOGRDB( species = "human", locus = c("IGH", "IGK", "IGL"), set_name = NULL, type = c("NUC", "PROT"), format = c("FASTA_GAPPED", "FASTA_UNGAPPED", "AIRR"), version = c("published", "latest"), species_subgroup = NULL, suppressMessages = FALSE )refreshOGRDB( species = "human", locus = c("IGH", "IGK", "IGL"), set_name = NULL, type = c("NUC", "PROT"), format = c("FASTA_GAPPED", "FASTA_UNGAPPED", "AIRR"), version = c("published", "latest"), species_subgroup = NULL, suppressMessages = FALSE )
species |
Character string specifying the species. Accepts
|
locus |
Character string specifying the locus short code. One of
|
set_name |
Optional character string specifying an explicit OGRDB set
name (e.g., |
type |
Character string specifying the sequence type. Either
|
format |
Character string specifying the download format. One of
|
version |
Character string specifying the version. Either
|
species_subgroup |
Optional character string specifying a subgroup
(e.g., a mouse strain like |
suppressMessages |
Logical. If |
A DNAStringSet object (when
type = "NUC") or AAStringSet object (when
type = "PROT") containing the requested sequences.
getOGRDB for the main download function
loadOGRDB to load from cache without downloading
if (is_ogrdb_available()) { # Force a re-download of the human IGK sequences igk_fresh <- refreshOGRDB(species = "human", locus = "IGK", type = "NUC", format = "FASTA_GAPPED") }if (is_ogrdb_available()) { # Force a re-download of the human IGK sequences igk_fresh <- refreshOGRDB(species = "human", locus = "IGK", type = "NUC", format = "FASTA_GAPPED") }