Title: | Functional Annotation of Custom Transcriptomes |
---|---|
Description: | factR contain tools to process and interact with custom-assembled transcriptomes (GTF). At its core, factR constructs CDS information on custom transcripts and subsequently predicts its functional output. In addition, factR has tools capable of plotting transcripts, correcting chromosome and gene information and shortlisting new transcripts. |
Authors: | Fursham Hamid [aut, cre] |
Maintainer: | Fursham Hamid <[email protected]> |
License: | file LICENSE |
Version: | 1.9.0 |
Built: | 2024-12-29 07:13:15 UTC |
Source: | https://github.com/bioc/factR |
'buildCDS()' is designed to construct CDS information on transcripts from query GTF object.
buildCDS(query, ref, fasta)
buildCDS(query, ref, fasta)
query |
GRanges object containing query GTF data. |
ref |
GRanges object containing reference GTF data. |
fasta |
BSgenome or Biostrings object containing genomic sequence |
The 'buildCDS()'function will first search for known reference mRNAs in 'query' and annotate its CDS information. For the remaining transcripts, 'buildCDS()' will search for a putative translation start site using a database of annotated ATG codons from 'ref'. Transcripts containing an open-reading frame will be assigned the newly-determined CDS information.
GRanges object containing query exon entries and newly-constructed CDS information
Fursham Hamid
# Load genome and datasets library(BSgenome.Mmusculus.UCSC.mm10) data(matched_query_gtf, ref_gtf) # Build CDS buildCDS(matched_query_gtf, ref_gtf, Mmusculus)
# Load genome and datasets library(BSgenome.Mmusculus.UCSC.mm10) data(matched_query_gtf, ref_gtf) # Build CDS buildCDS(matched_query_gtf, ref_gtf, Mmusculus)
query_gtf data which have been corrected for its seqlevels
data(chrom_matched_query_gtf)
data(chrom_matched_query_gtf)
A GRanges object with 56 ranges and 3 metadata columns:
Chromosome, start, end, and strand info of 4 transcripts and its exons
Entry type; transcript or exon
ID given to transcripts
ID given to gene origin of transcripts
Name given to gene origin of transcripts
...
Output dataframe from predictDomains() function. mRNAs from GENCODE mouse annotation was predicted for putative domain families.
data(domains.known)
data(domains.known)
A data.frame with 85780 rows and 5 columns:
Transcript ID of protein-coding RNAs
Name of domain families
E-value score
Start position of domain in protein
End position of domain in protein
...
Output dataframe from predictDomains() function.
data(domains.out)
data(domains.out)
A data.frame with 14880 rows and 5 columns:
Transcript ID of protein-coding RNAs
Name of domain families
E-value score
Start position of domain in protein
End position of domain in protein
...
Internally filter each element of a GenomicRangesList
filtereach(x, ...)
filtereach(x, ...)
x |
GRangesList object |
... |
Logical conditions to filter each element in the GRanges by. Multiple conditions can be provided as comma-delimited inputs |
Filtered GRangesList object
Fursham Hamid
# Load dataset data(query_exons) # select first element of each GRangesList item filtereach(query_exons, dplyr::row_number() == 1)
# Load dataset data(query_exons) # select first element of each GRangesList item filtereach(query_exons, dplyr::row_number() == 1)
This function will determine if all input ranges objects have the same chromosome naming convention. Input objects can be GenomicRanges, BSgenome or Biostrings object with seqlevel information.
has_consistentSeqlevels(..., verbose = TRUE)
has_consistentSeqlevels(..., verbose = TRUE)
... |
Two or more objects with seqlevels information |
verbose |
Whether to print out message |
Logical value as to whether all objects have consistent seqlevel styles
Fursham Hamid
## --------------------------------------------------------------------- ## EXAMPLE USING TOY DATASET ## --------------------------------------------------------------------- require(GenomicRanges) ## Create toy GRanges objects gr1 <- GRanges("1", IRanges(start = c(1, 101), width = c(20, 20)), "+") gr2 <- GRanges("chr1", IRanges(start = c(1, 101), width = c(20, 20)), "+") ## Test for seqlevels consistency has_consistentSeqlevels(gr1, gr2) ## Input can be a Biostrings object with seqlevels information x0 <- c("chr2" = "CTCACCAGTAT", "chr3" = "TGTCAGTCGA") dna <- Biostrings::DNAStringSet(x0) ## Test for seqlevels consistency has_consistentSeqlevels(gr1, dna) has_consistentSeqlevels(gr2, dna)
## --------------------------------------------------------------------- ## EXAMPLE USING TOY DATASET ## --------------------------------------------------------------------- require(GenomicRanges) ## Create toy GRanges objects gr1 <- GRanges("1", IRanges(start = c(1, 101), width = c(20, 20)), "+") gr2 <- GRanges("chr1", IRanges(start = c(1, 101), width = c(20, 20)), "+") ## Test for seqlevels consistency has_consistentSeqlevels(gr1, gr2) ## Input can be a Biostrings object with seqlevels information x0 <- c("chr2" = "CTCACCAGTAT", "chr3" = "TGTCAGTCGA") dna <- Biostrings::DNAStringSet(x0) ## Test for seqlevels consistency has_consistentSeqlevels(gr1, dna) has_consistentSeqlevels(gr2, dna)
This function is a wrapper to Biostrings::readDNAStringSet() function to import FASTA genome sequence file and simultaneously convert long chromosome names (e.g. 1 dna:chromosome chromosome:GRCm38:1:1:195471971:1 REF) to short names (e.g. 1)
importFASTA(con)
importFASTA(con)
con |
Path to FASTA file |
Imported DNAStringSet object
Fursham Hamid
This function loads GTF files into R and converts it into
a wrapper to rtracklayer::import() function to conveniently import GTF file into R as a GenomicRanges object.
importGTF(con)
importGTF(con)
con |
Path to GTF file |
Imported GenomicRanges object in GTF format
Fursham Hamid
gtf <- system.file("extdata", "sc_merged_sample.gtf.gz", package = "factR") importGTF(gtf)
gtf <- system.file("extdata", "sc_merged_sample.gtf.gz", package = "factR") importGTF(gtf)
A convenient wrapper to match seqlevels of a query GRanges object to a reference object that contain seqlevels information. Reference can be a GRanges, GRangesList, BioString or DNAString object. Seqlevels which fail to match will be dropped.
matchChromosomes(x, to)
matchChromosomes(x, to)
x |
GRanges object with seqnames to change |
to |
GRanges object from which seqnames is referenced |
Corrected input GRanges
Fursham Hamid
## --------------------------------------------------------------------- ## EXAMPLE USING TOY DATASET ## --------------------------------------------------------------------- require(GenomicRanges) ## Create toy GRanges objects gr1 <- GRanges("1", IRanges(start = c(1, 101), width = c(20, 20)), "+") gr2 <- GRanges("chr1", IRanges(start = c(1, 101), width = c(20, 20)), "+") ## Match Ensembl-style chromosomes from gr1 to UCSC-style gr2 matchChromosomes(gr1, gr2) ## Possible to match chrosomomes from GRanges object to a Biostrings # object containing seqlevels x0 <- c("chr2" = "CTCACCAGTAT", "chr3" = "TGTCAGTCGA") dna <- Biostrings::DNAStringSet(x0) ## Match gr1 to dna matchChromosomes(gr1, dna)
## --------------------------------------------------------------------- ## EXAMPLE USING TOY DATASET ## --------------------------------------------------------------------- require(GenomicRanges) ## Create toy GRanges objects gr1 <- GRanges("1", IRanges(start = c(1, 101), width = c(20, 20)), "+") gr2 <- GRanges("chr1", IRanges(start = c(1, 101), width = c(20, 20)), "+") ## Match Ensembl-style chromosomes from gr1 to UCSC-style gr2 matchChromosomes(gr1, gr2) ## Possible to match chrosomomes from GRanges object to a Biostrings # object containing seqlevels x0 <- c("chr2" = "CTCACCAGTAT", "chr3" = "TGTCAGTCGA") dna <- Biostrings::DNAStringSet(x0) ## Match gr1 to dna matchChromosomes(gr1, dna)
query_gtf data which have been corrected for its seqlevels and gene_ids
data(matched_query_gtf)
data(matched_query_gtf)
A GRanges object with 56 ranges and 6 metadata columns:
Chromosome, start, end, and strand info of 4 transcripts and its exons
Entry type; transcript or exon
ID given to transcripts
Matched gene_id
Original gene_id
Level of matching performed
Name of gene
...
'matchGeneInfo()' matches and corrects Gene IDs from a query GTF object to a reference GTF
matchGeneInfo(query, ref, primary_gene_id = NULL, secondary_gene_id = NULL)
matchGeneInfo(query, ref, primary_gene_id = NULL, secondary_gene_id = NULL)
query |
Query GTF imported as GRanges object |
ref |
Reference GTF as GRanges object |
primary_gene_id |
Character name of the primary gene id metadata in query GTF. Input to this argument is typically 'gene_id' |
secondary_gene_id |
Character name of the secondary gene id in query file. Example of input to this argument is 'ref_gene_id' |
The default approach to this correction relies on finding overlaps between transcripts in query with transcripts in reference. Using this method alone could result in false positive matches (19 percent false positives). To improve this, users have the option to invoke two additional layers of matching. (1) Matching by ENSEMBL Gene_IDs. If both query and reference transcript annotations containg Ensembl-style Gene IDs, this program will try to match both IDs in a less stringent manner. This correction can be invoked by providing the 'primary_gene_id' argument
(2) Matching by secondary Gene_IDs. Depending on the transcript assembly program, GTF/GFF3 annotations may contain additional comments on the transcript information. This may include a distinct secondary Gene ID annotation that potentially matches with the reference. To invoke this correction, provide 'primary_gene_id' and 'secondary_gene_id' arguments. To determine if your transcript assembly contain possible secondary Gene IDs, import query GTF file using 'importGTF()' and check its metadata columns
Gene_id-matched query GRanges
Fursham Hamid
## --------------------------------------------------------------------- ## EXAMPLE USING SAMPLE DATASET ## --------------------------------------------------------------------- # Load datasets data(chrom_matched_query_gtf, ref_gtf) # Run matching function matchGeneInfo(chrom_matched_query_gtf, ref_gtf)
## --------------------------------------------------------------------- ## EXAMPLE USING SAMPLE DATASET ## --------------------------------------------------------------------- # Load datasets data(chrom_matched_query_gtf, ref_gtf) # Run matching function matchGeneInfo(chrom_matched_query_gtf, ref_gtf)
Internally create or transform metadata of a GenomicRangesList
mutateeach(x, ...)
mutateeach(x, ...)
x |
GRangesList object |
... |
Name-value pairs of expressions. The name of each argument will be the name of a new metadata column, and the value will be its corresponding value. |
Transformed GRangesList object
Fursham Hamid
# Load dataset data(query_exons) # Create chr:start-end id for each entry mutateeach(query_exons, id = paste0(seqnames, ":", start, "-", end))
# Load dataset data(query_exons) # Create chr:start-end id for each entry mutateeach(query_exons, id = paste0(seqnames, ":", start, "-", end))
matched_query_gtf data that has undergone buildCDS function and containing CDS features
data(new_query_gtf)
data(new_query_gtf)
A GRanges object with 105 ranges and 7 metadata columns:
Chromosome, start, end, and strand info of 4 transcripts and its exons
Entry type; transcript or exon
ID given to transcripts
Matched gene_id
Original gene_id
Level of matching performed
Name of gene
Phase of open-reading frame
...
Predict protein domain families from coding transcripts
predictDomains(x, fasta, ..., plot = FALSE, progress_bar = FALSE, ncores = 4)
predictDomains(x, fasta, ..., plot = FALSE, progress_bar = FALSE, ncores = 4)
x |
Can be a GRanges object containing 'CDS' features in GTF format Can be a GRangesList object containing CDS ranges for each transcript |
fasta |
BSgenome or Biostrings object containing genomic sequence |
... |
Logical conditions to pass to dplyr::filter to subset transcripts for analysis. Variables are metadata information found in 'x' and multiple conditions can be provided delimited by comma. Example: transcript_id == "transcript1" |
plot |
Argument whether to plot out protein domains (Default: FALSE). Note: only first 20 proteins will be plotted |
progress_bar |
Argument whether to show progress bar (Default: FALSE). Useful to track progress of predicting a long list of proteins. |
ncores |
Number of cores to utilise to perform prediction |
Dataframe containing protein features for each cds entry
Fursham Hamid
## --------------------------------------------------------------------- ## EXAMPLE USING SAMPLE DATASET ## --------------------------------------------------------------------- # Load Mouse genome sequence library(BSgenome.Mmusculus.UCSC.mm10) # Load dataset data(new_query_gtf) # predict domains of all CDSs in query GTF predictDomains(new_query_gtf, Mmusculus, ncores=1) # predict domains of CDSs from Ptbp1 gene predictDomains(new_query_gtf, Mmusculus, gene_name == "Ptbp1",ncores=1) # predict domains of CDSs from Ptbp1 gene and plot architecture out predictDomains(new_query_gtf, Mmusculus, gene_name == "Ptbp1", plot = TRUE,ncores=1)
## --------------------------------------------------------------------- ## EXAMPLE USING SAMPLE DATASET ## --------------------------------------------------------------------- # Load Mouse genome sequence library(BSgenome.Mmusculus.UCSC.mm10) # Load dataset data(new_query_gtf) # predict domains of all CDSs in query GTF predictDomains(new_query_gtf, Mmusculus, ncores=1) # predict domains of CDSs from Ptbp1 gene predictDomains(new_query_gtf, Mmusculus, gene_name == "Ptbp1",ncores=1) # predict domains of CDSs from Ptbp1 gene and plot architecture out predictDomains(new_query_gtf, Mmusculus, gene_name == "Ptbp1", plot = TRUE,ncores=1)
Predict NMD sensitivity on mRNA transcripts
predictNMD(x, ..., cds = NULL, NMD_threshold = 50, progress_bar = TRUE)
predictNMD(x, ..., cds = NULL, NMD_threshold = 50, progress_bar = TRUE)
x |
Can be a GRanges object containing exon and CDS transcript features in GTF format. Can be a GRangesList object containing exon features for a list of transcripts.If so, 'cds' argument have to be provided. Can be a GRanges object containing exon features for a transcript. If so, 'cds' argument have to be provided. |
... |
Logical conditions to pass to dplyr::filter to subset transcripts for analysis. Variables are metadata information found in 'x' and multiple conditions can be provided delimited by comma. Example: transcript_id == "transcript1" |
cds |
If 'x' is a GRangesList object, 'cds' has to be a GRangesList containing CDS features for the list of transcripts in 'x'. List names in 'x' and 'cds' have to match. If 'x' is a GRanges object, 'cds' has to be a GRanges containing CDS features for the transcript in 'x'. |
NMD_threshold |
Minimum distance of stop_codon to last exon junction (EJ) which triggers NMD. Default = 50bp |
progress_bar |
Whether to display progress Default = TRUE |
Dataframe with prediction of NMD sensitivity and NMD features:
is_NMD: logical value in prediciting transcript sensitivity to NMD
stop_to_lastEJ: Integer value of the number of bases between the first base of the stop_codon to the last base of EJ. A positive value indicates that the last EJ is downstream of the stop_codon.
num_of_down_EJs: Number of EJs downstream of the stop_codon.
‘3_UTR_length': Length of 3’ UTR
Fursham Hamid
## --------------------------------------------------------------------- ## EXAMPLE USING SAMPLE DATASET ## --------------------------------------------------------------------- # Load datasets data(new_query_gtf, query_exons, query_cds) ## Using GTF GRanges as input predictNMD(new_query_gtf) ### Transcripts for analysis can be subsetted using logical conditions predictNMD(new_query_gtf, transcript_id == "transcript1") predictNMD(new_query_gtf, transcript_id %in% c("transcript1", "transcript3")) ## Using exon and CDS GRangesLists as input predictNMD(query_exons, cds = query_cds) predictNMD(query_exons, cds = query_cds, transcript_id == "transcript3") ## Using exon and CDS GRanges as input predictNMD(query_exons[[3]], cds = query_cds[[3]]) ## --------------------------------------------------------------------- ## EXAMPLE USING TRANSCRIPT ANNOTATION ## --------------------------------------------------------------------- library(AnnotationHub) ## Retrieve GRCm38 trancript annotation ah <- AnnotationHub() GRCm38_gtf <- ah[["AH60127"]] ## Run tool on specific gene family predictNMD(GRCm38_gtf, gene_name == "Ptbp1")
## --------------------------------------------------------------------- ## EXAMPLE USING SAMPLE DATASET ## --------------------------------------------------------------------- # Load datasets data(new_query_gtf, query_exons, query_cds) ## Using GTF GRanges as input predictNMD(new_query_gtf) ### Transcripts for analysis can be subsetted using logical conditions predictNMD(new_query_gtf, transcript_id == "transcript1") predictNMD(new_query_gtf, transcript_id %in% c("transcript1", "transcript3")) ## Using exon and CDS GRangesLists as input predictNMD(query_exons, cds = query_cds) predictNMD(query_exons, cds = query_cds, transcript_id == "transcript3") ## Using exon and CDS GRanges as input predictNMD(query_exons[[3]], cds = query_cds[[3]]) ## --------------------------------------------------------------------- ## EXAMPLE USING TRANSCRIPT ANNOTATION ## --------------------------------------------------------------------- library(AnnotationHub) ## Retrieve GRCm38 trancript annotation ah <- AnnotationHub() GRCm38_gtf <- ah[["AH60127"]] ## Run tool on specific gene family predictNMD(GRCm38_gtf, gene_name == "Ptbp1")
A dataset containing coordinates of CDS from 4 transcripts of mouse Ptbp1. Transcript names and gene IDs have been modified
data(query_cds)
data(query_cds)
A GRangesList object with 4 elements:
Chromosome, start, end, and strand info of 4 transcripts and its exons
Entry type; transcript or exon
ID given to transcripts
Phase of open-reading frame
Method by which CDS was built
...
A dataset containing coordinates of exons from 4 transcripts of mouse Ptbp1. Transcript names and gene IDs have been modified
data(query_exons)
data(query_exons)
A GRangesList object with 4 elements:
Chromosome, start, end, and strand info of 4 transcripts and its exons
Entry type; transcript or exon
ID given to transcripts
Matched gene_id
Original gene_id
Level of matching performed
Name of gene
...
A dataset containing coordinates of transcript and exons from 4 transcripts of mouse Ptbp1. Transcript names and gene IDs have been modified to demonstrate de novo origin of GTF
data(query_gtf)
data(query_gtf)
A GRanges object with 56 ranges and 3 metadata columns:
Chromosome, start, end, and strand info of 4 transcripts and its exons
Entry type; transcript or exon
Name or ID given to transcripts
Name or ID given to gene origin of transcripts
...
A dataset containing coordinates of CDS from 2 reference transcripts of mouse Ptbp1.
data(ref_cds)
data(ref_cds)
A GRangesList object with 2 elements:
Chromosome, start, end, and strand info of 4 transcripts and its exons
Entry type; transcript or exon
Phase of open-reading frame
Matched gene_id
Name of gene
ID given to transcripts
...
A dataset containing coordinates of exons from 2 reference transcripts of mouse Ptbp1.
data(ref_exons)
data(ref_exons)
A GRangesList object with 2 elements:
Chromosome, start, end, and strand info of 4 transcripts and its exons
Entry type; transcript or exon
Phase of open-reading frame
Matched gene_id
Name of gene
ID given to transcripts
...
A dataset containing coordinates of transcript and exons from 2 reference transcripts of mouse Ptbp1.
data(ref_gtf)
data(ref_gtf)
A GRanges object with 64 ranges and 5 metadata columns:
Chromosome, start, end, and strand info of 4 transcripts and its exons
Entry type; transcript or exon
Phase of open-reading frame
Matched gene_id
Name of gene
ID given to transcripts
...
Internally sort each element of a GenomicRangesList
sorteach(x, ...)
sorteach(x, ...)
x |
GRangesList object |
... |
Comma separated list of unquoted variable names to sort by. Variables are names of metadata columns found in GRangesList object. Use desc() to sort a variable in descending order. Input can be 'exonorder' to sort each element in exon order |
Sorted GRangesList object
Fursham Hamid
# Load dataset data(query_exons) # sort elements in each GRangesList in descending coordinate order query_exons_desc <- sorteach(query_exons, dplyr::desc(start)) # sort elements in each GRangesList in its order in transcript query_exons_exonorder <- sorteach(query_exons_desc, exonorder) # test similarity of query_exons and query_exons_exonorder identical(query_exons, query_exons_exonorder)
# Load dataset data(query_exons) # sort elements in each GRangesList in descending coordinate order query_exons_desc <- sorteach(query_exons, dplyr::desc(start)) # sort elements in each GRangesList in its order in transcript query_exons_exonorder <- sorteach(query_exons_desc, exonorder) # test similarity of query_exons and query_exons_exonorder identical(query_exons, query_exons_exonorder)
'subsetNewTranscripts()' will retain transcripts in 'query' that are distinct from those in 'ref'
subsetNewTranscripts(query, ref, refine.by = c("none", "intron", "cds"))
subsetNewTranscripts(query, ref, refine.by = c("none", "intron", "cds"))
query |
GRanges object containing query GTF data. |
ref |
GRanges object containing reference GTF data. |
refine.by |
Whether to refine the selection process by removing query transcripts with similar introns or CDS structure to reference. Default input is "none", and can be changed to "intron" or "cds" respectively. |
'subsetNewTranscripts()' will compare query and reference GTF GRanges and return query transcripts with different exon structures from reference transcripts. Transcriptome assemblers may sometime extend 5' and 3' ends of known transcripts based on experimental data. These annotated transcripts can be removed by inputting "intron" to the refine.by argument. This will further compare and remove transcripts of identical intron structures. Alternatively, transcripts with unique CDS coordinates can be selected by typing "cds" to the refine.by argument.
Filtered GRanges GTF object
Fursham Hamid
# Load dataset data(matched_query_gtf, ref_gtf) # shortlist new transcripts subsetNewTranscripts(matched_query_gtf, ref_gtf)
# Load dataset data(matched_query_gtf, ref_gtf) # shortlist new transcripts subsetNewTranscripts(matched_query_gtf, ref_gtf)
Resize 5' and 3' ends of a transcript GenomicRanges
trimTranscripts(x, start = 0, end = 0)
trimTranscripts(x, start = 0, end = 0)
x |
GRanges or GRangesList object containing exon coordinates for each transcript |
start |
Number of bases to trim from the start of transcript. Providing a negative value will extend the transcript instead. If 'x' is a GRanges object, 'start' is a single integer. If 'x' is a GRangesList, 'start' can be a single integer or a list of integers of the same length as 'x' |
end |
Number of bases to trim from the end of transcript. Providing a negative value will extend the transcript instead. If 'x' is a GRanges object, 'end' is a single integer. If 'x' is a GRangesList, 'end' can be a single integer or a list of integers of the same length as 'x' |
Trimmed GenomicRanges object
Fursham Hamid
library(GenomicRanges) gr1 <- GRanges( seqnames = "chr1", strand = c("+", "+", "+"), ranges = IRanges( start = c(1, 500, 1000), end = c(100, 600, 1100) ) ) trimTranscripts(gr1, 20, 80) trimTranscripts(gr1, 110, 150)
library(GenomicRanges) gr1 <- GRanges( seqnames = "chr1", strand = c("+", "+", "+"), ranges = IRanges( start = c(1, 500, 1000), end = c(100, 600, 1100) ) ) trimTranscripts(gr1, 20, 80) trimTranscripts(gr1, 110, 150)
A wrapper around wiggleplotr's plotTranscripts function.
See the documentation for (plotTranscripts
)
for more information.
viewTranscripts(x, ..., rescale_introns = FALSE, ncol = 1)
viewTranscripts(x, ..., rescale_introns = FALSE, ncol = 1)
x |
GRanges object containing transcript annotation in GTF format |
... |
Character value of features to plot. Multiple features can be plotted by entering comma-delimited values. Features will be extracted from metadata gene_name, gene_id and transcript_id of the GTF. Can also be a conditional statement to filter values from variables in the GTF (e.g. gene_name == "Ptbp1") |
rescale_introns |
Specifies if the introns should be scaled to fixed length or not. (default: FALSE) |
ncol |
Number of columns to patch the output plots (default: 1) |
ggplot2 object. If multiple genes are detected, plots will be combined using patchwork
Fursham Hamid
## --------------------------------------------------------------------- ## EXAMPLE USING SAMPLE DATASET ## --------------------------------------------------------------------- # Load datasets data(query_gtf, ref_gtf) viewTranscripts(query_gtf) viewTranscripts(query_gtf, "transcript1") viewTranscripts(ref_gtf) ## --------------------------------------------------------------------- ## EXAMPLE USING TRANSCRIPT ANNOTATION ## --------------------------------------------------------------------- library(AnnotationHub) ## Retrieve GRCm38 trancript annotation ah <- AnnotationHub() GRCm38_gtf <- ah[["AH60127"]] ## Plot transcripts from Ptbp1 gene viewTranscripts(GRCm38_gtf, "Ptbp1") # Plot transcripts from Ptbp1 and Ptbp2 genes viewTranscripts(GRCm38_gtf, "Ptbp1", "Ptbp2")
## --------------------------------------------------------------------- ## EXAMPLE USING SAMPLE DATASET ## --------------------------------------------------------------------- # Load datasets data(query_gtf, ref_gtf) viewTranscripts(query_gtf) viewTranscripts(query_gtf, "transcript1") viewTranscripts(ref_gtf) ## --------------------------------------------------------------------- ## EXAMPLE USING TRANSCRIPT ANNOTATION ## --------------------------------------------------------------------- library(AnnotationHub) ## Retrieve GRCm38 trancript annotation ah <- AnnotationHub() GRCm38_gtf <- ah[["AH60127"]] ## Plot transcripts from Ptbp1 gene viewTranscripts(GRCm38_gtf, "Ptbp1") # Plot transcripts from Ptbp1 and Ptbp2 genes viewTranscripts(GRCm38_gtf, "Ptbp1", "Ptbp2")