This vignette lists the most Frequently Asked Questions we receive about ampliCan.
Yes, amplican
can be used more or less as normal. The
expected edit site should still be placed as UPPER case letters, but
should in the case of dimers span the region between the two binding
sites. The guide sequence column is then typically set to the same as
the uppercase region. If you have controls, you should make sure their
Guide column and Group column are the same as the experiment for
normalization.
amplican
is versatile in its normalization. In the
default pipeline the guideRNA and Group columns determine which
experiments are normalized by which. The Control column specifies what
are to be considered controls as opposed to cases. The controls that
match both the guideRNA and Group are averaged and are used to normalize
every read from the case group with the same guideRNA and Group.
ID | guideRNA | Group | Control |
---|---|---|---|
1 | ACTG | g1 | 0 |
2 | ACTG | g1 | 0 |
3 | ACTG | g1 | 1 |
4 | ACTG | g2 | 1 |
5 | ACTG | g2 | 0 |
6 | ACTG | g2 | 0 |
In the above example, with default configuration, Experiment ID 1 and 2 will be normalized with ID 3, while ID 5 and 6 with ID 4.
However, as an alternative the user can only normalize by guideRNA
match by specifying normalize = c("guideRNA")
in the
amplicanPipeline
. If so, ID 3 and 4 will be averaged and
will be used to normalize all cases since all experiments have matching
guideRNA.
Unique reads is the number of reads when all duplicates are only counted once. For paired-end sequencing we reuiqre the combination of forward and reverse read to be unique. This is a simple metric of the heterogeneity of your reads.
If you have many reads, but few unique it means that many reads are identical. Possibly because CRISPR did not cut, or have cut in a highly specific manner. If you have very high number of unique reads, your reads are mostly different to each other. Sequencing errors, alignments and mosaic CRISPR activity can contribute to this. Both of those cases can happen in successful experiments, but usually a few reads tend to be more frequently sampled.
Reads_Del is the number of reads that had a deletion, Reads_Ins is number of reads that had an insertion. Reads_Edited is number of reads that had any edit, which can include reads with both insertion and deletion.
ampliCan
can at present not handle ABI directly, but ABI
can be converted to fastq files using other software.
There are mainly two reason to alter the normalization threshold:
When high precision is required (below 0.01%) it is beneficial to
lower the normalization threshold eg. min_freq = 0.001
if
you have sufficient sequencing depth
When you have a homogenous genetic background or your sequencing
depth is low it might be beneficial to set the threshold higher e.g.
min_freq = 0.1
.
You suspect/expect that there is Index Hopping occuring in your
reads, in that scenario you should adjust threshold to
e.g. min_freq = 0.03
as expected Index Hopping levels can
be as high as 0.02 frequency and can be confused as genetic background
during normalization, if threshold is kept at default.
This should be apparent from the mismatch plot, where the frequency line of mismatches in the control should give you an idea of what the background noise level is.
You can adjust threshold for normalization to
min_freq = 0.15
or use function
amplicanPipelineConservative
.