Title: | Cleavage of Polypeptide Sequences |
---|---|
Description: | In-silico cleavage of polypeptide sequences. The cleavage rules are taken from: http://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html |
Authors: | Sebastian Gibb [aut, cre] |
Maintainer: | Sebastian Gibb <[email protected]> |
License: | GPL (>= 3) |
Version: | 1.45.0 |
Built: | 2024-12-27 04:29:04 UTC |
Source: | https://github.com/bioc/cleaver |
This package cleaves polypeptide sequences. It provides
three functions: cleave
,
cleavageRanges
and
cleavageSites
.
The cleavage rules are taken from: https://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html
Package: | cleaver |
License: | GPL (>= 3) |
URL: | https://github.com/sgibb/cleaver/ |
Sebastian Gibb [email protected]
https://github.com/sgibb/cleaver/
Gasteiger E., Hoogland C., Gattiker A., Duvaud S.,
Wilkins M.R., Appel R.D., Bairoch A.; "Protein
Identification and Analysis Tools on the ExPASy Server".
(In) John M. Walker (ed): The Proteomics Protocols
Handbook, Humana Press (2005).
https://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html
cleave
,
cleavageRanges
and
cleavageSites
.
This functions cleave polypeptide sequences. Use cleavageSites
to find
the cleavage sites, cleavageRanges
to find the cleavage
ranges and cleave
to get the cleavage products.
## S4 method for signature 'character' cleave(x, enzym = "trypsin", missedCleavages = 0, custom = NULL, unique = TRUE) ## S4 method for signature 'AAString' cleave(x, enzym = "trypsin", missedCleavages = 0, custom = NULL, unique = TRUE) ## S4 method for signature 'AAStringSet' cleave(x, enzym = "trypsin", missedCleavages = 0, custom = NULL, unique = TRUE) ## S4 method for signature 'character' cleavageRanges(x, enzym = "trypsin", missedCleavages = 0, custom = NULL) ## S4 method for signature 'AAString' cleavageRanges(x, enzym = "trypsin", missedCleavages = 0, custom = NULL) ## S4 method for signature 'AAStringSet' cleavageRanges(x, enzym = "trypsin", missedCleavages = 0, custom = NULL) ## S4 method for signature 'character' cleavageSites(x, enzym = "trypsin", custom = NULL) ## S4 method for signature 'AAString' cleavageSites(x, enzym = "trypsin", custom = NULL) ## S4 method for signature 'AAStringSet' cleavageSites(x, enzym = "trypsin", custom = NULL)
## S4 method for signature 'character' cleave(x, enzym = "trypsin", missedCleavages = 0, custom = NULL, unique = TRUE) ## S4 method for signature 'AAString' cleave(x, enzym = "trypsin", missedCleavages = 0, custom = NULL, unique = TRUE) ## S4 method for signature 'AAStringSet' cleave(x, enzym = "trypsin", missedCleavages = 0, custom = NULL, unique = TRUE) ## S4 method for signature 'character' cleavageRanges(x, enzym = "trypsin", missedCleavages = 0, custom = NULL) ## S4 method for signature 'AAString' cleavageRanges(x, enzym = "trypsin", missedCleavages = 0, custom = NULL) ## S4 method for signature 'AAStringSet' cleavageRanges(x, enzym = "trypsin", missedCleavages = 0, custom = NULL) ## S4 method for signature 'character' cleavageSites(x, enzym = "trypsin", custom = NULL) ## S4 method for signature 'AAString' cleavageSites(x, enzym = "trypsin", custom = NULL) ## S4 method for signature 'AAStringSet' cleavageSites(x, enzym = "trypsin", custom = NULL)
x |
polypeptide sequences. |
enzym |
|
missedCleavages |
|
custom |
|
unique |
|
The cleavage rules are taken from: https://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html
Cleavage rules (cleavage between P1 and P1'):
Rule name | P4 | P3 | P2 | P1 | P1' | P2' |
arg-c proteinase |
- | - | - | R | - | - |
asp-n endopeptidase |
- | - | - | - | D | - |
bnps-skatole-c |
- | - | - | W | - | - |
caspase1 |
F,W,Y,L | - | H,A,T | D | not P,E,D,Q,K,R | - |
caspase2 |
D | V | A | D | not P,E,D,Q,K,R | - |
caspase3 |
D | M | Q | D | not P,E,D,Q,K,R | - |
caspase4 |
L | E | V | D | not P,E,D,Q,K,R | - |
caspase5 |
L,W | E | H | D | - | - |
caspase6 |
V | E | H,I | D | not P,E,D,Q,K,R | - |
caspase7 |
D | E | V | D | not P,E,D,Q,K,R | - |
caspase8 |
I,L | E | T | D | not P,E,D,Q,K,R | - |
caspase9 |
L | E | H | D | - | - |
caspase10 |
I | E | A | D | - | - |
chymotrypsin-high |
- | - | - | F,Y | not P | - |
- | - | - | W | not M,P | - | |
chymotrypsin-low |
- | - | - | F,L,Y | not P | - |
- | - | - | W | not M,P | - | |
- | - | - | M | not P,Y | - | |
- | - | - | H | not D,M,P,W | - | |
clostripain |
- | - | - | R | - | - |
cnbr |
- | - | - | M | - | - |
enterokinase |
D,E | D,E | D,E | K | - | - |
factor xa |
A,F,G,I,L,T,V,M | D,E | G | R | - | - |
formic acid |
- | - | - | D | - | - |
glutamyl endopeptidase |
- | - | - | E | - | - |
granzyme-b |
I | E | P | D | - | - |
hydroxylamine |
- | - | - | N | G | - |
iodosobenzoic acid |
- | - | - | W | - | - |
lysc |
- | - | - | K | - | - |
lysn |
- | - | - | - | K | - |
lysarginase |
- | - | - | - | K,R | - |
neutrophil elastase |
- | - | - | A,V | - | - |
ntcb |
- | - | - | - | C | - |
pepsin1.3 |
- | not H,K,R | not P | not R | F,L | not P |
pepsin |
- | not H,K,R | not P | not R | F,L,W,Y | not P |
- | not H,K,R | not P | F,L,W,Y | - | not P | |
- | not H,K,R | not P | F,L | - | not P | |
proline endopeptidase |
- | - | not H,K,R | P | not P | - |
proteinase k |
- | - | - | A,E,F,I,L,T,V,W,Y | - | - |
staphylococcal peptidase i |
- | - | not E | E | - | - |
thermolysin |
- | - | - | not D,E | A,F,I,L,M,V | - |
thrombin |
- | - | G | R | G | - |
A,F,G,I,L,T,V,M | A,F,G,I,L,T,V,W | P | R | not D,E | not D,E | |
trypsin |
- | - | - | K,R | not P | - |
- | - | W | K | P | - | |
- | - | M | R | P | - | |
trypsin-high |
- | - | - | K,R | not P | - |
- | - | W | K | P | - | |
- | - | M | R | P | - | |
trypsin-low |
- | - | - | K,R | not P | - |
- | - | W | K | P | - | |
- | - | M | R | P | - | |
trypsin-simple |
- | - | - | K,R | - | - |
Exceptions:
Rule name | Enzyme name | P4 | P3 | P2 | P1 | P1' | P2' |
trypsin | - | - | C,D | K | D | - | |
- | - | C | K | H,Y | - | ||
- | - | C | R | K | - | ||
- | - | R | R | H,R | - | ||
trypsin-high | - | - | C,D | K | D | - | |
- | - | C | K | H,Y | - | ||
- | - | C | R | K | - | ||
- | - | R | R | H,R | - | ||
Rule name | Enzyme name |
arg-c proteinase |
Arg-C proteinase |
asp-n endopeptidase |
Asp-N endopeptidase |
bnps-skatole-c |
BNPS-Skatole |
caspase1 |
Caspase 1 |
caspase2 |
Caspase 2 |
caspase3 |
Caspase 3 |
caspase4 |
Caspase 4 |
caspase5 |
Caspase 5 |
caspase6 |
Caspase 6 |
caspase7 |
Caspase 7 |
caspase8 |
Caspase 8 |
caspase9 |
Caspase 9 |
caspase10 |
Caspase 10 |
chymotrypsin-high
|
Chymotrypsin-high specificity (C-term to [FYW], not before P) |
chymotrypsin-low
|
Chymotrypsin-low specificity (C-term to [FYWML], not before P) |
clostripain |
Clostripain (Clostridiopeptidase B) |
cnbr |
CNBr |
enterokinase |
Enterokinase |
factor xa |
Factor Xa |
formic acid |
Formic acid |
glutamyl endopeptidase |
Glutamyl endopeptidase |
granzyme-b |
Granzyme B |
hydroxylamine |
Hydroxylamine |
iodosobenzoic acid |
Iodosobenzoic acid |
lysc |
LysC |
lysn |
LysN |
lysarginase |
LysargiNase |
neutrophil elastase |
Neutrophil elastase |
ntcb |
NTCB (2-nitro-5-thiocyanobenzoic acid) |
pepsin1.3 |
Pepsin (pH == 1.3) |
pepsin |
Pepsin (pH > 2) |
proline endopeptidase |
Proline-endopeptidase |
proteinase k |
Proteinase K |
staphylococcal peptidase i |
Staphylococcal Peptidase I |
thermolysin |
Thermolysin |
thrombin |
Thrombin |
trypsin |
Trypsin |
trypsin-high |
Trypsin, higher specificity as defined in PeptideMass, identical to trypsin |
trypsin-low |
Trypsin, C-term to K/R if C-term is not P, as defined in PeptideMass |
trypsin-simple |
Trypsin, C-term to K/R, even before P, as defined in PeptideMass |
cleave
If x
is a character
it returns a list
of the same
length as x
. Each element contains a character
vector with
the corresponding cleavage products of the polypeptides.
If x
is an AAString
or an
AAStringSet
an
AAStringSet
or an
AAStringSetList
instance of the same length as
x
is returned.
Each element contains an
AAString
or an AAStringSet
instance with the corresponding cleavage products of the polypeptides.
cleavageRanges
If x
is a character
it returns a list
of the same
length as x
. Each element contains a two-column matrix
with
the start and end positions of the peptides.
If x
is an AAString
or an
AAStringSet
instance an
IRanges
or an IRangesList
of the
same length as x
is returned.
cleavageSites
Returns a list
of the same length as x
. Each element
contains an integer
vector with the cleavage positions.
Overview:
Input | cleave |
cleavageRanges
|
cleavageSites |
character |
list of character
|
list of matrix
|
list of integer |
AAString |
AAStringSet
|
IRanges
|
list of integer |
AAStringSet |
AAStringSetList
|
IRangesList
|
list of integer |
Sebastian Gibb <[email protected]>
Gasteiger E., Hoogland C., Gattiker A., Duvaud S.,
Wilkins M.R., Appel R.D., Bairoch A.; "Protein
Identification and Analysis Tools on the ExPASy Server".
(In) John M. Walker (ed): The Proteomics Protocols
Handbook, Humana Press (2005).
https://web.expasy.org/peptide_cutter/peptidecutter_enzymes.html
PeptideMass https://web.expasy.org/peptide_mass/peptide-mass-doc.html#table1
AAString
,
AAStringSet
,
AAStringSetList
,
IRanges
,
IRangesList
library("cleaver") ## Gastric juice peptide 1 (UniProtKB/Swiss-Prot: GAJU_HUMAN/P01358) gaju <- "LAAGKVEDSD" cleave(gaju, "trypsin") # $LAAGKVEDSD # [1] "LAAGK" "VEDSD" cleavageRanges(gaju, "trypsin") # $LAAGKVEDSD # start end # [1,] 1 5 # [2,] 6 10 cleavageSites(gaju, "trypsin") # $LAAGKVEDSD # [1] 5 cleave(gaju, "trypsin", missedCleavages=1) # $LAAGKVEDSD # [1] "LAAGKVEDSD" cleavageRanges(gaju, "trypsin", missedCleavages=1) # $LAAGKVEDSD # start end # [1,] 1 10 cleave(gaju, "trypsin", missedCleavages=0:1) # $LAAGKVEDSD # [1] "LAAGK" "VEDSD" "LAAGKVEDSD" cleavageRanges(gaju, "trypsin", missedCleavages=0:1) # $LAAGKVEDSD # start end # [1,] 1 5 # [2,] 6 10 # [3,] 1 10 cleave(gaju, "pepsin") # $LAAGKVEDSD # [1] "LAAGKVEDSD" # (no cleavage) ## use AAStringSet gaju <- AAStringSet("LAAGKVEDSD") cleave(gaju) # AAStringSetList of length 1 # [["LAAGKVEDSD"]] LAAGK VEDSD ## Beta-enolase (UniProtKB/Swiss-Prot: ENOB_THUAL/P86978) enob <- "SITKIKAREILD" cleave(enob, "trypsin") # $SITKIKAREILD # [1] "SITK" "IK" "AR" "EILD" cleave(enob, "trypsin", missedCleavages=2) # $SITKIKAREILD # [1] "SITKIKAR" "IKAREILD" cleave(enob, "trypsin", missedCleavages=0:2) # $SITKIKAREILD # [1] "SITK" "IK" "AR" "EILD" "SITKIK" "IKAR" # [7] "AREILD" "SITKIKAR" "IKAREILD" ## define own cleavage rule: cleave at K cleave(enob, custom="K") # $SITKIKAREILD # [1] "SITK" "IK" "AREILD" cleavageRanges(enob, custom="K") # $SITKIKAREILD # start end # [1,] 1 4 # [2,] 5 6 # [3,] 7 12 ## define own cleavage rule: cleave at K but not if followed by A cleave(enob, custom=c("K", "K(?=A)")) # $SITKIKAREILD # [1] "SITK" "IKAREILD" cleavageRanges(enob, custom=c("K", "K(?=A)")) # $SITKIKAREILD # start end # [1,] 1 4 # [2,] 5 12 cleavageSites(enob, custom=c("K", "K(?=A)")) # $SITKIKAREILD # [1] 4
library("cleaver") ## Gastric juice peptide 1 (UniProtKB/Swiss-Prot: GAJU_HUMAN/P01358) gaju <- "LAAGKVEDSD" cleave(gaju, "trypsin") # $LAAGKVEDSD # [1] "LAAGK" "VEDSD" cleavageRanges(gaju, "trypsin") # $LAAGKVEDSD # start end # [1,] 1 5 # [2,] 6 10 cleavageSites(gaju, "trypsin") # $LAAGKVEDSD # [1] 5 cleave(gaju, "trypsin", missedCleavages=1) # $LAAGKVEDSD # [1] "LAAGKVEDSD" cleavageRanges(gaju, "trypsin", missedCleavages=1) # $LAAGKVEDSD # start end # [1,] 1 10 cleave(gaju, "trypsin", missedCleavages=0:1) # $LAAGKVEDSD # [1] "LAAGK" "VEDSD" "LAAGKVEDSD" cleavageRanges(gaju, "trypsin", missedCleavages=0:1) # $LAAGKVEDSD # start end # [1,] 1 5 # [2,] 6 10 # [3,] 1 10 cleave(gaju, "pepsin") # $LAAGKVEDSD # [1] "LAAGKVEDSD" # (no cleavage) ## use AAStringSet gaju <- AAStringSet("LAAGKVEDSD") cleave(gaju) # AAStringSetList of length 1 # [["LAAGKVEDSD"]] LAAGK VEDSD ## Beta-enolase (UniProtKB/Swiss-Prot: ENOB_THUAL/P86978) enob <- "SITKIKAREILD" cleave(enob, "trypsin") # $SITKIKAREILD # [1] "SITK" "IK" "AR" "EILD" cleave(enob, "trypsin", missedCleavages=2) # $SITKIKAREILD # [1] "SITKIKAR" "IKAREILD" cleave(enob, "trypsin", missedCleavages=0:2) # $SITKIKAREILD # [1] "SITK" "IK" "AR" "EILD" "SITKIK" "IKAR" # [7] "AREILD" "SITKIKAR" "IKAREILD" ## define own cleavage rule: cleave at K cleave(enob, custom="K") # $SITKIKAREILD # [1] "SITK" "IK" "AREILD" cleavageRanges(enob, custom="K") # $SITKIKAREILD # start end # [1,] 1 4 # [2,] 5 6 # [3,] 7 12 ## define own cleavage rule: cleave at K but not if followed by A cleave(enob, custom=c("K", "K(?=A)")) # $SITKIKAREILD # [1] "SITK" "IKAREILD" cleavageRanges(enob, custom=c("K", "K(?=A)")) # $SITKIKAREILD # start end # [1,] 1 4 # [2,] 5 12 cleavageSites(enob, custom=c("K", "K(?=A)")) # $SITKIKAREILD # [1] 4