Proteolytic resistance analysis

MSstatsLiP Workflow: Protease resistance analysis

This is an R Markdown Notebook describing the analysis of a LiP-MS experiment using MSstatsLiP. When you execute code within the notebook, the results appear beneath the code.

Here, we use LiP-MS data of human alpha-Synuclein in the monomeric (M) and fibrillar form (F) spiked into a S.cerevisiae lysate at 5 pmol/ug lysate (M1 and F1) and 20 pmol/ug lysate (M2 and F2).The data set is composed of four biological replicates per condition.

1. Installation

  • Install and load all necessary packages. The installation needs to be performed at first use only. Un-comment the lines for execution.
 knitr::opts_chunk$set(include = FALSE)
library(MSstatsLiP)
library(tidyverse)
library(data.table)
library(gghighlight)
  • Set the working directory
input_folder=choose.dir(caption="Choose the working directory")
knitr::opts_knit$set(root.dir = input_folder) 

2. Data preprocessing

2.1 Load datasets

Load the data from the Spectronaut export. LiP data is loaded as raw_lip, trypsin-only control data (TrP data) is loaded as raw_prot. The function choose.files() enables browsing for the input file.

CAVE: Make sure the separator delim is set correctly. For comma-separated values (csv), the separator is set to delim=",".

raw_lip <- read_delim(file=choose.files(caption="Choose LiP dataset"), 
                         delim=",", escape_double = FALSE, trim_ws = TRUE)
raw_prot <- read_delim(file=choose.files(caption="Choose TrP dataset"), 
                          delim=",", escape_double = FALSE, trim_ws = TRUE)
raw_lip <- raw_lip %>% mutate_all(funs(ifelse(.=="P37840.1", "P37840", .)))
raw_prot <- raw_prot %>% mutate_all(funs(ifelse(.=="P37840.1", "P37840", .)))

Load the fasta file that was used in the Spectronaut search.

fasta_file=choose.files(caption = "Choose FASTA file")

Convert the data to MSstatsLiP format. Load first the LiP data set raw_lip, then the FASTA file fasta_file used for searches. If the experiment contains TrP data, raw_prot is loaded last.

To remove information on iRT peptides, the default setting is removeiRT = TRUE. As default, peptides containing modifications are filtered, but this can be changed using the argument removeModifications. Also, peptides with multiple protein annotations are filtered as default. However, for data sets containing protein isoforms, this argument can be set to removeNonUniqueProteins = FALSE.

The default settings use PeakArea as measure of intensity, filter features based on the q-value, with a q-value cut-off of 0.01 and import all conditions. You can adjust the settings accordingly. For information on each option, refer to the vignette of the function.

msstats_data <- SpectronauttoMSstatsLiPFormat(raw_lip, fasta_file, raw_prot)
## INFO  [2024-08-30 02:52:59] ** Raw data from Spectronaut imported successfully.
## INFO  [2024-08-30 02:52:59] ** Raw data from Spectronaut cleaned successfully.
## INFO  [2024-08-30 02:52:59] ** Using annotation extracted from quantification data.
## INFO  [2024-08-30 02:52:59] ** Run labels were standardized to remove symbols such as '.' or '%'.
## INFO  [2024-08-30 02:52:59] ** The following options are used:
##   - Features will be defined by the columns: PeptideSequence, PrecursorCharge, FragmentIon, ProductCharge
##   - Shared peptides will be removed.
##   - Proteins with single feature will not be removed.
##   - Features with less than 3 measurements across runs will be removed.
## WARN  [2024-08-30 02:52:59] ** PGQvalue not found in input columns.
## INFO  [2024-08-30 02:52:59] ** Intensities with values not smaller than 0.01 in EGQvalue are replaced with 0
## INFO  [2024-08-30 02:52:59] ** Features with all missing measurements across runs are removed.
## INFO  [2024-08-30 02:52:59] ** Shared peptides are removed.
## INFO  [2024-08-30 02:52:59] ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows: max
## INFO  [2024-08-30 02:52:59] ** Features with one or two measurements across runs are removed.
## INFO  [2024-08-30 02:52:59] ** Run annotation merged with quantification data.
## INFO  [2024-08-30 02:52:59] ** Features with one or two measurements across runs are removed.
## INFO  [2024-08-30 02:52:59] ** Fractionation handled.
## INFO  [2024-08-30 02:52:59] ** Updated quantification data to make balanced design. Missing values are marked by NA
## INFO  [2024-08-30 02:52:59] ** Finished preprocessing. The dataset is ready to be processed by the dataProcess function.
## INFO  [2024-08-30 02:52:59] ** Raw data from Spectronaut imported successfully.
## INFO  [2024-08-30 02:52:59] ** Raw data from Spectronaut cleaned successfully.
## INFO  [2024-08-30 02:52:59] ** Using annotation extracted from quantification data.
## INFO  [2024-08-30 02:52:59] ** Run labels were standardized to remove symbols such as '.' or '%'.
## INFO  [2024-08-30 02:52:59] ** The following options are used:
##   - Features will be defined by the columns: PeptideSequence, PrecursorCharge, FragmentIon, ProductCharge
##   - Shared peptides will be removed.
##   - Proteins with single feature will not be removed.
##   - Features with less than 3 measurements across runs will be removed.
## WARN  [2024-08-30 02:52:59] ** PGQvalue not found in input columns.
## INFO  [2024-08-30 02:52:59] ** Intensities with values not smaller than 0.01 in EGQvalue are replaced with 0
## INFO  [2024-08-30 02:52:59] ** Features with all missing measurements across runs are removed.
## INFO  [2024-08-30 02:52:59] ** Shared peptides are removed.
## INFO  [2024-08-30 02:52:59] ** Multiple measurements in a feature and a run are summarized by summaryforMultipleRows: max
## INFO  [2024-08-30 02:52:59] ** Features with one or two measurements across runs are removed.
## INFO  [2024-08-30 02:52:59] ** Run annotation merged with quantification data.
## INFO  [2024-08-30 02:52:59] ** Features with one or two measurements across runs are removed.
## INFO  [2024-08-30 02:52:59] ** Fractionation handled.
## INFO  [2024-08-30 02:52:59] ** Updated quantification data to make balanced design. Missing values are marked by NA
## INFO  [2024-08-30 02:52:59] ** Finished preprocessing. The dataset is ready to be processed by the dataProcess function.

2.2 Select only fully tryptic (FT) peptides in both LiP and TrP dataset

Proteolytic resistance is calculated as the of the intensity of fully tryptic peptides in the LiP condition to the TrP condition. Half-tryptic (HT) peptides are excluded from this analysis. The function “calculateTrypticity” is used to annotate FT and HT peptides in the LiP dataset. Next, from the TrP dataset we filtered out FT peptides not identified in the LiP dataset.The msstats_data list will finally contain only FT peptides measured in both LiP and TrP datasets.

FullyTrP <- msstats_data[["LiP"]] %>% 
  distinct(ProteinName, PeptideSequence) %>% 
  calculateTrypticity(fasta_file) %>% 
  filter(fully_TRI) %>%
  filter(MissedCleavage == FALSE) %>% 
  select(ProteinName, PeptideSequence, StartPos, EndPos)
msstats_data[["LiP"]] <- msstats_data[["LiP"]] %>% 
  select(-ProteinName) %>% inner_join(FullyTrP)
msstats_data[["TrP"]] <- msstats_data[["TrP"]] %>% 
  select(-ProteinName) %>% inner_join(FullyTrP)

2.3 Correct nomenclature

Step 1:

Ensure that the Condition nomenclature is identical in both data sets. If the output is TRUE for all conditions, continue to step 2.

unique(msstats_data[["LiP"]]$Condition)%in%unique(msstats_data[["TrP"]]$Condition)
## [1] TRUE TRUE TRUE TRUE

To correct the condition nomenclature, display the condition for both data sets.

paste("LiP Condition nomenclature:", unique(msstats_data[["LiP"]]$Condition), ",",
      "TrP Condition nomenclature:",unique(msstats_data[["TrP"]]$Condition))
## [1] "LiP Condition nomenclature: F1 , TrP Condition nomenclature: F2"
## [2] "LiP Condition nomenclature: M1 , TrP Condition nomenclature: M2"
## [3] "LiP Condition nomenclature: M2 , TrP Condition nomenclature: M1"
## [4] "LiP Condition nomenclature: F2 , TrP Condition nomenclature: F1"

If necessary, un-comment following lines to correct the condition nomenclature in either of the data sets. E.g. change the nomenclature of the TrP samples from Cond1 to cond1.

# msstats_data[["TrP"]] = msstats_data[["TrP"]] %>% 
#   mutate(Condition = case_when(Condition == "Cond1" ~ "cond1",
#                                Condition == "Cond2" ~ "cond2"))

Step 2:

Ensure that BioReplicate nomenclature is correctly annotated (see also MSstats user manual. The BioReplicate needs a unique nomenclature, while the technical replicates can have duplicate numbering. If the replicate nomenclature is correct, proceed to section 2.3.

paste("LiP BioReplicate nomenclature:", unique(msstats_data[["LiP"]]$BioReplicate), ",",
      "TrP BioReplicate nomenclature:",unique(msstats_data[["TrP"]]$BioReplicate))
## [1] "LiP BioReplicate nomenclature: 1 , TrP BioReplicate nomenclature: 1"
## [2] "LiP BioReplicate nomenclature: 2 , TrP BioReplicate nomenclature: 2"
## [3] "LiP BioReplicate nomenclature: 3 , TrP BioReplicate nomenclature: 3"
## [4] "LiP BioReplicate nomenclature: 4 , TrP BioReplicate nomenclature: 4"

Adjust BioReplicate column to correct nomenclature for a Case-control experiment.

msstats_data[["LiP"]] = msstats_data[["LiP"]] %>% 
  mutate(BioReplicate = paste0(Condition,".",BioReplicate))

msstats_data[["TrP"]] = msstats_data[["TrP"]] %>% 
  mutate(BioReplicate = paste0(Condition,".",BioReplicate))

Inspect corrected BioReplicate column.

paste("LiP BioReplicate nomenclature:", unique(msstats_data[["LiP"]]$BioReplicate), ",",
      "TrP BioReplicate nomenclature:",unique(msstats_data[["TrP"]]$BioReplicate))
##  [1] "LiP BioReplicate nomenclature: F1.1 , TrP BioReplicate nomenclature: F2.1"
##  [2] "LiP BioReplicate nomenclature: M1.1 , TrP BioReplicate nomenclature: M2.1"
##  [3] "LiP BioReplicate nomenclature: M2.1 , TrP BioReplicate nomenclature: M1.1"
##  [4] "LiP BioReplicate nomenclature: F2.1 , TrP BioReplicate nomenclature: F1.1"
##  [5] "LiP BioReplicate nomenclature: M1.2 , TrP BioReplicate nomenclature: M2.2"
##  [6] "LiP BioReplicate nomenclature: M2.2 , TrP BioReplicate nomenclature: F1.2"
##  [7] "LiP BioReplicate nomenclature: F1.2 , TrP BioReplicate nomenclature: M1.2"
##  [8] "LiP BioReplicate nomenclature: F2.2 , TrP BioReplicate nomenclature: F2.2"
##  [9] "LiP BioReplicate nomenclature: M2.3 , TrP BioReplicate nomenclature: M1.3"
## [10] "LiP BioReplicate nomenclature: F1.3 , TrP BioReplicate nomenclature: M2.3"
## [11] "LiP BioReplicate nomenclature: M1.3 , TrP BioReplicate nomenclature: F1.3"
## [12] "LiP BioReplicate nomenclature: F2.3 , TrP BioReplicate nomenclature: F2.3"
## [13] "LiP BioReplicate nomenclature: F2.4 , TrP BioReplicate nomenclature: F1.4"
## [14] "LiP BioReplicate nomenclature: M2.4 , TrP BioReplicate nomenclature: M1.4"
## [15] "LiP BioReplicate nomenclature: M1.4 , TrP BioReplicate nomenclature: M2.4"
## [16] "LiP BioReplicate nomenclature: F1.4 , TrP BioReplicate nomenclature: F2.4"

2.4 Data Summarization

Summarize the data. The default settings use a log2-transformation and normalize the data using the "equalizeMedians" method. The default summary method is "TMP" and imputation is set to "FALSE". For detailed information on all settings, please refer to the function vignette.

This function will take some time and memory. If memory is limited, it is advisable to remove the raw files using the rm() function and clearing the memory cache using the gc() function.

MSstatsLiP_Summarized <- dataSummarizationLiP(msstats_data, normalization.LiP = "equalizeMedians")
## INFO  [2024-08-30 02:52:59] ** Features with one or two measurements across runs are removed.
## INFO  [2024-08-30 02:52:59] ** Fractionation handled.
## INFO  [2024-08-30 02:52:59] ** Updated quantification data to make balanced design. Missing values are marked by NA
## INFO  [2024-08-30 02:52:59] ** Use all features that the dataset originally has.
## INFO  [2024-08-30 02:52:59] 
##  # proteins: 9
##  # peptides per protein: 1-1
##  # features per peptide: 1-2
## INFO  [2024-08-30 02:52:59] Some proteins have only one feature: 
##  P14164_DLAIGAHGGK,
##  P16622_AFSENITK,
##  P17891_ALQLINQDDADIIGGR,
##  P38805_LFQSILPQNPDIEGR,
##  Q02908_IYPTLVIR ...
## INFO  [2024-08-30 02:52:59] 
##                     F1 F2 M1 M2
##              # runs  4  4  4  4
##     # bioreplicates  4  4  4  4
##  # tech. replicates  1  1  1  1
## INFO  [2024-08-30 02:52:59] Some features are completely missing in at least one condition:  
##  DLAIGAHGGK_2_y8_1,
##  PLTAETYK_2_y7_1,
##  ALQLINQDDADIIGGR_2_y11_1,
##  LFQSILPQNPDIEGR_2_y9_1,
##  NA ...
## INFO  [2024-08-30 02:52:59] The following runs have more than 75% missing values: 2,
##  4,
##  6,
##  10,
##  12,
##  16
## INFO  [2024-08-30 02:52:59]  == Start the summarization per subplot...
##   |                                                                              |                                                                      |   0%  |                                                                              |========                                                              |  11%  |                                                                              |================                                                      |  22%  |                                                                              |=======================                                               |  33%  |                                                                              |===============================                                       |  44%  |                                                                              |=======================================                               |  56%  |                                                                              |===============================================                       |  67%  |                                                                              |======================================================                |  78%  |                                                                              |==============================================================        |  89%  |                                                                              |======================================================================| 100%
## INFO  [2024-08-30 02:52:59]  == Summarization is done.
## INFO  [2024-08-30 02:52:59] ** Features with one or two measurements across runs are removed.
## INFO  [2024-08-30 02:52:59] ** Fractionation handled.
## INFO  [2024-08-30 02:52:59] ** Updated quantification data to make balanced design. Missing values are marked by NA
## INFO  [2024-08-30 02:52:59] ** Log2 intensities under cutoff = 15.979  were considered as censored missing values.
## INFO  [2024-08-30 02:52:59] ** Log2 intensities = NA were considered as censored missing values.
## INFO  [2024-08-30 02:52:59] ** Use all features that the dataset originally has.
## INFO  [2024-08-30 02:52:59] 
##  # proteins: 4
##  # peptides per protein: 1-3
##  # features per peptide: 1-3
## INFO  [2024-08-30 02:52:59] Some proteins have only one feature: 
##  P53235 ...
## INFO  [2024-08-30 02:52:59] 
##                     F1 F2 M1 M2
##              # runs  4  4  4  4
##     # bioreplicates  4  4  4  4
##  # tech. replicates  1  1  1  1
## INFO  [2024-08-30 02:52:59] Some features are completely missing in at least one condition:  
##  AFSENITK_2_y5_1,
##  AFSENITK_2_y6_1,
##  ELQDEAIK_2_y6_1,
##  SEVVDQWK_2_y5_1,
##  IYPTLVIR_2_y6_1 ...
## INFO  [2024-08-30 02:52:59]  == Start the summarization per subplot...
##   |                                                                              |                                                                      |   0%  |                                                                              |==================                                                    |  25%  |                                                                              |===================================                                   |  50%  |                                                                              |====================================================                  |  75%  |                                                                              |======================================================================| 100%
## INFO  [2024-08-30 02:52:59]  == Summarization is done.

Inspect MSstatsLiP_Summarized.

names(MSstatsLiP_Summarized[["LiP"]])
## [1] "FeatureLevelData"  "ProteinLevelData"  "SummaryMethod"    
## [4] "ModelQC"           "PredictBySurvival"
head(MSstatsLiP_Summarized[["LiP"]]$FeatureLevelData)
## Key: <FULL_PEPTIDE>
##         FULL_PEPTIDE      PEPTIDE TRANSITION           FEATURE  LABEL  GROUP
##               <char>       <fctr>     <fctr>            <fctr> <fctr> <fctr>
## 1: P14164_DLAIGAHGGK DLAIGAHGGK_2       y8_1 DLAIGAHGGK_2_y8_1      L     F1
## 2: P14164_DLAIGAHGGK DLAIGAHGGK_2       y8_1 DLAIGAHGGK_2_y8_1      L     F1
## 3: P14164_DLAIGAHGGK DLAIGAHGGK_2       y8_1 DLAIGAHGGK_2_y8_1      L     F1
## 4: P14164_DLAIGAHGGK DLAIGAHGGK_2       y8_1 DLAIGAHGGK_2_y8_1      L     F1
## 5: P14164_DLAIGAHGGK DLAIGAHGGK_2       y8_1 DLAIGAHGGK_2_y8_1      L     F2
## 6: P14164_DLAIGAHGGK DLAIGAHGGK_2       y8_1 DLAIGAHGGK_2_y8_1      L     F2
##       RUN SUBJECT FRACTION originalRUN censored INTENSITY ABUNDANCE
##    <fctr>  <fctr>   <char>      <fctr>   <lgcl>     <num>     <num>
## 1:      1    F1.1        1      LM2480    FALSE  138063.2  17.05399
## 2:      2    F1.2        1      LM2494    FALSE        NA        NA
## 3:      3    F1.3        1      LM2497    FALSE        NA        NA
## 4:      4    F1.4        1      LM2511    FALSE        NA        NA
## 5:      5    F2.1        1      LM2483    FALSE  164339.6  17.25664
## 6:      6    F2.2        1      LM2495    FALSE        NA        NA
##    newABUNDANCE PROTEIN
##           <num>  <char>
## 1:     17.05399  P14164
## 2:           NA  P14164
## 3:           NA  P14164
## 4:           NA  P14164
## 5:     17.25664  P14164
## 6:           NA  P14164
head(MSstatsLiP_Summarized[["LiP"]]$ProteinLevelData)
## Key: <FULL_PEPTIDE>
##         FULL_PEPTIDE    RUN LogIntensities originalRUN  GROUP SUBJECT
##               <char> <fctr>          <num>      <fctr> <fctr>  <char>
## 1: P14164_DLAIGAHGGK      1       17.05399      LM2480     F1    F1.1
## 2: P14164_DLAIGAHGGK      5       17.25664      LM2483     F2    F2.1
## 3: P14164_DLAIGAHGGK     13       16.64724      LM2482     M2    M2.1
## 4:   P16622_AFSENITK      1       16.69163      LM2480     F1    F1.1
## 5:   P16622_AFSENITK      5       17.27895      LM2483     F2    F2.1
## 6:   P16622_AFSENITK      7       16.93570      LM2499     F2    F2.3
##    TotalGroupMeasurements NumMeasuredFeature MissingPercentage more50missing
##                     <int>              <int>             <num>        <lgcl>
## 1:                      4                  1                 0         FALSE
## 2:                      4                  1                 0         FALSE
## 3:                      4                  1                 0         FALSE
## 4:                      4                  1                 0         FALSE
## 5:                      4                  1                 0         FALSE
## 6:                      4                  1                 0         FALSE
##    NumImputedFeature Protein
##                <num>  <char>
## 1:                 0  P14164
## 2:                 0  P14164
## 3:                 0  P14164
## 4:                 0  P16622
## 5:                 0  P16622
## 6:                 0  P16622
head(MSstatsLiP_Summarized[["TrP"]]$FeatureLevelData)
##   PROTEIN    PEPTIDE TRANSITION         FEATURE LABEL GROUP RUN SUBJECT
## 1  P16622 AFSENITK_2       y5_1 AFSENITK_2_y5_1     L    F1   1    F1.1
## 2  P16622 AFSENITK_2       y5_1 AFSENITK_2_y5_1     L    F1   2    F1.2
## 3  P16622 AFSENITK_2       y5_1 AFSENITK_2_y5_1     L    F1   3    F1.3
## 4  P16622 AFSENITK_2       y5_1 AFSENITK_2_y5_1     L    F1   4    F1.4
## 5  P16622 AFSENITK_2       y5_1 AFSENITK_2_y5_1     L    F2   5    F2.1
## 6  P16622 AFSENITK_2       y5_1 AFSENITK_2_y5_1     L    F2   6    F2.2
##   FRACTION originalRUN censored INTENSITY ABUNDANCE newABUNDANCE predicted
## 1        1      LM2487     TRUE        NA        NA           NA        NA
## 2        1      LM2489    FALSE  104817.1  17.08109     17.08109        NA
## 3        1      LM2502    FALSE  405851.8  17.97301     17.97301        NA
## 4        1      LM2504     TRUE        NA        NA           NA        NA
## 5        1      LM2484     TRUE        NA        NA     17.21114  17.21114
## 6        1      LM2491     TRUE        NA        NA     17.25962  17.25962
head(MSstatsLiP_Summarized[["TrP"]]$ProteinLevelData)
##   RUN Protein LogIntensities originalRUN GROUP SUBJECT TotalGroupMeasurements
## 1   2  P16622       16.61602      LM2489    F1    F1.2                      8
## 2   3  P16622       17.34884      LM2502    F1    F1.3                      8
## 3   5  P16622       17.24025      LM2484    F2    F2.1                      8
## 4   6  P16622       17.30643      LM2491    F2    F2.2                      8
## 5   7  P16622       16.90965      LM2503    F2    F2.3                      8
## 6   8  P16622       17.11269      LM2507    F2    F2.4                      8
##   NumMeasuredFeature MissingPercentage more50missing NumImputedFeature
## 1                  1               0.5          TRUE                 1
## 2                  1               0.5          TRUE                 1
## 3                  1               0.5          TRUE                 1
## 4                  1               0.5          TRUE                 1
## 5                  1               0.5          TRUE                 1
## 6                  1               0.5          TRUE                 1

Save and/or load summarized data.

save(MSstatsLiP_Summarized, file = 'MSstatsLiP_summarized.rda')
load(file = 'MSstatsLiP_summarized.rda')

3. Modelling

Run the modeling to obtain significantly altered peptides and proteins. The function groupComparisonLiPoutputs a list with three separate models: 1. LiP.Model, which contains the differential analysis on peptide level in the LiP sample without correction for protein abundance alterations. 2. Adjusted.LiP.Model, which contains the differential analysis on peptide level in the LiP sample with correction for protein abundance alterations 3. TrP.Model, which contains the differential analysis on protein level. The default setting of the function is a pairwise comparison of all existing groups. Alternatively, a contrast matrix can be provided to specify the comparisons of interest. See Vignette for details.

MSstatsLiP_model = groupComparisonLiP(MSstatsLiP_Summarized)
## INFO  [2024-08-30 02:52:59]  == Start to test and get inference in whole plot ...
##   |                                                                              |                                                                      |   0%  |                                                                              |========                                                              |  11%  |                                                                              |================                                                      |  22%  |                                                                              |=======================                                               |  33%  |                                                                              |===============================                                       |  44%  |                                                                              |=======================================                               |  56%  |                                                                              |===============================================                       |  67%  |                                                                              |======================================================                |  78%  |                                                                              |==============================================================        |  89%  |                                                                              |======================================================================| 100%
## INFO  [2024-08-30 02:52:59]  == Comparisons for all proteins are done.
## INFO  [2024-08-30 02:52:59]  == Start to test and get inference in whole plot ...
##   |                                                                              |                                                                      |   0%  |                                                                              |==================                                                    |  25%  |                                                                              |===================================                                   |  50%  |                                                                              |====================================================                  |  75%  |                                                                              |======================================================================| 100%
## INFO  [2024-08-30 02:52:59]  == Comparisons for all proteins are done.

Inspect MSstatsLiP_model.

head(MSstatsLiP_model[["LiP.Model"]])
## Key: <FULL_PEPTIDE>
##         FULL_PEPTIDE    Label     log2FC    SE Tvalue    DF pvalue adj.pvalue
##               <char>   <char>      <num> <num>  <num> <int>  <num>      <num>
## 1: P14164_DLAIGAHGGK F1 vs F2 -0.2026580   NaN    NaN     0    NaN        NaN
## 2: P14164_DLAIGAHGGK F1 vs M2  0.4067448   NaN    NaN     0    NaN        NaN
## 3: P14164_DLAIGAHGGK F1 vs M1        Inf    NA     NA    NA     NA          0
## 4: P14164_DLAIGAHGGK F2 vs M2  0.6094028   NaN    NaN     0    NaN        NaN
## 5: P14164_DLAIGAHGGK F2 vs M1        Inf    NA     NA    NA     NA          0
## 6: P14164_DLAIGAHGGK M2 vs M1        Inf    NA     NA    NA     NA          0
##                  issue MissingPercentage ImputationPercentage ProteinName
##                 <char>             <num>                <num>      <char>
## 1:                <NA>         0.7500000                    0      P14164
## 2:                <NA>         0.9166667                    0      P14164
## 3: oneConditionMissing         0.7500000                    0      P14164
## 4:                <NA>         0.9166667                    0      P14164
## 5: oneConditionMissing         0.7500000                    0      P14164
## 6: oneConditionMissing         0.9166667                    0      P14164
##    PeptideSequence
##             <char>
## 1:      DLAIGAHGGK
## 2:      DLAIGAHGGK
## 3:      DLAIGAHGGK
## 4:      DLAIGAHGGK
## 5:      DLAIGAHGGK
## 6:      DLAIGAHGGK
head(MSstatsLiP_model[["TrP.Model"]])
##    Protein    Label     log2FC        SE    Tvalue    DF     pvalue adj.pvalue
##     <fctr>   <char>      <num>     <num>     <num> <int>      <num>      <num>
## 1:  P16622 F1 vs F2 -0.1598201 0.3004019 -0.532021     7 0.61117223  0.6854285
## 2:  P16622 F1 vs M2 -0.5959528 0.3004019 -1.983852     7 0.08768248  0.2161724
## 3:  P16622 F1 vs M1        Inf        NA        NA    NA         NA  0.0000000
## 4:  P16622 F2 vs M2 -0.4361327 0.2452771 -1.778122     7 0.11862006  0.2372401
## 5:  P16622 F2 vs M1        Inf        NA        NA    NA         NA  0.0000000
## 6:  P16622 M2 vs M1        Inf        NA        NA    NA         NA  0.0000000
##                  issue MissingPercentage ImputationPercentage
##                 <char>             <num>                <num>
## 1:                <NA>             0.625                0.375
## 2:                <NA>             0.875                0.125
## 3: oneConditionMissing             0.500                0.250
## 4:                <NA>             0.750                0.250
## 5: oneConditionMissing             0.375                0.375
## 6: oneConditionMissing             0.625                0.125

Save and/or load model data.

save(MSstatsLiP_model, file = 'MSstatsLiP_model.rda')
load(file = 'MSstatsLiP_model.rda')

4. Calculate proteolytic resistance ratios

Proteolytic resistance ratios are calculated as the ratio of the intensity of fully tryptic peptides in the LiP condition to the TrP condition. In general, a low protease resistance value is indicative of high extent of cleavage, while high protease resistance values indicate low cleavage extent.

Accessibility = calculateProteolyticResistance(MSstatsLiP_Summarized, 
                                               fasta_file, 
                                               differential_analysis = TRUE)
## INFO  [2024-08-30 02:53:00]  == Start to test and get inference in whole plot ...
##   |                                                                              |                                                                      |   0%  |                                                                              |========                                                              |  11%  |                                                                              |================                                                      |  22%  |                                                                              |=======================                                               |  33%  |                                                                              |===============================                                       |  44%  |                                                                              |=======================================                               |  56%  |                                                                              |===============================================                       |  67%  |                                                                              |======================================================                |  78%  |                                                                              |==============================================================        |  89%  |                                                                              |======================================================================| 100%
## INFO  [2024-08-30 02:53:00]  == Comparisons for all proteins are done.
Accessibility$RunLevelData
## Key: <FULL_PEPTIDE, Protein, GROUP, RUN>
##                FULL_PEPTIDE Protein  GROUP    RUN  PeptideSequence
##                      <char>  <char> <fctr> <fctr>           <char>
##  1:       P14164_DLAIGAHGGK  P14164     F1      1       DLAIGAHGGK
##  2:       P14164_DLAIGAHGGK  P14164     F2      5       DLAIGAHGGK
##  3:       P14164_DLAIGAHGGK  P14164     M2     13       DLAIGAHGGK
##  4:         P16622_AFSENITK  P16622     F1      1         AFSENITK
##  5:         P16622_AFSENITK  P16622     F2      5         AFSENITK
##  6:         P16622_AFSENITK  P16622     F2      7         AFSENITK
##  7:         P16622_AFSENITK  P16622     F2      8         AFSENITK
##  8:         P16622_AFSENITK  P16622     M1      9         AFSENITK
##  9:         P16622_AFSENITK  P16622     M1     11         AFSENITK
## 10:         P16622_AFSENITK  P16622     M2     13         AFSENITK
## 11:         P16622_AFSENITK  P16622     M2     14         AFSENITK
## 12:         P16622_AFSENITK  P16622     M2     15         AFSENITK
## 13:         P16622_AFSENITK  P16622     M2     16         AFSENITK
## 14:         P16622_PLTAETYK  P16622     F1      1         PLTAETYK
## 15:         P16622_PLTAETYK  P16622     F1      2         PLTAETYK
## 16:         P16622_PLTAETYK  P16622     F1      3         PLTAETYK
## 17:         P16622_PLTAETYK  P16622     F2      5         PLTAETYK
## 18:         P16622_PLTAETYK  P16622     F2      7         PLTAETYK
## 19:         P16622_PLTAETYK  P16622     F2      8         PLTAETYK
## 20:         P16622_PLTAETYK  P16622     F2      6         PLTAETYK
## 21:         P16622_PLTAETYK  P16622     M1      9         PLTAETYK
## 22:         P16622_PLTAETYK  P16622     M1     11         PLTAETYK
## 23:         P16622_PLTAETYK  P16622     M1     10         PLTAETYK
## 24:         P16622_PLTAETYK  P16622     M1     12         PLTAETYK
## 25:         P16622_PLTAETYK  P16622     M2     13         PLTAETYK
## 26:         P16622_PLTAETYK  P16622     M2     14         PLTAETYK
## 27:         P16622_PLTAETYK  P16622     M2     15         PLTAETYK
## 28:         P16622_PLTAETYK  P16622     M2     16         PLTAETYK
## 29: P17891_ALQLINQDDADIIGGR  P17891     F1      1 ALQLINQDDADIIGGR
## 30: P17891_ALQLINQDDADIIGGR  P17891     M1      9 ALQLINQDDADIIGGR
## 31: P17891_ALQLINQDDADIIGGR  P17891     M1     11 ALQLINQDDADIIGGR
## 32: P17891_ALQLINQDDADIIGGR  P17891     M2     14 ALQLINQDDADIIGGR
## 33: P17891_ALQLINQDDADIIGGR  P17891     M2     15 ALQLINQDDADIIGGR
## 34:         P17891_ELQDEAIK  P17891     F1      1         ELQDEAIK
## 35:         P17891_ELQDEAIK  P17891     F2      5         ELQDEAIK
## 36:         P17891_ELQDEAIK  P17891     M1      9         ELQDEAIK
## 37:         P17891_ELQDEAIK  P17891     M1     11         ELQDEAIK
## 38:         P17891_ELQDEAIK  P17891     M2     13         ELQDEAIK
## 39:         P17891_ELQDEAIK  P17891     M2     14         ELQDEAIK
## 40:         P17891_SEVVDQWK  P17891     F1      1         SEVVDQWK
## 41:         P17891_SEVVDQWK  P17891     F2      5         SEVVDQWK
## 42:         P17891_SEVVDQWK  P17891     M1      9         SEVVDQWK
## 43:         P17891_SEVVDQWK  P17891     M1     10         SEVVDQWK
## 44:         P17891_SEVVDQWK  P17891     M2     13         SEVVDQWK
## 45:  P38805_LFQSILPQNPDIEGR  P38805     F1      1  LFQSILPQNPDIEGR
## 46:  P38805_LFQSILPQNPDIEGR  P38805     F1      3  LFQSILPQNPDIEGR
## 47:  P38805_LFQSILPQNPDIEGR  P38805     F2      5  LFQSILPQNPDIEGR
## 48:         Q02908_IYPTLVIR  Q02908     F1      1         IYPTLVIR
## 49:         Q02908_IYPTLVIR  Q02908     F1      3         IYPTLVIR
## 50:         Q02908_IYPTLVIR  Q02908     F2      5         IYPTLVIR
## 51:         Q02908_IYPTLVIR  Q02908     M1      9         IYPTLVIR
## 52:         Q02908_IYPTLVIR  Q02908     M2     13         IYPTLVIR
## 53:         Q02908_IYPTLVIR  Q02908     M2     15         IYPTLVIR
## 54:       Q02908_VQPDQVELIR  Q02908     F1      1       VQPDQVELIR
## 55:       Q02908_VQPDQVELIR  Q02908     F1      2       VQPDQVELIR
## 56:       Q02908_VQPDQVELIR  Q02908     F1      3       VQPDQVELIR
## 57:       Q02908_VQPDQVELIR  Q02908     F2      5       VQPDQVELIR
## 58:       Q02908_VQPDQVELIR  Q02908     F2      7       VQPDQVELIR
## 59:       Q02908_VQPDQVELIR  Q02908     F2      8       VQPDQVELIR
## 60:       Q02908_VQPDQVELIR  Q02908     M1      9       VQPDQVELIR
## 61:       Q02908_VQPDQVELIR  Q02908     M1     11       VQPDQVELIR
## 62:       Q02908_VQPDQVELIR  Q02908     M2     13       VQPDQVELIR
## 63:       Q02908_VQPDQVELIR  Q02908     M2     14       VQPDQVELIR
## 64:       Q02908_VQPDQVELIR  Q02908     M2     15       VQPDQVELIR
##                FULL_PEPTIDE Protein  GROUP    RUN  PeptideSequence
##     Accessibility_ratio originalRUN SUBJECT TotalGroupMeasurements
##                   <num>      <fctr>  <char>                  <int>
##  1:                  NA      LM2480    F1.1                      4
##  2:                  NA      LM2483    F2.1                      4
##  3:                  NA      LM2482    M2.1                      4
##  4:                  NA      LM2480    F1.1                      4
##  5:           1.0000000      LM2483    F2.1                      4
##  6:           1.0000000      LM2499    F2.3                      4
##  7:           0.8682778      LM2508    F2.4                      4
##  8:                  NA      LM2481    M1.1                      4
##  9:                  NA      LM2498    M1.3                      4
## 10:           1.0000000      LM2482    M2.1                      4
## 11:           0.5359730      LM2493    M2.2                      4
## 12:           0.6155192      LM2496    M2.3                      4
## 13:           0.4883203      LM2509    M2.4                      4
## 14:                  NA      LM2480    F1.1                      8
## 15:           0.7540955      LM2494    F1.2                      8
## 16:           0.4709501      LM2497    F1.3                      8
## 17:           0.8378363      LM2483    F2.1                      8
## 18:           0.8339002      LM2499    F2.3                      8
## 19:           0.6555789      LM2508    F2.4                      8
## 20:           0.4973797      LM2495    F2.2                      8
## 21:                  NA      LM2481    M1.1                      8
## 22:                  NA      LM2498    M1.3                      8
## 23:                  NA      LM2492    M1.2                      8
## 24:                  NA      LM2510    M1.4                      8
## 25:           0.9704908      LM2482    M2.1                      8
## 26:           0.4694516      LM2493    M2.2                      8
## 27:           0.5674287      LM2496    M2.3                      8
## 28:           0.4205355      LM2509    M2.4                      8
## 29:           1.0000000      LM2480    F1.1                      4
## 30:           1.0000000      LM2481    M1.1                      4
## 31:           1.0000000      LM2498    M1.3                      4
## 32:           1.0000000      LM2493    M2.2                      4
## 33:           1.0000000      LM2496    M2.3                      4
## 34:           1.0000000      LM2480    F1.1                      8
## 35:           0.8109229      LM2483    F2.1                      8
## 36:           1.0000000      LM2481    M1.1                      8
## 37:           1.0000000      LM2498    M1.3                      8
## 38:           0.8691530      LM2482    M2.1                      8
## 39:           1.0000000      LM2493    M2.2                      8
## 40:           1.0000000      LM2480    F1.1                      8
## 41:           0.8429855      LM2483    F2.1                      8
## 42:           1.0000000      LM2481    M1.1                      8
## 43:           0.7495376      LM2492    M1.2                      8
## 44:           0.9519563      LM2482    M2.1                      8
## 45:                  NA      LM2480    F1.1                      4
## 46:                  NA      LM2497    F1.3                      4
## 47:                  NA      LM2483    F2.1                      4
## 48:           1.0000000      LM2480    F1.1                      4
## 49:                  NA      LM2497    F1.3                      4
## 50:           0.8885471      LM2483    F2.1                      4
## 51:           0.8779434      LM2481    M1.1                      4
## 52:           0.8178262      LM2482    M2.1                      4
## 53:           1.0000000      LM2496    M2.3                      4
## 54:           1.0000000      LM2480    F1.1                      4
## 55:           1.0000000      LM2494    F1.2                      4
## 56:                  NA      LM2497    F1.3                      4
## 57:           1.0000000      LM2483    F2.1                      4
## 58:                  NA      LM2499    F2.3                      4
## 59:           0.9414730      LM2508    F2.4                      4
## 60:           1.0000000      LM2481    M1.1                      4
## 61:                  NA      LM2498    M1.3                      4
## 62:           1.0000000      LM2482    M2.1                      4
## 63:           1.0000000      LM2493    M2.2                      4
## 64:           1.0000000      LM2496    M2.3                      4
##     Accessibility_ratio originalRUN SUBJECT TotalGroupMeasurements
##     NumMeasuredFeature MissingPercentage more50missing NumImputedFeature
##                  <int>             <num>        <lgcl>             <num>
##  1:                  1               0.0         FALSE                 0
##  2:                  1               0.0         FALSE                 0
##  3:                  1               0.0         FALSE                 0
##  4:                  1               0.0         FALSE                 0
##  5:                  1               0.0         FALSE                 0
##  6:                  1               0.0         FALSE                 0
##  7:                  1               0.0         FALSE                 0
##  8:                  1               0.0         FALSE                 0
##  9:                  1               0.0         FALSE                 0
## 10:                  1               0.0         FALSE                 0
## 11:                  1               0.0         FALSE                 0
## 12:                  1               0.0         FALSE                 0
## 13:                  1               0.0         FALSE                 0
## 14:                  2               0.0         FALSE                 0
## 15:                  1               0.5          TRUE                 0
## 16:                  1               0.5          TRUE                 0
## 17:                  1               0.5          TRUE                 0
## 18:                  1               0.5          TRUE                 0
## 19:                  1               0.5          TRUE                 0
## 20:                  1               0.5          TRUE                 0
## 21:                  2               0.0         FALSE                 0
## 22:                  2               0.0         FALSE                 0
## 23:                  1               0.5          TRUE                 0
## 24:                  1               0.5          TRUE                 0
## 25:                  1               0.5          TRUE                 0
## 26:                  1               0.5          TRUE                 0
## 27:                  1               0.5          TRUE                 0
## 28:                  1               0.5          TRUE                 0
## 29:                  1               0.0         FALSE                 0
## 30:                  1               0.0         FALSE                 0
## 31:                  1               0.0         FALSE                 0
## 32:                  1               0.0         FALSE                 0
## 33:                  1               0.0         FALSE                 0
## 34:                  2               0.0         FALSE                 0
## 35:                  2               0.0         FALSE                 0
## 36:                  2               0.0         FALSE                 0
## 37:                  1               0.5          TRUE                 0
## 38:                  2               0.0         FALSE                 0
## 39:                  1               0.5          TRUE                 0
## 40:                  2               0.0         FALSE                 0
## 41:                  2               0.0         FALSE                 0
## 42:                  2               0.0         FALSE                 0
## 43:                  1               0.5          TRUE                 0
## 44:                  2               0.0         FALSE                 0
## 45:                  1               0.0         FALSE                 0
## 46:                  1               0.0         FALSE                 0
## 47:                  1               0.0         FALSE                 0
## 48:                  1               0.0         FALSE                 0
## 49:                  1               0.0         FALSE                 0
## 50:                  1               0.0         FALSE                 0
## 51:                  1               0.0         FALSE                 0
## 52:                  1               0.0         FALSE                 0
## 53:                  1               0.0         FALSE                 0
## 54:                  1               0.0         FALSE                 0
## 55:                  1               0.0         FALSE                 0
## 56:                  1               0.0         FALSE                 0
## 57:                  1               0.0         FALSE                 0
## 58:                  1               0.0         FALSE                 0
## 59:                  1               0.0         FALSE                 0
## 60:                  1               0.0         FALSE                 0
## 61:                  1               0.0         FALSE                 0
## 62:                  1               0.0         FALSE                 0
## 63:                  1               0.0         FALSE                 0
## 64:                  1               0.0         FALSE                 0
##     NumMeasuredFeature MissingPercentage more50missing NumImputedFeature
ResistanceBarcodePlotLiP(Accessibility,
                         fasta_file,
                         which.prot = "P16622",
                         which.condition = "F1",
                         address = FALSE)

5. Proteolytic resistance differential analysis

In this paragraph we described how to compare proteolytic resistance patterns of different conditions, as reported in Cappelletti et al., 2021, Figure 3. As described in the “Protease digestion accessibility analysis” paragraph of Cappelletti et al., proteolytic resistance is calculated as the ratio of the intensity of fully tryptic peptides in the LiP condition to the TrP condition and can be compared across different conditions using the linear mixed effects models-based differential analysis implemented in the MSstatsLiP package. First, infinite values are filtered out from the result of the groupComparisonLiP function. Next, logFCs and standard errors of the LiP (log2FC, s2) and TrP (log2FC_ref,s2_ref) models are combined and Student’s T-test is applied to compare proteolytic resistance between different conditions.Finally, p-values are adjusted for multiple comparisons (default is Benjamini & Hochberg method). In general, a low Proteolytic resistance value is indicative of high extent of cleavage, while high Proteolytic resistance values indicate low cleavage extent.

Accessibility$groupComparison
##     Protein    Label        log2FC         SE        Tvalue    DF     pvalue
##      <char>   <char>         <num>      <num>         <num> <int>      <num>
##  1:  P16622 F1 vs F2          -Inf         NA            NA    NA         NA
##  2:  P16622 F1 vs M2          -Inf         NA            NA    NA         NA
##  3:  P16622 F1 vs M1            NA         NA            NA    NA         NA
##  4:  P16622 F2 vs M2  2.961395e-01 0.14247902  2.078478e+00     5 0.09224015
##  5:  P16622 F2 vs M1           Inf         NA            NA    NA         NA
##  6:  P16622 M2 vs M1           Inf         NA            NA    NA         NA
##  7:  P16622 F1 vs F2 -9.365098e-02 0.18144349 -5.161441e-01     7 0.62165284
##  8:  P16622 F1 vs M2  5.546153e-03 0.18144349  3.056683e-02     7 0.97646825
##  9:  P16622 F1 vs M1           Inf         NA            NA    NA         NA
## 10:  P16622 F2 vs M2  9.919714e-02 0.14814799  6.695814e-01     7 0.52458808
## 11:  P16622 F2 vs M1           Inf         NA            NA    NA         NA
## 12:  P16622 M2 vs M1           Inf         NA            NA    NA         NA
## 13:  P17891 F1 vs F2           Inf         NA            NA    NA         NA
## 14:  P17891 F1 vs M2 -4.965068e-16 0.00000000          -Inf     2 0.00000000
## 15:  P17891 F1 vs M1 -4.965068e-16 0.00000000          -Inf     2 0.00000000
## 16:  P17891 F2 vs M2          -Inf         NA            NA    NA         NA
## 17:  P17891 F2 vs M1          -Inf         NA            NA    NA         NA
## 18:  P17891 M2 vs M1 -9.860761e-32 0.00000000          -Inf     2 0.00000000
## 19:  P17891 F1 vs F2  1.890771e-01 0.09252279  2.043573e+00     2 0.17770087
## 20:  P17891 F1 vs M2  6.542349e-02 0.08012709  8.164966e-01     2 0.50000000
## 21:  P17891 F1 vs M1  6.461001e-16 0.08012709  8.063441e-15     2 1.00000000
## 22:  P17891 F2 vs M2 -1.236536e-01 0.08012709 -1.543219e+00     2 0.26274985
## 23:  P17891 F2 vs M1 -1.890771e-01 0.08012709 -2.359715e+00     2 0.14224810
## 24:  P17891 M2 vs M1 -6.542349e-02 0.06542349 -1.000000e+00     2 0.42264973
## 25:  P17891 F1 vs F2  1.570145e-01 0.25046243  6.268983e-01     1 0.64351635
## 26:  P17891 F1 vs M2  4.804366e-02 0.25046243  1.918198e-01     1 0.87934924
## 27:  P17891 F1 vs M1  1.252312e-01 0.21690682  5.773503e-01     1 0.66666667
## 28:  P17891 F2 vs M2 -1.089708e-01 0.25046243 -4.350785e-01     1 0.73874644
## 29:  P17891 F2 vs M1 -3.178325e-02 0.21690682 -1.465295e-01     1 0.90737557
## 30:  P17891 M2 vs M1  7.718755e-02 0.21690682  3.558558e-01     1 0.78235114
## 31:  Q02908 F1 vs F2  1.114529e-01 0.18217377  6.117947e-01     1 0.65046587
## 32:  Q02908 F1 vs M2  9.108688e-02 0.15776711  5.773503e-01     1 0.66666667
## 33:  Q02908 F1 vs M1  1.220566e-01 0.18217377  6.700010e-01     1 0.62419863
## 34:  Q02908 F2 vs M2 -2.036605e-02 0.15776711 -1.290893e-01     1 0.91827115
## 35:  Q02908 F2 vs M1  1.060366e-02 0.18217377  5.820630e-02     1 0.96298648
## 36:  Q02908 M2 vs M1  3.096972e-02 0.15776711  1.963002e-01     1 0.87660046
## 37:  Q02908 F1 vs F2  2.926351e-02 0.02069243  1.414214e+00     4 0.23019964
## 38:  Q02908 F1 vs M2 -2.280353e-16 0.01888951 -1.207206e-14     4 1.00000000
## 39:  Q02908 F1 vs M1 -2.318359e-16 0.02534294 -9.147948e-15     4 1.00000000
## 40:  Q02908 F2 vs M2 -2.926351e-02 0.01888951 -1.549193e+00     4 0.19626118
## 41:  Q02908 F2 vs M1 -2.926351e-02 0.02534294 -1.154701e+00     4 0.31250000
## 42:  Q02908 M2 vs M1 -3.800589e-18 0.02389355 -1.590633e-16     4 1.00000000
##     Protein    Label        log2FC         SE        Tvalue    DF     pvalue
##     adj.pvalue               issue MissingPercentage ImputationPercentage
##          <num>              <char>             <num>                <num>
##  1:  0.0000000 oneConditionMissing         0.5714286                    0
##  2:  0.0000000 oneConditionMissing         1.0000000                    0
##  3:         NA     completeMissing         0.4285714                    0
##  4:  0.5254997                <NA>         0.6250000                    0
##  5:  0.0000000 oneConditionMissing         0.1250000                    0
##  6:  0.0000000 oneConditionMissing         0.5000000                    0
##  7:  0.6504659                <NA>         0.6250000                    0
##  8:  1.0000000                <NA>         0.9000000                    0
##  9:  0.0000000 oneConditionMissing         0.6250000                    0
## 10:  0.7868821                <NA>         0.8000000                    0
## 11:  0.0000000 oneConditionMissing         0.5000000                    0
## 12:  0.0000000 oneConditionMissing         0.8000000                    0
## 13:  0.0000000 oneConditionMissing         0.9166667                    0
## 14:  0.0000000                <NA>         0.6250000                    0
## 15:  0.0000000                <NA>         0.6250000                    0
## 16:  0.0000000 oneConditionMissing         0.8333333                    0
## 17:  0.0000000 oneConditionMissing         0.8333333                    0
## 18:  0.0000000                <NA>         0.5000000                    0
## 19:  0.5754991                <NA>         0.7500000                    0
## 20:  1.0000000                <NA>         0.6875000                    0
## 21:  1.0000000                <NA>         0.6875000                    0
## 22:  0.5254997                <NA>         0.6875000                    0
## 23:  0.5689924                <NA>         0.6875000                    0
## 24:  1.0000000                <NA>         0.6250000                    0
## 25:  0.6504659                <NA>         0.7500000                    0
## 26:  1.0000000                <NA>         0.6875000                    0
## 27:  1.0000000                <NA>         0.7500000                    0
## 28:  0.8864957                <NA>         0.6875000                    0
## 29:  0.9629865                <NA>         0.7500000                    0
## 30:  1.0000000                <NA>         0.6875000                    0
## 31:  0.6504659                <NA>         0.7500000                    0
## 32:  1.0000000                <NA>         0.7500000                    0
## 33:  1.0000000                <NA>         0.6250000                    0
## 34:  0.9182711                <NA>         0.7500000                    0
## 35:  0.9629865                <NA>         0.6250000                    0
## 36:  1.0000000                <NA>         0.6250000                    0
## 37:  0.5754991                <NA>         0.5000000                    0
## 38:  1.0000000                <NA>         0.6250000                    0
## 39:  1.0000000                <NA>         0.3750000                    0
## 40:  0.5254997                <NA>         0.6250000                    0
## 41:  0.6250000                <NA>         0.3750000                    0
## 42:  1.0000000                <NA>         0.5000000                    0
##     adj.pvalue               issue MissingPercentage ImputationPercentage
##      PeptideSequence            FULL_PEPTIDE
##               <char>                  <char>
##  1:         AFSENITK         P16622_AFSENITK
##  2:         AFSENITK         P16622_AFSENITK
##  3:         AFSENITK         P16622_AFSENITK
##  4:         AFSENITK         P16622_AFSENITK
##  5:         AFSENITK         P16622_AFSENITK
##  6:         AFSENITK         P16622_AFSENITK
##  7:         PLTAETYK         P16622_PLTAETYK
##  8:         PLTAETYK         P16622_PLTAETYK
##  9:         PLTAETYK         P16622_PLTAETYK
## 10:         PLTAETYK         P16622_PLTAETYK
## 11:         PLTAETYK         P16622_PLTAETYK
## 12:         PLTAETYK         P16622_PLTAETYK
## 13: ALQLINQDDADIIGGR P17891_ALQLINQDDADIIGGR
## 14: ALQLINQDDADIIGGR P17891_ALQLINQDDADIIGGR
## 15: ALQLINQDDADIIGGR P17891_ALQLINQDDADIIGGR
## 16: ALQLINQDDADIIGGR P17891_ALQLINQDDADIIGGR
## 17: ALQLINQDDADIIGGR P17891_ALQLINQDDADIIGGR
## 18: ALQLINQDDADIIGGR P17891_ALQLINQDDADIIGGR
## 19:         ELQDEAIK         P17891_ELQDEAIK
## 20:         ELQDEAIK         P17891_ELQDEAIK
## 21:         ELQDEAIK         P17891_ELQDEAIK
## 22:         ELQDEAIK         P17891_ELQDEAIK
## 23:         ELQDEAIK         P17891_ELQDEAIK
## 24:         ELQDEAIK         P17891_ELQDEAIK
## 25:         SEVVDQWK         P17891_SEVVDQWK
## 26:         SEVVDQWK         P17891_SEVVDQWK
## 27:         SEVVDQWK         P17891_SEVVDQWK
## 28:         SEVVDQWK         P17891_SEVVDQWK
## 29:         SEVVDQWK         P17891_SEVVDQWK
## 30:         SEVVDQWK         P17891_SEVVDQWK
## 31:         IYPTLVIR         Q02908_IYPTLVIR
## 32:         IYPTLVIR         Q02908_IYPTLVIR
## 33:         IYPTLVIR         Q02908_IYPTLVIR
## 34:         IYPTLVIR         Q02908_IYPTLVIR
## 35:         IYPTLVIR         Q02908_IYPTLVIR
## 36:         IYPTLVIR         Q02908_IYPTLVIR
## 37:       VQPDQVELIR       Q02908_VQPDQVELIR
## 38:       VQPDQVELIR       Q02908_VQPDQVELIR
## 39:       VQPDQVELIR       Q02908_VQPDQVELIR
## 40:       VQPDQVELIR       Q02908_VQPDQVELIR
## 41:       VQPDQVELIR       Q02908_VQPDQVELIR
## 42:       VQPDQVELIR       Q02908_VQPDQVELIR
##      PeptideSequence            FULL_PEPTIDE

Save and/or load model data

# save(FullyTrp.Model, file = 'Protection_model.rda')
# load(file = 'Protection_model.rda')

6. Save outputs

Save the output of the modeling in a .csv file.

# write.csv(FullyTrp.Model, "Proteolytic_resistance_DA.csv")

7. Plot aSynuclein proteolytic resistance DA result as barcode

Proteolytic resistance barcodes can be used to visualize FT peptides along the sequence of aSynucelin. Significant peptides showing high protease resistance are colored in red, significant peptides showing a decreased protease resistance are colored in blue and non-significant peptides (no change in protease resistance between conditions) are colored in grey. Black regions represent regions with no identified matching peptide. Position of the NAC domain is indicated by a rectangle.

ResistanceBarcodePlotLiP(Accessibility,
                         fasta_file,
                         which.prot = "P16622",
                         which.condition = "F1",
                         differential_analysis = TRUE,
                         which.comp = "F1 vs F2",
                         address = FALSE)