Title: | Statistical Modelling of AP-MS Data (SMAD) |
---|---|
Description: | Assigning probability scores to protein interactions captured in affinity purification mass spectrometry (AP-MS) expriments to infer protein-protein interactions. The output would facilitate non-specific background removal as contaminants are commonly found in AP-MS data. |
Authors: | Qingzhou Zhang [aut, cre] |
Maintainer: | Qingzhou Zhang <[email protected]> |
License: | MIT + file LICENSE |
Version: | 1.23.0 |
Built: | 2024-11-30 04:48:35 UTC |
Source: | https://github.com/bioc/SMAD |
CompPASS Comparative Proteomic Analysis Software Suite (CompPASS) is based on spoke model. This algorithm was developed by Dr. Mathew Sowa for defining the human deubiquitinating enzyme interaction landscape (Sowa, Mathew E., et al., 2009). The implementation of this algorithm was inspired by Dr. Sowa's online tutorial (http://besra.hms.harvard.edu/ipmsmsdbs/cgi-bin/tutorial.cgi). The output includes Z-score, S-score, D-score and WD-score. This function also computes entropy and normalized WD-score. The source code for this function was based on the source code. https://github.com/dnusinow/cRomppass
CompPASS(datInput)
CompPASS(datInput)
datInput |
A dataframe with column names: idRun, idBait, idPrey, countPrey. Each row represent one unique protein captured in one pull-down experiment |
A data frame consists of unique bait-prey pairs with Z-score, S-score,D-score and WD-score indicating interacting probabilities.
Qingzhou Zhang, [email protected]
Sowa, Mathew E., et al. "Defining the human deubiquitinating enzyme interaction landscape." Cell 138.2 (2009): 389-403. https://doi.org/10.1016/j.cell.2009.04.042
Huttlin, Edward L., et al. "The BioPlex network: a systematic exploration of the human interactome." Cell 162.2 (2015): 425-440. https://doi.org/10.1016/j.cell.2015.06.043
Huttlin, Edward L., et al. "Architecture of the human interactome defines protein communities and disease networks." Nature 545.7655 (2017): 505. https://www.nature.com/articles/nature22366
data(TestDatInput) datScore <- CompPASS(TestDatInput) head(datScore)
data(TestDatInput) datScore <- CompPASS(TestDatInput) head(datScore)
DICE The Dice coefficient is used to score the interaction affinity between two proteins.
DICE(datInput)
DICE(datInput)
datInput |
A dataframe with column names: idRun, idPrey. Each row represent one unique protein captured in one pull-down experiment. |
A dataframe consists of pairwise combindation of preys identified in the input with DICE scores.
Qingzhou Zhang, [email protected]
Bing Zhang et al., From pull-down data to protein interaction networks and complexes with biological relevance, Bioinformatics, Volume 24, Issue 7, 1 April 2008, Pages 979–986, https://doi.org/10.1093/bioinformatics/btn036
data(TestDatInput) datScore <- DICE(TestDatInput) head(datScore)
data(TestDatInput) datScore <- DICE(TestDatInput) head(datScore)
Hart Scoring algorithm based on a hypergeometric distribution error model (Hart et al.,2007).
Hart(datInput)
Hart(datInput)
datInput |
A dataframe with column names: idRun, idPrey. Each row represent one unique protein captured in one pull-down experiment. |
A dataframe consists of pairwise combindation of preys identified in the input with Hart scores indicating interacting probabilities computed from negative log transformed Hypergeometric test P-values.
Qingzhou Zhang, [email protected]
Hart, G. Traver, Insuk Lee, and Edward M. Marcotte. 'A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality.' BMC bioinformatics 8.1 (2007): 236. https://doi.org/10.1186/1471-2105-8-236
data(TestDatInput) datScore <- Hart(TestDatInput) head(datScore)
data(TestDatInput) datScore <- Hart(TestDatInput) head(datScore)
HGScore Scoring algorithm based on a hypergeometric distribution error model (Hart et al.,2007) with incorporation of NSAF (Zybailov, Boris, et al., 2006) . This algorithm was first introduced to predict the protein complex network of Drosophila melanogaster (Guruharsha, K. G., et al., 2011). This scoring algorithm was based on matrix model.
HG(datInput)
HG(datInput)
datInput |
A dataframe with column names: idRun, idPrey, countPrey, lenPrey. Each row represent one unique protein captured in one pull-down experiment. |
A dataframe consists of pairwise combindation of preys identified in the input with HG scores indicating interacting probabilities computed from negative log transformed Hypergeometric test P-values.
Qingzhou Zhang, [email protected]
Guruharsha, K. G., et al. 'A protein complex network of Drosophila melanogaster.' Cell 147.3 (2011): 690-703. https://doi.org/10.1016/j.cell.2011.08.047
Hart, G. Traver, Insuk Lee, and Edward M. Marcotte. 'A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality.' BMC bioinformatics 8.1 (2007): 236. https://doi.org/10.1186/1471-2105-8-236
Zybailov, Boris, et al. 'Statistical analysis of membrane proteome expression changes in Saccharomyces c erevisiae.' Journal of proteome research 5.9 (2006): 2339-2347. https://doi.org/10.1021/pr060161n
data(TestDatInput) datScore <- HG(TestDatInput) head(datScore)
data(TestDatInput) datScore <- HG(TestDatInput) head(datScore)
PE Incorporated both spoke and matrix model.
PE(datInput, rBait = 0.37, cntPseudo = 1)
PE(datInput, rBait = 0.37, cntPseudo = 1)
datInput |
A dataframe with column names: idRun, idBait, idPrey, Each row represent one unique protein captured in one pull-down experiment. |
rBait |
The value of the 'r' parameter as desribed in the publication. |
cntPseudo |
The value of the 'pseudo count' parameter. |
A dataframe consists of protein-protein interactions from both Spoke and Matrix model.
Qingzhou Zhang, [email protected]
Collins, Sean R., et al. "Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae." Molecular & Cellular Proteomics 6.3 (2007): 439-450. https://doi.org/10.1074/mcp.M600381-MCP200
data(TestDatInput) datScore <- PE(TestDatInput, 0.37, 1) head(datScore)
data(TestDatInput) datScore <- PE(TestDatInput, 0.37, 1) head(datScore)
It is a subset of unfiltered BioPlex 2.0 consisting of apoptosis as bait proteins
data(TestDatInput)
data(TestDatInput)
A data frame with 5000 rows and 5 variables
idRun A unique identifier for one AP-MS run
idBait A unique identifier for the bait protein
idPrey A unique identifier for the prey protein
countPrey Sepctra/Peptider count for the prey protein
lenPrey Protein length for the prey protein