Package 'cisPath' reference manual

Title:	Visualization and management of the protein-protein interaction networks.
Description:	cisPath is an R package that uses web browsers to visualize and manage protein-protein interaction networks.
Authors:	Likun Wang <[email protected]>
Maintainer:	Likun Wang <[email protected]>
License:	GPL (>= 3)
Version:	1.47.0
Built:	2025-03-27 03:17:39 UTC
Source:	https://github.com/bioc/cisPath

Visualization of the shortest paths of functional interaction between proteins

Description

This method is used to identify and visualize the shortest functional paths between proteins in the protein–protein interaction (PPI) network.

Usage

cisPath(infoFile, outputDir, proteinName=NULL, targetProteins=NULL, swissProtID=FALSE, sprotFile="", tremblFile="", 
        nodeColors=c("#1F77B4", "#FF7F0E", "#D62728", "#9467BD", "#8C564B", "#E377C2"), leafColor="#2CA02C", byStep=FALSE)
## S4 method for signature 'character,character'
cisPath(infoFile, outputDir, proteinName=NULL, targetProteins=NULL, swissProtID=FALSE, sprotFile="", tremblFile="",  
        nodeColors=c("#1F77B4", "#FF7F0E", "#D62728", "#9467BD", "#8C564B", "#E377C2"), leafColor="#2CA02C", byStep=FALSE)
cisPath(infoFile, outputDir, proteinName=NULL, targetProteins=NULL, swissProtID=FALSE, sprotFile="", tremblFile="", 
        nodeColors=c("#1F77B4", "#FF7F0E", "#D62728", "#9467BD", "#8C564B", "#E377C2"), leafColor="#2CA02C", byStep=FALSE)
## S4 method for signature 'character,character'
cisPath(infoFile, outputDir, proteinName=NULL, targetProteins=NULL, swissProtID=FALSE, sprotFile="", tremblFile="",  
        nodeColors=c("#1F77B4", "#FF7F0E", "#D62728", "#9467BD", "#8C564B", "#E377C2"), leafColor="#2CA02C", byStep=FALSE)

Arguments

`infoFile`	File that contains PPI data (character(1)). Please see the file `PPI_Info.txt` as an example.
`outputDir`	Output directory (character(1)).
`proteinName`	Gene name or Swiss-Prot accession number of the source protein (character(1)). If `null`, identify the shortest functional paths in the web page.
`targetProteins`	Gene names or Swiss-Prot accession numbers of the target proteins (character vector). If `null`, treat all other proteins as potential targets.
`swissProtID`	A logical value. If `targetProteins` contains Swiss-Prot accession numbers, set as `TRUE`. If `targetProteins` contains gene names, set as `FALSE`.
`sprotFile`	Input: File downloaded from the UniProt database (UniProtKB/Swiss-Prot) (character(1)).
`tremblFile`	Input: File downloaded from the UniProt database (UniProtKB/TrEMBL) (character(1)).
`nodeColors`	Represents colors for main nodes in the graph. If there are fewer values than main nodes, it will be recycled in the standard manner. Form: "#RRGGBB", each of the pairs RR, GG, BB consist of two hexadecimal digits giving a value in the range 00 to FF.
`leafColor`	Represents color for leaf nodes in the graph. (character(1)) Form: "#RRGGBB", each of the pairs RR, GG, BB consist of two hexadecimal digits giving a value in the range 00 to FF.
`byStep`	A logical value. If users wish to identify the paths utilizing the shortest number of steps (instead of minimal cost), set `byStep` as `TRUE`. In this situation, all the edge costs will be assigned as `1`. `Note:` If viewing more possible paths between two proteins is desired, we recommend this value be set as `TRUE`.

Details

The input PPI data file infoFile should follow the format as the output files of the method formatSTRINGPPI, formatPINAPPI, or formatiRefIndex. See files STRINGPPI.txt or PINAPPI.txt as examples. The first four fields contain the Swiss-Prot accession numbers and gene names for two interacting proteins. The PubMedID field should be stated to be NA if unavailable. The evidence field may present an introduction to the evidence. The edgeValue field should be assigned a value no less than 1. This value will be treated as the cost while identifying the shortest paths. If there is no method available to estimate this value, please give the value as 1.

The shortest functional paths between the proteins are calculated using Dijkstra's algorithm. The results are shown in an HTML file, and users can easily query them using a browser. Each shortest path is displayed as a force-directed graph (http://bl.ocks.org/4062045) with JavaScript library D3 (www.d3js.org). The HTML file follows HTML 4.01 Strict and CSS version 3 standards to maintain consistency across different browsers. Chrome, Firefox, Safari, and IE9 will all properly display the PPI view. Please contact us if the paths do not display correctly.

As an example, we have generated PPI interaction data for several species from the PINA database (http://cbg.garvan.unsw.edu.au/pina/), STRING database (http://string-db.org/), and iRefIndex database (http://www.irefindex.org/wiki/). Users can download these files from http://www.isb.pku.edu.cn/cisPath/. If you make use of these files, please cite PINA, STRING, and iRefIndex accordingly. Users can edit the PPI interactions generated with these two databases, or combine them with their private data to construct more complete PPI interaction networks. In this package, we select only a small portion of the available PPI interaction data as an example. An ID mapping file is also provided in this package, which was generated according to the data from the UniProt (http://www.uniprot.org/) database.

Using this method, a protein must be chosen as the source protein. However, target proteins need not be chosen. If target proteins are not provided, this method will identify the shortest paths between the source protein and all other relevant proteins. The user can query the results upon finding an interesting “target” protein. Although the output in such cases is large, we strongly suggest users give this method of selecting a source but not a target a try.

A protein often has several names, and some of these names have perhaps not been included in the input file infoFile. We therefore suggest users take a look at the output file targetIDs.txt to check whether the input protein names are valid. In order to avoid inputting invalid target protein names, the unique identifier Swiss-Prot accession numbers may alternatively be used as input. The Swiss-Prot accession numbers can be sought in the UniProt (http://www.uniprot.org/) database. We strongly suggest users provide the files from downloaded from the UniProt database (sprotFile and tremblFile).

All species: ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.dat.gz
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_trembl.dat.gz
Taxonomic divisions: ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/

uniprot_sprot_archaea.dat.gz and uniprot_trembl_archaea.dat.gz contain all archaea entries.
uniprot_sprot_bacteria.dat.gz and uniprot_trembl_bacteria.dat.gz contain all bacteria entries.
uniprot_sprot_fungi.dat.gz and uniprot_trembl_fungi.dat.gz contain all fungi entries.
uniprot_sprot_human.dat.gz and uniprot_trembl_human.dat.gz contain all human entries.
uniprot_sprot_invertebrates.dat.gz and uniprot_trembl_invertebrates.dat.gz contain all invertebrate entries.
uniprot_sprot_mammals.dat.gz and uniprot_trembl_mammals.dat.gz contain all mammalian entries except human and rodent entries.
uniprot_sprot_plants.dat.gz and uniprot_trembl_plants.dat.gz contain all plant entries.
uniprot_sprot_rodents.dat.gz and uniprot_trembl_rodents.dat.gz contain all rodent entries.
uniprot_sprot_vertebrates.dat.gz and uniprot_trembl_vertebrates.dat.gz contain all vertebrate entries except mammals.
uniprot_sprot_viruses.dat.gz and uniprot_trembl_viruses.dat.gz contain all eukaryotic entries except those from vertebrates, fungi and plants.
We suggest you take a look at the README file before you download these files.

If you make use of these files, please cite the UniProt database.

Value

A list will be returned, and each element will contain the shortest paths from the source protein to a target protein.

The output directory contains the shortest paths from a source protein to the target proteins. Users can search for the paths easily using a browser. The file validInputProteins.txt contains the proteins that are valid as input to the HTML file. Please take a look at the output file targetIDs.txt to check whether the input protein names to this method are valid.

References

Cowley, M.J. and et al. (2012) PINA v2.0: mining interactome modules. Nucleic Acids Res, 40, D862-865.

Wu, J. and et al. (2009) Integrated network analysis platform for protein-protein interactions. Nature methods, 6, 75-77.

Razick S. and et al. (2008) iRefIndex: A consolidated protein interaction database with provenance. BMC Bioinformatics, 9, 405

Aranda, B. and et al. (2011) PSICQUIC and PSISCORE: accessing and scoring molecular interactions, Nat Methods, 8, 528-529.

Szklarczyk,D. and et al. (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res, 39, D561-D568.

Franceschini,A. and et al. (2013) STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res, 41, D808-D815.

UniProt Consortium and others. (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res, 40, D71-D75.

Examples

    # examples
    infoFile <- system.file("extdata", "PPI_Info.txt", package="cisPath")
    
    # source protein: TP53
    # Identify the shortest functional paths from TP53 to all other relevant proteins
    outputDir <- file.path(tempdir(), "TP53_example1")
    results <- cisPath(infoFile, outputDir, "TP53", byStep=TRUE)
    
    # Identify the shortest paths from TP53 to proteins MAGI1 and GH1
    outputDir <- file.path(tempdir(), "TP53_example2")
    results <- cisPath(infoFile, outputDir, "TP53", targetProteins=c("MAGI1", "GH1"), byStep=TRUE)
    results
    
    # Identify the shortest paths from TP53 to proteins Q96QZ7 and P01241 (with the Swiss-Prot accession numbers)
    outputDir <- file.path(tempdir(), "TP53_example3")
    results <- cisPath(infoFile, outputDir, "TP53", targetProteins=c("Q96QZ7", "P01241"), swissProtID=TRUE, byStep=FALSE)
    
    # Identify the shortest functional paths in the web page
    outputDir <- file.path(tempdir(), "cisPath_example")
    results <- cisPath(infoFile, outputDir)

## Not run: 
    # example of downloading PPI data from our website
    
    # Change to your own output directory
    outputDir <- file.path(getwd(), "TP53")
    # Create the output directory
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # infoFile: site where the PPI data file will be saved.
    infoFile <- file.path(outputDir, "PPIdata.txt")
    
    # Download PPI data from our website
    download.file("http://www.isb.pku.edu.cn/cispath/data/Homo_sapiens_PPI.txt", infoFile)
    download.file("http://www.isb.pku.edu.cn/cispath/data/Caenorhabditis_elegans_PPI.txt", infoFile)
    download.file("http://www.isb.pku.edu.cn/cispath/data/Drosophila_melanogaster_PPI.txt", infoFile)
    download.file("http://www.isb.pku.edu.cn/cispath/data/Mus_musculus_PPI.txt", infoFile)
    download.file("http://www.isb.pku.edu.cn/cispath/data/Rattus_norvegicus_PPI.txt", infoFile)
    download.file("http://www.isb.pku.edu.cn/cispath/data/Saccharomyces_cerevisiae_PPI.txt", infoFile)
    
    results <- cisPath(infoFile, outputDir, "TP53")
    outputDir <- file.path(getwd(), "cisPathWeb")
    results <- cisPath(infoFile, outputDir)
    
## End(Not run)
# examples
    infoFile <- system.file("extdata", "PPI_Info.txt", package="cisPath")
    
    # source protein: TP53
    # Identify the shortest functional paths from TP53 to all other relevant proteins
    outputDir <- file.path(tempdir(), "TP53_example1")
    results <- cisPath(infoFile, outputDir, "TP53", byStep=TRUE)
    
    # Identify the shortest paths from TP53 to proteins MAGI1 and GH1
    outputDir <- file.path(tempdir(), "TP53_example2")
    results <- cisPath(infoFile, outputDir, "TP53", targetProteins=c("MAGI1", "GH1"), byStep=TRUE)
    results
    
    # Identify the shortest paths from TP53 to proteins Q96QZ7 and P01241 (with the Swiss-Prot accession numbers)
    outputDir <- file.path(tempdir(), "TP53_example3")
    results <- cisPath(infoFile, outputDir, "TP53", targetProteins=c("Q96QZ7", "P01241"), swissProtID=TRUE, byStep=FALSE)
    
    # Identify the shortest functional paths in the web page
    outputDir <- file.path(tempdir(), "cisPath_example")
    results <- cisPath(infoFile, outputDir)

## Not run: 
    # example of downloading PPI data from our website
    
    # Change to your own output directory
    outputDir <- file.path(getwd(), "TP53")
    # Create the output directory
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # infoFile: site where the PPI data file will be saved.
    infoFile <- file.path(outputDir, "PPIdata.txt")
    
    # Download PPI data from our website
    download.file("http://www.isb.pku.edu.cn/cispath/data/Homo_sapiens_PPI.txt", infoFile)
    download.file("http://www.isb.pku.edu.cn/cispath/data/Caenorhabditis_elegans_PPI.txt", infoFile)
    download.file("http://www.isb.pku.edu.cn/cispath/data/Drosophila_melanogaster_PPI.txt", infoFile)
    download.file("http://www.isb.pku.edu.cn/cispath/data/Mus_musculus_PPI.txt", infoFile)
    download.file("http://www.isb.pku.edu.cn/cispath/data/Rattus_norvegicus_PPI.txt", infoFile)
    download.file("http://www.isb.pku.edu.cn/cispath/data/Saccharomyces_cerevisiae_PPI.txt", infoFile)
    
    results <- cisPath(infoFile, outputDir, "TP53")
    outputDir <- file.path(getwd(), "cisPathWeb")
    results <- cisPath(infoFile, outputDir)
    
## End(Not run)

Combine PPI information from databases

Description

This method is used to combine the PPI information generated from different databases.

Usage

combinePPI(input, output, mappingFile="", dbNames="", maxEdgeValue=-1)
## S4 method for signature 'character,character'
combinePPI(input, output, mappingFile="", dbNames="", maxEdgeValue=-1)
combinePPI(input, output, mappingFile="", dbNames="", maxEdgeValue=-1)
## S4 method for signature 'character,character'
combinePPI(input, output, mappingFile="", dbNames="", maxEdgeValue=-1)

Arguments

`input`	Files that contain PPI information (character vector).
`output`	Output file (character(1)).
`mappingFile`	Identifier mapping file (character(1)). Generate this file with method `getMappingFile`.
`dbNames`	dbNames for input files (character vector).
`maxEdgeValue`	Filter out PPI information with edge cost greater than this value. (double(1)). Default: no filter.

Details

The input files should follow the format as the output files of the method formatSTRINGPPI, formatPINAPPI, formatSIFfile, or formatiRefIndex. See the files STRINGPPI.txt or PINAPPI.txt as examples. The first four fields contain the Swiss-Prot accession numbers and gene names for two interacting proteins. The PubMedID field should be stated to be NA if unavailable. The evidence field may present an introduction to the evidence. The edgeValue field should be given a value no less than 1. This value will be treated as the cost while identifying the shortest paths. If there is no method available to estimate this value, please give the value as 1.

In some cases, a protein may have several swiss-Prot accession numbers. The parameter mappingFile is used to unify the swiss-Prot accession numbers for proteins from different databases. The proteins whose swiss-Prot accession numbers are not included in the mappingfile will be ignored as well as the intereractions associated to them.

Examples

    library(cisPath)
    inputFile1 <- system.file("extdata", "PINA2PPI.txt", package="cisPath")
    inputFile2 <- system.file("extdata", "iRefIndex.txt", package="cisPath")
    inputFile3 <- system.file("extdata", "STRINGPPI.txt", package="cisPath")
    output <- file.path(tempdir(), "allPPI.txt")
    combinePPI(c(inputFile1, inputFile2, inputFile3), output, maxEdgeValue=1.4)
library(cisPath)
    inputFile1 <- system.file("extdata", "PINA2PPI.txt", package="cisPath")
    inputFile2 <- system.file("extdata", "iRefIndex.txt", package="cisPath")
    inputFile3 <- system.file("extdata", "STRINGPPI.txt", package="cisPath")
    output <- file.path(tempdir(), "allPPI.txt")
    combinePPI(c(inputFile1, inputFile2, inputFile3), output, maxEdgeValue=1.4)

Easy editor for network graphs

Description

This method is used to open the editor for network graphs.

Usage

easyEditor(outputDir)
## S4 method for signature 'character'
easyEditor(outputDir)
easyEditor(outputDir)
## S4 method for signature 'character'
easyEditor(outputDir)

Arguments

outputDir

Output directory (character(1)).

Value

The output HTML file of this method is a editor for network graphs. Users can draw and edit their own network with this editor.

Examples

    library(cisPath)
    outputDir <- file.path(tempdir(), "easyEditor")
    easyEditor(outputDir)
library(cisPath)
    outputDir <- file.path(tempdir(), "easyEditor")
    easyEditor(outputDir)

Format PPI files downloaded from the iRefIndex database

Description

This method is used to format the PPI file which is downloaded from the iRefIndex database.

Usage

formatiRefIndex(input, output, taxonId="")
## S4 method for signature 'character,character'
formatiRefIndex(input, output, taxonId="")
formatiRefIndex(input, output, taxonId="")
## S4 method for signature 'character,character'
formatiRefIndex(input, output, taxonId="")

Arguments

`input`	File downloaded from the iRefIndex database (character(1)).
`output`	Output file (character(1)).
`taxonId`	NCBI taxonomy specie identifier (character(1)). Process only data for this specie. Examples: `9606: Homo sapiens` `4932: Saccharomyces cerevisiae` `6239: Caenorhabditis elegans` `7227: Drosophila melanogaster` `10090: Mus musculus` `10116: Rattus norvegicus`

Details

The input file is downloaded from the iRefIndex database (http://irefindex.org/wiki/).
Access http://irefindex.org/download/irefindex/data/archive/release_13.0/psi_mitab/MITAB2.6/ to download PPI files with the MITAB2.6 format for different species.
If you make use of this file, please cite the iRefIndex database.

Value

Each line of the output file contains Swiss-Prot accession numbers and gene names for two interacting proteins. The edge value will be assigned as 1 for each link between two interacting proteins. This may be treated as the “cost” while identifying the shortest paths between proteins. Advanced users can edit the file and change this value for each edge.

References

Razick S. and et al. (2008) iRefIndex: A consolidated protein interaction database with provenance. BMC Bioinformatics, 9, 405

Aranda, B. and et al. (2011) PSICQUIC and PSISCORE: accessing and scoring molecular interactions, Nat Methods, 8, 528-529.

Examples

    library(cisPath)
    input <- system.file("extdata", "9606.mitab.100.txt", package="cisPath")
    output <- file.path(tempdir(), "iRefIndex.txt")
    formatiRefIndex(input, output)
    
## Not run: 
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Download iRefIndex PPI for humans only (compressed:~105M, decompressed:~757M)
    destfile <- file.path(outputDir, "9606.mitab.08122013.txt.zip")
    cat("Downloading...\n")
    httpURL <- "http://irefindex.org/download"
    filePath <- "irefindex/data/archive/release_13.0/psi_mitab/MITAB2.6"
    fileName <- "9606.mitab.08122013.txt.zip"
    URL <- paste(httpURL, filePath, fileName, sep="/")
    download.file(URL, destfile)
    unzip(destfile, overwrite=TRUE, exdir=outputDir)

    # Format iRefIndex PPI
    fileFromiRef <- file.path(outputDir, "9606.mitab.08122013.txt")
    iRefIndex <- file.path(outputDir, "iRefIndex.txt")
    formatiRefIndex(fileFromiRef, output=iRefIndex)
    
## End(Not run)
library(cisPath)
    input <- system.file("extdata", "9606.mitab.100.txt", package="cisPath")
    output <- file.path(tempdir(), "iRefIndex.txt")
    formatiRefIndex(input, output)
    
## Not run: 
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Download iRefIndex PPI for humans only (compressed:~105M, decompressed:~757M)
    destfile <- file.path(outputDir, "9606.mitab.08122013.txt.zip")
    cat("Downloading...\n")
    httpURL <- "http://irefindex.org/download"
    filePath <- "irefindex/data/archive/release_13.0/psi_mitab/MITAB2.6"
    fileName <- "9606.mitab.08122013.txt.zip"
    URL <- paste(httpURL, filePath, fileName, sep="/")
    download.file(URL, destfile)
    unzip(destfile, overwrite=TRUE, exdir=outputDir)

    # Format iRefIndex PPI
    fileFromiRef <- file.path(outputDir, "9606.mitab.08122013.txt")
    iRefIndex <- file.path(outputDir, "iRefIndex.txt")
    formatiRefIndex(fileFromiRef, output=iRefIndex)
    
## End(Not run)

Format PPI files downloaded from the PINA database (MITAB Format)

Description

This method is used to format the PPI file which is downloaded from the PINA database (MITAB Format).

Usage

formatPINAPPI(input, output)
## S4 method for signature 'character,character'
formatPINAPPI(input, output)
formatPINAPPI(input, output)
## S4 method for signature 'character,character'
formatPINAPPI(input, output)

Arguments

`input`	File downloaded from the PINA database (character(1)).
`output`	Output file (character(1)).

Details

The input file (MITAB Format) is downloaded from the PINA database (http://cbg.garvan.unsw.edu.au/pina/).
Access http://cbg.garvan.unsw.edu.au/pina/interactome.stat.do to download PPI files with the MITAB format for different species.
If you make use of this file, please cite the PINA database.

Value

References

Cowley, M.J. and et al. (2012) PINA v2.0: mining interactome modules. Nucleic Acids Res, 40, D862-865.

Wu, J. and et al. (2009) Integrated network analysis platform for protein-protein interactions. Nature methods, 6, 75-77.

Examples

    library(cisPath)
    input <- system.file("extdata", "Homo_sapiens_PINA100.txt", package="cisPath")
    output <- file.path(tempdir(), "PINAPPI.txt")
    formatPINAPPI(input, output)
    
## Not run: 
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Download PINA PPI (MITAB format) for humans only (~96M)
    destfile <- file.path(outputDir, "Homo_sapiens.txt")
    cat("Downloading...\n")
    download.file("http://cbg.garvan.unsw.edu.au/pina/download/Homo%20sapiens-20121210.txt", destfile)

    # Format PINA PPI
    fileFromPINA <- file.path(outputDir, "Homo_sapiens.txt")
    PINAPPI <- file.path(outputDir, "PINAPPI.txt")
    formatPINAPPI(fileFromPINA, output=PINAPPI)
    
## End(Not run)
library(cisPath)
    input <- system.file("extdata", "Homo_sapiens_PINA100.txt", package="cisPath")
    output <- file.path(tempdir(), "PINAPPI.txt")
    formatPINAPPI(input, output)
    
## Not run: 
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Download PINA PPI (MITAB format) for humans only (~96M)
    destfile <- file.path(outputDir, "Homo_sapiens.txt")
    cat("Downloading...\n")
    download.file("http://cbg.garvan.unsw.edu.au/pina/download/Homo%20sapiens-20121210.txt", destfile)

    # Format PINA PPI
    fileFromPINA <- file.path(outputDir, "Homo_sapiens.txt")
    PINAPPI <- file.path(outputDir, "PINAPPI.txt")
    formatPINAPPI(fileFromPINA, output=PINAPPI)
    
## End(Not run)

Format PPI file downloaded from the PINA2 database (SIF Format)

Description

This method is used to format the PPI file which is downloaded from the PINA2 database (SIF Format).

Usage

formatSIFfile(input, mappingFile, output)
## S4 method for signature 'character,character,character'
formatSIFfile(input, mappingFile, output)
formatSIFfile(input, mappingFile, output)
## S4 method for signature 'character,character,character'
formatSIFfile(input, mappingFile, output)

Arguments

`input`	File downloaded from the PINA2 database (SIF Format) (character(1)).
`mappingFile`	Identifier mapping file (character(1)). Generate this file with method `getMappingFile`.
`output`	Output file (character(1)).

Details

The input file (SIF Format) is downloaded from PINA2 database (http://cbg.garvan.unsw.edu.au/pina/).
Access http://cbg.garvan.unsw.edu.au/pina/interactome.stat.do to download PPI files with the SIF format for different species.
If you make use of this file, please cite the PINA database.

Value

References

Cowley, M.J. and et al. (2012) PINA v2.0: mining interactome modules. Nucleic Acids Res, 40, D862-865.

Wu, J. and et al. (2009) Integrated network analysis platform for protein-protein interactions. Nature methods, 6, 75-77.

Examples

    library(cisPath)
    
    # Generate the identifier mapping file 
    input <- system.file("extdata", "uniprot_sprot_human10.dat", package="cisPath")
    mappingFile <- file.path(tempdir(), "mappingFile.txt")
    getMappingFile(input, output=mappingFile, taxonId="9606")
    
    # Format the file downloaded from STRING database
    output <- file.path(tempdir(), "PINA2PPI.txt")
    fileFromPINA2 <- system.file("extdata", "Homo_sapiens_PINA2.sif", package="cisPath")
    formatSIFfile(fileFromPINA2, mappingFile, output)
    
## Not run: 
    if (!requireNamespace("BiocManager", quietly=TRUE))
        install.packages("BiocManager")
    BiocManager::install("R.utils")
    library(R.utils)
    
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Generate the identifier mapping file 
    fileFromUniProt <- file.path(outputDir, "uniprot_sprot_human.dat")
    mappingFile <- file.path(outputDir, "mappingFile.txt")
    getMappingFile(fileFromUniProt, output=mappingFile)
    
    # Download PINA2 PPI (SIF format) for humans only (~2.8M)
    destfile <- file.path(outputDir, "Homo_sapiens.sif")
    cat("Downloading...\n")
    download.file("http://cbg.garvan.unsw.edu.au/pina/download/Homo%20sapiens-20140521.sif", destfile)
    
    # Format PINA2 PPI
    fileFromPINA2 <- file.path(outputDir, "Homo_sapiens.sif")
    PINA2PPI <- file.path(outputDir, "PINA2PPI.txt")
    formatSIFfile(fileFromPINA2, mappingFile, output=PINA2PPI)
    
## End(Not run)    
library(cisPath)
    
    # Generate the identifier mapping file 
    input <- system.file("extdata", "uniprot_sprot_human10.dat", package="cisPath")
    mappingFile <- file.path(tempdir(), "mappingFile.txt")
    getMappingFile(input, output=mappingFile, taxonId="9606")
    
    # Format the file downloaded from STRING database
    output <- file.path(tempdir(), "PINA2PPI.txt")
    fileFromPINA2 <- system.file("extdata", "Homo_sapiens_PINA2.sif", package="cisPath")
    formatSIFfile(fileFromPINA2, mappingFile, output)
    
## Not run: 
    if (!requireNamespace("BiocManager", quietly=TRUE))
        install.packages("BiocManager")
    BiocManager::install("R.utils")
    library(R.utils)
    
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Generate the identifier mapping file 
    fileFromUniProt <- file.path(outputDir, "uniprot_sprot_human.dat")
    mappingFile <- file.path(outputDir, "mappingFile.txt")
    getMappingFile(fileFromUniProt, output=mappingFile)
    
    # Download PINA2 PPI (SIF format) for humans only (~2.8M)
    destfile <- file.path(outputDir, "Homo_sapiens.sif")
    cat("Downloading...\n")
    download.file("http://cbg.garvan.unsw.edu.au/pina/download/Homo%20sapiens-20140521.sif", destfile)
    
    # Format PINA2 PPI
    fileFromPINA2 <- file.path(outputDir, "Homo_sapiens.sif")
    PINA2PPI <- file.path(outputDir, "PINA2PPI.txt")
    formatSIFfile(fileFromPINA2, mappingFile, output=PINA2PPI)
    
## End(Not run)

Format PPI file downloaded from the STRING database

Description

This method is used to format the PPI file which is downloaded from the STRING database.

Usage

formatSTRINGPPI(input, mappingFile, taxonId, output, minScore=700)
## S4 method for signature 'character,character,character,character'
formatSTRINGPPI(input, mappingFile, taxonId, output, minScore=700)
formatSTRINGPPI(input, mappingFile, taxonId, output, minScore=700)
## S4 method for signature 'character,character,character,character'
formatSTRINGPPI(input, mappingFile, taxonId, output, minScore=700)

Arguments

`input`	File downloaded from the STRING database (character(1)).
`mappingFile`	Identifier mapping file (character(1)). Generate this file with method `getMappingFile`.
`taxonId`	NCBI taxonomy specie identifier (character(1)). Process only data for this specie. Examples: `9606: Homo sapiens` `4932: Saccharomyces cerevisiae` `6239: Caenorhabditis elegans` `7227: Drosophila melanogaster` `10090: Mus musculus` `10116: Rattus norvegicus`
`output`	Output file (character(1)).
`minScore`	Filter out PPI information with STRING scores less than this value. (integer(1)). Recommended default 700 (Only consider high confidence interactions).

Details

The input file is downloaded from the STRING database (http://string-db.org/). The URL of this file is http://string-db.org/newstring_download/protein.links.v9.1.txt.gz. Access http://string-db.org/newstring_download/species.v9.1.txt to determine the parameter taxonId. Access http://string-db.org/newstring_cgi/show_download_page.pl for more details.
If you make use of this file, please cite the STRING database.

Value

Each line of the output file contains Swiss-Prot accession numbers and gene names for two interacting proteins. An edge value is estimated for each link between two interacting proteins. This value is defined as max(1,log(1000-STRING_SCORE,100)). This may be treated as the “cost” while determining the shortest paths between proteins. Advanced users can edit the file and change this value for each edge.

References

Szklarczyk,D. and et al. (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res, 39, D561-D568.

Franceschini,A. and et al. (2013) STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res, 41, D808-D815.

UniProt Consortium and others. (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res 40, D71-D75.

Examples

    library(cisPath)
    
    # Generate the identifier mapping file 
    input <- system.file("extdata", "uniprot_sprot_human10.dat", package="cisPath")
    mappingFile <- file.path(tempdir(), "mappingFile.txt")
    getMappingFile(input, output=mappingFile, taxonId="9606")
    
    # Format the file downloaded from STRING database
    output <- file.path(tempdir(), "STRINGPPI.txt")
    fileFromSTRING <- system.file("extdata", "protein.links.txt", package="cisPath")
    formatSTRINGPPI(fileFromSTRING, mappingFile, "9606", output, 700)
    
## Not run: 
    if (!requireNamespace("BiocManager", quietly=TRUE))
        install.packages("BiocManager")
    BiocManager::install("R.utils")
    library(R.utils)
    
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Generate the identifier mapping file 
    fileFromUniProt <- file.path(outputDir, "uniprot_sprot_human.dat")
    mappingFile <- file.path(outputDir, "mappingFile.txt")
    getMappingFile(fileFromUniProt, output=mappingFile)
    
    # Download STRING PPI for Homo sapiens (compressed:~27M, decompressed:~213M)
    destfile <- file.path(outputDir, "9606.protein.links.v9.1.txt.gz")
    cat("Downloading...\n")
    download.file("http://string-db.org/newstring_download/protein.links.v9.1/9606.protein.links.v9.1.txt.gz", destfile)
    cat("Uncompressing...\n")
    gunzip(destfile, overwrite=TRUE, remove=FALSE)
    
    # Format STRING PPI
    fileFromSTRING <- file.path(outputDir, "9606.protein.links.v9.1.txt")
    STRINGPPI <- file.path(outputDir, "STRINGPPI.txt")
    formatSTRINGPPI(fileFromSTRING, mappingFile, "9606", output=STRINGPPI, 700)
    
## End(Not run)    
library(cisPath)
    
    # Generate the identifier mapping file 
    input <- system.file("extdata", "uniprot_sprot_human10.dat", package="cisPath")
    mappingFile <- file.path(tempdir(), "mappingFile.txt")
    getMappingFile(input, output=mappingFile, taxonId="9606")
    
    # Format the file downloaded from STRING database
    output <- file.path(tempdir(), "STRINGPPI.txt")
    fileFromSTRING <- system.file("extdata", "protein.links.txt", package="cisPath")
    formatSTRINGPPI(fileFromSTRING, mappingFile, "9606", output, 700)
    
## Not run: 
    if (!requireNamespace("BiocManager", quietly=TRUE))
        install.packages("BiocManager")
    BiocManager::install("R.utils")
    library(R.utils)
    
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Generate the identifier mapping file 
    fileFromUniProt <- file.path(outputDir, "uniprot_sprot_human.dat")
    mappingFile <- file.path(outputDir, "mappingFile.txt")
    getMappingFile(fileFromUniProt, output=mappingFile)
    
    # Download STRING PPI for Homo sapiens (compressed:~27M, decompressed:~213M)
    destfile <- file.path(outputDir, "9606.protein.links.v9.1.txt.gz")
    cat("Downloading...\n")
    download.file("http://string-db.org/newstring_download/protein.links.v9.1/9606.protein.links.v9.1.txt.gz", destfile)
    cat("Uncompressing...\n")
    gunzip(destfile, overwrite=TRUE, remove=FALSE)
    
    # Format STRING PPI
    fileFromSTRING <- file.path(outputDir, "9606.protein.links.v9.1.txt")
    STRINGPPI <- file.path(outputDir, "STRINGPPI.txt")
    formatSTRINGPPI(fileFromSTRING, mappingFile, "9606", output=STRINGPPI, 700)
    
## End(Not run)

Generate the identifier mapping file

Description

This method is used to generate the identifier mapping file which is necessary for methods formatSIFfile and formatSTRINGPPI.

Usage

getMappingFile(sprotFile, output, tremblFile="", taxonId="")
## S4 method for signature 'character,character'
getMappingFile(sprotFile, output, tremblFile="", taxonId="")
getMappingFile(sprotFile, output, tremblFile="", taxonId="")
## S4 method for signature 'character,character'
getMappingFile(sprotFile, output, tremblFile="", taxonId="")

Arguments

`sprotFile`	Input: File downloaded from the UniProt database (UniProtKB/Swiss-Prot) (character(1)).
`output`	Output file (character(1)).
`tremblFile`	Input: File downloaded from the UniProt database (UniProtKB/TrEMBL) (character(1)).
`taxonId`	NCBI taxonomy specie identifier (character(1)). This method will process only data for this specie. Default: process all data (recommended).

Details

UniProtKB/Swiss-Prot: fully annotated curated entries.
UniProtKB/TrEMBL: computer-generated entries enriched with automated classification and annotation.
sprotFile is mandatory, while tremblFile is optional. If users only want to process the reviewed proteins from the UniProt database, tremblFile should be ignored.

If you make use of these files, please cite the UniProt database.

Value

The output file contains identifier mapping information which is necessary for methods formatSIFfile and formatSTRINGPPI. Each line contains both the Ensembl Genomes Protein identifier and the Swiss-Prot accession number for a given protein.

References

UniProt Consortium and others. (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res 40, D71-D75.

Examples

    library(cisPath)
    sprotFile <- system.file("extdata", "uniprot_sprot_human10.dat", package="cisPath")
    output <- file.path(tempdir(), "mappingFile.txt")
    getMappingFile(sprotFile, output, taxonId="9606")
    
## Not run: 
    if (!requireNamespace("BiocManager", quietly=TRUE))
        install.packages("BiocManager")
    BiocManager::install("R.utils")
    library(R.utils)
    
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Download protein information file for humans only from UniProt (decompressed:~246M)
    destfile <- file.path(outputDir, "uniprot_sprot_human.dat.gz");
    cat("Downloading...\n")
    download.file("ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_human.dat.gz", destfile)
    gunzip(destfile, overwrite=TRUE, remove=FALSE)
    
    destfile <- file.path(outputDir, "uniprot_trembl_human.dat.gz");
    cat("Downloading...\n")
    download.file("ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_trembl_human.dat.gz", destfile)
    gunzip(destfile, overwrite=TRUE, remove=FALSE)
    
    # Generate identifier mapping file
    sprotFile <- file.path(outputDir, "uniprot_sprot_human.dat")
    tremblFile <- file.path(outputDir, "uniprot_trembl_human.dat")
    mappingFile <- file.path(outputDir, "mappingFile.txt")
    getMappingFile(sprotFile, output=mappingFile, tremblFile)
    
## End(Not run)
library(cisPath)
    sprotFile <- system.file("extdata", "uniprot_sprot_human10.dat", package="cisPath")
    output <- file.path(tempdir(), "mappingFile.txt")
    getMappingFile(sprotFile, output, taxonId="9606")
    
## Not run: 
    if (!requireNamespace("BiocManager", quietly=TRUE))
        install.packages("BiocManager")
    BiocManager::install("R.utils")
    library(R.utils)
    
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Download protein information file for humans only from UniProt (decompressed:~246M)
    destfile <- file.path(outputDir, "uniprot_sprot_human.dat.gz");
    cat("Downloading...\n")
    download.file("ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_human.dat.gz", destfile)
    gunzip(destfile, overwrite=TRUE, remove=FALSE)
    
    destfile <- file.path(outputDir, "uniprot_trembl_human.dat.gz");
    cat("Downloading...\n")
    download.file("ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_trembl_human.dat.gz", destfile)
    gunzip(destfile, overwrite=TRUE, remove=FALSE)
    
    # Generate identifier mapping file
    sprotFile <- file.path(outputDir, "uniprot_sprot_human.dat")
    tremblFile <- file.path(outputDir, "uniprot_trembl_human.dat")
    mappingFile <- file.path(outputDir, "mappingFile.txt")
    getMappingFile(sprotFile, output=mappingFile, tremblFile)
    
## End(Not run)

Visualization of the interactions between input proteins

Description

This method can visualize the input proteins in the protein–protein interaction (PPI) network and display the evidence that supports the specific interactions among these proteins.

Usage

networkView(infoFile, proteinNames, outputDir, swissProtID=FALSE, mainNode=c(1), nodeColors=c("#1F77B4", "#FF7F0E", "#D62728","#9467BD","#8C564B","#E377C2"), displayMore=TRUE, leafColor="#2CA02C")
## S4 method for signature 'character,character,character'
networkView(infoFile, proteinNames, outputDir, swissProtID=FALSE, mainNode=c(1), nodeColors=c("#1F77B4", "#FF7F0E", "#D62728","#9467BD","#8C564B","#E377C2"), displayMore=TRUE, leafColor="#2CA02C")
networkView(infoFile, proteinNames, outputDir, swissProtID=FALSE, mainNode=c(1), nodeColors=c("#1F77B4", "#FF7F0E", "#D62728","#9467BD","#8C564B","#E377C2"), displayMore=TRUE, leafColor="#2CA02C")
## S4 method for signature 'character,character,character'
networkView(infoFile, proteinNames, outputDir, swissProtID=FALSE, mainNode=c(1), nodeColors=c("#1F77B4", "#FF7F0E", "#D62728","#9467BD","#8C564B","#E377C2"), displayMore=TRUE, leafColor="#2CA02C")

Arguments

`infoFile`	File that contains PPI data (character(1)). Please see the file `PPI_Info.txt` as an example.
`proteinNames`	Gene names or Swiss-Prot accession numbers of the proteins (character vector).
`outputDir`	Output directory (character(1)).
`swissProtID`	A logical value. If `proteinNames` contains Swiss-Prot accession numbers, set as `TRUE`. If `proteinNames` contains gene names, set as `FALSE`.
`mainNode`	A vector of value `0` and `1`. `0`: display the corresponding protein as a leaf node. `1`: display the corresponding protein as a main node. If there are fewer values than proteins, it will be recycled in the standard manner.
`nodeColors`	Represents colors for proteins. If there are fewer values than proteins, it will be recycled in the standard manner. Form: "#RRGGBB", each of the pairs RR, GG, BB consist of two hexadecimal digits giving a value in the range 00 to FF.
`displayMore`	A logical value. If `TRUE`, other proteins will be displayed which can interact with at least `two` proteins in `proteinNames`.
`leafColor`	Only used while `displayMore=TRUE`. This is the color for additional proteins. (character(1)) Form: "#RRGGBB", each of the pairs RR, GG, BB consist of two hexadecimal digits giving a value in the range 00 to FF.

Details

As an example, we have generated PPI interaction data for several species from the PINA database (http://cbg.garvan.unsw.edu.au/pina/) and the STRING database (http://string-db.org/). Users can download these files from http://www.isb.pku.edu.cn/cisPath/. If you make use of these files, please cite PINA or STRING accordingly. Users can edit the PPIs generated with these two databases, or combine them with their private data to construct more complete PPI networks. In this package, we select only a small portion of the available PPI data as an example.

A protein often has several names, and some of these names have perhaps not been included in the input file infoFile. We therefore suggest users take a look at the output file proteinIDs.txt to check whether the input protein names are valid. In order to avoid inputting invalid target protein names, the unique identifier Swiss-Prot accession numbers may alternatively be used as input. The Swiss-Prot accession numbers can be sought in the UniProt (http://www.uniprot.org/) database.

Value

The output HTML file contains the PPIs and related evidence supporting them. Users can view the PPI network and the evidence supporting these interactions easily using a browser. Please take a look at the output file proteinIDs.txt to check whether the input protein names are valid.

References

Cowley, M.J. and et al. (2012) PINA v2.0: mining interactome modules. Nucleic Acids Res, 40, D862-865.

Wu, J. and et al. (2009) Integrated network analysis platform for protein-protein interactions. Nature methods, 6, 75-77.

Szklarczyk,D. and et al. (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res, 39, D561-D568.

Franceschini,A. and et al. (2013) STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res, 41, D808-D815.

UniProt Consortium and others. (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res 40, D71-D75.

Examples

    library(cisPath)
    infoFile <- system.file("extdata", "PPI_Info.txt", package="cisPath")
    outputDir <- file.path(tempdir(), "networkView")
    networkView(infoFile, c("MAGI1","TP53BP2","TP53", "PTEN"), outputDir, FALSE, c(1,1,1,0), displayMore=TRUE)
    
    outputDir2 <- file.path(tempdir(), "networkView2")
    inputFile <- system.file("extdata", "networkView.txt", package="cisPath")
    rt <- read.table(inputFile, sep=",", comment.char="", header=TRUE)
    proteins <- as.vector(rt[,1])
    sizes <- as.vector(rt[,2])
    nodeColors <- as.vector(rt[,3])
    networkView(infoFile, proteins, outputDir2, FALSE, sizes, nodeColors, displayMore=FALSE)
library(cisPath)
    infoFile <- system.file("extdata", "PPI_Info.txt", package="cisPath")
    outputDir <- file.path(tempdir(), "networkView")
    networkView(infoFile, c("MAGI1","TP53BP2","TP53", "PTEN"), outputDir, FALSE, c(1,1,1,0), displayMore=TRUE)
    
    outputDir2 <- file.path(tempdir(), "networkView2")
    inputFile <- system.file("extdata", "networkView.txt", package="cisPath")
    rt <- read.table(inputFile, sep=",", comment.char="", header=TRUE)
    proteins <- as.vector(rt[,1])
    sizes <- as.vector(rt[,2])
    nodeColors <- as.vector(rt[,3])
    networkView(infoFile, proteins, outputDir2, FALSE, sizes, nodeColors, displayMore=FALSE)

Package 'cisPath'

Help Index

Visualization of the shortest paths of functional interaction between proteins

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Combine PPI information from databases

Description

Usage

Arguments

Details

See Also

Examples

Easy editor for network graphs

Description

Usage

Arguments

Value

Examples

Format PPI files downloaded from the iRefIndex database

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Format PPI files downloaded from the PINA database (MITAB Format)

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Format PPI file downloaded from the PINA2 database (SIF Format)

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Format PPI file downloaded from the STRING database

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Generate the identifier mapping file

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Visualization of the interactions between input proteins

Description

Usage

Arguments

Details

Value

References

See Also

Examples