项目作者: lelimat

项目描述 :
Pipeline to create and analyze protein-protein interaction networks.
高级语言: Python
项目地址: git://github.com/lelimat/PPI_network_pipeline.git
创建时间: 2017-03-10T20:08:00Z
项目社区:https://github.com/lelimat/PPI_network_pipeline

开源协议:

下载


PPI network pipeline

These commands will create a network based on a set of genes you should provide as “seeds.txt”.

Download this pipeline

  • With git:

    1. git clone git@github.com:lelimat/PPI_network_pipeline.git
    2. cd PPI_network_pipeline
  • Or directly downloading the zip file:

    1. wget https://github.com/lelimat/PPI_network_pipeline/archive/master.zip
    2. unzip master.zip
    3. cd PPI_network_pipeline-master

Download PPI database (iRefIndex)

Usage: R —slave —file=get_PPI_data_iRefIndex.R —args [min_evidence] [database_dir].

  • [min_evidence] is the minimum number times the interaction is being reported. I usually choose 2.
  • [database_dir] is the directory where the iRefIndex file will be saved.

    1. R --slave --file=get_PPI_data_iRefIndex.R --args 2 databases

Download gene information

  1. wget ftp://ftp.ncbi.nih.gov/gene/DATA/GENE_INFO/Mammalia/Homo_sapiens.gene_info.gz
  2. gunzip Homo_sapiens.gene_info.gz

Create network from seed genes (initial set of interest)

  1. python create_net_irefindex.py --database=iRefIndex_human.txt --seeds=seeds.txt --output=. --distance=1

The distance is the number of levels of neighbors added to the network starting from the seeds. If you only need to see how the seeds connect. Choose distance=0. For seeds + first neighbors, choose distance=1, and so on.

Output:

  • edges.txt: File with edges created for the network.
  • genes_not_found.txt: List of genes (from the seeds) whose names were not found in the PPI database.

Calculate network topology

  1. echo -e "node_1\tnode_2" > edges_attributes.txt
  2. cat edges.txt >> edges_attributes.txt
  3. python network_topology.py edges_attributes.txt Homo_sapiens.gene_info
  4. rm edges.txt

Create plots for brokers, bridges and bottlenecks

  1. R --slave --file=get_network_info.R --args nodes_attributes.txt

Getting nodes in the initial list

  1. grep Yes nodes_attributes.txt | cut -f1 > in_list_Yes.txt

Separating the gene symbols in the network

  1. cut -f1 nodes_attributes.txt | grep -v node > all_genes.txt

Creating Cytoscape file

  1. python net_cytoscape.py

The command above will generate a file “network.xgmml” to be open with Cytoscape.