项目作者: tmkasun

项目描述 :
Streaming Graph Server with partitioning
高级语言: C++
项目地址: git://github.com/tmkasun/streaming_graph_partitioning.git
创建时间: 2019-05-10T02:35:41Z
项目社区:https://github.com/tmkasun/streaming_graph_partitioning

开源协议:Apache License 2.0

下载


Streaming Graph partitioning

This repository contains the slightly modified version of Fennel, and Linear Deterministic Greedy (LDG) partitioning algorithms to work with stream of edges (a pair of vertices).
The only difference is , Practitioner accept edge of a graph instead of single vertexes and add both vertexes(aka Nodes) to partitions.

I have develp this code for testing out different streaming graph partitioning algorithms, and validate and try out the partioninig quality with visualization.
Demo

Technology stack

Partitioning algorithms are implemented in cpp+11 and visualization is done using Cytoscape.js. Use Apache Kafka for consume stream of edges in a graph.
A python script is used for streaming edges from a graph in a file. It can also stream a graph by taking user inputs from the terminal.
Practitioner accept the graph as a stream of edges , where edge contains a space separated vertex pair, i:e 22 45

Running

Installing

This tool uses cmake for managing package versions and building the binaries. CMake 3.10 or latest version is required to build the tool.

Moreover following packages are being used for the purpose mentioned below. You need to install them seperatly.

  • Pistache : Use for developing the websocket and REST API for the live streaming and fetching graph meta data
  • GCC 9.1.0 (recommended) or above
  • SQLite3 (https://www.sqlite.org/download.html)

    • Download the sqlite-autoconf-3390200 from this URL.
    • Extract sqlite-autoconf-3390200.tar.gz to some location. E.g., /media/user/software/sqlite-autoconf-3390200-install
    • Change to that location and run ./configure --prefix=/media/user/software/sqlite-autoconf-3390200
    • Issue “make” followed by “sudo make install”
    • Once installed specify the target_link_libraries path to libsqlite3.
      E.g., target_link_libraries(JasmineGraph /media/user/software/sqlite-autoconf-3390200/lib/libsqlite3.so)
  • cppkafka (https://github.com/mfontanini/cppkafka)

    • Install librdkafka - Follow the Readme in (https://github.com/edenhill/librdkafka) - use sudo apt install librdkafka-dev
    • Install boost library - use sudo apt-get install libboost-all-dev
    • Follow the guidelines in (https://github.com/mfontanini/cppkafka#compiling) - use cppkafka release v0.3.1
    • When doing the above step for cmake <OPTIONS> .. use of cmake .. should be sufficient enough
    • Once cppkafka is built install it by running sudo make install from the build directory
  • SpdLog (https://github.com/gabime/spdlog)

    • Clone or download the repository from the above link
    • Issue “cmake .”
    • Issue “make” followed by “sudo make install”
  • SQLite3 (https://www.sqlite.org/download.html)

    • Download the sqlite-autoconf-3390200 from this URL.
    • Extract sqlite-autoconf-3390200.tar.gz to some location. E.g., /media/user/software/sqlite-autoconf-3390200-install
    • Change to that location and run ./configure --prefix=/media/user/software/sqlite-autoconf-3390200
    • Issue “make” followed by “sudo make install”
    • Once installed specify the target_link_libraries path to libsqlite3.
      E.g., target_link_libraries(JasmineGraph /media/user/software/sqlite-autoconf-3390200/lib/libsqlite3.so)
  • SQLite3 (https://www.sqlite.org/download.html)

    • Download the sqlite-autoconf-3390200 from this URL.
    • Extract sqlite-autoconf-3390200.tar.gz to some location. E.g., /media/user/software/sqlite-autoconf-3390200-install
    • Change to that location and run ./configure --prefix=/media/user/software/sqlite-autoconf-3390200
    • Issue “make” followed by “sudo make install”
    • Once installed specify the target_link_libraries path to libsqlite3.
      E.g., target_link_libraries(JasmineGraph /media/user/software/sqlite-autoconf-3390200/lib/libsqlite3.so)
  • SQLite3 (https://www.sqlite.org/download.html)

    • Download the sqlite-autoconf-3390200 from this URL.
    • Extract sqlite-autoconf-3390200.tar.gz to some location. E.g., /media/user/software/sqlite-autoconf-3390200-install
    • Change to that location and run ./configure --prefix=/media/user/software/sqlite-autoconf-3390200
    • Issue “make” followed by “sudo make install”
    • Once installed specify the target_link_libraries path to libsqlite3.
      E.g., target_link_libraries(JasmineGraph /media/user/software/sqlite-autoconf-3390200/lib/libsqlite3.so)
  • SQLite3 (https://www.sqlite.org/download.html)

    • Download the sqlite-autoconf-3390200 from this URL.
    • Extract sqlite-autoconf-3390200.tar.gz to some location. E.g., /media/user/software/sqlite-autoconf-3390200-install
    • Change to that location and run ./configure --prefix=/media/user/software/sqlite-autoconf-3390200
    • Issue “make” followed by “sudo make install”
    • Once installed specify the target_link_libraries path to libsqlite3.
      E.g., target_link_libraries(JasmineGraph /media/user/software/sqlite-autoconf-3390200/lib/libsqlite3.so)

More on Streaming Graphs

This was implemented as a part of adding new streaming graph partitioning algorithm to Jasmine Graph server. You could try out Jasmine graph server for more robust work on graph partitioning and analysis.

Algorithms

Implementation of Fennel

I have used following maximization formulae for deriving the partition index for most optimum partitioning. The hubristic used in here is, if a vertex having more neighbors and have not reached to it’s maximum capacity, Then it(Partition) will have most likelihood to accept that vertex.

Inter partition cost

CodeCogsEqn

Intra partition cost

CodeCogsEqn (1)

Results comparison

LDG

  1. ***Received the end of stream
  2. 0 => Vertext count = 2237
  3. 0 => Edges count = 1125
  4. 0 => Edge cuts count = 440
  5. 0 => Cut ratio = 0.28115
  6. 1 => Vertext count = 2235
  7. 1 => Edges count = 1162
  8. 1 => Edge cuts count = 430
  9. 1 => Cut ratio = 0.270101
  10. 2 => Vertext count = 2236
  11. 2 => Edges count = 1156
  12. 2 => Edge cuts count = 447
  13. 2 => Cut ratio = 0.278852
  14. 3 => Vertext count = 2236
  15. 3 => Edges count = 1153
  16. 3 => Edge cuts count = 455
  17. 3 => Cut ratio = 0.28296
  18. Time taken = 3.88941e+07 micro seconds

Fennel

  1. ***Received the end of stream
  2. 0 => Vertext count = 1711
  3. 0 => Edges count = 688
  4. 0 => Edge cuts count = 1758
  5. 0 => Cut ratio = 0.718724
  6. 1 => Vertext count = 1711
  7. 1 => Edges count = 580
  8. 1 => Edge cuts count = 1900
  9. 1 => Cut ratio = 0.766129
  10. 2 => Vertext count = 1710
  11. 2 => Edges count = 553
  12. 2 => Edge cuts count = 1973
  13. 2 => Cut ratio = 0.781077
  14. 3 => Vertext count = 1711
  15. 3 => Edges count = 545
  16. 3 => Edge cuts count = 1961
  17. 3 => Cut ratio = 0.782522
  18. Time taken = 5.8881e+07 micro seconds

Hash

  1. ***Received the end of stream
  2. 0 => Vertext count = 1644
  3. 0 => Edges count = 323
  4. 0 => Edge cuts count = 2601
  5. 0 => Cut ratio = 0.889535
  6. 1 => Vertext count = 1695
  7. 1 => Edges count = 310
  8. 1 => Edge cuts count = 2644
  9. 1 => Cut ratio = 0.895058
  10. 2 => Vertext count = 1672
  11. 2 => Edges count = 306
  12. 2 => Edge cuts count = 2648
  13. 2 => Cut ratio = 0.896412
  14. 3 => Vertext count = 1646
  15. 3 => Edges count = 326
  16. 3 => Edge cuts count = 2535
  17. 3 => Cut ratio = 0.886054
  18. Time taken = 428249 micro seconds

Sample output

  1. ***Received the end of stream
  2. 0 => Vertext count = 5
  3. 0 => Edges count = 4
  4. 0 => Edge cuts count = 4
  5. 0 => Cut ratio = 0.5
  6. Printing edge cuts of 0 partition
  7. 8 ____
  8. | ---> 9
  9. 39 ____
  10. | ---> 36
  11. 14 ____
  12. | ---> 17
  13. 35 ____
  14. | ---> 34
  15. Printing edge list of 0 partition
  16. 6 ____
  17. | ===> 8
  18. 7 ____
  19. | ===> 8
  20. 14 ____
  21. | ===> 16
  22. | ===> 29
  23. 1 => Vertext count = 5
  24. 1 => Edges count = 3
  25. 1 => Edge cuts count = 6
  26. 1 => Cut ratio = 0.666667
  27. Printing edge cuts of 1 partition
  28. 9 ____
  29. | ---> 8
  30. 36 ____
  31. | ---> 39
  32. 19 ____
  33. | ---> 20
  34. 36 ____
  35. | ---> 37
  36. 14 ____
  37. | ---> 13
  38. 36 ____
  39. | ---> 38
  40. Printing edge list of 1 partition
  41. 18 ____
  42. | ===> 19
  43. 35 ____
  44. | ===> 36
  45. 41 ____
  46. | ===> 42
  47. 2 => Vertext count = 4
  48. 2 => Edges count = 1
  49. 2 => Edge cuts count = 3
  50. 2 => Cut ratio = 0.75
  51. Printing edge cuts of 2 partition
  52. 20 ____
  53. | ---> 19
  54. 37 ____
  55. | ---> 36
  56. 15 ____
  57. | ---> 13
  58. Printing edge list of 2 partition
  59. 9 ____
  60. | ===> 10
  61. 3 => Vertext count = 4.5
  62. 3 => Edges count = 3
  63. 3 => Edge cuts count = 5
  64. 3 => Cut ratio = 0.625
  65. Printing edge cuts of 3 partition
  66. 17 ____
  67. | ---> 14
  68. 34 ____
  69. | ---> 35
  70. 13 ____
  71. | ---> 14
  72. 38 ____
  73. | ---> 36
  74. 13 ____
  75. | ---> 15
  76. Printing edge list of 3 partition
  77. 5 ____
  78. | ===> 13
  79. 12 ____
  80. | ===> 13
  81. 33 ____
  82. | ===> 34
  83. Time taken = 28433 micro seconds