项目作者: Abhi9k

项目描述 :
Parallel implementation of Aho Corasick algorithm using OpenACC and CUDA
高级语言: C++
项目地址: git://github.com/Abhi9k/AhoCorasickParallel.git
创建时间: 2018-05-06T16:42:07Z
项目社区:https://github.com/Abhi9k/AhoCorasickParallel

开源协议:Apache License 2.0

下载


AhoCorasickParallel

Parallel implementation of Aho Corasick algorithm using OpenACC and CUDA

Test Environment

All the experiments were conducted on ironclaw1 and ironclaw2 machines

  1. Hardware Specifications
  2. 1. Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz
  3. 2. 32 GB memory
  4. 3. 2 Grid K1 GPU
  5. 4. 2 NUMA nodes
  6. Software Specifications
  7. 1. Operating System: CentOS Linux 7 (Core)
  8. 2. gcc version 4.8.5
  9. 3. NVIDIA (R) Cuda compiler driver V8.0.61

Code walkthrough

  1. main.cpp: driver file where the execution starts.
    1. It is responsible for loading tweets, loading bad words, creating DFA for STT, fail state
    2. and running various experiment setups.
  2. ac_utils.cpp: contains the logic of creating DFA for STT and fail state table.
  3. ac_serial.cpp: contains the serial implementation of AC algorithm
  4. ac_open_acc.cpp: contains the OpenACC implementation of AC algorithm
  5. ac.cu: contains the CUDA implementation of AC algorithm

‘data’ directory holds all the input data sets.

Directions for running the experiment

The project has a Makefile which takes care of compiling the project.
Please enter the project folder and then follow the steps given below:

  1. Step 1: make clean -- to make sure old object files are removed
  2. Step 2: make
  3. Step 3: ./acp -l "data/bad_words"
  4. To run a search to find the best 'block size' and 'thread size' for CUDA run
  5. following command after Step 1 and 2:
  6. ./acp -l "data/bad_words" -s

After Step 3, you should see something like this on the console:

  1. Type,# of records,# of characters in each record,# of patterns,Runtime (in ms)
  2. SERIAL,100000,100,200,99.149376
  3. CUDA,100000,100,200,31.413729
  4. OpenACC,100000,100,200,54.380608
  5. SERIAL,100000,100,400,54.830593
  6. CUDA,100000,100,400,34.388512
  7. OpenACC,100000,100,400,54.935295