项目作者: bondhugula

项目描述 :
Pluto: An automatic polyhedral parallelizer and locality optimizer
高级语言: C
项目地址: git://github.com/bondhugula/pluto.git
创建时间: 2016-12-09T09:35:40Z
项目社区:https://github.com/bondhugula/pluto

开源协议:MIT License

下载


Pluto

Overview


PLUTO is an automatic parallelization tool based on the polyhedral
model
. The polyhedral model for compiler optimization
provides an abstraction to perform high-level transformations such as loop-nest
optimization and parallelization on affine loop nests. Pluto transforms C
programs from source to source for coarse-grained parallelism and data locality
simultaneously. The core transformation framework mainly works by finding affine
transformations for efficient tiling. The scheduling algorithm used by Pluto has
been published in [1]. OpenMP parallel code for multicores can be automatically
generated from sequential C program sections. Outer (communication-free), inner,
or pipelined parallelization is achieved purely with OpenMP parallel for
pragrams; the code is also optimized for locality and made amenable for
auto-vectorization. An experimental evaluation and comparison with previous
techniques can be found in [2]. Though the tool is fully automatic (C to OpenMP
C), a number of options are provided (both command-line and through meta files)
to tune aspects like tile sizes, unroll factors, and outer loop fusion
structure. Cloog is used for code
generation.

  1. Automatic Transformations for Communication-Minimized Parallelization and
    Locality Optimization in the Polyhedral Modelm
    Uday Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan.
    International Conference on Compiler Construction (ETAPS CC), Apr 2008,
    Budapest, Hungary.

  2. A Practical Automatic Polyhedral Parallelizer and Locality Optimizer
    Uday Bondhugula, A. Hartono, J. Ramanujan, P. Sadayappan. ACM SIGPLAN
    Programming Languages Design and Implementation (PLDI), Jun 2008, Tucson,
    Arizona.

This package includes both the tool pluto, and libpluto. The pluto tool is a
source-to-source transformer meant to be run via the polycc script, libpluto
provides a thread-safe library interface.

Pluto build and
test

Check format with clang-format

License

Pluto and libpluto are available under the MIT LICENSE. Please see the file
LICENSE in the top-level directory for more details.

Installing Pluto

Prerequisites

A Linux distribution. Pluto has been tested on x86 and x86-64 machines running
Fedora, Ubuntu, and CentOS.

  • In order to use the development version from Pluto’s git repository, automatic
    build system tools, including autoconf, automake, and libtool are needed.

  • LLVM/Clang 14.x (14.x recommended, 11.x, 12.x tested to work as well), along
    with its development/header files, is needed for the pet submodule. These
    packages are available in standard distribution repositories or could be
    installed by building LLVM and Clang from sources. See pet/README for
    additional detail. On most modern distributions, these can be installed from
    the repositories.

    Example:

    1. # On an Ubuntu.
    2. sudo apt install -y llvm-14-dev libclang-14-dev
    3. # On a Fedora.
    4. sudo dnf -y install llvm14-devel clang14-devel
  • LLVM FileCheck is used for Pluto’s test suite. (On a Fedora, this is part of
    the ‘llvm’ package.)

  • GMP (GNU multi-precision arithmetic library) is needed by ISL (one of the
    included libraries). If it’s not already on your system, it can be installed
    easily with, for eg., sudo yum -y install gmp gmp-devel on a Fedora (sudo apt-get install libgmp3-dev or something similar on an Ubuntu).

Pluto includes all polyhedral libraries on which it depends. See pet/README for
pet’s pre-requisites.

Building Pluto

Stable release:

Download the latest stable release from GitHub releases.

  1. $ tar zxvf pluto-<version>.tar.gz
  2. $ cd pluto-<version>/
  3. $ ./configure [--with-clang-prefix=<clang install location>]
  4. $ make
  5. $ make test

configure can be provided --with-isl-prefix=<isl install location> to build
with another isl, otherwise the bundled isl is used.

Development version from Git:

  1. git clone git@github.com:bondhugula/pluto.git
  2. cd pluto/
  3. git submodule init
  4. git submodule update
  5. ./autogen.sh
  6. ./configure [--enable-debug] [--with-clang-prefix=<clang headers/libs location>]
  7. # Example: on an Ubuntu: --with-clang-prefix=/usr/lib/llvm-14, on a Fedora,
  8. # typically, it's /usr/lib64/llvm14.
  9. make
  10. make check-pluto
  • Use --with-clang-prefix=<location> to point to the specific clang to
    build with.

  • Use --with-isl-prefix=<isl install location> to compile and link with an
    already installed isl. By default, the version of isl bundled with Pluto will be
    used.

polycc is the wrapper script around src/pluto (core transformer) and all other
components. polycc runs all of these in sequence on an input C program (with
the section to parallelize/optimize marked) and is what a user should use on
input. Output generated is OpenMP parallel C code that can be readily compiled
and run on shared-memory parallel machines like general-purpose multicores.
libpluto.{so,a} is also built and can be found in src/.libs/. make install
will install it.

Trying a new example

  • Use #pragma scop and #pragma endscop around the section of code
    you want to parallelize/optimize.

  • Then, just run ./polycc <C source file>.

    The transformation is also printed out, and test.par.c will have the
    parallelized code. If you want to see intermediate files, like the
    .cloog file generated (.opt.cloog, .tiled.cloog, or .par.cloog
    depending on command-line options provided), use --debug on the command
    line.

  • Tile sizes can be specified in a file tile.sizes, otherwise, default
    sizes will be set. See doc/DOC.txt for instructions on how to specify the sizes.

To run a good number of experiments on a code, it is best to use the setup
created for example codes in the examples/ directory. If you do not have
ICC (Intel C compiler), uncomment line 9 and comment line
8 of examples/common.mk to use GCC.

  • Just copy one of the sample directories in examples/, edit Makefile (SRC =).

  • do a make (this will build all executables; orig is the original code
    compiled with the native compiler, tiled is the tiled code, par is the
    OpenMP parallelized + locality-optimized code. One could do make <target>
    where target can be orig, orig_par, opt, tiled, par, pipepar, etc. (see
    examples/common.mk for complete list).

  • make check-pluto to test for correctness, make perf to compare
    performance.

Command-line options

Run

  1. ./polycc -h

Or see documentation (doc/DOC.txt) for details.

Trying any included example code

Let’s say we are trying the 2-d gauss seidel kernel. In examples/seidel, do
make par; this will generate seidel.par.c from seidel.c and also compile
it to generate par. Likewise, make tiled for tiled and make orig for
orig.

  1. cd examples/seidel

seidel.c: This is the original code (the kernel in this code is extracted).
orig is the corresponding executable when compiled with the native compiler
(gcc or icc for eg.) with optimization flags, orig_par with the native
compiler’s auto-parallelization enabled.

seidel.opt.c: This is the transformed code without tiling (this is of not much
use, except for seeing the benefits of fusion in some cases). opt is the
corresponding executable.

seidel.tiled.c: This is Pluto-generated code optimized for locality with
tiling and other transformations, but not parallelized - this should be used
for sequential execution. tiled is the corresponding executable.

seidel.par.c: This is Pluto parallelized code optimized for locality and
parallelism with tiling and other transformations. This code has OpenMP
pragmas. par is the corresponding executable.

  • To change any of the flags used for an example, edit the top section of
    examples/common.mk or the Makefile in the example directory

  • To manually specify tile sizes, create tile.sizes; see examples/matmul/
    for example or doc/DOC.txt for more information on setting tile sizes.

The executables already have timers; you just have to run them, and that will
print execution time for the core part of the computation as well.

To run the Pluto parallelized version:

  1. OMP_NUM_THREADS=4 ./par

To run native compiler optimized/auto-parallelized version:

  1. OMP_NUM_THREADS=4 ./orig_par

To run the original sequential code:

  1. ./orig

To run the locality-optimized version generated by Pluto:

  1. ./tiled

make clean in the particular example’s directory removes all executables as
well as generated codes.

To launch a complete verification that compares the output of tiled, par with orig
for all examples, in examples/, run make check-pluto.

  1. [examples/ ]$ make check-pluto

More information

  • See doc/DOC.txt for an overview of the system and details on all
    command-line options.

Bugs and issues

Please report bugs and issues at https://github.com/bondhugula/pluto/issues.

For questions and general discussion, please email
pluto-development@googlegroups.com after joining the group:
https://groups.google.com/g/pluto-development.