项目作者: szcompressor

项目描述 :
Portable Accelerator Implementation of SZ Lossy Compressor for Scientific Data Using Kokkos
高级语言: C++
项目地址: git://github.com/szcompressor/kokkosSZ.git
创建时间: 2020-09-30T18:03:22Z
项目社区:https://github.com/szcompressor/kokkosSZ

开源协议:Other

下载


kokkosSZ/kSZ: A Portable Accelerator Implementation of SZ Using Kokkos Programming Model

introduction

kSZ is a Kokkos-based implementation of the world-widely used SZ lossy compressor. We use Kokkos because it provides abstractions for both parallel execution of code and data management, which can be used to support portable implementation across different accelerator technologies. Kokkos can support OpenMP/OpenMPTarget, oneAPI, Pthreads, and CUDA as backend programming models.

(C) 2020 by Washington State University and Argonne National Laboratory. See COPYRIGHT in top-level directory.

Developers: Jiannan Tian, Dingwen Tao, Sheng Di, Franck Cappello

compile

The toolchain on login node is by default okay,

  1. git clone git@github.com:jtian0/kSZ-jtian.git ksz-omptarget
  2. cd ksz-omptarget
  3. make -j8

run on target testbed

  1. qsub -I -t 60 -n 1 -q <testbed name>

You may want to tune omp following the instruction

  1. # In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
  2. # For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
  3. # For unit testing set OMP_PROC_BIND=false
  4. export OMP_PROC_BIND=spread
  5. export OMP_PLACES=threads

And to run,

  1. ./ksz -f32 -m r2r -e 1e-4 -i ~/280953867/xx.f32 -1 280953867 -z

A sample output is given below

  1. [info] bin.cap: 1024
  2. [info] user-set eb: 1 x 10^(-4) = 0.0001
  3. [info] change to r2r mode (relative-to-value-range)
  4. eb --> 0.0001 x 64 = 0.0064
  5. Kokkos::OpenMP::initialize WARNING: OMP_PROC_BIND environment variable not set
  6. In general, for best performance with OpenMP 4.0 or better set OMP_PROC_BIND=spread and OMP_PLACES=threads
  7. For best performance with OpenMP 3.1 set OMP_PROC_BIND=true
  8. For unit testing set OMP_PROC_BIND=false
  9. !! using team_size (blockDim) = 32
  10. throughput: 2.95641 GB/s
  11. [info] verification start ---------------------
  12. | min.val 0
  13. | max.val 63.999996185302734375
  14. | val.rng 63.999996185302734375
  15. | max.err.abs.val 0.00640106201171875
  16. | max.err.abs.idx 145043941
  17. | max.err.vs.rng 0.00010001659989455937414
  18. | max.pw.rel.err 1
  19. | PSNR 84.771440846981420236
  20. | NRMSE 5.7733509432378176415E-05
  21. | correl.coeff 0.99999997929479633729
  22. [info] verification end -----------------------