项目作者: icedac

项目描述 :
np_alloc - yet another lock-free per-thread memory pool
高级语言: C++
项目地址: git://github.com/icedac/np_alloc.git
创建时间: 2018-07-20T09:15:42Z
项目社区:https://github.com/icedac/np_alloc

开源协议:Other

下载


GitHub license

[WORK-IN-PROGRESS] np_alloc

  • yet-another, per-thread memory alloc/pool for c++
  • code from scratch, some idea from written commerical code by myself in 2009.

toolset

  • c++14/gsl/x64
  • consideration for linux, not yet buildable

feature

  • lock-free, per-size, per-thread memory pool
  • wait-free most of times except fetching from global pool

api

  1. void* np_alloc(size_t bytes);
  2. void* np_alloc(size_t bytes, const char file[], int line);
  3. void np_free(void * ptr);

usage & test code

pros / cons

  • (+) no lock to get new memory so fast
  • (+) improved cache locality due to pre-allocated by memory page for local thread
  • (-) using lots of memory but can be optimizing per project

todo

garbage collecting process from local thread pool to global pool

  • it will saves memory but worse cache localtity due to spreading out memory fragments across threads

benchmark

  • just for initial lame benchmarks. code are here
  • methods: allocate up to max allocate count and deallocate until all freed and repeat this
  1. thread [28b0]: global pool created. pool=[2042de1a4b0]
  2. malloc {
  3. random_multithread (500000) - 5.11385s
  4. random_singlethread (5000000) - 1.43155s
  5. small_multithread (500000) - 0.733916s
  6. small_singlethread (5000000) - 0.281047s
  7. big_multithread (500000) - 3.08501s
  8. big_singlethread (5000000) - 1.76247s
  9. }
  10. np_alloc {
  11. random_multithread (500000) - 1.2951s
  12. random_singlethread (5000000) - 0.229591s
  13. small_multithread (500000) - 0.681932s
  14. small_singlethread (5000000) - 0.0977364s
  15. big_multithread (500000) - 0.826968s
  16. big_singlethread (5000000) - 0.232066s
  17. }
  18. thread [3a84]: global pool destroying. pool=[2042de1a4b0]
  19. [3a84] ~global_pool() this=[2042de1a4b0]
  20. Press any key to continue . . .
  • windows 10 x64 / xeon x5570 2 processor / 8 core 16 thread
  • env|50 threads|500k iteration per thread|10000 max alloction per thread
_ random alloc(50-7100) small(50-300) big(5000-7500)
malloc 5.1139 s 0.7339 s 3.0850 s
np_alloc 1.2951 s 0.6819 s 0.8270 s
faster x 3.95 1.08 3.73
  • env|1 thread|5000k iteration per thread|10000 max alloction per thread
_ random alloc(50-7100) small(50-300) big(5000-7500)
malloc 1.4316 s 0.2810 s 1.7625 s
np_alloc 0.2296 s 0.0977 s 0.2321 s
faster x 6.24 2.88 7.59