项目作者: puzzlef

项目描述 :
Performance of vector element sum using float vs bfloat16 as the storage type.
高级语言: C++
项目地址: git://github.com/puzzlef/sum-float-vs-bfloat16.git
创建时间: 2021-05-12T11:08:27Z
项目社区:https://github.com/puzzlef/sum-float-vs-bfloat16

开源协议:MIT License

下载


Comparison of vector element sum using various data types.

For all experiments, each approach is attempted on a number of vector sizes,
running each approach 5 times per size to get a good time measure. The
experiments are done with guidance from Prof. Dip Sankar Banerjee and
Prof. Kishore Kothapalli.


Comparision with Float and BFloat16 storage types

In this experiment (float-vs-bfloat16, main), we comparing the performance of
finding the sum of numbers between, the number stored as float or
bfloat16. While it seemed to me that bfloat16 method would be a clear
winner because of reduced memory bandwidth requirement, for some reason it is
only slightly faster. This is possibly because memory loads are anyway
always 32-bit. The only reason using bfloat16 is slightly faster could
possibly be because it allows data to be retained in cache for a longer period
of time (because of its small size). Note that neither approach makes use of
SIMD instructions which are available on all modern hardware.




References





ORG
DOI