项目作者: ultravideo

项目描述 :
An open-source HEVC encoder
高级语言: C
项目地址: git://github.com/ultravideo/kvazaar.git
创建时间: 2014-01-29T08:35:53Z
项目社区:https://github.com/ultravideo/kvazaar

开源协议:Other

下载


Kvazaar

An open-source HEVC encoder licensed under 3-clause BSD

Join channel #ultravideo in Libera.Chat IRC network to contact us or come to our Discord Discord

Kvazaar is still under development. Speed and RD-quality will continue to improve.

http://ultravideo.fi/#encoder for more information.

  • Linux Kvazaar_tests
  • Windows Build status

Table of Contents

Using Kvazaar

Example:

  1. kvazaar --input BQMall_832x480_60.yuv --output out.hevc

The mandatory parameters are input and output. If the resolution of the input file is not in the filename, or when pipe is used, the input resolution must also be given: --input-res=1920x1080.

The default input format is 8-bit yuv420p for 8-bit and yuv420p10le for 10-bit. Input format and bitdepth can be selected with --input-format and --input-bitdepth.

Speed and compression quality can be selected with --preset, or by setting the options manually.

Parameters

  1. Kvazaar v2.3.1 2024-04-10
  2. Kvazaar license: 3-clause BSD
  3. Usage:
  4. kvazaar -i <input> --input-res <width>x<height> -o <output>
  5. Required:
  6. -i, --input <filename> : Input file
  7. --input-res <res> : Input resolution [auto]
  8. - auto: Detect from file name.
  9. - <int>x<int>: width times height
  10. -o, --output <filename> : Output file
  11. Presets:
  12. --preset <preset> : Set options to a preset [medium]
  13. - ultrafast, superfast, veryfast, faster,
  14. fast, medium, slow, slower, veryslow
  15. placebo
  16. Input:
  17. -n, --frames <integer> : Number of frames to code [all]
  18. --seek <integer> : First frame to code [0]
  19. --input-fps <num>[/<denom>] : Frame rate of the input video [25]
  20. --source-scan-type <string> : Source scan type [progressive]
  21. - progressive: Progressive scan
  22. - tff: Top field first
  23. - bff: Bottom field first
  24. --input-format <string> : P420 or P400 [P420]
  25. --input-bitdepth <int> : 8-16 [8]
  26. --loop-input : Re-read input file forever.
  27. --input-file-format <string> : Input file format [auto]
  28. - auto: Check the file ending for format
  29. - y4m (skips frame headers)
  30. - yuv
  31. Options:
  32. --help : Print this help message and exit.
  33. --version : Print version information and exit.
  34. --(no-)aud : Use access unit delimiters. [disabled]
  35. --debug <filename> : Output internal reconstruction.
  36. --(no-)cpuid : Enable runtime CPU optimizations. [enabled]
  37. --hash <string> : Decoded picture hash [checksum]
  38. - none: 0 bytes
  39. - checksum: 18 bytes
  40. - md5: 56 bytes
  41. --(no-)psnr : Calculate PSNR for frames. [enabled]
  42. --(no-)info : Add encoder info SEI. [enabled]
  43. --(no-)enable-logging : Enable logging for regular encoder performance,
  44. error messages are always disblayed. [enabled]
  45. --crypto <string> : Selective encryption. Crypto support must be
  46. enabled at compile-time. Can be 'on' or 'off' or
  47. a list of features separated with a '+'. [off]
  48. - on: Enable all encryption features.
  49. - off: Disable selective encryption.
  50. - mvs: Motion vector magnitudes.
  51. - mv_signs: Motion vector signs.
  52. - trans_coeffs: Coefficient magnitudes.
  53. - trans_coeff_signs: Coefficient signs.
  54. - intra_pred_modes: Intra prediction modes.
  55. --key <string> : Encryption key [16,213,27,56,255,127,242,112,
  56. 97,126,197,204,25,59,38,30]
  57. --stats-file-prefix : A prefix used for stats files that include
  58. bits, lambda, distortion, and qp for each ctu.
  59. These are meant for debugging and are not
  60. written unless the prefix is defined.
  61. Video structure:
  62. -q, --qp <integer> : Quantization parameter [22]
  63. -p, --period <integer> : Period of intra pictures [64]
  64. - 0: Only first picture is intra.
  65. - 1: All pictures are intra.
  66. - N: Every Nth picture is intra.
  67. --vps-period <integer> : How often the video parameter set is re-sent [0]
  68. - 0: Only send VPS with the first frame.
  69. - N: Send VPS with every Nth intra frame.
  70. -r, --ref <integer> : Number of reference frames, in range 1..15 [4]
  71. --gop <string> : GOP structure [lp-g4d3t1]
  72. - 0: Disabled
  73. - 8: B-frame pyramid of length 8
  74. - 16: B-frame pyramid of length 16
  75. - lp-<string>: Low-delay P/B-frame GOP
  76. (e.g. lp-g8d4t2, see README)
  77. --intra-qp-offset <int>: QP offset for intra frames [-51..51] [auto]
  78. - N: Set QP offset to N.
  79. - auto: Select offset automatically based
  80. on GOP length.
  81. --(no-)open-gop : Use open GOP configuration. [enabled]
  82. --cqmfile <filename> : Read custom quantization matrices from a file.
  83. --scaling-list <string>: Set scaling list mode. [off]
  84. - off: Disable scaling lists.
  85. - custom: use custom list (with --cqmfile).
  86. - default: Use default lists.
  87. --bitrate <integer> : Target bitrate [0]
  88. - 0: Disable rate control.
  89. - N: Target N bits per second.
  90. --rc-algorithm <string>: Select used rc-algorithm. [lambda]
  91. - lambda: rate control from:
  92. DOI: 10.1109/TIP.2014.2336550
  93. - oba: DOI: 10.1109/TCSVT.2016.2589878
  94. --(no-)intra-bits : Use Hadamard cost based allocation for intra
  95. frames. Default on for gop 8 and off for lp-gop
  96. --(no-)clip-neighbour : On oba based rate control whether to clip
  97. lambda values to same frame's ctus or previous'.
  98. Default on for RA GOPS and disabled for LP.
  99. --(no-)lossless : Use lossless coding. [disabled]
  100. --mv-constraint <string> : Constrain movement vectors. [none]
  101. - none: No constraint
  102. - frametile: Constrain within the tile.
  103. - frametilemargin: Constrain even more.
  104. --roi <filename> : Use a delta QP map for region of interest.
  105. Reads an array of delta QP values from a file.
  106. Text and binary files are supported and detected
  107. from the file extension (.txt/.bin). If a known
  108. extension is not found, the file is treated as
  109. a text file. The file can include one or many
  110. ROI frames each in the following format:
  111. width and height of the QP delta map followed
  112. by width * height delta QP values in raster
  113. order. In binary format, width and height are
  114. 32-bit integers whereas the delta QP values are
  115. signed 8-bit values. The map can be of any size
  116. and will be scaled to the video size. The file
  117. reading will loop if end of the file is reached.
  118. See roi.txt in the examples folder.
  119. --set-qp-in-cu : Set QP at CU level keeping pic_init_qp_minus26.
  120. in PPS and slice_qp_delta in slize header zero.
  121. --(no-)erp-aqp : Use adaptive QP for 360 degree video with
  122. equirectangular projection. [disabled]
  123. --level <number> : Use the given HEVC level in the output and give
  124. an error if level limits are exceeded. [6.2]
  125. - 1, 2, 2.1, 3, 3.1, 4, 4.1, 5, 5.1, 5.2, 6,
  126. 6.1, 6.2
  127. --force-level <number> : Same as --level but warnings instead of errors.
  128. --high-tier : Used with --level. Use high tier bitrate limits
  129. instead of the main tier limits during encoding.
  130. High tier requires level 4 or higher.
  131. --(no-)vaq <integer> : Enable variance adaptive quantization with given
  132. strength, in range 1..20. Recommended: 5.
  133. [disabled]
  134. Compression tools:
  135. --(no-)deblock <beta:tc> : Deblocking filter. [0:0]
  136. - beta: Between -6 and 6
  137. - tc: Between -6 and 6
  138. --sao <string> : Sample Adaptive Offset [full]
  139. - off: SAO disabled
  140. - band: Band offset only
  141. - edge: Edge offset only
  142. - full: Full SAO
  143. --(no-)rdoq : Rate-distortion optimized quantization [enabled]
  144. --(no-)rdoq-skip : Skip RDOQ for 4x4 blocks. [disabled]
  145. --(no-)signhide : Sign hiding [disabled]
  146. --(no-)smp : Symmetric motion partition [disabled]
  147. --(no-)amp : Asymmetric motion partition [disabled]
  148. --rd <integer> : Mode search complexity [0]
  149. - 0: Skip intra if inter is good enough.
  150. - 1: Rough intra mode search with SATD.
  151. - 2: Refine mode search with SSE.
  152. - 3: More SSE candidates for inter and
  153. chroma mode search for 4x4 intra.
  154. - 4: Even more SSE candidates for both.
  155. - 5: Try all intra modes.
  156. --(no-)mv-rdo : Rate-distortion optimized motion vector costs
  157. [disabled]
  158. --(no-)zero-coeff-rdo : If a CU is set inter, check if forcing zero
  159. residual improves the RD cost. [enabled]
  160. --(no-)full-intra-search : Try all intra modes during rough search.
  161. [disabled]
  162. --(no-)intra-chroma-search : Test non-derived intra chroma modes.
  163. [disabled]
  164. --(no-)transform-skip : Try transform skip [disabled]
  165. --me <string> : Integer motion estimation algorithm [hexbs]
  166. - hexbs: Hexagon Based Search
  167. - tz: Test Zone Search
  168. - full: Full Search
  169. - full8, full16, full32, full64
  170. - dia: Diamond Search
  171. --me-steps <integer> : Motion estimation search step limit. Only
  172. affects 'hexbs' and 'dia'. [-1]
  173. --subme <integer> : Fractional pixel motion estimation level [4]
  174. - 0: Integer motion estimation only
  175. - 1: + 1/2-pixel horizontal and vertical
  176. - 2: + 1/2-pixel diagonal
  177. - 3: + 1/4-pixel horizontal and vertical
  178. - 4: + 1/4-pixel diagonal
  179. --(no-)fast-bipred : Only perform fast bipred search. [enabled]
  180. --pu-depth-inter <int>-<int> : Inter prediction units sizes [0-3]
  181. - 0, 1, 2, 3: from 64x64 to 8x8
  182. - Accepts a list of values separated by ','
  183. for setting separate depths per GOP layer
  184. (values can be omitted to use the first
  185. value for the respective layer).
  186. --pu-depth-intra <int>-<int> : Intra prediction units sizes [1-4]
  187. - 0, 1, 2, 3, 4: from 64x64 to 4x4
  188. - Accepts a list of values separated by ','
  189. for setting separate depths per GOP layer
  190. (values can be omitted to use the first
  191. value for the respective layer).
  192. --ml-pu-depth-intra : Predict the pu-depth-intra using machine
  193. learning trees, overrides the
  194. --pu-depth-intra parameter. [disabled]
  195. --(no-)combine-intra-cus: Whether the encoder tries to code a cu
  196. on lower depth even when search is not
  197. performed on said depth. Should only
  198. be disabled if cus absolutely must not
  199. be larger than limited by the search.
  200. [enabled]
  201. --force-inter : Force the encoder to use inter always.
  202. This is mostly for debugging and is not
  203. guaranteed to produce sensible bitstream or
  204. work at all. [disabled]
  205. --tr-depth-intra <int> : Transform split depth for intra blocks [0]
  206. --(no-)bipred : Bi-prediction [disabled]
  207. --cu-split-termination <string> : CU split search termination [zero]
  208. - off: Don't terminate early.
  209. - zero: Terminate when residual is zero.
  210. --me-early-termination <string> : Motion estimation termination [on]
  211. - off: Don't terminate early.
  212. - on: Terminate early.
  213. - sensitive: Terminate even earlier.
  214. --fast-residual-cost <int> : Skip CABAC cost for residual coefficients
  215. when QP is below the limit. [0]
  216. --fast-coeff-table <string> : Read custom weights for residual
  217. coefficients from a file instead of using
  218. defaults [default]
  219. --fast-rd-sampling : Enable learning data sampling for fast coefficient
  220. table generation
  221. --fastrd-accuracy-check : Evaluate the accuracy of fast coefficient
  222. prediction
  223. --fastrd-outdir : Directory to which to output sampled data or accuracy
  224. data, into <fastrd-outdir>/0.txt to 50.txt, one file
  225. for each QP that blocks were estimated on
  226. --(no-)intra-rdo-et : Check intra modes in rdo stage only until
  227. a zero coefficient CU is found. [disabled]
  228. --(no-)early-skip : Try to find skip cu from merge candidates.
  229. Perform no further search if skip is found.
  230. For rd=0..1: Try the first candidate.
  231. For rd=2.. : Try the best candidate based
  232. on luma satd cost. [enabled]
  233. --max-merge <integer> : Maximum number of merge candidates, 1..5 [5]
  234. --(no-)implicit-rdpcm : Implicit residual DPCM. Currently only supported
  235. with lossless coding. [disabled]
  236. --(no-)tmvp : Temporal motion vector prediction [enabled]
  237. Parallel processing:
  238. --threads <integer> : Number of threads to use [auto]
  239. - 0: Process everything with main thread.
  240. - N: Use N threads for encoding.
  241. - auto: Select automatically.
  242. --owf <integer> : Frame-level parallelism [auto]
  243. - N: Process N+1 frames at a time.
  244. - auto: Select automatically.
  245. --(no-)wpp : Wavefront parallel processing. [enabled]
  246. Enabling tiles automatically disables WPP.
  247. To enable WPP with tiles, re-enable it after
  248. enabling tiles. Enabling wpp with tiles is,
  249. however, an experimental feature since it is
  250. not supported in any HEVC profile.
  251. --tiles <int>x<int> : Split picture into width x height uniform tiles.
  252. --tiles-width-split <string>|u<int> :
  253. - <string>: A comma-separated list of tile
  254. column pixel coordinates.
  255. - u<int>: Number of tile columns of uniform
  256. width.
  257. --tiles-height-split <string>|u<int> :
  258. - <string>: A comma-separated list of tile
  259. row column pixel coordinates.
  260. - u<int>: Number of tile rows of uniform
  261. height.
  262. --slices <string> : Control how slices are used.
  263. - tiles: Put tiles in independent slices.
  264. - wpp: Put rows in dependent slices.
  265. - tiles+wpp: Do both.
  266. --partial-coding <x-offset>!<y-offset>!<slice-width>!<slice-height>
  267. : Encode partial frame.
  268. Parts must be merged to form a valid bitstream.
  269. X and Y are CTU offsets.
  270. Slice width and height must be divisible by CTU
  271. in pixels unless it is the last CTU row/column.
  272. This parameter is used by kvaShare.
  273. Video Usability Information:
  274. --sar <width:height> : Specify sample aspect ratio
  275. --overscan <string> : Specify crop overscan setting [undef]
  276. - undef, show, crop
  277. --videoformat <string> : Specify video format [undef]
  278. - undef, component, pal, ntsc, secam, mac
  279. --range <string> : Specify color range [tv]
  280. - tv, pc
  281. --colorprim <string> : Specify color primaries [undef]
  282. - undef, bt709, bt470m, bt470bg,
  283. smpte170m, smpte240m, film, bt2020
  284. --transfer <string> : Specify transfer characteristics [undef]
  285. - undef, bt709, bt470m, bt470bg,
  286. smpte170m, smpte240m, linear, log100,
  287. log316, iec61966-2-4, bt1361e,
  288. iec61966-2-1, bt2020-10, bt2020-12
  289. --colormatrix <string> : Specify color matrix setting [undef]
  290. - undef, bt709, fcc, bt470bg, smpte170m,
  291. smpte240m, GBR, YCgCo, bt2020nc, bt2020c
  292. --chromaloc <integer> : Specify chroma sample location (0 to 5) [0]
  293. Deprecated parameters: (might be removed at some point)
  294. -w, --width <integer> : Use --input-res.
  295. -h, --height <integer> : Use --input-res.

LP-GOP syntax

The LP-GOP syntax is “lp-g(num)d(num)t(num)”, where

  • g = GOP length.
  • d = Number of GOP layers.
  • t = How many references to skip for temporal scaling, where 4 means only
    every fourth picture needs to be decoded.
  1. QP
  2. +4 o o o o
  3. +3 o o o o o o
  4. +2 o o o o ooooooo
  5. +1 o o o o o o ooooooooo
  6. g8d4t1 g8d3t1 g8d2t1 g8d1t1

Presets

The names of the presets are the same as with x264: ultrafast,
superfast, veryfast, faster, fast, medium, slow, slower, veryslow and
placebo. The effects of the presets are listed in the following table,
where the names have been abbreviated to fit the layout in GitHub.

0-uf 1-sf 2-vf 3-fr 4-f 5-m 6-s 7-sr 8-vs 9-p
rd 0 0 0 0 0 0 1 2 2 2
pu-depth-intra 2-3 2-3 2-3 2-3 1-3 1-4 1-4 1-4 1-4 1-4
pu-depth-inter 1-2 1-2 1-3 1-3 1-3 0-3 0-3 0-3 0-3 0-3
me hexbs hexbs hexbs hexbs hexbs hexbs hexbs hexbs tz tz
gop 8 8 8 8 8 16 16 16 16 16
ref 1 1 1 1 2 4 4 4 4 4
bipred 1 1 1 1 1 1 1 1 1 1
deblock 1 1 1 1 1 1 1 1 1 1
signhide 0 0 0 0 0 0 0 1 1 1
subme 0 2 2 4 4 4 4 4 4 4
sao off full full full full full full full full full
rdoq 0 0 0 0 0 1 1 1 1 1
rdoq-skip 0 0 0 0 0 0 0 0 0 0
transform-skip 0 0 0 0 0 0 0 0 1 1
mv-rdo 0 0 0 0 0 0 0 0 0 1
full-intra-search 0 0 0 0 0 0 0 0 0 0
smp 0 0 0 0 0 0 0 0 1 1
amp 0 0 0 0 0 0 0 0 0 1
cu-split-termination zero zero zero zero zero zero zero zero zero off
me-early-termination sens. sens. sens. sens. sens. on on off off off
intra-rdo-et 0 0 0 0 0 0 0 0 0 0
early-skip 1 1 1 1 1 1 1 1 1 0
fast-residual-cost 28 28 28 0 0 0 0 0 0 0
max-merge 5 5 5 5 5 5 5 5 5 5

Kvazaar library

See kvazaar.h for the library API and its
documentation.

When using the static Kvazaar library on Windows, macro KVZ_STATIC_LIB
must be defined. On other platforms it’s not strictly required.

The needed linker and compiler flags can be obtained with pkg-config.

Compiling Kvazaar

If you have trouble regarding compiling the source code, please make an
issue about in Github.
Others might encounter the same problem and there is probably much to
improve in the build process. We want to make this as simple as
possible.

Autotools

Depending on the platform, some additional tools are required for compiling Kvazaar with autotools.
For Ubuntu, the required packages are automake autoconf libtool m4 build-essential.

Run the following commands to compile and install Kvazaar.

  1. ./autogen.sh
  2. ./configure
  3. make
  4. sudo make install
  5. sudo ldconfig

See ./configure --help for more options.
When building shared library with visual studio the tests will fail to link, the main binary will still work

Autotools on MinGW

It is recommended to use Clang instead of GCC in MinGW environments. GCC also works, but AVX2 optimizations will be disabled because of a known GCC issue from 2012, so performance will suffer badly. Instead of ./configure, run

  1. CC=clang ./configure

to build Kvazaar using Clang.

CMake

Depending on the platform, some additional tools are required for compiling Kvazaar with CMake.
For Ubuntu, the required packages are build-essential cmake.

OS X

  • Install Homebrew
  • run brew install automake libtool yasm
  • Refer to Autotools instructions

Visual Studio

  • At least VisualStudio 2015.2 is required.
  • Project files can be found under build/.
  • Requires external vsyasm.exe
    in %PATH%

Docker

This project includes a Dockerfile, which enables building for Docker. Kvazaar is also available in the Docker Hub ultravideo/kvazaar
Build using Docker: docker build -t kvazaar .
Example usage: docker run -i -a STDIN -a STDOUT kvazaar -i - --input-res=320x240 -o - < testfile_320x240.yuv > out.265
For other examples, see Dockerfile

Visualization (Windows only)

Compiling kvazaar_cli project in the visualizer branch results in a Kvazaar executable with visualization enabled.

Additional Requirements: SDL2, SDL2-ttf.

Directory visualizer_extras has to be added into the same directory level as the kvazaar project directory. Inside should be directories include and lib found from the development library zip packages.

SDL2.dll, SDL2_ttf.dll, libfreetype-6.dll, and zlib1.dll should be placed in the working directory (i.e. the folder the kvazaar.exe is in after compiling the kvazaar_cli project/solution) when running the visualizer. The required .dll can be found in the aforementioned lib-folder (lib\x64).

Note: The solution should be compiled on the x64 platform in visual studio.

Optional font file arial.ttf is to be placed in the working directory, if block info tool is used.

Paper

Please cite this paper for Kvazaar:

M. Viitanen, A. Koivula, A. Lemmetti, A. Ylä-Outinen, J. Vanne, and T. D. Hämäläinen, “Kvazaar: open-source HEVC/H.265 encoder,” in Proc. ACM Int. Conf. Multimedia, Amsterdam, The Netherlands, Oct. 2016.

Or in BibTex:

  1. @inproceedings{Kvazaar2016,
  2. author = {Viitanen, Marko and Koivula, Ari and Lemmetti, Ari and Yl\"{a}-Outinen, Arttu and Vanne, Jarno and H\"{a}m\"{a}l\"{a}inen, Timo D.},
  3. title = {Kvazaar: Open-Source HEVC/H.265 Encoder},
  4. booktitle = {Proceedings of the 24th ACM International Conference on Multimedia},
  5. year = {2016},
  6. isbn = {978-1-4503-3603-1},
  7. location = {Amsterdam, The Netherlands},
  8. url = {http://doi.acm.org/10.1145/2964284.2973796},
  9. }

Contributing to Kvazaar

We are happy to look at pull requests in Github. There is still lots of work to be done.

Code documentation

You can generate Doxygen documentation pages by running the command
“doxygen docs.doxy”. Here is a rough sketch of the module structure:
Kvazaar module hierarchy

For version control we try to follow these conventions:

  • Master branch always produces a working bitstream (can be decoded with
    HM).
  • Commits for new features and major changes/fixes put to a sensibly
    named feature branch first and later merged to the master branch.
  • Always merge the feature branch to the master branch, not the other
    way around, with fast-forwarding disabled if necessary.
  • Every commit should at least compile.

Testing

  • Main automatic way of testing is with Travis CI. Commits, branches
    and pull requests are tested automatically.
    • Uninitialized variables and such are checked with Valgrind.
    • Bitstream validity is checked with VTM.
    • Compilation is checked on GCC and Clang on Linux, and Clang on OSX.
  • Windows msys2 and msvc builds are checked automatically on Appveyor.
  • If your changes change the bitstream, decode with HM to check that
    it doesn’t throw checksum errors or asserts.
  • If your changes shouldn’t alter the bitstream, check that they don’t.
  • Automatic compression quality testing is in the works.

Unit tests

  • There are some unit tests located in the tests directory. We would
    like to have more.
  • The Visual Studio project links the unit tests against the actual .lib
    file used by the encoder. There is no Makefile as of yet.
  • The unit tests use “greatest” unit testing framework. It is included
    as a submodule, but getting it requires the following commands to be
    run in the root directory of kvazaar:

    1. git submodule init
    2. git submodule update
  • On Linux, run make test.

Code style

We try to follow the following conventions:

  • C99 without features not supported by Visual Studio 2015 (VLAs).
    • // comments allowed and encouraged.
  • Follow overall conventions already established in the code.
  • Indent by 2 spaces. (no tabs)
  • { on the same line for control logic and on the next line for functions
  • Reference and deference next to the variable name.
  • Variable names in lowered characters with words divided by underscore.
  • Maximum line length 79 characters when possible.
  • Functions only used inside the module shouldn’t be defined in the
    module header. They can be defined in the beginning of the .c file if
    necessary.
  • Symbols defined in headers prefixed with kvz or KVZ.
  • Includes in alphabetic order.