项目作者: mohit1997

项目描述 :
NN based lossless compression
高级语言: Python
项目地址: git://github.com/mohit1997/DeepZip.git
创建时间: 2018-07-16T17:03:11Z
项目社区:https://github.com/mohit1997/DeepZip

开源协议:MIT License

下载


DeepZip

Update: Please checkout our new work DZip presented at DCC 2021.

Description

Data compression using neural networks

DeepZip: Lossless Data Compression using Recurrent Neural Networks

Requirements

  1. GPU, nvidia-docker (or try alternative installation)
  2. python 2/3
  3. numpy
  4. sklearn
  5. keras 2.2.2
  6. tensorflow (cpu/gpu) 1.8

(nvidia-docker is currently required to run the code)
A simple way to install and run is to use the docker files provided:

  1. cd docker
  2. make bash BACKEND=tensorflow GPU=0 DATA=/path/to/data/

Alternative Installation

  1. cd DeepZip
  2. python3 -m venv tf
  3. source tf/bin/activate
  4. bash install.sh

Code

To run a compression experiment:

Data Preparation

  1. Place all the data to be compressed in data/files_to_be_compressed
  2. Run the parser
  1. cd data
  2. ./run_parser.sh

Running models

  1. All the models are listed in models.py
  2. Pick a model, to run compression experiment on all the data files in the data/files_to_be_compressed directory
  1. cd src
  2. ./run_experiments.sh biLSTM GPUID

Note: GPUID by default can be set to 0. The corresponding command would be then ./run_experiments.sh biLSTM 0

Please cite if you utilize the code in this repository.

  1. @inproceedings{7fcb664b03ac4d6497048954d756b91f,
  2. title = "DeepZip: Lossless Data Compression Using Recurrent Neural Networks",
  3. author = "Mohit Goyal and Kedar Tatwawadi and Shubham Chandak and Idoia Ochoa",
  4. year = "2019",
  5. month = "5",
  6. day = "10",
  7. doi = "10.1109/DCC.2019.00087",
  8. language = "English (US)",
  9. series = "Data Compression Conference Proceedings",
  10. publisher = "Institute of Electrical and Electronics Engineers Inc.",
  11. editor = "Ali Bilgin and Storer, {James A.} and Marcellin, {Michael W.} and Joan Serra-Sagrista",
  12. booktitle = "Proceedings - DCC 2019",
  13. address = "United States",
  14. }