Real-time video quality improvement for applications such as video-chat using Perceptual Losses
This repository contains a pytorch implementation of an algorithm for single image super-resolution applied to frames from a web-cam. The model uses the method described in Perceptual Losses for Real-Time Style Transfer and Super-Resolution along with Instance Normalization. The code is based on the PyTorch Example code for fast-neural-style.
The program is written in Python, and uses the Pytorch library for generating and loading the CNN models.
An example of 4x super-resolution:
An example of 8x super-resolution:
All the programs were tested on the following setup:
Note: While a good GPU will help immensely with training the networks, it is not absolutely required to evaluate these programs.
$ sudo apt-get install build-essential cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev python-dev python-numpy libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev libavcodec-dev libavformat-dev libswscale-dev libv4l-dev libxvidcore-dev libx264-dev libgtk-3-dev libatlas-base-dev gfortran python2.7-dev python3.5-dev
$ cd ~
$ wget -O opencv.zip https://github.com/opencv/opencv/archive/3.3.0.zip
$ unzip opencv.zip
$ cd ~
$ wget -O opencv_contrib.zip https://github.com/opencv/opencv_contrib/archive/3.3.0.zip
$ unzip opencv_contrib.zip
$ sudo apt-get install python-pip && pip install --upgrade pip
$ sudo pip install virtualenv virtualenvwrapper
$ sudo rm -rf ~/.cache/pip
export WORKON_HOME=$HOME/.virtualenvs
source /usr/local/bin/virtualenvwrapper.sh
source ~/.bashrc
mkvirtualenv supres -p python3
workon supres
pip install numpy
$ cd ~/opencv-3.3.0/
$ mkdir build
$ cd build
$ cmake -D CMAKE_BUILD_TYPE=RELEASE \
-D CMAKE_INSTALL_PREFIX=/usr/local \
-D INSTALL_PYTHON_EXAMPLES=ON \
-D INSTALL_C_EXAMPLES=OFF \
-D OPENCV_EXTRA_MODULES_PATH=~/opencv_contrib-3.3.0/modules \
-D PYTHON_EXECUTABLE=~/.virtualenvs/cv/bin/python \
-D BUILD_EXAMPLES=ON ..
$ make -j4
$ sudo make install
$ sudo ldconfig
Here, you should see a file like
$ cd /usr/local/lib/python3.5/site-packages/
$ ls -l
cv2.cpython-35m-x86_64-linux-gnu.so
sudo mv cv2.cpython-35m-x86_64-linux-gnu.so cv2.so
$ cd ~/.virtualenvs/supres/lib/python3.5/site-packages/
$ ln -s /usr/local/lib/python3.5/site-packages/cv2.so cv2.so
pip3 install http://download.pytorch.org/whl/cu80/torch-0.2.0.post3-cp35-cp35m-manylinux1_x86_64.whl
pip3 install torchvision
pip install -r requirements.txt
The code supports two scales of super-resolution: 4x and 8x.
Packaged with the code are two pretrained models:
saved_models/coco4x_epoch_20.model
for 4x Super-Resolutionsaved_models/coco8x_epoch_20.model
for 8x Super-ResolutionIn order to run the live demo for 4x Super-Resolution, run the following command:
python super-resolution.py eval --model saved_models/coco4x_epoch_20.model --downsample-scale 4
In order to run the live demo for 8x Super-Resolution, run the following command:
python super-resolution.py eval --model saved_models/coco8x_epoch_20.model --downsample-scale 8
These are all the options available in eval
mode:
$ python super-resolution.py eval --help
usage: super-resolution.py eval [-h] --model MODEL --downsample-scale
DOWNSAMPLE_SCALE
optional arguments:
-h, --help show this help message and exit
--model MODEL saved model to be used for super resolution
--downsample-scale DOWNSAMPLE_SCALE
amount that you wish to downsample by. Default = 8
It is possible to train your own models with the super-resolution.py’s train
mode.
The packaged models were trained with MS COCO dataset. In order to download the dataset and set aside 10k images for training, please run the download_dataset.sh
script.
Once downloaded, you can run the following example command to train a model for 4x Super-Resolution:
python super-resolution.py train --epochs 20 --dataset data/ --save-model-dir saved_models/ --checkpoint-model-dir checkpoints/ --downsample-scale 4
The full set of help options for train
mode are as follows:
$ python super-resolution.py train --help
usage: super-resolution.py train [-h] [--epochs EPOCHS]
[--batch-size BATCH_SIZE] [--dataset DATASET]
--save-model-dir SAVE_MODEL_DIR
[--checkpoint-model-dir CHECKPOINT_MODEL_DIR]
[--lr LR] [--log-interval LOG_INTERVAL]
[--checkpoint-interval CHECKPOINT_INTERVAL]
[--downsample-scale DOWNSAMPLE_SCALE]
optional arguments:
-h, --help show this help message and exit
--epochs EPOCHS number of training epochs, default is 2
--batch-size BATCH_SIZE
batch size for training, default is 4
--dataset DATASET path to training dataset, the path should point to a
folder containing another folder with all the training
images
--save-model-dir SAVE_MODEL_DIR
path to folder where trained model will be saved.
--checkpoint-model-dir CHECKPOINT_MODEL_DIR
path to folder where checkpoints of trained models
will be saved
--lr LR learning rate, default is 1e-3
--log-interval LOG_INTERVAL
number of images after which the training loss is
logged, default is 500
--checkpoint-interval CHECKPOINT_INTERVAL
number of batches after which a checkpoint of the
trained model will be created
--downsample-scale DOWNSAMPLE_SCALE
amount that you wish to downsample by. Default = 8