Convolutional autoencoder for encoding/decoding RGB images in TensorFlow with high compression ratio
This is a sample template adapted from Arash Saber Tehrani’s Deep-Convolutional-AutoEncoder tutorial https://github.com/arashsaber/Deep-Convolutional-AutoEncoder for encoding/decoding 3-channel images. The template has been fully commented. I have tested this implementation on rescaled samples from the CelebA dataset from CUHK http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html to produce reasonably decent results from a short period of training. The compression ratio of this implementation is 108. That is, for an input tensor of shape [-1, 48, 48, 3], the bottleneck layer has been reduced to a tensor of shape [-1, 64].
Add on features:
Caveats:
N.B. The input images are 48x48, hence the blurriness. Additionally these outputs are from setting n_epochs to 1000, which could be increased for even better results (note the cost function trend).
Inputs:
Outputs:
Make sure to create directory ./logs/run1/
to save TensorBoard output. For pushing multiple runs to TensorBoard, simply save additional logs as ./logs/run2/
, ./logs/run3/
etc.
Unzip ./celebG.tar.gz
and save jpegs in ./data/celebG/
Either use provided image set or your own. If using your own dataset, I recommend ImageMagick for resizing: https://www.imagemagick.org/script/download.php
If using ImageMagick, start Bash in ./data/<your_dir>/
:
for file in $PWD/*.jpg
do
convert $file -resize 42x42 $file
done
In root dir, python ConvAutoencoder.py
Here is a list of common problems:
Reference https://github.com/carpedm20/DCGAN-tensorflow/blob/master/utils.py for several dynamic image resize functions I have incorporated into my implementation.