项目作者: yunzhusong

项目描述 :
code for the paper: Character-Preserving Coherent Story Visualization
高级语言: Python
项目地址: git://github.com/yunzhusong/ECCV2020_CPCSV.git
创建时间: 2020-07-05T07:10:03Z
项目社区:https://github.com/yunzhusong/ECCV2020_CPCSV

开源协议:

下载


CPCStoryVisualization-Pytorch (ECCV 2020)

PWC

Author: @yunzhusong, @theblackcat102, @redman0226, Huiao-Han Lu, Hong-Han Shuai

Paper Link (ECCV 2020)

Organization

Code implementation for Character-Preserving Coherent Story Visualization

  1. Objects in pictures should so be arranged as by their very position to tell their own story.
  2. - Johann Wolfgang von Goethe (1749-1832)

In this paper we propose a new framework named Character-Preserving Coherent Story Visualization (CP-CSV) to tackle the challenges in story visualization: generating a sequence of images that emphasizes preserving the global consistency of characters and scenes across different story pictures.

CP-CSV effectively learns to visualize the story by three critical modules: story and context encoder (story and sentence representation learning), figure-ground segmentation (auxiliary task to provide information for preserving character and story consistency), and figure-ground aware generation (image sequence generation by incorporating figure-ground information). Moreover, we propose a metric named Frechet Story Distance (FSD) to evaluate the performance of story visualization. Extensive experiments demonstrate that CP-CSV maintains the details of character information and achieves high consistency among different frames, while FSD better measures the performance of story visualization. The FVD evaluation metric is from here.

Datasets

  1. PORORO images and segmentation images can be downloaded here. Pororo, original pororo datasets with self labeled segmentation mask of the character.

  2. CLEVR with segmentation mask, 13755 sequence of images, generate using Clevr-for-StoryGAN

  1. images/
  2. CLEVR_new_013754_1.png
  3. CLEVR_new_013754_1_mask.png
  4. CLEVR_new_013754_2.png
  5. CLEVR_new_013754_2_mask.png
  6. CLEVR_new_013754_3.png
  7. CLEVR_new_013754_3_mask.png
  8. CLEVR_new_013754_4.png
  9. CLEVR_new_013754_4_mask.png

Download link

Setup environment

  1. virtualenv -p python3 env
  2. source env/bin/activate
  3. pip install -r requirements.txt

Train CPCSV

Steps

  1. Download the Pororo dataset and put at DATA_DIR, downloaded. The dataset should contain SceneDialogues/ ( where gif files reside ) and *.npy files.

  2. Modify the DATA_DIR in ./cfg/final.yml

  3. The dafault hyper-parameters in ./cfg/final.yml are set to reproduce the paper results. To train from scratch:

  1. ./script.sh
  1. To run the evaluation, specify the —cfg to ./output/yourmodelname/setting.yml, e.g.,:
  1. ./script_inference.sh

Evaluate CPCSV

Pretrained model can be download here.

Steps

  1. Download the Pororo dataset and put at DATA_DIR, downloaded. The dataset should contain SceneDialogues/ ( where gif files reside ) and *.npy files.

  2. Modify the DATA_DIR in ./cfg/final.yml

  3. To evaluate FID, FSD of the pretrained model:

  1. ./script_inference.sh

Tensorboard

Use the tensorboard to check the results.

  1. tensorboard --logdir output/ --host 0.0.0.0 --port 6009

The slide and the presentation video:

The slide and the presentation video can be found in slides.

Cite

  1. @inproceedings{song2020CPCSV,
  2. title={Character-Preserving Coherent Story Visualization},
  3. author={Song, Yun-Zhu and Tam, Zhi-Rui and Chen, Hung-Jen and Lu, Huiao-Han and Shuai, Hong-Han},
  4. booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  5. year={2020}
  6. }