项目作者: yufengm

项目描述 :
Pytorch Implementation of Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning
高级语言: Jupyter Notebook
项目地址: git://github.com/yufengm/Adaptive.git
创建时间: 2017-10-11T20:04:02Z
项目社区:https://github.com/yufengm/Adaptive

开源协议:

下载


AdaptiveAttention

Pytorch Implementation of Adaptive Attention Model for Image Captioning

Knowing When to Look: Adaptive Attention via A Visual Sentinel for Image Captioning [Paper] [Review]

Dataset Preparation

First we will need to download the MS-COCO dataset. So create a data folder and run the download bash script

  1. mkdir data && ./download.sh

Afterwards, we should create the Karpathy split for training, validation and test.

  1. python KarpathySplit.py

Then we can build the vocabulary by running

  1. python build_vocab.py

The vocab.pkl should be saved in the data folder.

Now we will need to resize all the images in both train and val folder. Here I create a new folder under data, i.e., ‘resized’. Then we may run resize.py to resize all images into 256 x 256. You may specify different locations inside resize.py

  1. mkdir data/resized && python resize.py

After all images are resized. Now we can train our Adaptive Attention model with

  1. python train.py