项目作者: arunmallya

项目描述 :
Implements an MLP for VQA
高级语言: Lua
项目地址: git://github.com/arunmallya/simple-vqa.git
创建时间: 2016-10-05T20:32:09Z
项目社区:https://github.com/arunmallya/simple-vqa

开源协议:

下载


simple-vqa

Learns an MLP for VQA

This code implements the VQA MLP basline from Revisiting Visual Question Answering Baselines.

Some numbers on VQA

Features/Methods VQA Val Accuracy VQA Test-dev Accuracy
MCBP - 66.4
Baseline - MLP - 64.9
Imagenet - MLP 63.62 65.9

Readme is a work in progress……

Installation

The MLP is implemented in Torch, and depends on the following packages:
torch/nn,
torch/nngraph,
torch/cutorch,
torch/cunn,
torch/image,
torch/tds,
lua-cjson,
nninit,
torch-word-emb,
torch-hdf5,
torchx

After installing torch, you can install / update these dependencies by running the following:

  1. luarocks install nn
  2. luarocks install nngraph
  3. luarocks install image
  4. luarocks install tds
  5. luarocks install cutorch
  6. luarocks install cunn
  7. luarocks install lua-cjson
  8. luarocks install nninit
  9. luarocks install torch-word-emb
  10. luarocks install torchx

Install torch-hdf5 by following instructions here

Running trained models

Download this repo

  1. git clone --recursive https://github.com/arunmallya/simple-vqa.git

Data Dependencies

  • Create a data/ folder and symlink or place the following datasets: vqa -> VQA dataset root, coco -> COCO dataset root (coco is needed only if you plan to extract and use your own features, not required if using cached features below).

  • Download the Word2Vec model file from here. This is needed to encode sentences into vectors. Place the .bin file in the data/models folder.

  • Download cached resnet-152 imagenet features for the VQA dataset splits and place them in data/feats: features

  • Download VQA lite annotations and place then in data/vqa/Annotations/. These are required because the original VQA annotations do not fit in the 2GB limit of luajit.

  • Download MLP models trained on the VQA train set and place them in checkpoint/: models

  • At this point, your data folder should have models/, feats/, coco/ and vqa/ folders.

Run Eval

For example, to run the model trained on the VQA train set with Imagenet features, on the VQA val set:

  1. th eval.lua -eval_split val \
  2. -eval_checkpoint_path checkpoint/MLP-imagenet-train.t7

In general, the command is:

  1. th eval.lua -eval_split (train/val/test-dev/test-final) \
  2. -eval_checkpoint_path <model-path>

This will dump the results in checkpoint/ as a .json file as well as a results.zip file in case of test-dev and test-final. This results.zip can be uploaded to CodaLab for evaluation.

Training MLP from scratch

  1. th train.lua -im_feat_types imagenet -im_feat_dims 2048