项目作者: AhmedNasr7

项目描述 :
In this project, I am using Pytorch to implement automatic image captioning system as a part of Udacity Computer Vision Nanodegree.
高级语言: Jupyter Notebook
项目地址: git://github.com/AhmedNasr7/Automatic-Image-Captioning.git
创建时间: 2020-02-19T19:04:09Z
项目社区:https://github.com/AhmedNasr7/Automatic-Image-Captioning

开源协议:

下载


Automatic-Image-Captioning

Udacity Computer Vision Nanodegree

The Second Project in the Computer Vision Nanodegree by Udacity - Automatic Captioning of Images.

To install the required packages

  1. pip install -r requirements.txt

The project is an implementation of the architecture introduced in the paper Show and Tell: A Neural Image Caption Generator

The next figure shows the architecture used in the project:

Encoder Decoder Architecture

COCO Dataset Instructions

COCO Dataset Examples

  1. Clone this repo: https://github.com/cocodataset/cocoapi

    1. git clone https://github.com/cocodataset/cocoapi.git
  2. Setup the coco API (also described in the readme here)

    1. cd cocoapi/PythonAPI
    2. make
    3. cd ..
  3. Download some specific data from here: http://cocodataset.org/#download (described below)

  • Under Annotations, download:

    • 2014 Train/Val annotations [241MB] (extract captions_train2014.json and captions_val2014.json, and place at locations cocoapi/annotations/captions_train2014.json and cocoapi/annotations/captions_val2014.json, respectively)
    • 2014 Testing Image info [1MB] (extract image_info_test2014.json and place at location cocoapi/annotations/image_info_test2014.json)
  • Under Images, download:

    • 2014 Train images [83K/13GB] (extract the train2014 folder and place at location cocoapi/images/train2014/)
    • 2014 Val images [41K/6GB] (extract the val2014 folder and place at location cocoapi/images/val2014/)
    • 2014 Test images [41K/6GB] (extract the test2014 folder and place at location cocoapi/images/test2014/)