Use end-to-end memory networks architecture for Question & Answering NLP system
Use end-to-end memory networks architecture for Question & Answering NLP system
This project uses a end-to-end memory network architecture to build a chatbot model able to answer simple questions on a text corpus (‘story’). Learning capabilities allow logical deduction on memorized corpus. The model is written in keras.
The project uses the bAbI dataset from Facebook Research. The dataset is available here. bAbI dataset is composed of several sets to support 20 tasks for testing text understanding and reasoning as part of the bAbI project. The aim is that each task tests a unique aspect of text and reasoning, and hence test different capabilities of learning models. The datasets are in english.
This deep learning neural network architecture was published in 2015 and you can refer to the original paper for its detailed description. The architecture shares some early principles with attention model.
The model takes two different inputs: A story (represented as a list of sentences all required to answer the question) and a question. The model must take the entire story context into consideration to answer the query. The use of end-to-end memory network becomes handy in this use-case.
The model performs calculation in order to combine these inputs and predict the answer. We can split the network into several functions:
Calculation steps:
Memory Networks model representation:
All parameters (embeddings, weight matrix to determine predicted answer) are learned during training.
Model limitation: The whole vocabulary must be known during training phase. Only words which are part of the corpus (training and testing) can be used during inference.
The model is trained very quickly over 120 epochs using RMSprop and lr = 0.01. Other hyperparameters: Embedding size of 128, batch size of 256. Accuracy on unseen test data reaches over 97%.
Excellent prediction on complex story.
Story:
Daniel went to the office.