项目作者: Viral-Doshi

项目描述 :
Subjective Answer Finder using various NLP based techniques
高级语言: Jupyter Notebook
项目地址: git://github.com/Viral-Doshi/Auto-Answering-NLP.git
创建时间: 2021-08-25T04:42:41Z
项目社区:https://github.com/Viral-Doshi/Auto-Answering-NLP

开源协议:

下载


Subjective Answer Writer platform

Objectives:

  • Keyword Extraction: To get keywords which best define the document
  • Summarization: to generate paragraph-wise Summaries of the document
  • Creating a NLP model which generates automatic subjective answers using Information Retrieval and Summarization techniques

Model Architecture:

image

Methodology and Work-Flow:

  • Step 1: Raw Text Data to Organized DataFrame
  • Step 2: Paragraph-wise Keyword Extraction
  • Step 3: Vectorizing Keywords to form Representative Vectors for paragraphs
  • Step 4: Summarizing Paragraphs to generate fixed length Answers
  • Step 5: Query Question to Vector
  • Step 6: Scoring Function to calculate Paragraph Scores
  • Step 7: Selecting Best Answer based on Final Scores

Source Code:

  • This file has Analysis and Visualizations of the text document we are working with.
  • This file contains the Paragraph-wise Keyword Extraction using 6 different methods.
  • This file contains the Paragraph-wise Summarization using 5 different methods
  • This file is the final implementation of Subjective Answer Finder. The last cell contains a small GUI-like interface.

Requirements

  • Please install the required dependancies.
  • Download glove encodings from here and place it in the same directory.
  • Text format of my document is as follows:
    • Chapter Name on the first line followed by a blank line.
    • Paragraph-title followed by the paragraph description.
    • A empty line after completion of each paragraph.
    • 2 empty lines at the end of chapter before the Question/Answer section.

Conclusion:

  • This method can be used effectively for Information Retrieval purposes for obtaining relevant information from big text documents
  • This Auto-Answering Model can also be used to find subjective answers to given Questions from Textbooks
  • Accuracy ~ 75% ( Spacy + Model1)
  • Many NLP based tasks such as Keyword Extraction, Vectorization and Summarization are performed which has many individual applications