项目作者: jiegzhan

项目描述 :
Classify Kaggle Consumer Finance Complaints into 11 classes. Build the model with CNN (Convolutional Neural Network) and Word Embeddings on Tensorflow.
高级语言: Python
项目地址: git://github.com/jiegzhan/multi-class-text-classification-cnn.git
创建时间: 2016-10-30T05:27:57Z
项目社区:https://github.com/jiegzhan/multi-class-text-classification-cnn

开源协议:Apache License 2.0

下载


Project: Classify Kaggle Consumer Finance Complaints

Highlights:

  • This is a multi-class text classification (sentence classification) problem.
  • The purpose of this project is to classify Kaggle Consumer Finance Complaints into 11 classes.
  • The model was built with Convolutional Neural Network (CNN) and Word Embeddings on Tensorflow.

Data: Kaggle Consumer Finance Complaints

  • Input: consumer_complaint_narrative

    • Example: “someone in north Carolina has stolen my identity information and has purchased items including XXXX cell phones thru XXXX on XXXX/XXXX/2015. A police report was filed as soon as I found out about it on XXXX/XXXX/2015. A investigation from XXXX is under way thru there fraud department and our local police department.\n”
  • Output: product

    • Example: Credit reporting

Train:

  • Command: python3 train.py training_data.file parameters.json
  • Example: python3 train.py ./data/consumer_complaints.csv.zip ./parameters.json

    A directory will be created during training, and the trained model will be saved in this directory.

Predict:

Provide the model directory (created when running train.py) and new data to predict.py.

  • Command: python3 predict.py ./trained_model_directory/ new_data.file
  • Example: python3 predict.py ./trained_model_1479757124/ ./data/small_samples.json

Reference: