项目作者: aayux

项目描述 :
A text classification model with pretrained GloVe embeddings
高级语言: Python
项目地址: git://github.com/aayux/glove-text-cnn.git
创建时间: 2017-12-22T11:09:30Z
项目社区:https://github.com/aayux/glove-text-cnn

开源协议:Apache License 2.0

下载


Text Classification with Sentence Level Convolutional Neural Networks

About Model

A Deep Convolutional Neural Network architecture based on CNN for Text Classification[1] with pretrained GloVe embeddings.

How to run

python3 generate_embeddings.py -d data/glove.42B.300d.txt --npy_output data/embeddings.npy --dict_output data/vocab.pckl --dict_whitelist data/aclImdb/imdb.vocab

  • Start training with python3 train.py
  1. optional arguments:
  2. -h, --help show this help message and exit
  3. --embedding_dim EMBEDDING_DIM
  4. Dimensionality of character embedding (default: 300)
  5. --filter_sizes FILTER_SIZES
  6. Comma-separated filter sizes (default: '3,4,5')
  7. --num_filters NUM_FILTERS
  8. Number of filters per filter size (default: 128)
  9. --l2_reg_lambda L2_REG_LAMBDA
  10. L2 regularizaion lambda (default: 0.0)
  11. --dropout_keep_prob DROPOUT_KEEP_PROB
  12. Dropout keep probability (default: 0.5)
  13. --batch_size BATCH_SIZE
  14. Batch Size (default: 128)
  15. --num_epochs NUM_EPOCHS
  16. Number of training epochs (default: 100)
  17. --evaluate_every EVALUATE_EVERY
  18. Evaluate model on dev set after this many steps
  19. (default: 500)
  20. --checkpoint_every CHECKPOINT_EVERY
  21. Save model after this many steps (default: 1000)
  22. --num_checkpoints NUM_CHECKPOINTS
  23. Number of checkpoints to store (default: 3)
  24. --allow_soft_placement ALLOW_SOFT_PLACEMENT
  25. Allow device soft device placement
  26. --noallow_soft_placement
  27. --log_device_placement LOG_DEVICE_PLACEMENT
  28. Log placement of ops on devices
  29. --nolog_device_placement

Sources

[1] TextCNN: dennybritz/cnn-text-classification-tf

[2] Data Helpers: rampage644/qrnn