CWS/POS/NER

Chinese word segmentation, Part-of-speech tagging and Medical named entity recognition From scratch.

Getting Started

Dependencies:

tensorflow

# training, testing and evaluation
python3 run.py

Generate files:

Evaluation.md - markdown table of evaluation result
Result/ - prediction result
FinalResult/ - Final prediction result

Structure

├── Data         => data set given by TA
│   ├── devset
│   ├── testset1
│   └── trainset
├── Evaluation   => eval scripts given by TA
|
├── CWS          => CWS model
├── POS          => POS tagging model
├── NER          => NER model
|
├── constant.py  => some global constants and variables
|
├── dataset.py   => data preprocessing
├── model.py     => high-level model API for all our model
├── evaluate.py  => high-level evaluation API
└── run.py       => the entire process

CWS
POS
NER (TODO)

Task Description

Data and scripts given by TA

Directory Structure

Data: (each has its _cws, _pos, _ner file)
- devset
- testset1
- trainset
- final
  - test2.txt - raw article
Evaluation
- pos_evaluate.py
- ner_evaluate.py

Resources

Article

Paper

Sequence Tagging

Bidirectional LSTM-CRF Models for Sequence Tagging

Chinese Word Segmentation

Tools’ reference

pkuseg

ACM Digital Library - Fast online training with frequency-adaptive learning rates for Chinese word segmentation and new word detection

@inproceedings{DBLP:conf/acl/SunWL12,
author = {Xu Sun and Houfeng Wang and Wenjie Li},
title = {Fast Online Training with Frequency-Adaptive Learning Rates for Chinese Word Segmentation and New Word Detection},
booktitle = {The 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea- Volume 1: Long Papers},
pages = {253--262},
year = {2012}}

CRF

tensorflow/contrib/crf
CRFsuite - A fast implementation of Conditional Random Fields (CRFs)
- chokkan/crfsuite
sklearn-crfsuite
- TeamHG-Memex/sklearn-crfsuite

CWS/POS/NER

Getting Started

Structure

Task Description

Directory Structure

Resources

Article

Paper

Related Tools and Libraries

CRF

Model Structure