项目作者: undertheseanlp

项目描述 :
Vietnamese Chunking experiments
高级语言: Python
项目地址: git://github.com/undertheseanlp/chunking.git
创建时间: 2017-05-25T11:10:52Z
项目社区:https://github.com/undertheseanlp/chunking

开源协议:

下载


Underthesea Chunking

This repository contains experiments in Vietnamese Chunking problems. It is a part of underthesea project.

Corpus Summary

  1. Sentences : 7855
  2. Unique words : 14245
  3. Top words : ,, ., ", của, là, và, có, một, người, được, không, đã, những, cho, :, ..., ở, trong, với, đến
  4. POS Tags (28): A, Ab, C, CH, Cb, Cc, E, Eb, I, L, M, Mb, N, Nb, Nc, Np, Nu, Ny, P, Pb, R, T, V, Vb, Vy, X, Y, Z
  5. Chunking Tags (21): B-AP, B-MP, B-NP, B-PP, B-QP, B-TP, B-VP, B-WH, B-WP, B-XP, I-AP, I-MP, I-NP, I-PP, I-QP, I-VP, I-WH , I-WP, I-XP, N-NP, O

Usage

Setup Environment

  1. # clone project
  2. $ git clone git@github.com:magizbox/underthesea.chunking.git
  3. # create environment
  4. $ cd underthesea.chunking
  5. $ conda create -n uts.chunking python=3.4
  6. $ pip install -r requirement.txt

Run Experiments

  1. $ cd underthesea.chunking
  2. $ source activate uts.chunking
  3. $ python main.py

Last update: October 2017