项目作者: cloudkj

项目描述 :
Syllable counting and detection using an n-gram language model.
高级语言: Clojure
项目地址: git://github.com/cloudkj/ngram-syllables.git
创建时间: 2016-12-10T07:53:41Z
项目社区:https://github.com/cloudkj/ngram-syllables

开源协议:MIT License

下载


ngram-syllables

Syllable counting and detection using an n-gram language model.

Usage

Training

  1. Usage: lein run -m ngram-syllables.train [options] corpus
  2. Options:
  3. -h, --help
  4. -n, --n GRAMS 1 Number of grams
  5. -o, --output FILE target/model.edn Path to desired output location of model

Predictions

  1. Usage: lein run -m ngram-syllables.predict [options] weight_1 ... weight_n
  2. Options:
  3. -d, --delim DELIM Empty space Output syllable delimiter
  4. -h, --help
  5. -m, --model FILE target/model.edn Path to location of model

Example

Generate syllable boundaries for some words not in the training corpus.

  1. % ./train.sh
  2. Training model with n = 3
  3. 17490 1-gram sequences
  4. 17489 2-gram sequences
  5. 7434 3-gram sequences
  6. Output: target/model.edn
  7. % head -n 20 resources/pokemon_names.txt | ./predict.sh --delim · 0.1 0.1 0.8
  8. bulb·a·saur
  9. i·vy·saur
  10. ven·u·saur
  11. char·man·der
  12. char·mel·e·on
  13. char·i·zard
  14. squirt·le
  15. war·tor·tle
  16. blast·o·ise
  17. ca·ter·pie
  18. met·a·pod
  19. but·ter·free
  20. weed·le
  21. ka·ku·na
  22. bee·drill
  23. pid·gey
  24. pid·ge·ot·to
  25. pid·ge·ot
  26. rat·ta·ta
  27. ra·ti·cate

License

Copyright © 2016-2017