项目作者: AidenWilliams

项目描述 :
An NGram Language Model
高级语言: HTML
项目地址: git://github.com/AidenWilliams/Building-a-Language-Model.git
创建时间: 2021-03-07T21:46:03Z
项目社区:https://github.com/AidenWilliams/Building-a-Language-Model

开源协议:

下载


Building-a-Language-Model

Setup

For this assignment I wrote the python package LanguageModel, code documentation and explanation is
included as docstrings inside the code. I put my particular coding and design choices in an md cell with the heading
Coding Decisions. I am using the Maltese [1] corpus dataset for this assignment
and python version 3.7.

I have also included an html file generated by jupyter notebooks and I recommend viewing that instead of using the
jupyter server. Alternatively I used the Jetbrains Pycharm IDE which also renders the md components neatly.

Included is a requirements.txt which includes the external libraries used in this assignment. To install the libraries
with pip you can use this command:

sudo pip install -r requirements.txt

Omit sudo if you are using Windows.

The file structure is as follows

  1. Building a Language Model
  2. |
  3. +--Language Model
  4. | |
  5. | +-- __init__.py
  6. | +-- Corpus.py
  7. | +-- NGramCounts.py
  8. | +-- NGRamModel.py
  9. +--Maltese
  10. | |
  11. | +-- various txt files (Not included in git/submission)
  12. +--Religion
  13. | |
  14. | +-- two txt files (Not included in git/submission)
  15. +--Sports
  16. | |
  17. | +-- two txt files (Not included in git/submission)
  18. +--Test Corpus
  19. | |
  20. | +-- Test.txt
  21. +--.gitignore
  22. +--README.md
  23. +--Building a Language Model.ipynb
  24. +--Building a Language Model.html
  25. +--Building a Language Model.pdf
  26. +--Plagiarism form.pdf
  27. +--requirements.txt

This project has also been uploaded to git on:
https://github.com/AidenWilliams/Building-a-Language-Model