项目作者: PhaelIshall
项目描述 :
Implementation of different methods for weighing the importance of words for similarity computation in different settings. We used TF-IDF, which indicates how important a term is with respect to the the meaning of a document or collection, and Mutual Information, which indicates the strength of association between two variables such as two words or a word and a type of document. We used the weights in two downstream tasks to examine the performance of the two methods for assigning weights to words.
高级语言: Python
项目地址: git://github.com/PhaelIshall/Computational-Linguistics.git