我有一组话:
{公司,狗,猫,动物,公司,电话,权限,车辆,座位,轻便,规则,居民,专业知识}
我想计算上一组中每个单词之间的语义相似度。我有一个问题:
有些词不完整为“车辆”。我该如何忽略这些话?示例代码:Python:将变量传递到NLTK中的Wordnet Synsets方法中
import nltk.corpus as corpus import itertools as IT import fileinput if __name__=="__main__": wordnet = corpus.wordnet list1 = ["apple", "honey", "drinks", "flowers", "paper"] list2 = ["pear", "shell", "movie", "fire", "tree"] for word1, word2 in IT.product(list1, list2): #print(word1, word2) wordFromList1 = wordnet.synsets(word1)[0] wordFromList2 = wordnet.synsets(word2)[0] print('{w1}, {w2}: {s}'.format( w1 = wordFromList1.name, w2 = wordFromList2.name, s = wordFromList1.wup_similarity(wordFromList2)))
假设我将“车辆”添加到任何列表中。我收到以下错误:
IndexError:列表索引超出范围。
如何使用此错误忽略数据库中不存在的单词?