我建议使用TextBlob库。示例实现如下:
from textblob import TextBlob def sentiment(message): # create TextBlob object of passed tweet text analysis = TextBlob(message) # set sentiment return (analysis.sentiment.polarity)
stanford-corenlp在stanfordcore-nlp之上是一个非常好的包装器,可以在python中使用它。
wget http://nlp.stanford.edu/software/stanford-corenlp-full-2018-10-05.zip
# Simple usage from stanfordcorenlp import StanfordCoreNLP nlp = StanfordCoreNLP('/Users/name/stanford-corenlp-full-2018-10-05') sentence = 'Guangdong University of Foreign Studies is located in Guangzhou.' print('Tokenize:', nlp.word_tokenize(sentence)) print('Part of Speech:', nlp.pos_tag(sentence)) print('Named Entities:', nlp.ner(sentence)) print('Constituency Parsing:', nlp.parse(sentence)) print('Dependency Parsing:', nlp.dependency_parse(sentence)) nlp.close() # Do not forget to close! The backend server will consume a lot memory.
更多信息
我也遇到过类似的情况。我的大部分项目都是Python,情感部分是Java。幸运的是,如何使用stanford CoreNLP jar非常容易。
这是我的一个脚本,您可以下载jar并运行它。
import java.util.List; import java.util.Properties; import edu.stanford.nlp.ling.CoreAnnotations; import edu.stanford.nlp.neural.rnn.RNNCoreAnnotations; import edu.stanford.nlp.pipeline.Annotation; import edu.stanford.nlp.pipeline.StanfordCoreNLP; import edu.stanford.nlp.sentiment.SentimentCoreAnnotations.SentimentAnnotatedTree; import edu.stanford.nlp.trees.Tree; import edu.stanford.nlp.util.ArrayCoreMap; import edu.stanford.nlp.util.CoreMap; public class Simple_NLP { static StanfordCoreNLP pipeline; public static void init() { Properties props = new Properties(); props.setProperty("annotators", "tokenize, ssplit, parse, sentiment"); pipeline = new StanfordCoreNLP(props); } public static String findSentiment(String tweet) { String SentiReturn = ""; String[] SentiClass ={"very negative", "negative", "neutral", "positive", "very positive"}; //Sentiment is an integer, ranging from 0 to 4. //0 is very negative, 1 negative, 2 neutral, 3 positive and 4 very positive. int sentiment = 2; if (tweet != null && tweet.length() > 0) { Annotation annotation = pipeline.process(tweet); List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class); if (sentences != null && sentences.size() > 0) { ArrayCoreMap sentence = (ArrayCoreMap) sentences.get(0); Tree tree = sentence.get(SentimentAnnotatedTree.class); sentiment = RNNCoreAnnotations.getPredictedClass(tree); SentiReturn = SentiClass[sentiment]; } } return SentiReturn; } }
我面临同样的问题:也许是一个解决方案 stanford_corenlp_py 用的 Py4j 正如@roopalgarg所指出的那样。
Py4j
stanford_corenlp_py 这个repo提供了一个Python接口,用于调用斯坦福CoreNLP Java包的“情感”和“实体命令”注释器,截至v.3.5.1。它使用py4j与JVM进行交互;因此,为了运行脚本/ runGateway.py之类的脚本,必须首先编译并运行创建JVM网关的Java类。
这个repo提供了一个Python接口,用于调用斯坦福CoreNLP Java包的“情感”和“实体命令”注释器,截至v.3.5.1。它使用py4j与JVM进行交互;因此,为了运行脚本/ runGateway.py之类的脚本,必须首先编译并运行创建JVM网关的Java类。
在这个问题上有一个非常新的进展:
现在你可以使用了 stanfordnlp python里面的包:
stanfordnlp
来自 自述 :
>>> import stanfordnlp >>> stanfordnlp.download('en') # This downloads the English models for the neural pipeline >>> nlp = stanfordnlp.Pipeline() # This sets up a default neural pipeline in English >>> doc = nlp("Barack Obama was born in Hawaii. He was elected president in 2008.") >>> doc.sentences[0].print_dependencies()
Textblob 写的是一个很好的情感分析包 Python 。你可以拥有 这里的文档 。通过检查单词及其相应的情感分数(情绪)来执行对任何给定句子的情感分析。你可以先开始
Textblob
Python
$ pip install -U textblob $ python -m textblob.download_corpora
第一个pip install命令将为您提供最新版本的textblob( virtualenv 系统,因为你通过 -U will upgrade the pip package its latest available version 。接下来将下载所需的所有数据, corpus 。
virtualenv
-U will upgrade the pip package its latest available version
corpus