项目作者: sachinbiradar9

项目描述 :
Classify news into categories based on headline.
高级语言: Jupyter Notebook
项目地址: git://github.com/sachinbiradar9/News-Classification.git
创建时间: 2018-05-05T00:40:08Z
项目社区:https://github.com/sachinbiradar9/News-Classification

开源协议:

下载


News Classification

Classify news into categories based on their headline.
Various classifiers were tried - Decision Tree, Support Vector Classifier, Multinomial Naive Bayesian Classifier,
Multilayered Perceptron, Random Forest. Multinomial Naive Bayesian Classifier worked the best. It is logical
for Multinomial Naive Bayesian to work the best as even we as humans classify based on keywords. We are
likely to predict “Politics” is we see keywords like Obama, election, republic and we are likely to predict
“Criminal” if we see keywords like drugs, jail and so on. Naive bayesian scans whole dataset and finds the
probabilities of each word in headline being associated with a class and then find the probability for whole
headline hence it works good.

Installation

pip install numpy
pip install scikit-learn
pip install imblearn
pip install seaborn
Dataset: https://data.world/elenadata/vox-articles

Usage

You can look for detailed explanation with graphs at - news.ipynb
You can also run the python file using - python3 news.py

Credits

Dr. Elena Zeleva