项目作者: marlesson

项目描述 :
Grouping of facebook posts for engagement analysis
高级语言: Python
项目地址: git://github.com/marlesson/facebook_post_clusters.git
创建时间: 2017-08-17T03:59:02Z
项目社区:https://github.com/marlesson/facebook_post_clusters

开源协议:

下载


Facebook Post Page’s Cluster

Dependences

  • python (3.6)
  • pandas (0.19.2)
  • numpy (1.13.1)
  • nltk (3.2.2)
  • scipy (0.19.1)
  • sklearn

Usage

Download Facebook Crawler (Facebook-Page-Crawler)

> git submodule init

> git submodule update

Download Page Posts for clustering

> python Crawler/Facebook_Page_Crawler.py 'SiteOmelete' '2017-07-01 00:00:00' '2017-07-30 23:59:59' --resume

Clustering

> python Cluster/run.py 'SiteOmelete'

Default Params

Change the information in the file parameters.json

  1. {
  2. "fb_app_id": "",
  3. "fb_app_secret": "",
  4. "pca_variance_max": 0.7,
  5. "range_of_cluster": [10, 50],
  6. "tfidf_max_features": 200000,
  7. "tfidf_max_df": 0.1,
  8. "tfidf_min_df": 0.01,
  9. "tfidf_ngram_range": [1, 2]
  10. }

Step by Step

  • Data mining in text with bag of words and n-grams
  • Reduction of Dimensionality with PCA
  • Clustering with k-means
  • Automated generation of descriptors with decision tree
  • Cluster engagement analysis