项目作者: tbackes

项目描述 :
将Yelp评论与安全检查结果联系起来
高级语言: Python
项目地址: git://github.com/tbackes/yelp-health.git
创建时间: 2015-10-18T20:02:39Z
项目社区:https://github.com/tbackes/yelp-health

开源协议:

下载


yelp-health

Linking Yelp Reviews to Safety Inspection Outcomes

City-Specific Pages

To see more details on each city that was modeled, browse the following links:

Datasets

Yelp Challenge Data

Yelp Challenge Data is available for the following cities:

Health Inspection Data

Health Inspection data (scores and reports) have varying availability/accessibility across cities.

  • Pittsburgh: PDFs of each inspection report are available as early as 2011 (I haven’t checked earlier dates yet).
  • Charlotte: Scores and grades are available as early as 2013. Violation details are available online, but are much trickier to scrape. I’m not sure if I’ll have time to scrape the violation details.
  • Urbana-Champaign: HTML tables of inspection/violation reports are available online. However, I haven’t been able to figure out the logic behind url queries to generate these keys… so this city is on hold for now.
  • Phoenix: Grades and violation details are available as early as 2013. Web-scraping of violation details in progress.
  • Las Vegas: All data available as downloadable SQL database. Sweet!
  • Madison: Data is available online, but html tables are generated by javascripts. Not sure how to scrape, so this city is on hold for now.

Previous work