项目作者: tbackes
项目描述 :
将Yelp评论与安全检查结果联系起来
高级语言: Python
项目地址: git://github.com/tbackes/yelp-health.git
yelp-health
Linking Yelp Reviews to Safety Inspection Outcomes
City-Specific Pages
To see more details on each city that was modeled, browse the following links:
Datasets
Yelp Challenge Data
Yelp Challenge Data is available for the following cities:
Health Inspection Data
Health Inspection data (scores and reports) have varying availability/accessibility across cities.
- Pittsburgh: PDFs of each inspection report are available as early as 2011 (I haven’t checked earlier dates yet).
- Charlotte: Scores and grades are available as early as 2013. Violation details are available online, but are much trickier to scrape. I’m not sure if I’ll have time to scrape the violation details.
- Urbana-Champaign: HTML tables of inspection/violation reports are available online. However, I haven’t been able to figure out the logic behind url queries to generate these keys… so this city is on hold for now.
- Phoenix: Grades and violation details are available as early as 2013. Web-scraping of violation details in progress.
- Las Vegas: All data available as downloadable SQL database. Sweet!
- Madison: Data is available online, but html tables are generated by javascripts. Not sure how to scrape, so this city is on hold for now.
Previous work