项目作者: mvolfik

项目描述 :
Scraping, processing and presentation of participant calls from EYP Members Platform
高级语言: HTML
项目地址: git://github.com/mvolfik/eyp-calls.git
创建时间: 2020-12-28T10:14:12Z
项目社区:https://github.com/mvolfik/eyp-calls

开源协议:MIT License

下载


This project is discontinued: Members Platform now requires login with captcha verification. The reason for that is beyond my understanding.

Also, back when I created this, I even reached out to GB of EYP, which responded that all the features from here will be included in the new platform. We all can see how that went :)))

eyp-calls.tk

A simple list of all calls that are currently up on the Members Platform, allows for sorting
and filtering by event type and position. There’s not much else to
say, have a look at it yourself
yourself. The info below is mainly for developers.


Contributions are welcome, hit me up via mail or Telegram if you have
questions or ideas, I’ll be very happy to hear from you!

API Endpoint

Don’t want to go through the hassle of scraping yourself, but think you could deliver a better presentation? This is
possible too! Build your own app using the JSON endpoint at
https://eyp-calls.tk/data.json

I’m not a frontend dev, so I’ll be happy to ditch this presentation, keep only the data source and provide a link to
your app.

Repository structure

There are currently two main components – the scraper and the worker. The scraper is a scrapy project, which contains
the main spider call_spider. The worker contains a cron job which periodically starts the scraper and downloads the
items, and the presentation frontend.

Scraper (/scraper)

Main source code is scraper/scraper/spiders/call_spider.py.

Run scraper locally (virtualenv is, of course, recommended):

```shell script
cd scraper
pip install scrapy
scrapy crawl call_spider -L INFO -O calls.json

  1. Deploy to [ScrapingHub Scrapy Cloud](https://www.scrapinghub.com/scrapy-cloud/):
  2. ```shell script
  3. cd scraper
  4. pip install shub
  5. shub deploy --version 1.0 # use your own version identifier

Frontend (/frontend-worker)

A Cloudflare Worker. Contains the Worker code (src/index.js), which
starts the scraper on cron triggers, serves the data endpoint, and contains some boilerplate magic for serving static
files. These are in the src/ directory, important is the index.html.

shell script npm install -g @cloudflare/wrangler cd frontend-worker wrangler preview --watch