项目作者: matiasraisanen

项目描述 :
Insecam Crawler
高级语言: Python
项目地址: git://github.com/matiasraisanen/insecrawl.git
创建时间: 2019-12-12T13:23:54Z
项目社区:https://github.com/matiasraisanen/insecrawl

开源协议:MIT License

下载


Insecrawl

Python3.7

Insecrawl is a crawler for insecam.org. Its purpose is to download still frames from cameras listed on said website.

The script can scrape an image from a single camera, all cameras in a certain country, or just simply every camera found on the site.

The script is automated, so the user can just issue a command, go make a sandwich, and come back to a folder full of stills from interesting cameras.

Downloaded images will be saved in ./images by default.

Camera IDs are those listed on insecam.org.
i.e. the ID of https://www.insecam.org/en/view/241666/ is 241666

DISCLAIMER:
I am in no way affiliated with insecam. I just wanted to have an easier way of browsing cameras on the site.


How to use

You can use the program as you like, but the typical flow is as follows:

0. Install dependencies

  1. $ pip3 install opencv-python opencv-contrib-python-headless beautifulsoup4 iso3166

1. Print a list of all countries that have cameras

  1. $ python3 insecrawl.py -l

2. Pick a country code from the list, and scrape its cameras

Here we use Finland as an example.

  1. $ python3 insecrawl.py -t -S -c FI
Option Explanation
-t Add a timestamp to the image filename. Prevents overwriting previous scrapes.
-S Automatically determine the filepath using the country code. e.g. FI will be saved in ./images/Finland
-c FI Scrape cameras from FI

3. Browse ./images/Finland to examine the scraped images

  1. $ browse ./images/Finland

Other examples

Example-1:
Scrape images from every camera listed on insecam, and save them in ./images/{COUNTRY_NAME}
This can take a couple of hours to finish.

  1. $ python3 insecrawl.py --scrapeAllCameras --sortByCountry

Example-2:
You can also combine terminal commands to your liking.

Scrape camera ID 241666 every 900 seconds.
Save images into ./images/214666_timelapse and use timestamps in filenames. Great way to create frames for timelapse videos.

  1. $ watch -n 900 "python3 insecrawl.py -o 241666 -f 241666_timelapse -t"

Options

  1. -h, --help Print this help page
  2. -c, --country Designate a country, and scrape stills from all cameras in
  3. that country. Provide a two letter country code (ISO 3166-1 alpha-2)
  4. -l, --listCountries Prints all countries, country codes and camera amounts listed on
  5. insecam.org
  6. -d, --details Prints details for a given camera ID.
  7. -f, --folder Assign a custom download path under ./images folder.
  8. -i, --identifier Provide a custom identifier for the camera, used as
  9. filename for the saved image. Works only with -u and -o flags.
  10. -o, --oneCamera Provide a single insecam camera ID to download a still frame from.
  11. -n, --newCamsOnly Scrape only the cameras that do not have a still saved on disk.
  12. --scrapeAllCameras Downloads a still from every camera listed on insecam. This can
  13. take hours to complete. Best used together with --sortByCountry
  14. -S, --sortByCountry Images will be saved in ./images/{COUNTRY_NAME}
  15. --sortByCamera A new folder will be created for each camera.
  16. --interval Used for running the script at set intervals. Provide an amount
  17. of seconds you wish to wait between each run. Works only in
  18. conjuction with -c or --country. The interval starts to run after
  19. the last camera is scraped, so don't expect exact results with
  20. this. Can be exited only with CTRL+C.
  21. -t, --timeStamp Append timestamp to image filename. Useful if you don't
  22. want to overwrite previously saved images. Timestamp
  23. format is [YYYY-MM-DD]_[HH-MM-SS], using computer's
  24. local time and a 24h clock.
  25. -u, --url Provide a direct URL to a video stream to download a still frame
  26. from. Useful if the camera is no longer on insecam, but still
  27. has a publicly accessible web interface. Must be used in conjuction
  28. with -i or --identifier.
  29. -v, --verbose Debug level logging

3rd party dependencies