项目作者: chuanenlin

项目描述 :
Speedy, lightweight web scrapper for Shutterstock.
高级语言: Python
项目地址: git://github.com/chuanenlin/shutterscrape.git
创建时间: 2018-07-21T06:34:04Z
项目社区:https://github.com/chuanenlin/shutterscrape

开源协议:MIT License

下载


ShutterScrape

ShutterScrape is a web scrapper for bulk downloading images and videos from Shutterstock with speed. ⚡

It implements Selenium for browser automation and Beautiful Soup for parsing.


Setting up

  1. Configure shutterscrape.py to your Python version.

  2. Install requirements from Terminal:

    1. pip install beautifulsoup4
    2. pip install selenium
    3. pip install lxml
  3. Install ChromeDriver.

  4. (Optional) Configure environment variables paths for python.exe and chromedriver.exe.


Running

Open terminal in the directory of shutterscrape.py and enter:

  1. python shutterscrape.py

Go grab a cup of coffee while waiting… oh wait, it’s already done!


Definitions

  • Search mode: Enter i for scraping images and v for scraping videos .
  • Number of search terms: For example, if you want to search for drone single person, enter 3.
  • Search term: Keyword(s) for searching on Shutterstock.
  • Number of pages to scrape: Higher number of pages means greater quantity of content with lower keyword precision.

Updates

10/1/2020

Updated for new shutterstock page layout as of 10/1/2020.

4/26/2019

Updated for new shutterstock page layout as of 4/26/2019.

10/1/2018

Added GUI for save directory selection.

07/31/2018

More stability fixes.

07/25/2018

Added gettyscrape.py for scraping videos from Getty Images.

07/23/2018

Stability fixes.