项目作者: v-braun

项目描述 :
Find the hero (main) image of an URL
高级语言: Go
项目地址: git://github.com/v-braun/hero-scrape.git
创建时间: 2018-12-01T19:52:59Z
项目社区:https://github.com/v-braun/hero-scrape

开源协议:MIT License

下载


hero-scrape

Find the hero (main) image of an URL

Build Status
codecov

By v-braun - viktor-braun.de.



Demo

See a demo on https://hero-scrape.viktor-braun.de

Description

hero-scrape extracts the main image of a webpage.
It use different strategies to find the main images (OpenGraph HTML Tags and heuristic search).
You can use the existing strategies or implement your own.

To find the “biggest” image it is necessary to download it. fastimage is the perfect choice for that job.

Installation

  1. go get github.com/v-braun/hero-scrape

Usage

With pre configured strategies

  1. pageUrl, _ := url.Parse("https://github.com/v-braun/hero-scrape")
  2. res, _ := http.Get(pageUrl.String())
  3. defer res.Body.Close()
  4. result, _ := heroscrape.Scrape(pageUrl, res.Body)
  5. fmt.Println(result.Image)

With cusom strategies

  1. pageUrl, _ := url.Parse("https://github.com/v-braun/hero-scrape")
  2. res, _ := http.Get(pageUrl.String())
  3. defer res.Body.Close()
  4. result, _ := heroscrape.ScrapeWithStrategy(pageUrl, res.Body, , NewOgStrategy(), NewHeuristicStrategy(), YourOwnStrategy())
  5. fmt.Println(result.Image)
  • hero-scrape Demo for this lib
  • fastimage Finds the type and/or size of a remote image given its uri, by fetching as little as needed.
  • goquery A little like that j-thing, only in Go.

Known Issues

If you discover any bugs, feel free to create an issue on GitHub fork and
send me a pull request.

Issues List.

Authors

image
v-braun

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

License

See LICENSE.