hero-scrape

Find the hero (main) image of an URL

Demo

See a demo on https://hero-scrape.viktor-braun.de

Description

hero-scrape extracts the main image of a webpage.
It use different strategies to find the main images (OpenGraph HTML Tags and heuristic search).
You can use the existing strategies or implement your own.

To find the “biggest” image it is necessary to download it. fastimage is the perfect choice for that job.

Installation

go get github.com/v-braun/hero-scrape

Usage

With pre configured strategies

pageUrl, _ := url.Parse("https://github.com/v-braun/hero-scrape")
res, _ := http.Get(pageUrl.String())
defer res.Body.Close()
result, _ := heroscrape.Scrape(pageUrl, res.Body)
fmt.Println(result.Image)

With cusom strategies

pageUrl, _ := url.Parse("https://github.com/v-braun/hero-scrape")
res, _ := http.Get(pageUrl.String())
defer res.Body.Close()
result, _ := heroscrape.ScrapeWithStrategy(pageUrl, res.Body, , NewOgStrategy(), NewHeuristicStrategy(), YourOwnStrategy())
fmt.Println(result.Image)

hero-scrape Demo for this lib
fastimage Finds the type and/or size of a remote image given its uri, by fetching as little as needed.
goquery A little like that j-thing, only in Go.