项目作者: armchairtheorist

项目描述 :
A multi-strategy approach to find the absolutely cleanest and most likely canonical URL of any given URL.
高级语言: Ruby
项目地址: git://github.com/armchairtheorist/true_url.git
创建时间: 2016-12-27T06:04:04Z
项目社区:https://github.com/armchairtheorist/true_url

开源协议:MIT License

下载


Gem Version
Code Climate
Build Status
Coverage Status

TrueURL

TrueURL helps normalize, clean and derive a canonical URL for any given URL. Unlike other similar projects, TrueURL uses a configurable multi-strategy approach, including tailored strategies for specific sites (e.g. YouTube, DailyMotion, Twitter, etc.) as well as general strategies (e.g. rel="canonical", etc.).

Installation

Install the gem from RubyGems:

  1. gem install true_url

If you use Bundler, just add it to your Gemfile and run bundle install

  1. gem 'true_url'

I have only tested this gem on Ruby 2.3.0, but there shouldn’t be any reason why it wouldn’t work on earlier Ruby versions as well. TrueURL only requires the Addressable gem as a dependency. if page fetching is required, then the HTTP and Nokogiri gems are also required as dependencies.

Usage

  1. x = TrueURL.new("https://youtu.be/RDocnbkHjhI?list=PLs4hTtftqnlAkiQNdWn6bbKUr-P1wuSm0")
  2. puts x.canonical # => https://www.youtube.com/watch?v=RDocnbkHjhI
  3. x = TrueURL.new("http://embed.nicovideo.jp/watch/sm25956031/script?w=490&h=307&redirect=1")
  4. puts x.canonical # => http://www.nicovideo.jp/watch/sm25956031
  5. x = TrueURL.new("http://t.co/fvaGuRa5Za")
  6. puts x.canonical # => http://www.prdaily.com/Main/Articles/3_essential_skills_for_todays_PR_pro__18404.aspx

Other URL Canonicalization Projects (for Ruby)