项目作者: wrzlbrmft

项目描述 :
A tool for domain whitelists/blacklists analysis and optimization.
高级语言: Java
项目地址: git://github.com/wrzlbrmft/domains.git
创建时间: 2015-02-09T14:12:59Z
项目社区:https://github.com/wrzlbrmft/domains

开源协议:GNU General Public License v3.0

下载


domains

Build Status

domains is a command-line tool written in Java for analysis and optimization
of domain lists, e.g. to be used as whitelists or blacklists.

Features

  • Auto-correct malformed domain names
  • Sorting
  • Remove duplicate list entries
  • Remove redundant list entries
  • Check against an exclusion list and
    • Remove obsolete domain list entries
    • Remove unused exception list entries
  • Save optimized lists as new files
  • Check if a domain name would be whitelisted/blacklisted

Download

A pretty up-to-date version of domains can be downloaded
here.

To build the latest version by yourself, see below for Build Instructions.

Usage

Having installed the
Java Runtime Environment 7+, you can
run domains at the command-line or in Terminal with:

  1. java -jar domains.jar

Append -h or --help to get a list of all available options:

  1. java -jar domains.jar -h

All available options are:

  1. -b,--check-blacklist <domainName> check if the domain name would be
  2. blacklisted, when treating the loaded
  3. domain(/exception) list(s) as
  4. blacklist configuration
  5. -d,--domains <file> load domain list from text file
  6. -e,--exceptions <file> load exception list from text file
  7. -h,--help print this help message and exit
  8. -o,--remove-obsolete-domains remove obsolete domain list entries
  9. (e.g. if "com" is on the exception
  10. list, the domain list entry "foo.com"
  11. is obsolete and removed)
  12. -r,--remove-redundant remove redundant list entries (e.g.
  13. "com" includes "foo.com", so
  14. "foo.com" is redundant and removed)
  15. -s,--save-domains <file> save optimized domain list as new
  16. text file
  17. -u,--remove-unused-exceptions remove unused exception list entries
  18. (e.g. if "com" is not on the domain
  19. list, the exception list entry
  20. "foo.com" is unused and removed)
  21. -v,--verbose be more verbose
  22. --version print version info and exit
  23. -w,--check-whitelist <domainName> check if the domain name would be
  24. whitelisted, when treating the loaded
  25. domain(/exception) list(s) as
  26. whitelist configuration
  27. -x,--save-exceptions <file> save optimized exception list as new
  28. text file

Quick Start

Load the domain list from domains.txt, auto-correct and sort it, then remove
both duplicate and redundant entries. Finally save the optimized domain list as
domains-optimized.txt:

  1. java -jar domains.jar -d domains.txt -r -s domains-optimized.txt

Read further for more available optimizations.

Checking Domain Names

You can check if a given domain name would be whitelisted/blacklisted, when
treating the loaded domain(/exception) list(s) as whitelist/blacklist
configuration.

Load the domain list from domains.txt and the exception list from
exceptions.txt. Check if www.foo.com would be whitelisted, when treating the
loaded lists as whitelist configuration:

  1. java -jar domains.jar -d domains.txt -e exceptions.txt -w www.foo.com

Load the domain list from domains.txt and the exception list from
exceptions.txt. Check if www.bar.com would be blacklisted, when treating the
loaded lists as blacklist configuration:

  1. java -jar domains.jar -d domains.txt -e exceptions.txt -b www.bar.com

You can check domain names for being whitelisted/blacklisted at the same time:

  1. java -jar domains.jar -d domains.txt -e exceptions.txt -w www.foo.com -b www.bar.com

Domain Lists

A domain list is a simple text file, containing one domain name per line:

Example

  1. foo.com
  2. bar.com
  3. www.xyz.net
  4. org

Each domain includes all of its sub-domains. E.g. foo.com includes
www.foo.com, bar.foo.com etc.

NOTE: To put a top-level-domain like .com on a list, simply use com,
without the leading dot. Otherwise it will be auto-corrected.

Exception Lists

domains also supports exception lists to express rules like “‘com’ except
‘youtube.com’”
. An exception list is a second file loaded in conjunction with
a domain list; also a simple text file, containing one (exception) domain name
per line.

Optimizations

Domain and Exception Lists

The following optimizations can be applied to both domain and exception lists.

Auto-Correct Malformed Domain Names

(Auto-correction is always applied to any list loaded.)

Malformed domain names are auto-corrected with a set of rules applied in the
following order:

  1. remove the last :// and everything before it
  2. remove the first : and everything after it
  3. remove the first / and everything after it
  4. ensure that the domain name does not start with a dot (.)
  5. change to lower-case letters

NOTE: Even with rule #4, [.]foo.com does not include [.]barfoo.com, no
matter if you put foo.com on a list with or without the leading dot.

Example

All of the following entries are auto-corrected to www.foo.com:

  1. http://www.foo.com/
  2. WWW.FOO.COM/bar
  3. .www.foo.com:8080
  4. https://www.foo.com/bar/index.html

Sorting

(Sorting is always applied to any list loaded.)

Domain names are sorted as reverse-strings (foo.com as moc.oof) to keep
different sub-domains next to each other.

Example

  1. www.foo.com
  2. bar.net
  3. ftp.foo.com
  4. www.bar.net
  5. foo.com

becomes

  1. foo.com
  2. ftp.foo.com
  3. www.foo.com
  4. bar.net
  5. www.bar.net

Remove Duplicate List Entries

(De-duplication is always applied to any list loaded.)

Each domain name is only allowed to appear once on a list.

NOTE: Due to auto-correction, list entries can also result in the same
domain name, then being de-duplicated.

Remove Redundant List Entries

Domain names always include all their sub-domains. Therefore, in the following
list, all entries except com are redundant:

  1. foo.com
  2. com
  3. bar.com

Use the -r or --remove-redundant command-line options to remove the
redundant list entries.

Exception Lists

The following optimizations can only be applied when loading a domain list and
an exception list.

Use the -e or --exceptions command-line option to load an exception list.

Example

Load the domain list from domains.txt and the exception list from
exceptions.txt:

  1. java -jar domains.jar -d domains.txt -e exceptions.txt

Remove Obsolete Domain List Entries

Any domain list entry included in an exception list entry is obsolete.

Example

Domain list:

  1. www.foo.com

Exception list:

  1. foo.com

Since the exception foo.com includes www.foo.com on the domain list,
www.foo.com can be removed from the domain list.

Use the -o or --remove-obsolete-domains command-line option to remove the
obsolete domain list entries.

Remove Unused Exception List Entries

Any exception not being a sub-domain of a domain list entry is unused.

Example

Domain list:

  1. foo.com

Exception list:

  1. bar.com

Since bar.com is not a sub-domain of any domain list entry, it can be removed
from the exception list.

Use the -u or --remove-unused-exceptions command-line option to remove the
unused exception list entries.

Save Optimized Lists

All optimizations are applied to copies of the domain and/or exception list
files loaded into memory. The original files are never changed but you can save
the optimized lists from memory as new files.

Use the -s or --save-domains command-line option to save the optimized
domain list, and the -x or --save-exceptions command-line option to save the
optimized exception list as a new file.

Example

Load the domain list from domains.txt and save the optimized domain list as
domains-optimized.txt:

  1. java -jar domains.jar -d domains.txt -s domains-optimized.txt

NOTE: Any existing new file will be overwritten.

Build Instructions

A pretty up-to-date version of domains can be downloaded
here.

Or you can easily build the latest version by yourself.

Requirements

Download the latest source code as a ZIP
file
or use Git:

  1. git clone https://github.com/wrzlbrmft/domains.git

Change into the unzipped or the checkout directory and run Maven:

  1. mvn package

The uber-jar containing both the compiled source code and all of its
dependencies is saved in the target/ directory.

Simply run it:

  1. java -jar target/domains.jar

License

This software is distributed under the terms of the
GNU General Public License v3.