项目作者: leonkozlowski

项目描述 :
Yet another (inverted) index generator
高级语言: Go
项目地址: git://github.com/leonkozlowski/yaig.git
创建时间: 2020-08-06T03:26:12Z
项目社区:https://github.com/leonkozlowski/yaig

开源协议:

下载


yaig

Yet another index generator

Getting started

This project requires Go to be installed. On OS X with Homebrew you can just run brew install go.

Running it then should be as simple as:

  1. $ make
  2. $ ./bin/yaig

Or for CLI usage

  1. $ go build
  2. $ ./yaig input road.txt

Usage

Yaig creates an inverted index for a given "*.txt" input

Example

Given a sample input:

The Road Not Taken
BY ROBERT FROST
Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;

Then took the other, as just as fair,
And having perhaps the better claim,
Because it was grassy and wanted wear;
Though as for that the passing there
Had worn them really about the same,

And both that morning equally lay
In leaves no step had trodden black.
Oh, I kept the first for another day!
Yet knowing how way leads on to way,
I doubted if I should ever come back.

I shall be telling this with a sigh
Somewhere ages and ages hence:
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.

We can produce an inverted index

The inverted index is a map object consiting of objects

An object consists of a key word string with value Entry

  1. ages:[{1 125} {1 127}]

The Entry struct is comprised two parts:

  • document a uinque doc identifier
  • index index of the word in the document
  1. map[
  2. ages:[{1 125} {1 127}]
  3. bent:[{1 40}]
  4. better:[{1 56}]
  5. black:[{1 91}]
  6. claim:[{1 57}]
  7. come:[{1 114}]
  8. day:[{1 99}]
  9. difference:[{1 150}]
  10. diverged:[{1 9} {1 131}]
  11. doubted:[{1 109}]
  12. equally:[{1 83}]
  13. fair:[{1 51}]
  14. far:[{1 33}]
  15. frost:[{1 6}]
  16. grassy:[{1 61}]
  17. having:[{1 53}]
  18. just:[{1 49}]
  19. kept:[{1 94}]
  20. knowing:[{1 101}]
  21. lay:[{1 84}]
  22. leads:[{1 104}]
  23. leaves:[{1 86}]
  24. long:[{1 25}]
  25. looked:[{1 29}]
  26. morning:[{1 82}]
  27. oh:[{1 92}]
  28. passing:[{1 70}]
  29. really:[{1 75}]
  30. road:[{1 1}]
  31. roads:[{1 8} {1 130}]
  32. robert:[{1 5}]
  33. shall:[{1 117}]
  34. sigh:[{1 123}]
  35. sorry:[{1 15}]
  36. step:[{1 88}]
  37. stood:[{1 27}]
  38. taken:[{1 3}]
  39. telling:[{1 119}]
  40. took:[{1 45} {1 138}]
  41. travel:[{1 19}]
  42. traveled:[{1 142}]
  43. traveler:[{1 24}]
  44. trodden:[{1 90}]
  45. undergrowth:[{1 43}]
  46. wanted:[{1 63}]
  47. way:[{1 103} {1 107}]
  48. wear:[{1 64}]
  49. wood:[{1 13} {1 134}]
  50. worn:[{1 73}]
  51. yellow:[{1 12}]
  52. ]

Testing

make test