项目作者: JuliaNeighbors

项目描述 :
Inverted file system for billion-scale ANN search
高级语言: Julia
项目地址: git://github.com/JuliaNeighbors/IVFADC.jl.git
创建时间: 2019-04-19T17:04:38Z
项目社区:https://github.com/JuliaNeighbors/IVFADC.jl

开源协议:MIT License

下载


Alt text

Inverted file system with asymmetric distance computation for billion-scale approximate nearest neighbor search.

License
Build Status
Coverage Status

Installation

  1. using Pkg
  2. Pkg.add("IVFADC")

or

  1. Pkg.add(PackageSpec(url="https://github.com/JuliaNeighbors/IVFADC.jl", rev="master"))

for the latest master branch.

Examples

Create an index

  1. using IVFADC
  2. using Distances
  3. nrows, nvectors = 50, 1_000
  4. data = rand(Float32, nrows, nvectors)
  5. kc = 100 # coarse vectors (i.e. Voronoi cells)
  6. k = 256 # residual quantization levels/codebook
  7. m = 10 # residual quantizer codebooks
  8. ivfadc = IVFADCIndex(data,
  9. kc=kc,
  10. k=k,
  11. m=m,
  12. coarse_quantizer=:naive,
  13. coarse_distance=SqEuclidean(),
  14. quantization_distance=SqEuclidean(),
  15. quantization_method=:pq,
  16. index_type=UInt16)
  17. # IVFADCIndex, naive coarse quantizer, 12-byte encoding (2 + 1×10), 1000 Float32 vectors

Add and delete points to the index

Points can be added to the index by using the push! and pushfirst! methods.
Removing points from the index can be performed using the pop!, popfirst! and
delete_from_index! methods.

  1. for i in 1:15
  2. push!(ivfadc, rand(Float32, nrows))
  3. end
  4. length(ivfadc)
  5. # 1015
  6. delete_from_index!(ivfadc, [1000, 1001, 1010, 1015])
  7. length(ivfadc)
  8. # 1011

The pop! and popfirst! methods also return the indexed (and quantized) vectors respectively.

  1. pop!(ivfadc)
  2. # 50-element Array{Float32,1}:
  3. # 0.30565456
  4. # 0.6903644
  5. # ⋮
  6. # 0.20116138
  7. # 0.90699536
  8. popfirst!(ivfadc)
  9. # 50-element Array{Float32,1}:
  10. # 0.29412186
  11. # 0.0709379
  12. # ⋮
  13. # 0.51727176
  14. # 0.69718516
  15. length(ivfadc)
  16. # 09

Search the index

  1. point = data[:, 123];
  2. idxs, dists = knn_search(ivfadc, point, 3)
  3. # (UInt16[0x007a, 0x0237, 0x0081], Float32[4.303085, 10.026548, 10.06385])
  4. int_idxs = Int.(idxs) .+ 1 # retrieve 1-based integer neighbors
  5. # 3-element Array{Int64,1}:
  6. # 123
  7. # 568
  8. # 130

Features

To keep track with the latest features, please consult NEWS.md and the documentation.

License

The code has an MIT license and therefore it is free.

Reporting Bugs

This is work in progress and bugs may still be present…¯\(ツ)/¯ Do not worry, just open an issue to report a bug or request a feature.

References