项目作者: felipemoraes

项目描述 :
node-indri is an addon to bring Node.js and Indri search engine together
高级语言: C++
项目地址: git://github.com/felipemoraes/node-indri.git
创建时间: 2018-03-26T14:09:58Z
项目社区:https://github.com/felipemoraes/node-indri

开源协议:

下载


node-indri is native Node.js module that integrates Node.js and Indri search engine

Please see releases for older versions.

Setup

These instructions are for Ubuntu and MacOS, but the steps can be adapted for all major platforms with a GCC compiler.

  • Install NodeJS (at least version 12.0)

  • Install Cmake, zlib, and Cmake-js

    1. sudo apt install cmake zlib1g-dev
    2. sudo npm install cmake-js -g
  • Clone node-indri

    1. git clone https://github.com/felipemoraes/node-indri.git
    2. cd node-indri
  • Install customized Indri by Fernando Diaz:

    1. git clone https://github.com/diazf/indri.git
    2. cd indri
    3. ./configure CXX="g++ -D_GNU_SOURCE=1 -D_GLIBCXX_USE_CXX11_ABI=0" --prefix={NODE_INDRI_PATH}/ext/
    4. make
    5. make install
    6. cd ..
  • Install node-indri

    1. npm install
  • To test if everything works:

    1. npm test

Examples

Require the native module that was built during the setup. You can find it at the path below in the node-indri folder, usually you want to copy the module into the project that you are using it in and adjust the path accordingly.

  1. const indri = require('./build/Release/node-indri');

Searcher

Use the Searcher to execute a search with optional relevance feedback.
You can use pseudo relevance feedback by leaving feedback_docs empty and set fbDocs the number of fbDocs you want to use for expanding a query. For no relevance feedback pass and empty list as feedback_docs.

  1. const searcher = new indri.Searcher(
  2. {
  3. "index": "index_path",
  4. "rules" : "rules", // method:dirichlet,mu:1000
  5. "fbTerms": 10,
  6. "fbMu": 1500,
  7. "fbDocs" : 10,
  8. "fbOrigWeight" : 1.0,
  9. "includeDocument" : true,
  10. "includeDocumentScore" : true,
  11. "includeFields": [{
  12. "nameInIndex1": "nameInResponse1",
  13. "nameInIndex2": "nameInResponse2"
  14. }
  15. ]}
  16. );
  17. searcher.search(query, page, results_per_page, feedback_docs, callback);

Scorer

Use the scorer to retrieve score for a list of documents scoreDocuments or the top K retrieved documents and their scores with retrieveTopKScores.

  1. const scorer = new indri.Scorer(
  2. {
  3. "index": "index_path",
  4. "rules" : "rules", // method:dirichlet,mu:1000
  5. "fbTerms": 10,
  6. "fbMu": 1500}
  7. );
  8. scorer.scoreDocuments(query, page, docs, callbac_k);
  9. scorer.retrieveTopKScores(query, page, number_results, callback);

Reader

Use the Reader to retrieve an individual document by id.

  1. const reader = new indri.Reader("index": "index_path");
  2. reader.getDocument(docid, callback);

Benchmark

Benchmark files used to compare node-indri and pyndri can be folder inside folder benchmark.

Implementation details

node-indri was implemented using Native Abstractions for Node.js (NAN) and C++11. It supports a simple async searching function with relevance feedback.

Citation

If you use node-indri to produce results for your scientific publication, please refer to our ECIR 2019 paper.

  1. @inproceedings{moraes2019nodeindri,
  2. title={node-indri: moving the Indri toolkit to the modern Web stack},
  3. author={Moraes, Felipe and Hauff, Claudia},
  4. booktitle={ECIR},
  5. year={2019}
  6. }

License

MIT License