项目作者: loribeiro

项目描述 :
Portuguese POS-Tagger in Node.JS
高级语言: JavaScript
项目地址: git://github.com/loribeiro/Singularity-POS-Tagger.git
创建时间: 2021-02-06T02:39:42Z
项目社区:https://github.com/loribeiro/Singularity-POS-Tagger

开源协议:

下载


Singularity-POS-Tagger

Portuguese POS-Tagger writen in core Node.JS, without any external modules.

I developed this library to use as base for another personal project. There is planing of room to improve accuracy with heuristics and twiks.

It’s designed specially for Node.Js Streams, which can improve speed and memory use when working on servers or large corpus of data. Nonethless, one can still use a built in method to work with strings.

There is no need to pre-processing the corpus, there is a built in function that cleans everything before the POS classification.

Because the nature of JavaScript be single threaded and NLP jobs are usually very resource intense, almost everything in this package runs asynchronous.

Installation

In a Node.JS environment you can run on your terminal.

npm i singularity-tagger

What is a POS Tagger?

A POS Tagger or Part of Speach Tagger is a piece of software that analyzes a corpus and taggs the words with it’s respective gramatical class.

Applications of POS Taggers

  • Sentiment analysis
  • Question answering
  • Word sense disambiguation

Basically every Natural Language Processing task uses a POS tagger as sub task.

Implementation

  • Model trained on Mac-Morpho’s anotated corpus available on: http://nilc.icmc.usp.br/macmorpho/
  • Stochastic algorithm used:
    • Hidden Markov Model (HMM) with Viterbi Algorithm

How to use

Singularity is designed to be used on async functions or as ECMS6 promise.
There are two main methods available:

  • analyzeString
    • receive as parameter one string and returns an array with normalized words
      along side the tags of the string
  • analyzeStream
    • receive as parameter an input stream and output stream and returns to the output stream the normalized text along side the tags

      Code example

  • Inside Asynchronous functions:
    • analyzeString:
      • const PosTagger = require(“singularity-tagger”)

        const tagger = await PosTagger()

        const taggedArray = await tagger.analyzeString(string)

        console.log(await taggedArray)

    • analyzeStream:
      • const PosTagger = require(“singularity-tagger”)

        const tagger = await PosTagger()

        await tagger.analyzeStream(process.stdin, process.stdout) // can be any stream interface

  • Inside Synchronous functions:
    • analyzeString:
      • const PosTagger = require(“singularity-tagger”)

        PosTagger().then((tagger)=> tagger.analyzeString(string)).then(resp=> console.log()).catch(err=>console.log(err))

    • analyzeStream:
      • const PosTagger = require(“singularity-tagger”)

        PosTagger().then((tagger)=> tagger.analyzeStream(inputStream, outputStream)).catch(err=>console.log(err))

Tags meaning table

image info