Kyo Suayan | suayan.com | Natural NPM Package - Named Entity Recognition (NER)

Natural NPM Package - Named Entity Recognition (NER)

Saturday, March 18th 2023

const natural = require('natural');
const tokenizer = new natural.WordTokenizer();
const ner = new natural.NER();

// Sample text to tag
const text = "John lives in New York City and works at Google.";

// Tokenize the text into individual words
const tokens = tokenizer.tokenize(text);

// Add named entities to the NER model
ner.addNamedEntityText('John', 'person');
ner.addNamedEntityText('New York City', 'location');
ner.addNamedEntityText('Google', 'organization');

// Recognize named entities in the text
const entities = ner.getEntities(tokens);

// Output the recognized entities
console.log(entities);

In this example, we first import the natural module and create a WordTokenizer instance to tokenize the text into individual words. We then create a NER instance, which is a Named Entity Recognizer that uses a rule-based approach to identify named entities in the text.

We then add some named entities to the NER model using the addNamedEntityText method. In this example, we add a person ("John"), a location ("New York City"), and an organization ("Google").

We then pass the tokenized text to the getEntities method of the NER instance, which recognizes the named entities in the text based on the rules in the model. The getEntities method returns an array of objects where each object represents a named entity and contains its label, start index, and end index.

Finally, we output the recognized entities to the console.

Note that the natural module provides various other approaches to Named Entity Recognition, such as machine learning-based approaches that can be trained on your own dataset using the Classifier module.

tags: