Amazon Elasticsearch Service now supports the Seunjeon plugin, a popular open-source Korean language text analyzer, which makes it easy for developers to implement full-text search on Korean documents. The plugin internally uses a Korean language dictionary and is capable of recognizing compound words and separating them into terms based on context. Developers can now use this plugin to perform Korean text analysis operations such as tokenizing (separating a string into words), stemming (converting the text to its root form), removing stop words (frequent, low-value terms), and matching based on synonyms.
from What's New http://ift.tt/2HyB5O3
No comments:
Post a Comment