Log in


NLPIR Chinese lexical analysis system

beijing Haidian District
Views: 21 Share

Business Details

  The main functions include Chinese word segmentation; English tokenization; Part-Of-Speech (POS) tagging; named entity recognition; new word identification; keywords extraction; and supporting user-defined lexicon. The NLPIR system is compatible with all encoding including GBK and UTF-8, all operating systems including Windows, Linux, Android and IOS and can be invoked by all programming languages such as Java, Python, C and C#.

  Word segmentation for both Chinese and English

  Automatic tokenization and POS tagging for both Chinese and English. Detail functions are: Chinese word segmentation, English tokenizaiton, POS tagging, unknown words recognition and supporting user-defined lexicon.

  Keywords extraction

  We use information entropy algorithm to extract keywords , including listed and unlisted words. The following keywords is automatically extracted from the political report in the 3rd Plenary Session of 18th CPC Central Committee.

  New words identification and Adaptive Word Segmentation

  New Words is identified using information entropy from given regular untagged texts. Then new words are added to train language modeling and adapted to make adaptive segmentation.

  User-defined Lexicon Supported

  User-defined words can be added to NLPIR system one by one. They can be also batch imported. User-defined lexicon will refine the final segmentation results with a real-time speed.

beijing Haidian District


Tradezone.com has been putting suppliers and buyers together for over 65 years.  This Suppliers Directory focuses on promoting International Trade.  You are welcome to list your company and products here and we will do our best to provide great internet exposure for you.


Copyright © 2018 Tradezone.com. All Rights Reserved