1.1 Abstract

The Fumi package is comprised of several commands for the purpose of text mining for Japanese text.

Data mining is mostly deal with quantitative information in structured format. In recent years, advancements in computer performance and data analysis technology have increased the capacity to handle unstructured information. This type of data contains numbers, time and facts. An example of unstructured information in essays written by humans (natural language).

This package contains morphological analyser JUMAN for Japanese, and case structure analyser KNP. The algorithms are developed by Kyoto University Department of Intelligence Science and Technology, Kurohashi and Kawahara lab. Since the results generated are saved in CSV file format, various analytical processes with mcommand can be carried out to create analytical models. Please refer to the official website of JUMAN and KNP for more details.