next up previous contents
Next: A statistical approach at Up: Segmentation and morphological analysis Previous: Segmentation and morphological analysis

JUMAN at Kyoto University

 

JUMAN is a morphological parser for Japanese that was developed in Kyoto University by the team of Prof. Nagao. When a sentence is presented to JUMAN, it segments it into morphems and indicates the morphological class of every morphem (e.g. name, verb...). Two dictionaries are used to obtain this result:

One advantage of JUMAN is its modularity and adaptability. Indeed, in order to be able to deal with a large number of different morphological formalisms or grammars, it was conceived as a kernel that uses the dictionaries of any user. Even though it is given with a complete configuration (including a 120,000 words dictionary and a list of 14 morphological classes), it can be easily adapted to any personal formalism and dictionary. In Kyoto University, it is currently used with the 230,000 entries EDR dictionary and deals with 3,000 classes of morphems. In the case where several parsing candidates exist for one sentence, JUMAN prefers the one that contains the smallest number of unknown words, morphems and independent words.

JUMAN is used in many laboratories as morphological parser, in Japan as well as abroad. It is free and can be downloaded from the Internet from Dr. Kurohashi's laboratory in Kyoto University, or from Prof. Matsumoto's laboratory in NAIST.



Jean-Philippe Vert
Sun Dec 6 11:05:42 MET 1998