next up previous contents
Next: An English parser based Up: Syntactic parsing Previous: Syntactic parsing

KNP : a parser for long sentences, developed at Kyoto University

The Kurohashi-Nagao Parser (KNP) is an algorithm specially developed for parsing long sentence, which is classically a complex problem if relations exist between words far away from each other. The basic assumption of the algorithm is that a long sentence usually contains conjunctive structures that link noun phrases or clauses with each others. For example, in the sentence "Paul liked the book and enjoyed the movie" the word "and" links two parallel clauses. This kind of structure is very common in Japanese and creates many ambiguities for syntactic parsers.

In order to discover these conjunctive structures, members of Nagao Laboratory in Kyoto University have proposed the KNP algorithm that computes the similarity between two sequences of words on the left and on the right of a conjunction, and that selects those series of words that can be considered as similar enough to belong to a conjunctive structure, using technics of dynamic programming.

The similarity computation between blocks of words is the key of the algorithm and is based on comparisons between morphological categories, Japanese characters used, and semantic classes of the words in different blocks obtained from a thesaurus.



Jean-Philippe Vert
Sun Dec 6 11:05:42 MET 1998