next up previous contents
Next: Automatic document classification at Up: Document classification Previous: Document classification

Kanjis retrieval at Kyoto University

Doctor Kurohashi's laboratory at Kyoto University proposed a method for classifying Japanese documents without performing any morphological parsing of these documents but just by observing the kanjisgif.

The model was trained using a database of texts already classified according to their topic (philosophy, architecture etc...) to extract the kanjis characteristic for each topic using a tex2html_wrap_inline288 method. The kanjis found to be characteristic can thereafter be used to classify new texts depending on the kanjis observed in it.



Jean-Philippe Vert
Sun Dec 6 11:05:42 MET 1998