next up previous contents
Next: Kyoto University corpus project Up: Linguistic resources Previous: EDR's productions

NTT's productions

This telecommunication giant has produced dictionaries and corpus for its research in NLP, especially machine translation. These resources were later used by many research centers.

NTT's dictionary is a 400,000 words dictionary. For each word, the pronunciation and the canonical form as well as syntactic and semantic information are provided. Semantic information is based on a 3,000 semantic attributes hierarchical graph that uses "is a" and "has a" relationships between concepts. The semantic attributes are provided for every word of the dictionary.

Parallel to the Japanese dictionary NTT developed a Japanese/English bilingual dictionary for classical structures and idioms, with 17,000 entries, including 6,000 ambiguous verbs. This dictionary contains the equivalences between Japanese and English structures.

Finally, a Japanese-to-English dictionary containing syntactic and semantic information is also available.



Jean-Philippe Vert
Sun Dec 6 11:05:42 MET 1998