Next: NTT's productions
Up: Linguistic resources
Previous: Linguistic resources
In April 1996, the Japan Electronic Dictionary Research Institute
(E.D.R) was created to realize an electronic dictionary for
use in advanced NLP research. To build it the company received funds
from the Japan Key Technology Center and eight major computer
manufacturers : Fujitsu, NEC, Hitachi, Sharp, Toshiba, Oki Electric,
Mitsubishi Electric and Matsushita Electric. The project lasted for 9
years, between 1986 and 1994, and led to the creation of five
independent dictionaries:
- Japanese dictionary It is a 250,000 words dictionary
that contains for each word morphological information
(pronunciation, accent, etc...), syntactic information
(grammatical characteristics, aspect etc...) and semantic
information (sense explanation and links to all concepts
involved).
- English dictionary With the same philosophy as the
Japanese dictionary, this 190,000 words dictionary defines for
every word the concepts that can be attached to it as well as
morphological (e.g. inflection, adjacency, pronunciation, accent),
syntactic (e.g. POS, countability) and semantic information.
- Technical dictionary This dictionary specialized in
information processing contains 120,000 Japanese words and 90,000
English words.
- Concept dictionary This original dictionary describes
and classifies the set of 400,000 concepts needed to fully understand
the meaning of each word. The classification is based on supra/super
relationships. The description contains binary semantic relationships
between concepts (e.g. agent/action or object/action)
- Bilingual dictionary
- Co-occurrence This is a table that contains information
about the possibility of word sequences inside sentences, and about
concept collocations.
- Japanese and English corpus This corpus is made of
220,000 Japanese and 160,000 English sentences. Each sentence is
morphologically, syntactically and semantically parsed.
These dictionaries have been used to develop the morphological parser
JUMAN at Kyoto University. Since 1996, E.D.R. has joined the ANSI
Ad-Hoc Group for Ontology Standards and is trying to link its
dictionaries with Wordnet.
In July, 1998, the cost to buy the dictionaries was 100,000 JPY
(around 1,000 USD) for universities and 1,200,000 JPY (around 12,000
USD) for companies.
Next: NTT's productions
Up: Linguistic resources
Previous: Linguistic resources
Jean-Philippe Vert
Sun Dec 6 11:05:42 MET 1998