next up previous contents
Next: Information extraction with template Up: Information retrieval and extraction Previous: Geographical knowledge organization at

5W1H classification at NEC

ThePattern Analysis and Human Language Technology group of NEC Corporation, a giant of computer manufacturing and communication, has developed a navigation engine for textual databases with respect to 5W1H requests (who, when, where, what, why, how). During the stage of information organization in the database the program extracts a 6-dimensional vector for every sentence, corresponding to the 6 elementary questions. This information extraction uses NLP and pattern matching techniques, in order to identify pertinent informations.

The navigation stage then consists of asking the user to fill in one or several fields of the 6-dimensional question vector, and searching for documents that correspond to the query. The applications tested in July 1998, concerned economic news for which 5W1H formalism is particularly well adapted.



Jean-Philippe Vert
Sun Dec 6 11:05:42 MET 1998