next up previous contents
Next: Laboratories Up: Linguistic resources Previous: ATR's resources

ETL's GDA project

The Electrotechnical Laboratory (ETL) in Tsukuba, which developed the multilingual MULE environment that is available on the GNU Emacs v.20, was trying to promote an annotation standard for HTML documents published on Internet : the Global Document Annotation (GDA). This standard enables computers to recognize semantic and pragmatic structures of HTML documents. The initiators of that project hope that a huge quantity of tagged documents will appear on the Internet, which could in particular be used as a linguistic corpus. To promote this standard, a collection of tags was proposed to enable computers to guess the structures of a document, and several applications were developed like automatic translation of GDA-tagged document, data-mining, automatic summarization or automatic design of slides for a presentation concerning such a document.



Jean-Philippe Vert
Sun Dec 6 11:05:42 MET 1998