next up previous contents
Next: HPSG formalism in Tokyo Up: Syntactic parsing Previous: KNP : a parser

An English parser based on decision trees at ATR

Using the "ATR English Grammar", researchers from ATR have developed a syntactic parser that is based on a 3000-tags and 1100-rules grammar. The syntactic structure of a sentence is written as a tree, every non-terminal node of which contains the identification of the rule that created it, and every terminal node of which contains the POS of the corresponding word. A tagged corpus using this grammar has been created in collaboration with Lancaster University (the "ATR/Lancaster Treebank of General English") and has been used for the training of decision trees that automatically build the tree associated with a sentence. The questions used in the decision tree concern the attributes of the terminal and non-terminal tree nodes, as well as other characteristics of words or sentence (e.g. the size of the sentence).

An analysis is the result of a succession of partial analyses, that represent successive states in the process of tree determination. The jump from one state to another happens when a new node is tagged, or when a new node is set to be terminal : these decisions are taken thanks to decision trees.



Jean-Philippe Vert
Sun Dec 6 11:05:42 MET 1998