XATA2008

[Previous] [List of Papers] [Next]

XML-based Extraction of Terminological Information from Corpora
Ana Bel�n Crespo Bastos (Universidade de Vigo)
Xos� Mar�a G�mez Clemente (Universidade de Vigo)
Xavier G�mez Guinovart (Universidade de Vigo)
Susana L�pez Fern�ndez (Universidade de Vigo)

Abstract:
In this paper, we present a methodology for the extraction of terminological information from textual corpora, showing the processes we follow for identification of term candidates in corpora, and for recognition in textual data of term definitions and conceptual relations. Both the textual corpora that are used as the source for terminological information, as well as the terminological database we build from this information, are stored and maintained by linguists in XML format, and converted to MySQL format for consultation through a PHP-based web application.

Keywords:
Document Processing using XML, XML-based natural language processing

Download

[Previous] [List of Papers] [Next]