Hrg. Christina Feilmayr, Claudia Vojinovic, Birgit Pröll,
"Designing a Multi-Dimensional Space for Hybrid Information Extraction"
, in Abdelkader Hameurlain, A Min Tjoa, Roland R. Wagner: Twenty-Third International Workshop on Database and Expert Systems Applications, IEEE Computer Society, Seite(n) 121, 9-2012, ISBN: 978-0-7695-4801-2
Designing a Multi-Dimensional Space for Hybrid Information Extraction
Sprache des Titels:
Twenty-Third International Workshop on Database and Expert Systems Applications
Information extraction systems are developed for various specific application domains to manage an increasing amount of unstructured data. The majority build either upon the knowledge-based approach, which promises high accuracy but involves labour-intensive coding of extraction rules, or upon the automatically trainable systems approach, which produces highly portable solutions but requires an appropriate learning set. In this paper, we present results of a project that aims to provide a new methodology which combines the knowledge-based and the machine learning approach into a hybrid one in order to compensate for their respective shortcomings and to achieve high IE performance. Firstly, we propose the idea of a multi-dimensional space that guides users in selecting appropriate methods, i.e., different hybrid concepts, depending on the extraction task and the level of available features. Secondly, we provide the concept of one hybrid approach, namely the sequential processing of a knowledge-based approach and a selection of different machine learning methods. Thirdly, we present the evaluation of an implementation of the sequential extraction on a curriculum vitae corpus. Thus, we provide first results for filling the multi-dimensional space for hybrid in-formation extraction.