Keywords: Knowledge extraction, text mining, information structuring, newspaper advertisements Contact Person: Martin Rajman Phone: (+41 21) 693-5277 E-mail: Martin.Rajman@epfl.ch Partners: Consultas SA
The EXTRACT project is the concrete implementation of a scientific collaboration between the AI Lab of EPFL and the CONSULTAS company.
The general objective of this project is the development of a prototype for the automated extraction and structuring of the information contained in textual form in newspaper classified advertisements. The targeted functionality is the integration of the automatically extracted information within the information system of the CONSULTAS company, allowing a richer and more efficient exploitation of the information.
Among the main objectives of the project:
- automated classification of the advertisements into a predefined set of classes corresponding to identified content structures; <7li>
- analysis of the content of the advertisements based on the automated extraction and structuring of the different informational units contained in the texts.
Modules for advertisement filtering and reformulation will also be specified and implemented.