Distributional Semantic Application to Information Retrieval from Large Textual Databases

Keywords: Textual document retreiveal, distributional semantics, co-frequency
Contact Person: Martin Rajman
Phone: (+41 21) 693-5277
E-mail: Martin.Rajman@epfl.ch

Project Description

This research project concerns the development of semantic models for textual document retrieval systems. The models we are focusing on take place in framework based on a "distributional semantics" where semantic proximities are derived from co-frequency matrices computed on large textual corpora. The queries and documents are represented in an unified way as projections in a vector space of pertinent terms. Different similarity measures will be tested to characterize the proximity between queries and documents.

A software prototype, called D-SIR, has been implemented in order to validate the approach.

Last modified: Tue Apr 4 16:03:30 2000