Different approaches to Cross Language Information Retrieval

in Computational Linguistics in the Netherlands 2000
Restricted Access
Get Access to Full Text

Subject Highlights

Abstract

This paper describes two experiments in the domain of Cross Language Information Retrieval. Our basic approach is to translate queries word by word using machine readable dictionaries. The first experiment compared different strategies to deal with word sense ambiguity: i) keeping all translations and integrate translation probabilities in the model, ii) a single translation is selected on the basis of the number of occurrences in the dictionary iii) word by word translation after word sense disambiguation in the source language. In a second experiment we constructed parallel corpora from web documents in order to construct bilingual dictionaries or improve translation probability estimates. We conclude that our best dictionary based CLIR approach is based on keeping all possible translations, not by simple substitution of a query term by its translations but by creating a structured query and including reverse translation probabilities in the retrieval model.

Computational Linguistics in the Netherlands 2000

Selected Papers from the Eleventh CLIN Meeting

Series:

Table of Contents

Information

Metrics

Metrics

All Time Past Year Past 30 Days
Abstract Views 8 8 1
Full Text Views 6 6 3
PDF Downloads 5 5 2
EPUB Downloads 0 0 0

Related Content