The impact of word normalization methods and merging strategies on multilingual IR

E Airio, H Keskustalo, T Hedlund, A Pirkola�- Comparative Evaluation of�…, 2004 - Springer
E Airio, H Keskustalo, T Hedlund, A Pirkola
Comparative Evaluation of Multilingual Information Access Systems: 4th�…, 2004Springer
This article deals with both multilingual and bilingual IR. The source language is English,
and the target languages are English, German, Finnish, Swedish, Dutch, French, Italian and
Spanish. The approach of separate indexes is followed, and four different merging strategies
are tested. Two of the merging methods are classical basic methods: the Raw Score method
and the Round Robin method. Two simple new merging methods were created: the Dataset
Size Based method and the Score Difference Based method. Two kinds of indexing methods�…
Abstract
This article deals with both multilingual and bilingual IR. The source language is English, and the target languages are English, German, Finnish, Swedish, Dutch, French, Italian and Spanish. The approach of separate indexes is followed, and four different merging strategies are tested. Two of the merging methods are classical basic methods: the Raw Score method and the Round Robin method. Two simple new merging methods were created: the Dataset Size Based method and the Score Difference Based method. Two kinds of indexing methods are tested: morphological analysis and stemming. Morphologically analyzed indexes perform a slightly better than stemmed indexes. The merging method based on the dataset size performs best.
Springer
Showing the best result for this search. See all results