Abstract
The development of a comprehensive morphological dictionary of multi-word units for Serbian is a very demanding task, due to the complexity of Serbian morphology. Manual production of such a dictionary proved to be extremely time-consuming. In this paper we present a procedure that automatically produces dictionary lemmas for a given list of multi-word units. To accomplish this task the procedure relies on data in e-dictionaries of Serbian simple words, which are already well developed. We also offer an evaluation of the proposed procedure on several different sets of data. Finally, we discuss some implementation issues and present how the same procedure is used for other languages.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Courtois, B., Silberztein, M.: Dictionnaires électroniques du français. Larousse, Paris (1990)
Krstev, C.: Processing of Serbian - Automata, Texts and Electronic Dictionaries. Faculty of Philology, University of Belgrade, Belgrade (2008)
Savary, A.: Computational Inflection of Multi-Word Units - A Contrastive Study of Lexical Approaches. Linguistic Issues in Language Technologies 1(2) (2008)
Krstev, C., Vitas, D.: Finite State Transducers for Recognition and Generation of Compound Words. In: Erjavec, T., Žganec Gros, J. (eds.) IS-LTC 2006, Ljubljana, Slovenia, Institut Jožef Stefan, pp. 192–197 (October 2006)
Savary, A.: Multiflex: A Multilingual Finite-State Tool for Multi-Word Units. In: Maneth, S. (ed.) Implementation and Application of Automata. LNCS, vol. 5642, pp. 237–240. Springer, Heidelberg (2009)
Krstev, C., Stanković, R., Vitas, D., Obradović, I.: The Usage of Various Lexical Resources and Tools to Improve the Performance of Web Search Engines. In: 6th LREC, Marrakech, Marocco (2008)
Jacquemin, C.: Spotting and Discovering Terms through Natural Language Processing. MIT Press, Cambridge (2001)
Laporte, E.: Lexicons and Grammars for Language Processing: Industrial or Handcrafted Products? In: Rezende, L.M., da Silva, B.C.D., Barbosa, J.B. (eds.) Léxico e gramática: dos sentidos à construção da significação. Trilhas Lingüísticas, vol. 16, pp. 51–84. Cultura Acadêmica, São Paulo (2009)
Krstev, C., Vitas, D., Savary, A.: Prerequisites for a Comprehensive Dictionary of Serbian Compounds. In: Salakoski, T., Ginter, F., Pyysalo, S., Pahikkala, T. (eds.) FinTAL 2006. LNCS (LNAI), vol. 4139, pp. 552–563. Springer, Heidelberg (2006)
Savary, A.: Recensement et description des mots composés - méthodes et applications. PhD thesis, Université de Marne-la-Vallée (2000)
Courtois, B., Garrigues, M., Gross, G., Gross, M., Jung, R., Mathieu-Colas, M., Silberztein, M., Vivès, R.: Dictionnaire électronique des noms composés DELAC: les composants NA et NN. Technical Report 55, LADL, Université Paris 7 (1997)
Paumier, S.: Unitex 2.1 User Manual (2008), http://www-igm.univ-mlv.fr/unitex/UnitexManual2.1.pdf
Wolinski, M., Savary, A., Sikora, P., Marciniak, M.: Usability Improvements in the Lexicographic Framework Toposlaw. In: Vetulani, Z. (ed.) 4th LTC, Poznań, Poland, IMPRESJA Widawnictwa Elektroniczne S.A (2009)
Grass, T., Maurel, D., Piton, O.: Description of a Multilingual Database of Proper Names. In: Ranchhod, E., Mamede, N.J. (eds.) PorTAL 2002. LNCS (LNAI), vol. 2389, pp. 137–140. Springer, Heidelberg (2002)
Elia, A.: The Electronic Thematic Linguistic Atlases (Atlanti Linguistici Tematici Informatici - ALTI. In: Atlas DICoMP - Dizionario delle parole composite, http://www.ricercaitaliana.it/prin/unita_op_en-2005109535_003.htm
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Krstev, C., Stanković, R., Obradović, I., Vitas, D., Utvić, M. (2010). Automatic Construction of a Morphological Dictionary of Multi-Word Units. In: Loftsson, H., Rögnvaldsson, E., Helgadóttir, S. (eds) Advances in Natural Language Processing. NLP 2010. Lecture Notes in Computer Science(), vol 6233. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-14770-8_26
Download citation
DOI: https://doi.org/10.1007/978-3-642-14770-8_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-14769-2
Online ISBN: 978-3-642-14770-8
eBook Packages: Computer ScienceComputer Science (R0)