Abstract
In this paper, we present a method for automatic generation of a digital resource, which connects all indirect synonyms of a dialect term to all indirect synonyms of a corresponding term in the standard language, aiming to improve the search of a digital dialect dictionary. The method uses SWRL rules defined in the Serbian WordNet ontology to identify sets of synonymous words. It also uses e-dictionaries to produce correct lemmas in standard language that users usually employ in searches. The method was applied and evaluated on verbs and a group of nouns derived from verbs (verbal nouns). We compared the results obtained by the system to those produced by humans and achieved the accuracy of 89.7%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
On-line at http://www.vranje.co.rs.
- 2.
Lemmatization was done using Unitex, the corpus processing system (http://unitexgramlab.org/).
- 3.
We used the following software tools in this paper: Developing tool Eclipse Java EE IDE Luna and Apache Jena open source software development environment which allows for reasoning at the level of OWL 2 language by converting SWRL rules into the Jena rules format.
References
Čavar, D., Geyken, A., Neumann, G.: Digital dictionary of the 20th century German language. In: Jezikoslovne Tehnologije za Slovenski Jezik: Proceedings of JS, pp. 110–114 (2000)
Declerck, T., Wandl-Vogt, E.: How to semantically relate dialectal Dictionaries in the Linked Data Framework. In: Proceedings of the 8th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTeCH 2014). pp. 9–12. ACL, Gothenburg, April 2014
Fellbaum, C.: WordNet: An Eletronic Lexical Database. The MIT Press, Cambridge (1998)
Karanikolas, N.N., Galiotou, E., Xydopoulos, G.J., Ralli, A., Athanasakos, K., Koronakis, G.: Structuring a multimedia tri-dialectal dictionary. In: Habernal, I., Matoušek, V. (eds.) TSD 2013. LNCS, vol. 8082, pp. 509–518. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40585-3_64
Krstev, C., Vitas, D., Stanković, R.: A lexical approach to acronyms and their definitions. In: Mariani, Z.V.J. (ed.) Proceedings of 7th Language and Technology Conference, pp. 219–223. Fundacja Uniwersytetu im. A. Mickiewicza, Poznań, November 2015
McCrae, J., Aguado-de Cea, G., Buitelaar, P., Cimiano, P., Declerck, T., Gómez-Pérez, A., Gracia, J., Hollink, L., Montiel-Ponsoda, E., Spohr, D., Wunner, T.: Interchanging lexical resources on the Semantic Web. Lang. Resour. Eval. 46(4), 701–719 (2012)
Mladenović, M.: Digital dictionary of the South Serbian Dialect. Infotheca 15(1), 42–55 (2014)
Onysko, A., Markus, M., Heuberger, R.: Joseph Wright’s ‘English Dialect Dictionary’ in electronic form: a critical discussion of selected lexicographic parameters and query options. Lang. Comput. 69(1), 201–219 (2009)
O’Sullivan, D., Unwin, D.: Geographic Information Analysis. Wiley, Upper Saddle River (2010)
Pereira, S., Gillier, R.: TEDIPOR: thesaurus of dialectal Portuguese. In: Proceedings of the 15th EURALEX International Congress, Norway, pp. 267–281 (2012)
Petsas, S.: Visualising perceptual linguistic data. University of Edinburgh, Edinburgh (2009)
Sibler, P., Weibel, R., Glaser, E., Bart, G.: Cartographic visualization in support of dialectology in support of dialectology. In: Proceedings of the AutoCarto 2012: The International Symposium on Automated Cartography, Columbus, Ohio, USA (2012)
Van Keymeulen, J., De Tier, V.: The woordenbank van de Nederlandse dialecten (Wordbase of Dutch Dialects). In: Proceedings of the eLex 2013 Conference, Electronic lexicography in the 21st Century: Thinking Outside the Paper, Tallinn, Estonia, pp. 261–279 (2013)
Vitas, D., Popović, L., Krstev, C., Obradović, I., Pavlović-Lažetić, G., Stanojević, M.: The Serbian Language in the Digital Age. META-NET White Paper Series. Springer, Heidelberg (2012). doi:10.1007/978-3-642-30755-3. Rehm, G., Uszkoreit, H. (Series Editors)
Wandl-Vogt, E., Declerck, T.: Mapping a traditional dialectal dictionary with linked open data. In: Proceedings of the 3rd eLex Conference, Electronic Lexicography in the 21st Century: Thinking Outside the Paper, Tallinn, Estonia, pp. 460–471 (2013)
Zlatanović, M.: Rečnik govora juga Srbije: (provincijalizmi, dijalektizmi, varvarizmi i dr.). Učiteljski fakultet, Vranje (1998)
Zlatanović, M.: Rečnik govora juga Srbije: (provincijalizmi, dijalektizmi, varvarizmi i dr.). Aurora, Vranje (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Mladenović, M., Stanković, R., Krstev, C. (2017). A WordNet Ontology in Improving Searches of Digital Dialect Dictionary. In: Kirikova, M., et al. New Trends in Databases and Information Systems. ADBIS 2017. Communications in Computer and Information Science, vol 767. Springer, Cham. https://doi.org/10.1007/978-3-319-67162-8_37
Download citation
DOI: https://doi.org/10.1007/978-3-319-67162-8_37
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67161-1
Online ISBN: 978-3-319-67162-8
eBook Packages: Computer ScienceComputer Science (R0)