Abstract
In this chapter we present a network theoretic approach to linguistics. In particular, we introduce a network model of derivational morphology in languages. We focus on suffixation as a mechanism to derive new words from existing ones. We induce networks of natural language data consisting of words, derivation suffixes and parts of speech (PoS) as well as the relations between them. Measuring the entropy of these networks by means of so called information functionals we aim at capturing the variation between typologically different languages. In this way, we rely on the work of Dehmer (Appl Math Comput 201:82–94, 2008) who has introduced a framework for measuring the entropy of graphs. In addition, we compare several entropy measures recently presented for graphs. We check whether these measures allow us to distinguish between language networks on the one hand, and random networks on the other.We found out, that linguistic variation among languages can be captured by investigating the topology of the underlying networks. Further, information functionals based on distributions of topological properties turned out to be better discriminators than those that are based on properties of single vertices.
MSC2010 Primary 94C15; Secondary 90C35, 05C90.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In fact, only 4 nodes of 136 have a degree > 1.
- 2.
Dehmer [12] uses log to calculate the entropy. We use ln here for all functionals, which does not have any impact on the final results of the relative entropy (see the definition below) values.
- 3.
We selected these combinations since they performed best in the parameter study shown in Table 11.6.
- 4.
ER graphs are connected undirected random [15] graphs of the cardinality of German and English. BA [5] and WA [39] are randomly generated small world graphs of the cardinality of German. We generate ten graphs of each kind of random network (i.e., ten graphs for ER, ten for BA, etc.) and compare the averaged entropy values.
- 5.
See [21] for details.
References
Altmann, G., Lehfeldt, W.: Allgemeine Sprachtypologie. Wilhelm Fink, Germany (1973)
Aronoff, M.: Word Formation in Generative Grammar. MIT, Cambridge (1976)
Baayen, H.: Quantitative Aspects of Morphological Productivity. In: Geert Booij, J.M. (ed.) Yearbook of Morphology, pp. 109–149. Kluwer, Dordrecht, Boston, London (1991)
Baayen, H.: On frequency, transparency, and productivity. Yearbook of Morphology 1992, pp. 181–208 (1992)
Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)
Bauer, L.: Morphological Productivity. Cambridge University Press, Cambridge (2001)
Bertinetto, P.M., Noccetti, S.: Prolegomena to ATAM acquisition. Theoretical premises and corpus labeling. Quaderni del Laboratorio di Linguistica della SNS n.6 ns. (2006)
Bonchev, D., Rouvray, D.H.: Complexity in Chemistry, Biology, and Ecology. Mathematical and Computational Chemistry. Springer, New York (2005)
Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25, 163–177 (2001)
Bybee, J.L.: Morphology as Lexical Organization, Chap. 7, pp. 119–141. Academic, London (1988)
Clahsen, H., Sonnenstuhl, I., Blevins, J.P.: Derivational morphology in the german mental lexicon: a dual mechanism account. In: Baayen, H., Schreuder, R. (eds.), Morphological Structure in Language Processing, Mouton de Gruyter, pp. 125–155, 2006 (2003)
Dehmer, M.: Information processing in complex networks: Graph entropy and information functionals. Appl. Math. Comput. 201, 82–94 (2008)
Dehmer, M., Varmuza, K., Borgert, S., Emmert-Streib, F.: On entropy-based molecular descriptors: statistical analysis of real and synthetic chemical structures. J. Chem. Inform. Model. 49(7), 1655–1663 (2009)
Dressler, W.U., Karpf, A.: The theoretical relevance of pre- and protomorpholgy in language acquisition. Yearbook of Morphology 1994, pp. 99–122 (1995)
Erdős, P., Rényi, A.: On random graphs. Publicationes Mathematicae 6, 290–297 (1959)
Evert, S., Lüdeling, A.: Measuring Morphological Productivity: Is AutomaticPreprocessing Sufficient? In: Rayson, P., Wilson, A., McEnery, T., Hardie, A., Khoja, S. (eds.) Proceedings of the Corpus Linguistics 2001 conference, pp. 167–175. Lancaster (2001)
Ferrer i Cancho, R., Mehler, A., Pustylnikov, O., Díaz-Guilera, A.: Correlations in the organization of large-scale syntactic dependency networks. In: TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pp. 65–72 (2007)
Ferrer i Cancho, R., Solé, R.V., Köhler, R.: Patterns in syntactic dependency networks. Phys. Rev. E 69, 051, 915 (2004)
Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Network. 1(3), 215–239 (1978-1979)
Habermann, M.: Verbale Wortbildung um 1500. Eine historisch-synchrone Untersuchung anhand von Texten Albrecht Dürers, Heinrich Deichlers und Veit Dietrichs. de Gruyter, Berlin (1994)
Hotho, A., Nürnberger, A., Paaß, G.: A Brief Survey of Text Mining. J. Lang. Technol. Comput. Ling. (JLCL) 20(1), 19–62 (2005)
Köhler, R.: Zur linguistischen Synergetik: Struktur und Dynamik der Lexik. Brockmeyer, Bochum (1986)
Konstantinova, E.V.: On some applications of information indices in chemical graph theory. In: General Theory of Information Transfer and Combinatorics. Springer, New York (2006)
Konstantinova, E.V., Paleev, A.A.: Sensitivity of topological indices of polycyclic graphs (Russian). Vichisl. Systemy 136, 38–48 (1990)
Liu, H.: The complexity of chinese syntactic dependency networks. Phys. A 387, 3048–3058 (2008)
Mehler, A.: Structural similarities of complex networks: A computational model by example of wiki graphs. Appl. Artif. Intell. 22, 619–683 (2008)
Mehler, A.: A quantitative graph model of social ontologies by example of Wikipedia. In: Dehmer, M., Emmert-Streib, F., Mehler, A. (eds.) Towards an Information Theory of Complex Networks: Statistical Methods and Applications. Birkhäuser, Boston/Basel (2011)
Mehler, A., Pustylnikov, O., Diewald, N.: Geography of social ontologies: Testing a variant of the Sapir-Whorf hypothesis in the context of Wikipedia. Comput. Speech Lang. 25(3), 716–740 (2011)
Mehler, A., Lücking, A., Weiß, P.: A network model of interpersonal alignment. Entropy 12(6), 1440–1483 (2010)
Plag, I.: Morphological Productivity. Structural Constraints in English Derivation. Mouton de Gruyter, Berlin/New York (1999)
Prell, H.P.: Die Ableitung von Verben aus Substantiven in biblischen und nichtbiblischen Texten des Frühneuhochdeutschen. Lang, Frankfurt am Main (1991)
Pustylnikov, O.: Modeling learning of derivation morphology in a multi-agent simulation. In: Proceedings of IEEE Africon 2009. IEEE (2009)
Abramov, O., Mehler, A.: Automatic Language Classification by Means of Syntactic Dependency Networks. Journal of Quantitative Linguistics (2011)
Pustylnikov, O., Schneider-Wiejowski, K.: Measuring morphological productivity. Studies in Quantitative Linguistics 5: Issues in Quantitative Linguistics, pp. 106–125 (2009)
Schneider-Wiejowski, K.: Sprachwandel anhand von Produktivitätsverschiebungen in der schweizerdeutschen Derivationsmorphologie. Linguistik online 38 (2009)
Schultink, H.: Produktiviteit als Morfologisch Fenomeen. Forum der Letteren 2, 110–125 (1961)
Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press, Urbana, IL, USA (1997)
Snijders, T.A.B.: The degree variance: An index of graph heterogeneity. Soc. Network. 3(3), 163–174 (1981)
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)
Acknowledgements
We would like to express our gratitude to Alexander Mehler and Kirill Medvedev for fruitful discussions and comments. Our special thanks goes to Matthias Dehmer whose useful hints and recommendations helped to improve this chapter.
This work is supported by the Linguisitc Networks project (http://www.linguistic- networks.net/) funded by the German Federal Ministry of Education and Research (BMBF), and by the German Research Foundation Deutsche Forschungsgemeinschaft (DFG) in the Collaborative Research Center 673 “Alignment in Communication.”
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Abramov, O., Lokot, T. (2011). Typology by Means of Language Networks: Applying Information Theoretic Measures to Morphological Derivation Networks. In: Dehmer, M., Emmert-Streib, F., Mehler, A. (eds) Towards an Information Theory of Complex Networks. Birkhäuser, Boston, MA. https://doi.org/10.1007/978-0-8176-4904-3_11
Download citation
DOI: https://doi.org/10.1007/978-0-8176-4904-3_11
Published:
Publisher Name: Birkhäuser, Boston, MA
Print ISBN: 978-0-8176-4903-6
Online ISBN: 978-0-8176-4904-3
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)