Skip to main content

Typology by Means of Language Networks: Applying Information Theoretic Measures to Morphological Derivation Networks

  • Chapter
  • First Online:
Towards an Information Theory of Complex Networks

Abstract

In this chapter we present a network theoretic approach to linguistics. In particular, we introduce a network model of derivational morphology in languages. We focus on suffixation as a mechanism to derive new words from existing ones. We induce networks of natural language data consisting of words, derivation suffixes and parts of speech (PoS) as well as the relations between them. Measuring the entropy of these networks by means of so called information functionals we aim at capturing the variation between typologically different languages. In this way, we rely on the work of Dehmer (Appl Math Comput 201:82–94, 2008) who has introduced a framework for measuring the entropy of graphs. In addition, we compare several entropy measures recently presented for graphs. We check whether these measures allow us to distinguish between language networks on the one hand, and random networks on the other.We found out, that linguistic variation among languages can be captured by investigating the topology of the underlying networks. Further, information functionals based on distributions of topological properties turned out to be better discriminators than those that are based on properties of single vertices.

MSC2010 Primary 94C15; Secondary 90C35, 05C90.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
eBook
USD 84.99
Price excludes VAT (USA)
Hardcover Book
USD 109.99
Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In fact, only 4 nodes of 136 have a degree > 1.

  2. 2.

    Dehmer [12] uses log to calculate the entropy. We use ln here for all functionals, which does not have any impact on the final results of the relative entropy (see the definition below) values.

  3. 3.

    We selected these combinations since they performed best in the parameter study shown in Table 11.6.

  4. 4.

    ER graphs are connected undirected random [15] graphs of the cardinality of German and English. BA [5] and WA [39] are randomly generated small world graphs of the cardinality of German. We generate ten graphs of each kind of random network (i.e., ten graphs for ER, ten for BA, etc.) and compare the averaged entropy values.

  5. 5.

    See [21] for details.

References

  1. Altmann, G., Lehfeldt, W.: Allgemeine Sprachtypologie. Wilhelm Fink, Germany (1973)

    Google Scholar 

  2. Aronoff, M.: Word Formation in Generative Grammar. MIT, Cambridge (1976)

    Google Scholar 

  3. Baayen, H.: Quantitative Aspects of Morphological Productivity. In: Geert Booij, J.M. (ed.) Yearbook of Morphology, pp. 109–149. Kluwer, Dordrecht, Boston, London (1991)

    Google Scholar 

  4. Baayen, H.: On frequency, transparency, and productivity. Yearbook of Morphology 1992, pp. 181–208 (1992)

    Google Scholar 

  5. Barabási, A.L., Albert, R.: Emergence of scaling in random networks. Science 286, 509–512 (1999)

    Article  MathSciNet  Google Scholar 

  6. Bauer, L.: Morphological Productivity. Cambridge University Press, Cambridge (2001)

    Book  Google Scholar 

  7. Bertinetto, P.M., Noccetti, S.: Prolegomena to ATAM acquisition. Theoretical premises and corpus labeling. Quaderni del Laboratorio di Linguistica della SNS n.6 ns. (2006)

    Google Scholar 

  8. Bonchev, D., Rouvray, D.H.: Complexity in Chemistry, Biology, and Ecology. Mathematical and Computational Chemistry. Springer, New York (2005)

    Book  Google Scholar 

  9. Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25, 163–177 (2001)

    Article  Google Scholar 

  10. Bybee, J.L.: Morphology as Lexical Organization, Chap. 7, pp. 119–141. Academic, London (1988)

    Google Scholar 

  11. Clahsen, H., Sonnenstuhl, I., Blevins, J.P.: Derivational morphology in the german mental lexicon: a dual mechanism account. In: Baayen, H., Schreuder, R. (eds.), Morphological Structure in Language Processing, Mouton de Gruyter, pp. 125–155, 2006 (2003)

    Google Scholar 

  12. Dehmer, M.: Information processing in complex networks: Graph entropy and information functionals. Appl. Math. Comput. 201, 82–94 (2008)

    MathSciNet  MATH  Google Scholar 

  13. Dehmer, M., Varmuza, K., Borgert, S., Emmert-Streib, F.: On entropy-based molecular descriptors: statistical analysis of real and synthetic chemical structures. J. Chem. Inform. Model. 49(7), 1655–1663 (2009)

    Article  Google Scholar 

  14. Dressler, W.U., Karpf, A.: The theoretical relevance of pre- and protomorpholgy in language acquisition. Yearbook of Morphology 1994, pp. 99–122 (1995)

    Article  Google Scholar 

  15. Erdős, P., Rényi, A.: On random graphs. Publicationes Mathematicae 6, 290–297 (1959)

    Google Scholar 

  16. Evert, S., Lüdeling, A.: Measuring Morphological Productivity: Is AutomaticPreprocessing Sufficient? In: Rayson, P., Wilson, A., McEnery, T., Hardie, A., Khoja, S. (eds.) Proceedings of the Corpus Linguistics 2001 conference, pp. 167–175. Lancaster (2001)

    Google Scholar 

  17. Ferrer i Cancho, R., Mehler, A., Pustylnikov, O., Díaz-Guilera, A.: Correlations in the organization of large-scale syntactic dependency networks. In: TextGraphs-2: Graph-Based Algorithms for Natural Language Processing, pp. 65–72 (2007)

    Google Scholar 

  18. Ferrer i Cancho, R., Solé, R.V., Köhler, R.: Patterns in syntactic dependency networks. Phys. Rev. E 69, 051, 915 (2004)

    Google Scholar 

  19. Freeman, L.C.: Centrality in social networks conceptual clarification. Soc. Network. 1(3), 215–239 (1978-1979)

    Article  Google Scholar 

  20. Habermann, M.: Verbale Wortbildung um 1500. Eine historisch-synchrone Untersuchung anhand von Texten Albrecht Dürers, Heinrich Deichlers und Veit Dietrichs. de Gruyter, Berlin (1994)

    Google Scholar 

  21. Hotho, A., Nürnberger, A., Paaß, G.: A Brief Survey of Text Mining. J. Lang. Technol. Comput. Ling. (JLCL) 20(1), 19–62 (2005)

    Google Scholar 

  22. Köhler, R.: Zur linguistischen Synergetik: Struktur und Dynamik der Lexik. Brockmeyer, Bochum (1986)

    Google Scholar 

  23. Konstantinova, E.V.: On some applications of information indices in chemical graph theory. In: General Theory of Information Transfer and Combinatorics. Springer, New York (2006)

    Google Scholar 

  24. Konstantinova, E.V., Paleev, A.A.: Sensitivity of topological indices of polycyclic graphs (Russian). Vichisl. Systemy 136, 38–48 (1990)

    MATH  Google Scholar 

  25. Liu, H.: The complexity of chinese syntactic dependency networks. Phys. A 387, 3048–3058 (2008)

    Article  Google Scholar 

  26. Mehler, A.: Structural similarities of complex networks: A computational model by example of wiki graphs. Appl. Artif. Intell. 22, 619–683 (2008)

    Article  Google Scholar 

  27. Mehler, A.: A quantitative graph model of social ontologies by example of Wikipedia. In: Dehmer, M., Emmert-Streib, F., Mehler, A. (eds.) Towards an Information Theory of Complex Networks: Statistical Methods and Applications. Birkhäuser, Boston/Basel (2011)

    MATH  Google Scholar 

  28. Mehler, A., Pustylnikov, O., Diewald, N.: Geography of social ontologies: Testing a variant of the Sapir-Whorf hypothesis in the context of Wikipedia. Comput. Speech Lang. 25(3), 716–740 (2011)

    Article  Google Scholar 

  29. Mehler, A., Lücking, A., Weiß, P.: A network model of interpersonal alignment. Entropy 12(6), 1440–1483 (2010)

    Article  Google Scholar 

  30. Plag, I.: Morphological Productivity. Structural Constraints in English Derivation. Mouton de Gruyter, Berlin/New York (1999)

    Book  Google Scholar 

  31. Prell, H.P.: Die Ableitung von Verben aus Substantiven in biblischen und nichtbiblischen Texten des Frühneuhochdeutschen. Lang, Frankfurt am Main (1991)

    Google Scholar 

  32. Pustylnikov, O.: Modeling learning of derivation morphology in a multi-agent simulation. In: Proceedings of IEEE Africon 2009. IEEE (2009)

    Google Scholar 

  33. Abramov, O., Mehler, A.: Automatic Language Classification by Means of Syntactic Dependency Networks. Journal of Quantitative Linguistics (2011)

    Google Scholar 

  34. Pustylnikov, O., Schneider-Wiejowski, K.: Measuring morphological productivity. Studies in Quantitative Linguistics 5: Issues in Quantitative Linguistics, pp. 106–125 (2009)

    Google Scholar 

  35. Schneider-Wiejowski, K.: Sprachwandel anhand von Produktivitätsverschiebungen in der schweizerdeutschen Derivationsmorphologie. Linguistik online 38 (2009)

    Google Scholar 

  36. Schultink, H.: Produktiviteit als Morfologisch Fenomeen. Forum der Letteren 2, 110–125 (1961)

    Google Scholar 

  37. Shannon, C.E., Weaver, W.: The Mathematical Theory of Communication. University of Illinois Press, Urbana, IL, USA (1997)

    MATH  Google Scholar 

  38. Snijders, T.A.B.: The degree variance: An index of graph heterogeneity. Soc. Network. 3(3), 163–174 (1981)

    Article  MathSciNet  Google Scholar 

  39. Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)

    Article  Google Scholar 

Download references

Acknowledgements

We would like to express our gratitude to Alexander Mehler and Kirill Medvedev for fruitful discussions and comments. Our special thanks goes to Matthias Dehmer whose useful hints and recommendations helped to improve this chapter.

This work is supported by the Linguisitc Networks project (http://www.linguistic- networks.net/) funded by the German Federal Ministry of Education and Research (BMBF), and by the German Research Foundation Deutsche Forschungsgemeinschaft (DFG) in the Collaborative Research Center 673 “Alignment in Communication.”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Olga Abramov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Abramov, O., Lokot, T. (2011). Typology by Means of Language Networks: Applying Information Theoretic Measures to Morphological Derivation Networks. In: Dehmer, M., Emmert-Streib, F., Mehler, A. (eds) Towards an Information Theory of Complex Networks. Birkhäuser, Boston, MA. https://doi.org/10.1007/978-0-8176-4904-3_11

Download citation

Publish with us

Policies and ethics