Abstract
Currently there are only few available language resources for French. Additionally there is a lack of available language models for for tasks such as Named Entity Recognition and Classification (NERC) which makes difficult building natural language processing systems for this language. This paper presents a new publicly available supervised Apache OpenNLP NERC model that has been trained and tested under a maximum entropy approach. This new model achieves state of the art results for French when compared with another systems. Finally we have also extended Apache OpenNLP libraries to support part-of-speech feature extraction component which has been used for our experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abeillé, A., Clément, L., Toussenel, F.: Building a treebank for french. In: Treebanks, pp. 165–187. Springer (2003)
Appelt, D.E., Hobbs, J.R., Bear, J., Israel, D., Kameyama, M., Martin, D., Myers, K., Tyson, M.: Sri international fastus system: Muc-6 test results and analysis. In: Proceedings of the 6th Conference on Message Understanding, pp. 237–248. Association for Computational Linguistics (1995)
Bikel, D.M., Miller, S., Schwartz, R., Weischedel, R.: Nymble: A high performance learning name-finder. In: Proceedings of the 5th Conference on Applied Natural Language Processing, ANLP, Washington DC (1997)
Borthwick, A.: A maximum entropy approach to named entity recognition. Ph.D. thesis, New York University (1999)
Budi, I., Bressan, S.: Association rules mining for name entity recognition (2003)
Ekbal, A., Bandyopadhyay, S.: Named entity recognition using support vector machine: A language independent approach. International Journal of Computer Systems Science & Engineering 4(2) (2008)
Favre, B., Béchet, F., Nocéra, P.: Robust named entity extraction from large spoken archives. In: Proceedings of HLT-EMNLP, pp. 491–498. Association for Computational Linguistics (2005)
Mikheev, A., Moens, M., Grover, C.: Named entity recognition without gazetteers. In: Proceedings of the 9th EACL, pp. 1–8 (1999)
Nadeau, D., Sekine, S.: A survey of named entity recognition and classification. Lingvisticae Investigationes 30(1), 3–26 (2007)
Nothman, J., Ringland, N., Radford, W., Murphy, T., Curran, J.R.: Learning multilingual named entity recognition from wikipedia. Artificial Intelligence 194, 151–175 (2013)
Petasis, G., Vichot, F., Wolinski, F., Paliouras, G., Karkaletsis, V., Spyropoulos, C.D.: Using machine learning to maintain rule-based named-entity recognition and classification systems. In: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, pp. 426–433. Association for Computational Linguistics (2001)
Poibeau, T.: The multilingual named entity recognition framework. In: Proceedings of the tenth conference on European Chapter of the Association for Computational Linguistics, vol. 2, pp. 155–158. Association for Computational Linguistics (2003)
Richman, A.E., Schone, P.: Mining wiki resources for multilingual named entity recognition. In: ACL, pp. 1–9 (2008)
Sekine, S.: Nyu: Description of the japanese NE system used for met-2. In: Proc. Message Understanding Conference (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Azpeitia, A., Cuadros, M., Gaines, S., Rigau, G. (2014). NERC-fr: Supervised Named Entity Recognition for French. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech and Dialogue. TSD 2014. Lecture Notes in Computer Science(), vol 8655. Springer, Cham. https://doi.org/10.1007/978-3-319-10816-2_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-10816-2_20
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10815-5
Online ISBN: 978-3-319-10816-2
eBook Packages: Computer ScienceComputer Science (R0)