Abstract
This paper describes a new method of color text localization from generic scene images containing text of different scripts and with arbitrary orientations. A representative set of colors is first identified using the edge information to initiate an unsupervised clustering algorithm. Text components are identified from each color layer using a combination of a support vector machine and a neural network classifier trained on a set of low-level features derived from the geometric, boundary, stroke and gradient information. Experiments on camera-captured images that contain variable fonts, size, color, irregular layout, non-uniform illumination and multiple scripts illustrate the robustness of the method. The proposed method yields precision and recall of 0.8 and 0.86 respectively on a database of 100 images. The method is also compared with others in the literature using the ICDAR 2003 robust reading competition dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Zhong, Y., Karu, K., Jain, A.K.: Locating text in complex color images. Patt. Recog. 28(10), 1523–1535 (1995)
Li, H., Doermann, D., Kia, O.: Automatic Text Detection and Tracking in Digital Video. IEEE Trans. Image Proc. 9(1), 147–156 (2000)
Raju, S.S., Pati, P.B., Ramakrishnan, A.G.: Gabor Filter Based Block Energy Analysis for Text Extraction from Digital Document Images. In: Proc. Intl. Workshop DIAL, pp. 233–243 (2004)
Wu, V., Manmatha, R., Riseman, E.M.: TextFinder: an automatic system to detect and recognize text in images. IEEE Trans. PAMI 21(11), 1124–1129 (1999)
Clark, P., Mirmehdi, M.: Finding text using localised measures. In: Proc. British Machine Vision Conf., pp. 675–684 (2000)
Chen, X., Yuille, A.L.: Detecting and Reading Text in Natural Scenes. In: Proc. IEEE Intl. Conf. CVPR, vol. 2, pp. 366–373 (2004)
Shivakumara, P., Dutta, A., Tan, C.L., Pal, U.: A New Wavelet-Median-Moment based Method for Multi-Oriented Video Text Detection. In: Proc. Intl. Workshop on Document Analysis and Systems, pp. 279–286 (2010)
Gatos, B., Pratikakis, I., Kepene, K., Perantonis, S.J.: Text detection in indoor/outdoor scene images. In: Proc. Intl. Workshop CBDAR, pp. 127–132 (2005)
Zhu, K., Qi, F., Jiang, R., Xu, L., Kimachi, M., Wu, Y., Aizawa, T.: Using Adaboost to Detect and Segment Characters from Natural Scenes. In: Proc. Intl. Workshop CBDAR, pp. 52–59 (2005)
Pan, W., Brui, T.D., Suen, C.Y.: Text Detection from Scene Images Using Sparse Representation. In: Proc. ICPR, pp. 1–5 (2008)
Kasar, T., Ramakrishnan, A.G.: COCOCLUST: Contour-based Color Clustering for Robust Binarization of Colored Text. In: Proc. Intl. Workshop CBDAR, pp. 11–17 (2009)
Antonacopoulos, A., Karatzas, D.: Fuzzy Segmentation of Characters in Web Images Based on Human Colour Perception. In: Lopresti, D.P., Hu, J., Kashi, R.S. (eds.) DAS 2002. LNCS, vol. 2423, pp. 295–306. Springer, Heidelberg (2002)
ICDAR Robust reading competition data set (2003), http://algoval.essex.ac.uk/icdar/Competitions.html
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines, http://www.csie.ntu.edu.tw/cjlin/libsvm
Kasar, T., Kumar, D., Prasad, A.N., Girish, D., Ramakrishnan, A.G.: MAST: Multi-scipt Annotation Toolkit for Scenic Text. In: Joint Workshop on MOCR and AND, pp. 113–120 (2011), software http://mile.ee.iisc.ernet.in/mast
Lucas, S.M.: ICDAR 2005 Text Locating Competition Results. In: Proc. ICDAR, pp. 80–84 (2005)
Neumann, L., Matas, J.: A Method for Text Localization and Recognition in Real-World Images. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 770–783. Springer, Heidelberg (2011)
Minetto, R., Thome, N., Cord, M., Fabrizio, J., Marcotegui, B.: SNOOPERTEXT: A multiresolution system for text detection in complex visual scenes. In: Proc. IEEE ICIP, pp. 3861–3864 (2010)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proc. IEEE Conf. CVPR, pp. 2963–2970 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kasar, T., Ramakrishnan, A.G. (2012). Multi-script and Multi-oriented Text Localization from Scene Images. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2011. Lecture Notes in Computer Science, vol 7139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29364-1_1
Download citation
DOI: https://doi.org/10.1007/978-3-642-29364-1_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29363-4
Online ISBN: 978-3-642-29364-1
eBook Packages: Computer ScienceComputer Science (R0)