Abstract
Bound documents either scanned or captured with digital cameras often present a geometrical warp that makes text-lines curled. The identification of text-lines is one of the steps for document de-warping when only a single image is available. This paper presents a new method for text-line segmentation. It is based on a simple, but effective, skew detector proposed by Ávila-Lins and simplifies the idea of coupled snakes introduced by Bukhari to a moving parallel line regression. The proposed method performed better than the best of the similar algorithms in the literature.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Masalovitch, A., Mestetskiy, L.: Usage of continuous skeletal image representation for document images de-warping. In: Proceedings of International Workshop on Camera-Based Document Analysis and Recognition, Curitiba, pp. 45–53 (2007)
Fu, B., Wu, M., Li, R., Li, W., Xu, Z.: A model-based book de-warping method using text line detection. In: 2nd Int. Workshop on Camera-Based Document Analysis and Recognition, Curitiba, Brazil (September 2007)
Ávila, B.T., Lins, R.D.: A fast orientation and skew detection algorithm for monochromatic document images. In: Proceedings of the ACM Symposium on Document Engineering, Bristol, UK, pp. 118–126 (2005)
Lins, R.D., Oliveira, D.M., Torreão, G., Fan, J., Thielo, M.: Correcting Book Binding Distortion in Scanned Documents. In: Campilho, A., Kamel, M. (eds.) ICIAR 2010, Part II. LNCS, vol. 6112, pp. 355–365. Springer, Heidelberg (2010)
Shafait, F., Breuel, T.M.: Document Image De-warping Contest. In: 2nd Int. Workshop on Camera-Based Document Analysis and Recognition, CBDAR 2007, Brazil, September 2007, pp. 181–188 (2007)
Stamatopoulos, N., Gatos, B., Pratikakis, I., Perantonis, S.J.: A two-step de-warping of camera document images. In: Proceedings 8th IAPR Workshop on Document Analysis Systems, Nara, Japan, pp. 209–216 (2008)
Bukhari, S.S., Shafait, F., Breuel, T.M.: Coupled snakelet model for curled textline segmentation of camera-captured document images. In: Proceedings 10th International Conference on Document Analysis and Recognition, Barcelona, Spain, pp. 61–65 (2009)
Bukhari, S.S., Shafait, F., Breuel, T.M.: Ridges based curled textline region detection from grayscale camera-captured document images. In: Jiang, X., Petkov, N. (eds.) Computer Analysis of Images and Patterns. LNCS, vol. 5702, pp. 173–180. Springer, Heidelberg (2009)
Bukhari, S.S., Shafait, F., Breuel, T.M.: Segmentation of curled textlines using active contours. In: Proceedings 8th IAPR Workshop on Document Analysis Systems, Nara, Japan, pp. 270–277 (2008)
Bukhari, S.S., Shafait, F., Breuel, T.M.: Textline information extraction from grayscale camera-captured document images. In: Proc. The 13th International Conference on Image Processing, Cairo, Egypt (2009)
Bukhari, S.S.: Technical Report: Performance Evaluation and Benchmarking of Three Curled Textline Segmentation Algorithms. IUPR Techinal Report, Kaiserslautern (2010)
Wolfram Resarch. Least Squares Fitting, http://mathworld.wolfram.com/LeastSquaresFitting.html (accessed January 15, 2010)
Shafait, F., Keysers, D., Breuel, T.M.: Performance evaluation and benchmarking of six page segmentation algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(6), 941–954 (2008)
Naylor, M.: Typographic line terms, http://en.wikipedia.org/wiki/File:Typography_Line_Terms.svg
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oliveira, D.M., Lins, R.D., Torreão, G., Fan, J., Thielo, M. (2010). A New Method for Text-Line Segmentation for Warped Documents. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2010. Lecture Notes in Computer Science, vol 6112. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13775-4_40
Download citation
DOI: https://doi.org/10.1007/978-3-642-13775-4_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13774-7
Online ISBN: 978-3-642-13775-4
eBook Packages: Computer ScienceComputer Science (R0)