MRZ code extraction from visa and passport documents using convolutional neural networks

Yichuan Liu ORCID: orcid.org/0000-0002-1899-2082¹,
Hailey Joren¹,
Otkrist Gupta¹ &
…
Dan Raviv¹

607 Accesses
7 Citations
6 Altmetric
Explore all metrics

A Correction to this article was published on 11 July 2023

This article has been updated

Abstract

Detecting and extracting information from the machine-readable zone (MRZ) on passports and visas is becoming increasingly important for verifying document authenticity. However, computer vision methods for performing similar tasks, such as optical character recognition, fail to extract the MRZ from digital images of passports with reasonable accuracy. We present a specially designed model based on convolutional neural networks that is able to successfully extract MRZ information from digital images of passports of arbitrary orientation and size. Our model achieves 100% MRZ detection rate and 99.25% character recognition macro-f1 score on a passport and visa dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 2

A Deep Learning Solution to Detect Text-Types Using a Convolutional Neural Network

Automatic Text Recognition from Image Dataset Using Optical Character Recognition and Deep Learning Techniques

An Optical Character Recognition Technique for Devanagari Script Using Convolutional Neural Network and Unicode Encoding

Availability of data and material

Not available.

Change history

References

Abhishek Dutta, A.G., Zisserman, A. (2020). https://www.robots.ox.ac.uk/~vgg/software/via/
Arlazarov, V.V., Bulatov, K.B., Chernov, T.S., Arlazarov, V.L.: Midv-500: a dataset for identity document analysis and recognition on mobile devices in video stream. 43(5) (2019)
Bessmeltsev, V., Bulushev, E., Goloshevsky, N.: High-speed OCR algorithm for portable passport readers. In: 21st International Conference on Computer Graphics and Vision, GraphiCon’2011—Conference Proceedings (2011)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)
Article Google Scholar
Chernyshova, Y.S., Aliev, M.A., Gushchanskaia, E.S., Sheshkus, A.V.: Optical font recognition in smartphone-captured images, and its applicability for ID forgery detection. In: Eleventh International Conference on Machine Vision (ICMV 2018), p. 59 (2019). https://doi.org/10.1117/12.2522955. arXiv:1810.08016
Dai, Y., Huang, Z., Gao, Y., Xu, Y., Chen, K., Guo, J., Qiu, W.: Fused Text Segmentation Networks for Multi-oriented Scene Text Detection. arXiv:1709.03272 [cs] (2018)
Deng, D., Liu, H., Li, X., Cai, D.: PixelLink: Detecting Scene Text via Instance Segmentation. arXiv:1801.01315 [cs] (2018)
Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Deng, L., Gong, Y., Lin, Y., Shuai, J., Tu, X., Zhang, Y., Ma, Z., Xie, M.: Detecting multi-oriented text with corner-based region proposals. Neurocomputing 334, 134–142 (2019). https://doi.org/10.1016/j.neucom.2019.01.013
Article Google Scholar
Donoser, M., Arth, C., Bischof, H.: Detecting, tracking and recognizing license plates. In: Yagi, Y., Kang, S.B., Kweon, I.S., Zha, H. (eds) Computer Vision—ACCV 2007, Lecture Notes in Computer Science, pp. 447–456. Springer, Berlin (2007). https://doi.org/10.1007/978-3-540-76390-1_44
doubango.org: (2020). https://github.com/DoubangoTelecom/ultimateMRZ-SDK#Getting-started-Adding-the-SDK-to-your-project
Fabrizio, J., Marcotegui, B., Cord, M.: Text segmentation in natural scenes using toggle-mapping. In: 2009 16th IEEE International Conference on Image Processing (ICIP), pp. 2373–2376 (2009)
Giusti, A., Ciresan, D.C., Masci, J., Gambardella, L.M., Schmidhuber, J.: Fast image scanning with deep max-pooling convolutional neural networks. In: 2013 IEEE International Conference on Image Processing (2013). https://doi.org/10.1109/icip.2013.6738831
González, Á., Bergasa, L.M., Yebes, J.J.: Location in complex images (2012)
Hartl, A., Arth, C., Schmalstieg, D.: Real-time detection and recognition of machine-readable zones with mobile devices:. In: Proceedings of the 10th International Conference on Computer Vision Theory and Applications, pp. 79–87. SCITEPRESS—Science and and Technology Publications, Berlin (2015)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. arXiv:1703.06870 [cs] (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016). https://doi.org/10.1109/cvpr.2016.90
He, P., Huang, W., He, T., Zhu, Q., Qiao, Y., Li, X.: Single Shot Text Detector with Regional Attention. arXiv:1709.00138 [cs] (2017)
He, T., Tian, Z., Huang, W., Shen, C., Qiao, Y., Sun, C.: An end-to-end textspotter with explicit alignment and attention. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018). https://doi.org/10.1109/cvpr.2018.00527
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Hu, H., Zhang, C., Luo, Y., Wang, Y., Han, J., Ding, E.: WordSup: Exploiting Word Annotations for Character based Text Detection. arXiv:1708.06720 [cs] (2017)
Karatzas, D., Gomez-Bigorda, L., Nicolaou, A., Ghosh, S., Bagdanov, A., Iwamura, M., Matas, J., Neumann, L., Chandrasekhar, V.R., Lu, S., Shafait, F., Uchida, S., Valveny, E.: ICDAR 2015 competition on Robust reading. In: 13th IAPR International Conference on Document Analysis and Recognition, ICDAR 2015—Conference Proceedings, pp. 1156–1160. IEEE Computer Society (2015). https://doi.org/10.1109/ICDAR.2015.7333942
Kasar, T., Ramakrishnan, A.: Multi-script and multi-oriented text localization from scene images. pp. 1–14 (2012). https://doi.org/10.1007/978-3-642-29364-1_1
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014)
Kostro, D., Zasso, M. (2020). https://github.com/image-js/mrz-detection
Lee, H., Kwak, N.: Character recognition for the machine reader zone of electronic identity cards. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 387–391 (2015). https://doi.org/10.1109/ICIP.2015.7350826
Liao, M., Shi, B., Bai, X.: TextBoxes++: a single-shot oriented scene text detector. IEEE Trans. Image Process. 27(8), 3676–3690 (2018). https://doi.org/10.1109/TIP.2018.2825107
Article MathSciNet MATH Google Scholar
Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time Scene Text Detection with Differentiable Binarization. arXiv:1911.08947 [cs] (2019)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: Lecture Notes in Computer Science, pp. 21–37 (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., Yan, J.: FOTS: Fast oriented text spotting with a unified network. arXiv:1801.01671 [cs] (2018)
Liu, X., Liang, D., Yan, S., Chen, D., Qiao, Y., Yan, J.: Fots: fast oriented text spotting with a unified network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5676–5685 (2018)
Liu, Y., He, T., Chen, H., Wang, X., Luo, C., Zhang, S., Shen, C., Jin, L.: Exploring the capacity of an orderless box discretization network for multi-orientation scene text detection. arXiv:1912.09629 [cs] (2020)
Liu, Z., Sarkar, S.: Robust outdoor text detection using text intensity and shape features
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Long, S., Ruan, J., Zhang, W., He, X., Wu, W., Yao, C.: TextSnake: a flexible representation for detecting text of arbitrary shapes. arXiv:1807.01544 [cs] (2018)
Lyu, P., Liao, M., Yao, C., Wu, W., Bai, X.: Mask TextSpotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. arXiv:1807.02242 [cs] (2018)
Lyu, P., Liao, M., Yao, C., Wu, W., Bai, X.: Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 67–83 (2018)
Lyu, P., Yao, C., Wu, W., Yan, S., Bai, X.: Multi-oriented scene text detection via corner localization and region segmentation. arXiv:1802.08948 [cs] (2018)
Merino-Gracia, C., Lenc, K., Mirmehdi, M.: A head-mounted device for recognizing text in natural scenes. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition, Lecture Notes in Computer Science, pp. 29–41. Springer, Berlin (2012). https://doi.org/10.1007/978-3-642-29364-1_3
Minetto, R., Thome, N., Cord, M., Stolfi, J., Précioso, F., Guyomard, J., Leite, N.: Text detection and recognition in urban scenes. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 227–234 (2011). https://doi.org/10.1109/ICCVW.2011.6130247
Mishra, A., Alahari, K., Jawahar, C.: Scene text recognition using higher order language priors. In: Proceedings of the British Machine Vision Conference 2012, pp. 127.1–127.11. British Machine Vision Association, Surrey (2012). https://doi.org/10.5244/C.26.127. http://www.bmva.org/bmvc/2012/BMVC/paper127/index.html
Neumann, L., Matas, J.: Real-time scene text localization and recognition. pp. 3538–3545 (2012). https://doi.org/10.1109/CVPR.2012.6248097
Pan, Y.F., Hou, X., Liu, C.L.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans. Image Process. 20(3), 800–813 (2011). https://doi.org/10.1109/TIP.2010.2070803
Article MathSciNet MATH Google Scholar
Petrova, O., Bulatov, K.: Methods of machine-readable zone recognition results post-processing. In: Eleventh International Conference on Machine Vision (ICMV 2018), vol. 11041, p. 110411H. International Society for Optics and Photonics (2019). https://doi.org/10.1117/12.2522792
SakuraRiven (2020). https://github.com/SakuraRiven/EAST
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018). https://doi.org/10.1109/cvpr.2018.00474
Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat: integrated recognition, localization and detection using convolutional networks (2013)
Shi, B., Bai, X., Belongie, S.: Detecting oriented text in natural images by linking segments. arXiv:1703.06520 [cs] (2017)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014)
Smith, R.: An overview of the tesseract ocr engine. In: Ninth International Conference on Document Analysis and Recognition (ICDAR 2007), vol. 2, pp. 629–633. IEEE (2007)
Tretyakov, K.: PassportEye: Extraction of machine-readable zone information from passports, visas and id-cards via OCR (2016). https://github.com/konstantint/PassportEye
Wang, J., Hu, X.: Gated recurrent convolution neural network for OCR. In: Advances in Neural Information Processing Systems, pp. 335–344 (2017)
Wang, W., Xie, E., Li, X., Hou, W., Lu, T., Yu, G., Shao, S.: Shape robust text detection with progressive scale expansion network. arXiv:1903.12473 [cs] (2019)
Xing, L., Tian, Z., Huang, W., Scott, M.R.: Convolutional character networks. arXiv:1910.07954 [cs] (2019)
Xu, Y., Duan, J., Kuang, Z., Yue, X., Sun, H., Guan, Y., Zhang, W.: Geometry normalization networks for accurate scene text detection. arXiv:1909.00794 [cs] (2019)
Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks. arXiv:1604.04018 [cs] (2016)
Zhou, X., Yao, C., Wen, H., Wang, Y., Zhou, S., He, W., Liang, J.: EAST: an efficient and accurate scene text detector. arXiv:1704.03155 [cs] (2017)
Zhu, K.H., Qi, F.H., Jiang, R.J., Xu, L.: Automatic character detection and segmentation in natural scene images. J. Zhejiang Univ. Sci. A 8, 63–71 (2007). https://doi.org/10.1631/jzus.2007.A0063
Article Google Scholar

Download references

Funding

This research was supported by Lendbuzz.

Author information

Authors and Affiliations

Lendbuzz, 125 High Street, Boston, MA, USA
Yichuan Liu, Hailey Joren, Otkrist Gupta & Dan Raviv

Authors

Yichuan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Hailey Joren
View author publications
You can also search for this author in PubMed Google Scholar
Otkrist Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Dan Raviv
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yichuan Liu.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Code availability

Not available.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article was revised due to update in second author name.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Liu, Y., Joren, H., Gupta, O. et al. MRZ code extraction from visa and passport documents using convolutional neural networks. IJDAR 25, 29–39 (2022). https://doi.org/10.1007/s10032-021-00384-2

Download citation

Received: 23 October 2020
Revised: 24 June 2021
Accepted: 29 June 2021
Published: 14 July 2021
Issue Date: March 2022
DOI: https://doi.org/10.1007/s10032-021-00384-2

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Deep Learning Solution to Detect Text-Types Using a Convolutional Neural Network

Automatic Text Recognition from Image Dataset Using Optical Character Recognition and Deep Learning Techniques

An Optical Character Recognition Technique for Devanagari Script Using Convolutional Neural Network and Unicode Encoding

Availability of data and material

Change history

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

MRZ code extraction from visa and passport documents using convolutional neural networks

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A Deep Learning Solution to Detect Text-Types Using a Convolutional Neural Network

Automatic Text Recognition from Image Dataset Using Optical Character Recognition and Deep Learning Techniques

An Optical Character Recognition Technique for Devanagari Script Using Convolutional Neural Network and Unicode Encoding

Availability of data and material

Change history

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Code availability

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation