subscribe to arXiv mailings

ICDAR 2021 Competition on Components Segmentation Task of Document Photos

Authors: Celso A. M. Lopes Junior, Ricardo B. das Neves Junior, Byron L. D. Bezerra, Alejandro H. Toselli, Donato Impedovo

Abstract: This paper describes the short-term competition on the Components Segmentation Task of Document Photos that was prepared in the context of the 16th International Conference on Document Analysis and Recognition (ICDAR 2021). This competition aims to bring together researchers working in the field of identification document image processing and provides them a suitable benchmark to compare their tec… ▽ More This paper describes the short-term competition on the Components Segmentation Task of Document Photos that was prepared in the context of the 16th International Conference on Document Analysis and Recognition (ICDAR 2021). This competition aims to bring together researchers working in the field of identification document image processing and provides them a suitable benchmark to compare their techniques on the component segmentation task of document images. Three challenge tasks were proposed entailing different segmentation assignments to be performed on a provided dataset. The collected data are from several types of Brazilian ID documents, whose personal information was conveniently replaced. There were 16 participants whose results obtained for some or all the three tasks show different rates for the adopted metrics, like Dice Similarity Coefficient ranging from 0.06 to 0.99. Different Deep Learning models were applied by the entrants with diverse strategies to achieve the best results in each of the tasks. Obtained results show that the currently applied methods for solving one of the proposed tasks (document boundary detection) are already well established. However, for the other two challenge tasks (text zone and handwritten sign detection) research and development of more robust approaches are still required to achieve acceptable results. △ Less

Submitted 8 July, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: 15 pages; 5 figures; Accepted at ICDAR 2021: 16th International Conference on Document Analysis and Recognition

arXiv:2005.14229 [pdf, other]

FCN+RL: A Fully Convolutional Network followed by Refinement Layers to Offline Handwritten Signature Segmentation

Authors: Celso A. M. Lopes Junior, Matheus Henrique M. da Silva, Byron Leite Dantas Bezerra, Bruno Jose Torres Fernandes, Donato Impedovo

Abstract: Although secular, handwritten signature is one of the most reliable biometric methods used by most countries. In the last ten years, the application of technology for verification of handwritten signatures has evolved strongly, including forensic aspects. Some factors, such as the complexity of the background and the small size of the region of interest - signature pixels - increase the difficulty… ▽ More Although secular, handwritten signature is one of the most reliable biometric methods used by most countries. In the last ten years, the application of technology for verification of handwritten signatures has evolved strongly, including forensic aspects. Some factors, such as the complexity of the background and the small size of the region of interest - signature pixels - increase the difficulty of the targeting task. Other factors that make it challenging are the various variations present in handwritten signatures such as location, type of ink, color and type of pen, and the type of stroke. In this work, we propose an approach to locate and extract the pixels of handwritten signatures on identification documents, without any prior information on the location of the signatures. The technique used is based on a fully convolutional encoder-decoder network combined with a block of refinement layers for the alpha channel of the predicted image. The experimental results demonstrate that the technique outputs a clean signature with higher fidelity in the lines than the traditional approaches and preservation of the pertinent characteristics to the signer's spelling. To evaluate the quality of our proposal, we use the following image similarity metrics: SSIM, SIFT, and Dice Coefficient. The qualitative and quantitative results show a significant improvement in comparison with the baseline system. △ Less

Submitted 28 May, 2020; originally announced May 2020.

Comments: 7 pages, 6 figures, Accepted at IJCNN 2020: International Joint Conference on Neural Networks

arXiv:2004.01317 [pdf, other]

doi 10.1109/IJCNN48605.2020.9206711

A Fast Fully Octave Convolutional Neural Network for Document Image Segmentation

Authors: Ricardo Batista das Neves Junior, Luiz Felipe Verçosa, David Macêdo, Byron Leite Dantas Bezerra, Cleber Zanchettin

Abstract: The Know Your Customer (KYC) and Anti Money Laundering (AML) are worldwide practices to online customer identification based on personal identification documents, similarity and liveness checking, and proof of address. To answer the basic regulation question: are you whom you say you are? The customer needs to upload valid identification documents (ID). This task imposes some computational challen… ▽ More The Know Your Customer (KYC) and Anti Money Laundering (AML) are worldwide practices to online customer identification based on personal identification documents, similarity and liveness checking, and proof of address. To answer the basic regulation question: are you whom you say you are? The customer needs to upload valid identification documents (ID). This task imposes some computational challenges since these documents are diverse, may present different and complex backgrounds, some occlusion, partial rotation, poor quality, or damage. Advanced text and document segmentation algorithms were used to process the ID images. In this context, we investigated a method based on U-Net to detect the document edges and text regions in ID images. Besides the promising results on image segmentation, the U-Net based approach is computationally expensive for a real application, since the image segmentation is a customer device task. We propose a model optimization based on Octave Convolutions to qualify the method to situations where storage, processing, and time resources are limited, such as in mobile and robotic applications. We conducted the evaluation experiments in two new datasets CDPhotoDataset and DTDDataset, which are composed of real ID images of Brazilian documents. Our results showed that the proposed models are efficient to document segmentation tasks and portable. △ Less

Submitted 2 April, 2020; originally announced April 2020.

Comments: This paper was accepted for IJCNN 2020 Conference

Journal ref: 2020 International Joint Conference on Neural Networks (IJCNN)

Showing 1–3 of 3 results for author: Bezerra, B L D