Abstract
In this paper, we present a framework to handle recognition errors from a N-best list of output phrases given by a handwriting recognition system, with the aim to use the resulting phrases as inputs to a higher-level application. The framework can be decomposed into four main steps: phrase alignment, detection, characterization, and correction of word error hypotheses. First, the N-best phrases are aligned to the top-list phrase, and word posterior probabilities are computed and used as confidence indices to detect word error hypotheses on this top-list phrase (in comparison with a learned threshold). Then, the errors are characterized into predefined types, using the word posterior probabilities of the top-list phrase and other features to feed a trained SVM. Finally, the final output phrase is retrieved, thanks to a correction step that used the characterized error hypotheses and a designed word-to-class backoff language model. First experiments were conducted on the ImadocSen-OnDB handwritten sentence database and on the IAM-OnDB handwritten text database, using two recognizers. We present first results on an implementation of the proposed framework for handling recognition errors on transcripts of handwritten phrases provided by recognition systems.
Similar content being viewed by others
References
Abdou S., Scordilis M.: Beam search pruning in speech recognition using a posterior probability-based confidence measure. Speech Commun. 42(3–4), 409–428 (2004)
Bertolami R., Bunke H.: Integration of n-gram language models in multiple classifier systems for offline handwritten text line recognition. Intern. J. Pattern. Recognit. Artif. Intell. 22(7), 1301–1321 (2008)
Bertolami R., Zimmermann M., Bunke H.: Rejection strategies for offline handwritten text recognition. Pattern Recognit. Lett. 27, 2005–2012 (2006)
Bisani, M., Ney, H.: Open vocabulary speech recognition with flat hybrid models. In: European Conference on Speech Communication and Technology, pp. 725–728. Lisbon, Portugal (2005)
Blatz, J., Fitzgerald, E., Foster, G., Gandrabur, S., Goutte, C., Kulesza, A., Sanchis, A., Ueffing, N.: Confidence estimation for machine translation. In: Proceedings of the International Conference on Computational Linguistics (COLING), pp. 315–321. Geneva, Switzerland (2004)
Carbonnel, S., Anquetil, E.: Lexicon organization and string edit distance learning for lexical post-processing in handwriting recognition. In: International Workshop on Frontiers in Handwriting Recognition, pp. 462–467. Tokyo, Japan (2004)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001). Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Chase, L.: Word and acoustic confidence annotation for large vocabulary speech recognition. In: European Conference on Speech Communication and Technology, pp. 815–818. Rhodes, Greece (1997)
Cox S., Dasmahapatra S.: High-level approaches to confidence estimation in speech recognition. IEEE Trans. Speech Audio Process. 10(7), 460–471 (2002)
Evermann, G., Woodland, P.: Posterior probability decoding, confidence estimation and system combination. In: Speech Transcription Workshop. College Park, United States (2000)
Falavigna, D., Gretter, R., Riccardi, G.: Acoustic and word lattice based algorithms for confidence scores. In: International Conference on Spoken Language Processing, pp. 1621–1624. Denver, United States (2002)
Farooq F., Jose D., Govindaraju V.: Phrase-based correction model for improving handwriting recognition accuracies. Pattern Recognit. 42(12), 3271–3277 (2009)
Fayolle, J., Moreau, F., Raymond, C., Gravier, G.: Reshaping automatic speech transcripts for robust high-level spoken document analysis. In: AND’10, Workshop on Analytics for Noisy Unstructured Text Data. Toronto, Canada (2010)
Fiscus, J.: A post-processing system to yield reduced word error rates: Recognizer output voting error reduction. In: Workshop on Automatic Speech Recognition and Understanding, pp. 347–352. Santa Barbara, United States (1997)
Forney G.: The viterbi algorithm. IEEE 61(3), 268–278 (1973)
Francis W., Kucera H.: Brown Corpus Manual. Brown University, Providence, United States (1979)
Fu, Y., Du, L.: Combination of multiple predictors to improve confidence measure based on local posterior probabilities. In: International Conference on Acoustics, Speech and Signal Processing, pp. 93–96. Philadelphia, United States (2005)
Gandrabur S., Foster G., Lapalme G.: Confidence estimation for nlp applications. ACM Trans. Speech Lang. Process. 3(3), 1–29 (2006)
Goodman J.T.: A bit of progress in language modeling. Technical report MSR-TR-2001-72, Microsoft Research, Redmond, USA (2001)
Graves A., Liwicki M., Fernandez S., Bertolami R., Bunke H., Schmidhuber J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Trans. Pattern Anal. Mach. Intell. 31(5), 855–868 (2009)
Hillard, D., Ostendorf, M.: Compensating for word posterior estimation bias in confusion networks. In: International Conference on Acoustics, Speech and Signal Processing, pp. 1153–1156. Toulouse, France (2006)
Johansson S., Leech G., Goodluck H.: Manual of Information to Accompany the Lancaster-Oslo/Bergen Corpus of British English, for use with Digital Computers. University of Oslo, Oslo, Norway (1978)
Kemp, T., Schaaf, T.: Estimating confidence using word lattices. In: European Conference on Speech Communication and Technology, pp. 827–830. Rhodes, Greece (1997)
Levenshtein V.I.: Binary codes capable of correcting deletions, insertions and reversals. Sov. Phys. Doklady 10(8), 707–710 (1966)
Liwicki, M., Bunke, H.: Iam-ondb—an on-line English sentence database acquired from handwritten text on a whiteboard. In: International Conference on Document Analysis and Recognition, pp. 956–961. Seoul, Korea (2005)
Lopresti D.: Optical character recognition errors and their effects on natural language processing. IJDAR, Int. J. Doc. Anal. Recognit. 12(3), 141–151 (2009)
Marukatat, S., Artieres, T., Gallinari, P.: Rejection measures for handwriting sentence recognition. In: International Workshop on Frontiers in Handwriting Recognition, pp. 24–29. Ontario, Canada (2002)
Niesler T.: Category-based statistical language models. University of Cambridge, Cambridge, UK (1997)
Perraud F., Viard-Gaudin C., Morin E., Lallican P.M.: Statistical language models for on-line handwriting recognition. IEICE Trans. Inf. Syst. E88(D(8)), 1807–1814 (2005)
Pitrelli J., Subrahmonia J., Perrone M.: Confidence modeling for handwriting recognition: algorithms and applications. Int. J. Doc. Anal. Recognit. 8(1), 35–46 (2006)
Pittman J.: Handwriting recognition: tablet pc text input. IEEE Comput. 40(9), 49–54 (2007)
Quiniou, S., Anquetil, E.: A priori and a posteriori integration and combination of language models in an on-line handwritten sentence recognition system. In: International Workshop on Frontiers in Handwriting Recognition, pp. 403–408. La Baule, France (2006)
Quiniou S., Bouteruche F., Anquetil E.: Word extraction associated with a confidence index for on-line handwritten sentence recognition. Intern. J. Pattern Recognit. Artif. Intell. 23(5), 945–966 (2009)
Quiniou, S., Cheriet, M., Anquetil, E.: Handling out-of- vocabulary words and recognition errors based on word linguistic context for handwritten sentence recognition. In: International Conference on Document Analysis and Recognition, pp. 466–470. Barcelona, Spain (2009)
Saldarriaga S.P., Viard-Gaudin C., Morin E.: Impact of on-line handwriting recognition performance on text categorization. IJDAR, Int. J. Doc. Anal. Recognit. 13(2), 159–171 (2009)
Shi, Y., Zhou, L.: Error detection using linguistic features. In: EMNLP’05, Conference on Empirical Methods in Natural Language Processing, pp. 41–48. Vancouver, Canada (2005)
Siu M., Gish H.: Evaluation of word confidence for speech recognition systems. Comput. Speech Lang. 13(4), 299–319 (1999)
Stolcke, A.: Srilm—an extensible language modeling toolkit. In: International Conference on Spoken Language Processing, pp. 901–904. Denver, United States (2002). Available at http://www.speech.sri.com/projects/srilm/
Subramaniam, L., Roy, S., Faruquie, T., Negi, S.: A survey of types of text noise and techniques to handle noisy text. In: AND’09, Workshop on Analytics for Noisy Unstructured Text Data, pp. 115–122. Barcelona, Spain (2009)
Ueffing N., Ney H.: Word-level confidence estimation for machine translation. Comput. Linguist. 33(1), 9–40 (2007)
Wessel F., Schlter R., Macherey K., Ney H.: Confidence measures for large vocabulary continuous speech recognition. IEEE Trans. Speech Audio Process. 9(3), 288–298 (2001)
Williams G., Renals S.: Confidence measures from local posterior probability estimates. Comput. Speech Lang. 13(4), 395–411 (1999)
Xue, J., Zhao, Y.: Improved confusion network algorithm and shortest path search from word lattice. In: International Conference on Acoustics, Speech and Signal Processing, pp. 853–856. Philadelphia, United States (2005)
Zhou, Z.Y., Meng, H.: A two-level schema for detecting recognition errors. In: International Conference on Spoken Language Processing, pp. 449–452. Jeju Island, Korea (2004)
Zimmermann M., Chappelier J.C., Bunke H.: Offline grammar-based recognition of handwritten sentences. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 818–821 (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Quiniou, S., Cheriet, M. & Anquetil, E. Error handling approach using characterization and correction steps for handwritten document analysis. IJDAR 15, 125–141 (2012). https://doi.org/10.1007/s10032-011-0156-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10032-011-0156-6