Access Restriction

Author Leedham, G. ♦ Varma, S. ♦ Patankar, A. ♦ Govindaraju, V.
Sponsorship CEDAR, Univ. Buffalo ♦ Microsoft ♦ Siemens ♦ Hitachi ♦ Motorola ♦ U.S. Postal Service ♦ A2iA ♦ Int. Assoc. Pattern Recognition
Source IEEE Xplore Digital Library
Content type Text
Publisher Institute of Electrical and Electronics Engineers, Inc. (IEEE)
File Format PDF
Copyright Year ©2002
Language English
Subject Domain (in DDC) Technology ♦ Engineering & allied operations ♦ Other branches of engineering
Subject Keyword Degradation ♦ Pixel ♦ Entropy ♦ Writing ♦ Venus ♦ Text analysis ♦ Image analysis ♦ Performance analysis ♦ Image recognition ♦ Text recognition
Abstract Before any processing of the textual content of a document image can be performed the text must be separated from the background of the image. Several thresholding algorithms have previously been proposed and are widely used in document processing. None have been shown effective at thresholding difficult documents where the background and foreground are non-uniform. In this paper we investigate the use of three global thresholding algorithms (Otsu's, Kapur's entropy and Solihin's quadratic integral ratio (QIR)) as the first stage in a multi-stage thresholding algorithm for use in degraded document images. It is concluded that Otsu's and Kapur's algorithms do not work well for difficult documents as they tend to over-threshold the image, thus losing much of the useful information. The QIR algorithm is more accurate in separating the foreground and background in these images, leaving a range of undecided, fuzzy, pixels for later processing in a subsequent stage.
Description Author affiliation: Sch. of Comput. Eng., Nanyang Technol. Univ., Singapore (Leedham, G.)
ISBN 0769516920
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research ♦ Reading
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2002-08-06
Publisher Place Canada
Rights Holder Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Size (in Bytes) 530.46 kB
Page Count 6
Starting Page 244
Ending Page 249

Source: IEEE Xplore Digital Library