Thumbnail
Access Restriction
Subscribed

Author Pal, U. ♦ Chaudhuri, B.B.
Source IEEE Xplore Digital Library
Content type Text
Publisher Institute of Electrical and Electronics Engineers, Inc. (IEEE)
File Format PDF
Copyright Year ©1999
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Special computer methods
Subject Keyword Natural languages ♦ Optical character recognition software ♦ Shape ♦ Optical filters ♦ Computer vision ♦ Pattern recognition ♦ Writing ♦ Read only memory ♦ Character generation
Abstract In a multi-lingual country like India, a document page may contain more than one script form. Under the three-language formula, the document may be printed in English, Devnagari and one of the other official Indian languages. For OCR of such a document page, it is necessary to separate these three script forms before feeding them to the OCRs of individual scripts. In this paper, an automatic technique of separating the text lines using script characteristics and shape based features is presented. At present, the system has an overall accuracy of about 98.5%.
Description Author affiliation: Comput. Vision & Pattern Recognition Unit, Indian Stat. Inst., Calcutta, India (Pal, U.)
ISBN 0769503187
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research ♦ Reading
Education Level UG and PG
Learning Resource Type Article
Publisher Date 1999-09-22
Publisher Place India
Rights Holder Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Size (in Bytes) 49.84 kB
Page Count 4
Starting Page 406
Ending Page 409


Source: IEEE Xplore Digital Library