Guide to OCR for Indic Scripts

Document Recognition and Retrieval

By Venu Govindaraju , Srirangaraj (Ranga) Setlur

Guide to OCR for Indic Scripts Cover Image

This is the first comprehensive text on Optical Character Recognition for Indic scripts. It covers many topics and describes OCR systems for eight different scripts—Bangla, Devanagari, Gurmukhi, Gujarti, Kannada, Malayalam, Tamil and Urdu.

Full Description

  • ISBN13: 978-1-8480-0329-3
  • 348 Pages
  • User Level: Science
  • Publication Date: September 25, 2009
  • Available eBook Formats: PDF
  • eBook Price: $149.00
Buy eBook Buy Print Book Add to Wishlist

Related Titles

Full Description
This unique guide/reference is the very first comprehensive book on the subject of OCR (Optical Character Recognition) for Indic scripts. Features: contains contributions from the leading researchers in the field; discusses data set creation for OCR development; describes OCR systems that cover 8 different scripts – Bangla, Devanagari, Gurmukhi, Gujarati, Kannada, Malayalam, Tamil, and Urdu (Perso-Arabic); explores the challenges of Indic script handwriting recognition in the online domain; examines the development of handwriting-based text input systems; describes ongoing work to increase access to Indian cultural heritage materials; provides a section on the enhancement of text and images obtained from historical Indic palm leaf manuscripts; investigates different techniques for word spotting in Indic scripts; reviews mono-lingual and cross-lingual information retrieval in Indic languages. This is an excellent reference for researchers and graduate students studying OCR technology and methodologies.
Table of Contents

Table of Contents

  1. Part I: Recognition of Indic Scripts.
  2. Building Data Sets for Indian Language OCR Research.
  3. On OCR of Major Indian Scripts: Bangla and Devanagari.
  4. A Complete Machine Printed Gurmukhi OCR System.
  5. Progress in Gujarati Document Processing and Character Recognition.
  6. Design of a Bilingual Kannada
  7. English OCR.
  8. Recognition of Malayalam Documents.
  9. A Complete OCR System for Tamil Magazine Documents.
  10. Experiments on Urdu Text Recognition.
  11. The BBN Byblos Hindi OCR System.
  12. Generalization of Hindi OCR using Adaptive Segmentation and Font Files.
  13. Online Handwriting Recognition for Indic Scripts.
  14. Part II: Retrieval of Indic Documents.
  15. Enhancing Access to Primary Cultural Heritage Materials of India.
  16. Digital Image Enhancement of Indic Historical Manuscripts.
  17. GFG based Compression and Retrieval of Document Images in Indian Scripts.
  18. Word spotting for Indic documents to facilitate retrieval.
  19. Indian Language Information Retrieval.
Errata

Please Login to submit errata.

No errata are currently published