HAPPY HOLIDAYS: Get a special discount on Apress Access! Subscribe today >>

Advances in Computer Vision and Pattern Recognition

Guide to OCR for Indic Scripts

Document Recognition and Retrieval

Editors: Govindaraju, Venu, Setlur, Srirangaraj Ranga (Eds.)

  • First comprehensive book on the topic of Indic Script OCRs

Buy this book

eBook $149.00
price for USA
  • ISBN 978-1-84800-330-9
  • Digitally watermarked, DRM-free
  • Included format: PDF, EPUB
  • ebooks can be used on all reading devices
  • Download immediately after purchase
Hardcover $189.00
price for USA
  • ISBN 978-1-84800-329-3
  • Free shipping for individuals worldwide
  • Usually dispatched within 3 to 5 business days.
Softcover $189.00
price for USA
  • ISBN 978-1-4471-2518-1
  • Free shipping for individuals worldwide
  • Usually dispatched within 3 to 5 business days.
About this book

Optical Character Recognition (OCR) is a key enabling technology critical to creating indexed, digital library content, and it is especially valuable for Indic scripts, for which there has been very little digital access.

Indic scripts, the ancient Brahmi scripts prevalent in the Indian subcontinent, present some challenges for OCR that are different from those faced with Latin and Oriental scripts. But properly utilized, OCR will help to make Indic digital archives practically accessible to researchers and lay users alike by creating searchable indexes and machine-readable text repositories.

This unique guide/reference is the very first comprehensive book on the subject of OCR for Indic scripts, providing an overview of the state-of-the-art research in this field as well as other issues related to facilitating query and retrieval of Indic documents from digital libraries. All major research groups working in this area are represented in this book, which is divided into sections on recognition of Indic scripts and retrieval of Indic documents.

Topics and features:

  • Contains contributions from the leading researchers in the field
  • Discusses data set creation for OCR development
  • Describes OCR systems that cover eight different scripts: Bangla, Devanagari, Gurmukhi, Gujarati, Kannada, Malayalam, Tamil, and Urdu (Perso-Arabic)
  • Explores the challenges of Indic script handwriting recognition in the online domain
  • Examines the development of handwriting-based text input systems
  • Describes ongoing work to increase access to Indian cultural heritage materials
  • Provides a section on the enhancement of text and images obtained from historical Indic palm leaf manuscripts
  • Investigates different techniques for word spotting in Indic scripts
  • Reviews mono-lingual and cross-lingual information retrieval in Indic languages

This is an excellent reference for researchers and graduate students studying OCR technology and methodologies. This volume will contribute to opening up the rich Indian cultural heritage embodied in millions of ancient and contemporary documents spanning topics such as science, literature, medicine, astronomy, mathematics and philosophy.

Venu Govindaraju FIEEE FIAPR, is a Distinguished Professor of Computer Science and Engineering at the University at Buffalo. He has over 20 years of research experience in pattern recognition, information retrieval and biometrics. His seminal work on handwriting recognition was at the core of the first handwritten address interpretation system used by the U.S. Postal Service.

Srirangaraj Setlur SMIEEE, is a Principal Research Scientist at the University at Buffalo. He has over 15 years of research experience in pattern recognition that includes NSF sponsored work on multilingual OCR technologies for digital libraries and other applications. His work on postal automation has led to technology adopted by the U.S. Postal Service, and Royal Mail in the U.K.

Table of contents (16 chapters)

  • Building Data Sets for Indian Language OCR Research

    Jawahar, C.V. (et al.)

    Pages 3-25

  • On OCR of Major Indian Scripts: Bangla and Devanagari

    Chaudhuri, B. B.

    Pages 27-42

  • A Complete Machine-Printed Gurmukhi OCR System

    Lehal, G. S.

    Pages 43-71

  • Progress in Gujarati Document Processing and Character Recognition

    Dholakia, Jignesh (et al.)

    Pages 73-95

  • Design of a Bilingual Kannada–English OCR

    Umesh, R.S. (et al.)

    Pages 97-124

Buy this book

eBook $149.00
price for USA
  • ISBN 978-1-84800-330-9
  • Digitally watermarked, DRM-free
  • Included format: PDF, EPUB
  • ebooks can be used on all reading devices
  • Download immediately after purchase
Hardcover $189.00
price for USA
  • ISBN 978-1-84800-329-3
  • Free shipping for individuals worldwide
  • Usually dispatched within 3 to 5 business days.
Softcover $189.00
price for USA
  • ISBN 978-1-4471-2518-1
  • Free shipping for individuals worldwide
  • Usually dispatched within 3 to 5 business days.
Loading...

Bibliographic Information

Bibliographic Information
Book Title
Guide to OCR for Indic Scripts
Book Subtitle
Document Recognition and Retrieval
Editors
  • Venu Govindaraju
  • Srirangaraj Ranga Setlur
Series Title
Advances in Computer Vision and Pattern Recognition
Copyright
2010
Publisher
Springer-Verlag London
Copyright Holder
Springer-Verlag London
eBook ISBN
978-1-84800-330-9
DOI
10.1007/978-1-84800-330-9
Hardcover ISBN
978-1-84800-329-3
Softcover ISBN
978-1-4471-2518-1
Series ISSN
2191-6586
Edition Number
1
Number of Pages
XXI, 325
Number of Illustrations and Tables
150 b/w illustrations, 11 illustrations in colour
Topics