A Survey of Optical Character Recognition Techniques on Indic Script

Main Article Content

Article Sidebar

Published Sep 19, 2021
Ramya Ch Dr. B. Vishnu Vardhan

Abstract

Optical Character Recognition (OCR) is a technique that converts printed text and images into a digitized form which can be manipulated by a machine. It has many application sectors like Banking, Financial, Legal applications etc. Initially researchers were addressed and proposed many algorithms in image processing for character recognition and mapping. Most of the researchers focused on the Latin script English as it was supported by the Encoding standard ASCII. Later, people start realizing that OCR techniques for other languages are also gaining momentum these days. With the advent of technology and Unicode revolution, native language-based OCR solutions started emerging. In this paper we aim to focus on the latest machine learning techniques applied on OCR for the language English and two languages from Indian continent were presented. Out of the two Indian languages, one is the stroke-based language i.e., Hindi and the other being cursive script-based language Telugu.

How to Cite

Ch, R., & B, V. V. (2021). A Survey of Optical Character Recognition Techniques on Indic Script . SPAST Abstracts, 1(01). Retrieved from https://spast.org/techrep/article/view/581
Abstract 195 |

Article Details

Keywords

Optical Character Recognition, Artificial Neural Network, Convolutional Neural Network, Image Processing

References
[1] Islam, Noman & Islam, Zeeshan & Noor, Nazia. (2016). A Survey on Optical Character Recognition System. ITB Journal of Information and Communication Technology.
[2] Narasimha Reddy Soora& Parag S. Deshpande (2018) Review of Feature Extraction Techniques for Character Recognition, IETE Journal of Research, 64:2, 280-295, DOI: 10.1080/03772063.2017.1351323
[3] Youssef &Alwani, Mohammad. (2012). OCR Post-Processing Error Correction Algorithm using Google OnlineSpelling Suggestion.
[4] A. M. Sabu and A. S. Das, "A Survey on various Optical Character Recognition Techniques," 2018 Conference on Emerging Devices and Smart Systems (ICEDSS), 2018, pp. 152-155, doi: 10.1109/ICEDSS.2018.8544323.
[5] Srinivas, B. Anuradha, Arun Agarwal, and C. Raghavendra Rao. "An Overview of OCR Research in Indian Scripts." IJCSES 2.2 (2008).
[6] Singh, Sukhpreet. "Optical character recognition techniques: a survey." Journal of emerging Trends in Computing and information Sciences 4.6 (2013): 545-550.
[7] J. Memon, M. Sami, R. A. Khan and M. Uddin, "Handwritten Optical Character Recognition (OCR): A Comprehensive Systematic Literature Review (SLR)," in IEEE Access, vol. 8, pp. 142642-142668, 2020, doi: 10.1109/ACCESS.2020.3012542.
[8] Xiong, Qianli. (2020). APPLICATION OF IMAGE MODAL ANALYSIS IN ENGLISH PICTURE AND TEXT RECOGNITION. Microprocessors and Microsystems. 103341. 10.1016/j.micpro.2020.103341.
[9] Shi, B., Bai, X., & Yao, C.“An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition”. IEEE Transactions on Pattern Analysis and Machine Intelligence,
[10] S. Afroge, B. Ahmed and F. Mahmud, "Optical character recognition using back propagation neural network," 2016 2nd International Conference on Electrical, Computer & Telecommunication Engineering (ICECTE), 2016, pp. 1-4, doi: 10.1109/ICECTE.2016.7879615.
[11] Mollah, Ayatullah& Majumder, Nabamita&Basu, Subhadip &Nasipuri, Mita. (2011). Design of an Optical Character Recognition System for Camera-basedHandheld Devices. International Journal of Computer Science Issues. 8.
[12] H. Mehta, S. Singla and A. Mahajan, "Optical character recognition (OCR) system for Roman script & English language using Artificial Neural Network (ANN) classifier," 2016 International Conference on Research Advances in Integrated Navigation Systems (RAINS), 2016, pp. 1-5, doi: 10.1109/RAINS.2016.7764379.
[13] K. Ntirogiannis, B. Gatos and I. Pratikakis, "Performance Evaluation Methodology for Historical Document Image Binarization," in IEEE Transactions on Image Processing, vol. 22, no. 2, pp. 595-609, Feb. 2013, doi: 10.1109/TIP.2012.2219550.
[14] M. Pan, F. Zhang and H. Ling, "An Image Binarization Method Based on HVS," 2007 IEEE International Conference on Multimedia and Expo, 2007, pp. 1283-1286, doi: 10.1109/ICME.2007.4284892.
[15] A. Mutholib, T. S. Gunawan and M. Kartiwi, "Design and implementation of automatic number plate recognition on android platform," 2012 International Conference on Computer and Communication Engineering (ICCCE), 2012, pp. 540-543, doi: 10.1109/ICCCE.2012.6271245.
[16] Babu, G. & Reddy, K.. (2019). Automatic Payment System in Tollgate Using Number Plate Recognition. International Journal of Computer Sciences and Engineering. 7. 49-51. 10.26438/ijcse/v7i5.4951.
[17] J. Devlin, M. Kamali, K. Subramanian, R. Prasad and P. Natarajan, "Statistical Machine Translation as a Language Model for Handwriting Recognition," 2012 International Conference on Frontiers in Handwriting Recognition, 2012, pp. 291-296, doi: 10.1109/ICFHR.2012.273.
[18] Kasturi, R., O’Gorman, L. &Govindaraju, V. Document image analysis: A primer. Sadhana 27, 3–22 (2002). https://doi.org/10.1007/BF02703309.
[19] Lakshmi, Chenna&Patvardhan, C.. (2004). An optical character recognition system for printed Telugu text. Formal Pattern Analysis & Applications. 7. 190-204. 10.1007/s10044-004-0217-2.
[20] A. Negi and C. K. Chereddi, "Candidate search and elimination approach for Telugu OCR," TENCON 2003. Conference on Convergent Technologies for Asia-Pacific Region, 2003, pp. 745-748 Vol.2, doi: 10.1109/TENCON.2003.1273278.
[21] Bv, Dhandra&Mukarambi, Dr. Gururaj &Hangarge, Mallikarjun. (2011). A SCRIPT INDEPENDENT APPROACH FOR HANDWRITTEN BILINGUAL KANNADA AND TELUGU DIGITS RECOGNITION. International Journal of Machine Intelligence. 3. 155-159. 10.9735/0975-2927.3.3.155-159.
[22] Singh, Rinki and M. Kaur. “OCR for Telugu Script Using Back-Propagation Based Classifier.” (2010).
[23] Goyal, Nisha & Jain, Er. (2015). Optimized Hindi Script Recognition using OCR Feature Extraction Technique. IJARCCE. 4. 419-424. 10.17148/IJARCCE.2015.4891.
[24] JishnuMukhoti, Sukanya Dutta & Ram Sarkar (2020) Handwritten Digit Classification in Bangla and Hindi Using Deep Learning, Applied Artificial Intelligence, 34:14, 1074-1099, DOI:10.1080/08839514.2020.1804228
[25] Vaishnav A., Mandot M. (2020) Template Matching for Automatic Number Plate Recognition System with Optical Character Recognition. In: Tuba M., Akashe S., Joshi A. (eds) Information and Communication Technology for Sustainable Development. Advances in Intelligent Systems and Computing, vol 933. Springer, Singapore. https://doi.org/10.1007/978-981-13-7166-0_69
[26] Chaudhuri, Arindam &Mandaviya, Krupa &Badelia, Pratixa& Ghosh, Soumya. (2017). Optical Character Recognition Systems for Different Languages with Soft Computing. 352. 10.1007/978-3-319-50252-6.
[27] Yadav, Divakar & Sanchez-Cuadrado, Sonia &Morato, Jorge. (2013). Optical Character Recognition for Hindi Language Using a Neural-network Approach. Journal of Information Processing Systems. 9117. 117-140. 10.3745/JIPS.2013.9.1.117.
[28] YADAV, MADHURI &Purwar, Ravindra & Mittal, Mamta. (2018). Handwritten Hindi Character Recognition-A Review. IET Image Processing. 12. 10.1049/iet-ipr.2017.0184.
Section
GE3- Computers & Information Technology

Most read articles by the same author(s)