Neural network methods of text image recognition

Author(s) Collection number Pages Download abstract Download full text
Tymchenko O. V., Havrysh B. M., Durniak B. V. № 1 (81) 72-88 Image Image

The paper considers the methodology of improving the quality of recognition of text images, in particular distorted by noise and with geometric distortions due to poor scanning or photography. It has been shown that the problem of text document recognition is clo­sely related to the problem of pattern recognition, so that methods of comparison with a sample, statistical and based on neural networks can be applied to text recognition. The comparison of methods shows the expediency of using structural recognition methods with the use of neural networks.

Methods and algorithms for constructing a fuzzy text image recognition system are considered, a generalized structural scheme of a text image recognition system and func­tions of the main nodes are presented. Methods of pre-processing of text images in order to improve the image quality, convert the initial color image into images in grayscale and reduce noise using linear (Gaussian filter) and nonlinear median filtering. An algorithm for preliminary fuzzy image processing for their segmentation is presented. Features of binarization of such images by means of a threshold surface that allows reducing trhe influence of unevenness of illumination are shown. A fuzzy processing algorithm has been created to highlight boundaries and character segmentation.

For structural recognition of text images the methods of creation of the corresponding grammar with application of the theory of graphs and methods of the theory of formal languages ​​and grammars are considered. The used methods of horizontal and vertical concatenation of image elements are shown. Recognition consists in finding the best in a certain sense output of an image in a given grammar. Methods of training a neural net­work according to the algorithm of error backpropagation are considered and its testing is carried out. Segmentation was introduced for each line of text, and the methods were configured so that the result of line recognition coincided with the entered segmentation.

The result of noise image recognition is presented. A comparison of the work of the developed neuro-fuzzy system (with a window of 5x5 pixels) with the commercial product ABBYY FineReader 11 Corporate Edition (the image of the text of the page is 702 cha­racters, Gaussian noise - 0.03 from the black level). For practical implementation, the Matlab Simulink software environment with built-in fuzzy logic elements Fuzzy Logic Toolbox was chosen. It is concluded that increasing the number of inner layers of the neural network allows to improve the number of correctly recognized symbols, but for a much longer time. The time and spatial complexity of the developed algorithms are determined by the number of fragments that are viewed in the process of their work.

Keywords: recognition system, fuzzy and noisy image processing, text image re­cog­nition, structural recognition, neural networks.

doi: 10.32403/0554-4866-2021-1-81-72-88


  • Kumar, S., Chandrakar, S., Panigrahi, A., & Singh, S. K. (2017). Muzzle point pattern recognition system using image pre-processing techniques. 2017 Fourth International Conference on Image Information Processing (ICIIP), Shimla, India, 1–6 (in English).
  • Balbin, J. R., Sejera, M. P., Martinez, C. O. A., Cataquis, N. A. M., Ontog, L. M. H., & Tori­bio, J. K. (2017). Cloud based color coding scheme violator plate detection through character recognition using image processing. 2017 7th IEEE International Conference on Control System, Computing and Engineering (ICCSCE), Penang, 253–257 (in English).
  • Antoshchuk, S. (2004). The automatizated systems with the visual information processing de­sign. Proceedings of the International Conference Modern Problems of Radio Engineering, Tele­communications and Computer Science, 2004. Lviv-Slavsko, Ukraine, 268 (in English).
  • Cao, Y., Zhang, T., Zhang, S., & Luo, B. (April 2011). Forward scattering bistatic radar ima­ging method and practice data processing: Journal of Systems Engineering and Electronics, 22, 2, 206–211 (in English).
  • Słomiński, S. (2016). Potential resource of mistakes existing while using the modern methods of measurement and calculation in the glare evaluation. 2016 IEEE Lighting Conference of the Visegrad Countries (Lumen V4). Karpacz, 1–5 (in English).
  • Kalaykov, I., & Tolt, G. (2002). Fast fuzzy signal and image processing hardware. 2002 Annual Meeting of the North American Fuzzy Information Processing Society Proceedings. NAFIPS-FLINT 2002 (Cat. No. 02TH8622), 7–12 (in English).
  • Dorosinskiy, L., & Myasnikov, F. (2017). Radarsignal classification algorythms synthesis and analysis. 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). Chongqing, 119–122 (in English).
  • Marino, P., Pastoriza, V., Santamarfa, M., & Martinez, E. (2005). Fuzzy image processing in quality control application. Sixth International Conference on Computational Intelligence and Multimedia Applications (ICCIMA’05), 55–60 (in English).
  • Chacon, M. I., Aguilar, L., & Delgado, A. (2002). Definition and applications of a fuzzy ima­ge processing scheme. Proceedings of 2002 IEEE 10th Digital Signal Processing Workshop, 2002 and the 2nd Signal Processing Education Workshop, 102–107 (in English).
  • Melin, P., Gonzalez, C. I., Castro, J. R., Mendoza, O., & Castillo, O. (Dec. 2014). Edge-Detection Method for Image Processing Based on Generalized Type-2 Fuzzy Logic: IEEE Tran­sactions on Fuzzy Systems, 22, 6, 1515–1525 (in English).
  • Chellappa, R. et al. (2016). Towards the design of an end-to-end automated system for image and video-based recognition. 2016 Information Theory and Applications Workshop (ITA), La Jolla, CA, 1–7 (in English).
  • Peleshko, D., Rak, T., & Izonin, I. (2016). Image Superresolution via Divergence Matrix and Automatic Detection of Crossover: International Journal of Intelligent Systems and Applications (IJISA), 8, 12, 1–8. DOI: 10.5815/ijisa.2016.12.01 (in English).
  • Ito, N., & Hagiwara, M. (2012). Image description generation without image processing using fuzzy inference. 2012 IEEE International Conference on Fuzzy Systems, Brisbane, QLD, 1–8 (in English).
  • Vivona, L. et al. (2016). Unsupervised clustering method for pattern recognition in IIF images. 2016 International Image Processing, Applications and Systems (IPAS), Hammamet, 1–6 (in English).
  • Qu, Z., Xiao, G., Xu, N., Diao, Z., & Jia-Zhou, H. (2016). A novel night vision image color fusion method based on scene recognition. 2016 19th International Conference on Information Fusion (FUSION), Heidelberg, 1236–1243 (in English).
  • Calvo-Zaragoza, J., Toselli, A. H., & Vidal, E. (2017). Handwritten Music Recognition for Mensural Notation: Formulation, Data and Baseline Results. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, 1081–1086 (in Eng­lish).
  • Tapia, E., & Rojas, R. (2003). Recognition of on-line handwritten mathematical formulas in the E-chalk system. Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings, 980–984 (in English).
  • Rousset, F., Ducros, N., & Peyrin, F. A Semi Nonnegative Matrix Factorization Method for Pattern Generalization in Single-Pixel Imaging: IEEE Transactions on Computational Ima­ging, PP, 99, 1–1 (in English).
  • Alvaro, F., S´nchez, J. A., & Benedi, J. M. (2011). Recognition of Printed Mathematical Expressions Using Two-Dimensional Stochastic Context-Free Grammars. 2011 International Conference on Document Analysis and Recognition, Beijing, 1225–1229 (in English).
  • Subramanian, K. G., Geethalakshmi, M., Nagar, A. K., & Lee, S. K. (2008). Two-dimensional Picture Grammar models. 2008 Second UKSIM European Symposium on Computer Mode­ling and Simulation, Liverpool, 263–267 (in English).