Devanagari is the most widely used script of Indian languages of Proto-Indo-European origin. However, relatively less research has been conducted on Devanagari, leaving room for significant exploration and advancement. Generative Adversarial Networks (GANs) are unsupervised Machine Learning models which are used for smart augmentation of a dataset. In this paper, we create a collection of high-quality images of devanagari numeric digits from scratch using a novel method of collecting the input data through electronic medium only. This leaves little room for error and unwanted noise compared to conventional methods. The images are of size 128x128 pixels, totalling to 12000 images. Then we train two types of GANs - DCGAN and CGAN - on the images and compare the result of them. In general, the DCGAN models were found to perform much better and produce more realistic images than CGAN. Our work is useful in several applications including Optical Character Recognition and Accessibility Tools.
Introduction
With the rise of machine learning, large, high-quality datasets—especially image data—are crucial, particularly for Optical Character Recognition (OCR). While Latin scripts have abundant datasets like MNIST, Devanagari script, used in Hindi, Marathi, and Sanskrit, lacks large, quality datasets due to its complexity and regional variations. Manual dataset creation is slow and error-prone, so techniques like image augmentation and advanced generative models such as GANs (Generative Adversarial Networks) and VAEs (Variational Autoencoders) are used to synthesize new data, potentially improving dataset size and variety.
This work focuses on creating a high-resolution dataset for Devanagari numeric digits and evaluating different GAN architectures for generating realistic synthetic images.
Literature Review:
Previous attempts to generate Devanagari characters using GANs (e.g., DCGAN, CGAN) show promising but imperfect results, often producing lower quality images than original datasets. CGANs are simpler to train but typically produce less detailed images than DCGANs. Existing datasets mostly come from scanned handwritten samples, which can be noisy and biased.
Methodology:
Data Acquisition: Instead of physical handwriting collection and scanning, a novel digital method using an Android app allows volunteers to draw digits directly on a 128x128 pixel canvas. This results in high-resolution, less noisy images with diverse stroke thickness, collected from a wide contributor base.
Data Preprocessing: Basic cleaning and augmentation (rotation, scaling, translation) are applied to enhance dataset variety.
Synthetic Image Generation:
DCGAN: Two networks—generator (creates images from noise) and discriminator (distinguishes real from fake)—are trained adversarially. Uses minimax or Wasserstein loss functions.
CGAN: A conditional GAN that generates images based on specified digit labels, enabling one model to generate all classes.
Model Architectures:
DCGAN generates one digit at a time with a simpler architecture.
CGAN uses more complex, non-sequential architectures with batch normalization and dropout to handle multiple digits simultaneously.
Results:
Dataset collected from 96 volunteers using the app yielded 12,000 images (1,200 per digit).
Augmented dataset expanded to 60,000 images total.
The Android app approach improves quality and diversity compared to traditional methods.
Conclusion
This paper focuses on the synthetic generation of Devanagari numeric digits using Generative Adversarial Networks (GANs). We tested two types of GANs - DCGAN and CGAN. We found out that the DCGAN models individually perform better than the single CGAN model. Collecting and using a dataset of large images (128x128) proved very beneficial. In general the output generated by the DCGANs was very identical to the original dataset.
References
[1] Sinha, K., & Gupta, R. (2024, August). Enhancing Handwritten Devanagari Character Recognition via GAN-Generated Synthetic Data. In Proceedings of the 2024 Sixteenth International Conference on Contemporary Computing (pp. 484-489).
[2] Jha, G., &Cecotti, H. (2020). Data augmentation for handwritten digit recognition using generative adversarial networks. Multimedia Tools and Applications, 79(47), 35055-35068.
[3] Warkhandkar, A. G., Sharief, B., &Bhambure, O. B. (2020). Measuring performance of generative adversarial networks on devanagari script. arXiv preprint arXiv:2007.06710.
[4] Kaur, S., &Verma, K. (2020). Handwritten Devanagari character generation using deep convolutional generative adversarial network. In Soft Computing: Theories and Applications: Proceedings of SoCTA 2018 (pp. 1243-1253). Springer Singapore.
[5] Nannapaneni, R., Chakravarti, A., Sangappa, S., Bora, P., &Kulkarni, R. V. (2022). Augmentation of Handwritten Devanagari Character Dataset Using DCGAN. In Machine Intelligence and Smart Systems: Proceedings of MISS 2021 (pp. 31-44). Singapore: Springer Nature Singapore.
[6] Vishwakarma, D. K. (2020, May). Comparative analysis of deep convolutional generative adversarial network and conditional generative adversarial network using hand written digits. In 2020 4th international conference on intelligent computing and control systems (ICICCS) (pp. 1072-1075). IEEE.
[7] Bisht, M., & Gupta, R. (2021, November). Conditional Generative Adversarial Network for Devanagari Handwritten Character Generation. In 2021 7th International Conference on Signal Processing and Communication (ICSC) (pp. 142-145). IEEE.
[8] Haque, S., Shahinoor, S. A., Rabby, A. S. A., Abujar, S., &Hossain, S. A. (2019). Onkogan: Bangla handwritten digit generation with deep convolutional generative adversarial networks. In Recent Trends in Image Processing and Pattern Recognition: Second International Conference, RTIP2R 2018, Solapur, India, December 21–22, 2018, Revised Selected Papers, Part III 2 (pp. 108-117). Springer Singapore.
[9] Chhatkuli, R. K., Baral, H. P., &Surendra, K. C. (2021). Generating Nepali Handwritten Letters and Words Using Generative Adversarial Networks.
[10] Bhandari, B. B., Dhakal, A. R., Maharjan, L., &Karki, A. (2021). Nepali Handwritten Letter Generation using GAN. Journal of Science and Engineering, 9, 49-55.
[11] Shrisha, H. S., Anupama, V., Suresha, D., &Jagadisha, N. (2022). KGAN: A Generative Adversarial Network Augmented Convolution Neural Network Model for Recognizing Kannada Language Digits. In Communication and Intelligent Systems: Proceedings of ICCIS 2021 (pp. 523-531). Singapore: Springer Nature Singapore.
[12] Murugesh, V., Parthasarathy, A., Gopinath, G. P., &Khade, A. (2022). Tamil language handwritten document digitization and analysis of the impact of data augmentation using generative adversarial networks (GANs) on the accuracy of CNN model. In Machine Learning and Autonomous Systems: Proceedings of ICMLAS 2021 (pp. 159-177). Singapore: Springer Nature Singapore.
[13] Kaur, S., Bawa, S., & Kumar, R. (2023, May). Evaluating Generative Adversarial Networks for Gurumukhi Handwritten Character Recognition (CR). In Machine Intelligence Techniques for Data Analysis and Signal Processing: Proceedings of the 4th International Conference MISP 2022, Volume 1 (pp. 503-513). Singapore: Springer Nature Singapore.
[14] Eltay, M., Zidouri, A., Ahmad, I., &Elarian, Y. (2022). Generative adversarial network based adaptive data augmentation for handwritten Arabic text recognition. PeerJ Computer Science, 8, e861.
[15] Lajish, V. L., &Kopparapu, S. K. (2014, December). Online handwritten Devanagari stroke recognition using extended directional features. In 2014 8th International Conference on Signal Processing and Communication Systems (ICSPCS) (pp. 1-5). IEEE.
[16] Acharya, S., Pant, A. K., &Gyawali, P. K. (2015, December). Deep learning based large scale handwritten Devanagari character recognition. In 2015 9th International conference on software, knowledge, information management and applications (SKIMA) (pp. 1-6). IEEE.
[17] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ...&Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139-144.
[18] Fang, W., Zhang, F., Sheng, V. S., & Ding, Y. (2018). A Method for Improving CNN-Based Image Recognition Using DCGAN. Computers, Materials & Continua, 57(1).
[19] Rožanec, J. M., Zajec, P., Theodoropoulos, S., Koehorst, E., Fortuna, B., &Mladeni?, D. (2023). Synthetic data augmentation using GAN for improved automated visual inspection. Ifac-Papersonline, 56(2), 11094-11099.
[20] Liu, J., Gu, C., Wang, J., Youn, G., & Kim, J. U. (2019). Multi-scale multi-class conditional generative adversarial network for handwritten character generation. The Journal of Supercomputing, 75, 1922-1940.
[21] Alafif, T., Alharbi, R., Almajnooni, N., Albishry, M., Alotaibi, A., Alsaadi, F., ...&Sabban, S. (2022). GEAD: generating and evaluating handwritten Eastern Arabic digits using generative adversarial networks. International Journal of Information Technology, 1-9.
[22] Gurumurthy, S., KiranSarvadevabhatla, R., &VenkateshBabu, R. (2017). Deligan: Generative adversarial networks for diverse and limited data. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 166-174).
[23] Franc, D., Hamplová, A., &Svojše, O. (2023). Augmenting Historical Alphabet Datasets Using. Data Science and Algorithms in Systems: Proceedings of 6th Computational Methods in Systems and Software 2022, Vol. 2, 597, 132.