Automated Text Generation Applying Utilitarian n-grams

Authors: Sanjay Kumar, Ms. Kusum Sharma

DOI Link: https://doi.org/10.22214/ijraset.2025.73254

Abstract

A language is a careful articulation of an artefact which can be at a basic level – a noun, an article, a verb, and adjective, a preposition, a connective, a clause, an adverb, and certain amount of punctuation. From the ancient ages, people have learned a certain amount of language for effective communication. It started with alphabets which can be connected to form a word and later leading to the output of a sentence. Lately Artificial Intelligence (AI) and Machine Learning (ML) have been developing a language model which can assist a technician in the production of a sentence. The most common techniques are Recurrent Neural Networks (RNNs), Long and Short Term Memory (LSTMs), Convolutional Neural Network (CNNs), Gated Recurrent Networks (GRUs) and others. In order that a sentence can be output the basic model of a Noun Phrase and a Verb Phrase has to be applied. In this research article we present an algorithm called ANYA (Polynomial Approximation) in order to help in the output of a sentence.

Introduction

The text discusses sentence structure and generation using AI methods, particularly introducing a novel algorithm called ANYA (Polynomial Approximation) for generating simple sentences. A sentence is understood in terms of syntax (rules of structure) and semantics (meaning), typically composed of a Noun Phrase (NP) and Verb Phrase (VP).

Sentence Structure

A sentence is generally formed as:

Noun Phrase (NP): An article + noun (e.g., A boy, The tree)
Verb Phrase (VP): Action or description (e.g., is eating, on a bicycle)

Examples:

A boy on a bicycle → NP: A boy, VP: on a bicycle
John is eating → NP: John, VP: is eating
A tree is beautiful → NP: A tree, VP: is beautiful

Technologies for Sentence Generation

Several AI/ML models are used:

ANN, LSTM, CNN, RNN, GRU, SVM

ANYA Algorithm Methodology

Input:
- Sentence templates (3 files)
- A thesaurus (word-number mappings)
- An info file (correctness mapping)
Processing:
- Convert sentences into polynomials
- Load and structure thesaurus and info files
Generation:
- Create a random sentence from polynomial samples
- Output a correct sentence via polynomial approximation using correctness info

Related Work

The text reviews 19 research papers across various domains of sentence generation:

Corpus-based and neural models ([1], [2]): Focus on learning from large text datasets, sometimes incorporating external knowledge to enhance sentence generation.
Domain-specific generation ([3], [8]): Toolkits like Ascle for medical texts and deep learning methods for economic reports.
Model evaluation and customization ([4], [6], [7]): Propose new evaluation metrics, survey public perceptions, and assess customization.
Deep learning models ([5], [9], [14], [16], [17]): Employ RNN, LSTM, GRU, transformers, and GANs for sentence construction, chatbot development, and NLP tasks.
Syntax-aware and semantic-aware generation ([10], [11], [18], [19]): Explore methods using HMM, analogy, graph theory, and VAEs for structured or meaning-preserving output.
Simple sentence generation ([12], [13], [15]): Use rule-based, neural, and hybrid approaches to produce concise and clear sentences.

Conclusion

Sentence output from a computer has been studied lately. There has been active research in this area over the past decade. The research work so far has involved various machine learning and AI based algorithms. Some of the algorithms are Deep Neural Networks, Case Based Reasoning, Neural Methods, Large Language Models, etc. The literature review for this research article has been taken from various sources available on the internet as per certain criterion. An algorithm called ANYA (Polynomial Approximation) is presented as part of this research. The algorithm tries to make a polynomial approximation of a sentence so that a sentence can be represented as a lit of number. Later, the list of numbers is chosen at random in order to generate a sentence that is not exactly appropriate semantically, however it is able to follow the syntactic rules of the grammar. The ANYA algorithm is able to generate a right sentence by looking at the correct interpretations of the sentence.

References

[1] Daza, A., Calvo, H., Figueroa-Nazuno (2016). Automatic Text Generation by Learning from Literary Structures, Proceedings of the Fifth Workshop on Computational Linguistics for Literature. [2] Chen, S., Wang, J., Feng, X., Jiang, F., Qin, B., Lin, C. Y. (2019). Enhancing Neural Data-To-Text Generation Models with External Background Knowledge, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pages 3022-3032 [3] Yang, R., Zeng, Q., You, K., Qiao, Y., Huang, L., Hsieh, C. C., Rosand, B., Goldwasser, J., Dave, A., Keenan, T., Ke, Y., Hong, C., Liu, N., Chew, C., Radev, D., Lu, Z., Xu, H., Chen, Q., Li, I. (2024). Ascle- A Python Language Processing Toolkit for Medical Text Generation: Development and Evaluation Study, Journal of Medical Internet Research, Vol. 26. [4] Wang, Y., Jiang, J., Zhang, M., Li, C., Liang, Y. (2023). Automated Evaluation of Personalized Text Generation using Large Language Models [5] Pawade, D., Sakhapara, A., Jain, M., Jain, N., Gada, K. (2017). Story Scrambler – Automated Text Generation using Word Level RNN-LSTM, I. J. Information Technology and Computer Science, 6, 44-53. [6] Celikyilmaz, A., Clark, E., Gao, J. (2021). Evaluation of Text Generation: A Survey, arXiv [7] Henestrosa, A. L., Kimmerle, J. (2024). Understanding and Perception of Automated Text Generation among the Public: Two Surveys with Representative Samples in Germany, Behavioral Sciences, Behav. Sci., 14, 353. [8] Karkouri, A. A., Lazrak, M., Ghanimi, F., Amrani, H. E., Benammi,D., Bourekkadi, S. (2023). Journal of Theoretical and Applied Information Technology, Vol. 101, No. 23. [9] Kumar, M., Kumar, A., Singh, A., Kumar, A. (2021). Analysis of Automated Text Generation Using Deep Learning, International Journal for Research in Advanced Computer Science and Engineering, Vol. 7, Issue 4. [10] Harrison, B., Purdy, C., Riedl, M. O. (2017). Toward Automated Story Generation with Markov Chain Monte Carlo Methods and Deep Neural Networks. [11] Hervas, R., Pereira, F. C., Gervas, P., Cardoso, A. Cross-Domain Analogy in Automated Text Generation. [12] Upadhyay, L., Hasan. M. I., Patel, P. S. (2023). Demystifying Text Generation Approaches. [13] Layne, S., Gehrmann, S., Dernoncourt, F., Wang, L., Bui, T., Chang, W. (2022). A Framework for Automated Text Generation Benchmarking. [14] Iqbal, T., Qureshi, S. (2020). The Survey: Text Generation Models in Deep Learning. Journal of King Saud University – Computer and Information Sciences. [15] Upadhyay, A., Massie, S., Singh, R. K., Gupta, G., Ojha, M. (2021). A case-based approach to data-to-text generation. [16] Gayam, S. R. (2022). Generative AI for Content Creation: Advanced Techniques for Automated Text Generation, Image Synthesis, and Video Production, Journal of Science & Technology, Vol. 3, Issue 1. [17] Li, J., Tang, T., Zhao, W. X., Wen, J, R. (2021). Pretrained Language Models for Text Generation: A Survey, Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence (IJCAI-21). [18] Guo, Q., Qiu, X., Xue, X., Zhang, Z. (2019). Syntax-guided text generation via graph neural network, Science China, Information Sciences, Vol. 64. [19] Hu, Z., Yang, Z., Liang, X., Salakhutdinov, R., Xing, E. P. (2017). Toward Controlled Generation of Text.

Copyright

Copyright © 2025 Sanjay Kumar, Ms. Kusum Sharma. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET73254

Publish Date : 2025-07-19

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here