Iterative ROGUE-Enhanced Text Summarization via Connected Dominating Set

Authors: Dr. Arnab Kumar Das , Mrs. Mridusmita Baruah

DOI Link: https://doi.org/10.22214/ijraset.2026.77077

Abstract

Recent growth in the amount of content available online has made quick and efficient automatic summary more crucial. The area of text summarization is receiving increased attention as a result of the desire to obtain as much information as possible in the shortest amount of time. This paper introduces an iterative method for text summarization that makes use of the Recall-Oriented Understudy for Gisting Evaluation (ROGUE) quality check and the Connected Dominating Set (CDS) algorithm. In order to guarantee content coverage and preserve connectedness, the CDS algorithm is utilized to pinpoint important sentences in a document. Enhancing this, an iterative procedure is proposed to improve the quality of the summary by assessing its coherence and fidelity to reference summaries through the ROGUE metric.

Introduction

The rapid growth of online information has increased the need for automatic text summarization, as manual summarization of large documents is time-consuming and difficult. To address this challenge, researchers have explored graph-based approaches, which effectively model relationships between sentences to identify the most informative content. Among these, Dominating Set (DS) and Connected Dominating Set (CDS) models have gained attention for reducing redundancy while maintaining semantic coverage and coherence.

The literature highlights key graph-based methods such as TextRank and LexRank, which use sentence similarity graphs and centrality measures to select important sentences. While machine learning, deep learning, and optimization techniques (e.g., Genetic Algorithms, PSO) have also shown success, they often require high computational resources and large datasets. As a result, dominating set–based methods offer a more efficient alternative by selecting a minimal set of representative sentences.

The proposed method represents a document as a graph with sentences as nodes and similarity-based edges. A Connected Dominating Set algorithm is applied to select key sentences, followed by iterative refinement using ROUGE evaluation to improve summary quality. Experimental results show that the approach achieves competitive ROUGE scores with lower redundancy compared to TextRank and LexRank, while maintaining good readability and coherence.

Conclusion

1) The proposed Connected Dominating Set (CDS) with iterative ROUGE refinement consistently outperformed both TextRank and LexRank in ROUGE-1 and ROUGE-2 F1 scores across all input sizes (50, 100, and 150 words). 2) The redundancy (average pairwise cosine similarity) was lowest for the proposed method, indicating that it effectively reduces overlap between selected sentences and promotes diversity in the summary. 3) TextRank and LexRank tend to select highly connected sentences located within dense clusters of the similarity graph, which can lead to repetitive content. 4) The CDS-based approach selects sentences that span across multiple clusters, ensuring that different semantic regions of the text are covered. 5) The iterative ROUGE refinement further improves the summary quality by eliminating sentences that contribute little to the overall informativeness. The comparative study reveals that the proposed Connected Dominating Set-based summarization framework with iterative ROUGE refinement outperforms the baseline TextRank method across multiple evaluation dimensions. The CDS framework ensures comprehensive content coverage by selecting a connected subset of representative sentences, while the iterative refinement enhances summary precision by optimizing ROUGE performance iteratively. This results in a summary that maintains high informativeness, coherence, and minimal redundancy. The results confirm that the CDS-based summarization technique provides a more balanced and interpretable extractive summary. Future work can focus on integrating semantic embeddings and abstractive post-processing for enhanced summary quality.

References

[1] R. Mihalcea and P. Tarau, “TextRank: Bringing order into texts,” Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP), Barcelona, Spain, 2004, pp. 404–411. [2] G. Erkan and D. R. Radev, “LexRank: Graph-based lexical centrality as salience in text summarization,” Journal of Artificial Intelligence Research (JAIR), vol. 22, pp. 457–479, 2004. [3] R. M. Alguliev and R. M. Aliguliyev, “An effective algorithm for automatic document summarization,” Automation and Remote Control, vol. 71, no. 9, pp. 1765–1774, 2010. [4] K. Ježek and J. Steinberger, “Automatic text summarization,” in Proceedings of Znalosti 2008 Conference, Brno, Czech Republic, 2008, pp. 1–12. [5] M. G. Ozsoy and F. N. Alpaslan, “Text summarization of Turkish texts using latent semantic analysis,” International Journal of Computational Intelligence Systems, vol. 7, no. 4, pp. 691–707, 2014. [6] J. L. Neto, A. A. Freitas, and C. A. Kaestner, “Automatic text summarization using a machine learning approach,” in Advances in Artificial Intelligence: Lecture Notes in Computer Science, vol. 2507, Springer, Berlin, Heidelberg, 2002, pp. 205–215. [7] A. Erhandi, “A deep learning based automatic text summarization for Turkish and English texts,” International Journal of Advanced Computer Science and Applications, vol. 11, no. 5, pp. 123–129, 2020. [8] C. N. Silla Jr., A. A. Freitas, and C. Kaestner, “Automatic text summarization with genetic algorithms,” Advances in Artificial Intelligence — IBERAMIA 2001, vol. 2220, Springer, Berlin, Heidelberg, 2001, pp. 183–192. [9] O. Kaynar, M. A. Güngör, and O. Yildiz, “Sentence extraction-based text summarization using genetic algorithms,” Information Sciences Letters, vol. 3, no. 2, pp. 65–70, 2014. [10] S. Al-Abdallah and M. Al-Taani, “Arabic text summarization using swarm intelligence algorithms,” International Journal of Computer Science and Network Security (IJCSNS), vol. 18, no. 3, pp. 152–160, 2018. [11] A. Jain, S. Bansal, and P. Kumar, “Automatic text summarization for Hindi documents using real-coded genetic algorithm,” Procedia Computer Science, vol. 132, pp. 389–398, 2018. [12] J. L. Neto, A. A. Freitas, and C. A. Kaestner, “Text summarization using machine learning-based feature extraction,” Advances in Artificial Intelligence — IBERAMIA 2002, Springer, Berlin, Heidelberg, 2002. [13] A. Mallick, S. Das, and S. Bandyopadhyay, “Graph-based text summarization using modified TextRank,” Procedia Computer Science, vol. 132, pp. 1003–1012, 2018. [14] R. Mihalcea, “Graph-based ranking algorithms for sentence extraction, applied to text summarization,” Proceedings of the ACL Interactive Poster and Demonstration Sessions, Ann Arbor, Michigan, USA, 2005, pp. 170–173. [15] B. Kalita1, A. K. Das2, A.U. Islam3,Power Efficient Routing in Mobile Adhoc Network (MANET) Using Connected Dominating Set, International Journal of Computer Sciences and Engineering ,Vol.-6, Issue-10, Oct 2018 E-ISSN: 2347-2693

Copyright

Copyright © 2026 Dr. Arnab Kumar Das , Mrs. Mridusmita Baruah . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET77077

Publish Date : 2026-01-22

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here