Every company or organization conducts meetings to discuss certain topics, issues or development. Having a meeting summary is for the people to keep track of the certain points that have been discussed during the meetings. Generally, it is time consuming to read and understand the whole documents. Summaries are very important as they convey the essential content of discussions in a concise form. It plays an important role as the readers are interested in only the important context of discussions. The main idea of the project is to develop a text and video summarizer using Natural Language Processing. This helps the user to get the summary of the documents instead reading and analyzing the whole document. In text summarization, we are building a Seq2Seq model using LSTM and Attention Layer. Then we are training the model using the datasets and test them. With the help of this, we are able to generate a summary for the text document. For video summarization, we are considering NLP based text summarization algorithms such as Text Rank, Lex Rank, LUHN and LSA. These algorithms are performed on the subtitles and the summary of the video will be generated. Eventually we are able to get the summarization of the text and video documents using NLP.
Introduction
Overview
Natural Language Processing (NLP), a subfield of Artificial Intelligence, enables computers to understand and process human language. This project applies NLP to automatically summarize text and video content from business meetings, reducing the time and effort needed for manual summarization.
Key Applications of NLP
Language translation
Chatbots
Voice assistants
Sentiment & survey analysis
Email filtering
Speech recognition
Text Summarization
Approach: Uses Abstractive Summarization via Seq2Seq (Sequence-to-Sequence) models built with LSTM and an Attention Layer.
Model Structure:
Encoder (LSTM): Reads and processes the input text sequence.
Decoder (LSTM): Predicts the output summary sequence word by word.
Attention Mechanism: Helps focus on relevant parts of the input for better accuracy.
Video Summarization
Method: Applies text summarization algorithms on video subtitles.
Algorithms Used:
TextRank
LexRank
LUHN
LSA (Latent Semantic Analysis)
Process:
Subtitles converted into text.
Python library sumy used for ranking and summarizing subtitles.
User-defined duration guides the number of subtitle sentences selected.
MoviePy used to edit video using selected timestamps.
Final summarized video is created by merging top-ranked segments.
Proposed Methodology
Sequence-to-Sequence Modelling:
Suitable for converting long input sequences to shorter output summaries.
Utilizes LSTM for handling long-term dependencies in language.
<start> and <end> tokens used to control sequence decoding.
System Modules
Text Summarization Module:
Input: Long meeting text.
Output: Condensed summary using LSTM + Attention.
Video Summarization Module:
Input: Original meeting video + subtitles + desired summary duration.
Output: Shortened video containing only key segments.
Conclusion
In this project a time saver has been introduced in the form of a summarizer. With the help ofsummarizer a quick idea could be derived without spending hours. Both the Text and Video summarizer can be a boon in the forthcoming stages of life. People who hate reading can also get to know the content by reading the summarized version of long paragraphs and videos instantly This project can be further developed as a mobile application by integrating both the text and video summarizer.
Further the subtitle file can also be generated as a separate text file automatically which will be very useful since all videos may not have subtitles along with them. In case the user does not have a subtitle file then the subtitles can be generated. Summarization of audio files in the form of text can also be included. More efficiency can be achieved in the summary by increasing the amount of training datasets.
References
[1] Ajmal, Muhammad, Muhammad Husnain Ashraf, Muhammad Shakir, Yasir Abbas, and Faiz Ali Shah. \"Video summarization: techniques and classification.\" In Computer Vision and Graphics: International Conference, ICCVG 2012, Warsaw, Poland, September 24-26, 2012. Proceedings, pp. 1-13. Springer Berlin Heidelberg, 2012.
[2] Aswin, V. B., Mohammed Javed, Parag Parihar, K. Aswanth, C. R. Druval, Anupam Dagar, and C. V. Aravinda. \"NLP-driven ensemble-based automatic subtitle generation and semantic video summarization technique.\" In Advances in Artificial Intelligence and Data Engineering: Select Proceedings of AIDE 2019, pp. 3-13. Springer Singapore, 2021.
[3] Adhikari, Surabhi. \"NLP based machine learning approaches for text summarization.\" In 2020 Fourth International Conference on Computing Methodologies and Communication (ICCMC), pp. 535-538. IEEE, 2020.
[4] Batra, Pooja, Sarika Chaudhary, Kavya Bhatt, Saloni Varshney, and Srashti Verma. \"A review: abstractive text summarization techniques using NLP.\" In 2020 international conference on advances in computing, communication & materials (ICACCM), pp. 23-28. IEEE, 2020.
[5] Balaji, N., Deepa Kumari, N. Bhavatarini, N. Megha, and Sunil Kumar. \"Text summarization using NLP technique.\" In 2022 International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics (DISCOVER), pp. 30-35. IEEE, 2022.
[6] Haque, Md Majharul, Suraiya Pervin, and Zerina Begum. \"Literature review of automatic single document text summarization using NLP.\" International Journal of Innovation and Applied Studies 3, no. 3 (2013): 857-865.
[7] Moiyadi, Hamza Shabbir, Harsh Desai, Dhairya Pawar, Geet Agrawal, and Nilesh M. Patil. \"NLP based text summarization using semantic analysis.\" International Journal of Advanced Engineering, Management and Science 2, no. 10 (2016): 239678.
[8] Prudhvi, Kota, A. Bharath Chowdary, P. Subba Rami Reddy, and P. Lakshmi Prasanna. \"Text summarization using natural language processing.\" In Intelligent System Design: Proceedings of Intelligent System Design: INDIA 2019, pp. 535-547. Singapore: Springer Singapore, 2020.
[9] Vemaraju, Sudha, K. Sarvani, Satya Vani Bethapudi, Venkateswarlu Chandu, Ch Sahyaja, K. Kiran Kumar Varma, and Ankam Dhilli Babu. \"NLP Hybrid Deep Learning Model for E-Learning System Prediction Classifier System.\" Technologies and Their Effects on Real-Time Social Development 167, no. A2 (S) (2025).
[10] Wibawa, Aji Prasetya, and Fachrul Kurniawan. \"A survey of text summarization: Techniques, evaluation and challenges.\" Natural Language Processing Journal 7 (2024): 100070.