Our project, the way of re-defining the reading experience with visual storytelling focuses on deploying new methods and improving the way of people reading by understanding and engaging with the actual content presented by combining the content of visual elements (images, video, illustration/Drawing). The existing text based reading can sometimes make it harder for readers to fully understand the complete full-fledged ideas to maintain the same consistency with reading. The way to storytelling by visuals overcome the problems by presenting the actual content with more beginner friendly and easier to understand way of format. Our project also explores the question of combining text with the visualization elements which can improve the narration drastically along with the improved reading experience. Our proposed solution also aims to make the concept of reading more open and accessible for all type of users using the application. We make it accessible by breaking down the complex content into part by part of visual representation. Our project is easy to partake since the visual communication can change the traditional and existing reading templates to a more new, exploring reading experience
Introduction
The project focuses on improving traditional reading by combining text with visual storytelling elements such as images, illustrations, and videos to make content easier to understand and more engaging. It addresses the limitations of text-heavy reading, which can be difficult for complex topics, and proposes a web-based platform that enhances comprehension through multimedia integration.
The system is built using a React.js frontend and Node.js backend, where the frontend handles interactive UI and visual presentation, while the backend manages data processing, content delivery, and API communication. The application is deployed on IBM Cloud for scalable and remote access.
The literature review highlights limitations in existing systems, such as lack of reusable content, poor integration of visuals with text, and dependence on heavy hardware (e.g., VR/AR systems). The proposed solution improves on these by offering a lightweight, web-based, and structured visual reading experience.
The methodology includes requirement analysis, system design, development, deployment, and testing. The system is evaluated through functional, usability, and performance testing, showing improved readability and user engagement.
Conclusion
Our proposed project successfully mentions the limitations of the traditional and existing text based reading which is often rated as a hard situations for readers to maintain consistency with the given comprehension. With us developing a web based application that combines the visual elements and illustrations with our dual architecture pipeline, our research provides useful insights to improve and fix existing problems to handle efficiently. We ensure that the applications handle the complex comprehensions by breaking them down into accessible and next-to-next sequential visual representations.
Moving forward, we consider adding these features in the application, coming future:
A) Automated content generation:
We begin these feature with integrating AI frameworks to generate parallel visuals for any user-provided text. Improving the manual section of visual blocks.
B) Enhanced Interactivity:
Expanding the reusable features of the application to import user-defined video stories.
C) Resource Optimization:
Combining AR/VR text related environments to optimized to use fewer system resources, suitable for high tech without specific hardware.
D) Dynamic Data Handling:
Scaling backend services to high traffic volumes handling and more complex data processing as per the user base grows over the coming years.
References
[1] S. H. Lee and Y. Kim, “The power of visual storytelling: A deep learning framework for visual story generation in education,” Int. J. Inf. Sci. Appl. Eng. (IJISAE), vol. 11, no. 1, pp. 1–10, 2024.
[2] A. B. M. S. Rahman et al., “Use of interactive video story in enhancing the reading comprehension,” Int. J. Social Sci. Hum. Manage. Res. (IJSSMR), vol. 4, no. 6, pp. 1–12, 2025.
[3] M. A. Smith et al., “Does adding pictures to easy-to-read texts benefit comprehension for people with reading difficulties? A meta-synthesis,” Res. Dev. Disabil., vol. 153, p. 104876, 2025.
[4] L. Chen and P. Wang, “Multimodal reading and writing design analysis of English,” in Proc. Int. Conf. Cogn. Comput. Educ. Technol. (ICCCET), 2025, pp. 1–8.
[5] J. E. Brown, “Virtual and augmented reality text environments support self-directed multimodal reading,” Interact. Learn. Environ., vol. 33, no. 6, pp. 1–14, 2025.
[6] R. Singh et al., “A comparative analysis of storytelling videos and picture books,” Int. J. Educ. Technol. Learn. Sci. (IJETLS), vol. 12, no. 3, pp. 1–15, 2024.
[7] F. Garcia and T. Williams, “The visual representation of complexity: Definitions, examples, learning points,” in Proc. Research on Service Design Symp. (RSD), 2025, pp. 1–12.
[8] K. Zhang et al., “A survey on advancements in image-text multimodal models,” IEEE Access, vol. 11, pp. 112?345–112?367, 2023.
[9] T. Nguyen and H. Liu, “Multimodal reading materials in digital blogs: A language-image-text analysis,” Int. J. New Technol. Educ., vol. 16, no. 2, pp. 1–17, 2023.
[10] P. Kumar and A. Patel, “Enhancing reading ability of multimodal language models with visual cues,” in Proc. IEEE Conf. Multimedia Expo (ICME), 2023, pp. 1–6