This paper presents the design and development of a Mini Search Engine built using Python and the Trie data structure. The system enables users to efficiently search for keywords across multiple uploaded text and PDF files. It supports functionalities such as autocomplete suggestions, keyword highlighting, and real-time search results with performance metrics. The backend is powered by a Trie for fast prefix-based retrieval, while the user interface is built using Flask. Additional features include dark mode, file and word statistics display, and search history tracking. This lightweight tool offers a simple yet powerful solution for local document indexing and retrieval, making it ideal for personal and academic use. The proposed system demonstrates how fundamental data structures like Trie can be effectively applied in modern search-based applications.
Introduction
In the age of information overload, fast and accurate local search tools are essential, but existing desktop and web search solutions fall short for local files. This paper introduces a lightweight, browser-based Mini Search Engine built with Python and Flask that indexes user-uploaded .txt and .pdf files using a Trie data structure. The Trie enables efficient prefix-based word search and autocomplete, outperforming hash tables and linear search in speed and memory use.
The system workflow includes file upload, text extraction (using PyMuPDF for PDFs), tokenization, and indexing into the Trie. Users get real-time autocomplete suggestions and highlighted search results across files, plus additional features like dark mode, search history, and file statistics.
Implemented modularly with Python modules handling Trie logic, file processing, and the Flask web interface, the engine delivers fast indexing (about 1 second for five files) and near-instant search responses (<0.5 seconds). While effective, it currently lacks support for image-based PDFs, multilingual content, and phrase searches.
The work demonstrates how combining classical data structures with modern web technologies can create efficient, user-friendly tools for localized information retrieval.
Conclusion
This paper presented the design and implementation of a Mini Search Engine using Python and the Trie data structure, focusing on efficient keyword-based search across local .txt and .pdf documents. The system offers real-time search with autocomplete suggestions, keyword highlighting, and document-level result display, all integrated through a user-friendly Flask-based interface.
By leveraging the structural advantages of Trie, the engine achieves fast and accurate prefix-based retrieval, making it especially suitable for educational or personal use cases where lightweight, local search is needed. Additional features such as dark mode, file statistics, and search history enhance the overall user experience.
While the current version fulfills essential search requirements, future enhancements can include support for multi-word search, OCR for scanned PDFs, and semantic ranking for more intelligent result ordering.
The project demonstrates that classical data structures, when applied effectively with modern web frameworks, can significantly improve the way users interact with and retrieve information from personal document collections.
References
[1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein, Introduction to Algorithms, 3rd ed., MIT Press, 2009.
[2] Python Software Foundation, “Python Language Reference,” [Online]. Available: https://www.python.org/doc/
[3] FitZ / PyMuPDF Documentation, “Reading PDF Files with Python,” [Online]. Available: https://pymupdf.readthedocs.io/
[4] Flask Documentation, “Micro web framework for Python,” [Online]. Available: https://flask.palletsprojects.com/
[5] Ian H. Witten, Alistair Moffat, and Timothy C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann, 1999.