Phoenix: AI Based Search Engine

Authors: Prof. Deepak Naik, Prof. Hyder Ali Hingoniwala, Mohit , Rishita Kapile, Rajiv Surgoniwar, Thanshu Agarkar, Shreyash Karpe

DOI Link: https://doi.org/10.22214/ijraset.2025.70852

Certificate: View Certificate

Abstract

In today\'s world of information overload, smart and effective search tech plays a key role in providing accurate and relevant answers. Google and other traditional search engines use keyword-based retrieval and fixed relevance-ranking algorithms like PageRank. These methods work well but can\'t grasp user intent, handle multi-step queries, or offer personalized results beyond simple rephrasing.New AI breakthroughs large language models (LLMs), have led to tools such as Perplexity.ai and You.com that combine results into clear summaries. Yet, these tools still lack deep personalization, fine-tuning for specific fields, emotional awareness, and the ability to adapt to a user\'s changing search path.This study introduces a cutting-edge search engine powered by AI. It merges Google\'s Custom Search API\'s ability to scale with advanced natural language processing ranking that understands context, and smart recommendation systems. Our approach stands out by creating an expanding map of what a user knows over time. It adjusts to multi-step queries on the fly and gives search results that are custom-fit and grow with user input.Our system aims to connect the dots between strict keyword searches and flexible, chat-like searches. It offers better relevance, less search burnout, and a user-focused experience. These perks are particularly useful for academic studies exploring technical topics, and tasks that need a lot of knowledge.

Introduction

I. Problem & Purpose

Traditional search engines (e.g., Google) and LLM-based tools (e.g., Perplexity, You.com) focus on retrieving or summarizing existing content using static methods.
These tools lack understanding of:
- User intent
- Learning goals
- Emotions
- Search history
The proposed system acts as a dynamic, chatty knowledge guide—adapting to user needs, mood, pace, and intent using AI and real-time data.

II. Literature Review – Key Insights & Gaps

Traditional Search (e.g., Google)
- Powerful but lacks semantic understanding and personalization.
- Static rankings don’t adapt to evolving user needs.
LLM-Based Search (e.g., Perplexity, ChatGPT)
- Uses generative AI but:
  - Lacks accuracy for open-ended queries
  - Doesn’t track feedback or user progression
Human-Centered Systems
- Exist in academia or enterprise
- Not widely scaled or integrated with real-time LLMs
Conversational Interfaces (e.g., Bing Copilot, ChatGPT)
- Dialogue-capable but limited memory, emotion tracking, or reasoning across sessions
Identified Gaps
- No long-term personalization
- No evolving user modeling
- Weak emotional intent detection
- Lack of iterative, contextual refinement

III. Methodology – Core Principles

Views search as a learning journey
Transforms raw queries into intent vectors
Creates real-time knowledge maps
Uses a multi-layer feedback loop to learn from user behavior

IV. System Components

Intent Detection Engine
- Uses transformers to classify and rewrite queries based on type (exploratory, factual, decision-based)
Emotion-Aware Context Processor
- Detects user mood/confusion via input patterns and adjusts responses (e.g., simpler language or visuals)
Google API Layer
- Pulls top results as raw input—not answers—to analyze further
Knowledge Synthesizer
- Summarizes and verifies data using LLMs
- Scores based on relevance, novelty, confidence
Feedback Loop
- Tracks actions (clicks, time spent, ratings) to refine future responses

V. System Architecture Overview

Modular and layered structure:
- Frontend (React.js, Tailwind CSS)
- Backend (FastAPI, Python, NLP, ML)
- Database (PostgreSQL + Vector DB like Pinecone or FAISS)

Key Modules:

Query Processor – Cleans and tokenizes input
AI Engine – Classifies intent using LLMs
Result Enhancer – Reranks Google results based on semantics and user context
Personalization Module – Learns from user sessions for adaptive results
Renderer – Presents outputs with summaries, highlights, and suggestions

VI. Implementation Highlights

Frontend Features:
- Text, voice, and image-based search
- Interactive result display and feedback options
Backend Functions:
- Handles semantic search, intent detection, and multimodal queries
- Uses LLMs to generate human-like, factual summaries
Database Capabilities:
- Stores user history, embeddings, and feedback
- Enables fast, personalized semantic search

VII. Results and Benefits

Semantic Understanding – Accurately interprets meaning, not just keywords
Natural Language Fluency – Handles everyday conversational queries well
Speed & Scalability – Responds in ~0.12 seconds, even under load
Personalization – Learns and adapts to user preferences over time
Planned Enhancements – Integrating external data (Wikipedia, academic sources) for richer, fresher content

Conclusion

This AI-powered search engine offers a paradigm shift—from static keyword lookups to adaptive, emotionally aware, and goal-oriented search experiences. It is especially suited for researchers, students, and decision-makers seeking depth, clarity, and evolving insight.

References

[1] C. D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. [2] Maarek, Y., & Weikum, G. (2010). Web Search: The Past, the Present, and the Future [3] Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual web search engine. [4] Russell, S. J., & Norvig, P. (2010). Artificial Intelligence: A Modern Approach. [5] Croft, W. B., Metzler, D., & Strohman, T. (2010). Search Engines: Information Retrieval in Practice. Addison-Wesley. [6] Baeza-Yates, R., &Ribeiro-Neto, B. (2011). Modern Information Retrieval: The Concepts and Technology behind Search (2nd ed.). Addison-Wesley. Jurafsky, D., & Martin, J. H. (2020). Speech and Language Processing (3rd ed. draft). [7] Hearst, M. A. (2009). Search User Interfaces. Cambridge University Press. [8] Chakrabarti, S. (2002). Mining the Web: Discovering Knowledge from Hypertext Data. Morgan Kaufmann. [9] Liu, B. (2011). Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (2nd ed.). Springer. [10] Sebastiani, F. (2002). Machine Learning in Automated Text Categorization. ACM Computing Surveys, 34(1), 1–47. [11] Joachims, T. (2002). Optimizing Search Engines using Clickthrough Data. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. [12] Gulli, A., & Signorini, A. (2005). The Indexable Web is More than 11.5 Billion Pages. In Special Interest Tracks and Posters of the 14th International Conference on WWW.

Copyright

Copyright © 2025 Prof. Deepak Naik, Prof. Hyder Ali Hingoniwala, Mohit , Rishita Kapile, Rajiv Surgoniwar, Thanshu Agarkar, Shreyash Karpe. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET70852

Publish Date : 2025-05-12

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here