AI for Personalized Healthcare Recommendations using Wearable Data

Authors: Poonam ., Amandeep

DOI Link: https://doi.org/10.22214/ijraset.2026.84037

Abstract

The rapid development and uses of wearable health monitoring devices has generated continuous, real-time physiological data. Which then provides a transformative opportunity for Artificial Intelligence (AI) to provide personalized healthcare at this scale. This study presents the design, evaluation and implementation of an AI pipeline for cardiometabolic health risk classification and generating personalized health recommendation.For this we have used various machine learning architectures are trained and evaluated on Hamon Google Fit Medical Realistic Dataset which has ~90,000 rows and ~3000 user data. It is trained on Random Forest, Decision Tree and anenhanced Transformer classifier. These yield accuracy 93%, 89.2%,89.1% and 94.5% respectively. There is also a severe class imbalance (60:1 ratio) which is addressed through SMOTE oversampling and Focal Loss. Feature Importance via MDI identifies fatigue_score(18.9%) and bp_systolic(10.1%) as top predictors. SHAP analysis revels that age and sex as dominant global features. A Google Gemini LLM (gemini-2.5-flash) is integrated by a Flask REST API to translate ML risk predictions into actionable personalized natural language health recommendations. The Results shown tell us that the proposed Transformer-based pipeline significantly outperforms classical ML and prior deep learning approaches, achieving 94.5% accuracy on five-class cardiometabolic risk classification.

Introduction

The rapid advancement of Artificial Intelligence (AI) and wearable health monitoring devices has transformed modern healthcare by enabling continuous monitoring of physiological data such as heart rate, blood oxygen levels, sleep quality, physical activity, blood pressure, and glucose levels. Although wearable devices generate vast amounts of health data, extracting meaningful insights requires AI systems capable of processing multidimensional, sequential, noisy, and incomplete data. This research proposes a multi-model AI healthcare recommendation system that predicts cardiometabolic disease risk using the Hamon Google Fit Medical Realistic Dataset and provides personalized health recommendations through a Google Gemini Large Language Model (LLM) integrated into a Flask web application.

Background and Significance

Cardiometabolic diseases—including cardiovascular disease, diabetes, hypertension, obesity, and metabolic syndrome—are among the leading causes of death worldwide. Continuous monitoring of health indicators such as heart rate, heart rate variability (HRV), sleep quality, physical activity, blood pressure, and glucose levels allows AI models to identify health risks early and recommend preventive lifestyle changes. The growing adoption of wearable devices and the expansion of AI in healthcare have created opportunities for intelligent, real-time health monitoring systems.

Evolution of AI in Healthcare

Healthcare AI has evolved through three major stages:

Rule-based expert systems (e.g., MYCIN, INTERNIST-1) that relied on predefined medical rules but lacked adaptability.
Classical Machine Learning (ML) algorithms such as Decision Trees, Random Forests, SVM, Naïve Bayes, and KNN, which improved predictive performance but ignored temporal relationships in health data.
Deep Learning and Transformers, including CNNs, RNNs, LSTMs, and Transformer architectures, which effectively model complex patterns and sequential physiological data. Modern Large Language Models (LLMs) such as Google Gemini further enhance healthcare by converting AI predictions into understandable natural-language recommendations.

Wearable Health Monitoring

Modern wearable devices use sensors such as Photoplethysmography (PPG) to measure heart rate, blood oxygen saturation (SpO?), and Heart Rate Variability (HRV). HRV is an important indicator of cardiovascular health, where higher HRV reflects better autonomic function and lower HRV indicates increased cardiometabolic risk.

Challenges

Developing AI-based healthcare recommendation systems involves several challenges:

Processing noisy, incomplete, and multidimensional wearable data.
Handling severe class imbalance in healthcare datasets.
Ensuring model interpretability and explainability.
Generating understandable health recommendations for users.

To address class imbalance, this study combines SMOTE (Synthetic Minority Oversampling Technique) with Focal Loss, improving prediction performance on minority risk classes.

Literature Review

Previous studies demonstrate the effectiveness of ML and DL models in healthcare applications such as disease diagnosis, ECG interpretation, activity recognition, and cardiovascular risk prediction. However, many existing works:

Evaluate only a single ML architecture.
Do not compare multiple models using the same dataset.
Ignore severe class imbalance.
Lack explainable AI techniques.
Do not integrate LLMs for personalized recommendations.

Research Contributions

This research addresses these limitations by:

Comparing Random Forest, Decision Tree, SVM, and Transformer models using the same wearable dataset and evaluation framework.
Mitigating class imbalance with SMOTE and Focal Loss.
Incorporating SHAP and feature importance analysis for explainability.
Integrating Google Gemini to generate personalized natural-language health advice.
Deploying the complete system through a Flask REST API for real-time use.

Methodology

The proposed methodology includes:

Collecting the Hamon Google Fit Medical Realistic Dataset.
Performing preprocessing, including missing-value handling, feature engineering, normalization, and class balancing.
Training four ML models (Random Forest, Decision Tree, SVM, and Transformer).
Selecting the best-performing model using validation metrics.
Predicting cardiometabolic risk levels.
Generating personalized recommendations using Google Gemini.
Delivering predictions through a Flask web application.

System Architecture

The system consists of five layers:

Data Ingestion Layer for loading wearable health data.
Health Data Preprocessing Layer for cleaning, feature engineering, and normalization.
Multi-Model Training Layer for training and selecting ML models.
Google Gemini LLM Layer for converting predictions into natural-language recommendations.
Flask REST API Layer for real-time user interaction.

Dataset and Preprocessing

The Hamon Google Fit Medical Realistic Dataset contains physiological data from 3,000 users over 30 days, with over 30 health features and five cardiometabolic risk classes. Because the dataset is highly imbalanced, preprocessing includes:

Missing-value imputation.
Label encoding.
Feature engineering.
Standardization.
SMOTE-based class balancing.

Model Training

Four supervised learning models—Decision Tree, Random Forest, SVM, and Transformer—are trained and evaluated using balanced training data. The Transformer employs multi-head self-attention, early stopping, gradient clipping, and focal loss to improve classification performance. Model performance is assessed using accuracy, precision, recall, F1-score, and confusion matrices.

Conclusion

The main contribution of this paper is to integrate a transformer and LLM(Gemini) to get risk state, probabilities and actionable natural language personalised recommendations. Embedding all 25+ physiological indicators, the predicted risk class, and the focus area (as structured in the prompt) yields highly customized, clinical-level health recommendations, which are quantitatively specific (mentioning specific numbers based on the user\'s physiological indicators) and accurately calibrated with respect to urgency. In this way, the system bridges the gap between complicated ML model predictions and easily actionable health recommendations for lay users. Real-time integration with APIs of actual wearable devices (Google Fit REST API, Fitbit Web API, Apple HealthKit, Samsung HealthSDK) will be the most important addition in the near future, making it possible to continuously monitor user health and generate health recommendations on a continuous basis. It will require 2.0 authentication, implementing background data polling services, configuring desired sampling rates, and designing an online preprocessing pipeline for single-records (using fitted preprocessor artifact) with push notifications for proactive health alerts. Because of the fit–transform design of the HealthDataPreprocessor, it can be easily applied to single records obtained from the API layer.

References

[1] Faizan Ahmad, Abdullah A. Alzahrani, Tariq Hussain and Syed Sajjad Hussain Rizwi, “AI-Driven Personalized Medical Recommendation System,” International Journal of Innovative Research of Computer Science and Technology, vol. 13, pp. 26-34, 2025. [2] Maryam Etemadi, Sepideh Bazzaz Abkenar, Ahmad Ahmadzadeh, Mustafa Haghi Kashani, Parvaneh Asghari, Mohammad Akbari and Ebrahim Mahdipour, “A systematic review of healthcare recommender systems: Open issues, challenges, and techniques,” Expert Systems with Applications, vol. 213, p. 118823, 2023. [3] Thi Ngoc Trang Tran, Alexander Felfernig, Christoph Trattner and Andreas Holzinger, “Recommender systems in the healthcare domain: state-of-the-art and research issues,” Journal of Intelligent Information Systems, vol. 57, no. 1, pp. 171-201, 2021. [4] Grand View Research, Inc., “Artificial Intelligence in Healthcare Market Size, Share & Trends Analysis Report,” Grand View Research, 2026. [5] Hend Salah Saad, John Frederick William Zaki and Mohamed Maher Abdelsalam, “Employing of Machine Learning and Wearable Devices in Healthcare System: Tasks and Challenges,” Neural Computing and Applications, vol. 36, pp. 17829-17849, 2024. [6] Po-Han Chiang, Melissa Wong and Sujit Dey, “Using Wearables and Machine Learning to Enable Personalized Lifestyle Recommendations to Improve Blood Pressure,” IEEE Journal of Translational Engineering in Health and Medicine, vol.9, pp. 1-13, 2021. [7] Vishnu Ramineni, Sai Prasad Poggula, Venkata Sai Kiran Rao Budumuru and Divya Swaroopa Geeta Gorla, “Personalized Activity Recommendation System for Cardiovascular Patients Using Heart Rate and ECG/EKG Data from Wearable Devices,” 2024 First International Conference on Data, Computation and Communication (ICDCC), pp. 786-790, 2024. [8] Yazeed Yasin Ghadi, Tariq Shahzad, Muhammad Shahid Anwar, Mohammed Alhassan and Ahmad Jalal, “Integration of Wearable Technology and Artificial Intelligence in Digital Health for Remote Patient Care,” Journal of Cloud computing, vol. 14, 2025. [9] World Health Organization, \"Cardiovascular Diseases (CVDs),\" WHO Fact Sheet, World Health Organization, Geneva, Switzerland, 2021. [10] Omar Ali, Amer Shalan, Peter Stansby and David wall, “A Systematic Literature Review of Artificial Intelligence in the Healthcare Sector: Benefits, Challenges, Methodologies, and Functionalities,” Journal of Innovation and Knowledge, vol. 8, no. 1, 2023. [11] Edward Hance Shortliffe and Bruce G. Buchanan, \"A Model of Inexact Reasoning in Medicine,\" Mathematical Biosciences, vol. 23, nos. 3–4, pp. 351–379, 1975. [12] Randolph Arthur Miller, Harry Eugene Pople Jr. and Jack Donald Myers, \"INTERNIST-1: An Experimental Computer-Based Diagnostic Consultant for General Internal Medicine,\" New England Journal of Medicine, vol. 307, no. 8, pp. 468–476, 1982. [13] Leo Breiman, Jerome Friedman, Charles James Stone and Richard Allan Olshen, \"Classification and Regression Trees (CART),\" Wadsworth International Group, Belmont, CA, 1984. [14] Tin Kam Ho, \"Random Decision Forests,\" Proceedings of the Third International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 278–282, 1995. [15] Basma Mohamed Hassan and Shahd Mohamed Elagamy, “Personalized Medical Recommendation System with Machine Learning,” Neural Computing and Applications, vol. 37, pp. 6431-6447, 2025. [16] Parth Rajpurkar, Awni Yahya Hannun, Masoumeh Haghpanahi, Codie Bourn and Andrew Yan-Tak Ng, “Cardiologist-Level Arrhythmia Detection and Classification in Ambulatory Electrocardiograms Using a Deep Neural Network,” Nature Medicine, vol. 25, pp. 65–69, 2019. [17] Andre Esteva, Brett Kuprel, Roberto Alejandro Novoa, Justin Ko, Susan Michelle Swetter, Helen Mary Blau and Sebastian Thrun, “Dermatologist-Level Classification of Skin Cancer with Deep Neural Networks,” Nature, vol. 542, pp. 115–118, 2017. [18] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan Nicholas Gomez, Lukasz Kaiser and Illia Polusukhin, “Attention is All You Need,” Advances in Neural Network Processing System, vol. 30, pp. 5998-6008, 2017. [19] Madalena Costa, Ary Louis Goldberger and Chung-Kang Peng, \"Multiscale Entropy Analysis of Complex Physiologic Time Series,\" Physical Review Letters, vol. 89, no. 6, 2002. [20] David Alexander Clifton, Lei Clifton, Syed Anas Huq, David Philip Lauerty and Lionel Tarassenko, \"Predictive Monitoring of Mobile Patients by Combining Clinical Observations with Data from Wearable Sensors,\" IEEE Journal of Biomedical and Health Informatics, vol. 18, no. 3, pp. 722–730, 2014. [21] Matthew N. Nicolis, Monika Trojanowska and Mark Andrew Espey, \"Photoplethysmography in Smartwatches: Opportunities and Limitations for Cardiovascular Health Monitoring,\" NPJ Digital Medicine, vol. 7, p. 112, 2024. [22] Stefano Canali, Barbara Prainsack, Mette Nordahl Svendsen and Hallvard Furseth Haugen, “Big Data, Machine Learning, and Personalization in Health Systems: Ethical Issues and Emerging Trade-Offs,” Science and Engineering Ethics, vol.31, 2025. [23] Nitesh Vilas Chawla, Kevin William Bowyer, Lawrence O. Hall, and William Philip Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002. [24] Tsung-Yin Lin, Priya Goyal, Ross Benjamin Girshick, Kaiming He and Piotr Dollar, “Focal Loss for Dense Object Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 2, pp. 318–327, 2020. [25] Krishnapriya Vengat Mahalakshm, Ramasamy Nithiya, Natarajan Ramesh, Govindarajan Kanagaraj and Chinnasamy Suganthi, “Automated Medical Data Analysis for Personalized Smart Healthcare using Deep Learning,” 2025 3rd International Conference on Data Science and Information System (ICDSIS), pp. 1-5, 2025. [26] Sanmugam Prema, Ramar Kayalvizhi, Arumugam Bharathi, Murugesan Suganya, “Medi AI: Intelligent Patient Monitoring with Deep Learning,” 2025 International Conference on Computational Robotics, Testing and Engineering Evaluation (ICCRTEE), pp. 1-6, 2025. [27] Yoshua Bengio, Ian Joseph Goodfellow and Aaron Courville, \"Deep Learning,\" MIT Press, Cambridge, MA, 2016. [28] Sepp Hochreiter and Jurgen Schmidhuber, \"Long Short-Term Memory,\" Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. [29] Dawit Bekele Abebe, Yihenew Alemu and Nebil Ayalew Azene, \"Human Activity Recognition Using Deep Learning From Accelerometer and Gyroscope Sensor Fusion,\" Applied Sciences, vol. 13, no. 12, p. 7085, 2023. [30] Feng Xia, Laurence Tianruo Yang, Lizhen Wang and Alexey Vinel, \"Internet of Things,\" International Journal of Communication Systems, vol. 25, no. 9, pp. 1101–1102, 2012. [31] Rui Yin, Dan Li, Yifan Guo and Wenchao Xue, \"Multi-Modal Wearable Health Monitoring with Sensor Fusion and Deep Learning,\" IEEE Sensors Journal, vol. 22, no. 14, pp. 14182–14192, 2022. [32] Gotlur Karuna, Mandali Srinivasarao, Kambham Chandra Sekhar and Gorige Purnachand, “Healthcare Personalized System Recommendation Using Processing of Natural language in Wearable Data,” 2025 International Conference on Computational Innovations and Engineering Sustainability (ICCIES), pp.1-5, 2025. [33] Tom Benjamin Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah and Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell and Dario Amodei, \"Language Models are Few-Shot Learners (GPT-3),\" Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901, 2020. [34] Google DeepMind, Inc., “Gemini: A Family of Highly Capable Multimodal Models,” Google DeepMind Technical Report, 2023. [35] Leo Breiman, \"Random Forests,\" Machine Learning, vol. 45, no. 1, pp. 5–32, 2001 [36] Hamon (aridoge13), \"Hamon Google Fit Medical Realistic Dataset,\" Kaggle, 2025.Available: https://www.kaggle.com/datasets/aridoge13/google-fit-data. [Accessed: Jun. 2026].

Copyright

Copyright © 2026 Poonam ., Amandeep . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET84037

Publish Date : 2026-06-29

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here