Early detection of cardiac abnormalities is crucial for effective healthcare intervention, especially with the increasing adoption of IoT-based wearable devices. This study presents an AI-driven approach for anomaly detection in ECG time-series data using unsupervised learning techniques, namely Isolation Forest and Local Outlier Factor (LOF). The analysis is conducted on publicly available ECG datasets, including a standard ECG time-series dataset and the PTB Diagnostic ECG Database (PTBDB). The models are evaluated based on anomaly detection capability and comparative performance. Experimental results show that Isolation Forest detects a slightly higher number of anomalies, while LOF effectively captures local density variations. The study demonstrates that lightweight unsupervised models can be integrated into IoT healthcare systems for real-time cardiac monitoring. Future work includes extending the approach using deep learning models and real-time streaming data analytics.
Introduction
The rise of IoT-based medical devices has generated massive amounts of real-time ECG data, critical for detecting life-threatening cardiac anomalies like arrhythmias. Traditional monitoring relies on labeled data and clinical interpretation, limiting scalability in continuous or remote care. Unsupervised AI techniques, such as Isolation Forest (IF) and Local Outlier Factor (LOF), enable automated anomaly detection without large labeled datasets, making them suitable for IoT-enabled healthcare systems.
Contributions
This study:
Applies Isolation Forest and LOF to real-world ECG time-series and PTBDB datasets.
Compares the performance of both algorithms in detecting anomalies.
Visualizes detected anomalies for practical interpretation.
Demonstrates the feasibility of lightweight AI deployment in IoT-based cardiac monitoring.
Methodology
Datasets: ECG time-series from Kaggle and PTBDB from PhysioNet.
Preprocessing: Normalization, CSV conversion, and removal of duplicates/missing values.
Algorithms:
Isolation Forest: Detects anomalies via random data isolation; shorter path lengths indicate outliers.
LOF: Detects local density deviations; LOF values >1 indicate anomalies.
Implemented using Python (pandas, numpy, scikit-learn, matplotlib).
Results
Isolation Forest detected slightly more anomalies than LOF (203 vs. 179 in ECG time-series).
Both algorithms effectively highlighted anomalous points, corresponding to sharp spikes or deviations in ECG signals.
Visualization confirmed the practical applicability of both methods for anomaly detection in real healthcare datasets.
Conclusion
This study presented a comparative analysis of two unsupervised anomaly detection algorithms, Isolation Forest and Local Outlier Factor, applied to ECG datasets. Both models successfully identified anomalous patterns in ECG signals without requiring labelled data. Isolation Forest demonstrated slightly higher sensitivity, while LOF effectively captured local variations. The findings highlight the potential of integrating lightweight AI models into IoT-based healthcare systems for continuous and real-time cardiac monitoring. The analysis revealed that:
1) Isolation Forest consistently detected a slightly higher number of anomalies compared to LOF.
2) Both algorithms successfully identified significant outlier points in ECG signals, supporting their reliability in healthcare anomaly detection scenarios.
3) This practical case study bridges a gap in existing literature by comparatively evaluating multiple unsupervised techniques on real healthcare datasets.
The findings suggest that integrating these models into IoT-based wearable healthcare devices could enable early detection of critical heart conditions, improving patient safety and proactive clinical intervention proactive clinical intervention.
References
[1] Breunig, M.M., Kriegel, H.P., Ng, R.T., and Sander, J. LOF: Identifying Density-Based Local Outliers. Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, 29(2), 93–104, 2000. https://doi.org/10.1145/335191.335388
[2] Chandola, V., Banerjee, A., and Kumar, V. Anomaly Detection: A Survey. ACM Computing Surveys (CSUR), 41(3), 1–58, 2009. https://doi.org/10.1145/1541880.1541882
[3] Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B.,Peng, C.K., and Stanley, H.E. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals. Circulation, 101(23), e215–e220, 2000. https://doi.org/10.1161/01.CIR.101.23.e215
[4] Kachuee, M., Fazeli, S., and Sarrafzadeh, M. ECG Heartbeat Classification: A Deep Transferable Representation. Proceedings of the IEEE International Conference on Healthcare Informatics (ICHI), 443–444, 2018. https://doi.org/10.1109/ICHI.2018.00091
[5] Kaggle. ECG Time Series Data for Anomaly Detection [Data set]. 2024. https://www.kaggle.com/dataset-url
[6] Liu, F.T., Ting, K.M., and Zhou, Z.H. Isolation Forest. Proceedings of the 2008 IEEE International Conference on Data Mining, 413–422, 2008. https://doi.org/10.1109/ICDM.2008.17
[7] Pedregosa, F., Varoquaux, G., Gramfort, A., et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12, 2825–2830, 2011. https://jmlr.org/papers/v12/pedregosa11a.html