Medical laboratory reports carry dense clinical data that both physicians and patients struggle to interpret efficiently. Clinicians need clear pattern recognition across test values to inform diagnosis and care, while patients need accessible explanations in daily used language. Existing tools either fail when report formats change or rely on a single AI model with no mechanism to validate medical accuracy. This paper presents a multi-model AI pipeline that ingests lab reports as scanned images or PDFs, extracts clinical values through OCR and AI-based parsing, and benchmarks them against standard reference ranges. Each extracted value passes through three sequential AI models to cross-check interpretations and reduce single-model error. The system produces a daily used language health summary with context-specific observations, and exposes a retrieval-augmented generation (RAG) chat layer that allows users to query their own report in natural language. The pipeline runs on FastAPI, LangGraph, Groq LPU, and FAISS, completing end-to-end processing of a standard Complete Blood Count (CBC) report in 15 to 30 seconds.
Introduction
The paper presents an AI-based system that automatically interprets medical laboratory reports (such as CBC and metabolic panels) and converts raw numerical data into clear, patient-friendly explanations. It addresses a common problem where patients and clinicians struggle to quickly understand lab results due to time constraints, complex formats, and unreliable online explanations.
The proposed solution uses a structured LangGraph-based pipeline with multiple stages: document ingestion (PDF/image processing with OCR), parameter extraction using AI models, validation against medical reference ranges, multi-model clinical analysis, result synthesis, and safety filtering. The system also includes a retrieval-augmented generation (RAG) chat interface using FAISS to allow users to ask context-specific questions grounded in their own report.
Unlike traditional rule-based or single-model approaches, the system uses a three-model analysis strategy to interpret individual abnormalities, detect patterns across parameters, and adjust interpretations based on patient age and sex. It then generates a unified health summary with risk scoring and simplified recommendations, while ensuring safety by avoiding direct prescriptions or diagnoses.
Experimental results show that the system can process a full report in about 15–30 seconds, produces more reliable interpretations than single-model systems, and improves pattern detection (e.g., identifying combined conditions like anemia with leukopenia). The RAG-based chat further enables accurate, context-aware explanations with low latency.
Conclusion
This work shows that breaking an AI pipeline into stages can read a lab report, check each value, and explain the results to both doctors and patients in plain words. Older rule-based tools and single model calls usually miss this kind of task. Splitting the work makes each part easy to test and fix on its own, and the FAISS setup keeps each person\'s data apart so reports stay separate. The same approach fits other fields too. Any document that needs to be read, checked, and explained to a non-expert suits this pipeline. Next steps: add Redis so sessions survive a server restart, a task queue so big reports do not block the site, and a routine that clears stored FAISS index files once a user removes their report.
References
[1] P. Rajpurkar, E. Chen, O. Lehman, and A. Ng, \"AI applications in healthcare and medicine,\" Nature Medicine, vol. 28, pp. 31–38, 2022.
[2] J. Lee et al., \"BioBERT: a pretrained biomedical language model for text mining in biomedicine,\" Bioinformatics, vol. 36, no. 4, pp. 1234–1240, 2020.
[3] K. Singhal et al., \"Clinical knowledge encoded in large language models,\" Nature, vol. 620, pp. 172–180, 2023.
[4] P. Lewis et al., \"Retrieval-augmented generation for knowledge-intensive natural language tasks,\" Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474, 2020.
[5] J. Johnson, M. Douze, and H. Jégou, \"GPU-accelerated similarity search at billion scale,\" IEEE Transactions on Big Data, vol. 7, no. 3, pp. 535–547, 2021.
[6] R. Smith, \"Tesseract OCR engine: an overview,\" in Proc. 9th Int. Conf. on Document Analysis and Recognition (ICDAR), 2007.
[7] H. Chase et al., \"LangGraph: framework for stateful, multi-actor LLM applications,\" LangChain Documentation, 2024. [Online]. Available: https://langchain-ai.github.io/langgraph/
[8] N. Reimers and I. Gurevych, \"Sentence-BERT: sentence embeddings using Siamese BERT networks,\" in Proc. EMNLP-IJCNLP 2019, pp. 3982–3992.