An Integrated LSTM-Faster R-CNN Model for Accurate Segmentation and Detection of Dental Abnormalities

Authors: Rayini Priyanka, M. M. Rayudu, Mahesh Nannepagu

DOI Link: https://doi.org/10.22214/ijraset.2026.77605

Abstract

Accurate segmentation and classification of dental abnormalities in three-dimensional (3D) dental models remain challenging due to complex anatomical structures, overlapping features, and variability across patients. This study proposes a novel hybrid deep learning framework that integrates Long Short-Term Memory (LSTM) networks with Faster Region-Based Convolutional Neural Networks (Faster R-CNN) to enhance automated dental image analysis. In the proposed approach, 3D dental models are first processed using an LSTM-based segmentation network to capture contextual and sequential dependencies within structural patterns of teeth. The segmented outputs are subsequently fed into a Faster R-CNN classifier for precise detection and classification of dental conditions, including caries and structural abnormalities. While the LSTM component models spatial–structural dependencies and progression-related patterns, Faster R-CNN effectively localizes and identifies pathological regions with high detection accuracy. Experimental results demonstrate that the integrated framework significantly improves segmentation precision and classification performance compared to conventional standalone models. The proposed method enhances diagnostic reliability, reduces manual intervention, and supports efficient clinical decision-making. By enabling timely and accurate identification of dental disorders, this approach contributes to improved patient outcomes and optimized dental healthcare workflows.

Introduction

The study presents a hybrid deep learning framework for automated segmentation and classification of dental abnormalities from Digital Panoramic Radiograph (DPR) images.

With advancements in computer vision and deep learning, 3D dental imaging technologies such as:

Cone Beam Computed Tomography (CBCT)

have enabled detailed structural visualization. However, automated analysis remains challenging due to:

Complex anatomical variations
Overlapping dental structures
Heterogeneous disease presentations
Background noise in radiographs

To address these issues, the study integrates:

CNN-based feature learning
LSTM-based segmentation (for 3D contexts)
Faster R-CNN-based object detection

Problem Statement

Traditional methods:

Rely on manual interpretation
Are time-consuming and subjective
Lack generalization across diverse patients

Standalone CNN models may struggle to:

Capture long-range spatial dependencies
Perform precise object-level localization

Thus, a more integrated approach is required.

Proposed Hybrid Framework

The system combines:

LSTM-based segmentation
CNN backbone feature extraction
Faster R-CNN for region-level detection and classification

This enables both structural segmentation and precise abnormality localization.

System Architecture

The architecture consists of three major stages:

1?? Image Preprocessing Module

DPR image acquisition
Normalization:
Inorm=(I−μ)/σI_{norm} = (I - \mu)/\sigmaInorm?=(I−μ)/σ
Contrast enhancement and noise reduction
Tooth-level segmentation:
S(I)=fseg(I)S(I) = f_{seg}(I)S(I)=fseg?(I)
Single-tooth dataset creation

Purpose:

Reduce background interference
Improve structural clarity
Standardize model inputs

2?? CNN Training & Base Network Selection

A CNN performs hierarchical feature extraction:

Convolution:
Fk=I∗Wk+bkF_k = I * W_k + b_kFk?=I∗Wk?+bk?
Activation: ReLU
Classification: Softmax
Loss: Cross-Entropy

The CNN is trained iteratively with hyperparameter tuning.

Selection Rule:

If validation accuracy ≥ 94% → selected as base network
Otherwise → retrain with adjusted parameters

This adaptive mechanism ensures optimal feature extraction before detection integration.

3?? Faster R-CNN for Localization & Classification

The final stage uses Faster R-CNN for precise detection.

It includes:

Region Proposal Network (RPN)

Generates candidate bounding boxes
Predicts objectness score and bounding box offsets

Detection Head

Refines bounding boxes
Classifies dental abnormalities
Multi-task loss:
L=Lcls+λLregL = L_{cls} + \lambda L_{reg}L=Lcls?+λLreg?

Output:

Annotated DPR image
Bounding boxes
Predicted disease labels

Dental Abnormalities Classified

Normal
Caries
Impacted Tooth
Periapical Lesion

Dataset Description

Simulated DPR Dataset:

Total images: 1,200
Resolution: 1024 × 512
ROI annotations: 3,860 bounding boxes
Data split (patient-level, 80:10:10):
- Training: 960 images
- Validation: 120 images
- Testing: 120 images

Patient-level splitting prevents data leakage.

Training Configuration

CNN Base Network

Optimizer: Adam
Learning rate: 1×10??
Batch size: 16
Epochs: 50
Loss: Cross-Entropy

Faster R-CNN

Backbone: Selected CNN
Anchor scales: {64, 128, 256}
IoU thresholds:
- Positive ≥ 0.7
- Negative ≤ 0.3
Epochs: 30

Evaluation Metrics

Classification Metrics

Accuracy
Precision
Recall
F1-score

Detection Metric

Intersection over Union (IoU):
IoU=Area(Bp∩Bgt)Area(Bp∪Bgt)IoU = \frac{Area(B_p \cap B_{gt})}{Area(B_p \cup B_{gt})}IoU=Area(Bp?∪Bgt?)Area(Bp?∩Bgt?)?

Cross-validation ensures generalization and reduces overfitting.

Key Advantages

Modular pipeline design
Adaptive CNN backbone selection
Precise ROI localization
Integrated segmentation + detection
Reduced manual intervention
Clinically interpretable outputs

Main Contribution

The study demonstrates that integrating:

Sequential modeling (LSTM)
CNN-based feature learning
Faster R-CNN detection

significantly improves:

Segmentation accuracy
Abnormality classification reliability

Conclusion

This study presented a comprehensive deep learning framework for automated segmentation and classification of dental abnormalities from Digital Panoramic Radiograph (DPR) images. The proposed architecture integrates image preprocessing, CNN-based feature extraction, and Faster R-CNN-based region localization into a unified pipeline. The preprocessing module enhances image quality and isolates relevant dental structures, thereby improving feature representation and reducing background interference. A Convolutional Neural Network (CNN) was trained and optimized through systematic hyperparameter tuning, and a performance threshold-based selection mechanism ensured the use of an optimal base network. The selected backbone was subsequently integrated into a Faster R-CNN framework for precise region proposal, bounding box regression, and multi-class classification of dental abnormalities. The inclusion of a Region Proposal Network (RPN) enabled accurate localization of pathological regions such as caries, impacted teeth, and periapical lesions. Experimental evaluation demonstrated strong classification and detection performance. The CNN model achieved high validation accuracy, while the integrated Faster R-CNN framework improved detection precision and mean Average Precision (map). Training and validation curves confirmed stable convergence behaviour with minimal overfitting. Class-wise performance analysis further indicated balanced detection capability across multiple dental conditions. Overall, the proposed system enhances diagnostic automation, reduces manual effort, and supports reliable clinical decision-making. By combining segmentation, feature learning, and object detection into a structured pipeline, the framework contributes toward intelligent and time-efficient dental healthcare systems.

References

[1] R. Pauwels, “A brief introduction to concepts and applications of artificial intelligence in dental imaging,” Oral Radiology, vol. 37, pp. 153–160, 2021. [2] P. Folly, “Imaging techniques in dental radiology: Acquisition, anatomic analysis and interpretation of radiographic images,” BDJ Student, vol. 28, p. 11, 2021. [3] T. Mazhar, I. Haq, A. Ditta, S. A. H. Mohsan, F. Rehman, I. Zafar, J. A. Gansau, and L. P. W. Goh, “The role of machine learning and deep learning approaches for the detection of skin cancer,” Healthcare, vol. 11, no. 3, p. 415, 2023. [4] I. Haq, T. Mazhar, M. A. Malik, M. M. Kamal, I. Ullah, T. Kim, M. Hamdi, and H. Hamam, “Lung nodules localization and report analysis from computerized tomography (CT) scan using a novel machine learning approach,” Applied Sciences, vol. 12, no. 24, p. 12614, 2022. [5] R. A. Naqvi, D. Hussain, and W. K. Loh, “Artificial intelligence-based semantic segmentation of ocular regions for biometrics and healthcare applications,” Computers, Materials & Continua, vol. 66, no. 1, pp. 715–732, 2020. [6] M. Prados-Privado, J. G. Villalón, C. H. Martínez-Martínez, and C. Ivorra, “Dental images recognition technology and applications: A literature review,” Applied Sciences, vol. 10, no. 8, p. 2856, 2020. [7] J. Gateno, J. J. Xia, and J. F. Teichgraeber, “New 3-dimensional cephalometric analysis for orthognathic surgery,” Journal of Oral and Maxillofacial Surgery, vol. 69, no. 3, pp. 606–622, Mar. 2011. [8] G. Bettega, Y. Payan, B. Mollard, A. Boyer, B. Raphaël, and S. Lavallée, “A simulator for maxillofacial surgery integrating 3D cephalometry and orthodontia,” Computer Aided Surgery, vol. 5, no. 3, pp. 156–165, 2000. [9] C. A. Hurst, B. L. Eppley, R. J. Havlik, and A. M. Sadove, “Surgical cephalometrics: Applications and developments,” Plastic and Reconstructive Surgery, vol. 120, no. 6, pp. 92e–104e, 2007. [10] R. Chen, Y. Ma, N. Chen, D. Lee, and W. Wang, “Cephalometric landmark detection by attentive feature pyramid fusion and regression-voting,” in Proceedings of MICCAI, 2019, pp. 873–881.

Copyright

Copyright © 2026 Rayini Priyanka, M. M. Rayudu, Mahesh Nannepagu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET77605

Publish Date : 2026-02-21

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here