Evaluating Local-Global Landslide Features Combining CNN and Transformer for Landslide Susceptibility Mapping

Authors: Ashlin James, Mr. Ashish L

DOI Link: https://doi.org/10.22214/ijraset.2025.67275

Abstract

Landslide susceptibility mapping (LSM) is essential for assessing landslide risk and preventing geological hazards. Despite the advances in deep learning, convolutional neural networks (CNNs) and transformer models still face challenges in achieving optimal mapping accuracy and effectively extracting multilevel landslide features. This study introduces CTLGNet, a CNN-transformer local-global feature extraction network, combining the strengths of both models to capture both local and global landslide features. We applied CTLGNet to LSM in the Three Gorges Reservoir and Jiuzhaigou, using nine landslide conditioning factors to construct the dataset. The dataset was randomly split into training, validation, and test sets (6:2:2 ratio). CTLGNet was compared to CNN, ResNet, DenseNet, ViT, and FrIT using various evaluation metrics. The results showed that CTLGNet outperforms all other models in terms of landslide prediction, with AUC values of 0.9817 and 0.9693 for the two regions. Although its Recall was slightly lower than some models, CTLGNet effectively extracts both local and global landslide features, achieving precise landslide localization and detail capture. Overall, CTLGNet excels in multilevel feature extraction and demonstrates strong potential for widespread LSM applications.

Introduction

Overview

Landslides are severe natural disasters with major impacts on lives, property, and development. Landslide Susceptibility Mapping (LSM) predicts landslide-prone areas using environmental, geological, and historical data, aiding in disaster prevention. With advances in Artificial Intelligence (AI), Machine Learning (ML) and Deep Learning (DL) are increasingly used in LSM.

ML models require heavy feature engineering and are prone to overfitting.
DL models, such as Convolutional Neural Networks (CNNs) and Transformers, automatically detect complex patterns but have different strengths:
- CNNs extract local features (LLFs) well but struggle with global context.
- Transformers (e.g., Vision Transformer - ViT) capture global features (LGFs) using self-attention but need large datasets and are weaker at local detail.

Proposed Solution

To overcome these limitations, a hybrid model called CTLGNet is proposed. It combines:

CNNs for extracting LLFs (edges, textures, shapes)
Transformers for capturing LGFs (landslide size, distribution, spatial patterns)

This hybrid model improves the accuracy and reliability of LSM, though its performance had not been fully explored before this study.

Study Area and Data

Two high-risk landslide areas in China were selected:

Site A (Three Gorges, Hubei)
- Subtropical climate, heavy rainfall, unstable geology
- 202 documented landslides
Site B (Jiuzhaigou, Sichuan)
- Highland climate, tectonically active
- Nearly 4,000 landslides triggered by a 2017 earthquake

Landslide Inventory Maps were created using satellite data, historical records, and field surveys.

Landslide Conditioning Factors (LCFs): 9 key parameters used for modeling:

Elevation, Slope, Aspect, Lithology, Distance to Fault, Distance to River, Precipitation, Land Use, and NDVI

Methodology

1. Data Processing

Unified spatial resolution and coordinate systems
Applied Z-score normalization
Extracted image patches centered on landslide and non-landslide pixels (10,000 each)
Applied data augmentation (flipping, rotating, scaling)
Split datasets (60% train, 20% validation, 20% test)

2. Feature Analysis

Used Variance Inflation Factor (VIF) to eliminate multicollinearity among LCFs
Applied Random Forest to determine the importance of each LCF using the Gini index

Model Architectures Compared

CNN-Based Models

CNN (LeNet-5-based): Extracts high-dimensional local features
ResNet-18: Uses residual learning to prevent vanishing gradients
DenseNet-BC: Uses dense connections for efficient feature reuse and reduced complexity

Transformer-Based Models

ViT (Vision Transformer): Extracts global features using self-attention, splits images into patches
FrIT: A variation of ViT using the 2D fractional Fourier transform for global context extraction

Proposed Hybrid Model: CTLGNet

CNN backbone extracts detailed local features (LLFs)
Transformer component captures global spatial relationships (LGFs)
CTLGNet is evaluated against other models for:
- Accuracy
- Feature extraction quality
- Computational efficiency

Conclusion

In this article, we propose CTLGNet, a model that incorporates both LLFs and LGFs for landslide susceptibility mapping (LSM). It was applied in the Three Gorges Reservoir area and Jiuzhaigou, using historical landslide data and nine LCFs. CTLGNet\'s performance was evaluated against five models: CNN, ResNet, DenseNet, ViT, and FrIT.The results show that CTLGNet provides accurate LSM, with the VH and H susceptibility zones closely matching historical landslide locations. It outperforms other models in all evaluation metrics except Recall, achieving AUC values of 0.9817 and 0.9693 for the two regions. Additionally, CTLGNet produces the highest mean landslide susceptibility values and the lowest MAD and SD within historical landslide areas, indicating superior localization and detail extraction. It also has the lowest number of parameters and FLOPs among transformer-based models, making it more computationally efficient. In conclusion, CTLGNet demonstrates excellent predictive power and generalization, making it highly promising for a wide range of LSM applications.

References

[1] A. Merghadi et al., “Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance,” Earth-Sci. Rev., vol. 207, Aug. 2020, Art. no. 103225. [2] R. Wei, C. Ye, T. Sui, Y. Ge, Y. Li, and J. Li, “Combining spatial response features and machine learning classifiers for landslide susceptibility mapping,” Int. J. Appl. Earth Observ. Geoinformation, vol. 107, Mar. 2022, Art. no. 102681. [3] Y. Achour and H. R. Pourghasemi, “How do machine learning techniques help in increasing accuracy of landslide susceptibility maps?” Geosci. Front., vol. 11, no. 3, pp. 871–883, May 2020. [4] L. Zhang and L. Zhang, “Artificial intelligence for remote sensing data analysis: A review of challenges and opportunities,” IEEE Geosci. Remote Sens. Mag., vol. 10, no. 2, pp. 270–294, Jun. 2022. [5] L. Zhang, L. Song, B. Du, and Y. Zhang, “Nonlocal low-rank tensor completion for visual data,” IEEE Trans. Cybern., vol. 51, no. 2, pp. 673–685, Feb. 2021. [6] W. Huang et al., “Landslide susceptibility mapping and dynamic response along the Sichuan-Tibet transportation corridor using deep learning algorithms,” CATENA, vol. 222, Mar. 2023, Art. no. 106866. [7] C. Chen and L. Fan, “An attribution deep learning interpretation model for landslide susceptibility mapping in the Three Gorges Reservoir area,” IEEE Trans. Geosci. Remote Sens., vol. 61, Oct. 2023, Art. no. 3000515. [8] T. Chen, Q. Wang, Z. Zhao, G. Liu, J. Dou, and A. Plaza, “LCFSTE: Landslide conditioning factors and swing transformer ensemble for landslide susceptibility assessment,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 17, pp. 6444–6454, Mar. 2024. [9] L. Wang et al., “UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery,” ISPRS J. Photogrammetry Remote Sens., vol. 190, pp. 196–214, Aug. 2022. [10] L. Lv, T. Chen, J. Dou, and A. Plaza, “A hybrid ensemble-based deep learning framework for landslide susceptibility mapping,” Int. J. Appl. Earth Observ. Geoinformation, vol. 108, Apr. 2022, Art. no. 102713. [11] Y. Ge, G. Liu, H. Tang, B. Zhao, and C. Xiong, “Comparative analysis of five convolutional neural networks for landslide susceptibility assessment,” Bull. Eng. Geol. Environ., vol. 82, no. 10, Sep. 2023, Art. no. 337. [12] A. M. Youssef, B. Pradhan, A. Dikshit, M. M. Al-Katheri, S. S. Matar, and A. M. Mahdi, “Landslide susceptibility mapping using CNN-1D and 2D deep learning algorithms: Comparison of their performance at AsirRegion, KSA,” Bull. Eng. Geol. Environ., vol. 81, no. 4, Mar. 2022, Art. no. 165. [13] X. Liu, Y. Wu, W. Liang, Y. Cao, and M. Li, “High-resolution SAR image classification using global-local network structure based on vision transformer and CNN,” IEEE Geosci. Remote Sens. Lett., vol. 19, Feb. 2022, Art. no. 4505405. [14] D. Wang et al., “Evaluation of deep learning algorithms for landslide susceptibility mapping in an alpinegorge area: A case study in Jiuzhaigou County,” J. Mountain Sci., vol. 20, no. 2, pp. 484–500, Feb. 2023. [15] A. Jamali, S. K. Roy, and P. Ghamisi, “WetMapFormer: A unified deep CNN and vision transformer for complex wetland mapping,” Int. J. Appl. Earth Observ. Geoinformation, vol. 120, Jun. 2023, Art. no. 103333. [16] A.A.Aleissaeeetal.,“Transformers in remote sensing: A survey,” Remote Sens., vol. 15, no. 7, pp. 1860– 1860, Mar. 2023. [17] J. Ma, M. Li, X. Tang, X. Zhang, F. Liu, and L. Jiao, “Homo–heterogenous transformer learning framework for RS scene classification,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 15, pp. 2223– 2239, Mar. 2022. [18] Y. Feng, H. Xu, J. Jiang, H. Liu, and J.Zheng, “ICIF-Net:Intra-scalecrossinteraction and inter-scale feature fusion network for bi-temporal remote sensing image change detection,” IEEE Trans. Geosci. Remote Sens., vol. 60, Apr. 2022, Art. no. 4410213. [19] X. Gao, T. Chen, R. Niu, and A. Plaza, “Recognition and mapping of landslide using a fully convolutional DenseNet and influencing factors,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 14, pp. 7881–7894, Aug. 2021. [20] T. Chen et al., “BisDeNet: A new lightweight deep learning-based framework for efficient landslide detection,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 17, pp. 3648–3663, Jan. 2024. [21] Y. Bazi, L. Bashmal, M. M. A. Rahhal, R. A. Dayil, and N. A. Ajlan, “Vision transformers for remote sensing image classification,” Remote Sens., vol. 13, no. 3, p. 516, Feb. 2021. [22] S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, “Transformers in vision: A survey,” ACM Comput. Surv., vol. 54, no. 10, pp. 1–41, Jan. 2022. [23] X. Zhao et al., “Fractional Fourier image transformer for multimodal remote sensing data classification,” trans.NeuralNetw.Learn.Syst., vol. 35, no. 2, pp. 2314–2326, Feb. 2024. [24] B. Gao et al., “Landslide risk evaluation in Shenzhen based on stacking ensemble learning and InSAR,” IEEE J. Sel. Topics Appl. Earth Observ. Remote Sens., vol. 16, pp. 1–18, Jul. 2023.

Copyright

Copyright © 2025 Ashlin James, Mr. Ashish L. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET67275

Publish Date : 2025-03-06

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here