Zero-Shot 3D Brain Tumor Segmentation: Evaluating SAM2 across Multimodal Brain MRI

Authors: Tommy Nicholas, Louis Bratawidjaja

DOI Link: https://doi.org/10.22214/ijraset.2026.77569

Abstract

Brain tumor segmentation from MRI is clinically important yet challenging to deploy reliably, since tumor appearance varies widely across patients and boundary visibility depends strongly on the MRI modality. In this work, we evaluate Segment Anything Model 2 (SAM2) as a training-free, prompt-driven approach for volumetric tumor delineation in a zero-shot setting on the BraTS2024 Glioma Post-treatment dataset. To reflect a practical inference workflow, we apply minimal preprocessing and generate an initial 2D mask on a selected slice before predicting the full volume using bidirectional slice propagation with state reset to limit drift. We benchmark eight prompting strategies spanning point-only prompts, bounding box prompts, and box-plus-point combinations, and compare performance across four modalities (T1n, T1ce, T2w, and FLAIR) using IoU and Dice on a binary tumor mask derived from the original multi-class annotations. The results show that prompt design substantially influences both accuracy and stability, with box-guided prompting consistently outperforming point-only interaction and additional positive points further improving robustness. We also observe a clear modality effect, where FLAIR and T2w provide more reliable delineation cues than T1-based modalities under the same prompting and propagation protocol. These findings clarify when SAM2 is dependable for zero-shot volumetric tumor segmentation and provide practical guidance on prompt selection and modality choice for interactive clinical use.

Introduction

Brain tumor segmentation is essential for diagnosis, treatment planning, and monitoring, but remains highly challenging due to:

Heterogeneous tumor subregions
Variability across patients and scanners
MRI modality-dependent appearance
Inconsistent boundary contrast

While deep learning has improved segmentation, most supervised models require large annotated datasets and heavy computation, and often struggle to generalize across MRI modalities.

This study evaluates SAM2 (Segment Anything Model 2) in a zero-shot, prompt-driven setting for 3D brain tumor MRI segmentation, analyzing:

Modality-dependent performance
Prompt strategy effectiveness
Slice-to-slice propagation stability

Background

Traditional Segmentation Approaches

U-Net / nnU-Net: Strong convolutional baselines
Transformer-based models: Capture global context
Multi-modal fusion: Combines MRI sequences for robustness

However, these approaches:

Require large labeled datasets
Demand heavy computation
Need careful training and tuning

SAM2 and Prompt-Based Segmentation

SAM2 extends the original SAM with a video-style memory and propagation mechanism, enabling slice-to-slice prediction in 3D MRI volumes.

Key advantages:

No retraining required (zero-shot)
Prompt-driven (points, boxes)
Slice propagation mimics volumetric continuity

Methodology

The evaluation pipeline consists of four steps:

Slice Selection
- Initialization slice selected using ground truth.
- Priority:
  - Non-enhancing tumor core (NETC)
  - Enhancing tumor (ET)
  - Edema
- Designed to avoid modality bias (e.g., FLAIR favoring edema).
Prompt Strategies (8 Modes)
- Point-only (PN, 2PN, 3PN)
- Box-only (B)
- Box + Points (BPN, B2PN, B3PN)
- GT mask (upper bound reference)
Negative points control mask leakage.
Initial Mask Prediction
- SAM2 generates 2D segmentation from prompts.
- First returned mask used.
- Stored in memory for propagation.
Bidirectional Propagation
- Forward pass through slices
- Memory reset
- Backward pass
- Combined for final 3D prediction

Evaluation metrics:

IoU (Intersection over Union)
Dice coefficient

Binary tumor mask used (tumor vs background).

Experimental Setup

Dataset: 50 cases from BraTS2024 Glioma Post-treatment
Modalities:
- T1n
- T1ce
- T2w
- FLAIR
Minimal preprocessing (only intensity normalization)
Statistical testing via paired t-tests

Key Results

1. Prompt Strategy Findings

Clear hierarchy of effectiveness:

Point-only prompts → Poor performance

IoU: ~0.155–0.180
Dice: ~0.228–0.262
Sparse clicks insufficient in weak-contrast MRI

Box-only prompt → Large improvement

IoU: 0.355
Dice: 0.470

Box + Points → Best performance

B3PN achieved highest:
- IoU: 0.376
- Dice: 0.493
More stable across cases
Reduced leakage and ambiguity

Conclusion:
Bounding box provides strong geometric prior; positive points refine tumor core.

2. Modality-Dependent Performance

Average overlap per modality:

Modality	IoU	Dice
FLAIR	0.374	0.480
T2w	0.314	0.415
T1n	0.281	0.383
T1ce	0.264	0.359

Statistical findings:

No significant difference: T1ce vs T1n
Strong significant advantage:
- FLAIR vs T1-based
- T2w vs T1-based
- FLAIR vs T2w

Interpretation:

FLAIR provides strongest global lesion contrast.
T1-based modalities often have weaker or sparse enhancement.
Binary tumor target favors modalities highlighting broader abnormal tissue (e.g., edema).

Core Insights

Prompt design strongly influences zero-shot segmentation.
Box-assisted prompts significantly outperform sparse point prompts.
MRI modality is a primary determinant of performance.
FLAIR provides the most stable and highest overlap.
Bidirectional propagation reduces slice drift.

Limitations

Binary tumor mask evaluation (no subregion separation).
May bias results toward modalities highlighting broader tumor extent.
Does not evaluate fine-grained subregion segmentation.

However, the study successfully isolates the key question:

Can a training-free, prompt-driven model reliably recover global tumor extent and maintain volumetric consistency?

Conclusion

In this work, we investigated SAM2 as a training-free, prompt-driven alternative for volumetric brain tumor delineation on BraTS2024 post-treatment MRI, with an emphasis on how prompting design and MRI modality jointly shape zero-shot segmentation quality and propagation stability. By standardizing the initialization slice selection and evaluating eight prompt modes, we observed a consistent advantage for box-guided prompting, where a bounding box provides reliable coarse extent and additional positive points further stabilize the prediction, yielding higher IoU and Dice than point-only strategies. The distributional evidence from box plots reinforces that this improvement is not limited to mean gains, but also reflects reduced variability and fewer unstable cases, which is critical when the initial 2D mask becomes the anchor for volumetric propagation. From the modality perspective, the results show that SAM2’s overlap accuracy is strongly influenced by modality-specific contrast, with FLAIR achieving the highest average performance, followed by T2w, while T1ce and T1n remain comparatively lower and statistically similar. Pairwise t-tests further substantiate these differences, indicating that the performance gaps between TFLAIR and the other modalities are unlikely to be explained by random variation. Taken together, these findings suggest that effective zero-shot volumetric tumor segmentation with SAM2 benefits from combining geometric guidance through box-assisted prompts with modalities that provide clearer lesion extent cues, while acknowledging that our binary evaluation focuses on global tumor extent rather than subregion separation.

References

[1] Buchner, J.A., Peeken, J.C., Etzel, L., Ezhov, I., Mayinger, M., Christ, S.M., Brunner, T.B., Wittig, A., Menze, B.H., Zimmer, C., Meyer, B., Guckenberger, M., Andratschke, N., El Shafie, R.A., Debus, J., Rogers, S., Riesterer, O., Schulze, K., Feldmann, H.J., Blanck, O., Zamboglou, C., Ferentinos, K., Bilger, A., Grosu, A.L., Wolff, R., Kirschke, J.S., Eitz, K.A., Combs, S.E., Bernhardt, D., Rueckert, D., Piraud, M., Wiestler, B., Kofler, F.: Identi-fying core MRI sequences for reliable automatic brain metastasis segmentation. Radiothera-py and Oncology, 188, 109901 (2023). https://doi.org/10.1016/j.radonc.2023.109901 [2] Gaillard, F., Baba, Y., Bell, D., et al.: MRI sequences (overview). Radiopaedia.org (2025). https://doi.org/10.53347/rID-37346 [3] Ravi, N., Gabeur, V., Hu, Y.-T., Hu, R., Ryali, C., Ma, T., Khedr, H., Rädle, R., Rolland, C., Gustafson, L., Mintun, E., Pan, J., Alwala, K.V., Carion, N., Wu, C.-Y., Girshick, R., Dollár, P., Feichtenhofer, C.: SAM 2: Segment Anything in Images and Videos. arXiv pre-print (2024). https://arxiv.org/abs/2408.00714 [4] Walsh, J., Othmani, A., Jain, M., Dev, S.: Using U-Net network for efficient brain tumor segmentation in MRI images. Healthcare Analytics, 2, 100098 (2022). https://doi.org/10.1016/j.health.2022.100098 [5] Lin, S.Y., Lin, C.L.: Brain tumor segmentation using U-Net in conjunction with Efficient-Net. PeerJ Comput Sci, 10, e1754 (2024). https://doi.org/10.7717/peerj-cs.1754 [6] Agrawal, P., Katal, N., Hooda, N.: Segmentation and classification of brain tumor using 3D-UNet deep neural networks. International Journal of Cognitive Computing in Engineering, 3, 199–210 (2022). https://doi.org/10.1016/j.ijcce.2022.11.001 [7] Isensee, F., Jaeger, P.F., Kohl, S.A.A., et al.: nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods, 18, 203–211 (2021). https://doi.org/10.1038/s41592-020-01008-z [8] Shen, C., Fang, Y., Yu, X., Guo, C., Ju, Z.: CS-UNet: A LightWeight UNet Model Based On Context Information. In: Lan, X., Mei, X., Jiang, C., Zhao, F., Tian, Z. (eds.) Intelligent Robotics and Applications, ICIRA 2024. LNCS, vol. 15205. Springer, Singapore (2025). https://doi.org/10.1007/978-981-96-0777-8_24 [9] Wang, P., Yang, Q., He, Z., Yuan, Y.: Vision transformers in multi-modal brain tumor MRI segmentation: A review. Meta-Radiology, 1(1), 100004 (2023). https://doi.org/10.1016/j.metrad.2023.100004 [10] Zhou, T.: M2GCNet: Multi-Modal Graph Convolution Network for Precise Brain Tumor Segmentation Across Multiple MRI Sequences. IEEE Transactions on Image Processing, 33, 4896-4910 (2024). https://doi.org/10.1109/TIP.2024.3451936 [11] Zhu, Z., He, X., Qi, G., Li, Y., Cong, B., Liu, Y.: Brain tumor segmentation based on the fusion of deep semantics and edge information in multimodal MRI. Information Fusion, 91, 376-387 (2023). https://doi.org/10.1016/j.inffus.2022.10.022 [12] Zhang, G., Zhou, J., He, G., Zhu, H.: Deep fusion of multi-modal features for brain tumor image segmentation. Heliyon, 9(8), e19266 (2023). https://doi.org/10.1016/j.heliyon.2023.e19266 [13] Kirillov, A., et al.: Segment Anything. 2023 IEEE/CVF International Conference on Com-puter Vision (ICCV), Paris, France, 3992-4003 (2023). https://doi.org/10.1109/ICCV51070.2023.00371 [14] Mazurowski, M.A., Dong, H., Gu, H., Yang, J., Konz, N., Zhang, Y.: Segment anything model for medical image analysis: An experimental study. Med Image Anal, 89, 102918 (2023). https://doi.org/10.1016/j.media.2023.102918 [15] Mattjie, C., et al.: Zero-Shot Performance of the Segment Anything Model (SAM) in 2D Medical Imaging: A Comprehensive Evaluation and Practical Guidelines. 2023 IEEE 23rd International Conference on Bioinformatics and Bioengineering (BIBE), 108-112 (2023). https://doi.org/10.1109/BIBE60311.2023.00025 [16] Zhang, L., Deng, X., Lu, Y.: Segment Anything Model (SAM) for Medical Image Segmen-tation: A Preliminary Review. 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 4187-4194 (2023). https://doi.org/10.1109/BIBM58861.2023.10386032 [17] Ali, L., Alnajjar, F., Swavaf, M., Elharrouss, O., Abd-Alrazaq, A., Damseh, R.: Evaluating segment anything model (SAM) on MRI scans of brain tumors. Scientific Reports, 14(1), 21659 (2024). https://doi.org/10.1038/s41598-024-72342-x [18] Yang, Y., Wu, X., He, T., Zhao, H., Liu, X.: SAM3D: Segment Anything in 3D Scenes. arXiv preprint (2023). https://arxiv.org/abs/2306.03908 [19] Wang, H., Guo, S., Ye, J., Deng, Z., Cheng, J., Li, T., Chen, J., Su, Y., Huang, Z., Shen, Y., Fu, B., Zhang, S., He, J., Qiao, Y.: SAM-Med3D: Towards General-purpose Segmenta-tion Models for Volumetric Medical Images. arXiv preprint (2024). https://arxiv.org/abs/2310.15161 [20] Dong, H., Gu, H., Chen, Y., Yang, J., Chen, Y., Mazurowski, M.A.: Segment Anything Model 2: An Application to 2D and 3D Medical Images. arXiv preprint (2024). https://arxiv.org/abs/2408.00756 [21] Zhao, X., et al.: Inspiring the Next Generation of Segment Anything Models: Comprehen-sively Evaluate SAM and SAM 2 with Diverse Prompts Towards Context-Dependent Con-cepts under Different Scenes. arXiv preprint (2024). https://arxiv.org/abs/2412.01240 [22] Correia de Verdier, M., et al.: The 2024 Brain Tumor Segmentation (BraTS) Challenge: Gli-oma Segmentation on Post-treatment MRI. arXiv preprint (2024). https://arxiv.org/abs/2405.18368

Copyright

Copyright © 2026 Tommy Nicholas, Louis Bratawidjaja. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Download Paper

Paper Id : IJRASET77569

Publish Date : 2026-02-19

ISSN : 2321-9653

Publisher Name : IJRASET

DOI Link : Click Here