Robust and Interpretable Chest X-ray Classification via Diffusion Purification and Concept-Based Adversarial Detection
DOI:
https://doi.org/10.56714/bjrs.51.2.19Keywords:
Concept Activation Vectors (TCAV), Medical images, Random Forest, Resnet18Abstract
Adversarial attack is an approach that primarily compromise the integrity of deep learning system in medical imaging by adding subtle changes to the model inputs that humans don't notice, it leads the model to make an incorrect decision, so the attacker can corrupt both the input data and the prediction system. In this paper A hybrid detection method based on diffusion purification, deep feature extraction, and ensemble learning has been proposed to address this issue using chest x-ray images.
Two phases make it up. First, cleansing the adversarial samples. This is accomplished by a diffusion model. The samples are regenerated to remove the small perturbations fully. In the second phase, the images are categorized into two groups. Clean and purified images are collected in a safe category, while harmful images are unsafe category. In the second stage, we utilize a pre-trained ResNet18 feature extractor to retrieve salient from chest X-rays. Furthermore, a Random Forest model classifies the features obtained from the ResNet18 model into harmful and non-harmful image. The experimental results show that the proposed framework can detect more than 99% of adversarial samples. The satisfactory detection rate of the proposed purification and ensemble-based feature detection for making AI-assisted disease detection more reliable and safer
Downloads
References
[1] S. G. Finlayson, H. W. Chung, I. S. Kohane, and A. L. Beam, “Adversarial attacks against medical deep learning systems,” arXiv, arXiv:1804.05296, 2018, doi: 10.48550/arXiv.1804.05296.
[2] K. D. Apostolidis and G. A. Papakostas, “A survey on adversarial deep learning robustness in medical image analysis,” Electronics, vol. 10, no. 17, Art. no. 2132, 2021, doi: 10.3390/electronics10172132. DOI: https://doi.org/10.3390/electronics10172132
[3] J. Dong, J. Chen, X. Xie, J. Lai, and H. Chen, “Survey on adversarial attack and defense for medical image analysis: Methods and challenges,” ACM Computing Surveys, vol. 57, no. 3, pp. 1–38, 2024, doi: 10.1145/3702638. DOI: https://doi.org/10.1145/3702638
[4] M. Surekha, A. K. Sagar, and V. Khemchandani, “Adversarial attack and defense mechanisms in medical imaging: A comprehensive review,” in Proc. IEEE Int. Conf. Computing, Power and Communication Technologies (IC2PCT), 2024, pp. 1657–1661, doi: 10.1109/IC2PCT60090.2024.10486235. DOI: https://doi.org/10.1109/IC2PCT60090.2024.10486235
[5] G. Bortsova et al., “Adversarial attack vulnerability of medical image analysis systems: Unexplored factors,” Medical Image Analysis, vol. 73, Art. no. 102141, 2021, doi: 10.1016/j.media.2021.102141. DOI: https://doi.org/10.1016/j.media.2021.102141
[6] D. Rodriguez, T. Nayak, Y. Chen, R. Krishnan, and Y. Huang, “On the role of deep learning model complexity in adversarial robustness for medical images,” BMC Medical Informatics and Decision Making, vol. 22, no. Suppl. 2, Art. no. 160, 2022, doi: 10.1186/s12911-022-01891-w. DOI: https://doi.org/10.1186/s12911-022-01891-w
[7] A. S. Musthafa, K. Sankar, T. Benil, and Y. N. Rao, “A hybrid machine learning technique for early prediction of lung nodules from medical images using a learning-based neural network classifier,” Concurrency and Computation: Practice and Experience, vol. 35, no. 3, Art. no. e7488, 2023, doi: 10.1002/cpe.7488. DOI: https://doi.org/10.1002/cpe.7488
[8] G. Webber and A. J. Reader, “Diffusion models for medical image reconstruction,” BJR | Artificial Intelligence, vol. 1, no. 1, Art. no. ubae013, 2024, doi: 10.1093/bjrai/ubae013. DOI: https://doi.org/10.1093/bjrai/ubae013
[9] V. T. Truong, L. B. Dang, and L. B. Le, “Attacks and defenses for generative diffusion models: A comprehensive survey,” ACM Computing Surveys, vol. 57, no. 8, pp. 1–44, 2025, doi: 10.1145/3721479. DOI: https://doi.org/10.1145/3721479
[10] D. Qiu, L. Zheng, J. Zhu, and D. Huang, “Multiple improved residual networks for medical image super-resolution,” Future Generation Computer Systems, vol. 116, pp. 200–208, 2021, doi: 10.1016/j.future.2020.11.001. DOI: https://doi.org/10.1016/j.future.2020.11.001
[11] B. P. Reddy, K. Rangaswamy, D. Bharadwaja, M. M. Dupaty, P. Sarkar, and M. S. Al Ansari, “Using generative adversarial networks and ensemble learning for multimodal medical image fusion to improve the diagnosis of rare neurological disorders,” Int. J. Advanced Computer Science and Applications, vol. 14, no. 11, 2023, doi: 10.14569/IJACSA.2023.01411108. DOI: https://doi.org/10.14569/IJACSA.2023.01411108
[12] G. W. Muoka et al., “A comprehensive review and analysis of deep learning-based medical image adversarial attack and defense,” Mathematics, vol. 11, no. 20, Art. no. 4272, 2023, doi: 10.3390/math11204272. DOI: https://doi.org/10.3390/math11204272
[13] S. Kaviani, K. J. Han, and I. Sohn, “Adversarial attacks and defenses on AI in medical imaging informatics: A survey,” Expert Systems with Applications, vol. 198, Art. no. 116815, 2022, doi: 10.1016/j.eswa.2022.116815. DOI: https://doi.org/10.1016/j.eswa.2022.116815
[14] Z. Teng et al., “A literature review of artificial intelligence for medical image segmentation: From explainable AI to trustworthy AI,” Quantitative Imaging in Medicine and Surgery, vol. 14, no. 12, Art. no. 9620, 2024, doi: 10.21037/qims-24-723. DOI: https://doi.org/10.21037/qims-24-723
[15] A. Abomakhelb, K. A. Jalil, A. G. Buja, A. Alhammadi, and A. M. Alenezi, “A comprehensive review of adversarial attacks and defense strategies in deep neural networks,” Technologies, vol. 13, no. 5, Art. no. 202, 2025, doi: 10.3390/technologies13050202. DOI: https://doi.org/10.3390/technologies13050202
[16] J. Liu, Y. Li, Y. Guo, Y. Liu, J. Tang, and Y. Nie, “Generation and countermeasures of adversarial examples on vision: A survey,” Artificial Intelligence Review, vol. 57, no. 8, Art. no. 199, 2024, doi: 10.1007/s10462-024-10841-z. DOI: https://doi.org/10.1007/s10462-024-10841-z
[17] D. A. M. Akhtom, M. M. Singh, and C. Xinying, “Enhancing trustworthy deep learning for image classification against evasion attacks: A systematic literature review,” Artificial Intelligence Review, vol. 57, no. 7, Art. no. 174, 2024, doi: 10.1007/s10462-024-10777-4. DOI: https://doi.org/10.1007/s10462-024-10777-4
[18] J. Li, Y.-A. Tan, X. Liu, W. Meng, and Y. Li, “Interpretable adversarial example detection via high-level concept activation vector,” Computers & Security, vol. 150, Art. no. 104218, 2025, doi: 10.1016/j.cose.2024.104218. DOI: https://doi.org/10.1016/j.cose.2024.104218
[19] Y. Wang, T. Li, S. Li, X. Yuan, and W. Ni, “New adversarial image detection based on sentiment analysis,” IEEE Trans. Neural Networks and Learning Systems, vol. 35, no. 10, pp. 14060–14074, 2023, doi: 10.1109/TNNLS.2023.3274538. DOI: https://doi.org/10.1109/TNNLS.2023.3274538
[20] S. Freitas, S.-T. Chen, Z. J. Wang, and D. H. Chau, “Unmask: Adversarial detection and defense through robust feature alignment,” in Proc. IEEE Int. Conf. Big Data, 2020, pp. 1081–1088, doi: 10.1109/BigData50022.2020.9378303. DOI: https://doi.org/10.1109/BigData50022.2020.9378303
[21] F. Mumcu and Y. Yilmaz, “Detecting adversarial examples,” arXiv, arXiv:2410.17442, 2024, doi: 10.48550/arXiv.2410.17442.
[22] H. Mu, C. Li, A. Peng, Y. Wang, and Z. Liang, “Robust adversarial example detection algorithm based on high-level feature differences,” Sensors, vol. 25, no. 6, Art. no. 1770, 2025, doi: 10.3390/s25061770. DOI: https://doi.org/10.3390/s25061770
[23] X. Ye, Q. Zhang, S. Cui, Z. Ying, J. Sun, and X. Du, “Mitigating adversarial attacks in object detection through conditional diffusion models,” Mathematics, vol. 12, no. 19, Art. no. 3093, 2024, doi: 10.3390/math12193093. DOI: https://doi.org/10.3390/math12193093
[24] F.-A. Croitoru, V. Hondru, R. T. Ionescu, and M. Shah, “Diffusion models in vision: A survey,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 9, pp. 10850–10869, 2023, doi: 10.1109/TPAMI.2023.3261988. DOI: https://doi.org/10.1109/TPAMI.2023.3261988
[25] P. Nasra and S. Gupta, “ResNet18-based deep learning approach for efficient and accurate nail disease detection,” in Proc. 3rd Int. Conf. Advancement in Computation & Computer Technologies (InCACCT), 2025, pp. 145–150, doi: 10.1109/InCACCT65424.2025.11011397. DOI: https://doi.org/10.1109/InCACCT65424.2025.11011397
[26] A. Yaqoob et al., “SGA-driven feature selection and random forest classification for enhanced breast cancer diagnosis: A comparative study,” Scientific Reports, vol. 15, no. 1, Art. no. 10944, 2025, doi: 10.1038/s41598-025-95786-1. DOI: https://doi.org/10.1038/s41598-025-95786-1
[27] L. Schmalwasser, N. Penzel, J. Denzler, and J. Niebling, “FastCAV: Efficient computation of concept activation vectors for explaining deep neural networks,” arXiv, arXiv:2505.17883, 2025, doi: 10.48550/arXiv.2505.17883.
[28] Z. Rguibi, A. Hajami, D. Zitouni, Y. Maleh, and A. Elqaraoui, “Medical variational autoencoder and generative adversarial network for medical imaging,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 32, pp. 494–505, 2023, doi: 10.11591/ijeecs.v32.i1. DOI: https://doi.org/10.11591/ijeecs.v32.i1.pp494-505
[29] M. C. Nguyen and N. H. Luong, “Diffusion-based purification for adversarial defense in medical image classification,” in Proc. Int. Symp. Information and Communication Technology, 2024, pp. 80–91, doi: 10.1007/978-981-96-4288-5_7. DOI: https://doi.org/10.1007/978-981-96-4288-5_7
[30] M. K. Puttagunta, S. Ravi, and C. N. K. Babu, “Adversarial examples: Attacks and defenses on medical deep learning systems,” Multimedia Tools and Applications, vol. 82, no. 22, pp. 33773–33809, 2023, doi: 10.1007/s11042-023-14702-9. DOI: https://doi.org/10.1007/s11042-023-14702-9
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Basrah Researches Sciences

This work is licensed under a Creative Commons Attribution 4.0 International License.





