Skip to content
Mobrief
Mobrief
Back to archive

Research · Hugging Face Daily Papers

When Fine-Tuning Changes the Evidence: Architecture-Dependent Semantic Drift in Chest X-Ray Explanations

Transfer learning followed by fine-tuning is widely adopted in medical image classification due to consistent gains in diagnostic performance.

Apr 09, 2026 17:53 UTC · ~4 min read · Technical Source
Read original
  • However, in multi-class settings with overlapping visual features, improvements in accuracy do not guarantee stability of the visual evidence used to support predictions.
  • Hugging Face Daily Papers define semantic drift as systematic changes in the attribution structure supporting a model's predictions between transfer learning and full fine-tuning, reflecting…
  • Using a five-class chest X-ray task, Hugging Face Daily Papers evaluates Dense Net201, Res Net50 V2, and Inception V3 under a two-stage training protocol and quantify drift with reference-free…

Context

However, in multi-class settings with overlapping visual features, improvements in accuracy do not guarantee stability of the visual evidence used to support predictions. Hugging Face Daily Papers define semantic drift as systematic changes in the attribution structure supporting a model's predictions between transfer learning and full fine-tuning, reflecting potential shifts in underlying visual reasoning despite stable classification performance. Using a five-class chest X-ray task, Hugging Face Daily Papers evaluates Dense Net201, Res Net50 V2, and Inception V3 under a two-stage training protocol and quantify drift with reference-free metrics capturing spatial localization and structural consistency of attribution maps. Across architectures, coarse anatomical localization remains stable, while overlap Io U reveals pronounced architecture-dependent reorganization of evidential structure. Beyond single-method analysis, stability rankings can reverse across Layer CAM and Grad CAM++ under converged predictive performance, establishing explanation stability as an interaction between architecture, optimization phase, and attribution objective.

However, in multi-class settings with overlapping visual features, improvements in accuracy do not guarantee stability of the visual evidence used to support predictions.