Improved Swin Transformer-Based Model for Hot-Rolled Strip Defect Detecting

keywords: Object detection, Swin transformer, NEU-DET, multi-scale targets
Hot-rolled steel strip plays an important role in the field of industrial manufacturing. In addition, defects on its surface affect the aesthetics of the subsequent products and their corrosion resistance, wear resistance, and fatigue strength. However, the existing methods are difficult to learn or capture discriminative feature representations, resulting in poor detection performance. Therefore, its surface defect detection faces two main challenges: one is the insufficient ability to extract local features, and the other is the limited ability to detect multi-scale targets. To address the above issues, we propose a Residual Deformable Convolution and Double LayerNorm Swin Transformer and Channel Expansion Feature Pyramid Networks (RTCN) multi-scale hot-rolled strip surface defect detection model, which adopts Double LayerNorm Swin Transformer (DLST) and as Residual Deformable Convolution Block (RDCB) its backbone network to increase the sensitivity of the model's detection of small and irregular defects. In addition, we adopt Channel Expansion Feature Pyramid Networks (CEFPN) to introduce more feature dimensions to better capture the structure and semantic image information. Ultimately, we assess the proposed model using the publicly available NEU-DET dataset. Our comprehensive testing shows that the model developed in this paper beats the most advanced approach by 1.1 % to 7.2 % in mAP.
reference: Vol. 43, 2024, No. 6, pp. 1352–1371