INCSA-UNET: Spatial Attention Inception UNET for Aerial Images Segmentation

İbrahim Delibaşoğlu

Software Engineering Faculty of Computer and Information Sciences Sakarya University 54050, Sakarya, Turkey

INCSA-UNET: Spatial Attention Inception UNET for Aerial Images Segmentation

keywords: Segmentation, deep learning, CNN, INCSA-UNET, attention

Building segmentation from aerial images is essential in applications such as facilitating urban planning and estimating the population. Fully convolutional networks (FCNs) and especially UNET have achieved promising results in segmentation problems, after deep learning methods have significantly advanced the performance of many computer vision problems. However, in Convolutional Neural Networks (CNNs) with the standard convolution operations, there are problems such as the overfitting and precise extraction of the boundaries of the objects with different sizes and shapes. In this study, we have used Inception blocks with UNET to enhance feature extraction by implementing two-level Inception approach covering the entire encoding stage. In the proposed architecture, structured form of dropout (DropBlock) is used to prevent overfitting, and spatial/channel attention modules are applied to enhance important features by focusing key areas. We evaluate the proposed INCSA-UNET architecture on publicly available Massachusetts dataset and apply two fold cross-validation experiments for better analyzes. The experimental results show that the proposed architecture does not significantly increase the number of parameters of UNET and has a significant improvement in terms of F1 and Kappa quantitative measures.

reference: Vol. 40, 2021, No. 6, pp. 1244–1262

doi: 10.31577/cai_2021_6_1244

Computing and Informatics

formerly Computers and Artificial Intelligence

INCSA-UNET: Spatial Attention Inception UNET for Aerial Images Segmentation