CTransNet: Convolutional Neural Network Combined with Transformer for Medical Image Segmentation

Zhixin Zhang

Information Engineering Department Tianjin University of Commerce Tianjin, 300134, China
Shuhao Jiang

Information Engineering Department Tianjin University of Commerce Tianjin, 300134, China
Xuhua Pan

Information Engineering Department Tianjin University of Commerce Tianjin, 300134, China

CTransNet: Convolutional Neural Network Combined with Transformer for Medical Image Segmentation

keywords: Medical image segmentation, deep learning, attention mechanism

The Transformer has been widely used for many tasks in NLP before, but there is still much room to explore the application of the Transformer to the image domain. In this paper, we propose a simple and efficient hybrid Transformer framework, CTransNet, which combines self-attention and CNN to improve medical image segmentation performance. Capturing long-range dependencies at different scales. To this end, this paper proposes an effective self-attention mechanism incorporating relative position information encoding, which can reduce the time complexity of self-attention from O(n2) to O(n), and a new self-attention decoder that can recover fine-grained features in encoder from skip connection. This paper aims to address the current dilemma of Transformer applications: i.e., the need to learn induction bias from large amounts of training data. The hybrid layer in CTransNet allows the Transformer to be initialized as a CNN without pre-training. We have evaluated the performance of CTransNet on several medical segmentation datasets. CTransNet shows superior segmentation performance, robustness, and great promise for generalization to other medical image segmentation tasks.

reference: Vol. 42, 2023, No. 2, pp. 392–410

doi: 10.31577/cai_2023_2_392

Computing and Informatics

formerly Computers and Artificial Intelligence

CTransNet: Convolutional Neural Network Combined with Transformer for Medical Image Segmentation