badge icon

This article was automatically translated from the original Turkish version.

Article
Xception is an advanced CNN architecture that achieves more efficient learning with fewer parameters by using depthwise separable convolutions.
Model
Xception
Year
April 4, 2017
Developer
François Chollet
Establishment
Google Inc
Basic Component
Depthwise + Pointwise distinction
Success
ImageNet Top-1 ~79%
Variants
XceptionNEXcepTion

Xception is a convolutional neural network (CNN) architecture widely used in deep learning for image classification. Proposed by Google in 2017, its name stands for “Extreme Inception,” reflecting its inspiration from the Inception architecture. The Xception design is particularly notable for its reliance on the principle of depthwise separable convolutions, which enables higher performance with fewer parameters.

Xception Architecture

Xception improves the information processing pipeline by replacing standard convolutional layers with depthwise separable convolutions, making the structure more modular and efficient. This design is based on the principle of filtering each channel independently before combining them.

Xception Architecture (Source =

Depthwise Separable Convolutions

Depthwise separable convolutions consist of a two-step process:

  1. Depthwise Convolution: Each input channel is filtered separately using its own filter. This step performs spatial filtering.
  2. Pointwise Convolution: Inter-channel interactions are established through 1x1 convolutions, which rearrange the channel dimensions.

These two steps simulate the operations of a standard convolution more efficiently.

Structural Features of Xception

Xception consists of 36 convolutional layers, organized into three main blocks:

  • Entry Flow: Transforms the input into low-dimensional representations. It begins with two standard convolutional layers followed by three depthwise separable convolution blocks. During this stage, basic edge and texture features are extracted from the input image while spatial dimensions are reduced to lower computational load for subsequent stages.
  • Middle Flow: The core learning component of the Xception architecture. This block contains eight repeated units, each comprising three depthwise separable convolutions with a residual connection. These repetitions enable the model to learn more complex and abstract features. The repetitive structure of the middle flow increases the model’s depth and helps generate stronger representations.
  • Exit Flow: Converts the learned features into dense representations suitable for classification. This section also employs depthwise separable convolutions and residual connections. Following this block, a global average pooling layer is typically applied, followed by a fully connected layer.


(a) Standard CNN (b) Depthwise Separable CNN (Source =


The Xception architecture achieves more efficient feature extraction compared to standard convolutions by leveraging depthwise separable convolutions.

Applications

The Xception architecture has been successfully applied to various image classification and object detection tasks, most notably on ImageNet. Similar separable convolution principles have also been adopted in models such as MobileNetV2 and EfficientNet. Its primary application areas include:

  • Image classification
  • Medical image analysis
  • Video analysis
  • Industrial quality control
  • Transfer learning-based models

Author Information

Avatar
AuthorKaan GümeleDecember 9, 2025 at 9:07 AM

Tags

Discussions

No Discussion Added Yet

Start discussion for "Xception" article

View Discussions

Contents

  • Xception Architecture

    • Depthwise Separable Convolutions

  • Structural Features of Xception

  • Applications

Ask to Küre