badge icon

This article was automatically translated from the original Turkish version.

Article

Autoencoders are a machine learning method based on artificial neural networks that learn by compressing input data into a lower-dimensional, meaningful representation and reconstructing the input from this representation. First introduced in the 1980s, autoencoders operate within the framework of unsupervised learning, with the primary goal of learning important features within the data to construct a low-dimensional representation. In this architecture, where input and output data are identical, the training process aims to minimize data loss. Autoencoders are used in numerous applications including data compression, feature extraction, dimensionality reduction, denoising, and generative modeling.

Basic Structure of Autoencoders

Autoencoders consist fundamentally of two main components: an encoder and a decoder. The encoder transforms the input into a low-dimensional code, while the decoder attempts to reconstruct the original data from this code.


Encoder and Decoder Structure

The encoder transforms high-dimensional input data into a lower-dimensional hidden layer representation. The objective in this process is to represent the data without losing its most important features. This low-dimensional representation produced by the encoder is called the "code" or "latent representation."


The decoder takes this code and generates an output that approximates the original input data. Autoencoders are optimized so that the output closely matches the input.

Mathematical Model

Autoencoders are defined by the learning of two functions:


  • A : Rn→Rp (Encoder)
  • B : Rp→Rn (Decoder)


The goal is to minimize the following optimization problem:


argminA,B E[Δ(x,B(A(x)))]


Here, the function Δ (Delta) measures the difference between the output and the original input, and is typically the ℓ2-norm (mean squared error).

Types of Autoencoders

Different variations of autoencoders have been developed to serve specific purposes. These variations maintain the basic architecture but employ different regularization strategies and loss functions.

Undercomplete and Overcomplete Autoencoders

  • Undercomplete Autoencoder: The hidden layer has fewer units than the input layer. This forces the network to learn only the most important features.
  • Overcomplete Autoencoder: The hidden layer has the same or more units than the input. In this case, additional regularization is required to prevent overfitting.

Sparse Autoencoders

In sparse autoencoders, a large portion of the hidden layer activations are encouraged to be close to zero. This ensures that only the most important features of the data are extracted. Sparsity is typically achieved using methods such as L1 regularization or Kullback-Leibler divergence.

Denoising Autoencoders

Denoising autoencoders introduce deliberate noise into the input data and train the network to reconstruct the original clean data. This approach helps the model learn more robust and generalizable features.

Variational Autoencoders (VAE)

Variational autoencoders differ from classical autoencoders in that they are generative models. They generate data by sampling from a probability distribution over the latent representation. In this architecture, the encoder learns the parameters of a probability distribution (mean and variance), while the decoder generates data from samples drawn from these parameters.

Applications of Autoencoders

Autoencoders have a wide range of applications. Different configurations can be used to address various types of problems.


Dimensionality Reduction

Autoencoders can be used to reduce data to a lower-dimensional representation. Although similar in purpose to traditional dimensionality reduction methods such as Principal Component Analysis (PCA), autoencoders are capable of learning nonlinear relationships.

Denoising

Denoising autoencoders are particularly used in image processing to remove noise from data. The network learns to predict the original clean data from corrupted input.

Anomaly Detection

Once trained on normal data samples, autoencoders can be used to detect anomalies. Anomalous samples will exhibit high reconstruction error because the network cannot accurately reconstruct them.

Generative Models

In particular, variational autoencoders (VAEs) can be used to generate new data samples. New data can be created by randomly sampling from the distribution learned during training.

Feature Extraction

Autoencoders are also used for feature extraction from data. In particular, for other machine learning tasks such as classification or clustering, the encoder component of an autoencoder can serve as a feature extractor.

Author Information

Avatar
AuthorGülçin ÖzerDecember 9, 2025 at 6:26 AM

Discussions

No Discussion Added Yet

Start discussion for "Autoencoder" article

View Discussions

Contents

  • Basic Structure of Autoencoders

    • Encoder and Decoder Structure

      • Mathematical Model

  • Types of Autoencoders

    • Undercomplete and Overcomplete Autoencoders

    • Sparse Autoencoders

    • Denoising Autoencoders

    • Variational Autoencoders (VAE)

  • Applications of Autoencoders

    • Dimensionality Reduction

    • Denoising

    • Anomaly Detection

    • Generative Models

    • Feature Extraction

Ask to Küre