badge icon

This article was automatically translated from the original Turkish version.

Article

Artificial Neural Networks and Single-Layer Perceptron

Quote

Artificial Neural Network

Classical Von Neumann computer architecture architectures and software are highly effective in numerical and symbol processing but fail to solve complex perceptual problems. The human brain, although slower in numerical and symbolic processing, excels in complex perceptual tasks, perception, and the use of knowledge acquired through experience. Artificial neural networks emulate the structure of biological neural networks in the brain, including their ability to learning, remember, and generalize. Learning is performed using examples. After learning, artificial neural networks can be used for tasks such as pattern recognition/classification (sound, writing, face, shape, object, etc.), clustering/grouping, optimization, prediction, control, decision making, and more.


Artificial neural networks consist of a large number of artificial neuron processors that are simply connected in parallel and distributed, each with its own memory. Artificial neural networks are also referred to as neural computation, network computation, connectionist networks, parallel distributed networks, and neuromorphic systems.


To define and later use any input example in an artificial neural network, it is necessary to understand how the data is represented within the network, where it is stored, and how it is retrieved. In classical computers, data is represented as sequences of 1s and 0s, whereas in neural networks, it is represented by mathematical functions. The weights of connections between neuron cells (processing elements) serve as variables of this function. Here, weights determine what information is stored; they themselves have no inherent meaning.

Information in artificial neural networks is distributed through connections and their associated weights. Classical computers store information in specific memory locations, while artificial neural networks distribute information across the entire network. This is called distributed representation.

In a classical computer, information is retrieved by accessing a specific memory location. In artificial neural networks, information is typically presented as noisy or incomplete input examples. The network combines the input with all possible alternatives and selects the best and most appropriate example as output. This output example represents the network’s interpretation of the input based on available knowledge. This is known as associative memory.


The greatest advantages of artificial neural networks are their ability to learn and to use different learning algorithms. They adapt to their environment, can operate with incomplete information, make decisions under uncertainty, and are tolerant to errors. Their most frequently cited disadvantages are the inability to analyze how the system operates and the risk of failing to achieve successful learning. The advantages and disadvantages of artificial neural networks are listed below.


Advantages

  • They do not require rule-based systems.
  • They have learning capability and can learn using different learning algorithms.
  • They can generate results (information) for unseen outputs.
  • They can be used for perception-related tasks.
  • They can perform pattern recognition and classification and can complete incomplete patterns.
  • They have the ability to self-organize and learn by adapting to their environment.
  • They are fault-tolerant and can operate with incomplete or uncertain information, exhibiting graceful degradation in faulty conditions.
  • They have distributed memory.
  • They can operate in parallel and process real-time information.


Disadvantages

  • It may be difficult to adapt a model developed for one problem to different problems.
  • Training is impossible if examples are unavailable. Even with example data, learning may not occur due to network architecture or learning rules.
  • Determining the appropriate network structure for a problem is typically done through trial and error.
  • There is no definitive method to determine when training should be stopped.
  • The internal workings of the system cannot be understood.
  • They suffer performance loss when run on serial computers that process only one piece of information at a time. Therefore, they must be implemented on hardware capable of parallel processing.
  • The behavior of the network cannot be explained. When a solution is produced, it is impossible to determine why or how it was generated.
  • Stability analysis cannot be performed for some networks.


Biological Neural Network

A biological neural network consists of nerve cells called neurons, which have information processing capabilities. Neurons (nerve cells) connect with each other to perform their functions. It is estimated that the human brain contains 100 billion neurons. A single neuron can form connections with 50,000 to 250,000 other neurons, and it is estimated that there are more than 6x10^13 connections in the human brain. Biological neural networks enable humans to understand all behaviors and their environment. They learn relationships between events by using information received from five sense senses and developing perception and comprehension mechanisms. Different regions of the human brain perform different functions. Information received from sensory organs is transmitted to the brain via the nervous system, and decisions made by the brain are sent back to the body’s organs via signals transmitted by the nervous system.


Neurons are classified into three main groups according to their functions:

  • Sensory neurons: Transmit information received from the external environment through smell, taste, touch, vision, and hearing to the brain.
  • Motor neurons: Control muscle contractions to enable movement.
  • Interneurons: Establish neural circuits and facilitate communication between sensory neurons, motor neurons, and the central nervous system.


The structure of a typical neuron is shown below.



A basic biological neuron consists of dendrites, Soma, axon, and terminal buttons. Biological neurons can be roughly classified into four groups based on their physical structure.

  • Multipolar neurons
  • Bipolar neurons
  • Unipolar neurons
  • Anaxonic neurons


The neuron shown above is a multipolar neuron, which has many dendrites and a single axon. Dendrites receive information from other neurons, acting as receivers of neural communication. The soma region contains the nucleus and mechanisms that sustain the cell’s vital functions. The soma processes incoming signals using a generally nonlinear method. The neuron transmits information to other cells via the axon. The axon is surrounded by myelin layer. The myelin sheath is an insulating material that reduces capacitance between the plasma membrane and extracellular fluid, thereby increasing the speed of signal propagation. Transmission along myelinated nerves is significantly faster than along unmyelinated ones. The firing frequency of the human brain is approximately 100 Hz. Along the myelinated axon, there are periodic nodes called Ranvier nodes, spaced a few millimeters apart, which regenerate the signal.


The ends of the axon are called terminal buttons. Signals transmitted from one neuron to another are transferred via synapses, which are junctions between the terminal buttons of the sending cell and the dendritic membrane or soma of the receiving cell.




The transmission of signals from one nerve cell to another across the synapse is mediated by specialized chemical transmitters called neurotransmitters. When the chemical substance signal received through the synapse causes the intracellular electricity potential to reach a threshold, an electrical signal is sent down the axon. This is called cell activation.


The function of a nerve cell is to receive stimuli arriving at the dendrite via synapses and either generate or not generate an action potential. An action potential typically takes the form of an electrical pulse of about 0.1 mV amplitude lasting 1 ms and propagates along the axon at a speed of 120 m/s.


When the action potential reaches the synaptic terminal of the axon, the electrical coup-coded signals are converted into chemical neurotransmitter signals that cross the synaptic cleft and reach the dendrites of the receiving neurons. In the dendritic membrane of the receiving neuron, the chemical stimulus is converted back into an electrical signal and transmitted toward the cell body. A synapse can be either excitatory or inhibitory. Excitatory synapses increase the action potential of the receiving neuron, while inhibitory synapses decrease it. There is a potential difference across the inner and outer surfaces of the neuron’s membrane. Neurotransmitter chemical signals sent from the presynaptic neuron adjust the membrane potential of the postsynaptic neuron.


The neuron evaluates the signals passing along its axon to determine how significant a signal is for it. The neurophysiological interpretation of activity (f(x(t))) and signal (x(t)) includes electrical pulses and temporal summations of the potential difference. Signals include broad axonal spikes or activation potentials. Mathematically, the activity value (f(x(t))) represents the potential difference across the neuron’s membrane over time (tension). Activity can be positive or negative and theoretically can be infinite.


Learning in the human brain occurs in three ways:

  • By generating new axons
  • By activating existing axons
  • By altering the strength of existing axons


Mathematical Model of a Neuron

The mathematical modeling of neurons was first introduced in 1943 by Warren McCulloch and Walter Pitts and is known as the McCulloch–Pitts (MCP) neuron, Binary Threshold Unit, Threshold Logic Unit (TLU), or Linear Threshold Unit.


In the TLU neuron model, there are n input values {x1,x2,x3,...,xn}, n weight values {w1,w2,w3,...,wn}, one threshold value (T), and one output value (y).



In the McCulloch–Pitts neuron model, input values are multiplied by their corresponding weights and then summed. If the resulting sum exceeds the threshold value, the output is set to "1"; if it is less than or equal to the threshold, the output is set to "0". The weights and threshold were fixed, and no learning rule was applied. Although the McCulloch–Pitts model attempted to mimic biological neurons, it failed to replicate their actual behavior. Differences between the McCulloch–Pitts model and biological neurons include:


  • Biological neurons do not operate according to a unit step function but produce classifier or graded outputs
  • Biological neurons apply nonlinear summation functions and logical operations to input values
  • Biological neurons produce a series of output signals rather than a single output value
  • Biological neurons are updated asynchronously


In 1958, Frank Rosenblatt extended the McCulloch–Pitts model and developed the "Perceptron" model for image recognition. The Perceptron model introduced a learning rule to the McCulloch–Pitts neuron, based on the Hebbian learning rule proposed by Donald Olding Hebb in his 1949 work "The Organization of Behaviour". The learning rule enables the updating of weight values.


In an artificial neuron model, inputs represent values or signals received from other neurons or the external environment. Inputs are also referred to as features.


Weight values are coefficients that determine the influence of each input on the neuron. Each input has its own distinct weight value. The magnitude of a weight does not indicate importance or insignificance per se. Positive weight values represent excitatory synapses, while negative weight values represent inhibitory synapses. A large weight value indicates a strong connection between the input and the artificial neuron, while a small weight value indicates a weak connection.


The TLU neuron calculates the sum of the products of input values and their corresponding weights (v), then compares this sum with the threshold value (T) using an activation function. If the weighted sum exceeds the threshold, the output is "1"; if it is below the threshold, the output is "0". The linear threshold function ensures the output signal takes a binary value of "1" or "0". It is generally advisable to use a non-zero threshold value.


If the result of the activation function is positive (y > 0), the artificial neuron perceives the input as a positive example; if the result is negative (y <= 0), it perceives the input as a negative example. Inputs with positive weights contribute to increased activation and thus indicate positive examples. Inputs with negative weights reduce activation and indicate negative examples.


In artificial neurons, the results of the activation function can be passed through a scaling or limiting operation. Scaling is simply the multiplication of the activation value by a scaling factor. Limiting ensures that the scaled results do not exceed predefined minimum and maximum bounds.


The output of an artificial neuron is the point at which the result of the activation function is transmitted to the external environment or to other neurons. Each neuron has one output. This output can serve as an input to any number of subsequent neurons.


The mathematical formulas for the TLU neuron are shown below:



In the Perceptron model using the TLU neuron, if we treat the threshold value as an input, we can use an input x0 with a fixed value of -1 and its corresponding weight w0 representing the threshold T.



The linear function v = 0 defines a decision boundary (hyperplane) that separates the n-dimensional input (feature) space into two regions. Below is the geometric representation of a line for a two-dimensional (two-input), two-class pattern classification problem.




In 1959, Bernard Widrow and Ted Hoff developed the ADALINE (Adaptive Linear Neuron) model. The ADALINE unit resembles the single-layer linear classifier Perceptron but uses a different learning rule.


In the ADALINE model, input values are multiplied by their weights and then added to a bias value. If the resulting sum is greater than zero, the output is set to 1; if it is less than or equal to zero, the output is set to -1. The ADALINE model is shown below.



As in the Perceptron model, if we represent the bias value as an input in the ADALINE model, we can use an input x0 with a fixed value of +1 and its corresponding weight w0 representing the bias b.



The mathematical formula for the ADALINE neuron is shown below:




The bias value shifts the separating line (decision boundary) away from the origin.


Today, the summation and activation functions of artificial neurons have been generalized.


The summation function calculates the net input to an artificial neuron. The summation function used depends on the artificial neural network model. Not all neurons in a network need to use the same summation function. Examples of summation functions:



The activation function determines the output of the neuron by processing the net input. As with the summation function, neurons in a network do not need to use the same activation function. Examples of activation functions:



Single-Layer Artificial Neural Networks and Training

Artificial neural networks composed of single-layer perceptrons consist only of input and output layers. The input layer contains input signals, while the output layer contains perceptron neurons. Input signals are connected to all output neurons. Each connection has a distinct weight value for each output neuron.



In single-layer perceptron networks used as linear classifiers, the output function produces binary values. Depending on the neuron model used (Perceptron or ADALINE), output values take the form {1, 0} or {-1, 1}. These output values represent classes. Input values are assigned to one of two classes, and the network attempts to find a line or plane that separates the two classes. This limits a single artificial neuron to recognizing only two distinct classes. To recognize more than two classes, multiple artificial neurons are required.


Training involves updating the weight values associated with the neuron’s inputs. In single-layer networks, the learning rule is based on the error correction principle. In the error correction algorithm, the difference between the desired output (d) and the calculated output (y) — the error value — is used to adjust the input weights, aiming to reduce the error. The goal of the learning rule is to determine the position of the separating line or plane such that it best distinguishes between the two classes.


Perceptron Learning Rule

Below is the formula for the output function of a TLU-type artificial neuron (Perceptron).



Both input values and the corresponding desired (expected) output values, along with the threshold, are presented to the artificial neuron. The output value can be {1, 0}. The neuron then calculates its output based on its summation and activation functions. If the calculated output differs from the expected output, the weights are updated:


(a) If the calculated result is greater than the expected result, the weights are decreased

(b) If the calculated result is less than the expected result, the weights are increased

(c) If the calculated result equals the expected result, the weights remain unchanged


This process is repeated for the entire training set until the calculated output matches the expected output for all examples. However, in nonlinear problems, the weight update loop may enter an infinite cycle; therefore, the process is halted once a predefined maximum number of iterations is reached.

Author Information

Avatar
AuthorBeyza Nur TürküDecember 25, 2025 at 8:23 AM

Tags

Discussions

No Discussion Added Yet

Start discussion for "Artificial Neural Networks and Single-Layer Perceptron" article

View Discussions

Contents

  • Artificial Neural Network

    • Advantages

    • Disadvantages

  • Biological Neural Network

  • Mathematical Model of a Neuron

  • Single-Layer Artificial Neural Networks and Training

    • Perceptron Learning Rule

Ask to Küre