This article was automatically translated from the original Turkish version.
Computer architecture is a field that defines the organization of a computer system’s hardware components and their interaction with software. This field concerns the design of processors, memory units, and input/output systems. Computer architecture is an engineering discipline that significantly affects system performance and efficiency and evolves in response to technological advancements.
The Von Neumann architecture, which forms the basis of modern computer design, is grounded in the principle that program instructions and data are stored in the same memory space. In this architecture, the central processing unit (CPU), main memory, input/output (I/O) units, and data buses connecting these components are present. This serial processing model, in which a single processor core fetches and executes one instruction at a time, has been used in computer design for many decades.
The fundamental operational mechanism of the Von Neumann architecture is the Fetch-Decode-Execute cycle. In this cycle, the CPU fetches instructions from memory, decodes them to determine what action to perform, and then executes them. The sequential processing of instructions leads to the Von Neumann Bottleneck, a condition in which the speed of the data bus between the processor and memory lags behind the processor speed.
The memory hierarchy, a component of computer architecture, is a layered structure designed to manage the performance gap between processor speed and memory speed. This structure follows a sequence from the fastest and closest memory units to the processor—such as registers and cache—to the slowest and most distant ones—such as main memory and disk storage.
Cache is a type of fast, small memory that accelerates access to frequently used data and instructions, operating faster than main memory. Caches typically consist of three levels:
L1 Cache: The smallest and fastest memory unit, dedicated to each processor core.
L2 Cache: Larger and slower than L1, typically dedicated to each core.
L3 Cache: The largest and slowest cache, usually shared among all processor cores.
In multi-core systems, a mechanism called cache coherence is required to prevent the same data from having different values in different caches.
As the performance gains of single processor cores have reached physical limits, parallel processing architectures have been developed. These architectures enable multiple operations to be performed simultaneously.
Instruction-Level Parallelism (ILP) aims to execute multiple instructions concurrently within a single instruction cycle. Pipelining allows instructions to be processed simultaneously at different stages of the instruction cycle. Superscalar architectures use multiple execution units capable of processing more than one instruction at a time. Modern processors commonly incorporate these technologies.
Graphics Processing Units (GPUs) consist of thousands of small cores and operate on the Single Instruction, Multiple Thread (SIMT) principle. This architecture is designed for tasks requiring high levels of parallel computation, such as graphics rendering, artificial intelligence algorithms, and scientific computing.
Von Neumann Architecture and the Instruction Cycle
Fetch-Decode-Execute Cycle
Memory Hierarchy and Cache Systems
Multi-Level Cache Architectures
Parallel Processing and Multi-Core Systems
Instruction-Level Parallelism and Multi-Core Systems
Specialized Parallel Processing Units: GPUs