badge icon

This article was automatically translated from the original Turkish version.

Article

Ironwood TPU

Quote
Ironwood TPU
Publication Date
April 9, 2025
Website
https://blog.google/products/google-cloud/ironwood-tpu-age-of-inference/

Ironwood is the seventh-generation Tensor Processing Unit (TPU) chip developed by Google and unveiled at the Google Cloud Next 2025 event. Unlike previous TPU models designed for training workloads, Ironwood is specifically engineered for inference operations. It serves applications requiring high computational capacity such as Large Language Models (LLMs), Mixture of Experts (MoE) architectures, and inference tasks like.

Technical Specifications

Each Ironwood chip delivers 4.614 TFLOPS of processing power in FP8 format. Its 192 GB of high bandwidth memory (High Bandwidth Memory, HBM) capacity is six times that of the previous generation Trillium TPU. Memory bandwidth reaches 7.2 TBps per chip. The Ironwood architecture aims to minimize data transfer latencies and enhance efficiency in large tensor operations.

Ironwood chips are interconnected via a high-bandwidth connection technology called Inter-Chip Interconnect (ICI) and grouped into structures known as “pods.” A single pod can accommodate 9,216 chips. This configuration provides 42.5 exaflops of computational capacity—24 times the processing power per pod of El Capitan, one of the world’s most powerful supercomputers today. Google offers two configurations: a 256-chip setup and a 9,216-chip complete pod.


Technical Comparison of Ironwood with Previous Generation TPUs (Source: Google)

Ironwood chips are equipped with liquid cooling systems. This system enables the chips to operate at high frequencies under sustained heavy workloads. It offers twice the energy efficiency of Trillium and approximately 30 times greater energy savings compared to Google’s first-generation TPU.


Ironwood is compatible with Google DeepMind’s Pathways work timing system. Pathways supports distributed computation across chips and enables thousands of Ironwood chips to operate in multi-pod configurations together. This architecture is used for training and inference of large models.


Energy Consumption Comparison Among TPUs (Source: Google)

Application Areas

Ironwood is used as a scalable artificial intelligence infrastructure in sectors such as finance, retail, telecommunications, and healthcare. In finance, it is applied to transaction data analysis; in retail and services, to customer service systems; and in healthcare, to medical data processing and decision support systems.

Structural Components

Ironwood chips feature a specialized accelerator called SparseCore. This component is designed to enhance performance in large-scale embedded data (embedding) processing tasks and recommendation systems. Extended SparseCore support enables its use in models within financial and scientific domains.


Initially, Ironwood is configured for internal use within Google’s own systems. Models such as Gemini 2.5 are already running on this infrastructure. Its release to developers will follow the completion of performance optimization within Google’s internal infrastructure.


Ironwood delivers a four- to five-fold improvement in overall performance compared to Trillium TPU. ICI bandwidth reaches 1.2 TBps for bidirectional communication. Memory capacity has increased sixfold and memory bandwidth by 4.5 times. These enhancements enable more efficient operation of high-dimensional models.

Enterprise-Level Benefits

The computational density and energy efficiency provided by the Ironwood chip enable AI applications to be delivered at lower costs. Reduced energy consumption offers a sustainability advantage important. Shorter processing times reduce model development and deployment cycles.


Google positions Ironwood as a core component of its AI infrastructure. The $75 billion capital investment is part of the company’s full-stack AI strategy. Ironwood serves as the foundational hardware component of a three-layered architecture encompassing hardware, foundational models, and software tools for executing multi-agent systems.


Ironwood is one of the high-performance chip architectures designed for the era of inference-driven AI. With its high computational power, memory capacity, and energy efficiency, it meets the demands of inference workloads technical. Its enterprise applicability and distributed computing-enabled architecture facilitate the deployment of AI applications across diverse industries opportunity.

Bibliographies

Forbes. "Google Cloud’s Ironwood TPU Forges Better Enterprise AI." Accessed April 15, 2025. https://www.forbes.com/sites/maribellopez/2025/04/14/google-clouds-ironwood-tpu-forges-better-enterprise-ai/.

Observer Voice. “Google Unveils Powerful Ironwood TPU for AI.” Accessed April 15, 2025. https://observervoice.com/google-unveils-powerful-ironwood-tpu-for-ai-110301/.

Reuters. "Google Launches New Ironwood Chip to Speed AI Applications." Accessed April 15, 2025. https://www.reuters.com/technology/google-launches-new-ironwood-chip-speed-ai-applications-2025-04-09/.

TechCrunch. “Google Unveils Ironwood, a New AI Accelerator Chip.” Accessed April 15, 2025. https://techcrunch.com/2025/04/09/google-unveils-ironwood-a-new-ai-accelerator-chip/.

The New Stack. "Ironwood: Google’s Answer to Nvidia in the AI Chip Wars." Accessed April 15, 2025. https://thenewstack.io/ironwood-googles-answer-to-nvidia-in-the-ai-chip-wars/.

Author Information

Avatar
AuthorÖmer Said AydınDecember 6, 2025 at 8:14 AM

Tags

Discussions

No Discussion Added Yet

Start discussion for "Ironwood TPU" article

View Discussions

Contents

  • Technical Specifications

  • Application Areas

  • Structural Components

  • Enterprise-Level Benefits

Ask to Küre