- calendar_today August 17, 2025
Google introduced Ironwood, its seventh-generation Tensor Processing Unit (TPU), a custom-built powerhouse that will power its advanced Gemini models to the next level of AI capability, marking the beginning of the “age of inference.” The new chip goes beyond standard improvements by creating a major transformation in Google’s AI structure to handle advanced computational requirements for tasks Google refers to as “thinking.” The company considers Ironwood an essential component of its infrastructure since it operates in tight collaboration with advanced AI models to improve inference speeds significantly while enhancing contextual information processing capabilities.
Google is convinced that Ironwood will enable the development of stronger “agentic AI” capabilities. The described concept represents autonomous AI that takes action for users by collecting information independently and processing data to produce relevant results without needing direct instructions from users. The company’s future vision includes highly proactive and helpful AI assistants which Ironwood technology will power through its advanced computational capabilities.
Ironwood delivers a major advancement in performance and architectural design beyond previous Google TPUs. Next-generation AI systems require extensive computational resources, which are managed through massive liquid-cooled chip clusters that support up to 9,216 individual chips. A newly enhanced Inter-Chip Interconnect (ICI) allows these chips to connect together while enabling extremely fast communication speeds. The advanced interconnect technology plays an essential role in supporting efficient data transfer while reducing communication bottlenecks across extensive distributed computing systems.
Google’s scalable architecture will support internal research and development activities as well as external developers who use Google Cloud Platform. Google plans to offer Ironwood in two distinct configurations: The Ironwood platform includes a 256-chip server configuration for small-scale implementations and research efforts alongside a comprehensive 9,216-chip cluster designed to manage demanding AI tasks and large production systems.
A fully configured Ironwood pod delivers exceptional computational power as it achieves 42.5 Exaflops of inference capability. The processing capability available for complex AI tasks reaches an unprecedented level. According to Google’s specifications each Ironwood chip achieves a peak throughput of 4,614 TFLOPs which represents a major advancement beyond earlier TPU generations.
The memory capacity has undergone significant enhancement to accommodate the heightened processing requirements. The Ironwood chip includes 192GB of high-bandwidth memory, which represents a six-fold increase from the Trillium TPU’s memory capacity. The expanded on-chip memory enables the TPU to handle greater datasets and parameters with enhanced efficiency while decreasing memory transfer frequency to boost performance. Memory bandwidth now reaches 7.2 Tbps, which represents a 4.5x increase that supports quicker data retrieval and processing operations.
Benchmarking and Contextualizing Ironwood’s Capabilities
Google supplies performance context for Ironwood because comparing AI hardware directly proves difficult due to diverse benchmarking methods. FP8 precision is the main benchmark that the company utilizes to evaluate Ironwood’s performance. The assertion that Ironwood “pods” operate 24 times faster than equivalent parts of the world’s leading supercomputers must be considered carefully because certain supercomputers currently lack native FP8 precision support in their hardware. The diversity of hardware capabilities between systems leads to potential inaccuracies and irrelevance in direct comparison results.
The direct performance comparisons provided did not include Google’s TPU v6 (Trillium). According to Google, Ironwood achieves twice the power efficiency level of their TPU v6 (Trillium), resulting in better performance per watt. Google representatives specified that Ironwood acts as the direct replacement for TPU v5p, while Trillium succeeded TPU v5e, which had less power. Trillium managed to achieve a maximum FP8 performance level of roughly 918 TFLOPS in benchmark tests.
Ironwood represents a new milestone in advancing Agentic AI capabilities.
Google’s AI ecosystem will benefit greatly from Ironwood’s superior speed and larger memory capacity which enhances power efficiency to enable the creation of more advanced AI applications. Ironwood intends to expand agentic AI potential by building upon the existing robust infrastructure that supports advanced large language models and other AI systems like Gemini 2.5.
AI systems will be capable of independently collecting information from multiple sources and generating suitable responses or actions for users while requiring very little direct guidance. Google sees Ironwood as the driving force behind the new intelligent and autonomous AI interaction era which will lead to future advancements in natural language processing and machine learning as well as the creation of more effective AI agents.




