Google introduced its eighth-generation Tensor Processing Units (TPUs), featuring two distinct chips for different AI workloads. The TPU 8t handles high-throughput model training, while the TPU 8i optimizes low-latency inference. This marks the first time Google has simultaneously released separate hardware for training and running models.
The specialized design addresses the rising demand for AI agents requiring faster response times. These chips aim to provide better performance and cost-efficiency than previous general-purpose versions.
The launch intensifies competition with Nvidia and other cloud providers like AWS. Google Cloud customers will have access to the new hardware in late 2026.