Image Credits: Google
It’s Google Cloud Next in Las Vegas this week, and that means it’s time for a bunch of new instance types and accelerators to hit the Google Cloud Platform. In addition to the new custom Arm-based Axion chips, most of this year’s announcements are about AI accelerators, whether built by Google or from Nvidia.
Only a few weeks ago, Nvidia announced its Blackwell platform. But don’t expect Google to offer those machines anytime soon. Support for the high-performance Nvidia HGX B200 for AI and HPC workloads and GB200 NBL72 for large language model (LLM) training will arrive in early 2025. One interesting nugget from Google’s announcement: The GB200 servers will be liquid-cooled.
This may sound like a bit of a premature announcement, but Nvidia said that its Blackwell chips won’t be publicly available until the last quarter of this year.
Before Blackwell
For developers who need more power to train LLMs today, Google also announced the A3 Mega instance. This instance, which the company developed together with Nvidia, features the industry-standard H100 GPUs but combines them with a new networking system that can deliver up to twice the bandwidth per GPU.
Another new A3 instance is A3 confidential, which Google described as enabling customers to “better protect the confidentiality and integrity of sensitive data and AI workloads during training and inferencing.” The company has long offered confidential computing services that encrypt data in use, and here, once enabled, confidential computing will encrypt data transfers between Intel’s CPU and the Nvidia H100 GPU via protected PCIe. No code changes required, Google says.
As for Google’s own chips, the company on Tuesday launched its Cloud TPU v5p processors — the most powerful of its homegrown AI accelerators yet — into general availability. These chips feature a 2x improvement in floating point operations per second and a 3x improvement in memory bandwidth speed.
All of those fast chips need an underlying architecture that can keep up with them. So in addition to the new chips, Google also announced Tuesday new AI-optimized storage options. Hyperdisk ML, which is now in preview, is the company’s next-gen block storage service that can improve model load times by up to 3.7x, according to Google.
Google Cloud is also launching a number of more traditional instances, powered by Intel’s fourth- and fifth-generation Xeon processors. The new general-purpose C4 and N4 instances, for example, will feature the fifth-generation Emerald Rapids Xeons, with the C4 focused on performance and the N4 on price. The new C4 instances are now in private preview, and the N4 machines are generally available today.
Also new, but still in preview, are the C3 bare-metal machines, powered by older fourth-generation Intel Xeons, the X4 memory-optimized bare metal instances (also in preview) and the Z3, Google Cloud’s first storage-optimized virtual machine that promises to offer “the highest IOPS for storage optimized instances among leading clouds.”