Amazon SageMaker HyperPod expands support to G7e and r5d.16xlarge instances

Amazon SageMaker HyperPod now supports G7e and r5d.16xlarge instances, enhancing its infrastructure for AI/ML model development and deployment. These instances offer improved performance and expanded capabilities.

Amazon SageMaker HyperPod has expanded its capabilities by incorporating support for G7e and r5d.16xlarge instances. SageMaker HyperPod is specifically designed to aid in the development, training, and deployment of foundational models on a large scale. It offers a robust and efficient environment, complete with inherent fault tolerance, automated cluster recovery, and optimized libraries for distributed training. These features significantly reduce the complexities involved in managing extensive AI/ML infrastructures.

The G7e instances are equipped with NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs, providing up to 2.3 times better inference performance than the previous G6e instances. This enhancement allows for greater processing of requests per second while minimizing latency. With a total GPU memory of up to 768 GB, G7e instances enable the deployment of larger language models or the operation of multiple models on a single endpoint. These instances are particularly beneficial for deploying large language models (LLMs), agentic AI, multimodal generative AI, and physical AI models. They are also cost-effective for single-node fine-tuning or training of natural language processing (NLP), computer vision, and smaller generative AI models, offering up to 1.27 times the TFLOPs and up to 4 times the GPU-to-GPU bandwidth compared to G6e.

Moreover, the inclusion of r5d.16xlarge instances further extends HyperPod’s capabilities. The r5d.16xlarge instance features 64 vCPUs, 512 GB of memory, and 5 x 600 GB NVMe SSD instance storage, powered by Intel Xeon Platinum 8000 series processors with a sustained all-core turbo frequency reaching up to 3.1 GHz. This configuration is ideal for distributed training data preprocessing, particularly with frameworks such as Ray, as well as for large-scale feature engineering and running memory-intensive orchestration services in conjunction with GPU computing.

G7e instances are available in regions including US East (N. Virginia), US East (Ohio), Asia Pacific (Tokyo), and US West (Oregon). The r5d.16xlarge instances are accessible in all regions where Amazon SageMaker HyperPod is offered.