To accelerate ML, AI and high-performance computing (HPC) workloads that you run in your Managed Service for Kubernetes clusters with GPUs, you can interconnect the GPUs using InfiniBand, a high-throughput, low-latency networking standard. For more details about InfiniBand in Nebius AI Cloud, see the Compute documentation. In this article, you will learn how to set up InfiniBand in a Managed Kubernetes cluster.Documentation Index
Fetch the complete documentation index at: https://docs.nebius.com/llms.txt
Use this file to discover all available pages before exploring further.
How to enable InfiniBand for a node group
- Web console
- CLI
In the node group creation form (
Compute → Kubernetes → your cluster → Node groups → Create node group), under Computing resources:
- Select With GPU.
-
Select a platform and a preset compatible with GPU clusters. The compatible platforms and presets:
Platform Presets Regions NVIDIA® B300 NVLink with Intel Granite Rapids
(gpu-b300-sxm)8gpu-192vcpu-2768gbuk-south1NVIDIA® B200 NVLink with Intel Emerald Rapids
(gpu-b200-sxm)8gpu-160vcpu-1792gbus-central1NVIDIA® B200 NVLink with Intel Emerald Rapids
(gpu-b200-sxm-a)8gpu-160vcpu-1792gbme-west1NVIDIA® H200 NVLink with Intel Sapphire Rapids
(gpu-h200-sxm)8gpu-128vcpu-1600gbeu-north1,eu-north2,eu-west1,us-central1NVIDIA® H100 NVLink with Intel Sapphire Rapids
(gpu-h100-sxm)8gpu-128vcpu-1600gbeu-north1 - Select a GPU cluster or create one. If the field is inactive, make sure you have selected a compatible platform and preset.
- Under GPU settings, keep the Install NVIDIA GPU drivers and other components option enabled. If you want to install the drivers manually, disable this option.
Example: NCCL tests
To test InfiniBand performance in a Managed Service for Kubernetes cluster, you can run the NVIDIA NCCL test in it. For instructions, see our tutorial.See also
InfiniBand and InfiniBand Trade Association are registered trademarks of the InfiniBand Trade Association.