> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nebius.com/llms.txt
> Use this file to discover all available pages before exploring further.

# InfiniBand™ networking for Compute virtual machines with GPUs

You can group your virtual machines with GPUs into a *GPU cluster*. The cluster accelerates high-performance computing (HPC) tasks such as training and inference. These tasks require a lot of processing power that a single VM cannot provide.

The GPU clusters are built with InfiniBand secure high-speed networking. Each GPU in a VM is connected through a network interface card (NIC) that provides 400 Gbps. As a compute VM for GPU clusters consists of 8 GPUs, the total bandwidth for a node is 3.2 Tbps.

Nebius AI Cloud uses GPUDirect RDMA, an NVIDIA technology of remote direct memory access (RDMA) that allows data to flow directly between each GPU and its NIC, avoiding CPU, thus boosting the data exchange speed.

## InfiniBand fabrics

Each GPU cluster is created in one of physical *InfiniBand fabrics*. This is where GPUs interconnected over InfiniBand are located. Each fabric has limited GPU capacity.

When creating a GPU cluster, select an InfiniBand fabric for it. Take into account the type of GPUs you are going to use. For example, if you select `fabric-7`, you can only add NVIDIA® H200 NVLink with Intel Sapphire Rapids GPUs to this cluster.

Available fabrics and corresponding regions ([private regions](../../../overview/regions) are marked with \*):

| Fabric                     | GPU platform                                                                | [Region](../../../overview/regions)                                                        |
| -------------------------- | --------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
| `fabric-2`                 | NVIDIA® H100 NVLink with Intel Sapphire Rapids (<code>gpu-h100-sxm</code>)  | <code>eu-north1</code>                                                                     |
| `fabric-3`                 | NVIDIA® H100 NVLink with Intel Sapphire Rapids (<code>gpu-h100-sxm</code>)  | <code>eu-north1</code>                                                                     |
| `fabric-4`                 | NVIDIA® H100 NVLink with Intel Sapphire Rapids (<code>gpu-h100-sxm</code>)  | <code>eu-north1</code>                                                                     |
| `fabric-5`                 | NVIDIA® H200 NVLink with Intel Sapphire Rapids (<code>gpu-h200-sxm</code>)  | <code>eu-west1</code>                                                                      |
| `fabric-6`                 | NVIDIA® H100 NVLink with Intel Sapphire Rapids (<code>gpu-h100-sxm</code>)  | <code>eu-north1</code>                                                                     |
| `fabric-7`                 | NVIDIA® H200 NVLink with Intel Sapphire Rapids (<code>gpu-h200-sxm</code>)  | <code>eu-north1</code>                                                                     |
| <code>eu-north2-a</code>   | NVIDIA® H200 NVLink with Intel Sapphire Rapids (<code>gpu-h200-sxm</code>)  | <code>eu-north2</code> <Tooltip href="/overview/regions" cta="Private region">\*</Tooltip> |
| <code>me-west1-a</code>    | NVIDIA® B200 NVLink with Intel Emerald Rapids (<code>gpu-b200-sxm-a</code>) | <code>me-west1</code>                                                                      |
| <code>uk-south1-a</code>   | NVIDIA® B300 NVLink with Intel Granite Rapids (<code>gpu-b300-sxm</code>)   | <code>uk-south1</code><Tooltip href="/overview/regions" cta="Private region">\*</Tooltip>  |
| <code>us-central1-a</code> | NVIDIA® H200 NVLink with Intel Sapphire Rapids (<code>gpu-h200-sxm</code>)  | <code>us-central1</code>                                                                   |
| <code>us-central1-b</code> | NVIDIA® B200 NVLink with Intel Emerald Rapids (<code>gpu-b200-sxm</code>)   | <code>us-central1</code>                                                                   |

<Note>
  In most cases, you do not need to change the preselected fabric. We recommend that you create a GPU cluster in another fabric only if it is better suited for a different platform or if you experience capacity issues with an existing GPU cluster.
</Note>

## How to enable InfiniBand for VMs with GPUs

<Tabs>
  <Tab title="Web console">
    1. Create a GPU cluster:

       1. In the sidebar, go to <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/sidebar/compute.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=b91340217b08a1456d88ae0347f281d1" width="16" height="16" data-path="_assets/sidebar/compute.svg" /> **Compute** → **GPU clusters**.
       2. Click <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/plus.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=7c9efc69d65fc58db0eb73702fd81aa1" width="16" height="16" data-path="_assets/plus.svg" /> **Create GPU cluster**.
       3. On the page that opens, specify the cluster name. It should contain from 3 to 63 characters: lowercase letters, numbers and hyphens.
       4. Select the InfiniBand fabric.
       5. Click **Create GPU cluster**.

    2. Add VMs to the cluster. You can do it only when creating the VMs:

           <Warning>
             All virtual machines added to the GPU cluster, including Managed Service for Kubernetes® nodes, must be in the same [project](../../../iam/overview#projects).
           </Warning>

       1. In the sidebar, go to <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/sidebar/compute.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=b91340217b08a1456d88ae0347f281d1" width="16" height="16" data-path="_assets/sidebar/compute.svg" /> **Compute** → **Virtual machines**.
       2. Click <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/plus.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=7c9efc69d65fc58db0eb73702fd81aa1" width="16" height="16" data-path="_assets/plus.svg" /> **Create virtual machine**.
       3. On the page that opens, specify the VM's details and select the GPU cluster name in the **GPU cluster** list.

    You can also create a GPU cluster while [creating the first VM in it](../../virtual-machines/manage):

    1. In the **Computing resources** section of the VM creation form:

       1. Select **With GPU**.
       2. Select a platform and a preset compatible with GPU clusters. The compatible platforms and presets:

          | Platform                                                                | Presets               | [Regions](/overview/regions)                                                                                             |
          | ----------------------------------------------------------------------- | --------------------- | ------------------------------------------------------------------------------------------------------------------------ |
          | NVIDIA® B300 NVLink with Intel Granite Rapids  <br />(`gpu-b300-sxm`)   | `8gpu-192vcpu-2768gb` | `uk-south1`*<Tooltip href="/overview/regions" cta="Private region">\*</Tooltip>*                                         |
          | NVIDIA® B200 NVLink with Intel Emerald Rapids  <br />(`gpu-b200-sxm`)   | `8gpu-160vcpu-1792gb` | `us-central1`                                                                                                            |
          | NVIDIA® B200 NVLink with Intel Emerald Rapids  <br />(`gpu-b200-sxm-a`) | `8gpu-160vcpu-1792gb` | `me-west1`                                                                                                               |
          | NVIDIA® H200 NVLink with Intel Sapphire Rapids  <br />(`gpu-h200-sxm`)  | `8gpu-128vcpu-1600gb` | `eu-north1`, `eu-north2`*<Tooltip href="/overview/regions" cta="Private region">\*</Tooltip>*, `eu-west1`, `us-central1` |
          | NVIDIA® H100 NVLink with Intel Sapphire Rapids  <br />(`gpu-h100-sxm`)  | `8gpu-128vcpu-1600gb` | `eu-north1`                                                                                                              |

    2. In the **Boot disk** section of the VM creation form, select the boot disk for NVIDIA GPUs. For details, see [Boot disk images for Compute virtual machines](../../storage/boot-disk-images).
  </Tab>

  <Tab title="CLI">
    1. Check that your project ID is saved in the Nebius AI Cloud CLI profile configuration:
       ```bash theme={null}
       cat ~/.nebius/config.yaml
       ```

    2. If you have not set your project ID as `parent-id`, or you want to create resources in a different project, [get the project ID](/iam/manage-projects#how-to-get-a-project-id) and update your [CLI profile](/cli/configure):
       ```bash theme={null}
       nebius profile update --parent-id <project_ID>
       ```

    3. Depending on your project’s [region](../../../overview/regions), select an [InfiniBand fabric](#infiniband-fabrics) for VM interconnection and save it to an environment variable:

       ```bash theme={null}
       export INFINIBAND_FABRIC=<fabric>
       ```

    4. Create a GPU cluster and save its ID:

       ```bash theme={null}
       export GPU_CLUSTER_ID=$(nebius compute gpu-cluster create \
         --name <gpu_cluster_name> \
         --infiniband-fabric $INFINIBAND_FABRIC \
         --format json \
         | jq -r ".metadata.id")
       ```

       Where:

       * `name`: A cluster name that you can use to quickly find the cluster.

    5. Create a boot disk optimized for VMs with NVIDIA GPUs:

       ```bash theme={null}
       export BOOT_DISK_ID=$(nebius compute disk create \
         --name <boot_disk_name> \
         --size-gibibytes 200 \
         --type network_ssd \
         --source-image-family-image-family ubuntu24.04-cuda13.0 \
         --block-size-bytes 4096 \
         --format json \
         | jq -r ".metadata.id")
       ```

       For compatible boot disk images (`--source-image-family-image-family`), see [Boot disk images](../../storage/boot-disk-images).

    6. [Create a virtual machine](../../virtual-machines/manage#create-a-vm) with GPUs and specify the GPU cluster ID in its parameters.

           <Warning>
             All virtual machines added to the GPU cluster, including Managed Service for Kubernetes nodes, must be in the same project.
           </Warning>

       Depending on how you are specifying the parameters in the <code>nebius compute instance create</code> command:

       * *JSON*

         Use `.spec.gpu_cluster.id` for the GPU cluster ID. Specify a VM platform with GPUs in `.spec.resources.platform`, and a preset compatible with GPU clusters in `.spec.resources.preset`. The compatible platforms and presets are:

         | Platform                                                                | Presets               | [Regions](/overview/regions)                                                                                             |
         | ----------------------------------------------------------------------- | --------------------- | ------------------------------------------------------------------------------------------------------------------------ |
         | NVIDIA® B300 NVLink with Intel Granite Rapids  <br />(`gpu-b300-sxm`)   | `8gpu-192vcpu-2768gb` | `uk-south1`*<Tooltip href="/overview/regions" cta="Private region">\*</Tooltip>*                                         |
         | NVIDIA® B200 NVLink with Intel Emerald Rapids  <br />(`gpu-b200-sxm`)   | `8gpu-160vcpu-1792gb` | `us-central1`                                                                                                            |
         | NVIDIA® B200 NVLink with Intel Emerald Rapids  <br />(`gpu-b200-sxm-a`) | `8gpu-160vcpu-1792gb` | `me-west1`                                                                                                               |
         | NVIDIA® H200 NVLink with Intel Sapphire Rapids  <br />(`gpu-h200-sxm`)  | `8gpu-128vcpu-1600gb` | `eu-north1`, `eu-north2`*<Tooltip href="/overview/regions" cta="Private region">\*</Tooltip>*, `eu-west1`, `us-central1` |
         | NVIDIA® H100 NVLink with Intel Sapphire Rapids  <br />(`gpu-h100-sxm`)  | `8gpu-128vcpu-1600gb` | `eu-north1`                                                                                                              |

         For example:

         ```json theme={null}
         {
           "spec": {
             "resources": {
               "platform": "gpu-h100-sxm",
               "preset": "8gpu-128vcpu-1600gb"
             },
             "gpu_cluster": {
               "id": "$GPU_CLUSTER_ID"
             },
             "boot_disk": {
               "attach_mode": "READ_WRITE",
               "existing_disk": {
                 "id": "$BOOT_DISK_ID"
               }
             },
             ...
           },
           ...
         }
         ```

       * *CLI parameters*

         Use `--gpu-cluster-id` for the GPU cluster ID and `--boot-disk-existing-disk-id` for the boot disk ID. Specify a VM platform with GPUs in `--resources-platform`, and a preset compatible with GPU clusters in `--resources-preset`. The compatible platforms and presets are:

         | Platform                                                                | Presets               | [Regions](/overview/regions)                                                                                             |
         | ----------------------------------------------------------------------- | --------------------- | ------------------------------------------------------------------------------------------------------------------------ |
         | NVIDIA® B300 NVLink with Intel Granite Rapids  <br />(`gpu-b300-sxm`)   | `8gpu-192vcpu-2768gb` | `uk-south1`*<Tooltip href="/overview/regions" cta="Private region">\*</Tooltip>*                                         |
         | NVIDIA® B200 NVLink with Intel Emerald Rapids  <br />(`gpu-b200-sxm`)   | `8gpu-160vcpu-1792gb` | `us-central1`                                                                                                            |
         | NVIDIA® B200 NVLink with Intel Emerald Rapids  <br />(`gpu-b200-sxm-a`) | `8gpu-160vcpu-1792gb` | `me-west1`                                                                                                               |
         | NVIDIA® H200 NVLink with Intel Sapphire Rapids  <br />(`gpu-h200-sxm`)  | `8gpu-128vcpu-1600gb` | `eu-north1`, `eu-north2`*<Tooltip href="/overview/regions" cta="Private region">\*</Tooltip>*, `eu-west1`, `us-central1` |
         | NVIDIA® H100 NVLink with Intel Sapphire Rapids  <br />(`gpu-h100-sxm`)  | `8gpu-128vcpu-1600gb` | `eu-north1`                                                                                                              |

         For example:

         ```bash theme={null}
         nebius compute instance create \
           --resources-platform gpu-h100-sxm \
           --resources-preset 8gpu-128vcpu-1600gb \
           --gpu-cluster-id $GPU_CLUSTER_ID \
           --boot-disk-existing-disk-id $BOOT_DISK_ID \
           ...
         ```
  </Tab>
</Tabs>

## How to test the connection with the NCCL tests

To test InfiniBand performance in a Compute cluster, you can run the NVIDIA NCCL test in it. For instructions, see our tutorial on running distributed jobs with [MPIrun](../../../3p-integrations/mpirun): it uses the NCCL test as an example.

## How to delete a GPU cluster

Before deleting a GPU cluster, make sure all virtual machines in the cluster are deleted or moved to another cluster.

<Tabs group="interfaces">
  <Tab title="Web console">
    1. In the sidebar, go to <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/sidebar/compute.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=b91340217b08a1456d88ae0347f281d1" width="16" height="16" data-path="_assets/sidebar/compute.svg" /> **Compute** → **GPU clusters**.
    2. In the row of the GPU cluster you want to delete, click <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/button-vellipsis.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=e80b8e57c43bfd117679262e6a1334ad" width="12" height="24" data-path="_assets/button-vellipsis.svg" /> → **Delete**.
    3. In the window that opens, confirm the deletion.
  </Tab>

  <Tab title="CLI">
    1. Get the ID of the GPU cluster you want to delete:

       ```bash theme={null}
       nebius compute gpu-cluster list
       ```

    2. Delete the GPU cluster:

       ```bash theme={null}
       nebius compute gpu-cluster delete <GPU_cluster_ID>
       ```
  </Tab>
</Tabs>

## See also

* [How to test a GPU cluster physical state in Compute](./test)
* [How to create a virtual machine in Nebius AI Cloud](../../virtual-machines/manage)
* [Running the all-reduce NCCL performance test in Soperator clusters](../../../slurm-soperator/jobs/examples/nccl-all-reduce)

***

*InfiniBand and InfiniBand Trade Association are registered trademarks of the InfiniBand Trade Association.*
