Getting started with Compute: Create your first Nebius AI Cloud virtual machine

To set up the infrastructure for ML workloads, create a virtual machine (VM) with 8 GPUs and a shared filesystem for training and a VM with one GPU for inference. In this guide, we will use the Nebius AI Cloud CLI to create VMs in a project in the eu-north1 region.

Before you start

Install the Nebius AI Cloud CLI

The Nebius AI Cloud CLI manages all Nebius AI Cloud resources. For more details, see the Nebius AI Cloud CLI documentation. To install and initialize the Nebius AI Cloud CLI, run the following commands one by one:

curl -sSL https://storage.eu-north1.nebius.cloud/cli/install.sh | bash

nebius profile create

The last command, nebius profile create, will guide you through several prompts. After you complete the prompts, your browser will open the Nebius AI Cloud web console sign-in screen. Sign in to the web console to complete the initialization. If you have access to multiple tenants, the CLI will prompt you to choose a tenant ID. After that, save your project ID in the CLI configuration: If the project ID has not been configured during the nebius profile create flow, get the project ID and save it in the CLI configuration:

nebius config set parent-id <project_ID>

Install jq

In this guide, we will use jq to extract IDs and tokens from JSON data returned by the Nebius AI Cloud CLI. For more details, see the jq documentation.

sudo apt-get install jq

Generate keys for SSH access to the VM

Generate an SSH key pair.

Create a VM with eight GPUs with InfiniBand™ and a shared filesystem for training

Create a boot disk and save its ID to an environment variable:
```
export TR_VM_BOOT_DISK_ID=$(nebius compute disk create \
  --name training-vm-disk-1 \
  --size-gibibytes 200 \
  --type network_ssd \
  --source-image-family-image-family ubuntu24.04-cuda13.0 \
  --block-size-bytes 4096 \
  --format json | jq -r ".metadata.id")
```
The command creates a 200 GiB SSD disk with a 4 KiB block size and an Ubuntu boot image with pre-installed NVIDIA GPU drivers. For details about boot disk images (--source-image-family-image-family), see Boot disk images for Compute virtual machines.

Create a shared filesystem and save its ID to an environment variable:

export TR_VM_FILESYSTEM_ID=$(nebius compute filesystem create \
  --name training-vm-filesystem-1 \
  --size-gibibytes 1024 \
  --type network_ssd \
  --block-size-bytes 4096 \
  --format json | jq -r ".metadata.id")

The command creates a 1 TiB SSD shared filesystem with 4 KiB blocks.

Get the subnet ID and save it to an environment variable:

export SUBNET_ID=$(nebius vpc subnet list \
  --format json \
  | jq -r ".items[0].metadata.id")

Possible subnet ID: vpcsubnet-e0dcbaa76x2024xyz8.

For high-speed networking and efficient training, consider interconnecting multiple VM GPUs in a GPU cluster using InfiniBand. To do this, before creating the VM, create a GPU cluster to connect the VM and get its ID:
```
export GPU_CLUSTER_ID=$(nebius compute gpu-cluster create \
  --name gpu-cluster-name \
  --infiniband-fabric fabric-3 \
  --format json \
  | jq -r ".metadata.id")
```

Create a VM with 8 GPUs for training:

export NETWORK_INTERFACE_NAME=multi-gpu-node-compute-api-network-interface
export USER_DATA=$(jq -Rrs '.' <<EOF
#cloud-config
users:
  - name: user
    sudo: ALL=(ALL) NOPASSWD:ALL
    shell: /bin/bash
    ssh_authorized_keys:
      - $(cat ~/.ssh/id_ed25519.pub)
EOF
)

export TR_VM_ID=$(nebius compute instance create \
  --name training-vm \
  --resources-platform gpu-h100-sxm \
  --resources-preset 8gpu-128vcpu-1600gb \
  --boot-disk-existing-disk-id "$TR_VM_BOOT_DISK_ID" \
  --boot-disk-attach-mode READ_WRITE \
  --cloud-init-user-data "$USER_DATA" \
  --filesystems "[{\"existing_filesystem\": {\"id\": \"$TR_VM_FILESYSTEM_ID\"}, \"attach_mode\": \"READ_WRITE\", \"mount_tag\": \"training-vm-filesystem-1\"}]" \
  --network-interfaces "[{\"name\": \"$NETWORK_INTERFACE_NAME\", \"subnet_id\": \"$SUBNET_ID\", \"ip_address\": {}, \"public_ip_address\": {}}]" \
  --gpu-cluster-id "$GPU_CLUSTER_ID" \
  --format json | jq -r ".metadata.id")

The given example assumes that you work with VMs that have public addresses, so you can later connect to these VMs by SSH. However, if you need isolated VMs without public addresses, remove "public_ip_address": {} from the --network-interfaces parameter. To access the VM, you can set up a WireGuard jump server later. This approach enhances security and still provides access to the VM within the same subnet. For more information about creating VMs and managing their network parameters, see How to create a virtual machine in Nebius AI Cloud.

Create a VM with one GPU for inference

Create a boot disk and save its ID to an environment variable:

export INF_VM_BOOT_DISK_ID=$(nebius compute disk create \
  --name inference-vm-disk-1 \
  --size-gibibytes 200 \
  --type network_ssd \
  --source-image-family-image-family ubuntu24.04-cuda13.0 \
  --block-size-bytes 4096 \
  --format json | jq -r ".metadata.id")

The command creates a 200 GiB SSD disk with a 4 KiB block size and an Ubuntu boot image with pre-installed NVIDIA GPU drivers. For details about boot disk images (--source-image-family-image-family), see Boot disk images for Compute virtual machines.

Create a VM with one GPU for inference:

export NETWORK_INTERFACE_NAME=single-gpu-node-compute-api-network-interface
export USER_DATA=$(jq -Rrs '.' <<EOF
#cloud-config
users:
  - name: user
    sudo: ALL=(ALL) NOPASSWD:ALL
    shell: /bin/bash
    ssh_authorized_keys:
      - $(cat ~/.ssh/id_ed25519.pub)
EOF
)

export INF_VM_ID=$(nebius compute instance create \
  --name inference-vm \
  --resources-platform gpu-h100-sxm \
  --resources-preset 1gpu-16vcpu-200gb \
  --boot-disk-existing-disk-id "$INF_VM_BOOT_DISK_ID" \
  --boot-disk-attach-mode READ_WRITE \
  --cloud-init-user-data "$USER_DATA" \
  --network-interfaces "[{\"name\": \"$NETWORK_INTERFACE_NAME\", \"subnet_id\": \"$SUBNET_ID\", \"ip_address\": {}, \"public_ip_address\": {}}]" \
  --format json | jq -r ".metadata.id")

Connect to the VMs

Connect to the VM for training via SSH:

Get your VM’s public IP address and save it to an environment variable:

export TR_PUBLIC_IP_ADDRESS=$(nebius compute instance get \
  --id $TR_VM_ID \
  --format json \
  | jq -r '.status.network_interfaces[0].public_ip_address.address | split("/")[0]')

Use the public IP address to connect to the VM:
```
ssh user@$INF_PUBLIC_IP_ADDRESS
```

Connect to the VM for inference via SSH:

Get your VM’s public IP address and save it to an environment variable:

export INF_PUBLIC_IP_ADDRESS=$(nebius compute instance get \
  --id $INF_VM_ID \
  --format json \
  | jq -r '.status.network_interfaces[0].public_ip_address.address | split("/")[0]')

Use the public IP address to connect to the VM:
```
ssh user@$INF_PUBLIC_IP_ADDRESS
```

What’s next

Learn about VM and GPU types
Learn how to create different types of VMs
Learn more about VM networking
Learn how to work with GPU clusters

InfiniBand and InfiniBand Trade Association are registered trademarks of the InfiniBand Trade Association.

​Before you start

​Install the Nebius AI Cloud CLI

​Install jq

​Generate keys for SSH access to the VM

​Create a VM with eight GPUs with InfiniBand™ and a shared filesystem for training

​Create a VM with one GPU for inference

​Connect to the VMs

​What’s next