> ## Documentation Index > Fetch the complete documentation index at: https://docs.nebius.com/llms.txt > Use this file to discover all available pages before exploring further. # Running applications and custom containers on virtual machines In Nebius AI Cloud, you can run applications in [container virtual machines](/compute/virtual-machines/containers) (VMs). A container VM allows you to launch a VM with a pre-installed container image, such as Jupyter Notebook, or a custom Docker image from a public registry. Container VMs are useful when you want to quickly deploy an application environment without manually configuring the VM or installing dependencies. This tutorial demonstrates two alternative ways to run container VMs: * [Run a container VM with a pre-installed application image](#run-a-container-vm-with-a-pre-installed-application-image) * [Run a container VM with a custom Docker image from the public registry](#run-a-container-vm-with-a-custom-docker-image) ## Costs The tutorial includes the following chargeable resources: * [Compute virtual machines](/compute/resources/pricing#virtual-machines-gpus-vcpus-ram) * [Compute disks](/compute/resources/pricing#disks) ## Prerequisites Generate an [SSH key pair](/compute/virtual-machines/ssh-keys). ## Steps ### Run a container VM with a pre-installed application image #### Create a single-GPU VM with Jupyter Notebook 1. In the [web console](https://console.nebius.com), go to **Compute** → **Container VMs**. 2. Click **Create container VM**. 3. Specify the VM name. 4. Select the **Jupyter Notebook** container image. 5. Copy and save the token that appears. You will need this token later to access the JupyterLab web interface. 6. In **Computing resources**, configure the VM with one GPU. For example, select NVIDIA® L40S PCIe with Intel Ice Lake and keep the predefined preset with eight CPUs. 7. In **Local storage**, specify the disk size. 8. In **Access**, add new credentials or select existing ones. To add new credentials: 1. Specify the username. Do not use the `root` or `admin` usernames. They are reserved for internal needs and cannot be used for SSH access. 2. Copy the contents of the `.pub` file generated earlier and paste it into the **Public key** field. 3. Click **Add credentials**. 9. Click **Create container VM**. #### Launch Jupyter Notebook When the container VM is running, connect to the application: 1. In **Container VMs**, open the page of the created VM. 2. Click **Go to Web UI** at the top of the VM page. 3. When prompted, paste the token into the authentication field and click **Log in**. If you did not save the Jupyter token earlier, you can copy it from the **Container parameters** section on the container VM page. 4. In JupyterLab, create a new notebook. 5. Run the following code. It shows information about available GPUs: ```python theme={null} import torch if torch.cuda.is_available(): print("CUDA is available. PyTorch can use your GPU.") print(f"Number of GPUs available: {torch.cuda.device_count()}") print(f"GPU Name: {torch.cuda.get_device_name()}") else: print("CUDA is not available. PyTorch will run on CPU.") ``` Example output: ```text theme={null} CUDA is available. PyTorch can use your GPU. Number of GPUs available: 1 GPU Name: NVIDIA L40S ``` #### Benchmark a VM with one GPU Run a simple benchmark to measure how long the GPU takes to multiply large matrices. This test multiplies two 30,000×30,000 tensors several times and measures the total execution time. Later in the tutorial, you will repeat the same benchmark on a VM with eight GPUs and compare the results. Run the following code: ```python theme={null} import torch import time device = torch.device("cuda" if torch.cuda.is_available() else "cpu") matrix_size = 30000 a = torch.randn(matrix_size, matrix_size, device=device) b = torch.randn(matrix_size, matrix_size, device=device) num_runs = 10 for _ in range(3): torch.matmul(a, b) torch.cuda.synchronize() start_time = time.time() for _ in range(num_runs): torch.matmul(a, b) torch.cuda.synchronize() end_time = time.time() average_time = (end_time - start_time) print(f"Time for {num_runs} matrix multiplications ({matrix_size}x{matrix_size}): {average_time:.4f} seconds") ``` Example output: ```text theme={null} Time for 10 matrix multiplications (30000x30000): 16.7096 seconds ``` In this benchmark test, CUDA synchronization ensures that each multiplication finishes before the next one starts, which makes the timing more accurate. #### Replace the VM with an 8-GPU VM while preserving data To scale from one GPU to eight GPUs and keep your notebooks: 1. Delete the current VM. When deleting the VM, select the option to keep the boot disk. 2. Create a new container VM as described in the [Create a single-GPU VM with Jupyter Notebook](#create-a-single-gpu-vm-with-jupyter-notebook) section, but: * In **Computing resources**, choose a configuration with eight GPUs * Attach the existing disk that contains your data as an additional disk #### Benchmark a VM with eight GPUs Run the benchmark test again on the VM with eight GPUs to measure how the workload performs after scaling. 1. Go to Jupyter Notebook and open your existing notebook. 2. Replace the benchmark code with: ```python theme={null} import torch import time num_gpus = torch.cuda.device_count() matrix_size = 30000 num_runs = 10 chunk_size = matrix_size // num_gpus B_cpu = torch.randn(matrix_size, matrix_size) B_chunks = [B_cpu.to(f"cuda:{i}") for i in range(num_gpus)] A_chunks = [torch.randn(chunk_size, matrix_size, device=f"cuda:{i}") for i in range(num_gpus)] start_time = time.time() for _ in range(num_runs): C_chunks = [] for i in range(num_gpus): C = A_chunks[i] @ B_chunks[i] C_chunks.append(C) for i in range(num_gpus): torch.cuda.synchronize(i) end_time = time.time() print(f"Time for {num_runs} multi-GPU matrix multiplications ({matrix_size}x{matrix_size}): {(end_time - start_time):.4f} seconds") ``` Example output: ```text theme={null} Time for 10 multi-GPU matrix multiplications (30000x30000): 1.3283 seconds. ``` This demonstrates the performance improvement when scaling from one GPU to eight GPUs. ### Run a container VM with a custom Docker image In the previous section, you deployed a container by using a pre-installed Jupyter Notebook application image on a container VM. You can also deploy containers with custom Docker images from public registries. In this section, you will create a container VM by using a Docker image from Docker Hub and access the application running inside the container. The [TensorFlow Jupyter image](https://hub.docker.com/r/tensorflow/tensorflow) is used as an example. This image includes both TensorFlow and Jupyter Notebook, so you can run TensorFlow workloads directly in a notebook environment. #### Create a container VM with a custom Docker image 1. In the [web console](https://console.nebius.com), go to **Compute** → **Container VMs**. 2. Click **Create container VM**. 3. Specify the VM name. 4. Select **Custom Image**. 5. In **Docker Image**, enter `tensorflow/tensorflow:nightly-gpu-jupyter`. 6. In **Docker run arguments**, specify `--restart=always --gpus all --shm-size=16GB -p 8888:8888`. These arguments enable GPU access, allocate shared memory and expose port 8888 for Jupyter Notebook. 7. In **Computing resources**, use at least one GPU. 8. In **Local storage**, specify the disk size. 9. In **Access**, select the previously created credentials. 10. Click **Create container VM**. #### Connect to the VM When the container VM is running, connect to the application: 1. In the **Container VMs** section, open the page of the VM with the custom Docker image installed. 2. In the **Network** section, copy the **Public IPv4** address. 3. Connect to the VM by using SSH: ```bash theme={null} ssh @ ``` 4. List the running containers and copy the name of the TensorFlow container: ```bash theme={null} sudo docker ps ``` 5. Get the Jupyter token: ```bash theme={null} sudo docker logs ``` 6. Open in browser `http://:8888/?token=`. 7. In JupyterLab, create a new notebook and run the following code to verify that TensorFlow works in the container: ```python theme={null} import tensorflow as tf import time matrix_size = 10000 a = tf.random.normal([matrix_size, matrix_size]) b = tf.random.normal([matrix_size, matrix_size]) start = time.time() c = tf.matmul(a, b) _ = c.numpy() end = time.time() print(f"Matrix multiplication ({matrix_size}x{matrix_size}) took {end - start:.4f} ``` Example output: ```text theme={null} Matrix multiplication (10000x10000) took 1.9316 seconds ``` This example performs large matrix multiplication by using TensorFlow and prints the execution time. ## How to delete the created resources The created Compute VMs and their boot disks are chargeable. If you do not need them, delete the resources created during this tutorial: 1. Go to **Compute** → **Container VMs**. 2. Open the VM page. 3. Switch to **Settings**. 4. Click **Delete virtual machine**. 5. In the window that opens, select **Delete the boot disk**. 6. Confirm the deletion. 7. Repeat these steps for any other VMs created during this tutorial.