Documentation Index
Fetch the complete documentation index at: https://docs.nebius.com/llms.txt
Use this file to discover all available pages before exploring further.
You can create a virtual machine (VM) in Nebius AI Cloud, deploy the Qwen/Qwen2.5-72B-Instruct large language model (LLM) on the VM and then use Open WebUI to provide access to the model in a browser.
Before you start
Meet the following prerequisites, depending on the preferred interface:
Create a key pair for SSH access to the VM and save the key pair to the default location:
-
Install and configure the Nebius AI Cloud CLI.
-
To extract JSON data from the CLI output, install jq:
-
Create a key pair for SSH access to the VM and save the key pair to the default location:
Create the VM
-
Go to the web console, click Create resource and then select Virtual machine.
-
On the VM creation page that opens, set the following parameters:
-
Platform: NVIDIA® H100 NVLink with Intel Sapphire Rapids.
-
Preset: 1 GPU - 16 CPUs - 200 GiB RAM.
-
Boot disk image: Ubuntu 22.04 LTS for NVIDIA® GPUs (CUDA® 12). For details about boot disk images, see Boot disk images for Compute virtual machines.
-
Boot disk size: 300 GiB SSD.
-
Network: Select the Public IP address: Auto assign static IP option.
-
Username and SSH key: Select the public key that you created earlier.
In this field, do not use the
root or admin usernames. They are reserved for internal needs and are not allowed to connect to a VM by SSH.
-
Click Create VM.
The given example assumes that you work with a VM that has a public address, so you can later connect to this VM by SSH. However, if you need an isolated VM, do not assign a public address. To access the VM, you can set up a WireGuard jump server later. This approach enhances security and still provides access to the VM within the same subnet.For more information about creating VMs and managing their network parameters, see How to create a virtual machine in Nebius AI Cloud.
-
Create a boot disk and save its ID to an environment variable:
export BOOT_DISK_ID=$(nebius compute disk create \
--name openwebui-disk-1 \
--size-gibibytes 300 \
--type network_ssd \
--source-image-family-image-family ubuntu22.04-cuda12 \
--block-size-bytes 4096 \
--format json | jq -r ".metadata.id")
The command creates a 300 GiB SSD disk with a 4 KiB block size, and an Ubuntu boot image with pre-installed NVIDIA® GPU drivers. For details about boot disk images, see Boot disk images for Compute virtual machines.
-
Get the default subnet ID and save it to an environment variable:
export SUBNET_ID=$(nebius vpc subnet list \
--format json \
| jq -r ".items[0].metadata.id")
-
Create the VM with one GPU:
export USER_DATA=$(jq -Rs '.' <<EOF
users:
- name: <username>
sudo: ALL=(ALL) NOPASSWD:ALL
shell: /bin/bash
ssh_authorized_keys:
- $(cat ~/.ssh/id_ed25519.pub)
EOF
)
export VM_ID=$(nebius compute instance create \
--format json \
- <<EOF | jq -r ".metadata.id"
{
"metadata": {
"name": "openwebui"
},
"spec": {
"stopped": false,
"cloud_init_user_data": $USER_DATA,
"resources": {
"platform": "gpu-h100-sxm",
"preset": "1gpu-16vcpu-200gb"
},
"boot_disk": {
"attach_mode": "READ_WRITE",
"existing_disk": {
"id": "$BOOT_DISK_ID"
}
},
"network_interfaces": [
{
"name": "default-subnet",
"subnet_id": "$SUBNET_ID",
"ip_address": {},
"public_ip_address": {}
}
]
}
}
EOF
)
The given example assumes that you work with a VM that has a public address, so you can later connect to this VM by SSH. However, if you need an isolated VM without a public address, remove the "public_ip_address": {} line from the VM configuration. To access the VM, you can set up a WireGuard jump server later. This approach enhances security and still provides access to the VM within the same subnet.For more information about creating VMs and managing their network parameters, see How to create a virtual machine in Nebius AI Cloud.
Connect to the VM
-
Get the public IP address of the VM:
- Open the VM page.
- In the Network block, copy the Public IPv4 value.
Run the following command:export PUBLIC_IP_ADDRESS=$(nebius compute instance get-by-name \
--name openwebui \
--format json \
| jq -r '.status.network_interfaces[0].public_ip_address.address | split("/")[0]')
-
Connect to the VM:
ssh <username>@<ip_address>
Specify the received public IP address and the username that you set during the VM creation.
Create a virtual environment and install the necessary packages
To work with Open WebUI, you need a dedicated virtual environment. It enables you to set up and run the OpenWebUI server in isolation from other software on the VM. To create a virtual environment, use Miniconda.
To configure the environment:
-
Download and install the latest Miniconda version:
mkdir -p ~/miniconda3
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3
rm ~/miniconda3/miniconda.sh
-
Initialize Miniconda:
source ~/miniconda3/bin/activate
On initialization, Miniconda activates its base environment.
-
Create an
OpenWebUI environment with Python 3.11:
conda create -n OpenWebUI python=3.11
conda init bash
echo "conda activate OpenWebUI " >> ~/.bashrc
source ~/.bashrc
This command creates and activates a new environment.
-
Install Ollama, which provides access to the model:
curl -fsSL https://ollama.com/install.sh | sh
-
Install Open WebUI:
Start the Open WebUI server
-
Start the server:
-
Open the Open WebUI interface in the browser. To do this, enter the
http://<public_ip_address>:8080 address in the search bar.
-
In the Open WebUI interface, create an account to work with LLMs locally within the VM. For details on working with Open WebUI, see their documentation.
If you need to restart the server, use the same command: open-webui serve.
Download the Qwen/Qwen2.5-72B-Instruct model
- In Open WebUI, click Select a model.
- Paste
qwen2.5:72b into the search bar.
- Click Pull “qwen2.5:72b” from Ollama and wait for the download to finish.
- Click Select a model again and then choose Qwen/Qwen2.5-72B-Instruct.
Now you can chat with the model in the browser.
Make Open WebUI start automatically
With the current configuration, you need to manually start the Open WebUI server every time you connect to your VM. Alternatively, you can configure the server to start up whenever the VM starts. To do this:
-
Create a
systemd service file for Open WebUI and open the file in an editor:
sudo nano /etc/systemd/system/openwebui.service
-
Paste the following contents into the file and save it. Specify the username that you set during the VM creation:
[Unit]
Description=OpenWebUI Server
After=network.target
[Service]
User=<username>
WorkingDirectory=/home/<username>/
ExecStart=/home/<username>/miniconda3/envs/OpenWebUI/bin/open-webui serve
Restart=always
[Install]
WantedBy=multi-user.target
-
To make the new service file recognizable, reload
systemd:
sudo systemctl daemon-reload
-
To start automatically and immediately, enable the
systemd service:
sudo systemctl enable openwebui.service
sudo systemctl start openwebui.service
-
Verify that the service is running:
sudo systemctl status openwebui.service
Whenever you start up your VM in the web console, Open WebUI now automatically launches in the background. You can directly access it in the browser at http://ip_address:8080 and work with the Qwen/Qwen2.5-72B-Instruct model.