Fine-tuning a large language model by using a Serverless AI job and Axolotl

You can use Serverless AI jobs to work with AI models and perform such operations as fine-tuning, scientific simulations, data processing or batch inference. The example below demonstrates how to fine-tune the Qwen/Qwen2.5-0.5B large language model (LLM) with the support of the Low-Rank Adaptation (LoRA). This example also includes Axolotl, an open-source tool for fine-tuning. Axolotl provides a public container image, which you deploy in a job and, as a result, run fine-tuning. For a quick walkthrough of the web console workflow, watch the video below. If you prefer other interfaces or written instructions, follow the steps further down.

Prerequisites

Web console
CLI

Make sure you are in a group that has the admin role within your tenant; for example, the default admins group. You can check this in the Administration → IAM section of the web console.

Make sure you are in a group that has the admin role within your tenant; for example, the default admins group. You can check this in the Administration → IAM section of the web console.
Install and configure the Nebius AI Cloud CLI and the AWS CLI. Here are the instructions, depending on what tools you have already installed:
- How to install both CLIs
- How to install the Nebius AI Cloud CLI only
- How to install and configure the AWS CLI only

Steps

Prepare an Object Storage bucket

To store job results, mount an Object Storage bucket to your job. Serverless AI deletes the job after its completion. Thus, the bucket preserves the job data: checkpoints and LoRA adapter weights of a fine-tuned model. For visual guidance on creating a bucket and uploading objects in the web console, watch the video below. If you prefer other interfaces or written instructions, follow the steps further down.

Create a bucket:
- Web console
- CLI
1. In the web console, go to Storage → Object Storage.
2. Click Create bucket.
3. Specify the fine-tuning-axolotl name for the bucket.
4. In the Maximum size field, select Unlimited. Leave the other settings at their default values.
5. Click Create bucket.
nebius storage bucket create --name fine-tuning-axolotl

Save the config.yaml file specified below. It is required for Axolotl to run fine-tuning.

base_model: Qwen/Qwen2.5-0.5B

load_in_4bit: true
adapter: qlora

datasets:
  - path: Salesforce/wikitext
    name: wikitext-2-raw-v1
    split: "train[:2000]"
    type: completion
    field: text

sequence_len: 128
micro_batch_size: 1
gradient_accumulation_steps: 1

learning_rate: 2e-4
max_steps: 30
val_set_size: 0
logging_steps: 5

output_dir: /workspace/output

lora_r: 8
lora_alpha: 16
lora_dropout: 0.05
lora_target_modules:
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - gate_proj
  - up_proj
  - down_proj

Upload this configuration file to the bucket:
- Web console
- CLI
1. In the web console, go to Storage → Object Storage.
2. Open the page of the fine-tuning-axolotl bucket.
3. Click Add → Object.
4. Upload the config.yaml file.
aws s3 cp config.yaml s3://fine-tuning-axolotl/config.yaml
The file is stored in the fine-tuning-axolotl bucket as the config.yaml object.Run this command in the directory where config.yaml is stored. Alternatively, specify an actual path to the file after cp.

Run a fine-tuning job

Create a job that performs the following actions:

Runs an Axolotl container.
Mounts the bucket with the prepared configuration file in the read-write mode.
Executes fine-tuning.
Saves the fine-tuning results to the bucket.

To create and run such a job:

Web console
CLI

In the web console, go to AI Services → Jobs.
Click Create job.
On the page that opens, specify the following job settings:
- Name: fine-tuning-axolotl-qwen-lora.
- Image path: docker.io/axolotlai/axolotl:main-20260309-py3.11-cu128-2.9.1.
- Entrypoint command:
  bash -c "RUN_ID=run-$(date +%Y%m%d-%H%M%S); axolotl train /workspace/data/config.yaml && mkdir -p /workspace/data/output/$RUN_ID && cp -r /workspace/output/. /workspace/data/output/$RUN_ID"
- Computing resources: With GPU.
- Available platform: NVIDIA® L40S PCIe with Intel Ice Lake.
- Preset: 1 GPU — 8 CPUs — 32 GiB RAM.
- Container disk, Size GiB: 450.
- Mount volumes: Bucket.
- Mount path: /workspace/data. After that, click Attach bucket and then select the fine-tuning-axolotl bucket.
Click Create.

Run the following command:

nebius ai job create \
  --name "fine-tuning-axolotl-qwen-lora" \
  --subnet-id "<subnet_ID>" \
  --image docker.io/axolotlai/axolotl:main-20260309-py3.11-cu128-2.9.1 \
  --platform gpu-l40s-a \
  --preset 1gpu-8vcpu-32gb \
  --disk-size 450Gi \
  --volume "<bucket_ID>:/workspace/data" \
  --container-command bash \
  --args '-c "RUN_ID=run-$(date +%Y%m%d-%H%M%S); axolotl train /workspace/data/config.yaml && mkdir -p /workspace/data/output/$RUN_ID && cp -r /workspace/output/. /workspace/data/output/$RUN_ID"'

The command contains the following parameters:

--name: Job name.
--subnet-id: Subnet ID.
--image: Container image to run. In the given example, the public Axolotl container image is used.
--platform: VM platform for the job. As Serverless AI jobs are based on containers over virtual machines (VMs) in Compute, every job uses Compute platforms and presets.
--preset: Number of GPUs, vCPUs and RAM allocated to the container. The preset must match the selected platform.
--disk-size: Size of the job container disk.
--volume: Volume to mount to the job container. Specify the ID of the fine-tuning-axolotl bucket created earlier. The bucket is used to store the job results and checkpoints. To get the bucket ID, run nebius storage bucket get-by-name --name fine-tuning-axolotl.
--container-command: Entrypoint command for the job container.
--args: Arguments for docker run to pass to the entrypoint command.

The job takes several minutes to complete.

Check the job results

View information about the job:
- Web console
- CLI
In the web console, go to AI Services → Jobs and then open the page of the fine-tuning-axolotl-qwen-lora job. It contains information about the job state and configuration.
nebius ai job get <job_ID>
The command output shows the job state and configuration.To get the job ID, run nebius ai job get-by-name --name fine-tuning-axolotl-qwen-lora.
Download LoRA adapter weights of the fine-tuning job. They are stored as files in the output directory, in the fine-tuning-axolotl bucket. To check these files, open the page of the fine-tuning-axolotl bucket in the web console.
- Web console
- CLI
To download a file, in its line, click → Download.
Command example for downloading a file:
aws s3 cp s3://fine-tuning-axolotl/output/run-20260309-160510/checkpoint-30/adapter_config.json download/output/run-20260309-160510/checkpoint-30/adapter_config.json
In this example, the s3://... path points out to the file in the bucket, and the download/... path points out to a local download directory.

​Prerequisites

​Steps

​Prepare an Object Storage bucket

​Run a fine-tuning job

​Check the job results

​See also

Prerequisites

Steps

Prepare an Object Storage bucket

Run a fine-tuning job

Check the job results

See also