> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nebius.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Managing jobs in Serverless AI

*Serverless AI jobs* run container images as one-off or scheduled batch workloads. They are suitable for training, fine-tuning and data processing where you want to use computing resources only to perform a task and stop when the task is done. Each job runs on a [container over a Compute virtual machine (VM)](/compute/virtual-machines/containers) that is billed only while the job is running.

## How to create a job

To run a container image as a batch workload for training, fine-tuning or data processing, create a job:

<Tabs group="interfaces">
  <Tab title="Web console">
    1. In the sidebar, go to <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/sidebar/ai-services.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=ab4ff229f7690c99deb1dc52d3daf987" width="16" height="16" data-path="_assets/sidebar/ai-services.svg" /> **AI Services** → **Jobs**.

    2. Click <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/plus.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=7c9efc69d65fc58db0eb73702fd81aa1" width="16" height="16" data-path="_assets/plus.svg" /> **Create job**.

    3. Configure **Job settings**:

       1. In **Image path**, set the path to the container image.

       2. If you use a private registry, click <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/plus.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=7c9efc69d65fc58db0eb73702fd81aa1" width="16" height="16" data-path="_assets/plus.svg" /> **Add registry**, and provide the details for your registry.

       3. (Optional) In **Entrypoint command**, specify the entrypoint command for the container.

          If you need to pass container arguments, specify them in this field as well.

       4. (Optional) In **Environment variables**, specify environment variables in key-value pairs.

       5. (Optional) In **Job timeout in hours**, specify the number of hours after which the job will be canceled if not completed.

    4. (Optional) Configure the **Computing resources** section:

       1. Select whether the VM should have GPUs.

       2. Specify the VM type: regular or [preemptible](/compute/virtual-machines/preemptible).

          VMs without GPUs only support the regular type.

       3. Select the [platform and preset](/compute/virtual-machines/types).

    5. (Optional) Configure **Storage** settings:

       * Set the container disk size. See how [disk performance depends on disk size](/compute/storage/types#disk-performance).
       * Attach a bucket or a filesystem to provide storage. You can create a bucket or filesystem, or use an existing one. To create a new bucket, see [Bucket parameters](/object-storage/buckets/manage#bucket-parameters). To create a new filesystem, see [Volume parameters](/compute/storage/manage#volume-parameters).

    6. (Optional) In the **Access** section, add an SSH key for the VM's user so you can [connect to the VM](/compute/virtual-machines/connect#connect-to-the-vm-by-using-ssh).

       You can add new credentials or select existing ones. If you decide to use an existing credential, make sure that the SSH key is stored for the **nebius** username.

    7. Configure the **Network** section:
       * Select a subnet or create a new one.
       * Select the IP address type: **Public static IP** or **Private IP**. If you want to connect to the resource from the internet, select **Public static IP**.

    8. Click **Create**.
  </Tab>

  <Tab title="CLI">
    Run the following command:

    ```bash theme={null}
    nebius ai job create \
      --name <job_name> \
      --image <image_path> \
      --registry-username <username> \
      --registry-password <password> \
      --container-command "<command>" \
      --args "<arguments>" \
      --env <key=value> \
      --working-dir <absolute_path> \
      --timeout <duration> \
      --platform <platform_ID> \
      --preset <preset> \
      --disk-size <size> \
      --volume <source:container_path[:mode]|s3://bucket:/container_path[:mode[:profile]]> \
      --subnet-id <subnet_ID> \
      --shm-size <size> \
      --ssh-key <SSH_public_key>
    ```

    <Accordion title="Job creation example">
      ```bash theme={null}
      nebius ai job create \
        --name training-job \
        --image nvidia/cuda:13.1.1-runtime-ubuntu24.04 \
        --container-command bash \
        --args "-c nvidia-smi" \
        --platform gpu-l40s-a \
        --preset 1gpu-8vcpu-32gb \
        --timeout 1h \
        --subnet-id vpcsubnet-e***
        
      ```
    </Accordion>

    In the command, specify the following parameters:

    * **Job settings:**

      * `--name`: Job name.

      - `--image`: Container image reference in the `registry/path:tag` or `registry/path@digest` format. Use an image from a public registry or your authenticated private registry.

      - `--registry-username`, `--registry-password` (optional): Credentials to authenticate if you pull an image from a private registry. Alternatively, use `--registry-secret` for credentials stored in [MysteryBox](/mysterybox/).

        * `--registry-username`: Username.
        * `--registry-password`: Personal access token, password or an API key. Depends on where your registry is hosted. It can be Docker Hub, Microsoft Azure, GitHub, NVIDIA or a custom registry.

        If you pull an image from a public registry or from [Container Registry](/container-registry) in the same project, you don't need to specify credentials.

      - `--registry-secret` (optional): [MysteryBox secret](/mysterybox/overview#secrets-and-versions) selector with `REGISTRY_USERNAME` and `REGISTRY_PASSWORD` payload keys. You can specify a secret name, secret ID, version ID or a combined secret/version selector such as `mbsec-e00***@mbsecver-e00***`.

      - `--container-command` (optional): Entrypoint command for the container.

      - `--args` (optional): Arguments for `docker run` to pass to the entrypoint command.

      - `--env` (optional): Environment variables for the container. Set them in the `key=value` format where the `key` is the environment variable and the `value` is the value of this variable. If you need to set several variables, list the `key=value` pairs separated by commas.

      - `--env-secret` (optional): Environment variables loaded from a [MysteryBox secret](/mysterybox/overview#secrets-and-versions) in the `key=value` format. The value can be a secret name, secret ID, version ID or a combined secret/version selector such as `mbsec-e00***@mbsecver-e00***`. If you need to set several variables, list the pairs separated by commas.

      * `--working-dir` (optional): Working directory (absolute path).
      * `--timeout` (optional): Job timeout (e.g., `2h30m10s`, `24h`). Minimum: `1h`, maximum: `168h`. Default: `24h`.
      * `--volume` (optional): [Bucket](/object-storage/overview#buckets) or [shared filesystem](/compute/storage/types#shared-filesystems) to mount to the job container and to store the job results and checkpoints. Volumes persist if the job is recreated after a [maintenance event](/compute/virtual-machines/maintenance).

        Specify the value in either format:

        * `source:container_path[:mode]` for mounting Nebius shared filesystems and existing bucket or volume resources by ID or name.
        * `s3://bucket:/container_path[:mode[:profile]]` for mounting an Object Storage bucket with AWS profile credentials or S3 credentials stored in MysteryBox. The `profile` is the AWS credentials profile to use. If you manage your credentials with [MysteryBox](/mysterybox/overview), use `profile@<secret_selector>`, where `<secret_selector>` is a secret name, secret ID, version ID or a combined secret/version selector such as `mbsec-e00***@mbsecver-e00***`

        The supported modes are `ro`, read only, and `rw`, read-write (default). Repeat for multiple volumes. For example:

        ```bash theme={null}
        --volume 'computefilesystem-e***:/input:ro' \
        --volume 'storagebucket-e***:/output:rw' \
        --volume 's3://training-results:/output:rw:default'
        ```

    * Underlying container over VM characteristics:

      * `--subnet-id`: [Subnet ID](/vpc/networking/resources#how-to-get-a-subnet-id). Required if the project has multiple subnets.

      * `--platform`: VM platform. See available platforms in [Types of virtual machines and GPUs in Nebius AI Cloud](/compute/virtual-machines/types).

      * `--preset`: Number of GPUs, vCPUs and RAM allocated to the container. The preset must match the selected platform. See available presets in [Presets for GPU platforms](/compute/virtual-machines/types#presets-for-gpu-platforms).

      * `--disk-size`: Disk size of the container over VM. Specify the value such as `100Gi`, `500Gi` or `1Ti`. The default value is `250Gi`.

        See how [disk performance depends on disk size](/compute/storage/types#disk-performance).

      * `--shm-size` (optional): Shared memory size of `/dev/shm`. Specify the value such as `64Mi`, `128Mi` or `1Gi`. The default value is `16Gi`.

      * `--ssh-key` (optional): SSH key to access the container over VM by SSH. When you add an SSH key, a public dynamic IP address is assigned. Before you add the key, check the quota on the number of public IP addresses in the [web console](https://console.nebius.com/quota).
  </Tab>
</Tabs>

The job creation usually takes a few minutes. Jobs run until the workload finishes.

When the job completes successfully or fails, the container over VM is deleted automatically. If you mounted volumes, they will remain, and you should delete them manually.

## How to check job logs

To view logs from a running or completed job:

<Tabs group="interfaces">
  <Tab title="Web console">
    1. In the sidebar, go to <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/sidebar/ai-services.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=ab4ff229f7690c99deb1dc52d3daf987" width="16" height="16" data-path="_assets/sidebar/ai-services.svg" /> **AI Services** → **Jobs**.
    2. Next to the job, click **View logs**. Alternatively, select the job that you want to view the logs for and switch to the **Logs** tab.

    You can use the period or log level filters to filter the logs. You can also use the [LogQL query language](/observability/logs/query-language).
  </Tab>

  <Tab title="CLI">
    Run the following command:

    ```bash theme={null}
     nebius ai job logs <job_ID> --follow
    ```

    You can add the following options to control the output:

    * `--follow` or `-f`: Stream logs in real time.
    * `--since <value>`: Show logs starting from the specified time. For example, `1h` (from 1 hour ago), `30m` (from 30 minutes ago) or `2024-01-01` (from that date).
    * `--tail <value>`: Number of recent lines to show in the output.
    * `--timestamps`: Include timestamps in the output.
    * `--until <value>`: Show logs up to the specified time. For example, `1h` (up to 1 hour ago), `30m` (up to 30 minutes ago) or `2024-01-01` (up to that date).
  </Tab>
</Tabs>

## How to cancel a job

If you don't need a job to continue running, you can cancel it. The jobs that finish with `COMPLETED` status are canceled automatically.

To cancel a job:

<Tabs group="interfaces">
  <Tab title="Web console">
    1. In the sidebar, go to <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/sidebar/ai-services.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=ab4ff229f7690c99deb1dc52d3daf987" width="16" height="16" data-path="_assets/sidebar/ai-services.svg" /> **AI Services** → **Jobs**.
    2. Find the job and then click <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/button-vellipsis.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=e80b8e57c43bfd117679262e6a1334ad" width="12" height="24" data-path="_assets/button-vellipsis.svg" /> → **Cancel**.
    3. In the window that opens, confirm canceling the job.
  </Tab>

  <Tab title="CLI">
    1. List jobs:

       ```bash theme={null}
       nebius ai job list
       ```

       In the output, copy the ID of the required job.

    2. To cancel a job, run:

       ```bash theme={null}
       nebius ai job cancel <job_ID>
       ```

    Canceling a job immediately stops the container over VM and deletes the container disk. The job remains in the list of jobs. Mounted volumes are retained. You can remove the mounted volumes manually. See the guides on [deleting a filesystem](/kubernetes/storage/filesystem-over-csi#how-to-delete-the-created-resources) and [deleting a bucket](/object-storage/buckets/manage#how-to-delete-buckets).
  </Tab>
</Tabs>

Canceling a job immediately stops the container over VM and deletes the container disk. The job remains in the list of jobs. Mounted volumes are retained. You can remove the mounted volumes manually. See the guides on [deleting a filesystem](/kubernetes/storage/filesystem-over-csi#how-to-delete-the-created-resources) and [deleting a bucket](/object-storage/buckets/manage#how-to-delete-buckets).

If you need to remove any record about the job from the job list, delete the job instead of canceling it.

## How to delete a job

To delete a job:

<Tabs group="interfaces">
  <Tab title="Web console">
    1. In the sidebar, go to <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/sidebar/ai-services.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=ab4ff229f7690c99deb1dc52d3daf987" width="16" height="16" data-path="_assets/sidebar/ai-services.svg" /> **AI Services** → **Jobs**.
    2. Locate the job and then click <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/button-vellipsis.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=e80b8e57c43bfd117679262e6a1334ad" width="12" height="24" data-path="_assets/button-vellipsis.svg" /> → **Delete**.
    3. In the window that opens, confirm deleting the job.
  </Tab>

  <Tab title="CLI">
    1. List jobs:

       ```bash theme={null}
       nebius ai job list
       ```

       In the output, copy the ID of the required job.

    2. Delete the job:

       ```bash theme={null}
       nebius ai job delete <job_ID>
       ```
  </Tab>
</Tabs>

When the job is deleted, it disappears from the list of jobs. If a job is running, deleting cancels the job first.

If the job uses additional volumes, they are not deleted with it. You can remove the mounted volumes manually. See the guides on [deleting a filesystem](/kubernetes/storage/filesystem-over-csi#how-to-delete-the-created-resources) and [deleting a bucket](/object-storage/buckets/manage#how-to-delete-buckets).
