> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nebius.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Autoscaling in Managed Service for Kubernetes

In Managed Service for Kubernetes, the **cluster autoscaler** integrates with the underlying infrastructure to monitor and manage [node groups](../components#node-group) in your cluster, to add or remove nodes seamlessly as needed. It makes scaling decisions based on the following principles:

* If there are unschedulable pods in the cluster due to resource constraints, the cluster autoscaler adds new nodes to accommodate these pods.
* If nodes in the cluster are underutilized, the cluster autoscaler removes these nodes in order to optimize resource usage and reduce costs.

<Tip>
  If you have a GPU node group with autoscaling, add a CPU node group with at least two nodes (or with autoscaling) to the cluster. In this case, when there are no tasks to perform, [CoreDNS and Cilium networking add-ons](../networking/add-ons) can run on CPU nodes, so that the GPU node group can scale down and reduce your costs.
</Tip>

## Set up autoscaling for new node groups

You can set up autoscaling when creating a new node group:

<Tabs>
  <Tab title="Web console">
    When [creating a node group](./manage#how-to-create-node-groups):

    1. Under **Size**, toggle **Enable autoscaling**.

    2. Specify the **Min. nodes** and **Max. nodes** numbers in the group.
  </Tab>

  <Tab title="Nebius AI Cloud CLI">
    When [creating a node group](./manage#how-to-create-node-groups), add the following parameters to the [nebius mk8s node-group create](/cli/reference/mk8s/node-group/create) command:

    ```bash theme={null}
    nebius mk8s node-group create \
      ... \
      --autoscaling-min-node-count <minimum_number_of_nodes> \
      --autoscaling-max-node-count <maximum_number_of_nodes>
    ```

    For example, to set the autoscaling from 2 to 4 nodes, add `--autoscaling-min-node-count 2 --autoscaling-max-node-count 4` to the `nebius mk8s node-group create` command.
  </Tab>
</Tabs>

## Set up autoscaling for existing node groups

You can only manage autoscaling for existing node groups by using the Nebius AI Cloud CLI.

To enable autoscaling for an existing node group, add the following parameters to the [nebius mk8s node-group update](/cli/reference/mk8s/node-group/update) command:

```bash theme={null}
nebius mk8s node-group update --id <node_group_ID> \
  --autoscaling-min-node-count <minimum_number_of_nodes> \
  --autoscaling-max-node-count <maximum_number_of_nodes>
```

For example, to set the autoscaling from 2 to 4 nodes, add `--autoscaling-min-node-count 2 --autoscaling-max-node-count 4` to the `nebius mk8s node-group create` command.

## Configure autoscaling parameters

You can only configure autoscaling parameters for existing node groups by using the Nebius AI Cloud CLI.

To configure the minimum and maximum numbers of nodes for autoscaling, add the following parameters to the [nebius mk8s node-group update](/cli/reference/mk8s/node-group/update) command:

```bash theme={null}
nebius mk8s node-group update --id <node_group_ID> \
  --autoscaling-min-node-count <minimum_number_of_nodes> \
  --autoscaling-max-node-count <maximum_number_of_nodes>
```

For example, to set the autoscaling from 2 to 4 nodes, add `--autoscaling-min-node-count 2 --autoscaling-max-node-count 4` to the `nebius mk8s node-group update` command.

## Troubleshooting

### More GPU nodes than required

* **Issue**: When a Managed Kubernetes cluster has the NVIDIA GPU Operator and the NVIDIA Network Operator installed, and workloads on a GPU node group are run with autoscaling, the cluster autoscaler can create more nodes in the group than the workloads require.
* **Possible reason**: A bug in Kubernetes Autoscaler that causes inconsistency in how nodes are considered ready or not ready for pods. For more information, see [Excess multiGPU nodes when using GPU + network operators](https://github.com/kubernetes/autoscaler/issues/7956) in the Kubernetes Autoscaler repository on GitHub.
* **Solution**:

  1. Uninstall the NVIDIA operators.
  2. [Create a GPU node group](../gpu/set-up#how-to-add-nodes-with-gpus-to-a-cluster) and [migrate your workloads](./moving-workload) to it. A node group created this way uses the GPU-adapted boot disk image offered by Managed Kubernetes, which solves the issue because the NVIDIA operators are no longer required.

## See also

* [Cluster autoscaler parameters](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-the-parameters-to-ca) in the official GitHub repository.
