Getting started with Managed Service for MLflow: create your first cluster

MLflow helps to track and manage your machine learning experiments. With clusters for MLflow in Nebius AI Cloud, you can deploy MLflow in a cloud in just a couple of clicks and connect to it easily from machines that run your workloads. This guide covers how to set up your environment to work with Managed Service for MLflow, create your first cluster, run a simple experiment and access its artifacts in an Object Storage.

Prepare your environment

Meet the following prerequisites, depending on the preferred interface:

Web console
CLI

Install the MLflow Python package that provides an API to run and manage experiments in your code:
```
pip install mlflow
```
Make sure you are in a group that has the admin role within your tenant or project; for example, the default admins group. You can check this in the Administration → IAM section of the web console.

Install the following tools:
- MLflow: Python package that provides an API to run and manage experiments in your code.
- jq: (How to install jq) and use it to parse JSON outputs from the Nebius AI Cloud CLI, and extract resource IDs for other commands.
- Nebius AI Cloud CLI: (How to install the CLI), to manage all Nebius AI Cloud resources.
Here are all the installation commands in a single copy-and-paste block:
pip install mlflow sudo apt-get install unzip jq curl -sSL https://storage.eu-north1.nebius.cloud/cli/install.sh | bash nebius profile create
The last command, nebius profile create, will guide you through several prompts. After you complete the prompts, your browser will open the Nebius AI Cloud web console sign-in screen. Sign in to the web console to complete the initialization. If you have access to multiple tenants, the CLI will prompt you to choose a tenant ID. After that, save your project ID in the CLI configuration: If the project ID has not been configured during the nebius profile create flow, get the project ID and save it in the CLI configuration:
```
nebius config set parent-id <project_ID>
```
Make sure that you, or the service account that you use on your behalf, is in a group that has the admin role within your tenant; for example, the default admins group. You can check this in the Administration → IAM section of the web console.

Create a cluster

Web console
CLI

In the sidebar, go to ML tools → MLflow.
Click Create cluster.
On the cluster creation page that opens, set the following parameters:
- Name: Enter a name for your cluster.
- Service account: Select the mlflow-sa service account. It was created for Managed MLflow by default after you signed up for Nebius AI Cloud.
- Access: Select the Public and private option. In this case, the cluster has both public and private tracking endpoints. You can access the public endpoint from the internet and the private endpoint from the network where the cluster is located (for example, via a virtual machine in this network).
- Cluster size: Select Medium.
- Admin credentials: Enter a username and a strong password.
Click Create cluster.

Create a password for administrator access to the cluster and save it to an environment variable:
```
export PASSWORD=<password>
```
The password must be between 8 and 64 characters long and must contain at least the following:
- One uppercase letter
- One lowercase letter
- One digit
- One special character from **-!@#$^&*_=+:;’”\|/?,.~\§\±()[]{}<>\`**.
Get the ID of the mlflow-sa service account. It was created by default after you signed up for Nebius AI Cloud. Save the ID to an environment variable: Get the ID of the mlflow-sa service account. It was created by default after you signed up for Nebius AI Cloud. Save the ID to an environment variable:
```
export SA_ID=$(nebius iam service-account get-by-name \
  --name mlflow-sa \
  --format json | jq -r '.metadata.id')
```

Get the ID of the default network and save it to an environment variable:

export NETWORK_ID=$(nebius vpc network get-by-name \
  --name default-network \
  --format json | jq -r '.metadata.id')

Create a test cluster:

nebius msp mlflow v1alpha1 cluster create \
  --name mlflow-test-cluster \
  --service-account-id $SA_ID \
  --network-id $NETWORK_ID \
  --admin-username admin \
  --admin-password $PASSWORD \
  --public-access=true

Together with the cluster, Managed Service for MLflow creates a bucket in Object Storage to store the experiment logs and metrics. This bucket is not deleted automatically after you delete the cluster. You are still charged for the bucket until you delete it.

Configure connection to MLflow Tracking

Get the cluster tracking endpoint and save it to an environment variable:

Web console
CLI

Go to the cluster page and copy the Public Tracking URI value. Append https:// and save the resulting URL to an environment variable:

nebius msp mlflow v1alpha1 cluster create \
  --name mlflow-test-cluster \
  --service-account-id $SA_ID \
  --network-id $NETWORK_ID \
  --admin-username admin \
  --admin-password $PASSWORD \
  --public-access=true

Run the following command:

export MLFLOW_TRACKING_URI=https://$(nebius msp mlflow v1alpha1 cluster get-by-name \
  --name mlflow-test-cluster \
  --format json | jq -r ".status.tracking_endpoint")

Set up other environment variables used by MLflow Tracking:

export MLFLOW_TRACKING_USERNAME=<admin_username>
export MLFLOW_TRACKING_PASSWORD=<admin_password>
export MLFLOW_EXPERIMENT_NAME=<default_experiment_name>

For more details on secure connection to the tracking server, see MLflow documentation.

Run an experiment and check its results

Run an experiment:

Create a Python script, nebius_mlflow_test.py, that trains a linear regression model on a simple, randomly populated dataset and uses MLflow autologging to log the training:

import mlflow
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

np.random.seed(42)
X = np.random.rand(100, 1)
y = 3.5 * X.squeeze() + np.random.randn(100) * 0.5
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)

mlflow.autolog()

model = LinearRegression()
model.fit(X_train, y_train)

Run the script:
```
python3 nebius_mlflow_test.py
```

Check the experiment artifacts:
- Web console
- CLI
Go to the cluster page and click Go to web UI. Enter the admin credentials you specified when creating the cluster.In the left pane, you can see a list of experiments. Select the experiment you just ran and review its results.
Get the public tracking endpoint URL of the cluster:
nebius msp mlflow v1alpha1 cluster get-by-name \ --name mlflow-test-cluster \ --format json | jq -r ".status.tracking_endpoint"
Open the URL in your browser and enter the admin credentials you specified when creating the cluster.In the left pane, you can see a list of experiments. Select the experiment you just ran and review its results.

If the client where you work requires a certificate, use the certificate that your machine’s OS provides. For example, check the /etc/ssl/ folder for macOS.

​Prepare your environment

​Create a cluster

​Configure connection to MLflow Tracking

​Run an experiment and check its results

Prepare your environment

Create a cluster

Configure connection to MLflow Tracking

Run an experiment and check its results