> ## Documentation Index
> Fetch the complete documentation index at: https://docs.nebius.com/llms.txt
> Use this file to discover all available pages before exploring further.

# How to view metrics in Prometheus

To work with metrics in Prometheus, connect Prometheus to Observability Metrics and query the data by using [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/).

## Prerequisites

1. [Install](/cli/install) and [configure](/cli/configure) Nebius AI Cloud CLI.

2. If you don’t have a service account for observability services, [create one](/iam/service-accounts/manage).

3. Make sure that the service account is in a [group](/iam/authorization/groups) that has at least the `viewer` role within your tenant; for example, the default `viewers` group. You can check this in the [Administration → IAM](https://console.nebius.com/iam/service-accounts) section of the web console.

   If the service account is not in the required group, click <Icon icon="https://mintcdn.com/nebius-ai-cloud/1Ha0sWR6e1mnIaHS/_assets/button-vellipsis.svg?fit=max&auto=format&n=1Ha0sWR6e1mnIaHS&q=85&s=e80b8e57c43bfd117679262e6a1334ad" width="12" height="24" data-path="_assets/button-vellipsis.svg" /> → **Add to group**, and select `viewers`.

4. Issue a [static key](/iam/authorization/static-keys) for the service account using the following command:

   ```bash theme={null}
   nebius iam static-key issue \
     --name <name_for_the_key> \
     --account-service-account-id <service_account_ID> \
     --service=OBSERVABILITY
   ```

   Copy the value of the static key from the `token` parameter of the response. You will need it on later steps.

## How to connect Prometheus

<Note>
  Prometheus can only show a limited amount of monitoring data. If you have a large infrastructure, consider connecting a data source in [Grafana®](./grafana) instead.
</Note>

1. [Download the latest release](https://prometheus.io/download) of Prometheus for your platform.

2. Extract the contents and switch to the folder with Prometheus:

   ```bash theme={null}
   tar xvfz prometheus-***.tar.gz
   cd prometheus-***
   ```

3. Create the `prometheus.yml` configuration file that configures Prometheus to retrieve the metrics. Use one of the following configurations depending on your Prometheus version:

   <CodeGroup>
     ```yaml Prometheus 3.x theme={null}
     scrape_configs:
       - job_name: 'Export time series from Nebius Observability'
         honor_labels: true
         scrape_interval: 15s
         scheme: https
         metrics_path: '/projects/<project_ID>/service-provider/prometheus/federate'
         params:
           match[]:
             - '{__name__=~".+"}'
         bearer_token: '<static_key_for_service_account>'
         static_configs:
           - targets:
             - 'read.monitoring.api.nebius.cloud'
     ```

     ```yaml Prometheus 2.x theme={null}
     scrape_configs:
       - job_name: 'Export time series from Nebius Observability'
         honor_labels: true
         scrape_interval: 15s
         scheme: https
         metric_name_validation_scheme: legacy
         scrape_protocols:
           - OpenMetricsText0.0.1
         metrics_path: '/projects/<project_ID>/service-provider/prometheus/federate'
         params:
           match[]:
             - '{__name__=~".+"}'
         bearer_token: '<static_key_for_service_account>'
         static_configs:
           - targets:
             - 'read.monitoring.api.nebius.cloud'
     ```
   </CodeGroup>

   In this file, change the following parameters:

   * `bearer_token`: Enter the static key that you [got earlier](#prerequisites).

   * `metrics_path`: Specify your [project ID](/iam/manage-projects#how-to-get-a-project-id) in the URL.

     Optionally, add a service in the path in the following format:

     ```yaml theme={null}
     metrics_path: '/projects/<project_ID>/buckets/<service>/prometheus/federate'
     ```

     The following services are available:

     * `compute`: metrics related to Compute virtual machines.
     * `gpu`: GPU-related metrics.
     * `nbs`: metrics related to Compute volumes.
     * `sp_storage`: metrics related to Object Storage.
     * `msp`: metrics related to Managed Service for PostgreSQL® and Managed Service for MLflow.

   * `match[]`: optionally specify which data Prometheus collects by filtering for labels or metric names. For example, to collect only metrics with the `disk` prefix, set the following value:

     ```yaml theme={null}
     match[]:
       - '{__name__=~"^disk.*"}'
     ```

   * `scrape_interval`: you can change the interval, but the recommended interval is no less than 15 seconds.

4. Start Prometheus:

   ```bash theme={null}
   ./prometheus --config.file=prometheus.yml
   ```

## How to shard large scraping jobs

If a scraping job needs to return a large amount of data, shard (split) it into several jobs.

Use sharding when any of the following is true:

* Prometheus takes too long to retrieve metrics because one job requests too many time series.
* A large scraping job intermittently times out or becomes unreliable.
* You expect your cluster to grow significantly and want to avoid reworking the Prometheus configuration later.

To shard a scraping job, create multiple `scrape_configs` entries that use the same `metrics_path` but different `match[]` selectors. Make the selectors non-overlapping so that the same metric is not collected more than once.

> For example, when you collect only GPU metrics, split the requests by the `uuid` label:
>
> ```yaml theme={null}
> scrape_configs:
>  - job_name: 'Nebius Observability: GPU metrics, shard 1'
>    honor_labels: true
>    scrape_interval: 15s
>    scheme: https
>    metrics_path: '/projects/<project_ID>/buckets/gpu/prometheus/federate'
>    params:
>      match[]:
>        - '{uuid=~"GPU-[0-7].*"}'
>    bearer_token: '<static_key_for_service_account>'
>    static_configs:
>      - targets:
>        - 'read.monitoring.api.nebius.cloud'
>
>  - job_name: 'Nebius Observability: GPU metrics, shard 2'
>    honor_labels: true
>    scrape_interval: 15s
>    scheme: https
>    metrics_path: '/projects/<project_ID>/buckets/gpu/prometheus/federate'
>    params:
>      match[]:
>        - '{uuid=~"GPU-[8-9a-f].*"}'
>    bearer_token: '<static_key_for_service_account>'
>    static_configs:
>      - targets:
>        - 'read.monitoring.api.nebius.cloud'
>
>  - job_name: 'Nebius Observability: GPU metrics, shard 3'
>    honor_labels: true
>    scrape_interval: 15s
>    scheme: https
>    metrics_path: '/projects/<project_ID>/buckets/gpu/prometheus/federate'
>    params:
>      match[]:
>        - '{uuid=""}'
>    bearer_token: '<static_key_for_service_account>'
>    static_configs:
>      - targets:
>        - 'read.monitoring.api.nebius.cloud'
> ```

Choose one sharding strategy and use it consistently. For example, split requests by service, by metric name prefix or by a stable label that clearly partitions your infrastructure.

## How to explore and manage metrics

Open [http://localhost:9090](http://localhost:9090) in your browser and explore the metrics by using [PromQL queries](https://prometheus.io/docs/prometheus/latest/querying/basics/).

For example, to get all metrics related to Compute virtual machines, enter the following query:

```
{instance_id=~"computeinstance-.*"}
```

***

*The Grafana Labs Marks are trademarks of Grafana Labs, and are used with Grafana Labs' permission. We are not affiliated with, endorsed or sponsored by Grafana Labs or its affiliates.*

*Postgres, PostgreSQL and the Slonik Logo are trademarks or registered trademarks of the PostgreSQL Community Association of Canada, and used with their permission.*
