Prerequisites
- Install Nebius Observability Agent for Kubernetes.
-
Create a Managed Kubernetes cluster and connect to it by using
kubectl.
Update the agent
To enable metrics collection, create avalues.yaml file and update your Nebius Observability Agent for Kubernetes installation:
-
Create a
values.yamlfile with metrics configuration: -
Update your Nebius Observability Agent for Kubernetes installation:
Configuration options
config.metrics.enabled: Enable or disable metrics collection. Default:true.config.metrics.collectAgentMetrics: Option to collect metrics from the Nebius Observability Agent for Kubernetes itself. Default:false.config.metrics.collectK8sMetrics: Enable collection of Kubernetes infrastructure metrics (API servers, nodes, cAdvisor, Hubble). Default:false.config.metrics.excludedNamespaces: List of namespaces to exclude from metrics collection.
Excluding namespaces
To exclude specific namespaces from metrics collection, add them to theexludedNamespaces list in the configuration:
Collecting both logs and metrics
To collect both logs and metrics, extend the configuration:Collected targets
Nebius Observability Agent for Kubernetes automatically discovers and collects metrics from multiple Kubernetes targets:Service endpoints targets
-
kubernetes-service-endpoints (scrape interval: 15s)
Collects metrics from Kubernetes services with
prometheus.io/scrape: "true"annotation. This target discovers services and scrapes their endpoints. -
kubernetes-service-endpoints-slow (scrape interval: 5m)
Collects metrics from services with
prometheus.io/scrape_slow: "true"annotation for less frequent scraping with extended timeout (30s).
Pod targets
-
kubernetes-pods (scrape interval: 15s)
Collects metrics directly from Pods with
prometheus.io/scrape: "true"annotation. Only scrapes Pods running on the same node as the agent (node-local collection). -
kubernetes-pods-slow (scrape interval: 5m)
Collects metrics from Pods with
prometheus.io/scrape_slow: "true"annotation for less frequent scraping with extended timeout (30s).
(Optional) Kubernetes infrastructure targets
WhencollectK8sMetrics is enabled, Nebius Observability Agent for Kubernetes also collects:
- kubernetes-apiservers Collects metrics from Kubernetes API servers for cluster health monitoring.
-
kubernetes-nodes
Collects node-level metrics via kubelet
/metricsendpoint, including node resource usage and status. -
kubernetes-nodes-cadvisor
Collects container metrics via kubelet
/metrics/cadvisorendpoint, providing detailed container resource usage. -
hubble (scrape interval: 15s)
Collects network observability metrics from Cilium Hubble in the
kube-systemnamespace, if available.
Custom targets
- additionalTargets
User-defined custom scrape targets that can be configured via
config.metrics.additionalTargetsin the Helm values.
Target filtering
- Namespace filtering: Targets in namespaces listed in
excludedNamespacesare automatically excluded. - Node locality: Pod and node targets are filtered to only collect metrics from the same node where the agent is running.
- Pod state filtering: Pods in
Pending,Succeeded,FailedorCompletedstates are excluded from collection.
Annotation requirements
For service and Pod targets to be discovered, they must have proper Prometheus annotations:| Annotation | Required | Description | Default |
|---|---|---|---|
prometheus.io/scrape | Yes | Enable metrics scraping | - |
prometheus.io/port | Yes | Port number for metrics endpoint | - |
prometheus.io/path | No | Metrics endpoint path | /metrics |
prometheus.io/scheme | No | HTTP scheme (http/https) | http |
prometheus.io/scrape_slow | No | Enable slow scraping (5m interval) | - |
Data enrichment
The Nebius Observability Agent for Kubernetes enriches metrics with the following metadata:k8s_cluster_id: Cluster IDk8s_node_group_id: Node group IDapp.kubernetes.io/name: Application name labelk8s.namespace.name: Namespace namek8s.deployment.name: Deployment name (if applicable)k8s.statefulset.name: StatefulSet name (if applicable)k8s.daemonset.name: DaemonSet name (if applicable)k8s.cronjob.name: CronJob name (if applicable)k8s.job.name: Job name (if applicable)k8s.node.name: Node namek8s.pod.name: Pod namek8s.pod.start_time: Pod start timecontainer.image.tag: Container image tagk8s.container.restart_count: Restart count of the container in the Podk8s_pod_uid: Pod unique identifier
Pod annotations for metrics scraping
For the Nebius Observability Agent for Kubernetes to scrape metrics from your applications, Pods must have the following annotations:Required annotations
prometheus.io/scrape: Set to"true"to enable metrics scraping for this Pod.prometheus.io/port: The port number where your application exposes metrics (as a string).prometheus.io/path: The path where metrics are available. Default:/metrics.
Example deployment with annotations
Troubleshooting
If you encounter issues with metrics collection:-
Verify that the agent is running:
-
Check agent logs for errors:
-
Check the status of targets by using Prometheus API exposed by the agent:
-
Run the following command to forward the port to your local machine:
-
Open http://127.0.0.1:8080/api/v1/targets in your browser to inspect the targets that the agent currently scrapes. For more details, see the
api/v1/targetsPrometheus documentation.
-
Run the following command to forward the port to your local machine: