Serverless AI is a Nebius AI Cloud service for running containerized AI workloads as interactive endpoints or non-interactive jobs. By deploying your workloads in Serverless AI, you can focus on them without worrying about the infrastructure: the service handles resource provisioning and lifecycle, and usage-based, per-second billing. The service is available in all Nebius AI Cloud regions.Documentation Index
Fetch the complete documentation index at: https://docs.nebius.com/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Read about how Serverless AI works and how to choose between endpoints and jobs
Getting started with jobs
Create your first job that runs nvidia-smi and prints information about the GPUs in use
Getting started with endpoints
Launch a simple endpoint and send authenticated requests to it
Monitoring
Track resource utilization to schedule quota increases and to quickly identify anomalies
Pricing and quotas
Learn what other services Serverless AI uses and how this affects pricing and quotas