Skip to main content
In Soperator clusters, all Slurm nodes are Kubernetes Pods. The main types of Slurm nodes in Soperator clusters are the following: You can connect to login and worker nodes.

Prerequisites

Before you connect to nodes of a Soperator cluster for the first time, make sure that the following requirements are met:
  1. Get the public endpoint of the cluster.
  2. Create a user account.
  3. (Optional) Establish a connection in Visual Studio Code.

Get the public endpoint of the cluster

If you deployed the cluster in Managed Service for Soperator yourself, get the endpoint in the web console:
  1. In the sidebar, go to icon Compute → Soperator.
  2. Under General, copy the public endpoint.
If you are using the Pro Solution for Soperator, get the endpoint from your personal manager.

Create a user account

Contact the cluster administrator (the root user) to create a user for you. Share your SSH public key with the administrator.
If you do not have an SSH key pair, generate it on your local machine:
  1. In the terminal, go to the ~/.ssh directory:
    cd ~/.ssh
    
  2. Create an SSH key pair:
    ssh-keygen -t ed25519 -C "<comment>"
    
    -C "<comment>" is optional but it helps distinguish the key from others.
  3. At the prompt that appears, enter the following information:
    • Name of the file where the key should be stored.
    • Passphrase for the key. Press Enter if you do not want to use a passphrase.
  4. Get the contents of the generated public key:
    cat <file_name>.pub
    
    Use the file name that you specified during the key pair creation.

(Optional) Establish a connection in Visual Studio Code

If you want to establish a connection in Visual Studio Code, do the following:
  1. Install the Remote - SSH extension.
  2. Open your ~/.ssh/config file and add the following configuration:
    Host slurm
      HostName <public_IP_address>
      User <username>
      IdentityFile ~/slurm_ed25519
    

How to connect to login nodes

Connect to a login node, so you can run Slurm commands and manage data stored on the shared filesystem.
By default, if a cluster has several login nodes, you connect to a random login node:
ssh <username>@<public_endpoint> -i <path_to_private_key>
If you are working in the terminal, you can choose which login node to connect to.
If you use the tmux terminal multiplexer, make sure to connect to the same node each time. tmux sessions are created on the node where they are started, so you can only access them on this node.
To connect to a particular login node, use the cluster public endpoint as a jump host and the name of the required node.
ssh -J <username>@<public_endpoint> <username>@login-<number> -i <path_to_private_key>
Specify the parameters that were set during the user creation:
  • Username. The administrator has the root username.
  • Path to the private key.
  • Exact name of the node you need to connect to, if you are connecting to a specific node (login-<number>). You can get the list of login nodes in your cluster from your personal manager.
Output example:
username@login-0:~$
Now, you can run Slurm commands, for example, sinfo to get a list of available Slurm nodes.

How to connect to worker nodes

Connect to a worker node, so you can monitor, observe and manage Slurm jobs. You can only connect to a specific worker node. To establish a connection, do the following:
  1. From your personal manager, get a list of worker nodes in your cluster. Usually, worker nodes are named as worker-0, worker-1, worker-2, depending on how many exist.
  2. Connect to a login node by using a terminal.
  3. Connect to a specific worker node:
    ssh worker-<number>
    
    For example, if you have worker-0 and worker-1, specify one of these nodes. Output example:
    username@worker-0:~$
    
Now, you can monitor system usage with various tools, such as htop, nvtop or nvidia-smi. If you are working in the terminal, you can connect to a worker node directly by using the cluster public endpoint as a jump host and then connecting to the specific node by name.
ssh -J <username>@<public_endpoint> <username>@worker-<number> -i <path_to_private_key>
Enter the exact name of the node you need to connect to. In addition, specify the parameters that were set during the user creation:
  • Username. The administrator has the root username.
  • Path to the private key.
Output example:
username@worker-0:~$