- Login nodes provide users with access to the cluster.
- Worker nodes execute Slurm jobs.
- Controller nodes manage scheduling and orchestration.
Prerequisites
Before you connect to nodes of a Soperator cluster for the first time, make sure that the following requirements are met:- Get the public endpoint of the cluster.
- Create a user account.
- (Optional) Establish a connection in Visual Studio Code.
Get the public endpoint of the cluster
If you deployed the cluster in Managed Service for Soperator yourself, get the endpoint in the web console:- In the sidebar, go to
Compute → Soperator.
- Under General, copy the public endpoint.
Create a user account
Contact the cluster administrator (theroot user) to create a user for you. Share your SSH public key with the administrator.
How to generate an SSH key pair
How to generate an SSH key pair
If you do not have an SSH key pair, generate it on your local machine:
-
In the terminal, go to the
~/.sshdirectory: -
Create an SSH key pair:
-C "<comment>"is optional but it helps distinguish the key from others. -
At the prompt that appears, enter the following information:
- Name of the file where the key should be stored.
- Passphrase for the key. Press Enter if you do not want to use a passphrase.
-
Get the contents of the generated public key:
Use the file name that you specified during the key pair creation.
(Optional) Establish a connection in Visual Studio Code
If you want to establish a connection in Visual Studio Code, do the following:- Install the Remote - SSH extension.
-
Open your
~/.ssh/configfile and add the following configuration:
How to connect to login nodes
Connect to a login node, so you can run Slurm commands and manage data stored on the shared filesystem.- Terminal
- VS Code
By default, if a cluster has several login nodes, you connect to a random login node:If you are working in the terminal, you can choose which login node to connect to.To connect to a particular login node, use the cluster public endpoint as a jump host and the name of the required node.Specify the parameters that were set during the user creation:Now, you can run Slurm commands, for example,
If you use the tmux terminal multiplexer, make sure to connect to the same node each time.
tmux sessions are created on the node where they are started, so you can only access them on this node.- Username. The administrator has the
rootusername. - Path to the private key.
- Exact name of the node you need to connect to, if you are connecting to a specific node (
login-<number>). You can get the list of login nodes in your cluster from your personal manager.
sinfo to get a list of available Slurm nodes.How to connect to worker nodes
Connect to a worker node, so you can monitor, observe and manage Slurm jobs. You can only connect to a specific worker node. To establish a connection, do the following:-
From your personal manager, get a list of worker nodes in your cluster. Usually, worker nodes are named as
worker-0,worker-1,worker-2, depending on how many exist. - Connect to a login node by using a terminal.
-
Connect to a specific worker node:
For example, if you have
worker-0andworker-1, specify one of these nodes. Output example:
htop, nvtop or nvidia-smi.
If you are working in the terminal, you can connect to a worker node directly by using the cluster public endpoint as a jump host and then connecting to the specific node by name.
- Terminal
- Username. The administrator has the
rootusername. - Path to the private key.