Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.nebius.com/llms.txt

Use this file to discover all available pages before exploring further.

Diagnostic logs from Compute virtual machines (VMs) help you troubleshoot issues with VM operations, networking and workloads. We strongly recommend collecting logs while the issue is still occurring, because they capture more information about the broken state than logs collected after the issue has been resolved.

Types of logs

This guide describes how to collect the following types of logs for troubleshooting:
  • GPU logs: nvidia-bug-report.sh.
  • General system logs, including more context about system services and package versions: sos report.
  • NVIDIA® Mellanox® adapter (InfiniBand™/NVSwitch/Ethernet) logs: sysinfo-snapshot.

Prerequisites

Make sure that you have configured SSH access to the VM.

How to collect logs

  1. Connect to the VM by using SSH.
  2. Generate GPU logs:
    sudo nvidia-bug-report.sh
    
    This command usually runs for about five minutes and generates nvidia-bug-report.log.gz in the current working directory. If the command stops responding, run it in safe mode:
    sudo nvidia-bug-report.sh --safe-mode
    
  3. If you need more system information, generate general system logs:
    sudo sos report --batch
    
    This command generates an archive in the following format: /tmp/sosreport-<VM_ID>-<date>-<random_ID>.tar.gz.
  4. If you are troubleshooting Mellanox adapter issues, generate Mellanox adapter logs:
    sudo /opt/nebius/sysinfo-snapshot
    
    This command generates an archive in the following format: /tmp/sysinfo-snapshot-<VM_ID>-<date>-<random_ID>.tgz.

How to get generated log files

  1. Check that the files were generated on your VM by running the following commands:
    • To check for GPU logs:
      ls nvidia-bug-report.log.gz
      
    • To check for general system logs or Mellanox adapter logs:
      ls /tmp
      
  2. From your local shell, run the following command to copy the files from the VM to the current directory:
    scp -i ~/.ssh/id_ed25519 <username>@<public_IP_address>:<remote_file_path> .
    
    In the command, specify the path to the generated file on the VM, for example: nvidia-bug-report.log.gz, /tmp/sosreport-*.tar.gz or /tmp/sysinfo-snapshot-*.tgz. If copying files from the /tmp directory fails due to a permission error, this usually means the generated file is owned by root. To fix this issue, proceed to the next step.
  3. Reconnect to the VM and set permissions to grant read access to non-root users. After that, you can rerun the scp command.
    If you successfully copied the generated log file, skip this step.
    sudo chmod 644 <remote_file_path>
    
    In the command, set the remote_file_path to /tmp/sosreport-*.tar.gz or /tmp/sysinfo-snapshot-*.tgz.
  4. Find the copied log files in your local directory.

See also

InfiniBand and InfiniBand Trade Association are registered trademarks of the InfiniBand Trade Association.