Installing Slurm on a Single Workstation (Ubuntu 24.04, Ultra-Simple Setup Without Authentication)

If we’ve used supercomputers, we’ve probably dealt with queueing systems like Slurm. It’s very convenient — we just submit jobs and let the scheduler take care of the rest. I decided to install it on my personal workstation as well.

That said, I’m not running a cluster — just a single workstation (PC). So I skipped authentication (like munge) and went for the bare minimum setup. My lab network is isolated from the outside world, and no one else uses this machine, so I’m ignoring security concerns. If you’re following this setup, proceed with caution.

Install via apt

We can install it using apt.

$ sudo apt install slurm-wlm

munge will also be installed by default, but we won’t be using it.

Create slurm.conf

$ sudo vim /etc/slurm/slurm.conf

Here’s a minimal configuration.

1
ClusterName=local
2
ControlMachine=hostname
3
NodeName=hostname
4
PartitionName=main Nodes=hostname Default=YES MaxTime=INFINITE State=UP
5

6
SlurmctldPort=6817
7
SlurmdPort=6818
8
AuthType=auth/none
9
SlurmUser=slurm
10
StateSaveLocation=/var/spool/slurm
11
SlurmdSpoolDir=/var/spool/slurmd
12
SwitchType=switch/none
13
TaskPlugin=task/none

The hostname should match the value shown by hostname -s.

Technically, NodeName can include properties like CPUs, but since I’m not dividing resources, I left it blank. Running slurmd -C will output system info, so Slurm may auto-detect the specs. If you need resource partitioning, you may want to explicitly set those values.

Set AuthType=auth/none.

Create necessary directories and set permissions

$ sudo mkdir -p /var/spool/slurm
$ sudo mkdir -p /var/spool/slurmd
$ sudo chown -R slurm: /var/spool/slurm /var/spool/slurmd

Disable munge

Since we’re using auth/none, munge isn’t required. It doesn’t hurt to leave it running, but I disabled it just in case.

$ sudo systemctl disable --now munge

Start Slurm

$ sudo systemctl enable --now slurmctld
$ sudo systemctl enable --now slurmd

Check that it’s running correctly.

$ sinfo
PARTITION AVAIL  TIMELIMIT  NODES  STATE NODELIST
main*        up   infinite      1   idle localhost

Submit a job

Try submitting a test job.

$ set +H
$ echo -e "#!/bin/bash\necho Hello, Slurm!" > test.sh
$ chmod +x test.sh
$ sbatch test.sh
$ squeue
$ cat slurm-*.out

Fixing STATE=DOWN after reboot

Sometimes, after a reboot, node STATE appears as DOWN. You can reset it with the command below, although the cause remains unclear.

sudo scontrol update nodename=hostname state=idle

Prioritize a job (job preemption)

If you’ve submitted many jobs and want to prioritize a new one urgently, you can do the following:

Submit the job as usual.

$ sbatch job.sh

Adjust its priority.

$ sudo scontrol update jobid=<jobid> Nice=-10

By default, Nice is set to 0. Lower values (negative) are prioritized. You’ll need sudo to change it.

Using sacct

If we want to view job history using sacct, we’ll need the following setup.

$ sudo apt install slurmdbd mysql-server-8.0
$ sudo service mysql start
$ sudo mysql -u root
(mysql) CREATE DATABASE slurm_acct_db;
(mysql) CREATE USER 'slurm'@'localhost' IDENTIFIED BY 'パスワード';
(mysql) GRANT ALL ON slurm_acct_db.* TO 'slurm'@'localhost';
(mysql) FLUSH PRIVILEGES;

Add this to /etc/slurm/slurmdbd.conf.

1
AuthType=auth/none
2
DbdHost=localhost
3
DbdPort=6819
4

5
StorageType=accounting_storage/mysql
6
StorageHost=localhost
7
StoragePass=パスワード
8
StorageUser=slurm
9
StorageLoc=slurm_acct_db
10

11
LogFile=/var/log/slurmdbd.log
12
PidFile=/var/run/slurmdbd.pid
13
SlurmUser=slurm

Change the file’s ownership and permissions accordingly.

$ sudo chown slurm: /etc/slurm/slurmdbd.conf
$ sudo chmod 600 /etc/slurm/slurmdbd.conf

Then, add the following lines to /etc/slurm/slurm.conf.

1
AccountingStorageType=accounting_storage/slurmdbd
2
AccountingStorageHost=<slurmdbdが動くホスト名>

Start slurmdbd, and restart slurmctld and slurmd.

sudo systemctl start slurmdbd
sudo systemctl restart slurmctld slurmd

Try running sacct.

$ sacct -o User,JobID,Partition,NNodes,Submit,Start,End,Elapsed,State -X

Stop Slurm after current jobs finish (e.g., for maintenance)

We can suspend new jobs after the current ones complete.

$ sudo scontrol update NodeName=<ノード名> State=DRAIN Reason="Maintenance after current job"

To revert this behavior, run the following.

$ sudo scontrol update NodeName=<ノード名> State=RESUME

Installing Slurm on a Single Workstation (Ubuntu 24.04, Ultra-Simple Setup Without Authentication)