Skip to content

Running your first job

This tutorial walks you through submitting and running your first SLURM job
on the HPC cluster.

By the end of this page, you will be able to:

  • Create a job script
  • Submit it to SLURM
  • Monitor its execution
  • Retrieve the output

Step 1 — Prepare a working directory

Move to your project work directory (recommended):

cd /work/<project_name>
mkdir first-job
cd first-job

Alternatively, if your environment defines it:

cd $WORK

This keeps job files organized and avoids cluttering your HOME directory.


Step 2 — Create a simple job script

Create a file called hello.slurm:

nano hello.slurm

Paste the following content:

#!/bin/bash
#SBATCH --job-name=hello-world
#SBATCH --output=hello.out
#SBATCH --error=hello.err
#SBATCH --time=00:05:00
#SBATCH --cpus-per-task=1
#SBATCH --mem=1G

echo "Job started on $(date)"
echo "Running on node: $(hostname)"
echo "Working directory: $(pwd)"

sleep 10

echo "Job finished on $(date)"

Save and exit the editor.


Step 3 — Submit the job

Submit the job to SLURM:

sbatch hello.slurm

You should see output similar to:

Submitted batch job 123456

The number returned is your job ID.


Step 4 — Monitor the job

Check the job status:

squeue -u $USER

Possible states:

  • PD — pending (waiting for resources)
  • R — running
  • CD — completed

For very short jobs, the job may complete before appearing as running.


Step 5 — Check job output

Once the job has completed, list the files:

ls -l

You should see:

  • hello.out — standard output
  • hello.err — standard error (empty if no errors occurred)

View the output:

cat hello.out

Understanding job output files

  • Standard output (--output)
    Contains everything printed to the terminal (stdout)

  • Standard error (--error)
    Contains error messages (stderr)

Always check both files when debugging a job.


Using SCRATCH for computations

For real workloads, use SCRATCH (or SCRATCH_LOCAL) for temporary data and heavy I/O.

Example pattern:

#!/bin/bash
#SBATCH --job-name=example-scratch
#SBATCH --time=01:00:00
#SBATCH --cpus-per-task=4
#SBATCH --mem=8G

SCRATCH_DIR=$SCRATCH/$SLURM_JOB_ID
mkdir -p $SCRATCH_DIR

cp /work/<project_name>/input_data.dat $SCRATCH_DIR
cd $SCRATCH_DIR

# Run your computation here

cp results.dat /work/<project_name>/

rm -rf $SCRATCH_DIR

This improves performance and reduces load on persistent filesystems.

Always copy important results to WORK before job completion.


Example: requesting GPUs

If your workload requires GPUs, request them explicitly:

#!/bin/bash
#SBATCH --job-name=gpu-example
#SBATCH --partition=gpu
#SBATCH --gres=gpu:1
#SBATCH --time=02:00:00
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G

nvidia-smi

Always request only the resources you actually need.


Common issues

Job stays in PENDING (PD)

Possible reasons:

  • Requested resources are not currently available
  • Walltime is too long
  • Partition limits have been reached

Use:

squeue -j <job_id> -o "%i %t %r"

to see the reason.


Job fails immediately

Check:

  • Output files (.out and .err)
  • Requested resources
  • Script syntax

Make sure the script is correctly submitted via sbatch.


Cleaning up

After jobs complete:

  • Remove unnecessary output files
  • Clean temporary directories
  • Keep WORK organized

Good housekeeping improves overall system efficiency.


Next steps