Filesystems and storage on the HPC cluster¶
This page explains the storage areas available on the HPC cluster and how to use them correctly.
Choosing the right filesystem is essential for performance, data safety, and compliance.
Overview¶
The cluster provides multiple filesystems, each designed for a specific purpose:
- HOME – personal, persistent storage
- WORK – project-level persistent storage
- SCRATCH – shared temporary storage
- SCRATCH_LOCAL – node-local high-performance temporary storage
Each filesystem has different characteristics in terms of performance, persistence, and data retention.
HOME filesystem¶
Purpose¶
The HOME filesystem is your personal workspace.
Use it for:
- Configuration files
- Small scripts and lightweight files
Do not use HOME for:
- Source code repositories
- Job scripts and input/output data
- Large datasets or intensive workloads
Characteristics¶
- Path:
$HOME - Scope: user-specific
- Persistence: permanent
- Backup: not enabled
- Quota: limited (default quota applies)
Good practices¶
- Keep HOME clean and organized
- Do not store datasets here
- Do not run heavy I/O workloads from HOME
WORK filesystem¶
Purpose¶
The WORK filesystem is intended for project data and results.
Each user has access to one or more project directories, typically structured as:
/work/<project_name>
Use WORK for:
- Input datasets for jobs
- Intermediate results
- Final outputs that must be preserved
- Data shared within a project team
Characteristics¶
- Path:
/work/<project_name> - Scope: project-specific
- Persistence: persistent (for the duration of the project)
- Backup: not enabled
- Quota: project-based (expandable on request)
Good practices¶
- Organize data by project and workflow
- Share data only with authorized project members
- Remove obsolete files to stay within quota
SCRATCH filesystem¶
Purpose¶
The SCRATCH filesystem is designed for temporary, high-performance I/O during job execution.
Each user typically has a directory such as:
/scratch/<username>
Environment variable:
$SCRATCH
Use SCRATCH for:
- Temporary files
- Intermediate data
- I/O-intensive workloads
Characteristics¶
- Scope: user/project-specific directories
- Persistence: temporary
- Backup: not enabled
- Automatic cleanup: files may be deleted after a defined period
⚠️ Important: Data stored on SCRATCH can be removed at any time.
Never use SCRATCH as the only copy of important data.
Good practices¶
- Copy input data from WORK to SCRATCH at job start
- Write temporary output to SCRATCH
- Copy final results back to WORK before job completion
SCRATCH_LOCAL filesystem¶
Purpose¶
The SCRATCH_LOCAL filesystem provides node-local, high-performance storage (NVMe) for fast I/O during job execution.
Each job may use local storage on the compute node, typically exposed as:
/scratch_local/<username>
Environment variable:
${SCRATCH_LOCAL}
Use SCRATCH_LOCAL for:
- Very high I/O workloads
- Temporary data during job execution
- Performance-critical applications
Characteristics¶
- Scope: local to each compute node
- Persistence: temporary (data is lost after job completion or node reuse)
- Backup: not enabled
⚠️ Important: Data stored on SCRATCH_LOCAL is not persistent and must be copied back before the job ends.
Typical workflow example¶
A common pattern for batch jobs is:
- Read input data from WORK (
/work/<project_name>) - Copy data to SCRATCH (
$SCRATCH) or SCRATCH_LOCAL (${SCRATCH_LOCAL}) - Run computations using temporary storage
- Copy final results back to WORK
- Clean up temporary files
This approach maximizes performance and minimizes load on persistent filesystems.
Data access and protection¶
Data access on the HPC platform is controlled and restricted according to project authorization.
Users must ensure that:
- Data handling is consistent with the approved project scope
- Only authorized users can access the data
The platform enforces access control at the filesystem level:
- Project directories (
/work/<project_name>) are accessible only to authorized project members - Personal scratch directories (
/scratch/<username>and${SCRATCH_LOCAL}) are accessible only to the corresponding user - Data belonging to other users or projects is not accessible
Users must not attempt to access data outside their authorized scope.
Checking quotas and usage¶
You can check disk usage with standard commands:
df -h
du -sh $HOME
du -sh /work/<project_name>
du -sh $SCRATCH
If you need additional space on WORK, contact support with your project details.
What NOT to do¶
- Do not store datasets in HOME
- Do not rely on SCRATCH or SCRATCH_LOCAL for long-term storage
- Do not bypass filesystem policies or quotas
- Do not share data outside your project scope
Next steps¶
- Read SLURM basics to learn how to run jobs
- See Running your first job for a complete example