Skip to content

First steps with S3 storage

This tutorial guides you through the first practical steps to access and use the S3-compatible storage service.

You will learn how to:

  • Configure an S3 client
  • Access your project bucket
  • Upload and download data
  • Use S3 storage together with the HPC cluster

Prerequisites

Before starting, make sure you have:

  • An approved project with S3 storage enabled
  • Valid S3 credentials (Access Key ID and Secret Access Key) provided by the administrators
  • The S3 endpoint provided by the administrators
  • The name of your assigned bucket

The recommended tool for interacting with S3 storage is rclone.

  • Command-line usage on Linux and macOS
  • GUI alternatives:
  • Commander One (macOS)
  • S3 Browser (Windows)

rclone is:

  • Widely used in research environments
  • Actively maintained
  • Compatible with S3 APIs
  • Available on most Linux systems and on the HPC cluster

Step 1 — Configure rclone

Start the interactive configuration:

rclone config

Choose:

  1. New remote
  2. Name it (e.g. unibo-s3)
  3. Storage type: s3

When prompted, select:

  • S3 provider: Other
  • Access Key ID: (provided by administrators)
  • Secret Access Key: (provided by administrators)
  • Endpoint: (provided by administrators)
  • Region: leave empty unless instructed otherwise

Save the configuration.

The configuration file is stored in:

~/.config/rclone/rclone.conf

Ensure this file is readable only by you.


Accessing your bucket

Access to S3 storage is restricted to the buckets assigned to your project.

You must specify the bucket name explicitly in all commands.

Example:

unibo-s3:<your-bucket-name>

Listing all buckets may not be permitted.


Step 2 — Browse bucket contents

List the contents of your bucket:

rclone ls unibo-s3:<your-bucket-name>

Step 3 — Upload data to S3

Upload a file:

rclone copy local_file.dat unibo-s3:<your-bucket-name>/

Upload a directory:

rclone copy local_directory/ unibo-s3:<your-bucket-name>/data/

rclone transfers only new or modified files.


Step 4 — Download data from S3

Download a file:

rclone copy unibo-s3:<your-bucket-name>/results.dat .

Download a directory:

rclone copy unibo-s3:<your-bucket-name>/data/ local_data/

Step 5 — Synchronization (use with care)

To synchronize a local directory with a bucket:

rclone sync local_directory/ unibo-s3:<your-bucket-name>/data/

⚠️ Warning: sync deletes files on the destination that do not exist on the source.
Use it only if you fully understand the implications.


Using S3 storage with the HPC cluster

A common workflow is:

  1. Store raw datasets in S3
  2. Copy required data from S3 to $WORK/<project_name>
  3. Run HPC jobs
  4. Copy final results back to S3

Example (from the HPC login node):

rclone copy unibo-s3:<your-bucket-name>/input/ $WORK/<project_name>/input/

After job completion:

rclone copy $WORK/<project_name>/results/ unibo-s3:<your-bucket-name>/results/

Performance tips

For best performance:

  • Transfer large files rather than many small files
  • Use --progress to monitor transfers
  • Use multipart uploads for large datasets
  • Avoid frequent overwrites of the same objects

Example:

rclone copy large_file.dat unibo-s3:<your-bucket-name>/ --progress

Security reminders

  • Never share S3 credentials
  • Do not store credentials in scripts or notebooks
  • Ensure data usage is consistent with project authorization
  • Remove temporary local copies when no longer needed

Troubleshooting

Access denied

Check:

  • Correct bucket name
  • Valid credentials
  • Project authorization

Slow transfers

Check:

  • Network connectivity
  • File size and number
  • Use of appropriate rclone options

If problems persist, contact support with error messages.


Next steps