Accessing Allas
The four main options for accessing Allas are:
- Web browser interfaces
- Command-line tools
- Graphical tools
- Other tools: Python and R libraries etc
The tool lists below are not complete or exclusive. Any tool that supports Swift or S3 protocols can in principle use Allas. You can cross-use the Allas clients as long as you access Allas with the same protocol (Swift or S3).
When choosing the tool for accessing Allas, consider:
- Ease of getting started: web interfaces do not need installation and the connection configuration is easy.
- Ease of use: web interface and graphical tools are in general easier to get started for basic tasks.
- The amount of data you have to move, the web interfaces are not suitable for for big data transfers.
- Your other workflow, Python or R libraries may be useful, if you use these programming languages already for other tasks.
- The operating system of your local machine, some command-line and graphical tools support only Linux/Mac or Windows.
- Allas protocol of your choice, many of the command-line and graphical tools support only Swift or S3.
- Packaging of files, in case of moving many files,
a-tools
packages them by default to a .tar file and adds metadata, other tools usually move files as they are. - Sensitity of your data, for sensitive data use tools that support client side encryption.
To use Allas from Puhti or Mahti, see Tutorial for using Allas in CSC supercomputers.
Web browser interfaces
At the moment CSC provides several web browser interfaces for Allas:
Web interface | Instructions | SWIFT support | S3 support | Use cases | Limits |
---|---|---|---|---|---|
Allas web UI | Instructions | ✔ | General first choice, share data with another project | Max 5 GB files) | |
Puhti web UI | Instructions | ✔ | ✔ | Moving data to/from Puhti or local, also S3 usage and LUMI-O | Max 10 GB file uploads from local |
Mahti web UI | Instructions | ✔ | ✔ | Moving data to/from Mahti or local, also S3 usage and LUMI-O | Max 10 GB file uploads from local |
cPouta web UI | Instructions | ✔ | Make your bucket public | Max 5 GB files, uploading/downloading only a single file at a time. | |
SD Connect | Instructions | ✔ | Sensitive data |
Commandline tools
To access Allas with command line commands, client software supporting the Swift or S3 protocol is required. This is the most flexible way to access Allas, but it is a little bit more complicated to get started.
Tools | SWIFT support | S3 support | Linux/Mac | Windows |
---|---|---|---|---|
a-commands | ✔ | ✔ | ✔ | - |
rclone | ✔ | ✔ | ✔ | ✔ |
swift python-swiftclient | ✔ | ✔ | ||
s3cmd | ✔ | ✔ | ||
aws-cli | ✔ | ✔ | ✔ |
Additionally for exmple curl
and wget
can be used for downloading public objects or objects with temporary URLs.
Graphical tools
Tools | SWIFT support | S3 support | Linux/Mac | Windows |
---|---|---|---|---|
Cyberduck | ✔ | ✔ | ✔ | ✔ |
WinSCP | ✔ | ✔ | ||
S3browser | ✔ | ✔ |
WinSCP has generally rather slow data transfer speed for S3, so likely not suitable for bigger amounts of data.
Other tools: Python and R libraries etc
- Python:
- R
- aws.s3 R package can be used for working with Allas with S3 protocol
- Geoscience related example how Allas can be used in R scripts, inc. aws.s3 set up.
- Nextcloud front end Can be set up in Pouta to get additional functionality.
These Python and R libraries can be installed to all operating systems.
Clients comparison
A web client is suitable for using the basic functions. a-commands offer easy-to-use functions for using Allas either via a personal computer or supercomputer. Power users might want to consider the clients rclone, Swift and s3cmd. The table displays the core functions of the power clients concerning data management in Allas.
Allas Web UI | a-commands | rclone | Swift | s3cmd | |
---|---|---|---|---|---|
Usage | Basic | Basic | Power | Power | Power |
Create buckets | ✔ | ✔ | ✔ | ✔ | ✔ |
Upload objects | ✔ | ✔ | ✔ | ✔ | ✔ |
List | |||||
objects | ✔ | ✔ | ✔ | ✔ | ✔ |
buckets | ✔ | ✔ | ✔ | ✔ | ✔ |
Download | |||||
objects | ✔ | ✔ | ✔ | ✔ | ✔ |
buckets | ✔ | ✔ | ✔ | ✔ | |
Remove | |||||
objects | ✔ | ✔ | ✔ | ✔ | ✔ |
buckets | ✔•• | ✔ | ✔ | ✔ | ✔•• |
Managing access rights | |||||
public/private | ✔ | ✔ | ✔ | ||
read/write access to another project | ✔ | ✔ | ✔ | ✔ | |
temp URLs | ✔ | ✔ | |||
Set lifecycle policies | ✔ | ||||
Move objects | ✔ | ✔ | |||
Edit metadata | ✔ | ✔ | |||
Download whole project | ✔ | ✔ | |||
Remove whole project | ✔ | ✔ |
Files larger than 5 GB
Files larger than 5 GB are divided into smaller segments during upload.
- Most tools split large files automatically
- With Swift, you can use the Static Large Object: swift with large files
After upload, s3cmd connects these segments into one large object, but in case of swift based uploads (a-put, rclone , swift) the large files are also stored as several objects. This is done automatically to a bucket that is named by adding extension _segments
to the original bucket name. For example, if you would use a-put to upload a large file to bucket 123-dataset the actual data would be stored as several pieces into bucket 123-dataset_segments. The target bucket 123_dataset would contain just a front object that contains information what segments make the stored file. Operations performed to the front object are automatically reflected to the segments. Normally users don't need to operate with the segments buckets at all and objects inside these buckets should not be deleted or modified.
It is important not to mix Swift and S3, as these protocols are not fully mutually compatible.