Platform Configuration
Writing Platform Configurations
Added in version 8.0.0.
See also
Listing available platforms
If you are working on an institutional network, platforms may already have been configured for you.
To see a list of available platforms:
cylc config --platform-names
To see the full configuration of available platforms:
cylc config --platforms
This is equivalent to cylc config -i 'platforms' -i 'platform groups'
What Are Platforms?
Platforms define settings, most importantly:
A set of
hosts
.A
job runner
(formerly abatch system
) where Cylc can submit a job.An
install target
for Cylc to install job files on.
Why Were Platforms Introduced?
Allow a compute cluster with multiple login nodes to be treated as a single unit.
Allow Cylc to elegantly handle failure to communicate with login nodes.
Reduce the number of ssh connections required for job submission and polling.
What Are Install Targets?
Install targets represent file systems. More than one platform can use the
same file system. Cylc relies on the site configuration file global.cylc
to determine
which platforms share install targets.
Cylc will setup each remote install target once. During setup it will:
Install workflow files
Symlink directories
Copy authentication keys (to allow secure communication)
Note, if missing from configuration, the install target will default to the platform name. If incorrectly configured, this will cause errors in Remote Initialization.
If you log into one system and see the same files as on another, then these two
platforms will require the same install target in global.cylc
config file.
Example Platform Configurations
Detailed below are some examples of common platform configurations.
Submit PBS Jobs from Localhost
The scheduler runs on the
localhost
platform.Platforms can share hosts without sharing job runners.
Scenario
You have a cluster where you can submit jobs from the Cylc scheduler host using PBS.
The localhost
platform is the Cylc Scheduler host, as configured in
global.cylc[scheduler][run hosts]available
. This is the host that
the workflow will start on. For more information, see
Platform Configuration.
Our platform pbs_cluster
shares this localhost
host and setting the
install target to localhost
ensures that Cylc knows this platform does not
require remote initialization.
[platforms]
# The localhost platform is available by default
# [[localhost]]
# hosts = localhost
# install target = localhost
[[pbs_cluster]]
hosts = localhost
job runner = pbs
install target = localhost
Our Cylc scheduler does not have a job runner defined. Any job submitted to
this localhost
platform will run as a background job. Users can now set
flow.cylc[runtime][<namespace>]platform
= pbs_cluster
to run
pbs jobs.
Note
Both hosts
and install target
default to the platform name.
Multiple Platforms Sharing File System with Cylc Scheduler
Platform names can be defined as regular expressions.
Scenario
Everyone in your organization has a computer called desktopNNN
,
all with a file system shared with the scheduler host. Many users
will want their desktop set up as a platform to run small jobs.
In this scenario, Cylc does not need to install files on the desktop, since
required files which are on the scheduler host will be accessible on the
desktop. From Cylc’s point of view, the desktop and scheduler hosts are
considered different platforms but must share an install target.
Cylc needs to be told that these platforms share an install target and so we
configure this using the designated configuration item:
global.cylc[platforms][<platform name>]install target
.
global.cylc[platforms][<platform name>]
has optional configuration
[[[meta]]]
which users can view with cylc config --platforms
. We will add
a description designed to help users in this example.
The following platform definition is simplified, taking advantage of defaults
for hosts
and install targets
.
[platforms]
[[desktop\d\d\d]]
install target = localhost
[[[meta]]]
description = "Background job on a desktop system"
As before, a localhost
platform is available by default.
desktop\d\d\d
is a pattern which defines multiple platforms.
When using a pattern the “hosts” setting must be left unset so that it defaults
to the platform name. This ensures each of the matching platforms is unique.
Note
Cylc carries out a “fullmatch” regular expression comparison with the
the platform name so desktop\d\d\d
is effectively the same as
^desktop\d\d\d$
.
If a user wants to run a job on their local desktop, e.g. “desktop123”, they should set:
[runtime] [[mytask]] platform = desktop123
in their workflow configuration.
If [runtime][mytask]platform
is unset, the job will run on the Cylc
Scheduler host using this default localhost
platform.
Neither platforms will require remote initialization as the install target
is set to localhost
.
Cluster with Multiple Login Nodes
Platforms with multiple hosts require job runner to be set
Platforms can group multiple hosts together.
Scenario
You have a cluster where users submit jobs to Slurm from either of a pair of identical login nodes which share a file system.
[platforms]
[[slurm_cluster]]
hosts = login_node_1, login_node_2
job runner = slurm
retrieve job logs = True
The slurm_cluster
hosts do not share a file system with the scheduler,
therefore slurm_cluster
is a remote platform.
As the install target
setting for this platform has been omitted, this will
default to the platform name.
Cylc will initiate a remote installation, to transfer required files to
slurm_cluster
which will commence before job submission for the first job
on that platform.
Cylc will attempt to communicate with jobs via the other login node if either of the login_nodes becomes unavailable.
With multiple hosts defined under slurm_cluster
, a job runner is required.
Note
The “background” and “at” job runners require single-host platforms, because the job ID is only valid on the submission host.
We have set retrieve job logs = True
. This will ensure our job logs are
fetched from the slurm_cluster
platform. This setting is recommended for
all remote platforms (i.e. where install target is not localhost).
Grouping Platforms
Platform groups allow users to ask for jobs to be run on any suitable computer.
Scenario
Extending the example from above, we now wish to set the slurm_cluster
up such that slurm_cluster
nodes can accept background jobs.
We would like to group these background platforms together so users can set
flow.cylc[runtime][<namespace>]platform
= slurm_cluster_bg
.
[platforms]
[[slurm_cluster, slurm_cluster_bg1, slurm_cluster_bg2]] # settings that apply to all:
install target = slurm_cluster
retrieve job logs = True
[[slurm_cluster]]
batch system = slurm
hosts = login_node_1, login_node_2
[[slurm_cluster_bg1]]
hosts = login_node_1
[[slurm_cluster_bg2]]
hosts = login_node_2
[platform groups]
[[slurm_cluster_bg]]
platforms = slurm_cluster_bg1, slurm_cluster_bg2
Group platforms together using the configuration item
global.cylc[platform groups]
. In the above example, the
slurm_cluster_bg
platforms all share a file system
(install target = slurm_cluster
). We advise caution when grouping platforms
with different install targets as users could encounter a scenario whereby
files (created by a previous task using the same platform group) are
not available to them.
With the above configuration, users can now run background jobs on either of the login nodes, without the concern of selecting a specific platform.
Warning
Platforms and platform groups are both configured by
flow.cylc[runtime][<namespace>]platform
.
Therefore a platform group cannot be given the same name as a platform.
The global.cylc
file will fail validation if the same name is
used for both.
Symlinking Directories
To minimize the disk space used by ~/cylc-run
, set
global.cylc[install][symlink dirs]
to offload files onto other
locations. The entire run directory can be symlinked, as well as
certain sub-directories.
run
- the run directory itselflog
share
(see share directory)share/cycle
(typically used by Rose tasks)work
(see work directory)
These should be configured per install target.
For example, to configure workflow log
directories (on the
scheduler host) so that they symlink to a different location,
you could write the following in global.cylc
:
[install]
[[symlink dirs]]
[[[localhost]]]
log = /somewhere/else
This would result in the following file structure on the Cylc scheduler host:
~/cylc-run
└── myflow
├── flow.cylc
├── log -> /somewhere/else/cylc-run/myflow/log
...
/somewhere
└── else
└── cylc-run
└── myflow
└── log
├── flow-config
├── install
...
These localhost
symlinks are created during the cylc install process.
Symlinks for remote install targets are created during Remote Initialization following
cylc play
.
Advanced Platform Examples
Platform with no $HOME
directory
Scenario
You are trying to run jobs on a platform where the compute nodes don’t
have a configured HOME
directory.
So long as the login and compute nodes share a filesystem the workflow can be
installed on the shared filesystem using
global.cylc[install][symlink dirs]
.
The $CYLC_RUN_DIR
variable can then be set on the compute node to point
at the cylc-run
directory on the shared filesystem using
global.cylc[platforms][<platform name>]global init-script
.
[platforms] [[homeless-hpc]] job runner = my-job-runner install target = homeless-hpc global init-script = """ export CYLC_RUN_DIR=/shared/filesystem/cylc-run """ [install] [[symlink dirs]] [[[homeless-hpc]]] run = /shared/filesystem/
In this example Cylc will install workflows into
/shared/filesystem/cylc-run
.
Note
If you are running schedulers directly on the login node
and submitting jobs locally then the platform name and install target should
be localhost
.