Writing Platform Configurations
New in version 8.0.0.
Listing available platforms
If you are working on an institutional network, platforms may already have been configured for you.
To see a list of available platforms:
cylc config --platform-names
To see the full configuration of available platforms:
cylc config --platforms
This is equivalent to
cylc config -i 'platforms' -i 'platform groups'
What Are Platforms?
Platforms define settings, most importantly:
A set of
job runner(formerly a
batch system) where Cylc can submit a job.
install targetfor Cylc to install job files on.
Why Were Platforms Introduced?
Allow a compute cluster with multiple login nodes to be treated as a single unit.
Allow Cylc to elegantly handle failure to communicate with login nodes.
Reduce the number of ssh connections required for job submission and polling.
What Are Install Targets?
Install targets represent file systems. More than one platform can use the
same file system. Cylc relies on the site configuration file
global.cylc to determine
which platforms share install targets.
Cylc will setup each remote install target once. During setup it will:
Install workflow files
Copy authentication keys (to allow secure communication)
Note, if missing from configuration, the install target will default to the platform name. If incorrectly configured, this will cause errors in Remote Initialization.
If you log into one system and see the same files as on another, then these two
platforms will require the same install target in
global.cylc config file.
Example Platform Configurations
Detailed below are some examples of common platform configurations.
Submit PBS Jobs from Localhost
The scheduler runs on the
Platforms can share hosts without sharing job runners.
You have a cluster where you can submit jobs from the Cylc scheduler host using PBS.
localhost platform is the Cylc Scheduler host, as configured in
global.cylc[scheduler][run hosts]available. This is the host that
the workflow will start on. For more information, see
pbs_cluster shares this
localhost host and setting the
install target to
localhost ensures that Cylc knows this platform does not
require remote initialization.
[platforms] # The localhost platform is available by default # [[localhost]] # hosts = localhost # install target = localhost [[pbs_cluster]] hosts = localhost job runner = pbs install target = localhost
Our Cylc scheduler does not have a job runner defined. Any job submitted to
localhost platform will run as a background job. Users can now set
pbs_cluster to run
install target default to the platform name.
Multiple Platforms Sharing File System with Cylc Scheduler
Platform names can be defined as regular expressions.
Everyone in your organization has a computer called
all with a file system shared with the scheduler host. Many users
will want their desktop set up as a platform to run small jobs.
In this scenario, Cylc does not need to install files on the desktop, since
required files which are on the scheduler host will be accessible on the
desktop. From Cylc’s point of view, the desktop and scheduler hosts are
considered different platforms but must share an install target.
Cylc needs to be told that these platforms share an install target and so we
configure this using the designated configuration item:
global.cylc[platforms][<platform name>]install target.
global.cylc[platforms][<platform name>] has optional configuration
[[[meta]]] which users can view with
cylc config --platforms. We will add
a description designed to help users in this example.
The following platform definition is simplified, taking advantage of defaults
[platforms] [[desktop\d\d\d]] install target = localhost [[[meta]]] description = "Background job on a desktop system"
As before, a
localhost platform is available by default.
desktop\d\d\d is a pattern which defines multiple platforms.
When using a pattern the “hosts” setting must be left unset so that it defaults
to the platform name. This ensures each of the matching platforms is unique.
Cylc carries out a “fullmatch” regular expression comparison with the
the platform name so
desktop\d\d\d is effectively the same as
If a user wants to run a job on their local desktop, e.g. “desktop123”, they should set:
[runtime] [[mytask]] platform = desktop123
in their workflow configuration.
[runtime][mytask]platform is unset, the job will run on the Cylc
Scheduler host using this default
Neither platforms will require remote initialization as the
is set to
Cluster with Multiple Login Nodes
Platforms with multiple hosts require job runner to be set
Platforms can group multiple hosts together.
You have a cluster where users submit jobs to Slurm from either of a pair of identical login nodes which share a file system.
[platforms] [[slurm_cluster]] hosts = login_node_1, login_node_2 job runner = slurm retrieve job logs = True
slurm_cluster hosts do not share a file system with the scheduler,
slurm_cluster is a remote platform.
install target setting for this platform has been omitted, this will
default to the platform name.
Cylc will initiate a remote installation, to transfer required files to
slurm_cluster which will commence before job submission for the first job
on that platform.
Cylc will attempt to communicate with jobs via the other login node if either of the login_nodes becomes unavailable.
With multiple hosts defined under
slurm_cluster, a job runner is required.
The “background” and “at” job runners require single-host platforms, because the job ID is only valid on the submission host.
We have set
retrieve job logs = True. This will ensure our job logs are
fetched from the
slurm_cluster platform. This setting is recommended for
all remote platforms (i.e. where install target is not localhost).
Platform groups allow users to ask for jobs to be run on any suitable computer.
Extending the example from above, we now wish to set the
up such that
slurm_cluster nodes can accept background jobs.
We would like to group these background platforms together so users can set
[platforms] [[slurm_cluster, slurm_cluster_bg1, slurm_cluster_bg2]] # settings that apply to all: install target = slurm_cluster retrieve job logs = True [[slurm_cluster]] batch system = slurm hosts = login_node_1, login_node_2 [[slurm_cluster_bg1]] hosts = login_node_1 [[slurm_cluster_bg2]] hosts = login_node_2 [platform groups] [[slurm_cluster_bg]] platforms = slurm_cluster_bg1, slurm_cluster_bg2
Group platforms together using the configuration item
global.cylc[platform groups]. In the above example, the
slurm_cluster_bg platforms all share a file system
(install target =
slurm_cluster). We advise caution when grouping platforms
with different install targets as users could encounter a scenario whereby
files (created by a previous task using the same platform group) are
not available to them.
With the above configuration, users can now run background jobs on either of the login nodes, without the concern of selecting a specific platform.
Advanced Platform Example
Platform with no
You are trying to run jobs on a platform where the compute nodes don’t
have a configured
So long as the login and compute nodes share a filesystem the workflow can be
installed on the shared filesystem using
$CYLC_RUN_DIR variable can then be set on the compute node to point
cylc-run directory on the shared filesystem using
global.cylc[platforms][<platform name>]global init-script.
[platforms] [[homeless-hpc]] job runner = my-job-runner install target = homeless-hpc global init-script = """ export CYLC_RUN_DIR=/shared/filesystem/cylc-run """ [install] [[symlink dirs]] [[[homeless-hpc]]] run = /shared/filesystem/
In this example Cylc will install workflows into
If you are running schedulers directly on the login node
and submitting jobs locally then the platform name and install target should