Platform Configuration

Writing Platform Configurations

Added in version 8.0.0.

Listing available platforms

If you are working on an institutional network, platforms may already have been configured for you.

To see a list of available platforms:

cylc config --platform-names

To see the full configuration of available platforms:

cylc config --platforms

This is equivalent to cylc config -i 'platforms' -i 'platform groups'

What Are Platforms?

Platforms define settings, most importantly:

A set of hosts.

A job runner (formerly a batch system) where Cylc can submit a job.

An install target for Cylc to install job files on.

Why Were Platforms Introduced?

Allow a compute cluster with multiple login nodes to be treated as a single unit.
Allow Cylc to elegantly handle failure to communicate with login nodes.
Reduce the number of ssh connections required for job submission and polling.

What Are Install Targets?

Install targets represent file systems. More than one platform can use the same file system. Cylc relies on the site configuration file global.cylc to determine which platforms share install targets.

Cylc will setup each remote install target once. During setup it will:

Install workflow files

Symlink directories

Copy authentication keys (to allow secure communication)

Note, if missing from configuration, the install target will default to the platform name. If incorrectly configured, this will cause errors in Remote Initialization.

If you log into one system and see the same files as on another, then these two platforms will require the same install target in global.cylc config file.

Example Platform Configurations

Detailed below are some examples of common platform configurations.

Submit PBS Jobs from Localhost

The scheduler runs on the localhost platform.
Platforms can share hosts without sharing job runners.

Scenario

You have a cluster where you can submit jobs from the Cylc scheduler host using PBS.

The localhost platform is the Cylc Scheduler host, as configured in global.cylc[scheduler][run hosts]available. This is the host that the workflow will start on. For more information, see Platform Configuration.

Our platform pbs_cluster shares this localhost host and setting the install target to localhost ensures that Cylc knows this platform does not require remote initialization.

part of a global.cylc config file

[platforms]
    # The localhost platform is available by default
    # [[localhost]]
    #     hosts = localhost
    #     install target = localhost
    [[pbs_cluster]]
        hosts = localhost
        job runner = pbs
        install target = localhost

Our Cylc scheduler does not have a job runner defined. Any job submitted to this localhost platform will run as a background job. Users can now set flow.cylc[runtime][<namespace>]platform = pbs_cluster to run pbs jobs.

Note

Both hosts and install target default to the platform name.

Cluster with Multiple Login Nodes

Platforms with multiple hosts require job runner to be set
Platforms can group multiple hosts together.

Scenario

You have a cluster where users submit jobs to Slurm from either of a pair of identical login nodes which share a file system.

part of a global.cylc config file

[platforms]
    [[slurm_cluster]]
        hosts = login_node_1, login_node_2
        job runner = slurm
        retrieve job logs = True

The slurm_cluster hosts do not share a file system with the scheduler, therefore slurm_cluster is a remote platform. As the install target setting for this platform has been omitted, this will default to the platform name. Cylc will initiate a remote installation, to transfer required files to slurm_cluster which will commence before job submission for the first job on that platform.

Cylc will attempt to communicate with jobs via the other login node if either of the login_nodes becomes unavailable.

With multiple hosts defined under slurm_cluster, a job runner is required.

Note

The “background” and “at” job runners require single-host platforms, because the job ID is only valid on the submission host.

We have set retrieve job logs = True. This will ensure our job logs are fetched from the slurm_cluster platform. This setting is recommended for all remote platforms (i.e. where install target is not localhost).

Grouping Platforms

Platform groups allow users to ask for jobs to be run on any suitable computer.

Scenario

Extending the example from above, we now wish to set the slurm_cluster up such that slurm_cluster nodes can accept background jobs. We would like to group these background platforms together so users can set flow.cylc[runtime][<namespace>]platform = slurm_cluster_bg.

part of a global.cylc config file

[platforms]
    [[slurm_cluster, slurm_cluster_bg1, slurm_cluster_bg2]]  # settings that apply to all:
        install target = slurm_cluster
        retrieve job logs = True
    [[slurm_cluster]]
        batch system = slurm
        hosts = login_node_1, login_node_2
    [[slurm_cluster_bg1]]
        hosts = login_node_1
    [[slurm_cluster_bg2]]
        hosts = login_node_2
[platform groups]
    [[slurm_cluster_bg]]
        platforms = slurm_cluster_bg1, slurm_cluster_bg2

Group platforms together using the configuration item global.cylc[platform groups]. In the above example, the slurm_cluster_bg platforms all share a file system (install target = slurm_cluster). We advise caution when grouping platforms with different install targets as users could encounter a scenario whereby files (created by a previous task using the same platform group) are not available to them.

With the above configuration, users can now run background jobs on either of the login nodes, without the concern of selecting a specific platform.

Warning

Platforms and platform groups are both configured by flow.cylc[runtime][<namespace>]platform. Therefore a platform group cannot be given the same name as a platform. The global.cylc file will fail validation if the same name is used for both.

Symlinking Directories

To minimize the disk space used by ~/cylc-run, set global.cylc[install][symlink dirs] to offload files onto other locations. The entire run directory can be symlinked, as well as certain sub-directories.

run - the run directory itself
log * log/job - contains job scripts and outputs.
share (see share directory)
share/cycle (typically used by Rose tasks)
work (see work directory)

These should be configured per install target.

For example, to configure workflow log directories (on the scheduler host) so that they symlink to a different location, you could write the following in global.cylc:

[install]
    [[symlink dirs]]
        [[[localhost]]]
            log = /somewhere/else

This would result in the following file structure on the Cylc scheduler host:

~/cylc-run
└── myflow
    ├── flow.cylc
    ├── log -> /somewhere/else/cylc-run/myflow/log
    ...

/somewhere
└── else
    └── cylc-run
        └── myflow
            └── log
                ├── flow-config
                ├── install
                ...

These localhost symlinks are created during the cylc install process. Symlinks for remote install targets are created during Remote Initialization following cylc play.

Advanced Platform Examples

Platform with no `$HOME` directory

Scenario

You are trying to run jobs on a platform where the compute nodes don’t have a configured HOME directory.

So long as the login and compute nodes share a filesystem the workflow can be installed on the shared filesystem using global.cylc[install][symlink dirs].

The $CYLC_RUN_DIR variable can then be set on the compute node to point at the cylc-run directory on the shared filesystem using global.cylc[platforms][<platform name>]global init-script.

part of a global.cylc config file

[platforms]
    [[homeless-hpc]]
        job runner = my-job-runner
        install target = homeless-hpc
        global init-script = """
            export CYLC_RUN_DIR=/shared/filesystem/cylc-run
        """

[install]
    [[symlink dirs]]
        [[[homeless-hpc]]]
            run = /shared/filesystem/

In this example Cylc will install workflows into /shared/filesystem/cylc-run.

Note

If you are running schedulers directly on the login node and submitting jobs locally then the platform name and install target should be localhost.

Platform Configuration

Writing Platform Configurations

Listing available platforms

What Are Platforms?

Why Were Platforms Introduced?

What Are Install Targets?

Example Platform Configurations

Submit PBS Jobs from Localhost

Multiple Platforms Sharing File System with Cylc Scheduler

Cluster with Multiple Login Nodes

Grouping Platforms

Symlinking Directories

Advanced Platform Examples

Platform with no $HOME directory

Sharing environment variables with the Cylc server

Platform with no `$HOME` directory