Global Configuration

global.cylc

The global configuration which defines default Cylc Flow settings for a user or site.

To view your global config run:

$ cylc config

Cylc will attempt to load the global configuration (global.cylc) from a hierarchy of locations, including the site directory (defaults to /etc/cylc/flow/) and the user directory (~/.cylc/flow/). For example at Cylc version 8.0.1, the hierarchy would be, in order of ascending priority:

<site-conf-path>/flow/global.cylc
<site-conf-path>/flow/8/global.cylc
<site-conf-path>/flow/8.0/global.cylc
<site-conf-path>/flow/8.0.1/global.cylc
~/.cylc/flow/global.cylc
~/.cylc/flow/8/global.cylc
~/.cylc/flow/8.0/global.cylc
~/.cylc/flow/8.0.1/global.cylc

Where <site-conf-path> is /etc/cylc/flow/ by default but can be changed by CYLC_SITE_CONF_PATH.

A setting in a file lower down in the list will override the same setting from those higher up (but if a setting is present in a file higher up and not in any files lower down, it will not be overridden).

The following environment variables can change the files which are loaded:

CYLC_CONF_PATH

If set this bypasses the default site/user configuration hierarchy used to load the Cylc Flow global configuration.

This should be set to a directory containing a global.cylc file.

CYLC_SITE_CONF_PATH

By default the site configuration is located in /etc/cylc/. For installations where this is not convenient, this path can be overridden by setting CYLC_SITE_CONF_PATH to point at another location.

Configuration for different Cylc components should be in sub-directories within this location.

For example to configure Cylc Flow you could do the following:

$CYLC_SITE_CONF_PATH/
`-- flow/
    `-- global.cylc

Note

The global.cylc file can be templated using Jinja2 variables. See Jinja2.

Note

Prior to Cylc 8, global.cylc was named global.rc, but that name is no longer supported.

[scheduler]

Default values for entries in flow.cylc[scheduler] section. This should not be confused with scheduling in the flow.cylc file.

UTC mode
Type

boolean

Default

False

Default for flow.cylc[scheduler]UTC mode.

process pool size
Type

integer

Default

4

Maximum number of concurrent processes used to execute external job submission, event handlers, and job poll and kill commands - see Managing External Command Execution.

process pool timeout
Type

time interval

Default

PT10M

Interval after which long-running commands in the process pool will be killed - see Managing External Command Execution.

Note

The default is set quite high to avoid killing important processes when the system is under load.

auto restart delay
Type

time interval

Relates to Cylc’s auto stop-restart mechanism (see Auto Stop-Restart). When a host is set to automatically shutdown/restart it will first wait a random period of time between zero and auto restart delay seconds before beginning the process. This is to prevent large numbers of workflows from restarting simultaneously.

[run hosts]

Configure workflow hosts and ports for starting workflows. Additionally configure host selection settings specifying how to determine the most suitable run host at any given time from those configured.

available
Type

spaceless list

A list of workflow run hosts. One of these hosts will be appointed for a workflow to start on if an explicit host is not provided as an option to the cylc play command.

ports
Type

integer list

Default

43001 .. 43100

A list of allowed ports for Cylc to use to run workflows.

condemned
Type

absolute host list

Hosts specified in condemned hosts will not be considered as workflow run hosts. If workflows are already running on condemned hosts they will be automatically shutdown and restarted (see:ref:auto-stop-restart).

ranking
Type

string

Rank and filter run hosts based on system information.

This can be used to provide load balancing to ensure no one run host is overloaded and provide thresholds beyond which Cylc will not attempt to start new schedulers on a host.

This should be a multiline string containing Python expressions to rank and/or filter hosts. All psutil attributes are available for use in these expressions.

Ranking

Rankings are expressions which return numerical values. The host which returns the lowest value is chosen. Examples:

# rank hosts by cpu_percent
cpu_percent()

# rank hosts by 15min average of server load
getloadavg()[2]

# rank hosts by the number of cores
# (multiple by -1 because the lowest value is chosen)
-1 * cpu_count()

Threshold

Thresholds are expressions which return boolean values. If a host returns a False value that host will not be selected. Examples:

# filter out hosts with a CPU utilisation of 70% or above
cpu_percent() < 70

# filter out hosts with less than 1GB of RAM available
virtual_memory.available > 1000000000

# filter out hosts with less than 1GB of disk space
# available on the "/" mount
disk_usage('/').free > 1000000000

Combining

Multiple rankings and thresholds can be combined in this section e.g:

# filter hosts
cpu_percent() < 70
disk_usage('/').free > 1000000000

# rank hosts by CPU count
1 / cpu_count()
# if two hosts have the same CPU count
# then rank them by CPU usage
cpu_percent()
[host self-identification]

The workflow host’s identity must be determined locally by cylc and passed to running tasks (via $CYLC_WORKFLOW_HOST) so that task messages can target the right workflow on the right host.

method
Type

string

Default

name

Options

name, address, hardwired

Determines how cylc finds the identity of the workflow host.

Options:

name

(The default method) Self-identified host name. Cylc asks the workflow host for its host name. This should resolve on task hosts to the IP address of the workflow host; if it doesn’t, adjust network settings or use one of the other methods.

address

Automatically determined IP address (requires target). Cylc attempts to use a special external “target address” to determine the IP address of the workflow host as seen by remote task hosts.

hardwired

(only to be used as a last resort) Manually specified host name or IP address (requires host) of the workflow host.

target
Type

string

Default

google.com

This item is required for the address self-identification method. If your workflow host sees the internet, a common address such as google.com will do; otherwise choose a host visible on your intranet.

host
Type

string

Use this item to explicitly set the name or IP address of the workflow host if you have to use the hardwired self-identification method.

[events]

You can define site defaults for each of the following options, details of which can be found under flow.cylc[scheduler][events].

handlers
Type

list

handler events
Type

list

mail events
Type

list

startup handler
Type

list

timeout handler
Type

list

inactivity handler
Type

list

shutdown handler
Type

list

aborted handler
Type

list

stalled handler
Type

list

timeout
Type

time interval

inactivity
Type

time interval

abort on timeout
Type

boolean

abort on inactivity
Type

boolean

abort on stalled
Type

boolean

[mail]

Options for email handling.

from
Type

string

smtp
Type

string

to
Type

string

footer
Type

string

task event batch interval
Type

time interval

Default

PT5M

Default for flow.cylc[scheduler][mail]task event batch interval

[main loop]

Configuration of the Cylc Scheduler’s main loop.

plugins
Type

list

Default

'health check', 'prune flow labels', 'reset bad hosts'

Configure the default main loop plugins to use when starting new workflows.

[<plugin name>]

Configure a main loop plugin.

interval
Type

time interval

The interval with which this plugin is run.

[health check]
Inherits

global.cylc[scheduler][main loop][<plugin name>]

Checks the integrity of the workflow run directory.

interval
Type

time interval

Default

PT10M

The interval with which this plugin is run.

[prune flow labels]
Inherits

global.cylc[scheduler][main loop][<plugin name>]

Prune redundant flow labels.

interval
Type

time interval

Default

PT10M

The interval with which this plugin is run.

[reset bad hosts]
Inherits

global.cylc[scheduler][main loop][<plugin name>]

Periodically clear the scheduler list of unreachable (bad) hosts.

interval
Type

time interval

Default

PT30M

How often (in seconds) to run this plugin.

[logging]

The workflow event log, held under the workflow run directory, is maintained as a rolling archive. Logs are rolled over (backed up and started anew) when they reach a configurable limit size.

rolling archive length
Type

integer

Default

5

How many rolled logs to retain in the archive.

maximum size in bytes
Type

integer

Default

1000000

Workflow event logs are rolled over when they reach this file size.

[install]
source dirs
Type

list

Default

~/cylc-src

A list of paths where cylc install <flow_name> will look for a workflow of that name. All workflow source directories in these locations will also show up in the GUI, ready for installation.

Note

If workflow source directories of the same name exist in more than one of these paths, only the first one will be picked up.

Configure alternate workflow run directory locations. Symlinks from the the standard $HOME/cylc-run locations will be created.

Type

string

If specified, the workflow run directory will be created in <run dir>/cylc-run/<workflow-name> and a symbolic link will be created from $HOME/cylc-run/<workflow-name>. If not specified the workflow run directory will be created in $HOME/cylc-run/<workflow-name>. All the workflow files and the .service directory get installed into this directory.

Type

string

If specified the workflow log directory will be created in <log dir>/cylc-run/<workflow-name>/log and a symbolic link will be created from $HOME/cylc-run/<workflow-name>/log. If not specified the workflow log directory will be created in $HOME/cylc-run/<workflow-name>/log.

Type

string

If specified the workflow share directory will be created in <share dir>/cylc-run/<workflow-name>/share and a symbolic link will be created from <$HOME/cylc-run/<workflow-name>/share. If not specified the workflow share directory will be created in $HOME/cylc-run/<workflow-name>/share.

Type

string

If specified the workflow share/cycle directory will be created in <share/cycle dir>/cylc-run/<workflow-name>/share/cycle and a symbolic link will be created from $HOME/cylc-run/<workflow-name>/share/cycle. If not specified the workflow share/cycle directory will be created in $HOME/cylc-run/<workflow-name>/share/cycle.

Type

string

If specified the workflow work directory will be created in <work dir>/cylc-run/<workflow-name>/work and a symbolic link will be created from $HOME/cylc-run/<workflow-name>/work. If not specified the workflow work directory will be created in $HOME/cylc-run/<workflow-name>/work.

[editors]

Choose your favourite text editor for editing workflow configurations.

terminal
Type

string

An in-terminal text editor to be used by the cylc command line.

If unspecified Cylc will use the environment variable $EDITOR which is the preferred way to set your text editor.

If neither this or $EDITOR are specified then Cylc will default to vi.

Note

You can set your $EDITOR in your shell profile file (e.g. ~.bashrc)

Examples:

ed
emacs -nw
nano
vi
gui
Type

string

A graphical text editor to be used by cylc.

If unspecified Cylc will use the environment variable $GEDITOR which is the preferred way to set your text editor.

If neither this or $GEDITOR are specified then Cylc will default to gvim -fg.

Note

You can set your $GEDITOR in your shell profile file (e.g. ~.bashrc)

Examples:

atom --wait
code --new-window --wait
emacs
gedit -s
gvim -fg
nedit
[platforms]
[<platform name>]
job runner
Type

string

Default

background

The batch system/job submit method used to run jobs on the platform, e.g., background, at, slurm, loadleveler

job runner command template
Type

string

shell
Type

string

Default

/bin/bash

communication method
Type

string

Default

zmq

Options

poll, ssh, zmq

The means by which task progress messages are reported back to the running workflow.

Options:

zmq

Direct client-server TCP communication via network ports

poll

The workflow polls for task status (no task messaging)

ssh

Use non-interactive ssh for task communications

submission polling intervals
Type

time interval list

Cylc can also poll submitted jobs to catch problems that prevent the submitted job from executing at all, such as deletion from an external job runner queue. Routine polling is done only for the polling task communication method unless workflow-specific polling is configured in the workflow configuration. A list of interval values can be specified as for execution polling but a single value is probably sufficient for job submission polling.

Example:

5*PT1M, 10*PT5M
submission retry delays
Type

time interval list

execution polling intervals
Type

time interval list

Cylc can poll running jobs to catch problems that prevent task messages from being sent back to the workflow, such as hard job kills, network outages, or unplanned task host shutdown. Routine polling is done only for the polling task communication method (below) unless polling is configured in the workflow configuration. A list of interval values can be specified, with the last value used repeatedly until the task is finished - this allows more frequent polling near the beginning and end of the anticipated task run time. Multipliers can be used as shorthand as in the example below.

Example:

5*PT1M, 10*PT5M
execution time limit polling intervals
Type

time interval list

The intervals between polling after a task job (submitted to the relevant job runner on the relevant host) exceeds its execution time limit. The default setting is PT1M, PT2M, PT7M. The accumulated times (in minutes) for these intervals will be roughly 1, 1 + 2 = 3 and 1 + 2 + 7 = 10 after a task job exceeds its execution time limit.

ssh command
Type

string

Default

ssh -oBatchMode=yes -oConnectTimeout=10

A string for the command used to invoke commands on this host. Not used on the workflow host unless you run local tasks under another user account. The value is assumed to be ssh with some initial options or a command that implements a similar interface to ssh.

use login shell
Type

boolean

Default

True

Whether to use a login shell or not for remote command invocation. By default cylc runs remote ssh commands using a login shell:

ssh user@host 'bash --login cylc ...'

which will source the following files (in order):

  • /etc/profile

  • ~/.bash_profile

  • ~/.bash_login

  • ~/.profile

For more information on login shells see the “Invocation” section of the Bash man pages.

For security reasons some institutions do not allow unattended commands to start login shells, so you can turn off this behaviour to get:

ssh user@host 'cylc ...'

which will use the default shell on the remote machine, sourcing ~/.bashrc (or ~/.cshrc) to set up the environment.

hosts
Type

list

cylc path
Type

string

The path containing the cylc executable on a remote host.

This may be necessary if the cylc executable is not in the $PATH for an ssh call. Test whether this is the case by using ssh <host> command -v cylc.

This path is used for remote invocations of the cylc command and is added to the $PATH in job scripts for the configured platform.

Note

If use login shell=True (the default) then an alternative approach is to add cylc to the $PATH in the system or user Bash profile files (e.g. ~/.bash_profile).

Tip

For multi-version installations this should point to the Cylc wrapper script rather than the cylc executable itself.

See Managing Environments for more information on the wrapper script.

global init-script
Type

string

If specified, the value of this setting will be inserted to just before the init-script section of all job scripts that are to be submitted to the specified remote host.

copyable environment variables
Type

list

A list containing the names of the environment variables to be copied from the scheduler to a job.

retrieve job logs
Type

boolean

Global default for flow.cylc[runtime][<namespace>][remote]retrieve joblogs.

retrieve job logs command
Type

string

Default

rsync -a

If rsync -a is unavailable or insufficient to retrieve job logs from a remote host, you can use this setting to specify a suitable command.

retrieve job logs max size
Type

string

Global default for the flow.cylc[runtime][<namespace>][remote]retrieve joblogs max size. the specified host.

retrieve job logs retry delays
Type

time interval list

Global default for the flow.cylc[runtime][<namespace>][remote]retrieve joblogs retry delays. setting for the specified host.

tail command template
Type

string

Default

tail -n +1 -F %(filename)s

A command template (with %(filename)s substitution) to tail-follow job logs on HOST, by cylc cat-log. You are unlikely to need to override this.

err tailer
Type

string

A command template (with %(job_id)s substitution) that can be used to tail-follow the stderr stream of a running job if SYSTEM does not use the normal log file location while the job is running. This setting overrides tail command template.

Examples:

# for PBS
qcat -f -e %(job_id)s
out tailer
Type

string

A command template (with %(job_id)s substitution) that can be used to tail-follow the stdout stream of a running job if SYSTEM does not use the normal log file location while the job is running. This setting overrides tail command template.

Examples:

# for PBS
qcat -f -o %(job_id)s
err viewer
Type

string

A command template (with %(job_id)s substitution) that can be used to view the stderr stream of a running job if SYSTEM does not use the normal log file location while the job is running.

Examples:

# for PBS
qcat -e %(job_id)s
out viewer
Type

string

A command template (with %(job_id)s substitution) that can be used to view the stdout stream of a running job if SYSTEM does not use the normal log file location while the job is running.

Examples:

# for PBS
qcat -o %(job_id)s
job name length maximum
Type

integer

The maximum length for job name acceptable by a job runner on a given host. Currently, this setting is only meaningful for PBS jobs. For example, PBS 12 or older will fail a job submit if the job name has more than 15 characters; whereas PBS 13 accepts up to 236 characters.

install target
Type

string

This defaults to the platform name. This will be used as the target for remote file installation. For example, to indicate to Cylc that Platform_A shares a file system with localhost, we would configure as follows:

[platforms]
    [[Platform_A]]
        install target = localhost
clean job submission environment
Type

boolean

Default

False

Job submission subprocesses inherit their parent environment by default. So remote job submissions inherit the default non-interactive shell environment, but local ones inherit the scheduler environment. This means local jobs see the scheduler environment unless the local batch system prevents it, which can cause problems - e.g. scheduler $PYTHON... variables can affect Python programs executed by task job scripts. For consistent handling of local and remote jobs a clean job submission environment is recommended, but it is not the default because it prevents local task jobs from running unless the cylc version selection wrapper script is installed in $PATH (a clean environment prevents local jobs from seeing the scheduler’s virtual environment).

Specific environment variables can be singled out to pass through to the clean environment, if necessary.

A standard set of executable paths is passed through to clean environments, and can be added to if necessary.

job submission environment pass-through
Type

list

Minimal list of environment variable names to pass through to job submission subprocesses. $HOME is passed automatically. You are unlikely to need this.

job submission executable paths
Type

list

Additional executable locations to pass to the job submission subprocess beyond the standard locations/bin, /usr/bin, /usr/local/bin, /sbin, /usr/sbin, /usr/local/sbin. You are unlikely to need this.

max batch submit size
Type

integer

Default

100

Limits the maximum number of jobs that can be submitted at once.

Where possible Cylc will batch together job submissions to the same platform for efficiency. Submitting very large numbers of jobs can cause problems with some submission systems so for safety there is an upper limit on the number of job submissions which can be batched together.

[selection]
method
Type

string

Default

random

Options

random, definition order

Host selection method for the platform. Available options:

  • random: Suitable for an identical pool of hosts.

  • definition order: Take the first host in the list unless that host has been unreachable. In many cases this is likely to cause load imbalances, but might be appropriate if your hosts were main, backup, failsafe.

[localhost]
Inherits

global.cylc[platforms][<platform name>]

hosts
Type

list

Default

localhost

[platform groups]
[<group>]
platforms
Type

list

[task events]

Global site/user defaults for flow.cylc[runtime][<namespace>][events].

execution timeout
Type

time interval

handlers
Type

list

handler events
Type

list

handler retry delays
Type

time interval list

mail events
Type

list

submission timeout
Type

time interval

[task mail]

Global site/user defaults for flow.cylc[runtime][<namespace>][mail].

from
Type

string

to
Type

string