Global Configuration¶
- global.cylc¶
The global configuration which defines default Cylc Flow settings for a user or site.
To view your global config run:
$ cylc config
Cylc will attempt to load the global configuration (
global.cylc
) from a hierarchy of locations, including the site directory (defaults to/etc/cylc/flow/
) and the user directory (~/.cylc/flow/
). For example at Cylc version 8.0.1, the hierarchy would be, in order of ascending priority:<site-conf-path>/flow/global.cylc <site-conf-path>/flow/8/global.cylc <site-conf-path>/flow/8.0/global.cylc <site-conf-path>/flow/8.0.1/global.cylc ~/.cylc/flow/global.cylc ~/.cylc/flow/8/global.cylc ~/.cylc/flow/8.0/global.cylc ~/.cylc/flow/8.0.1/global.cylc
Where
<site-conf-path>
is/etc/cylc/flow/
by default but can be changed byCYLC_SITE_CONF_PATH
.A setting in a file lower down in the list will override the same setting from those higher up (but if a setting is present in a file higher up and not in any files lower down, it will not be overridden).
The following environment variables can change the files which are loaded:
- CYLC_CONF_PATH¶
If set this bypasses the default site/user configuration hierarchy used to load the Cylc Flow global configuration.
This should be set to a directory containing a
global.cylc
file.
- CYLC_SITE_CONF_PATH¶
By default the site configuration is located in
/etc/cylc/
. For installations where this is not convenient, this path can be overridden by settingCYLC_SITE_CONF_PATH
to point at another location.Configuration for different Cylc components should be in sub-directories within this location.
For example to configure Cylc Flow you could do the following:
$CYLC_SITE_CONF_PATH/ `-- flow/ `-- global.cylc
Note
The
global.cylc
file can be templated using Jinja2 variables. See Jinja2.Note
Prior to Cylc 8,
global.cylc
was namedglobal.rc
, but that name is no longer supported.- [scheduler]¶
Default values for entries in
flow.cylc[scheduler]
section. This should not be confused with scheduling in theflow.cylc
file.- UTC mode¶
- Type
- Default
False
Default for
flow.cylc[scheduler]UTC mode
.
- process pool size¶
- Type
- Default
4
Maximum number of concurrent processes used to execute external job submission, event handlers, and job poll and kill commands - see Managing External Command Execution.
- process pool timeout¶
- Type
- Default
PT10M
Interval after which long-running commands in the process pool will be killed - see Managing External Command Execution.
Note
The default is set quite high to avoid killing important processes when the system is under load.
- auto restart delay¶
- Type
Relates to Cylc’s auto stop-restart mechanism (see Auto Stop-Restart). When a host is set to automatically shutdown/restart it will first wait a random period of time between zero and
auto restart delay
seconds before beginning the process. This is to prevent large numbers of workflows from restarting simultaneously.
- [run hosts]¶
Configure workflow hosts and ports for starting workflows. Additionally configure host selection settings specifying how to determine the most suitable run host at any given time from those configured.
- available¶
- Type
A list of workflow run hosts. One of these hosts will be appointed for a workflow to start on if an explicit host is not provided as an option to the
cylc play
command.
- ports¶
- Type
- Default
43001 .. 43100
A list of allowed ports for Cylc to use to run workflows.
- condemned¶
- Type
Hosts specified in
condemned hosts
will not be considered as workflow run hosts. If workflows are already running oncondemned hosts
they will be automatically shutdown and restarted (see:ref:auto-stop-restart).
- ranking¶
- Type
Rank and filter run hosts based on system information.
This can be used to provide load balancing to ensure no one run host is overloaded and provide thresholds beyond which Cylc will not attempt to start new schedulers on a host.
This should be a multiline string containing Python expressions to rank and/or filter hosts. All psutil attributes are available for use in these expressions.
Ranking
Rankings are expressions which return numerical values. The host which returns the lowest value is chosen. Examples:
# rank hosts by cpu_percent cpu_percent() # rank hosts by 15min average of server load getloadavg()[2] # rank hosts by the number of cores # (multiple by -1 because the lowest value is chosen) -1 * cpu_count()
Threshold
Thresholds are expressions which return boolean values. If a host returns a
False
value that host will not be selected. Examples:# filter out hosts with a CPU utilisation of 70% or above cpu_percent() < 70 # filter out hosts with less than 1GB of RAM available virtual_memory.available > 1000000000 # filter out hosts with less than 1GB of disk space # available on the "/" mount disk_usage('/').free > 1000000000
Combining
Multiple rankings and thresholds can be combined in this section e.g:
# filter hosts cpu_percent() < 70 disk_usage('/').free > 1000000000 # rank hosts by CPU count 1 / cpu_count() # if two hosts have the same CPU count # then rank them by CPU usage cpu_percent()
- [host self-identification]¶
The workflow host’s identity must be determined locally by cylc and passed to running tasks (via
$CYLC_WORKFLOW_HOST
) so that task messages can target the right workflow on the right host.- method¶
- Type
- Default
name
- Options
name
,address
,hardwired
Determines how cylc finds the identity of the workflow host.
Options:
- name
(The default method) Self-identified host name. Cylc asks the workflow host for its host name. This should resolve on task hosts to the IP address of the workflow host; if it doesn’t, adjust network settings or use one of the other methods.
- address
Automatically determined IP address (requires target). Cylc attempts to use a special external “target address” to determine the IP address of the workflow host as seen by remote task hosts.
- hardwired
(only to be used as a last resort) Manually specified host name or IP address (requires host) of the workflow host.
- target¶
- Type
- Default
google.com
This item is required for the address self-identification method. If your workflow host sees the internet, a common address such as
google.com
will do; otherwise choose a host visible on your intranet.
- [events]¶
You can define site defaults for each of the following options, details of which can be found under
flow.cylc[scheduler][events]
.- timeout¶
- Type
- inactivity¶
- Type
- [mail]¶
Options for email handling.
- Type
- task event batch interval¶
- Type
- Default
PT5M
Default for
flow.cylc[scheduler][mail]task event batch interval
- [main loop]¶
Configuration of the Cylc Scheduler’s main loop.
- plugins¶
- Type
- Default
'health check', 'prune flow labels', 'reset bad hosts'
Configure the default main loop plugins to use when starting new workflows.
- [<plugin name>]¶
Configure a main loop plugin.
- interval¶
- Type
The interval with which this plugin is run.
- [health check]¶
-
Checks the integrity of the workflow run directory.
- interval¶
- Type
- Default
PT10M
The interval with which this plugin is run.
- [prune flow labels]¶
-
Prune redundant flow labels.
- interval¶
- Type
- Default
PT10M
The interval with which this plugin is run.
- [logging]¶
The workflow event log, held under the workflow run directory, is maintained as a rolling archive. Logs are rolled over (backed up and started anew) when they reach a configurable limit size.
- [install]¶
- source dirs¶
- Type
- Default
~/cylc-src
A list of paths where
cylc install <flow_name>
will look for a workflow of that name. All workflow source directories in these locations will also show up in the GUI, ready for installation.Note
If workflow source directories of the same name exist in more than one of these paths, only the first one will be picked up.
- [symlink dirs]¶
Configure alternate workflow run directory locations. Symlinks from the the standard
$HOME/cylc-run
locations will be created.- [<install target>]¶
- run¶
- Type
If specified, the workflow run directory will be created in
<run dir>/cylc-run/<workflow-name>
and a symbolic link will be created from$HOME/cylc-run/<workflow-name>
. If not specified the workflow run directory will be created in$HOME/cylc-run/<workflow-name>
. All the workflow files and the.service
directory get installed into this directory.
- log¶
- Type
If specified the workflow log directory will be created in
<log dir>/cylc-run/<workflow-name>/log
and a symbolic link will be created from$HOME/cylc-run/<workflow-name>/log
. If not specified the workflow log directory will be created in$HOME/cylc-run/<workflow-name>/log
.
- Type
If specified the workflow share directory will be created in
<share dir>/cylc-run/<workflow-name>/share
and a symbolic link will be created from<$HOME/cylc-run/<workflow-name>/share
. If not specified the workflow share directory will be created in$HOME/cylc-run/<workflow-name>/share
.
- Type
If specified the workflow share/cycle directory will be created in
<share/cycle dir>/cylc-run/<workflow-name>/share/cycle
and a symbolic link will be created from$HOME/cylc-run/<workflow-name>/share/cycle
. If not specified the workflow share/cycle directory will be created in$HOME/cylc-run/<workflow-name>/share/cycle
.
- [editors]¶
Choose your favourite text editor for editing workflow configurations.
- terminal¶
- Type
An in-terminal text editor to be used by the cylc command line.
If unspecified Cylc will use the environment variable
$EDITOR
which is the preferred way to set your text editor.If neither this or
$EDITOR
are specified then Cylc will default tovi
.Note
You can set your
$EDITOR
in your shell profile file (e.g.~.bashrc
)Examples:
ed emacs -nw nano vi
- gui¶
- Type
A graphical text editor to be used by cylc.
If unspecified Cylc will use the environment variable
$GEDITOR
which is the preferred way to set your text editor.If neither this or
$GEDITOR
are specified then Cylc will default togvim -fg
.Note
You can set your
$GEDITOR
in your shell profile file (e.g.~.bashrc
)Examples:
atom --wait code --new-window --wait emacs gedit -s gvim -fg nedit
- [platforms]¶
- [<platform name>]¶
- job runner¶
- Type
- Default
background
The batch system/job submit method used to run jobs on the platform, e.g.,
background
,at
,slurm
,loadleveler
…
- communication method¶
- Type
- Default
zmq
- Options
poll
,ssh
,zmq
The means by which task progress messages are reported back to the running workflow.
Options:
- zmq
Direct client-server TCP communication via network ports
- poll
The workflow polls for task status (no task messaging)
- ssh
Use non-interactive ssh for task communications
- submission polling intervals¶
- Type
Cylc can also poll submitted jobs to catch problems that prevent the submitted job from executing at all, such as deletion from an external job runner queue. Routine polling is done only for the polling
task communication method
unless workflow-specific polling is configured in the workflow configuration. A list of interval values can be specified as for execution polling but a single value is probably sufficient for job submission polling.Example:
5*PT1M, 10*PT5M
- submission retry delays¶
- Type
- execution polling intervals¶
- Type
Cylc can poll running jobs to catch problems that prevent task messages from being sent back to the workflow, such as hard job kills, network outages, or unplanned task host shutdown. Routine polling is done only for the polling task communication method (below) unless polling is configured in the workflow configuration. A list of interval values can be specified, with the last value used repeatedly until the task is finished - this allows more frequent polling near the beginning and end of the anticipated task run time. Multipliers can be used as shorthand as in the example below.
Example:
5*PT1M, 10*PT5M
- execution time limit polling intervals¶
- Type
The intervals between polling after a task job (submitted to the relevant job runner on the relevant host) exceeds its execution time limit. The default setting is PT1M, PT2M, PT7M. The accumulated times (in minutes) for these intervals will be roughly 1, 1 + 2 = 3 and 1 + 2 + 7 = 10 after a task job exceeds its execution time limit.
- ssh command¶
- Type
- Default
ssh -oBatchMode=yes -oConnectTimeout=10
A string for the command used to invoke commands on this host. Not used on the workflow host unless you run local tasks under another user account. The value is assumed to be
ssh
with some initial options or a command that implements a similar interface tossh
.
- use login shell¶
- Type
- Default
True
Whether to use a login shell or not for remote command invocation. By default cylc runs remote ssh commands using a login shell:
ssh user@host 'bash --login cylc ...'
which will source the following files (in order):
/etc/profile
~/.bash_profile
~/.bash_login
~/.profile
For more information on login shells see the “Invocation” section of the Bash man pages.
For security reasons some institutions do not allow unattended commands to start login shells, so you can turn off this behaviour to get:
ssh user@host 'cylc ...'
which will use the default shell on the remote machine, sourcing
~/.bashrc
(or~/.cshrc
) to set up the environment.
- cylc path¶
- Type
The path containing the
cylc
executable on a remote host.This may be necessary if the
cylc
executable is not in the$PATH
for anssh
call. Test whether this is the case by usingssh <host> command -v cylc
.This path is used for remote invocations of the
cylc
command and is added to the$PATH
in job scripts for the configured platform.Note
If
use login shell=True
(the default) then an alternative approach is to addcylc
to the$PATH
in the system or user Bash profile files (e.g.~/.bash_profile
).Tip
For multi-version installations this should point to the Cylc wrapper script rather than the
cylc
executable itself.See Managing Environments for more information on the wrapper script.
- global init-script¶
- Type
If specified, the value of this setting will be inserted to just before the
init-script
section of all job scripts that are to be submitted to the specified remote host.
- copyable environment variables¶
- Type
A list containing the names of the environment variables to be copied from the scheduler to a job.
- retrieve job logs¶
- Type
Global default for
flow.cylc[runtime][<namespace>][remote]retrieve joblogs
.
- retrieve job logs command¶
- Type
- Default
rsync -a
If
rsync -a
is unavailable or insufficient to retrieve job logs from a remote host, you can use this setting to specify a suitable command.
- retrieve job logs max size¶
- Type
Global default for the
flow.cylc[runtime][<namespace>][remote]retrieve joblogs max size
. the specified host.
- retrieve job logs retry delays¶
- Type
Global default for the
flow.cylc[runtime][<namespace>][remote]retrieve joblogs retry delays
. setting for the specified host.
- tail command template¶
- Type
- Default
tail -n +1 -F %(filename)s
A command template (with
%(filename)s
substitution) to tail-follow job logs on HOST, bycylc cat-log
. You are unlikely to need to override this.
- err tailer¶
- Type
A command template (with
%(job_id)s
substitution) that can be used to tail-follow the stderr stream of a running job if SYSTEM does not use the normal log file location while the job is running. This setting overridestail command template
.Examples:
# for PBS qcat -f -e %(job_id)s
- out tailer¶
- Type
A command template (with
%(job_id)s
substitution) that can be used to tail-follow the stdout stream of a running job if SYSTEM does not use the normal log file location while the job is running. This setting overridestail command template
.Examples:
# for PBS qcat -f -o %(job_id)s
- err viewer¶
- Type
A command template (with
%(job_id)s
substitution) that can be used to view the stderr stream of a running job if SYSTEM does not use the normal log file location while the job is running.Examples:
# for PBS qcat -e %(job_id)s
- out viewer¶
- Type
A command template (with
%(job_id)s
substitution) that can be used to view the stdout stream of a running job if SYSTEM does not use the normal log file location while the job is running.Examples:
# for PBS qcat -o %(job_id)s
- job name length maximum¶
- Type
The maximum length for job name acceptable by a job runner on a given host. Currently, this setting is only meaningful for PBS jobs. For example, PBS 12 or older will fail a job submit if the job name has more than 15 characters; whereas PBS 13 accepts up to 236 characters.
- install target¶
- Type
This defaults to the platform name. This will be used as the target for remote file installation. For example, to indicate to Cylc that Platform_A shares a file system with localhost, we would configure as follows:
[platforms] [[Platform_A]] install target = localhost
- clean job submission environment¶
- Type
- Default
False
Job submission subprocesses inherit their parent environment by default. So remote job submissions inherit the default non-interactive shell environment, but local ones inherit the scheduler environment. This means local jobs see the scheduler environment unless the local batch system prevents it, which can cause problems - e.g. scheduler
$PYTHON...
variables can affect Python programs executed by task job scripts. For consistent handling of local and remote jobs a clean job submission environment is recommended, but it is not the default because it prevents local task jobs from running unless thecylc
version selection wrapper script is installed in$PATH
(a clean environment prevents local jobs from seeing the scheduler’s virtual environment).Specific environment variables can be singled out to pass through to the clean environment, if necessary.
A standard set of executable paths is passed through to clean environments, and can be added to if necessary.
- job submission environment pass-through¶
- Type
Minimal list of environment variable names to pass through to job submission subprocesses. $HOME is passed automatically. You are unlikely to need this.
- job submission executable paths¶
- Type
Additional executable locations to pass to the job submission subprocess beyond the standard locations/bin, /usr/bin, /usr/local/bin, /sbin, /usr/sbin, /usr/local/sbin. You are unlikely to need this.
- max batch submit size¶
- Type
- Default
100
Limits the maximum number of jobs that can be submitted at once.
Where possible Cylc will batch together job submissions to the same platform for efficiency. Submitting very large numbers of jobs can cause problems with some submission systems so for safety there is an upper limit on the number of job submissions which can be batched together.
- [selection]¶
- method¶
- Type
- Default
random
- Options
random
,definition order
Host selection method for the platform. Available options:
random: Suitable for an identical pool of hosts.
definition order: Take the first host in the list unless that host has been unreachable. In many cases this is likely to cause load imbalances, but might be appropriate if your hosts were
main, backup, failsafe
.
- [localhost]¶
- [task events]¶
Global site/user defaults for
flow.cylc[runtime][<namespace>][events]
.- execution timeout¶
- Type
- handler retry delays¶
- Type
- submission timeout¶
- Type