Scheduler Configuration
The flow.cylc[scheduler]
section configures certain aspects of
scheduler behaviour at the workflow level.
Many of these configurations can also be defined at the site or user level in
the global.cylc[scheduler]
section where it applies to all
workflows.
Event handlers
Note
Workflow event handlers are configured by:
Workflow event handlers allow configurable actions to be performed when workflow events occur.
Workflow Events
The list of events is:
- startup
The scheduler started running the workflow.
- shutdown
The workflow finished and the scheduler will shut down.
- abort
The scheduler shut down early with error status, due to a fatal error condition or a configured timeout.
- workflow timeout
The workflow run timed out.
- stall
The workflow stalled.
- stall timeout
The workflow timed out after stalling.
- inactivity timeout
The workflow timed out with no activity.
You can tell the scheduler to abort (i.e., shut down immediately with error status) on certain workflow events, with the following settings:
abort on stall timeout
abort on inactivity timeout
abort on workflow timeout
Mail Events
Cylc can send emails for workflow events, these are configured by
flow.cylc[scheduler][events]mail events
.
For example with the following configuration, emails will be sent if a scheduler stalls or shuts down for an unexpected reason.
[scheduler]
[[events]]
mail events = stall, abort
Email addresses and servers are configured by
global.cylc[scheduler][mail]
.
Workflow event emails can be customised using
flow.cylc[scheduler][mail]footer
,
Workflow Event Template Variables can be used.
For example to integrate with the Cylc 7 web interface (Cylc Review) the mail footer could be configured with a URL:
[scheduler]
[[events]]
mail footer = http://cylc-review/taskjobs/%(owner)s/?suite=%(workflow)s
Custom Event Handlers
Cylc can also be configured to invoke scripts on workflow events.
Event handler scripts can be stored in the workflow bin
directory, or
anywhere in $PATH
in the scheduler environment.
They should return quickly to avoid tying up the scheduler process pool - see External Command Execution.
Contextual information can be passed to the event handler via Workflow Event Template Variables.
For example the following configuration will write some information to a file when a workflow is started:
#!/bin/bash
echo "Workflow $1 is running on $2:$3" > info
[scheduler]
[[events]]
startup handlers = my-handler %(workflow)s %(host) %(port)
Workflow Event Template Variables
- event
The type of workflow event that has occurred e.g.
stall
.- message
Additional information about the event.
- workflow
The workflow ID
- host
The host where the workflow is running.
- port
The port where the workflow is running.
- owner
The user account under which the workflow is running.
- uuid
The unique identification string for this workflow run.
This string is preserved for the lifetime of the scheduler and is restored from the database on restart.
- workflow_url
The URL defined in
flow.cylc[meta]URL
.- suite
The workflow ID
Deprecated since version 8.0.0: Use “workflow”.
- suite_uuid
The unique identification string for this workflow run.
Deprecated since version 8.0.0: Use “uuid_str”.
- suite_url
The URL defined in
flow.cylc[meta]URL
.Deprecated since version 8.0.0: Use “workflow_url”.
The following variables are available to workflow event handlers.
They can be templated into event handlers with Python percent style string formatting e.g:
%(workflow)s is running on %(host)s
Note
Substitution patterns should not be quoted in the template strings. This is done automatically where required.
For an explanation of the substitution syntax, see String Formatting Operations in the Python documentation.
External Command Execution
Job submission commands, event handlers, and job poll and kill commands, are
executed by the scheduler in a subprocess pool. The pool is size can
be configured with global.cylc[scheduler]process pool size
.
Event handlers should be lightweight and quick-running because they tie up a process pool member until complete, and the workflow will appear to stall if the pool is saturated with long-running processes.
To protect the scheduler, processes are killed on a timeout
(global.cylc[scheduler]process pool timeout
). This will be
logged by the scheduler. If a job submission gets killed, the
associated task goes to the submit-failed
state.
Submitting Workflows To a Pool Of Hosts
- Configured by:
By default cylc play
will run workflows on the machine where the command
was invoked.
Cylc supports configuring a pool of hosts for workflows to run on,
cylc play
will automatically pick a host and submit the workflow to it.
Host Pool
- Configured by:
The hosts must:
Share a common
$HOME
directory and therefore a common file system (with each other and anywhere thecylc play
command is run).Share a common Cylc global config (
global.cylc
).Be set up to allow passwordless SSH between them.
Example:
[scheduler]
[[run hosts]]
available = host_1, host_2, host_3
Load Balancing
- Configured by:
Cylc can balance the load on the configured “run hosts” by ranking them in order of available resource or by excluding hosts which fail to meet certain criterion.
Example:
[scheduler]
[[run hosts]]
available = host_1, host_2, host_3
ranking = """
# filter out hosts with high server load
getloadavg()[2] < 5
# pick the host with the most available memory
virtual_memory().available
"""
For more information see global.cylc[scheduler][run hosts]ranking
.
Workflow Migration
- Configured by:
Cylc has the ability to automatically stop workflows running on a particular host and optionally, restart them on a different host. This can be useful if a host needs to be taken off-line, e.g. for scheduled maintenance.
Example:
[scheduler]
[[run hosts]]
available = host_1, host_2, host_3
# tell workflows on host_1 to move to another available host
condemned = host_1
Note
This feature requires the [auto restart]
plugin to be enabled, e.g. in the configured list of
plugins
.
For more information see: global.cylc[scheduler][run hosts]ranking
.
Platform Configuration
From the perspective of a running scheduler localhost
is the
scheduler host.
The localhost
platform is configured by
global.cylc[platforms][localhost]
.
It configures:
Jobs that run on the
localhost
platform, i.e. any jobs which have[runtime][<namespace>]platform=localhost
or which don’t have a platform configured.Connections to the scheduler hosts (e.g. the
ssh command
).