Task Configuration

Related Tutorial

Runtime Tutorial

The [runtime] section of the flow.cylc file defines what job each task should run, and where and how to submit each one to run.

It is an inheritance hierarchy that allows common settings to be factored out and defined once in task families (duplication of configuration is a maintenance risk in a complex workflow).

Task and Family Names

Task and family names must match in the graph and runtime sections of the workflow config file. They do not need to match the names of the external applications wrapped by the tasks.

class cylc.flow.unicode_rules.TaskNameValidator

The rules for valid task and family names:

  • must start with: alphanumeric

  • can only contain: alphanumeric, -, +, %, @

  • cannot start with: _cylc

  • cannot be: root

Note

At runtime, tasks can access their own workflow task name as $CYLC_TASK_NAME in the job environment job environment if needed.

The following runtime configuration defines one family called FAM and two member tasks fm1 and fm2 that inherit settings from it. Members can also override inherited settings and define their own private settings.

[runtime]
    [[FAM]]  # <-- a family
        #...  settings for all FAM members

    [[fm1]]  # <-- task
        inherit = FAM
        #...  fm1-specific settings

    [[fm2]]  # <-- a task
        inherit = FAM
        #...  fm2-specific settings

Note that families are not nested in terms of the file sub-heading structure. A runtime subsection defines a family if others inherit from it, otherwise it defines a task.

The Root Family

All tasks inherit implicitly from a family called root that can provide default settings for all tasks in the workflow (non-root families require an explicit inherit statement).

For example, if all tasks are to run on the same platform, that could can be specified once for all tasks under root:

[runtime]
    [[root]]
        # all tasks run on hpc1 (unless they override this setting)
        platform = hpc1

Defining Multiple Tasks or Families at Once

Runtime sub-section headings can be a comma-separated list of task or family names, in which case the settings below it apply to each list member.

Here a group of three related tasks all run the same script on the same platform, but pass their own names to it on the command line:

[runtime]
    [[ENSEMBLE]]
        platform = hpc1
        script = "run-model.sh $CYLC_TASK_NAME"

    [[m1, m2, m3]]
        inherit = ENSEMBLE

    [[m1]]
        #...  m1-specific settings

Particular tasks (such as m1 above) can still be singled out to add task-specific settings.

Note

Task parameters or template processing (see Jinja2 and EmPy) can be used to programmatically generate family members and associated dependencies.

Families of Families

Families can inherit from other families, to any depth.

[runtime]
    [[HPC1]]
        platform = hpc1

    [[BIG-HPC1]]
        inherit = HPC1
        #...  add in high memory batch system directives

    [[model]]  # a big task that runs on hpc1
        inherit = BIG-HPC1

If the same item is defined (and redefined) at several levels in the family tree, the highest level (closest to the task) takes precedence.

Inheriting from Multiple Parents

Sometimes a multi-level single-parent tree is not sufficient to avoid all duplication of settings. Fortunately tasks can inherit from multiple parents at once [1]:

[runtime]
    [[HPC1]]
        platform = hpc1

    [[BIG]]  # high memory batch system directives
        #...

    [[model]]  # a big task that runs on hpc1
        inherit = BIG, HPC1

Tip

Use cylc config to check exactly what settings a task or family ends up with after inheritance processing:

$ cylc config --item "[runtime][model]environment" <workflow-id>

First-parent Family Hierarchy for Visualization

Tasks can be collapsed into first-parent families in the Cylc GUI, so first parents should reflect the logical purpose of a task where possible, rather than (say) shared technical settings:

[runtime]
    [[HPC]]
        # technical platform settings

    [[MODEL]]
        # atmospheric model tasks

    [[atmos]]
        inherit = MODEL, HPC  # (not HPC, MODEL)

If this is not what you want, given that the primary purpose of the family hierarchy is inheritance of runtime settings, a dummy first parent None can be used to disable the visualization usage without affecting inheritance:

[runtime]
    [[BAR]]
        #...
    [[foo]]
        # inherit from BAR but stay under root for visualization
        inherit = None, BAR

Job Environment

Job scripts export various environment variables before running script blocks (see Job Submission and Management).

Scheduler-defined variables appear first to identify the workflow, the task, and log directory locations. These are followed by user-defined variables from [runtime][<namespace>][environment]. Order of variable definition is preserved so that new variable assignments can reference previous ones.

Note

Task environment variables are evaluated at runtime, by jobs, on the job platform. So $HOME in a task environment, for instance, evaluates at runtime to the home directory on the job platform, not on the scheduler platform.

In this example the task foo ends up with SHAPE=circle, COLOR=blue, and TEXTURE=rough in its environment:

[runtime]
    [[root]]
        [[[environment]]]
            COLOR = red
            SHAPE = circle
    [[foo]]
        [[[environment]]]
            COLOR = blue  # root override
            TEXTURE = rough # new variable

Job access to Cylc itself is configured first so that variable assignment expressions (as well as scripting) can use Cylc commands:

[runtime]
    [[foo]]
        [[[environment]]]
            REFERENCE_TIME = $(cylc cyclepoint --offset-hours=6)

Overriding Inherited Environment Variables

Warning

If you override an inherited task environment variable the parent config item gets replaced before it is ever used to define the shell variable in the job script. Consequently the job cannot see the parent value as well as the task value:

[runtime]
    [[FOO]]
        [[[environment]]]
            COLOR = red
    [[bar]]
        inherit = FOO
        [[[environment]]]
            tmp = $COLOR  # !! ERROR: $COLOR is undefined here
            COLOR = dark-$tmp  # !! as this overrides COLOR in FOO.

The compressed variant of this, COLOR = dark-$COLOR, is also an error for the same reason. To achieve the desired result, use a different name for the parent variable:

[runtime]
    [[FOO]]
        [[[environment]]]
            FOO_COLOR = red
    [[bar]]
        inherit = FOO
        [[[environment]]]
            COLOR = dark-$FOO_COLOR  # OK

Job Script Variables

These variables provided by the scheduler are available to job scripts:

CYLC_VERSION                       # Version of cylc installation used
CYLC_VERBOSE                       # Verbose mode, true or false
CYLC_DEBUG                         # Debug mode (even more verbose), true or false

CYLC_CYCLING_MODE                  # Cycling mode, e.g. gregorian
ISODATETIMECALENDAR                # Calendar mode for the `isodatetime` command,
                                   #   defined with the value of CYLC_CYCLING_MODE
                                   #   when in any datetime cycling mode

CYLC_WORKFLOW_FINAL_CYCLE_POINT    # Final cycle point
CYLC_WORKFLOW_INITIAL_CYCLE_POINT  # Initial cycle point
CYLC_WORKFLOW_ID                   # Workflow ID
                                   # e.g. "a/b/c/run1"
CYLC_WORKFLOW_NAME                 # Workflow ID with the run name removed
                                   # (use CYLC_WORKFLOW_ID for most purposes)
                                   # e.g. "a/b/c"
CYLC_WORKFLOW_NAME_BASE            # The basename of the workflow name
                                   # (use CYLC_WORKFLOW_ID for most purposes)
                                   # e.g. "c"
CYLC_UTC                           # UTC mode, True or False
TZ                                 # Set to "UTC" in UTC mode or not defined

CYLC_WORKFLOW_RUN_DIR              # Location of the run directory in
                                   # job host, e.g. ~/cylc-run/foo
CYLC_WORKFLOW_HOST                 # Host running the workflow process
CYLC_WORKFLOW_OWNER                # User ID running the workflow process

CYLC_WORKFLOW_SHARE_DIR            # Workflow (or task!) shared directory (see below)
CYLC_WORKFLOW_UUID                 # Workflow UUID string
CYLC_WORKFLOW_WORK_DIR             # Workflow work directory (see below)

CYLC_TASK_JOB                      # Job identifier expressed as
                                   # CYCLE-POINT/TASK-NAME/SUBMIT-NUMBER
                                   #   e.g. 20110511T1800Z/t1/01

CYLC_TASK_CYCLE_POINT              # Cycle point, e.g. 20110511T1800Z
ISODATETIMEREF                     # Reference time for the `isodatetime` command,
                                   #   defined with the value of CYLC_TASK_CYCLE_POINT
                                   #   when in any datetime cycling mode

CYLC_TASK_NAME                     # Job's task name, e.g. t1
CYLC_TASK_ID                       # Task instance identifier CYCLE-POINT/TASK-NAME
                                   #   e.g. 20110511T1800Z/t1

CYLC_TASK_SUBMIT_NUMBER            # Job's submit number, e.g. 1,
                                   #   increments with every submit
CYLC_TASK_TRY_NUMBER               # Number of execution tries, e.g. 1
                                   #   increments with automatic execution retry delays.
CYLC_TASK_FLOW_NUMBERS             # Flows this task belongs to, e.g. 1,2

CYLC_TASK_LOG_DIR                  # Location of the job log directory
                                   #   e.g. ~/cylc-run/foo/log/job/20110511T1800Z/t1/01/
CYLC_TASK_LOG_ROOT                 # The job script path
                                   #   e.g. ~/cylc-run/foo/log/job/20110511T1800Z/t1/01/job
CYLC_TASK_WORK_DIR                 # Location of task work directory (see below)
                                   #   e.g. ~/cylc-run/foo/work/20110511T1800Z/t1

CYLC_TASK_NAMESPACE_HIERARCHY      # Linearised family namespace of the task,
                                   #   e.g. root postproc t1
CYLC_TASK_DEPENDENCIES             # List of met dependencies that triggered the task
                                   #   e.g. 1/foo 1/bar
                                   #   (note this variable will not be
                                   #   exported if there are more than 50
                                   #   dependencies)

CYLC_TASK_COMMS_METHOD             # Set to "ssh" if communication method is "ssh"
CYLC_TASK_SSH_LOGIN_SHELL          # With "ssh" communication, if set to "True",
                                   #   use login shell on workflow host

Some global shell variables are also defined in the job script, but not exported to subshells:

CYLC_FAIL_SIGNALS               # List of signals trapped by the error trap
CYLC_VACATION_SIGNALS           # List of signals trapped by the vacation trap
CYLC_TASK_MESSAGE_STARTED_PID   # PID of "cylc message" job started" command
CYLC_TASK_WORK_DIR_BASE         # Alternate task work directory,
                                #   relative to the workflow work directory

Workflow Share Directories

The workflow share directory is created automatically under the workflow run directory as a convenient shared space for tasks. The location is available to tasks as $CYLC_WORKFLOW_SHARE_DIR. In a cycling workflow, output files are typically held in cycle point sub-directories of this.

The top level share directory location can be changed, e.g. to a large data area, by global config settings under global.cylc[install][symlink dirs].

If your workflow creates or installs executables or Python libraries as it is running, these can be placed in:

  • share/bin/ - for executables. This location is automatically added to PATH (before the top-level bin/ in the run dir).

  • share/lib/python/ - for Python modules. This location is automatically added to PYTHONPATH (before the top-level lib/python/ in the run dir).

Note

Cylc will not create these folders.

Task Work Directories

Job scripts are executed from within work directories created automatically under the workflow run directory. A task can access its own work directory via $CYLC_TASK_WORK_DIR (or simply $PWD if it does not change to another location at runtime). By default the location contains task name and cycle point, to provide a unique workspace for every instance of every task.

The top level work directory location can be changed, e.g. to a large data area, by global config settings under global.cylc[install][symlink dirs].

Remote Task Hosting

Job platforms are defined in global.cylc[platforms].

If a task declares a different platform to that where the scheduler is running, Cylc uses non-interactive SSH to submit the job to the platform job runner on one of the platform hosts. Workflow source files will be installed on the platform, via the associated global.cylc[install targets], just before the first job is submitted to run there.

[runtime]
   [[foo]]
       platform = orca

For this to work:

  • Non-interactive SSH is required from the scheduler host to the platform hosts

  • Cylc must be installed on the hosts of the destination platform

    • If polling task communication is used, there is no other requirement

    • If SSH task communication is configured, non-interactive SSH is required from the job platform to the scheduler platform

    • If TCP (default) task communication is configured, the task platform should have access to the Cylc ports on the scheduler host

Platforms, like other runtime settings, can be declared globally in the root family, or in other families, or for individual tasks.

Note

The platform known as localhost is the platform where the scheduler is running, in many cases a dedicated server and not your desktop.

Internal Platform and Host Selection

The [runtime][<namespace>]platform item points to either a platform or a platform group.

Cylc platforms allow you to configure compute platforms you wish Cylc to run jobs on.

Platform groups allow you to group together platforms any of which would be suitable for a given job. Platform groups can improve robustness by allowing jobs to be submitted on any platform in the group, as well as providing an interface for basic load balancing.

Platforms are selected from a platform group once, when a job is submitted.

Hosts within a platform are re-selected each time the scheduler needs to communicate with a job.

See also

Platform Configuration: For details of how Platforms and Platform Groups are set up and in-depth examples.

External Platform Selection Scripts

Deprecated since version 8.0.0: Cylc 8 can select hosts from a group of suitable hosts listed in the platform config, so in many cases this logic should no longer be necessary.

Instead of hardwiring platform names into the workflow configuration you can give a command that prints a platform name, or an environment variable, as the value of [runtime][<namespace>]platform.

For example:

flow.cylc
[runtime]
    [[mytask]]
        platform = $(script-which-returns-a-platform-name)

Job hosts are always selected dynamically, for the chosen platform or platform group.

Caution

If $(script-which-returns-a-platform-name) returns a non-zero exit code then the scheduler will assign the submit-failed state to this job. If you have submit retries set up for the job, the scheduler will retry running your platform selection script in the same was is it would for any other submission failure.

Remote Job Log Directories

Job stdout and stderr streams are written to log files under the workflow run directory (see Task stdout and stderr Logs). For remote tasks the same directory is used, on the job host.

Implicit Tasks

An implicit task is one that appears in the graph but is not defined under flow.cylc[runtime].

Depending on the value of flow.cylc[scheduler]allow implicit tasks, Cylc can automatically create default task definitions for these, to submit local dummy jobs that just return the standard job status messages.

Implicit tasks can be used to mock up functional workflows very quickly. A default script can be added to the root family, e.g. to slow job execution down a little. Here is a complete workflow definition using implicit tasks:

[scheduler]
    allow implicit tasks = True
[scheduling]
    [[graph]]
        R1 = "prep => run-a & run-b => done"
[runtime]
    [[root]]
        script = "sleep 10"

Warning

Implicit tasks are somewhat dangerous because they can easily be created by mistake: misspelling a task’s name divorces it from its runtime definition.

For this reason implicit tasks are not allowed by default, and if used they should be turned off once the real task definitions are complete.

You can get the convenience without the danger with a little more effort, by adding empty runtime placeholders instead of allowing implicit tasks:

[scheduling]
    [[graph]]
        R1 = "prep => run-a & run-b => done"
[runtime]
    [[root]]
        script = "sleep 10"
    [[prep]]
    [[run-a, run-b]]
    [[done]]

Task Retry On Failure

Tasks can have a list of ISO8601 durations as retry intervals. If the job fails the task will return to the waiting state with a clock-trigger configured with the next retry delay.

Note

Tasks only enter the submit-failed state if job submission fails with no retries left. Otherwise they return to the waiting state, to wait on the next try.

Tasks only enter the failed state if job execution fails with no retries left. Otherwise they return to the waiting state, to wait on the next try.

In the following example, tasks bad and flaky each have 3 retries configured, with a 10 second delay between. On the final try, bad fails again and goes to the failed state, while flaky succeeds and triggers task whizz downstream. The scheduler will then stall with bad retained as an incomplete task.

[scheduling]
    [[graph]]
        R1 = """
            bad => cheese
            flaky => whizz
         """
[runtime]
    [[bad]]
        # retry 3 times then fail
        script = """
            sleep 10
            false
        """
        execution retry delays = 3*PT10S
    [[flaky]]
        # retry 3 times then succeed
        script = """
            sleep 10
            test $CYLC_TASK_TRY_NUMBER -gt 3
        """
        execution retry delays = 3*PT10S
    [[cheese, whizz]]
        script = "sleep 10"

Task Event Handling

Task event handlers allow configured commands to run when task events occur.

Note

Cylc supports workflow events e.g. startup and shutdown and task events e.g. submitted and failed.

See also Workflow Event Handling.

Event handlers can be used to send a message, raise an alarm, or whatever you like. They can even call cylc commands to intervene in the workflow.

Task event handlers are configured by flow.cylc[runtime][<namespace>][events].

Note

Task event handlers are called by the scheduler, not by the task jobs that generate the events - so they do not see the job environment.

Event handlers can be stored in the workflow bin directory, or anywhere in $PATH in the scheduler environment.

They should return quickly to avoid tying up the scheduler process pool - see External Command Execution.

Event-Specific Handlers

Event-specific handlers are configured by <event> handlers under [runtime][<namespace>][events], where <event> can be:

Event

Description

submitted

job submitted

submission retry

job submission failed but will retry later

submission failed

job submission failed

started

job started running

retry

job failed but will retry later

failed

job failed

succeeded

job succeeded

submission timeout

job timed out in the submitted state

execution timeout

job timed out in the running state

warning

scheduler received a message of severity WARNING from job

critical

scheduler received a message of severity CRITICAL from job

custom

scheduler received a message of severity CUSTOM from job
(note: literally, the word CUSTOM)

expired

task expired and will not submit (too far behind)

late

task running later than expected

Values should be a list of commands, command lines, or command line templates (see below) to call if the specified event is triggered.

General Event Handlers

Alternatively you can configure a list of generic event handlers to be run for configured handler events.

handler events

A list of events which may include any of the above events (e.g. submission failed or warning) or any of a task’s custom outputs.

handlers

A list of commands to be run for these events. Information about the event can be provided using Task Event Template Variables.

Example:

handlers = """
   my-handler %(event)s %(workflow)s,
   echo %(workflow)s-%(event)s >> my-log-file
"""
handler events = submission failed, failed, warning, my-custom-output

Task Event Template Variables

The following variables are available to task event handlers.

They can be templated into event handlers with Python percent style string formatting e.g:

%(workflow)s is running on %(host)s

The %(event)s string, for instance, will be replaced by the actual event name when the handler is invoked.

If no templates or arguments are specified the following default command line will be used:

<event-handler> %(event)s %(workflow)s %(id)s %(message)s

Note

Substitution patterns should not be quoted in the template strings. This is done automatically where required.

For an explanation of the substitution syntax, see String Formatting Operations in the Python documentation.

event

Event name.

workflow

Workflow ID.

suite

Workflow ID.

Deprecated since version 8.0.0: Use “workflow”.

uuid

The unique identification string for this workflow run.

This string is preserved for the lifetime of the scheduler and is restored from the database on restart.

suite_uuid

The unique identification string for this workflow run.

Deprecated since version 8.0.0: Use ‘uuid’.

point

The task’s cycle point.

submit_num

The job’s submit number.

This starts at 1 and increments with each additional job submission.

try_num

The job’s try number.

The number of execution attempts. It starts at 1 and increments with automatic flow.cylc[runtime][<namespace>]execution retry delays.

id

The task ID (i.e. %(point)/%(name)).

message

Events message, if any.

job_runner_name

The job runner name.

batch_sys_name

The job runner name.

Deprecated since version 8.0.0: Use “job_runner_name”.

job_id

The job ID in the job runner.

I.E. The job submission ID. For background jobs this is the process ID.

batch_sys_job_id

The job ID in the job runner.

Deprecated since version 8.0.0: Use “job_id”.

submit_time

Date-time when the job was submitted, in ISO8601 format.

start_time

Date-time when the job started, in ISO8601 format.

finish_time

Date-time when the job finished, in ISO8601 format.

platform_name

The name of the platform where the job is submitted.

user@host

The name of the platform where the job is submitted.

Deprecated since version 8.0.0: Use “platform_name”.

Changed in version 8.0.0: This now provides the platform name rather than user@host.

name

The name of the task.

task_url

The URL defined in the task’s metadata.

Deprecated since version 8.0.0: Use URL from <task metadata>.

workflow_url

The URL defined in the workflow’s metadata.

Deprecated since version 8.0.0: Use workflow_URL from workflow_<workflow metadata>.

<task metadata>

Any task metadata defined in flow.cylc[runtime][<namespace>][meta] can be used e.g:

%(title)s

Task title

%(URL)s

Task URL

%(importance)s

Example custom task metadata

workflow_<workflow metadata>

Any workflow metadata defined in flow.cylc[meta] can be used with the workflow_ e.g. prefix:

%(workflow_title)s

Workflow title

%(workflow_URL)s

Workflow URL.

%(workflow_rating)s

Example custom workflow metadata.

Examples

The following flow.cylc snippets illustrate the two (general and task-specific) ways to configure event handlers:

[runtime]
    [[foo]]
        script = test ${CYLC_TASK_TRY_NUMBER} -eq 2
        execution retry delays = PT0S, PT30S
        [[[events]]]  # event-specific handlers:
            retry handlers = notify-retry.py
            failed handlers = notify-failed.py
[runtime]
    [[foo]]
        script = """
            test ${CYLC_TASK_TRY_NUMBER} -eq 2
            cylc message -- "${CYLC_WORKFLOW_ID}" "${CYLC_TASK_JOB}" 'oopsy daisy'
        """
        execution retry delays = PT0S, PT30S
        [[[events]]]  # general handlers:
            handlers = notify-events.py
            # Note: task output name can be used as an event in this method
            handler events = retry, failed, oops
        [[[outputs]]]
            oops = oopsy daisy

Built-in Email Event Handler

To send an email on task events, configure relevant tasks with a list of events to handle by email. Custom task output names can also be used as event names, in which case the event triggers when the output message is received.

E.g. to send an email on task failed, retry, and a custom message event:

[runtime]
    [[foo]]
        script = """
            test ${CYLC_TASK_TRY_NUMBER} -eq 3
            cylc message -- "${CYLC_WORKFLOW_ID}" "${CYLC_TASK_JOB}" 'oopsy daisy'
        """
        execution retry delays = PT0S, PT30S
        [[[events]]]
            mail events = failed, retry, oops
        [[[outputs]]]
            oops = oopsy daisy

By default, event emails will be sent to the current user with:

  • to: set as $USER

  • from: set as notifications@$(hostname)

  • SMTP server at localhost:25

These can be configured using the settings:

The scheduler batches events over a 5 minute interval, by default, to avoid flooding your Inbox if many events occur in a short time. The batching interval can be configured with [scheduler][mail]task event batch interval.

Late Events

Warning

The scheduler can only check for lateness once a task has appeared in its active task window. In Cylc 8 this is usually when the task is actually ready to run, which severely limits the usefulness of late events as currently implemented.

If a real time (clock-triggered) workflow performs fairly consistently from one cycle to the next, you may want to be notified when certain tasks are running late with respect the time they normally trigger in each cycle.

Cylc can generate a late event if a task has not triggered by a given offset from its cycle point in real time. For example, if a task forecast normally triggers at 30 minutes after cycle point, a late event could be configured like this:

[runtime]
   [[forecast]]
        script = run-model.sh
        [[[events]]]
            late offset = PT40M  # allow a 10 minute delay
            late handlers = my-handler %(message)s

Warning

Late offset intervals are not computed automatically so be careful to update them after any workflow change that affects triggering times.