Task Configuration

Task and Family Names

Task and family names must match in the graph and runtime sections of the workflow config file. They do not need to match the names of the external applications wrapped by the tasks.

class cylc.flow.unicode_rules.TaskNameValidator[source]

The rules for valid task and family names:

must start with: alphanumeric (regex word characters - \w)
can only contain: alphanumeric (regex word characters - \w), -, +, %, @
cannot start with: _cylc
cannot be: root

Note

At runtime, tasks can access their own workflow task name as $CYLC_TASK_NAME in the job environment job environment if needed.

The following runtime configuration defines one family called FAM and two member tasks fm1 and fm2 that inherit settings from it. Members can also override inherited settings and define their own private settings.

[runtime]
    [[FAM]]  # <-- a family
        #...  settings for all FAM members

    [[fm1]]  # <-- task
        inherit = FAM
        #...  fm1-specific settings

    [[fm2]]  # <-- a task
        inherit = FAM
        #...  fm2-specific settings

Note that families are not nested in terms of the file sub-heading structure. A runtime subsection defines a family if others inherit from it, otherwise it defines a task.

The Root Family

All tasks inherit implicitly from a family called root that can provide default settings for all tasks in the workflow (non-root families require an explicit inherit statement).

For example, if all tasks are to run on the same platform, that could can be specified once for all tasks under root:

[runtime]
    [[root]]
        # all tasks run on hpc1 (unless they override this setting)
        platform = hpc1

Defining Multiple Tasks or Families at Once

Runtime sub-section headings can be a comma-separated list of task or family names, in which case the settings below it apply to each list member.

Here a group of three related tasks all run the same script on the same platform, but pass their own names to it on the command line:

[runtime]
    [[ENSEMBLE]]
        platform = hpc1
        script = "run-model.sh $CYLC_TASK_NAME"

    [[m1, m2, m3]]
        inherit = ENSEMBLE

    [[m1]]
        #...  m1-specific settings

Particular tasks (such as m1 above) can still be singled out to add task-specific settings.

Note

Task parameters or template processing (see Jinja2) can be used to programmatically generate family members and associated dependencies.

Families of Families

Families can inherit from other families, to any depth.

[runtime]
    [[HPC1]]
        platform = hpc1

    [[BIG-HPC1]]
        inherit = HPC1
        #...  add in high memory batch system directives

    [[model]]  # a big task that runs on hpc1
        inherit = BIG-HPC1

If the same item is defined (and redefined) at several levels in the family tree, the highest level (closest to the task) takes precedence.

Inheriting from Multiple Parents

Sometimes a multi-level single-parent tree is not sufficient to avoid all duplication of settings. Fortunately tasks can inherit from multiple parents at once [1]:

[runtime]
    [[HPC1]]
        platform = hpc1

    [[BIG]]  # high memory batch system directives
        #...

    [[model]]  # a big task that runs on hpc1
        inherit = BIG, HPC1

Tip

Use cylc config to check exactly what settings a task or family ends up with after inheritance processing:

$ cylc config --item "[runtime][model]environment" <workflow-id>

First-parent Family Hierarchy for Visualization

Tasks can be collapsed into first-parent families in the Cylc GUI, so first parents should reflect the logical purpose of a task where possible, rather than (say) shared technical settings:

[runtime]
    [[HPC]]
        # technical platform settings

    [[MODEL]]
        # atmospheric model tasks

    [[atmos]]
        inherit = MODEL, HPC  # (not HPC, MODEL)

If this is not what you want, given that the primary purpose of the family hierarchy is inheritance of runtime settings, a dummy first parent None can be used to disable the visualization usage without affecting inheritance:

[runtime]
    [[BAR]]
        #...
    [[foo]]
        # inherit from BAR but stay under root for visualization
        inherit = None, BAR

Job Environment

Job scripts export various environment variables before running script blocks (see Job Submission and Management).

Scheduler-defined variables appear first to identify the workflow, the task, and log directory locations. These are followed by user-defined variables from [runtime][<namespace>][environment]. Order of variable definition is preserved so that new variable assignments can reference previous ones.

Note

Task environment variables are evaluated at runtime, by jobs, on the job platform. So $HOME in a task environment, for instance, evaluates at runtime to the home directory on the job platform, not on the scheduler platform.

In this example the task foo ends up with SHAPE=circle, COLOR=blue, and TEXTURE=rough in its environment:

[runtime]
    [[root]]
        [[[environment]]]
            COLOR = red
            SHAPE = circle
    [[foo]]
        [[[environment]]]
            COLOR = blue  # root override
            TEXTURE = rough # new variable

Job access to Cylc itself is configured first so that variable assignment expressions (as well as scripting) can use Cylc commands:

[runtime]
    [[foo]]
        [[[environment]]]
            REFERENCE_TIME = $(cylc cyclepoint --offset-hours=6)

Overriding Inherited Environment Variables

Warning

If you override an inherited task environment variable the parent config item gets replaced before it is ever used to define the shell variable in the job script. Consequently the job cannot see the parent value as well as the task value:

[runtime]
    [[FOO]]
        [[[environment]]]
            COLOR = red
    [[bar]]
        inherit = FOO
        [[[environment]]]
            tmp = $COLOR  # !! ERROR: $COLOR is undefined here
            COLOR = dark-$tmp  # !! as this overrides COLOR in FOO.

The compressed variant of this, COLOR = dark-$COLOR, is also an error for the same reason. To achieve the desired result, use a different name for the parent variable:

[runtime]
    [[FOO]]
        [[[environment]]]
            FOO_COLOR = red
    [[bar]]
        inherit = FOO
        [[[environment]]]
            COLOR = dark-$FOO_COLOR  # OK

Job Script Variables

These variables provided by the scheduler are available to job scripts:

CYLC_VERSION: Version of cylc installation used
CYLC_VERBOSE: Verbose mode, true or false
CYLC_DEBUG: Debug mode (even more verbose), true or false
CYLC_CYCLING_MODE: Cycling mode, e.g. “gregorian” or “integer”
ISODATETIMECALENDAR: Calendar mode for the isodatetime command, defined with the value of CYLC_CYCLING_MODE when in any datetime cycling mode
CYLC_UTC: UTC mode, True or False
TZ: Set to “UTC” in UTC mode or not defined
CYLC_WORKFLOW_INITIAL_CYCLE_POINT: Initial cycle point
CYLC_WORKFLOW_FINAL_CYCLE_POINT: Final cycle point
CYLC_WORKFLOW_ID: Workflow ID e.g. “foo/run1” or “a/b/c/run1”
CYLC_WORKFLOW_NAME: Workflow ID with the run name removed (use CYLC_WORKFLOW_ID for most purposes) e.g. “foo” or “a/b/c”
CYLC_WORKFLOW_NAME_BASE: The basename of the workflow name (use CYLC_WORKFLOW_ID for most purposes) e.g. “foo” or “c”
CYLC_WORKFLOW_UUID: Workflow UUID string
CYLC_WORKFLOW_HOST: Host running the workflow process
CYLC_WORKFLOW_OWNER: User ID running the workflow process
CYLC_WORKFLOW_RUN_DIR: Location of the run directory in job host, e.g. ~/cylc-run/foo/run1
CYLC_WORKFLOW_LOG_DIR: Location of the scheduler’s log files, e.g. ~/cylc-run/foo/run1/log/scheduler
CYLC_WORKFLOW_SHARE_DIR: Workflow (or task!) shared directory e.g. ~/cylc-run/foo/run1/share
CYLC_WORKFLOW_WORK_DIR: Workflow work directory, e.g. ~/cylc-run/foo/run1/work
CYLC_TASK_JOB: Job identifier expressed as CYCLE-POINT/TASK-NAME/SUBMIT-NUMBER e.g. 20110511T1800Z/t1/01
CYLC_TASK_ID: Task instance identifier CYCLE-POINT/TASK-NAME e.g. 20110511T1800Z/t1
CYLC_TASK_NAME: Job’s task name, e.g. t1
CYLC_TASK_CYCLE_POINT: Cycle point, e.g. 20110511T1800Z
ISODATETIMEREF: Reference time for the isodatetime command, defined with the value of CYLC_TASK_CYCLE_POINT when in any datetime cycling mode
CYLC_TASK_SUBMIT_NUMBER: Job’s submit number, e.g. 1, increments with every submit
CYLC_TASK_TRY_NUMBER: Number of execution tries, e.g. 1 increments with automatic execution retry delays.
CYLC_TASK_FLOW_NUMBERS: Flows this task belongs to, e.g. 1,2
CYLC_TASK_LOG_DIR: Location of the job log directory e.g. ~/cylc-run/foo/run1/log/job/20110511T1800Z/t1/01/
CYLC_TASK_LOG_ROOT: The job script path e.g. ~/cylc-run/foo/run1/log/job/20110511T1800Z/t1/01/job
CYLC_TASK_SHARE_CYCLE_DIR: Cycle point-specific shared directory for this task. e.g. ~/cylc-run/foo/run1/share/cycle/20110511T1800Z
CYLC_TASK_WORK_DIR: Location of task work directory (see below) e.g. ~/cylc-run/foo/run1/work/20110511T1800Z/t1
CYLC_TASK_NAMESPACE_HIERARCHY: Linearised family namespace of the task, e.g. root postproc t1
CYLC_TASK_COMMS_METHOD: Set to “ssh” if communication method is “ssh”
CYLC_TASK_SSH_LOGIN_SHELL: With “ssh” communication, if set to “True”, use login shell on workflow host
CYLC_TASK_PARAM_<param>: If this task is a parameterized task, the value of the parameter named <param>

Some global shell variables are also defined in the job script, but not exported to subshells:

CYLC_FAIL_SIGNALS               # List of signals trapped by the error trap
CYLC_VACATION_SIGNALS           # List of signals trapped by the vacation trap
CYLC_TASK_MESSAGE_STARTED_PID   # PID of "cylc message" job started" command
CYLC_TASK_WORK_DIR_BASE         # Alternate task work directory,
                                #   relative to the workflow work directory

Task Work Directories

Job scripts are executed from within work directories created automatically under the workflow run directory. A task can access its own work directory via $CYLC_TASK_WORK_DIR (or simply $PWD if it does not change to another location at runtime). By default the location contains task name and cycle point, to provide a unique workspace for every instance of every task.

The top level work directory location can be changed, e.g. to a large data area, by global config settings under global.cylc[install][symlink dirs].

Remote Task Hosting

Job platforms are defined in global.cylc[platforms].

If a task declares a different platform to that where the scheduler is running, Cylc uses non-interactive SSH to submit the job to the platform job runner on one of the platform hosts. Workflow source files will be installed on the platform, via the associated global.cylc[install targets], just before the first job is submitted to run there.

[runtime]
   [[foo]]
       platform = orca

For this to work:

Non-interactive SSH is required from the scheduler host to the platform hosts
Cylc must be installed on the hosts of the destination platform
- If polling task communication is used, there is no other requirement
- If SSH task communication is configured, non-interactive SSH is required from the job platform to the scheduler platform
- If TCP (default) task communication is configured, the task platform should have access to the Cylc ports on the scheduler host

Platforms, like other runtime settings, can be declared globally in the root family, or in other families, or for individual tasks.

Note

The platform known as localhost is the platform where the scheduler is running, in many cases a dedicated server and not your desktop.

Internal Platform and Host Selection

The [runtime][<namespace>]platform item points to either a platform or a platform group.

Cylc platforms allow you to configure compute platforms you wish Cylc to run jobs on.

Platform groups allow you to group together platforms any of which would be suitable for a given job. Platform groups can improve robustness by allowing jobs to be submitted on any platform in the group, as well as providing an interface for basic load balancing.

Platforms are selected from a platform group once, when a job is submitted.

Hosts within a platform are re-selected each time the scheduler needs to communicate with a job.

External Platform Selection Scripts

Deprecated since version 8.0.0: Cylc 8 can select hosts from a group of suitable hosts listed in the platform config, so in many cases this logic should no longer be necessary.

Instead of hardwiring platform names into the workflow configuration you can give a command that prints a platform name, or an environment variable, as the value of [runtime][<namespace>]platform.

For example:

flow.cylc

[runtime]
    [[mytask]]
        platform = $(script-which-returns-a-platform-name)

Job hosts are always selected dynamically, for the chosen platform or platform group.

Caution

If $(script-which-returns-a-platform-name) returns a non-zero exit code then the scheduler will assign the submit-failed state to this job. If you have submit retries set up for the job, the scheduler will retry running your platform selection script in the same was is it would for any other submission failure.

Remote Job Log Directories

Job stdout and stderr streams are written to log files under the workflow run directory (see Task stdout and stderr Logs). For remote tasks the same directory is used, on the job host.

Implicit Tasks

An implicit task is one that appears in the graph but is not defined under flow.cylc[runtime].

Depending on the value of flow.cylc[scheduler]allow implicit tasks, Cylc can automatically create default task definitions for these, to submit dummy jobs that just return the standard job status messages.

Implicit tasks can be used to mock up functional workflows very quickly. A default script can be added to the root family, e.g. to slow job execution down a little. Here is a complete workflow definition using implicit tasks:

[scheduler]
    allow implicit tasks = True
[scheduling]
    [[graph]]
        R1 = "prep => run-a & run-b => done"
[runtime]
    [[root]]
        script = "sleep 10"

Warning

Implicit tasks are somewhat dangerous because they can easily be created by mistake: misspelling a task’s name divorces it from its runtime definition.

For this reason implicit tasks are not allowed by default, and if used they should be turned off once the real task definitions are complete.

You can get the convenience without the danger with a little more effort, by adding empty runtime placeholders instead of allowing implicit tasks:

[scheduling]
    [[graph]]
        R1 = "prep => run-a & run-b => done"
[runtime]
    [[root]]
        script = "sleep 10"
    [[prep]]
    [[run-a, run-b]]
    [[done]]

Automatically Retrying Tasks

Details

For a task with execution / submission retries configured:

When a job fails or submit-fails, the task will change back into the waiting state and a retry will be scheduled.
The task will not enter the failed or submit-failed state until all retries have been exhausted. This means that graph triggers (e.g. foo:failed => bar) and task events (e.g. [events]failed handlers) will not be run until the task runs out of retries (rather than after the first failure / submission-failure) and will not be run if the retry subsequently succeeds.
The $CYLC_TASK_TRY_NUMBER environment variable increments with each automatic submission, allowing you to vary task behaviour between retries.

Changed in version 8.0.0: Tasks that fail but are configured to retry return to the waiting state, with a new clock trigger to handle the configured retry delay.

Note

A task that is waiting on a retry will already have one or more failed jobs associated with it.

Advanced Example

[scheduling]
    [[graph]]
        R1 = """
            # If task "a" succeeds in three attempts or fewer, then run the
            # task "continue":
            a:succeed? => continue

            # If task "a" still fails after two retries, then run "recover":
            a:fail? => recover
        """

[runtime]
    [[a]]
        script = """
           if [[ $CYLC_TASK_TRY_NUMBER -eq 1 ]]; then
               # this is not an automatic retry
               export DEBUG=false
           else
               # this is a retry -> turn on some extra debugging
               export DEBUG=true
           fi
           do-something
        """

        # Schedule two retries for this task:
        # * The first retry will happen one minute after the task fails.
        # * The second retry will happen two minutes after the first retry
        #   fails.
        execution retry delays = PT1M, PT3M

        [[[events]]
            # These "failed" task events will only be actioned if the task
            # has exhausted all of its retries:
            mail events = failed
            failed handlers = my-task-event-handler

Aborting a Retry Sequence

To prevent a task from retrying, remove it from the n=0 window.

$ cylc remove <workflow>//3/foo  # remove task 3//foo preventing it from retrying

If you kill a running task that has more retries configured, it goes to the held state so you can decide whether to release it and continue the retry sequence, or remove it.

$ cylc kill brew//3/foo     # 3/foo goes to held state post kill
$ cylc release brew//3/foo  # release to continue retrying...
$ cylc remove brew//3/foo   # ... OR remove the task to stop retries

If you want trigger downstream tasks despite 3/foo being removed before it could succeed, use cylc set to artificially mark its required outputs as complete (and with the --flow option, if needed to make a specific flow continue on from there).

Task Event Handling

Task event handlers allow configured commands to run when task events occur, e.g. submitted and failed.

Not to be confused with

For workflow events, e.g. startup and shutdown, see Workflow Events.

Event handlers can be used to send a message, raise an alarm, or whatever you like. They can even call cylc commands to intervene in the workflow.

Task event handlers are configured by flow.cylc[runtime][<namespace>][events].

Note

Task event handlers are called by the scheduler, not by the task jobs that generate the events - so they do not see the job environment.

Event handlers can be stored in the workflow bin directory, or anywhere in $PATH in the scheduler environment.

They should return quickly to avoid tying up the scheduler process pool - see External Command Execution.

List Of Task Events

Event	Description
submitted	job submitted
submission retry	job submission failed but will retry after the configured `submission retry delays`
submission failed	job submission failed and no retries are configured or remaining
started	job started running
retry	job failed but will retry after the configured `execution retry delays`
failed	job failed and no retries are configured or remaining
succeeded	job succeeded
submission timeout	job exceeded the `[events]submission timeout` while in the `submitted` state
execution timeout	job exceeded the `[events]execution timeout` while in the `running` state
warning	scheduler received a message of severity WARNING from job
critical	scheduler received a message of severity CRITICAL from job
custom	scheduler received a message of severity CUSTOM from job (note: literally, the word `CUSTOM`)
expired	task expired and will not submit (too far behind)
late	task running later than expected

Any of a task’s custom outputs are also valid event names.

Event-Specific Handlers

Event-specific handlers are configured by <event> handlers under [runtime][<namespace>][events], where <event> can be any in the table above.

Values should be a list of commands, command lines, or command line templates (see below) to call if the specified event is triggered.

General Event Handlers

Alternatively you can configure a list of generic event handlers to be run for configured handler events.

handler events: A list of events which may include any of the events in the table above (e.g. submission failed or warning) or any of a task’s custom outputs.
handlers: A list of commands to be run for these events. Information about the event can be provided using Task Event Template Variables.

Example:

handlers = """
   my-handler %(event)s %(workflow)s,
   echo %(workflow)s-%(event)s >> my-log-file
"""
handler events = submission failed, failed, warning, my-custom-output

Task Event Template Variables

The following variables are available to task event handlers.

They can be templated into event handlers with Python percent style string formatting e.g:

%(workflow)s is running on %(host)s

The %(event)s string, for instance, will be replaced by the actual event name when the handler is invoked.

If no templates or arguments are specified the following default command line will be used:

<event-handler> %(event)s %(workflow)s %(id)s %(message)s

Note

Substitution patterns should not be quoted in the template strings. This is done automatically where required.

For an explanation of the substitution syntax, see String Formatting Operations in the Python documentation.

event: Event name.

workflow: Workflow ID.

suite: Workflow ID.

Deprecated since version 8.0.0: Use “workflow”.

uuid

The unique identification string for this workflow run.

This string is preserved for the lifetime of the scheduler and is restored from the database on restart.

suite_uuid: The unique identification string for this workflow run.

Deprecated since version 8.0.0: Use ‘uuid’.

point: The task’s cycle point.

submit_num

The job’s submit number.

This starts at 1 and increments with each additional job submission.

try_num

The job’s try number.

The number of execution attempts. It starts at 1 and increments with automatic flow.cylc[runtime][<namespace>]execution retry delays.

id: The task ID (i.e. %(point)/%(name)).

message: Events message, if any.

job_runner_name: The job runner name.

batch_sys_name: The job runner name.

Deprecated since version 8.0.0: Use “job_runner_name”.

job_id

The job ID in the job runner.

I.E. The job submission ID. For background jobs this is the process ID.

batch_sys_job_id: The job ID in the job runner.

Deprecated since version 8.0.0: Use “job_id”.

submit_time: Date-time when the job was submitted, in ISO8601 format.

start_time: Date-time when the job started, in ISO8601 format.

finish_time: Date-time when the job finished, in ISO8601 format.

platform_name: The name of the platform where the job is submitted.

user@host: The name of the platform where the job is submitted.

Deprecated since version 8.0.0: Use “platform_name”.

Changed in version 8.0.0: This now provides the platform name rather than user@host.

name: The name of the task.

task_url: The URL defined in the task’s metadata.

Deprecated since version 8.0.0: Use URL from <task metadata>.

workflow_url: The URL defined in the workflow’s metadata.

Deprecated since version 8.0.0: Use workflow_URL from workflow_<workflow metadata>.

<task metadata>

Any task metadata defined in flow.cylc[runtime][<namespace>][meta] can be used e.g:

%(title)s: Task title
%(URL)s: Task URL
%(importance)s: Example custom task metadata

workflow_<workflow metadata>

Any workflow metadata defined in flow.cylc[meta] can be used with the workflow_ e.g. prefix:

%(workflow_title)s: Workflow title
%(workflow_URL)s: Workflow URL.
%(workflow_rating)s: Example custom workflow metadata.

Examples

The following flow.cylc snippets illustrate the two (general and task-specific) ways to configure event handlers:

[runtime]
    [[foo]]
        script = test ${CYLC_TASK_TRY_NUMBER} -eq 2
        execution retry delays = PT0S, PT30S
        [[[events]]]  # event-specific handlers:
            retry handlers = notify-retry.py
            failed handlers = notify-failed.py

[runtime]
    [[foo]]
        script = """
            test ${CYLC_TASK_TRY_NUMBER} -eq 2
            cylc message -- "${CYLC_WORKFLOW_ID}" "${CYLC_TASK_JOB}" 'oopsy daisy'
        """
        execution retry delays = PT0S, PT30S
        [[[events]]]  # general handlers:
            handlers = notify-events.py
            # Note: task output name can be used as an event in this method
            handler events = retry, failed, oops
        [[[outputs]]]
            oops = oopsy daisy

Built-in Email Event Handler

To send an email on task events, configure relevant tasks with a list of events to handle by email. Custom task output names can also be used as event names, in which case the event triggers when the output message is received.

E.g. to send an email on task failed, retry, and a custom message event:

[runtime]
    [[foo]]
        script = """
            test ${CYLC_TASK_TRY_NUMBER} -eq 3
            cylc message -- "${CYLC_WORKFLOW_ID}" "${CYLC_TASK_JOB}" 'oopsy daisy'
        """
        execution retry delays = PT0S, PT30S
        [[[events]]]
            mail events = failed, retry, oops
        [[[outputs]]]
            oops = oopsy daisy

By default, event emails will be sent to the current user with:

to: set as $USER
from: set as notifications@$(hostname)
SMTP server at localhost:25

These can be configured using the settings:

[mail]to (list of email addresses)
[mail]from

The scheduler batches events over a 5 minute interval, by default, to avoid flooding your Inbox if many events occur in a short time. The batching interval can be configured with [scheduler][mail]task event batch interval.

Late Events

Warning

The scheduler can only check for lateness once a task becomes active. In Cylc 8 this usually means the task is ready, or nearly ready, to run, which limits the usefulness of late events.

If a real time (clock-triggered) workflow performs fairly consistently from one cycle to the next, you may want to be notified when certain tasks are running late with respect the time they normally trigger in each cycle.

Cylc can generate a late event if a task has not triggered by a given offset from its cycle point in real time. For example, if a task forecast normally triggers at 30 minutes after cycle point, a late event could be configured like this:

[runtime]
   [[forecast]]
        script = run-model.sh
        [[[events]]]
            late offset = PT40M  # allow a 10 minute delay
            late handlers = my-handler %(message)s

Warning

Late offset intervals are not computed automatically so be careful to update them after any workflow change that affects triggering times.

Task Configuration

Task and Family Names

The Root Family

Defining Multiple Tasks or Families at Once

Families of Families

Inheriting from Multiple Parents

First-parent Family Hierarchy for Visualization

Job Environment

Overriding Inherited Environment Variables

Job Script Variables

Workflow Share Directories

Task Work Directories

Remote Task Hosting

Internal Platform and Host Selection

External Platform Selection Scripts

Remote Job Log Directories

Implicit Tasks

Automatically Retrying Tasks

Details

Advanced Example

Aborting a Retry Sequence

Task Event Handling

List Of Task Events

Event-Specific Handlers

General Event Handlers

Task Event Template Variables

Examples

Built-in Email Event Handler

Late Events