Glossary

branching
graph branching

Cylc handles graphs in an event-driven manner which means that a workflow can follow different paths in different eventualities. This is called “branching”.

For example the following workflow follows one of two possible paths depending on the outcome of task b:

digraph example { subgraph cluster_success { label = ":succeed" color = "green" fontcolor = "green" style = "dashed" c } subgraph cluster_failure { label = ":fail" color = "red" fontcolor = "red" style = "dashed" r } a -> b -> c -> d b -> r -> d }

See also:

cold start

A cold start is one in which the workflow starts from the initial cycle point. This is the default behaviour of cylc play for a workflow that hasn’t been run before.

See also:

conditional dependency
conditional trigger

A conditional dependency is a dependency which uses the & (and) or | (or) operators for example:

a & (b | c) => d

See also:

contact file

The contact file records information about a running workflow such as the host it is running on, the TCP port(s) it is listening on and the process ID. The file is called contact and lives inside the workflow’s service directory.

The contact file only exists when the workflow is running, if you delete the contact file, the workflow will (after a delay) notice this and shut down.

Warning

In the event that a workflow process dies in an uncontrolled way, for example if the process is killed or the host which is running the process crashes, the contact file may be erroneously left behind. Some Cylc commands will automatically detect such files and remove them, otherwise they should be manually removed.

custom task output

A custom task output is a user-defined message sent from the job to the workflow server. These can be used as message triggers.

See also:

cycle

In a cycling workflow one cycle is one repetition of the workflow.

For example, in the following workflow each dotted box represents a cycle and the tasks within it are the tasks belonging to that cycle. The numbers (i.e. 1, 2, 3) are the cycle points.

digraph example { size = "3,5" subgraph cluster_1 { label = "1" style = dashed "foo.1" [label="foo\n1"] "bar.1" [label="bar\n1"] "baz.1" [label="baz\n1"] } subgraph cluster_2 { label = "2" style = dashed "foo.2" [label="foo\n2"] "bar.2" [label="bar\n2"] "baz.2" [label="baz\n2"] } subgraph cluster_3 { label = "3" style = dashed "foo.3" [label="foo\n3"] "bar.3" [label="bar\n3"] "baz.3" [label="baz\n3"] } "foo.1" -> "bar.1" -> "baz.1" "foo.2" -> "bar.2" -> "baz.2" "foo.3" -> "bar.3" -> "baz.3" "bar.1" -> "bar.2" -> "bar.3" }

cycle point

A cycle point is the unique label given to a particular cycle. If the workflow is using integer cycling then the cycle points will be numbers e.g. 1, 2, 3, etc. If the workflow is using datetime cycling then the labels will be ISO8601 datetimes e.g. 2000-01-01T00:00Z.

See also:

cycling

A cycling workflow is one in which the workflow repeats.

See also:

cylc-run directory

The directory that contains workflows. This is, by default, ~/cylc-run but may be configured using global.cylc[install][symlink dirs].

See also:

datetime cycling

A datetime cycling is the default for a cycling workflow. When using datetime cycling cycle points will be ISO8601 datetimes e.g. 2000-01-01T00:00Z and ISO8601 recurrences can be used e.g. P3D means every third day.

See also:

dependency

A dependency is a relationship between two tasks which describes a constraint on one.

For example the dependency foo => bar means that the task bar is dependent on the task foo. This means that the task bar will only run once the task foo has successfully completed.

See also:

directive

Directives are used by job runners to determine what a job’s requirements are, e.g. how much memory it requires.

Directives are set in [runtime][<namespace>][directives].

See also:

event handlers
handlers

See also

An action you want the Cylc scheduler to run when it detects that an event has occurred:

  • For the scheduler; for example startup, stall or shutdown.

  • For a task; for example when the task state changes to succeeded, failed or submit-failed.

This allows Cylc to centralize automated handling of critical events. Cylc can do many things when it detects an event.

Possible use-cases include (but are not limited to):

  • Send an email message.

  • Run a Cylc command.

  • Run _any_ user-specified script or command.

family

In Cylc a family is a collection of tasks which share a common configuration and which can be referred to collectively in the graph.

By convention families are named in upper case with the exception of the special root family from which all tasks inherit.

See also:

family inheritance

A task can be “added” to a family by “inheriting” from it using the [runtime][<namespace>]inherit configuration.

For example the task task “belongs” to the family FAMILY in the following snippet:

[runtime]
    [[FAMILY]]
        [[[environment]]]
            FOO = foo
    [[task]]
        inherit = FAMILY

A task can inherit from multiple families by writing a comma-separated list e.g:

inherit = foo, bar, baz

See also:

family trigger

Tasks which “belong” to (inherit from) a family can be referred to collectively in the graph using a family trigger.

A family trigger is written using the name of the family followed by a special qualifier i.e. FAMILY_NAME:qualifier. The most commonly used qualifiers are:

succeed-all

The dependency will only be met when all of the tasks in the family have succeeded.

succeed-any

The dependency will be met as soon as any one of the tasks in the family has succeeded.

finish-all

The dependency will only be met once all of the tasks in the family have finished (either succeeded or failed).

See also:

final cycle point

In a cycling workflow the final cycle point is the point at which cycling ends.

It is set by [scheduling]final cycle point.

If the final cycle point were 2001 then the final cycle would be no later than the 1st of January 2001.

See also:

flow

A flow is a single logical run of a workflow that is done by a scheduler.

A flow can be played and paused, stopped and restarted.

A flow begins at the start cycle point and ends at the stop cycle point.

It is possible to run more than one flow in a single scheduler.

graph

The graph of a workflow refers to the graph strings contained within the [scheduling][graph] section. For example the following is, collectively, a graph:

P1D = foo => bar
PT12H = baz

digraph example { size = "7,15" subgraph cluster_1 { label = "2000-01-01T00:00Z" style = dashed "foo.01T00" [label="foo\n2000-01-01T00:00Z"] "bar.01T00" [label="bar\n2000-01-01T00:00Z"] "baz.01T00" [label="baz\n2000-01-01T00:00Z"] } subgraph cluster_2 { label = "2000-01-01T12:00Z" style = dashed "baz.01T12" [label="baz\n2000-01-01T12:00Z"] } subgraph cluster_3 { label = "2000-01-02T00:00Z" style = dashed "foo.02T00" [label="foo\n2000-01-02T00:00Z"] "bar.02T00" [label="bar\n2000-01-02T00:00Z"] "baz.02T00" [label="baz\n2000-01-02T00:00Z"] } "foo.01T00" -> "bar.01T00" "foo.02T00" -> "bar.02T00" }

graph string

A graph string is a collection of dependencies which are placed inside the [scheduling][graph] section e.g:

foo => bar => baz & pub => qux
pub => bool
hold
held task
hold after cycle point

A task can be held using cylc hold, which prevents it from submitting jobs. Both active tasks (n=0) and future tasks (n>0) can be held; the latter will be immediately held when they spawn.

It is also possible to set a “hold after cycle point”; all tasks after this cycle point will be held.

Note

Workflows can be paused and unpaused/resumed.

Tasks can be held and released.

When a workflow is unpaused any held tasks remain held.

See also:

implicit task

An implicit task (previously known as a naked task) is a task in the graph that does not have an explicit runtime definition. For example, bar is an implicit task in the following workflow:

[scheduling]
    [[graph]]
        R1 = foo & bar
[runtime]
    [[foo]]

Implicit tasks are not allowed by default, as they are often typos. However, it is possible to allow them using flow.cylc[scheduler]allow implicit tasks during development of a workflow.

See also:

initial cycle point

In a cycling workflow the initial cycle point is the point from which cycling begins.

It is set by [scheduling]initial cycle point.

If the initial cycle point were 2000 then the first cycle would be on the 1st of January 2000.

See also:

integer cycling

An integer cycling workflow is a cycling workflow which has been configured to use integer cycling. When a workflow uses integer cycling integer recurrences may be used in the graph, e.g. P3 means every third cycle. This is configured by setting [scheduling]cycling mode=integer.

See also:

inter-cycle dependency
inter-cycle trigger

In a cycling workflow an inter-cycle dependency is a dependency between two tasks in different cycles.

For example in the following workflow the task bar is dependent on its previous occurrence:

[scheduling]
    initial cycle point = 1
    cycling mode = integer
    [[graph]]
        P1 = """
            foo => bar => baz
            bar[-P1] => bar
        """

digraph example { size = "3,5" subgraph cluster_1 { label = "1" style = dashed "foo.1" [label="foo\n1"] "bar.1" [label="bar\n1"] "baz.1" [label="baz\n1"] } subgraph cluster_2 { label = "2" style = dashed "foo.2" [label="foo\n2"] "bar.2" [label="bar\n2"] "baz.2" [label="baz\n2"] } subgraph cluster_3 { label = "3" style = dashed "foo.3" [label="foo\n3"] "bar.3" [label="bar\n3"] "baz.3" [label="baz\n3"] } "foo.1" -> "bar.1" -> "baz.1" "foo.2" -> "bar.2" -> "baz.2" "foo.3" -> "bar.3" -> "baz.3" "bar.1" -> "bar.2" -> "bar.3" }

ISO8601

ISO8601 is an international standard for writing dates and times which is used in Cylc with datetime cycling.

See also:

ISO8601 datetime

A date-time written in the ISO8601 format, e.g:

  • 2000-01-01T00:00Z: midnight on the 1st of January 2000

See also:

ISO8601 duration

A duration written in the ISO8601 format e.g:

  • PT1H30M: one hour and thirty minutes.

See also:

job

A job is the realisation of a task consisting of a file called the job script which is executed when the job “runs”.

See also:

job host

The job host is the compute resource that a job runs on. For example node_1 would be one of two possible job hosts on the platform my_hpc for the task some-task in the following workflow:

global.cylc
[platforms]
    [[my_hpc]]
        hosts = node_1, node_2
        job runner = slurm
flow.cylc
[runtime]
    [[some-task]]
        platform = my_hpc

See also:

job log
job log directory

When Cylc executes a job, stdout and stderr are redirected to the job.out and job.err files which are stored in the job log directory.

The job log directory lies within the run directory:

<run directory>/log/job/<cycle>/<task-name>/<submission-no>

Other files stored in the job log directory:

  • job: the job script.

  • job-activity.log: a log file containing details of the job’s progress.

  • job.status: a file holding Cylc’s most up-to-date understanding of the job’s present status.

job runner
batch system

A job runner (also known as batch system or job scheduler) is a system for submitting jobs to a job platform.

Job runners are set on a per-platform basis in global.cylc[platforms][<platform name>]job runner.

See also:

job script

A job script is the file containing a bash script which is executed when a job runs. A task’s job script can be found in the job log directory.

See also:

job submission number

Cylc may run multiple jobs per task (e.g. if the task failed and was re-tried). Each time Cylc runs a job it is assigned a submission number. The submission number starts at 1, incrementing with each submission.

See also:

message trigger

A message trigger can be used to trigger a dependent task before the upstream task has completed.

We can use custom task outputs as triggers.

Messages should be defined in the runtime section of the workflow and the graph trigger notation refers to each message.

See also:

parameterisation

Parameterisation is a way to consolidate configuration in the Cylc flow.cylc file by implicitly looping over a set of pre-defined variables e.g:

[scheduler]
    [[parameters]]
        foo = 1..3
[scheduling]
    [[graph]]
        R1 = bar<foo> => baz<foo>

digraph Mini_Cylc { baz_foo2 bar_foo2 -> baz_foo2 baz_foo3 baz_foo1 bar_foo3 -> baz_foo3 bar_foo1 -> baz_foo1 }

See also:

pause

When a workflow is “paused” the scheduler is still running, however, will not submit any new jobs.

This can be useful if you want to make a change to a running workflow.

Pause a workflow using cylc pause and resume it using cylc play.

See also:

platform
job platform

A configured setup for running jobs on (usually remotely). Platforms are primarily defined by the combination of a job runner and a group of hosts (which share a file system).

For example my_hpc could be the platform for the task some-task in the following workflow:

Global configuration (global.cylc)
[platforms]
    [[my_hpc]]
        hosts = node_1, node_2
        job runner = slurm
Workflow configuration (flow.cylc)
[runtime]
    [[some-task]]
        platform = my_hpc

See also:

play

We run a workflow using the cylc play command.

This starts a scheduler which is the program that controls the flow and is what we refer to as “running”.

You can play, pause and stop a flow, Cylc will always carry on where it left off.

See also:

qualifier

A qualifier is used to determine the task state to which a dependency relates.

See also:

recurrence

A recurrence is a repeating sequence which may be used to define a cycling workflow. Recurrences determine how often something repeats and take one of two forms depending on whether the workflow is configured to use integer cycling or datetime cycling.

See also:

reflow

A reflow is a subsequent logical run of a workflow that is done by the same scheduler as the original flow.

Reflows are useful when you need to re-wind your workflow run to allow it to evolve a new path into the future.

release

Held tasks can be released using cylc release, allowing submission of jobs once again.

It is also possible to remove the “hold after cycle point” if set, using cylc release --all. This will also release all held tasks.

See also:

reload

Any changes made to the flow.cylc file whilst the workflow is running will not have any effect until the workflow is either:

Reloading does not require the workflow to be shutdown. When a workflow is reloaded any currently “active” tasks will continue with their “pre-reload” configuration, whilst new tasks will use the new configuration.

Reloading changes is safe providing they don’t affect the workflow’s graph. Changes to the graph have certain caveats attached, see the Cylc User Guide for details.

See also:

restart

When a stopped workflow is restarted, Cylc will pick up where it left off. Cylc will detect any jobs which have changed state (e.g. succeeded) during the period in which the workflow was stopped.

A restart is the behaviour of cylc play for a workflow that has been previously run.

See also:

run directory

This is a directory containing the configuration that Cylc uses to run the workflow.

Typically this is installed from the source directory using cylc install.

The run directory can be accessed by a running workflow using the environment variable CYLC_WORKFLOW_RUN_DIR.

See also:

scheduler

When we say that a workflow is “running” we mean that the cylc scheduler is running.

The scheduler is responsible for running the workflow. It submits jobs, monitors their status and maintains the workflow state.

By default a scheduler is a daemon meaning that it runs in the background (potentially on another host).

service directory

This directory is used to store information for internal use by Cylc.

It is called .service and is located in the run directory, it should exist for all installed workflows.

share directory

The share directory resides within a workflow’s run directory. It serves the purpose of providing a storage place for any files which need to be shared between different tasks.

<run directory>/share

The location of the share directory can be accessed by a job via the environment variable CYLC_WORKFLOW_SHARE_DIR.

In cycling workflows files are typically stored in cycle sub-directories.

See also:

source directory

Any directory where workflows are written and stored in preparation for installation with cylc install or reinstallation with cylc reinstall.

Tip

You can configure the default locations where the cylc install will look for source directories using the global.cylc[install]source dirs configuration.

See also:

stalled workflow
stalled state

If Cylc is unable to proceed running a workflow due to unmet dependencies the workflow is said to be stalled.

This usually happens because of a task failure as in the following diagram:

digraph Example { foo [style="filled" color="#ada5a5"] bar [style="filled" color="#ff0000" fontcolor="white"] baz [color="#88c6ff"] foo -> bar -> baz }

In this example the task bar has failed meaning that baz is unable to run as its dependency (bar:succeed) has not been met.

When a Cylc detects that a workflow has stalled an email will be sent to the user. Human interaction is required to escape a stalled state.

start
startup

A start is when the Cylc scheduler runs a workflow for the first time. The scheduler is the program that controls the workflow and is what we refer to as “running”.

A workflow start can be either cold or warm (cold by default).

See also:

start cycle point

The start cycle point is the cycle point where the scheduler starts running from.

This may be before or after the initial cycle point.

See Start Cycle Point & Stop Cycle Point for more information.

See also:

stop
shutdown

When a workflow is shut down the scheduler is stopped. This means that no further jobs will be submitted.

By default Cylc waits for any submitted or running jobs to complete (either succeed or fail) before shutting down.

See also:

stop cycle point

The stop cycle point is the cycle point at which the scheduler shuts down.

This may be before or after the final cycle point.

See Start Cycle Point & Stop Cycle Point for more information.

See also:

suicide trigger

Suicide triggers remove tasks from the graph.

This allows Cylc to dynamically alter the graph based on events in the workflow.

Warning

Since Cylc 8 suicide triggers have been surpassed by graph branching which provides a simpler, superior solution.

Suicide triggers are denoted using an exclamation mark, !foo would mean “remove the task foo from this cycle”.

a => b

# suicide trigger which removes the task "b" if "a" fails
# NOTE: since Cylc 8 this suicide trigger is not necessary
a:fail => !b
task

A task represents an activity in a workflow. It is a specification of that activity consisting of the script or executable to run and certain details of the environment it is run in.

The task specification is used to create a job which is executed on behalf of the task.

Tasks submit jobs and therefore each job belongs to one task. Each task can submit multiple jobs.

See also:

task state

During a task’s life it will proceed through various states. These include:

  • waiting

  • running

  • succeeded

See also:

trigger
task trigger

Dependency relationships can be thought of the other way around as “triggers”.

For example the dependency foo => bar could be described in several ways:

  • bar depends on foo

  • foo triggers bar

  • bar triggers off of foo

In practice a trigger is the left-hand side of a dependency (foo in this example).

See also:

wall-clock time

In a Cylc workflow the wall-clock time refers to the actual time (in the real world).

See also:

warm start

In a cycling workflow, a warm start is one in which a workflow (that hasn’t been run before) starts from a start cycle point that is after the initial cycle point. Tasks in cycles before this point are treated as if they have succeeded.

See also:

work directory

When Cylc executes a job it does so inside the job’s working directory. This directory is created by Cylc and lies within the directory tree inside the relevant workflow’s run directory.

<run directory>/work/<cycle>/<task-name>

The location of the work directory can be accessed by a job via the environment variable CYLC_TASK_WORK_DIR.

See also:

workflow
cylc workflow

A Cylc workflow is a collection of tasks to carry out and dependencies that govern the order in which they run. This is represented in Cylc format in a flow.cylc file.

For example here is a Cylc workflow representing the brewing process:

flow.cylc
[scheduling]
    cycling mode = integer
    initial cycle point = 1
    [[graph]]
        # repeat this for each batch
        P1 = """
            # the stages of brewing in the order they must occur in
            malt => mash => sparge => boil => chill => ferment => rack

            # must finish the sparge of one batch before
            # starting the next one
            # sparge[-P1] => mash
        """

Cylc 7

In Cylc version 7 and earlier “workflows” were referred to as “suites”.

workflow log
workflow log directory

A Cylc workflow logs events and other information to the workflow log files when it runs. There are two log files:

  • log - a log of workflow events, consisting of information about user interaction.

  • file-installation-log - a log documenting the file installation process on remote platforms.

The workflow log directory lies within the run directory:

<run directory>/log/workflow